|
Bugzilla – Full Text Bug Listing |
| Summary: | fstrim.service should not run fstrim -a | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE Tumbleweed | Reporter: | Andreas Schneider <asn> |
| Component: | Basesystem | Assignee: | Stanislav Brabec <sbrabec> |
| Status: | RESOLVED FIXED | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Normal | ||
| Priority: | P5 - None | CC: | aschnell, sbrabec, sweet_f_a |
| Version: | Current | ||
| Target Milestone: | --- | ||
| Hardware: | Other | ||
| OS: | Other | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
fstrim -a has a de-duplication heuristics. https://git.kernel.org/cgit/utils/util-linux/util-linux.git/commit/?id=8dd51c183396c4bf6344d08b01fb84658e0a8eb4 https://git.kernel.org/cgit/utils/util-linux/util-linux.git/commit/?id=e05a3400e79e8cba6aca8725fd53d2b9753f0cf3 But adding a printf just before the the FITRIM ioctl() really shows that it calls FITRIM for each sub-volume. We should not invent strange workarounds, but fix fstrim -a instead. I just consulted it with a btrfs developer. The patch is not significant for the function of TRIM. Call of ioctl(,FITRIM,) on any path and any sub-volume has the same effect: TRIM on the whole volume. This fact simplifies the possible fix. Debugging the code, I found: For duplicates to be removed, uniq_fs_source_cmp() sets eq = 1 (equal = duplicate), but then want inside mnt_table_uniq_fs() remains 1 (wanted = unique). But the code at least partially works, because it identified match and de-duplicated: gvfsd-fuse on /run/user/10027/gvfs type fuse.gvfsd-fuse (rw,nosuid,nodev,relatime,user_id=10027,group_id=100) gvfsd-fuse on /var/run/user/10027/gvfs type fuse.gvfsd-fuse (rw,nosuid,nodev,relatime,user_id=10027,group_id=100) Now I have to find a reason why it happens. Well, fixing fstrim is even better, thanks for looking into this. Currently I have no time to look into this. But I remember a similar issue with multiple bind mounts where "fstrim -a" should de-duplicate mount points. This was discussed on ul mailing list one or two years ago. Maybe forward this bug to util-linux. Ruediger Meier: I am working on it. De-duplication of bind mounts is explicitly mentioned in the first change in the comment 1. And it even mentions btrfs, but it does not work for it for some reason. I found the root of the problem: sys-utils/fstrim.c: uniq_fs_source_cmp() contains this code: const char *aroot = mnt_fs_get_root(a), *broot = mnt_fs_get_root(b); if (!aroot || !broot) eq = 0; else if (strcmp(aroot, broot) != 0) eq = 0; In case of btrfs, it may look? mnt_fs_streq_srcpath() founds a match. mnt_fs_get_root() returns: aroot = /usr/local broot = /var/crash These strings don't match => return "not match" Whole this function was added to de-duplicate bind mounts. There it makes sense: for gvfsd-fuse and gvfsd-fuse aroot = / broot = / => match We have similar problems in mount/libmount in past. There we attempted to detect whether a particular btrfs sub-volume is already mounted. Here we will need a similar code. I am unsure whether root comparison makes sense at all in fstrim -a. Patch assuming that it does not make sense was sent to the upstream for review: [PATCH 0/1] fstrim: de-duplication of btrfs subvolumes http://marc.info/?l=util-linux-ng&m=148536542717051&w=2 [PATCH 1/1] fstrim: de-duplicate btrfs sub-volumes http://marc.info/?l=util-linux-ng&m=148536543717059&w=2 The code author Karel Zak just confirmed my suspicion that the root comparison makes no sense. Once the patch will appear in the upstream, I'll backport it. http://marc.info/?l=util-linux-ng&m=148543544505102&w=2 Thanks for the update. Looking forward to a fix. fstrim runs again on Monday ;-) Fix was accepted: https://git.kernel.org/cgit/utils/util-linux/util-linux.git/commit/?id=afa382f2e087b71c9e626144f2f313abc42ad594 SUSE-SU-2017:0553-1: An update that solves two vulnerabilities and has 11 fixes is now available. Category: security (important) Bug References: 1008965,1012504,1012632,1019332,1020077,1023041,947494,966891,978993,982331,983164,987176,988361 CVE References: CVE-2016-5011,CVE-2017-2616 Sources used: SUSE Linux Enterprise Server for SAP 12 (src): python-libmount-2.25-24.10.3, util-linux-2.25-24.10.1, util-linux-systemd-2.25-24.10.1 SUSE Linux Enterprise Server 12-LTSS (src): python-libmount-2.25-24.10.3, util-linux-2.25-24.10.1, util-linux-systemd-2.25-24.10.1 SUSE-SU-2017:0554-1: An update that solves one vulnerability and has 6 fixes is now available. Category: security (important) Bug References: 1008965,1012504,1012632,1019332,1020077,1020985,1023041 CVE References: CVE-2017-2616 Sources used: SUSE Linux Enterprise Workstation Extension 12-SP2 (src): util-linux-2.28-44.3.1 SUSE Linux Enterprise Software Development Kit 12-SP2 (src): util-linux-2.28-44.3.1 SUSE Linux Enterprise Server for Raspberry Pi 12-SP2 (src): python-libmount-2.28-44.3.3, util-linux-2.28-44.3.1, util-linux-systemd-2.28-44.3.3 SUSE Linux Enterprise Server 12-SP2 (src): python-libmount-2.28-44.3.3, util-linux-2.28-44.3.1, util-linux-systemd-2.28-44.3.3 SUSE Linux Enterprise Desktop 12-SP2 (src): python-libmount-2.28-44.3.3, util-linux-2.28-44.3.1, util-linux-systemd-2.28-44.3.3 SUSE-SU-2017:0555-1: An update that solves one vulnerability and has 5 fixes is now available. Category: security (important) Bug References: 1008965,1012504,1012632,1019332,1020077,1023041 CVE References: CVE-2017-2616 Sources used: SUSE Linux Enterprise Workstation Extension 12-SP1 (src): util-linux-2.25-40.1 SUSE Linux Enterprise Software Development Kit 12-SP1 (src): util-linux-2.25-40.1 SUSE Linux Enterprise Server 12-SP1 (src): python-libmount-2.25-40.2, util-linux-2.25-40.1, util-linux-systemd-2.25-40.1 SUSE Linux Enterprise Desktop 12-SP1 (src): python-libmount-2.25-40.2, util-linux-2.25-40.1, util-linux-systemd-2.25-40.1 openSUSE-SU-2017:0589-1: An update that solves one vulnerability and has 6 fixes is now available. Category: security (important) Bug References: 1008965,1012504,1012632,1019332,1020077,1020985,1023041 CVE References: CVE-2017-2616 Sources used: openSUSE Leap 42.2 (src): python-libmount-2.28-10.2, util-linux-2.28-10.1, util-linux-systemd-2.28-10.1 openSUSE-SU-2017:0590-1: An update that solves one vulnerability and has 5 fixes is now available. Category: security (important) Bug References: 1008965,1012504,1012632,1019332,1020077,1023041 CVE References: CVE-2017-2616 Sources used: openSUSE Leap 42.1 (src): python-libmount-2.25-21.1, util-linux-2.25-21.1, util-linux-systemd-2.25-21.1 Now it should be fixed in Leap 42.1, Leap 42.2, SLE12, SLE12 SP1, SLE12 SP2 and Tumbleweed. I'm sorry, but this is not solved! magrathea:~ # rpm -q util-linux util-linux-2.29.1-1.1.x86_64 magrathea:~ # fstrim -av /home/asn/storage: 368.2 GiB (395347783680 bytes) trimmed /home: 413.7 GiB (444230541312 bytes) trimmed /usr/local: 46.9 GiB (50348015616 bytes) trimmed /var/crash: 46.9 GiB (50341838848 bytes) trimmed /var/lib/mysql: 46.9 GiB (50343002112 bytes) trimmed /srv: 46.9 GiB (50343165952 bytes) trimmed /boot/grub2/i386-pc: 46.9 GiB (50343690240 bytes) trimmed ^C magrathea:~ # mount [...] /dev/sda3 on /var/lib/named type btrfs (rw,relatime,ssd,space_cache,subvolid=273,subvol=/@/var/lib/named) /dev/sda3 on /var/tmp type btrfs (rw,relatime,ssd,space_cache,subvolid=278,subvol=/@/var/tmp) /dev/sda3 on /.snapshots type btrfs (rw,relatime,ssd,space_cache,subvolid=258,subvol=/@/.snapshots) /dev/sda3 on /var/spool type btrfs (rw,relatime,ssd,space_cache,subvolid=277,subvol=/@/var/spool) /dev/sda3 on /var/log type btrfs (rw,relatime,ssd,space_cache,subvolid=275,subvol=/@/var/log) /dev/sda3 on /boot/grub2/i386-pc type btrfs (rw,relatime,ssd,space_cache,subvolid=260,subvol=/@/boot/grub2/i386-pc) /dev/sda3 on /srv type btrfs (rw,relatime,ssd,space_cache,subvolid=263,subvol=/@/srv) /dev/sda3 on /var/lib/mysql type btrfs (rw,relatime,ssd,space_cache,subvolid=272,subvol=/@/var/lib/mysql) /dev/sda3 on /var/crash type btrfs (rw,relatime,ssd,space_cache,subvolid=267,subvol=/@/var/crash) /dev/sda3 on /usr/local type btrfs (rw,relatime,ssd,space_cache,subvolid=265,subvol=/@/usr/local) /dev/mapper/cr_home on /home type xfs (rw,relatime,attr2,inode64,noquota) /dev/sdd1 on /home/asn/storage type btrfs (rw,relatime,ssd,space_cache,subvolid=257,subvol=/@) tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=3288472k,mode=700,uid=1000,gid=100) fusectl on /sys/fs/fuse/connections type fusectl (rw,relatime) gvfsd-fuse on /run/user/1000/gvfs type fuse.gvfsd-fuse (rw,nosuid,nodev,relatime,user_id=1000,group_id=100) It still calls trim on each subvolume ... Please wait for release of util-linux-2.29.2. util-linux-2.29.1 does not have the fix yet. The update is on its way: https://build.opensuse.org/package/show/Base:System/util-linux (Feel free to test the package from there.) I have a similar setup, and FITRIM is called only twice in Tumbleweed with util-linux-2.29.2 (once for /, once for /home), so I assume that it is fixed. |
The fstrim.service of the util-linux-systemd package should not run 'fstrim -a'. If you install openSUSE Tumbleweed is chooses btrfs for the root filesystem and creates subvolumes. For example: /dev/sda3 on /tmp type btrfs (rw,relatime,ssd,space_cache,subvolid=264,subvol=/@/tmp) /dev/sda3 on /var/lib/mysql type btrfs (rw,relatime,ssd,space_cache,subvolid=272,subvol=/@/var/lib/mysql) /dev/sda3 on /var/lib/pgsql type btrfs (rw,relatime,ssd,space_cache,subvolid=274,subvol=/@/var/lib/pgsql) /dev/sda3 on /var/lib/machines type btrfs (rw,relatime,ssd,space_cache,subvolid=269,subvol=/@/var/lib/machines) /dev/sda3 on /opt type btrfs (rw,relatime,ssd,space_cache,subvolid=262,subvol=/@/opt) /dev/sda3 on /usr/local type btrfs (rw,relatime,ssd,space_cache,subvolid=265,subvol=/@/usr/local) /dev/sda3 on /var/lib/named type btrfs (rw,relatime,ssd,space_cache,subvolid=273,subvol=/@/var/lib/named) /dev/sda3 on /var/log type btrfs (rw,relatime,ssd,space_cache,subvolid=275,subvol=/@/var/log) /dev/sda3 on /var/tmp type btrfs (rw,relatime,ssd,space_cache,subvolid=278,subvol=/@/var/tmp) /dev/sda3 on /var/cache type btrfs (rw,relatime,ssd,space_cache,subvolid=266,subvol=/@/var/cache) /dev/sda3 on /srv type btrfs (rw,relatime,ssd,space_cache,subvolid=263,subvol=/@/srv) /dev/sda3 on /var/lib/mailman type btrfs (rw,relatime,ssd,space_cache,subvolid=270,subvol=/@/var/lib/mailman) /dev/sda3 on /.snapshots type btrfs (rw,relatime,ssd,space_cache,subvolid=258,subvol=/@/.snapshots) /dev/sda3 on /var/crash type btrfs (rw,relatime,ssd,space_cache,subvolid=267,subvol=/@/var/crash) /dev/sda3 on /var/lib/mariadb type btrfs (rw,relatime,ssd,space_cache,subvolid=271,subvol=/@/var/lib/mariadb) /dev/sda3 on /boot/grub2/x86_64-efi type btrfs (rw,relatime,ssd,space_cache,subvolid=261,subvol=/@/boot/grub2/x86_64-efi) /dev/sda3 on /var/spool type btrfs (rw,relatime,ssd,space_cache,subvolid=277,subvol=/@/var/spool) /dev/sda3 on /var/lib/libvirt/images type btrfs (rw,relatime,ssd,space_cache,subvolid=268,subvol=/@/var/lib/libvirt/images) From the fstrim manpage: "fstrim is used on a mounted filesystem" So 'fstrim -a' trims each subvolume of /dev/sda3. This means it trims /dev/sda3 in the exmaple more than 15 times! Please implement a more intelligent solution for trimming. A script which works on the output of 'lsblk -o MOUNTPOINT,FSTYPE' output would be more intelligent.