Bug 1189793

Summary: libzypp cache keeps /mnt/var mount busy
Product: [openSUSE] openSUSE Distribution Reporter: Stefan Hundhammer <shundhammer>
Component: InstallationAssignee: YaST Team <yast-internal>
Status: RESOLVED FIXED QA Contact: Jiri Srain <jsrain>
Severity: Normal    
Priority: P5 - None CC: lslezak, ma, mfilka, shundhammer
Version: Leap 15.3   
Target Milestone: ---   
Hardware: Other   
OS: Other   
URL: https://trello.com/c/TRWCBSTv
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: y2logs tarball
y2logs tarball from TW 20210829 NET ISO without unmount problems

Description Stefan Hundhammer 2021-08-25 12:39:48 UTC
Created attachment 852030 [details]
y2logs tarball

In the process of testing my changes of umount_finish.rb I got errors because /mnt/var could not be unmounted (and subsequently of course also not /mnt). It turned out that the libzypp cache was still open when umount_finish.rb attempted to unmount all target mounts; so I got this:

2021-08-25 13:41:11 <3> install(4456) [Ruby] 
  lib/cheetah.rb(log_stream_line):208 Error output: 
  umount: /mnt/var: target is busy.

2021-08-25 13:41:11 <2> install(4456) [Ruby] clients/umount_finish.rb(log_running_processes):185 

Running processes using /mnt/var:
                     USER        PID ACCESS COMMAND
/mnt/var:            root     kernel mount /mnt/var
                     root       4456 f.... y2start



0:install:/ # /mnt/usr/bin/lsof -p 4456 | grep '/mnt'

y2start 4456 root 27r REG 0,40 821414 12740
  /mnt/var/cache/zypp/solv/repo-update/solv

y2start 4456 root 28r REG 0,40 20071 12744
  /mnt/var/cache/zypp/solv/repo-update-non-oss/solv

y2start 4456 root 29r REG 0,40 33644441 12748
  /mnt/var/cache/zypp/solv/repo-oss/solv

y2start 4456 root 30r REG 0,40 49563 12752
  /mnt/var/cache/zypp/solv/repo-non-oss/solv


This prevents /mnt/var and thus /mnt from being unmounted, so the reboot after the installation will complain about the root filesystem not being cleanly unmounted and require a filesystem check.


I was using a DVD ISO, no installer self-update, no online repositories enabled.
Comment 1 Stefan Hundhammer 2021-08-25 13:43:39 UTC
umount_finish.rb refactoring bug: bug #1149980
PR: https://github.com/yast/yast-installation/pull/975
Comment 2 Stefan Hundhammer 2021-08-25 13:51:13 UTC
Maybe related: bug #1189590 comment #4
(mentioning /mnt/var/cache failing to unmount)
Comment 3 Stefan Hundhammer 2021-08-26 10:57:08 UTC
Installing Tumbleweed from 2021-08-24:

- /mnt/var is only in use while an RPM is being installed;
  between RPMs it is not in use.

- From the step "Saving the software manager configuration" on,
  it appears to be always in use.

- When the popup "The system will reboot now" is open, it is
  still in use:


0:install:~ # fuser -m /mnt/var
/mnt/var:             4976


0:install:~ # /mnt/usr/bin/lsof +D /mnt/var

COMMAND    PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME

Zypp-main 4976 root 32r REG 0,28 12028 8432
/mnt/var/cache/zypp/solv/repo-update/solv

Zypp-main 4976 root 33r REG 0,28 23897763 8436
/mnt/var/cache/zypp/solv/repo-oss/solv

Zypp-main 4976 root 34r REG 0,28 41571 8440
/mnt/var/cache/zypp/solv/repo-non-oss/solv


0:install:~ # ps 4976
  PID TTY      STAT   TIME COMMAND

 4976 pts/0 Sl 2:02 /usr/bin/ruby.ruby2.7 --encoding=utf-8
 /usr/lib/YaST2/bin/y2start installation --arg initial qt


4976 is also the PID of all log lines in the y2log.


After confirming the "reboot" pop-up (booted with "startshell=1"
so it doesn't immediately reboot):

0:install:~ # findmnt /mnt
0:install:~ #

0:install:~ # findmnt
TARGET                    SOURCE     FSTYPE
/                         /dev/zram0 ext2
├─/proc                   proc       proc
├─/sys                    sysfs      sysfs
├─/dev                    devtmpfs   devtmpfs
│ └─/dev/pts              devpts     devpts
├─/parts/mp_0000          /dev/loop0 squashfs
├─/parts/mp_0001          /dev/loop1 squashfs
├─/var/lib/nfs/rpc_pipefs rpc_pipefs rpc_pipefs
├─/mounts/mp_0000         /dev/loop2 squashfs
├─/mounts/mp_0001         /dev/loop3 squashfs
└─/mounts/mp_0003         /dev/loop5 squashfs


i.e. all /mnt mounts are now unmounted (?).
Comment 4 Stefan Hundhammer 2021-08-26 10:58:02 UTC
From the y2log:

2021-08-26 12:35:30 <1> install(4976) [Ruby]
  clients/umount_finish.rb(umount_target_mounts):79
  Paths to unmount:
  ["/mnt/run", "/mnt/sys", "/mnt/proc", "/mnt/dev", "/mnt/home",
  "/mnt/boot/grub2/i386-pc", "/mnt/boot/grub2/x86_64-efi",
  "/mnt/opt", "/mnt/root", "/mnt/srv", "/mnt/usr/local",
  "/mnt/var", "/mnt"]


2021-08-26 12:35:30 <2> install(4976) [Ruby]
  clients/umount_finish.rb(unmount_summary):97
  Leftover paths that could not be unmounted:
  ["/mnt/var", "/mnt"]

2021-08-26 12:35:30 <2> install(4976) [Ruby]
  clients/umount_finish.rb(log_running_processes):185

Running processes using ["/mnt/var", "/mnt"]:
                     USER        PID ACCESS COMMAND
/mnt/var:            root     kernel mount /mnt/var
                     root       4976 f.... Zypp-main
/mnt:                root     kernel mount /mnt

2021-08-26 12:35:30 <1> install(4976) [Ruby]
  clients/umount_finish.rb(dump_file):107


/proc/mounts:

...
...
/dev/sda2 /mnt btrfs rw,relatime,space_cache,subvolid=256,subvol=/@ 0 0
/dev/sda2 /mnt/var btrfs rw,relatime,space_cache,subvolid=258,subvol=/@/var 0 0


2021-08-26 12:35:30 <1> install(4976) [Ruby]
  clients/umount_finish.rb(write):64 umount_finish.rb done



While the "reboot" pop-up is still open:

0:install:~ # findmnt
TARGET                    SOURCE            FSTYPE
/                         /dev/zram0        ext2
├─/proc                   proc              proc
├─/sys                    sysfs             sysfs
├─/dev                    devtmpfs          devtmpfs
│ └─/dev/pts              devpts            devpts
├─/parts/mp_0000          /dev/loop0        squashfs
├─/parts/mp_0001          /dev/loop1        squashfs
├─/var/lib/nfs/rpc_pipefs rpc_pipefs        rpc_pipefs
├─/mounts/mp_0000         /dev/loop2        squashfs
├─/mounts/mp_0001         /dev/loop3        squashfs
├─/mounts/mp_0003         /dev/loop5        squashfs
└─/mnt                    /dev/sda2[/@]     btrfs
  └─/mnt/var              /dev/sda2[/@/var] btrfs
Comment 5 Stefan Hundhammer 2021-08-26 11:11:00 UTC
It appears to me that we seem to have another fallback during installation that tries to umount leftover /mnt mounts after YaST has finished. I'll check the y2start scripts.
Comment 6 Lukas Ocilka 2021-08-26 11:35:25 UTC
BTW, for the libzypp cache, it might work to stop using libzypp completely? IMO we still might keep some reference in the memory. Or maybe we need to call libzypp directly? Maybe MA knows?
Comment 7 Stefan Hundhammer 2021-08-26 11:44:40 UTC
I found it:

https://github.com/yast/yast-installation/blob/master/startup/First-Stage/F10-cleanup#L9-L10

# 10.1) reverse umount file systems...
awk '/ \/mnt/{ print $2 }' /proc/mounts | sort -r | xargs --no-run-if-empty umount



While this is very good to have as a last ditch defense, it's still scary that this is needed in this case: Something related to package handling (probably on the YaST side) keeps /mnt/var/cache/zypp busy.

AFAICS it works well while RPMs are being installed: It only attaches /mnt/var/cache/zypp when it starts installing an RPM and then releases it again immediately. That's how it should be.

But at the very end of the installation when files are copied over from the inst-sys to the target, some other code attaches to that directory and never releases it again. It is possible that some parts of libzypp are initialized again and never properly shut down.
Comment 8 Michael Andres 2021-08-26 12:36:01 UTC
(In reply to Lukas Ocilka from comment #6)
> BTW, for the libzypp cache, it might work to stop using libzypp completely?

The solv files are referenced by libsolv as long as the repos are loaded in the pool. This is needed because not all the package data are kept in memory. It's like this ever since.

If the repos are unloaded from the zypp pool and the files are still open, then something would have cahanged in libsolv. And it would IMO be bug.
Comment 9 Michael Andres 2021-08-26 12:47:57 UTC
> [grep 'target is busy\|setDirty\|Updating RepoManager' y2log]
> _createRepo openSUSE-Leap-15.3-1
> _addSolv openSUSE-Leap-15.3-1
> setRepoInfo openSUSE-Leap-15.3-1
>rget):459 Updating RepoManager (target changed from / to /mnt)
> _createRepo @System
> setRepoInfo @System
> _addSolv @System
> _deleteRepo @System
> _createRepo @System
> setRepoInfo @System
> _addSolv @System
> _createRepo repo-update
> _addSolv repo-update
> setRepoInfo repo-update
> _createRepo repo-update-non-oss
> _addSolv repo-update-non-oss
> setRepoInfo repo-update-non-oss
> _createRepo repo-oss
> _addSolv repo-oss
> setRepoInfo repo-oss
> _createRepo repo-non-oss
> _addSolv repo-non-oss
> setRepoInfo repo-non-oss
> _deleteRepo openSUSE-Leap-15.3-1
> _deleteRepo @System
>8 Error output: umount: /mnt/var: target is busy.
> umount_target_mounts):86 Execution of "umount /mnt/var" failed with status 32: umount: /mnt/var: target is busy..

From the y2log I'd say at least those were steated on /mnt and not unloaded:
> _createRepo repo-update
> _createRepo repo-update-non-oss
> _createRepo repo-oss
> _createRepo repo-non-oss
Comment 10 Stefan Hundhammer 2021-08-26 13:04:47 UTC
I believe it was @snwint who pointed me to this:

https://github.com/yast/yast-installation/blob/master/src/lib/installation/clients/pre_umount_finish.rb#L56-L62


      # Release all sources, they might be still mounted
      Pkg.SourceReleaseAll

      # save all sources and finish target
      # bnc #398315
      Pkg.SourceSaveAll
      Pkg.TargetFinish

Does this re-initialize the repos ("sources") after they have already been released? I don't know, but we'll have to check.
Comment 12 Michal Filka 2021-08-28 13:28:21 UTC
according to comment#9 and comment#10 it seems that playground is ready for us -> trelloizing it.
Comment 13 Stefan Hundhammer 2021-08-31 13:07:02 UTC
I just installed the latest TW (20210829) from a NET ISO with an ext4 root filesystem, and it didn't seem to have this problem.
Comment 14 Stefan Hundhammer 2021-08-31 13:07:48 UTC
Created attachment 852185 [details]
y2logs tarball from TW 20210829 NET ISO without unmount problems
Comment 17 Ladislav Slezák 2021-10-01 13:53:03 UTC
Fixed in yast2-installation-4.4.19 (https://github.com/yast/yast-installation/pull/981)