|
Bugzilla – Full Text Bug Listing |
| Summary: | System goes into emergency mode on boot (lvmetad problem) | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE Distribution | Reporter: | Neil Rickert <nwr10cst-oslnx> |
| Component: | Basesystem | Assignee: | Liuhua Wang <lwang> |
| Status: | RESOLVED DUPLICATE | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Normal | ||
| Priority: | P5 - None | CC: | bchou, mchang, msuchanek |
| Version: | Leap 42.2 | ||
| Target Milestone: | --- | ||
| Hardware: | x86-64 | ||
| OS: | SUSE Other | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
Typescript "transcript file" with most of the requested information.
typescript (transcript file) from emergency mode, 7/28/2016 |
||
|
Description
Neil Rickert
2016-07-21 00:57:51 UTC
I have since installed Alpha 3 on a different computer. Again, I used an encrypted LVM, mounting the home volume to "/xhome". This time, everything worked. The system did not go into emergency mode. I'm not sure what's the difference. However, on the first computer (where I did have a problem) there are actually two encrypted LVMs (only one used by Alpha3). Possibly that is what confuses things. Please see whether the following service is running or not: systemctl status initrd-udevadm-cleanup-db.service Responding to comment 2 >Please see whether the following service is running or not: > systemctl status initrd-udevadm-cleanup-db.service # systemctl status initrd-udevadm-cleanup-db.service ● initrd-udevadm-cleanup-db.service - Cleanup udevd DB Loaded: loaded (/usr/lib/systemd/system/initrd-udevadm-cleanup-db.service; static; vendor preset: disabled) Active: inactive (dead) Jul 25 07:00:37 linux-zbzz systemd[1]: Starting Cleanup udevd DB... Jul 25 07:00:37 linux-zbzz systemd[1]: Started Cleanup udevd DB. I see the same output on the system where I am having problems and on the system where everything works (except different timestamps). So it has no problem with initrd-udevadm-cleanup-db.service. Is the problematic device from network such as iscsi? If it is not, more information is needed: - fstab - lsblk - /etc/lvm/lvm.conf - systemctl status lvm2-lvmetad lvm2-pvscan@major:minor.service (the PV's major and minor number) Thanks! Created attachment 685640 [details]
Typescript "transcript file" with most of the requested information.
There no unusual network device.
I generated the requested output, except "/etc/lvm/lvm.conf". That file is as
originally installed, except the change to "use_lvmetad = 0".
After generating the transcript file, I edited "lvm.conf" and set "use_lvmetad = 1" (as in the originally installed version). My plan was to reboot, and rerun those commands from emergency mode.
Strangely, the system booted properly this time. It did not go into emergency mode. The I appended to the "typescript" and reran "lsblk", as the order for the LVM components was now changed. I have since rebooted a second time, and again it did not go into emergency mode. I do not know what changed. The only significant change that I have made was to install "fvwm2" and "fvwm-themes".
(In reply to Neil Rickert from comment #5) > Created attachment 685640 [details] > Typescript "transcript file" with most of the requested information. > > There no unusual network device. > > I generated the requested output, except "/etc/lvm/lvm.conf". That file is > as > originally installed, except the change to "use_lvmetad = 0". > > After generating the transcript file, I edited "lvm.conf" and set > "use_lvmetad = 1" (as in the originally installed version). My plan was to > reboot, and rerun those commands from emergency mode. > > Strangely, the system booted properly this time. It did not go into > emergency mode. The I appended to the "typescript" and reran "lsblk", as > the order for the LVM components was now changed. I have since rebooted a > second time, and again it did not go into emergency mode. I do not know > what changed. The only significant change that I have made was to install > "fvwm2" and "fvwm-themes". Should have no relationship with fvwm. As to lvm2-pvscan@, in your case it should be: systemctl status lvm2-pvscan@8:21.service (physical volume's major&minor, not logical volume's) I suspect the services are not initialized during boot in case of failure. If you can still reproduce, please check: systemctl status lvm2-lvmetad.socket lvm2-lvmetad.servcie systemctl status lvm2-pvscan@8:21.service Thank you! It looks as if I cannot reproduce the problem at present. Maybe it will show up again with the upcoming beta release. Created attachment 685935 [details]
typescript (transcript file) from emergency mode, 7/28/2016
It just went into emergency mode again on a recent boot. I hope the transcript file shows what you wanted.
I also listed "/dev/mapper" which show that the root2 and swap volumes have been mapped, but the home volume has not been mapped.
I can update this for 42.2-Beta1. My first three boots were fine. The next three all failed (went into emergency mode). I could not recover by using lvm commands (such as vgchange). That gave error messages about a corrupt lvmetad cache. However, changing "use_lvmetad = 0" in "lvm.conf" and rebooting does work around the issue. I have more testing to do. But here's what caused the difference between the boots that failed and the ones that succeeded: First, as background, this is UEFI system and I do have secure-boot enabled. The successful boots were all done with the grub2-efi as installed for 42.2. And the unsuccessful boots were all done with grub2-efi as installed for Tumbleweed. I'm not sure why that would make a difference. When using the grub2 from Tumbleweed, I am using the "configfile" command to use the exact "grub.cfg" that was installed for 42.2. So the only differences are "shim.efi", "grub.efi" and the grub2 modules pointed to by "$prefix". I am a little confused. Do you mean the boot failure only occur when using grub2-efi and security enabled? Or still has relationship with Tumbleweed and Leap? Only occur on Tumbleweed? The difference of activating a logical volume by enable lvmetad or not is that lvmetad is activating logical volumes when receiving udev event. I don't whether grub2-efi has relationship with events? I will CC grube-efi maintainer Michael Chang. Hi Michael, Do you have any thoughts about this? I don't think its grub2, but the different version of dracut and udev between Leap 42.2 and TW. The boot process has past grub2 and handed over to kernel, in a sequence that initrd is in charge of mounting root filesystem (which failed ..). >I am a little confused.
Sorry about that. I guess I did not give enough detail.
I use the particular computer for testing. I have several linux versions installed, including both Tumbleweed and 42.2.
My tests all had secure-boot enabled.
The boot path then should be:
firmware --> shim.efi --> grub.efi which loads "grub.cfg".
Here, "shim.efi", "grub.efi" and "grub.cfg" are all in "/boot/efi/EFI/opensuse".
Each install takes over booting. So I keep backups of those three files so that I can restore them as needed.
When those three files come from the new 42.2 install, then everything is fine.
When those three files come from Tumbleweed (snapshot 20160828), I can boot into Tumbleweed, but I run into lvmetad issues if I try to boot into 42.2
For the Tumbleweed grub menu entry, I use the following to boot 42.2:
--- cut here ---
### Entry to boot 42.2 on sdb4
menuentry "configfile for linux on /dev/sdb4 (42.2)" {
set bootdir='hd1,gpt4'
search --fs-uuid --set=bootdir 723d6d1e-6b1f-410c-bd53-87e944c288e0
configfile (${bootdir})/boot/grub2/grub.cfg
}
--- cut here ---
The UUID is for "/dev/sdb4" which is "/boot" for 42.2. And there is a symlink so that "/boot/grub2/grub.cfg" works relative to "/boot". So I am really using the 42.2 boot menu via a "configfile" command. It does work, in that the kernel is loaded. But the home volume in my encrypted LVM is not made accessible and boot ends in emergency mode.
No, I don't understand it either. All LVM operations should be initiated by the "initrd", and that happens after grub is no longer doing anything. So I agree that it does not make sense.
Nevertheless, using the three boot path files for 42.2 was 100% effective when booting. And using those files from Tumbleweed failed every time (at least 4 attempts) yesterday.
But now a new event. I tried again this morning, and it booted successfully using the basic boot files from Tumbleweed. So this is quite strange.
*** Bug 996916 has been marked as a duplicate of this bug. *** I installed Leap 42.2 beta and on first and second boot the home partition did not activate. I can activate it by hand in the emergency shell and continue boot. I will try to change to use_lvmetad = 0 There are also a bug(997637) filed for lvm partition disabled during booting Leap42.2, I I mark this as a duplicat of it. *** This bug has been marked as a duplicate of bug 997637 *** |