Bugzilla – Bug 937237
Timed out waiting for device dev-hvc0.device
Last modified: 2015-10-27 12:01:48 UTC
Trying to boot rescue mode from Tumbleweed snapshot,gives me: [ TIME ] Timed out waiting for device dev-hvc0.device. [DEPEND] Dependency failed for Serial Getty on hvc0. [ OK ] Reached target Login Prompts. [ OK ] Started /etc/init.d/after.local Compatibility. Starting /etc/init.d/after.local Compatibility... [FAILED] Failed to start System Logging Service. See "systemctl status syslogd.service" for details. [DEPEND] Dependency failed for System Kernel Logging Service. [ OK ] Reached target Multi-User System. [ OK ] Reached target Graphical Interface. Starting Update UTMP about System Runlevel Changes... [ OK ] Started Update UTMP about System Runlevel Changes.
journal says: failed at step NAMESPACE spawning /usr/lib/systemd/systemd-udevd: invalid argument
Looks more like a startup problem of the base system. In any case, please provide YaST logs so that we can see what has been configured.
This is a rescue image, there are no yast logs
Could you attach journalctl output? dmesg or /var/log/boot.log anything else that might help debugging... why dont we have supportconfig for openSUSE?
Give the latest the rescue mode of an openSUSE 13.2 image a try ... Beside this, this could also a linuxrc and mayby a dracut problem.
13.2 is fine in this regard.
After the resue mode, does the final system work together with hvc0? If yes it might be that in the resue image of tumbleweed something is missed which is required for hvc0?
If rescue is started over serial (no vga attached), then it is not possible to login, there is no login prompt. If machine started with VGA, then sure, there is login prompt, but hvc0 is not used in this case. I don't see the same behaviour on installed system.
My question is: is the hvc0 attached in the installed system, that is: are there runnging agettys not only on /dev/tty1 upto /dev/tty6 but also on /dev/hvc0? Please show ps aux | grep agetty cat /proc/cmdline of the *installed* system. Also I'd like to know what happens if you specify console=hvc0,38400 for the rescue mode on the kernel's command line.
Dinar? Please provide an answer on comment #8
starting rescue with console=hvc0,38400 doesn't solve a problem. I'll provide requested output from the running system in a minute
Info from installed machine: localhost:~ # cat /proc/cmdline BOOT_IMAGE=/boot/vmlinux-4.1.6-3-default root=UUID=16c8375f-dd26-4e1c-bed5-64644a5c92b1 quiet splash=silent localhost:~ # ps aux | grep agetty root 569 0.0 0.0 3072 2112 tty1 Ss+ 10:18 0:00 /sbin/agetty --noclear tty1 linux root 1367 0.0 0.0 4160 2368 hvc0 S+ 10:20 0:00 grep --color=auto agetty localhost:~ # Also I don't see system boot output during boot: Loading Linux 4.1.6-3-default ... * finddevice /memory grub workaround * Loading initial ramdisk ... * finddevice /memory grub workaround * OF stdout device is: /vdevice/vty@71000000 Preparing to boot Linux version 4.1.6-3-default (geeko@buildhost) (gcc version 5.1.1 20150713 [gcc-5-branch revision 225736] (SUSE Linux) ) #1 SMP Fri Aug 28 10:59:34 UTC 2015 (d867e86) Detected machine type: 0000000000000101 Max number of cores passed to firmware: 2048 (NR_CPUS = 2048) Calling ibm,client-architecture-support... done command line: BOOT_IMAGE=/boot/vmlinux-4.1.6-3-default root=UUID=16c8375f-dd26-4e1c-bed5-64644a5c92b1 quiet splash=silent memory layout at init: memory_limit : 0000000000000000 (16 MB aligned) alloc_bottom : 0000000003b80000 alloc_top : 0000000030000000 alloc_top_hi : 0000000100000000 rmo_top : 0000000030000000 ram_top : 0000000100000000 instantiating rtas at 0x000000002fff0000... done prom_hold_cpus: skipped copying OF device tree... Building dt strings... Building dt structure... Device tree strings 0x0000000003b90000 -> 0x0000000003b90820 Device tree struct 0x0000000003ba0000 -> 0x0000000003bb0000 Quiescing Open Firmware ... Booting Linux via __start() ... -> smp_release_cpus() spinning_secondaries = 0 <- smp_release_cpus() <- setup_system() SUSE Linux #1 SMP Fri Aug 2 Welcome to openSUSE 20150903 "Tumbleweed" - Kernel 4.1.6-3-default (hvc0).
(In reply to Dinar Valeev from comment #12) AFAICS you *are* working on hvc0 that is that there had been an agetty as well as a login. Otherwise the terminal of the grep command would'nt be hvc0. > Also I don't see system boot output during boot Wrong order of console=hvc0,38400 console=tty0 switch the order to console=tty0 console=hvc0,38400 as with this the last one becomes the main console. Or use plymouth to see boot messages on both devices. There is no blogd anymore and systemd write to /dev/console which is printed on the main device only. Beside this I'd like to see systemctl status --all dev-hvc0.device (In reply to Dinar Valeev from comment #8) > I don't see the same behaviour on installed system.
(In reply to Dinar Valeev from comment #8) > I don't see the same behaviour on installed system. .... this looks more like a problem in the installation system. Maybe the file /usr/lib/udev/rules.d/99-systemd.rules is missed in the installation system and/or the line: SUBSYSTEM=="tty", KERNEL=="tty[a-zA-Z]*|hvc*|xvc*|hvsi*|ttysclp*|sclp_line*|3270/tty*", TAG+="systemd" does miss `hvc*|'
installed system: localhost:~ # systemctl status --all dev-hvc0.device ● dev-hvc0.device - /dev/hvc0 Follow: unit currently follows state of sys-devices-virtual-tty-hvc0.device Loaded: loaded Active: active (plugged) since Mon 2015-10-05 10:40:07 EDT; 38s ago Device: /sys/devices/virtual/tty/hvc0 Oct 05 10:40:07 localhost systemd[1]: Found device /dev/hvc0.
localhost:~ # cat /usr/lib/udev/rules.d/99-systemd.rules | grep hvc SUBSYSTEM=="tty", KERNEL=="tty[a-zA-Z]*|hvc*|xvc*|hvsi*|ttysclp*|sclp_line*|3270/tty*", TAG+="systemd"
(In reply to Dinar Valeev from comment #17) And this file *is* part of the installation system? Note, that I'm not talking about the installed system nor the final initrd, I meant the installation system as you had said the you do not have the problem on the installed system.
rescue system have udev rule: SUBSYSTEM=="tty", KERNEL=="tty[a-zA-Z]*|hvc*|xvc*|hvsi*|ttysclp*|sclp_line*|3270/tty*", TAG+="systemd" systemctl status --all dev-hvc0.device ● dev-hvc0.device Loaded: loaded Active: inactive (dead) Oct 05 15:08:44 Rescue systemd[1]: Job dev-hvc0.device/start timed out. Oct 05 15:08:44 Rescue systemd[1]: Timed out waiting for device dev-hvc0.device. Oct 05 15:08:44 Rescue systemd[1]: Job dev-hvc0.device/start failed with result 'timeout'.
In addition to that: Oct 05 15:07:14 Rescue systemd[1]: Starting udev Kernel Device Manager... Oct 05 15:07:14 Rescue systemd[1254]: Failed at step NAMESPACE spawning /usr/lib/systemd/systemd-udevd: Invalid argument Oct 05 15:07:14 Rescue systemd[1]: systemd-udevd.service: main process exited, code=exited, status=226/NAMESPACE Oct 05 15:07:14 Rescue systemd[1]: Failed to start udev Kernel Device Manager. Oct 05 15:07:14 Rescue systemd[1]: Unit systemd-udevd.service entered failed state. Oct 05 15:07:14 Rescue systemd[1]: systemd-udevd.service failed. Oct 05 15:07:14 Rescue systemd[1]: systemd-udevd.service has no holdoff time, scheduling restart. Oct 05 15:07:14 Rescue systemd[1]: Starting udev Kernel Device Manager... Oct 05 15:07:14 Rescue systemd[1258]: Failed at step NAMESPACE spawning /usr/lib/systemd/systemd-udevd: Invalid argument Oct 05 15:07:14 Rescue systemd[1]: systemd-udevd.service: main process exited, code=exited, status=226/NAMESPACE Oct 05 15:07:14 Rescue systemd[1]: Failed to start udev Kernel Device Manager. Oct 05 15:07:14 Rescue systemd[1]: Unit systemd-udevd.service entered failed state. Oct 05 15:07:14 Rescue systemd[1]: systemd-udevd.service failed. Oct 05 15:07:14 Rescue systemd[1]: systemd-udevd.service has no holdoff time, scheduling restart. Oct 05 15:07:14 Rescue systemd[1]: Starting udev Kernel Device Manager...
Created attachment 650211 [details] boot.msg boot.msg from rescue boot
(In reply to Dinar Valeev from comment #20) > Oct 05 15:07:14 Rescue systemd[1]: systemd-udevd.service failed. > Oct 05 15:07:14 Rescue systemd[1]: systemd-udevd.service has no holdoff time, scheduling restart. This looks more like a broken library on the rescue system for PPC64. After asking google with "holdoff systemd" I found https://bugzilla.redhat.com/show_bug.cgi?id=1121419 which describes a bug in glibc in i686 with a similar behaviour. Related to this I have found https://bugzilla.redhat.com/show_bug.cgi?id=1120473#c1
(In reply to Dr. Werner Fink from comment #22) > which describes a bug in glibc in i686 with a similar behaviour. > Related to this I have found the bugzillas I provided shows that the gcc had misscompiled the glibc, therefore I have added the maintainers of gcc to the carbon copy list.
might be related to: https://github.com/openSUSE/installation-images/blob/master/data/rescue/rescue.file_list#L371-L373 ? It setups only getty on tty1 not serial-getty and tty1 doesn't exists on serial?
(In reply to Dinar Valeev from comment #24) Indeed ... /dev/tty1 *is* the first virtual console of /dev/tty0, the current virtual console aka the VESA console of the graphic card. The question rises: Why there is no serial support in the installation system?
Should we have something like: d etc/systemd/system/serial-getty.target.wants s /usr/lib/systemd/system/serial-getty@.service etc/systemd/system/serial-getty.target.wants/serial-getty@hvc0.service ? But how installed system handles this autiomatically, without doing a symlink manualy?
That's what I'm wondering also...
If something like console=ttyS0 is on the command line, the getty generator kicks in, creating the particular "wants" link: /run/systemd/generator/getty.target.wants/serial-getty@ttyS0.service This ought to work similarly when having console=hvc0 on the line, and that is what it does, at least, the last time I checked with xvc0 (xen) on systemd-210.
(In reply to Steffen Winterfeldt from comment #27) man:systemd-getty-generator(8)
(In reply to Jan Engelhardt from comment #28) > If something like console=ttyS0 is on the command line, the getty generator > kicks in, creating the particular "wants" link: > > /run/systemd/generator/getty.target.wants/serial-getty@ttyS0.service > > This ought to work similarly when having console=hvc0 on the line, and that > is what it does, at least, the last time I checked with xvc0 (xen) on > systemd-210. I tried to boot resuce with console=hvc0, doesn't help with the currect installation-images code
(In reply to Dinar Valeev from comment #30) Then the systemd-getty-generator was not executed: strings /usr/lib/systemd/system-generators/systemd-getty-generator | grep hvc hvc0 hvc0
>(systemd-getty-generator.8): It will also instantiate serial-getty@.service instances for virtualizer consoles, if execution in a virtualized environment is detected. So maybe that virtual environment is not detected properly? Run systemd-detect-virt(8).
detect-virt detects kvm
tried to change tty1 to hvc0 and getty to serial-getty -> fail. It leads me to think that /dev/hvc0 is not there.
More specifically, /sys/class/tty/hvc0 needs to exist (so it is not dependent upon the population status of /dev).
(In reply to Jan Engelhardt from comment #35) Yes the generator does use ... after handling containers ... first the current active console(s) and further add a serial out of the list hvc0, xvc0, hvsi0, sclp_line0, ttysclp0, and 3270!tty1 if exists below /sys/class/tty/ NB: the ! in 3270!tty1 is a replacement for / Nevertheless for getting agetty to work the devices to exist. The question rises if in the installation image is udevd running and the getty generator has been executed ... I doubt that this had happen if I read https://github.com/openSUSE/installation-images/blob/master/data/rescue/rescue.file_list#L371-L373 as this looks more like setting up the first virtual console with a script language instead of executing /usr/lib/systemd/system-generators/systemd-getty-generator at boot time maybe with e /usr/lib/systemd/system-generators/systemd-getty-generator but I do not know if this would work out.
(In reply to Dr. Werner Fink from comment #36) ... also the devices, if not handles by a running udevd with the appropiate rules, have to be created in this scripting like rescue.file_list ... I guess there is indeed no /dev/hvc0
Well, at least the intention is that the rescue system is just a normal system; so udevd should be running. And afaics it is. It may well be that something is missing but I don't see what.
But udev seems have troubles starting: failed at step NAMESPACE spawning /usr/lib/systemd/systemd-udevd: invalid argument
(In reply to Dinar Valeev from comment #39) Hmmm ... then something is missed in the namespace/context for starting systemd-udevd like a sticky /tmp
(In reply to Dr. Werner Fink from comment #40) > (In reply to Dinar Valeev from comment #39) > > Hmmm ... then something is missed in the namespace/context for starting > systemd-udevd like a sticky /tmp The first error I see in the boot log is this: <7>[ 75.875642] systemd[1]: Failed to set up the root directory for shared mount propagation: Invalid argument seems to be in context with: https://github.com/jumpstarter-io/pkgbuilds/blob/master/systemd/0001-Revert-mount-setup-change-system-mount-propagation-t.patch Maybe there is a problem making the snapshot root shared?
trying to build iio with patched systemd
(In reply to Dinar Valeev from comment #42) From comment /* Mark the root directory as shared in regards to mount * propagation. The kernel defaults to "private", but we think * it makes more sense to have a default of "shared" so that * nspawn and the container tools work out of the box. If * specific setups need other settings they can reset the * propagation mode to private if needed. */ I like to guess that we need this feature as container are in use even for openSUSE... also from src/basic/virt.c function detect_container() I'd like to suppose that mkdir -p /run/systemd echo systemd-nspawn > /run/systemd/container could help in the rescue system
Nope, doest help. Could somebody please take a look at it? vncviewer cabernet.arch:55
(In reply to Dinar Valeev from comment #44) > Nope, doest help. > > Could somebody please take a look at it? > > vncviewer cabernet.arch:55 I still see the error message: Failed to set up the root directory for shared mount propagation: Invalid argument in the journal log.
(In reply to Thomas Blume from comment #45) > > I still see the error message: > > Failed to set up the root directory for shared mount propagation: Invalid > argument > > in the journal log. btw. I suspect a context with bug 902226.
Yes, 'mount --make-shared /' gives: / is not mountpoint or bad option
Hm, mount doesn't show a mountpoint for / but /proc/mounts does.
/proc/self/mountinfo misses /
So probably the rescue system needs a real root fs and can't just live in a subdir. I'll try this.
Indeed, chroot has some effect. # chroot /var/lib/containers/g2 # mount -t proc proc /proc # cat /proc/mounts proc /proc proc rw,relatime 0 0 # exit One can workaround this by creating a new vfsmount: # mount --bind g2 g3 # chroot g3 # mount -t proc proc /proc # cat /proc/mounts /dev/md4 / xfs rw,relatime,attr2,inode64,noquota 0 0 proc /proc proc rw,relatime 0 0
Nice idea. It gets you a / mountpoint; but --make-shared still fails.
cabernet.arch:55 rebooted with vanila rescue image, since I failed to specify install= with German keyboard.
Ok, rescue is now its own file system and the systemd error is gone. HOWEVER, it now mounts the root-fs ro. Any idea why and how to stop this?
(In reply to Steffen Winterfeldt from comment #54) > Ok, rescue is now its own file system and the systemd error is gone. > > HOWEVER, it now mounts the root-fs ro. Any idea why and how to stop this? I assume that the rescue system doesn't have an fstab entry for system root mount. If so, systemd-fstab-generator would fail to remount system root rw after the initrd. You could try with the boot parameters: rw and/or: rootflags=rw
I thought so but there is an entry; but probably a wrong one. How would an entry for a tmpfs / look like?
(In reply to Steffen Winterfeldt from comment #56) > I thought so but there is an entry; but probably a wrong one. > > How would an entry for a tmpfs / look like? Hm, not really sure either. When you are logged in as root, there is a systemd tmpfs mount for run-user-0.mount. On my machine, it has the following options: Options=rw,nosuid,nodev,relatime,size=203868k,mode=700 Don't know wheter all would apply for system root though. Can you paste the output of: systemctl cat / systemctl status /
Ah, I guess we need this: https://github.com/systemd/systemd/commit/b0438462089d1e1460429a57718305de08985908?utm_source=anzwix
Hm, seems so. Postponing the linuxrc changes until we have this in systemd. As a short term solution: would adding a static getty symlink for hvc0 do?
If /dev/hvc0 does not exist (because udevd did not launch), then starting the getty unit likely is not going to be successful either :-/
(In reply to Steffen Winterfeldt from comment #59) > Hm, seems so. > > Postponing the linuxrc changes until we have this in systemd. > > As a short term solution: would adding a static getty symlink for hvc0 do? I'm just building a systemd version with the (ported) patch from comment#58 here: https://build.opensuse.org/package/show/home:tsaupe:branches:Base:System:bsc937237-systemd/systemd you might want to give it a try when it is finished.
Hm, I just checked with qemu and sle12-sp1. I also saw the 'shared mount propagation error' in the journal. But /dev/hdc0 _does_ exist. Also a getty was running there and I could login there after fixing /etc/securetty. So why does that not work in openSUSE?
(In reply to Steffen Winterfeldt from comment #62) > Hm, I just checked with qemu and sle12-sp1. > > I also saw the 'shared mount propagation error' in the journal. > > But /dev/hdc0 _does_ exist. Also a getty was running there and I could login > there after fixing /etc/securetty. > > > So why does that not work in openSUSE? This is, because: http://cgit.freedesktop.org/systemd/systemd/commit/?id=b3ac5f8cb98757416d8660023d6564a7c411f0a0 is not in systemd-210. Basically the same as if you apply the patch from comment#41.
(In reply to Thomas Blume from comment #63) > > This is, because: > > http://cgit.freedesktop.org/systemd/systemd/commit/ > ?id=b3ac5f8cb98757416d8660023d6564a7c411f0a0 > > is not in systemd-210. > Basically the same as if you apply the patch from comment#41. Sorry, my fault, it is in systemd-210, but it is limited to containers: -->-- if (detect_container(NULL) <= 0) if (mount(NULL, "/", NULL, MS_REC|MS_SHARED, NULL) < 0) log_warning("Failed to set up the root directory for shared mount propagation: %m"); --<--
On tumbleweed I see comment 20; but /dev/hvc0 exists. So, my question still is: why can systemd on sle12-sp1 cope with the situation but the one in tumbleweed can't? And, should systemd be fixed or the setup?
(In reply to Steffen Winterfeldt from comment #62) > Hm, I just checked with qemu and sle12-sp1. > > I also saw the 'shared mount propagation error' in the journal. > > But /dev/hdc0 _does_ exist. Also a getty was running there and I could login > there after fixing /etc/securetty. > > > So why does that not work in openSUSE? Yes, it is only visible in openSUSE
And it's not ppc-specific. udevd itself seems to run just fine; it's systemd doing this namespace separation thingy at service startup running amok or so.
Got reproduced on x86?
yes
(In reply to Steffen Winterfeldt from comment #65) > On tumbleweed I see comment 20; but /dev/hvc0 exists. > > So, my question still is: why can systemd on sle12-sp1 cope with the > situation > but the one in tumbleweed can't? And, should systemd be fixed or the setup? Checked the difference between SLES12SP1 and tumbleweed. /usr/lib/systemd/system/systemd-udevd.service on tumbleweed has this: MountFlags=slave and the manpage says: -->-- MountFlags= Takes a mount propagation flag: shared, slave or private, which control whether mounts in the file system namespace set up for this unit's processes will receive or propagate mounts or unmounts. --<-- I guess this makes systemd-udevd fail. The SLES12SP1 service file doesn't have this setting. The missing hvc0 device is most probably only a side effect of systemd-udevd.service failing to start.
I confirm: Index: systemd.spec =================================================================== --- systemd.spec (revision 223) +++ systemd.spec (working copy) @@ -286,6 +286,7 @@ Patch1098: 1098-systemd-networkd-alias-network-service.patch # PATCH-FIX-OPENSUSE hostname-NULL.patch - fix crash on xen build hosts in OBS Marcus Meissner Patch1099: hostname-NULL.patch +Patch1100: mount-flags.patch %description Systemd is a system and service manager, compatible with SysV and LSB @@ -618,6 +619,7 @@ %patch1097 -p1 %patch1098 -p1 %patch1099 -p1 +%patch1100 -p1 # # In combination with Patch352 set-and-use-default-logconsole.patch Index: mount-flags.patch =================================================================== --- mount-flags.patch (revision 0) +++ mount-flags.patch (revision 0) @@ -0,0 +1,11 @@ +Index: systemd-224/units/systemd-udevd.service.in +=================================================================== +--- systemd-224.orig/units/systemd-udevd.service.in ++++ systemd-224/units/systemd-udevd.service.in +@@ -21,6 +21,5 @@ Sockets=systemd-udevd-control.socket sys + Restart=always + RestartSec=0 + ExecStart=@rootlibexecdir@/systemd-udevd +-MountFlags=slave + KillMode=mixed + WatchdogSec=1min Fixes the rescue, I'm able to login from hvc0
Ok, disabling the mount propagation works. But I'm not sure wheter we should do this in general as it seems to break nspawn (see comment #43). Jan, Werner, Steffen, what do you think?
Time to see what upstream's opinion is.
You don't have to do it in the package; I can just change the setting for the rescue system. I guess systemd doesn't expect to be run chroot'ed where you don't have a mountpoint at the top dir...
Can you not just make a bind mount before chrooting? (And chrooting seems to be a rescue system special case).
Tried that already; see comment 52.
(In reply to Steffen Winterfeldt from comment #76) > Tried that already; see comment 52. I guess you will need the patched systemd version from comment#61 to make this work.
Just modified the service file for the rescue system as in comment 71 and that seems to work. Looks like a good solution to me.
Just one thing: why can I login on hvc0 as root even though it's not in /etc/securetty?
BTW, fix will be https://github.com/openSUSE/installation-images/pull/68
(In reply to Steffen Winterfeldt from comment #79) From man:pam_securetty(8) .... It will also allow root logins on the tty specified with console= switch on the kernel command line and on ttys from the /sys/class/tty/console/active.
Thanks, Werner! as comment 80 is now on its way into tumbleweed, closing bug
This is an autogenerated message for OBS integration: This bug (937237) was mentioned in https://build.opensuse.org/request/show/341131 Leap:42.1 / installation-images-openSUSE
This is an autogenerated message for OBS integration: This bug (937237) was mentioned in https://build.opensuse.org/request/show/341167 Factory / installation-images-openSUSE https://build.opensuse.org/request/show/341169 Leap:42.1 / installation-images-openSUSE