Bug 758570

Summary: todays (3.1.10) kernel-desktop update renders machine unbootable: hangs at "GRUB " blinking cursor
Product: [openSUSE] openSUSE 12.1 Reporter: andreas bittner <abittner>
Component: BasesystemAssignee: Michael Chang <mchang>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Critical    
Priority: P5 - None CC: mchang, sboyce
Version: Final   
Target Milestone: ---   
Hardware: i686   
OS: openSUSE 12.1   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: various configfiles and infos about the 12.2/m3/x86-64 box
whole /var/log/ directory from the 12.2/m3/x86-64 broken machine

Description andreas bittner 2012-04-23 14:53:18 UTC
User-Agent:       Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1025.162 Safari/535.19

suse boot trouble has caused me countless country trips to some opensuse machines needing to manually fix stuff that is upcoming via simple updates delivered via yast or zypper up


today i applied another zypper ref and zypper up, there was a kernel-desktop update amongst those, and some udev and some other libraries.

now this machine is not booting any more, it just sits at "GRUB " after the POST.

i tried chroot and looked at menu.lst and menu.lst.old but i didnt really find any trouble there.

mkinitrd fails for me as i am probably lacking some chroot parameters to these special paths and files, it complains that mtab (points somehwere in proc or so) cant be accessed.

maybe someone can give me a lead on this and more specially, whats wrong with this kernel update? :(

it came from 3.1.9......, so this machine is a normal 12.1 x86 installation and was up to date yesterday or some days before, and i only zypper up-ed those offered patches. nothing fancy

i have a separate /boot partition and / and /var and /opt

they are reiser and i think boot is ext3, something like that.


i copied these names from the suse download server but the zypper/history file shows that kernel-desktop rpms got applied, i dont know if all of them though or which are applicable to my system.


 kernel-desktop-devel-3.1.10-1.9.1.i586.rpm                                        20-Apr-2012 15:18  2.0M   Details
 kernel-desktop-base-3.1.10-1.9.1.i586.rpm                                         20-Apr-2012 15:17   13M   Details
 kernel-desktop-3.1.10-1.9.1.i586.rpm                                              20-Apr-2012 15:17   37M   Details

any more details on how to fix this? this is really annoying. it took me more than two hours to reach this location and machine just because of a kernel mess :((

Reproducible: Always

Steps to Reproduce:
1.
2.
3.
Comment 1 andreas bittner 2012-04-23 15:18:12 UTC
I dug around a bit for chroot, and I came across the --bind stuff, 

so in the 12.1/x86 dvd, i booted rescue, then i first mounted

mount /dev/sda5 /mnt  (my /, first parititon in extended)
then i 

mount --bind /dev /mnt/dev

(at a later point mkinitrd complained with some perl bootloader module script or so could access or find /sys, so i did --bind for /sys as well

mount --bind /sys /mnt/sys

then

mount /dev/sda2 /mnt/boot     (my /boot) 
mount /dev/sda6 /mnt/var
mount /dev/sda7 /mnt/opt


than i did

chroot /mnt

then

mount /proc


according to some blog:
http://nwrickert2.wordpress.com/2011/10/24/rescuing-susie/


anyways, so only with the additional --bind /sys did mkinitrd continue a bit further, but at the very end I am getting very familiar perl-bootloader .... grub grubdev2unixdev or similar errors... then my /dev/disk/by-id/ata-SAMSUNG_xxxxxxx...... harddiskpathname and an additional "with 2" after that....

five times this error line. always "with 2"... 

odd thing is, that the dev-disk-by-id-ata-samsung.... stuff doesnt contain the   "-part1"... or "-partX" at the end at all. all these lines only have the main disk name without mentioning any partition.


I already reported some serious bug about perlbootloader module and also with this grubdev2unixdev stuff some weeks back, but not much has happened, that bug hosed my whole system right from the start when i actually made the switch to opensuse 12.1 in the first place.....


sigh :(((( what now? what on earth is wrong here. HALP!
Comment 2 andreas bittner 2012-04-23 15:20:35 UTC
the mkinitrd part goes very far, right to features and bootsplash with two resolutions displaying and then giving those five times the perl-boot-loader error with no parititon found in that line as well...
Comment 3 andreas bittner 2012-04-23 15:35:54 UTC
inside the chroot i could run yast2 bootloader and i deleted all entries there and made him propose new configuration.


that one was defective as well, as it selected root=tmpfs for both standard and failsafe entries next to the floppy and the memtest.

so i swichted to root=/dev/disk/by-id/ata-SAMSUNG_____whatever-part5 .... 

here and now this darn system boots up again using the updated kernel 3.1.10

sigh...... this whole boot process, bootloader installation, boot dependability, updating kernel packages initrd, bootloader areas and whatever, needs to be made MUCH MUCH MUCH more robust and dependable.


please suse folks and fellow participants. this is really a huge pain :(((

thanks in advance.
Comment 4 andreas bittner 2012-04-23 15:58:41 UTC
this is my other bug about serious perl-bootloader troubles
https://bugzilla.novell.com/show_bug.cgi?id=748988
Comment 5 andreas bittner 2012-04-25 11:42:40 UTC
yay, or better NAY.... 

I just zypper ref zypper up-ed my other test system with 12.2 milestone3 x86-64 installed a few days ago via iso/usb-key.

I added the oss/nonoss/debug repos from factory to this system, did a zypper ref, and zypper up.

it installed 1100+ packages. it also installed a kernel-desktop package.

the machine "rebooted". NO. actually not. it sits the same way at "GRUB " and the blinking cursor as described in this bug here with the production machine on 12.1/x86.


there is something seriously messed up with the kernel updates, grub and bootloaders.


the 12.2/milestone3 -> factory machine is the one from this bug:
https://bugzilla.novell.com/show_bug.cgi?id=758499

so this machine has very simple default proposed partition layout. i booted the milestone3 usb-key/iso and it offered me to partition the single sata harddisk existing in this machine as 


/swap
/
/home

all with ext4 i think is the default proposal. nothing else.
no other operating systems, no dualboot, no fancy stuff, no nothing. linux testing machine. one hard disk. very simple.


so the bootloaders, scripts, grub and kernel-package updates even mess up this very simple partition layout so that grub doesnt do anything at all any more and refuses to boot the kernel-package upgraded system.


:(((
Comment 6 andreas bittner 2012-04-25 12:02:27 UTC
sorry I dont have a usable rescue system available right now, as I just found out in 

https://bugzilla.novell.com/show_bug.cgi?id=759074


so I cannot check this 12.2/milestone3/x86-64 testing system at all any more what is wrong with it or provide logs from it what it did to the kernel/bootloader/grub  parts that caused the mess.
Comment 7 andreas bittner 2012-04-25 17:37:33 UTC
Created attachment 488035 [details]
various configfiles and infos about the 12.2/m3/x86-64 box

various configfiles and infos about the 12.2/m3/x86-64 box

this machine is also hanging at "GRUB " and doing nothing any more after vanilly milestone3 installation and adding once the oss/nonoss/debug factory repositories and zypper ref zypper dup.

kernel package amongst many others (1158 patches total I think) get installed, machine needs reboot. then never comes back alive again. hangs with "GRUB " and blinking cursor....
Comment 8 andreas bittner 2012-04-25 17:41:13 UTC
Created attachment 488036 [details]
whole /var/log/ directory from the 12.2/m3/x86-64 broken machine

please dig through the /var/log content attached here as there might be or hopefully should be the hint what goes wrong with the kernel update packages recently.

this is from 12.2/m3, but I started this bug originally on a normally working 12.1/x86, and the kernel update from earlier this week or end of last week killed my machine it probably the very same ways as this milestone3 machine here.

they are/were both hanging at the "GRUB " stage..... although 12.2/m3 is GRUB2 and 12.1/x86 is older GRUB1 I guess.....


----

this machine is also hanging at "GRUB " and doing nothing any more after vanilly milestone3 installation and adding once the oss/nonoss/debug factory repositories and zypper ref zypper dup.

kernel package amongst many others (1158 patches total I think) get installed, machine needs reboot. then never comes back alive again. hangs with "GRUB " and blinking cursor....
Comment 9 andreas bittner 2012-04-25 18:53:55 UTC
ok a small update for 12.2/m3 (factory)

i was able to repro this bug. I reinstalled the milestone3/x64 from the usbkey.

then i added a single repo: factory-oss

zypper ref

then i updated only the kernel-desktop .rpm package first. rebooted. system came back fine.

then i did another zypper ref and zypper up finally.

it installed 11xx packages alltogether and tried to reboot.

now the machine hangs at "GRUB " cursor blinking....


its not the kernel-desktop rpm update that kills the machine, it must be some of the other relevant packagages, related to grub, bootloader or whatever there is still involved.

please do look into this.

once again: machine has simple one sata hdd. milestone3 proposes

swap
/ (20gigs)
/home (all the rest)

/ and /home are ext4.

thats all.
Comment 10 Kun Kun Zhang 2012-04-26 02:10:55 UTC
HI,could you please help to have a look this?Feel free to reassign it.Thank you very much.;)
Comment 11 andreas bittner 2012-04-26 07:15:42 UTC
some other opensuse tester also reported a milestone4 grub stalling bug...

https://bugzilla.novell.com/show_bug.cgi?id=759224

there gotta some interconnection between opensuse 12.1 and opensuse 12.2
Comment 12 Michael Chang 2012-04-26 08:02:46 UTC
*** Bug 759224 has been marked as a duplicate of this bug. ***
Comment 13 Sid Boyce 2012-04-26 11:28:30 UTC
Michael, I updated bug 759224 with the following information showing how I recovered my system by going back to 12.2 Milestone 3.
-----------------------------------------------------------------------------
I booted from a 12.2 Milestone 2 DVD into rescue, the Milestone 3 DVD fails
rescue.
I mounted /dev/sda1 as /mnt
chroot /mnt
grub2-install complained it couldn't find / (as I remember).

Booted 12.2 Milestone 3 DVD and chose install --> upgrade, included factory
repos.
It's up and running as Milestone 3.

It's backed off some of the changes, e.g gcc is now back at 4.6.3 whereas it
was 4.7.0 previously.

Awaiting your fix before doing zypper dup again.
Comment 14 Sid Boyce 2012-04-26 13:30:42 UTC
Doing a zypper dup now and will try the commands suggested.
Comment 15 Bernhard Wiedemann 2012-04-27 11:00:16 UTC
This is an autogenerated message for OBS integration:
This bug (758570) was mentioned in
https://build.opensuse.org/request/show/115825 Factory / perl-Bootloader
Comment 16 Michael Chang 2012-07-05 06:58:19 UTC
As the fix was already in factory, I'd like to close it.

Feel free to reopen if you still have problem. Thanks.