|
Bugzilla – Full Text Bug Listing |
| Summary: | error in service module - read-only file system | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE 11.0 | Reporter: | Dan Gahlinger <dgahling> |
| Component: | Bootloader | Assignee: | Philipp Thomas <pth> |
| Status: | RESOLVED DUPLICATE | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Major | ||
| Priority: | P5 - None | CC: | cerebus_8, coolo, jplack |
| Version: | Beta 1 | ||
| Target Milestone: | --- | ||
| Hardware: | x86-64 | ||
| OS: | openSUSE 11.0 | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
second screen capture of boot process
/var/log from the test system trying to boot 11.0 beta1 |
||
|
Description
Dan Gahlinger
2008-04-21 18:35:45 UTC
There is no 11.0 beta3 release, we are on beta1 right now. What are the specific error messages you get, and what exact release are you using? I'm sorry, it is 11.0 beta 1, bad typo there. I was using this http://download.opensuse.org/distribution/11.0-Beta1/iso/cd/openSUSE-11.0-Beta1-KDE4-LiveCD-i386.iso And as far as error messages, that's the error I'm getting when I try to login when booting in "normal" mode. ie: booting normally. I enter username "root" and the proper password and it says "error in service module" and asks for Username again. I can't login, every user that I setup has the same problem. As mentioned in fail-safe mode, it hangs. and single user mode doesn't help either. The file-system is read-only so no errors or logs are created. If you can tell me how to get the boot messages I'll gladly post the boot log here. One further note, install did NOT complete. It got through to the first reboot and that's where I'm stuck. when it rebooted to do the next part is when all the issues happened. I've tried to reinstall again and got the same thing. I also tried the full GM x86-64 bit dvd, and have the same problems. Created attachment 210958 [details]
second screen capture of boot process
capture of boot process early-on just before all the read-only errors.
Ok, thanks, this doesn't look like a kernel issue, but a boot-time issue, reassigning... Sorry I didn't see that option when opening the bug. your bug is heavily confusing. The KDE Live CD does not install in beta1, but in your initial report you talk about GM DVD. Please specify if you really installed from DVD - if so, provide your yast logs. not confusing at all. I used BOTH the GM DVD and the LIVE CD to try to get something to work. Neither of them works. But most important is the GM DVD. How can I provide yast logs when the filesystem is READ ONLY ? nothing gets written to the logs. have you seen the screen capture I attached previously? BTW I tried downloading a fresh image of the live cd, and burned it, it also hangs these systems. Can we focus on the GM DVD which has the "error in service module" and file system is read only? if we can fix these problems, I'm sure the lock/hang problem of the live cd will be easier to fix. if the installation works, you should be able to boot the rescue system from DVD and mount the target system from it to grab the yast logs Created attachment 212009 [details]
/var/log from the test system trying to boot 11.0 beta1
This is the entire /var/log tar/gzip of the one test system I have that has 11.0 beta 1 installed but cannot login and has some problems.
Root device: /dev/disk/by-id/scsi-SATA_Maxtor_6V160E0_V301EPPG-part1 (/dev/sda1) (mounted on / as reiserfs) If you boot with init=/bin/sh - do you see something strange in dmesg related to the file system? I won't be able to test this until monday, I'll test beta2 since it's out now, and update here once I've done that. Ok, this is bad. opensuse 11.0 beta 2 does EXACTLY the same thing! This is a show stopper if I ever saw one. I did the boot with init=/bin/sh but nothing really out of the ordinary pops up on the file system. It says fsck is clean, mounting read-only The only thing that stands out is a statement about invoking /dev/sda2 manually and a user retry for the same statement. I'll see if I can write it down and paste it in here later. On the "up" side, I reinstalled using Grub/ext3 combo and that works perfectly. So something about Lilo/reiser is really messing things up. i'm going to write down those boot lines, then reinstall testing grub/reiser and lilo/ext3 and see which (or both?) that causes the issue. We need lilo (for some really weird reasons) Here is the boot log I see on console, keep in mind copied by hand:
...
Trying manual resume from /dev/sda2
Invoking userspace resume from /dev/sda2
resume: libgcrypt version: 1.4.0
Trying manual resume from /dev/sda2
Invoking in-kernel resume from /dev/sda2
PM: starting manual resume from /dev/sda2
Waiting for device {ID} to appear: ok
fsck 1.4.0.8
.
{no errors - normal fsck logs}
.
filesystem is clean
fsck succeeded, mounting root device read-only
mounting root {ID same as above}
...
Note: there is no file /etc/mtab and /etc/fstab looks normal
I am trying grub/reiser next and will post notes here
This is weird. grub/reiser works perfectly! no issues at all! I am testing Lilo/ext3 next. But looks like our focus should be on Lilo now as the most likely culprit! When I change the boot loader install (during initial install) from Grub to Lilo, I always choose the option "propose changes", that has always worked in the past, even up to 10.3 works perfectly. but now, as of 11.0 it's not working. I'll provide info on lilo/ext3 combo shortly. Well lilo/ext3 has the exact same problem and messages using init=/bin/sh as the above.
So Lilo is definitely the issue.
Please escalate to the Lilo team for debugging ASAP!
BTW note for about /dev/sda2 is SWAP
and the "{ID}" mentioned above is for /dev/sda1 which is the root partition.
I always build systems this way. no /home or /boot, it's a waste.
this should be fixed with beta2 - dup of 380781. Please retest *** This bug has been marked as a duplicate of bug 380781 *** I *DID* test with beta2, did you not read my notes? ALL of these problems exist with Beta 2 as well!! All of the above tests were done under BETA 2! Repeat: lilo/reiser or lilo/ext3 FAILS on opensuse 11.0 beta2 with "error in service module" and read-only filesystem. using boot init=/bin/sh shows the above SAME problems as in beta 1 This is NOT a duplicate of 380781 - as this is not a yast issue, it is a Lilo issue And the lilo.conf looks good? yes, it looks perfect. exactly like other working systems. I have no idea what's going on. I will try one thing though, just out of desperation and post an update here. no go. Although I did find a minor issue/fix with lilo on 11.0 beta 2. in lilo.conf it puts vga=0x31s this SHOULD be vga=0x317 instead, thats how 10.3 did it, not sure why this changed. But this really doesn't affect the bug I'm posting. i don't see any difference (other than that one) in the lilo.conf I'm out of ideas, is there anything you want me to test? If you wish to recreate this, just boot the DVD, do a BASE install (minimal), choose KDE4 as the desktop, ext3 or reiserfs file systems (doesn't matter), go into the booting menu, select the boot manager tab, and choose LILO, it will pop up a box, select "propose new changes" (or something like that). In the boot loader screen after that, make sure you select the MBR (first option). then install as normal and let it run. When it finishes install and reboots, you'll see lots of "read-only file system". and booting with init=/bin/sh does as mentioned above. So this problem is reproduce-able, consistently so far as I can see. I'll try to reproduce tomorrow. For the time being I'm setting this to major as this bug is definitely not critical. For reference in case you missed it, its x86-64 on Intel processor. There are only about 3 days until Beta 3, is this bug going to get fixed? I fail to see how having a totally non-bootable system is no critical, but maybe that's just me. we can take out the lilo option in about no time. LILO is a *critical* option for us, and our company. Removing Lilo renders opensuse practically useless to us. We have very specific needs, and I expect there are millions of other users who depend on it as well. Lilo is not something you can just casually remove and expect users to be ok with it. This would be a major change to the distribution, I expect you'd have to take it to committee. I think a better option would be to fix it. What's happened from 10.3 to 11.0 that breaks in this functionality? OK, I think I can reproduce this and am working on locating the bug. Dan, please check /etc/lilo.conf. My guess is, that this contains the line read-only and that would be the culprit. This is a known issue and not a LILO fault. I'll make this a duplicate *** This bug has been marked as a duplicate of bug 381669 *** yes, agree- this is a duplicate of bug report 381669. I see the "read-only" in lilo.conf on 10.3, which is carried over to 11.0 hopefully that other bug thread will get this mess resolved |