|
Bugzilla – Full Text Bug Listing |
| Summary: | Recent HP workstations need pci=nommconf - Only needed with internal graphics card, external graphics card added the machines boot | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE 11.0 | Reporter: | Thomas Renninger <trenn> |
| Component: | Kernel | Assignee: | Greg Kroah-Hartman <gregkh> |
| Status: | RESOLVED FIXED | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Major | ||
| Priority: | P5 - None | CC: | asadeghpour, bernard.delley, bjorn.helgaas, bryan.christ, coolo, forgotten_FOUTW3E5Ow, henry.su, otto.hase, ric, richard.zhao, sbahling, trenn |
| Version: | Alpha 2 | ||
| Target Milestone: | --- | ||
| Hardware: | Other | ||
| OS: | Other | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: | This change from 2.6.17 to 2.6.18 breaks things | ||
|
Description
Thomas Renninger
2007-09-18 13:16:22 UTC
10.2 is also not booting, hanging at exactly the same point. The machine is continuing booting by passing pci=nommconf (didn't test how far, but looks good) This very much smells like a BIOS issue. If there are no objections, I am going to upgrade the BIOS to a much bigger revision now. If we still have problems at this point, I am going to add Andi, he knows more about mmconf issues... Adding Andi -> he might be interested Reducing severity, this has nothing to do with the MSI boards, workaround is found... should we close this bug? It could be a duplicate of this one: https://bugzilla.novell.com/show_bug.cgi?id=331027 I just did not find the time... I close it for now. Thomas, Christian or whoever will install the machine at some time: - A Bios update certainly would be a good idea - If inital boot hangs, it should install fine with pci=nommconf Parameter. - When installed and the kernel got upgraded to the latest one, it would be worth to try without the boot option, when it works it was a duplicate (see above). If someone finds the time, it would be great if someone could add the machine type into the bug's title. Like that it is easier for people having the same machine and problem, being able to get 10.3 running... Stefan now also has an HP machine showing this issue. Stefan, could you please search for a BIOS update for that machine in try again without pci=nommconf. This looks important... we probably could also get some help from HP here. Unfortunately I can't provide any feedback as long as no BIOS update is available. :-( I will do once I can. I'll provide feedback once I can test a BIOS update. Thanks Scott for the new BIOSes. Unfortunately they did not help, I will start digging on the mm config table or wherever this leads to, asap. Unfortunately I can't reproduce this issue any more on my machine, neither with the openSUSE 10.3-x86 default-kernel nor with the current STABLE-x86_64 default-kernel (2.6.24-rc8-git2-3-default). Latest changelog entry of 2.6.24-rc8-git2-3-default: * Mo Jan 21 2008 aj@suse.de - Remove unused config/s390/rt. Stefan Dirsch discovered something important: Adding an external PCI Express graphics card makes the machine boot fine. I verified this on the second HP desktop machine with another PCI Express graphics card. A network PCIe card does not make it boot. I can try to extract ACPI tables (it should even be possible to extract them on a broken machine)... But I expect now is the right time to get some HP developers on board. Scott? Bjorn->I am still on the pnp patches..., I updated and discovered yet another problem..., there will be a post really soon... which system has the issue? The dc7800 or dc5800 or both? the dc7800 is certified for SLE10 SP1 and I in stalled SLED10 SP1 several times without issues. Does this only show up with later kernels? Yes, both. No it is only 10.3 and newer (up to 11.0 Alpha1). I tried a vanilla kernel with same/similar config and it boots. So it seems to be a SUSE specific patch, I try to find it... Not sure, I thought booting with kernel-vanilla (suse rpm package), I had a working kernel..., but it seems I was wrong or it was something else... I now went back from latest kernel 2.6.24 to 2.6.17. 2.6.17 was the first kernel booting. I used plain vanilla sources (not the very latest Stable ones, hmm, should have done this, but it should not really matter). I mainly used SUSE derived kernel configs, but also used vanilla default kernel config on 2.6.18 (also not working). 2.6.17 in serial console shows: ACPI: bus type pci registered PCI: BIOS Bug: MCFG area is not E820-reserved PCI: Not using MMCONFIG. Which seem to be the reason why this one is booting. So this very much smells like an HP bug. I just wonder (there really seem to be (nearly?) every) HP workstation with internal graphics card affected (I saw this on 3 of 3 systems...) and the bug in mainline goes back to kernel 2.6.17 (first which is working). Can this really be?!? Even this seem to be a BIOS bug, I expect we want to have the check: PCI: BIOS Bug: MCFG area is not E820-reserved PCI: Not using MMCONFIG. in our kernels? I am going to dig it out..., hmm this should be something for Arjan who does not have a novell bugzilla account. I'll come back. Bryan, has HP seen this issue with later upstream and distro kernels? (this is related to the firmware updates I requested - the updates did not help) Created attachment 192558 [details]
This change from 2.6.17 to 2.6.18 breaks things
Scott, One the engineers in ISS worked on a patch for this and I believe his work has been submitted upstream. If not, maybe he could send you the patch. Would you like for me to investigate? Bryan yes, please let us know the status. Thanks. Scott, Here is what Tony had to say: "My patch did get pushed upstream, but it has not been accepted. The submission of my patch generated a lot of discussion on LKML, and several alternatives have been suggested, one which I must admit is better than mine. The maintainer, Greg Kroah-Hartman is leaning towards a patch that was admitted into the -mm stream, but nobody seems to like that one. It is awful, in that it requires drivers to make a kernel call to enable MMCONFIG. My patch was accepted into RH and is working as advertised. If you give me a pointer to the kernel sources for Novell, I would be happy to create a patch for it, if they are willing to take it while waiting for the upstream resolution. If you log onto the LKML and search for MMCONF since Dec 19, 2007, you will see all the discussion about it." Puhh, I really don't want to go through these hundreds of posts if not really necessary. Greg, can you take care of this? A two sentence explenation/summary of the problem would be nice :) Is there something we could/should add to 10.3 (pci=nommconf works, so this is not sever, not adding anything sounds better than risking breakage?). I readjust the product to 11.0 now, this is more important IMO. {sigh}
This is a long and complex path of patches, broken patches, broken hardware, and lots of other mess.
In short, this is not yet resolved upstream, and I would strongly hesitate to accept any patch into our kernel tree yet.
But, as this is a 11.0 issue, I think it will be resolved by then (hopefully), so I'll just take the bug and continue to work on the upstream issue.
And no, I don't want to take the Red Hat patch, because that is not a potential one for upstream, there are 2 others that might be used instead.
*** Bug 246646 has been marked as a duplicate of this bug. *** *** Bug 328471 has been marked as a duplicate of this bug. *** *** Bug 362588 has been marked as a duplicate of this bug. *** Is now fixed in our kernel-of-the-day, will be in the next alpha. If not, please reopen this bug. |