Bugzilla – Bug 202133
PPC: Xorg does not work on Pegasos PPC
Last modified: 2006-11-10 13:36:16 UTC
I installed factory from this weekend (about 08.23.06), and X does not start any more on my Pegasos PPC. Here is a note from a friend, who helped to solve the problem on Fedora: "X.org 7.1 has new PCI scanning code which has no concept of PCI domains. So.. if you have two PCI buses on a machine and a graphics card is not installed in the first one.. it simply doesn't find it (or at least it does, but if it is on bus 01 then it doesn't remember this part, and later code assumes it is on bus 00, or something like that)"
The PCI domain stuff should be fixed with Alpha4. Please retest when it's available (in about 2 weeks).
Any news on if this fix is in yet?
Yes, the PCI domain support is in Alpha4, but still needs to be tested.
What about a patch for all those guys who aren't just testing SUSE? :D
Probably you only need the xorg-x11-server package from Factory.
(In reply to comment #1) > The PCI domain stuff should be fixed with Alpha4. Please retest when it's > available (in about 2 weeks). > Tested with Alpha4. It still does not work. To share some good news also: for the first time ever, my monitor was set to the optimum resolution and refreshrate during installation. Please note the difference between 'lspci' and 'X -scanpci': suse102a4:~ # lspci 00:00.0 Host bridge: Marvell Technology Group Ltd. MV64360/64361/64362 System Controller (rev 03) 00:01.0 FireWire (IEEE 1394): VIA Technologies, Inc. IEEE 1394 Host Controller (rev 46) 00:05.0 USB Controller: NEC Corporation USB (rev 43) 00:05.1 USB Controller: NEC Corporation USB (rev 43) 00:05.2 USB Controller: NEC Corporation USB 2.0 (rev 04) 00:0c.0 ISA bridge: VIA Technologies, Inc. VT8231 [PCI-to-ISA Bridge] (rev 10) 00:0c.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06) 00:0c.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 1e) 00:0c.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 1e) 00:0c.4 Bridge: VIA Technologies, Inc. VT8235 ACPI (rev 10) 00:0c.5 Multimedia audio controller: VIA Technologies, Inc. VT82C686 AC97 Audio Controller (rev 40) 00:0c.6 Communication controller: VIA Technologies, Inc. AC'97 Modem Controller (rev 20) 00:0d.0 Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II] (rev 51) 0001:01:00.0 Host bridge: Marvell Technology Group Ltd. MV64360/64361/64362 System Controller (rev 03) 0001:01:08.0 VGA compatible controller: ATI Technologies Inc RV280 [Radeon 9200 SE] (rev 01) 0001:01:08.1 Display controller: ATI Technologies Inc RV280 [Radeon 9200 SE] (Secondary) (rev 01) and suse102a4:~ # X -scanpci Probing for PCI devices (Bus:Device:Function) (0:0:0) Marvell Technology Group Ltd. MV64360/64361/64362 System Controller (0:1:0) unknown card (0x1106/0x3044) using a VIA Technologies, Inc. IEEE 1394 Host Controller (0:5:0) unknown card (0x1033/0x0035) using a NEC Corporation USB (0:5:1) unknown card (0x1033/0x0035) using a NEC Corporation USB (0:5:2) unknown card (0x9710/0x1906) using a NEC Corporation USB 2.0 (0:12:0) VIA Technologies, Inc. VT8231 [PCI-to-ISA Bridge] (0:12:1) VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (0:12:2) unknown card (0x0925/0x1234) using a VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (0:12:3) unknown card (0x0925/0x1234) using a VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (0:12:4) VIA Technologies, Inc. VT8235 ACPI (0:12:5) VIA Technologies, Inc. VT82C686 AC97 Audio Controller (0:12:6) VIA Technologies, Inc. AC'97 Modem Controller (0:13:0) unknown card (0x3065/0x1106) using a VIA Technologies, Inc. VT6102 [Rhine-II] I'll also attach yast2 and sax2 logs.
can you also attach lsprop -R /proc/device-tree?
Well, the devices on Domain 1 are still missing.
Created attachment 98271 [details] device-tree
Created attachment 98272 [details] sax2 logs
Created attachment 98273 [details] yast2 logs
Reassigning to our current PCI domain specialist. :-)
Peter, could you also do find /proc/bus/pci/ -ls and find /sys/bus/pci* -ls I changed the scanning code (as domain info is not available in /proc, which the old code used), and might have done something stupid.
Created attachment 98418 [details] find /proc/bus/pci/ -ls
Created attachment 98419 [details] find /sys/bus/pci* -ls
there is a pegasos system in the office from max/werner, its currently offline or has a dynamic dhcp ip.
This problem was probably fixed up-stream. Please take a look: https://bugs.freedesktop.org/show_bug.cgi?id=7248
No, it's not. But thanks for pointing to the upstream bug report.
Ok, according to a first test the PCI domain support does work now. The cards are correctly detected. However, the machine crashes deep down in RADEONPreInit().
Seems like legacy IO isn't trivially possible on that machine in other domains than domain 0. Domain support has always been broken in one or another respect, unfortunately. Egbert is much more fluent in working in PCI space, so I assign this bug to him.
Grmbl. Peter, Egbert, sorry, comment #19 and #20 correspond to testing with IA64. Wrong bug.
The lack of legacy IO is the same on Pegasos though; everything is mapped through the "ISA bridge" which is on the PCI bus and in PCI IO space. There is no VGA framebuffer at 0xa00000, and certainly no registers below 1k either. However the Radeon stuff should work fine. Does it? Or is PCI domains still broken? I or Peter (reporter) can test a PPC kernel if you have it.
We do have PPC machines here, however, there hasn't exactly been extensive testing lately due to other more urgent projects. I'll look into this issue next week. If I cannot reproduce any of the issues, I'll come back to you. The radeon needs access to VGA registers, BTW. I don't think it will work without any legacy I/O at all.
The Radeon driver should properly resolve the VGA registers it wants into PCI I/O space just fine.. however if it messes up because of the PCI domain stuff, it might not know how.
It still does not work with the first installable Factory after Alpha5 (10/15/2006).
That's bad news. Can you post the latest 'rpm -q --changelog xorg-x11-server' entry? There had been a couple of fixes lately, but I guess you tested the right package.
See below! And also please take a look at http://www.ppczone.org/forums/viewtopic.php?p=5171#5171 as Fedora has AFAIK the same Xorg version, and the problem seems to be fixed (I did not have yet a chance to verify). factoryppc:~ # rpm -q --changelog xorg-x11-server * Mon Oct 09 2006 - sndirsch@suse.de - glx-align.patch: * reenabled -D__GLX_ALIGN64 on affected plaforms (X.Org Bug #8392) - Fixes to p_pci-domain.diff (Bug #197572) * internal domain number of by one (was supposed to be a cleanup, but other code dependet on this semantics) * fixed another long-standing of-by-1 error - p_enable-altrix.diff (Bug #197572) * This additional patch enables the build of the altrix detection routines, which have apparently not been included in Xorg 7.1 yet. This patch needs a autoreconf -fi after application. * Mon Sep 18 2006 - sndirsch@suse.de - updated to Mesa 6.5.1 * Wed Sep 13 2006 - sndirsch@suse.de - disable-fbblt-opt.diff: * Disable optimization (introduced by ajax) due to a general vesa driver crash later in memcpy (Bug #204324)
Thanks! I assume Redhat doesn't have the updated domain support we're trying to fix right now. I'll check their packages.
I havent checked the logs for pegasos, but cranberry.suse.de can not start X either with BusID "PCI193@1:0:0" Maybe its the same bug, maybe I have to open a different one.
I have made some progress with IA64 (bug #197190), PPC is next. Olaf, do you have a machine I can test on (and potentially crash it)?
cranberry.suse.de would work.
Apparently the ppc Xserver is built for 32bit addresses only. cranberry is running a ppc64 OS, but the 32bit Xserver isn't capable of working with 64bit addresses. Olaf, did that combination (64bit OS w/ 32bit Xserver) ever work? It seems like the Xserver is using (hardcoded) unsigned long for PCI addresses all over the place in the linux PCI code...
AFAIK ppc64 has always been completely 32bit userland on top of a 64bit kernel.
I'm sure it never worked, but without pci domain support the mga driver did at least something in sles10.
For ppc64 - at least on PowerMac G5's and anything using IBM system controllers (Maple, p185, JS20, JS21, dunno about POWER or anything without the Power Architecture 32/64-bridge) I think memory has been mapped traditionally such that you have 2GB of real memory before PCI space, and then PCI space literally sits before the 4GB barrier minus 256MB. http://www-306.ibm.com/chips/techlib/techlib.nsf/techdocs/B2BBA0230B9B9BF4872570EB006DAB7B Section 12.1 gives the memory map. So a 32-bit physical addresses for PCI/PCIe/whatever will Just Work (tm) for this platform. I could be wrong, like I said, about POWER5 or POWER6 or anything that does not implement the magic 32-bit bridge specification, but I am sure they all do :)
(In reply to comment #34) > I'm sure it never worked, but without pci domain support the mga driver did at > least something in sles10. Please define 'something'. Segfault? Break? Crash? Display a black screen? Work? (In reply to comment #35) > For ppc64 - at least on PowerMac G5's and anything using IBM system controllers > (Maple, p185, JS20, JS21, dunno about POWER or anything without the Power > Architecture 32/64-bridge) I think memory has been mapped traditionally such > that you have 2GB of real memory before PCI space, and then PCI space literally > sits before the 4GB barrier minus 256MB. # cat /sys/bus/pci/devices/0001\:c1\:00.0/resource 0x0000040178000000 0x0000040179ffffff 0x0000000000000200 [...] For me this reads as if the device is mapped in higher PCI memory regions. So at least mmap() needs 64bit addresses, and the pci scaning code so far used unsigned long (which is 32bit on ppc). I do not understand how this could ever work, so it probably didn't. I changed the scaning code (luckily only 2 levels deep, after that a special type is used - don't know wether that will work later on, though), only to find out that domain support for ppc isn't enabled yet at all (maximum of 256 buses). At least not after modularization. Then I vaguely remember that Egbert tried to enable domain support for PPC, but failed due to the absurd code complexity. I probably need a different machine to fix the problems of the original bug report. Or cranberry booted into 32bit mode. Thanks for the bridge reference document link!
(In reply to comment #36) > # cat /sys/bus/pci/devices/0001\:c1\:00.0/resource > 0x0000040178000000 0x0000040179ffffff 0x0000000000000200 > [...] > > For me this reads as if the device is mapped in higher PCI memory regions. At least physically on the CPC945, you cannot 'map' it at any other place; it will always be physically accessible at a <4GB address. However there is always a trick to be played to put your peripheral addresses at the top of your virtual address space, and make your application think it has 62GB of real.. contiguous memory. Is this an artefact of that? > at least mmap() needs 64bit addresses, and the pci scaning code so far used > unsigned long (which is 32bit on ppc). I do not understand how this could ever > work, so it probably didn't. At least it should work on the 32-bit Pegasos. > Thanks for the bridge reference document link! No problem.
in SLES10 the pci devices were found. But mga is broken since 4.0.2 from 2001. we can add a xorg-server.ppc64.rpm to the install media.
(In reply to comment #37) > (In reply to comment #36) > > # cat /sys/bus/pci/devices/0001\:c1\:00.0/resource > > 0x0000040178000000 0x0000040179ffffff 0x0000000000000200 > > For me this reads as if the device is mapped in higher PCI memory regions. > At least physically on the CPC945, you cannot 'map' it at any other place; it Ja, right, sorry for the mixup. Still, the Xserver needs 64bit addresses in a 32bit user space program, for mmap()ing. Currently, adddresses are long or int in a lot of places. > a trick to be played to put your peripheral addresses at the top of your > virtual address space, and make your application think it has 62GB of real.. > contiguous memory. Is this an artefact of that? I don't know, and I think it doesn't really matter. (In reply to comment #38) > in SLES10 the pci devices were found. > But mga is broken since 4.0.2 from 2001. The Xserver could never ever work on this machine as a 32 bit binary. Irrelevant of the driver (maybe except fb and/or vesa). > we can add a xorg-server.ppc64.rpm to the install media. I would like to debug 32bit ppc first, which won't work as well ATM. Do we have a 32bit ppc machine here for debugging?
PPC64 seems to be a well known issue, so unless this is a *major* use case, I won't continue to do research here. The 32bit ppc issue is still a regression, and should be solvable. Xorg mailing list: From: Ian Romanick <idr@us.ibm.com> Benjamin Herrenschmidt wrote: > What is the status of fixing X PCI layer nowadays ? Some folks here got > some brand new shinny Power5+ workstations (damn, those things are > faaaaast....). We've tried putting Matrox G400 or old Radeon 7000 PCI in > there, and while the various fbdev's work fine, X doesn't. > > What seems to be happening is that basically, X crops the top 32 bits of > all PCI resources :) It then tries to access those bogus addresses > via /dev/mem, which is no good, and instead craps over system memory. That and not having domain support are the core problems on PPC64. The first problem is somewhat specific to IBM hardware. Apple's PPC64 firmware puts domain 0 devices below the 32-bit boundary. > Thus I'm wondering what is the status with reworking X PCI code to, > among others, properly mmap /proc or /sys instead of /dev/mem, stop > assuming PCI resources are 32 bits on 32 bits machines (which is broken > on various other type of chips anyway, for example, some 44x embedded > powerpc's have a 36 bits physical address space and typically devices > sit above 4G), etc... libpciaccess currently only supports the sysfs method. We still need to determine which Linux kernels we care about so that other interfaces can be supported. Of course, there's also the need to support non-Linux systems. All PCI address are treated as 64-bits. [...] My intention is to bring them up to date and merge them to HEAD once 7.2 ships. That's when the real fun should begin. :)
I will configure cantaloupe-giga or e89 manually with factory, yast is too broken.
Cool. Thanks a lot!
cantaloupe-giga:~ # Xorg -scanpci Probing for PCI devices (Bus:Device:Function) (0:0:0) Marvell Technology Group Ltd. MV64360/64361/64362 System Controller (0:1:0) unknown card (0x1106/0x3044) using a VIA Technologies, Inc. IEEE 1394 Host Controller (0:6:0) unknown card (0x1244/0x1100) using a Digital Equipment Corporation StrongARM DC21285 (0:12:0) VIA Technologies, Inc. VT8231 [PCI-to-ISA Bridge] (0:12:1) VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (0:12:2) unknown card (0x0925/0x1234) using a VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (0:12:3) unknown card (0x0925/0x1234) using a VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (0:12:4) VIA Technologies, Inc. VT8235 ACPI (0:12:5) VIA Technologies, Inc. VT82C686 AC97 Audio Controller (0:12:6) VIA Technologies, Inc. AC'97 Modem Controller (0:13:0) unknown card (0x3065/0x1106) using a VIA Technologies, Inc. VT6102 [Rhine-II] cantaloupe-giga:~ # lspci 0000:00:00.0 Host bridge: Marvell Technology Group Ltd. MV64360/64361/64362 System Controller (rev 03) 0000:00:01.0 FireWire (IEEE 1394): VIA Technologies, Inc. IEEE 1394 Host Controller (rev 46) 0000:00:06.0 I2O: Digital Equipment Corporation StrongARM DC21285 (rev 04) 0000:00:0c.0 ISA bridge: VIA Technologies, Inc. VT8231 [PCI-to-ISA Bridge] (rev 10) 0000:00:0c.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06) 0000:00:0c.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 1e) 0000:00:0c.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 1e) 0000:00:0c.4 Bridge: VIA Technologies, Inc. VT8235 ACPI (rev 10) 0000:00:0c.5 Multimedia audio controller: VIA Technologies, Inc. VT82C686 AC97 Audio Controller (rev 40) 0000:00:0c.6 Communication controller: VIA Technologies, Inc. AC'97 Modem Controller (rev 20) 0000:00:0d.0 Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II] (rev 51) 0001:01:00.0 Host bridge: Marvell Technology Group Ltd. MV64360/64361/64362 System Controller (rev 03) 0001:01:08.0 VGA compatible controller: ATI Technologies Inc RV280 [Radeon 9200 SE] (rev 01) 0001:01:08.1 Display controller: ATI Technologies Inc RV280 [Radeon 9200 SE] (Secondary) (rev 01) cantaloupe-giga:~ # does not look good.
Of course. Didn't have a machine to debug this first.
Factory from Nov. 3. still has the problem. If finding a Pegasos PPC machine inside SUSE is a problem, I can provide ssh access to my machine.
maybe this bug should be slightly red.
Is now. I have now set up a debugging environment on e89 and can reproduce. Honnestly, I'm astonished that this ever worked. AFAK ppc never had working domain support, and the machines in this bug report have their vga controllers in domain 1.
(In reply to comment #47) > Honnestly, I'm astonished that this ever worked. AFAK ppc never had working > domain support, and the machines in this bug report have their vga controllers > in domain 1. In the 'old days' there was procfs support, and for some insane reason the domain ('host bridge' in the case of Pegasos) was somehow fudged into the bus ID. This worked somehow, scarily, because of the way the PCI controller routed things. However it isn't two buses on one PCI controller, it's two real PCI controllers on real seperate domains. When sysfs gives you a domain, you're stuck. However the legacy procfs interface still does the bus fudging; it's just being/been deprecated. There are some patches around (to get Ubuntu Edgy working) which roll back the sysfs support and get it working.
Created attachment 104475 [details] Workaround patch for using old /proc interface on ppc only. This patch uses the old /proc interface for ppc. It only includes code for ppc, as for other architectures /sys works or even is mandatory (ia64). This is an ugly hack, and only works due to the domain fudging in the kernel on ppc. Stefan, please use this patch until I found real domain support for ppc is working.
Thanks, Matthias!
fixed for STABLE/Factory/BS and Beta2+.
Seems like the comment in the program text that kernel support isn't really there is still valid. If I completely enable domain support for PPC, the driver doesn't seem to address the card correctly, even though the bus/device is selected correctly. Maybe the bridge isn't detected correctly either, I don't know enough about this architecture to actually fix this issue.
Created attachment 104500 [details] Xorg log diff Diff of Xorg with patch (2.4 /proc scanning w/o domain support) against an patched nonworking Xorg with domain support.
(In reply to comment #51) > fixed for STABLE/Factory/BS and Beta2+. I disagree that this bug is "fixed". It looks like you reverted X11 to a state where it doesn't understand PCI domains properly again - there are plenty of kernel configs and other system tweaks which may completely break the procfs interface scanning...
Only on PPC. It is fixed on openSUSE 10.2 (read: working for this architecture and no influence on others), but not upstream. Upstream will need a real fix, but I have to test first whether just diabling the workarounds in the kernel will be enough to make domain support work.