|
Bugzilla – Full Text Bug Listing |
| Summary: | multiple nvidia-*-kmp-* are installed during a kernel update for the same kernel flavour | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE Distribution | Reporter: | Heiko Rommel <heiko.rommel> |
| Component: | X11 3rd Party Driver | Assignee: | Stefan Dirsch <sndirsch> |
| Status: | RESOLVED FIXED | QA Contact: | Stefan Dirsch <sndirsch> |
| Severity: | Major | ||
| Priority: | P1 - Urgent | CC: | aaugusto, andrej.semen, astieger, edera, forgotten_7Vd19u3Vod, heiko.rommel, hpj, meissner, mlj, mmarek |
| Version: | 13.2 | Flags: | sndirsch:
needinfo?
(aaugusto) |
| Target Milestone: | --- | ||
| Hardware: | Other | ||
| OS: | Other | ||
| See Also: | http://bugzilla.novell.com/show_bug.cgi?id=966618 | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: | nvidia-kmp-remove-multiversion-provides.patch | ||
|
Description
Heiko Rommel
2015-04-01 12:45:04 UTC
Sounds reasonable. Michal, what do you think? How can I disable "multiversion(kernel)" provides in a KMP? If you install _only_ the old version (304.123) of the nvidia KMP, does it create the weak-updates symlink for the 3.16.7.-13 kernel? (In reply to Michal Marek from comment #4) > If you install _only_ the old version (304.123) of the nvidia KMP, does it > create the weak-updates symlink for the 3.16.7.-13 kernel? Thinking about it again. I'm afraid this is not doable, since the old version KMPs 304.123 are no longer in the repo. :-( Created attachment 629754 [details]
nvidia-kmp-remove-multiversion-provides.patch
Remove multiversion(kernel) from provides. Similar patch also required for uvm specfile for G03 or higher.
Note for myself: I still have 304.123 packages in ~sndirsch/links/beta/ddadap/340.58/nvidia-13.1.tgz. Indeed both KMP versions get installed during a kernel update. Tried this with G04 KMPs (346.35 -> 346.47 update). I got warnings from weak-updates, which pointed to kABI breakage with the latest kernel. Still both weak symlinks got created for 3.16.7-7 and 3.16.7-13 kernel. At least the 3.16.7-13 kernel worked with nvidia. No boot entry for 3.16.7-7 has been created. Since I don't see any point in having multiversion feature enabled with nvidia kmps (nvidia.ko/nvidia-uvm.ko will be created at the same location anyway overwriting each other) I will disable this feature for these KMPs. I you like (and still can setup the scenario before the update) you can try again with internal test nvidia repo http://download.suse.de/ibs/home:/sndirsch:/drivers/openSUSE_13.2/ to see if this is of any help. The issue here is. In the old KMPs there is still this multiversion flag, so the KMP still might not be replaced. No idea of this magic here. Sorry. same problem on SLED 12 with nvidia gfxcard on zypper up did install a 2nd version of nvidia packages. On boot the X server crashed. after removing nvdida packages and an installing them. On reboot X server start and behaves normal. zypper rm nvidia-computeG03 nvidia-gfxG03-kmp-default nvidia-glG03 nvidia-uvm-gfxG03-kmp-default x11-video-nvidiaG03 zypper in nvidia-computeG03 nvidia-gfxG03-kmp-default nvidia-glG03 nvidia-uvm-gfxG03-kmp-default x11-video-nvidiaG03 rpm -q --provides nvidia-gfxG03-kmp-default-340.76_k3.12.28_4-37.1.x86_64 ksym(default:nvUvmInterfaceAddressSpaceCreate) = 1cd8fc77 ksym(default:nvUvmInterfaceAddressSpaceCreateMirrored) = 207105f4 ksym(default:nvUvmInterfaceAddressSpaceDestroy) = ea4bdfea ksym(default:nvUvmInterfaceChannelAllocate) = 9e77ba41 ksym(default:nvUvmInterfaceChannelDestroy) = 5e3233ab ksym(default:nvUvmInterfaceChannelTranslateError) = de0e694e ksym(default:nvUvmInterfaceCheckEccErrorSlowpath) = 207dc32f ksym(default:nvUvmInterfaceCopyEngineAllocate) = cff5848a ksym(default:nvUvmInterfaceDeRegisterUvmOps) = 2103c3ad ksym(default:nvUvmInterfaceGetAttachedUuids) = f9c01347 ksym(default:nvUvmInterfaceGetGpuArch) = ad5f37e ksym(default:nvUvmInterfaceGetUvmPrivRegion) = e06c7054 ksym(default:nvUvmInterfaceKillChannel) = b899c7f1 ksym(default:nvUvmInterfaceMemoryAllocFB) = 435f7ff1 ksym(default:nvUvmInterfaceMemoryAllocSys) = b1b8125e ksym(default:nvUvmInterfaceMemoryCpuMap) = e8161540 ksym(default:nvUvmInterfaceMemoryCpuUnMap) = b9e082db ksym(default:nvUvmInterfaceMemoryFree) = 94752471 ksym(default:nvUvmInterfaceQueryCaps) = 2e249a1f ksym(default:nvUvmInterfaceRegisterUvmCallbacks) = 94fa66e8 ksym(default:nvUvmInterfaceServiceDeviceInterruptsRM) = e9211194 ksym(default:nvUvmInterfaceSessionCreate) = 69d2c985 ksym(default:nvUvmInterfaceSessionDestroy) = f7bcd53d ksym(default:nvidia_frontend_add_device) = f17550a5 ksym(default:nvidia_frontend_remove_device) = bcc81f22 ksym(default:nvidia_p2p_destroy_mapping) = 4c9ba34e ksym(default:nvidia_p2p_free_page_table) = 88765bb5 ksym(default:nvidia_p2p_get_pages) = f487b36a ksym(default:nvidia_p2p_init_mapping) = b73bde45 ksym(default:nvidia_p2p_put_pages) = eacba72c ksym(default:nvidia_register_module) = 499f1b98 ksym(default:nvidia_unregister_module) = 1fb3cf3d multiversion(kernel) nvidia-gfxG03-kmp = 340.76 nvidia-gfxG03-kmp = 340.76_k3.12.28_4 nvidia-gfxG03-kmp-default = 340.76 nvidia-gfxG03-kmp-default = 340.76_k3.12.28_4-37.1 nvidia-gfxG03-kmp-default(x86-64) = 340.76_k3.12.28_4-37.1 Ok. Andrej, feel free to retest with sle12 repo http://download.suse.de/ibs/home:/sndirsch:/drivers/SLE_12/ I also had the NVIDIA problem after kernel update yesterday (13th April). I could resolve it by refreshing all installed NVIDIA drivers via YaST. (In reply to Andrej Semen from comment #12) > same problem on SLED 12 with nvidia gfxcard > > on zypper up did install a 2nd version of nvidia packages. > On boot the X server crashed. > > after removing nvdida packages and an installing them. > On reboot X server start and behaves normal. > > zypper rm nvidia-computeG03 nvidia-gfxG03-kmp-default nvidia-glG03 > nvidia-uvm-gfxG03-kmp-default x11-video-nvidiaG03 > > zypper in nvidia-computeG03 nvidia-gfxG03-kmp-default nvidia-glG03 > nvidia-uvm-gfxG03-kmp-default x11-video-nvidiaG03 Thanks, it helped a lot. For others, beware that if `uname -a` yields Linux jade 3.16.7-13-desktop you need to change all kmp-default to kmp-desktop in the above commands. Update to 3.16.7-21-desktop on 13.2 required refreshment of NVIDIA driver - I had the 3.16.7-13-desktop kernel installed and my NVIDIA module refreshed for that [This is meant as a reminder for the whole story - since only distribution kernel are supported, it's no a problem fro you, Stefan. Things change, if you're going to support Tumbleweed one day (which would be a good idea, wouldn't it?)] Unfortunately, this breaks installation of Kernel-stable and the like, since it is replacing former working setups. (after you're gone though the pains of building the packages locally, of course.) E.g.: Installed is 4.0.5, new kernel is 4.1.0 Without multikernel(kernel), it removes a perfectly working driver for 4.0.5, which isn't the real McCoy, either. A better resolution, that deals with different kernel VERSIONS, is needed. This same issue is happening again on 13.2, since 2016-03-02 update. OS data: earth:/home/me # uname -a Linux earth 3.16.7-35-desktop #1 SMP PREEMPT Sun Feb 7 17:32:21 UTC 2016 (832c776) x86_64 x86_64 x86_64 GNU/Linux earth:/home/me # rpm -qa | grep -Ei "nvidia" nvidia-glG04-361.28-23.1.x86_64 nvidia-gfxG04-kmp-desktop-352.41_k3.16.6_2-16.1.x86_64 nvidia-computeG04-361.28-23.1.x86_64 nvidia-gfxG04-kmp-desktop-352.30_k3.16.6_2-15.1.x86_64 nvidia-gfxG04-kmp-desktop-361.28_k3.16.6_2-23.1.x86_64 nvidia-gfxG04-kmp-desktop-352.63_k3.16.6_2-18.1.x86_64 x11-video-nvidiaG04-361.28-23.1.x86_64 nvidia-gfxG04-kmp-desktop-352.55_k3.16.6_2-17.1.x86_64 driver is available, but only tty consoles are accessible, no graphic console earth:/home/me # lsmod | grep -i nvidia nvidia 10038754 31 drm 335594 3 nvidia earth:/home/me # find /lib/modules/ -iname "nvidia.ko" /lib/modules/3.16.6-2-desktop/updates/nvidia.ko /lib/modules/3.16.7-35-desktop/weak-updates/updates/nvidia.ko /lib/modules/3.16.7-32-desktop/weak-updates/updates/nvidia.ko earth:/home/me # l /lib/modules/3.16.6-2-desktop/updates/nvidia.ko -rw-r--r-- 1 root root 19664934 mar 3 10:22 /lib/modules/3.16.6-2-desktop/updates/nvidia.ko earth:/home/me # l /lib/modules/3.16.7-35-desktop/weak-updates/updates/nvidia.ko lrwxrwxrwx 1 root root 47 mar 2 14:13 /lib/modules/3.16.7-35-desktop/weak-updates/updates/nvidia.ko -> /lib/modules/3.16.6-2-desktop/updates/nvidia.ko earth:/home/me # l /lib/modules/3.16.7-32-desktop/weak-updates/updates/nvidia.ko lrwxrwxrwx 1 root root 47 mar 2 14:13 /lib/modules/3.16.7-32-desktop/weak-updates/updates/nvidia.ko -> /lib/modules/3.16.6-2-desktop/updates/nvidia.ko Can't provide more information on my own, just reproduced what I see on previous post. Probably comment 19 has the clue. Willing to help debug it, by providing more info and testing things. earth:/home/me # hwinfo --gfxcard 32: PCI 200.0: 0300 VGA compatible controller (VGA) [Created at pci.328] Unique ID: B35A.lkXvo1kkhm4 Parent ID: _Znp.dn0vZGMNp_9 SysFS ID: /devices/pci0000:00/0000:00:02.0/0000:02:00.0 SysFS BusID: 0000:02:00.0 Hardware Class: graphics card Model: "nVidia VGA compatible controller" Vendor: pci 0x10de "nVidia Corporation" Device: pci 0x13ba SubVendor: pci 0x103c "Hewlett-Packard Company" SubDevice: pci 0x1097 Revision: 0xa2 Driver: "nvidia" Driver Modules: "nvidia" Memory Range: 0xf2000000-0xf2ffffff (rw,non-prefetchable) Memory Range: 0xe0000000-0xefffffff (ro,non-prefetchable) Memory Range: 0xf0000000-0xf1ffffff (ro,non-prefetchable) I/O Ports: 0x1000-0xffff (rw) Memory Range: 0xf3080000-0xf30fffff (ro,non-prefetchable,disabled) IRQ: 72 (19738 events) Module Alias: "pci:v000010DEd000013BAsv0000103Csd00001097bc03sc00i00" Driver Info #0: Driver Status: nouveau is not active Driver Activation Cmd: "modprobe nouveau" Driver Info #1: Driver Status: nvidia is active Driver Activation Cmd: "modprobe nvidia" Config Status: cfg=new, avail=yes, need=no, active=unknown Attached to: #10 (PCI bridge) Primary display adapter: #32 Andres, this looks like an already messed up system. There is no 352.x driver packages an longer in the openSUSE 13.2 repository. Therefore I suggest to remove these old driver packages and try again. Also there was a bug when removing the "multiversion(kernel)" provides ... [...] ------------------------------------------------------------------- Mon Jan 18 12:55:01 UTC 2016 - sndirsch@suse.com - added missing '-e' option in sed call to remove "multiversion(kernel)" in provides ------------------------------------------------------------------- Thu Nov 12 15:39:06 UTC 2015 - sndirsch@suse.com - update latest long lived branch version 352.63 [...] ------------------------------------------------------------------- Thu Apr 2 13:51:01 UTC 2015 - sndirsch@suse.com - remove "multiversion(kernel)" from provides (bnc#925437) ------------------------------------------------------------------- Thu Mar 5 14:38:54 UTC 2015 - sndirsch@suse.com - update to latest long lived branch version 346.47 So things haven't really been fixed before release 352.79. Thanks for the answer. Just remove all "messy" things, left from previous kernels and drivers, so now the things that have changed are: earth:~ # rpm -qa | grep -Ei "nvidia" nvidia-glG04-361.28-23.1.x86_64 nvidia-gfxG04-kmp-desktop-361.28_k3.16.6_2-23.1.x86_64 nvidia-computeG04-361.28-23.1.x86_64 x11-video-nvidiaG04-361.28-23.1.x86_64 earth:~ # find /lib/modules/ -iname "nvidia.ko" /lib/modules/3.16.6-2-desktop/updates/nvidia.ko earth:~ # lsmod | grep -i nvidia earth:~ # modprobe nvidia modprobe: ERROR: could not find module by name='nvidia' modprobe: ERROR: could not insert 'nvidia': Function not implemented modprobe: ERROR: could not insert 'nvidia_uvm': Unknown symbol in module, or unknown parameter (see dmesg) mknod: missing operand after ‘0’ Try 'mknod --help' for more information. So now even worst, as the nvidia module does not load on kernel. TTY consoles are fine, but no gfx enviroment. And from dmes [ 20.707051] nvidia_uvm: module license 'MIT' taints kernel. [ 20.707054] Disabling lock debugging due to kernel taint [ 20.707419] nvidia_uvm: Unknown symbol nvUvmInterfaceChannelDestroy (err 0) [ 20.707440] nvidia_uvm: Unknown symbol nvUvmInterfaceQueryCaps (err 0) [ 20.707466] nvidia_uvm: Unknown symbol nvUvmInterfaceUnsetPageDirectory (err 0) [ 20.707478] nvidia_uvm: Unknown symbol nvUvmInterfaceMemoryAllocSys (err 0) [ 20.707494] nvidia_uvm: Unknown symbol nvUvmInterfaceFreeMemHandles (err 0) [ 20.707507] nvidia_uvm: Unknown symbol nvUvmInterfaceMemoryCpuMap (err 0) [ 20.707523] nvidia_uvm: Unknown symbol nvUvmInterfaceGetGmmuFmt (err 0) [ 20.707534] nvidia_uvm: Unknown symbol nvUvmInterfaceMemoryAllocGpuPa (err 0) [ 20.707552] nvidia_uvm: Unknown symbol nvUvmInterfacePmaFreePages (err 0) [ 20.707563] nvidia_uvm: Unknown symbol nvUvmInterfaceKillChannel (err 0) [ 20.707578] nvidia_uvm: Unknown symbol nvUvmInterfaceGetSurfaceMapInfo (err 0) [ 20.707595] nvidia_uvm: Unknown symbol nvUvmInterfaceSetPageDirectory (err 0) [ 20.707606] nvidia_uvm: Unknown symbol nvUvmInterfaceMemoryCpuUnMap (err 0) [ 20.707620] nvidia_uvm: Unknown symbol nvUvmInterfaceAddressSpaceCreateMirrored (err 0) [ 20.707634] nvidia_uvm: Unknown symbol nvUvmInterfaceOwnPageFaultIntr (err 0) [ 20.707645] nvidia_uvm: Unknown symbol nvUvmInterfaceGetCtxBufferPhysInfo (err 0) [ 20.707672] nvidia_uvm: Unknown symbol nvUvmInterfaceDupAddressSpace (err 0) [ 20.707686] nvidia_uvm: Unknown symbol nvUvmInterfaceGetCtxBufferCount (err 0) [ 20.707703] nvidia_uvm: Unknown symbol nvUvmInterfaceRegisterGpu (err 0) [ 20.707714] nvidia_uvm: Unknown symbol nvUvmInterfaceP2pObjectDestroy (err 0) [ 20.707725] nvidia_uvm: Unknown symbol nvUvmInterfaceGetFbInfo (err 0) [ 20.707737] nvidia_uvm: Unknown symbol nvUvmInterfaceGetChannelPhysInfo (err 0) [ 20.707764] nvidia_uvm: Unknown symbol nvUvmInterfaceValidateChannel (err 0) [ 20.707788] nvidia_uvm: Unknown symbol nvUvmInterfaceStopChannel (err 0) [ 20.707801] nvidia_uvm: Unknown symbol nvUvmInterfaceDestroyFaultInfo (err 0) [ 20.707814] nvidia_uvm: Unknown symbol nvUvmInterfaceMemoryAllocFB (err 0) [ 20.707830] nvidia_uvm: Unknown symbol nvUvmInterfaceGetGpuInfo (err 0) [ 20.707846] nvidia_uvm: Unknown symbol nvUvmInterfaceBindChannel (err 0) [ 20.707857] nvidia_uvm: Unknown symbol nvUvmInterfaceInitFaultInfo (err 0) [ 20.707876] nvidia_uvm: Unknown symbol nvUvmInterfaceStopVaspaceChannels (err 0) [ 20.707887] nvidia_uvm: Unknown symbol nvUvmInterfaceGetBigPageSize (err 0) [ 20.707898] nvidia_uvm: Unknown symbol nvUvmInterfaceServiceDeviceInterruptsRM (err 0) [ 20.707911] nvidia_uvm: Unknown symbol nvUvmInterfaceDeRegisterUvmOps (err 0) [ 20.707922] nvidia_uvm: Unknown symbol nvUvmInterfaceFreeDupedHandle (err 0) [ 20.707936] nvidia_uvm: Unknown symbol nvUvmInterfaceMemoryFree (err 0) [ 20.707948] nvidia_uvm: Unknown symbol nvUvmInterfaceGetUvmPrivRegion (err 0) [ 20.707962] nvidia_uvm: Unknown symbol nvUvmInterfaceGetAttachedUuids (err 0) [ 20.707973] nvidia_uvm: Unknown symbol nvUvmInterfaceMemoryFreePa (err 0) [ 20.707995] nvidia_uvm: Unknown symbol nvUvmInterfaceP2pObjectCreate (err 0) [ 20.708008] nvidia_uvm: Unknown symbol nvUvmInterfaceGetCtxBufferInfo (err 0) [ 20.708019] nvidia_uvm: Unknown symbol nvUvmInterfaceGetPmaObject (err 0) [ 20.708032] nvidia_uvm: Unknown symbol nvUvmInterfaceSessionDestroy (err 0) [ 20.708044] nvidia_uvm: Unknown symbol nvUvmInterfaceDupMemory (err 0) [ 20.708057] nvidia_uvm: Unknown symbol nvUvmInterfaceCheckEccErrorSlowpath (err 0) [ 20.708068] nvidia_uvm: Unknown symbol nvUvmInterfaceAddressSpaceCreate (err 0) [ 20.708079] nvidia_uvm: Unknown symbol nvUvmInterfaceCopyEngineAllocate (err 0) [ 20.708091] nvidia_uvm: Unknown symbol nvUvmInterfaceUnregisterGpu (err 0) [ 20.708103] nvidia_uvm: Unknown symbol nvUvmInterfaceAddressSpaceDestroy (err 0) [ 20.708117] nvidia_uvm: Unknown symbol nvUvmInterfaceRegisterUvmCallbacks (err 0) [ 20.708129] nvidia_uvm: Unknown symbol nvUvmInterfaceChannelAllocate (err 0) [ 20.708140] nvidia_uvm: Unknown symbol nvUvmInterfaceGetP2PCaps (err 0) [ 20.708159] nvidia_uvm: Unknown symbol nvUvmInterfaceDupAllocation (err 0) [ 20.708171] nvidia_uvm: Unknown symbol nvUvmInterfacePmaAllocPages (err 0) [ 20.708189] nvidia_uvm: Unknown symbol nvUvmInterfaceSessionCreate (err 0) Hope this info put some light on this issue! Andres, meanwhile I figured that your issue is bnc#969763. This issue has now extended to 13.1 evergreen with todays (08.Mar.16) kernel update. (Linux 3.12.53-40-desktop openSUSE 13.1 (Bottle) (x86_64) Likely the same as same symptoms and resolution: After new kernel install booted to a default low screen resolution. When I booted back into previous 3.11 kernel all was well Resolved by forced refresh of all installed NVIDIA drivers via YaST. Did not explore my system enough to guarantee exactly the same as in 13.2 but annoying for end user. Fixed with current updates in repos since yesterday Just to complete the information on the previous report I've followed the indicated procedure of deleting the NVidia drivers, and manually check for files left behind by others kernels. Found a few, related to 3.16.2 kernel, and deleted them. As recommended on bnc#969763. Then re-install only thing for current 3.16.7-35-desktop, and on 2016-03-09 got an update, so now earth:~ # uname -a Linux earth 3.16.7-35-desktop #1 SMP PREEMPT Sun Feb 7 17:32:21 UTC 2016 (832c776) x86_64 x86_64 x86_64 GNU/Linux earth:~ # rpm -qa | egrep nvidia nvidia-computeG04-361.28-33.1.x86_64 nvidia-glG04-361.28-33.1.x86_64 nvidia-gfxG04-kmp-desktop-361.28_k3.16.6_2-33.1.x86_64 x11-video-nvidiaG04-361.28-33.1.x86_64 earth:~ # l /lib/modules/3.16.7-35-desktop/weak-updates/updates/ total 8 drwxr-xr-x 2 root root 4096 Mar 10 08:54 ./ drwxr-xr-x 3 root root 4096 Feb 23 08:5 ./ lrwxrwxrwx 1 root root 55 Mar 10 08:54 nvidia-modeset.ko -> /lib/modules/3.16.6-2-desktop/updates/nvidia-modeset.ko lrwxrwxrwx 1 root root 51 Mar 10 08:54 nvidia-uvm.ko -> /lib/modules/3.16.6-2-desktop/updates/nvidia-uvm.ko lrwxrwxrwx 1 root root 47 Mar 10 08:54 nvidia.ko -> /lib/modules/3.16.6-2-desktop/updates/nvidia.ko earth:~ # find /lib/modules/ -name 'nvidia*' /lib/modules/3.16.6-2-desktop/updates/nvidia-modeset.ko /lib/modules/3.16.6-2-desktop/updates/nvidia.ko /lib/modules/3.16.6-2-desktop/updates/nvidia-uvm.ko /lib/modules/3.16.7-35-desktop/kernel/drivers/net/ethernet/nvidia /lib/modules/3.16.7-35-desktop/weak-updates/updates/nvidia-modeset.ko /lib/modules/3.16.7-35-desktop/weak-updates/updates/nvidia.ko /lib/modules/3.16.7-35-desktop/weak-updates/updates/nvidia-uvm.ko And all works like a charm :-) I mean, Thanks, Stefan! |