Bug 1028369

Summary: missing virtio module in PowerPC install initrd kernel 4.10.1-1-default
Product: [openSUSE] openSUSE Tumbleweed Reporter: Michel Normand <normand>
Component: KernelAssignee: Steffen Winterfeldt <snwint>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Major    
Priority: P1 - Urgent CC: igonzalezsosa, jreidinger, normand, okurz, snwint, tiwai
Version: Current   
Target Milestone: ---   
Hardware: PowerPC   
OS: Other   
URL: https://trello.com/c/fyV9gRjA
Whiteboard: http://openqa.opensuse.org/tests/365283/modules/installation_mode/steps/2
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: y2log
y2log_debug
bug1028369_data.txt
linuxrc and YaST2 logs and lsmod on failing TW snapshot 20170309

Description Michel Normand 2017-03-07 17:52:37 UTC
Created attachment 716638 [details]
y2log

## Observation

openQA test in scenario opensuse-Tumbleweed-DVD-ppc64le-boot_to_snapshot@ppc64le fails in
[installation_mode](http://openqa.opensuse.org/tests/365283/modules/installation_mode/steps/2)


## Reproducible

Fails since (at least) Build [20170305](http://openqa.opensuse.org/tests/365283) (current job)


## Expected result

Last good: [20170304](http://openqa.opensuse.org/tests/364714) (or more recent)


## Further details

Always latest result in this scenario: [latest](http://openqa.opensuse.org/tests/latest?machine=ppc64le&arch=ppc64le&version=Tumbleweed&distri=opensuse&flavor=DVD&test=boot_to_snapshot)
Comment 1 Michel Normand 2017-03-07 18:09:59 UTC
I do not understand why Yast is displaying the "Disk Activation" panel.
and how to identify in related log the cause.
Comment 2 Michel Normand 2017-03-07 18:20:08 UTC
Created attachment 716644 [details]
y2log_debug

I am able to recreate the same problem in a local openQA PowerPC instance
with Y2DEBUG=1 boot parameter.
I captured the y2log_debug, but do not understand what lines could explain the failure.
Comment 3 Michel Normand 2017-03-12 05:31:08 UTC
Imobach, do you have a suggestion to continue investigation on this bug ?
Comment 4 Michel Normand 2017-03-15 09:30:53 UTC
Created attachment 717494 [details]
bug1028369_data.txt

I need help to continue investigation to understand why yast
decides to move from initial_5 to initial_6 steps
and displaying not expected "Disk Activation" screen,
While before (at snapshot 20170304) yast moved from initial_5 to initial_7 steps
directly displaying "Installation Mode" screen.
The referenced data are accessible in attached bug1028369_data.txt
Comment 5 Michel Normand 2017-03-15 11:07:19 UTC
looking at linuxrc.log do not report drivers virtio_pci* or virtio_blk*
does it mean no detected hardware ? or missing drivers ?
despite the qemu parameters
===
05:22:00.1890 131931 starting: /usr/bin/qemu-system-ppc64 -serial file:serial0 -soundhw ac97 -global isa-fdc.driveA= -g 1024x768 -vga std -m 4096 -machine usb=off -cpu host -netdev user,id=qanet0 -device virtio-net,netdev=qanet0,mac=52:54:00:12:34:56 -device virtio-scsi-pci,id=scsi0 -device virtio-blk,drive=hd1 -drive file=raid/l1,cache=unsafe,if=none,id=hd1,format=qcow2 -drive media=cdrom,if=none,id=cd0,format=raw,file=/var/lib/openqa/share/factory/iso/openSUSE-Tumbleweed-DVD-ppc64le-Snapshot20170311-Media.iso -device scsi-cd,drive=cd0,bus=scsi0.0 -boot once=d,menu=on,splash-time=5000 -device nec-usb-xhci -device usb-tablet -device usb-kbd -smp 8,threads=8 -enable-kvm -no-shutdown -vnc :95,share=force-shared -qmp unix:qmp_socket,server,nowait -monitor unix:hmp_socket,server,nowait -S -monitor telnet:127.0.0.1:20052,server,nowait
===
Comment 6 Josef Reidinger 2017-03-15 15:49:21 UTC
Hi Michel, I cannot login to that openqa setup. What is credentials? is there reason to have it hidden?
Comment 7 Michel Normand 2017-03-15 17:24:40 UTC
(In reply to Josef Reidinger from comment #6)
> Hi Michel, I cannot login to that openqa setup. What is credentials? is
> there reason to have it hidden?

I do not know what "openqa setup" you are referring to.
I have an access granted by Oliver Kurz that allow me a login to 
https://openqa.opensuse.org/

Is it what you are looking for ?
Comment 8 Josef Reidinger 2017-03-16 08:04:17 UTC
Oliver - do you have idea why when I go to http://openqa.opensuse.org/tests/latest?machine=ppc64le&arch=ppc64le&version=Tumbleweed&distri=opensuse&flavor=DVD&test=boot_to_snapshot password is required? and if it is intention what password is expected?
Comment 9 Oliver Kurz 2017-03-16 09:18:14 UTC
accessing the test results on o3 should not require a login. yesterday opensuse admins changed the login server and it was unavailable for some time. jreidinger: can you access https://openqa.opensuse.org?
Comment 10 Josef Reidinger 2017-03-16 09:24:23 UTC
Michel - I can see it now. In fact this Disk Activation screen is shown only when installator do not see any physical disk. So I suggest to check if system see disks, maybe problem is some disk driver? do you try it manually if you see any disk?
Comment 11 Michel Normand 2017-03-16 09:54:20 UTC
I tried the same snapshot 20170309 using a libvirt configuration with spapr-vscsi device (below) in place of virtio-* in openQA comment #5
and now the device driver is loaded.

So may be missing drivers for virtio-* in iso ?
How to continue investigation ?

=== different qemu parameters from manual libvirt trial:
 44799 ?        SLl    1:04 /usr/bin/qemu-system-ppc64 -name twppc64le3 -S -machine pseries-2.2,accel=kvm,usb=off -m 6144 -realtime mlock=off -smp 8,sockets=1,cores=2,threads=4 -uuid f7cdfa4d-7c20-4b10-8e6d-ca4ddd5b841c -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/home/normand/.config/libvirt/qemu/lib/twppc64le3.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device pci-ohci,id=usb,bus=pci.0,addr=0x1 -device spapr-vscsi,id=scsi0,reg=0x2000 -drive file=/home/normand/images/twppc64le3.disk1.qcow2,if=none,id=drive-scsi0-0-0-0,format=qcow2,cache=none -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=2 -drive file=http://sf1.test.toulouse-stg.fr.ibm.com:80/pub/linux/opensuse/factory/ppc64le/iso/latest,if=none,id=drive-scsi0-0-0-4,readonly=on,format=raw -device scsi-cd,bus=scsi0.0,channel=0,scsi-id=0,lun=4,drive=drive-scsi0-0-0-4,id=scsi0-0-0-4,bootindex=1 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=20 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:e7:4f:32,bus=pci.0,addr=0x2 -chardev pty,id=charserial0 -device spapr-vty,chardev=charserial0,reg=0x30001000 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 -msg timestamp=on
=== extract console
...
Loading basic drivers... ok                                                                                                                                                            
Starting hardware detection... ok                                                                                                                                                      
(If a driver is not working for you, try booting with brokenmodules=driver_name.)                                                                                                      
                                                                                                                                                                                       
IBM Virtual SCSI 0                                                                                                                                                                     
  drivers: ibmvscsi*                                                                                                                                                                   
Activating usb devices... ok
===
Comment 12 Michel Normand 2017-03-17 11:37:12 UTC
Created attachment 717858 [details]
linuxrc and YaST2 logs and lsmod on failing TW snapshot 20170309

If I compare the lsmod output while in linuxrc shell
between failing TW snapshot 20170309 and a previous leap 42.2
then this confirm there are missing modules for TW:
===
$grep virtio /tmp/lsmod_tw
virtio_balloon         12328  0
virtio_net             40607  0
===
$grep virtio /tmp/lsmod_leap
virtio_blk             15882  0 
virtio_balloon         10995  0 
virtio_net             33852  0 
virtio_pci             22583  0 
virtio_ring            15434  4 virtio_blk,virtio_net,virtio_pci,virtio_balloon
virtio                 11431  4 virtio_blk,virtio_net,virtio_pci,virtio_balloon
===

How to continue investigation ?
Comment 13 Michel Normand 2017-03-17 13:45:29 UTC
Need help from kernel team to understand how to have virtio module added back to boot initrd for PowerPC:

===
console:install:/ # lsmod |grep virtio
virtio_balloon         12328  0
virtio_net             40607  0
===
console:install:/ # find /lib/modules/4.10.1-1-default/ -name virtio\*                   
/lib/modules/4.10.1-1-default/initrd/virtio_scsi.ko                                      
/lib/modules/4.10.1-1-default/initrd/virtio_net.ko                                       
/lib/modules/4.10.1-1-default/initrd/virtio_mmio.ko
/lib/modules/4.10.1-1-default/initrd/virtio_input.ko                                     
/lib/modules/4.10.1-1-default/initrd/virtio_balloon.ko                                   
/lib/modules/4.10.1-1-default/initrd/virtio-gpu.ko 
===
Comment 14 Takashi Iwai 2017-03-28 04:59:22 UTC
Michal, could you take a look?
Comment 15 Michal Suchanek 2017-03-30 11:02:27 UTC
The current kernel config has virtio_blk as module but it is not present in the ramdisk.

It seems there has been a switch from virtio_scsi to virtio_blk in the openQA qemu config. My old config copied from openQA has disks connected as scsi which would work with the ramdisk but current openQA qemu config does not.

What determines which modules are added to the installer ramdisk? Presumably adding virtio_blk would resolve the issue as would reverting the openQA config to connect the disks as scsi (same as the cdrom in the current qemu config).
Comment 16 Michal Suchanek 2017-03-30 11:06:44 UTC
scsi disk example

QEMU_AUDIO_DRV=spice qemu-system-ppc64 -global isa-fdc.driveA= -g 1024x768 -soundhw ac97 -vga std -m 4096 -machine usb=off -cpu host \
-netdev user,id=qanet1 -device virtio-net,netdev=qanet1,mac=52:54:00:12:34:57 \
-device virtio-scsi-pci,id=scsi0 -device virtio-scsi-pci,id=scsi1 -device scsi-hd,drive=hd1a,bus=scsi0.0 -drive file=/srv/virt/twle.hdd,cache=none,if=none,id=hd1a,serial=mpath1,format=raw -device scsi-hd,drive=hd1b,bus=scsi1.0 -drive file=/srv/virt/twle.hdd,cache=none,if=none,id=hd1b,serial=mpath1,format=raw \
-drive media=cdrom,if=none,id=cd0,format=raw,file=/var/lib/openqa/factory/iso/openSUSE-Tumbleweed-DVD-ppc64le-Snapshot20161102-Media.iso -device scsi-cd,drive=cd0,bus=scsi0.0 \
-boot once=d,menu=on,splash-time=5000 \
-device nec-usb-xhci -device usb-tablet -device usb-kbd -smp 8,threads=8 -enable-kvm \
-chardev spicevmc,id=charchannel0,name=vdagent -device virtio-serial-pci,id=virtio-serial0 -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -spice port=5997,disable-ticketing \
-qmp unix:qmp_socket,server,nowait -monitor unix:hmp_socket,server,nowait -monitor telnet:127.0.0.1:20063,server,nowait -serial stdio
Comment 17 Michel Normand 2017-03-31 18:56:25 UTC
(In reply to Michal Suchanek from comment #15)
> The current kernel config has virtio_blk as module but it is not present in
> the ramdisk.
> 
> It seems there has been a switch from virtio_scsi to virtio_blk in the
> openQA qemu config. My old config copied from openQA has disks connected as
> scsi which would work with the ramdisk but current openQA qemu config does
> not.

I am a little confused, I do not see a recent change in openQA/os-autoinst
related to HDDMODEL: 
* set by virtio-blk by default (as per comment #5) 
* set to scsi-hd  for MULTIPATH test (as per your comment #16)
but both are using virtio-scsi-pci as controller.

> 
> What determines which modules are added to the installer ramdisk? Presumably
> adding virtio_blk would resolve the issue as would reverting the openQA
> config to connect the disks as scsi (same as the cdrom in the current qemu
> config).

The bypass to set HDDMODEL=scsi-hd is not a viable solution
because it would ultimately failed as already reported by
other bug #1018262.

So for me the real solution would be to understand why virtio/virtio_blk not present in ramdisk at install time when default HDDMODEL is used.
Comment 18 Michel Normand 2017-04-04 17:21:15 UTC
Michal, could you identify somebody who is familiar with the scripts 
used in "installation-images" package (1) with whom we could discuss
about the different results of generated install-initrd rpms
between ppc64le and x86_64 ?
The virtio_blk.ko is not present for ppc64le, while present for x86_64 (2)

(1) https://build.opensuse.org/package/show/openSUSE:Factory:PowerPC/installation-images
(2) 
===
$rpmunpack ../install-initrd-14.304-2.6.ppc64le.rpm
./usr/lib/install-initrd
./usr/lib/install-initrd/default
./usr/lib/install-initrd/default/module.config
./usr/lib/install-initrd/default/module.list
./usr/lib/install-initrd/initrd-base.xz
./usr/lib/install-initrd/openSUSE
./usr/lib/install-initrd/ppc64le
./usr/lib/install-initrd/ppc64le/module.config
./usr/lib/install-initrd/ppc64le/module.list
./usr/sbin/mkinstallinitrd
61588 blocks
[michel@twppc64le2:~/work/binaries.tw/tmp]
$grep virtio ./usr/lib/install-initrd/default/module.list
caif_virtio.ko
virtio-gpu.ko
virtio_balloon.ko
virtio_input.ko
virtio_mmio.ko
virtio_net.ko
virtio_scsi.ko
===
$rpmunpack ../install-initrd-14.304-2.5.x86_64.rpm
./usr/lib/install-initrd
./usr/lib/install-initrd/default
./usr/lib/install-initrd/default/module.config
./usr/lib/install-initrd/default/module.list
./usr/lib/install-initrd/initrd-base.xz
./usr/lib/install-initrd/openSUSE
./usr/sbin/mkinstallinitrd
63146 blocks
[michel@twppc64le2:~/work/binaries.x86_64/tmp]
$grep virtio ./usr/lib/install-initrd/default/module.list
caif_virtio.ko
virtio-gpu.ko
virtio_balloon.ko
virtio_blk.ko
virtio_input.ko
virtio_net.ko
virtio_scsi.ko
===
Comment 19 Josef Reidinger 2017-04-05 06:40:52 UTC
Steffen if you can answer comment#18
Comment 20 Steffen Winterfeldt 2017-04-05 08:19:08 UTC
The module location has been moved. :-/
Comment 21 Steffen Winterfeldt 2017-04-05 09:01:36 UTC
fixed

https://github.com/openSUSE/installation-images/pull/175
Comment 22 Michel Normand 2017-04-05 09:34:40 UTC
(In reply to Steffen Winterfeldt from comment #21)
> fixed
> 
> https://github.com/openSUSE/installation-images/pull/175

Thank-you,

I will wait for https://build.opensuse.org/request/show/485773
to arrive in TW.
Comment 24 Michel Normand 2017-04-12 16:10:23 UTC
FYIO, 
The correction is available and validated  with snapshot 20170411 tested on
https://openqa.opensuse.org/tests/overview?distri=opensuse&version=Tumbleweed&build=20170411&groupid=4