Bug 1018911

Summary: Total freeze with kernels bigger than 4.4.27-2.1
Product: [openSUSE] openSUSE Distribution Reporter: Anton Smorodskyi <anton.smorodskyi>
Component: KernelAssignee: E-mail List <kernel-maintainers>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: anton.smorodskyi, jslaby, sebastian.chlad, slindomansilla, tiwai, wvvelzen
Version: Leap 42.2Flags: tiwai: needinfo? (anton.smorodskyi)
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Anton Smorodskyi 2017-01-09 16:51:43 UTC
1. I am using KDE plasma as DE 
2. I have DELL Latitude E7470

Don't have any clear scenario when and how it happens just notice that after upgrade to 4.4.36-8.1 ( currently latest in Main Update Repository) my system at least once a day totally freezes. After freeze system is unreachable even for magic sys requests so no ability to get a dump. This issue constantly happening for a week everyday ( sometimes few time during a day ) . After I downgrade kernel to 4.4.27-2.1 issue disappear and I never face it for a long time with this version. So between 4.4.27-2.1 and 4.4.36-8.1 was some change which cause total kernel freeze
Comment 1 Takashi Iwai 2017-01-09 16:56:17 UTC
Could you check KOTD (OBS Kernel:openSUSE-42.2 repo) to see whether the bug still remains?  Also, try to set up kdump.  With a luck, we might catch something.
Comment 2 Anton Smorodskyi 2017-01-12 08:26:45 UTC
Some updates from my side :

1. I enable Kdump , first I installed back 4.4.36-8 ( latest on Leap 42.2 ) reproduces problem meaning my system hang again , tried to press Alt+ SysRQ + C but unfortunetly didn't find anything in /var/crash after restart
2. Now I installed latest KOTD 4.10.0-rc3-1.gf1c24bb and waiting for reproduce 
3. One thing that I notice during kernel-default updates to any version in my system is this two lines :

```
(1/1) Installing: kernel-default-4.10.rc3-1.1.gf1c24bb.x86_64 ...........................................................................................................................[done]
Additional rpm output:
ln: failed to create symbolic link '/boot/vmlinuz': Operation not permitted
ln: failed to create symbolic link '/boot/initrd': Operation not permitted
```

also manual attempt to create vmlinuz symlink lead to same error :
```
ln -s -T /boot/vmlinuz-4.10.0-rc3-1.gf1c24bb-default /boot/vmlinuz
ln: failed to create symbolic link '/boot/vmlinuz': Operation not permitted
```
Comment 3 Takashi Iwai 2017-01-12 09:55:06 UTC
The alt-syrq-c is not needed at the time the system crashes, but it simulates the crash.  When the kdump is set up, it should have been triggered automatically at kernel panic or such.

For testing the kdump, just boot normally, and at the running state, try alt-sysrq-c.  If this doesn't produce the crash dump, then either the kdump setup isn't sufficient or the kdump is buggy somewhere.  Often YaST sets up the too tight memory for kdump.  Try to increase the lower memory size.

The errors at installing 4.10-rc kernel are irrelevant with kdump or other issues.  What points /boot/vmlinuz?  Show the output of "ls -l /boot/vmlinuz"
Comment 4 Anton Smorodskyi 2017-01-12 14:46:28 UTC
I manage to reproduce the issue with 4.4.41-1.1.g3bf02b3 from KOTD (OBS Kernel:openSUSE-42.2 repo) but again no dump , regarding dump it appears there is separate problem which I posted as another bug https://bugzilla.suse.com/show_bug.cgi?id=1019590  until it will be fixed can't do much , so return back to stable working 4.4.27-2.1
Comment 5 Takashi Iwai 2017-01-12 15:32:37 UTC
OK, thanks.  Could you then try with nomodeset boot option?  This will disable i915 graphics.  If the problem is gone by that, the likely culprit is i915 updates.
Comment 8 Anton Smorodskyi 2017-01-20 14:27:59 UTC
I manage to reproduce bug with new firmware 

Repository     : Main Update Repository     
Name           : kernel-firmware            
Version        : 20160516git-5.1            
Arch           : noarch                     
Vendor         : openSUSE                   
Installed Size : 135.6 MiB                  
Installed      : Yes                        
Status         : up-to-date

Have no ability to use nomodeset on my workstation , so removing "NEEDINFO" flag.
Comment 9 Takashi Iwai 2017-01-20 15:00:11 UTC
OK, thanks.  Then let's try an aggressive way: copy the old i915.ko to the new kernel directory.

- Suppose the latest running 4.4.x kernel as $VERSION
    VERSION=$(uname -r)

- mkdir /lib/modules/$VERSION/updates

- cp /lib/modules/4.4.27-1-default/kernel/drivers/gpu/drm/i915.ko \
    /lib/modules/$VERSION/updates/

- /sbin/depmod -a

- /sbin/modinfo i915 | grep file
  filename:   /lib/modules/.../updates/i915.ko

- Reboot and retest.
Comment 10 Sebastian Chlad 2017-02-08 11:53:11 UTC
@Takashi Iwai:

out of curiosity. Is this the commit which solves the issue?
commit fdf35a6b22247746a7053fc764d04218a9306f82
Author: Takashi Iwai <tiwai@suse.de>
Date:   Mon Jan 9 15:56:14 2017 +0100

    drm: Fix broken VT switch with video=1366x768 option
Comment 11 Takashi Iwai 2017-02-08 11:59:45 UTC
(In reply to Sebastian Chlad from comment #10)
> @Takashi Iwai:
> 
> out of curiosity. Is this the commit which solves the issue?
> commit fdf35a6b22247746a7053fc764d04218a9306f82
> Author: Takashi Iwai <tiwai@suse.de>
> Date:   Mon Jan 9 15:56:14 2017 +0100
> 
>     drm: Fix broken VT switch with video=1366x768 option

I don't think so.  It's for bsc#1018358.
E7470 has a higher resolution, IIRC.
Comment 12 Sergio Lindo Mansilla 2017-03-23 08:29:11 UTC
It also happened to me.

- Dell Latitude E7470
- openSUSE Leap 42.2

After downgrading the Kernel like Anton Smorodskyi did, it hasn't happened again.
Comment 13 Anton Smorodskyi 2017-04-12 09:31:03 UTC
problem goes away after upgrade to kernel 4.4.57-18.3-default on machine where it was initially found
Comment 14 Jiri Slaby 2017-08-22 14:06:25 UTC
good