Bug 1203833

Summary: VMs with kernel 5.14.21 freezes after a while
Product: [openSUSE] openSUSE Distribution Reporter: Axel Schwarzer <SchwarzerA>
Component: KernelAssignee: openSUSE Kernel Bugs <kernel-bugs>
Status: RESOLVED DUPLICATE QA Contact: E-mail List <qa-bugs>
Severity: Major    
Priority: P5 - None CC: SchwarzerA, tiwai
Version: Leap 15.4   
Target Milestone: ---   
Hardware: VMWare   
OS: openSUSE Leap 15.4   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: hardcopy console lnx02
hardcopy console lnx99

Description Axel Schwarzer 2022-09-28 09:04:37 UTC
Created attachment 861810 [details]
hardcopy console lnx02

We run a lot of linux vms on our VMware vCenter 7.0.3, different Debians, RedHats plus openSUSE Tumbleweed and Leap 15.3 without difficulties for years now.

The next day after I updated some VMs from Leap 15.3 to 15.4 I realized that they were frozen over night. I was not able to find the point when that happened, could only grab a last hardcopy from the console. Even journalctl did not show useful information for me. One of them "lnx02" carries a Tomcat based application, the other one "lnx99" an Apache with three vhosts. Both are rarely used, lnx02 because it is a test system and lnx99 because it is a clone master. Users have not been on the systems during the freezes.


Used repositories are (mirrored locally): 

_ver="15.4"
rsync://ftp5.gwdg.de/pub/opensuse/distribution/leap/${_ver}/iso/
rsync://ftp5.gwdg.de/pub/opensuse/distribution/leap/${_ver}/repo/oss/
rsync://ftp5.gwdg.de/pub/opensuse/distribution/leap/${_ver}/repo/non-oss/
rsync://ftp.halifax.rwth-aachen.de/packman/suse/openSUSE_Leap_${_ver}/
rsync://ftp5.gwdg.de/pub/opensuse/update/leap/${_ver}/oss/
rsync://ftp5.gwdg.de/pub/opensuse/update/leap/${_ver}/non-oss/
rsync://ftp5.gwdg.de/pub/opensuse/repositories/Printing/${_ver}/
rsync://ftp5.gwdg.de/pub/opensuse/update/leap/${_ver}/backports/
rsync://ftp5.gwdg.de/pub/opensuse/update/leap/${_ver}/sle/
Comment 1 Axel Schwarzer 2022-09-28 09:05:06 UTC
Created attachment 861811 [details]
hardcopy console lnx99
Comment 2 Takashi Iwai 2022-09-28 09:12:45 UTC
Appears to be the same issue found on openQA; bug 1203630
Comment 3 Takashi Iwai 2022-09-28 09:14:44 UTC
The bug above suggests that it's a regression in the latest 15.4 MU kernel.
Could you try the older 15.4 kernel?
Comment 4 Axel Schwarzer 2022-09-28 10:21:54 UTC
I've done 'zypper dup' and changed the kernel on lnx99:

S  | Name                           | Type       | Version                              | Arch   | Repository
---+--------------------------------+------------+--------------------------------------+--------+----------------
v  | kernel-default                 | package    | 5.14.21-150400.24.21.2               | x86_64 | oSU.SLES
v  | kernel-default                 | package    | 5.14.21-150400.24.21.2               | x86_64 | repo-sle-update
v  | kernel-default                 | package    | 5.14.21-150400.24.18.1               | x86_64 | oSU.SLES
v  | kernel-default                 | package    | 5.14.21-150400.24.18.1               | x86_64 | repo-sle-update
v  | kernel-default                 | package    | 5.14.21-150400.24.11.1               | x86_64 | oSU.SLES
v  | kernel-default                 | package    | 5.14.21-150400.24.11.1               | x86_64 | repo-sle-update
i+ | kernel-default                 | package    | 5.14.21-150400.22.1                  | x86_64 | oSI.OSS

Is this one sufficiently old? Rebooting the VMachine and wait…
Comment 5 Takashi Iwai 2022-09-28 11:15:42 UTC
IIUC, it's a regression in the very last update, so even 5.14.21-150400.24.18.1 should work, too.
Comment 6 Axel Schwarzer 2022-09-29 06:22:49 UTC
The VM is still alive. It was started Mi 28. Sep 12:22:37 2022 CEST and took part in all daily tasks like cleaning up and archiving. These are scheduled on each of our machines via crontab. So there has been some workload and some root accesses via ssh.

Do you need further information or is this already sufficient?
Comment 7 Takashi Iwai 2022-09-30 08:00:45 UTC
It's a bug in NFS.  Let's track in another bug entry.

*** This bug has been marked as a duplicate of bug 1203630 ***