Bug 989084

Summary: The last kernel update broke "ecryptfs"
Product: [openSUSE] openSUSE Distribution Reporter: Neil Rickert <nwr10cst-oslnx>
Component: KernelAssignee: E-mail List <kernel-maintainers>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Critical    
Priority: P5 - None CC: astieger, forgotten_rwiaD0Rwva, forgotten_sZOYI8rJY8, gleixner, jeffm, maintenance, mhocko, nwr10cst-oslnx, tiwai
Version: Leap 42.1   
Target Milestone: ---   
Hardware: x86-64   
OS: openSUSE 42.1   
Whiteboard:
Found By: Community User Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: "typescript" file showing journalctl output for several boots

Description Neil Rickert 2016-07-15 02:04:44 UTC
User-Agent:       Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.21 (KHTML, like Gecko) konqueror/4.14.18 Safari/537.21
Build Identifier: 

I'm using "ecryptfs" with a private directory (/home/rickert/Private )

I last used this just before the kernel update (to kernel 4.1.27-24-default).  It was working fine.

I tried looking at my private directory a little while ago, and got a weird message about wrong medium type (using "ls" at the command line).

I rebooted.  Still the same problem.

I then rebooted to the previous kernel, and my private directory is fine with the previous kernel (4.1.26-21-default).  So this looks like a bug in the 4.1.27 kernel.

I will attach the output from:

journalctl -b -1 | grep ecryptfs

Actually, it will be a "typescript" file, showing similar output for several boots (-1, -2, -3, -4 and the current boot)

Reproducible: Always
Comment 1 Neil Rickert 2016-07-15 02:09:33 UTC
Created attachment 684349 [details]
"typescript" file showing journalctl output for several boots

To help identify the various boots, here is the output from:
# last | grep reboot | head -5
reboot   system boot  4.1.26-21-defaul Thu Jul 14 20:37 - 20:52  (00:14)    
reboot   system boot  4.1.27-24-defaul Thu Jul 14 20:31 - 20:37  (00:05)    
reboot   system boot  4.1.27-24-defaul Thu Jul 14 09:07 - 20:30  (11:22)    
reboot   system boot  4.1.26-21-defaul Sun Jul 10 20:20 - 08:57 (3+12:37)   
reboot   system boot  4.1.26-21-defaul Thu Jul  7 07:05 - 19:12 (3+12:07)
Comment 2 Neil Rickert 2016-07-15 03:25:17 UTC
Here's some output.  This was on a second computer with the same problem.  This computer is running kernel 4.1.27-24-default (the one with problems).

--- begin output ---
nwr3:rickert 1% df
Filesystem                1K-blocks     Used Available Use% Mounted on
devtmpfs                     984340        0    984340   0% /dev
tmpfs                        991272      176    991096   1% /dev/shm
tmpfs                        991272     2196    989076   1% /run
tmpfs                        991272        0    991272   0% /sys/fs/cgroup
/dev/mapper/nwr3vol-root2  41153856 17218104  21822216  45% /
/dev/sda5                   2048000  1194893    853107  59% /windows/D
/dev/sda6                    516040    64380    425448  14% /boot
/dev/mapper/nwr3vol-home   67984968   823544  66107012   2% /xhome
/dev/mapper/cr_shared     459873248 88792936 370129252  20% /shared
/home/rickert/.Private     67984968   823544  66107012   2% /home/rickert/Private
nwr3:rickert 2% ls Private
ls: cannot open directory Private: Wrong medium type
nwr3:rickert 3% cd Private
nwr3:rickert 4% ls
ls: cannot open directory .: Wrong medium type
nwr3:rickert 5% head -3 calendar
Mon Sep 12      2016 Dr Becker, 1:00 pm
Tue Jul 26      2016 Dentist, 9:00 am
Mon Jun 01      2018 Dr Zubair set up endoscopy & colonoscopy
nwr3:rickert 6% head iso-usb.txt
32bit 4G        slackware64-14.2-install-dvd.iso
64bit 4G        openSUSE-13.2-Rescue-CD-x86_64.iso
32bit 8G        openSUSE-Leap-42.2-DVD-x86_64-Build0045-Media.iso (Alpha2)
64bit 8G        openSUSE-Leap-42.1-DVD-x86_64.iso
PNY   4G        Li-f-e-16.04.amd64.iso
CruzA 8G        openSUSE-Tumbleweed-DVD-x86_64-Snapshot20160621-Media.iso
CruzB 8G        openSUSE-Edu-li-f-e.x86_64-42.1.1.iso
nwr3:rickert 7% 
--- end output ---

So it looks as if the ecryptfs private directory is mounted, and files are accessible by name.  But "ls" gives an error, and probably anything that does
a readdir() will fail.
Comment 3 Takashi Iwai 2016-07-15 05:12:10 UTC
There has been a few fixes for ecryptfs due to the recent security issue (CVE-2016-1583), so most likely that caused a regression.

Could you test kernel-vanilla to check whether it shows the same problem or not?  Since we have two fixes in 4.1.x stable and one another fix in our own, we need to identify which ones causes the issue.
Comment 4 Neil Rickert 2016-07-15 05:34:30 UTC
The same problem exists with kernel-vanilla.
Comment 5 Takashi Iwai 2016-07-15 05:54:16 UTC
OK, then it's the upstream fix that broke.

I'm building a test kernel in OBS home:tiwai:bnc989084 with a partial revert that was already included in SLE11-SP2 / oS42.2 kernel.  It'll take some time (maybe an hour or so) until the build finishes.  Please try the kernel later.
Comment 6 Michal Hocko 2016-07-15 06:48:10 UTC
just wondering what is the filesystem type you are using for the /home/rickert/.Private?
Comment 10 Andreas Stieger 2016-07-15 11:44:59 UTC
*** Bug 989157 has been marked as a duplicate of this bug. ***
Comment 11 Takashi Iwai 2016-07-15 11:47:52 UTC
The test kernel is ready at:
  http://download.opensuse.org/repositories/home:/tiwai:/bnc989084/standard/

Please check whether this works.
Comment 12 Forgotten User sZOYI8rJY8 2016-07-15 12:10:07 UTC
@Takashi Iwai:
the suggested test kernel (kernel-default-4.1.28-1.1.g8197286.x86_64.rpm) 
solved that issue for me! - Thank you!
Comment 13 Takashi Iwai 2016-07-15 12:15:00 UTC
Good to hear.  The fix was already merged to kernel git openSUSE-42.1 branch, so it'll be included in KOTD in tomorrow:
   http://download.opensuse.org/repositories/Kernel:/openSUSE-42.1/standard/

Since it's a bad regression, we'll need to submit the new kernel again, I suppose.  I'm going to submit it early in the next week after some smoke tests.
Comment 14 Neil Rickert 2016-07-15 12:41:48 UTC
I tried that kernel.  I installed kernel-default-4.1.28.*.  I hope that was the one that I was supposed to test.

If I login at the command line, then

ls Private

works as it should.

If I login to KDE/Plasma 5, I get a kernel panic.  I've tried that as two different users, only one of them using "ecryptfs".  Either way, I end up in a kernel panic.

And a response to comment #6: The file systems are "ext4" (both computers that see the problem).
Comment 15 Takashi Iwai 2016-07-15 12:58:36 UTC
(In reply to Neil Rickert from comment #14)
> If I login to KDE/Plasma 5, I get a kernel panic.  I've tried that as two
> different users, only one of them using "ecryptfs".  Either way, I end up in
> a kernel panic.
> 
> And a response to comment #6: The file systems are "ext4" (both computers
> that see the problem).

That's bad.  It might be something from the latest 4.1.28 stable updates.

I'm building another kernel, just cherry-picking two fix commits on top of the previous 4.1.27-24 kernel.  It's being built on OBS home:tiwai:branches:openSUSE:Leap:42.1:Update repo.

Please check this later.  If this works, I'll submit it as a fast-path fix, then fix the other 4.1.28 regression later.
Comment 16 Forgotten User sZOYI8rJY8 2016-07-15 13:32:05 UTC
I can login KDE but I got also a kernel panic after running the system for some time (doing nothing)!
(using btrfs)
I will report on that later...
Comment 17 Takashi Iwai 2016-07-15 13:45:45 UTC
It seems that 4.1.28 is a really bad release.  I did one another revert of the commit that was reported on LKML.  The new 4.1.28 kernel is being built on OBS home:tiwai:bnc989084-2 repo.

So, at first, please test the 4.1.27 kernel with the ecryptfs fix in OBS home:tiwai:branches:openSUSE:Leap:42.1:Update repo.  (It's being built now.)  If this works, I'll submit this one as a quick fix.

After that, please test the one in OBS home:tiwai:bnc989084-2 repo.  (Also it's being built now.)  This is based on 4.1.28, and contains ecryptfs fixes and a few reverts.
Comment 19 Michal Hocko 2016-07-15 14:59:03 UTC
JFTR c5ad33184354260be6d05de57e46a5498692f6d6 is misbackported same as http://lkml.kernel.org/r/20160714175521.3675e3d6%40gandalf.local.home
Comment 20 Michal Hocko 2016-07-15 15:00:17 UTC
(In reply to Michal Hocko from comment #19)
> JFTR c5ad33184354260be6d05de57e46a5498692f6d6 is misbackported same as
> http://lkml.kernel.org/r/20160714175521.3675e3d6%40gandalf.local.home

Ble, sorry the original one was 4.1 as well and that is what Takashi probably meant.
Comment 21 Neil Rickert 2016-07-17 20:54:29 UTC
>So, at first, please test the 4.1.27 kernel with the ecryptfs fix in OBS home:tiwai:branches:openSUSE:Leap:42.1:Update repo.

For that, I used the url "http://download.opensuse.org/repositories/home:/tiwai:/branches:/openSUSE:/Leap:/42.1:/Update/openSUSE_Leap_42.1_Update/".  I hope that was right.

It seems to be working fine.  In particular, the "ecryptfs" problem no longer exists.

>The new 4.1.28 kernel is being built on OBS home:tiwai:bnc989084-2 repo.

I also tried this one.  And I am still getting kernel panics. So 4.1.28 is not looking good at the moment.
Comment 22 Bernhard Wiedemann 2016-07-18 08:00:14 UTC
This is an autogenerated message for OBS integration:
This bug (989084) was mentioned in
https://build.opensuse.org/request/show/411683 42.1 / kernel-source
Comment 23 Takashi Iwai 2016-07-18 08:05:00 UTC
Now I submitted a quick fix concentrating only on ecryptfs to openSUSE:Leap:42.1:Update.  It'll be 4.1.27-based.

For the rest issues, the regressions in 4.1.28, let's continue tracking on bug 989176.

Thanks for reporting and testing!
Comment 24 Andreas Stieger 2016-07-18 10:02:00 UTC
What is your expectation for a turn-around time?
Do you expect to submit for bug 989176 soon to be included as well?
Comment 25 Takashi Iwai 2016-07-18 10:11:38 UTC
(In reply to Andreas Stieger from comment #24)
> What is your expectation for a turn-around time?

Please take SR#411863 quickly.  It contains only one patch on top of the previously released kernel, so it's absolutely safe.

> Do you expect to submit for bug 989176 soon to be included as well?

4.1.28 isn't yet in openSUSE:Leap:42.1:Update, and I'm not going to submit it until lots of problems are settled down.  Maybe after 4.1.29.
Comment 26 Takashi Iwai 2016-07-20 15:42:52 UTC
*** Bug 989805 has been marked as a duplicate of this bug. ***
Comment 27 Neil Rickert 2016-07-20 15:58:08 UTC
I'll note that the patch to fix this is already in the "update test" repo for Leap 42.1.  I've installed than on one system, where it is doing fine.  The "update test" repo is one of the community repos that you can easily add with Yast software repositories.
Comment 28 Takashi Iwai 2016-07-21 05:18:08 UTC
*** Bug 989865 has been marked as a duplicate of this bug. ***
Comment 29 Swamp Workflow Management 2016-07-21 22:09:36 UTC
openSUSE-RU-2016:1849-1: An update that has two recommended fixes can now be installed.

Category: recommended (important)
Bug References: 987990,989084
CVE References: 
Sources used:
openSUSE Leap 42.1 (src):    kernel-debug-4.1.27-27.1, kernel-default-4.1.27-27.1, kernel-docs-4.1.27-27.2, kernel-ec2-4.1.27-27.1, kernel-obs-build-4.1.27-27.2, kernel-obs-qa-4.1.27-27.1, kernel-obs-qa-xen-4.1.27-27.1, kernel-pae-4.1.27-27.1, kernel-pv-4.1.27-27.1, kernel-source-4.1.27-27.1, kernel-syms-4.1.27-27.1, kernel-vanilla-4.1.27-27.1, kernel-xen-4.1.27-27.1, virtualbox-5.0.24-25.1
Comment 31 Swamp Workflow Management 2016-09-12 12:14:55 UTC
openSUSE-SU-2016:2290-1: An update that solves 17 vulnerabilities and has 9 fixes is now available.

Category: security (important)
Bug References: 963931,970948,971126,971360,974266,978821,978822,979018,979213,979879,980371,981058,981267,986362,986365,986570,987886,989084,989152,989176,990058,991110,991608,991665,994296,994520
CVE References: CVE-2015-8787,CVE-2016-1237,CVE-2016-2847,CVE-2016-3134,CVE-2016-3156,CVE-2016-4485,CVE-2016-4486,CVE-2016-4557,CVE-2016-4569,CVE-2016-4578,CVE-2016-4580,CVE-2016-4805,CVE-2016-4951,CVE-2016-4998,CVE-2016-5696,CVE-2016-6480,CVE-2016-6828
Sources used:
openSUSE Leap 42.1 (src):    drbd-8.4.6-8.1, hdjmod-1.28-24.1, ipset-6.25.1-5.1, kernel-debug-4.1.31-30.2, kernel-default-4.1.31-30.2, kernel-docs-4.1.31-30.3, kernel-ec2-4.1.31-30.2, kernel-obs-build-4.1.31-30.3, kernel-obs-qa-4.1.31-30.1, kernel-obs-qa-xen-4.1.31-30.1, kernel-pae-4.1.31-30.2, kernel-pv-4.1.31-30.2, kernel-source-4.1.31-30.1, kernel-syms-4.1.31-30.1, kernel-vanilla-4.1.31-30.2, kernel-xen-4.1.31-30.2, lttng-modules-2.7.0-2.1, pcfclock-0.44-266.1, vhba-kmp-20140928-5.1