Bug 941305 - ldconfig fails with message "Aborted" after upgrade of sbl to version 3.5.0.20130317.git7a75bc29-24.4.1
Summary: ldconfig fails with message "Aborted" after upgrade of sbl to version 3.5.0.2...
Status: RESOLVED FIXED
Alias: None
Product: openSUSE Distribution
Classification: openSUSE
Component: Other (show other bugs)
Version: 13.2
Hardware: PC openSUSE 13.2
: P2 - High : Normal (vote)
Target Milestone: ---
Assignee: Jan Kara
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-08-11 14:19 UTC by Giacomo Comes
Modified: 2015-10-29 16:54 UTC (History)
4 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
xfs patches used for testing (2.88 KB, application/x-compressed-tar)
2015-08-13 17:09 UTC, Giacomo Comes
Details
metadump of corrupted filesystem (1.46 MB, application/x-xz)
2015-08-14 16:13 UTC, Giacomo Comes
Details
new metadump of corrupted filesystem (with xfs patches) (1.74 MB, application/x-xz)
2015-08-15 12:29 UTC, Giacomo Comes
Details
xfs: Extend tracing of rename operations (2.52 KB, patch)
2015-08-20 17:42 UTC, Jan Kara
Details | Diff
[PATCH v2] xfs: Extend tracing of rename operations (4.24 KB, patch)
2015-08-20 18:08 UTC, Jan Kara
Details | Diff
requested xfs_trace (2.53 KB, application/x-xz)
2015-08-20 20:50 UTC, Giacomo Comes
Details
[PATCH v3] xfs: Extend tracing of rename operations (6.50 KB, patch)
2015-08-21 08:28 UTC, Jan Kara
Details | Diff
second requested xfs_trace (2.46 KB, application/x-xz)
2015-08-21 13:19 UTC, Giacomo Comes
Details
xfs: Fix file type directory corruption for btree directories (2.18 KB, patch)
2015-08-21 17:54 UTC, Jan Kara
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Giacomo Comes 2015-08-11 14:19:34 UTC
ldconfig stopped working (message: Aborted, returncode: 134) after a recent update. The update causing the problem is sbl-3.5.0.20130317.git7a75bc29-24.4.1.
Here is how to test it:
Remove sbl if installed:
  zypper rm sbl
install the version provided with 13.2:
  zypper in sbl-3.5.0
now install the update
  zypper in sbl
This error message appears:

Additional rpm output:
/var/tmp/rpm-tmp.7MTEpn: line 1: 26734 Aborted                 /sbin/ldconfig

now ldconfig does not work anymore:

  ldconfig
Aborted

I could not figure out what is causing the problem, but I guess the preinstall scriptlet should have something to do with it.
If I remove sbl and I install directly the last version, then ldconfig works again.
Comment 1 Martin Pluskal 2015-08-12 07:20:29 UTC
This issue looks remotely similar to boo#908780, are you by any chance using xfs?
Comment 2 Giacomo Comes 2015-08-12 15:07:40 UTC
Yes, / is xfs.
Looking at boo#908780 comment 23 there is a mention of a patch in boo#910336 that may fix the problem, but I cannot find it. If you can point me to such patch I can apply it to the current kernel and test it.
Comment 3 Andreas Schwab 2015-08-12 15:27:51 UTC
See https://apibugzilla.novell.com/show_bug.cgi?id=908780#c27

*** This bug has been marked as a duplicate of bug 908780 ***
Comment 4 Jan Kara 2015-08-13 07:52:52 UTC
So the patches mentioned in bug 908780 are commits 0d612fb570b7 "xfs: ensure buffer types are set correctly".. 3443a3bca545 "xfs: set superblock buffer type correctly" upstream (in total 4 patches). They got merged in kernel 4.0 and backported to some -stable trees but we don't have them in 13.2 kernel. I'll push these patches to 13.2 kernel.
Comment 5 Jan Kara 2015-08-13 08:29:29 UTC
OK, I've pushed the fixes to openSUSE-13.2 branch.
Comment 6 Giacomo Comes 2015-08-13 17:09:25 UTC
Created attachment 643793 [details]
xfs patches used for testing

I have applied to the current 13.2 kernel 3.16.7-21.1 the set of four xfs patches that are included in the attachment. Build a new kernel, installed and rebooted.

Then again:
zypper rm sbl
zypper in sbl-3.5.0
zypper in sbl

At the end ldconfig is still broken.

Can somebody else test the patches on 13.2? Either I did something wrong, or the patches are not enough to fix the problem for the kernel 3.16.
Comment 7 Jan Kara 2015-08-14 08:33:36 UTC
Side note: Please don't set text/plain type for tar.gz archives. Bugzilla gets confused by that ;)

I'll have a look at the problem on Moday. Indeed it seems these patches aren't enough but given you are able to reproduce the issue, we should be able to drill down to the cause. Can you please boot from rescue CD and run:

xfs_metadump -o <root-device> - | gzip -c image_file.gz

where image file is on some other partition / USB stick?

Then please make the file available for download. The file will contain metadata from your filesystem which I will use for inspection if I fail to reproduce the issue on a test machine. Thanks!
Comment 8 Giacomo Comes 2015-08-14 12:44:47 UTC
I will install the new released kernel and then create the metadata.

As a side note, the packages virtualbox-{host,guest}-kmp-<flavor> were not build for the latest kernel, leading to the installation of kernel<flavor>-base in my case. Can you please make them build?
Comment 9 Giacomo Comes 2015-08-14 16:13:12 UTC
Created attachment 643916 [details]
metadump of corrupted filesystem

Here is the metadump output of a system with the latest kernel 3.16.7-24.1
after the installation of sbl and broken ldconfig
Comment 10 Giacomo Comes 2015-08-15 12:29:05 UTC
Created attachment 643953 [details]
new metadump of corrupted filesystem (with xfs patches)

After all. the latest kernel 3.16.7-24.1 does not contain the xfs patches. I have created a custom kernel with include such patches.

Here is a new metadump output of a system with kernel 3.16.7-24.1 + xfs patches
after the installation of sbl and broken ldconfig
Comment 11 Jan Kara 2015-08-20 17:41:24 UTC
OK, so I've spent some time on this. I cannot reproduce the problem on my test machine but that doesn't really surprise me that much.

Looking into what zypper exactly does, I can see that it first creates a symlink under a different name and then renames it to a new name replacing the original regular file (the entry with wrong filetype is for /usr/lib64/libbrld.so.1). However on my xfs filesystem filetype gets properly updated while in your case it does not.

I was looking into the metadump and when dumping log I can see that indeed buffer with the libbrld.so.1 gets updated but in the buffer is stored new inode number but still original file type. So this rules out some issues with journalling or stuff like that.

Looking into the code filetype gets updated just next to the inode number so that leaves me at loss how filetype can be wrong. I have created a patch which somewhat extents a tracing information from XFS. Can you please run a kernel with that patch applied (will attach it in a moment) and run:

zypper rm sbl
zypper in sbl-3.5.0
echo 1 >/sys/kernel/debug/tracing/events/xfs/xfs_dir2_leaf_replace/enable
echo 1 >/sys/kernel/debug/tracing/events/xfs/xfs_dir2_leaf_removename/enable
echo 1 >/sys/kernel/debug/tracing/events/xfs/xfs_rename/enable
cat /sys/kernel/debug/tracing/trace_pipe >/tmp/xfs_trace
zypper in sbl

And then attach here /tmp/xfs_trace. Thanks!
Comment 12 Jan Kara 2015-08-20 17:42:13 UTC
Created attachment 644505 [details]
xfs: Extend tracing of rename operations
Comment 13 Jan Kara 2015-08-20 18:08:03 UTC
Created attachment 644506 [details]
[PATCH v2] xfs: Extend tracing of rename operations

Updated version of the patch with more info...
Comment 14 Giacomo Comes 2015-08-20 20:50:11 UTC
Created attachment 644524 [details]
requested xfs_trace
Comment 15 Jan Kara 2015-08-21 08:27:32 UTC
Thanks for the prompt reply! Looking at the libbrld handling in the trace, I can see:
             rpm-1441  [009] ....  1011.553683: xfs_rename: dev 8:2 src dp ino 0
x1800110 target dp ino 0x1800110 src type 0 target type 7 src name libbrld.so;55
d63b6d target name libbrld.so
             rpm-1441  [009] ....  1011.553810: xfs_rename: dev 8:2 src dp ino 0
x1800110 target dp ino 0x1800110 src type 0 target type 7 src name libbrld.so.1;
55d63b6d target name libbrld.so.1
             rpm-1441  [009] ....  1011.554287: xfs_rename: dev 8:2 src dp ino 0x1800110 target dp ino 0x1800110 src type 0 target type 1 src name libbrld.so.1.0;55d63b6d target name libbrld.so.1.0

In particular we are interested in the second entry where libbrld.so.1;55d63b6d gets renamed to libbrld.so.1. The 'type 7' field tells us that at least xfs_rename() was sending down correct ftype. Sadly I wrongly identified type of your /usr/lib64 directory so the other tracepoints didn't trigger.

Since we have narrowed down the problem somewhat more, I've added further tracing. Can you please do another debug round for me with the patch I'll attach here in a moment? Do:

zypper rm sbl
zypper in sbl-3.5.0
echo 1 >/sys/kernel/debug/tracing/events/xfs/xfs_dir2_node_replace/enable
echo 1 >/sys/kernel/debug/tracing/events/xfs/xfs_dir2_node_replace_done/enable
echo 1 >/sys/kernel/debug/tracing/events/xfs/xfs_dir2_node_removename/enable
echo 1 >/sys/kernel/debug/tracing/events/xfs/xfs_dir2_node_addname/enable
echo 1 >/sys/kernel/debug/tracing/events/xfs/xfs_rename/enable
cat /sys/kernel/debug/tracing/trace_pipe >/tmp/xfs_trace &
zypper in sbl

Thanks!
Comment 16 Jan Kara 2015-08-21 08:28:21 UTC
Created attachment 644557 [details]
[PATCH v3] xfs: Extend tracing of rename operations
Comment 17 Giacomo Comes 2015-08-21 13:19:40 UTC
Created attachment 644605 [details]
second requested xfs_trace
Comment 18 Jan Kara 2015-08-21 17:52:59 UTC
Thanks for the trace! So finally the info was detailed enough that I was able to pinpoint the problem, reproduce it and fix it. In the trace we can see:

             rpm-1476  [005] ....  1624.636105: xfs_rename: dev 8:2 src dp ino 0x1800110 target dp ino 0x1800110 src type 0 target type 7 src name libbrld.so.1;55d7247f target name libbrld.so.1
             rpm-1476  [005] ....  1624.636109: xfs_dir2_node_replace: dev 8:2 ino 0x1800110 name libbrld.so.1 namelen 12 hashval 0x8c93fe62 inumber 0x18967cd op_flags  ftype 7 dir3_op 1
             rpm-1476  [005] ....  1624.636113: xfs_dir2_node_replace_done: dev 8:2 ino 0x1800110 oldino 0x187cf79 newino 0x18967cd oldftype 1 newftype 1 dir3_op 1

We see that ftype was correct when entering xfs_dir2_node_replace() but the stored value was still wrong as we can see in the last trace entry. After some code inspection I have found that xfs_da3_node_lookup_int() called from xfs_dir2_node_replace() clobbers args->filetype. I'll attach here a patch which fixes the problem for me.
Comment 19 Jan Kara 2015-08-21 17:54:45 UTC
Created attachment 644636 [details]
xfs: Fix file type directory corruption for btree directories
Comment 20 Giacomo Comes 2015-08-21 19:16:42 UTC
Indeed the patch fixes the issue. I will make a custom rpm with such patch and use it until a newer kernel with the fix is released for 13.2. Just one question: should I use also the four xfs patches that are in comment 6 or they are not needed for 13.2?
Comment 21 Jan Kara 2015-08-24 09:22:10 UTC
Well, those patches fix a real problem that can result in data corruption (but only after a system crash when the log is being replayed) so using them doesn't do any harm. But they aren't needed to fix this particular bug.
Comment 22 Jan Kara 2015-08-31 06:41:26 UTC
OK, the patch has been merged into upstream repository as well. I have pushed it to openSUSE-13.2 and SLE12 kernel branches. Closing the bug.
Comment 25 Swamp Workflow Management 2015-10-13 09:20:55 UTC
SUSE-SU-2015:1727-1: An update that solves 7 vulnerabilities and has 44 fixes is now available.

Category: security (important)
Bug References: 856382,886785,898159,907973,908950,912183,914818,916543,920016,922071,924722,929092,929871,930813,932285,932350,934430,934942,934962,936556,936773,937609,937612,937613,937616,938550,938706,938891,938892,938893,939145,939266,939716,939834,939994,940398,940545,940679,940776,940912,940925,940965,941098,941305,941908,941951,942160,942204,942307,942367,948536
CVE References: CVE-2015-5156,CVE-2015-5157,CVE-2015-5283,CVE-2015-5697,CVE-2015-6252,CVE-2015-6937,CVE-2015-7613
Sources used:
SUSE Linux Enterprise Workstation Extension 12 (src):    kernel-default-3.12.48-52.27.1
SUSE Linux Enterprise Software Development Kit 12 (src):    kernel-docs-3.12.48-52.27.2, kernel-obs-build-3.12.48-52.27.1
SUSE Linux Enterprise Server 12 (src):    kernel-default-3.12.48-52.27.1, kernel-source-3.12.48-52.27.1, kernel-syms-3.12.48-52.27.1, kernel-xen-3.12.48-52.27.2
SUSE Linux Enterprise Module for Public Cloud 12 (src):    kernel-ec2-3.12.48-52.27.1
SUSE Linux Enterprise Live Patching 12 (src):    kgraft-patch-SLE12_Update_8-1-2.6
SUSE Linux Enterprise Desktop 12 (src):    kernel-default-3.12.48-52.27.1, kernel-source-3.12.48-52.27.1, kernel-syms-3.12.48-52.27.1, kernel-xen-3.12.48-52.27.2
Comment 26 Swamp Workflow Management 2015-10-29 16:54:20 UTC
openSUSE-SU-2015:1842-1: An update that solves 7 vulnerabilities and has 7 fixes is now available.

Category: security (important)
Bug References: 919154,926238,937969,938645,939834,940338,941104,941305,941867,942178,944296,947155,951195,951440
CVE References: CVE-2015-0272,CVE-2015-1333,CVE-2015-2925,CVE-2015-3290,CVE-2015-5283,CVE-2015-5707,CVE-2015-7872
Sources used:
openSUSE 13.2 (src):    bbswitch-0.8-3.13.2, cloop-2.639-14.13.2, crash-7.0.8-13.2, hdjmod-1.28-18.14.2, ipset-6.23-13.2, kernel-debug-3.16.7-29.1, kernel-default-3.16.7-29.1, kernel-desktop-3.16.7-29.1, kernel-docs-3.16.7-29.3, kernel-ec2-3.16.7-29.1, kernel-obs-build-3.16.7-29.2, kernel-obs-qa-3.16.7-29.1, kernel-obs-qa-xen-3.16.7-29.1, kernel-pae-3.16.7-29.1, kernel-source-3.16.7-29.1, kernel-syms-3.16.7-29.1, kernel-vanilla-3.16.7-29.1, kernel-xen-3.16.7-29.1, pcfclock-0.44-260.13.2, vhba-kmp-20140629-2.13.2, xen-4.4.2_06-27.2, xtables-addons-2.6-13.2