|
Bugzilla – Full Text Bug Listing |
| Summary: | ldconfig fails with message "Aborted" after upgrade of sbl to version 3.5.0.20130317.git7a75bc29-24.4.1 | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE Distribution | Reporter: | Giacomo Comes <comes> |
| Component: | Other | Assignee: | Jan Kara <jack> |
| Status: | RESOLVED FIXED | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Normal | ||
| Priority: | P2 - High | CC: | chcao, comes, jack, mpluskal |
| Version: | 13.2 | ||
| Target Milestone: | --- | ||
| Hardware: | PC | ||
| OS: | openSUSE 13.2 | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
xfs patches used for testing
metadump of corrupted filesystem new metadump of corrupted filesystem (with xfs patches) xfs: Extend tracing of rename operations [PATCH v2] xfs: Extend tracing of rename operations requested xfs_trace [PATCH v3] xfs: Extend tracing of rename operations second requested xfs_trace xfs: Fix file type directory corruption for btree directories |
||
|
Description
Giacomo Comes
2015-08-11 14:19:34 UTC
This issue looks remotely similar to boo#908780, are you by any chance using xfs? Yes, / is xfs. Looking at boo#908780 comment 23 there is a mention of a patch in boo#910336 that may fix the problem, but I cannot find it. If you can point me to such patch I can apply it to the current kernel and test it. See https://apibugzilla.novell.com/show_bug.cgi?id=908780#c27 *** This bug has been marked as a duplicate of bug 908780 *** So the patches mentioned in bug 908780 are commits 0d612fb570b7 "xfs: ensure buffer types are set correctly".. 3443a3bca545 "xfs: set superblock buffer type correctly" upstream (in total 4 patches). They got merged in kernel 4.0 and backported to some -stable trees but we don't have them in 13.2 kernel. I'll push these patches to 13.2 kernel. OK, I've pushed the fixes to openSUSE-13.2 branch. Created attachment 643793 [details]
xfs patches used for testing
I have applied to the current 13.2 kernel 3.16.7-21.1 the set of four xfs patches that are included in the attachment. Build a new kernel, installed and rebooted.
Then again:
zypper rm sbl
zypper in sbl-3.5.0
zypper in sbl
At the end ldconfig is still broken.
Can somebody else test the patches on 13.2? Either I did something wrong, or the patches are not enough to fix the problem for the kernel 3.16.
Side note: Please don't set text/plain type for tar.gz archives. Bugzilla gets confused by that ;) I'll have a look at the problem on Moday. Indeed it seems these patches aren't enough but given you are able to reproduce the issue, we should be able to drill down to the cause. Can you please boot from rescue CD and run: xfs_metadump -o <root-device> - | gzip -c image_file.gz where image file is on some other partition / USB stick? Then please make the file available for download. The file will contain metadata from your filesystem which I will use for inspection if I fail to reproduce the issue on a test machine. Thanks! I will install the new released kernel and then create the metadata.
As a side note, the packages virtualbox-{host,guest}-kmp-<flavor> were not build for the latest kernel, leading to the installation of kernel<flavor>-base in my case. Can you please make them build?
Created attachment 643916 [details]
metadump of corrupted filesystem
Here is the metadump output of a system with the latest kernel 3.16.7-24.1
after the installation of sbl and broken ldconfig
Created attachment 643953 [details]
new metadump of corrupted filesystem (with xfs patches)
After all. the latest kernel 3.16.7-24.1 does not contain the xfs patches. I have created a custom kernel with include such patches.
Here is a new metadump output of a system with kernel 3.16.7-24.1 + xfs patches
after the installation of sbl and broken ldconfig
OK, so I've spent some time on this. I cannot reproduce the problem on my test machine but that doesn't really surprise me that much. Looking into what zypper exactly does, I can see that it first creates a symlink under a different name and then renames it to a new name replacing the original regular file (the entry with wrong filetype is for /usr/lib64/libbrld.so.1). However on my xfs filesystem filetype gets properly updated while in your case it does not. I was looking into the metadump and when dumping log I can see that indeed buffer with the libbrld.so.1 gets updated but in the buffer is stored new inode number but still original file type. So this rules out some issues with journalling or stuff like that. Looking into the code filetype gets updated just next to the inode number so that leaves me at loss how filetype can be wrong. I have created a patch which somewhat extents a tracing information from XFS. Can you please run a kernel with that patch applied (will attach it in a moment) and run: zypper rm sbl zypper in sbl-3.5.0 echo 1 >/sys/kernel/debug/tracing/events/xfs/xfs_dir2_leaf_replace/enable echo 1 >/sys/kernel/debug/tracing/events/xfs/xfs_dir2_leaf_removename/enable echo 1 >/sys/kernel/debug/tracing/events/xfs/xfs_rename/enable cat /sys/kernel/debug/tracing/trace_pipe >/tmp/xfs_trace zypper in sbl And then attach here /tmp/xfs_trace. Thanks! Created attachment 644505 [details]
xfs: Extend tracing of rename operations
Created attachment 644506 [details]
[PATCH v2] xfs: Extend tracing of rename operations
Updated version of the patch with more info...
Created attachment 644524 [details]
requested xfs_trace
Thanks for the prompt reply! Looking at the libbrld handling in the trace, I can see:
rpm-1441 [009] .... 1011.553683: xfs_rename: dev 8:2 src dp ino 0
x1800110 target dp ino 0x1800110 src type 0 target type 7 src name libbrld.so;55
d63b6d target name libbrld.so
rpm-1441 [009] .... 1011.553810: xfs_rename: dev 8:2 src dp ino 0
x1800110 target dp ino 0x1800110 src type 0 target type 7 src name libbrld.so.1;
55d63b6d target name libbrld.so.1
rpm-1441 [009] .... 1011.554287: xfs_rename: dev 8:2 src dp ino 0x1800110 target dp ino 0x1800110 src type 0 target type 1 src name libbrld.so.1.0;55d63b6d target name libbrld.so.1.0
In particular we are interested in the second entry where libbrld.so.1;55d63b6d gets renamed to libbrld.so.1. The 'type 7' field tells us that at least xfs_rename() was sending down correct ftype. Sadly I wrongly identified type of your /usr/lib64 directory so the other tracepoints didn't trigger.
Since we have narrowed down the problem somewhat more, I've added further tracing. Can you please do another debug round for me with the patch I'll attach here in a moment? Do:
zypper rm sbl
zypper in sbl-3.5.0
echo 1 >/sys/kernel/debug/tracing/events/xfs/xfs_dir2_node_replace/enable
echo 1 >/sys/kernel/debug/tracing/events/xfs/xfs_dir2_node_replace_done/enable
echo 1 >/sys/kernel/debug/tracing/events/xfs/xfs_dir2_node_removename/enable
echo 1 >/sys/kernel/debug/tracing/events/xfs/xfs_dir2_node_addname/enable
echo 1 >/sys/kernel/debug/tracing/events/xfs/xfs_rename/enable
cat /sys/kernel/debug/tracing/trace_pipe >/tmp/xfs_trace &
zypper in sbl
Thanks!
Created attachment 644557 [details]
[PATCH v3] xfs: Extend tracing of rename operations
Created attachment 644605 [details]
second requested xfs_trace
Thanks for the trace! So finally the info was detailed enough that I was able to pinpoint the problem, reproduce it and fix it. In the trace we can see:
rpm-1476 [005] .... 1624.636105: xfs_rename: dev 8:2 src dp ino 0x1800110 target dp ino 0x1800110 src type 0 target type 7 src name libbrld.so.1;55d7247f target name libbrld.so.1
rpm-1476 [005] .... 1624.636109: xfs_dir2_node_replace: dev 8:2 ino 0x1800110 name libbrld.so.1 namelen 12 hashval 0x8c93fe62 inumber 0x18967cd op_flags ftype 7 dir3_op 1
rpm-1476 [005] .... 1624.636113: xfs_dir2_node_replace_done: dev 8:2 ino 0x1800110 oldino 0x187cf79 newino 0x18967cd oldftype 1 newftype 1 dir3_op 1
We see that ftype was correct when entering xfs_dir2_node_replace() but the stored value was still wrong as we can see in the last trace entry. After some code inspection I have found that xfs_da3_node_lookup_int() called from xfs_dir2_node_replace() clobbers args->filetype. I'll attach here a patch which fixes the problem for me.
Created attachment 644636 [details]
xfs: Fix file type directory corruption for btree directories
Indeed the patch fixes the issue. I will make a custom rpm with such patch and use it until a newer kernel with the fix is released for 13.2. Just one question: should I use also the four xfs patches that are in comment 6 or they are not needed for 13.2? Well, those patches fix a real problem that can result in data corruption (but only after a system crash when the log is being replayed) so using them doesn't do any harm. But they aren't needed to fix this particular bug. OK, the patch has been merged into upstream repository as well. I have pushed it to openSUSE-13.2 and SLE12 kernel branches. Closing the bug. SUSE-SU-2015:1727-1: An update that solves 7 vulnerabilities and has 44 fixes is now available. Category: security (important) Bug References: 856382,886785,898159,907973,908950,912183,914818,916543,920016,922071,924722,929092,929871,930813,932285,932350,934430,934942,934962,936556,936773,937609,937612,937613,937616,938550,938706,938891,938892,938893,939145,939266,939716,939834,939994,940398,940545,940679,940776,940912,940925,940965,941098,941305,941908,941951,942160,942204,942307,942367,948536 CVE References: CVE-2015-5156,CVE-2015-5157,CVE-2015-5283,CVE-2015-5697,CVE-2015-6252,CVE-2015-6937,CVE-2015-7613 Sources used: SUSE Linux Enterprise Workstation Extension 12 (src): kernel-default-3.12.48-52.27.1 SUSE Linux Enterprise Software Development Kit 12 (src): kernel-docs-3.12.48-52.27.2, kernel-obs-build-3.12.48-52.27.1 SUSE Linux Enterprise Server 12 (src): kernel-default-3.12.48-52.27.1, kernel-source-3.12.48-52.27.1, kernel-syms-3.12.48-52.27.1, kernel-xen-3.12.48-52.27.2 SUSE Linux Enterprise Module for Public Cloud 12 (src): kernel-ec2-3.12.48-52.27.1 SUSE Linux Enterprise Live Patching 12 (src): kgraft-patch-SLE12_Update_8-1-2.6 SUSE Linux Enterprise Desktop 12 (src): kernel-default-3.12.48-52.27.1, kernel-source-3.12.48-52.27.1, kernel-syms-3.12.48-52.27.1, kernel-xen-3.12.48-52.27.2 openSUSE-SU-2015:1842-1: An update that solves 7 vulnerabilities and has 7 fixes is now available. Category: security (important) Bug References: 919154,926238,937969,938645,939834,940338,941104,941305,941867,942178,944296,947155,951195,951440 CVE References: CVE-2015-0272,CVE-2015-1333,CVE-2015-2925,CVE-2015-3290,CVE-2015-5283,CVE-2015-5707,CVE-2015-7872 Sources used: openSUSE 13.2 (src): bbswitch-0.8-3.13.2, cloop-2.639-14.13.2, crash-7.0.8-13.2, hdjmod-1.28-18.14.2, ipset-6.23-13.2, kernel-debug-3.16.7-29.1, kernel-default-3.16.7-29.1, kernel-desktop-3.16.7-29.1, kernel-docs-3.16.7-29.3, kernel-ec2-3.16.7-29.1, kernel-obs-build-3.16.7-29.2, kernel-obs-qa-3.16.7-29.1, kernel-obs-qa-xen-3.16.7-29.1, kernel-pae-3.16.7-29.1, kernel-source-3.16.7-29.1, kernel-syms-3.16.7-29.1, kernel-vanilla-3.16.7-29.1, kernel-xen-3.16.7-29.1, pcfclock-0.44-260.13.2, vhba-kmp-20140629-2.13.2, xen-4.4.2_06-27.2, xtables-addons-2.6-13.2 |