|
Bugzilla – Full Text Bug Listing |
| Summary: | Installation failure "cpio: rename" PowerPC multipath openQA test | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE Tumbleweed | Reporter: | Michel Normand <normand> |
| Component: | Other | Assignee: | E-mail List <bnc-team-screening> |
| Status: | RESOLVED DUPLICATE | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Normal | ||
| Priority: | P5 - None | CC: | coolo, forgotten_ny8t7SHjD_, igonzalezsosa, jreidinger, msuchanek, normand |
| Version: | Current | ||
| Target Milestone: | --- | ||
| Hardware: | PowerPC-64 | ||
| OS: | Other | ||
| Whiteboard: | |||
| Found By: | --- | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
y2log.txt.gz
openqa_save_y2logs_failure.png |
||
|
Description
Michel Normand
2016-11-10 11:12:56 UTC
If doing a new trial on another openQA server: * the install failing package is not the same: (one trial failed on perl-DateTime-Local install) (another one on manufacturer-PPDs install) * There is no such failure if openQA job is started without MULTIPATH=1 === failure: /var/lib/openqa/script/client jobs post ISO=openSUSE-Tumbleweed-DVD-ppc64le-Snapshot20161107-Media.iso DESKTOP=minimalx DISTRI=opensuse DVD=1 VERSION=Tumbleweed FLAVOR=DVD BUILD=20161107 INSTALLONLY=1 MACHINE=ppc64le-multipath MULTIPATH=1 QEMU=ppc64 QEMUCPU=host VGA=std WORKER_CLASS=qemu_ppc64le NOIMAGES=1 ARCH=ppc64le OFW=1 QEMUCPUS=8 QEMUTHREADS=8 TEST=install_minimalx Test=install_minimalx === no failure: /var/lib/openqa/script/client jobs post ISO=openSUSE-Tumbleweed-DVD-ppc64le-Snapshot20161107-Media.iso DESKTOP=minimalx DISTRI=opensuse DVD=1 VERSION=Tumbleweed FLAVOR=DVD BUILD=20161107 INSTALLONLY=1 MACHINE=ppc64le-multipath QEMU=ppc64 QEMUCPU=host VGA=std WORKER_CLASS=qemu_ppc64le NOIMAGES=1 ARCH=ppc64le OFW=1 QEMUCPUS=8 QEMUTHREADS=8 TEST=install_minimalx Test=install_minimalx === Note: Since a while the openQA server had an incorrect setup of ppc64le-multipath machine definition (1). That explain why this bug failure was not reported before for ppc64le. This should appear on futur new test run on this server. (1) === (10:56:06) michel_mno: There is an error in openQA database about ppc64le multipath machine http://paste.opensuse.org/view/raw/16432658 MULTIPATH1 is set, but should be MULTIPATH (10:56:06) michel_mno: the templates is correct in git, the database has to be modified (I do not have admin access) (10:58:03) okurz: michel_mno: I updated the machine definition for ppc64le-multipath === https://openqa.opensuse.org/admin/machines ppc64le-multipath qemu MULTIPATH1=1 <= this has to be replaced by MULTIPATH=1 NOVIDEO=1 OFW=1 QEMUCPU=host QEMUCPUS=8 QEMURAM=4096 QEMUTHREADS=8 WORKER_CLASS=qemu_ppc64le === Hmm, it is package failure during package unpackaing. Does multipath scenario have less disk space? My suspicious is some relation of btrfs disk usage and maybe multipath? It seems that we just need to rebuild a whole tree against recent rpm.. It was kbd-legacy, then kernel-firmware, now netpbm.. All are cured by rebuilding those packages. (I'm not certain about kbd-legacy though as Andreas did something to it.) Anyway rebuild is in progress.. (In reply to Dinar Valeev from comment #4) > It seems that we just need to rebuild a whole tree against recent rpm.. > Hi Dinar, I would be interested to understand what informations make you point to the need for such rebuild; could you give me some details ? same error with Snapshot 20161115 after rebuild as reported in https://openqa.opensuse.org/tests/305788 (In reply to Michel Normand from comment #6) > same error with Snapshot 20161115 after rebuild as reported in > https://openqa.opensuse.org/tests/305788 in this retry same error of rpm command, cpio rename. === 2016-11-16 01:50:42 <1> install(3311) [zypp++] ExternalProgram.cc(start_program):249 Executing 'rpm' '--root' '/mnt' '--dbpath' '/var/lib/rpm' '-U' '--percent' '--noglob' '--force' '--nodeps' '--' '/mnt/var/cache/zypp/packages/openSUSE-20161115-0/suse/noarch/kbd-legacy-2.0.3-1.2.noarch.rpm' 2016-11-16 01:50:42 <1> install(3311) [zypp++] ExternalProgram.cc(start_program):412 pid 4862 launched 2016-11-16 01:50:42 <1> install(3311) [zypp++] ExternalProgram.cc(checkStatus):506 Pid 4862 exited with status 1 2016-11-16 01:50:42 <5> install(3311) [zypp] Exception.cc(log):137 RpmDb.cc(doInstallPackage):2043 THROW: Subprocess failed. Error: RPM failed: error: unpacking of archive failed on file /usr/share/kbd/keymaps/legacy/include/compose.latin4: cpio: rename === Created attachment 702238 [details] openqa_save_y2logs_failure.png * in previous failing conditions default disk size is used (qcow2 10GB) * on my local openqa server (because I am not authorized to issue openQA client cmd on main instance) * if setup HDDSIZEGB=40 then still rpm cpio rename error. * if setup HDDSIZEGB=80 then no more rpm cpio rename error. so assumption of comment #3 by Josef Reidinger is valid, and yast2 ultimately stuck with "Saving bootloader configuration" But openQA unable to capture y2log because save_y2log failed as reported by this openqa_save_y2logs_failure.png this test defaults to HDDSIZEGB=10 To exclude btrfs, I'd run the same test but with ext3/ext4 fs. There is definitely something wrong with disk in this setup. (In reply to Dinar Valeev from comment #9) > this test defaults to HDDSIZEGB=10 > > To exclude btrfs, I'd run the same test but with ext3/ext4 fs. There is > definitely something wrong with disk in this setup. the problem may be specific to multipath disk setup but only for TW, because same test is not failing for Leap 42.2: https://openqa.opensuse.org/tests/308190# <= Leap 42.2 passed https://openqa.opensuse.org/tests/308165 <= TW failed (In reply to Michel Normand from comment #8) > Created attachment 702238 [details] > openqa_save_y2logs_failure.png > > * in previous failing conditions default disk size is used (qcow2 10GB) > * on my local openqa server > (because I am not authorized to issue openQA client cmd on main instance) > * if setup HDDSIZEGB=40 then still rpm cpio rename error. > * if setup HDDSIZEGB=80 then no more rpm cpio rename error. > so assumption of comment #3 by Josef Reidinger is valid, and > yast2 ultimately stuck with "Saving bootloader configuration" > But openQA unable to capture y2log because save_y2log failed > as reported by this openqa_save_y2logs_failure.png The hung condition at "Saving bootloader configuration" is tracked by new bug https://bugzilla.suse.com/show_bug.cgi?id=1011529 Hi Dinar, What's expected from the YaST team? It looks a problem with Btrfs and/or multipath set-up. How can we help? next step is for me to add debug traces to understand the problem around the cpio rename error (without qualification string as per log extract of comment #7) and why is it specific to multipath configuration (use of dev mapper) investigation blocked by new bug https://bugzilla.suse.com/show_bug.cgi?id=1012825 I need suggestion to continue investigation as per following status.
Current_Status:
* The failure is specific to disk multipath test and btrfs for TW PowerPC
the reported error in y2log is "cpio: rename" error
* No failure for Leap 42.2
* Unable to recreate the failure without openQA env.
* Not same failure in ext4 FS in place of btrfs.
* The error reported by Yast is any package installation failure
and the y2log reports a "cpio: rename" error with no error number.
* the "cpio: rename" string is related to error from fsmRename fct in lib/fsm.c:
Reported by rpm via the zypp traces from libzypp
(for ExternalProgram.cc, Exception.cc, RpmDb.cc)
the last error is reported by rpm psm.c rpmpsmUnpack fct as error from rpmPackageFilesInstall
the related string from emsg (output of rpmfileStrerror)
string "cpio: rename" is build in this rpmfileStrerror by decoding of RPMERR_RENAME_FAILED RC
Summary of related source lines:
===
./rpm-4.12.0.1/lib/psm.c:671: fsmrc = rpmPackageFilesInstall(psm->ts, psm->te, psm->files,
===
fsmrc = rpmPackageFilesInstall(psm->ts, psm->te, psm->files, psm, &failedFile);
emsg = rpmfileStrerror(fsmrc);
rpmlog(RPMLOG_ERR,
_("unpacking of archive failed%s%s: %s\n"),
(failedFile != NULL ? _(" on file ") : ""),
(failedFile != NULL ? failedFile : ""),
emsg);
===
./rpm-4.12.0.1/lib/rpmfi.c:2111:char * rpmfileStrerror(int rc)
./rpm-4.12.0.1/lib/fsm.c:535: static int fsmRename(const char *opath, const char *path)
./rpm-4.12.0.1/lib/rpmarchive.h RPMERR_RENAME_FAILED = -32774,
===
In my openQA instance there is no such failure if eatmydata package is removed ! This is used in os-autoinst qemu backend: === backend/qemu.pm:719: unshift(@params, "/usr/bin/eatmydata") if (-e "/usr/bin/eatmydata"); === Do we really need this eatmydata call ? that seems to really eat data. (In reply to Michel Normand from comment #16) > In my openQA instance there is no such failure if eatmydata package is > removed ! > This is used in os-autoinst qemu backend: > === > backend/qemu.pm:719: unshift(@params, "/usr/bin/eatmydata") if (-e > "/usr/bin/eatmydata"); > === > > Do we really need this eatmydata call ? that seems to really eat data. wait for openQA update with PR https://github.com/os-autoinst/os-autoinst/pull/660 (In reply to Michel Normand from comment #17) > [CUT] ... > > wait for openQA update with PR > https://github.com/os-autoinst/os-autoinst/pull/660 Hello Michal, do you know who is in charge to update the PowerPC workers used by openQA ? I would like to know when those workers will be updated with the correction of above PR. The yesterday's run of the related test (1) still trace that eatmydata is still used by the worker. (1) https://openqa.opensuse.org/tests/321033/file/autoinst-log.txt === +++ worker notes +++ start time: 2016-12-14 13:16:02 running on power8:16 ... 13:16:09.9790 153031 starting: /usr/bin/eatmydata /usr/bin/qemu-system-ppc64 ... === No idea. Will reassign to OpenQA The SUSE tool channel is used for our internal tracking. openQA is developed in the open as a community project: https://progress.opensuse.org/projects/openqav3 Please raise the issue in the community project: https://progress.opensuse.org/projects/openqav3/issues I close the bug and move it back to openSUSE Tumbleweed - QA Tools component, so that you can reference the bug to avoid loosing information. as suggested I am creating a new issue https://progress.opensuse.org/issues/15548 (In reply to Michel Normand from comment #21) > as suggested I am creating a new issue > https://progress.opensuse.org/issues/15548 above issue created in wrong project, replaced by https://progress.opensuse.org/issues/15770 in https://progress.opensuse.org/projects/openqav3/issues) after power8 worker update as per issue https://progress.opensuse.org/issues/15770 set this bug as duplicate of new boo#1018262 to continue investigation from status summary of comment #15 *** This bug has been marked as a duplicate of bug 1018262 *** |