Bug 1182544

Summary: transactional-update fails when self-updating from old snapshot
Product: [openSUSE] openSUSE Tumbleweed Reporter: Fabian Vogt <fvogt>
Component: MicroOSAssignee: Ignaz Forster <iforster>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: alexander.wenzel, dimstar, forgotten_u0-bnvADNc, guillaume.gardet, rbrown
Version: Current   
Target Milestone: ---   
Hardware: Other   
OS: Other   
URL: https://openqa.opensuse.org/tests/1638401/modules/image_checks/steps/13
Whiteboard:
Found By: openQA Services Priority:
Business Priority: Blocker: Yes
Marketing QA Status: --- IT Deployment: ---
Attachments: log of partial workaround for transactional-update error

Description Fabian Vogt 2021-02-22 08:43:15 UTC
This test boots into an old MicroOS image. It fails because combustion failed to run, which is caused by transactional-updates self update:

Feb 21 05:49:26.233585 localhost combustion[542]: New version found - updating...
Feb 21 05:49:26.374352 localhost combustion[703]: Loading repository data...
Feb 21 05:49:26.584979 localhost combustion[703]: Reading installed packages...
Feb 21 05:49:26.884304 localhost combustion[703]: Retrieving package transactional-update-3.1.4-1.1.x86_64 (1/1),  54.6 KiB ( 73.1 KiB unpacked)
Feb 21 05:49:27.121761 localhost combustion[703]: .........done]
Feb 21 05:49:27.306095 localhost combustion[742]: Loading repository data...
Feb 21 05:49:27.517683 localhost combustion[742]: Reading installed packages...
Feb 21 05:49:27.771859 localhost combustion[742]: Retrieving package libtukit0-3.1.4-1.1.x86_64 (1/2), 128.4 KiB (283.7 KiB unpacked)
Feb 21 05:49:28.000220 localhost combustion[742]: ..........done]
Feb 21 05:49:28.002964 localhost combustion[742]: Retrieving package tukit-3.1.4-1.1.x86_64 (2/2),  55.9 KiB ( 85.2 KiB unpacked)
Feb 21 05:49:28.148466 localhost combustion[742]: ......done]
Feb 21 05:49:28.209932 localhost combustion[741]: /tmp/transactional-update.a1AVPfXW9o/usr/sbin/tukit: /lib64/libselinux.so.1: no version information available (required by /tmp/transactional-update.a1AVPfXW9o/usr/lib64/libtukit.so.0)
Feb 21 05:49:28.215355 localhost combustion[542]: transactional-update 3.1.4 started
Feb 21 05:49:28.217252 localhost combustion[542]: Options: shell
Feb 21 05:49:28.271971 localhost combustion[542]: Separate /var detected.
Feb 21 05:49:28.320210 localhost combustion[741]: tukit: /lib64/libselinux.so.1: no version information available (required by /tmp/transactional-update.a1AVPfXW9o/usr/lib64/libtukit.so.0)
Feb 21 05:49:28.328102 localhost combustion[741]: Failure (dbus fatal exception).
...
Feb 21 05:49:29.012290 localhost combustion[741]: ERROR: filesystem error: cannot copy: File exists [/run/netconfig] [/.snapshots/2/snapshot/run/netconfig]
Feb 21 05:49:29.013155 localhost combustion[542]: Opening chroot in snapshot , continue with 'exit'
Feb 21 05:49:29.013981 localhost combustion[877]: tukit: /lib64/libselinux.so.1: no version information available (required by /tmp/transactional-update.a1AVPfXW9o/usr/lib64/libtukit.so.0)
Feb 21 05:49:29.015824 localhost combustion[877]: 2021-02-21 05:49:29 tukit 3.1.4 started
...

## Observation

openQA test in scenario microos-Tumbleweed-MicroOS-Image-x86_64-microos-old2microosnext@64bit fails in
[image_checks](https://openqa.opensuse.org/tests/1638401/modules/image_checks/steps/13)

## Test suite description
Boot from existing, static MicroOS image and transactional-update dup to snapshot under test


## Reproducible

Fails since (at least) Build [20210219](https://openqa.opensuse.org/tests/1636750)


## Expected result

Last good: [20210218](https://openqa.opensuse.org/tests/1635223) (or more recent)


## Further details

Always latest result in this scenario: [latest](https://openqa.opensuse.org/tests/latest?arch=x86_64&distri=microos&flavor=MicroOS-Image&machine=64bit&test=microos-old2microosnext&version=Tumbleweed)
Comment 1 Alexander Wenzel 2021-02-25 09:08:28 UTC
Created attachment 846500 [details]
log of partial workaround for transactional-update error

To reproducibly workaround one part of the bug:

ERROR: filesystem error: cannot copy: File exists [/run/netconfig] [/.snapshots/2/snapshot/run/netconfig]

It is somehow sufficient, to let netconfig recreate the yp.conf like this:

echo -n > /etc/yp.conf && netconfig update -f

NETCONFIG_NIS_POLICY is still set to its default "auto".
Afterwards transactional-update works fine again. I did not dive any further so far, but I thought, this piece might help anyway. See also attached log.

NB: We're building a variant of MicroOS using kiwi/obs and installing the OS using PXE which shows the current behaviour/bug. Using the ISO or skipping the ignition/combustion part, will lead to a system without update capability... An so we're at this point here and the wheel has come full circle ;)
Comment 2 Ignaz Forster 2021-02-25 09:27:36 UTC
The problem occurs when /run/netconfig is part of the root file system - which is unexpected, as /run is a tmpfs. In any case libtukit will just overwrite any existing files now with https://github.com/openSUSE/transactional-update/commit/d0b10c0a32ab33832c53a8c9ac90bac94a0a009d
Comment 3 Fabian Vogt 2021-02-25 17:28:24 UTC
*** Bug 1182772 has been marked as a duplicate of this bug. ***
Comment 4 Fabian Vogt 2021-02-26 16:04:27 UTC
(In reply to Ignaz Forster from comment #2)
> The problem occurs when /run/netconfig is part of the root file system

This is actually not the full story! It's true that this particular error happens because there's already a different /etc/resolv.conf inside the "virgin" snapshot. I found that even the latest images also have content inside /run (reported as https://github.com/OSInside/kiwi/issues/1744) and tukit actually runs fine on old images outside of combustion.

Question is: why? It clearly shouldn't work: When tukit copies /run/netconfig/, it doesn't override existing files, but in those cases it doesn't complain and abort either. Debugging actually showed that there's a bug in libstdc++'s implementation of std::filesystem::copy, which I reported as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99290.

Combustion avoids this bug with a 100% chance of "success" because when networking is enabled (the default on old images), it creates /run/netconfig/ with just resolv.conf inside, so there's no other item to copy and the error is returned properly.
Comment 6 OBSbugzilla Bot 2021-03-03 00:40:15 UTC
This is an autogenerated message for OBS integration:
This bug (1182544) was mentioned in
https://build.opensuse.org/request/show/876314 Factory / transactional-update
Comment 8 Oliver Kurz 2021-03-18 06:05:36 UTC
This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: microos-old2microosnext@
https://openqa.opensuse.org/tests/1668980

To prevent further reminder comments one of the following options should be followed:
1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
2. The openQA job group is moved to "Released"
3. The label in the openQA scenario is removed
Comment 9 Richard Brown 2021-04-29 13:31:06 UTC
It's been fixed
Comment 11 Swamp Workflow Management 2021-06-28 19:40:00 UTC
SUSE-RU-2021:2192-1: An update that has 15 recommended fixes can now be installed.

Category: recommended (important)
Bug References: 1173842,1177149,1182525,1182544,1183442,1183521,1183539,1183856,1184529,1185224,1185226,1185625,1185766,1186775,1186842
CVE References: 
JIRA References: 
Sources used:
SUSE MicroOS 5.0 (src):    transactional-update-3.4.0-3.6.1

NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.