Bug 1191260

Summary: kernel 5.14.8 post scriptlets failing
Product: [openSUSE] openSUSE Tumbleweed Reporter: Dominique Leuenberger <dimstar>
Component: KernelAssignee: Martin Wilck <martin.wilck>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Major    
Priority: P5 - None CC: dimstar, fvogt, guillaume.gardet, jcheung, jslaby, kernel-bugs, msuchanek, ptesarik
Version: Current   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Bug Depends on:    
Bug Blocks: 1189841    

Description Dominique Leuenberger 2021-10-04 07:37:30 UTC
This seems to be a regression introduced with
  https://build.opensuse.org/request/show/921675

The post/posttrans scriptlets were modified, and we now see for example transactional dist upgrades failing with

warning: %post(kernel-default-5.14.8-1.1.x86_64) scriptlet failed, exit status 1
EFI variables are not supported on this system
Failed to delete /etc/uefi/certs/33CEA71B.crt

e.g. https://openqa.opensuse.org/tests/1951077#step/tdup/11
Comment 1 Fabian Vogt 2021-10-04 07:47:27 UTC
The previous %post always did exit 0:

# vim: set sts=4 sw=4 ts=8 noet:
if ! command -v mokutil >/dev/null; then
        exit 0
fi
# Only apply CA check on the kernel package certs (bsc#1173115)
if [ 0 = 0 ] && mokutil -h | grep -q "ca-check"; then
        MOK_ARG="--ca-check"
else
        MOK_ARG=""
fi
# XXX: Only call mokutil if UEFI and shim are used
for cert in 1AA60533; do
        cert="/etc/uefi/certs/${cert}.crt"
        if ! mokutil --import "$cert" --root-pw ${MOK_ARG}; then
                echo "Failed to import $cert"
        fi
done
exit 0

exit 0

There's also a failure inside OBS builds:

[  162s] (183/197) Installing: kernel-kvmsmall-5.14.8-1.1.x86_64 [..............done]
[  162s] Additional rpm output:
[  162s] Please run "/usr/bin/dracut -f /boot/initrd-5.14.8-1-kvmsmall 5.14.8-1-kvmsmall" as soon as your system is complete.
[  162s] You may need to setup and install the boot loader using the
[  162s] available bootloader for your platform (e.g. grub, lilo, zipl, ...).
[  162s] warning: %post(kernel-kvmsmall-5.14.8-1.1.x86_64) scriptlet failed, exit status 1

That doesn't show the cause for the non-zero exit directly though.
Comment 2 Martin Wilck 2021-10-04 09:10:20 UTC
This was caused by the changes for bug 1189841. Fix here:

> https://github.com/openSUSE/suse-module-tools/commit/605073fbb6dec1611f67ae7e1ced265546cce10c

Rebuilt package should appear shortly under home:wilck:suse-module-tools.

Can you retest with that?
Comment 3 Martin Wilck 2021-10-04 09:14:32 UTC
Michal, please review the above commit, lest we introduce a new regression.
Comment 4 Michal Suchanek 2021-10-04 11:43:36 UTC
As pointed out in 1189841 mokutil should be called only on EFI systems.

So the 'mokutil -v' check should be replaced with 'mokutil --sb-state' check.
Comment 5 Michal Suchanek 2021-10-04 12:20:21 UTC
Alternative test

https://github.com/hramrach/suse-module-tools/commit/f568bef4f70867cd72b684b4170c89362e6ed59e

It preserves the check that correct scriptlet is called even on non-EFI system.

How are secure boot certificates managed on s390?
Comment 6 Martin Wilck 2021-10-04 12:35:41 UTC
(In reply to Michal Suchanek from comment #5)
> Alternative test
> 
> https://github.com/hramrach/suse-module-tools/commit/
> f568bef4f70867cd72b684b4170c89362e6ed59e
> 
> It preserves the check that correct scriptlet is called even on non-EFI
> system.
> 

I don't understand. What's the benefit of doing the same test separately in each branch of the case statement, rather than once in the beginning of the script, as in my patch? What's your patch preserving that mine did not?
Comment 7 Michal Suchanek 2021-10-04 13:09:00 UTC
It's not done in each branch.

There are branches that do nothing and branch that complains about wrong scriptlet.

Exiting early because the test is needed on all branches now is prone to the same problem as we had with IFS in find-provides.ksyms. The test is completely separated from the code in question.
Comment 8 Martin Wilck 2021-10-04 13:46:38 UTC
(In reply to Michal Suchanek from comment #7)

> There are branches that do nothing and branch that complains about wrong
> scriptlet.

So what? What's the benefit of running the noop branch, or seeing that complaint on systems where the entire concept of certificates doesn't matter?

By definition, cert-script is a noop on systems that don't support UEFI. Therefore exiting early (and without a warning, which would just confuse users without good reason) on such a system is the right thing to do. As a side effect, it makes the code more compact and easier to read.

> Exiting early because the test is needed on all branches now is prone to the
> same problem as we had with IFS in find-provides.ksyms. The test is
> completely separated from the code in question.

No. The problem in find-provides.ksyms was not the early check, but the fact that IFS was mangled, which is about the dirtiest thing that you can do in shell programming. If a program can determine early on that it has no purpose, quitting immediately is reasonable and clean. The scriptlet is small enough that even in the extremely unlikely case that some time in the future we'll add some certificate handling on non-UEFI systems, we'll figure out how to skip the test.
Comment 10 OBSbugzilla Bot 2021-10-04 16:42:33 UTC
This is an autogenerated message for OBS integration:
This bug (1191260) was mentioned in
https://build.opensuse.org/request/show/923052 Factory / suse-module-tools
Comment 11 Michal Suchanek 2021-10-04 16:52:58 UTC
(In reply to Martin Wilck from comment #8)
> (In reply to Michal Suchanek from comment #7)
> 
> > There are branches that do nothing and branch that complains about wrong
> > scriptlet.
> 
> So what? What's the benefit of running the noop branch, or seeing that
> complaint on systems where the entire concept of certificates doesn't matter?
> 
> By definition, cert-script is a noop on systems that don't support UEFI.

We have secure boot on s390 and support for POWER is in progress (expected in 15 SP5). Maybe the kernel makes this look like EFI to userspace but the expectation is that we will have a different tool for POWER eventually.

> Therefore exiting early (and without a warning, which would just confuse
> users without good reason) on such a system is the right thing to do. As a
> side effect, it makes the code more compact and easier to read.

To the contrary - code that exits at random is difficult to follow, and is the very cause of bug 1189841 - if your code does not run because of IFS set at the start of because of a test exiting at the start the result is the same - the code does not run.
Comment 12 Martin Wilck 2021-10-04 19:56:00 UTC
(In reply to Michal Suchanek from comment #11)

> > By definition, cert-script is a noop on systems that don't support UEFI.
> 
> We have secure boot on s390 and support for POWER is in progress (expected
> in 15 SP5). Maybe the kernel makes this look like EFI to userspace but the
> expectation is that we will have a different tool for POWER eventually.

Fine. Either s390x and Power will use mokutil, in which case the test will be correct. Or they will use some other tool, in which we'll need to extend the test. It's fine both ways, and actually easier to maintain than having the same test in multiple places in the code.
Comment 13 Michal Suchanek 2021-10-05 09:36:57 UTC
(In reply to Martin Wilck from comment #12)
> (In reply to Michal Suchanek from comment #11)
> 
> > > By definition, cert-script is a noop on systems that don't support UEFI.
> > 
> > We have secure boot on s390 and support for POWER is in progress (expected
> > in 15 SP5). Maybe the kernel makes this look like EFI to userspace but the
> > expectation is that we will have a different tool for POWER eventually.
> 
> Fine. Either s390x and Power will use mokutil, in which case the test will
> be correct. Or they will use some other tool, in which we'll need to extend
> the test. It's fine both ways, and actually easier to maintain than having
> the same test in multiple places in the code.

Which then evolves into

if !X && !Y 

exit 0

...

if X 

do something with X

fi

if Y

do something with Y

fi

so the initial test is redundant.

But there is also another problem: when looking at the code you not only need to  take into account the code you are looking at but also any code that exits early. That makes maintaining the code harder.

The code that was imported from kernel-source does exit early in some places but adding to that does not make the code cleaner but rather more obfuscated.
Comment 14 Martin Wilck 2021-10-05 10:43:05 UTC
We have different taste wrt what code is easy to understand and read. I guess we will have to deal with that somehow. Being adults, we should be able to figure it out.

If we need a different tool for enrolling certificates on Power/s390, I expect that we will not work with explicit conditionals everywhere. Instead we'll add some abstraction layer that hides the architecture specifics and let us do high level operations like "enroll" or "delete"; and "is_supported", too.

The code will then look like this:

is_supported() {
   case $arch in
   power) ...;;
   s390x) ...;;
   *) mokutil --sb-state;;
   esac
}

if ! is_supported; then 
    exit 0
fi

case $op in
post) 
    enroll;;
postun)
    delete;;
...
esac


... which is nice and clean, IMHO. We can simplify the conditionals inside the last case statement because we don't need to worry whether certificates are generally supported in the first place.
Comment 15 Michal Suchanek 2021-10-05 12:21:38 UTC
It still stands.

You claim you remove redundancy but is_supported needs to detect which certificate management is preset as does enroll and delete.

So is_supported is completely redundant.

But the worse problem is that whatever piece of code you are changing you have to be aware of

if ! is_supported; then 
    exit 0
fi

 - the code is no longer local.
Comment 16 Martin Wilck 2021-10-05 14:52:58 UTC
> So is_supported is completely redundant.

It isn't, and I told you why. You don't buy it, and I didn't expect you would. No point in continuing this debate, we both have far more important things to do.

I hope we agree at least that the current code, with or without early exit, is far better readable and maintainable than anything we had during the years when the scriptlets were embedded in the rpm.

If there's a real bug, show me.
Comment 19 Swamp Workflow Management 2021-10-26 13:27:45 UTC
openSUSE-RU-2021:3509-1: An update that has 5 recommended fixes can now be installed.

Category: recommended (important)
Bug References: 1191200,1191260,1191480,1191804,1191922
CVE References: 
JIRA References: 
Sources used:
openSUSE Leap 15.3 (src):    suse-module-tools-15.3.13-3.11.1
Comment 20 Swamp Workflow Management 2021-10-26 13:29:35 UTC
SUSE-RU-2021:3509-1: An update that has 5 recommended fixes can now be installed.

Category: recommended (important)
Bug References: 1191200,1191260,1191480,1191804,1191922
CVE References: 
JIRA References: 
Sources used:
SUSE MicroOS 5.1 (src):    suse-module-tools-15.3.13-3.11.1
SUSE Linux Enterprise Module for Basesystem 15-SP3 (src):    suse-module-tools-15.3.13-3.11.1

NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.
Comment 21 Swamp Workflow Management 2021-10-26 16:25:41 UTC
SUSE-RU-2021:3515-1: An update that has 5 recommended fixes can now be installed.

Category: recommended (important)
Bug References: 1191200,1191260,1191480,1191804,1191922
CVE References: 
JIRA References: 
Sources used:
SUSE MicroOS 5.0 (src):    suse-module-tools-15.2.15-4.9.1
SUSE Linux Enterprise Module for Basesystem 15-SP2 (src):    suse-module-tools-15.2.15-4.9.1

NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.
Comment 22 Swamp Workflow Management 2021-10-31 20:54:20 UTC
openSUSE-RU-2021:1406-1: An update that has 5 recommended fixes can now be installed.

Category: recommended (important)
Bug References: 1191200,1191260,1191480,1191804,1191922
CVE References: 
JIRA References: 
Sources used:
openSUSE Leap 15.2 (src):    suse-module-tools-15.2.15-lp152.5.9.1
Comment 26 Michal Suchanek 2021-11-16 17:03:21 UTC
Update released.
Comment 29 Swamp Workflow Management 2021-12-01 17:26:36 UTC
SUSE-RU-2021:3820-1: An update that has 9 recommended fixes can now be installed.

Category: recommended (moderate)
Bug References: 1158817,1189841,1189879,1190598,1191200,1191260,1191480,1191804,1191922
CVE References: 
JIRA References: 
Sources used:
SUSE Linux Enterprise Server for SAP 15 (src):    suse-module-tools-15.0.10-3.12.1
SUSE Linux Enterprise Server 15-LTSS (src):    suse-module-tools-15.0.10-3.12.1
SUSE Linux Enterprise High Performance Computing 15-LTSS (src):    suse-module-tools-15.0.10-3.12.1
SUSE Linux Enterprise High Performance Computing 15-ESPOS (src):    suse-module-tools-15.0.10-3.12.1

NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.
Comment 30 Swamp Workflow Management 2021-12-02 11:49:58 UTC
SUSE-RU-2021:3869-1: An update that has 8 recommended fixes can now be installed.

Category: recommended (moderate)
Bug References: 1189841,1189879,1190598,1191200,1191260,1191480,1191804,1191922
CVE References: 
JIRA References: 
Sources used:
SUSE Linux Enterprise Server for SAP 15-SP1 (src):    suse-module-tools-15.1.23-3.19.1
SUSE Linux Enterprise Server 15-SP1-LTSS (src):    suse-module-tools-15.1.23-3.19.1
SUSE Linux Enterprise Server 15-SP1-BCL (src):    suse-module-tools-15.1.23-3.19.1
SUSE Linux Enterprise High Performance Computing 15-SP1-LTSS (src):    suse-module-tools-15.1.23-3.19.1
SUSE Linux Enterprise High Performance Computing 15-SP1-ESPOS (src):    suse-module-tools-15.1.23-3.19.1
SUSE Enterprise Storage 6 (src):    suse-module-tools-15.1.23-3.19.1
SUSE CaaS Platform 4.0 (src):    suse-module-tools-15.1.23-3.19.1

NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.
Comment 31 Swamp Workflow Management 2021-12-07 17:17:19 UTC
SUSE-RU-2021:3966-1: An update that has 8 recommended fixes can now be installed.

Category: recommended (moderate)
Bug References: 1189841,1189879,1190598,1191200,1191260,1191480,1191804,1191922
CVE References: 
JIRA References: 
Sources used:
SUSE Linux Enterprise Server 12-SP5 (src):    suse-module-tools-12.11-3.8.1

NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.
Comment 32 Swamp Workflow Management 2021-12-07 20:31:47 UTC
SUSE-RU-2021:3970-1: An update that has 8 recommended fixes can now be installed.

Category: recommended (moderate)
Bug References: 1189841,1189879,1190598,1191200,1191260,1191480,1191804,1191922
CVE References: 
JIRA References: 
Sources used:
SUSE OpenStack Cloud Crowbar 9 (src):    suse-module-tools-12.6.1-27.6.1
SUSE OpenStack Cloud Crowbar 8 (src):    suse-module-tools-12.6.1-27.6.1
SUSE OpenStack Cloud 9 (src):    suse-module-tools-12.6.1-27.6.1
SUSE OpenStack Cloud 8 (src):    suse-module-tools-12.6.1-27.6.1
SUSE Linux Enterprise Server for SAP 12-SP4 (src):    suse-module-tools-12.6.1-27.6.1
SUSE Linux Enterprise Server for SAP 12-SP3 (src):    suse-module-tools-12.6.1-27.6.1
SUSE Linux Enterprise Server 12-SP4-LTSS (src):    suse-module-tools-12.6.1-27.6.1
SUSE Linux Enterprise Server 12-SP3-LTSS (src):    suse-module-tools-12.6.1-27.6.1
SUSE Linux Enterprise Server 12-SP3-BCL (src):    suse-module-tools-12.6.1-27.6.1
SUSE Linux Enterprise Server 12-SP2-BCL (src):    suse-module-tools-12.6.1-27.6.1
HPE Helion Openstack 8 (src):    suse-module-tools-12.6.1-27.6.1

NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.