Bug 1179981

Summary: EFI on raid1: grub2-install fails to update nvram
Product: [openSUSE] openSUSE Distribution Reporter: Marco M. <jjletho67-esus>
Component: BootloaderAssignee: Michael Chang <mchang>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Major    
Priority: P2 - High CC: jjletho67-esus, mchang
Version: Leap 15.2   
Target Milestone: ---   
Hardware: Other   
OS: openSUSE Leap 15.2   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: Partitions layout
output of efibootmgr after installation
output of efibootmgr after system update

Description Marco M. 2020-12-13 18:08:59 UTC
Created attachment 844418 [details]
Partitions layout

I installed openSUSE LEAP 15.2 on a kvm virtual machine with ovmf firmware. 

My disks layout is configured for a full redundancy, so the efi partition is on raid1 (I've already used this layout on openSUSE 42.3 and by doc it should  be supported https://doc.opensuse.org/documentation/leap/startup/single-html/book-opensuse-startup/index.html#sec-yast-install-partitioning)

I'm attaching the partition layout. All file systems are ext4, secureboot is disabled

The machine has worked fine ( i rebooted severak times) until i performed a a full patch (zypper patch).

At the next reboot grub was not able to boot the os and stop with the error: "error: verification requested but nobody cares"

I immediately found this bug https://bugzilla.suse.com/show_bug.cgi?id=1175766 but the solving patch was already installed. 

Curiously i was able to boot with the installation disk, selecting the "boot from hard disk" option.

I tried to launch grub2-install --verbose manually and it ended up with this error:

"
...
grub2-install: info: copying `/boot/grub2/x86_64-efi/core.efi' -> `/boot/efi/EFI/opensuse/grubx64.efi'.
grub2-install: info: Registering with EFI: distributor = `opensuse', path = `\EFI\opensuse\grubx64.efi', ESP at mduuid/9182c46b9d469f79b48850b68f3371a5.
grub2-install: info: executing efibootmgr --version </dev/null >/dev/null.
grub2-install: info: executing modprobe -q efivars.
grub2-install: info: executing efibootmgr -c -d.
efibootmgr: option requires an argument -- 'd'
efibootmgr version 14
usage: efibootmgr [options]
        -a | --active         sets bootnum active
        -A | --inactive       sets bootnum inactive
        -b | --bootnum XXXX   modify BootXXXX (hex)
        -B | --delete-bootnum delete bootnum (specified with -b)
             --delete         delete entry by bootnum (-b), by UUID (-P)
                              or by disk+partition[+file] (-d -p -l)
        -c | --create         create new variable bootnum and add to bootorder
        -C | --create-only      create new variable bootnum and do not
...
"

My speculation is that efibootmgr hasn't the disks parameter due to the raid configuration of the efi partition. Some updated packages has "lost" the ability to deal properly with EFI on raid1

STEP TO REPRODUCE:
Install LEAP 15.2 with the same partitions layout described above.
Perform a "zypper patch" and reboot it

ACTUAL RESULTS: the system does not boot, grub stops with "error: verification requested but nobody cares"

EXPECTED RESULTS: the system continues to boot properly
Comment 1 Marco M. 2020-12-26 19:11:03 UTC
Some additional info:
The error come out if you select secureboot in bootloader configuration, but you have secureboot disabled in bios. 

If you do not select secureboot in bootloader configuration AND you have secure boot disabled the installer is not able to properly set efivars and you end woth an unbootable system. If you arrange to boot with the help of a removable drive or with the help of the EFI firmware (if it allows you to manually select a boot file) you'll discover that update-bootloader nor grub2-install are able to set efivar entries because they do not pass the correct disk parameter to efibootmgr. The grub2-install error is the same I already wrote above.

With secureboot enabled in both bios and bootloader configuration everything is working fine and efibootmgr is called twice (once per disks) with the correct -d parameter. 

It looks like a regression from leap 14.3 which support these configurations very well.
Comment 2 Michael Chang 2020-12-28 02:51:19 UTC
(In reply to Marco M. from comment #1)

> It looks like a regression from leap 14.3 which support these configurations
> very well.

No this is not. The "esp on raid1" has yet to work with grub-install so you have use shim-install (ie turning Secure Boot on) for this purpose ...
Comment 3 Marco M. 2020-12-28 11:13:44 UTC
I did further tests and I was not able to reproduce the issue when using shim.
The system seems able to deal with EFI on RAID1 if you opt for secureboot in bootloader configuration, regardless of whether secureboot is enabled or disabled in the bios.

I'm very sorry for this, I cannot explain what happend. I checked the patch released in the last few days but they do not seems related to this kind of bugs.

I'm attaching the output of efibootmgr -v after the installation, after the system update and after an update-bootloader --reinit. They look fine.

Now I kindly ask: Is EFI on raid1 supposed to be supported even without using shim (without selecting secureboot in bootloader configuration)?
In LEAP 42.3 this was supported. If in LEAP 15.2 it is officially NOT supported then the bug can be closed (but I suggest to better specify this aspect in documentation).
Comment 4 Marco M. 2020-12-28 11:19:01 UTC
Created attachment 844722 [details]
output of efibootmgr after installation
Comment 5 Marco M. 2020-12-28 11:20:45 UTC
Created attachment 844723 [details]
output of efibootmgr after system update
Comment 6 Michael Chang 2020-12-29 04:38:43 UTC
(In reply to Marco M. from comment #3)

> Now I kindly ask: Is EFI on raid1 supposed to be supported even without
> using shim (without selecting secureboot in bootloader configuration)?
> In LEAP 42.3 this was supported. If in LEAP 15.2 it is officially NOT
> supported then the bug can be closed (but I suggest to better specify this
> aspect in documentation).

I don't think there's any difference in 42.3 and 15.2 regarding this support. It depends on how grub2-install was invoked, with --removable or --no-nvram you could have successful installation. On the other hand, if the installation tried to create boot variables for mdadm raid1 device then the attempt may fail since no viable uefi device path could be used to represent that.

However the shim-install wrapper uses member device of raid1 to compose device path for creating uefi boot variable so that the firmware could treat them as plain disks/partitions.
Comment 7 Michael Chang 2021-01-05 07:52:19 UTC
Hi Marco,

Would it be possible for you to test grub package in this repository ?

https://download.opensuse.org/repositories/home:/michael-chang:/devel:/grub/openSUSE_Tumbleweed/

I added some patch to it, mainly to have grub-install support updating nvram to boot grub installed on mirrored esp. You should now have identical result regardless the secure boot settings.

Thanks in advanced.
Comment 8 Marco M. 2021-01-05 09:08:35 UTC
(In reply to Michael Chang from comment #7)
> Hi Marco,
> 
> Would it be possible for you to test grub package in this repository ?
> 
> https://download.opensuse.org/repositories/home:/michael-chang:/devel:/grub/
> openSUSE_Tumbleweed/
> 
> I added some patch to it, mainly to have grub-install support updating nvram
> to boot grub installed on mirrored esp. You should now have identical result
> regardless the secure boot settings.
> 
> Thanks in advanced.

Hi Chang,
yes I can do the test, I'm glad to help. What is the cleanest method to install your patch? Should I do a vendor change on grub-install package? 

Thank you very much for your help.
Comment 9 Michael Chang 2021-01-06 06:55:52 UTC
(In reply to Marco M. from comment #8)
> (In reply to Michael Chang from comment #7)

[snip]

> Hi Chang,
> yes I can do the test, I'm glad to help.

Thank you.

> What is the cleanest method to
> install your patch? Should I do a vendor change on grub-install package? 

Yes. You can follow the instructions to install and uninstall test packages from that repository.

To install:

> zypper ar --repo https://download.opensuse.org/repositories/home:/michael-chang:/devel:/grub/openSUSE_Tumbleweed/home:michael-chang:devel:grub.repo
> zypper ref home_michael-chang_devel_grub 
> zypper dup --allow-vendor-change --from home_michael-chang_devel_grub

To uninstall (revert to official package)
> zypper rr home_michael-chang_devel_grub
> zypper ref
> zypper in --allow-downgrade  --allow-vendor-change --force grub2

Please note that you'll have to disable secure boot in UEFI firmware to test the package since it was built in my home project thus not signed by openSUSE.
Comment 10 Michael Chang 2021-01-06 07:01:42 UTC
(In reply to Michael Chang from comment #7)
> Hi Marco,
> 
> Would it be possible for you to test grub package in this repository ?
> 
> https://download.opensuse.org/repositories/home:/michael-chang:/devel:/grub/
> openSUSE_Tumbleweed/

Sorry, this is for tumbleweed. I just noticed that you're using openSUSE Leap 15.2.  I am going to enable 15.2 repository for building the test package and update you the new link once published. Please stay tuned.
Comment 11 Michael Chang 2021-01-06 07:29:37 UTC
(In reply to Michael Chang from comment #10)
> (In reply to Michael Chang from comment #7)

Hi Marco,

Please use this repository for Leap 15.2

https://download.opensuse.org/repositories/home:/michael-chang:/devel:/grub/openSUSE_Leap_15.2/

Also change the first instruction in comment#9 to use the new link

> zypper ar --repo https://download.opensuse.org/repositories/home:/michael-chang:/devel:/grub/openSUSE_Leap_15.2/home:michael-chang:devel:grub.repo

Thanks.
Comment 12 Marco M. 2021-01-06 16:32:01 UTC
Hi,
I installed the patched grub, but something went wrong. I got this error at the end of the installation:

Output of grub2-i386-pc-2.05-lp152.23.1.noarch.rpm %posttrans script:
    update-bootloader: 2021-01-06 11:38:16 <3> update-bootloader-1768 run_command.294: '/usr/lib/bootloader/grub2-efi/install' failed with exit code 1, output:
    <<<<<<<<<<<<<<<<
    target = x86_64-efi
    + /usr/sbin/shim-install --config-file=/boot/grub2/grub.cfg
    copying /usr/share/efi/x86_64/grub.efi to /boot/efi/EFI/opensuse/grub.efi
    Installing for x86_64-efi platform.
    error: no such partition.
    /usr/sbin/grub2-install: error: disk `lvmid/dECLWx-tkSi-DQWq-dtwx-seVe-Dujp-xJ18ie/IKPCya-y4jr-Joku-tisO-WtOC-To7x-femnYs' not found.
    >>>>>>>>>>>>>>>>

I also tried to update bootloader through yast (removing the secureboot settings) and by manually invoking grub2-install, but I got the same error.

The cited lvmid belongs to my root partition, but please note that in my configuration root /boot and /boot/efi are different partitions (see the attached Partition Layout)
Comment 13 Michael Chang 2021-01-07 04:48:18 UTC
(In reply to Marco M. from comment #12)

> The cited lvmid belongs to my root partition, but please note that in my
> configuration root /boot and /boot/efi are different partitions (see the
> attached Partition Layout)

Yes this should be a separate problem as the test package is based on latest upstream commit for 2.05 code stream (odd numbers are assigned to development code stream). I think I will have to backport these works to 2.04 and provide another test package, sorry for the inconvenience. I'll try to make it for today.

For the regression in lvm, certainly I will investigate the error as well.

Thanks.
Comment 14 Michael Chang 2021-01-07 11:46:42 UTC
(In reply to Michael Chang from comment #13)
> (In reply to Marco M. from comment #12)

> For the regression in lvm, certainly I will investigate the error as well.

The lvm error is actually introduced by the test patch, not part of upstream. I have fixed that and please follow the instruction to update the grub2 package .

> zypper ref home_michael-chang_devel_grub 
> zypper up grub2

If you had removed the home_michael-chang_devel_grub repo, you would need to follow instructions in comment#9 to re-add the repo and install package from there.

Please let me know how it works this time.
Thanks.
Comment 15 Michael Chang 2021-01-07 11:49:10 UTC
(In reply to Michael Chang from comment #14)
> (In reply to Michael Chang from comment #13)
> > (In reply to Marco M. from comment #12)

> Please let me know how it works this time.

Btw, I have tested on disk layout similar to yours (ie lvm in raid container) and that the new package worked as expected.
Comment 16 Marco M. 2021-01-08 17:54:29 UTC
> Please let me know how it works this time.
> Thanks.
Hi,
first of all thank you very much for your help. I installed the new version of the patched grub, but there is still a problem. This is the output I got when I installed the package:

**************************
Output of grub2-i386-pc-2.05-lp152.24.1.noarch.rpm %posttrans script:
    update-bootloader: 2021-01-08 18:35:01 <3> update-bootloader-2141 run_command.294: '/usr/lib/bootloader/grub2-efi/install' failed with exit code 1, output:
    <<<<<<<<<<<<<<<<
    target = x86_64-efi
    + /usr/sbin/grub2-install --target=x86_64-efi
    Installing for x86_64-efi platform.
    /usr/sbin/grub2-install: warning: this array has metadata at the start and may not be suitable as a efi system partition. please ensure that your firmware understands md/v1.x metadata, or use --metadata=0.90 to create the array..
    /usr/sbin/grub2-install: error: mduuid/74b7e49b63d0490d99ec2895beceeec2: no device for efi.
    >>>>>>>>>>>>>>>>
*************************

The mentioned mduuid correspond to my raid1 md device mounted as /boot/efi, but metadata is 1.0 as you can see from the output of the mdadm --detail command:

*************************
/dev/md0:
           Version : 1.0
     Creation Time : Fri Jan  1 19:14:58 2021
        Raid Level : raid1
        Array Size : 511936 (499.94 MiB 524.22 MB)
     Used Dev Size : 511936 (499.94 MiB 524.22 MB)
      Raid Devices : 2
     Total Devices : 2
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Fri Jan  8 18:35:28 2021
             State : clean 
    Active Devices : 2
   Working Devices : 2
    Failed Devices : 0
     Spare Devices : 0

Consistency Policy : bitmap

              Name : any:0
              UUID : 74b7e49b:63d0490d:99ec2895:beceeec2
            Events : 22

    Number   Major   Minor   RaidDevice State
       0     253        1        0      active sync   /dev/vda1
       1     253       17        1      active sync   /dev/vdb1
*************************

I don't understand why grub is complaining about the position of metadata, because metadata are at the end of the partition, for sure (the installer correctly defaults to metadata v1.0). Is there still a bug or am I missing something?

Thanks in advance!
Comment 17 Michael Chang 2021-01-10 09:04:10 UTC
(In reply to Marco M. from comment #16)

> I don't understand why grub is complaining about the position of metadata,
> because metadata are at the end of the partition, for sure (the installer
> correctly defaults to metadata v1.0). Is there still a bug or am I missing
> something?

Yes the intention is to ensure the use of metadata format which places superblock at end of the partition. I added the complaining because it is also what would be complained by mdadm that one may have encountered.

> localhost:~ # mdadm --create /dev/md0 --level=mirror --raid-devices=2 /dev/vdb1 /dev/vdc1
> mdadm: Note: this array has metadata at the start and
>     may not be suitable as a boot device.  If you plan to
>     store '/boot' on this device please ensure that
>     your boot-loader understands md/v1.x metadata, or use
>     --metadata=0.90
> Continue creating array? ^C

But you're right, version 1.0 also have its metadata at end of partition. I was misled by above mdadm note, as it only explicitly suggesting metadata 0.90.

For that I will have to work on and provide another test package. Sorry for the trouble, but I think we are getting close to complete the fix now (hopefully).

> 
> Thanks in advance!

You are welcome.

Thanks.
Comment 18 Michael Chang 2021-01-11 04:17:03 UTC
Hi Marco,

I have updated the package to address the metadata 1.0 problem we just discussed. Would you please verify if that can work for you ?

Thanks in advanced.
Comment 19 Marco M. 2021-01-13 21:44:46 UTC
(In reply to Michael Chang from comment #18)
> Hi Marco,
> 
> I have updated the package to address the metadata 1.0 problem we just
> discussed. Would you please verify if that can work for you ?
> 
> Thanks in advanced.

Hi,
I've just installed your updated packages and this time results looks good!
At first i installed the patch and the posttrans scripts did not report any error. 
But my efivars were still pointing to shim because the last time I left the test machine with secureboot enabled in yast. So I launched yast, removed secureboot settings and saved. I did not receive any erros and evivars are now correctly set as shown below:

**************
BootCurrent: 0001
Timeout: 0 seconds
BootOrder: 0002,0001,0003,000B,0000,000A
Boot0000* UiApp FvVol(7cb8bdc9-f8eb-4f34-aaea-3ee4af6516a1)/FvFile(462caa21-7614-4503-836e-8ab6f4662331)
Boot0001* opensuse (vda1)
        HD(1,GPT,d12112d2-e2a6-49bd-9f4c-9e22b9504f84,0x800,0xfa000)/File(\EFI\opensuse\grubx64.efi)
Boot0002* opensuse (vdb1)
        HD(1,GPT,481918fa-c9a3-4f74-a05f-96237355c4c8,0x800,0xfa000)/File(\EFI\opensuse\grubx64.efi)
Boot0003* UEFI QEMU DVD-ROM QM00005     PciRoot(0x0)/Pci(0x6,0x0)/Sata(0,65535,0)N.....YM....R,Y.
Boot000A* EFI Internal Shell    FvVol(7cb8bdc9-f8eb-4f34-aaea-3ee4af6516a1)/FvFile(7c04a583-9e3e-4f1c-ad65-e05268d0b4d1)
**************

Reboot was fine. 
I also manually launched update-bootloader and grub2-install, in both cases results looks good (no error and efivars correctly set).

I'm planning to do a more complete test reinstalling from scratch, avoiding shim since the beginning (grub only) and I'll let you know, but I'm confident the it will work.

Thank you very much
Comment 20 Marco M. 2021-01-17 14:13:56 UTC
I performed a new test from scratch:

- I installed opensuse 15.2 with the same partition layout
- I disabled secureboot since installation both in bios and in yast configuration (so shim was necver involved)
- At the first reboot the OS was not able to boot; this was expected, in fact  yast installer copied efi/opensuse/grubx64.efi in the raid EFI partition but was not able to set efivars due to the bug we are trying to solve.
- I managed to boot the just installed system through the installation cdrom, by chainloading grubx64.efi and the boot succeeded
- I installed the Chang's repository and updated grub with the Chang's patch: the result what I expect: efivars were set correctly:

************
BootCurrent: 000E
Timeout: 0 seconds
BootOrder: 0003,000B,000E,000D,0000,000A
Boot0000* UiApp FvVol(7cb8bdc9-f8eb-4f34-aaea-3ee4af6516a1)/FvFile(462caa21-7614-4503-836e-8ab6f4662331)
Boot0003* UEFI QEMU DVD-ROM QM00005     PciRoot(0x0)/Pci(0x6,0x0)/Sata(0,65535,0)N.....YM....R,Y.
Boot000A* EFI Internal Shell    FvVol(7cb8bdc9-f8eb-4f34-aaea-3ee4af6516a1)/FvFile(7c04a583-9e3e-4f1c-ad65-e05268d0b4d1)
Boot000B* UEFI Misc Device 3    PciRoot(0x0)/Pci(0xc,0x0)N.....YM....R,Y.
Boot000D* opensuse (vda1)
        HD(1,GPT,d12112d2-e2a6-49bd-9f4c-9e22b9504f84,0x800,0xfa000)/File(\EFI\opensuse\grubx64.efi)
Boot000E* opensuse (vdb1)
        HD(1,GPT,481918fa-c9a3-4f74-a05f-96237355c4c8,0x800,0xfa000)/File(\EFI\opensuse\grubx64.efi)
************

From my point of view the patch is working fine! Please tell me if you need some other tests or information. 

I'm looking forward to see the patch in the official update repository!

Thank you very much for your help!
Comment 21 Michael Chang 2021-01-19 06:52:32 UTC
Hi Marco,

Thank you very much, it is very appreciated for your time to do the test and also working on very informative comments. I'll send the patch series to openSUSE and upstream and update the bugzilla accordingly.
Comment 22 Marco M. 2021-04-06 12:12:46 UTC
Hi,
I've just completed a test installation of Leap 15.3 beta and it has the same bug we are discussing here. 
Do you think the patched grub could be available before leap 15.3 final release?
Should we update the version affected in the bug metadata?

Thank you very much
Comment 24 Swamp Workflow Management 2022-04-14 13:19:03 UTC
SUSE-RU-2022:1202-1: An update that has four recommended fixes can now be installed.

Category: recommended (moderate)
Bug References: 1179981,1191974,1192622,1195204
CVE References: 
JIRA References: 
Sources used:
SUSE Manager Server 4.1 (src):    grub2-2.04-150200.9.58.3
SUSE Manager Retail Branch Server 4.1 (src):    grub2-2.04-150200.9.58.3
SUSE Manager Proxy 4.1 (src):    grub2-2.04-150200.9.58.3
SUSE Linux Enterprise Server for SAP 15-SP2 (src):    grub2-2.04-150200.9.58.3
SUSE Linux Enterprise Server 15-SP2-LTSS (src):    grub2-2.04-150200.9.58.3
SUSE Linux Enterprise Server 15-SP2-BCL (src):    grub2-2.04-150200.9.58.3
SUSE Linux Enterprise Realtime Extension 15-SP2 (src):    grub2-2.04-150200.9.58.3
SUSE Linux Enterprise Micro 5.0 (src):    grub2-2.04-150200.9.58.3
SUSE Linux Enterprise High Performance Computing 15-SP2-LTSS (src):    grub2-2.04-150200.9.58.3
SUSE Linux Enterprise High Performance Computing 15-SP2-ESPOS (src):    grub2-2.04-150200.9.58.3
SUSE Enterprise Storage 7 (src):    grub2-2.04-150200.9.58.3

NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.
Comment 25 Swamp Workflow Management 2022-04-14 13:23:29 UTC
SUSE-RU-2022:1201-1: An update that has four recommended fixes can now be installed.

Category: recommended (moderate)
Bug References: 1179981,1191974,1192622,1195204
CVE References: 
JIRA References: 
Sources used:
openSUSE Leap 15.3 (src):    grub2-2.04-150300.22.15.2
SUSE Linux Enterprise Module for Server Applications 15-SP3 (src):    grub2-2.04-150300.22.15.2
SUSE Linux Enterprise Module for SUSE Manager Proxy 4.2 (src):    grub2-2.04-150300.22.15.2
SUSE Linux Enterprise Module for Basesystem 15-SP3 (src):    grub2-2.04-150300.22.15.2

NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.
Comment 26 Marco M. 2022-04-28 17:15:49 UTC
I tested the patch with a fresh installation of:

1) Leap 15.3 with self_update=1 and using online repository (to ensure I was using the last update of the grub package)
2) Leap 15.4 beta 222.1 

Secureboot was disabled in bios and NOT selected in the bootloader configuration and everything has worked as expected in both installations.

From my point of view the patch has solved the problem.

Thank you very much!
Comment 27 Michael Chang 2022-04-29 02:44:09 UTC
(In reply to Marco M. from comment #26)
> I tested the patch with a fresh installation of:
> 
> 1) Leap 15.3 with self_update=1 and using online repository (to ensure I was
> using the last update of the grub package)
> 2) Leap 15.4 beta 222.1 
> 
> Secureboot was disabled in bios and NOT selected in the bootloader
> configuration and everything has worked as expected in both installations.
> 
> From my point of view the patch has solved the problem.
> 
> Thank you very much!

Great to hearing from your positive test result. Since then the patch was also revised to handle Dell's firmware RAID on EFI so it took more time getting in to factory. Now we should have all things covered. :)

Thanks a lot for your cooperation and work.
Comment 28 Swamp Workflow Management 2022-06-14 13:18:25 UTC
SUSE-SU-2022:2073-1: An update that solves 7 vulnerabilities and has 14 fixes is now available.

Category: security (important)
Bug References: 1071559,1159205,1179981,1189769,1189874,1191184,1191185,1191186,1191504,1191974,1192522,1192622,1193282,1193532,1195204,1197948,1198460,1198493,1198495,1198496,1198581
CVE References: CVE-2021-3695,CVE-2021-3696,CVE-2021-3697,CVE-2022-28733,CVE-2022-28734,CVE-2022-28735,CVE-2022-28736
JIRA References: 
Sources used:
SUSE Linux Enterprise Micro 5.1 (src):    grub2-2.04-150300.3.5.1

NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.