Bug 630434

Summary: Server refuses to boot, while used iSCSI based non "/" file system is not available.
Product: [openSUSE] openSUSE 11.3 Reporter: Dennis Olsson <DOlsson>
Component: BasesystemAssignee: Lee Duncan <lduncan>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Critical    
Priority: P3 - Medium CC: hare, heiko.rommel, ihno, jeffm, moussa.sagna, per, shawn.starr, stefan.bogner, stephan.barth
Version: Final   
Target Milestone: ---   
Hardware: x86-64   
OS: SLES 11   
Whiteboard: maint:released:sle11-sp1:38085 maint:released:11.3:38138 maint:released:sle11-sp1:37595 maint:running:52031:low maint:released:sle11-sp3:57014
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Bug Depends on: 751056, 821695    
Bug Blocks:    
Deadline: 2014-05-05   
Attachments: multipath-emit-change-events
udevmountd-multipath-handling
Template of how "/etc/sysconfig/initrd" could look like.
Patch fixing reported bug, enabling defintions of override variables in "/etc/sysconfig/initrd" as well as adding a "/etc/sysconfig/initrd" template file.
Patch fixing missing inclusion of iSCSI stack, when having "onboot" configured iSCSI sessions.
Alternative patch to "/etc/init.d/boot.open-iscsi".
/etc/sysconfig/initrd for my test
My setup-iscsi.sh patch for initrd
Patch for /etc/init.d/open-iscsi and boot.open-iscsi

Description Dennis Olsson 2010-08-11 17:26:04 UTC
User-Agent:       Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.6) Gecko/20100626 SUSE/3.6.6-1.1 Firefox/3.6.6

On our SLES 11 GM based "mirror" server (running as a virtual XEN guest), we are using iSCSI to attach to our NAS holding the "/data" file system, whereas all other file systems (/, /boot, /var, /home) are placed on local disk.

When rebooting the server, it refuses to boot correctly, while it is not able to "fsck" the "/data" file system -- The system stops in "(repair filesystem) #" mode (after having entered the "root" password).

Using the "nofail" option for the "/data" file system helps to get the system to reboot correctly, but in this case the "/data" file system is neither being checked nor mounted during the system startup, which therefore -- once again -- makes the system unusable!
Using the "nofail" option is part of the "solution" listed in the SLES 11 SP1 documentation, but since the system is unusable, this is *not* the solution to the problem.

The problem is that the "mkinitrd" is and the boot scripts are not capable of handling iSCSI attached file systems correctly.

The correct way of handling iSCSI attached devices are:

- All iSCSI devices having the "onboot" set as node startup setting must be attached during the "initrd" boot sequence (i.e. by "boot.open-iscsi") -- and this *independent* of whether the "/" (root) file system is attached via iSCSI or not,

- All iSCSI devices having the "automatic" set as node startup setting need to be attached during the normal booting sequence (i.e. by "open-iscsi"),

- All iSCSI devices having the "manual" set as node startup setting should be left alone.

Using such a attaching schema will ensure that a server is capable of being booted with all its iSCSI file systems being attached, checked and mounted.

Reproducible: Always

Steps to Reproduce:
1a. Setup an iSCSI device and mount it e.g. as "/home" (no changes to "/etc/fstab").
2a. Reboot the system
3a. System fails to boot, while "/home" is not available.

1b. Setup an iSCSI device and mount it e.g. as "/home", and add the "nofail" option in "/etc/fstab" to the "/home" mount point.
2b. Reboot system.
3b. System boots, but "/home" is neither checked nor mounted!
Actual Results:  
When using iSCSI devices for non root file systems
- the system is not able to boot, when the file system on the iSCSI device is listed in "/etc/fstab" as a file system that should be mounted
or
- the system boots boots, when the file system on the iSCSI device has been marked with "nofail" in "/etc/fstab", but the file is neither being checked nor it is being mounted.

Expected Results:  
System boots with all file systems listed in "/etc/fstab" being checked and mounted independently of whether they are on local or iSCSI attached disks.
Comment 1 Michal Marek 2010-08-12 09:46:46 UTC
Dennis, you should SLES issues via the SLES support channels, not in the openSUSE bugzilla. Reassigning to Hannes nevertheless.
Comment 2 Hannes Reinecke 2010-08-12 11:26:21 UTC
Got me confused.
Where exactly is the problem?

Yes, you need have to mark the iscsi device as 'automatic' or 'onboot' just as you described.
Which incidentally is also what our documentation states about the use of iSCSI devices. So that's not a bug.
We only have a bug if a device will _not_ be mounted despite being marked as 'automatic' or 'onboot'.
Is that the case here?
Comment 3 Dennis Olsson 2010-08-12 19:05:07 UTC
Yes, this is exactly the case here.

Sorry that my bug reporting was not as clear as I thought.

The problem is:

(1) Doing as described in the SLES 11 SP1 documentation, i.e.
    - Setting the "node.startup" to "automatic",
    - Adding "nofail" option to the file system options in "/etc/fstab" for the file system found on the iSCSI disk(s),
I end up with a system that boots, but is unusable, while the "/data" iSCSI file system has not been mounted.
The iSCSI disk has been attached to the system (by "open-iscsi"), but the "/data" file system has not been *checked* nor *mounted*.

I.e. using the "nofail" option on the "/data" file system in "/etc/fstab" only prevents the system from not booting at all, but it does not ensure that the system ends up in a usable state (in my case neither "nfsserver" nor "autofs" works correctly -- not even if "/data" is being manually mount after having booted up, except if "nfsserver" and "autofs" is stopped, "/data" mounted, restarting "autofs" and "nfsserver").
This is *not* the way the usage of iSCSI disks should be used!  .-)


(2) Doing as one would except the world to work (and as described above in comment #0 under "The correct way of handling iSCSI attached devices are"), i.e.
    - Setting the "node.startup" to "onboot",
    - Adding the "/data" file system as normal, just like when using a local disk to the "/etc/fstab",
results in a non bootable system, while during the file system checking "fsck" cannot find the "/data" file system disk at all, and thus ending up in the "(repair filesystem)" shell prompt.


As I wrote in comment #0, iSCSI based systems that need file systems to be mounted during the boot up sequence have to attached to the booting system during the execution of "boot.open-iscsi" using "node.startup" set to "onboot", allowing for these file systems to be checked and mounted by "boot.localfs".


Just think what would happen to a system, if you have a "/" on a local disk and "/usr" on an iSCSI disk => Non bootable system.


Using "automatic" only ensures that an iSCSI target gets attached to the system, but cannot be incorporated/used for normal mount points, simply because the attachment of the iSCSI targets happens far too late the in the booting sequence.
The "automatic" setting is useful for iSCSI disks that are used by e.g. databases that are being started rather late in the boot sequence, but is absolutely *not* usable for iSCSI disks holding normal mounted file systems.
Comment 4 Dennis Olsson 2010-08-13 09:08:59 UTC
Reply to comment #1:

> Dennis, you should SLES issues via the SLES support channels, not in the
> openSUSE bugzilla. Reassigning to Hannes nevertheless.

Michal, thanks, yes, I am aware of this.   Have just first reported it here, while (1) the issue also affects openSUSE 11.3+, and (2) while I am must more familiar with the usage of Bugzilla (is far more easier to use;-) than with Novell's proprietary Support Requests. ;-)

Issue was reported today as:

Service Request: 10642695911
    Description: Server refuses to boot, while used iSCSI based non "/" file system is not available.
Comment 5 Hannes Reinecke 2010-08-31 14:14:31 UTC
Ah. Now this makes sense.
Not mounting an existing iSCSI device is indeed an error.
Comment 6 Hannes Reinecke 2010-08-31 14:18:18 UTC
I've been doing some tests here, and the 'udevmountd' service did indeed mount the device. Sadly, the 'block' device only, so multipath would refuse to run there.

To fix this we need to update multipath-tools and sysconfig.
multipathd from multipath-tools needs to be patched to emit a 'change' event for block devices whenever it's finished processing. This event can then be parsed by udevmountd, which then will know if it should try to check/mount the block device or if it's claimed by multipath.
If the latter we can skip the fsck/mount step as we'll be getting a new multipath device eventually.
Comment 7 Hannes Reinecke 2010-08-31 14:32:45 UTC
Created attachment 386537 [details]
multipath-emit-change-events

Emit 'change' event when the daemon finished processing the 'add' event.
Comment 8 Hannes Reinecke 2010-08-31 14:34:17 UTC
Created attachment 386539 [details]
udevmountd-multipath-handling

Patch to 'sysconfig' to allow mounting of multipath devices.
Comment 9 Hannes Reinecke 2010-08-31 14:36:13 UTC
With these two patches mounting of iscsi devices works as advertised, both multipathed and non-multipathed.

The patch to multipath-tools should be applied on top of my git tree at:

git://git.kernel.org/pub/scm/linux/kernel/git/hare/multipath-tools.git
branch sles11-sp1

The patch to sysconfig should just be applied to the existing source rpm.

Please test (with SLES11 SP1) and report the results.
Comment 10 Dennis Olsson 2010-09-15 14:44:17 UTC
Although I currently cannot test on SLES 11 SP1 (my XEN guest first needs to be upgraded, which I currently cannot do), I have tried to test on the running SLES 11 GM with all updates using the PTFs Stefan Bogner supplied me with:

  kpartx-0.4.8-40.19.1.1889.0.PTF.638468.x86_64.rpm
  multipath-tools-0.4.8-40.19.1.1889.0.PTF.638468.x86_64.rpm
  sysconfig-0.71.14-12.9.1.1889.0.PTF.638468.x86_64.rpm

  mkinitrd-2.4-49.10.1

Having installed these, rerun "mkinitrd" and rebooted, the system still refuses to boot, while the iSCSI devices are not attached to the system.

Which does not surprises me, while the problem is that the "mkinitrd" still cannot figure out that it has to activate iSCSI during the booting up of "initrd".

Also tried to update to (from SLES 11 SP1):

  module-init-tools-3.11.1-1.3.5.x86_64.rpm
  mkinitrd-2.4.1-0.11.1.x86_64.rpm

but this did not change anything (yes, I did rerun "mkinitrd")!

As a note, after having finished the "initrd" booting, it finishes with:

...
System Boot Control: The system has been                             set up
Skipped features:                             boot.open-iscsi boot.cycle

which very clearly informs you that iSCSI has *not* been started nor that any iSCSI devices have been attached.
Comment 11 Hannes Reinecke 2010-09-15 14:58:18 UTC
Hmm? What exactly has initrd to do with this? I was under the impression that you're trying to mount an iSCSI device via /etc/fstab, ie _after_ initrd has run and the 'normal' SYSV init runs.
Is that correct?
Comment 12 Dennis Olsson 2010-09-15 15:20:19 UTC
An additional note to comment #10:

When doing the testing, I noticed that during a shutdown "open-iscsi" is called
*before* any of the file systems used over iSCSI have been unmounted, resulting
in having these file systems end up in an undefined state by the next start up.

Not especially smart.

By changing "/etc/init.d/open-iscsi" to have:

  # Required-Stop:     $remote_fs $network

the system is able to shutdown correctly.

The problem is that the mounted file systems over iSCSI are being used by
"autofs" resp. "nfsserver".
By adding "$remote_fs", it is ensured that "open-iscsi" is executed after these
services have been stopped, enabling "open-iscsi" to unmount its partitions
before logging out from the iSCSI server.
Comment 13 Dennis Olsson 2010-09-15 15:30:16 UTC
> Hmm? What exactly has initrd to do with this? I was under the impression that
> you're trying to mount an iSCSI device via /etc/fstab, ie _after_ initrd has
> run and the 'normal' SYSV init runs.
> Is that correct?

Well, yes, that is what I am trying to.

The problem is just that "open-iscsi" is started *after* "boot.localfs", which means that "fsck" bails out, while the file systems from the iSCSI devices (in my case "/data" and "/home") are not available.

On the other hand, if "open-iscsi" had already been started in "initrd" (as is the case, if "/" is on an iSCSI device), the iSCSI devices that holds the file systems would be available at the time, when "boot.localfs" is being executed, and everything would run as it should.
Comment 14 Hannes Reinecke 2010-09-20 17:04:25 UTC
> The problem is just that "open-iscsi" is started *after* "boot.localfs", which
> means that "fsck" bails out, while the file systems from the iSCSI devices (in
> my case "/data" and "/home") are not available.

For which we have the option 'nofail' to avoid a failure here.
The devices will then be mounted via udev.
Comment 15 Dennis Olsson 2010-09-20 19:22:50 UTC
> For which we have the option 'nofail' to avoid a failure here.
> The devices will then be mounted via udev.

Sigh -- and *that* is the problem!

(1) The devices do *not* get mounted via udev.
(2) The file systems (sometimes) cannot be mounted, while a "fsck" needs to be run on them before mounting them.
(3) In cases of having ones system file systems, like "/usr", "/srv", "/opt", "/var", etc., it is simply *not* possible to make use of the "nofail" option and then wait until udev mounts the file system, while the system simply *cannot* boot without having access to these file systems on their iSCSI devices!!

Therefore, and to repeat myself from my initial comment #0, when creating this bug entry:

The correct way of handling iSCSI attached devices are:

<quotation>
- All iSCSI devices having the "onboot" set as node startup setting must be
attached during the "initrd" boot sequence (i.e. by "boot.open-iscsi") -- and
this *independent* of whether the "/" (root) file system is attached via iSCSI
or not,

- All iSCSI devices having the "automatic" set as node startup setting need to
be attached during the normal booting sequence (i.e. by "open-iscsi"),

- All iSCSI devices having the "manual" set as node startup setting should be
left alone.

Using such a attaching schema will ensure that a server is capable of being
booted with all its iSCSI file systems being attached, checked and mounted.
</quotation>

The usage of iSCSI devices are *independed* of for what they are used (file systems, system file systems, raw device or whatever), just like any other attachable disk device (IDE, SATA, SAS, SCSI, USB, etc.).

Therefore, using an attachment logic for iSCSI devices as described above, ensures that the needed iSCSI devices are made available *before* they are being address by the system.
Comment 17 Hannes Reinecke 2010-09-27 08:29:46 UTC
Hmm. Yes, you are correct, the 'nofail' option will not work for systems with any system directories (like /var or /usr) on iSCSI when root is _not_ on iSCSI.
And I'm equally sure there is no way we can fix this without major restructuring.

Reason being that network is started in runlevel 3 (and higher) only, and iscsi requires the network to run.
So moving iscsi to the 'boot' runlevel is a no-go.
On the other hand 'mkinitrd' considers the root fs _only, and as iSCSI is not required to get the root fs up and running it won't be including it.

Of course we might be able to update mkinitrd to include iSCSI setup in these cases, but then the iSCSI setup might be more complicated than just a single connection to a single target, in which case we're bound to fail.

So to make this work we need first to update mkinitrd to handle more than one network connected and then update mkinitrd to include iSCSI when /var or /usr is on iSCSI.

But this is _not_ something we can do for an already released product. Sorry.
Comment 18 Hannes Reinecke 2010-09-27 08:31:30 UTC
So we should be updating the documentation that any system directory on a separate partition via iSCSI is _not_ supported with the root-fs is not on iSCSI, too.

Ihno, can you trigger the documentation team here?
Comment 21 Marius Tomaschewski 2011-01-05 17:27:28 UTC
Patch from comment #8 is included in the submit request 9953
  ->  SUSE:SLE-11-SP1:Update:Test
Fixes bnc#616765,bnc#637183,bnc#644738,bnc#660774,bnc#630434

Going to add to swampid 37050.
Comment 22 Marius Tomaschewski 2011-01-06 10:33:22 UTC
Mr Maintenance,

the update will go out via multipath swampid 37050 for SLE-11-SP1.
I've submitted it to git for 11.4 already and preparing for factory.

What about 11.3?
Comment 23 Christian Dengler 2011-01-10 14:14:04 UTC
Feel free to update it also on 11.3. You can use the same SwampID.
Comment 24 Marius Tomaschewski 2011-01-11 14:40:12 UTC
OK. Submitted package to 11.3:

 57886  State:new     By:mtomaschewski When:2011-01-11T15:31:18
        submit:       home:mtomaschewski:branches:openSUSE:11.3:Update:Test/sysconfig  ->  openSUSE:11.3:Update:Test   
        Descr: SWAMPID 37050: fixes for bnc#630434 and bnc#660774

and added patchinfo. The other fixes (comment 21) are already
released on 11.3 via SWAMPID 37959.
Comment 25 Swamp Workflow Management 2011-01-12 20:53:28 UTC
Update released for: sysconfig, sysconfig-debuginfo, sysconfig-debugsource
Products:
SLE-DEBUGINFO 11-SP1 (i386, ia64, ppc64, s390x, x86_64)
SLE-DESKTOP 11-SP1 (i386, x86_64)
SLE-SERVER 11-SP1 (i386, ia64, ppc64, s390x, x86_64)
SLES4VMWARE 11-SP1 (i386, x86_64)
Comment 26 Swamp Workflow Management 2011-01-13 12:31:06 UTC
Update released for: sysconfig, sysconfig-debuginfo, sysconfig-debugsource
Products:
openSUSE 11.3 (debug, i586, x86_64)
Comment 27 Swamp Workflow Management 2011-01-13 14:59:10 UTC
Update released for: kpartx, multipath-tools, multipath-tools-debuginfo, multipath-tools-debugsource
Products:
SLE-DEBUGINFO 11-SP1 (i386, ia64, ppc64, s390x, x86_64)
SLE-DESKTOP 11-SP1 (i386, x86_64)
SLE-SERVER 11-SP1 (i386, ia64, ppc64, s390x, x86_64)
SLES4VMWARE 11-SP1 (i386, x86_64)
Comment 28 Shawn Starr 2011-01-19 07:14:55 UTC
Can I get some additional clarification:

There seems to be TWO problems here:

1) Mounting system filesystems /var /home /usr - breaks with mkinitrd - Fine, you've documented that above but that is for that use case only.

2) mounting non-system or non-root '/' filesystems breaks - This is still not fixed.

I have hit this problem with OpenSuSE 11.x and SLES 11 SP1, trying to mount a data partition (let's call it /somedata). Worse, even if I use /dev/disk/by-id or by-uuid in /etc/fstab, fsck fails and brings to emergency root login. 

telling open-iscsi to use 'automatic' doesn't work either. I use nofail to at least get the machine booted, but this isn't very good.

So, is what you are saying is what USED to work in OpenSuSE 10.x/SLES 10.x no longer works in SLES 11 SP1 or in OpenSuSE?

It sounds like I'm going to have to write an initscript wrapper, that will parse /etc/fstab, grab the uuid/by-id device name and mount it explicitly after network and open-iscsi has attached disk.

This doesn't seem fully resolved, reopening to get attention.
Comment 29 Dennis Olsson 2011-09-14 11:40:24 UTC
A work-around to fix the issue with not getting "onboot" configured iSCSI sessions attached during system boot up, when root is not on an iSCSI device, is to make use of the overriding feature of "mkinitrd" (only documented in the sources) by creating the file "/etc/sysconfig/initrd" containing the setting of "root_iscsi=1".

See attached template of how a "/etc/sysconfig/initrd" could look like.

Please, be aware of the bug in "mkinitrd" (bug 717590) that prevents the overriding feature to work correctly, although this bug does not prevent the feature of setting "root_iscsi=1" from working.

After having set "root_iscsi=1" in "/etc/sysconfig/initrd", all you need is to rerun "mkinitrd" to get an "initrd" that will attach all "onboot" configured iSCSI sessions during system boot up.
Comment 30 Dennis Olsson 2011-09-14 11:44:36 UTC
Created attachment 450674 [details]
Template of how "/etc/sysconfig/initrd" could look like.
Comment 31 Dennis Olsson 2011-09-14 12:49:40 UTC
Created attachment 450695 [details]
Patch fixing reported bug, enabling defintions of override variables in "/etc/sysconfig/initrd" as well as adding a "/etc/sysconfig/initrd" template file.

Attached patch for "mkinitrd-2.4.1-0.14.1" contains pathcing of "/sbin/mkinitrd" (see also bug 717590) and adding file of "/etc/sysconfig/initrd" (see comment 29).
The inclusion of "/etc/sysconfig/initrd" in the packages allows for easier usage of "mkinitrd", when one has special settings, just as it documents a nice feature of "mkinitrd" that has been "hitten" away in the source code. ;-)
Comment 32 Dennis Olsson 2011-09-14 12:56:49 UTC
Created attachment 450697 [details]
Patch fixing missing inclusion of iSCSI stack, when having "onboot" configured iSCSI sessions.

Attached patch for "open-iscsi-2.0.871-0.29.1" contains patching of "/etc/init.d/open-iscsi" & "/lib/mkinitrd/scripts/setup-iscsi.sh" and the removal of "/etc/init.d/boot.open-iscsi".

This patch fixes the original reported problem (see comment 0) -- just as it makes the work-around described in comment 29 above unnecessary -- by changing the logic in "/lib/mkinitrd/scripts/setup-iscsi.sh" to look for all iSCSI devices / sessions that have their start up configured to "onboot" ("node.conn[0].startup = onboot") and when at least one is found ensures that "mkinitrd" includes the iSCSI session attachment of these during system boot up in "initrd".
The patch also removes "/etc/init.d/boot.open-iscsi", while as far as I could see during my testing, this script is a NOOP script that is always either skipped (while the iSCSI was already loaded by "initrd") or fails (while it cannot start the "/sbin/iscsid" daemon, because the necessary kernel modules have not been loaded).
Granted, I have not tested the case, where iSCSI is being used during the install, where the script might be of usage (see also comment 33 below).
Comment 33 Dennis Olsson 2011-09-14 13:01:56 UTC
Created attachment 450703 [details]
Alternative patch to "/etc/init.d/boot.open-iscsi".

For the case where script "/etc/init.d/boot.open-iscsi" is needed when iSCSI is being used during system installation, please, make use of this patch of the script instead of removing it.
Comment 34 Dennis Olsson 2011-09-28 13:23:05 UTC
Issue was reopened in Novell Support Center under

Service Request: 10724107481
    Description: Server refuses to boot, while used iSCSI based non "/" file
system is not available.

requesting subject to be handled in time for SLES 11 SP2 as well as an update to SLES 11 SP1.
Comment 36 Hannes Reinecke 2011-10-10 11:43:49 UTC
boot.open-iscsi is _NOT_ a NOOP.

It is absolutely essential if you want to run with root on iSCSI.

Problem here is that while the iSCSI connection works perfectly without any daemon, the error recovery logic does _not_.
And we have to stop the daemon at the end of the initrd run as we're switching over to a new root fs, and can't have any daemons running which would still be referencing the old root fs.

So between the time the iscsi daemon is stopped at the end of initrd run and the daemon being started from the 'normal' init scripts no iSCSI exception handling can take place.
Any network interruption here will lead to an immediate system stall, with no chance of recovery other than reboot.
Hence we should keep this window as small as possible.

Now, as iscsi is normally dependent on the network setup, the daemon would be started rather late. The purpose of the boot.open-iscsi script is to start the daemon _as early as possible_, basically just after udev has started.

Experience shows that the highest I/O traffic is during mounting / checking of the local filesystems, as then fs logs will be replayed etc. Which also means that any I/O exception is most likely to occur at this time.
When we were to delete the boot.open-iscsi script no daemon would be running at this time and the system would stall. With daemon running the error recovery logic would relogin into the iSCSI target and the system boots as normal.
Comment 37 Dennis Olsson 2011-10-10 12:46:17 UTC
Well, yes, I thought that might be the case.   I do not have a setup, where I am using root on an iSCSI device, and thus was unsure whether "boot.open-iscsi" would be kicking in these cases (assumed so, though;-), and have now been confirmed that it is so).

The reasons for the suggested alternative changes (opposed to the removal of "boot.open-iscsi") in comment 33 is that in my test cases, the unconditional call of "startproc $DAEMON $ARGS" in "boot.open-iscsi" resulted in this script bailing out, while the iscsi daemon ($DAEMON) is already running at this point in time!

Using the suggested patch, though, would ensure that the daemon only gets started in cases, where it is really not running, just as it ensures that "boot.open-iscsi" is capable of continue running its "start" section to the end without bailing out too early.
Comment 40 Lee Duncan 2012-02-29 23:31:06 UTC
I am trying to understand the current state of this bug, and what patches are still needed to fix the problem.

It sounds like you want /etc/init.d/boot.open-iscsi to be modified to only try to start the iscsi daemon if it is not already running, as per attachment (id=450703).

Do you also want changes to /etc/init.d/open-iscsi from the previous attachment (id=450697)?

Lastly, are the patches that Hannes suggested already handled? It looks like one of them is, from comment #8, but not sure about the one from comment #7.

And what about /etc/sysconfig/initrd? I am really not sure about adding such a template to fix this problem, when only the "root_iscsi=1" option is needed here.
Comment 41 Dennis Olsson 2012-03-11 21:48:47 UTC
> It sounds like you want /etc/init.d/boot.open-iscsi to be modified to only try
> to start the iscsi daemon if it is not already running, as per attachment (id=450703).

Correct (see also comment 39 for reason why).

> Do you also want changes to /etc/init.d/open-iscsi from the previous attachment (id=450697) [details]?

Yes, also correct.
The removal of "/etc/init.d/boot.open-iscsi" in this attachment must be ignored (as per Hannes comment 36), while this script is *needed*, although it then needs the fix above to work correctly in cases, where iSCSI is used on systems, where the root file system is not located on an iSCSI device.

> Lastly, are the patches that Hannes suggested already handled? It looks like
> one of them is, from comment #8, but not sure about the one from comment #7.

You will have to ask Hannes about this.
As far as I am concerned, these patches are unrelated to my original report (as far as I can overview).

> And what about /etc/sysconfig/initrd? I am really not sure about adding such a
> template to fix this problem, when only the "root_iscsi=1" option is needed here.

The "initrd" template suggested in attachment (id=450695) covers other cases then just using an iSCSI device without having the root file system on an iSCSI device.
This template is the sum of my experience with "mkinitrd" over time, where I have been forced to overwrite various oddities found with "mkinitrd", the iSCSI case as reported here just being the lastest.

Just to fix the reported iSCSI problem, the template is probably "an overkill", but then again the template covers more than just the iSCSI issue fix, so why not include all of it?  ;-)

PS:
Notice the fix to "mkinitrd" itself, too, in the attachment (id=450695) -- There is a rather annoying bug in "mkinitrd" that needs to be fixed!
See also bug 717590, which was created to keep track of this "follow up" bug.
This bug also contains a newer and better patch to fix this bug.
Comment 42 Lee Duncan 2012-03-27 19:00:11 UTC
Dennis: it does not look like the openSUSE 11.3 open-iscsi package is supported any longer. Are you looking for an 11.3 fix for this, or are you willing to move to a newer version of openSUSE? It looks like 11.4 is still supported.

As for the mkinitrd and template changes, I believe I will fork them of as a separate bug.
Comment 43 Dennis Olsson 2012-03-27 19:48:57 UTC
openSUSE 11.3??

Originally, I opened this bug for SLES 11 GM (see message #0), later on SLES 11 SP1 (message #3 & 10).  And, I believe, the bug still exists in SLES 11 SP2.
Only in message #4, I also confirmed that the issue exists under openSUSE 11.3.

With regards to openSUSE, I can live with no fixes for 11.3, while I would be using 11.4 or more likely 12.1, were I to setup an openSUSE server.

Regarding "mkinitrd", I have already opened bug 717590 to handle the bug.

Regarding the "/etc/sysconfig/initrd" template, well, this one would indeed need a new bug.  .-)
Comment 44 Lee Duncan 2012-04-03 22:06:52 UTC
Dennis:

I have looked at bug 717590, and I believe that we should also break out another part of this bug as a separate issue: I believe that mkinitrd template should be a separate bug. And I believe this current bug should depend on that new bug and on 717590.

I believe I have the open-iscsi part of this bug fixed. Would you like a patch for the open-iscsi source package for SLES 12 SP2 GM?
Comment 45 Dennis Olsson 2012-04-04 15:30:14 UTC
Well, I do not any longer have access to an iSCSI environment, but I would like to give the sources a review (I just cannot test in a live environment presently).

For SLES 12 SP2 GM???   I do believe, you mean SLES 11 SP2 GM, right?
Comment 46 Dennis Olsson 2012-04-04 19:07:35 UTC
Lee,

As requested, I have created bug 755748 to handle the "initrd" template, and made this bug depend on this bug and on 717590.
Comment 48 Lee Duncan 2012-04-05 17:29:13 UTC
(In reply to comment #45)
> Well, I do not any longer have access to an iSCSI environment, but I would like
> to give the sources a review (I just cannot test in a live environment
> presently).
> 
> For SLES 12 SP2 GM???   I do believe, you mean SLES 11 SP2 GM, right?

My apologies: I was off-by-one. Yes, I meant SLES 11 SP2 GM.

Note: I am trying to reproduce this problem to verify your suggested changes, and I am not having luck. Perhaps you can hep me. Here's what I'm doing:

* Created an iSCSI target volume on a remote system
* Set up my SLE 12 SP2 GM client with open-iscsi
* Made sure client could connect to target
* Created a filesystem on the target disc from the client (EXT3)
* Set node.startup and node.conn[0].startup to "onboot"
* Updated /etc/init.d/boot.open-iscsi and /etc/init.d/open-iscsi as you suggested
* added your suggested initrd template and ran mkinitrd

When I reboot, the /etc/init.d/boot.open-iscsi service finds no sessions, since it appears to not have logged into the target.

And when I get up full running (after /etc/init.d/open-iscsi has ran), the session still is not present, i.e. no login has been attempted to my target!

If I change the "startup" variables to "automatic" instead of "onboot", then I get automatic login to the target, so when I get to full multi-user mode, the disk is present (in my case, as /dev/sdc).

So why is this not working when I set startup to "onboot", i.e. who/what is supposed to login to the target when its startup value is set to onboot? Is it initrd? Do I need to make changes to initrd itself?
Comment 49 Dennis Olsson 2012-04-06 11:16:17 UTC
Hmmm, Lee, assuming that you are off-by-one once again regarding SLE (I believe, you are working too hard on openSUSE 12.2 at the moment;-), it looks like you are doing exactly what I have been doing.

Except, as far as I can judge from your description, it seems that you might have forgotten to activate the out-commented line "#root_iscsi=1" (found almost at the end of the "initrd" file) *before* you ran "mkinitrd".

While, without having the "root_iscsi=1" activated in the "initrd" sysconfig file, you will not get the iSCSI stack (including the "open-iscsi" script) packed into the "initrd" file and without this, the iSCSI does not work, as you have noticed.

The difference between "onboot" and "automatic" is:

* "onboot" means that the iSCSI device have to be attached as soon as the system is being booted (this might have been done during the BIOS / UEFI setup phase!), or at least as soon as possible after the boot.

* "automatic" means that the system should attach the iSCSI automatically, when it runs it iSCSI stack setup -- This typically means when running the iSCSI start up scripts; i.e. unter SLE / openSUSE this typically means when going to run level 3 or 5.
---------------------

Please, notice that if you only implemented the fixes for the "/etc/init.d/*open-iscsi" scripts, which is how I read your description, then you need to activate the "root_iscsi=1" variable setting in the "initrd" sysconfig file.

If you on the other hand have implemented the fixes for "mkinitrd" and the "/etc/init.d/*open-iscsi" scripts (see comment 32 and comment 33), then you would be able to get a correctly configured "initrd" without having to make use of the "initrd" sysconfig file.

In fact, if the fixes for the "mkinitrd" and "/etc/init.d/*open-iscsi" as suggested in comment 32 and comment 33 were to be implemented, the "initrd" sysconfig would not really be needed to get the iSCSI stack working correctly.

Please, notice that in comment 32 the script "/etc/init.d/boot.open-iscsi" was deleted.
This is NOT a correct handling of this script (refer to comment 36 and comment 37).
It is therefore important to use the fixes to "/etc/init.d/boot.open-iscsi" as given in comment 33 instead.

Hope this answers you questions.
Comment 50 Lee Duncan 2012-04-10 00:00:59 UTC
I found my problem. I created a /etc/sysconfig/initrd file, and I set root_iscsi=1 and ADDITIONAL_FEATURES="iscsi" at first. But then the network was not getting loaded, so I added "network" to the list of additional features.

I also updated the mkinitrd script itself, as per your suggestion.

Now, when I reboot, my "/data", which is my iSCSI volume, is mounted at boot time, as desired.

I will update this bug soon with my patches for boot.open-iscsi and open-iscsi, so they are captured. After that, work on SLES will resume when L3 status confirmed.
Comment 51 Lee Duncan 2012-04-10 01:07:12 UTC
Created attachment 485398 [details]
/etc/sysconfig/initrd for my test

this had just the minimum for my testing
Comment 52 Lee Duncan 2012-04-10 01:08:38 UTC
Created attachment 485399 [details]
My setup-iscsi.sh patch for initrd

My version of this patch differs from the bug filers since the script being patched has changed.
Comment 53 Lee Duncan 2012-04-10 01:09:18 UTC
Created attachment 485400 [details]
Patch for /etc/init.d/open-iscsi and boot.open-iscsi
Comment 54 Dennis Olsson 2012-04-15 14:20:58 UTC
Having reviewed attachment id=485400 in comment 53, may I suggest that you change the line reading (now, where you are changing the line anyway.-):

    cat /proc/mounts | sed -ne '/^\/dev\/.*/p' | while read d m t o x; do

to:

    sed -ne '/^\/dev\/.*/p' /proc/mounts | while read d m t o x; do

instead.
The use of "cat" is completely superfluous, using unnecessary resources (for no valid reasons).
Comment 60 Lee Duncan 2013-01-10 22:45:32 UTC
I have submitted the tested changes to openSUSE:12.2:Update, using maintenance id# 147958. Once this is accepted I will submit these changes to SP2/SP3.
Comment 61 Bernhard Wiedemann 2013-01-10 23:00:11 UTC
This is an autogenerated message for OBS integration:
This bug (630434) was mentioned in
https://build.opensuse.org/request/show/147985 Maintenance /
Comment 64 Lee Duncan 2013-01-12 00:40:11 UTC
Submitted to build service for openSUSE:Factory/open-iscsi (linked to network/open-iscsi) as request id 148154.
Comment 65 Lee Duncan 2013-01-14 18:28:15 UTC
Here are the steps I used to test on all platforms:

- apply patch
- ensure open-iscsi is running
- connect to remote target you want to mount
- set node.conn[0].startup to "onboot" for that target
- add an entry to /etc/fstab to mount that target (I used "/data"), and ensure
   options set to that it mounts at boot time (I used "auto,nofail"). The "no fail" ensures
   the boot will succeed even if the mount does not
- set the defaults for mkinitrd in /etc/sysconfig/initrd, as per attached example
- run "mkinitrd -v" (the "-v" is optional, but I like to see what is happening)
- reboot

You can tell if the patch worked if the iSCSI volume you want is mounted after reboot. In my case,
/data was mounted after reboot, so the patch worked.
Comment 66 Lee Duncan 2013-01-14 18:29:21 UTC
SP2 fix submitted to SUSE build server as request id 23526.
Comment 67 Benjamin Brunner 2013-01-17 10:54:53 UTC
Update released for openSUSE 12.2.
Comment 68 Lee Duncan 2013-01-21 23:31:54 UTC
The SP2 Update may be backed up behind a major open-iscsi version update for SP2. If that major version update occurs this patch will have to be ported, so it may make sense to block this until that other bug gets fixed.
Comment 69 Lee Duncan 2013-01-24 19:46:00 UTC
Updated bug dependencies, where were backwards for bnc#751056
Comment 70 Bernhard Wiedemann 2013-03-14 21:00:10 UTC
This is an autogenerated message for OBS integration:
This bug (630434) was mentioned in
https://build.opensuse.org/request/show/159468 Maintenance /
Comment 71 Lee Duncan 2013-03-14 21:08:22 UTC
found this problem was not yet fixed in openSUSE 12.3 and openSUSE Factory, so changes submitted today for those two packages.

Still blocked, but not for much longer, on SP2, by bnc#751056.
Comment 72 Bernhard Wiedemann 2013-03-16 09:00:08 UTC
This is an autogenerated message for OBS integration:
This bug (630434) was mentioned in
https://build.opensuse.org/request/show/159644 Factory / open-iscsi
Comment 73 Benjamin Brunner 2013-03-19 12:46:38 UTC
Update released for openSUSE 12.3 and checked in into Factory.
Comment 74 Swamp Workflow Management 2013-03-19 13:04:45 UTC
openSUSE-RU-2013:0483-1: An update that has two recommended fixes can now be installed.

Category: recommended (important)
Bug References: 630434,766300
CVE References: 
Sources used:
openSUSE 12.3 (src):    open-iscsi-2.0.870-47.4.1
Comment 75 Swamp Workflow Management 2013-04-05 11:21:07 UTC
The SWAMPID for this issue is 52031.
This issue was rated as low.
Please submit fixed packages until 2013-05-03.
Also create a patchinfo file using this link:
https://swamp.suse.de/webswamp/wf/52031
Comment 76 Lee Duncan 2013-05-16 19:04:53 UTC
Updated the patch for current SP2 open-iscsi.

Tested that open-iscsi came up and worked on a normal (non-boot-time) target.

Then, tested this patch as per Comment#65, i.e. with patch in place and open-iscsi up and running:

- connected to remote target
- set node.conn[0].startup to "onboot" for that target
- added entry for that target in /etc/fstab with options
  "auto,nofail", then mounted it
- set /etc/sysconfig/initrd to:

        ADDITIONAL_FEATURES="network iscsi"
        root_iscsi=1

- run "mkinitrd -v"
- reboot

And /data is mounted after boot.

I will submit the updated SP2 patch today, then SP3 shortly after.
Comment 77 Lee Duncan 2013-05-17 00:06:49 UTC
Submitted to build service for SP2, request id 26345.
Comment 79 Lee Duncan 2013-05-20 20:57:47 UTC
Changes submitted to build service for SP3, request id 26423.
Comment 81 Lee Duncan 2013-05-21 15:45:11 UTC
The fix for this bug is integrated now in SP3, but still under review in SP2. So bug can remain open until an SP2 decision is made.
Comment 82 Lee Duncan 2013-06-03 10:57:07 UTC
I would like to add this simple patch to SP2 open-iscsi, which fixes handling of boot-time non-root iSCSI volumes.
Comment 88 Lee Duncan 2013-06-26 11:06:55 UTC
I propose to *not* fix this in an SP2 update, since SP2 is more mature now than when this bug was filed, and consequently is not an appropriate place for a fix with as much risk as this fix has.

Therefore, once this the submitted SP3 fix is accepted I will mark this bug as resolved.
Comment 89 Lee Duncan 2013-06-26 11:28:30 UTC
Just checked and found that the version of this fix that went into openSUSE:12.3 and openSUSE:Factory has the problem mentioned in bnc#823363, so those two releases need to be updated, as well.

The good news: it should be the same fix that went into SP3.
Comment 90 Lee Duncan 2013-06-27 15:06:38 UTC
The openSUSE fix for this will be merged into the openSUSE open-iscsi update (bnc#821695), so marking this bug as dependent on that one. When that one gets fixed, this bug will be complete.
Comment 91 Lee Duncan 2013-06-27 15:31:57 UTC
I am removing dependence on bnc#717590 and bnc#755748.

Both bugs are about updating mkinitrd to support a sysconfig defaults file, which is not needed to support a non-root boot-time iSCSI volume.
Comment 92 Lee Duncan 2013-07-23 01:12:04 UTC
Since bnc#821695 is now fixed, and that fix included the changes to fix this bug, marking this one as complete.
Comment 93 Swamp Workflow Management 2014-04-07 20:35:22 UTC
The SWAMPID for this issue is 56918.
This issue was rated as low.
Please submit fixed packages until 2014-05-05.
Also create a patchinfo file using this link:
https://swamp.suse.de/webswamp/wf/56918
Comment 94 SMASH SMASH 2014-04-15 07:40:59 UTC
Affected packages:

SLE-11-SP2: open-iscsi
Comment 95 SMASH SMASH 2014-04-22 14:41:04 UTC
Affected packages:

SLE-11-SP2: open-iscsi
Comment 96 SMASH SMASH 2014-05-15 12:40:26 UTC
Affected packages:

SLE-11-SP2: open-iscsi
Comment 97 Swamp Workflow Management 2014-05-26 20:46:56 UTC
Update released for: open-iscsi, open-iscsi-debuginfo, open-iscsi-debugsource
Products:
SLE-DEBUGINFO 11-SP3 (i386, ia64, ppc64, s390x, x86_64)
SLE-DESKTOP 11-SP3 (i386, x86_64)
SLE-SERVER 11-SP3 (i386, ia64, ppc64, s390x, x86_64)
SLES4VMWARE 11-SP3 (i386, x86_64)
Comment 98 Swamp Workflow Management 2014-05-27 00:04:35 UTC
SUSE-RU-2014:0714-1: An update that has 5 recommended fixes can now be installed.

Category: recommended (low)
Bug References: 630434,831934,834256,867657,867934
CVE References: 
Sources used:
SUSE Linux Enterprise Server 11 SP3 for VMware (src):    open-iscsi-2.0.873-0.23.1
SUSE Linux Enterprise Server 11 SP3 (src):    open-iscsi-2.0.873-0.23.1
SUSE Linux Enterprise Desktop 11 SP3 (src):    open-iscsi-2.0.873-0.23.1
Comment 99 Bernhard Wiedemann 2016-04-15 12:57:47 UTC
This is an autogenerated message for OBS integration:
This bug (630434) was mentioned in
https://build.opensuse.org/request/show/57457 Factory / sysconfig
https://build.opensuse.org/request/show/57886 11.3:Test / sysconfig