Bug 335486 - network interface not set up during boot due to a separate /dev/shm tmpfs
Summary: network interface not set up during boot due to a separate /dev/shm tmpfs
Status: RESOLVED WONTFIX
: 355786 410367 435189 435880 497924 (view as bug list)
Alias: None
Product: openSUSE 10.3
Classification: openSUSE
Component: Network (show other bugs)
Version: Final
Hardware: i686 openSUSE 10.3
: P5 - None : Normal with 5 votes (vote)
Target Milestone: ---
Assignee: Marius Tomaschewski
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 553794
  Show dependency treegraph
 
Reported: 2007-10-20 14:29 UTC by Walter Haidinger
Modified: 2009-12-14 10:53 UTC (History)
12 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
Files requested by comment #7 (5.11 KB, application/x-gzip)
2008-02-15 08:10 UTC, Walter Haidinger
Details
Comment #7 test results with most recent packages (15.27 KB, text/plain)
2008-05-09 08:05 UTC, Marius Tomaschewski
Details
boot.msg from system with ethernet driver in initrd (50.37 KB, text/plain)
2008-05-09 11:36 UTC, Marius Tomaschewski
Details
rcnetwork.log from Peter (132.18 KB, text/x-log)
2008-05-20 19:44 UTC, Peter Küppers
Details
rcnetwork.set from Peter (11.45 KB, text/plain)
2008-05-20 19:45 UTC, Peter Küppers
Details
My 6 logs/sets from rcnetwork (32.24 KB, application/x-bzip)
2008-05-20 20:22 UTC, Marc Munnen
Details
/etc/init.d/network patch to skip unconverted ifcfg-<hwdesc> files (842 bytes, patch)
2008-05-22 12:39 UTC, Marius Tomaschewski
Details | Diff
debuginfo from Peter (514.85 KB, application/x-tgz)
2008-05-22 19:37 UTC, Peter Küppers
Details
debug info as requested, from Marc (372.55 KB, application/x-compressed-tar)
2008-05-23 20:55 UTC, Marc Munnen
Details
boot.msg with udev debug from Peter (245.82 KB, text/plain)
2008-05-28 12:49 UTC, Peter Küppers
Details
Not Working machine /etc/udev /etc/sysconfig/network (78.44 KB, application/x-compressed-tar)
2008-12-18 00:15 UTC, Kai Lappalainen
Details
not working machine /etc/sysconfig/network before upgrade (56.27 KB, application/x-compressed-tar)
2008-12-18 00:16 UTC, Kai Lappalainen
Details
working machine /etc/udev /etc/sysconfig/network (79.17 KB, application/x-compressed-tar)
2008-12-18 00:19 UTC, Kai Lappalainen
Details
working machine /etc/sysconfig/network before upgrade (61.34 KB, application/x-compressed-tar)
2008-12-18 00:20 UTC, Kai Lappalainen
Details
boot.localfs (11.1) patch to honour /dev options (size) from /etc/fstab (551 bytes, patch)
2009-01-09 16:44 UTC, Marius Tomaschewski
Details | Diff
Output from scripts/extradebug (621.85 KB, application/x-compressed-tar)
2009-01-16 21:36 UTC, Marc Munnen
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Walter Haidinger 2007-10-20 14:29:30 UTC
After upgrading from openSUSE 10.2 to 10.3, the /etc/init.d/network script
fails to configure the network interface (traditional method, no Networkmanager). It has worked without problems under 10.2.

The script times out waiting for the mandatory device eth1 now:
 eth1  device: Marvell Technology Group Ltd. 88E8001 Gigabit Ethernet Controller (rev 13).
 eth1  DHCP client NOT running
 eth1  is down

And indeed, as ifconfig shows, the interface isn't up.
However, after booting, 'rcnetwork start' or 'yast2 network' works.

Unfortunately /var/log/messages isn't very helpful other than:
  sgke eth1: Link is up at 1000 Mbps, full duplex
  ADDRCONF(NETDEV_CHANGE): eth1: links becomes ready
  skge eth1: disabling interface
Please note that there is no delay in the above syslog entries (i.e. same timestamp). The interface seems to be brought down right after it was brought up.

Since the above does not provide much details, I'd appreciate for some hints 
to further diagnose the problem. Thanks!
Comment 1 Walter Haidinger 2007-10-20 14:43:17 UTC
Please note that the interface is not set up with a static IP too.
It is therefore obviously not a dhcp client problem.
Comment 2 Walter Haidinger 2007-10-20 15:26:14 UTC
Well, the network script waits forever for hotplug to setup eth1 "...still waiting for hotplug devices". I manually load the skge module before /etc/init.d/network is called. Is this causing this now?
Comment 3 Walter Haidinger 2007-10-20 15:35:08 UTC
/var/log/boot.msg with network debugging enabled:
(please note the "unknown option" error messages!)

<notice>network start
start

CONFIG      = 
INTERFACE   = 
AVAILABLE_IFACES =  
PHYSICAL_IFACES  = 
DIALUP_IFACES    = 
TUNNEL_IFACES    = 
MANDATORY_DEVICES = eth1 __NSC__ 
SKIP             = 
Setting up network interfaces:
    lo        returned 0
done... still waiting for hotplug devices:
SUCCESS_IFACES= lo
MANDATORY_DEVICES=eth1 __NSC__
Time to wait: 42
Waiting for mandatory devices:  eth1 __NSC__
42 ... still waiting for hotplug devices:
SUCCESS_IFACES= lo
[--cut--]
Time to wait: 1
 ... still waiting for hotplug devices:
SUCCESS_IFACES= lo
MANDATORY_DEVICES= eth1 __NSC__

eth1 -o rc onboot

    eth1      device: Marvell Technology Group Ltd. 88E8001 Gigabit Ethernet Controller (rev 13)
eth1 eth1 -o rc onboot
    eth1      DHCP client NOT running
    eth1      is down
eth1 eth1 -o rc onboot
unknown option rc ignored
'eth1' is not wireless, exiting
eth1 eth1 -o rc onboot
unknown option rc ignored
interface eth1 is not up
    eth1      returned 7
failed    eth1      interface could not be set up until now
failed... final
SUCCESS_IFACES= lo
MANDATORY_DEVICES= eth1 __NSC__
FAILED=1
noiface -o rc onboot
Setting up service network  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .failed
<notice>'network start' exits with status 7
Comment 4 Walter Haidinger 2007-10-22 11:52:50 UTC
Here is a a workaround (NO FIX). Ugly but works for me.
Exclude the eth-interfaces from the physical/non-physical detection:

--- sysconfig-0.70.2-4/etc/init.d/network   2007-09-22 00:12:40.000000000 +0200
+++ /etc/init.d/network			    2007-10-22 12:18:54.000000000 +0200
@@ -470,7 +470,8 @@
 # later in the start section if it is considered mandatory (see next section).
 for a in $(type_filter `ls -A /sys/class/net/`); do
 	case "`get_iface_type $a`" in
-		eth|tr|wlan)	
+		eth) ;;
+		tr|wlan)	
 			STAMPFILE=$STAMPFILE_STUB`cat /sys/class/net/$a/ifindex`
 			if [ "$MODE" == onboot -a "$ACTION" == start ] ; then
 				if [ ! -e "$STAMPFILE" ] ; then


Nevertheless, the ethernet interfaces should be brought up upon boot if configured that way (start=onboot) regardless of udev or prior module load.
Comment 5 Marc Munnen 2007-12-08 22:10:16 UTC
I just have the same problem. I upgraded Suse 10.2 to 10.3 on two pc's and both refused to load the network cards after booting. I tried almost every configuration option in Yast for my 3 network cards, but only allowing the networkmanager to start them gave me at least one working card. Alas, the other 2 are greyed out and so can't be started. Also advanced routing does not work with the Networkmanager. (And, strange enough, it refuses my netmask of 255.255.255.0 and uses 255.0.0.0)
In contrast, when I do a fresh install on both pc's, all network cards are started upon booting as might be expected, with ifup.
After logging in on my upgraded pc's I can start them with 'rcnetwork start'. But this is hardly a solution, because I can't tell my friend he has to live with that.

So please, please, repair this in some way or the other.
I must admit that the Workaround from Walter Haidinger does not ring much bells in me at first sight.

Marc
 
Comment 6 Marc Munnen 2007-12-08 22:11:08 UTC
I just have the same problem. I upgraded Suse 10.2 to 10.3 on two pc's and both refused to load the network cards after booting. I tried almost every configuration option in Yast for my 3 network cards, but only allowing the networkmanager to start them gave me at least one working card. Alas, the other 2 are greyed out and so can't be started. Also advanced routing does not work with the Networkmanager. (And, strange enough, it refuses my netmask of 255.255.255.0 and uses 255.0.0.0)
In contrast, when I do a fresh install on both pc's, all network cards are started upon booting as might be expected, with ifup.
After logging in on my upgraded pc's I can start them with 'rcnetwork start'. But this is hardly a solution, because I can't tell my friend he has to live with that.

So please, please, repair this in some way or the other.
I must admit that the Workaround from Walter Haidinger does not ring much bells in me at first sight.

Marc
 
Comment 7 Christian Zoz 2008-02-03 17:35:02 UTC
Please attach your network configuration files.

Would you please try these two things:

1) rcnetwork stop
   rcnetwork start -o boot

2) rcnetwork stop
   unload the driver of your NIC
   rcnetwork start -o boot
   load the module while rcnetwork is waiting for the interface
Comment 8 Walter Haidinger 2008-02-15 08:10:12 UTC
Created attachment 195057 [details]
Files requested by comment #7

Hi!

I've attached a tarball with the following files:
* 1.log    - output of 1) above with DEBUG=yes
* 2.log    - output of 2) above with DEBUG=yes
* /etc/sysconfig/network/config
* /etc/sysconfig/network/ifcfg-eth[01]

Unfortunately inserting the module as requested by 2) did not change anything. 
Btw, here is the console mixed output (not in the attached logs) of syslog
logging the console when inserting the forcedeth driver in 2):

MANDATORY_DEVICES= eth1 __NSC__
Time to wait: 18
kernel: forcedeth.c: Reverse Engineered nForce ethernet driver. Version 0.60.
Feb 15 08:32:16 banshee kernel: ACPI: PCI Interrupt 0000:00:04.0[A] -> Link [APCH] -> GSI 22 (level, high) -> IRQ 19
... still waiting for hotplug devices:
SUCCESS_IFACES= lo
MANDATORY_DEVICES= eth1 __NSC__
Time to wait: 16
kernel: eth0: forcedeth.c: subsystem: 01043:80a7 bound to 0000:00:04.0
... still waiting for hotplug devices:
SUCCESS_IFACES= lo
MANDATORY_DEVICES= eth1 __NSC__

Please tell me if you need anything else.
Comment 9 Peter Küppers 2008-03-24 20:05:59 UTC
Hello Walter and Christian,

First I've to say, that I'm not an expert on Bugzilla and how to participate in it. But I've made some investigations on this bug an maybe I found a solution and it's of some interest.

In openSUSE 10.2 /etc/init.d/network says up from line 458:
>>>
# Now get all available interfaces drop lo and separate them into physical and
# not physical. Then get AVAILABLE_IFACES sorted to shutdown the not physical
# first.
for a in $(type_filter `ls -A /sys/class/net/`); do
	test "$a" = lo && continue;
	test "$a" = sit0 && continue;
	test "$a" = bonding_masters && continue;
	test "${a#wifi}" != "$a" && continue
	case $a in
		eth*|ath*|wlan*|ra*)
			# Skip these which are too new, they will come via hotplug
#Stempeln in rename_netiface
#- am Anfang: virgin
#- während dem Schleifen: looping
#- am Ende: renamed
#ifup bricht gleich ab, wenn kein service network
#Wenns keinen Stempel gibt dann Stempeln unknown --> skip
#Wenn Stempel virgin --> skip
#            looping --> skip
#            renamed --> set up
#In Statusschleife, wenn mandatory devices gecheckt werden:
#  wenn status failed
#    STEMPEL == unknown && halbe Wartezeit vorbei -> ifup
#            == virgin/looping/renamed -> nix
			STAMPFILE=$STAMPFILE_STUB`cat /sys/class/net/$a/ifindex`
			if [ "$MODE" == onboot -a "$ACTION" == start ] ; then
				if [ -r "$STAMPFILE" ] ; then
					case "`cat $STAMPFILE`" in
						virgin|looping) continue ;;
					esac
				else
					echo unknown > $STAMPFILE
					continue
				fi
			fi
			;;
	esac
	for b in $DIALUP_IFACES $TUNNEL_IFACES; do
		if [ "$a" = "$b" ] ; then
			NOT_PHYSICAL_IFACES="$NOT_PHYSICAL_IFACES $a"
			continue 2
		fi
	done
	case $a in
		sit*)
			NOT_PHYSICAL_IFACES="$NOT_PHYSICAL_IFACES $a"
			continue 2
			;;
	esac
	PHYSICAL_IFACES="$PHYSICAL_IFACES $a"
done
<<<

In openSUSE 10.3 /etc/init.d/network is different up from line 472:
>>>
# Now get all available interfaces drop lo and separate them into physical and
# not physical. Then get AVAILABLE_IFACES sorted to shutdown the not physical
# first.
# Interfaces may be renamed by udev after they are registered. In some cases
# this may take some time. Therefore we check a 'renamed' flag if an interface
# is ready to be set up. If an it is not ready now, it will be set up via
# udev/ifup (because network is started now). We will just have to wait for it
# later in the start section if it is considered mandatory (see next section).
for a in $(type_filter `ls -A /sys/class/net/`); do
	case "`get_iface_type $a`" in
		eth|tr|wlan)	
			STAMPFILE=$STAMPFILE_STUB`cat /sys/class/net/$a/ifindex`
			if [ "$MODE" == onboot -a "$ACTION" == start ] ; then
				if [ ! -e "$STAMPFILE" ] ; then
					continue # this leaves the for-loop!
				fi
			fi
			;;
		lo|wlan_aux)
			continue
			;;
	esac
	for b in $DIALUP_IFACES $TUNNEL_IFACES; do
		if [ "$a" = "$b" ] ; then
			NOT_PHYSICAL_IFACES="$NOT_PHYSICAL_IFACES $a"
			continue 2
		fi
	done
	case $a in
		sit*)
			NOT_PHYSICAL_IFACES="$NOT_PHYSICAL_IFACES $a"
			continue 2
			;;
	esac
	PHYSICAL_IFACES="$PHYSICAL_IFACES $a"
done
<<<
I marked the line with the supposed bug with a comment (this leaves the for-loop!), see above.
Since the for-loop is stopped here, the variable PHYSICAL_IFACES has no value!
And what's about "virgin and looping" from openSUSE 10.2?

I hope this gives a hint for a solution to the bug.

Peter
Comment 10 Peter Küppers 2008-03-26 18:49:06 UTC
Add myself to the CC lis
Comment 11 Marius Tomaschewski 2008-05-09 08:01:08 UTC
The problem seems to be caused by some inconsistency of the scripts
after the update... I can not reproduce this problem (see attachment
in next comment).

In comment #4 and comment #9 an old (not from last update) rcnetwork
script is used. Please update, there was fixes also to the mandatory
wait loop.

(In reply to comment #3 from Walter Haidinger)
> /var/log/boot.msg with network debugging enabled:
> (please note the "unknown option" error messages!)

The "unknown option rc ignored" is just debug output from ifup-wireless
script without any relevance (it just informs about ignored rc option).

Please update the sysconfig package to the most recent update package:

http://download.opensuse.org/update/10.3/rpm/i586/sysconfig-0.70.2-4.2.i586.rpm
http://download.opensuse.org/update/10.3/rpm/x86_64/sysconfig-0.70.2-4.2.x86_64.rpm

then, please verify the package using
   rpm -V sysconfig

When rpm reports some modification/inconsistence like:

   # rpm -V sysconfig
   S.5....T    /etc/init.d/network

remove the reported files and install the package again.

Verify the udev installation using "rpm -V udev" as well.

# ls -l /etc/udev/rules.d/*net*.rules | cut -b 23-
  450  9. Mai 08:17 /etc/udev/rules.d/70-persistent-net.rules
 1518 21. Sep 2007  /etc/udev/rules.d/75-persistent-net-generator.rules
  823 24. Apr 00:26 /etc/udev/rules.d/77-network.rules

The /etc/udev/rules.d/70-persistent-net.rules is generated and
contains mapping of the hardware to the interface name.

Please verify, it reflect your hardware address (MAC) and the
ifcfg-<interface> files in /etc/sysconfig/network.

(In reply to comment #9 from Peter Küppers)
> I marked the line with the supposed bug with a comment (this leaves the
> for-loop!), see above.
> Since the for-loop is stopped here, the variable PHYSICAL_IFACES has no value!
> And what's about "virgin and looping" from openSUSE 10.2?
> 
> I hope this gives a hint for a solution to the bug.

No, the scripts / rule files (in sysconfig and udev) are rewritten for 10.3
and are simplier in many places. On 10.3, the "ifcfg-<hardware-description>"
(ifcfg-eth-id-00:01:02:8E:21) support is removed completely, we just use
ifcfg-<interfacename>.

The PHYSICAL_IFACES variable is empty, because the interface is not up
until now and we wait for udev to load the modules.
Comment 12 Marius Tomaschewski 2008-05-09 08:05:44 UTC
Created attachment 213807 [details]
Comment #7 test results with most recent packages

Messages while "rcnetwork start -o boot debug" as described in 2)
in comment #7
egrep "kernel:|ifup:|rcnetwork:" /var/log/messages > messages.txt
Comment 13 Marius Tomaschewski 2008-05-09 08:08:26 UTC
The messages.txt attached in comment #12 contains also the case
of interface renaming (eth0 -> eth1, eth1 -> eth0) in udev rules.
Comment 14 Marius Tomaschewski 2008-05-09 08:27:03 UTC
Walter, Peter,
please update also the kernel to the most recent one (2.6.22.17-0.1).
Comment 15 Walter Haidinger 2008-05-09 10:18:23 UTC
Thanks for the response, Marius!

Unfortunately the sysconfig update to -4.2 did not work for me. Actually,
this explains why my work-around of comment #4 stopped working sometime ago
and had to be reapplied (I automatically install the update rpms via a
custom script, not using yast. This is still from the days before yast 
provided update functionality. So I obviously overlooked the sysconfig update
in the script's notification email.).

My udev rules and interface renaming are fine and I'm not running an opensuse kernel, currently vanilla 2.6.25.1.

Please note the following things regarding this bug:
* It only shows during boot. Subsequent rcnetwork calls once logged in succeed.
* It's probably caused by the assumption that _only_ udevd loads the modules.
  But what if the module is already loaded before boot.udev is run?

I'll create a messages.txt as suggested in comment #12 later.
Comment 16 Marius Tomaschewski 2008-05-09 11:36:41 UTC
Created attachment 213890 [details]
boot.msg from system with ethernet driver in initrd

(In reply to comment #15 from Walter Haidinger)
> Unfortunately the sysconfig update to -4.2 did not work for me.

And what else is wrong with it?

> Please note the following things regarding this bug:
> * It only shows during boot. Subsequent rcnetwork calls once logged in succeed.
> * It's probably caused by the assumption that _only_ udevd loads the modules.
>   But what if the module is already loaded before boot.udev is run?

No, even I add the forcedeth driver to the initrd (INITRD_MODULES variable
in /etc/sysconfig/kernel + mkinitrd) and the driver is loaded before udev
starts, it works fine for me - see attached boot.msg.

Please update correctly.
Comment 17 Marc Munnen 2008-05-09 21:18:28 UTC
Marius,

I already ran kernel 2.6.22.17-0.1, and also I already had the latest sysconfig. After 'upgrading' to the version 0.70.2-4.2 Yast showed me the same old and current version. This upgrade also replaced my 'patched' network script.
Of course now my network interfaces were not started after rebooting.

Your messages.txt told me that your interfaces were not renamed by udev. So, with the faint knowledge I have about udev, I edited some rules in /etc/udev/rules.d/,
Now I got rid of those renames for old non-existing interfaces. But this did not solve the bug. Nevertheless, I'm happy with the simple setup I have now, with eth0 and wlan0 and no more.

Marc

Comment 18 Peter Küppers 2008-05-18 09:19:05 UTC
Hello Marius,

I also installed the latest sysconfig (sysconfig-0.70.2-4.5) and kernel (kernel-default-2.6.22.17-0.1 x86_64), udev is also OK. The error on boot still persists:
>>>
Waiting for mandatory devices:  eth0 eth1 __NSC__
18 17 16 14 13 11 10 8 7 6 4 3 1 0 
    eth0      device: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)
    eth0      is down
failed    eth0      interface could not be set up until now
...
<<<

If I add the following (from opensuse 10.2) in the network script up from line 493 instead of the hard 'continue', the script works fine for me:
>>>
493: if [ ! -e "$STAMPFILE" ] ; then
         case "`cat $STAMPFILE`" in
            virgin|looping) continue ;;
         esac
     else
         echo unknown > $STAMPFILE
         continue
     fi
<<<

It must have something to do with the not properly filled PHYSICAL_IFACES variable (or around this, e.g. with udev?).

Cheers

Peter





Comment 19 Walter Haidinger 2008-05-19 10:06:40 UTC
Sorry for the delayed answer.

Unfortunately I've to confirm Marc and Peter: sysconfig-0.70.2-4.5 does not work for me either, the ethernet interfaces are not setup during boot (only!) by /etc/init.d/network.

Regarding comment #16:
>> Unfortunately the sysconfig update to -4.2 did not work for me.
> And what else is wrong with it?

Sorry, a bit misleading. I meant that the update itself worked but this bug still persists. No problems with updates.

Marc, Peter, please could you too do the following in order to help Marius debugging the script: Add the 3 lines

  set >> /var/log/rcnetwork.set
  exec >> /var/log/rcnetwork.log 2>&1
  set -v; set -x

at the top of your /etc/init.d/network and reboot. This will log the shell variables during boot and the (expanded) commands run. 
Marius should then have something to compare to his script from three different
machines, all experiencing the bug.
Comment 20 Peter Küppers 2008-05-20 19:44:33 UTC
Created attachment 217033 [details]
rcnetwork.log from Peter
Comment 21 Peter Küppers 2008-05-20 19:45:53 UTC
Created attachment 217035 [details]
rcnetwork.set from Peter
Comment 22 Peter Küppers 2008-05-20 19:50:30 UTC
Hello,

as recommended by Walter, I added the lines in /etc/init.d/network and reboot.
See my attachments as result.

(/etc/init.d/network with my modifications, see above!)

Cheers

Peter
Comment 23 Marc Munnen 2008-05-20 20:22:53 UTC
Created attachment 217048 [details]
My 6 logs/sets from rcnetwork
Comment 24 Marc Munnen 2008-05-20 20:24:37 UTC
Hi,

Just as Peter I did what Walter has advised. Yesterday.
Now I have 6 logs/sets.
- rcnetwork-changed.* : my patched/changed rcnetwork to make it work at boot time.
- rcnetwork-org.* : my original rcnetwork that does not do the job at boot time
Additionally:
- rcnetwork.* : output from the original rcnetwork with 'rcnetwork start' after booting

All this in the hope it will provide some useful information.
Good luck,

Marc
 
Comment 26 Marius Tomaschewski 2008-05-22 12:01:37 UTC
(In reply to comment #24 from Marc Munnen)
> All this in the hope it will provide some useful information.

Yes and no:

++ ls -d /etc/sysconfig/network/ifcfg-eth0 /etc/sysconfig/network/ifcfg-eth3~
/etc/sysconfig/network/ifcfg-eth5~ /etc/sysconfig/network/ifcfg-lo
/etc/sysconfig/network/ifcfg-type-wlan /etc/sysconfig/network/ifcfg-wlan0
/etc/sysconfig/network/ifcfg-wlan0-bu~ /etc/sysconfig/network/ifcfg-wlan0~
[...]
+ MANDATORY_DEVICES=' eth0 type-wlan wlan0 __NSC__ '

This means, there is a problem, that the ifcfg-type-wlan is not
converted -- the hwdesc2iface script is unable to handle it.

We will see with Chistian (in NEEDINFO) if it can be converted somehow;
at the moment, just move it away (rename to ifcfg.type-wlan) and delete
the "ifcfg-*~" files. This should fix your problem Marc.


Beside of the above obsolete ifcfg-<hwdescr> files:

Not the /etc/init.d/network script causes the problems, but something
with udev and ifup. So please provide log files as described bellow.


At http://www.suse.de/~mt/openSUSE/10.3/, in 10.3-<arch> subdirectories,
you'll find sysconfig RPMs with enabled extra debug.

Please install them, reset your /var/log/messages with:

bzip2 -9c < /var/log/messages \
          > /var/log/messages-$(date +%Y%m%d).bz2 && \
cp /dev/null /var/log/messages

and _reboot_ (no, really not a joke - I want to see what udev is doing).

After the reboot, please collect files and create an archive using e.g.:

mkdir /tmp/bug-355786-extradebug
cp -a /dev/shm/sysconfig/*                /tmp/bug-355786-extradebug/
cp -a /var/log/messages /var/log/boot.msg /tmp/bug-355786-extradebug/
cd    /tmp/bug-355786-extradebug/

#### replace wireless keys / passwords / another secets with XXXXXX

tar cvzf /tmp/bug-355786-extradebug.tgz *

And attach the archive as (private) attachment to this bug.
Comment 27 Marius Tomaschewski 2008-05-22 12:39:25 UTC
Created attachment 217531 [details]
/etc/init.d/network patch to skip unconverted ifcfg-<hwdesc> files

Marc,
you can try to apply this patch. It should skip the ifcfg-type-wlan file.
Comment 28 Peter Küppers 2008-05-22 19:37:39 UTC
Created attachment 217617 [details]
debuginfo from Peter
Comment 29 Marc Munnen 2008-05-23 20:55:24 UTC
Created attachment 217913 [details]
debug info as requested, from Marc

Marius,

As suggested, I deleted the "ifcfg-*~" files. You said: This should fix your problem Marc. But what problem? I also renamed ifcfg-type-wlan. There is no need to convert this one, I don't need it.
I rebooted with the original network script. The network was not started.

Ok, I installed your sysconfig-rpm's and collected the information.
The WLAN part of these files don't matter for now, it's really only eth0 that is important for me (and the bug).

I hope this will reveil something

Regards,
Marc
Comment 30 Walter Haidinger 2008-05-26 10:08:11 UTC
Marius,

I was finally was able to create the debug info as requested by comment #26.
Since I do not know how to make a private attachment (does somebody?),
I'll simply send it to you directly. 

The tarball also contains some *.log and *.set files (see comment #19).

Removing the NEEDINFO status because all of us three have provided the requested debug info. If you need additional information, please let us know!

I'm curious though, why you require the /var/log/messages file too because the
syslog is started _after_ network and the script only breaks during boot.
Comment 31 Marius Tomaschewski 2008-05-26 13:41:56 UTC
(In reply to comment #30 from Walter Haidinger)
> Marius,
> 
> I was finally was able to create the debug info as requested by comment #26.
> Since I do not know how to make a private attachment (does somebody?),
> I'll simply send it to you directly.

OK, thanks!

In your and in Peters case, a "ifup eth0 -o hotplug" is never called
(same for eth1). This means, you have a problem with udev rules.

/etc/udev/rules.d # ls -l *net*
-rw-r--r-- 1 root root  450 2008-05-26 14:59:09 70-persistent-net.rules
-rw-r--r-- 1 root root 1518 2007-09-21 21:12:39 75-persistent-net-generator.rules
-rw-r--r-- 1 root root  823 2008-05-22 12:39:27 77-network.rules

There should be one rule for each physical network device, e.g.:

/etc/udev/rules.d # grep -Ev "^#|^$" 70-persistent-net.rules 
SUBSYSTEM=="net", DRIVERS=="?*", ATTR{address}=="00:17:31:ca:a5:a5", NAME="eth0"
SUBSYSTEM=="net", DRIVERS=="?*", ATTR{address}=="00:17:31:ca:a3:92", NAME="eth1"

Please verify using "rpm -V udev" and "rpm -V syslog" that the
both another rule files are not modified:

# rpm -qf /etc/udev/rules.d/75-persistent-net-generator.rules
udev-114-19
# rpm -qf /etc/udev/rules.d/77-network.rules 
sysconfig-0.70.2-4.6


The 77-network.rules file is responsible for "marking" the interface
available by creating the $STAMPFILE:

==> 77-network.rules:
[...]
SUBSYSTEM=="net", ACTION=="add", RUN+="/sbin/ifup $env{INTERFACE} -o hotplug"
[...]

==> /sbin/ifup:
[...]
if [ "$SCRIPTNAME" == ifup -a "$HOTPLUG" == yes ] ; then
        IFINDEX=/sys/$DEVPATH/ifindex
        if [ -r "$IFINDEX" ] ; then
                STAMPFILE=$STAMPFILE_STUB`cat $IFINDEX`
                echo renamed > $STAMPFILE
        fi
fi
[...]

The DEVPATH variable is provided by udev and points to the path of the
device, e.g. devices/pci0000:00/0000:00:10.0/net/eth0.

The STAMPFILE is checked in the rcnetwork script -- see comment #4 and
#9, your patches apply exactly to this place..

But the network script is not the reason of the problem - something is
wrong with the udev network rules on your systems.

This is the reason, why I asked to verify sysconfig + udev installation.
Comment 32 Marius Tomaschewski 2008-05-26 13:49:55 UTC
(In reply to comment #29 from Marc Munnen)
> Created an attachment (id=217913) [details]
> debug info as requested, from Marc
> 
> Marius,
> 
> As suggested, I deleted the "ifcfg-*~" files.

> You said: This should fix your problem Marc. But what problem?

That the network script waits for an "type-wlan" interface that
will be never available - see also in comment #27.

> I also renamed ifcfg-type-wlan. There is no need to convert
> this one, I don't need it.

Then remove it, but don't rename to "ifcfg-wlan1" or the network
script will wait for "wlan1" interface.

> I rebooted with the original network script. The network was
> not started.

Same problem as in Peters and Walters case - something is wrong
with udev rule files. "ifup eth0 -o hotplug" is never called.
Comment 33 Marius Tomaschewski 2008-05-26 13:57:51 UTC
What is the result of "rpm -V udev sysconfig"?

Does one of this rule files exists on your system?

/etc/udev/rules.d/29-net_trigger_firmware.rules
/etc/udev/rules.d/30-net_persistent_names.rules
/etc/udev/rules.d/31-network.rules
/etc/udev/rules.d/80-sysconfig.rules
/etc/udev/rules.d/85-mount-fstab.rules

Rules like:

30-net_persistent_names.rules:SUBSYSTEM=="net", ACTION=="add", SYSFS{address}=="00:e0:81:02:a4:56", IMPORT="/lib/udev/rename_netiface %k eth1"

are obsolete and does not work on 10.3.
Comment 34 Walter Haidinger 2008-05-27 09:09:35 UTC
Hi Marius, thanks for your help!

Replying to comment #31:

> ls -l *net*
shows the 70-,75- und 77- files that you listed.

> # grep -Ev "^#|^$" 70-persistent-net.rules 
ok, there are entries for each ethX-interface.

> rpm -V ...
udev is quiet (no changes), and syslog-ng (not syslog!) shows only
/etc/syslog-ng/syslog-ng.conf which is ok (added custom entries). 

> rpm -qf ...
Same packages as yours.

> 77-network.rules
has exactly the required line.

> /sys/devices/pci*/net/ethX
entries exist.

Replying to comment #33:

> rpm -V udev sysconfig
ok, nothing is printed.

None of the 5 udev rules listed in comment #33 exist on my system.

The obsoleted 30-... rule was even documented in the Release Notes, IIRC.
So yes, I'm aware of that.

Now, without anything obvious wrong in the udev setup,
how can we debug this further?

What are the possible reasons that the "add" action of 77-network.rules
is _not_ triggered?
Comment 35 Marius Tomaschewski 2008-05-27 14:16:40 UTC
(In reply to comment #34 from Walter Haidinger)
> Hi Marius, thanks for your help!

Thanks for your reports!

> > rpm -V ...
> udev is quiet (no changes), and syslog-ng (not syslog!) shows only
> /etc/syslog-ng/syslog-ng.conf which is ok (added custom entries). 

OK.
I meant "rpm -V sysconfig" of course, but you verified it bellow ;-)

> Replying to comment #33:
> 
> > rpm -V udev sysconfig
> ok, nothing is printed.

OK.

> None of the 5 udev rules listed in comment #33 exist on my system.
> 
> The obsoleted 30-... rule was even documented in the Release Notes, IIRC.
> So yes, I'm aware of that.
> 
> Now, without anything obvious wrong in the udev setup,
> how can we debug this further?
> 
> What are the possible reasons that the "add" action of 77-network.rules
> is _not_ triggered?

Yes, this is the question here. It mayby happens, because of your
2.6.25.4-vmhost32 kernel... Perhaps the forcedeth driver does not
trigger this event without patch?

Hmm... Marcs and Peters seem to use suse default kernels...

We will ask the maintainer and find it out - reassigning to him.

The rename rule (70-persistent-net.rules) is called / visible
in your boot.msg file.

<6>udev: renamed network interface eth1 to eth0
<6>udev: renamed network interface eth0_rename to eth1

In my case, I can see more log lines when rename happens (10.3):

<6>eth0 renamed to eth0_rename
<6>eth1 renamed to eth0
<6>udev: renamed network interface eth1 to eth0
<6>eth0_rename renamed to eth1
<6>udev: renamed network interface eth0_rename to eth1

I think, it makes sense to enable ulog debug mode; perhaps it
is visible there.

Please set udev_log="debug" in /etc/udev/udev.conf, reboot
and provide the logs - should be in /var/log/boot.msg.

Alternatively, you can also try to trigger it at runtime:

rcnetwork stop
rmmod forcedeth # network card driver

udevcontrol log_priority=debug

rcnetwork start -o boot &        # in background, or modprobe on another
modprobe forcedeth               # terminal after rcnetwork start...
Comment 36 Kay Sievers 2008-05-27 14:32:52 UTC
Why is this bus assigned to me? Please do not reassign bugs without comment. I doubt it is something to fix from my side.
Comment 37 Marius Tomaschewski 2008-05-27 14:46:46 UTC
Because udev does not trigger rules from 77-network.rules file,
as mentioned in comment #35.
Comment 38 Kay Sievers 2008-05-28 08:32:37 UTC
Udevtest shows, it should run. How did you find out, it is not called?
Comment 39 Walter Haidinger 2008-05-28 08:49:45 UTC
Merius concluded that "ifup eth0 -o hotplug" is never called (see bottom of comment #32). It should, though:

#udevtest /class/net/eth1
This program is for debugging only, it does not run any program,
specified by a RUN key. It may show incorrect results, because
some values may be different, or not available at a simulation run.

parse_file: reading '/etc/udev/rules.d/05-udev-early.rules' as rules file
parse_file: reading '/etc/udev/rules.d/40-alsa.rules' as rules file
parse_file: reading '/etc/udev/rules.d/40-bluetooth.rules' as rules file
parse_file: reading '/etc/udev/rules.d/41-soundfont.rules' as rules file
parse_file: reading '/etc/udev/rules.d/50-udev-default.rules' as rules file
parse_file: reading '/etc/udev/rules.d/51-lirc.rules' as rules file
parse_file: reading '/etc/udev/rules.d/52-irda.rules' as rules file
parse_file: reading '/etc/udev/rules.d/55-hpmud.rules' as rules file
parse_file: reading '/etc/udev/rules.d/55-libsane.rules' as rules file
parse_file: reading '/etc/udev/rules.d/56-idedma.rules' as rules file
parse_file: reading '/etc/udev/rules.d/60-cdrom_id.rules' as rules file
parse_file: reading '/etc/udev/rules.d/60-persistent-input.rules' as rules file
parse_file: reading '/etc/udev/rules.d/60-persistent-storage.rules' as rules file
parse_file: reading '/etc/udev/rules.d/64-device-mapper.rules' as rules file
parse_file: reading '/etc/udev/rules.d/64-md-raid.rules' as rules file
parse_file: reading '/etc/udev/rules.d/70-kpartx.rules' as rules file
parse_file: reading '/etc/udev/rules.d/70-persistent-cd.rules' as rules file
parse_file: reading '/etc/udev/rules.d/70-persistent-net.rules' as rules file
parse_file: reading '/etc/udev/rules.d/71-multipath.rules' as rules file
parse_file: reading '/etc/udev/rules.d/75-cd-aliases-generator.rules' as rules file
parse_file: reading '/etc/udev/rules.d/75-persistent-net-generator.rules' as rules file
parse_file: reading '/etc/udev/rules.d/77-network.rules' as rules file
parse_file: reading '/etc/udev/rules.d/80-drivers.rules' as rules file
parse_file: reading '/etc/udev/rules.d/90-hal.rules' as rules file
parse_file: reading '/etc/udev/rules.d/95-udev-late.rules' as rules file
import_uevent_var: import into environment: 'PHYSDEVPATH=/devices/pci0000:00/0000:00:08.0'
import_uevent_var: import into environment: 'PHYSDEVBUS=pci'
import_uevent_var: import into environment: 'PHYSDEVDRIVER=forcedeth'
import_uevent_var: import into environment: 'INTERFACE=eth1'
import_uevent_var: import into environment: 'IFINDEX=4'
main: looking at device '/class/net/eth1' from subsystem 'net'
udev_rules_get_name: rule applied, 'eth1' becomes 'eth1'
main: run: 'socket:/org/kernel/dm/multipath_event'
main: run: '/sbin/ifup eth1 -o hotplug'
main: run: 'socket:/org/freedesktop/hal/udev_event'
main: run: 'socket:/org/kernel/udev/monitor'

I'll reboot with udev logging set to debug (from comment #35) and post
the logs. Marc, Peter, can you do this too?
Comment 40 Walter Haidinger 2008-05-28 08:52:58 UTC
Btw, has somebody looked into /sbin/ifup? Perhaps it is called by udev after all but doesn't do its job...
Comment 41 Walter Haidinger 2008-05-28 08:57:46 UTC
Q: Who sets environment variable DEVPATH? udev?
Required in /sbin/ifup (see comment #31)
Comment 42 Kay Sievers 2008-05-28 09:00:40 UTC
(In reply to comment #41 from Walter Haidinger)
> Q: Who sets environment variable DEVPATH? udev?
> Required in /sbin/ifup (see comment #31)

The kernel.

Comment 43 Walter Haidinger 2008-05-28 12:01:44 UTC
In reply to comment #35:
> Yes, this is the question here. It mayby happens, because of your
> 2.6.25.4-vmhost32 kernel... Perhaps the forcedeth driver does not
> trigger this event without patch?

Checked patches of kernel-default-2.6.22.17-0.1.nosrc.rpm to drivers/net/forcedeth.c. There are some, but either already applied in
current 2.6.25.4 or not applicable, e.g. patches to printk().
Unless there are other patches to the networking stack, I doubt that
the forcedeth driver is the problem.

Marc, Peter, which network driver do you use?
Comment 44 Marius Tomaschewski 2008-05-28 12:47:22 UTC
(In reply to comment #38 from Kay Sievers)
> Udevtest shows, it should run. How did you find out, it is not called?

I've created a sysconfig package (http://www.suse.de/~mt/openSUSE/10.3/,
see comment 26), that enables "bash -vx" debugging in both, rcnetwork
and ifup:

/sbin/ifup:
R_INTERNAL=1      # internal error, e.g. no config or missing scripts
cd /etc/sysconfig/network || exit $R_INTERNAL
test -f ./config && . ./config
test -f scripts/functions && . scripts/functions || exit $R_INTERNAL

###### scripts/functions creates /dev/shm/sysconfig
. scripts/extradebug


/etc/sysconfig/network/scripts/extradebug:
SCRIPT=${0##*/}
if test -d /dev/shm/sysconfig ; then
exec 2> /dev/shm/sysconfig/exdeb.${SCRIPT}_$$.$PPID.${SEQNUM}_$1.$2
[...]
set -vx
fi

In the logs provided in comment 28 and comment 29, ifup is never called
with the "-o hotplug" option as udev is starting it in the rule.
Comment 45 Peter Küppers 2008-05-28 12:49:24 UTC
Created attachment 218600 [details]
boot.msg with udev debug from Peter

Hello,

Replying to comment #31 and #33 from Marius:
I've the same results as in comment #34 from Walter.

Replying to comment #35 from Marius and #39 from Walter:
I've set udev_log="debug", rebooted, and provided the log /var/log/boot.msg.

Replying to comment #43 from Walter:
I use network drivers 8139too and 8139cp for eth0 (Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)).
and r8169 for eth1 (Realtek Semiconductor Co., Ltd. RTL-8110SC/8169SC Gigabit Ethernet (rev 10)).
lsmod says:
r8169                  48392  0
8139too                44672  0
8139cp                 41216  0
mii                    22528  2 8139too,8139cp

Replying to comment #30 from Walter:
me neither -> Since I do not know how to make a private attachment (does somebody?)

Cheers

Peter
Comment 46 Marius Tomaschewski 2008-05-28 13:06:56 UTC
(In reply to comment #43 from Walter Haidinger)
> In reply to comment #35:
> > Yes, this is the question here. It mayby happens, because of your
> > 2.6.25.4-vmhost32 kernel... Perhaps the forcedeth driver does not
> > trigger this event without patch?
> 
> Checked patches of kernel-default-2.6.22.17-0.1.nosrc.rpm to
> drivers/net/forcedeth.c. There are some, but either already applied in
> current 2.6.25.4 or not applicable, e.g. patches to printk().
> Unless there are other patches to the networking stack, I doubt that
> the forcedeth driver is the problem.
> 
> Marc, Peter, which network driver do you use?

It does not depend on the driver [at least as shipped by suse].

Marc and me are using forcedeth from suse kernel too and it
works fine for me but not for Marc. Peter is using r8169 (suse).
Comment 47 Marius Tomaschewski 2008-05-28 13:37:26 UTC
BTW: which dhcp clients are you using Marc, Peter, Walter?

Please provide the output of:
   rpm -qa | grep dhcp
   grep DHCLIENT_BIN /etc/sysconfig/network/dhcp
Comment 48 Walter Haidinger 2008-05-28 14:29:51 UTC
I usually use ISC /sbin/dhclient from dhcp-client.rpm.
The interfaces are configured statically, though.
Comment 49 Peter Küppers 2008-05-28 14:48:53 UTC
Hello,

Replying to comment #47 from Marius:
rpm -qa | grep dhcp:
dhcpcd-1.3.22pl4-287
grep DHCLIENT_BIN /etc/sysconfig/network/dhcp
DHCLIENT_BIN=""

But I dont have any ifcfg-<interface> with BOOTPROTO='dhcp'. They are both 'static'.

Cheers

Peter
Comment 50 Marc Munnen 2008-05-28 21:13:58 UTC
Hi,

Just as Walter & Peter I use only static configured interfaces. Cmt 48/49
Just as Peter I have the same dhcp-stuff. Cmt 49

@Marius: (cmt 47): do you really have an updated 10.2 -> 10.3 Open Suse?

@all: more than 10 messages today while I was out. I'll try to grab what's in it and to do what is asked, tomorrow.
But I´m more worried that i haven't got any security updates for about a month, than about this bug for which exists some sort of work around.

Regards,
Marc
Comment 51 Marius Tomaschewski 2008-05-29 06:01:51 UTC
I just wanted to make sure, that you aren't using dhclient (ISC dhcp client).
until now, ifup-dhcp does not support multiple interfaces (parallel up), but
it also complains in /var/log/messages when it detects this.
Comment 52 Innocenty Enikeew 2008-07-21 18:31:19 UTC
Hello,

I also have this problem and tried to investigate it myself. Putting echo's here and there I've found that during /etc/init.d/network starts in 'onboot' mode it expects udev to rename interface to that time and if this isn't the case ifup wouldn't not run for it. Network interfaces for which udev rule has run ifup it creates STAMPFILE in /dev/shm/sysconfig/ with 'renamed' in it. But if udev called this rule before rcnetwork had started ifup exits with "Service network not started and mode 'auto' -> skipping"". 
Ifup creates that STAMPFILE during on boot udev device detection, but when rcnetwork ran after it, somehow /dev/shm/sysconfig/ is empty! And makes rcnetwork wait for interface's detection.

I'm not sure, but maybe that could help.
Comment 53 Kay Sievers 2008-07-22 08:33:45 UTC
I really doubt, that this is a udev issue. Reassigning back.
Comment 54 Marius Tomaschewski 2008-10-01 15:47:20 UTC
Walter, can you provide the output of the following command? :

for p in $(pidof udevd) ; do ls -l /proc/$p/exe ; done
Comment 55 Walter Haidinger 2008-10-03 05:59:53 UTC
#for p in $(pidof udevd) ; do ls -l /proc/$p/exe ; done
lrwxrwxrwx 1 root root 0 Oct  3 07:42 /proc/2558/exe -> /sbin/udevd*
#for p in $(pidof udevd) ; do ls -lL /proc/$p/exe ; done
-rwxr-xr-x 1 root root 80252 Sep 21  2007 /proc/2558/exe*

Do you doubt that I don't have udevd running?
Comment 56 Marius Tomaschewski 2008-10-06 09:50:04 UTC
(In reply to comment #55 from Walter Haidinger)
> #for p in $(pidof udevd) ; do ls -l /proc/$p/exe ; done
> lrwxrwxrwx 1 root root 0 Oct  3 07:42 /proc/2558/exe -> /sbin/udevd*
                                                          ^^^^^^^^^^^
> Do you doubt that I don't have udevd running?

No. I wanted to verify a bug that we've had on 11.1 Beta, where the
udevd from initrd was still running due bug in sysvinit. It would
appear as "-> /sbin/udevd (deleted)"...

(In reply to comment #52 from Innocenty Enikeew)
> Hello,
> 
> I also have this problem and tried to investigate it myself. Putting echo's
> here and there I've found that during /etc/init.d/network starts in 'onboot'
> mode it expects udev to rename interface to that time and if this isn't the
> case ifup wouldn't not run for it. Network interfaces for which udev rule has
> run ifup it creates STAMPFILE in /dev/shm/sysconfig/ with 'renamed' in it. But
> if udev called this rule before rcnetwork had started ifup exits with "Service
> network not started and mode 'auto' -> skipping"". 
> Ifup creates that STAMPFILE during on boot udev device detection, but when
> rcnetwork ran after it, somehow /dev/shm/sysconfig/ is empty! And makes
> rcnetwork wait for interface's detection.

I'll verify this.
Comment 57 Kai Lappalainen 2008-10-23 23:25:43 UTC
We have this same problem here with 2 from 4 machines which were upgraded from 10.2 to 10.3. All have static ip-addresses, different network cards, different architecture.

The difference during the upgrade was, that the two working machines were upgraded from "outside" by booting with a network-install-cd. The two machines showing the problems were upgraded from a running system by changing the repos to the 10.3 versions and then using the "factory upgrade"-tool in yast.

Recently we upgraded one of these 2 problematic systems from 10.3 to 11.0 by booting with a retail dvd and performing the upgrade, but the problem still persists.

(Due to network dependant services our workaround after booting is to switch to runlevel 1 and then back to runlevel 3 (these are servers).)

One working server and and one server with broken network after upgrade are nearly identical (Dell PowerEdge Servers with BCM5708 Network Cards), so I did compare a lot of settings and configuration files without finding any difference. But my knowledge about udev is very limited. If there is something I should compare I would happily try to help.
Comment 58 Marius Tomaschewski 2008-12-17 13:07:32 UTC
(In reply to comment #57 from Kai Lappalainen)
> We have this same problem here with 2 from 4 machines which were upgraded from
> 10.2 to 10.3. All have static ip-addresses, different network cards, different
> architecture.
>
> The difference during the upgrade was, that the two working machines were
> upgraded from "outside" by booting with a network-install-cd. The two machines
> showing the problems were upgraded from a running system by changing the repos
> to the 10.3 versions and then using the "factory upgrade"-tool in yast.

[...]

Updates of a running system to a new distribution are AFAIK not supported
and yast2 shows AFAIK at least a red warning. This the case, because there
are several problems that may occur.

One example: the conversion of ifcfg-eth-id-* to ifcfg-ethX needs an
already updated udev with an already generated persistent-net rule 70
or a kernel using the new sysfs to work propelly. When the conversion
happens while the old udev is running, the rule 70 it generates does
not exists and the conversion using the old one may result in different persistent name. Using the old sysfs also does not work, because they
differ significantly and on the new system a completely different
modules may be in use. ...

But let's take a look to the rules / config. Perhaps we'll find it.
Can you attach a tgz from a working and a not working machine?

"tar cvzf /tmp/machine1.tgz /etc/udev /etc/sysconfig/network"

In /var/adm/backup/sysconfig are backups - please provide one of each
machine that contains the old configuration (ifcfg-eth-<hwdesc> files).

Of course, please copy the dirs somewhere first, review and replace
any private data with XXXXX / example.com / dummy IPs / ... and attach
it with a private flag set.

And please also provide the output of "rpm -V udev sysconfig"?
Comment 59 Marius Tomaschewski 2008-12-17 13:28:51 UTC
*** Bug 410367 has been marked as a duplicate of this bug. ***
Comment 60 Marius Tomaschewski 2008-12-17 13:31:14 UTC
WARNING: The following "udevadm trigger" may cause strange things
         (interfaces may be set up twice); it is a good idea to
        reboot after.

Can you disable services like databases first, reboot and provide
the output of:

  ls -l /dev/shm/sysconfig/
  udevadm trigger --verbose --retry-failed --subsystem-match="net"
  ls -l /dev/shm/sysconfig/
  udevadm trigger --verbose --subsystem-match="net"
  ls -l /dev/shm/sysconfig/

after the network script failed?
Comment 61 Kai Lappalainen 2008-12-17 23:09:48 UTC
(In reply to comment #60 from Marius Tomaschewski)

I have no udevadm, so I used udevtrigger

>   ls -l /dev/shm/sysconfig/
total 20
-rw-r--r-- 1 root root  5 Dec 17 23:49 config-eth0
-rw-r--r-- 1 root root  3 Dec 17 23:49 config-lo
-rw-r--r-- 1 root root 27 Dec 17 23:49 if-lo
-rw-r--r-- 1 root root  7 Dec 17 23:49 ifup-lo
-rw-r--r-- 1 root root  3 Dec 17 23:49 network
-rw-r--r-- 1 root root  0 Dec 17 23:49 ready-lo
drwxr-xr-x 2 root root 60 Dec 17 23:49 tmp

>   udevadm trigger --verbose --retry-failed --subsystem-match="net"
(no output)

>   ls -l /dev/shm/sysconfig/
total 20
-rw-r--r-- 1 root root  5 Dec 17 23:49 config-eth0
-rw-r--r-- 1 root root  3 Dec 17 23:49 config-lo
-rw-r--r-- 1 root root 27 Dec 17 23:49 if-lo
-rw-r--r-- 1 root root  7 Dec 17 23:49 ifup-lo
-rw-r--r-- 1 root root  3 Dec 17 23:49 network
-rw-r--r-- 1 root root  0 Dec 17 23:49 ready-lo
drwxr-xr-x 2 root root 60 Dec 17 23:49 tmp

>   udevadm trigger --verbose --subsystem-match="net"
/devices/pci0000:00/0000:00:02.0/0000:06:00.0/0000:07:00.0/0000:08:00.0/0000:09:00.0/net/eth1
/devices/pci0000:00/0000:00:1c.0/0000:04:00.0/0000:05:00.0/net/eth0
/devices/virtual/net/lo


>   ls -l /dev/shm/sysconfig/
total 36
-rw-r--r-- 1 root root  5 Dec 17 23:54 config-eth0
-rw-r--r-- 1 root root  3 Dec 17 23:49 config-lo
-rw-r--r-- 1 root root 29 Dec 17 23:54 if-eth0
-rw-r--r-- 1 root root 27 Dec 17 23:49 if-lo
-rw-r--r-- 1 root root  7 Dec 17 23:54 ifup-eth0
-rw-r--r-- 1 root root  7 Dec 17 23:49 ifup-lo
-rw-r--r-- 1 root root  3 Dec 17 23:49 network
-rw-r--r-- 1 root root  8 Dec 17 23:54 new-stamp-2
-rw-r--r-- 1 root root  8 Dec 17 23:54 new-stamp-3
-rw-r--r-- 1 root root  0 Dec 17 23:54 ready-eth0
-rw-r--r-- 1 root root  0 Dec 17 23:54 ready-eth1
-rw-r--r-- 1 root root  0 Dec 17 23:49 ready-lo
drwxr-xr-x 2 root root 60 Dec 17 23:54 tmp

(eth1 is not used/configured!)
Comment 62 Kai Lappalainen 2008-12-18 00:15:05 UTC
Created attachment 260717 [details]
Not Working machine /etc/udev /etc/sysconfig/network

(In reply to comment #58 from Marius Tomaschewski)
> But let's take a look to the rules / config. Perhaps we'll find it.
> Can you attach a tgz from a working and a not working machine?
> 
> "tar cvzf /tmp/machine1.tgz /etc/udev /etc/sysconfig/network"

machine1 is the not working machine, machine2 is the working machine.

> 
> In /var/adm/backup/sysconfig are backups - please provide one of each
> machine that contains the old configuration (ifcfg-eth-<hwdesc> files).
the working machine had no ifcfg-eth-<hwdesc> files before the upgrade, but an ifcfg-eth0 file in the oldest sysconfig-backup I've found. (?)
It's the same on the second working machine.
I'll attach the files anyway.

> 
> Of course, please copy the dirs somewhere first, review and replace
> any private data with XXXXX / example.com / dummy IPs / ... and attach
> it with a private flag set.
> 
> And please also provide the output of "rpm -V udev sysconfig"?
> 
No output on both machines.
Comment 63 Kai Lappalainen 2008-12-18 00:16:36 UTC
Created attachment 260718 [details]
not working machine /etc/sysconfig/network before upgrade
Comment 64 Kai Lappalainen 2008-12-18 00:19:11 UTC
Created attachment 260719 [details]
working machine /etc/udev /etc/sysconfig/network

( /etc/sysconfig/network/route intentionally missing)
Comment 65 Kai Lappalainen 2008-12-18 00:20:13 UTC
Created attachment 260720 [details]
working machine /etc/sysconfig/network before upgrade
Comment 66 Marius Tomaschewski 2008-12-19 13:54:43 UTC
Now, you've provided:

  machine1/new

  machine1/old/sysconfig

  machine2/new/udev
  machine2/new/sysconfig

  machine2/old/sysconfig

In machine1/old archive are ifcfg-eth-id-* files - ok, but:

The new files machine1 would be most interesting ... Can you
provide them? There is afaik no backup of udev rules, except
of the rule 30 that sysconfig creates during conversion...

The machine2/new/udev/rules.d/* looks like a fresh install,
not like an update. There is no converted & disabled rule 30.


I've tested it yesterday - I've booted from CD and updated;
the network starts fine after, rule 30 was correctly converted
into a rule 70.

I'll retest making an update in a running system when I'm back
in work next year...

(In reply to comment #61 from Kai Lappalainen)
> (In reply to comment #60 from Marius Tomaschewski)
> 
> I have no udevadm, so I used udevtrigger

sure, sorry.

> >   ls -l /dev/shm/sysconfig/
> total 36
> -rw-r--r-- 1 root root  5 Dec 17 23:54 config-eth0
> -rw-r--r-- 1 root root  3 Dec 17 23:49 config-lo
> -rw-r--r-- 1 root root 29 Dec 17 23:54 if-eth0
> -rw-r--r-- 1 root root 27 Dec 17 23:49 if-lo
> -rw-r--r-- 1 root root  7 Dec 17 23:54 ifup-eth0
> -rw-r--r-- 1 root root  7 Dec 17 23:49 ifup-lo
> -rw-r--r-- 1 root root  3 Dec 17 23:49 network
> -rw-r--r-- 1 root root  8 Dec 17 23:54 new-stamp-2
> -rw-r--r-- 1 root root  8 Dec 17 23:54 new-stamp-3
> -rw-r--r-- 1 root root  0 Dec 17 23:54 ready-eth0
> -rw-r--r-- 1 root root  0 Dec 17 23:54 ready-eth1
> -rw-r--r-- 1 root root  0 Dec 17 23:49 ready-lo
> drwxr-xr-x 2 root root 60 Dec 17 23:54 tmp
> 
> (eth1 is not used/configured!)

Then "all is fine", the new-stamp-* are created after explicit
udevtrigger run for both interfaces (correct, they exists) and
because the network was already started (network file exists),
also configured (eth0).
Comment 67 Kai Lappalainen 2008-12-19 16:25:06 UTC
(In reply to comment #66 from Marius Tomaschewski)
> Now, you've provided:
> 
>   machine1/new
Yes, as attachment 260717 [details] to comment 62.


> In machine1/old archive are ifcfg-eth-id-* files - ok, but:
> 
> The new files machine1 would be most interesting ... Can you
> provide them? There is afaik no backup of udev rules, except
> of the rule 30 that sysconfig creates during conversion...
It's not clear to me what's missing? Could you please explain?

> 
> The machine2/new/udev/rules.d/* looks like a fresh install,
> not like an update. There is no converted & disabled rule 30.
Sorry, I've checked our logs. It turned out, that this server was (inplace) upgraded at September, 13th 2007 from 10.2 to 10.3 *factory* before doing the upgrade to 10.3 final at October, 4th 2007.
So I'm afraid this machine is not comparable - other than maybe there was a change between Sept. 13th and Oct. 4th which broke the conversion, because this machine works?
The same is true for a second working machine here, which was also upgraded before to 10.3 factory on Sept. 13th and also works.

In the /etc/sysconfig/network backup from Oct., 4th there is an empty file called "__convert_hwdesc_to_iface__". Other than that I see no significant difference to the one provided.
Comment 68 Marius Tomaschewski 2009-01-07 15:49:27 UTC
Do you (or somebody that can reproduct the problem) a line like this
in the /etc/fstab? :

tmpfs /dev/shm tmpfs defaults,size=132M 0 0
Comment 69 Kai Lappalainen 2009-01-07 16:12:59 UTC
On both (not working) machines I have 

tmpfs /dev/shm tmpfs defaults 0 0

in /etc/fstab. (no size specified)


Comment 70 Marius Tomaschewski 2009-01-07 16:42:15 UTC
Is the problem fixed when you remove this line and reboot?
Comment 71 Walter Haidinger 2009-01-07 19:56:20 UTC
I have such a line too:
tmpfs   /dev/shm   tmpfs   auto,size=384m,mode=1777     0 0

I cannot shutdown the server for tests atm, though. Sorry.
However, I'm curious: How can this make difference? It should not.
Comment 72 Kai Lappalainen 2009-01-07 20:02:29 UTC
*Bingo*! :-)

After removing the line on the two affected servers both machines were able to boot with working network!

As I understand it, it's because of the mount during boot the flags in /dev/shm/sysconfig have been "overwritten"?
Comment 73 Peter Küppers 2009-01-07 20:45:53 UTC
Hello,

sorry for the late answer

I've the same result:

my /etc/fstab contains:
tmpfs /dev/shm tmfs size=1G 0 0

when I remove (uncomment) the line, yes bingo that's it: the problem is fixed!
but I need the line...

Cheers

Peter
Comment 74 Marius Tomaschewski 2009-01-07 21:00:59 UTC
Yes, the problem is, that first there is the udev tmpfs mounted on /dev,
udev gets started and creates the /dev/shm/sysconfig/new-stamp-$INDEX
interface marks via "ifup $IF -o hotplug".

But then a separate tmpfs gets mounted (by boot.localfs) _over_ /dev/shm
and _hides_ them -- and the network script is waiting for them to appear.

When you move /dev/shm/sysconfig to somewhere else and "umount /dev/shm"
you will find the new-stamp-* files again...

The credits go to Michael Monnerie in bug 435880 that found out that
there is sometimes a separate /dev/shm mount.

I've to find out who is creating it (an Oracle soft? see bug 355786)
and find a way to fix it. I'll set all the bugs as duplicates tomorrow.
Comment 75 Marius Tomaschewski 2009-01-07 21:02:20 UTC
(In reply to comment #73 from Peter Küppers)
> Hello,
> 
> sorry for the late answer
> 
> I've the same result:
> 
> my /etc/fstab contains:
> tmpfs /dev/shm tmfs size=1G 0 0
> 
> when I remove (uncomment) the line, yes bingo that's it: the problem is fixed!
> but I need the line...

No, you don't need it: /dev is already a tmpfs.
Comment 76 Marius Tomaschewski 2009-01-07 21:04:24 UTC
When you (some software) need a mount point there, you can fake it with
  mount -obind /dev/shm /dev/shm
Comment 77 Neil Murphy 2009-01-07 21:45:45 UTC
Ah, sounds like I was suffering from the /dev/shm problem too (bug #410367).

ATI tell you to mount a tmpfs at /dev/shm in order to make 3d support work in their fglrx drivers.

I've gave up on ATI's drivers some time ago which probably explains why I've not seen this bug for a while.

Comment 79 Marius Tomaschewski 2009-01-09 13:24:31 UTC
*** Bug 435880 has been marked as a duplicate of this bug. ***
Comment 80 Marius Tomaschewski 2009-01-09 13:45:44 UTC
(In reply to comment #77 from Neil Murphy)
> Ah, sounds like I was suffering from the /dev/shm problem too (bug #410367).
> 
> ATI tell you to mount a tmpfs at /dev/shm in order to make 3d support work in
> their fglrx drivers.
> 
> I've gave up on ATI's drivers some time ago which probably explains why I've
> not seen this bug for a while.

Well, ATI, VMWARE, Oracle. 

By default there are up to 2G in /dev/shm:

LANG=C df -h /dev/shm
Filesystem            Size  Used Avail Use% Mounted on
udev                  2.0G  264K  2.0G   1% /dev

but some applications may need more.
Comment 81 Marius Tomaschewski 2009-01-09 13:45:58 UTC
*** Bug 355786 has been marked as a duplicate of this bug. ***
Comment 82 Marius Tomaschewski 2009-01-09 13:47:49 UTC
*** Bug 435189 has been marked as a duplicate of this bug. ***
Comment 83 Marius Tomaschewski 2009-01-09 13:51:42 UTC
Is is possible to specify the size of the /dev fs in /etc/fstab?
Comment 84 Kay Sievers 2009-01-09 13:57:54 UTC
(In reply to comment #83 from Marius Tomaschewski)
> Is is possible to specify the size of the /dev fs in /etc/fstab?

It may be possible by adding size=, but making it larger than the default (half the RAM size) just papers over some utterly broken applications, which should be fixed instead.
Comment 85 Marius Tomaschewski 2009-01-09 14:09:28 UTC
(In reply to comment #84 from Kay Sievers)
> (In reply to comment #83 from Marius Tomaschewski)
> > Is is possible to specify the size of the /dev fs in /etc/fstab?
> 
> It may be possible by adding size=, but making it larger than the default (half
> the RAM size) just papers over some utterly broken applications, which should
> be fixed instead.

Yes, half of RAM size (2G was wrong / my machine only :).

Well, but when you've 8G RAM and run only e.g. a database it may be
required to set it to e.g. 6G.

IMO we have at least two choices:

 a) change the init / network scripts to not to use it
 b) make it adjustable (/etc/init.d/boot mounts /dev using
    "mount -n -t tmpfs -o mode=0755 udev /dev" -- I'm going
    to test if the size is used when I add /dev with
    a different size to /etc/fstab)...

In case of a) -- which path we can use instead - /dev/.tmp?
/var may be mounted on a separate disk so it can't be used too.
Comment 86 Marius Tomaschewski 2009-01-09 14:29:09 UTC
(In reply to comment #85 from Marius Tomaschewski)
>  a) change the init / network scripts to not to use it
>  b) make it adjustable (/etc/init.d/boot mounts /dev using
>     "mount -n -t tmpfs -o mode=0755 udev /dev" -- I'm going
>     to test if the size is used when I add /dev with
>     a different size to /etc/fstab)...

No, the size for /dev isn't used...
Comment 87 Marius Tomaschewski 2009-01-09 14:32:29 UTC
Michael Taylor:

does Oracle work when you remove the separate /dev/shm and
call "mount -oremount,size=7g /dev" after booting instead?
Comment 88 Marius Tomaschewski 2009-01-09 14:34:18 UTC
Michael, you can add the mount -oremount call to "/etc/init.d/boot.local".
Comment 89 Walter Haidinger 2009-01-09 14:35:57 UTC
Marius, thanks for hunting this bug so relentlessly... ;-)

I'd favor a) because b) would be optional then but still nice to have anyways.

Instead of /dev/.tmp use a dynamically created path, with some unique magic
which is persistent between the script calls, like (bad) e.g. 
/dev/.network-config.`uname -r`

It would be nice if the directory gets cleaned up (say, removed) after the
network is completely configured (provided that is possible) but /dev/ gets
cleaned up upon reboot anyways. 
Comment 90 Kay Sievers 2009-01-09 14:37:43 UTC
(In reply to comment #85 from Marius Tomaschewski)

>  b) make it adjustable (/etc/init.d/boot mounts /dev using
>     "mount -n -t tmpfs -o mode=0755 udev /dev" -- I'm going
>     to test if the size is used when I add /dev with
>     a different size to /etc/fstab)...

It should honor the fstab options by re-mounting the already mounted filesystem. We do that for some other filesystems too.
 
> In case of a) -- which path we can use instead - /dev/.tmp?
> /var may be mounted on a separate disk so it can't be used too.

Better use some name private to your package, and delete the directory after it is no longer needed when the rootfs is available. We should not put new generic names like ".tmp" in /dev, and suggest people to share that.
Comment 91 Marius Tomaschewski 2009-01-09 16:44:32 UTC
Created attachment 264188 [details]
boot.localfs (11.1) patch to honour /dev options (size) from /etc/fstab

Rüdiger,

what do you think about this patch to honor a size option?
Comment 92 Marius Tomaschewski 2009-01-09 16:47:46 UTC
The while loop is not really needed, since the
   mount -fv -t tmpfs udev /dev
call before added /dev to /etc/mtab already...
Comment 93 Michael Taylor 2009-01-09 17:56:20 UTC
Hi Marius,

Thank you for your work on this issue, and finding the tmpfs root cause.  Since creating this bug, I have moved my Oracle machines to use HugePages, which are incompatible with the tmpfs construct, so you can close the bug.  Here is the Oracle note.  I have removed the parameter that made use of /tmpfs and my network interfaces are now working.

Thanks,
-Michael

Subject: 	MEMORY_TARGET/MEMORY_MAX_TARGET And Linux Hugepages
  	Doc ID: 	473165.1 	Type: 	HOWTO
  	Modified Date : 	26-MAY-2008 	Status: 	MODERATED

In this Document
  Goal
  Solution

This document is being delivered to you via Oracle Support's Rapid Visibility (RaV) process, and therefore has not been subject to an independent technical review.

Applies to:
Oracle Server - Enterprise Edition - Version: 11.1.0.6
Linux x86
Goal

Using MEMORY_TARGET/MEMORY_MAX_TARGET for managing memory in an 11g database on Linux. When trying to check if Hugpages are being used by running the command (grep Huge /proc/meminfo), can see that Hugepages are not being used.

But, When using SGA_MAX_SIZE to manage the memory in the same database, can see by using the same command (grep Huge /proc/meminfo) that  Hugepages are being used.

Does MEMORY_TARGET/MEMORY_MAX_TARGET make use of Linux Hugepages?
Solution

Automatic Memory Management (MEMORY_TARGET/MEMORY_MAX_TARGET) cannot be used in conjunction with Hugepages on Linux. This is because its memory segments are memory mapped files in /dev/shm. 
Comment 94 Peter Küppers 2009-01-09 22:16:19 UTC
Hello Marius,

on my server, I used the /etc/fstab line "tmpfs /dev/shm tmpfs size=1G 0 0" cause it was recommended for a SAP Testdrive (Netweaver 2004s with MaxDB 7.6).

There are various sapnotes with hints on tmpfs and SAP memory management for Linux systems. In the SAP Note 941735 (SAP memory management for 64-bit Linux systems), I found a solution to customize my system without the line in /etc/fstab. But there is another hint in this SAP Note:
...
TMPFS
With the STD implementation (SAP profile parameter es/implementation=std), the SAP Extended Memory is no longer stored in the TMPFS (under /dev/shm). However, the TMPFS is required by the Virtual Machine Container (VMC). For this reason, we still recommend the same configuration of the TMPFS:
75% (RAM + Swap) is still recommended as the size.
...
So I understand, that in my case the "tmpfs /dev/shm tmpfs size=1G 0 0" is still relevant. If not, is this a question for Linux (so bugzilla.novell and SLES) or more for SAP?
Question would be "How to configure the tmpfs for the VMC otherwise?"
With the patch you recommended or "hard" size= in /etc/init.d/boot or...?

Cheers

Peter
Comment 95 Marius Tomaschewski 2009-01-12 09:47:01 UTC
(In reply to comment #94)
> Hello Marius,
Hi!

> on my server, I used the /etc/fstab line "tmpfs /dev/shm tmpfs size=1G 0 0"
> cause it was recommended for a SAP Testdrive (Netweaver 2004s with MaxDB 7.6).
> 
> There are various sapnotes with hints on tmpfs and SAP memory management for
> Linux systems. In the SAP Note 941735 (SAP memory management for 64-bit Linux
> systems), I found a solution to customize my system without the line in
> /etc/fstab. But there is another hint in this SAP Note:
> ...
> TMPFS
> With the STD implementation (SAP profile parameter es/implementation=std), the
> SAP Extended Memory is no longer stored in the TMPFS (under /dev/shm). However,
> the TMPFS is required by the Virtual Machine Container (VMC). For this reason,
> we still recommend the same configuration of the TMPFS:
> 75% (RAM + Swap) is still recommended as the size.
> ...
> So I understand, that in my case the "tmpfs /dev/shm tmpfs size=1G 0 0" is
> still relevant. If not, is this a question for Linux (so bugzilla.novell and
> SLES) or more for SAP?
> Question would be "How to configure the tmpfs for the VMC otherwise?"
> With the patch you recommended or "hard" size= in /etc/init.d/boot or...?

Since udev (something like 10.x), the complete /dev is a tmpfs. Before,
/dev was (usually) a normal directory on root-fs with static device files
and only the /dev/shm directory was a tmpfs.

Because the complete /dev is a tmpfs (udev & init scripts are using very
less of it [254K on my system]), it is not required any more to create a
separate tmpfs for /dev/shm.

But this does not mean, that it is never needed to adjust the default
size (of 50% RAM).

The patch in comment #91 allows to specify the size directly for /dev,
adding a /etc/fstab line like:

  udev  /dev  tmpfs  size=3g,mode=755  0 0

and to remove the /dev/shm mount entry.

If setting it to "75% (RAM + Swap)" (as recommended above) makes sense
or not, is completely another issue.
Comment 96 Marius Tomaschewski 2009-01-12 09:55:29 UTC
The question is now, what the maintainer of /etc/init.d/boot.localfs
says about the patch from comment #91 or the alternative one bellow.

[Because "mount -fv -t tmpfs udev /dev" writes an /etc/mtab entry,
 it is not really required to check the /etc/fstab before remount]

Ruediger?

=============
--- /etc/init.d/boot.localfs
+++ /etc/init.d/boot.localfs    2009-01-12 10:48:43.000000000 +0100
@@ -239,6 +239,8 @@
            fi
        done < /proc/filesystems
        mount -fv -t tmpfs udev /dev
+       # remount /dev too when there may be options in fstab
+       mount -oremount /dev
        rc_status
        if test ! -d /sys/block/loop0 ; then
            /sbin/modprobe loop
=============
Comment 97 Marc Munnen 2009-01-14 21:18:10 UTC
On jan 7 I updated my fstab to remove the line for tmpfs, and I restored /etc/init.d/network to the original state.
Since then I have had a working system and eth0 was functioning as it should. I was very pleased to see this bug being resolved with such a clever solution.
That is, until now.
Although my fstab is not changed since the 7., all of a sudden the bug is back.
This afternoon nothing was wrong, but this evening again I have no network without manually restarting rcnetwork.
The only difference today that I can remember are some security fixes for Cups, that came my way.

Marc
Comment 98 Marc Munnen 2009-01-14 22:20:18 UTC
Hi,

Because the problem reoccurs, I re-applied the 'fix' in /etc/init.d/network.
But it did not help this time!
I even restored fstab with the offending tmpfs line.
Needless to say this did not help either.

Basically I have no clue left. The problem is back, and there is no cheap fix anymore.
Am I the only one with this reoccuring problem?

Marc
Comment 99 Marius Tomaschewski 2009-01-15 08:13:15 UTC
(In reply to comment #98)
> Hi,
Hi!

> Because the problem reoccurs, I re-applied the 'fix' in /etc/init.d/network.
> But it did not help this time!
> I even restored fstab with the offending tmpfs line.
> Needless to say this did not help either.
> 
> Basically I have no clue left. The problem is back, and there is no cheap fix
> anymore.
> Am I the only one with this reoccuring problem?

Please take a look to the output of "cat /proc/mounts" if there is a
separate /dev/shm mounted. Then take a look if there are "new-stamp-$ID"
files in /dev/shm/sysconfig matching you network cards interface IDs in
the "ip link show" output ("3: eth1:" => check if new-stamp-3 exists).
Comment 100 Marc Munnen 2009-01-15 21:09:11 UTC
Marius,

There is no /dev/shm mounted:
 
root@Planhold:/home/marc> cat /proc/mounts
rootfs / rootfs rw 0 0
udev /dev tmpfs rw 0 0
/dev/mapper/nvidia_eicaegch_part1 / reiserfs rw 0 0
proc /proc proc rw 0 0
sysfs /sys sysfs rw 0 0
debugfs /sys/kernel/debug debugfs rw 0 0
devpts /dev/pts devpts rw 0 0
/dev/mapper/nvidia_eicaegch_part9 /home ext3 rw,data=ordered 0 0
/dev/mapper/nvidia_eicaegch_part3 /boot ext3 rw,data=ordered 0 0
/dev/mapper/nvidia_eicaegch_part8 /bu xfs rw 0 0
fusectl /sys/fs/fuse/connections fusectl rw 0 0
securityfs /sys/kernel/security securityfs rw 0 0

In sysconfig new-stamp-2 and new-stamp-4 exists

root@Planhold:/home/marc> ls /dev/shm/sysconfig
config-eth0  config-lo  config-wlan0  new-stamp-2  new-stamp-4  ready-lo
root@Planhold:/home/marc> ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
    link/ether 00:1b:fc:df:9e:11 brd ff:ff:ff:ff:ff:ff
3: wmaster0: <BROADCAST,MULTICAST> mtu 1500 qdisc ieee80211 qlen 1000
    link/ieee802.11 00:80:5a:4e:f8:ea brd ff:ff:ff:ff:ff:ff
4: wlan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
    link/ether 00:80:5a:4e:f8:ea brd ff:ff:ff:ff:ff:ff
root@Planhold:/home/marc> ls /dev/shm/sysconfig/new-stamp-2
/dev/shm/sysconfig/new-stamp-2
root@Planhold:/home/marc> cat /dev/shm/sysconfig/new-stamp-2
renamed
root@Planhold:/home/marc> cat /dev/shm/sysconfig/new-stamp-4
renamed
root@Planhold:/home/marc> cat /dev/shm/sysconfig/config-eth0
eth0

I don't use wlan0. But as said I haven't changed anything except network and fstab last week. 'network'is original, fstab has a comment in front of tmpfs.
It worked for 7 days, but not anymore.
Your help is appreciated.

Marc
Comment 101 Marius Tomaschewski 2009-01-16 09:02:04 UTC
BTW:
I've created a separate bug 466718 for the "apply size to /dev fs" issue.

(In reply to comment #100)
> Marius,
> 
> There is no /dev/shm mounted:
[...]

OK.

> In sysconfig new-stamp-2 and new-stamp-4 exists
> 
> root@Planhold:/home/marc> ls /dev/shm/sysconfig
> config-eth0  config-lo  config-wlan0  new-stamp-2  new-stamp-4  ready-lo

Hmm... strange - it should work then.

Please reinstall most recent sysconfig and udev RPMs for your distribution,
verify the install using "rpm -V sysconfig udev" and reboot (true reboot).
When it still happens, please enable the ". scripts/extradebug" line in
the /sbin/ifup and /etc/init.d/network scripts and reboot (true reboot).

Depending on the distribution/sysconfig version it should create "bash -x"
trace files either in /tmp/exdeb.* or in /dev/shm/sysconfig/exdeb.*.
Please tar them together with the /dev/shm/sysconfig files (tar cvzf
bug335486-exdeb.tgz /tmp/exdeb.* /dev/shm/sysconfig) and attach to this
bug.
Comment 102 Peter Küppers 2009-01-16 19:47:29 UTC
(In reply to comment #98)
> Hi,
> 
> Because the problem reoccurs, I re-applied the 'fix' in /etc/init.d/network.
> But it did not help this time!
> I even restored fstab with the offending tmpfs line.
> Needless to say this did not help either.
> 
> Basically I have no clue left. The problem is back, and there is no cheap fix
> anymore.
> Am I the only one with this reoccuring problem?
> 
> Marc

Sorry, I'm late again with my answer.

I updated the last upgrade packages on my server (openSUSE 10.3), but the solution (remove tmpfs from /etc/ftsab) still works on my server.
cat /proc/mounts says, that there is no separate /dev/shm mounted.
ip link show says there an no "new-stamp-$ID".
in /dev/shm/sysconfig new-stamp-2 and new-stamp-3 exists (and both have 'cat' renamed).

Cheers

Peter
Comment 103 Marc Munnen 2009-01-16 21:36:38 UTC
Created attachment 265808 [details]
Output from scripts/extradebug

Marius,

My first verify with rpm showed something hopeful:
root@Planhold:/home/marc> rpm -V sysconfig udev
S.5....T    /etc/init.d/network
S.5....T    /etc/sysconfig/network/scripts/ifup-wireless
This disappeared after reinstalling.

Unfortunately the bug did not...
See the logfiles, Hopefully something evil will reveal itself.

Marc
Comment 104 Forgotten User E4aj6OYf6m 2009-01-17 16:49:10 UTC
Marc wrote:
> fstab has a comment in front of tmpfs.

Marc, try completely deleting that entry. Even with a comment in front, it didn't work for me. Removing the line helped.
Comment 105 Marc Munnen 2009-01-17 21:10:49 UTC
Michael,

Michael wrote:
> Marc, try completely deleting that entry. Even with a comment in front, it
> didn't work for me. Removing the line helped.

This looks like wizzardry. If that helps, it's magic.
But it did not help for me, I am sorry to say.

I will try to upgrade to OpenSuse 11.1, maybe that is more rewarding.

Marc
Comment 106 Marc Munnen 2009-01-21 21:14:48 UTC
Hi,

I just upgraded to OpenSuse 11.1, and that was a good work around for this bug.
All is well that ends well.

Love,
Marc
Comment 107 Marius Tomaschewski 2009-02-02 11:07:55 UTC
(In reply to comment #103)
> Created an attachment (id=265808) [details]
> Output from scripts/extradebug
> 
> Marius,
> 
> My first verify with rpm showed something hopeful:
> root@Planhold:/home/marc> rpm -V sysconfig udev
> S.5....T    /etc/init.d/network
> S.5....T    /etc/sysconfig/network/scripts/ifup-wireless
> This disappeared after reinstalling.
> 
> Unfortunately the bug did not...
> See the logfiles, Hopefully something evil will reveal itself.
> 
> Marc

The bug was not away, because you've /etc/sysconfig/network/ifcfg-eth6
and /etc/sysconfig/network/ifcfg-eth7 files in your system and then the
network script is waiting for this not-existing hardware to appear.
A "rm /etc/sysconfig/network/ifcfg-eth{6,7}" solves the problem without
a need to update to 11.1.
Comment 108 Marius Tomaschewski 2009-02-02 14:45:02 UTC
Resetting Bug Prio (while comment #5) back to P5 as assigned by
bnc-team-screening.
Comment 109 Marius Tomaschewski 2009-02-02 15:19:53 UTC
The problem that the network script waits for an interface until
timeout (WAIT_FOR_INTERFACES in /etc/sysconfig/network/config),
occurs under two conditions:

 a) there are ifcfg files (interface configurations) for hardware
   that does not exists (any more) as in comment #103 and comment
   #107.

    solution =>
        Delete the [excrescent] ifcfg files or set
        STARTMODE='off' or 'manual' in these files.

 b) there is a separate tmpfs mounted on /dev/shm (via /etc/fstab),
    that hides the sysconfig udev rule state files created before
    /dev/shm got mounted.

    solution =>
        Remove /dev/shm mount point from /etc/fstab. A separate
        /dev/shm is not required, because /dev is already a tmpfs
        with a maximal size of 50% of RAM.

        In case that 50% of RAM for the tmpfs is not sufficient
        because of special requirements of some software, the
        size can be adopted by adding an /etc/init.d/boot.local
        line, like:

                /bin/mount -oremount,size=3g /dev

        to remount it with the desired size (3GB in this example).

I'm resolving this Bug as WONTFIX, because it is not sufficient
to just change the /dev/shm/sysconfig path [used for many years]
in sysconfig to something else, because it affects also several
another packages and may break custom if-up.d/if-down.d scripts.

We'll address this issue in a later/next openSUSE version.
Comment 110 Marc Munnen 2009-02-03 20:57:17 UTC
Marius,

Thank you for your analyze (comment #107.)
In the meantime I have upgraded to 11.1, and the extra ifcfg-eth* files are gone.

Usually I update my system as soon as an update is available, but now I hesitated because I have a two-seat configuration, that I did not want to loose.

After upgrading it took some time indeed to get the configuration in order, but now I'm glad I took the step.

Thanks for all your effort, and hopefully once Linux will rule.

Marc
Comment 111 Walter Haidinger 2009-02-04 11:32:52 UTC
As the initial bug reporter, I'd also like to thank Marius for hunting this bug so persistently.

Since the problem was identified and we have workarounds in comment #109, I guess we can live with a resolution of WONTFIX for the older distributions.
Comment 112 Marius Tomaschewski 2009-08-13 14:36:51 UTC
*** Bug 516769 has been marked as a duplicate of this bug. ***
Comment 113 Marius Tomaschewski 2009-12-14 10:53:32 UTC
*** Bug 497924 has been marked as a duplicate of this bug. ***