Bugzilla – Bug 335486
network interface not set up during boot due to a separate /dev/shm tmpfs
Last modified: 2009-12-14 10:53:32 UTC
After upgrading from openSUSE 10.2 to 10.3, the /etc/init.d/network script fails to configure the network interface (traditional method, no Networkmanager). It has worked without problems under 10.2. The script times out waiting for the mandatory device eth1 now: eth1 device: Marvell Technology Group Ltd. 88E8001 Gigabit Ethernet Controller (rev 13). eth1 DHCP client NOT running eth1 is down And indeed, as ifconfig shows, the interface isn't up. However, after booting, 'rcnetwork start' or 'yast2 network' works. Unfortunately /var/log/messages isn't very helpful other than: sgke eth1: Link is up at 1000 Mbps, full duplex ADDRCONF(NETDEV_CHANGE): eth1: links becomes ready skge eth1: disabling interface Please note that there is no delay in the above syslog entries (i.e. same timestamp). The interface seems to be brought down right after it was brought up. Since the above does not provide much details, I'd appreciate for some hints to further diagnose the problem. Thanks!
Please note that the interface is not set up with a static IP too. It is therefore obviously not a dhcp client problem.
Well, the network script waits forever for hotplug to setup eth1 "...still waiting for hotplug devices". I manually load the skge module before /etc/init.d/network is called. Is this causing this now?
/var/log/boot.msg with network debugging enabled: (please note the "unknown option" error messages!) <notice>network start start CONFIG = INTERFACE = AVAILABLE_IFACES = PHYSICAL_IFACES = DIALUP_IFACES = TUNNEL_IFACES = MANDATORY_DEVICES = eth1 __NSC__ SKIP = Setting up network interfaces: lo returned 0 done... still waiting for hotplug devices: SUCCESS_IFACES= lo MANDATORY_DEVICES=eth1 __NSC__ Time to wait: 42 Waiting for mandatory devices: eth1 __NSC__ 42 ... still waiting for hotplug devices: SUCCESS_IFACES= lo [--cut--] Time to wait: 1 ... still waiting for hotplug devices: SUCCESS_IFACES= lo MANDATORY_DEVICES= eth1 __NSC__ eth1 -o rc onboot eth1 device: Marvell Technology Group Ltd. 88E8001 Gigabit Ethernet Controller (rev 13) eth1 eth1 -o rc onboot eth1 DHCP client NOT running eth1 is down eth1 eth1 -o rc onboot unknown option rc ignored 'eth1' is not wireless, exiting eth1 eth1 -o rc onboot unknown option rc ignored interface eth1 is not up eth1 returned 7 failed eth1 interface could not be set up until now failed... final SUCCESS_IFACES= lo MANDATORY_DEVICES= eth1 __NSC__ FAILED=1 noiface -o rc onboot Setting up service network . . . . . . . . . . . . . . . .failed <notice>'network start' exits with status 7
Here is a a workaround (NO FIX). Ugly but works for me. Exclude the eth-interfaces from the physical/non-physical detection: --- sysconfig-0.70.2-4/etc/init.d/network 2007-09-22 00:12:40.000000000 +0200 +++ /etc/init.d/network 2007-10-22 12:18:54.000000000 +0200 @@ -470,7 +470,8 @@ # later in the start section if it is considered mandatory (see next section). for a in $(type_filter `ls -A /sys/class/net/`); do case "`get_iface_type $a`" in - eth|tr|wlan) + eth) ;; + tr|wlan) STAMPFILE=$STAMPFILE_STUB`cat /sys/class/net/$a/ifindex` if [ "$MODE" == onboot -a "$ACTION" == start ] ; then if [ ! -e "$STAMPFILE" ] ; then Nevertheless, the ethernet interfaces should be brought up upon boot if configured that way (start=onboot) regardless of udev or prior module load.
I just have the same problem. I upgraded Suse 10.2 to 10.3 on two pc's and both refused to load the network cards after booting. I tried almost every configuration option in Yast for my 3 network cards, but only allowing the networkmanager to start them gave me at least one working card. Alas, the other 2 are greyed out and so can't be started. Also advanced routing does not work with the Networkmanager. (And, strange enough, it refuses my netmask of 255.255.255.0 and uses 255.0.0.0) In contrast, when I do a fresh install on both pc's, all network cards are started upon booting as might be expected, with ifup. After logging in on my upgraded pc's I can start them with 'rcnetwork start'. But this is hardly a solution, because I can't tell my friend he has to live with that. So please, please, repair this in some way or the other. I must admit that the Workaround from Walter Haidinger does not ring much bells in me at first sight. Marc
Please attach your network configuration files. Would you please try these two things: 1) rcnetwork stop rcnetwork start -o boot 2) rcnetwork stop unload the driver of your NIC rcnetwork start -o boot load the module while rcnetwork is waiting for the interface
Created attachment 195057 [details] Files requested by comment #7 Hi! I've attached a tarball with the following files: * 1.log - output of 1) above with DEBUG=yes * 2.log - output of 2) above with DEBUG=yes * /etc/sysconfig/network/config * /etc/sysconfig/network/ifcfg-eth[01] Unfortunately inserting the module as requested by 2) did not change anything. Btw, here is the console mixed output (not in the attached logs) of syslog logging the console when inserting the forcedeth driver in 2): MANDATORY_DEVICES= eth1 __NSC__ Time to wait: 18 kernel: forcedeth.c: Reverse Engineered nForce ethernet driver. Version 0.60. Feb 15 08:32:16 banshee kernel: ACPI: PCI Interrupt 0000:00:04.0[A] -> Link [APCH] -> GSI 22 (level, high) -> IRQ 19 ... still waiting for hotplug devices: SUCCESS_IFACES= lo MANDATORY_DEVICES= eth1 __NSC__ Time to wait: 16 kernel: eth0: forcedeth.c: subsystem: 01043:80a7 bound to 0000:00:04.0 ... still waiting for hotplug devices: SUCCESS_IFACES= lo MANDATORY_DEVICES= eth1 __NSC__ Please tell me if you need anything else.
Hello Walter and Christian, First I've to say, that I'm not an expert on Bugzilla and how to participate in it. But I've made some investigations on this bug an maybe I found a solution and it's of some interest. In openSUSE 10.2 /etc/init.d/network says up from line 458: >>> # Now get all available interfaces drop lo and separate them into physical and # not physical. Then get AVAILABLE_IFACES sorted to shutdown the not physical # first. for a in $(type_filter `ls -A /sys/class/net/`); do test "$a" = lo && continue; test "$a" = sit0 && continue; test "$a" = bonding_masters && continue; test "${a#wifi}" != "$a" && continue case $a in eth*|ath*|wlan*|ra*) # Skip these which are too new, they will come via hotplug #Stempeln in rename_netiface #- am Anfang: virgin #- während dem Schleifen: looping #- am Ende: renamed #ifup bricht gleich ab, wenn kein service network #Wenns keinen Stempel gibt dann Stempeln unknown --> skip #Wenn Stempel virgin --> skip # looping --> skip # renamed --> set up #In Statusschleife, wenn mandatory devices gecheckt werden: # wenn status failed # STEMPEL == unknown && halbe Wartezeit vorbei -> ifup # == virgin/looping/renamed -> nix STAMPFILE=$STAMPFILE_STUB`cat /sys/class/net/$a/ifindex` if [ "$MODE" == onboot -a "$ACTION" == start ] ; then if [ -r "$STAMPFILE" ] ; then case "`cat $STAMPFILE`" in virgin|looping) continue ;; esac else echo unknown > $STAMPFILE continue fi fi ;; esac for b in $DIALUP_IFACES $TUNNEL_IFACES; do if [ "$a" = "$b" ] ; then NOT_PHYSICAL_IFACES="$NOT_PHYSICAL_IFACES $a" continue 2 fi done case $a in sit*) NOT_PHYSICAL_IFACES="$NOT_PHYSICAL_IFACES $a" continue 2 ;; esac PHYSICAL_IFACES="$PHYSICAL_IFACES $a" done <<< In openSUSE 10.3 /etc/init.d/network is different up from line 472: >>> # Now get all available interfaces drop lo and separate them into physical and # not physical. Then get AVAILABLE_IFACES sorted to shutdown the not physical # first. # Interfaces may be renamed by udev after they are registered. In some cases # this may take some time. Therefore we check a 'renamed' flag if an interface # is ready to be set up. If an it is not ready now, it will be set up via # udev/ifup (because network is started now). We will just have to wait for it # later in the start section if it is considered mandatory (see next section). for a in $(type_filter `ls -A /sys/class/net/`); do case "`get_iface_type $a`" in eth|tr|wlan) STAMPFILE=$STAMPFILE_STUB`cat /sys/class/net/$a/ifindex` if [ "$MODE" == onboot -a "$ACTION" == start ] ; then if [ ! -e "$STAMPFILE" ] ; then continue # this leaves the for-loop! fi fi ;; lo|wlan_aux) continue ;; esac for b in $DIALUP_IFACES $TUNNEL_IFACES; do if [ "$a" = "$b" ] ; then NOT_PHYSICAL_IFACES="$NOT_PHYSICAL_IFACES $a" continue 2 fi done case $a in sit*) NOT_PHYSICAL_IFACES="$NOT_PHYSICAL_IFACES $a" continue 2 ;; esac PHYSICAL_IFACES="$PHYSICAL_IFACES $a" done <<< I marked the line with the supposed bug with a comment (this leaves the for-loop!), see above. Since the for-loop is stopped here, the variable PHYSICAL_IFACES has no value! And what's about "virgin and looping" from openSUSE 10.2? I hope this gives a hint for a solution to the bug. Peter
Add myself to the CC lis
The problem seems to be caused by some inconsistency of the scripts after the update... I can not reproduce this problem (see attachment in next comment). In comment #4 and comment #9 an old (not from last update) rcnetwork script is used. Please update, there was fixes also to the mandatory wait loop. (In reply to comment #3 from Walter Haidinger) > /var/log/boot.msg with network debugging enabled: > (please note the "unknown option" error messages!) The "unknown option rc ignored" is just debug output from ifup-wireless script without any relevance (it just informs about ignored rc option). Please update the sysconfig package to the most recent update package: http://download.opensuse.org/update/10.3/rpm/i586/sysconfig-0.70.2-4.2.i586.rpm http://download.opensuse.org/update/10.3/rpm/x86_64/sysconfig-0.70.2-4.2.x86_64.rpm then, please verify the package using rpm -V sysconfig When rpm reports some modification/inconsistence like: # rpm -V sysconfig S.5....T /etc/init.d/network remove the reported files and install the package again. Verify the udev installation using "rpm -V udev" as well. # ls -l /etc/udev/rules.d/*net*.rules | cut -b 23- 450 9. Mai 08:17 /etc/udev/rules.d/70-persistent-net.rules 1518 21. Sep 2007 /etc/udev/rules.d/75-persistent-net-generator.rules 823 24. Apr 00:26 /etc/udev/rules.d/77-network.rules The /etc/udev/rules.d/70-persistent-net.rules is generated and contains mapping of the hardware to the interface name. Please verify, it reflect your hardware address (MAC) and the ifcfg-<interface> files in /etc/sysconfig/network. (In reply to comment #9 from Peter Küppers) > I marked the line with the supposed bug with a comment (this leaves the > for-loop!), see above. > Since the for-loop is stopped here, the variable PHYSICAL_IFACES has no value! > And what's about "virgin and looping" from openSUSE 10.2? > > I hope this gives a hint for a solution to the bug. No, the scripts / rule files (in sysconfig and udev) are rewritten for 10.3 and are simplier in many places. On 10.3, the "ifcfg-<hardware-description>" (ifcfg-eth-id-00:01:02:8E:21) support is removed completely, we just use ifcfg-<interfacename>. The PHYSICAL_IFACES variable is empty, because the interface is not up until now and we wait for udev to load the modules.
Created attachment 213807 [details] Comment #7 test results with most recent packages Messages while "rcnetwork start -o boot debug" as described in 2) in comment #7 egrep "kernel:|ifup:|rcnetwork:" /var/log/messages > messages.txt
The messages.txt attached in comment #12 contains also the case of interface renaming (eth0 -> eth1, eth1 -> eth0) in udev rules.
Walter, Peter, please update also the kernel to the most recent one (2.6.22.17-0.1).
Thanks for the response, Marius! Unfortunately the sysconfig update to -4.2 did not work for me. Actually, this explains why my work-around of comment #4 stopped working sometime ago and had to be reapplied (I automatically install the update rpms via a custom script, not using yast. This is still from the days before yast provided update functionality. So I obviously overlooked the sysconfig update in the script's notification email.). My udev rules and interface renaming are fine and I'm not running an opensuse kernel, currently vanilla 2.6.25.1. Please note the following things regarding this bug: * It only shows during boot. Subsequent rcnetwork calls once logged in succeed. * It's probably caused by the assumption that _only_ udevd loads the modules. But what if the module is already loaded before boot.udev is run? I'll create a messages.txt as suggested in comment #12 later.
Created attachment 213890 [details] boot.msg from system with ethernet driver in initrd (In reply to comment #15 from Walter Haidinger) > Unfortunately the sysconfig update to -4.2 did not work for me. And what else is wrong with it? > Please note the following things regarding this bug: > * It only shows during boot. Subsequent rcnetwork calls once logged in succeed. > * It's probably caused by the assumption that _only_ udevd loads the modules. > But what if the module is already loaded before boot.udev is run? No, even I add the forcedeth driver to the initrd (INITRD_MODULES variable in /etc/sysconfig/kernel + mkinitrd) and the driver is loaded before udev starts, it works fine for me - see attached boot.msg. Please update correctly.
Marius, I already ran kernel 2.6.22.17-0.1, and also I already had the latest sysconfig. After 'upgrading' to the version 0.70.2-4.2 Yast showed me the same old and current version. This upgrade also replaced my 'patched' network script. Of course now my network interfaces were not started after rebooting. Your messages.txt told me that your interfaces were not renamed by udev. So, with the faint knowledge I have about udev, I edited some rules in /etc/udev/rules.d/, Now I got rid of those renames for old non-existing interfaces. But this did not solve the bug. Nevertheless, I'm happy with the simple setup I have now, with eth0 and wlan0 and no more. Marc
Hello Marius, I also installed the latest sysconfig (sysconfig-0.70.2-4.5) and kernel (kernel-default-2.6.22.17-0.1 x86_64), udev is also OK. The error on boot still persists: >>> Waiting for mandatory devices: eth0 eth1 __NSC__ 18 17 16 14 13 11 10 8 7 6 4 3 1 0 eth0 device: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10) eth0 is down failed eth0 interface could not be set up until now ... <<< If I add the following (from opensuse 10.2) in the network script up from line 493 instead of the hard 'continue', the script works fine for me: >>> 493: if [ ! -e "$STAMPFILE" ] ; then case "`cat $STAMPFILE`" in virgin|looping) continue ;; esac else echo unknown > $STAMPFILE continue fi <<< It must have something to do with the not properly filled PHYSICAL_IFACES variable (or around this, e.g. with udev?). Cheers Peter
Sorry for the delayed answer. Unfortunately I've to confirm Marc and Peter: sysconfig-0.70.2-4.5 does not work for me either, the ethernet interfaces are not setup during boot (only!) by /etc/init.d/network. Regarding comment #16: >> Unfortunately the sysconfig update to -4.2 did not work for me. > And what else is wrong with it? Sorry, a bit misleading. I meant that the update itself worked but this bug still persists. No problems with updates. Marc, Peter, please could you too do the following in order to help Marius debugging the script: Add the 3 lines set >> /var/log/rcnetwork.set exec >> /var/log/rcnetwork.log 2>&1 set -v; set -x at the top of your /etc/init.d/network and reboot. This will log the shell variables during boot and the (expanded) commands run. Marius should then have something to compare to his script from three different machines, all experiencing the bug.
Created attachment 217033 [details] rcnetwork.log from Peter
Created attachment 217035 [details] rcnetwork.set from Peter
Hello, as recommended by Walter, I added the lines in /etc/init.d/network and reboot. See my attachments as result. (/etc/init.d/network with my modifications, see above!) Cheers Peter
Created attachment 217048 [details] My 6 logs/sets from rcnetwork
Hi, Just as Peter I did what Walter has advised. Yesterday. Now I have 6 logs/sets. - rcnetwork-changed.* : my patched/changed rcnetwork to make it work at boot time. - rcnetwork-org.* : my original rcnetwork that does not do the job at boot time Additionally: - rcnetwork.* : output from the original rcnetwork with 'rcnetwork start' after booting All this in the hope it will provide some useful information. Good luck, Marc
(In reply to comment #24 from Marc Munnen) > All this in the hope it will provide some useful information. Yes and no: ++ ls -d /etc/sysconfig/network/ifcfg-eth0 /etc/sysconfig/network/ifcfg-eth3~ /etc/sysconfig/network/ifcfg-eth5~ /etc/sysconfig/network/ifcfg-lo /etc/sysconfig/network/ifcfg-type-wlan /etc/sysconfig/network/ifcfg-wlan0 /etc/sysconfig/network/ifcfg-wlan0-bu~ /etc/sysconfig/network/ifcfg-wlan0~ [...] + MANDATORY_DEVICES=' eth0 type-wlan wlan0 __NSC__ ' This means, there is a problem, that the ifcfg-type-wlan is not converted -- the hwdesc2iface script is unable to handle it. We will see with Chistian (in NEEDINFO) if it can be converted somehow; at the moment, just move it away (rename to ifcfg.type-wlan) and delete the "ifcfg-*~" files. This should fix your problem Marc. Beside of the above obsolete ifcfg-<hwdescr> files: Not the /etc/init.d/network script causes the problems, but something with udev and ifup. So please provide log files as described bellow. At http://www.suse.de/~mt/openSUSE/10.3/, in 10.3-<arch> subdirectories, you'll find sysconfig RPMs with enabled extra debug. Please install them, reset your /var/log/messages with: bzip2 -9c < /var/log/messages \ > /var/log/messages-$(date +%Y%m%d).bz2 && \ cp /dev/null /var/log/messages and _reboot_ (no, really not a joke - I want to see what udev is doing). After the reboot, please collect files and create an archive using e.g.: mkdir /tmp/bug-355786-extradebug cp -a /dev/shm/sysconfig/* /tmp/bug-355786-extradebug/ cp -a /var/log/messages /var/log/boot.msg /tmp/bug-355786-extradebug/ cd /tmp/bug-355786-extradebug/ #### replace wireless keys / passwords / another secets with XXXXXX tar cvzf /tmp/bug-355786-extradebug.tgz * And attach the archive as (private) attachment to this bug.
Created attachment 217531 [details] /etc/init.d/network patch to skip unconverted ifcfg-<hwdesc> files Marc, you can try to apply this patch. It should skip the ifcfg-type-wlan file.
Created attachment 217617 [details] debuginfo from Peter
Created attachment 217913 [details] debug info as requested, from Marc Marius, As suggested, I deleted the "ifcfg-*~" files. You said: This should fix your problem Marc. But what problem? I also renamed ifcfg-type-wlan. There is no need to convert this one, I don't need it. I rebooted with the original network script. The network was not started. Ok, I installed your sysconfig-rpm's and collected the information. The WLAN part of these files don't matter for now, it's really only eth0 that is important for me (and the bug). I hope this will reveil something Regards, Marc
Marius, I was finally was able to create the debug info as requested by comment #26. Since I do not know how to make a private attachment (does somebody?), I'll simply send it to you directly. The tarball also contains some *.log and *.set files (see comment #19). Removing the NEEDINFO status because all of us three have provided the requested debug info. If you need additional information, please let us know! I'm curious though, why you require the /var/log/messages file too because the syslog is started _after_ network and the script only breaks during boot.
(In reply to comment #30 from Walter Haidinger) > Marius, > > I was finally was able to create the debug info as requested by comment #26. > Since I do not know how to make a private attachment (does somebody?), > I'll simply send it to you directly. OK, thanks! In your and in Peters case, a "ifup eth0 -o hotplug" is never called (same for eth1). This means, you have a problem with udev rules. /etc/udev/rules.d # ls -l *net* -rw-r--r-- 1 root root 450 2008-05-26 14:59:09 70-persistent-net.rules -rw-r--r-- 1 root root 1518 2007-09-21 21:12:39 75-persistent-net-generator.rules -rw-r--r-- 1 root root 823 2008-05-22 12:39:27 77-network.rules There should be one rule for each physical network device, e.g.: /etc/udev/rules.d # grep -Ev "^#|^$" 70-persistent-net.rules SUBSYSTEM=="net", DRIVERS=="?*", ATTR{address}=="00:17:31:ca:a5:a5", NAME="eth0" SUBSYSTEM=="net", DRIVERS=="?*", ATTR{address}=="00:17:31:ca:a3:92", NAME="eth1" Please verify using "rpm -V udev" and "rpm -V syslog" that the both another rule files are not modified: # rpm -qf /etc/udev/rules.d/75-persistent-net-generator.rules udev-114-19 # rpm -qf /etc/udev/rules.d/77-network.rules sysconfig-0.70.2-4.6 The 77-network.rules file is responsible for "marking" the interface available by creating the $STAMPFILE: ==> 77-network.rules: [...] SUBSYSTEM=="net", ACTION=="add", RUN+="/sbin/ifup $env{INTERFACE} -o hotplug" [...] ==> /sbin/ifup: [...] if [ "$SCRIPTNAME" == ifup -a "$HOTPLUG" == yes ] ; then IFINDEX=/sys/$DEVPATH/ifindex if [ -r "$IFINDEX" ] ; then STAMPFILE=$STAMPFILE_STUB`cat $IFINDEX` echo renamed > $STAMPFILE fi fi [...] The DEVPATH variable is provided by udev and points to the path of the device, e.g. devices/pci0000:00/0000:00:10.0/net/eth0. The STAMPFILE is checked in the rcnetwork script -- see comment #4 and #9, your patches apply exactly to this place.. But the network script is not the reason of the problem - something is wrong with the udev network rules on your systems. This is the reason, why I asked to verify sysconfig + udev installation.
(In reply to comment #29 from Marc Munnen) > Created an attachment (id=217913) [details] > debug info as requested, from Marc > > Marius, > > As suggested, I deleted the "ifcfg-*~" files. > You said: This should fix your problem Marc. But what problem? That the network script waits for an "type-wlan" interface that will be never available - see also in comment #27. > I also renamed ifcfg-type-wlan. There is no need to convert > this one, I don't need it. Then remove it, but don't rename to "ifcfg-wlan1" or the network script will wait for "wlan1" interface. > I rebooted with the original network script. The network was > not started. Same problem as in Peters and Walters case - something is wrong with udev rule files. "ifup eth0 -o hotplug" is never called.
What is the result of "rpm -V udev sysconfig"? Does one of this rule files exists on your system? /etc/udev/rules.d/29-net_trigger_firmware.rules /etc/udev/rules.d/30-net_persistent_names.rules /etc/udev/rules.d/31-network.rules /etc/udev/rules.d/80-sysconfig.rules /etc/udev/rules.d/85-mount-fstab.rules Rules like: 30-net_persistent_names.rules:SUBSYSTEM=="net", ACTION=="add", SYSFS{address}=="00:e0:81:02:a4:56", IMPORT="/lib/udev/rename_netiface %k eth1" are obsolete and does not work on 10.3.
Hi Marius, thanks for your help! Replying to comment #31: > ls -l *net* shows the 70-,75- und 77- files that you listed. > # grep -Ev "^#|^$" 70-persistent-net.rules ok, there are entries for each ethX-interface. > rpm -V ... udev is quiet (no changes), and syslog-ng (not syslog!) shows only /etc/syslog-ng/syslog-ng.conf which is ok (added custom entries). > rpm -qf ... Same packages as yours. > 77-network.rules has exactly the required line. > /sys/devices/pci*/net/ethX entries exist. Replying to comment #33: > rpm -V udev sysconfig ok, nothing is printed. None of the 5 udev rules listed in comment #33 exist on my system. The obsoleted 30-... rule was even documented in the Release Notes, IIRC. So yes, I'm aware of that. Now, without anything obvious wrong in the udev setup, how can we debug this further? What are the possible reasons that the "add" action of 77-network.rules is _not_ triggered?
(In reply to comment #34 from Walter Haidinger) > Hi Marius, thanks for your help! Thanks for your reports! > > rpm -V ... > udev is quiet (no changes), and syslog-ng (not syslog!) shows only > /etc/syslog-ng/syslog-ng.conf which is ok (added custom entries). OK. I meant "rpm -V sysconfig" of course, but you verified it bellow ;-) > Replying to comment #33: > > > rpm -V udev sysconfig > ok, nothing is printed. OK. > None of the 5 udev rules listed in comment #33 exist on my system. > > The obsoleted 30-... rule was even documented in the Release Notes, IIRC. > So yes, I'm aware of that. > > Now, without anything obvious wrong in the udev setup, > how can we debug this further? > > What are the possible reasons that the "add" action of 77-network.rules > is _not_ triggered? Yes, this is the question here. It mayby happens, because of your 2.6.25.4-vmhost32 kernel... Perhaps the forcedeth driver does not trigger this event without patch? Hmm... Marcs and Peters seem to use suse default kernels... We will ask the maintainer and find it out - reassigning to him. The rename rule (70-persistent-net.rules) is called / visible in your boot.msg file. <6>udev: renamed network interface eth1 to eth0 <6>udev: renamed network interface eth0_rename to eth1 In my case, I can see more log lines when rename happens (10.3): <6>eth0 renamed to eth0_rename <6>eth1 renamed to eth0 <6>udev: renamed network interface eth1 to eth0 <6>eth0_rename renamed to eth1 <6>udev: renamed network interface eth0_rename to eth1 I think, it makes sense to enable ulog debug mode; perhaps it is visible there. Please set udev_log="debug" in /etc/udev/udev.conf, reboot and provide the logs - should be in /var/log/boot.msg. Alternatively, you can also try to trigger it at runtime: rcnetwork stop rmmod forcedeth # network card driver udevcontrol log_priority=debug rcnetwork start -o boot & # in background, or modprobe on another modprobe forcedeth # terminal after rcnetwork start...
Why is this bus assigned to me? Please do not reassign bugs without comment. I doubt it is something to fix from my side.
Because udev does not trigger rules from 77-network.rules file, as mentioned in comment #35.
Udevtest shows, it should run. How did you find out, it is not called?
Merius concluded that "ifup eth0 -o hotplug" is never called (see bottom of comment #32). It should, though: #udevtest /class/net/eth1 This program is for debugging only, it does not run any program, specified by a RUN key. It may show incorrect results, because some values may be different, or not available at a simulation run. parse_file: reading '/etc/udev/rules.d/05-udev-early.rules' as rules file parse_file: reading '/etc/udev/rules.d/40-alsa.rules' as rules file parse_file: reading '/etc/udev/rules.d/40-bluetooth.rules' as rules file parse_file: reading '/etc/udev/rules.d/41-soundfont.rules' as rules file parse_file: reading '/etc/udev/rules.d/50-udev-default.rules' as rules file parse_file: reading '/etc/udev/rules.d/51-lirc.rules' as rules file parse_file: reading '/etc/udev/rules.d/52-irda.rules' as rules file parse_file: reading '/etc/udev/rules.d/55-hpmud.rules' as rules file parse_file: reading '/etc/udev/rules.d/55-libsane.rules' as rules file parse_file: reading '/etc/udev/rules.d/56-idedma.rules' as rules file parse_file: reading '/etc/udev/rules.d/60-cdrom_id.rules' as rules file parse_file: reading '/etc/udev/rules.d/60-persistent-input.rules' as rules file parse_file: reading '/etc/udev/rules.d/60-persistent-storage.rules' as rules file parse_file: reading '/etc/udev/rules.d/64-device-mapper.rules' as rules file parse_file: reading '/etc/udev/rules.d/64-md-raid.rules' as rules file parse_file: reading '/etc/udev/rules.d/70-kpartx.rules' as rules file parse_file: reading '/etc/udev/rules.d/70-persistent-cd.rules' as rules file parse_file: reading '/etc/udev/rules.d/70-persistent-net.rules' as rules file parse_file: reading '/etc/udev/rules.d/71-multipath.rules' as rules file parse_file: reading '/etc/udev/rules.d/75-cd-aliases-generator.rules' as rules file parse_file: reading '/etc/udev/rules.d/75-persistent-net-generator.rules' as rules file parse_file: reading '/etc/udev/rules.d/77-network.rules' as rules file parse_file: reading '/etc/udev/rules.d/80-drivers.rules' as rules file parse_file: reading '/etc/udev/rules.d/90-hal.rules' as rules file parse_file: reading '/etc/udev/rules.d/95-udev-late.rules' as rules file import_uevent_var: import into environment: 'PHYSDEVPATH=/devices/pci0000:00/0000:00:08.0' import_uevent_var: import into environment: 'PHYSDEVBUS=pci' import_uevent_var: import into environment: 'PHYSDEVDRIVER=forcedeth' import_uevent_var: import into environment: 'INTERFACE=eth1' import_uevent_var: import into environment: 'IFINDEX=4' main: looking at device '/class/net/eth1' from subsystem 'net' udev_rules_get_name: rule applied, 'eth1' becomes 'eth1' main: run: 'socket:/org/kernel/dm/multipath_event' main: run: '/sbin/ifup eth1 -o hotplug' main: run: 'socket:/org/freedesktop/hal/udev_event' main: run: 'socket:/org/kernel/udev/monitor' I'll reboot with udev logging set to debug (from comment #35) and post the logs. Marc, Peter, can you do this too?
Btw, has somebody looked into /sbin/ifup? Perhaps it is called by udev after all but doesn't do its job...
Q: Who sets environment variable DEVPATH? udev? Required in /sbin/ifup (see comment #31)
(In reply to comment #41 from Walter Haidinger) > Q: Who sets environment variable DEVPATH? udev? > Required in /sbin/ifup (see comment #31) The kernel.
In reply to comment #35: > Yes, this is the question here. It mayby happens, because of your > 2.6.25.4-vmhost32 kernel... Perhaps the forcedeth driver does not > trigger this event without patch? Checked patches of kernel-default-2.6.22.17-0.1.nosrc.rpm to drivers/net/forcedeth.c. There are some, but either already applied in current 2.6.25.4 or not applicable, e.g. patches to printk(). Unless there are other patches to the networking stack, I doubt that the forcedeth driver is the problem. Marc, Peter, which network driver do you use?
(In reply to comment #38 from Kay Sievers) > Udevtest shows, it should run. How did you find out, it is not called? I've created a sysconfig package (http://www.suse.de/~mt/openSUSE/10.3/, see comment 26), that enables "bash -vx" debugging in both, rcnetwork and ifup: /sbin/ifup: R_INTERNAL=1 # internal error, e.g. no config or missing scripts cd /etc/sysconfig/network || exit $R_INTERNAL test -f ./config && . ./config test -f scripts/functions && . scripts/functions || exit $R_INTERNAL ###### scripts/functions creates /dev/shm/sysconfig . scripts/extradebug /etc/sysconfig/network/scripts/extradebug: SCRIPT=${0##*/} if test -d /dev/shm/sysconfig ; then exec 2> /dev/shm/sysconfig/exdeb.${SCRIPT}_$$.$PPID.${SEQNUM}_$1.$2 [...] set -vx fi In the logs provided in comment 28 and comment 29, ifup is never called with the "-o hotplug" option as udev is starting it in the rule.
Created attachment 218600 [details] boot.msg with udev debug from Peter Hello, Replying to comment #31 and #33 from Marius: I've the same results as in comment #34 from Walter. Replying to comment #35 from Marius and #39 from Walter: I've set udev_log="debug", rebooted, and provided the log /var/log/boot.msg. Replying to comment #43 from Walter: I use network drivers 8139too and 8139cp for eth0 (Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)). and r8169 for eth1 (Realtek Semiconductor Co., Ltd. RTL-8110SC/8169SC Gigabit Ethernet (rev 10)). lsmod says: r8169 48392 0 8139too 44672 0 8139cp 41216 0 mii 22528 2 8139too,8139cp Replying to comment #30 from Walter: me neither -> Since I do not know how to make a private attachment (does somebody?) Cheers Peter
(In reply to comment #43 from Walter Haidinger) > In reply to comment #35: > > Yes, this is the question here. It mayby happens, because of your > > 2.6.25.4-vmhost32 kernel... Perhaps the forcedeth driver does not > > trigger this event without patch? > > Checked patches of kernel-default-2.6.22.17-0.1.nosrc.rpm to > drivers/net/forcedeth.c. There are some, but either already applied in > current 2.6.25.4 or not applicable, e.g. patches to printk(). > Unless there are other patches to the networking stack, I doubt that > the forcedeth driver is the problem. > > Marc, Peter, which network driver do you use? It does not depend on the driver [at least as shipped by suse]. Marc and me are using forcedeth from suse kernel too and it works fine for me but not for Marc. Peter is using r8169 (suse).
BTW: which dhcp clients are you using Marc, Peter, Walter? Please provide the output of: rpm -qa | grep dhcp grep DHCLIENT_BIN /etc/sysconfig/network/dhcp
I usually use ISC /sbin/dhclient from dhcp-client.rpm. The interfaces are configured statically, though.
Hello, Replying to comment #47 from Marius: rpm -qa | grep dhcp: dhcpcd-1.3.22pl4-287 grep DHCLIENT_BIN /etc/sysconfig/network/dhcp DHCLIENT_BIN="" But I dont have any ifcfg-<interface> with BOOTPROTO='dhcp'. They are both 'static'. Cheers Peter
Hi, Just as Walter & Peter I use only static configured interfaces. Cmt 48/49 Just as Peter I have the same dhcp-stuff. Cmt 49 @Marius: (cmt 47): do you really have an updated 10.2 -> 10.3 Open Suse? @all: more than 10 messages today while I was out. I'll try to grab what's in it and to do what is asked, tomorrow. But I´m more worried that i haven't got any security updates for about a month, than about this bug for which exists some sort of work around. Regards, Marc
I just wanted to make sure, that you aren't using dhclient (ISC dhcp client). until now, ifup-dhcp does not support multiple interfaces (parallel up), but it also complains in /var/log/messages when it detects this.
Hello, I also have this problem and tried to investigate it myself. Putting echo's here and there I've found that during /etc/init.d/network starts in 'onboot' mode it expects udev to rename interface to that time and if this isn't the case ifup wouldn't not run for it. Network interfaces for which udev rule has run ifup it creates STAMPFILE in /dev/shm/sysconfig/ with 'renamed' in it. But if udev called this rule before rcnetwork had started ifup exits with "Service network not started and mode 'auto' -> skipping"". Ifup creates that STAMPFILE during on boot udev device detection, but when rcnetwork ran after it, somehow /dev/shm/sysconfig/ is empty! And makes rcnetwork wait for interface's detection. I'm not sure, but maybe that could help.
I really doubt, that this is a udev issue. Reassigning back.
Walter, can you provide the output of the following command? : for p in $(pidof udevd) ; do ls -l /proc/$p/exe ; done
#for p in $(pidof udevd) ; do ls -l /proc/$p/exe ; done lrwxrwxrwx 1 root root 0 Oct 3 07:42 /proc/2558/exe -> /sbin/udevd* #for p in $(pidof udevd) ; do ls -lL /proc/$p/exe ; done -rwxr-xr-x 1 root root 80252 Sep 21 2007 /proc/2558/exe* Do you doubt that I don't have udevd running?
(In reply to comment #55 from Walter Haidinger) > #for p in $(pidof udevd) ; do ls -l /proc/$p/exe ; done > lrwxrwxrwx 1 root root 0 Oct 3 07:42 /proc/2558/exe -> /sbin/udevd* ^^^^^^^^^^^ > Do you doubt that I don't have udevd running? No. I wanted to verify a bug that we've had on 11.1 Beta, where the udevd from initrd was still running due bug in sysvinit. It would appear as "-> /sbin/udevd (deleted)"... (In reply to comment #52 from Innocenty Enikeew) > Hello, > > I also have this problem and tried to investigate it myself. Putting echo's > here and there I've found that during /etc/init.d/network starts in 'onboot' > mode it expects udev to rename interface to that time and if this isn't the > case ifup wouldn't not run for it. Network interfaces for which udev rule has > run ifup it creates STAMPFILE in /dev/shm/sysconfig/ with 'renamed' in it. But > if udev called this rule before rcnetwork had started ifup exits with "Service > network not started and mode 'auto' -> skipping"". > Ifup creates that STAMPFILE during on boot udev device detection, but when > rcnetwork ran after it, somehow /dev/shm/sysconfig/ is empty! And makes > rcnetwork wait for interface's detection. I'll verify this.
We have this same problem here with 2 from 4 machines which were upgraded from 10.2 to 10.3. All have static ip-addresses, different network cards, different architecture. The difference during the upgrade was, that the two working machines were upgraded from "outside" by booting with a network-install-cd. The two machines showing the problems were upgraded from a running system by changing the repos to the 10.3 versions and then using the "factory upgrade"-tool in yast. Recently we upgraded one of these 2 problematic systems from 10.3 to 11.0 by booting with a retail dvd and performing the upgrade, but the problem still persists. (Due to network dependant services our workaround after booting is to switch to runlevel 1 and then back to runlevel 3 (these are servers).) One working server and and one server with broken network after upgrade are nearly identical (Dell PowerEdge Servers with BCM5708 Network Cards), so I did compare a lot of settings and configuration files without finding any difference. But my knowledge about udev is very limited. If there is something I should compare I would happily try to help.
(In reply to comment #57 from Kai Lappalainen) > We have this same problem here with 2 from 4 machines which were upgraded from > 10.2 to 10.3. All have static ip-addresses, different network cards, different > architecture. > > The difference during the upgrade was, that the two working machines were > upgraded from "outside" by booting with a network-install-cd. The two machines > showing the problems were upgraded from a running system by changing the repos > to the 10.3 versions and then using the "factory upgrade"-tool in yast. [...] Updates of a running system to a new distribution are AFAIK not supported and yast2 shows AFAIK at least a red warning. This the case, because there are several problems that may occur. One example: the conversion of ifcfg-eth-id-* to ifcfg-ethX needs an already updated udev with an already generated persistent-net rule 70 or a kernel using the new sysfs to work propelly. When the conversion happens while the old udev is running, the rule 70 it generates does not exists and the conversion using the old one may result in different persistent name. Using the old sysfs also does not work, because they differ significantly and on the new system a completely different modules may be in use. ... But let's take a look to the rules / config. Perhaps we'll find it. Can you attach a tgz from a working and a not working machine? "tar cvzf /tmp/machine1.tgz /etc/udev /etc/sysconfig/network" In /var/adm/backup/sysconfig are backups - please provide one of each machine that contains the old configuration (ifcfg-eth-<hwdesc> files). Of course, please copy the dirs somewhere first, review and replace any private data with XXXXX / example.com / dummy IPs / ... and attach it with a private flag set. And please also provide the output of "rpm -V udev sysconfig"?
*** Bug 410367 has been marked as a duplicate of this bug. ***
WARNING: The following "udevadm trigger" may cause strange things (interfaces may be set up twice); it is a good idea to reboot after. Can you disable services like databases first, reboot and provide the output of: ls -l /dev/shm/sysconfig/ udevadm trigger --verbose --retry-failed --subsystem-match="net" ls -l /dev/shm/sysconfig/ udevadm trigger --verbose --subsystem-match="net" ls -l /dev/shm/sysconfig/ after the network script failed?
(In reply to comment #60 from Marius Tomaschewski) I have no udevadm, so I used udevtrigger > ls -l /dev/shm/sysconfig/ total 20 -rw-r--r-- 1 root root 5 Dec 17 23:49 config-eth0 -rw-r--r-- 1 root root 3 Dec 17 23:49 config-lo -rw-r--r-- 1 root root 27 Dec 17 23:49 if-lo -rw-r--r-- 1 root root 7 Dec 17 23:49 ifup-lo -rw-r--r-- 1 root root 3 Dec 17 23:49 network -rw-r--r-- 1 root root 0 Dec 17 23:49 ready-lo drwxr-xr-x 2 root root 60 Dec 17 23:49 tmp > udevadm trigger --verbose --retry-failed --subsystem-match="net" (no output) > ls -l /dev/shm/sysconfig/ total 20 -rw-r--r-- 1 root root 5 Dec 17 23:49 config-eth0 -rw-r--r-- 1 root root 3 Dec 17 23:49 config-lo -rw-r--r-- 1 root root 27 Dec 17 23:49 if-lo -rw-r--r-- 1 root root 7 Dec 17 23:49 ifup-lo -rw-r--r-- 1 root root 3 Dec 17 23:49 network -rw-r--r-- 1 root root 0 Dec 17 23:49 ready-lo drwxr-xr-x 2 root root 60 Dec 17 23:49 tmp > udevadm trigger --verbose --subsystem-match="net" /devices/pci0000:00/0000:00:02.0/0000:06:00.0/0000:07:00.0/0000:08:00.0/0000:09:00.0/net/eth1 /devices/pci0000:00/0000:00:1c.0/0000:04:00.0/0000:05:00.0/net/eth0 /devices/virtual/net/lo > ls -l /dev/shm/sysconfig/ total 36 -rw-r--r-- 1 root root 5 Dec 17 23:54 config-eth0 -rw-r--r-- 1 root root 3 Dec 17 23:49 config-lo -rw-r--r-- 1 root root 29 Dec 17 23:54 if-eth0 -rw-r--r-- 1 root root 27 Dec 17 23:49 if-lo -rw-r--r-- 1 root root 7 Dec 17 23:54 ifup-eth0 -rw-r--r-- 1 root root 7 Dec 17 23:49 ifup-lo -rw-r--r-- 1 root root 3 Dec 17 23:49 network -rw-r--r-- 1 root root 8 Dec 17 23:54 new-stamp-2 -rw-r--r-- 1 root root 8 Dec 17 23:54 new-stamp-3 -rw-r--r-- 1 root root 0 Dec 17 23:54 ready-eth0 -rw-r--r-- 1 root root 0 Dec 17 23:54 ready-eth1 -rw-r--r-- 1 root root 0 Dec 17 23:49 ready-lo drwxr-xr-x 2 root root 60 Dec 17 23:54 tmp (eth1 is not used/configured!)
Created attachment 260717 [details] Not Working machine /etc/udev /etc/sysconfig/network (In reply to comment #58 from Marius Tomaschewski) > But let's take a look to the rules / config. Perhaps we'll find it. > Can you attach a tgz from a working and a not working machine? > > "tar cvzf /tmp/machine1.tgz /etc/udev /etc/sysconfig/network" machine1 is the not working machine, machine2 is the working machine. > > In /var/adm/backup/sysconfig are backups - please provide one of each > machine that contains the old configuration (ifcfg-eth-<hwdesc> files). the working machine had no ifcfg-eth-<hwdesc> files before the upgrade, but an ifcfg-eth0 file in the oldest sysconfig-backup I've found. (?) It's the same on the second working machine. I'll attach the files anyway. > > Of course, please copy the dirs somewhere first, review and replace > any private data with XXXXX / example.com / dummy IPs / ... and attach > it with a private flag set. > > And please also provide the output of "rpm -V udev sysconfig"? > No output on both machines.
Created attachment 260718 [details] not working machine /etc/sysconfig/network before upgrade
Created attachment 260719 [details] working machine /etc/udev /etc/sysconfig/network ( /etc/sysconfig/network/route intentionally missing)
Created attachment 260720 [details] working machine /etc/sysconfig/network before upgrade
Now, you've provided: machine1/new machine1/old/sysconfig machine2/new/udev machine2/new/sysconfig machine2/old/sysconfig In machine1/old archive are ifcfg-eth-id-* files - ok, but: The new files machine1 would be most interesting ... Can you provide them? There is afaik no backup of udev rules, except of the rule 30 that sysconfig creates during conversion... The machine2/new/udev/rules.d/* looks like a fresh install, not like an update. There is no converted & disabled rule 30. I've tested it yesterday - I've booted from CD and updated; the network starts fine after, rule 30 was correctly converted into a rule 70. I'll retest making an update in a running system when I'm back in work next year... (In reply to comment #61 from Kai Lappalainen) > (In reply to comment #60 from Marius Tomaschewski) > > I have no udevadm, so I used udevtrigger sure, sorry. > > ls -l /dev/shm/sysconfig/ > total 36 > -rw-r--r-- 1 root root 5 Dec 17 23:54 config-eth0 > -rw-r--r-- 1 root root 3 Dec 17 23:49 config-lo > -rw-r--r-- 1 root root 29 Dec 17 23:54 if-eth0 > -rw-r--r-- 1 root root 27 Dec 17 23:49 if-lo > -rw-r--r-- 1 root root 7 Dec 17 23:54 ifup-eth0 > -rw-r--r-- 1 root root 7 Dec 17 23:49 ifup-lo > -rw-r--r-- 1 root root 3 Dec 17 23:49 network > -rw-r--r-- 1 root root 8 Dec 17 23:54 new-stamp-2 > -rw-r--r-- 1 root root 8 Dec 17 23:54 new-stamp-3 > -rw-r--r-- 1 root root 0 Dec 17 23:54 ready-eth0 > -rw-r--r-- 1 root root 0 Dec 17 23:54 ready-eth1 > -rw-r--r-- 1 root root 0 Dec 17 23:49 ready-lo > drwxr-xr-x 2 root root 60 Dec 17 23:54 tmp > > (eth1 is not used/configured!) Then "all is fine", the new-stamp-* are created after explicit udevtrigger run for both interfaces (correct, they exists) and because the network was already started (network file exists), also configured (eth0).
(In reply to comment #66 from Marius Tomaschewski) > Now, you've provided: > > machine1/new Yes, as attachment 260717 [details] to comment 62. > In machine1/old archive are ifcfg-eth-id-* files - ok, but: > > The new files machine1 would be most interesting ... Can you > provide them? There is afaik no backup of udev rules, except > of the rule 30 that sysconfig creates during conversion... It's not clear to me what's missing? Could you please explain? > > The machine2/new/udev/rules.d/* looks like a fresh install, > not like an update. There is no converted & disabled rule 30. Sorry, I've checked our logs. It turned out, that this server was (inplace) upgraded at September, 13th 2007 from 10.2 to 10.3 *factory* before doing the upgrade to 10.3 final at October, 4th 2007. So I'm afraid this machine is not comparable - other than maybe there was a change between Sept. 13th and Oct. 4th which broke the conversion, because this machine works? The same is true for a second working machine here, which was also upgraded before to 10.3 factory on Sept. 13th and also works. In the /etc/sysconfig/network backup from Oct., 4th there is an empty file called "__convert_hwdesc_to_iface__". Other than that I see no significant difference to the one provided.
Do you (or somebody that can reproduct the problem) a line like this in the /etc/fstab? : tmpfs /dev/shm tmpfs defaults,size=132M 0 0
On both (not working) machines I have tmpfs /dev/shm tmpfs defaults 0 0 in /etc/fstab. (no size specified)
Is the problem fixed when you remove this line and reboot?
I have such a line too: tmpfs /dev/shm tmpfs auto,size=384m,mode=1777 0 0 I cannot shutdown the server for tests atm, though. Sorry. However, I'm curious: How can this make difference? It should not.
*Bingo*! :-) After removing the line on the two affected servers both machines were able to boot with working network! As I understand it, it's because of the mount during boot the flags in /dev/shm/sysconfig have been "overwritten"?
Hello, sorry for the late answer I've the same result: my /etc/fstab contains: tmpfs /dev/shm tmfs size=1G 0 0 when I remove (uncomment) the line, yes bingo that's it: the problem is fixed! but I need the line... Cheers Peter
Yes, the problem is, that first there is the udev tmpfs mounted on /dev, udev gets started and creates the /dev/shm/sysconfig/new-stamp-$INDEX interface marks via "ifup $IF -o hotplug". But then a separate tmpfs gets mounted (by boot.localfs) _over_ /dev/shm and _hides_ them -- and the network script is waiting for them to appear. When you move /dev/shm/sysconfig to somewhere else and "umount /dev/shm" you will find the new-stamp-* files again... The credits go to Michael Monnerie in bug 435880 that found out that there is sometimes a separate /dev/shm mount. I've to find out who is creating it (an Oracle soft? see bug 355786) and find a way to fix it. I'll set all the bugs as duplicates tomorrow.
(In reply to comment #73 from Peter Küppers) > Hello, > > sorry for the late answer > > I've the same result: > > my /etc/fstab contains: > tmpfs /dev/shm tmfs size=1G 0 0 > > when I remove (uncomment) the line, yes bingo that's it: the problem is fixed! > but I need the line... No, you don't need it: /dev is already a tmpfs.
When you (some software) need a mount point there, you can fake it with mount -obind /dev/shm /dev/shm
Ah, sounds like I was suffering from the /dev/shm problem too (bug #410367). ATI tell you to mount a tmpfs at /dev/shm in order to make 3d support work in their fglrx drivers. I've gave up on ATI's drivers some time ago which probably explains why I've not seen this bug for a while.
*** Bug 435880 has been marked as a duplicate of this bug. ***
(In reply to comment #77 from Neil Murphy) > Ah, sounds like I was suffering from the /dev/shm problem too (bug #410367). > > ATI tell you to mount a tmpfs at /dev/shm in order to make 3d support work in > their fglrx drivers. > > I've gave up on ATI's drivers some time ago which probably explains why I've > not seen this bug for a while. Well, ATI, VMWARE, Oracle. By default there are up to 2G in /dev/shm: LANG=C df -h /dev/shm Filesystem Size Used Avail Use% Mounted on udev 2.0G 264K 2.0G 1% /dev but some applications may need more.
*** Bug 355786 has been marked as a duplicate of this bug. ***
*** Bug 435189 has been marked as a duplicate of this bug. ***
Is is possible to specify the size of the /dev fs in /etc/fstab?
(In reply to comment #83 from Marius Tomaschewski) > Is is possible to specify the size of the /dev fs in /etc/fstab? It may be possible by adding size=, but making it larger than the default (half the RAM size) just papers over some utterly broken applications, which should be fixed instead.
(In reply to comment #84 from Kay Sievers) > (In reply to comment #83 from Marius Tomaschewski) > > Is is possible to specify the size of the /dev fs in /etc/fstab? > > It may be possible by adding size=, but making it larger than the default (half > the RAM size) just papers over some utterly broken applications, which should > be fixed instead. Yes, half of RAM size (2G was wrong / my machine only :). Well, but when you've 8G RAM and run only e.g. a database it may be required to set it to e.g. 6G. IMO we have at least two choices: a) change the init / network scripts to not to use it b) make it adjustable (/etc/init.d/boot mounts /dev using "mount -n -t tmpfs -o mode=0755 udev /dev" -- I'm going to test if the size is used when I add /dev with a different size to /etc/fstab)... In case of a) -- which path we can use instead - /dev/.tmp? /var may be mounted on a separate disk so it can't be used too.
(In reply to comment #85 from Marius Tomaschewski) > a) change the init / network scripts to not to use it > b) make it adjustable (/etc/init.d/boot mounts /dev using > "mount -n -t tmpfs -o mode=0755 udev /dev" -- I'm going > to test if the size is used when I add /dev with > a different size to /etc/fstab)... No, the size for /dev isn't used...
Michael Taylor: does Oracle work when you remove the separate /dev/shm and call "mount -oremount,size=7g /dev" after booting instead?
Michael, you can add the mount -oremount call to "/etc/init.d/boot.local".
Marius, thanks for hunting this bug so relentlessly... ;-) I'd favor a) because b) would be optional then but still nice to have anyways. Instead of /dev/.tmp use a dynamically created path, with some unique magic which is persistent between the script calls, like (bad) e.g. /dev/.network-config.`uname -r` It would be nice if the directory gets cleaned up (say, removed) after the network is completely configured (provided that is possible) but /dev/ gets cleaned up upon reboot anyways.
(In reply to comment #85 from Marius Tomaschewski) > b) make it adjustable (/etc/init.d/boot mounts /dev using > "mount -n -t tmpfs -o mode=0755 udev /dev" -- I'm going > to test if the size is used when I add /dev with > a different size to /etc/fstab)... It should honor the fstab options by re-mounting the already mounted filesystem. We do that for some other filesystems too. > In case of a) -- which path we can use instead - /dev/.tmp? > /var may be mounted on a separate disk so it can't be used too. Better use some name private to your package, and delete the directory after it is no longer needed when the rootfs is available. We should not put new generic names like ".tmp" in /dev, and suggest people to share that.
Created attachment 264188 [details] boot.localfs (11.1) patch to honour /dev options (size) from /etc/fstab Rüdiger, what do you think about this patch to honor a size option?
The while loop is not really needed, since the mount -fv -t tmpfs udev /dev call before added /dev to /etc/mtab already...
Hi Marius, Thank you for your work on this issue, and finding the tmpfs root cause. Since creating this bug, I have moved my Oracle machines to use HugePages, which are incompatible with the tmpfs construct, so you can close the bug. Here is the Oracle note. I have removed the parameter that made use of /tmpfs and my network interfaces are now working. Thanks, -Michael Subject: MEMORY_TARGET/MEMORY_MAX_TARGET And Linux Hugepages Doc ID: 473165.1 Type: HOWTO Modified Date : 26-MAY-2008 Status: MODERATED In this Document Goal Solution This document is being delivered to you via Oracle Support's Rapid Visibility (RaV) process, and therefore has not been subject to an independent technical review. Applies to: Oracle Server - Enterprise Edition - Version: 11.1.0.6 Linux x86 Goal Using MEMORY_TARGET/MEMORY_MAX_TARGET for managing memory in an 11g database on Linux. When trying to check if Hugpages are being used by running the command (grep Huge /proc/meminfo), can see that Hugepages are not being used. But, When using SGA_MAX_SIZE to manage the memory in the same database, can see by using the same command (grep Huge /proc/meminfo) that Hugepages are being used. Does MEMORY_TARGET/MEMORY_MAX_TARGET make use of Linux Hugepages? Solution Automatic Memory Management (MEMORY_TARGET/MEMORY_MAX_TARGET) cannot be used in conjunction with Hugepages on Linux. This is because its memory segments are memory mapped files in /dev/shm.
Hello Marius, on my server, I used the /etc/fstab line "tmpfs /dev/shm tmpfs size=1G 0 0" cause it was recommended for a SAP Testdrive (Netweaver 2004s with MaxDB 7.6). There are various sapnotes with hints on tmpfs and SAP memory management for Linux systems. In the SAP Note 941735 (SAP memory management for 64-bit Linux systems), I found a solution to customize my system without the line in /etc/fstab. But there is another hint in this SAP Note: ... TMPFS With the STD implementation (SAP profile parameter es/implementation=std), the SAP Extended Memory is no longer stored in the TMPFS (under /dev/shm). However, the TMPFS is required by the Virtual Machine Container (VMC). For this reason, we still recommend the same configuration of the TMPFS: 75% (RAM + Swap) is still recommended as the size. ... So I understand, that in my case the "tmpfs /dev/shm tmpfs size=1G 0 0" is still relevant. If not, is this a question for Linux (so bugzilla.novell and SLES) or more for SAP? Question would be "How to configure the tmpfs for the VMC otherwise?" With the patch you recommended or "hard" size= in /etc/init.d/boot or...? Cheers Peter
(In reply to comment #94) > Hello Marius, Hi! > on my server, I used the /etc/fstab line "tmpfs /dev/shm tmpfs size=1G 0 0" > cause it was recommended for a SAP Testdrive (Netweaver 2004s with MaxDB 7.6). > > There are various sapnotes with hints on tmpfs and SAP memory management for > Linux systems. In the SAP Note 941735 (SAP memory management for 64-bit Linux > systems), I found a solution to customize my system without the line in > /etc/fstab. But there is another hint in this SAP Note: > ... > TMPFS > With the STD implementation (SAP profile parameter es/implementation=std), the > SAP Extended Memory is no longer stored in the TMPFS (under /dev/shm). However, > the TMPFS is required by the Virtual Machine Container (VMC). For this reason, > we still recommend the same configuration of the TMPFS: > 75% (RAM + Swap) is still recommended as the size. > ... > So I understand, that in my case the "tmpfs /dev/shm tmpfs size=1G 0 0" is > still relevant. If not, is this a question for Linux (so bugzilla.novell and > SLES) or more for SAP? > Question would be "How to configure the tmpfs for the VMC otherwise?" > With the patch you recommended or "hard" size= in /etc/init.d/boot or...? Since udev (something like 10.x), the complete /dev is a tmpfs. Before, /dev was (usually) a normal directory on root-fs with static device files and only the /dev/shm directory was a tmpfs. Because the complete /dev is a tmpfs (udev & init scripts are using very less of it [254K on my system]), it is not required any more to create a separate tmpfs for /dev/shm. But this does not mean, that it is never needed to adjust the default size (of 50% RAM). The patch in comment #91 allows to specify the size directly for /dev, adding a /etc/fstab line like: udev /dev tmpfs size=3g,mode=755 0 0 and to remove the /dev/shm mount entry. If setting it to "75% (RAM + Swap)" (as recommended above) makes sense or not, is completely another issue.
The question is now, what the maintainer of /etc/init.d/boot.localfs says about the patch from comment #91 or the alternative one bellow. [Because "mount -fv -t tmpfs udev /dev" writes an /etc/mtab entry, it is not really required to check the /etc/fstab before remount] Ruediger? ============= --- /etc/init.d/boot.localfs +++ /etc/init.d/boot.localfs 2009-01-12 10:48:43.000000000 +0100 @@ -239,6 +239,8 @@ fi done < /proc/filesystems mount -fv -t tmpfs udev /dev + # remount /dev too when there may be options in fstab + mount -oremount /dev rc_status if test ! -d /sys/block/loop0 ; then /sbin/modprobe loop =============
On jan 7 I updated my fstab to remove the line for tmpfs, and I restored /etc/init.d/network to the original state. Since then I have had a working system and eth0 was functioning as it should. I was very pleased to see this bug being resolved with such a clever solution. That is, until now. Although my fstab is not changed since the 7., all of a sudden the bug is back. This afternoon nothing was wrong, but this evening again I have no network without manually restarting rcnetwork. The only difference today that I can remember are some security fixes for Cups, that came my way. Marc
Hi, Because the problem reoccurs, I re-applied the 'fix' in /etc/init.d/network. But it did not help this time! I even restored fstab with the offending tmpfs line. Needless to say this did not help either. Basically I have no clue left. The problem is back, and there is no cheap fix anymore. Am I the only one with this reoccuring problem? Marc
(In reply to comment #98) > Hi, Hi! > Because the problem reoccurs, I re-applied the 'fix' in /etc/init.d/network. > But it did not help this time! > I even restored fstab with the offending tmpfs line. > Needless to say this did not help either. > > Basically I have no clue left. The problem is back, and there is no cheap fix > anymore. > Am I the only one with this reoccuring problem? Please take a look to the output of "cat /proc/mounts" if there is a separate /dev/shm mounted. Then take a look if there are "new-stamp-$ID" files in /dev/shm/sysconfig matching you network cards interface IDs in the "ip link show" output ("3: eth1:" => check if new-stamp-3 exists).
Marius, There is no /dev/shm mounted: root@Planhold:/home/marc> cat /proc/mounts rootfs / rootfs rw 0 0 udev /dev tmpfs rw 0 0 /dev/mapper/nvidia_eicaegch_part1 / reiserfs rw 0 0 proc /proc proc rw 0 0 sysfs /sys sysfs rw 0 0 debugfs /sys/kernel/debug debugfs rw 0 0 devpts /dev/pts devpts rw 0 0 /dev/mapper/nvidia_eicaegch_part9 /home ext3 rw,data=ordered 0 0 /dev/mapper/nvidia_eicaegch_part3 /boot ext3 rw,data=ordered 0 0 /dev/mapper/nvidia_eicaegch_part8 /bu xfs rw 0 0 fusectl /sys/fs/fuse/connections fusectl rw 0 0 securityfs /sys/kernel/security securityfs rw 0 0 In sysconfig new-stamp-2 and new-stamp-4 exists root@Planhold:/home/marc> ls /dev/shm/sysconfig config-eth0 config-lo config-wlan0 new-stamp-2 new-stamp-4 ready-lo root@Planhold:/home/marc> ip link show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000 link/ether 00:1b:fc:df:9e:11 brd ff:ff:ff:ff:ff:ff 3: wmaster0: <BROADCAST,MULTICAST> mtu 1500 qdisc ieee80211 qlen 1000 link/ieee802.11 00:80:5a:4e:f8:ea brd ff:ff:ff:ff:ff:ff 4: wlan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000 link/ether 00:80:5a:4e:f8:ea brd ff:ff:ff:ff:ff:ff root@Planhold:/home/marc> ls /dev/shm/sysconfig/new-stamp-2 /dev/shm/sysconfig/new-stamp-2 root@Planhold:/home/marc> cat /dev/shm/sysconfig/new-stamp-2 renamed root@Planhold:/home/marc> cat /dev/shm/sysconfig/new-stamp-4 renamed root@Planhold:/home/marc> cat /dev/shm/sysconfig/config-eth0 eth0 I don't use wlan0. But as said I haven't changed anything except network and fstab last week. 'network'is original, fstab has a comment in front of tmpfs. It worked for 7 days, but not anymore. Your help is appreciated. Marc
BTW: I've created a separate bug 466718 for the "apply size to /dev fs" issue. (In reply to comment #100) > Marius, > > There is no /dev/shm mounted: [...] OK. > In sysconfig new-stamp-2 and new-stamp-4 exists > > root@Planhold:/home/marc> ls /dev/shm/sysconfig > config-eth0 config-lo config-wlan0 new-stamp-2 new-stamp-4 ready-lo Hmm... strange - it should work then. Please reinstall most recent sysconfig and udev RPMs for your distribution, verify the install using "rpm -V sysconfig udev" and reboot (true reboot). When it still happens, please enable the ". scripts/extradebug" line in the /sbin/ifup and /etc/init.d/network scripts and reboot (true reboot). Depending on the distribution/sysconfig version it should create "bash -x" trace files either in /tmp/exdeb.* or in /dev/shm/sysconfig/exdeb.*. Please tar them together with the /dev/shm/sysconfig files (tar cvzf bug335486-exdeb.tgz /tmp/exdeb.* /dev/shm/sysconfig) and attach to this bug.
(In reply to comment #98) > Hi, > > Because the problem reoccurs, I re-applied the 'fix' in /etc/init.d/network. > But it did not help this time! > I even restored fstab with the offending tmpfs line. > Needless to say this did not help either. > > Basically I have no clue left. The problem is back, and there is no cheap fix > anymore. > Am I the only one with this reoccuring problem? > > Marc Sorry, I'm late again with my answer. I updated the last upgrade packages on my server (openSUSE 10.3), but the solution (remove tmpfs from /etc/ftsab) still works on my server. cat /proc/mounts says, that there is no separate /dev/shm mounted. ip link show says there an no "new-stamp-$ID". in /dev/shm/sysconfig new-stamp-2 and new-stamp-3 exists (and both have 'cat' renamed). Cheers Peter
Created attachment 265808 [details] Output from scripts/extradebug Marius, My first verify with rpm showed something hopeful: root@Planhold:/home/marc> rpm -V sysconfig udev S.5....T /etc/init.d/network S.5....T /etc/sysconfig/network/scripts/ifup-wireless This disappeared after reinstalling. Unfortunately the bug did not... See the logfiles, Hopefully something evil will reveal itself. Marc
Marc wrote: > fstab has a comment in front of tmpfs. Marc, try completely deleting that entry. Even with a comment in front, it didn't work for me. Removing the line helped.
Michael, Michael wrote: > Marc, try completely deleting that entry. Even with a comment in front, it > didn't work for me. Removing the line helped. This looks like wizzardry. If that helps, it's magic. But it did not help for me, I am sorry to say. I will try to upgrade to OpenSuse 11.1, maybe that is more rewarding. Marc
Hi, I just upgraded to OpenSuse 11.1, and that was a good work around for this bug. All is well that ends well. Love, Marc
(In reply to comment #103) > Created an attachment (id=265808) [details] > Output from scripts/extradebug > > Marius, > > My first verify with rpm showed something hopeful: > root@Planhold:/home/marc> rpm -V sysconfig udev > S.5....T /etc/init.d/network > S.5....T /etc/sysconfig/network/scripts/ifup-wireless > This disappeared after reinstalling. > > Unfortunately the bug did not... > See the logfiles, Hopefully something evil will reveal itself. > > Marc The bug was not away, because you've /etc/sysconfig/network/ifcfg-eth6 and /etc/sysconfig/network/ifcfg-eth7 files in your system and then the network script is waiting for this not-existing hardware to appear. A "rm /etc/sysconfig/network/ifcfg-eth{6,7}" solves the problem without a need to update to 11.1.
Resetting Bug Prio (while comment #5) back to P5 as assigned by bnc-team-screening.
The problem that the network script waits for an interface until timeout (WAIT_FOR_INTERFACES in /etc/sysconfig/network/config), occurs under two conditions: a) there are ifcfg files (interface configurations) for hardware that does not exists (any more) as in comment #103 and comment #107. solution => Delete the [excrescent] ifcfg files or set STARTMODE='off' or 'manual' in these files. b) there is a separate tmpfs mounted on /dev/shm (via /etc/fstab), that hides the sysconfig udev rule state files created before /dev/shm got mounted. solution => Remove /dev/shm mount point from /etc/fstab. A separate /dev/shm is not required, because /dev is already a tmpfs with a maximal size of 50% of RAM. In case that 50% of RAM for the tmpfs is not sufficient because of special requirements of some software, the size can be adopted by adding an /etc/init.d/boot.local line, like: /bin/mount -oremount,size=3g /dev to remount it with the desired size (3GB in this example). I'm resolving this Bug as WONTFIX, because it is not sufficient to just change the /dev/shm/sysconfig path [used for many years] in sysconfig to something else, because it affects also several another packages and may break custom if-up.d/if-down.d scripts. We'll address this issue in a later/next openSUSE version.
Marius, Thank you for your analyze (comment #107.) In the meantime I have upgraded to 11.1, and the extra ifcfg-eth* files are gone. Usually I update my system as soon as an update is available, but now I hesitated because I have a two-seat configuration, that I did not want to loose. After upgrading it took some time indeed to get the configuration in order, but now I'm glad I took the step. Thanks for all your effort, and hopefully once Linux will rule. Marc
As the initial bug reporter, I'd also like to thank Marius for hunting this bug so persistently. Since the problem was identified and we have workarounds in comment #109, I guess we can live with a resolution of WONTFIX for the older distributions.
*** Bug 516769 has been marked as a duplicate of this bug. ***
*** Bug 497924 has been marked as a duplicate of this bug. ***