Bug 355786

Summary: Network fails to start with multiple NICs
Product: [openSUSE] openSUSE 10.3 Reporter: Michael Taylor <michael.d.taylor>
Component: NetworkAssignee: Marius Tomaschewski <mt>
Status: RESOLVED DUPLICATE QA Contact: E-mail List <qa-bugs>
Severity: Major    
Priority: P5 - None CC: aschnell, cihlarov, mt, ro
Version: Final   
Target Milestone: ---   
Hardware: i686   
OS: openSUSE 10.3   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Bug Depends on:    
Bug Blocks: 392646    
Attachments: /var/log/boot.msg
/var/log/messages
/var/log/hwinfo.txt
50-udev-default.rules

Description Michael Taylor 2008-01-23 22:10:48 UTC
I am on OpenSuse 10.3 final with latest online updates on several machines, including a new Thinkpad T61p, and all network interfaces began failing to startup in the past few weeks with the following messages in the startup log:

Setting up network interfaces:
lo
lo IP address: 127.0.0.1/8
Checking for network time protocol daemon (NTPD): unused
doneWaiting for mandatory devices: eth1 __NSC__
20 <notice>checkproc: /opt/kde3/bin/kdm 3594
18 17 16 15 14 12 11 10 9 8 7 5 4 3 2 1 0
eth1 device: Intel Corporation 82566MM Gigabit Network Connection (rev 03)
eth1 is down
failed eth1 interface could not be set up until now
failedSetting up service network . . . . . . . . . . . . . . . .failed

Once I log in, I can sux -, and do an ifup eth1 to get the network working. It is a pain because I have a bunch of NFS client mounts I have to manually remount after starting up. I have 3 machines with the same problem. I am using the standard ifup with eth1 configured to start at boot time and wlan1 to start manually.  This worked fine until about 2 weeks ago.

rcnetwork start works fine after the machine has booted:

rcnetwork start
Hint: you may set mandatory devices in /etc/sysconfig/network/config
Setting up network interfaces:
    lo        
    lo        IP address: 127.0.0.1/8   
Checking for network time protocol daemon (NTPD):                    doneed
    eth1      device: Intel Corporation 82566MM Gigabit Network Connection (rev 03)
    eth1      IP address: 192.168.1.106/24   
Checking for network time protocol daemon (NTPD):                    doneed
    wlan1     device: Intel Corporation PRO/Wireless 4965 AG or AGN Network Connection (rev 61)
    wlan1     Startmode is 'manual'                                  skipped
Setting up service network  .  .  .  .  .  .  .  .  .  .  .  .  .  . done.
SuSEfirewall2: Setting up rules from /etc/sysconfig/SuSEfirewall2 ...
SuSEfirewall2: using default zone 'ext' for interface wmaster0
SuSEfirewall2: batch committing...
SuSEfirewall2: Firewall rules successfully set

Networking starts fine on OpenSuse 10.3 machines with 1 NIC.  It fails on machines with 2 NICs like laptops.

Not sure if bug 335486 is similar or not.
Comment 1 Klara Cihlarova 2008-01-30 14:48:27 UTC
Can you please attach /var/log/messages, /var/log/boot.msg, and output of hwinfo? Thanks!
Comment 2 Michael Taylor 2008-01-30 15:21:17 UTC
Created attachment 192349 [details]
/var/log/boot.msg
Comment 3 Michael Taylor 2008-01-30 15:22:20 UTC
Created attachment 192351 [details]
/var/log/messages
Comment 4 Michael Taylor 2008-01-30 15:22:37 UTC
Created attachment 192352 [details]
/var/log/hwinfo.txt
Comment 5 Christian Zoz 2008-04-14 16:04:38 UTC
Sorry, I did not find any time to work on that. Marius, can you take it, please?
Comment 6 Marius Tomaschewski 2008-05-09 08:25:23 UTC
Michael,
can you update to most recent sysconfig and kernel packages first and
verify the installation using "rpm -V" as I've described in Bug #335486,
comment #11?

http://download.opensuse.org/update/10.3/rpm/i586/kernel-bigsmp-2.6.22.17-0.1.i586.rpm
http://download.opensuse.org/update/10.3/rpm/i586/sysconfig-0.70.2-4.2.i586.rpm

I can not reproduce your problems. Yes, the bug looks similar to #335486,
but in your case it seems to be hardware specific, because you write it
worked some time before your report.

Does the problem also happen, when you _downgrade_ the kernel to for example:

http://download.opensuse.org/update/10.3/rpm/i586/kernel-bigsmp-2.6.22.12-0.1.i586.rpm
Comment 7 Marius Tomaschewski 2008-05-20 14:27:08 UTC
No response. Looks like it would work with most recent updates...

If it does not work, please reopen and provide info as requested
in comment #6 and Bug #335486, comment #11?
Comment 8 Michael Taylor 2008-05-21 23:57:35 UTC
No change after removing the extra file and reinstalling sysconfig.

rpm -V sysconfig
.......T  c /etc/sysconfig/network/ifroute-lo
plato:~ # vi /etc/sysconfig/network/ifroute-lo
plato:~ # rm /etc/sysconfig/network/ifroute-lo

It looks like they are still working on bug 335486.
Comment 9 Marius Tomaschewski 2008-05-22 10:23:53 UTC
.
Comment 10 Marius Tomaschewski 2008-05-22 10:25:28 UTC
*** Bug 392646 has been marked as a duplicate of this bug. ***
Comment 11 Marius Tomaschewski 2008-05-26 14:00:30 UTC
Michael, can you verify your udev and sysconfig installation using
"rpm -V udev sysconfig"? See also bug #335486, comments #31-#33.
Comment 12 Marius Tomaschewski 2008-05-27 14:22:33 UTC
Kay, this seems to be the same problem as in bug 335486.

I just keep it separately, because Michael says that it was
working fine before (December/January? Michael?)...
Comment 13 Kay Sievers 2008-05-27 14:35:41 UTC
Why do I get this bug assigned? What do you expect from me? I doubt it is udev related.
Comment 14 Michael Taylor 2008-05-27 15:13:58 UTC
/etc/sysconfig/network/ifroute-lo keeps appearing from sysconfig even though I remove it and reinstall sysconfig.
I modified one line of /etc/udev/rules.d/50-udev-default.rules to get QEMU/KVM working.

This was working fine for a few months after 10.3 GM was released.

The interesting thing is one machine had 3 NICs and showed the problem.  I removed the two extra NIC cards and still have the problem.  All of my 10.3 machines with a single NIC do not have the problem.
Comment 15 Marius Tomaschewski 2008-05-27 15:35:38 UTC
(In reply to comment #13 from Kay Sievers)
> Why do I get this bug assigned? What do you expect from me?
> I doubt it is udev related.

For whatever reason, udev seems to not to execute the rule:

SUBSYSTEM=="net", ACTION=="add", RUN+="/sbin/ifup $env{INTERFACE} -o hotplug"

from 77-network.rules file (see also bug 335486, where I already
requested udev debug) and you're the udev maintainer.

It happens with different hardware (Intel here). While it works
fine for me with the forcedeth driver, it does not work for at
least 2 customers using forcedeth too.

Michael:
please provide the output of "rpm -V udev" and all rule files
listed in this output?
Comment 16 Michael Taylor 2008-05-27 16:21:31 UTC
Created attachment 218366 [details]
50-udev-default.rules
Comment 17 Michael Taylor 2008-05-27 16:21:53 UTC
# rpm -V udev
..5....T  c /etc/udev/rules.d/50-udev-default.rules
Comment 18 Kay Sievers 2008-05-27 16:46:39 UTC
Please run:
  udevtest /class/net/ethX
and check, if /sbin/ifup is printed?
Comment 19 Michael Taylor 2008-05-27 16:56:40 UTC
# udevtest /class/net/eth1
This program is for debugging only, it does not run any program,
specified by a RUN key. It may show incorrect results, because
some values may be different, or not available at a simulation run.

parse_file: reading '/etc/udev/rules.d/05-udev-early.rules' as rules file
parse_file: reading '/etc/udev/rules.d/10-local.rules' as rules file
add_to_rules: invalid ATTRS operation
add_to_rules: invalid rule '/etc/udev/rules.d/10-local.rules:1'
parse_file: reading '/etc/udev/rules.d/10-smwan.rules' as rules file
add_to_rules: invalid KERNEL operation
add_to_rules: invalid rule '/etc/udev/rules.d/10-smwan.rules:9'
add_to_rules: invalid KERNEL operation
add_to_rules: invalid rule '/etc/udev/rules.d/10-smwan.rules:11'
add_to_rules: invalid KERNEL operation
add_to_rules: invalid rule '/etc/udev/rules.d/10-smwan.rules:13'
parse_file: reading '/etc/udev/rules.d/40-alsa.rules' as rules file
parse_file: reading '/etc/udev/rules.d/40-bluetooth.rules' as rules file
parse_file: reading '/etc/udev/rules.d/50-udev-default.rules' as rules file
parse_file: reading '/etc/udev/rules.d/51-lirc.rules' as rules file
parse_file: reading '/etc/udev/rules.d/52-irda.rules' as rules file
parse_file: reading '/etc/udev/rules.d/52-usx2yaudio.rules' as rules file
parse_file: reading '/etc/udev/rules.d/55-hpmud.rules' as rules file
parse_file: reading '/etc/udev/rules.d/55-libsane.rules' as rules file
parse_file: reading '/etc/udev/rules.d/56-idedma.rules' as rules file
parse_file: reading '/etc/udev/rules.d/60-cdrom_id.rules' as rules file
parse_file: reading '/etc/udev/rules.d/60-dock.rules' as rules file
parse_file: reading '/etc/udev/rules.d/60-kqemu.rules' as rules file
parse_file: reading '/etc/udev/rules.d/60-kvm.rules' as rules file
parse_file: reading '/etc/udev/rules.d/60-pcmcia.rules' as rules file
parse_file: reading '/etc/udev/rules.d/60-persistent-input.rules' as rules file
parse_file: reading '/etc/udev/rules.d/60-persistent-storage.rules' as rules file
parse_file: reading '/etc/udev/rules.d/64-device-mapper.rules' as rules file
parse_file: reading '/etc/udev/rules.d/64-md-raid.rules' as rules file
parse_file: reading '/etc/udev/rules.d/70-kpartx.rules' as rules file
parse_file: reading '/etc/udev/rules.d/70-persistent-cd.rules' as rules file
parse_file: reading '/etc/udev/rules.d/70-persistent-net.rules' as rules file
parse_file: reading '/etc/udev/rules.d/75-cd-aliases-generator.rules' as rules file
parse_file: reading '/etc/udev/rules.d/75-persistent-net-generator.rules' as rules file
parse_file: reading '/etc/udev/rules.d/77-network.rules' as rules file
parse_file: reading '/etc/udev/rules.d/80-drivers.rules' as rules file
parse_file: reading '/etc/udev/rules.d/90-hal.rules' as rules file
parse_file: reading '/etc/udev/rules.d/95-udev-late.rules' as rules file
parse_file: reading '/etc/udev/rules.d/ibm-ultrabay.rules' as rules file
import_uevent_var: import into environment: 'INTERFACE=eth1'
import_uevent_var: import into environment: 'IFINDEX=2'
main: looking at device '/devices/pci0000:00/0000:00:19.0/net/eth1' from subsystem 'net'
udev_rules_get_name: rule applied, 'eth1' becomes 'eth1'
main: run: '/sbin/ifup eth1 -o hotplug'
main: run: 'socket:/org/freedesktop/hal/udev_event'
main: run: 'socket:/org/kernel/udev/monitor'

I see /sbin/ifup eth1 -o hotplug printed third line up from the bottom.
Comment 20 Kay Sievers 2008-07-22 08:34:28 UTC
I really doubt this is a udev issue. Reassigning back.
Comment 21 Marius Tomaschewski 2008-10-01 15:47:24 UTC
Michael, can you provide the output of the following command? :

for p in $(pidof udevd) ; do ls -l /proc/$p/exe ; done
Comment 22 Michael Taylor 2008-10-01 20:54:51 UTC
Thanks for working on this bug again.  It is pretty annoying, and I really like using SUMF to maintain multiple profiles (different static IP addresses, NFS shares etc) between home and work.  SUMF over ifup provides the same functionality of Thinkpad Access Connections on Windows.

plato:~ # for p in $(pidof udevd) ; do ls -l /proc/$p/exe ; done
lrwxrwxrwx 1 root root 0 Oct  1 08:58 /proc/1270/exe -> /sbin/udevd
plato:~ # 
Comment 23 Michael Taylor 2008-11-26 20:00:56 UTC
I reproduced this on another Thinkpad with the following 2 network controllers:

02:01.0 Ethernet controller: Intel Corporation 82540EP Gigabit Ethernet Controller (Mobile) (rev 03)
02:02.0 Network controller: Intel Corporation PRO/Wireless 2200BG Network Connection (rev 05)

Just did the 10.3 GM install, eth0 came up after rebooting.  After taking all of the online updates, the only thing up in ifconfig is lo.

Is anyone still looking at this?
Comment 24 Michael Taylor 2008-11-28 17:12:42 UTC
If I install OpenSuse 10.3 GM from the KDE CD download on a laptop or desktop with 2 NICS and take all of the online updates, the NIC fail to start upon reboot.  If I install OpenSuse 10.3 GM from DVD, and take all of the online updates, the networking is fine.  Is there a way to diagnose this discrepancy?
Comment 25 Marius Tomaschewski 2009-01-07 16:00:28 UTC
Do you (or somebody that can reproduct the problem) a line like this
in the /etc/fstab? :

tmpfs /dev/shm tmpfs defaults,size=132M 0 0
Comment 26 Michael Taylor 2009-01-07 16:04:08 UTC
Yes.  I have the following line required by some Oracle software:

tmpfs   /dev/shm        tmpfs   defaults 0 0
Comment 27 Marius Tomaschewski 2009-01-07 16:47:14 UTC
Is the problem fixed when you remove this line and reboot?

Note, an explicit tmpfs mount on /dev/shm is _not_ needed,
because /dev is already a tmpfs.
Comment 28 Michael Taylor 2009-01-07 19:01:38 UTC
It works if I comment out the tmpfs fstab entry.  I was using it due to the following issue: 

Connected to an idle instance.
ORA-00845: MEMORY_TARGET not supported on this system
Disconnected

Is there another workaround?

Subject: 	ORA-00845: MEMORY_TARGET not supported on this system - Linux Servers
  	Doc ID: 	465048.1 	Type: 	PROBLEM
  	Modified Date : 	28-SEP-2008 	Status: 	PUBLISHED

In this Document
  Symptoms
  Cause
  Solution

Applies to:
Oracle Server - Standard Edition - Version: 11.1.0.6.0
Linux x86-64
Linux x86
Symptoms
Customer may receive  the below error messages on Linux Machines ::

SQL> connect sys as sysdba
Enter password:
Connected to an idle instance.
SQL> STARTUP NOMOUNT PFILE="/opt/oracle/admin/day/pfile/day2.ini";
ORA-00845: MEMORY_TARGET not supported on this system
Cause
On Linux systems, insufficient /dev/shm mount size for PGA and SGA.

AMM (Automatic Memory Management) is a New feature in 11G which manages both SGA and PGA
together. Its is managed by MMAN, same as with 10g AMM

MEMORY_TARGET is used instead of SGA_TARGET
MEMORY_MAX_TARGET is used instead of SGA_MAX_SIZE (defaults to MEMORY_TARGET )

It uses /dev/shm on Linux

If max_target set over /dev/shm size, you may  get the below error message ::

 ORA-00845: MEMORY_TARGET not supported on this system

If you are installing Oracle 11G on a Linux system, note that Memory Size (SGA and PGA), which sets
the initialization parameter MEMORY_TARGET or MEMORY_MAX_TARGET, cannot be greater than the shared memory filesystem (/dev/shm) on your operating system.

NOTE:

This error may also occur if /dev/shm is not properly mounted.  Make sure your df output is similar to the following:

$ df -k
Filesystem 1K-blocks Used Available Use% Mounted on
...
shmfs 6291456 832356 5459100 14% /dev/shm


Solution
Increase the /dev/shm mountpoint size.

For example:

# mount -t tmpfs shmfs -o size=7g /dev/shm

Also, to make this change persistent across system restarts, add an entry in /etc/fstab similar to  the following:

shmfs /dev/shm tmpfs size=7g 0
Comment 29 Marius Tomaschewski 2009-01-09 13:39:06 UTC
RĂ¼diger, Kay,

and what can we do here?

udev                  2,0G  264K  2,0G   1% /dev

Is it possible to tweek the fs size of the tmpfs in /dev?
Comment 30 Marius Tomaschewski 2009-01-09 13:39:50 UTC
Is it possible for the user to tweek the fs size of the tmpfs in /dev?
Comment 31 Marius Tomaschewski 2009-01-09 13:45:58 UTC
Lets resolve to the oldest bug 335486 and continue there.

*** This bug has been marked as a duplicate of bug 335486 ***
Comment 32 Kay Sievers 2009-01-09 13:55:01 UTC
(In reply to comment #30 from Marius Tomaschewski)
> Is it possible for the user to tweek the fs size of the tmpfs in /dev?

/dev tmpfs is by default half the RAM size of the box. Apps that need larger sizes here are just terribly broken.
Comment 33 Marius Tomaschewski 2009-01-09 16:50:37 UTC
Michael,
does Oracle still complain when you remove the separate /dev/shm
and call "mount -oremount,size=7g /dev" after booting instead?

It should work to add the mount -oremount /dev call to the
/etc/init.d/boot.local. See also patch in bug 335486 comment 92.
Comment 34 Michael Taylor 2009-01-09 17:55:59 UTC
Hi Marius,

Thank you for your work on this issue, and finding the tmpfs root cause.  Since creating this bug, I have moved my Oracle machines to use HugePages, which are incompatible with the tmpfs construct, so you can close the bug.  Here is the Oracle note.  I have removed the parameter that made use of /tmpfs and my network interfaces are now working.

Thanks,
-Michael

Subject: 	MEMORY_TARGET/MEMORY_MAX_TARGET And Linux Hugepages
  	Doc ID: 	473165.1 	Type: 	HOWTO
  	Modified Date : 	26-MAY-2008 	Status: 	MODERATED

In this Document
  Goal
  Solution

This document is being delivered to you via Oracle Support's Rapid Visibility (RaV) process, and therefore has not been subject to an independent technical review.

Applies to:
Oracle Server - Enterprise Edition - Version: 11.1.0.6
Linux x86
Goal

Using MEMORY_TARGET/MEMORY_MAX_TARGET for managing memory in an 11g database on Linux. When trying to check if Hugpages are being used by running the command (grep Huge /proc/meminfo), can see that Hugepages are not being used.

But, When using SGA_MAX_SIZE to manage the memory in the same database, can see by using the same command (grep Huge /proc/meminfo) that  Hugepages are being used.

Does MEMORY_TARGET/MEMORY_MAX_TARGET make use of Linux Hugepages?
Solution

Automatic Memory Management (MEMORY_TARGET/MEMORY_MAX_TARGET) cannot be used in conjunction with Hugepages on Linux. This is because its memory segments are memory mapped files in /dev/shm. 
Comment 35 Bernhard Wiedemann 2016-04-15 09:06:31 UTC
This is an autogenerated message for OBS integration:
This bug (355786) was mentioned in
https://build.opensuse.org/request/show/4310 Factory / sysconfig