Bugzilla – Bug 436310
nscd crash if you use ldap client for authentication
Last modified: 2009-01-21 00:29:58 UTC
Created attachment 246131 [details] nsswitch.conf Hi i've the following behaviour on my opensuse 11: After boot nscd service stop randomly (sometimes take 1 minute, sometimes 10), i'm doing authentication against an open ldap server. My architecture: a cluster with running ipvs and 100 servers This shouldn't be a major problem but i'm using about 50-70 users for each machine with remote access, without nscd the system gets really slow (because of the passwd remote verification), but the worst part is that the openldap cluster gets knocked out with the amount of requests. i've tried several novell distros and they all present the same simptom (opensuse 10.(0-3), sled 10, sled9, sles10, sles10 x86-64. i've managed to get a workarround that is a tiny script that check for the pid of nscd every second and if it doesn't start it will restart it. attached i'll send /etc/nsswitch /etc/nscd.conf
Created attachment 246132 [details] nscd.conf
Huh, 10.3 nscd used to be very stable - it would be interesting if you checked with that one again and paste nscd -d last few messages and include the core (call ulimit -c unlimited, then nscd -d from the same shell). For 11.0, can you do the same, please?
I am seeing this bug also. The problem of nscd dying unexpectely is causing other problems - see bug 467161 (opensuse Factory as of 01/16/09, also SLES 11 RC1) I will try to get the 'nscd -d' output.
I ran nscd interactively with -d. It crashed at an assertion: . . . 30010: considering GETPWBYUID entry "4001", timeout 1232293240 30010: considering GETPWBYUID entry "74", timeout 1232293240 30010: considering GETPWBYUID entry "106", timeout 1232293240 30010: considering GETPWBYNAME entry "postfix", timeout 1232293240 30010: considering GETPWBYNAME entry "haldaemon", timeout 1232293240 30010: considering GETPWBYNAME entry "messagebus", timeout 1232293240 30010: considering GETPWBYNAME entry "ntp", timeout 1232293240 30010: considering GETPWBYUID entry "76", timeout 1232293240 30010: considering GETPWBYNAME entry "topol", timeout 1232293240 30010: considering GETPWBYUID entry "66", timeout 1232293240 30010: considering GETPWBYUID entry "3912", timeout 1232293240 30010: considering GETPWBYUID entry "100", timeout 1232293240 30010: considering GETPWBYNAME entry "volfovsn", timeout 1232293240 30010: remove GETPWBYUID entry "101" 30010: remove GETPWBYNAME entry "myi" 30010: remove GETPWBYNAME entry "ldap" 30010: remove GETPWBYNAME entry "sshd" 30010: remove GETPWBYUID entry "2287" 30010: remove GETPWBYNAME entry "avahi" 30010: remove GETPWBYNAME entry "harbaugh" 30010: remove GETPWBYUID entry "71" 30010: remove GETPWBYUID entry "51" 30010: remove GETPWBYNAME entry "nobody" 30010: remove GETPWBYNAME entry "wnn" 30010: remove GETPWBYUID entry "1596" 30010: remove GETPWBYUID entry "65534" 30010: remove GETPWBYUID entry "4001" 30010: remove GETPWBYNAME entry "postfix" 30010: remove GETPWBYUID entry "106" 30010: remove GETPWBYUID entry "74" 30010: remove GETPWBYNAME entry "haldaemon" 30010: remove GETPWBYNAME entry "messagebus" 30010: remove GETPWBYNAME entry "ntp" 30010: remove GETPWBYUID entry "76" 30010: remove GETPWBYNAME entry "topol" 30010: remove GETPWBYUID entry "66" 30010: remove GETPWBYUID entry "3912" 30010: remove GETPWBYUID entry "100" 30010: remove GETPWBYNAME entry "volfovsn" nscd: mem.c:412: gc: Assertion `next_data < &he_data[db->head->nentries]' failed. Aborted I set 'ulimit -c unlimited', but no core file was produced
This appears to be the same problem as bug 387202 nscd: mem.c:412: gc: Assertion `next_data < &he_data[db->head->nentries]' failed. Aborted
This has nothing to do with LDAP but is the same problem as bug 387202 indeed. No details about the LDAP interaction available, so I'm closing this.