Bugzilla – Bug 318546
[PATCH] Deadlock on assembly loading locks
Last modified: 2007-09-15 21:24:46 UTC
---- Reported by smsmith@novell.com 2005-07-19 13:28:37 MST ---- The web domain is locking up in the loading of assemblies. Roger Beus in system test has been running a server for several days now trying to build it up so that the data contained in the system is 1.5 TB. When the server rolled at 4:15 this morning it failed to come back up all the way. I had Rob Lyon look at it and he said that the problem is that the web domain is locking up in the loading of assemblies. I'll attach the backtraces to this bug. ---- Additional Comments From smsmith@novell.com 2005-07-19 13:31:53 MST ---- Created an attachment (id=168256) This is the backtraces ---- Additional Comments From smsmith@novell.com 2005-07-19 13:39:26 MST ---- Sorry, didn't specify the server. He is running an iFolder server. Below are the details: iFolder server build: 20050712 Mono Version: 1.1.7.7 ---- Additional Comments From bmaurer@users.sf.net 2005-07-19 15:18:52 MST ---- The issue is between these two threads: #12 0x080f80a2 in EnterCriticalSection (section=0x81a4f64) at critical-sections.c:151 #13 0x080f80a2 in EnterCriticalSection (section=0x81a4f60) at critical-sections.c:151 #14 0x08096ed8 in mono_assembly_addref (assembly=0x4651c518) at assembly.c:383 #15 0x080d19d9 in add_assemblies_to_domain (domain=0x81fc6c0, ass=0x224d, ht=0x400beeae) at appdomain.c:542 #16 0x080d1a89 in mono_domain_fire_assembly_load (assembly=0x4651c518, user_data=0x0) at appdomain.c:576 #17 0x0809754d in mono_assembly_invoke_load_hook (ass=0x4651c518) at assembly.c:573 #18 0x08097f9a in mono_assembly_load_from_full (image=0x4651b9e8, fname=0xfffffffc <Address 0xfffffffc out of bounds>, status=0x425e8164, refonly=0) at assembly.c:1018 #19 0x08097cda in mono_assembly_open_full ( filename=0x4651b788 "/opt/novell/ifolder3/web/bin/SyncService.Web.dll", status=0x425e8164, refonly=0) at assembly.c:887 #20 0x080d247d in ves_icall_System_Reflection_Assembly_LoadFrom ( fname=0x8e0ab60, refOnly=0 '\0') at appdomain.c:899 --- #12 0x080f80a2 in EnterCriticalSection (section=0x81fc764) at critical-sections.c:151 #13 0x080f80a2 in EnterCriticalSection (section=0x81fc760) at critical-sections.c:151 #14 0x080d23ce in mono_domain_assembly_search (aname=0x44124bc8, user_data=0x0) at appdomain.c:867 #15 0x0809760d in mono_assembly_invoke_search_hook (aname=0x44124bc8) at assembly.c:607 #16 0x08096ba8 in search_loaded (aname=0x44124bc8, refonly=0) at assembly.c:243 #17 0x08097ff2 in mono_assembly_load_from_full (image=0x4397af58, fname=0xfffffffc <Address 0xfffffffc out of bounds>, status=0x423e6164, refonly=0) at assembly.c:970 #18 0x08097cda in mono_assembly_open_full ( filename=0x4411f920 "/opt/novell/ifolder3/web/bin/Simias.POBox.Web.dll", status=0x423e6164, refonly=0) at assembly.c:887 #19 0x080d247d in ves_icall_System_Reflection_Assembly_LoadFrom ( fname=0x8e0a2a0, refOnly=0 '\0') at appdomain.c:899 In the first thread, mono_assembly_load_from_full does not hold any locks when it calls mono_assembly_invoke_load_hook. This will call mono_domain_fire_assembly_load which calls add_assemblies_to_domain under mono_domain_assemblies_lock. When this calls mono_assembly_addref, it increments a refcount inside assemblies_mutex. mono_assembly_load_from_full (in the second thread) holds the assemblies_mutex lock when it calls search_loaded. This calls mono_domain_assembly_search which will acquire mono_domain_assemblies_lock. ---- Additional Comments From bmaurer@users.sf.net 2005-07-19 15:28:08 MST ---- So, this is something that we are doing wrong in the VM in terms of locking ---- Additional Comments From bmaurer@users.sf.net 2005-07-19 15:35:00 MST ---- One simple fix here is to use atomic ops to do the refcounting. Am going to try a patch for that. ---- Additional Comments From bmaurer@users.sf.net 2005-07-19 16:15:51 MST ---- Created an attachment (id=168257) Patch ---- Additional Comments From bmaurer@users.sf.net 2005-07-19 16:22:26 MST ---- Created an attachment (id=168258) And now, without a typo :-). ---- Additional Comments From bmaurer@users.sf.net 2005-07-19 16:24:23 MST ---- This patch fixes the issue by using atomic operations inside mono_assembly_addref. This should prevent the out of order locking you are seeing durring assembly loading. ---- Additional Comments From lupus@ximian.com 2005-07-19 17:09:20 MST ---- Ben, please commit, but remove the useless volatile keyword you added to the field. ---- Additional Comments From bmaurer@users.sf.net 2005-07-19 17:15:44 MST ---- It's in head. ---- Additional Comments From bmaurer@users.sf.net 2005-07-19 17:48:47 MST ---- On the branch. Should be fixed in the next set of rpms we give you. Imported an attachment (id=168256) Imported an attachment (id=168257) Imported an attachment (id=168258) Unknown operating system unknown. Setting to default OS "Other".