Bug 318546 (MONO75586) - [PATCH] Deadlock on assembly loading locks
Summary: [PATCH] Deadlock on assembly loading locks
Status: RESOLVED FIXED
Alias: MONO75586
Product: Mono: Runtime
Classification: Mono
Component: misc (show other bugs)
Version: 1.1
Hardware: Other Other
: P3 - Medium : Major
Target Milestone: ---
Assignee: Ben Maurer
QA Contact: Mono Bugs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-07-19 20:28 UTC by Sharon Smith
Modified: 2007-09-15 21:24 UTC (History)
2 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
This is the backtraces (377.38 KB, text/plain)
2005-07-19 20:31 UTC, Thomas Wiest
Details
Patch (3.88 KB, patch)
2005-07-19 23:15 UTC, Thomas Wiest
Details | Diff
And now, without a typo :-). (3.88 KB, patch)
2005-07-19 23:22 UTC, Thomas Wiest
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Wiest 2007-09-15 19:25:20 UTC


---- Reported by smsmith@novell.com 2005-07-19 13:28:37 MST ----

The web domain is locking up in the loading of assemblies.

Roger Beus in system test has been running a server for several days now 
trying to build it up so that the data contained in the system is 1.5 
TB.  When the server rolled at 4:15 this morning it failed to come back 
up all the way.  I had Rob Lyon look at it and he said that the problem 
is that the web domain is locking up in the loading of assemblies.

I'll attach the backtraces to this bug.



---- Additional Comments From smsmith@novell.com 2005-07-19 13:31:53 MST ----

Created an attachment (id=168256)
This is the backtraces




---- Additional Comments From smsmith@novell.com 2005-07-19 13:39:26 MST ----

Sorry, didn't specify the server.  He is running an iFolder server.  
Below are the details:

iFolder server build: 20050712
Mono Version: 1.1.7.7



---- Additional Comments From bmaurer@users.sf.net 2005-07-19 15:18:52 MST ----

The issue is between these two threads:

#12 0x080f80a2 in EnterCriticalSection (section=0x81a4f64)
    at critical-sections.c:151
#13 0x080f80a2 in EnterCriticalSection (section=0x81a4f60)
    at critical-sections.c:151
#14 0x08096ed8 in mono_assembly_addref (assembly=0x4651c518) at
assembly.c:383
#15 0x080d19d9 in add_assemblies_to_domain (domain=0x81fc6c0, ass=0x224d, 
    ht=0x400beeae) at appdomain.c:542
#16 0x080d1a89 in mono_domain_fire_assembly_load (assembly=0x4651c518, 
    user_data=0x0) at appdomain.c:576
#17 0x0809754d in mono_assembly_invoke_load_hook (ass=0x4651c518)
    at assembly.c:573
#18 0x08097f9a in mono_assembly_load_from_full (image=0x4651b9e8, 
    fname=0xfffffffc <Address 0xfffffffc out of bounds>,
status=0x425e8164, 
    refonly=0) at assembly.c:1018
#19 0x08097cda in mono_assembly_open_full (
    filename=0x4651b788
"/opt/novell/ifolder3/web/bin/SyncService.Web.dll", 
    status=0x425e8164, refonly=0) at assembly.c:887
#20 0x080d247d in ves_icall_System_Reflection_Assembly_LoadFrom (
    fname=0x8e0ab60, refOnly=0 '\0') at appdomain.c:899


---

#12 0x080f80a2 in EnterCriticalSection (section=0x81fc764)
    at critical-sections.c:151
#13 0x080f80a2 in EnterCriticalSection (section=0x81fc760)
    at critical-sections.c:151
#14 0x080d23ce in mono_domain_assembly_search (aname=0x44124bc8,
user_data=0x0)
    at appdomain.c:867
#15 0x0809760d in mono_assembly_invoke_search_hook (aname=0x44124bc8)
    at assembly.c:607
#16 0x08096ba8 in search_loaded (aname=0x44124bc8, refonly=0) at
assembly.c:243
#17 0x08097ff2 in mono_assembly_load_from_full (image=0x4397af58, 
    fname=0xfffffffc <Address 0xfffffffc out of bounds>,
status=0x423e6164, 
    refonly=0) at assembly.c:970
#18 0x08097cda in mono_assembly_open_full (
    filename=0x4411f920
"/opt/novell/ifolder3/web/bin/Simias.POBox.Web.dll", 
    status=0x423e6164, refonly=0) at assembly.c:887
#19 0x080d247d in ves_icall_System_Reflection_Assembly_LoadFrom (
    fname=0x8e0a2a0, refOnly=0 '\0') at appdomain.c:899

In the first thread, mono_assembly_load_from_full does not hold any
locks when it calls mono_assembly_invoke_load_hook. This will call
mono_domain_fire_assembly_load which calls add_assemblies_to_domain 
under mono_domain_assemblies_lock. When this calls
mono_assembly_addref, it increments a refcount inside assemblies_mutex.

mono_assembly_load_from_full (in the second thread) holds the
assemblies_mutex lock when it calls search_loaded. This calls
mono_domain_assembly_search which will acquire
mono_domain_assemblies_lock.



---- Additional Comments From bmaurer@users.sf.net 2005-07-19 15:28:08 MST ----

So, this is something that we are doing wrong in the VM in terms of
locking



---- Additional Comments From bmaurer@users.sf.net 2005-07-19 15:35:00 MST ----

One simple fix here is to use atomic ops to do the refcounting. Am
going to try a patch for that.



---- Additional Comments From bmaurer@users.sf.net 2005-07-19 16:15:51 MST ----

Created an attachment (id=168257)
Patch




---- Additional Comments From bmaurer@users.sf.net 2005-07-19 16:22:26 MST ----

Created an attachment (id=168258)
And now, without a typo :-).




---- Additional Comments From bmaurer@users.sf.net 2005-07-19 16:24:23 MST ----

This patch fixes the issue by using atomic operations inside
mono_assembly_addref. This should prevent the out of order locking you
are seeing durring assembly loading.



---- Additional Comments From lupus@ximian.com 2005-07-19 17:09:20 MST ----

Ben, please commit, but remove the useless volatile keyword you added
to the field.



---- Additional Comments From bmaurer@users.sf.net 2005-07-19 17:15:44 MST ----

It's in head.



---- Additional Comments From bmaurer@users.sf.net 2005-07-19 17:48:47 MST ----

On the branch. Should be fixed in the next set of rpms we give you.

Imported an attachment (id=168256)
Imported an attachment (id=168257)
Imported an attachment (id=168258)

Unknown operating system unknown. Setting to default OS "Other".