Bugzilla – Attachment 342869 Details for
Bug 568319
DM lockup causes filesystem failure
Home
|
New
|
Browse
|
Search
|
[?]
|
Reports
|
Requests
|
Forgot Password
[patch]
xfs - non-blocking file size updates during io completion backport to openSUSE11.2
xfs-non-blobking-inode-locking-io-completion (text/plain), 4.91 KB, created by
Nikanth Karthikesan
on 2010-02-17 11:45:58 UTC
(
hide
)
Description:
xfs - non-blocking file size updates during io completion backport to openSUSE11.2
Filename:
MIME Type:
Creator:
Nikanth Karthikesan
Created:
2010-02-17 11:45:58 UTC
Size:
4.91 KB
patch
obsolete
>xfs: Non-blocking inode locking in IO completion > >The introduction of barriers to DM loop devices (e.g. dm-crypt) has >created a new IO order completion dependency that XFS does not >handle. That is, the completion of log IOs (which have barriers) in >the loop filesystem are now dependent on completion of data IO in >the backing filesystem. > >This can cause deadlocks when a flush daemon issues a log force with >an inode locked because the IO completion of IO on the inode is >blocked by the inode lock. This in turn prevents further data IO >completion from occuring, which prevents the log IO from completing >because DM is waiting for data Io completion as well. > >The fix for this new completion order dependency issue is to make >the IO completion inode locking non-blocking. If the inode lock >can't be grabbed, simply requeue the IO completion back to the work >queue so that it can be processed later. This prevents the >completion queue from being blocked and allows data IO completion on >other inodes to proceed, hence avoiding completion order dependent >deadlocks. > >Signed-off-by: Dave Chinner <david@fromorbit.com> > >--- > >Index: linux-2.6.31-openSUSE-11.2/fs/xfs/linux-2.6/xfs_aops.c >=================================================================== >--- linux-2.6.31-openSUSE-11.2.orig/fs/xfs/linux-2.6/xfs_aops.c >+++ linux-2.6.31-openSUSE-11.2/fs/xfs/linux-2.6/xfs_aops.c >@@ -186,13 +186,17 @@ xfs_destroy_ioend( > } > > /* >- * Update on-disk file size now that data has been written to disk. >- * The current in-memory file size is i_size. If a write is beyond >- * eof i_new_size will be the intended file size until i_size is >- * updated. If this write does not extend all the way to the valid >- * file size then restrict this update to the end of the write. >+ * Update on-disk file size now that data has been written to disk. The >+ * current in-memory file size is i_size. If a write is beyond eof i_new_size >+ * will be the intended file size until i_size is updated. If this write does >+ * not extend all the way to the valid file size then restrict this update to >+ * the end of the write. >+ * >+ * This function does not block as blocking on the inode lock in IO completion >+ * can lead to IO completion order dependency deadlocks.. If it can't get the >+ * inode ilock it will return EAGAIN. Callers must handle this. > */ >-STATIC void >+STATIC int > xfs_setfilesize( > xfs_ioend_t *ioend) > { >@@ -204,11 +208,12 @@ xfs_setfilesize( > ASSERT(ioend->io_type != IOMAP_READ); > > if (unlikely(ioend->io_error)) >- return; >+ return 0; > > bsize = ioend->io_offset + ioend->io_size; > >- xfs_ilock(ip, XFS_ILOCK_EXCL); >+ if (!xfs_ilock_nowait(ip, XFS_ILOCK_EXCL)) >+ return EAGAIN; > > isize = MAX(ip->i_size, ip->i_new_size); > isize = MIN(isize, bsize); >@@ -221,6 +226,30 @@ xfs_setfilesize( > } > > xfs_iunlock(ip, XFS_ILOCK_EXCL); >+ >+ return 0; >+} >+ >+ >+/* >+ * Schedule IO completion handling on a xfsdatad if this was >+ * the final hold on this ioend. If we are asked to wait, >+ * flush the workqueue. >+ */ >+STATIC void >+xfs_finish_ioend( >+ xfs_ioend_t *ioend, >+ int wait) >+{ >+ if (atomic_dec_and_test(&ioend->io_remaining)) { >+ struct workqueue_struct *wq; >+ >+ wq = (ioend->io_type == IOMAP_UNWRITTEN) ? >+ xfsconvertd_workqueue : xfsdatad_workqueue; >+ queue_work(wq, &ioend->io_work); >+ if (wait) >+ flush_workqueue(wq); >+ } > } > > /* >@@ -233,6 +262,8 @@ xfs_end_io( > xfs_ioend_t *ioend = > container_of(work, xfs_ioend_t, io_work); > struct xfs_inode *ip = XFS_I(ioend->io_inode); >+ int error; >+ > > /* > * For unwritten extents we need to issue transactions to convert a >@@ -240,7 +271,6 @@ xfs_end_io( > */ > if (ioend->io_type == IOMAP_UNWRITTEN && > likely(!ioend->io_error && !XFS_FORCED_SHUTDOWN(ip->i_mount))) { >- int error; > > error = xfs_iomap_write_unwritten(ip, ioend->io_offset, > ioend->io_size); >@@ -252,30 +282,24 @@ xfs_end_io( > * We might have to update the on-disk file size after extending > * writes. > */ >- if (ioend->io_type != IOMAP_READ) >- xfs_setfilesize(ioend); >- xfs_destroy_ioend(ioend); >-} >+ if (ioend->io_type != IOMAP_READ) { >+ error = xfs_setfilesize(ioend); >+ ASSERT(!error || error == EAGAIN); >+ } > >-/* >- * Schedule IO completion handling on a xfsdatad if this was >- * the final hold on this ioend. If we are asked to wait, >- * flush the workqueue. >- */ >-STATIC void >-xfs_finish_ioend( >- xfs_ioend_t *ioend, >- int wait) >-{ >- if (atomic_dec_and_test(&ioend->io_remaining)) { >- struct workqueue_struct *wq; >+ /* >+ * If we didn't complete processing of the ioend, requeue it to the >+ * tail of the workqueue for another attempt later. Otherwise destroy >+ * it. >+ */ >+ if (error == EAGAIN) { >+ atomic_inc(&ioend->io_remaining); >+ xfs_finish_ioend(ioend, 0); >+ /* ensure we don't spin on blocked ioends */ >+ delay(1); >+ } else >+ xfs_destroy_ioend(ioend); > >- wq = (ioend->io_type == IOMAP_UNWRITTEN) ? >- xfsconvertd_workqueue : xfsdatad_workqueue; >- queue_work(wq, &ioend->io_work); >- if (wait) >- flush_workqueue(wq); >- } > } > > /*
You cannot view the attachment while viewing its details because your browser does not support IFRAMEs.
View the attachment on a separate page
.
Actions:
View
|
Diff
Attachments on
bug 568319
:
339770
|
339771
|
342356
|
342357
| 342869 |
345769
|
345771