Bug 1165434

Summary: btrfs balance stuck in endless loop causing high I/O but never finishes on 4.12.14-lp151.28.36-default as well as 5.5.6-4.geca1eba-vanilla
Product: [openSUSE] openSUSE Distribution Reporter: Oliver Kurz <okurz>
Component: KernelAssignee: E-mail List <kernel-maintainers>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Major    
Priority: P5 - None CC: msuchanek
Version: Leap 15.1   
Target Milestone: ---   
Hardware: Other   
OS: Other   
See Also: http://bugzilla.suse.com/show_bug.cgi?id=1163994
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Oliver Kurz 2020-03-02 12:09:00 UTC
## Observation

btrfs balance seems to be stuck in a loop in the background causing high I/O. Messages like the following:

```
Mar 02 13:05:00 linux-28d6.suse kernel: BTRFS info (device dm-1): found 35 extents
Mar 02 13:05:00 linux-28d6.suse kernel: BTRFS info (device dm-1): found 35 extents
Mar 02 13:05:01 linux-28d6.suse kernel: BTRFS info (device dm-1): found 35 extents
…
```

repeated endlessly on 4.12.14-lp151.28.36-default. `btrfs balance status /` seems to be stuck at "100% left". On 5.5.6-4.geca1eba-vanilla I was either a bit lucky and it continued further or it seems to be different behaviour on the newer version. `btrfs balance status /` reports:

```
1 out of about 3 chunks balanced (9 considered),  67% left
```

but it seems to be stuck there as well.

```
# ps auxf | grep 'btrfs'
root       795  0.0  0.0      0     0 ?        I<   Mar01   0:00  \_ [btrfs-worker]
root       796  0.0  0.0      0     0 ?        I<   Mar01   0:00  \_ [btrfs-worker-hi]
root       797  0.0  0.0      0     0 ?        I<   Mar01   0:00  \_ [btrfs-delalloc]
root       798  0.0  0.0      0     0 ?        I<   Mar01   0:00  \_ [btrfs-flush_del]
root       799  0.0  0.0      0     0 ?        I<   Mar01   0:00  \_ [btrfs-cache]
root       800  0.0  0.0      0     0 ?        I<   Mar01   0:00  \_ [btrfs-fixup]
root       801  0.0  0.0      0     0 ?        I<   Mar01   0:00  \_ [btrfs-endio]
root       802  0.0  0.0      0     0 ?        I<   Mar01   0:00  \_ [btrfs-endio-met]
root       803  0.0  0.0      0     0 ?        I<   Mar01   0:00  \_ [btrfs-endio-met]
root       804  0.0  0.0      0     0 ?        I<   Mar01   0:00  \_ [btrfs-endio-rai]
root       805  0.0  0.0      0     0 ?        I<   Mar01   0:00  \_ [btrfs-endio-rep]
root       806  0.0  0.0      0     0 ?        I<   Mar01   0:00  \_ [btrfs-rmw]
root       807  0.0  0.0      0     0 ?        I<   Mar01   0:00  \_ [btrfs-endio-wri]
root       808  0.0  0.0      0     0 ?        I<   Mar01   0:00  \_ [btrfs-freespace]
root       809  0.0  0.0      0     0 ?        I<   Mar01   0:00  \_ [btrfs-delayed-m]
root       810  0.0  0.0      0     0 ?        I<   Mar01   0:00  \_ [btrfs-readahead]
root       811  0.0  0.0      0     0 ?        I<   Mar01   0:00  \_ [btrfs-qgroup-re]
root       813  0.0  0.0      0     0 ?        S    Mar01   0:00  \_ [btrfs-cleaner]
root       814  0.0  0.0      0     0 ?        S    Mar01   0:01  \_ [btrfs-transacti]
root     24170  1.2  0.0      0     0 ?        I    10:35   1:52  \_ [kworker/u8:8-btrfs-endio-write]
root     25435  1.6  0.0      0     0 ?        I    11:33   1:30  \_ [kworker/u8:10-btrfs-endio]
root     25841  1.5  0.0      0     0 ?        I    11:54   1:06  \_ [kworker/u8:3-btrfs-endio]
root      3301  1.4  0.0      0     0 ?        I    12:26   0:34  \_ [kworker/u8:0-btrfs-endio-meta]
root      4311  1.1  0.0      0     0 ?        I    12:33   0:22  \_ [kworker/u8:11-btrfs-endio-meta]
root      5468  1.6  0.0      0     0 ?        I    12:47   0:17  \_ [kworker/u8:5-btrfs-freespace-write]
root      5764  1.4  0.0      0     0 ?        I    12:56   0:08  \_ [kworker/u8:6-btrfs-endio-write]
root      5936  0.0  0.0      0     0 ?        I    13:01   0:00  \_ [kworker/u8:7-btrfs-endio-meta]
root      5938  0.0  0.0      0     0 ?        I<   13:01   0:00  \_ [kworker/u9:0-btrfs-worker-high]
root      6421  0.0  0.0   8684   812 pts/5    S+   13:06   0:00  |                           \_ grep --color=auto btrfs
root     20803  0.0  0.0  10164   816 ?        Ss   09:32   0:00 /usr/bin/flock /run/btrfs-maintenance-running /usr/share/btrfsmaintenance/btrfs-balance.sh
root     20804  0.0  0.0  16404  3056 ?        S    09:32   0:00  \_ /bin/sh /usr/share/btrfsmaintenance/btrfs-balance.sh
root     20832  0.0  0.0  16404  2472 ?        S    09:32   0:00      \_ /bin/sh /usr/share/btrfsmaintenance/btrfs-balance.sh
root     20926 11.8  0.0  16100   584 ?        D    09:34  25:06      |   \_ btrfs balance start -v -dusage 40 /
root     20833  0.0  0.0  16404  2196 ?        S    09:32   0:00      \_ /bin/sh /usr/share/btrfsmaintenance/btrfs-balance.sh
linux-28d6:/home/okurz # cat /proc/20926/stack
[<0>] wait_on_page_bit+0x12f/0x220
[<0>] __filemap_fdatawait_range+0x82/0xe0
[<0>] filemap_fdatawait_range+0xe/0x20
[<0>] __btrfs_wait_marked_extents.isra.0+0xc2/0x100 [btrfs]
[<0>] btrfs_write_and_wait_transaction.isra.0+0x67/0xd0 [btrfs]
[<0>] btrfs_commit_transaction+0x716/0xa00 [btrfs]
[<0>] prepare_to_merge+0x206/0x240 [btrfs]
[<0>] relocate_block_group+0x3bc/0x650 [btrfs]
[<0>] btrfs_relocate_block_group+0x161/0x2f0 [btrfs]
[<0>] btrfs_relocate_chunk+0x25/0x80 [btrfs]
[<0>] __btrfs_balance+0x401/0xa10 [btrfs]
[<0>] btrfs_balance+0x380/0x540 [btrfs]
[<0>] btrfs_ioctl_balance+0x28f/0x340 [btrfs]
[<0>] btrfs_ioctl+0x477/0x2790 [btrfs]
[<0>] do_vfs_ioctl+0x461/0x6d0
[<0>] ksys_ioctl+0x5e/0x90
[<0>] __x64_sys_ioctl+0x16/0x20
[<0>] do_syscall_64+0x5a/0x1c0
[<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
```

Reproduced multiple times on this machine so far.
Comment 1 Michal Suchanek 2020-03-02 14:25:43 UTC
Please test KOTD. Note that the system will likely crash if you create a snapshot (eg. use zypper to install packages).
Comment 2 Oliver Kurz 2020-03-02 22:00:14 UTC
You might have misunderstood something. I reported a problem observed with the latest published official openSUSE Leap 15.1 kernel. I can also test with KOTD but if this is a bug in the latest released then there should be fix backported, right? Or do you expect that there already *is* a fix in the queue?
Comment 3 Michal Suchanek 2020-03-02 22:11:28 UTC
Yes, there were btrfs fixes since last MU that address these very symptomps.

It may be that you are experiencing a variant of the issue that is not yet addressed but that can be shown only by the issue persisting with a current kernel.
Comment 4 Oliver Kurz 2020-03-23 07:31:07 UTC
Can you please point me to which kernel version should fix this or what entry I should look for in the changelog? Should kernel-default-4.12.14-lp151.28.40.1.x86_64 have this fix? I am seeing other problems on 5.5.6 so I am currently impacted by either a partially unresponsive system or I/O stuck system.
Comment 5 Michal Suchanek 2020-03-23 09:36:42 UTC
-------------------------------------------------------------------
Thu Jan 23 06:28:30 CET 2020 - wqu@suse.com

- btrfs: relocation: fix reloc_root lifespan and access
  (bsc#1159588).
- commit e44a0b9