From 2a8cec4b12d69a63828974823cb0867290d21274 Mon Sep 17 00:00:00 2001 From: Qu Wenruo Date: Mon, 8 Jul 2019 15:33:51 +0800 Subject: [PATCH] btrfs-progs: Exhaust delayed refs and dirty block groups to prevent delayed refs lost MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit [BUG] Btrfs-progs sometimes fails to find certain extent backref when committing transaction. The most reliable way to reproduce it is fsck-test/013 on 64K page sized system: [...] adding new data backref on 315859712 root 287 owner 292 offset 0 found 1 btrfs unable to find ref byte nr 31850496 parent 0 root 2 owner 0 offset 0 Failed to find [30867456, 168, 65536] Also there are some github bug reports related to this problem. [CAUSE] Commit 909357e86799 ("btrfs-progs: Wire up delayed refs") introduced delayed refs in btrfs-progs. However in that commit, delayed refs are not run at correct timing. That commit calls btrfs_run_delayed_refs() before btrfs_write_dirty_block_groups(), which needs to update BLOCK_GROUP_ITEMs in extent tree, thus could cause new delayed refs. This means each time we commit a transaction, we may screw up the extent tree by dropping some pending delayed refs, like: Transaction 711: btrfs_commit_transaction() |- btrfs_run_delayed_refs() | Now all delayed refs are written to extent tree | |- btrfs_write_dirty_block_groups() | Needs to update extent tree root | ADD_DELAYED_REF to 315859712. | Delayed refs are attached to current trans handle. | |- __commit_transaction() |- write_ctree_super() |- btrfs_finish_extent_commit() |- kfree(trans) Now delayed ref for 315859712 are lost Transaction 712: Tree block 315859712 get dropped btrfs_commit_transaction() |- btrfs_run_delayed_refs() |- run_one_delayed_ref() |- __free_extent() As previous ADD_DELAYED_REF to 315859712 is lost, extent tree doesn't have any backref for 315859712, causing the bug In fact, commit c31edf610cbe ("btrfs-progs: Fix false ENOSPC alert by tracking used space correctly") detects the tree block leakage, but in the reproducer we have too much noise, thus nobody notices the leakage warning. [FIX] We can't just move btrfs_run_delayed_refs() after btrfs_write_dirty_block_groups(), as during btrfs_run_delayed_refs(), we can re-dirty block groups. Thus we need to exhaust both delayed refs and dirty blocks. This patch will call btrfs_write_dirty_block_groups() and btrfs_run_delayed_refs() in a loop until both delayed refs and dirty blocks are exhausted. Much like what we do in commit_cowonly_roots() in kernel. Also, to prevent such problem from happening again (and not to debug such problem again), add extra check on delayed refs before freeing the transaction handle. Reported-by: Klemens Schölhorn Issue: #187 Reviewed-by: Nikolay Borisov Signed-off-by: Qu Wenruo Signed-off-by: David Sterba --- transaction.c | 26 ++++++++++++++++++++------ 1 file changed, 20 insertions(+), 6 deletions(-) diff --git a/transaction.c b/transaction.c index 551fb24e..45bb9e1f 100644 --- a/transaction.c +++ b/transaction.c @@ -193,17 +193,31 @@ commit_tree: ret = commit_tree_roots(trans, fs_info); if (ret < 0) goto error; + /* - * Ensure that all committed roots are properly accounted in the - * extent tree + * btrfs_write_dirty_block_groups() can cause COW thus new delayed + * tree refs, while run such delayed tree refs can dirty block groups + * again, we need to exhause both dirty blocks and delayed refs */ - ret = btrfs_run_delayed_refs(trans, -1); - if (ret < 0) - goto error; - btrfs_write_dirty_block_groups(trans); + while (!RB_EMPTY_ROOT(&trans->delayed_refs.href_root) || + test_range_bit(&fs_info->block_group_cache, 0, (u64)-1, + BLOCK_GROUP_DIRTY, 0)) { + ret = btrfs_write_dirty_block_groups(trans); + if (ret < 0) + goto error; + ret = btrfs_run_delayed_refs(trans, -1); + if (ret < 0) + goto error; + } __commit_transaction(trans, root); if (ret < 0) goto error; + + /* There should be no pending delayed refs now */ + if (!RB_EMPTY_ROOT(&trans->delayed_refs.href_root)) { + error("uncommitted delayed refs detected"); + goto error; + } ret = write_ctree_super(trans); btrfs_finish_extent_commit(trans); kfree(trans);