To test regression 460e93f25754 ("btrfs-progs: mkfs: check the status of
file at mkfs").
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ update test to create a out of /tmp ]
Signed-off-by: David Sterba <dsterba@suse.com>
Commit 460e93f25754 ("btrfs-progs: mkfs: check the status of file at mkfs")
will try to check the file state before creating fs on it.
The check is mostly fine for normal mkfs case, while for --rootdir
option, it's allowed to create a new file if the destination file
doesn't exist.
Fix it by allowing non-existent file if --rootdir is specified.
Fixes: 460e93f25754 ("btrfs-progs: mkfs: check the status of file at mkfs")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Make --shrink a separate option for --rootdir, and change the default to
off.
The shrinking behaviour is not a commonly used feature but can be useful
for creating minimal pre-filled images, in one step, without requiring
to mount.
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ update changelog and error messages ]
Signed-off-by: David Sterba <dsterba@suse.com>
Use the new dev extent based shrink method for rootdir option. This
restores the original behaviour when --rootdir will create a minimal
filesystem size.
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ update changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
Use an easier method to calculate the estimate device size for
mkfs.btrfs --rootdir.
The new method will over-estimate, but should ensure we won't encounter
ENOSPC.
It relies on the following data:
1) number of inodes -- for metadata chunk size
2) rounded up data size of each regular inode -- for data chunk size
Total meta chunk size = round_up(nr_inode * (PATH_MAX * 3 + sectorsize),
min_chunk_size) * profile_multiplier
PATH_MAX is the maximum size possible for INODE_REF/DIR_INDEX/DIR_ITEM.
Sectorsize is the maximum size possible for inline extent.
min_chunk_size is 8M for SINGLE, and 32M for DUP, get from
btrfs_alloc_chunk().
profile_multiplier is 1 for Single, 2 for DUP.
Total data chunk size is much easier.
Total data chunk size = round_up(total_data_usage, min_chunk_size) *
profile_multiplier
Total_data_usage is the sum of *rounded up* size of each regular inode
use.
min_chunk_size is 8M for SINGLE, 64M for DUP, get from btrfS_alloc_chunk().
Same profile_multiplier for meta.
This over-estimate calculate is, of course inacurrate, but since we will
later shrink the fs to its real usage, it doesn't matter much now.
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ update comments ]
Signed-off-by: David Sterba <dsterba@suse.com>
Remove the custom chunk allocator for mkfs. It is buggy in connection to
the --rootdir option and puts file data to the reerved 1M area. The
feature of the custom allocator was to reserve only minimal amount of
blockgroup space. This will temporarily stop working and will need an
explicit request by option, added by following patches.
Use the generic chunk allocator.
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ update changelog ]
Signed-off-by: David Sterba <dsterba@suse.com>
Cleanup of temporary chunks should be done as soon as possible, and it
should be especially before doing large tree operations, like filling
the filesystem when using --rootdir.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since new --rootdir can allocate chunk, it will modify the chunk
allocation result.
This patch will update allocation info before verbose output to reflect
such info.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
New test case to test if the minimal device size given by "mkfs.btrfs"
failure case is valid.
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ renamed script ]
Signed-off-by: David Sterba <dsterba@suse.com>
Also rename the function from size_sourcedir() to mkfs_size_dir().
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In fact, --rootdir option is getting more and more independent from
normal mkfs code.
So move image creation function, make_image() and its related code to
mkfs/rootdir.[ch], and rename the function to btrfs_mkfs_fill_dir().
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit 1170ac3079 ("btrfs-progs: convert: Introduce function to check if
convert image is able to be rolled back") reworked rollback check
condition, by checking 1:1 mapping of each file extent.
The idea itself has nothing wrong, but error handler is not implemented
correctly, which over writes the return value and always try to rollback
the fs even it fails to pass the check.
Fix it by correctly return the error before rollback the fs.
Fixes: 1170ac3079 ("btrfs-progs: convert: Introduce function to check if convert image is able to be rolled back")
Reported-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In function cmd_filesystem_defrag(), lines of code for error handling
are duplicate and hard to expand in further.
Create a jump label for errors.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
It's a waste of IO to fill the whole image before creating btrfs on it,
just wiping the first 1M, and then write 1 byte to the last position to
create a sparse file.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since commit c11e36a29e ("Btrfs-progs: Do not force mixed block group
creation unless '-M' option is specified"), mkfs no longer use mixed
block group unless specified manually.
This breaks the minimal device size calculation, which only considered
mixed block group use case.
This patch enhances minimal device size calculation for mkfs, by using
different minimal stripe length (calculated from code) for different
profiles, and use them to calculate minimal device size.
Reported-by: Wesley Aptekar-Cassels <W.Aptekar@gmail.com>
Fixes: c11e36a29e ("Btrfs-progs: Do not force mixed block group creation unless '-M' option is specified")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
[ updated comments ]
Signed-off-by: David Sterba <dsterba@suse.com>
So prepare_test_dev() can be called several times in one test case, to
test different device sizes.
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ switch to [ ] ]
Signed-off-by: David Sterba <dsterba@suse.com>
Currently, only the status of block devices is checked at mkfs,
but we should also check for regular files whether they are already
formatted or mounted to prevent overwrite accidentally.
Device status is checked by test_dev_for_mkfs().
The part which is not related to block device is split from this
and used for both block device and regular file.
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
With extended tests in the following patch a file based filesystem image
also needs -f, otherwise it will fail.
Signed-off-by: David Sterba <dsterba@suse.com>
Reloc tree is a special tree with very short life span. It acts as a
special snapshot for any tree, with related nodes/leaves or EXTENT_DATA
modified to point to new position.
Considering the short life span and its special purpose, it should be
quite reasonable to keep them as both corner case for fsck and
educational dump for anyone interested in relocation.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This error occurs when no_holes is not set, but there is a gap
before the file extent.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
To simplify, I suggest moving the 'writable/readonly' issue only to the
-r line, instead of having it introduced in two places.
Pull-request: #80
Author: Howard <hwj@BridgeportContractor.com>
Signed-off-by: David Sterba <dsterba@suse.com>
-# btrfs inspect-internal dump-tree -t fs /dev/block/device
ERROR: unrecognized tree id: fs
Without this fix I can't dump-tree fs, but I can dump-tree fs_tree and
also fs_tree_tree, which is a bit silly.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Hans van Kranenburg <hans@knorrie.org>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We missed some regressions because the lowmem mode was not run in the CI
tests. This is partially due to the incomplete implementation but we
have exceptions for the --repair mode in the tests.
Signed-off-by: David Sterba <dsterba@suse.com>
For snapshot shared tree blocks with source subvolume, the keyed backref
counter only counts the exclusive owned references.
In the following case, 258 is a snapshot of 257, which inherits all the
reference to this data extent.
------
item 4 key (12582912 EXTENT_ITEM 524288) itemoff 3741 itemsize 140
refs 179 gen 9 flags DATA
extent data backref root 257 objectid 258 offset 0 count 49
extent data backref root 257 objectid 257 offset 0 count 1
extent data backref root 256 objectid 258 offset 0 count 128
extent data backref root 256 objectid 257 offset 0 count 1
------
However lowmem mode used to iterate the whole inode to find all
references, and doesn't care if a reference is already counted by the
shared tree block.
Add the test case to check it.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The image is dumped by modifying kernel to sleep long enough before
merging relocation trees, so we can just copy the whole image to other
place before kernel begins to merge reloc trees.
And the base image is created by the following script to bump metadata
size:
------
dev=~/test.img
mnt=/mnt/btrfs
umount $mnt &> /dev/null
fallocate -l 128M $dev
mkfs.btrfs -f -n 4k -m single -d single $dev
mount $dev $mnt -o nospace_cache,max_inline=2048
btrfs subvolume create $mnt/src
for i in $(seq -w 0 128); do
xfs_io -f -c "pwrite 0 2k" $mnt/src/file_$i > /dev/null
done
for i in $(seq -w 0 64); do
btrfs subvolume snapshot $mnt/src/ $mnt/snapshot_$i
touch $mnt/snapshot_$i/new
done
sync
------
The image triggers several corner cases that the old lowmem mode didn't
consider.
Like metadata backref with FULL_BACKREF flag and only SHARED_BLOCK_REF
backrefs for metadata. And several tree reloc trees with shared
leaves/nodes to confuse old lowmem mode.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
For image with shared block ref only metadata item like:
------
item 66 key (21573632 METADATA_ITEM 0) itemoff 3971 itemsize 24
refs 66 gen 9 flags TREE_BLOCK|FULL_BACKREF
tree block skinny level 0
item 0 key (21573632 SHARED_BLOCK_REF 21676032) itemoff 3995 itemsize 0
shared block backref
item 1 key (21573632 SHARED_BLOCK_REF 21921792) itemoff 3995 itemsize 0
shared block backref
item 2 key (21573632 SHARED_BLOCK_REF 21995520) itemoff 3995 itemsize 0
shared block backref
item 3 key (21573632 SHARED_BLOCK_REF 22077440) itemoff 3995 itemsize 0
shared block backref
...
------
Lowmem mode check will report false alerts like:
------
ERROR: extent[21573632 4096] backref lost (owner: 256, level: 0)
------
[CAUSE]
In fact, the false alerts are not even from extent tree verfication, but
a fs tree helper which is designed to make sure there is some tree block
referring to the fs tree block.
The idea is to find inlined tree backref then keyed TREE_BLOCK_REF_KEY.
However it missed SHARED_BLOCK_REF_KEY, and caused such false alert.
[FIX]
Add SHARED_BLOCK_REF_KEY to make the warning shut up.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
For reloc tree root, its backref points to itself. So for such case,
we should finish the lookup.
Previous end condition is to ensure it's reloc tree *and* needs its root
bytenr to match the bytenr passed in.
However the @root passed can be another tree, e.g. other tree reloc root
which shares the node/leaf. This makes any check based on @root passed
in invalid.
The patch removes the unreliable root objectid detection, and only uses
root->bytenr check.
For the possibility of invalid self-pointing backref, extent tree
checker should have already handled it, so we don't need to bother in
fs tree checker.
Fixes: 54c8f9152f ("btrfs-progs: check: Fix lowmem mode stack overflow caused by fsck/023")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Btrfs lowmem mode fails with the following ASSERT() on certain valid
image.
------
backref.c:466: __add_missing_keys: Assertion `ref->root_id` failed, value 0
------
[REASON]
Lowmem mode uses btrfs_find_all_roots() when walking down fs trees.
However if a tree block with only shared parent backref like below,
backref code from btrfs-progs doesn't handle it correct.
------
item 72 key (604653731840 METADATA_ITEM 0) itemoff 13379 itemsize 60
refs 4 gen 7198 flags TREE_BLOCK|FULL_BACKREF
tree block skinny level 0
shared block backref parent 604498477056
shared block backref parent 604498460672
shared block backref parent 604498444288
shared block backref parent 604498411520
------
Such shared block ref is *direct* ref, which means we don't need to
solve its key, nor its rootid.
As the objective of backref walk is to find all direct parents until it
reaches tree root.
So for such direct ref, it should be pended to pref_stat->pending, other
than pending it to pref_stat->pending_missing_key.
[FIX]
For direct ref, pending it to pref_state->pending directly to solve the
problem.
Reported-by: Chris Murphy <chris@colorremedies.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce a new test image, which has an extent item with no inlined
extent data ref, but all keyed extent data ref.
Only in this case we can trigger fase data extent backref lost bug in
lowmem mode.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
For keyed extent ref, its offset is calculated offset (file offset -
file extent offset), just like inlined extent data ref.
However the code is using file offset to hash extent data ref offset,
causing false backref lost warning like:
------
ERROR: data extent[16913485824 7577600] backref lost
------
Fixes: b0d360b541 ("btrfs-progs: check: introduce function to check data backref in extent tree")
Reported-by: Chris Murphy <chris@colorremedies.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When lowmem fsck tries to find backref of a specified file extent, it
searches inlined data ref first.
However, extent data ref contains both owner root objectid, inode number
and calculated offset (file offset - extent offset).
The code only checks owner root objectid, not checking inode number nor
calculated offset.
This makes lowmem mode fail to detect any backref mismatch if there is
a inlined data ref with the same owner objectid.
Fix it by also checking extent data ref's objectid and offset.
Fixes: b0d360b541 ("btrfs-progs: check: introduce function to check data backref in extent tree")
Reported-by: Chris Murphy <chris@colorremedies.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
v4.14 btrfs-progs can't pass new self test image with large tree reloc
trees. It will fail with later "shared_block_ref_only.raw.xz" test
image with NULL pointer access.
[CAUSE]
For image with higher (level >= 2) tree reloc tree, for function
need_check() its ulist will be empty as tree reloc tree won't be
accounted in btrfs_find_all_roots(). Then accessing ulist->roots with
rb_first() will return NULL pointer.
[FIX]
For need_check() function, if @roots is empty, meaning it's a tree reloc
tree, always check them. Although this can be slow, but at least it's
safe that we won't skip any possible wrong tree block.
Fixes: 5e2dc77047 ("btrfs-progs: check: skip shared node or leaf check for low_memory mode")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Commit 723427d7e6 ("btrfs-progs: check: change the way lowmem mode
traverses metadata") introduces a regression which could make some fsck
self test case to fail.
For fsck test case 004-no-dir-item, btrfs check --mode=lowmem --repair
can cause BUG_ON() with ret = -17 (-EEXIST) when committing transaction.
The problem happens with the following backtrace:
./btrfs(+0x22045)[0x555d0dade045]
./btrfs(+0x2216f)[0x555d0dade16f]
./btrfs(+0x29df1)[0x555d0dae5df1]
./btrfs(+0x2a142)[0x555d0dae6142]
./btrfs(btrfs_alloc_free_block+0x78)[0x555d0dae6202]
./btrfs(__btrfs_cow_block+0x177)[0x555d0dad00a2]
./btrfs(btrfs_cow_block+0x116)[0x555d0dad05a8]
./btrfs(commit_tree_roots+0x91)[0x555d0db1fd4f]
./btrfs(btrfs_commit_transaction+0x18c)[0x555d0db20100]
./btrfs(btrfs_fix_super_size+0x190)[0x555d0db005a4]
./btrfs(btrfs_fix_device_and_super_size+0x177)[0x555d0db00771]
./btrfs(cmd_check+0x1757)[0x555d0db4f6ab]
./btrfs(main+0x138)[0x555d0dace5dd]
/usr/lib/libc.so.6(__libc_start_main+0xea)[0x7fa5e4613f6a]
./btrfs(_start+0x2a)[0x555d0dacddda]
The bug is triggered by that, extent allocator considers range
[29360128, 29376512) as free and allocates it. However when inserting
EXTENT_ITEM, btrfs finds there is already one tree block (fs tree root),
returning -EEXIST and causing the later BUG_ON().
[CAUSE]
The cause is in repair mode, lowmem check always pins all metadata
blocks. However pinned metadata blocks will be unpined when transaction
commits, and will be marked as *FREE* space.
So later extent allocator will consider such range free and allocates
them incorrectly.
[FIX]
Don't pin metadata blocks without valid reason or preparation (like
discard all free space cache to re-calculate free space on next write).
Fixes: 723427d7e6 ("btrfs-progs: check: change the way lowmem mode traverses metadata")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The standalone utility btrfs-show-super has been obsoleted by 'btrfs
inspect-internal dump-super' but it's still in the repository and should
build in case somebody still uses it.
Reported-by: "John L. Center" <jlcenter15@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Currently ctime/otime/stime/rtime of ROOT_ITEM are not printed in
print_root_item(). Fix this and print them if the values are not zero.
The function print_timespec() is moved forward to reuse.
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The helper script ./travis-should-run-test has been moved to a directory
in 4.13.3 but the path in the config was not updated. This was not
caught in the CI environment and the tests did not report a failure.
Signed-off-by: David Sterba <dsterba@suse.com>