Since lowmem mode can repair certain corruptions (mostly in fs tree),
insert a beacon into each fsck test cases to allow some of them be
tested in lowmem mode.
With this patch, fsck option override will check the beacon file
".lowmem_repairable" in the same directory of the test image, and if the
beacon exists, then it will also run lowmem mode repair to repair the
image.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since repair functions will search path again, if the last item
was checked, the location where the path points is invalid.
Fix it by saving the last valid key if err contains LAST_ITEM,
and call btrfs_next_item() before return of check_inode_item().
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
While checking file extents, there are two errors that may occur:
1) There is one hole between the last extent end and beginning of the
current extent but no-holes is disabled.
2) No-holes is disabled, one file's nbytes equals 0 but isize is not 0.
Those both mean the file may have lost some extents.
To avoid btrfsck's error message, fix it by introducing function
'punch_extent_hole' to punch holes.
For case 1, punch a hole extent whose length is
(current extent begin - last extent end)
while checking one extent.
For case 2, punch a hole extent whose length is
(file isize - actual file size)
after traversing one entire file.
Then repair_inode_nbytes will set the nbytes to isize.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
New function repair_inode_nlinks_lowmem() sets nlink of the inode to refs.
If refs equals 0, move the inode to lost+found and set refs to 1
initially.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
repair_ternary_lowmem() may delete dir_item(s), later traversal can cause
wrong isize of the dirctory inode.
Introduce count_dir_iszie() to count directory isize if any
dir_item(s) in the directory has been repaired.
check_dir_item() now returns DIR_COUNT_AGAIN means the inode should be
counted isize again.
It is unnessary to do recount after check_inode_ref(), since
inode_ref is irrelevant to isize.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce repair_ternary_lowmem() to repair dir_item, dir_index
and inode_ref.
If two of the three are missing or mismatched, call btrfs_unlink() to
delete the existing one.
If one of three is missing or mismatched, call btrfs_add_link() to
add the missing one.
repair_dir_item() inserts an inode item corresponding to location in the
dir item if error contains INODE_ITEM_MISSING.
Also, it calls repair_ternary_lowmem() to repair relationship of
dir_item, dir_index and inode_ref.
check_inode_ref() calls repair_ternary_item() to fix up errors.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce 'repair_fs_first_inode' to repair first inode errors
(ref missing and inode item missing).
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce __create_inode_item() to create a new inode item.
It is called by create_inode_item() and create_inode_item_lowmem().
Function repair_inode_item_missing() just adds a new inode item.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
For code reuse, btrfs_insert_dir_item() now calls
inserts_with_overflow() even if the dir_item existed.
Add a parameter @ignore_existed to btrfs_add_link().
If @ignore_existed is not zero, btrfs_add_link() continues to do link.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
check_dir_item() now checks relative dir_item/dir_index.
Introduce print_dir_item_err() to print error msg while
checking dir_item/dir_index.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce print_inode_ref() to print error msg while checking inode ref.
Add args @name_ret and @namelen_ret to check_inode_ref().
Name is essential if the inode item is to be put into lost+found
while doing nlinks repair.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The changes in the patch is for further repair:
1.Introduce find_dir_index() to get the index by traversing items.
2.We should distinguish dir_index error and dir_item error.
However, there are only DIR_ITEM_MISSING and DIR_ITEM_MISMATCH.
Introduce marcos DIR_INDEX_MISSING and DIR_INDEX_MISMATCH
to represent index missing/mismatch.
3.Because find_dir_item() prints message right now if it detects any
error.
Remove message output now and next patches will introduce functions
to print error message.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Modify check_fs_first_inode to check the inode ref in first inode.
Which root dir inode differs from other inode is inode_ref points
"..".
So we just handle this special case and treat it as normal
inode in continued check.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
For further lowmem repair, change @index type u64 to u64* of
function find_inode_ref().
So caller can get the index of ref.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce repair_inode_orphan_item_lowmem() to add an orphan
item if the inode refs and nlink are both zero.
repair_inode_orphan_item_lowmem() is just a wrapper function
that calls btrfs_add_orphan_item().
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
After traversal of whole directory, we should get the actual isize.
Like original mode, function repair_dir_isize_lowmem() sets isize of the
directory inode item to actual size.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
After checking one entire inode item, we should get the actual
nbytes of the inode item.
Like original mode, repair_inode_nbytes_lowmem() sets nbytes in
struct btrfs_inode_item to the actual nbytes.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Turn on the option --repair with --mode==lowmem in btrfs check.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
[ use warning() and adjust wording ]
Signed-off-by: David Sterba <dsterba@suse.com>
We should use entry->root_id instead of top_id to determine whether it is
the toplevel subvolume. Introduced in 4.13.2.
Issue: #72
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Currently, gcc is passed the include directory with full path. As a result,
dependency files (*.o.d) also record the full path at the build time. Such
full path dependency is annoying for sharing the source between multiple
machines, containers, or anything the path differ.
And this is the same way what other program using autotools e.g. e2fsprogs
is doing:
$ grep top_builddir Makefile
top_builddir = .
CPPFLAGS = -I. -I$(top_builddir)/lib -I$(top_srcdir)/lib
BUILD_CFLAGS = -g -O2 -I. -I$(top_builddir)/lib -I$(top_srcdir)/lib -DHAVE_CONFIG_H
<snip>
Signed-off-by: Naohiro Aota <naota@elisp.net>
[ set TOPDIR=. instead of -I as discussed, does not harm linker ]
Signed-off-by: David Sterba <dsterba@suse.com>
We'll need TOPDIR to be ./ but library-test is intentionally built
outside of the git repository so we need to make them separate.
Signed-off-by: David Sterba <dsterba@suse.com>
Before the change configure refused to accept it's defaults explicitly:
$ ./configure --enable-convert --with-convert=ext2,reiserfs
...
configure: error: unknown tokens for --with-convert: ,
After the change both converters are enabled:
$ ./configure --enable-convert --with-convert=ext2,reiserfs
...
btrfs-convert: yes (ext2,reiserfs)
Cc: linux-btrfs@vger.kernel.org
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Signed-off-by: David Sterba <dsterba@suse.com>
We're going to add move build check integration scripts and
configuration, so put everything under travis/ now.
Signed-off-by: David Sterba <dsterba@suse.com>
The check opens the given device in exclusive by default. In the forced
mode we want to access a device in use, so we have to drop the
exclusivity bit.
This works for block devices but not for files, that could be mounted
via a loop device. In that respect test check/007 is broken and will be
fixed.
Signed-off-by: David Sterba <dsterba@suse.com>
Lowmem mode only repairs few cases which has a beacon file
".lowmem_repairable" in the case' directory.
However, defining TEST_ENABLE_OVERRIDE=true in command line does work
in above strategy.
Because _skip_spec() in tests/common.local isn't interpreted by shell
in that case.
Solve it by making _skip_spec() always be defined in common.local.
Reported-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
[ keep the _skip_spec check ]
Signed-off-by: David Sterba <dsterba@suse.com>
Adding support for 'btrfs subvol show' for the toplevel subvolume
accidentally started to list the toplevel subvolume among the deleted.
Since version 4.8.3.
Don't panic. The toplevel subvolume (id 5) cannot be deleted.
Fixes: d4aa2bc07e ("btrfs-progs: subvol show: print more details about toplevel subvolume")
Signed-off-by: David Sterba <dsterba@suse.com>
kernel 4.14 introduces new function for checking if all chunks is ok for
mount with -o degraded option.
commit 21634a19f646 ("btrfs: Introduce a function to check if all
chunks a OK for degraded rw mount")
As a result, raid0 profile cannot be mounted with -o degraded on 4.14.
This causes failure of the misc-test 011 "delete missing device".
Fix this by using raid1 profile for both data and metadata.
This also should work for kernel before 4.13.
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The values for block group offset, length etc. in btrfs-debugfs' output
are left-aligned, which creates unaligned output and makes the usage
percentage hard to read/process further. This patch adds right-aligning
format specifiers for the number values.
Ideally the format values wouldn't be hardcoded but instead derived from
the filesystem size, but this seems to work for now.
Signed-off-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The tabular output format looks better if the empty uuids are aligned
with the other. In the list output (now default) it's not that nice but
the whole list format is not nice anyway.
Signed-off-by: David Sterba <dsterba@suse.com>
Rather than iterate over all outstanding backrefs to resolve indirect refs,
use a separate list that only contains indirect refs.
When we process missing keys, the ref moves to the indirect ref list.
Once the indirect ref is resolved, move the ref to the pending list.
Eventually these lists will be replaced by rbtrees.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
[ added assertion fix from Josef ]
Signed-off-by: David Sterba <dsterba@suse.com>
Rather than iterate over all outstanding backrefs to resolve missing keys,
use a separate list that only contains refs that need missing keys resolved.
Once the missing key is resolved, move the ref to the pending list.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Eventually, we'll have several lists and trees, as well as some statistics.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We have the infrastructure to cache extent buffers but we don't actually
do the caching. As soon as the last reference is dropped, the buffer
is dropped. This patch keeps the extent buffers around until the max
cache size is reached (defaults to 25% of memory) and then it drops
the last 10% of the LRU to free up cache space for reallocation. The
cache size is configurable (for use by e.g. lowmem) when the cache is
initialized.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
[ update codingstyle, switch total_memory to bytes ]
Signed-off-by: David Sterba <dsterba@suse.com>
We now have two data structures that can be used to iterate the same data
set, and there may be quite a few of them in memory. Eliminating the
list_head member will reduce memory consumption while iterating over
the extent backrefs.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
For the pathlogical case, like xfstests generic/297 that creates a
large file consisting of one, repeating reflinked extent, fsck can
take hours. The root cause is that calling find_data_backref while
iterating the extent records is an O(n^2) algorithm. For my
example test run, n was 2*2^20 and fsck was at 8 hours and counting.
This patch supplements the list with an rbtree and drops the runtime
of that testcase to about 20 seconds.
A previous version of this patch introduced a regression that would
have corrupted file systems during repair. It was traced to the
compare algorithm honoring ->bytes regardless of whether the
reference had been found and a failure to reinsert nodes after
the target reference was found.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Mkfs uses make_path that is duplicate of path_cat* functions, so we can
switch to them and add the error handling.
Signed-off-by: David Sterba <dsterba@suse.com>
A convert parameter is added as a flag to indicate if btrfs_mksubvol()
is used for btrfs-convert. The change cascades down to the callchain.
Signed-off-by: Yingyi Luo <yingyil@google.com>
Signed-off-by: David Sterba <dsterba@suse.com>
link_subvol() is moved to inode.c and renamed as btrfs_mksubvol().
The change cascades down to the callchain.
Signed-off-by: Yingyi Luo <yingyil@google.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Fix 'subvolume delete --commit-after' to work properly:
- SYNC ioctl will be issued even when last delete fails
- SYNC ioctl will be issued on each file system only once in the end
To achieve this, get_fsid() and add_seen_fsid() are called after each
delete to keep only one fd for each fs.
In the end, seen_fsid_hash will be traversed and SYNC is issued on each
fs.
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Reviewed-by: Qu Wenruo <quwenruo.btrfs@gmx.com>
Signed-off-by: David Sterba <dsterba@suse.com>