Commit Graph

25 Commits (07ce7005fc81289eb4c7dde7d601be08c977b92c)

Author SHA1 Message Date
Qu Wenruo 0ffacad290 btrfs-progs: rebuild missing block group during chunk recovery if possible
Before the patch, chunk will be considered bad if the corresponding
block group is missing, even the only uncertain data is the 'used'
member of the block group.

This patch will try to recalculate the 'used' value of the block group
and rebuild it.
So even only chunk item and dev extent item is found, the chunk can be
recovered.
Although if extent tree is damanged and needed extent item can't be
read, the block group's 'used' value will be the block group length, to
prevent any later write/block reserve damaging the block group.
In that case, we will prompt user and recommend them to use
'--init-extent-tree' to rebuild extent tree if possible.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
2014-12-04 16:48:13 +01:00
Anand Jain 38cfeef103 btrfs-progs: introduce a proper structure on which cli will call register-device ioctl
As of now commands mentioned below (with in [..]) are calling call register-device
ioctl BTRFS_IOC_SCAN_DEV for all the devices in the system.
Some issues with it:
 BTRFS_IOC_SCAN_DEV: ioctl is a write operation, we don't want command like
 btrfs-debug-tree threads to do that..
   eg:
   ----
   $ cat /proc/fs/btrfs/devlist  | egrep fsid | wc -l
   0
   $ btrfs-debug-tree /dev/sde  (num_device > 1)
   $ cat /proc/fs/btrfs/devlist  | egrep fsid | wc -l
   5
   ----

 btrfs_scan_fs_devices() ends up calling this ioctl only when num_device > 1.
 That's inconsistency with in feature/bug.

 We don't have to register _all_ the btrfs devices (again) in the system
 without user consent.

Why its inconsistent:
 function btrfs_scan_fs_devices() calls btrfs_scan_lblkid only when
 num_devices is > 1, which in turn calls BTRFS_IOC_SCAN_DEV ioctl, if
 conditions are met.

 But main issue is we have too many consumers of btrfs_scan_fs_devices()
 the names below with in [] is the cli leading to this function.

 open_ctree_broken()  [btrfs-find-root]
 recover_prepare()    [btrfs rescue super-recover]
 __open_ctree_fd
 (updates always except when flag OPEN_CTREE_RECOVER_SUPER is set and
 flag OPEN_CTREE_RECOVER_SUPER is set only by 'btrfs rescue super-
 recover' but still this thread sneaks through the open_ctree function
 to call register-device-ioctl as show below).
	open_ctree_fs_info
		[btrfs-debug-tree]
		[btrfs-image -r]
		[btrfs check]
		open_fs
			[btrfs restore]
		open_ctree
			[calc-size]
			[btrfs-corrupt-block]
			[btrfs-image] (create)
			[btrfs-map-logical]
			[btrfs-select-super]
			[btrfstune]
			[btrfs-zero-log]
			[tester]
			[mkfs]
			[quick-test.c]
			[btrfs label set unmounted]
			[btrfs get label unmounted]
			[btrfs rescue super-recover]

	open_ctree_fd
		[btrfs-convert]

Fix:
 In an effort to make register-device consistent, all calls to
 btrfs_scan_fs_devices() will have 5th parameter set to 0. that means
 we don't need 5th parameter at all. And with this function not calling
 the register ioctl at all, finally we will have following two cli to call
 the ioctl BTRFS_IOC_SCAN_DEV.
    btrfs dev scan and
    mkfs.btrfs
 Threads needing to update kernel about a device would have to use
 btrfs_register_one_device() separately.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
2014-10-16 12:02:00 +02:00
Anand Jain 7fd6d93352 btrfs-progs: fix uninitialized warning in btrfs_calc_stripe_index
chunk-recover.c: In function btrfs_calc_stripe_index
chunk-recover.c:1481: warning: index may be used uninitialized in this function

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
2014-10-14 11:20:09 +02:00
Qu Wenruo a1a3dc7fd4 btrfs-progs: Fix malloc size for superblock.
recover_prepare() in chunk-recover.c alloc memory which only contains
sizeof(struct btrfs_super_block). This will cause glibc malloc error
after superblock csum is calculated.

Use BTRFS_SUPER_INFO_SIZE to fix the bug.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
2014-10-10 09:09:47 +02:00
Qu Wenruo 23d7f6d9dc btrfs-progs: Allow btrfs_read_dev_super() to read all 3 super for super_recover.
Btrfs-progs superblock checksum check is somewhat too restricted for
super-recover, since current btrfs-progs will only read the 1st
superblock and if you need super-recover the 1st superblock is
possibly already damaged.

The fix is introducing super_recover parameter for
btrfs_read_dev_super() and callers to allow scan backup superblocks if
needed.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
2014-08-22 15:04:50 +02:00
Gui Hecheng ad70b55b66 btrfs-progs: replace BTRFS_NUM_MIRRORS with BTRFS_MAX_MIRRORS
The chunk-recover.c/BTRFS_NUM_MIRRORS in the userspace means
the same thing as ctree.h/BTRFS_MAX_MIRRORS in the kernelspace,
so to stay consistent with the kernelspace, just make this movement
in the userspace:
	chunk-recover.c/BTRFS_NUM_MIRRORS
		===>
	ctree.h/BTRFS_MAX_MIRRORS

This provides convenience for future use.

Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
2014-08-22 14:43:11 +02:00
Gui Hecheng 2da5099c69 btrfs-progs: fix max mirror number error for chunk-recover
When run chunk-recover on a health btrfs(data profile raid0, with
plenty of data), the program has a chance to abort on the number
of mirrors of an extent.

According to the kernel code, the max mirror number of an extent
is 3 not 2:
	ctree.h: 		BTRFS_MAX_MIRRORS	3
	chunk-recover.c :	BTRFS_NUM_MIRRORS	2
just change BTRFS_NUM_MIRRORS to 3, and everything goes well.

Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
2014-08-22 14:40:11 +02:00
Gui Hecheng 41b617ed73 btrfs-progs: fix missing parity stripe for raid6 in chunk-recover
When deal with the p & q stripes for data profile raid6, chunk-recover
forgets to insert them into the chunk record. Just insert them back
freely.
Also wrap the insert procedure into a new function, fill_chunk_up.

Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
2014-08-22 14:39:35 +02:00
Gui Hecheng 8559a1626f btrfs-progs: cleanup unused assignment for chunk-recover
The 'num_unordered' will be recounted after 'goto out',
just remove it.

Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
2014-08-22 14:39:34 +02:00
Gui Hecheng c63d47653f btrfs-progs: fix blindly goto failure for chunk-recover
If the csum of one stripe is not able to judge the order of two
device extents, the stripe may happen to belong to the device extent
that is already kicked out as ordered.
Take this condition into consideration, don't report failure and
give more tries with the stripes following.

Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
2014-08-22 14:39:33 +02:00
Gui Hecheng ae5c13934e btrfs-progs: fix uninitialized number count in chunk-recover
When count the number of unordered device extents in chunk-recover,
the counter should be reinitialized to be used.
Also, introduce a new function for the counting job.

Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
2014-08-22 14:39:33 +02:00
Rakesh Pandit 30156b749b Btrfs-progs: chunk_recovery: fix mem leak and pthread_cancel call
Free memory if open call fails. Prevent pthread_cancel on threads
which have already finished successfully. If all calls to
pthread_create and pthread_join are successful, we mistakenly call
pthread_cancel because cancel_from and cancel_to are both zero.

Make POSIX.1-2001 happy by supplying a non-NULL second argument to
pthread_setcanceltype.

Signed-off-by: Rakesh Pandit <rakesh@tuxera.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
2014-04-22 14:33:44 +02:00
Hidetoshi Seto 9c59fb9809 btrfs-progs: Copyright string update
Fix corporate name for copyright.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2014-03-21 06:23:24 -07:00
Anand Jain 5e5fd1b9ed btrfs-progs: don't replicate the stripe_len defines
a clean up patch, the BTRFS_STRIPE_LEN is been duplicated across
btrfs-progs, the kernel defines it in volume.h so do the same
for progs.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-31 08:22:18 -08:00
Gui Hecheng 9df48a7d01 btrfs-progs: scan devices in parallel for chunk-recover
Originally, multi devices are scanned one by one;
Now, one thread is used per device to scan.

Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-31 08:22:09 -08:00
Gui Hecheng 4989dc82d1 btrfs-progs: add chunk-recover raid0/5/6 data stripes rebuild routine
Decide the raid0/5/6 data stripes' order using checksums.
For one chunk, fetch each 64k logical stripe
	1. search its checksum in the csum tree
	2. read the physical stripe data on each device
	3. calc the data checksums
	4. if one checksum matches the value from the csum tree,
	   then the logical stripe resides in that device,
	   the stripe order index can be calculated.
	5. if more than one checksums match,
	   then use the successive csum in the tree to compare again.
	6. if equal stripes are encountered, just fetch next stripe.
	7. if some devices' order are still not decided, then they
	   can not be recovered.

Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-31 08:22:09 -08:00
Gui Hecheng 7af8e4ee2a btrfs-progs: skip chunk recover works when check chunks successfully
If no chunks need to be recovered, skip the recover works,
meanwhile the user won't be annoyed by the "ask_user".

Signed-off-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-31 08:22:08 -08:00
Wang Shilong 52ddfa74fe Btrfs-progs: chunk-recover: add new flag to prepare recovering for ordered data chunk
When reading block groups we will searching it's corresponding chunk, however, at this
time, some chunks has not been built(data chunks raid0/raid10/raid56), don't bug_on here,
we will try to rebuild these chunks later.

Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-31 08:22:07 -08:00
Wang Shilong e5f72de944 Btrfs-progs: chunk-recover: use right size when allocating chunk root node
When allocating chunk root node, we should use nodesize rather than sectorsize,
this will casue regression when making other nodesize choice.(for example 16k size now)

Reported-by: Gui Hecheng <guihc.fnst@cn.fujitsu.com>
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-31 08:22:06 -08:00
Ross Kirk 7ff512ce38 btrfs-progs: Make btrfs_header_chunk_tree_uuid() return unsigned long
Internally, btrfs_header_chunk_tree_uuid() calculates an unsigned
long, but casts it to a pointer, while all callers cast it to unsigned
long again.

From btrfs commit b308bc2f05a86e728bd035e21a4974acd05f4d1e

Signed-off-by: Ross Kirk <ross.kirk@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
2014-01-31 08:22:04 -08:00
Ross Kirk 33ce9a82b8 btrfs-progs: Make btrfs_header_fsid() return unsigned long
Internally, btrfs_header_fsid() calculates an unsigned long, but casts
it to a pointer, while all callers cast it to unsigned long again.

Committed to btrfs as fba6aa75654394fccf2530041e9451414c28084f

Fix line length issues and match changes to kernelspace

Signed-off-by: Ross Kirk <ross.kirk@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-10-16 08:23:11 -04:00
Ross Kirk db6feaadfe btrfs-progs: remove unused parameter from btrfs_header_fsid
Remove unused parameter, 'eb'. Unused since introduction in
7777e63b42

Signed-off-by: Ross Kirk <ross.kirk@gmail.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-10-16 08:23:10 -04:00
Wang Shilong 39813fb7ac Btrfs-progs: move ask_user() to utils.c
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-10-16 08:22:26 -04:00
Wang Shilong 77801d34d0 Btrfs-progs: pass flag to control whether run ioctl in btrfs_scan_for_fsid()
If some fatal superblocks are damaged, running ioctl will return failure,
in this case, we should avoid run ioctl.

Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-10-16 08:22:26 -04:00
David Sterba e9270f6209 btrfs-progs: separate command and implementation of chunk-recover code
The command has been moved and we should rename the files accordingly,
so the entry point is now in cmds-rescue.c and the core functionality
in it's own file.

Return codes of btrfs_recover_chunk_tree have been simplified not to
require a define and another file for defintion.

CC: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-10-16 08:22:23 -04:00