From c6f903fa0443d16c14c6755597a779e03d35fbcd Mon Sep 17 00:00:00 2001 From: Su Yue Date: Mon, 3 Jun 2019 09:27:54 +0800 Subject: [PATCH] btrfs-progs: fix invalid memory write in get_fs_info() As the link reported, btrfs fi sh may crash while a device is removing. valgrind reported: ====================================================================== ... ==883== Invalid write of size 8 ==883== at 0x13C99A: get_device_info (in /usr/bin/btrfs) ==883== by 0x13D715: get_fs_info (in /usr/bin/btrfs) ==883== by 0x153B5F: ??? (in /usr/bin/btrfs) ==883== by 0x11B0C1: main (in /usr/bin/btrfs) ==883== Address 0x4d8c7a0 is 0 bytes after a block of size 12,288 alloc'd ==883== at 0x483877F: malloc (vg_replace_malloc.c:299) ==883== by 0x13D861: get_fs_info (in /usr/bin/btrfs) ==883== by 0x153B5F: ??? (in /usr/bin/btrfs) ==883== by 0x11B0C1: main (in /usr/bin/btrfs) ==883== ==883== Invalid write of size 8 ==883== at 0x13C99D: get_device_info (in /usr/bin/btrfs) ==883== by 0x13D715: get_fs_info (in /usr/bin/btrfs) ==883== by 0x153B5F: ??? (in /usr/bin/btrfs) ==883== by 0x11B0C1: main (in /usr/bin/btrfs) ==883== Address 0x4d8c7a8 is 8 bytes after a block of size 12,288 alloc'd ==883== at 0x483877F: malloc (vg_replace_malloc.c:299) ==883== by 0x13D861: get_fs_info (in /usr/bin/btrfs) ==883== by 0x153B5F: ??? (in /usr/bin/btrfs) ==883== by 0x11B0C1: main (in /usr/bin/btrfs) ==883== ==883== Syscall param ioctl(generic) points to unaddressable byte(s) ==883== at 0x4CA9CBB: ioctl (in /usr/lib/libc-2.29.so) ==883== by 0x13C9AB: get_device_info (in /usr/bin/btrfs) ==883== by 0x13D715: get_fs_info (in /usr/bin/btrfs) ==883== by 0x153B5F: ??? (in /usr/bin/btrfs) ==883== by 0x11B0C1: main (in /usr/bin/btrfs) ==883== Address 0x4d8c7a0 is 0 bytes after a block of size 12,288 alloc'd ==883== at 0x483877F: malloc (vg_replace_malloc.c:299) ==883== by 0x13D861: get_fs_info (in /usr/bin/btrfs) ==883== by 0x153B5F: ??? (in /usr/bin/btrfs) ==883== by 0x11B0C1: main (in /usr/bin/btrfs) ==883== --883-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - exiting --883-- si_code=1; Faulting address: 0x284D8C7B8; sp: 0x1002eb5e50 valgrind: the 'impossible' happened: Killed by fatal signal host stacktrace: ==883== at 0x5805261C: get_bszB_as_is (m_mallocfree.c:303) ==883== by 0x5805261C: get_bszB (m_mallocfree.c:315) ==883== by 0x5805261C: vgPlain_arena_malloc (m_mallocfree.c:1799) ==883== by 0x58005AD2: vgMemCheck_new_block (mc_malloc_wrappers.c:372) ==883== by 0x58005AD2: vgMemCheck_malloc (mc_malloc_wrappers.c:407) ==883== by 0x580A7373: do_client_request (scheduler.c:1925) ==883== by 0x580A7373: vgPlain_scheduler (scheduler.c:1488) ==883== by 0x580F57A0: thread_wrapper (syswrap-linux.c:103) ==883== by 0x580F57A0: run_a_thread_NORETURN (syswrap-linux.c:156) sched status: running_tid=1 Thread 1: status = VgTs_Runnable (lwpid 883) ==883== at 0x483877F: malloc (vg_replace_malloc.c:299) ==883== by 0x1534AA: ??? (in /usr/bin/btrfs) ==883== by 0x153C49: ??? (in /usr/bin/btrfs) ==883== by 0x11B0C1: main (in /usr/bin/btrfs) client stack range: [0x1FFEFFA000 0x1FFF000FFF] client SP: 0x1FFEFFDCE0 valgrind stack range: [0x1002DB6000 0x1002EB5FFF] top usage: 7520 of 1048576 ====================================================================== The above log says that invalid write to allocated @di_args happened in get_device_info() called in get_fs_info(). The size of @di_args is allocated according by fi_args->num_devices. And fi_args->num_devices is *the number of dev_items in chunk_tree*. However, in the loop to get devices info, btrfs-progs calls ioctl BTRFS_IOC_DEV_INFO which just finds device in fs_info->fs_devices->devices. Let's look at kernel side. In btrfs_rm_device(), btrfs_rm_dev_item() causes removal of related dev_items in chunk_tree. *Do something*. Then delete the device from device->fs_devices. So the case is: Userspace kernel get_fs_info() btrfs_rm_device() ... btrfs_rm_dev_item() determine fi_args->num_devices and fi_args->max_id by seraching chunk_tree. malloc() ... Loop(Crashed): call get_device_info() by devid from 1 to fi_args->max_id. mutex_lock(&fs_devices->device_list_mutex); list_del_rcu(&device->dev_list); ... In the loop of get_device_info(), get_device_info() still can get info of the removing device since it's still in fs_info->fs_devices->devices. Then the iterator value @ndev increaments causes invalid access out of bounds. Solved it by adding the check of @ndev while looping. Reported-by: Peter Hjalmarsson Link: https://bugzilla.redhat.com/show_bug.cgi?id=1711787 Signed-off-by: Su Yue Signed-off-by: David Sterba --- utils.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/utils.c b/utils.c index c6cdc8f0..0b271517 100644 --- a/utils.c +++ b/utils.c @@ -1763,7 +1763,8 @@ int get_fs_info(const char *path, struct btrfs_ioctl_fs_info_args *fi_args, if (replacing) memcpy(di_args, &tmp, sizeof(tmp)); - for (; last_devid <= fi_args->max_id; last_devid++) { + for (; last_devid <= fi_args->max_id && ndevs < fi_args->num_devices; + last_devid++) { ret = get_device_info(fd, last_devid, &di_args[ndevs]); if (ret == -ENODEV) continue;