commit 53fff24aaf01dcb09cbfabbe060f42db8e61ab01 Author: Greg Kroah-Hartman Date: Tue Nov 10 12:36:02 2020 +0100 Linux 4.19.156 Tested-by: Jon Hunter Tested-by: Pavel Machek (CIP) Tested-by: Guenter Roeck Tested-by: Shuah Khan Tested-by: Linux Kernel Functional Testing Link: https://lore.kernel.org/r/20201109125019.906191744@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman commit 779d3e303977edbe43c4f0017d1553b4306d8d9e Author: Pali Rohár Date: Mon Sep 7 13:27:17 2020 +0200 arm64: dts: marvell: espressobin: Add ethernet switch aliases commit b64d814257b027e29a474bcd660f6372490138c7 upstream. Espressobin boards have 3 ethernet ports and some of them got assigned more then one MAC address. MAC addresses are stored in U-Boot environment. Since commit a2c7023f7075c ("net: dsa: read mac address from DT for slave device") kernel can use MAC addresses from DT for particular DSA port. Currently Espressobin DTS file contains alias just for ethernet0. This patch defines additional ethernet aliases in Espressobin DTS files, so bootloader can fill correct MAC address for DSA switch ports if more MAC addresses were specified. DT alias ethernet1 is used for wan port, DT aliases ethernet2 and ethernet3 are used for lan ports for both Espressobin revisions (V5 and V7). Fixes: 5253cb8c00a6f ("arm64: dts: marvell: espressobin: add ethernet alias") Cc: # a2c7023f7075c: dsa: read mac address Signed-off-by: Pali Rohár Reviewed-by: Andrew Lunn Reviewed-by: Andre Heider Signed-off-by: Gregory CLEMENT [pali: Backported Espressobin rev V5 changes to 5.4 and 4.19 versions] Signed-off-by: Greg Kroah-Hartman commit 8069a08d35b72e16486a6a81eb06f964b2e1432b Author: Xiaofei Shen Date: Fri Mar 29 11:04:58 2019 +0530 net: dsa: read mac address from DT for slave device commit a2c7023f7075ca9b80f944d3f20f60e6574538e2 upstream. Before creating a slave netdevice, get the mac address from DTS and apply in case it is valid. Signed-off-by: Xiaofei Shen Signed-off-by: Vinod Koul Signed-off-by: David S. Miller Cc: Pali Rohár Signed-off-by: Greg Kroah-Hartman commit 3ab5872df9211447be82d0b01a2ec5731aa20621 Author: Guenter Roeck Date: Sat Nov 7 16:31:24 2020 -0800 tools: perf: Fix build error in v4.19.y perf may fail to build in v4.19.y with the following error. util/evsel.c: In function ‘perf_evsel__exit’: util/util.h:25:28: error: passing argument 1 of ‘free’ discards ‘const’ qualifier from pointer target type This is observed (at least) with gcc v6.5.0. The underlying problem is the following statement. zfree(&evsel->pmu_name); evsel->pmu_name is decared 'const *'. zfree in turn is defined as #define zfree(ptr) ({ free(*ptr); *ptr = NULL; }) and thus passes the const * to free(). The problem is not seen in the upstream kernel since zfree() has been rewritten there. The problem has been introduced into v4.19.y with the backport of upstream commit d4953f7ef1a2 (perf parse-events: Fix 3 use after frees found with clang ASAN). One possible fix of this problem would be to not declare pmu_name as const. This patch chooses to typecast the parameter of zfree() to void *, following the guidance from the upstream kernel which does the same since commit 7f7c536f23e6a ("tools lib: Adopt zalloc()/zfree() from tools/perf") Fixes: a0100a363098 ("perf parse-events: Fix 3 use after frees found with clang ASAN") Signed-off-by: Guenter Roeck Signed-off-by: Greg Kroah-Hartman commit 29a975bcc107d68e379a55048813ddf3e7b120b8 Author: kiyin(尹亮) Date: Wed Nov 4 08:23:22 2020 +0300 perf/core: Fix a memory leak in perf_event_parse_addr_filter() commit 7bdb157cdebbf95a1cd94ed2e01b338714075d00 upstream. As shown through runtime testing, the "filename" allocation is not always freed in perf_event_parse_addr_filter(). There are three possible ways that this could happen: - It could be allocated twice on subsequent iterations through the loop, - or leaked on the success path, - or on the failure path. Clean up the code flow to make it obvious that 'filename' is always freed in the reallocation path and in the two return paths as well. We rely on the fact that kfree(NULL) is NOP and filename is initialized with NULL. This fixes the leak. No other side effects expected. [ Dan Carpenter: cleaned up the code flow & added a changelog. ] [ Ingo Molnar: updated the changelog some more. ] Fixes: 375637bc5249 ("perf/core: Introduce address range filtering") Signed-off-by: "kiyin(尹亮)" Signed-off-by: Dan Carpenter Signed-off-by: Ingo Molnar Cc: "Srivatsa S. Bhat" Cc: Anthony Liguori -- kernel/events/core.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) Signed-off-by: Greg Kroah-Hartman commit 292e5700f4db841593141e9a3b1210e94bc8f74f Author: Rafael J. Wysocki Date: Thu Oct 22 17:38:22 2020 +0200 PM: runtime: Resume the device earlier in __device_release_driver() commit 9226c504e364158a17a68ff1fe9d67d266922f50 upstream. Since the device is resumed from runtime-suspend in __device_release_driver() anyway, it is better to do that before looking for busy managed device links from it to consumers, because if there are any, device_links_unbind_consumers() will be called and it will cause the consumer devices' drivers to unbind, so the consumer devices will be runtime-resumed. In turn, resuming each consumer device will cause the supplier to be resumed and when the runtime PM references from the given consumer to it are dropped, it may be suspended. Then, the runtime-resume of the next consumer will cause the supplier to resume again and so on. Update the code accordingly. Signed-off-by: Rafael J. Wysocki Fixes: 9ed9895370ae ("driver core: Functional dependencies tracking support") Cc: All applicable # All applicable Tested-by: Xiang Chen Reviewed-by: Greg Kroah-Hartman Signed-off-by: Greg Kroah-Hartman commit 2590411220774522fe00c598e089cc11baebe26e Author: Vineet Gupta Date: Mon Oct 19 19:19:57 2020 -0700 Revert "ARC: entry: fix potential EFA clobber when TIF_SYSCALL_TRACE" This reverts commit 00fdec98d9881bf5173af09aebd353ab3b9ac729. (but only from 5.2 and prior kernels) The original commit was a preventive fix based on code-review and was auto-picked for stable back-port (for better or worse). It was OK for v5.3+ kernels, but turned up needing an implicit change 68e5c6f073bcf70 "(ARC: entry: EV_Trap expects r10 (vs. r9) to have exception cause)" merged in v5.3 which itself was not backported. So to summarize the stable backport of this patch for v5.2 and prior kernels is busted and it won't boot. The obvious solution is backport 68e5c6f073bcf70 but that is a pain as it doesn't revert cleanly and each of affected kernels (so far v4.19, v4.14, v4.9, v4.4) needs a slightly different massaged varaint. So the easier fix is to simply revert the backport from 5.2 and prior. The issue was not a big deal as it would cause strace to sporadically not work correctly. Waldemar Brodkorb first reported this when running ARC uClibc regressions on latest stable kernels (with offending backport). Once he bisected it, the analysis was trivial, so thx to him for this. Reported-by: Waldemar Brodkorb Bisected-by: Waldemar Brodkorb Cc: stable # 5.2 and prior Signed-off-by: Vineet Gupta Signed-off-by: Greg Kroah-Hartman commit a7443bebdecaa474900ee5eec5da69de63686c24 Author: Vineet Gupta Date: Tue Oct 27 15:01:17 2020 -0700 ARC: stack unwinding: avoid indefinite looping commit 328d2168ca524d501fc4b133d6be076142bd305c upstream. Currently stack unwinder is a while(1) loop which relies on the dwarf unwinder to signal termination, which in turn relies on dwarf info to do so. This in theory could cause an infinite loop if the dwarf info was somehow messed up or the register contents were etc. This fix thus detects the excessive looping and breaks the loop. | Mem: 26184K used, 1009136K free, 0K shrd, 0K buff, 14416K cached | CPU: 0.0% usr 72.8% sys 0.0% nic 27.1% idle 0.0% io 0.0% irq 0.0% sirq | Load average: 4.33 2.60 1.11 2/74 139 | PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND | 133 2 root SWN 0 0.0 3 22.9 [rcu_torture_rea] | 132 2 root SWN 0 0.0 0 22.0 [rcu_torture_rea] | 131 2 root SWN 0 0.0 3 21.5 [rcu_torture_rea] | 126 2 root RW 0 0.0 2 5.4 [rcu_torture_wri] | 129 2 root SWN 0 0.0 0 0.2 [rcu_torture_fak] | 137 2 root SW 0 0.0 0 0.2 [rcu_torture_cbf] | 127 2 root SWN 0 0.0 0 0.1 [rcu_torture_fak] | 138 115 root R 1464 0.1 2 0.1 top | 130 2 root SWN 0 0.0 0 0.1 [rcu_torture_fak] | 128 2 root SWN 0 0.0 0 0.1 [rcu_torture_fak] | 115 1 root S 1472 0.1 1 0.0 -/bin/sh | 104 1 root S 1464 0.1 0 0.0 inetd | 1 0 root S 1456 0.1 2 0.0 init | 78 1 root S 1456 0.1 0 0.0 syslogd -O /var/log/messages | 134 2 root SW 0 0.0 2 0.0 [rcu_torture_sta] | 10 2 root IW 0 0.0 1 0.0 [rcu_preempt] | 88 2 root IW 0 0.0 1 0.0 [kworker/1:1-eve] | 66 2 root IW 0 0.0 2 0.0 [kworker/2:2-eve] | 39 2 root IW 0 0.0 2 0.0 [kworker/2:1-eve] | unwinder looping too long, aborting ! Cc: Signed-off-by: Vineet Gupta Signed-off-by: Greg Kroah-Hartman commit 83a282f990ba94103eb49be1ad69a8e4b2de4fbd Author: Macpaul Lin Date: Fri Nov 6 13:54:29 2020 +0800 usb: mtu3: fix panic in mtu3_gadget_stop() commit 20914919ad31849ee2b9cfe0428f4a20335c9e2a upstream. This patch fixes a possible issue when mtu3_gadget_stop() already assigned NULL to mtu->gadget_driver during mtu_gadget_disconnect(). [] notifier_call_chain+0xa4/0x128 [] __atomic_notifier_call_chain+0x84/0x138 [] notify_die+0xb0/0x120 [] die+0x1f8/0x5d0 [] __do_kernel_fault+0x19c/0x280 [] do_bad_area+0x44/0x140 [] do_translation_fault+0x4c/0x90 [] do_mem_abort+0xb8/0x258 [] el1_da+0x24/0x3c [] mtu3_gadget_disconnect+0xac/0x128 [] mtu3_irq+0x34c/0xc18 [] __handle_irq_event_percpu+0x2ac/0xcd0 [] handle_irq_event_percpu+0x80/0x138 [] handle_irq_event+0xac/0x148 [] handle_fasteoi_irq+0x234/0x568 [] generic_handle_irq+0x48/0x68 [] __handle_domain_irq+0x264/0x1740 [] gic_handle_irq+0x14c/0x250 [] el1_irq+0xec/0x194 [] dma_pool_alloc+0x6e4/0xae0 [] cmdq_mbox_pool_alloc_impl+0xb0/0x238 [] cmdq_pkt_alloc_buf+0x2dc/0x7c0 [] cmdq_pkt_add_cmd_buffer+0x178/0x270 [] cmdq_pkt_perf_begin+0x108/0x148 [] cmdq_pkt_create+0x178/0x1f0 [] mtk_crtc_config_default_path+0x328/0x7a0 [] mtk_drm_idlemgr_kick+0xa6c/0x1460 [] mtk_drm_crtc_atomic_begin+0x1a4/0x1a68 [] drm_atomic_helper_commit_planes+0x154/0x878 [] mtk_atomic_complete.isra.16+0xe80/0x19c8 [] mtk_atomic_commit+0x258/0x898 [] drm_atomic_commit+0xcc/0x108 [] drm_mode_atomic_ioctl+0x1c20/0x2580 [] drm_ioctl_kernel+0x118/0x1b0 [] drm_ioctl+0x5c0/0x920 [] do_vfs_ioctl+0x188/0x1820 [] SyS_ioctl+0x8c/0xa0 Fixes: df2069acb005 ("usb: Add MediaTek USB3 DRD driver") Signed-off-by: Macpaul Lin Acked-by: Chunfeng Yun Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1604642069-20961-1-git-send-email-macpaul.lin@mediatek.com Signed-off-by: Greg Kroah-Hartman commit 327a8935379f2e9b73b1c0f7345df321d6b42c88 Author: Alan Stern Date: Mon Nov 2 09:58:21 2020 -0500 USB: Add NO_LPM quirk for Kingston flash drive commit afaa2e745a246c5ab95103a65b1ed00101e1bc63 upstream. In Bugzilla #208257, Julien Humbert reports that a 32-GB Kingston flash drive spontaneously disconnects and reconnects, over and over. Testing revealed that disabling Link Power Management for the drive fixed the problem. This patch adds a quirk entry for that drive to turn off LPM permanently. CC: Hans de Goede CC: Reported-and-tested-by: Julien Humbert Signed-off-by: Alan Stern Link: https://lore.kernel.org/r/20201102145821.GA1478741@rowland.harvard.edu Signed-off-by: Greg Kroah-Hartman commit 59187e8bba015683ac00a1e10895a18443249cee Author: Daniele Palmas Date: Tue Nov 3 13:44:25 2020 +0100 USB: serial: option: add Telit FN980 composition 0x1055 commit db0362eeb22992502764e825c79b922d7467e0eb upstream. Add the following Telit FN980 composition: 0x1055: tty, adb, tty, tty, tty, tty Signed-off-by: Daniele Palmas Link: https://lore.kernel.org/r/20201103124425.12940-1-dnlplm@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman commit 6ea688db659c49334256d2dc69dc45f1e93f3594 Author: Daniele Palmas Date: Sat Oct 31 23:54:58 2020 +0100 USB: serial: option: add LE910Cx compositions 0x1203, 0x1230, 0x1231 commit 489979b4aab490b6b917c11dc02d81b4b742784a upstream. Add following Telit LE910Cx compositions: 0x1203: rndis, tty, adb, tty, tty, tty, tty 0x1230: tty, adb, rmnet, audio, tty, tty, tty, tty 0x1231: rndis, tty, adb, audio, tty, tty, tty, tty Signed-off-by: Daniele Palmas Link: https://lore.kernel.org/r/20201031225458.10512-1-dnlplm@gmail.com [ johan: add comments after entries ] Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman commit 3411d382f62132df82fb697c79ef3c4a7b9c741b Author: Ziyi Cao Date: Tue Oct 20 00:08:06 2020 +0800 USB: serial: option: add Quectel EC200T module support commit a46b973bced1ba57420752bf38426acd9f6cbfa6 upstream. Add usb product id of the Quectel EC200T module. Signed-off-by: Ziyi Cao Link: https://lore.kernel.org/r/17f8a2a3-ce0f-4be7-8544-8fdf286907d0@www.fastmail.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman commit 202dae99e649d49355133d2bb9150244fd7695e5 Author: Johan Hovold Date: Mon Oct 26 09:25:48 2020 +0100 USB: serial: cyberjack: fix write-URB completion race commit 985616f0457d9f555fff417d0da56174f70cc14f upstream. The write-URB busy flag was being cleared before the completion handler was done with the URB, something which could lead to corrupt transfers due to a racing write request if the URB is resubmitted. Fixes: 507ca9bc0476 ("[PATCH] USB: add ability for usb-serial drivers to determine if their write urb is currently being used.") Cc: stable # 2.6.13 Reviewed-by: Greg Kroah-Hartman Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman commit 288609c2bae2b08a21d1c794053c7e761620a946 Author: Qinglang Miao Date: Tue Nov 3 16:49:42 2020 +0800 serial: txx9: add missing platform_driver_unregister() on error in serial_txx9_init commit 0c5fc92622ed5531ff324b20f014e9e3092f0187 upstream. Add the missing platform_driver_unregister() before return from serial_txx9_init in the error handling case when failed to register serial_txx9_pci_driver with macro ENABLE_SERIAL_TXX9_PCI defined. Fixes: ab4382d27412 ("tty: move drivers/serial/ to drivers/tty/serial/") Signed-off-by: Qinglang Miao Link: https://lore.kernel.org/r/20201103084942.109076-1-miaoqinglang@huawei.com Cc: stable Signed-off-by: Greg Kroah-Hartman commit 18f5757d7fd3ae455c2fcfd60efd5242ea9836f6 Author: Claire Chang Date: Mon Nov 2 20:07:49 2020 +0800 serial: 8250_mtk: Fix uart_get_baud_rate warning commit 912ab37c798770f21b182d656937072b58553378 upstream. Mediatek 8250 port supports speed higher than uartclk / 16. If the baud rates in both the new and the old termios setting are higher than uartclk / 16, the WARN_ON in uart_get_baud_rate() will be triggered. Passing NULL as the old termios so uart_get_baud_rate() will use uartclk / 16 - 1 as the new baud rate which will be replaced by the original baud rate later by tty_termios_encode_baud_rate() in mtk8250_set_termios(). Fixes: 551e553f0d4a ("serial: 8250_mtk: Fix high-speed baud rates clamping") Signed-off-by: Claire Chang Link: https://lore.kernel.org/r/20201102120749.374458-1-tientzu@chromium.org Cc: stable Signed-off-by: Greg Kroah-Hartman commit b177d2d915cea2d0a590f0034a20299dd1ee3ef2 Author: Eddy Wu Date: Sat Nov 7 14:47:22 2020 +0800 fork: fix copy_process(CLONE_PARENT) race with the exiting ->real_parent commit b4e00444cab4c3f3fec876dc0cccc8cbb0d1a948 upstream. current->group_leader->exit_signal may change during copy_process() if current->real_parent exits. Move the assignment inside tasklist_lock to avoid the race. Signed-off-by: Eddy Wu Acked-by: Oleg Nesterov Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 6612b754ac0c85ca8b1181b5d3ea4461a8c1bbcb Author: Daniel Vetter Date: Sun Nov 8 16:38:06 2020 +0100 vt: Disable KD_FONT_OP_COPY commit 3c4e0dff2095c579b142d5a0693257f1c58b4804 upstream. It's buggy: On Fri, Nov 06, 2020 at 10:30:08PM +0800, Minh Yuan wrote: > We recently discovered a slab-out-of-bounds read in fbcon in the latest > kernel ( v5.10-rc2 for now ). The root cause of this vulnerability is that > "fbcon_do_set_font" did not handle "vc->vc_font.data" and > "vc->vc_font.height" correctly, and the patch > for VT_RESIZEX can't handle this > issue. > > Specifically, we use KD_FONT_OP_SET to set a small font.data for tty6, and > use KD_FONT_OP_SET again to set a large font.height for tty1. After that, > we use KD_FONT_OP_COPY to assign tty6's vc_font.data to tty1's vc_font.data > in "fbcon_do_set_font", while tty1 retains the original larger > height. Obviously, this will cause an out-of-bounds read, because we can > access a smaller vc_font.data with a larger vc_font.height. Further there was only one user ever. - Android's loadfont, busybox and console-tools only ever use OP_GET and OP_SET - fbset documentation only mentions the kernel cmdline font: option, not anything else. - systemd used OP_COPY before release 232 published in Nov 2016 Now unfortunately the crucial report seems to have gone down with gmane, and the commit message doesn't say much. But the pull request hints at OP_COPY being broken https://github.com/systemd/systemd/pull/3651 So in other words, this never worked, and the only project which foolishly every tried to use it, realized that rather quickly too. Instead of trying to fix security issues here on dead code by adding missing checks, fix the entire thing by removing the functionality. Note that systemd code using the OP_COPY function ignored the return value, so it doesn't matter what we're doing here really - just in case a lone server somewhere happens to be extremely unlucky and running an affected old version of systemd. The relevant code from font_copy_to_all_vcs() in systemd was: /* copy font from active VT, where the font was uploaded to */ cfo.op = KD_FONT_OP_COPY; cfo.height = vcs.v_active-1; /* tty1 == index 0 */ (void) ioctl(vcfd, KDFONTOP, &cfo); Note this just disables the ioctl, garbage collecting the now unused callbacks is left for -next. v2: Tetsuo found the old mail, which allowed me to find it on another archive. Add the link too. Acked-by: Peilin Ye Reported-by: Minh Yuan References: https://lists.freedesktop.org/archives/systemd-devel/2016-June/036935.html References: https://github.com/systemd/systemd/pull/3651 Cc: Greg KH Cc: Peilin Ye Cc: Tetsuo Handa Signed-off-by: Daniel Vetter Link: https://lore.kernel.org/r/20201108153806.3140315-1-daniel.vetter@ffwll.ch Signed-off-by: Greg Kroah-Hartman commit a52cdf61125b2189a8e1d85d1e61d654f7fe5d4d Author: Zhang Qilong Date: Tue Oct 27 21:49:01 2020 +0800 ACPI: NFIT: Fix comparison to '-ENXIO' [ Upstream commit 85f971b65a692b68181438e099b946cc06ed499b ] Initial value of rc is '-ENXIO', and we should use the initial value to check it. Signed-off-by: Zhang Qilong Reviewed-by: Pankaj Gupta Reviewed-by: Vishal Verma [ rjw: Subject edit ] Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin commit 6eecfcbcde431904e5837d285e9e99b5a5eac02c Author: Hoegeun Kwon Date: Tue Oct 27 13:14:42 2020 +0900 drm/vc4: drv: Add error handding for bind [ Upstream commit 9ce0af3e9573fb84c4c807183d13ea2a68271e4b ] There is a problem that if vc4_drm bind fails, a memory leak occurs on the drm_property_create side. Add error handding for drm_mode_config. Signed-off-by: Hoegeun Kwon Signed-off-by: Maxime Ripard Link: https://patchwork.freedesktop.org/patch/msgid/20201027041442.30352-2-hoegeun.kwon@samsung.com Signed-off-by: Sasha Levin commit 1aa82721dd74c9f08337ea0a22689eb2a94634ae Author: Jeff Vander Stoep Date: Fri Oct 23 16:37:57 2020 +0200 vsock: use ns_capable_noaudit() on socket create [ Upstream commit af545bb5ee53f5261db631db2ac4cde54038bdaf ] During __vsock_create() CAP_NET_ADMIN is used to determine if the vsock_sock->trusted should be set to true. This value is used later for determing if a remote connection should be allowed to connect to a restricted VM. Unfortunately, if the caller doesn't have CAP_NET_ADMIN, an audit message such as an selinux denial is generated even if the caller does not want a trusted socket. Logging errors on success is confusing. To avoid this, switch the capable(CAP_NET_ADMIN) check to the noaudit version. Reported-by: Roman Kiryanov https://android-review.googlesource.com/c/device/generic/goldfish/+/1468545/ Signed-off-by: Jeff Vander Stoep Reviewed-by: James Morris Link: https://lore.kernel.org/r/20201023143757.377574-1-jeffv@google.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 6280db912e03089b15a3e2c2882d573d3b09bd0c Author: Ming Lei Date: Sat Oct 10 11:25:39 2020 +0800 scsi: core: Don't start concurrent async scan on same host [ Upstream commit 831e3405c2a344018a18fcc2665acc5a38c3a707 ] The current scanning mechanism is supposed to fall back to a synchronous host scan if an asynchronous scan is in progress. However, this rule isn't strictly respected, scsi_prep_async_scan() doesn't hold scan_mutex when checking shost->async_scan. When scsi_scan_host() is called concurrently, two async scans on same host can be started and a hang in do_scan_async() is observed. Fixes this issue by checking & setting shost->async_scan atomically with shost->scan_mutex. Link: https://lore.kernel.org/r/20201010032539.426615-1-ming.lei@redhat.com Cc: Christoph Hellwig Cc: Ewan D. Milne Cc: Hannes Reinecke Cc: Bart Van Assche Reviewed-by: Lee Duncan Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 239aed5d2ecbfd381ca642f0a6bcb272c98408df Author: Gabriel Krisman Bertazi Date: Thu Oct 22 16:58:42 2020 -0400 blk-cgroup: Pre-allocate tree node on blkg_conf_prep [ Upstream commit f255c19b3ab46d3cad3b1b2e1036f4c926cb1d0c ] Similarly to commit 457e490f2b741 ("blkcg: allocate struct blkcg_gq outside request queue spinlock"), blkg_create can also trigger occasional -ENOMEM failures at the radix insertion because any allocation inside blkg_create has to be non-blocking, making it more likely to fail. This causes trouble for userspace tools trying to configure io weights who need to deal with this condition. This patch reduces the occurrence of -ENOMEMs on this path by preloading the radix tree element on a GFP_KERNEL context, such that we guarantee the later non-blocking insertion won't fail. A similar solution exists in blkcg_init_queue for the same situation. Acked-by: Tejun Heo Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit 6dba51828dfd31e326e6642727ede24c054a86fe Author: Gabriel Krisman Bertazi Date: Thu Oct 22 16:58:41 2020 -0400 blk-cgroup: Fix memleak on error path [ Upstream commit 52abfcbd57eefdd54737fc8c2dc79d8f46d4a3e5 ] If new_blkg allocation raced with blk_policy change and blkg_lookup_check fails, new_blkg is leaked. Acked-by: Tejun Heo Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit b8f922aa95aa07759623e2adab65bdba8811e808 Author: Vincent Whitchurch Date: Wed Oct 21 11:53:59 2020 +0200 of: Fix reserved-memory overlap detection [ Upstream commit ca05f33316559a04867295dd49f85aeedbfd6bfd ] The reserved-memory overlap detection code fails to detect overlaps if either of the regions starts at address 0x0. The code explicitly checks for and ignores such regions, apparently in order to ignore dynamically allocated regions which have an address of 0x0 at this point. These dynamically allocated regions also have a size of 0x0 at this point, so fix this by removing the check and sorting the dynamically allocated regions ahead of any static regions at address 0x0. For example, there are two overlaps in this case but they are not currently reported: foo@0 { reg = <0x0 0x2000>; }; bar@0 { reg = <0x0 0x1000>; }; baz@1000 { reg = <0x1000 0x1000>; }; quux { size = <0x1000>; }; but they are after this patch: OF: reserved mem: OVERLAP DETECTED! bar@0 (0x00000000--0x00001000) overlaps with foo@0 (0x00000000--0x00002000) OF: reserved mem: OVERLAP DETECTED! foo@0 (0x00000000--0x00002000) overlaps with baz@1000 (0x00001000--0x00002000) Signed-off-by: Vincent Whitchurch Link: https://lore.kernel.org/r/ded6fd6b47b58741aabdcc6967f73eca6a3f311e.1603273666.git-series.vincent.whitchurch@axis.com Signed-off-by: Rob Herring Signed-off-by: Sasha Levin commit f81794c3789dcacf5dc5abe7da557f33e0c60ecf Author: Kairui Song Date: Wed Oct 14 17:24:28 2020 +0800 x86/kexec: Use up-to-dated screen_info copy to fill boot params [ Upstream commit afc18069a2cb7ead5f86623a5f3d4ad6e21f940d ] kexec_file_load() currently reuses the old boot_params.screen_info, but if drivers have change the hardware state, boot_param.screen_info could contain invalid info. For example, the video type might be no longer VGA, or the frame buffer address might be changed. If the kexec kernel keeps using the old screen_info, kexec'ed kernel may attempt to write to an invalid framebuffer memory region. There are two screen_info instances globally available, boot_params.screen_info and screen_info. Later one is a copy, and is updated by drivers. So let kexec_file_load use the updated copy. [ mingo: Tidied up the changelog. ] Signed-off-by: Kairui Song Signed-off-by: Ingo Molnar Link: https://lore.kernel.org/r/20201014092429.1415040-2-kasong@redhat.com Signed-off-by: Sasha Levin commit dffce0f0e7f1c2c421ef488642764d81ff62da0f Author: Clément Péron Date: Sat Oct 3 12:03:32 2020 +0200 ARM: dts: sun4i-a10: fix cpu_alert temperature [ Upstream commit dea252fa41cd8ce332d148444e4799235a8a03ec ] When running dtbs_check thermal_zone warn about the temperature declared. thermal-zones: cpu-thermal:trips:cpu-alert0:temperature:0:0: 850000 is greater than the maximum of 200000 It's indeed wrong the real value is 85°C and not 850°C. Signed-off-by: Clément Péron Signed-off-by: Maxime Ripard Link: https://lore.kernel.org/r/20201003100332.431178-1-peron.clem@gmail.com Signed-off-by: Sasha Levin commit c096a3d44ead54b53a3b65d9d210022099bdc7ef Author: Mike Galbraith Date: Wed Nov 4 16:12:44 2020 +0100 futex: Handle transient "ownerless" rtmutex state correctly commit 9f5d1c336a10c0d24e83e40b4c1b9539f7dba627 upstream. Gratian managed to trigger the BUG_ON(!newowner) in fixup_pi_state_owner(). This is one possible chain of events leading to this: Task Prio Operation T1 120 lock(F) T2 120 lock(F) -> blocks (top waiter) T3 50 (RT) lock(F) -> boosts T1 and blocks (new top waiter) XX timeout/ -> wakes T2 signal T1 50 unlock(F) -> wakes T3 (rtmutex->owner == NULL, waiter bit is set) T2 120 cleanup -> try_to_take_mutex() fails because T3 is the top waiter and the lower priority T2 cannot steal the lock. -> fixup_pi_state_owner() sees newowner == NULL -> BUG_ON() The comment states that this is invalid and rt_mutex_real_owner() must return a non NULL owner when the trylock failed, but in case of a queued and woken up waiter rt_mutex_real_owner() == NULL is a valid transient state. The higher priority waiter has simply not yet managed to take over the rtmutex. The BUG_ON() is therefore wrong and this is just another retry condition in fixup_pi_state_owner(). Drop the locks, so that T3 can make progress, and then try the fixup again. Gratian provided a great analysis, traces and a reproducer. The analysis is to the point, but it confused the hell out of that tglx dude who had to page in all the futex horrors again. Condensed version is above. [ tglx: Wrote comment and changelog ] Fixes: c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex") Reported-by: Gratian Crisan Signed-off-by: Mike Galbraith Signed-off-by: Thomas Gleixner Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87a6w6x7bb.fsf@ni.com Link: https://lore.kernel.org/r/87sg9pkvf7.fsf@nanos.tec.linutronix.de Signed-off-by: Greg Kroah-Hartman commit 7e4eeff7da1df5602e9af340d01b2ce1b4662d3e Author: Qiujun Huang Date: Fri Oct 30 00:19:05 2020 +0800 tracing: Fix out of bounds write in get_trace_buf commit c1acb4ac1a892cf08d27efcb964ad281728b0545 upstream. The nesting count of trace_printk allows for 4 levels of nesting. The nesting counter starts at zero and is incremented before being used to retrieve the current context's buffer. But the index to the buffer uses the nesting counter after it was incremented, and not its original number, which in needs to do. Link: https://lkml.kernel.org/r/20201029161905.4269-1-hqjagain@gmail.com Cc: stable@vger.kernel.org Fixes: 3d9622c12c887 ("tracing: Add barrier to trace_printk() buffer nesting modification") Signed-off-by: Qiujun Huang Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit 2de780dfbe1ad05c3f15f8d6f2d22dae9cf95267 Author: Steven Rostedt (VMware) Date: Thu Oct 29 19:35:08 2020 -0400 ftrace: Handle tracing when switching between context commit 726b3d3f141fba6f841d715fc4d8a4a84f02c02a upstream. When an interrupt or NMI comes in and switches the context, there's a delay from when the preempt_count() shows the update. As the preempt_count() is used to detect recursion having each context have its own bit get set when tracing starts, and if that bit is already set, it is considered a recursion and the function exits. But if this happens in that section where context has changed but preempt_count() has not been updated, this will be incorrectly flagged as a recursion. To handle this case, create another bit call TRANSITION and test it if the current context bit is already set. Flag the call as a recursion if the TRANSITION bit is already set, and if not, set it and continue. The TRANSITION bit will be cleared normally on the return of the function that set it, or if the current context bit is clear, set it and clear the TRANSITION bit to allow for another transition between the current context and an even higher one. Cc: stable@vger.kernel.org Fixes: edc15cafcbfa3 ("tracing: Avoid unnecessary multiple recursion checks") Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit ee2b95c08515633da67c048beb164966224ffe9e Author: Steven Rostedt (VMware) Date: Thu Oct 29 17:31:45 2020 -0400 ftrace: Fix recursion check for NMI test commit ee11b93f95eabdf8198edd4668bf9102e7248270 upstream. The code that checks recursion will work to only do the recursion check once if there's nested checks. The top one will do the check, the other nested checks will see recursion was already checked and return zero for its "bit". On the return side, nothing will be done if the "bit" is zero. The problem is that zero is returned for the "good" bit when in NMI context. This will set the bit for NMIs making it look like *all* NMI tracing is recursing, and prevent tracing of anything in NMI context! The simple fix is to return "bit + 1" and subtract that bit on the end to get the real bit. Cc: stable@vger.kernel.org Fixes: edc15cafcbfa3 ("tracing: Avoid unnecessary multiple recursion checks") Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit b410d07e965a039dc67073889dab9ff01cee8402 Author: Steven Rostedt (VMware) Date: Mon Nov 2 15:31:27 2020 -0500 ring-buffer: Fix recursion protection transitions between interrupt context commit b02414c8f045ab3b9afc816c3735bc98c5c3d262 upstream. The recursion protection of the ring buffer depends on preempt_count() to be correct. But it is possible that the ring buffer gets called after an interrupt comes in but before it updates the preempt_count(). This will trigger a false positive in the recursion code. Use the same trick from the ftrace function callback recursion code which uses a "transition" bit that gets set, to allow for a single recursion for to handle transitions between contexts. Cc: stable@vger.kernel.org Fixes: 567cd4da54ff4 ("ring-buffer: User context bit recursion checking") Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Greg Kroah-Hartman commit fe0af0efa7b1ecb8aebb5e8f44d00db07f564c6c Author: Alexander Aring Date: Mon Oct 26 10:52:29 2020 -0400 gfs2: Wake up when sd_glock_disposal becomes zero commit da7d554f7c62d0c17c1ac3cc2586473c2d99f0bd upstream. Commit fc0e38dae645 ("GFS2: Fix glock deallocation race") fixed a sd_glock_disposal accounting bug by adding a missing atomic_dec statement, but it failed to wake up sd_glock_wait when that decrement causes sd_glock_disposal to reach zero. As a consequence, gfs2_gl_hash_clear can now run into a 10-minute timeout instead of being woken up. Add the missing wakeup. Fixes: fc0e38dae645 ("GFS2: Fix glock deallocation race") Cc: stable@vger.kernel.org # v2.6.39+ Signed-off-by: Alexander Aring Signed-off-by: Andreas Gruenbacher Signed-off-by: Greg Kroah-Hartman commit 361d82eb7c301e7473459cb7f1e0d7733cf5e7ec Author: Jason Gunthorpe Date: Sun Nov 1 17:08:00 2020 -0800 mm: always have io_remap_pfn_range() set pgprot_decrypted() commit f8f6ae5d077a9bdaf5cbf2ac960a5d1a04b47482 upstream. The purpose of io_remap_pfn_range() is to map IO memory, such as a memory mapped IO exposed through a PCI BAR. IO devices do not understand encryption, so this memory must always be decrypted. Automatically call pgprot_decrypted() as part of the generic implementation. This fixes a bug where enabling AMD SME causes subsystems, such as RDMA, using io_remap_pfn_range() to expose BAR pages to user space to fail. The CPU will encrypt access to those BAR pages instead of passing unencrypted IO directly to the device. Places not mapping IO should use remap_pfn_range(). Fixes: aca20d546214 ("x86/mm: Add support to make use of Secure Memory Encryption") Signed-off-by: Jason Gunthorpe Signed-off-by: Andrew Morton Cc: Arnd Bergmann Cc: Tom Lendacky Cc: Thomas Gleixner Cc: Andrey Ryabinin Cc: Borislav Petkov Cc: Brijesh Singh Cc: Jonathan Corbet Cc: Dmitry Vyukov Cc: "Dave Young" Cc: Alexander Potapenko Cc: Konrad Rzeszutek Wilk Cc: Andy Lutomirski Cc: Larry Woodman Cc: Matt Fleming Cc: Ingo Molnar Cc: "Michael S. Tsirkin" Cc: Paolo Bonzini Cc: Peter Zijlstra Cc: Rik van Riel Cc: Toshimitsu Kani Cc: Link: https://lkml.kernel.org/r/0-v1-025d64bdf6c4+e-amd_sme_fix_jgg@nvidia.com Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 68e8b8ed787a8acafa3e987d52f6d415c5949ec6 Author: Zqiang Date: Sun Nov 1 17:07:53 2020 -0800 kthread_worker: prevent queuing delayed work from timer_fn when it is being canceled commit 6993d0fdbee0eb38bfac350aa016f65ad11ed3b1 upstream. There is a small race window when a delayed work is being canceled and the work still might be queued from the timer_fn: CPU0 CPU1 kthread_cancel_delayed_work_sync() __kthread_cancel_work_sync() __kthread_cancel_work() work->canceling++; kthread_delayed_work_timer_fn() kthread_insert_work(); BUG: kthread_insert_work() should not get called when work->canceling is set. Signed-off-by: Zqiang Signed-off-by: Andrew Morton Reviewed-by: Petr Mladek Acked-by: Tejun Heo Cc: Link: https://lkml.kernel.org/r/20201014083030.16895-1-qiang.zhang@windriver.com Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 1471e96263b6e68929e7afed28f80ab3b6bd1066 Author: Vasily Gorbik Date: Sun Nov 1 17:07:47 2020 -0800 lib/crc32test: remove extra local_irq_disable/enable commit aa4e460f0976351fddd2f5ac6e08b74320c277a1 upstream. Commit 4d004099a668 ("lockdep: Fix lockdep recursion") uncovered the following issue in lib/crc32test reported on s390: BUG: using __this_cpu_read() in preemptible [00000000] code: swapper/0/1 caller is lockdep_hardirqs_on_prepare+0x48/0x270 CPU: 6 PID: 1 Comm: swapper/0 Not tainted 5.9.0-next-20201015-15164-g03d992bd2de6 #19 Hardware name: IBM 3906 M04 704 (LPAR) Call Trace: lockdep_hardirqs_on_prepare+0x48/0x270 trace_hardirqs_on+0x9c/0x1b8 crc32_test.isra.0+0x170/0x1c0 crc32test_init+0x1c/0x40 do_one_initcall+0x40/0x130 do_initcalls+0x126/0x150 kernel_init_freeable+0x1f6/0x230 kernel_init+0x22/0x150 ret_from_fork+0x24/0x2c no locks held by swapper/0/1. Remove extra local_irq_disable/local_irq_enable helpers calls. Fixes: 5fb7f87408f1 ("lib: add module support to crc32 tests") Signed-off-by: Vasily Gorbik Signed-off-by: Andrew Morton Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Greg Kroah-Hartman Link: https://lkml.kernel.org/r/patch.git-4369da00c06e.your-ad-here.call-01602859837-ext-1679@work.hours Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit d4fea27ffcb01bbf8ca8ae54f60479945394b83e Author: Shijie Luo Date: Sun Nov 1 17:07:40 2020 -0800 mm: mempolicy: fix potential pte_unmap_unlock pte error commit 3f08842098e842c51e3b97d0dcdebf810b32558e upstream. When flags in queue_pages_pte_range don't have MPOL_MF_MOVE or MPOL_MF_MOVE_ALL bits, code breaks and passing origin pte - 1 to pte_unmap_unlock seems like not a good idea. queue_pages_pte_range can run in MPOL_MF_MOVE_ALL mode which doesn't migrate misplaced pages but returns with EIO when encountering such a page. Since commit a7f40cfe3b7a ("mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified") and early break on the first pte in the range results in pte_unmap_unlock on an underflow pte. This can lead to lockups later on when somebody tries to lock the pte resp. page_table_lock again.. Fixes: a7f40cfe3b7a ("mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified") Signed-off-by: Shijie Luo Signed-off-by: Miaohe Lin Signed-off-by: Andrew Morton Reviewed-by: Oscar Salvador Acked-by: Michal Hocko Cc: Miaohe Lin Cc: Feilong Lin Cc: Shijie Luo Cc: Link: https://lkml.kernel.org/r/20201019074853.50856-1-luoshijie1@huawei.com Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit df0486a4faee7277932136f2526721f07443ec95 Author: Geoffrey D. Bennett Date: Wed Nov 4 22:37:05 2020 +1030 ALSA: usb-audio: Add implicit feedback quirk for MODX commit 26201ddc1373c99b2a67c5774da2f0eecd749b93 upstream. This patch fixes audio distortion on playback for the Yamaha MODX. Signed-off-by: Geoffrey D. Bennett Tested-by: Frank Slotta Cc: Link: https://lore.kernel.org/r/20201104120705.GA19126@b4.vu Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 9a83c225d4b118e65957abbc75bb844f88e5162b Author: Geoffrey D. Bennett Date: Wed Nov 4 22:27:17 2020 +1030 ALSA: usb-audio: Add implicit feedback quirk for Qu-16 commit 0938ecae432e7ac8b01080c35dd81d50a1e43033 upstream. This patch fixes audio distortion on playback for the Allen&Heath Qu-16. Signed-off-by: Geoffrey D. Bennett Cc: Link: https://lore.kernel.org/r/20201104115717.GA19046@b4.vu Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 69a7cae8656039d4420e8613448387d0ebf494f0 Author: Artem Lapkin Date: Tue Nov 3 18:08:09 2020 +0800 ALSA: usb-audio: add usb vendor id as DSD-capable for Khadas devices commit 07815a2b3501adeaae6384a25b9c4a9c81dae59f upstream. Khadas audio devices ( USB_ID_VENDOR 0x3353 ) have DSD-capable implementations from XMOS need add new usb vendor id for recognition Signed-off-by: Artem Lapkin Cc: Link: https://lore.kernel.org/r/20201103103311.5435-1-art@khadas.com Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 0b443d1aff6d6f3706e48c93d9fe1212a569af4b Author: Keith Winstein Date: Sun Oct 25 22:05:47 2020 -0700 ALSA: usb-audio: Add implicit feedback quirk for Zoom UAC-2 commit f15cfca818d756dd1c9492530091dfd583359db3 upstream. The Zoom UAC-2 USB audio interface provides an async playback endpoint ("1 OUT (ASYNC)") and capture endpoint ("2 IN (ASYNC)"), both with 2-channel S32_LE in 44.1, 48, 88.2, 96, 176.4, or 192 kilosamples/s. The device provides explicit feedback to adjust the host's playback rate, but the feedback appears unstable and biased relative to the device's capture rate. "alsaloop -t 1000" experiences playback underruns and tries to resample the captured audio to match the varying playback rate. Forcing the kernel to use implicit feedback appears to produce more stable results. This causes the host to transmit one playback sample for each capture sample received. (Zoom North America has been notified of this change.) Signed-off-by: Keith Winstein Tested-by: Keith Winstein Cc: BugLink: https://lore.kernel.org/r/20201027071841.GA164525@trolley.csail.mit.edu Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 11af858054baadff13eb92eb347443f93768fde7 Author: Lee Jones Date: Mon Nov 2 13:32:42 2020 -0500 Fonts: Replace discarded const qualifier commit 9522750c66c689b739e151fcdf895420dc81efc0 upstream. Commit 6735b4632def ("Fonts: Support FONT_EXTRA_WORDS macros for built-in fonts") introduced the following error when building rpc_defconfig (only this build appears to be affected): `acorndata_8x8' referenced in section `.text' of arch/arm/boot/compressed/ll_char_wr.o: defined in discarded section `.data' of arch/arm/boot/compressed/font.o `acorndata_8x8' referenced in section `.data.rel.ro' of arch/arm/boot/compressed/font.o: defined in discarded section `.data' of arch/arm/boot/compressed/font.o make[3]: *** [/scratch/linux/arch/arm/boot/compressed/Makefile:191: arch/arm/boot/compressed/vmlinux] Error 1 make[2]: *** [/scratch/linux/arch/arm/boot/Makefile:61: arch/arm/boot/compressed/vmlinux] Error 2 make[1]: *** [/scratch/linux/arch/arm/Makefile:317: zImage] Error 2 The .data section is discarded at link time. Reinstating acorndata_8x8 as const ensures it is still available after linking. Do the same for the other 12 built-in fonts as well, for consistency purposes. Cc: Cc: Russell King Reviewed-by: Greg Kroah-Hartman Fixes: 6735b4632def ("Fonts: Support FONT_EXTRA_WORDS macros for built-in fonts") Signed-off-by: Lee Jones Co-developed-by: Peilin Ye Signed-off-by: Peilin Ye Signed-off-by: Daniel Vetter Link: https://patchwork.freedesktop.org/patch/msgid/20201102183242.2031659-1-yepeilin.cs@gmail.com Signed-off-by: Greg Kroah-Hartman commit cdf69f3b132c95aa255383b2d7907da5ba63221a Author: Qu Wenruo Date: Tue Aug 25 21:42:51 2020 +0800 btrfs: tree-checker: fix the error message for transid error commit f96d6960abbc52e26ad124e69e6815283d3e1674 upstream. The error message for inode transid is the same as for inode generation, which makes us unable to detect the real problem. Reported-by: Tyler Richmond Fixes: 496245cac57e ("btrfs: tree-checker: Verify inode item") CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Marcos Paulo de Souza Signed-off-by: Qu Wenruo Signed-off-by: David Sterba [bwh: Backported to 4.19: adjust context] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 3384e8d72567dc6827d53c66071da61f57c517f7 Author: Qu Wenruo Date: Wed Mar 13 14:31:35 2019 +0800 btrfs: tree-checker: Verify inode item commit 496245cac57e26d8b738d85c7a29cf9a47610f3f upstream. There is a report in kernel bugzilla about mismatch file type in dir item and inode item. This inspires us to check inode mode in inode item. This patch will check the following members: - inode key objectid Should be ROOT_DIR_DIR or [256, (u64)-256] or FREE_INO. - inode key offset Should be 0 - inode item generation - inode item transid No newer than sb generation + 1. The +1 is for log tree. - inode item mode No unknown bits. No invalid S_IF* bit. NOTE: S_IFMT check is not enough, need to check every know type. - inode item nlink Dir should have no more link than 1. - inode item flags Reviewed-by: Nikolay Borisov Reviewed-by: Johannes Thumshirn Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit bedd9974c106fe96b518da667d49b0bfe6353590 Author: Qu Wenruo Date: Wed Mar 13 12:17:50 2019 +0800 btrfs: tree-checker: Enhance chunk checker to validate chunk profile commit 80e46cf22ba0bcb57b39c7c3b52961ab3a0fd5f2 upstream. Btrfs-progs already have a comprehensive type checker, to ensure there is only 0 (SINGLE profile) or 1 (DUP/RAID0/1/5/6/10) bit set for chunk profile bits. Do the same work for kernel. Reported-by: Yoon Jungyeon Link: https://bugzilla.kernel.org/show_bug.cgi?id=202765 Reviewed-by: Nikolay Borisov Reviewed-by: Johannes Thumshirn Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit b5b2a94d846a599a0b843895d3dbd90246d8d1fc Author: Qu Wenruo Date: Wed Aug 28 10:33:13 2019 +0800 btrfs: tree-checker: Fix wrong check on max devid commit 8bb177d18f114358a57d8ae7e206861b48b8b4de upstream. [BUG] The following script will cause false alert on devid check. #!/bin/bash dev1=/dev/test/test dev2=/dev/test/scratch1 mnt=/mnt/btrfs umount $dev1 &> /dev/null umount $dev2 &> /dev/null umount $mnt &> /dev/null mkfs.btrfs -f $dev1 mount $dev1 $mnt _fail() { echo "!!! FAILED !!!" exit 1 } for ((i = 0; i < 4096; i++)); do btrfs dev add -f $dev2 $mnt || _fail btrfs dev del $dev1 $mnt || _fail dev_tmp=$dev1 dev1=$dev2 dev2=$dev_tmp done [CAUSE] Tree-checker uses BTRFS_MAX_DEVS() and BTRFS_MAX_DEVS_SYS_CHUNK() as upper limit for devid. But we can have devid holes just like above script. So the check for devid is incorrect and could cause false alert. [FIX] Just remove the whole devid check. We don't have any hard requirement for devid assignment. Furthermore, even devid could get corrupted by a bitflip, we still have dev extents verification at mount time, so corrupted data won't sneak in. This fixes fstests btrfs/194. Reported-by: Anand Jain Fixes: ab4ba2e13346 ("btrfs: tree-checker: Verify dev item") CC: stable@vger.kernel.org # 5.2+ Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba [bwh: Backported to 4.19: adjust context] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit e23e5d2594307d1088e1e1b3621d785add2a57b8 Author: Qu Wenruo Date: Fri Mar 8 14:20:03 2019 +0800 btrfs: tree-checker: Verify dev item commit ab4ba2e133463c702b37242560d7fabedd2dc750 upstream. [BUG] For fuzzed image whose DEV_ITEM has invalid total_bytes as 0, then kernel will just panic: BUG: unable to handle kernel NULL pointer dereference at 0000000000000098 #PF error: [normal kernel read fault] PGD 800000022b2bd067 P4D 800000022b2bd067 PUD 22b2bc067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 1106 Comm: mount Not tainted 5.0.0-rc8+ #9 RIP: 0010:btrfs_verify_dev_extents+0x2a5/0x5a0 Call Trace: open_ctree+0x160d/0x2149 btrfs_mount_root+0x5b2/0x680 [CAUSE] If device extent verification finds a deivce with 0 total_bytes, then it assumes it's a seed dummy, then search for seed devices. But in this case, there is no seed device at all, causing NULL pointer. [FIX] Since this is caused by fuzzed image, let's go the tree-check way, just add a new verification for device item. Reported-by: Yoon Jungyeon Link: https://bugzilla.kernel.org/show_bug.cgi?id=202691 Reviewed-by: Nikolay Borisov Signed-off-by: Qu Wenruo Reviewed-by: Johannes Thumshirn Signed-off-by: David Sterba Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 3c06c86cd08fdabe2ed20cbc3ea5010dcd7d4b29 Author: Qu Wenruo Date: Wed Mar 20 13:42:33 2019 +0800 btrfs: tree-checker: Check chunk item at tree block read time commit 075cb3c78fe7976c9f29ca1fa23f9728634ecefc upstream. Since we have btrfs_check_chunk_valid() in tree-checker, let's do chunk item verification in tree-checker too. Since the tree-checker is run at endio time, if one chunk leaf fails chunk verification, we can still retry the other copy, making btrfs more robust to fuzzed image as we may still get a good chunk item. Also since we have done chunk verification in tree block read time, skip the btrfs_check_chunk_valid() call in read_one_chunk() if we're reading chunk items from leaf. Reviewed-by: Johannes Thumshirn Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 782aa87c8f1d4472fc45697fdddbe8d47b56ebff Author: Qu Wenruo Date: Wed Mar 20 13:39:14 2019 +0800 btrfs: tree-checker: Make btrfs_check_chunk_valid() return EUCLEAN instead of EIO commit bf871c3b43b1dcc3f2a076ff39a8f1ce7959d958 upstream. To follow the standard behavior of tree-checker. Reviewed-by: Johannes Thumshirn Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba [bwh: Cherry-picked for 4.19 to ease backporting later fixes] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit c24ed5ea52165bf44f405aca10d9af0ad593ae86 Author: Qu Wenruo Date: Wed Mar 20 13:36:06 2019 +0800 btrfs: tree-checker: Make chunk item checker messages more readable commit f114024376bceb1c0f61a7bad4a72a0f978767af upstream. Old error message would be something like: BTRFS error (device dm-3): invalid chunk num_stipres: 0 New error message would be: Btrfs critical (device dm-3): corrupt superblock syschunk array: chunk_start=2097152, invalid chunk num_stripes: 0 Or Btrfs critical (device dm-3): corrupt leaf: root=3 block=8388608 slot=3 chunk_start=2097152, invalid chunk num_stripes: 0 And for certain error message, also output expected value. The error message levels are changed from error to critical. Reviewed-by: Johannes Thumshirn Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba [bwh: Cherry-picked for 4.19 to ease backporting later fixes] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 5d123c6335e82db8de1e2fe1c15563e003245438 Author: Qu Wenruo Date: Wed Mar 20 13:16:42 2019 +0800 btrfs: Move btrfs_check_chunk_valid() to tree-check.[ch] and export it commit 82fc28fbedbb59642f05215db3b0ef4eb91aa31d upstream. By function, chunk item verification is more suitable to be done inside tree-checker. So move btrfs_check_chunk_valid() to tree-checker.c and export it. And since it's now moved to tree-checker, also add a better comment for what this function is doing. Reviewed-by: Johannes Thumshirn Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba [bwh: Cherry-picked for 4.19 to ease backporting later fixes] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 1527c0e0229d2dd1c8ae1e73b1579bd8d5866b5b Author: Qu Wenruo Date: Wed Feb 12 14:12:44 2020 +0800 btrfs: Don't submit any btree write bio if the fs has errors commit b3ff8f1d380e65dddd772542aa9bff6c86bf715a upstream. [BUG] There is a fuzzed image which could cause KASAN report at unmount time. BUG: KASAN: use-after-free in btrfs_queue_work+0x2c1/0x390 Read of size 8 at addr ffff888067cf6848 by task umount/1922 CPU: 0 PID: 1922 Comm: umount Tainted: G W 5.0.21 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 Call Trace: dump_stack+0x5b/0x8b print_address_description+0x70/0x280 kasan_report+0x13a/0x19b btrfs_queue_work+0x2c1/0x390 btrfs_wq_submit_bio+0x1cd/0x240 btree_submit_bio_hook+0x18c/0x2a0 submit_one_bio+0x1be/0x320 flush_write_bio.isra.41+0x2c/0x70 btree_write_cache_pages+0x3bb/0x7f0 do_writepages+0x5c/0x130 __writeback_single_inode+0xa3/0x9a0 writeback_single_inode+0x23d/0x390 write_inode_now+0x1b5/0x280 iput+0x2ef/0x600 close_ctree+0x341/0x750 generic_shutdown_super+0x126/0x370 kill_anon_super+0x31/0x50 btrfs_kill_super+0x36/0x2b0 deactivate_locked_super+0x80/0xc0 deactivate_super+0x13c/0x150 cleanup_mnt+0x9a/0x130 task_work_run+0x11a/0x1b0 exit_to_usermode_loop+0x107/0x130 do_syscall_64+0x1e5/0x280 entry_SYSCALL_64_after_hwframe+0x44/0xa9 [CAUSE] The fuzzed image has a completely screwd up extent tree: leaf 29421568 gen 8 total ptrs 6 free space 3587 owner EXTENT_TREE refs 2 lock (w:0 r:0 bw:0 br:0 sw:0 sr:0) lock_owner 0 current 5938 item 0 key (12587008 168 4096) itemoff 3942 itemsize 53 extent refs 1 gen 9 flags 1 ref#0: extent data backref root 5 objectid 259 offset 0 count 1 item 1 key (12591104 168 8192) itemoff 3889 itemsize 53 extent refs 1 gen 9 flags 1 ref#0: extent data backref root 5 objectid 271 offset 0 count 1 item 2 key (12599296 168 4096) itemoff 3836 itemsize 53 extent refs 1 gen 9 flags 1 ref#0: extent data backref root 5 objectid 259 offset 4096 count 1 item 3 key (29360128 169 0) itemoff 3803 itemsize 33 extent refs 1 gen 9 flags 2 ref#0: tree block backref root 5 item 4 key (29368320 169 1) itemoff 3770 itemsize 33 extent refs 1 gen 9 flags 2 ref#0: tree block backref root 5 item 5 key (29372416 169 0) itemoff 3737 itemsize 33 extent refs 1 gen 9 flags 2 ref#0: tree block backref root 5 Note that leaf 29421568 doesn't have its backref in the extent tree. Thus extent allocator can re-allocate leaf 29421568 for other trees. In short, the bug is caused by: - Existing tree block gets allocated to log tree This got its generation bumped. - Log tree balance cleaned dirty bit of offending tree block It will not be written back to disk, thus no WRITTEN flag. - Original owner of the tree block gets COWed Since the tree block has higher transid, no WRITTEN flag, it's reused, and not traced by transaction::dirty_pages. - Transaction aborted Tree blocks get cleaned according to transaction::dirty_pages. But the offending tree block is not recorded at all. - Filesystem unmount All pages are assumed to be are clean, destroying all workqueue, then call iput(btree_inode). But offending tree block is still dirty, which triggers writeback, and causes use-after-free bug. The detailed sequence looks like this: - Initial status eb: 29421568, header=WRITTEN bflags_dirty=0, page_dirty=0, gen=8, not traced by any dirty extent_iot_tree. - New tree block is allocated Since there is no backref for 29421568, it's re-allocated as new tree block. Keep in mind that tree block 29421568 is still referred by extent tree. - Tree block 29421568 is filled for log tree eb: 29421568, header=0 bflags_dirty=1, page_dirty=1, gen=9 << (gen bumped) traced by btrfs_root::dirty_log_pages - Some log tree operations Since the fs is using node size 4096, the log tree can easily go a level higher. - Log tree needs balance Tree block 29421568 gets all its content pushed to right, thus now it is empty, and we don't need it. btrfs_clean_tree_block() from __push_leaf_right() get called. eb: 29421568, header=0 bflags_dirty=0, page_dirty=0, gen=9 traced by btrfs_root::dirty_log_pages - Log tree write back btree_write_cache_pages() goes through dirty pages ranges, but since page of tree block 29421568 gets cleaned already, it's not written back to disk. Thus it doesn't have WRITTEN bit set. But ranges in dirty_log_pages are cleared. eb: 29421568, header=0 bflags_dirty=0, page_dirty=0, gen=9 not traced by any dirty extent_iot_tree. - Extent tree update when committing transaction Since tree block 29421568 has transid equal to running trans, and has no WRITTEN bit, should_cow_block() will use it directly without adding it to btrfs_transaction::dirty_pages. eb: 29421568, header=0 bflags_dirty=1, page_dirty=1, gen=9 not traced by any dirty extent_iot_tree. At this stage, we're doomed. We have a dirty eb not tracked by any extent io tree. - Transaction gets aborted due to corrupted extent tree Btrfs cleans up dirty pages according to transaction::dirty_pages and btrfs_root::dirty_log_pages. But since tree block 29421568 is not tracked by neither of them, it's still dirty. eb: 29421568, header=0 bflags_dirty=1, page_dirty=1, gen=9 not traced by any dirty extent_iot_tree. - Filesystem unmount Since all cleanup is assumed to be done, all workqueus are destroyed. Then iput(btree_inode) is called, expecting no dirty pages. But tree 29421568 is still dirty, thus triggering writeback. Since all workqueues are already freed, we cause use-after-free. This shows us that, log tree blocks + bad extent tree can cause wild dirty pages. [FIX] To fix the problem, don't submit any btree write bio if the filesytem has any error. This is the last safe net, just in case other cleanup haven't caught catch it. Link: https://github.com/bobfuzzer/CVE/tree/master/CVE-2019-19377 CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Josef Bacik Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba [bwh: Backported to 4.19: fs_info variable already exists in btree_write_cache_pages()] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 432bdb0d931b3a6e8c8d34ffa29a218479b5f9be Author: Filipe Manana Date: Wed Sep 11 17:42:00 2019 +0100 Btrfs: fix unwritten extent buffers and hangs on future writeback attempts commit 18dfa7117a3f379862dcd3f67cadd678013bb9dd upstream. The lock_extent_buffer_io() returns 1 to the caller to tell it everything went fine and the callers needs to start writeback for the extent buffer (submit a bio, etc), 0 to tell the caller everything went fine but it does not need to start writeback for the extent buffer, and a negative value if some error happened. When it's about to return 1 it tries to lock all pages, and if a try lock on a page fails, and we didn't flush any existing bio in our "epd", it calls flush_write_bio(epd) and overwrites the return value of 1 to 0 or an error. The page might have been locked elsewhere, not with the goal of starting writeback of the extent buffer, and even by some code other than btrfs, like page migration for example, so it does not mean the writeback of the extent buffer was already started by some other task, so returning a 0 tells the caller (btree_write_cache_pages()) to not start writeback for the extent buffer. Note that epd might currently have either no bio, so flush_write_bio() returns 0 (success) or it might have a bio for another extent buffer with a lower index (logical address). Since we return 0 with the EXTENT_BUFFER_WRITEBACK bit set on the extent buffer and writeback is never started for the extent buffer, future attempts to writeback the extent buffer will hang forever waiting on that bit to be cleared, since it can only be cleared after writeback completes. Such hang is reported with a trace like the following: [49887.347053] INFO: task btrfs-transacti:1752 blocked for more than 122 seconds. [49887.347059] Not tainted 5.2.13-gentoo #2 [49887.347060] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [49887.347062] btrfs-transacti D 0 1752 2 0x80004000 [49887.347064] Call Trace: [49887.347069] ? __schedule+0x265/0x830 [49887.347071] ? bit_wait+0x50/0x50 [49887.347072] ? bit_wait+0x50/0x50 [49887.347074] schedule+0x24/0x90 [49887.347075] io_schedule+0x3c/0x60 [49887.347077] bit_wait_io+0x8/0x50 [49887.347079] __wait_on_bit+0x6c/0x80 [49887.347081] ? __lock_release.isra.29+0x155/0x2d0 [49887.347083] out_of_line_wait_on_bit+0x7b/0x80 [49887.347084] ? var_wake_function+0x20/0x20 [49887.347087] lock_extent_buffer_for_io+0x28c/0x390 [49887.347089] btree_write_cache_pages+0x18e/0x340 [49887.347091] do_writepages+0x29/0xb0 [49887.347093] ? kmem_cache_free+0x132/0x160 [49887.347095] ? convert_extent_bit+0x544/0x680 [49887.347097] filemap_fdatawrite_range+0x70/0x90 [49887.347099] btrfs_write_marked_extents+0x53/0x120 [49887.347100] btrfs_write_and_wait_transaction.isra.4+0x38/0xa0 [49887.347102] btrfs_commit_transaction+0x6bb/0x990 [49887.347103] ? start_transaction+0x33e/0x500 [49887.347105] transaction_kthread+0x139/0x15c So fix this by not overwriting the return value (ret) with the result from flush_write_bio(). We also need to clear the EXTENT_BUFFER_WRITEBACK bit in case flush_write_bio() returns an error, otherwise it will hang any future attempts to writeback the extent buffer, and undo all work done before (set back EXTENT_BUFFER_DIRTY, etc). This is a regression introduced in the 5.2 kernel. Fixes: 2e3c25136adfb ("btrfs: extent_io: add proper error handling to lock_extent_buffer_for_io()") Fixes: f4340622e0226 ("btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up") Reported-by: Zdenek Sojka Link: https://lore.kernel.org/linux-btrfs/GpO.2yos.3WGDOLpx6t%7D.1TUDYM@seznam.cz/T/#u Reported-by: Stefan Priebe - Profihost AG Link: https://lore.kernel.org/linux-btrfs/5c4688ac-10a7-fb07-70e8-c5d31a3fbb38@profihost.ag/T/#t Reported-by: Drazen Kacar Link: https://lore.kernel.org/linux-btrfs/DB8PR03MB562876ECE2319B3E579590F799C80@DB8PR03MB5628.eurprd03.prod.outlook.com/ Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204377 Signed-off-by: Filipe Manana Signed-off-by: David Sterba Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 2311ea7ea04fbc15f51760d3b91f6a8381b06c9a Author: Qu Wenruo Date: Wed Mar 20 14:27:46 2019 +0800 btrfs: extent_io: add proper error handling to lock_extent_buffer_for_io() commit 2e3c25136adfb293d517e17f761d3b8a43a8fc22 upstream. This function needs some extra checks on locked pages and eb. For error handling we need to unlock locked pages and the eb. There is a rare >0 return value branch, where all pages get locked while write bio is not flushed. Thankfully it's handled by the only caller, btree_write_cache_pages(), as later write_one_eb() call will trigger submit_one_bio(). So there shouldn't be any problem. Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 22bb77c13fbeeef2e0d0a8506c4efdde41598d32 Author: Qu Wenruo Date: Wed Mar 20 14:27:43 2019 +0800 btrfs: extent_io: Handle errors better in btree_write_cache_pages() commit 2b952eea813b1f7e7d4b9782271acd91625b9bb9 upstream. In btree_write_cache_pages(), we can only get @ret <= 0. Add an ASSERT() for it just in case. Then instead of submitting the write bio even we got some error, check the return value first. If we have already hit some error, just clean up the corrupted or half-baked bio, and return error. If there is no error so far, then call flush_write_bio() and return the result. Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit eeda803b77d6cc80c9f9dffcfc86cddc5a52c0fe Author: Qu Wenruo Date: Wed Mar 20 14:27:42 2019 +0800 btrfs: extent_io: Handle errors better in extent_write_full_page() commit 3065976b045f77a910809fa7699f99a1e7c0dbbb upstream. Since now flush_write_bio() could return error, kill the BUG_ON() first. Then don't call flush_write_bio() unconditionally, instead we check the return value from __extent_writepage() first. If __extent_writepage() fails, we do cleanup, and return error without submitting the possible corrupted or half-baked bio. If __extent_writepage() successes, then we call flush_write_bio() and return the result. Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 169ae603400293e0182469f3adfd135d04a38470 Author: Josef Bacik Date: Thu Jan 23 15:33:02 2020 -0500 btrfs: flush write bio if we loop in extent_write_cache_pages commit 42ffb0bf584ae5b6b38f72259af1e0ee417ac77f upstream. There exists a deadlock with range_cyclic that has existed forever. If we loop around with a bio already built we could deadlock with a writer who has the page locked that we're attempting to write but is waiting on a page in our bio to be written out. The task traces are as follows PID: 1329874 TASK: ffff889ebcdf3800 CPU: 33 COMMAND: "kworker/u113:5" #0 [ffffc900297bb658] __schedule at ffffffff81a4c33f #1 [ffffc900297bb6e0] schedule at ffffffff81a4c6e3 #2 [ffffc900297bb6f8] io_schedule at ffffffff81a4ca42 #3 [ffffc900297bb708] __lock_page at ffffffff811f145b #4 [ffffc900297bb798] __process_pages_contig at ffffffff814bc502 #5 [ffffc900297bb8c8] lock_delalloc_pages at ffffffff814bc684 #6 [ffffc900297bb900] find_lock_delalloc_range at ffffffff814be9ff #7 [ffffc900297bb9a0] writepage_delalloc at ffffffff814bebd0 #8 [ffffc900297bba18] __extent_writepage at ffffffff814bfbf2 #9 [ffffc900297bba98] extent_write_cache_pages at ffffffff814bffbd PID: 2167901 TASK: ffff889dc6a59c00 CPU: 14 COMMAND: "aio-dio-invalid" #0 [ffffc9003b50bb18] __schedule at ffffffff81a4c33f #1 [ffffc9003b50bba0] schedule at ffffffff81a4c6e3 #2 [ffffc9003b50bbb8] io_schedule at ffffffff81a4ca42 #3 [ffffc9003b50bbc8] wait_on_page_bit at ffffffff811f24d6 #4 [ffffc9003b50bc60] prepare_pages at ffffffff814b05a7 #5 [ffffc9003b50bcd8] btrfs_buffered_write at ffffffff814b1359 #6 [ffffc9003b50bdb0] btrfs_file_write_iter at ffffffff814b5933 #7 [ffffc9003b50be38] new_sync_write at ffffffff8128f6a8 #8 [ffffc9003b50bec8] vfs_write at ffffffff81292b9d #9 [ffffc9003b50bf00] ksys_pwrite64 at ffffffff81293032 I used drgn to find the respective pages we were stuck on page_entry.page 0xffffea00fbfc7500 index 8148 bit 15 pid 2167901 page_entry.page 0xffffea00f9bb7400 index 7680 bit 0 pid 1329874 As you can see the kworker is waiting for bit 0 (PG_locked) on index 7680, and aio-dio-invalid is waiting for bit 15 (PG_writeback) on index 8148. aio-dio-invalid has 7680, and the kworker epd looks like the following crash> struct extent_page_data ffffc900297bbbb0 struct extent_page_data { bio = 0xffff889f747ed830, tree = 0xffff889eed6ba448, extent_locked = 0, sync_io = 0 } Probably worth mentioning as well that it waits for writeback of the page to complete while holding a lock on it (at prepare_pages()). Using drgn I walked the bio pages looking for page 0xffffea00fbfc7500 which is the one we're waiting for writeback on bio = Object(prog, 'struct bio', address=0xffff889f747ed830) for i in range(0, bio.bi_vcnt.value_()): bv = bio.bi_io_vec[i] if bv.bv_page.value_() == 0xffffea00fbfc7500: print("FOUND IT") which validated what I suspected. The fix for this is simple, flush the epd before we loop back around to the beginning of the file during writeout. Fixes: b293f02e1423 ("Btrfs: Add writepages support") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana Signed-off-by: Josef Bacik Signed-off-by: David Sterba Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit aa38097b44153d7470ce8956ec4de8f1372390c6 Author: Ben Hutchings Date: Mon Oct 12 23:18:11 2020 +0100 Revert "btrfs: flush write bio if we loop in extent_write_cache_pages" This reverts commit 860473714cbe7fbedcf92bfe3eb6d69fae8c74ff. That has an incorrect upstream commit reference, and was modified in a way that conflicts with some older fixes. We can cleanly cherry-pick the upstream commit *after* those fixes. Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 63ece3bb01489e3942bc15e4aa976afcb3cd864f Author: Qu Wenruo Date: Wed Mar 20 14:27:41 2019 +0800 btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up commit f4340622e02261fae599e3da936ff4808b418173 upstream. We have a BUG_ON() in flush_write_bio() to handle the return value of submit_one_bio(). Move the BUG_ON() one level up to all its callers. This patch will introduce temporary variable, @flush_ret to keep code change minimal in this patch. That variable will be cleaned up when enhancing the error handling later. Reviewed-by: Johannes Thumshirn Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba [bwh: Cherry-picked for 4.19 to ease backporting later fixes] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 7f2b9e8d42bd957332dc2af0ba6a77684fe32b1c Author: Qu Wenruo Date: Fri Jan 25 13:09:15 2019 +0800 btrfs: extent_io: Kill the forward declaration of flush_write_bio commit bb58eb9e167d087cc518f7a71c3c00f1671958da upstream. There is no need to forward declare flush_write_bio(), as it only depends on submit_one_bio(). Both of them are pretty small, just move them to kill the forward declaration. Reviewed-by: Nikolay Borisov Reviewed-by: Johannes Thumshirn Signed-off-by: Qu Wenruo Signed-off-by: David Sterba [bwh: Cherry-picked for 4.19 to ease backporting later fixes] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 8a78b4c0d6292d32d76b4268b5a33ae089a5d791 Author: Luis Chamberlain Date: Fri Jun 19 20:47:28 2020 +0000 blktrace: fix debugfs use after free commit bad8e64fb19d3a0de5e564d9a7271c31bd684369 upstream. On commit 6ac93117ab00 ("blktrace: use existing disk debugfs directory") merged on v4.12 Omar fixed the original blktrace code for request-based drivers (multiqueue). This however left in place a possible crash, if you happen to abuse blktrace while racing to remove / add a device. We used to use asynchronous removal of the request_queue, and with that the issue was easier to reproduce. Now that we have reverted to synchronous removal of the request_queue, the issue is still possible to reproduce, its however just a bit more difficult. We essentially run two instances of break-blktrace which add/remove a loop device, and setup a blktrace and just never tear the blktrace down. We do this twice in parallel. This is easily reproduced with the script run_0004.sh from break-blktrace [0]. We can end up with two types of panics each reflecting where we race, one a failed blktrace setup: [ 252.426751] debugfs: Directory 'loop0' with parent 'block' already present! [ 252.432265] BUG: kernel NULL pointer dereference, address: 00000000000000a0 [ 252.436592] #PF: supervisor write access in kernel mode [ 252.439822] #PF: error_code(0x0002) - not-present page [ 252.442967] PGD 0 P4D 0 [ 252.444656] Oops: 0002 [#1] SMP NOPTI [ 252.446972] CPU: 10 PID: 1153 Comm: break-blktrace Tainted: G E 5.7.0-rc2-next-20200420+ #164 [ 252.452673] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 [ 252.456343] RIP: 0010:down_write+0x15/0x40 [ 252.458146] Code: eb ca e8 ae 22 8d ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 55 48 89 fd e8 52 db ff ff 31 c0 ba 01 00 00 00 48 0f b1 55 00 75 0f 48 8b 04 25 c0 8b 01 00 48 89 45 08 5d [ 252.463638] RSP: 0018:ffffa626415abcc8 EFLAGS: 00010246 [ 252.464950] RAX: 0000000000000000 RBX: ffff958c25f0f5c0 RCX: ffffff8100000000 [ 252.466727] RDX: 0000000000000001 RSI: ffffff8100000000 RDI: 00000000000000a0 [ 252.468482] RBP: 00000000000000a0 R08: 0000000000000000 R09: 0000000000000001 [ 252.470014] R10: 0000000000000000 R11: ffff958d1f9227ff R12: 0000000000000000 [ 252.471473] R13: ffff958c25ea5380 R14: ffffffff8cce15f1 R15: 00000000000000a0 [ 252.473346] FS: 00007f2e69dee540(0000) GS:ffff958c2fc80000(0000) knlGS:0000000000000000 [ 252.475225] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 252.476267] CR2: 00000000000000a0 CR3: 0000000427d10004 CR4: 0000000000360ee0 [ 252.477526] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 252.478776] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 252.479866] Call Trace: [ 252.480322] simple_recursive_removal+0x4e/0x2e0 [ 252.481078] ? debugfs_remove+0x60/0x60 [ 252.481725] ? relay_destroy_buf+0x77/0xb0 [ 252.482662] debugfs_remove+0x40/0x60 [ 252.483518] blk_remove_buf_file_callback+0x5/0x10 [ 252.484328] relay_close_buf+0x2e/0x60 [ 252.484930] relay_open+0x1ce/0x2c0 [ 252.485520] do_blk_trace_setup+0x14f/0x2b0 [ 252.486187] __blk_trace_setup+0x54/0xb0 [ 252.486803] blk_trace_ioctl+0x90/0x140 [ 252.487423] ? do_sys_openat2+0x1ab/0x2d0 [ 252.488053] blkdev_ioctl+0x4d/0x260 [ 252.488636] block_ioctl+0x39/0x40 [ 252.489139] ksys_ioctl+0x87/0xc0 [ 252.489675] __x64_sys_ioctl+0x16/0x20 [ 252.490380] do_syscall_64+0x52/0x180 [ 252.491032] entry_SYSCALL_64_after_hwframe+0x44/0xa9 And the other on the device removal: [ 128.528940] debugfs: Directory 'loop0' with parent 'block' already present! [ 128.615325] BUG: kernel NULL pointer dereference, address: 00000000000000a0 [ 128.619537] #PF: supervisor write access in kernel mode [ 128.622700] #PF: error_code(0x0002) - not-present page [ 128.625842] PGD 0 P4D 0 [ 128.627585] Oops: 0002 [#1] SMP NOPTI [ 128.629871] CPU: 12 PID: 544 Comm: break-blktrace Tainted: G E 5.7.0-rc2-next-20200420+ #164 [ 128.635595] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 [ 128.640471] RIP: 0010:down_write+0x15/0x40 [ 128.643041] Code: eb ca e8 ae 22 8d ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 55 48 89 fd e8 52 db ff ff 31 c0 ba 01 00 00 00 48 0f b1 55 00 75 0f 65 48 8b 04 25 c0 8b 01 00 48 89 45 08 5d [ 128.650180] RSP: 0018:ffffa9c3c05ebd78 EFLAGS: 00010246 [ 128.651820] RAX: 0000000000000000 RBX: ffff8ae9a6370240 RCX: ffffff8100000000 [ 128.653942] RDX: 0000000000000001 RSI: ffffff8100000000 RDI: 00000000000000a0 [ 128.655720] RBP: 00000000000000a0 R08: 0000000000000002 R09: ffff8ae9afd2d3d0 [ 128.657400] R10: 0000000000000056 R11: 0000000000000000 R12: 0000000000000000 [ 128.659099] R13: 0000000000000000 R14: 0000000000000003 R15: 00000000000000a0 [ 128.660500] FS: 00007febfd995540(0000) GS:ffff8ae9afd00000(0000) knlGS:0000000000000000 [ 128.662204] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 128.663426] CR2: 00000000000000a0 CR3: 0000000420042003 CR4: 0000000000360ee0 [ 128.664776] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 128.666022] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 128.667282] Call Trace: [ 128.667801] simple_recursive_removal+0x4e/0x2e0 [ 128.668663] ? debugfs_remove+0x60/0x60 [ 128.669368] debugfs_remove+0x40/0x60 [ 128.669985] blk_trace_free+0xd/0x50 [ 128.670593] __blk_trace_remove+0x27/0x40 [ 128.671274] blk_trace_shutdown+0x30/0x40 [ 128.671935] blk_release_queue+0x95/0xf0 [ 128.672589] kobject_put+0xa5/0x1b0 [ 128.673188] disk_release+0xa2/0xc0 [ 128.673786] device_release+0x28/0x80 [ 128.674376] kobject_put+0xa5/0x1b0 [ 128.674915] loop_remove+0x39/0x50 [loop] [ 128.675511] loop_control_ioctl+0x113/0x130 [loop] [ 128.676199] ksys_ioctl+0x87/0xc0 [ 128.676708] __x64_sys_ioctl+0x16/0x20 [ 128.677274] do_syscall_64+0x52/0x180 [ 128.677823] entry_SYSCALL_64_after_hwframe+0x44/0xa9 The common theme here is: debugfs: Directory 'loop0' with parent 'block' already present This crash happens because of how blktrace uses the debugfs directory where it places its files. Upon init we always create the same directory which would be needed by blktrace but we only do this for make_request drivers (multiqueue) block drivers. When you race a removal of these devices with a blktrace setup you end up in a situation where the make_request recursive debugfs removal will sweep away the blktrace files and then later blktrace will also try to remove individual dentries which are already NULL. The inverse is also possible and hence the two types of use after frees. We don't create the block debugfs directory on init for these types of block devices: * request-based block driver block devices * every possible partition * scsi-generic And so, this race should in theory only be possible with make_request drivers. We can fix the UAF by simply re-using the debugfs directory for make_request drivers (multiqueue) and only creating the ephemeral directory for the other type of block devices. The new clarifications on relying on the q->blk_trace_mutex *and* also checking for q->blk_trace *prior* to processing a blktrace ensures the debugfs directories are only created if no possible directory name clashes are possible. This goes tested with: o nvme partitions o ISCSI with tgt, and blktracing against scsi-generic with: o block o tape o cdrom o media changer o blktests This patch is part of the work which disputes the severity of CVE-2019-19770 which shows this issue is not a core debugfs issue, but a misuse of debugfs within blktace. Fixes: 6ac93117ab00 ("blktrace: use existing disk debugfs directory") Reported-by: syzbot+603294af2d01acfdd6da@syzkaller.appspotmail.com Signed-off-by: Luis Chamberlain Reviewed-by: Christoph Hellwig Cc: Bart Van Assche Cc: Omar Sandoval Cc: Hannes Reinecke Cc: Nicolai Stange Cc: Greg Kroah-Hartman Cc: Michal Hocko Cc: "Martin K. Petersen" Cc: "James E.J. Bottomley" Cc: yu kuai Signed-off-by: Jens Axboe [bwh: Backported to 4.19: open-code queue_is_mq()] Signed-off-by: Ben Hutchings Signed-off-by: Greg Kroah-Hartman commit 4ff840604a45673bb02dfaaa5f30cd9bc40011e5 Author: YueHaibing Date: Sat Oct 31 11:10:53 2020 +0800 sfp: Fix error handing in sfp_probe() [ Upstream commit 9621618130bf7e83635367c13b9a6ee53935bb37 ] gpiod_to_irq() never return 0, but returns negative in case of error, check it and set gpio_irq to 0. Fixes: 73970055450e ("sfp: add SFP module support") Signed-off-by: YueHaibing Reviewed-by: Andrew Lunn Link: https://lore.kernel.org/r/20201031031053.25264-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 64752f5cfda61aa7ca12d23ca1ecc7d36e996f93 Author: Petr Malat Date: Fri Oct 30 14:26:33 2020 +0100 sctp: Fix COMM_LOST/CANT_STR_ASSOC err reporting on big-endian platforms [ Upstream commit b6df8c81412190fbd5eaa3cec7f642142d9c16cd ] Commit 978aa0474115 ("sctp: fix some type cast warnings introduced since very beginning")' broke err reading from sctp_arg, because it reads the value as 32-bit integer, although the value is stored as 16-bit integer. Later this value is passed to the userspace in 16-bit variable, thus the user always gets 0 on big-endian platforms. Fix it by reading the __u16 field of sctp_arg union, as reading err field would produce a sparse warning. Fixes: 978aa0474115 ("sctp: fix some type cast warnings introduced since very beginning") Signed-off-by: Petr Malat Acked-by: Marcelo Ricardo Leitner Link: https://lore.kernel.org/r/20201030132633.7045-1-oss@malat.biz Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 1a9118379cc7b66bfd2ac030da4af3a712a9e9e2 Author: Daniele Palmas Date: Mon Nov 2 12:01:08 2020 +0100 net: usb: qmi_wwan: add Telit LE910Cx 0x1230 composition [ Upstream commit 5fd8477ed8ca77e64b93d44a6dae4aa70c191396 ] Add support for Telit LE910Cx 0x1230 composition: 0x1230: tty, adb, rmnet, audio, tty, tty, tty, tty Signed-off-by: Daniele Palmas Acked-by: Bjørn Mork Link: https://lore.kernel.org/r/20201102110108.17244-1-dnlplm@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 3a40f713276b2a03609749c6ef2e9efe4acdfa1a Author: Claudiu Manoil Date: Tue Oct 20 20:36:05 2020 +0300 gianfar: Account for Tx PTP timestamp in the skb headroom [ Upstream commit d6a076d68c6b5d6a5800f3990a513facb7016dea ] When PTP timestamping is enabled on Tx, the controller inserts the Tx timestamp at the beginning of the frame buffer, between SFD and the L2 frame header. This means that the skb provided by the stack is required to have enough headroom otherwise a new skb needs to be created by the driver to accommodate the timestamp inserted by h/w. Up until now the driver was relying on the second option, using skb_realloc_headroom() to create a new skb to accommodate PTP frames. Turns out that this method is not reliable, as reallocation of skbs for PTP frames along with the required overhead (skb_set_owner_w, consume_skb) is causing random crashes in subsequent skb_*() calls, when multiple concurrent TCP streams are run at the same time on the same device (as seen in James' report). Note that these crashes don't occur with a single TCP stream, nor with multiple concurrent UDP streams, but only when multiple TCP streams are run concurrently with the PTP packet flow (doing skb reallocation). This patch enforces the first method, by requesting enough headroom from the stack to accommodate PTP frames, and so avoiding skb_realloc_headroom() & co, and the crashes no longer occur. There's no reason not to set needed_headroom to a large enough value to accommodate PTP frames, so in this regard this patch is a fix. Reported-by: James Jurack Fixes: bee9e58c9e98 ("gianfar:don't add FCB length to hard_header_len") Signed-off-by: Claudiu Manoil Link: https://lore.kernel.org/r/20201020173605.1173-1-claudiu.manoil@nxp.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 63a22c7cb9d9a422e959bd7a01d4af07cfa00298 Author: Claudiu Manoil Date: Thu Oct 29 10:10:56 2020 +0200 gianfar: Replace skb_realloc_headroom with skb_cow_head for PTP [ Upstream commit d145c9031325fed963a887851d9fa42516efd52b ] When PTP timestamping is enabled on Tx, the controller inserts the Tx timestamp at the beginning of the frame buffer, between SFD and the L2 frame header. This means that the skb provided by the stack is required to have enough headroom otherwise a new skb needs to be created by the driver to accommodate the timestamp inserted by h/w. Up until now the driver was relying on skb_realloc_headroom() to create new skbs to accommodate PTP frames. Turns out that this method is not reliable in this context at least, as skb_realloc_headroom() for PTP frames can cause random crashes, mostly in subsequent skb_*() calls, when multiple concurrent TCP streams are run at the same time with the PTP flow on the same device (as seen in James' report). I also noticed that when the system is loaded by sending multiple TCP streams, the driver receives cloned skbs in large numbers. skb_cow_head() instead proves to be stable in this scenario, and not only handles cloned skbs too but it's also more efficient and widely used in other drivers. The commit introducing skb_realloc_headroom in the driver goes back to 2009, commit 93c1285c5d92 ("gianfar: reallocate skb when headroom is not enough for fcb"). For practical purposes I'm referencing a newer commit (from 2012) that brings the code to its current structure (and fixes the PTP case). Fixes: 9c4886e5e63b ("gianfar: Fix invalid TX frames returned on error queue when time stamping") Reported-by: James Jurack Suggested-by: Jakub Kicinski Signed-off-by: Claudiu Manoil Link: https://lore.kernel.org/r/20201029081057.8506-1-claudiu.manoil@nxp.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 75b0b05ace856333c2a94c2173c92973dd027297 Author: Vinay Kumar Yadav Date: Mon Nov 2 23:09:10 2020 +0530 chelsio/chtls: fix always leaking ctrl_skb [ Upstream commit dbfe394dad33f99cf8458be50483ec40a5d29c34 ] Correct skb refcount in alloc_ctrl_skb(), causing skb memleak when chtls_send_abort() called with NULL skb. it was always leaking the skb, correct it by incrementing skb refs by one. Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition") Signed-off-by: Vinay Kumar Yadav Link: https://lore.kernel.org/r/20201102173909.24826-1-vinay.yadav@chelsio.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit fe478a589404287513bf4cbb074df80805db9f6f Author: Vinay Kumar Yadav Date: Mon Nov 2 23:06:51 2020 +0530 chelsio/chtls: fix memory leaks caused by a race [ Upstream commit 8080b462b6aa856ae05ea010441a702599e579f2 ] race between user context and softirq causing memleak, consider the call sequence scenario chtls_setkey() //user context chtls_peer_close() chtls_abort_req_rss() chtls_setkey() //user context work request skb queued in chtls_setkey() won't be freed because resources are already cleaned for this connection, fix it by not queuing work request while socket is closing. v1->v2: - fix W=1 warning. v2->v3: - separate it out from another memleak fix. Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition") Signed-off-by: Vinay Kumar Yadav Link: https://lore.kernel.org/r/20201102173650.24754-1-vinay.yadav@chelsio.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit d67543112dc5f39b41f9bd309e47da25be2bb14c Author: Mark Deneen Date: Fri Oct 30 15:58:14 2020 +0000 cadence: force nonlinear buffers to be cloned [ Upstream commit 403dc16796f5516acf23d94a1cd9eba564d03210 ] In my test setup, I had a SAMA5D27 device configured with ip forwarding, and second device with usb ethernet (r8152) sending ICMP packets.  If the packet was larger than about 220 bytes, the SAMA5 device would "oops" with the following trace: kernel BUG at net/core/skbuff.c:1863! Internal error: Oops - BUG: 0 [#1] ARM Modules linked in: xt_MASQUERADE ppp_async ppp_generic slhc iptable_nat xt_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 can_raw can bridge stp llc ipt_REJECT nf_reject_ipv4 sd_mod cdc_ether usbnet usb_storage r8152 scsi_mod mii o ption usb_wwan usbserial micrel macb at91_sama5d2_adc phylink gpio_sama5d2_piobu m_can_platform m_can industrialio_triggered_buffer kfifo_buf of_mdio can_dev fixed_phy sdhci_of_at91 sdhci_pltfm libphy sdhci mmc_core ohci_at91 ehci_atmel o hci_hcd iio_rescale industrialio sch_fq_codel spidev prox2_hal(O) CPU: 0 PID: 0 Comm: swapper Tainted: G           O      5.9.1-prox2+ #1 Hardware name: Atmel SAMA5 PC is at skb_put+0x3c/0x50 LR is at macb_start_xmit+0x134/0xad0 [macb] pc : []    lr : []    psr: 20070113 sp : c0d01a60  ip : c07232c0  fp : c4250000 r10: c0d03cc8  r9 : 00000000  r8 : c0d038c0 r7 : 00000000  r6 : 00000008  r5 : c59b66c0  r4 : 0000002a r3 : 8f659eff  r2 : c59e9eea  r1 : 00000001  r0 : c59b66c0 Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none Control: 10c53c7d  Table: 2640c059  DAC: 00000051 Process swapper (pid: 0, stack limit = 0x75002d81) [] (skb_put) from [] (macb_start_xmit+0x134/0xad0 [macb]) [] (macb_start_xmit [macb]) from [] (dev_hard_start_xmit+0x90/0x11c) [] (dev_hard_start_xmit) from [] (sch_direct_xmit+0x124/0x260) [] (sch_direct_xmit) from [] (__dev_queue_xmit+0x4b0/0x6d0) [] (__dev_queue_xmit) from [] (ip_finish_output2+0x350/0x580) [] (ip_finish_output2) from [] (ip_output+0xb4/0x13c) [] (ip_output) from [] (ip_forward+0x474/0x500) [] (ip_forward) from [] (ip_sublist_rcv_finish+0x3c/0x50) [] (ip_sublist_rcv_finish) from [] (ip_sublist_rcv+0x11c/0x188) [] (ip_sublist_rcv) from [] (ip_list_rcv+0xf8/0x124) [] (ip_list_rcv) from [] (__netif_receive_skb_list_core+0x1a0/0x20c) [] (__netif_receive_skb_list_core) from [] (netif_receive_skb_list_internal+0x194/0x230) [] (netif_receive_skb_list_internal) from [] (gro_normal_list.part.0+0x14/0x28) [] (gro_normal_list.part.0) from [] (napi_complete_done+0x16c/0x210) [] (napi_complete_done) from [] (r8152_poll+0x684/0x708 [r8152]) [] (r8152_poll [r8152]) from [] (net_rx_action+0x100/0x328) [] (net_rx_action) from [] (__do_softirq+0xec/0x274) [] (__do_softirq) from [] (irq_exit+0xcc/0xd0) [] (irq_exit) from [] (__handle_domain_irq+0x58/0xa4) [] (__handle_domain_irq) from [] (__irq_svc+0x6c/0x90) Exception stack(0xc0d01ef0 to 0xc0d01f38) 1ee0:                                     00000000 0000003d 0c31f383 c0d0fa00 1f00: c0d2eb80 00000000 c0d2e630 4dad8c49 4da967b0 0000003d 0000003d 00000000 1f20: fffffff5 c0d01f40 c04e0f88 c04e0f8c 30070013 ffffffff [] (__irq_svc) from [] (cpuidle_enter_state+0x7c/0x378) [] (cpuidle_enter_state) from [] (cpuidle_enter+0x28/0x38) [] (cpuidle_enter) from [] (do_idle+0x194/0x214) [] (do_idle) from [] (cpu_startup_entry+0xc/0x14) [] (cpu_startup_entry) from [] (start_kernel+0x46c/0x4a0) Code: e580c054 8a000002 e1a00002 e8bd8070 (e7f001f2) ---[ end trace 146c8a334115490c ]--- The solution was to force nonlinear buffers to be cloned.  This was previously reported by Klaus Doth (https://www.spinics.net/lists/netdev/msg556937.html) but never formally submitted as a patch. This is the third revision, hopefully the formatting is correct this time! Suggested-by: Klaus Doth Fixes: 653e92a9175e ("net: macb: add support for padding and fcs computation") Signed-off-by: Mark Deneen Link: https://lore.kernel.org/r/20201030155814.622831-1-mdeneen@saucontech.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit caf8f9c19a93ec501d0ffca5e6767a969961baea Author: Oleg Nesterov Date: Sun Nov 1 17:07:44 2020 -0800 ptrace: fix task_join_group_stop() for the case when current is traced commit 7b3c36fc4c231ca532120bbc0df67a12f09c1d96 upstream. This testcase #include #include #include #include #include #include #include void *tf(void *arg) { return NULL; } int main(void) { int pid = fork(); if (!pid) { kill(getpid(), SIGSTOP); pthread_t th; pthread_create(&th, NULL, tf, NULL); return 0; } waitpid(pid, NULL, WSTOPPED); ptrace(PTRACE_SEIZE, pid, 0, PTRACE_O_TRACECLONE); waitpid(pid, NULL, 0); ptrace(PTRACE_CONT, pid, 0,0); waitpid(pid, NULL, 0); int status; int thread = waitpid(-1, &status, 0); assert(thread > 0 && thread != pid); assert(status == 0x80137f); return 0; } fails and triggers WARN_ON_ONCE(!signr) in do_jobctl_trap(). This is because task_join_group_stop() has 2 problems when current is traced: 1. We can't rely on the "JOBCTL_STOP_PENDING" check, a stopped tracee can be woken up by debugger and it can clone another thread which should join the group-stop. We need to check group_stop_count || SIGNAL_STOP_STOPPED. 2. If SIGNAL_STOP_STOPPED is already set, we should not increment sig->group_stop_count and add JOBCTL_STOP_CONSUME. The new thread should stop without another do_notify_parent_cldstop() report. To clarify, the problem is very old and we should blame ptrace_init_task(). But now that we have task_join_group_stop() it makes more sense to fix this helper to avoid the code duplication. Reported-by: syzbot+3485e3773f7da290eecc@syzkaller.appspotmail.com Signed-off-by: Oleg Nesterov Signed-off-by: Andrew Morton Cc: Jens Axboe Cc: Christian Brauner Cc: "Eric W . Biederman" Cc: Zhiqiang Liu Cc: Tejun Heo Cc: Link: https://lkml.kernel.org/r/20201019134237.GA18810@redhat.com Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 604ac10d9d1e9a85787d53acbba541f420b9cc9e Author: Hoang Huu Le Date: Thu Aug 27 09:56:51 2020 +0700 tipc: fix use-after-free in tipc_bcast_get_mode commit fdeba99b1e58ecd18c2940c453e19e4ef20ff591 upstream. Syzbot has reported those issues as: ================================================================== BUG: KASAN: use-after-free in tipc_bcast_get_mode+0x3ab/0x400 net/tipc/bcast.c:759 Read of size 1 at addr ffff88805e6b3571 by task kworker/0:6/3850 CPU: 0 PID: 3850 Comm: kworker/0:6 Not tainted 5.8.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: events tipc_net_finalize_work Thread 1's call trace: [...] kfree+0x103/0x2c0 mm/slab.c:3757 <- bcbase releasing tipc_bcast_stop+0x1b0/0x2f0 net/tipc/bcast.c:721 tipc_exit_net+0x24/0x270 net/tipc/core.c:112 [...] Thread 2's call trace: [...] tipc_bcast_get_mode+0x3ab/0x400 net/tipc/bcast.c:759 <- bcbase has already been freed by Thread 1 tipc_node_broadcast+0x9e/0xcc0 net/tipc/node.c:1744 tipc_nametbl_publish+0x60b/0x970 net/tipc/name_table.c:752 tipc_net_finalize net/tipc/net.c:141 [inline] tipc_net_finalize+0x1fa/0x310 net/tipc/net.c:131 tipc_net_finalize_work+0x55/0x80 net/tipc/net.c:150 [...] ================================================================== BUG: KASAN: use-after-free in tipc_named_reinit+0xef/0x290 net/tipc/name_distr.c:344 Read of size 8 at addr ffff888052ab2000 by task kworker/0:13/30628 CPU: 0 PID: 30628 Comm: kworker/0:13 Not tainted 5.8.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: events tipc_net_finalize_work Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1f0/0x31e lib/dump_stack.c:118 print_address_description+0x66/0x5a0 mm/kasan/report.c:383 __kasan_report mm/kasan/report.c:513 [inline] kasan_report+0x132/0x1d0 mm/kasan/report.c:530 tipc_named_reinit+0xef/0x290 net/tipc/name_distr.c:344 tipc_net_finalize+0x85/0xe0 net/tipc/net.c:138 tipc_net_finalize_work+0x50/0x70 net/tipc/net.c:150 process_one_work+0x789/0xfc0 kernel/workqueue.c:2269 worker_thread+0xaa4/0x1460 kernel/workqueue.c:2415 kthread+0x37e/0x3a0 drivers/block/aoe/aoecmd.c:1234 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293 [...] Freed by task 14058: save_stack mm/kasan/common.c:48 [inline] set_track mm/kasan/common.c:56 [inline] kasan_set_free_info mm/kasan/common.c:316 [inline] __kasan_slab_free+0x114/0x170 mm/kasan/common.c:455 __cache_free mm/slab.c:3426 [inline] kfree+0x10a/0x220 mm/slab.c:3757 tipc_exit_net+0x29/0x50 net/tipc/core.c:113 ops_exit_list net/core/net_namespace.c:186 [inline] cleanup_net+0x708/0xba0 net/core/net_namespace.c:603 process_one_work+0x789/0xfc0 kernel/workqueue.c:2269 worker_thread+0xaa4/0x1460 kernel/workqueue.c:2415 kthread+0x37e/0x3a0 drivers/block/aoe/aoecmd.c:1234 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293 Fix it by calling flush_scheduled_work() to make sure the tipc_net_finalize_work() stopped before releasing bcbase object. Reported-by: syzbot+6ea1f7a8df64596ef4d7@syzkaller.appspotmail.com Reported-by: syzbot+e9cc557752ab126c1b99@syzkaller.appspotmail.com Acked-by: Jon Maloy Signed-off-by: Hoang Huu Le Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 4141168704e40509b376d0c32822eccc29e00627 Author: Chris Wilson Date: Wed Sep 16 10:00:58 2020 +0100 drm/i915: Break up error capture compression loops with cond_resched() commit 7d5553147613b50149238ac1385c60e5c7cacb34 upstream. As the error capture will compress user buffers as directed to by the user, it can take an arbitrary amount of time and space. Break up the compression loops with a call to cond_resched(), that will allow other processes to schedule (avoiding the soft lockups) and also serve as a warning should we try to make this loop atomic in the future. Testcase: igt/gem_exec_capture/many-* Signed-off-by: Chris Wilson Cc: Mika Kuoppala Cc: stable@vger.kernel.org Reviewed-by: Mika Kuoppala Link: https://patchwork.freedesktop.org/patch/msgid/20200916090059.3189-2-chris@chris-wilson.co.uk (cherry picked from commit 293f43c80c0027ff9299036c24218ac705ce584e) Signed-off-by: Rodrigo Vivi Signed-off-by: Greg Kroah-Hartman