mfd: handle legacy m10bmc by trixirt · Pull Request #1 · OFS/linux-dfl

trixirt · 2020-06-19T21:47:31Z

When the bmc version is non 0xffffffff, the bmc is just old, not
invalid.

So adjust the reads to use the legacy base address.

Signed-off-by: Tom Rix trix@redhat.com

Each DFL functional block, e.g. AFU (Accelerated Function Unit) and FME (FPGA Management Engine), could implement more than one function within its region, but current driver only allows one user application to access it by exclusive open on device node. So this is not convenient and flexible for userspace applications, as they have to combine lots of different functions into one single application. This patch removes the limitation here to allow multiple opens to each feature device node for AFU and FME from userspace applications. If user still needs exclusive access to these device node, O_EXCL flag must be issued together with open. Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Moritz Fischer <mdf@kernel.org>

pci_driver.sriov_configure should return negative value on error and number of enabled VFs on success. But now the driver returns 0 on success. The sriov configure still works but will cause a warning message: XX VFs requested; only 0 enabled This patch changes the return value accordingly. Cc: stable@vger.kernel.org Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Moritz Fischer <mdf@kernel.org>

…upport This patch adds description for performance reporting support for Device Feature List (DFL) based FPGA. Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com>

This patch adds support for performance reporting private feature for FPGA Management Engine (FME). Now it supports several different performance counters, including 'basic', 'cache', 'fabric', 'vtd' and 'vtd_sip'. It allows user to use standard linux tools to access these performance counters. e.g. List all events by "perf list" perf list | grep fme dfl_fme0/cache_read_hit/ [Kernel PMU event] dfl_fme0/cache_read_miss/ [Kernel PMU event] ... dfl_fme0/fab_mmio_read/ [Kernel PMU event] dfl_fme0/fab_mmio_write/ [Kernel PMU event] ... dfl_fme0/fab_port_mmio_read,portid=?/ [Kernel PMU event] dfl_fme0/fab_port_mmio_write,portid=?/ [Kernel PMU event] ... dfl_fme0/vtd_port_devtlb_1g_fill,portid=?/ [Kernel PMU event] dfl_fme0/vtd_port_devtlb_2m_fill,portid=?/ [Kernel PMU event] ... dfl_fme0/vtd_sip_iotlb_1g_hit/ [Kernel PMU event] dfl_fme0/vtd_sip_iotlb_1g_miss/ [Kernel PMU event] ... dfl_fme0/clock [Kernel PMU event] ... e.g. check increased counter value after run one application using "perf stat" command. perf stat -e dfl_fme0/fab_mmio_read/,dfl_fme0/fab_mmio_write/ ./test Performance counter stats for './test': 1 dfl_fme0/fab_mmio_read/ 2 dfl_fme0/fab_mmio_write/ 1.009496520 seconds time elapsed Please note that fabric counters support both fab_* and fab_port_*, but actually they are sharing one set of performance counters in hardware. If user wants to monitor overall data events on fab_* then fab_port_* can't be supported at the same time, see example below: perf stat -e dfl_fme0/fab_mmio_read/,dfl_fme0/fab_port_mmio_write,portid=0/ Performance counter stats for 'system wide': 0 dfl_fme0/fab_mmio_read/ <not supported> dfl_fme0/fab_port_mmio_write,portid=0/ 2.141064085 seconds time elapsed Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com>

DFL based FPGA devices could support interrupts for different purposes, but current DFL framework only supports feature device enumeration with given MMIO resources information via common DFL headers. This patch introduces one new API dfl_fpga_enum_info_add_irq for low level bus drivers (e.g. PCIe device driver) to pass its interrupt resources information to DFL framework for enumeration, and also adds interrupt enumeration code in framework to parse and assign interrupt resources for enumerated feature devices and their own sub features. With this patch, DFL framework enumerates interrupt resources for core features, including PORT Error Reporting, FME (FPGA Management Engine) Error Reporting and also AFU User Interrupts. Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> ---- v2: early validating irq table for each feature in parse_feature_irq(). Some code improvement and minor fix for Hao's comments. v3: put parse_feature_irqs() inside create_feature_instance() some minor fixes and more comments v4: no need to include asm/irq.h. fail the dfl enumeration when irq parsing error happens. Signed-off-by: Xu Yilun <yilun.xu@intel.com>

Some DFL FPGA PCIe cards (e.g. Intel FPGA Programmable Acceleration Card) support MSI-X based interrupts. This patch allows PCIe driver to prepare and pass interrupt resources to DFL via enumeration API. These interrupt resources could then be assigned to actual features which use them. Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> ---- v2: put irq resources init code inside cce_enumerate_feature_dev() Some minor changes for Hao's comments. v3: Some minor fix for Hao's comments for v2. v4: Some minor fix for Hao's comments for v3.

FPGA user applications may be interested in interrupts generated by DFL features. For example, users can implement their own FPGA logics with interrupts enabled in AFU (Accelerated Function Unit, dynamic region of DFL based FPGA). So user applications need to be notified to handle these interrupts. In order to allow userspace applications to monitor interrupts, driver requires userspace to provide eventfds as interrupt notification channels. Applications then poll/select on the eventfds to get notified. This patch introduces a generic helper functions to do eventfds binding with given interrupts. Sub feature drivers are expected to use XXX_GET_IRQ_NUM to query irq info, and XXX_SET_IRQ to set eventfds for interrupts. This patch also introduces helper functions for these 2 ioctls. Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> ---- v2: use unsigned int instead of int for irq array indexes in dfl_fpga_set_irq_triggers() Improves comments for NULL fds param in dfl_fpga_set_irq_triggers() v3: Improve comments of dfl_fpga_set_irq_triggers() refines code for dfl_fpga_set_irq_triggers, delete local variable j v4: Introduce 2 helper functions to help handle the XXX_GET_IRQ_NUM & XXX_SET_IRQ ioctls for sub feature drivers.

Error reporting interrupt is very useful to notify users that some errors are detected by the hardware. Once users are notified, they could query hardware logged error states, no need to continuously poll on these states. This patch adds interrupt support for port error reporting sub feature. It follows the common DFL interrupt notification and handling mechanism, implements two ioctl commands below for user to query number of irqs supported, and set/unset interrupt triggers. Ioctls: * DFL_FPGA_PORT_ERR_GET_IRQ_NUM get the number of irqs, which is used to determine whether/how many interrupts error reporting feature supports. * DFL_FPGA_PORT_ERR_SET_IRQ set/unset given eventfds as error interrupt triggers. Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> ---- v2: use DFL_FPGA_PORT_ERR_GET_IRQ_NUM instead of DFL_FPGA_PORT_ERR_GET_INFO Delete flag field for DFL_FPGA_PORT_ERR_SET_IRQ param v3: put_user() instead of copy_to_user() improves comments v4: use common functions to handle irq ioctls

Error reporting interrupt is very useful to notify users that some errors are detected by the hardware. Once users are notified, they could query hardware logged error states, no need to continuously poll on these states. This patch adds interrupt support for fme global error reporting sub feature. It follows the common DFL interrupt notification and handling mechanism. And it implements two ioctls below for user to query number of irqs supported, and set/unset interrupt triggers. Ioctls: * DFL_FPGA_FME_ERR_GET_IRQ_NUM get the number of irqs, which is used to determine whether/how many interrupts fme error reporting feature supports. * DFL_FPGA_FME_ERR_SET_IRQ set/unset given eventfds as fme error reporting interrupt triggers. Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> ---- v2: use DFL_FPGA_FME_ERR_GET_IRQ_NUM instead of DFL_FPGA_FME_ERR_GET_INFO Delete flags field for DFL_FPGA_FME_ERR_SET_IRQ v3: put_user() instead of copy_to_user() improves comments v4: use common functions to handle irq ioctls

AFU (Accelerated Function Unit) is dynamic region of the DFL based FPGA, and always defined by users. Some DFL based FPGA cards allow users to implement their own interrupts in AFU. In order to support this, hardware implements a new UINT (AFU Interrupt) private feature with related capability register which describes the number of supported AFU interrupts as well as the local index of the interrupts for software enumeration, and from software side, driver follows the common DFL interrupt notification and handling mechanism, and it implements two ioctls below for user to query number of irqs supported and set/unset interrupt triggers. Ioctls: * DFL_FPGA_PORT_UINT_GET_IRQ_NUM get the number of irqs, which is used to determine how many interrupts UINT feature supports. * DFL_FPGA_PORT_UINT_SET_IRQ set/unset eventfds as AFU interrupt triggers. Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> ---- v2: use DFL_FPGA_PORT_UINT_GET_IRQ_NUM instead of DFL_FPGA_PORT_UINT_GET_INFO Delete flags field for DFL_FPGA_PORT_UINT_SET_IRQ v3: put_user() instead of copy_to_user() improves comments v4: use common functions to handle irq ioctls

…rfaces. This patch adds introductions of interrupt related interfaces for FME error reporting, port error reporting and AFU user interrupts features. Signed-off-by: Luwei Kang <luwei.kang@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> ---- v2: Update Documents cause change of irq ioctl interfaces. v3: No change v4: Update interrupt support part.

In early partial reconfiguration private feature, it only supports 32bit data width when writing data to hardware for PR. 512bit data width PR support is an important optimization for some specific solutions (e.g. XEON with FPGA integrated), it allows driver to use AVX512 instruction to improve the performance of partial reconfiguration. e.g. programming one 100MB bitstream image via this 512bit data width PR hardware only takes ~300ms, but 32bit revision requires ~3s per test result. Please note now this optimization is only done on revision 2 of this PR private feature which is only used in integrated solution that AVX512 is always supported. This revision 2 hardware doesn't support 32bit PR. Signed-off-by: Ananda Ravuri <ananda.ravuri@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Acked-by: Alan Tull <atull@kernel.org> Signed-off-by: Moritz Fischer <mdf@kernel.org>

This patch makes preparation for modularization of DFL sub feature drivers. Currently, if we need to support a new DFL sub feature, an entry should be added to fme/port_feature_drvs[] in dfl-fme/port-main.c. And we need to re-compile the whole DFL modules. That make the DFL drivers hard to be extended. Another consideration is that DFL may contain some IP blocks which are already supported by kernel, most of them are supported by platform device drivers. We could create platform devices for these IP blocks and get them supported by these drivers. An important issue is that platform device drivers usually requests mmio resources on probe. But now dfl mmio is mapped in dfl bus driver (e.g. dfl-pci) as a whole region. Then platform device drivers for sub features can't request their own mmio resources again. This is what the patch trying to resolve. This patch changes the DFL enumeration. DFL bus driver will unmap mmio resources after first step enumeration and pass enumeration info to DFL framework. Then DFL framework will map the mmio resources again, do 2nd step enumeration, and also unmap the mmio resources. In this way, sub feature drivers could then request their own mmio resources as needed. An exception is that mmio resource of FIU headers are still mapped in dfl bus driver. The FIU headers have some fundamental functions (sriov set, port enable/disable) needed for dfl bus devices and other sub features. They should not be unmapped as long as dfl bus device is alive. Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Matthew Gerlach <matthew.gerlach@linux.intel.com> Signed-off-by: Russ Weight <russell.h.weight@intel.com> ---- v2: dfl-pci: move mapped bars out of cci_pci_ioremap_bar. dfl: introduced a macro is_hdr_feature delete feature->ioaddr = NULL in dfl_fpga_dev_feature_uinit() rename ioremap_dfl_region -> dfl_map_iomem refactor build_info_commit_dev & build_info_create_dev, make binfo->start & ioaddr change out of these functions some minor fixes v3: merged Hao's code improvement patch. v4: improves comments.

A new bus type "dfl" is introduced for private features which are not initialized by DFL feature drivers (dfl-fme & dfl-afu drivers). So these private features could be handled by separate driver modules. DFL framework will create DFL devices on enumeration. DFL drivers could be registered on this bus to match these DFL devices. They are matched by dfl type & feature_id. Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Matthew Gerlach <matthew.gerlach@linux.intel.com> Signed-off-by: Russ Weight <russell.h.weight@intel.com> ---- v2: add struct dfl_fpga_cdev for struct dfl_sub_device add driver_data for dflsub_device_id v4: rename dflsub bus to dfl bus dfl device naming change, delete .feature_id, dev_name(parent dev) + feature_index is enough to make every name unique. Add "type" & "feature_id" sysfs attrs for dfl devices. Add documentation sys-bus-dfl for dfl device sysfs attrs

In order to support MODULE_DEVICE_TABLE() for dfl device driver, this patch moves struct dfl_device_id to mod_devicetable.h Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Matthew Gerlach <matthew.gerlach@linux.intel.com> Signed-off-by: Russ Weight <russell.h.weight@intel.com> ---- v4: rename dflsub bus to dfl bus

Device Feature List (DFL) is a linked list of feature headers within the device MMIO space. It is used by FPGA to enumerate multiple sub features within it. Each feature can be uniquely identified by DFL type and feature id, which can be read out from feature headers. A dfl bus helps DFL framework modularize DFL device drivers for different sub features. The dfl bus matches its devices and drivers by DFL type and feature id. This patch add dfl bus support to MODULE_DEVICE_TABLE() by adding info about struct dfl_device_id in devicetable-offsets.c and add a dfl entry point in file2alias.c. Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Matthew Gerlach <matthew.gerlach@linux.intel.com> Signed-off-by: Russ Weight <russell.h.weight@intel.com> ---- v4: rename dflsub to dfl

Add PCIe Device ID for Intel FPGA PAC N3000. Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Matthew Gerlach <matthew.gerlach@linux.intel.com> Signed-off-by: Russ Weight <russell.h.weight@intel.com> Reviewed-by: Wu Hao <hao.wu@intel.com>

Add support for 32bit width data register, then it supports 32bit data width spi slave device and spi transfers. Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Matthew Gerlach <matthew.gerlach@linux.intel.com> Signed-off-by: Russ Weight <russell.h.weight@intel.com> Reviewed-by: Wu Hao <hao.wu@intel.com>

This patch introduced SPI core parameters in platform data, it allows passing these SPI core parameters via platform data. Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Matthew Gerlach <matthew.gerlach@linux.intel.com> Signed-off-by: Russ Weight <russell.h.weight@intel.com> Reviewed-by: Wu Hao <hao.wu@intel.com>

This patch introduces platform data for slave information, it allows spi-altera to add new spi devices once master registration is done. Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Matthew Gerlach <matthew.gerlach@linux.intel.com> Signed-off-by: Russ Weight <russell.h.weight@intel.com> Reviewed-by: Wu Hao <hao.wu@intel.com>

This patch adds support for regmap. It allows this driver to be compatible if low layer register access method is changed in some cases. Signed-off-by: Matthew Gerlach <matthew.gerlach@linux.intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Russ Weight <russell.h.weight@intel.com> Reviewed-by: Wu Hao <hao.wu@intel.com>

This allows other driver to reuse the name string for spi-altera platform device creation. Signed-off-by: Matthew Gerlach <matthew.gerlach@linux.intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Russ Weight <russell.h.weight@intel.com> Reviewed-by: Wu Hao <hao.wu@intel.com>

The patch moves dfl-bus related APIs to include/linux/fpga/dfl-bus.h Now the DFL sub feature drivers could be made as independent modules and put in different folders according to their functionality. In order for scattered sub feature drivers to include dfl bus APIs, move the dfl bus APIs to a new header file in the public folder. Signed-off-by: Xu Yilun <yilun.xu@intel.com> Reviewed-by: Wu Hao <hao.wu@intel.com> Reviewed-by: Russ Weight <russell.h.weight@intel.com>

This patch adds support for the nios handshake private feature on Intel N3000 FPGA Card. This private feature provides a handshake interface to FPGA NIOS firmware, which receives retimer configuration command from host and executes via an internal SPI master. When nios finished the configuration, host takes over the ownership of the SPI master to control an Intel MAX10 BMC Chip on the SPI bus. For NIOS firmware handshake part, this driver requests the retimer configuration for NIOS with parameters from module param, and adds some sysfs nodes for user to query NIOS state. For SPI part, this driver adds a spi-altera platform device as well as the MAX10 BMC spi slave info. A spi-altera driver will be matched to handle following the SPI work. Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Matthew Gerlach <matthew.gerlach@linux.intel.com> Signed-off-by: Russ Weight <russell.h.weight@intel.com> Reviewed-by: Wu Hao <hao.wu@intel.com> Reviewed-by: Russ Weight <russell.h.weight@intel.com>

This patch implements the basic functions of the BMC chip for some Intel FPGA PCIe Acceleration Cards (PAC). The BMC is implemented using the intel max10 CPLD. This BMC chip is connected to FPGA by a SPI bus. To provide reliable register access from FPGA, an Avalon Memory-Mapped (Avmm) transaction protocol over the SPI bus is used between host and slave. This driver implements the basic register access with the regmap framework. The mfd cells array is empty now as a placeholder. Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Matthew Gerlach <matthew.gerlach@linux.intel.com> Signed-off-by: Russ Weight <russell.h.weight@intel.com> Reviewed-by: Wu Hao <hao.wu@intel.com> Reviewed-by: Russ Weight <russell.h.weight@intel.com>

The spi-altera driver was originally written with a 32 bit processor, where sizeof(unsigned long) is 4. On a 64 bit processor sizeof(unsigned long) is 8. Change the structure member to u32 to match the actual size of the control register. Signed-off-by: Matthew Gerlach <matthew.gerlach@linux.intel.com> Reviewed-by: Wu Hao <hao.wu@intel.com> Reviewed-by: Xu Yilun <yilun.xu@intel.com>

This is to fix lkp cppcheck warnings: drivers/fpga/dfl-pci.c:230:6: warning: The scope of the variable 'ret' can be reduced. [variableScope] int ret = 0; ^ drivers/fpga/dfl-pci.c:230:10: warning: Variable 'ret' is assigned a value that is never used. [unreadVariable] int ret = 0; ^ Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com>

When putting the port in reset, driver must wait for the soft reset acknowledgment bit instead of the soft reset bit. Fixes: 47c1b19 (fpga: dfl: afu: add port ops support) Signed-off-by: Matthew Gerlach <matthew.gerlach@linux.intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com>

This patch adds hwmon functionality for Intel MAX10 BMC chip. The MAX10 BMC chip connects to a set of sensor chips to monitor current, voltage, thermal and power of different components on board. BMC firmware is responsible for sensor data sampling and recording in shared registers. Host driver reads the sensor data from these shared registers and exposes them to users as hwmon interfaces. Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com> Signed-off-by: Matthew Gerlach <matthew.gerlach@linux.intel.com> Reviewed-by: Wu Hao <hao.wu@intel.com>

Create two sysfs entries for exposing the MAC address and count from the MAX10 BMC register space. Signed-off-by: Russ Weight <russell.h.weight@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com>

Lockdep with fstests test case btrfs/041 detected a unsafe locking scenario when we allocate the log node on a zoned filesystem. btrfs/041 ============================================ WARNING: possible recursive locking detected 5.11.0-rc7+ #939 Not tainted -------------------------------------------- xfs_io/698 is trying to acquire lock: ffff88810cd673a0 (&root->log_mutex){+.+.}-{3:3}, at: btrfs_sync_log+0x3d1/0xee0 [btrfs] but task is already holding lock: ffff88810b0fc3a0 (&root->log_mutex){+.+.}-{3:3}, at: btrfs_sync_log+0x313/0xee0 [btrfs] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&root->log_mutex); lock(&root->log_mutex); *** DEADLOCK *** May be due to missing lock nesting notation 2 locks held by xfs_io/698: #0: ffff88810cd66620 (sb_internal){.+.+}-{0:0}, at: btrfs_sync_file+0x2c3/0x570 [btrfs] #1: ffff88810b0fc3a0 (&root->log_mutex){+.+.}-{3:3}, at: btrfs_sync_log+0x313/0xee0 [btrfs] stack backtrace: CPU: 0 PID: 698 Comm: xfs_io Not tainted 5.11.0-rc7+ #939 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014 Call Trace: dump_stack+0x77/0x97 __lock_acquire.cold+0xb9/0x32a lock_acquire+0xb5/0x400 ? btrfs_sync_log+0x3d1/0xee0 [btrfs] __mutex_lock+0x7b/0x8d0 ? btrfs_sync_log+0x3d1/0xee0 [btrfs] ? btrfs_sync_log+0x3d1/0xee0 [btrfs] ? find_first_extent_bit+0x9f/0x100 [btrfs] ? __mutex_unlock_slowpath+0x35/0x270 btrfs_sync_log+0x3d1/0xee0 [btrfs] btrfs_sync_file+0x3a8/0x570 [btrfs] __x64_sys_fsync+0x34/0x60 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 This happens, because we are taking the ->log_mutex albeit it has already been locked. Also while at it, fix the bogus unlock of the tree_log_mutex in the error handling. Fixes: 3ddebf2 ("btrfs: zoned: reorder log node allocation on zoned filesystem") Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>

Calling btrfs_qgroup_reserve_meta_prealloc from btrfs_delayed_inode_reserve_metadata can result in flushing delalloc while holding a transaction and delayed node locks. This is deadlock prone. In the past multiple commits: * ae5e070 ("btrfs: qgroup: don't try to wait flushing if we're already holding a transaction") * 6f23277 ("btrfs: qgroup: don't commit transaction when we already hold the handle") Tried to solve various aspects of this but this was always a whack-a-mole game. Unfortunately those 2 fixes don't solve a deadlock scenario involving btrfs_delayed_node::mutex. Namely, one thread can call btrfs_dirty_inode as a result of reading a file and modifying its atime: PID: 6963 TASK: ffff8c7f3f94c000 CPU: 2 COMMAND: "test" #0 __schedule at ffffffffa529e07d #1 schedule at ffffffffa529e4ff #2 schedule_timeout at ffffffffa52a1bdd #3 wait_for_completion at ffffffffa529eeea <-- sleeps with delayed node mutex held #4 start_delalloc_inodes at ffffffffc0380db5 #5 btrfs_start_delalloc_snapshot at ffffffffc0393836 #6 try_flush_qgroup at ffffffffc03f04b2 #7 __btrfs_qgroup_reserve_meta at ffffffffc03f5bb6 <-- tries to reserve space and starts delalloc inodes. #8 btrfs_delayed_update_inode at ffffffffc03e31aa <-- acquires delayed node mutex #9 btrfs_update_inode at ffffffffc0385ba8 #10 btrfs_dirty_inode at ffffffffc038627b <-- TRANSACTIION OPENED #11 touch_atime at ffffffffa4cf0000 #12 generic_file_read_iter at ffffffffa4c1f123 #13 new_sync_read at ffffffffa4ccdc8a #14 vfs_read at ffffffffa4cd0849 #15 ksys_read at ffffffffa4cd0bd1 #16 do_syscall_64 at ffffffffa4a052eb #17 entry_SYSCALL_64_after_hwframe at ffffffffa540008c This will cause an asynchronous work to flush the delalloc inodes to happen which can try to acquire the same delayed_node mutex: PID: 455 TASK: ffff8c8085fa4000 CPU: 5 COMMAND: "kworker/u16:30" #0 __schedule at ffffffffa529e07d #1 schedule at ffffffffa529e4ff #2 schedule_preempt_disabled at ffffffffa529e80a #3 __mutex_lock at ffffffffa529fdcb <-- goes to sleep, never wakes up. #4 btrfs_delayed_update_inode at ffffffffc03e3143 <-- tries to acquire the mutex #5 btrfs_update_inode at ffffffffc0385ba8 <-- this is the same inode that pid 6963 is holding #6 cow_file_range_inline.constprop.78 at ffffffffc0386be7 #7 cow_file_range at ffffffffc03879c1 #8 btrfs_run_delalloc_range at ffffffffc038894c #9 writepage_delalloc at ffffffffc03a3c8f #10 __extent_writepage at ffffffffc03a4c01 #11 extent_write_cache_pages at ffffffffc03a500b #12 extent_writepages at ffffffffc03a6de2 #13 do_writepages at ffffffffa4c277eb #14 __filemap_fdatawrite_range at ffffffffa4c1e5bb #15 btrfs_run_delalloc_work at ffffffffc0380987 <-- starts running delayed nodes #16 normal_work_helper at ffffffffc03b706c #17 process_one_work at ffffffffa4aba4e4 #18 worker_thread at ffffffffa4aba6fd #19 kthread at ffffffffa4ac0a3d #20 ret_from_fork at ffffffffa54001ff To fully address those cases the complete fix is to never issue any flushing while holding the transaction or the delayed node lock. This patch achieves it by calling qgroup_reserve_meta directly which will either succeed without flushing or will fail and return -EDQUOT. In the latter case that return value is going to be propagated to btrfs_dirty_inode which will fallback to start a new transaction. That's fine as the majority of time we expect the inode will have BTRFS_DELAYED_NODE_INODE_DIRTY flag set which will result in directly copying the in-memory state. Fixes: c53e965 ("btrfs: qgroup: try to flush qgroup space when we get -EDQUOT") CC: stable@vger.kernel.org # 5.10+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>

syzbot reports a deadlock, attempting to lock the same spinlock twice: ============================================ WARNING: possible recursive locking detected 5.11.0-syzkaller #0 Not tainted -------------------------------------------- swapper/1/0 is trying to acquire lock: ffff88801b2b1130 (&runtime->sleep){..-.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline] ffff88801b2b1130 (&runtime->sleep){..-.}-{2:2}, at: io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4960 but task is already holding lock: ffff88801b2b3130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&runtime->sleep); lock(&runtime->sleep); *** DEADLOCK *** May be due to missing lock nesting notation 2 locks held by swapper/1/0: #0: ffff888147474908 (&group->lock){..-.}-{2:2}, at: _snd_pcm_stream_lock_irqsave+0x9f/0xd0 sound/core/pcm_native.c:170 #1: ffff88801b2b3130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137 stack backtrace: CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.11.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0xfa/0x151 lib/dump_stack.c:120 print_deadlock_bug kernel/locking/lockdep.c:2829 [inline] check_deadlock kernel/locking/lockdep.c:2872 [inline] validate_chain kernel/locking/lockdep.c:3661 [inline] __lock_acquire.cold+0x14c/0x3b4 kernel/locking/lockdep.c:4900 lock_acquire kernel/locking/lockdep.c:5510 [inline] lock_acquire+0x1ab/0x730 kernel/locking/lockdep.c:5475 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:354 [inline] io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4960 __wake_up_common+0x147/0x650 kernel/sched/wait.c:108 __wake_up_common_lock+0xd0/0x130 kernel/sched/wait.c:138 snd_pcm_update_state+0x46a/0x540 sound/core/pcm_lib.c:203 snd_pcm_update_hw_ptr0+0xa75/0x1a50 sound/core/pcm_lib.c:464 snd_pcm_period_elapsed+0x160/0x250 sound/core/pcm_lib.c:1805 dummy_hrtimer_callback+0x94/0x1b0 sound/drivers/dummy.c:378 __run_hrtimer kernel/time/hrtimer.c:1519 [inline] __hrtimer_run_queues+0x609/0xe40 kernel/time/hrtimer.c:1583 hrtimer_run_softirq+0x17b/0x360 kernel/time/hrtimer.c:1600 __do_softirq+0x29b/0x9f6 kernel/softirq.c:345 invoke_softirq kernel/softirq.c:221 [inline] __irq_exit_rcu kernel/softirq.c:422 [inline] irq_exit_rcu+0x134/0x200 kernel/softirq.c:434 sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1100 </IRQ> asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:632 RIP: 0010:native_save_fl arch/x86/include/asm/irqflags.h:29 [inline] RIP: 0010:arch_local_save_flags arch/x86/include/asm/irqflags.h:70 [inline] RIP: 0010:arch_irqs_disabled arch/x86/include/asm/irqflags.h:137 [inline] RIP: 0010:acpi_safe_halt drivers/acpi/processor_idle.c:111 [inline] RIP: 0010:acpi_idle_do_entry+0x1c9/0x250 drivers/acpi/processor_idle.c:516 Code: dd 38 6e f8 84 db 75 ac e8 54 32 6e f8 e8 0f 1c 74 f8 e9 0c 00 00 00 e8 45 32 6e f8 0f 00 2d 4e 4a c5 00 e8 39 32 6e f8 fb f4 <9c> 5b 81 e3 00 02 00 00 fa 31 ff 48 89 de e8 14 3a 6e f8 48 85 db RSP: 0018:ffffc90000d47d18 EFLAGS: 00000293 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff8880115c3780 RSI: ffffffff89052537 RDI: 0000000000000000 RBP: ffff888141127064 R08: 0000000000000001 R09: 0000000000000001 R10: ffffffff81794168 R11: 0000000000000000 R12: 0000000000000001 R13: ffff888141127000 R14: ffff888141127064 R15: ffff888143331804 acpi_idle_enter+0x361/0x500 drivers/acpi/processor_idle.c:647 cpuidle_enter_state+0x1b1/0xc80 drivers/cpuidle/cpuidle.c:237 cpuidle_enter+0x4a/0xa0 drivers/cpuidle/cpuidle.c:351 call_cpuidle kernel/sched/idle.c:158 [inline] cpuidle_idle_call kernel/sched/idle.c:239 [inline] do_idle+0x3e1/0x590 kernel/sched/idle.c:300 cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:397 start_secondary+0x274/0x350 arch/x86/kernel/smpboot.c:272 secondary_startup_64_no_verify+0xb0/0xbb which is due to the driver doing poll_wait() twice on the same wait_queue_head. That is perfectly valid, but from checking the rest of the kernel tree, it's the only driver that does this. We can handle this just fine, we just need to ignore the second addition as we'll get woken just fine on the first one. Cc: stable@vger.kernel.org # 5.8+ Fixes: 18bceab ("io_uring: allow POLL_ADD with double poll_wait() users") Reported-by: syzbot+28abd693db9e92c160d8@syzkaller.appspotmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>

…ules [ Upstream commit 61834c9 ] The custom regulatory ruleset in the rtl8723bs driver lists an incorrect number of rules: one too many. This results in an out-of-bounds access, as detected by KASAN. This was possible thanks to the newly added support for KASAN on ARMv7. Fix this by filling in the correct number of rules given. KASAN report: ================================================================== BUG: KASAN: global-out-of-bounds in cfg80211_does_bw_fit_range+0x14/0x4c [cfg80211] Read of size 4 at addr bf20c254 by task ip/971 CPU: 2 PID: 971 Comm: ip Tainted: G C 5.11.0-rc2-00020-gf7fe528a7ebe #1 Hardware name: Allwinner sun8i Family [<c0113338>] (unwind_backtrace) from [<c010e8a4>] (show_stack+0x10/0x14) [<c010e8a4>] (show_stack) from [<c0e0f868>] (dump_stack+0x9c/0xb4) [<c0e0f868>] (dump_stack) from [<c0388284>] (print_address_description.constprop.2+0x1dc/0x2dc) [<c0388284>] (print_address_description.constprop.2) from [<c03885cc>] (kasan_report+0x1a8/0x1c4) [<c03885cc>] (kasan_report) from [<bf00a354>] (cfg80211_does_bw_fit_range+0x14/0x4c [cfg80211]) [<bf00a354>] (cfg80211_does_bw_fit_range [cfg80211]) from [<bf00b41c>] (freq_reg_info_regd.part.6+0x108/0x124 [> [<bf00b41c>] (freq_reg_info_regd.part.6 [cfg80211]) from [<bf00df00>] (handle_channel_custom.constprop.12+0x48/> [<bf00df00>] (handle_channel_custom.constprop.12 [cfg80211]) from [<bf00e150>] (wiphy_apply_custom_regulatory+0> [<bf00e150>] (wiphy_apply_custom_regulatory [cfg80211]) from [<bf1fb9e8>] (rtw_regd_init+0x60/0x70 [r8723bs]) [<bf1fb9e8>] (rtw_regd_init [r8723bs]) from [<bf1ee5a8>] (rtw_cfg80211_init_wiphy+0x164/0x1e8 [r8723bs]) [<bf1ee5a8>] (rtw_cfg80211_init_wiphy [r8723bs]) from [<bf1f8d50>] (_netdev_open+0xe4/0x28c [r8723bs]) [<bf1f8d50>] (_netdev_open [r8723bs]) from [<bf1f8f58>] (netdev_open+0x60/0x88 [r8723bs]) [<bf1f8f58>] (netdev_open [r8723bs]) from [<c0bb3730>] (__dev_open+0x178/0x220) [<c0bb3730>] (__dev_open) from [<c0bb3cdc>] (__dev_change_flags+0x258/0x2c4) [<c0bb3cdc>] (__dev_change_flags) from [<c0bb3d88>] (dev_change_flags+0x40/0x80) [<c0bb3d88>] (dev_change_flags) from [<c0bc86fc>] (do_setlink+0x538/0x1160) [<c0bc86fc>] (do_setlink) from [<c0bcf9e8>] (__rtnl_newlink+0x65c/0xad8) [<c0bcf9e8>] (__rtnl_newlink) from [<c0bcfeb0>] (rtnl_newlink+0x4c/0x6c) [<c0bcfeb0>] (rtnl_newlink) from [<c0bc67c8>] (rtnetlink_rcv_msg+0x1f8/0x454) [<c0bc67c8>] (rtnetlink_rcv_msg) from [<c0c330e4>] (netlink_rcv_skb+0xc4/0x1e0) [<c0c330e4>] (netlink_rcv_skb) from [<c0c32478>] (netlink_unicast+0x2c8/0x3c4) [<c0c32478>] (netlink_unicast) from [<c0c32894>] (netlink_sendmsg+0x320/0x5f0) [<c0c32894>] (netlink_sendmsg) from [<c0b75eb0>] (____sys_sendmsg+0x320/0x3e0) [<c0b75eb0>] (____sys_sendmsg) from [<c0b78394>] (___sys_sendmsg+0xe8/0x12c) [<c0b78394>] (___sys_sendmsg) from [<c0b78a50>] (__sys_sendmsg+0xc0/0x120) [<c0b78a50>] (__sys_sendmsg) from [<c0100060>] (ret_fast_syscall+0x0/0x58) Exception stack(0xc5693fa8 to 0xc5693ff0) 3fa0: 00000074 c7a39800 00000003 b6cee648 00000000 00000000 3fc0: 00000074 c7a39800 00000001 00000128 78d18349 00000000 b6ceeda0 004f7cb0 3fe0: 00000128 b6cee5e8 aeca151f aec1d746 The buggy address belongs to the variable: rtw_drv_halt+0xf908/0x6b4 [r8723bs] Memory state around the buggy address: bf20c100: 00 00 00 00 00 00 00 00 00 00 04 f9 f9 f9 f9 f9 bf20c180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >bf20c200: 00 00 00 00 00 00 00 00 00 00 04 f9 f9 f9 f9 f9 ^ bf20c280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 bf20c300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ================================================================== Fixes: 554c0a3 ("staging: Add rtl8723bs sdio wifi driver") Signed-off-by: Chen-Yu Tsai <wens@csie.org> Link: https://lore.kernel.org/r/20210108141401.31741-1-wens@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit 12c8f3d ] When trying to set the noise floor via debugfs, a "data bus error" crash like the following can happen: [ 88.433133] Data bus error, epc == 80221c28, ra == 83314e60 [ 88.438895] Oops[#1]: [ 88.441246] CPU: 0 PID: 7263 Comm: sh Not tainted 4.14.195 #0 [ 88.447174] task: 838a1c20 task.stack: 82d5e000 [ 88.451847] $ 0 : 00000000 00000030 deadc0de 83141de4 [ 88.457248] $ 4 : b810a2c4 0000a2c4 83230fd4 00000000 [ 88.462652] $ 8 : 0000000a 00000000 00000001 00000000 [ 88.468055] $12 : 7f8ef318 00000000 00000000 77f802a0 [ 88.473457] $16 : 83230080 00000002 0000001b 83230080 [ 88.478861] $20 : 83a1c3f8 00841000 77f7adb0 ffffff92 [ 88.484263] $24 : 00000fa4 77edd860 [ 88.489665] $28 : 82d5e000 82d5fda8 00000000 83314e60 [ 88.495070] Hi : 00000000 [ 88.498044] Lo : 00000000 [ 88.501040] epc : 80221c28 ioread32+0x8/0x10 [ 88.505671] ra : 83314e60 ath9k_hw_loadnf+0x88/0x520 [ath9k_hw] [ 88.512049] Status: 1000fc03 KERNEL EXL IE [ 88.516369] Cause : 5080801c (ExcCode 07) [ 88.520508] PrId : 00019374 (MIPS 24Kc) [ 88.524556] Modules linked in: ath9k ath9k_common pppoe ppp_async l2tp_ppp cdc_mbim batman_adv ath9k_hw ath sr9700 smsc95xx sierra_net rndis_host qmi_wwan pppox ppp_generic pl2303 nf_conntrack_ipv6 mcs7830 mac80211 kalmia iptable_nat ipt_REJECT ipt_MASQUERADE huawei_cdc_ncm ftdi_sio dm9601 cfg80211 cdc_subset cdc_ncm cdc_ether cdc_eem ax88179_178a asix xt_time xt_tcpudp xt_tcpmss xt_statistic xt_state xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_ecn xt_dscp xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT xt_NETMAP xt_LOG xt_HL xt_FLOWOFFLOAD xt_DSCP xt_CLASSIFY usbserial usbnet usbhid slhc rtl8150 r8152 pegasus nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 nf_conntrack_ipv4 nf_nat_ipv4 nf_nat nf_log_ipv4 nf_flow_table_hw nf_flow_table nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack [ 88.597894] libcrc32c kaweth iptable_mangle iptable_filter ipt_ECN ipheth ip_tables hso hid_generic crc_ccitt compat cdc_wdm cdc_acm br_netfilter hid evdev input_core nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 l2tp_netlink l2tp_core udp_tunnel ip6_udp_tunnel xfrm6_mode_tunnel xfrm6_mode_transport xfrm6_mode_beet ipcomp6 xfrm6_tunnel esp6 ah6 xfrm4_tunnel xfrm4_mode_tunnel xfrm4_mode_transport xfrm4_mode_beet ipcomp esp4 ah4 tunnel6 tunnel4 tun xfrm_user xfrm_ipcomp af_key xfrm_algo sha256_generic sha1_generic jitterentropy_rng drbg md5 hmac echainiv des_generic deflate zlib_inflate zlib_deflate cbc authenc crypto_acompress ehci_platform ehci_hcd gpio_button_hotplug usbcore nls_base usb_common crc16 mii aead crypto_null cryptomgr crc32c_generic [ 88.671671] crypto_hash [ 88.674292] Process sh (pid: 7263, threadinfo=82d5e000, task=838a1c20, tls=77f81efc) [ 88.682279] Stack : 00008060 00000008 00000200 00000000 00000000 00000000 00000000 00000002 [ 88.690916] 80500000 83230080 82d5fe22 00841000 77f7adb0 00000000 00000000 83156858 [ 88.699553] 00000000 8352fa00 83ad62b0 835302a8 00000000 300a00f8 00000003 82d5fe38 [ 88.708190] 82d5fef4 00000001 77f54dc4 77f80000 77f7adb0 c79fe901 00000000 00000000 [ 88.716828] 80510000 00000002 00841000 77f54dc4 77f80000 801ce4cc 0000000b 41824292 [ 88.725465] ... [ 88.727994] Call Trace: [ 88.730532] [<80221c28>] ioread32+0x8/0x10 [ 88.734765] Code: 00000000 8c820000 0000000f <03e00008> 00000000 08088708 00000000 aca40000 03e00008 [ 88.744846] [ 88.746464] ---[ end trace db226b2de1b69b9e ]--- [ 88.753477] Kernel panic - not syncing: Fatal exception [ 88.759981] Rebooting in 3 seconds.. The "REG_READ(ah, AR_PHY_AGC_CONTROL)" in ath9k_hw_loadnf() does not like being called when the hardware is asleep, leading to this crash. The easiest way to reproduce this is trying to set nf_override while the hardware is down: $ ip link set down dev wlan0 $ echo "-85" > /sys/kernel/debug/ieee80211/phy0/ath9k/nf_override Fixing this crash by waking the hardware up before trying to set the noise floor. Similar to what other ath9k debugfs files do. Tested on a Lima board from 8devices, which has a QCA 4531 chipset. Fixes: b901897 ("ath9k: add noise floor override option") Cc: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Linus Lüssing <ll@simonwunderlich.de> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210209184352.4272-1-linus.luessing@c0d3.blue Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit a217313 ] The ct entry object is accessed by the ct add, del, stats and restore methods. In addition, it is referenced from several hash tables. The lifetime of the ct entry object was not managed which triggered race conditions as in the following kasan dump: [ 3374.973945] ================================================================== [ 3374.988552] BUG: KASAN: use-after-free in memcmp+0x4c/0x98 [ 3374.999590] Read of size 1 at addr ffff00036129ea55 by task ksoftirqd/1/15 [ 3375.016415] CPU: 1 PID: 15 Comm: ksoftirqd/1 Tainted: G O 5.4.31+ #1 [ 3375.055301] Call trace: [ 3375.060214] dump_backtrace+0x0/0x238 [ 3375.067580] show_stack+0x24/0x30 [ 3375.074244] dump_stack+0xe0/0x118 [ 3375.081085] print_address_description.isra.9+0x74/0x3d0 [ 3375.091771] __kasan_report+0x198/0x1e8 [ 3375.099486] kasan_report+0xc/0x18 [ 3375.106324] __asan_load1+0x60/0x68 [ 3375.113338] memcmp+0x4c/0x98 [ 3375.119409] mlx5e_tc_ct_restore_flow+0x3a4/0x6f8 [mlx5_core] [ 3375.131073] mlx5e_rep_tc_update_skb+0x1d4/0x2f0 [mlx5_core] [ 3375.142553] mlx5e_handle_rx_cqe_rep+0x198/0x308 [mlx5_core] [ 3375.154034] mlx5e_poll_rx_cq+0x2a0/0x1060 [mlx5_core] [ 3375.164459] mlx5e_napi_poll+0x1d4/0xa78 [mlx5_core] [ 3375.174453] net_rx_action+0x28c/0x7a8 [ 3375.182004] __do_softirq+0x1b4/0x5d0 Manage the lifetime of the ct entry object by using synchornization mechanisms for concurrent access. Fixes: ac991b4 ("net/mlx5e: CT: Offload established flows") Signed-off-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit eaba3b2 ] Unprivileged user can crash kernel by using DRM_IOCTL_NOUVEAU_CHANNEL_ALLOC ioctl. This was reported by trinity[1] fuzzer. [ 71.073906] nouveau 0000:01:00.0: crashme[1329]: channel failed to initialise, -17 [ 71.081730] BUG: kernel NULL pointer dereference, address: 00000000000000a0 [ 71.088928] #PF: supervisor read access in kernel mode [ 71.094059] #PF: error_code(0x0000) - not-present page [ 71.099189] PGD 119590067 P4D 119590067 PUD 1054f5067 PMD 0 [ 71.104842] Oops: 0000 [#1] SMP NOPTI [ 71.108498] CPU: 2 PID: 1329 Comm: crashme Not tainted 5.8.0-rc6+ #2 [ 71.114993] Hardware name: AMD Pike/Pike, BIOS RPK1506A 09/03/2014 [ 71.121213] RIP: 0010:nouveau_abi16_ioctl_channel_alloc+0x108/0x380 [nouveau] [ 71.128339] Code: 48 89 9d f0 00 00 00 41 8b 4c 24 04 41 8b 14 24 45 31 c0 4c 8d 4b 10 48 89 ee 4c 89 f7 e8 10 11 00 00 85 c0 75 78 48 8b 43 10 <8b> 90 a0 00 00 00 41 89 54 24 08 80 7d 3d 05 0f 86 bb 01 00 00 41 [ 71.147074] RSP: 0018:ffffb4a1809cfd38 EFLAGS: 00010246 [ 71.152526] RAX: 0000000000000000 RBX: ffff98cedbaa1d20 RCX: 00000000000003bf [ 71.159651] RDX: 00000000000003be RSI: 0000000000000000 RDI: 0000000000030160 [ 71.166774] RBP: ffff98cee776de00 R08: ffffdc0144198a08 R09: ffff98ceeefd4000 [ 71.173901] R10: ffff98cee7e81780 R11: 0000000000000001 R12: ffffb4a1809cfe08 [ 71.181214] R13: ffff98cee776d000 R14: ffff98cec519e000 R15: ffff98cee776def0 [ 71.188339] FS: 00007fd926250500(0000) GS:ffff98ceeac80000(0000) knlGS:0000000000000000 [ 71.196418] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 71.202155] CR2: 00000000000000a0 CR3: 0000000106622000 CR4: 00000000000406e0 [ 71.209297] Call Trace: [ 71.211777] ? nouveau_abi16_ioctl_getparam+0x1f0/0x1f0 [nouveau] [ 71.218053] drm_ioctl_kernel+0xac/0xf0 [drm] [ 71.222421] drm_ioctl+0x211/0x3c0 [drm] [ 71.226379] ? nouveau_abi16_ioctl_getparam+0x1f0/0x1f0 [nouveau] [ 71.232500] nouveau_drm_ioctl+0x57/0xb0 [nouveau] [ 71.237285] ksys_ioctl+0x86/0xc0 [ 71.240595] __x64_sys_ioctl+0x16/0x20 [ 71.244340] do_syscall_64+0x4c/0x90 [ 71.248110] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 71.253162] RIP: 0033:0x7fd925d4b88b [ 71.256731] Code: Bad RIP value. [ 71.259955] RSP: 002b:00007ffc743592d8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010 [ 71.267514] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd925d4b88b [ 71.274637] RDX: 0000000000601080 RSI: 00000000c0586442 RDI: 0000000000000003 [ 71.281986] RBP: 00007ffc74359340 R08: 00007fd926016ce0 R09: 00007fd926016ce0 [ 71.289111] R10: 0000000000000003 R11: 0000000000000206 R12: 0000000000400620 [ 71.296235] R13: 00007ffc74359420 R14: 0000000000000000 R15: 0000000000000000 [ 71.303361] Modules linked in: rfkill sunrpc snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core edac_mce_amd snd_hwdep kvm_amd snd_seq ccp snd_seq_device snd_pcm kvm snd_timer snd irqbypass soundcore sp5100_tco pcspkr crct10dif_pclmul crc32_pclmul ghash_clmulni_intel wmi_bmof joydev i2c_piix4 fam15h_power k10temp acpi_cpufreq ip_tables xfs libcrc32c sd_mod t10_pi sg nouveau video mxm_wmi i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm broadcom bcm_phy_lib ata_generic ahci drm e1000 crc32c_intel libahci serio_raw tg3 libata firewire_ohci firewire_core wmi crc_itu_t dm_mirror dm_region_hash dm_log dm_mod [ 71.365269] CR2: 00000000000000a0 simplified reproducer ---------------------------------8<---------------------------------------- /* * gcc -o crashme crashme.c * ./crashme /dev/dri/renderD128 */ struct drm_nouveau_channel_alloc { uint32_t fb_ctxdma_handle; uint32_t tt_ctxdma_handle; int channel; uint32_t pushbuf_domains; /* Notifier memory */ uint32_t notifier_handle; /* DRM-enforced subchannel assignments */ struct { uint32_t handle; uint32_t grclass; } subchan[8]; uint32_t nr_subchan; }; static struct drm_nouveau_channel_alloc channel; int main(int argc, char *argv[]) { int fd; int rv; if (argc != 2) die("usage: %s <dev>", 0, argv[0]); if ((fd = open(argv[1], O_RDONLY)) == -1) die("open %s", errno, argv[1]); if (ioctl(fd, DRM_IOCTL_NOUVEAU_CHANNEL_ALLOC, &channel) == -1 && errno == EACCES) die("ioctl %s", errno, argv[1]); close(fd); printf("PASS\n"); return 0; } ---------------------------------8<---------------------------------------- [1] https://github.com/kernelslacker/trinity Fixes: eeaf06a ("drm/nouveau/svm: initial support for shared virtual memory") Signed-off-by: Frantisek Hrbata <frantisek@hrbata.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

…g system shutdown [ Upstream commit 45a2702 ] During Coldboot stress tests, system encountered the following panic. Panic logs depicts rt5682_i2c_shutdown() happened first and then later jack detect handler workqueue function triggered. This situation causes panic as rt5682_i2c_shutdown() resets codec. Fix this panic by cancelling all jack detection delayed work. Panic log: [ 20.936124] sof_pci_shutdown [ 20.940248] snd_sof_device_shutdown [ 20.945023] snd_sof_shutdown [ 21.126849] rt5682_i2c_shutdown [ 21.286053] rt5682_jack_detect_handler [ 21.291235] BUG: kernel NULL pointer dereference, address: 000000000000037c [ 21.299302] #PF: supervisor read access in kernel mode [ 21.305254] #PF: error_code(0x0000) - not-present page [ 21.311218] PGD 0 P4D 0 [ 21.314155] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 21.319206] CPU: 2 PID: 123 Comm: kworker/2:3 Tainted: G U 5.4.68 #10 [ 21.333687] ACPI: Preparing to enter system sleep state S5 [ 21.337669] Workqueue: events_power_efficient rt5682_jack_detect_handler [snd_soc_rt5682] [ 21.337671] RIP: 0010:rt5682_jack_detect_handler+0x6c/0x279 [snd_soc_rt5682] Fixes: a50067d ('ASoC: rt5682: split i2c driver into separate module') Signed-off-by: Jairaj Arava <jairaj.arava@intel.com> Signed-off-by: Sathyanarayana Nujella <sathyanarayana.nujella@intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@intel.com> Reviewed-by: Shuming Fan <shumingf@realtek.com> Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://lore.kernel.org/r/20210205171428.2344210-1-ranjani.sridharan@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit ed670c3 ] Abaci reported follow issue: [ 30.615891] ====================================================== [ 30.616648] WARNING: possible circular locking dependency detected [ 30.617423] 5.11.0-rc3-next-20210115 #1 Not tainted [ 30.618035] ------------------------------------------------------ [ 30.618914] a.out/1128 is trying to acquire lock: [ 30.619520] ffff88810b063868 (&ep->mtx){+.+.}-{3:3}, at: __ep_eventpoll_poll+0x9f/0x220 [ 30.620505] [ 30.620505] but task is already holding lock: [ 30.621218] ffff88810e952be8 (&ctx->uring_lock){+.+.}-{3:3}, at: __x64_sys_io_uring_enter+0x3f0/0x5b0 [ 30.622349] [ 30.622349] which lock already depends on the new lock. [ 30.622349] [ 30.623289] [ 30.623289] the existing dependency chain (in reverse order) is: [ 30.624243] [ 30.624243] -> #1 (&ctx->uring_lock){+.+.}-{3:3}: [ 30.625263] lock_acquire+0x2c7/0x390 [ 30.625868] __mutex_lock+0xae/0x9f0 [ 30.626451] io_cqring_overflow_flush.part.95+0x6d/0x70 [ 30.627278] io_uring_poll+0xcb/0xd0 [ 30.627890] ep_item_poll.isra.14+0x4e/0x90 [ 30.628531] do_epoll_ctl+0xb7e/0x1120 [ 30.629122] __x64_sys_epoll_ctl+0x70/0xb0 [ 30.629770] do_syscall_64+0x2d/0x40 [ 30.630332] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 30.631187] [ 30.631187] -> #0 (&ep->mtx){+.+.}-{3:3}: [ 30.631985] check_prevs_add+0x226/0xb00 [ 30.632584] __lock_acquire+0x1237/0x13a0 [ 30.633207] lock_acquire+0x2c7/0x390 [ 30.633740] __mutex_lock+0xae/0x9f0 [ 30.634258] __ep_eventpoll_poll+0x9f/0x220 [ 30.634879] __io_arm_poll_handler+0xbf/0x220 [ 30.635462] io_issue_sqe+0xa6b/0x13e0 [ 30.635982] __io_queue_sqe+0x10b/0x550 [ 30.636648] io_queue_sqe+0x235/0x470 [ 30.637281] io_submit_sqes+0xcce/0xf10 [ 30.637839] __x64_sys_io_uring_enter+0x3fb/0x5b0 [ 30.638465] do_syscall_64+0x2d/0x40 [ 30.638999] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 30.639643] [ 30.639643] other info that might help us debug this: [ 30.639643] [ 30.640618] Possible unsafe locking scenario: [ 30.640618] [ 30.641402] CPU0 CPU1 [ 30.641938] ---- ---- [ 30.642664] lock(&ctx->uring_lock); [ 30.643425] lock(&ep->mtx); [ 30.644498] lock(&ctx->uring_lock); [ 30.645668] lock(&ep->mtx); [ 30.646321] [ 30.646321] *** DEADLOCK *** [ 30.646321] [ 30.647642] 1 lock held by a.out/1128: [ 30.648424] #0: ffff88810e952be8 (&ctx->uring_lock){+.+.}-{3:3}, at: __x64_sys_io_uring_enter+0x3f0/0x5b0 [ 30.649954] [ 30.649954] stack backtrace: [ 30.650592] CPU: 1 PID: 1128 Comm: a.out Not tainted 5.11.0-rc3-next-20210115 #1 [ 30.651554] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 30.652290] Call Trace: [ 30.652688] dump_stack+0xac/0xe3 [ 30.653164] check_noncircular+0x11e/0x130 [ 30.653747] ? check_prevs_add+0x226/0xb00 [ 30.654303] check_prevs_add+0x226/0xb00 [ 30.654845] ? add_lock_to_list.constprop.49+0xac/0x1d0 [ 30.655564] __lock_acquire+0x1237/0x13a0 [ 30.656262] lock_acquire+0x2c7/0x390 [ 30.656788] ? __ep_eventpoll_poll+0x9f/0x220 [ 30.657379] ? __io_queue_proc.isra.88+0x180/0x180 [ 30.658014] __mutex_lock+0xae/0x9f0 [ 30.658524] ? __ep_eventpoll_poll+0x9f/0x220 [ 30.659112] ? mark_held_locks+0x5a/0x80 [ 30.659648] ? __ep_eventpoll_poll+0x9f/0x220 [ 30.660229] ? _raw_spin_unlock_irqrestore+0x2d/0x40 [ 30.660885] ? trace_hardirqs_on+0x46/0x110 [ 30.661471] ? __io_queue_proc.isra.88+0x180/0x180 [ 30.662102] ? __ep_eventpoll_poll+0x9f/0x220 [ 30.662696] __ep_eventpoll_poll+0x9f/0x220 [ 30.663273] ? __ep_eventpoll_poll+0x220/0x220 [ 30.663875] __io_arm_poll_handler+0xbf/0x220 [ 30.664463] io_issue_sqe+0xa6b/0x13e0 [ 30.664984] ? __lock_acquire+0x782/0x13a0 [ 30.665544] ? __io_queue_proc.isra.88+0x180/0x180 [ 30.666170] ? __io_queue_sqe+0x10b/0x550 [ 30.666725] __io_queue_sqe+0x10b/0x550 [ 30.667252] ? __fget_files+0x131/0x260 [ 30.667791] ? io_req_prep+0xd8/0x1090 [ 30.668316] ? io_queue_sqe+0x235/0x470 [ 30.668868] io_queue_sqe+0x235/0x470 [ 30.669398] io_submit_sqes+0xcce/0xf10 [ 30.669931] ? xa_load+0xe4/0x1c0 [ 30.670425] __x64_sys_io_uring_enter+0x3fb/0x5b0 [ 30.671051] ? lockdep_hardirqs_on_prepare+0xde/0x180 [ 30.671719] ? syscall_enter_from_user_mode+0x2b/0x80 [ 30.672380] do_syscall_64+0x2d/0x40 [ 30.672901] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 30.673503] RIP: 0033:0x7fd89c813239 [ 30.673962] Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 73 01 c3 48 8b 0d 27 ec 2c 00 f7 d8 64 89 01 48 [ 30.675920] RSP: 002b:00007ffc65a7c628 EFLAGS: 00000217 ORIG_RAX: 00000000000001aa [ 30.676791] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd89c813239 [ 30.677594] RDX: 0000000000000000 RSI: 0000000000000014 RDI: 0000000000000003 [ 30.678678] RBP: 00007ffc65a7c720 R08: 0000000000000000 R09: 0000000003000000 [ 30.679492] R10: 0000000000000000 R11: 0000000000000217 R12: 0000000000400ff0 [ 30.680282] R13: 00007ffc65a7c840 R14: 0000000000000000 R15: 0000000000000000 This might happen if we do epoll_wait on a uring fd while reading/writing the former epoll fd in a sqe in the former uring instance. So let's don't flush cqring overflow list, just do a simple check. Reported-by: Abaci <abaci@linux.alibaba.com> Fixes: 6c50315 ("io_uring: patch up IOPOLL overflow_flush sync") Signed-off-by: Hao Xu <haoxu@linux.alibaba.com> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit b8437a3 ] Sleeping while atomic = bad. Let's fix an obvious typo to try to avoid it. The warning that was seen (on a downstream kernel with the problematic patch backported): BUG: sleeping function called from invalid context at mm/page_alloc.c:4726 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 9, name: ksoftirqd/0 CPU: 0 PID: 9 Comm: ksoftirqd/0 Not tainted 5.4.93-12508-gc10c93e28e39 #1 Call trace: dump_backtrace+0x0/0x154 show_stack+0x20/0x2c dump_stack+0xa0/0xfc ___might_sleep+0x11c/0x12c __might_sleep+0x50/0x84 __alloc_pages_nodemask+0xf8/0x2bc __arm_lpae_alloc_pages+0x48/0x1b4 __arm_lpae_map+0x124/0x274 __arm_lpae_map+0x1cc/0x274 arm_lpae_map+0x140/0x170 arm_smmu_map+0x78/0xbc __iommu_map+0xd4/0x210 _iommu_map+0x4c/0x84 iommu_map_atomic+0x44/0x58 __iommu_dma_map+0x8c/0xc4 iommu_dma_map_page+0xac/0xf0 Fixes: d8c1df0 ("iommu: Move iotlb_sync_map out from __iommu_map") Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Yong Wu <yong.wu@mediatek.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210201170611.1.I64a7b62579287d668d7c89e105dcedf45d641063@changeid Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit 429fa96 ] The size of tx_valid_cpus was calculated under the assumption that the numa nodes identifiers are continuous, which is not the case in all archs as this could lead to the following panic when trying to access an invalid tx_valid_cpus index, avoid the following panic by using nr_node_ids instead of num_online_nodes() to allocate the tx_valid_cpus size. Kernel attempted to read user page (8) - exploit attempt? (uid: 0) BUG: Kernel NULL pointer dereference on read at 0x00000008 Faulting instruction address: 0xc0080000081b4a90 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: siw(+) rfkill rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm sunrpc ib_umad rdma_cm ib_cm iw_cm i40iw ib_uverbs ib_core i40e ses enclosure scsi_transport_sas ipmi_powernv ibmpowernv at24 ofpart ipmi_devintf regmap_i2c ipmi_msghandler powernv_flash uio_pdrv_genirq uio mtd opal_prd zram ip_tables xfs libcrc32c sd_mod t10_pi ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm_ttm_helper ttm drm vmx_crypto aacraid drm_panel_orientation_quirks dm_mod CPU: 40 PID: 3279 Comm: modprobe Tainted: G W X --------- --- 5.11.0-0.rc4.129.eln108.ppc64le #2 NIP: c0080000081b4a90 LR: c0080000081b4a2c CTR: c0000000007ce1c0 REGS: c000000027fa77b0 TRAP: 0300 Tainted: G W X --------- --- (5.11.0-0.rc4.129.eln108.ppc64le) MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: 44224882 XER: 00000000 CFAR: c0000000007ce200 DAR: 0000000000000008 DSISR: 40000000 IRQMASK: 0 GPR00: c0080000081b4a2c c000000027fa7a50 c0080000081c3900 0000000000000040 GPR04: c000000002023080 c000000012e1c300 000020072ad70000 0000000000000001 GPR08: c000000001726068 0000000000000008 0000000000000008 c0080000081b5758 GPR12: c0000000007ce1c0 c0000007fffc3000 00000001590b1e40 0000000000000000 GPR16: 0000000000000000 0000000000000001 000000011ad68fc8 00007fffcc09c5c8 GPR20: 0000000000000008 0000000000000000 00000001590b2850 00000001590b1d30 GPR24: 0000000000043d68 000000011ad67a80 000000011ad67a80 0000000000100000 GPR28: c000000012e1c300 c0000000020271c8 0000000000000001 c0080000081bf608 NIP [c0080000081b4a90] siw_init_cpulist+0x194/0x214 [siw] LR [c0080000081b4a2c] siw_init_cpulist+0x130/0x214 [siw] Call Trace: [c000000027fa7a50] [c0080000081b4a2c] siw_init_cpulist+0x130/0x214 [siw] (unreliable) [c000000027fa7a90] [c0080000081b4e68] siw_init_module+0x40/0x2a0 [siw] [c000000027fa7b30] [c0000000000124f4] do_one_initcall+0x84/0x2e0 [c000000027fa7c00] [c000000000267ffc] do_init_module+0x7c/0x350 [c000000027fa7c90] [c00000000026a180] __do_sys_init_module+0x210/0x250 [c000000027fa7db0] [c0000000000387e4] system_call_exception+0x134/0x230 [c000000027fa7e10] [c00000000000d660] system_call_common+0xf0/0x27c Instruction dump: 40810044 3d420000 e8bf0000 e88a82d0 3d420000 e90a82c8 792a1f24 7cc4302a 7d2642aa 79291f24 7d25482a 7d295214 <7d4048a8> 7d4a3b78 7d4049ad 40c2fff4 Fixes: bdcf26b ("rdma/siw: network and RDMA core interface") Link: https://lore.kernel.org/r/20210201112922.141085-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com> Tested-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

[ Upstream commit c5c97ca ] The ubsan reported the following error. It was because sample's raw data missed u32 padding at the end. So it broke the alignment of the array after it. The raw data contains an u32 size prefix so the data size should have an u32 padding after 8-byte aligned data. 27: Sample parsing :util/synthetic-events.c:1539:4: runtime error: store to misaligned address 0x62100006b9bc for type '__u64' (aka 'unsigned long long'), which requires 8 byte alignment 0x62100006b9bc: note: pointer points here 00 00 00 00 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ #0 0x561532a9fc96 in perf_event__synthesize_sample util/synthetic-events.c:1539:13 #1 0x5615327f4a4f in do_test tests/sample-parsing.c:284:8 #2 0x5615327f3f50 in test__sample_parsing tests/sample-parsing.c:381:9 #3 0x56153279d3a1 in run_test tests/builtin-test.c:424:9 #4 0x56153279c836 in test_and_print tests/builtin-test.c:454:9 #5 0x56153279b7eb in __cmd_test tests/builtin-test.c:675:4 #6 0x56153279abf0 in cmd_test tests/builtin-test.c:821:9 #7 0x56153264e796 in run_builtin perf.c:312:11 #8 0x56153264cf03 in handle_internal_command perf.c:364:8 #9 0x56153264e47d in run_argv perf.c:408:2 #10 0x56153264c9a9 in main perf.c:538:3 #11 0x7f137ab6fbbc in __libc_start_main (/lib64/libc.so.6+0x38bbc) #12 0x561532596828 in _start ... SUMMARY: UndefinedBehaviorSanitizer: misaligned-pointer-use util/synthetic-events.c:1539:4 in Fixes: 045f8cd ("perf tests: Add a sample parsing test") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20210214091638.519643-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

commit 957e3f7 upstream. acpi_walk_namespace can return success without executing our callback which initializes info->handle. If the random value in this structure is a valid address (which is on the stack, so it's quite possible), then nothing bad will happen, because: sdw_intel_scan_controller -> acpi_bus_get_device -> acpi_get_device_data -> acpi_get_data_full -> acpi_ns_validate_handle will reject this handle. However, if the value from the stack doesn't point to a valid address, we get this: BUG: kernel NULL pointer dereference, address: 0000000000000050 PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI CPU: 6 PID: 472 Comm: systemd-udevd Tainted: G W 5.10.0-1-amd64 #1 Debian 5.10.4-1 Hardware name: HP HP Pavilion Laptop 15-cs3xxx/86E2, BIOS F.05 01/01/2020 RIP: 0010:acpi_ns_validate_handle+0x1a/0x23 Code: 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 0f 1f 44 00 00 48 8d 57 ff 48 89 f8 48 83 fa fd 76 08 48 8b 05 0c b8 67 01 c3 <80> 7f 08 0f 74 02 31 c0 c3 0f 1f 44 00 00 48 8b 3d f6 b7 67 01 e8 RSP: 0000:ffffc388807c7b20 EFLAGS: 00010213 RAX: 0000000000000048 RBX: ffffc388807c7b70 RCX: 0000000000000000 RDX: 0000000000000047 RSI: 0000000000000246 RDI: 0000000000000048 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffffc0f5f4d1 R11: ffffffff8f0cb268 R12: 0000000000001001 R13: ffffffff8e33b160 R14: 0000000000000048 R15: 0000000000000000 FS: 00007f24548288c0(0000) GS:ffff9f781fb80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000050 CR3: 0000000106158004 CR4: 0000000000770ee0 PKRU: 55555554 Call Trace: acpi_get_data_full+0x4d/0x92 acpi_bus_get_device+0x1f/0x40 sdw_intel_acpi_scan+0x59/0x230 [soundwire_intel] ? strstr+0x22/0x60 ? dmi_matches+0x76/0xe0 snd_intel_dsp_driver_probe.cold+0xaf/0x163 [snd_intel_dspcfg] azx_probe+0x7a/0x970 [snd_hda_intel] local_pci_probe+0x42/0x80 ? _cond_resched+0x16/0x40 pci_device_probe+0xfd/0x1b0 really_probe+0x205/0x460 driver_probe_device+0xe1/0x150 device_driver_attach+0xa1/0xb0 __driver_attach+0x8a/0x150 ? device_driver_attach+0xb0/0xb0 ? device_driver_attach+0xb0/0xb0 bus_for_each_dev+0x78/0xc0 bus_add_driver+0x12b/0x1e0 driver_register+0x8b/0xe0 ? 0xffffffffc0f65000 do_one_initcall+0x44/0x1d0 ? do_init_module+0x23/0x250 ? kmem_cache_alloc_trace+0xf5/0x200 do_init_module+0x5c/0x250 __do_sys_finit_module+0xb1/0x110 do_syscall_64+0x33/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> CC: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210208120104.204761-1-marcin.slusarz@gmail.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 7e2a870 upstream. Zygo reported the following panic when testing my error handling patches for relocation: kernel BUG at fs/btrfs/backref.c:2545! invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 3 PID: 8472 Comm: btrfs Tainted: G W 14 Hardware name: QEMU Standard PC (i440FX + PIIX, Call Trace: btrfs_backref_error_cleanup+0x4df/0x530 build_backref_tree+0x1a5/0x700 ? _raw_spin_unlock+0x22/0x30 ? release_extent_buffer+0x225/0x280 ? free_extent_buffer.part.52+0xd7/0x140 relocate_tree_blocks+0x2a6/0xb60 ? kasan_unpoison_shadow+0x35/0x50 ? do_relocation+0xc10/0xc10 ? kasan_kmalloc+0x9/0x10 ? kmem_cache_alloc_trace+0x6a3/0xcb0 ? free_extent_buffer.part.52+0xd7/0x140 ? rb_insert_color+0x342/0x360 ? add_tree_block.isra.36+0x236/0x2b0 relocate_block_group+0x2eb/0x780 ? merge_reloc_roots+0x470/0x470 btrfs_relocate_block_group+0x26e/0x4c0 btrfs_relocate_chunk+0x52/0x120 btrfs_balance+0xe2e/0x18f0 ? pvclock_clocksource_read+0xeb/0x190 ? btrfs_relocate_chunk+0x120/0x120 ? lock_contended+0x620/0x6e0 ? do_raw_spin_lock+0x1e0/0x1e0 ? do_raw_spin_unlock+0xa8/0x140 btrfs_ioctl_balance+0x1f9/0x460 btrfs_ioctl+0x24c8/0x4380 ? __kasan_check_read+0x11/0x20 ? check_chain_key+0x1f4/0x2f0 ? __asan_loadN+0xf/0x20 ? btrfs_ioctl_get_supported_features+0x30/0x30 ? kvm_sched_clock_read+0x18/0x30 ? check_chain_key+0x1f4/0x2f0 ? lock_downgrade+0x3f0/0x3f0 ? handle_mm_fault+0xad6/0x2150 ? do_vfs_ioctl+0xfc/0x9d0 ? ioctl_file_clone+0xe0/0xe0 ? check_flags.part.50+0x6c/0x1e0 ? check_flags.part.50+0x6c/0x1e0 ? check_flags+0x26/0x30 ? lock_is_held_type+0xc3/0xf0 ? syscall_enter_from_user_mode+0x1b/0x60 ? do_syscall_64+0x13/0x80 ? rcu_read_lock_sched_held+0xa1/0xd0 ? __kasan_check_read+0x11/0x20 ? __fget_light+0xae/0x110 __x64_sys_ioctl+0xc3/0x100 do_syscall_64+0x37/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xa9 This occurs because of this check if (RB_EMPTY_NODE(&upper->rb_node)) BUG_ON(!list_empty(&node->upper)); As we are dropping the backref node, if we discover that our upper node in the edge we just cleaned up isn't linked into the cache that we are now done with this node, thus the BUG_ON(). However this is an erroneous assumption, as we will look up all the references for a node first, and then process the pending edges. All of the 'upper' nodes in our pending edges won't be in the cache's rb_tree yet, because they haven't been processed. We could very well have many edges still left to cleanup on this node. The fact is we simply do not need this check, we can just process all of the edges only for this node, because below this check we do the following if (list_empty(&upper->lower)) { list_add_tail(&upper->lower, &cache->leaves); upper->lowest = 1; } If the upper node truly isn't used yet, then we add it to the cache->leaves list to be cleaned up later. If it is still used then the last child node that has it linked into its node will add it to the leaves list and then it will be cleaned up. Fix this problem by dropping this logic altogether. With this fix I no longer see the panic when testing with error injection in the backref code. CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 938fcbf upstream. While doing error injection testing with my relocation patches I hit the following assert: assertion failed: list_empty(&block_group->dirty_list), in fs/btrfs/block-group.c:3356 ------------[ cut here ]------------ kernel BUG at fs/btrfs/ctree.h:3357! invalid opcode: 0000 [#1] SMP NOPTI CPU: 0 PID: 24351 Comm: umount Tainted: G W 5.10.0-rc3+ #193 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 RIP: 0010:assertfail.constprop.0+0x18/0x1a RSP: 0018:ffffa09b019c7e00 EFLAGS: 00010282 RAX: 0000000000000056 RBX: ffff8f6492c18000 RCX: 0000000000000000 RDX: ffff8f64fbc27c60 RSI: ffff8f64fbc19050 RDI: ffff8f64fbc19050 RBP: ffff8f6483bbdc00 R08: 0000000000000000 R09: 0000000000000000 R10: ffffa09b019c7c38 R11: ffffffff85d70928 R12: ffff8f6492c18100 R13: ffff8f6492c18148 R14: ffff8f6483bbdd70 R15: dead000000000100 FS: 00007fbfda4cdc40(0000) GS:ffff8f64fbc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fbfda666fd0 CR3: 000000013cf66002 CR4: 0000000000370ef0 Call Trace: btrfs_free_block_groups.cold+0x55/0x55 close_ctree+0x2c5/0x306 ? fsnotify_destroy_marks+0x14/0x100 generic_shutdown_super+0x6c/0x100 kill_anon_super+0x14/0x30 btrfs_kill_super+0x12/0x20 deactivate_locked_super+0x36/0xa0 cleanup_mnt+0x12d/0x190 task_work_run+0x5c/0xa0 exit_to_user_mode_prepare+0x1b1/0x1d0 syscall_exit_to_user_mode+0x54/0x280 entry_SYSCALL_64_after_hwframe+0x44/0xa9 This happened because I injected an error in btrfs_cow_block() while running the dirty block groups. When we run the dirty block groups, we splice the list onto a local list to process. However if an error occurs, we only cleanup the transactions dirty block group list, not any pending block groups we have on our locally spliced list. In fact if we fail to allocate a path in this function we'll also fail to clean up the splice list. Fix this by splicing the list back onto the transaction dirty block group list so that the block groups are cleaned up. Then add a 'out' label and have the error conditions jump to out so that the errors are handled properly. This also has the side-effect of fixing a problem where we would clear 'ret' on error because we unconditionally ran btrfs_run_delayed_refs(). CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a56f441 upstream. In sdhci_esdhc_imx_remove() the SDHCI_INT_STATUS in read. Under some circumstances, this may be done while the device is runtime suspended, triggering the below splat. Fix the problem by adding a pm_runtime_get_sync(), before reading the register, which will turn on clocks etc making the device accessible again. [ 1811.323148] mmc1: card aaaa removed [ 1811.347483] Internal error: synchronous external abort: 96000210 [#1] PREEMPT SMP [ 1811.354988] Modules linked in: sdhci_esdhc_imx(-) sdhci_pltfm sdhci cqhci mmc_block mmc_core [last unloaded: mmc_core] [ 1811.365726] CPU: 0 PID: 3464 Comm: rmmod Not tainted 5.10.1-sd-99871-g53835a2e8186 #5 [ 1811.373559] Hardware name: Freescale i.MX8DXL EVK (DT) [ 1811.378705] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--) [ 1811.384723] pc : sdhci_esdhc_imx_remove+0x28/0x15c [sdhci_esdhc_imx] [ 1811.391090] lr : platform_drv_remove+0x2c/0x50 [ 1811.395536] sp : ffff800012c7bcb0 [ 1811.398855] x29: ffff800012c7bcb0 x28: ffff00002c72b900 [ 1811.404181] x27: 0000000000000000 x26: 0000000000000000 [ 1811.409497] x25: 0000000000000000 x24: 0000000000000000 [ 1811.414814] x23: ffff0000042b3890 x22: ffff800009127120 [ 1811.420131] x21: ffff00002c4c9580 x20: ffff0000042d0810 [ 1811.425456] x19: ffff0000042d0800 x18: 0000000000000020 [ 1811.430773] x17: 0000000000000000 x16: 0000000000000000 [ 1811.436089] x15: 0000000000000004 x14: ffff000004019c10 [ 1811.441406] x13: 0000000000000000 x12: 0000000000000020 [ 1811.446723] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f [ 1811.452040] x9 : fefefeff6364626d x8 : 7f7f7f7f7f7f7f7f [ 1811.457356] x7 : 78725e6473607372 x6 : 0000000080808080 [ 1811.462673] x5 : 0000000000000000 x4 : 0000000000000000 [ 1811.467990] x3 : ffff800011ac1cb0 x2 : 0000000000000000 [ 1811.473307] x1 : ffff8000091214d4 x0 : ffff8000133a0030 [ 1811.478624] Call trace: [ 1811.481081] sdhci_esdhc_imx_remove+0x28/0x15c [sdhci_esdhc_imx] [ 1811.487098] platform_drv_remove+0x2c/0x50 [ 1811.491198] __device_release_driver+0x188/0x230 [ 1811.495818] driver_detach+0xc0/0x14c [ 1811.499487] bus_remove_driver+0x5c/0xb0 [ 1811.503413] driver_unregister+0x30/0x60 [ 1811.507341] platform_driver_unregister+0x14/0x20 [ 1811.512048] sdhci_esdhc_imx_driver_exit+0x1c/0x3a8 [sdhci_esdhc_imx] [ 1811.518495] __arm64_sys_delete_module+0x19c/0x230 [ 1811.523291] el0_svc_common.constprop.0+0x78/0x1a0 [ 1811.528086] do_el0_svc+0x24/0x90 [ 1811.531405] el0_svc+0x14/0x20 [ 1811.534461] el0_sync_handler+0x1a4/0x1b0 [ 1811.538474] el0_sync+0x174/0x180 [ 1811.541801] Code: a9025bf5 f9403e95 f9400ea0 9100c000 (b9400000) [ 1811.547902] ---[ end trace 3fb1a3bd48ff7be5 ]--- Signed-off-by: Frank Li <Frank.Li@nxp.com> Cc: stable@vger.kernel.org # v4.0+ Link: https://lore.kernel.org/r/20210210181933.29263-1-Frank.Li@nxp.com [Ulf: Clarified the commit message a bit] Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

[ Upstream commit 4964a43 ] Replace strcpy() with strscpy() in bcm2835-audio/bcm2835.c to prevent the following when loading snd-bcm2835: [ 58.480634] ------------[ cut here ]------------ [ 58.485321] kernel BUG at lib/string.c:1149! [ 58.489650] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 58.495214] Modules linked in: snd_bcm2835(COE+) snd_pcm snd_timer snd dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua btsdio bluetooth ecdh_generic ecc bcm2835_v4l2(CE) bcm2835_codec(CE) brcmfmac bcm2835_isp(CE) bcm2835_mmal_vchiq(CE) brcmutil cfg80211 v4l2_mem2mem videobuf2_vmalloc videobuf2_dma_contig videobuf2_memops raspberrypi_hwmon videobuf2_v4l2 videobuf2_common videodev bcm2835_gpiomem mc vc_sm_cma(CE) rpivid_mem uio_pdrv_genirq uio sch_fq_codel drm ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq libcrc32c raid1 raid0 multipath linear dwc2 roles spidev udc_core crct10dif_ce xhci_pci xhci_pci_renesas phy_generic aes_neon_bs aes_neon_blk crypto_simd cryptd [ 58.563787] CPU: 3 PID: 1959 Comm: insmod Tainted: G C OE 5.11.0-1001-raspi #1 [ 58.572172] Hardware name: Raspberry Pi 4 Model B Rev 1.2 (DT) [ 58.578086] pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--) [ 58.584178] pc : fortify_panic+0x20/0x24 [ 58.588161] lr : fortify_panic+0x20/0x24 [ 58.592136] sp : ffff800010a83990 [ 58.595491] x29: ffff800010a83990 x28: 0000000000000002 [ 58.600879] x27: ffffb0b07cb72928 x26: 0000000000000000 [ 58.606268] x25: ffff39e884973838 x24: ffffb0b07cb74190 [ 58.611655] x23: ffffb0b07cb72030 x22: 0000000000000000 [ 58.617042] x21: ffff39e884973014 x20: ffff39e88b793010 [ 58.622428] x19: ffffb0b07cb72670 x18: 0000000000000030 [ 58.627814] x17: 0000000000000000 x16: ffffb0b092ce2c1c [ 58.633200] x15: ffff39e88b901500 x14: 0720072007200720 [ 58.638588] x13: 0720072007200720 x12: 0720072007200720 [ 58.643979] x11: ffffb0b0936cbdf0 x10: 00000000fffff000 [ 58.649366] x9 : ffffb0b09220cfa8 x8 : 0000000000000000 [ 58.654752] x7 : ffffb0b093673df0 x6 : ffffb0b09364e000 [ 58.660140] x5 : 0000000000000000 x4 : ffff39e93b7db948 [ 58.665526] x3 : ffff39e93b7ebcf0 x2 : 0000000000000000 [ 58.670913] x1 : 0000000000000000 x0 : 0000000000000022 [ 58.676299] Call trace: [ 58.678775] fortify_panic+0x20/0x24 [ 58.682402] snd_bcm2835_alsa_probe+0x5b8/0x7d8 [snd_bcm2835] [ 58.688247] platform_probe+0x74/0xe4 [ 58.691963] really_probe+0xf0/0x510 [ 58.695585] driver_probe_device+0xe0/0x100 [ 58.699826] device_driver_attach+0xcc/0xd4 [ 58.704068] __driver_attach+0xb0/0x17c [ 58.707956] bus_for_each_dev+0x7c/0xd4 [ 58.711843] driver_attach+0x30/0x40 [ 58.715467] bus_add_driver+0x154/0x250 [ 58.719354] driver_register+0x84/0x140 [ 58.723242] __platform_driver_register+0x34/0x40 [ 58.728013] bcm2835_alsa_driver_init+0x30/0x1000 [snd_bcm2835] [ 58.734024] do_one_initcall+0x54/0x300 [ 58.737914] do_init_module+0x60/0x280 [ 58.741719] load_module+0x680/0x770 [ 58.745344] __do_sys_finit_module+0xbc/0x130 [ 58.749761] __arm64_sys_finit_module+0x2c/0x40 [ 58.754356] el0_svc_common.constprop.0+0x88/0x220 [ 58.759216] do_el0_svc+0x30/0xa0 [ 58.762575] el0_svc+0x28/0x70 [ 58.765669] el0_sync_handler+0x1a4/0x1b0 [ 58.769732] el0_sync+0x178/0x180 [ 58.773095] Code: aa0003e1 91366040 910003fd 97ffee21 (d4210000) [ 58.779275] ---[ end trace 29be5b17497bd898 ]--- [ 58.783955] note: insmod[1959] exited with preempt_count 1 [ 58.791921] ------------[ cut here ]------------ For the sake of it, replace all the other occurences of strcpy() under bcm2835-audio/ as well. Signed-off-by: Juerg Haefliger <juergh@canonical.com> Link: https://lore.kernel.org/r/20210205072502.10907-1-juergh@canonical.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>

commit 1c3b3e6 upstream. syzbot reports a deadlock, attempting to lock the same spinlock twice: ============================================ WARNING: possible recursive locking detected 5.11.0-syzkaller #0 Not tainted -------------------------------------------- swapper/1/0 is trying to acquire lock: ffff88801b2b1130 (&runtime->sleep){..-.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline] ffff88801b2b1130 (&runtime->sleep){..-.}-{2:2}, at: io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4960 but task is already holding lock: ffff88801b2b3130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&runtime->sleep); lock(&runtime->sleep); *** DEADLOCK *** May be due to missing lock nesting notation 2 locks held by swapper/1/0: #0: ffff888147474908 (&group->lock){..-.}-{2:2}, at: _snd_pcm_stream_lock_irqsave+0x9f/0xd0 sound/core/pcm_native.c:170 #1: ffff88801b2b3130 (&runtime->sleep){..-.}-{2:2}, at: __wake_up_common_lock+0xb4/0x130 kernel/sched/wait.c:137 stack backtrace: CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.11.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0xfa/0x151 lib/dump_stack.c:120 print_deadlock_bug kernel/locking/lockdep.c:2829 [inline] check_deadlock kernel/locking/lockdep.c:2872 [inline] validate_chain kernel/locking/lockdep.c:3661 [inline] __lock_acquire.cold+0x14c/0x3b4 kernel/locking/lockdep.c:4900 lock_acquire kernel/locking/lockdep.c:5510 [inline] lock_acquire+0x1ab/0x730 kernel/locking/lockdep.c:5475 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:354 [inline] io_poll_double_wake+0x25f/0x6a0 fs/io_uring.c:4960 __wake_up_common+0x147/0x650 kernel/sched/wait.c:108 __wake_up_common_lock+0xd0/0x130 kernel/sched/wait.c:138 snd_pcm_update_state+0x46a/0x540 sound/core/pcm_lib.c:203 snd_pcm_update_hw_ptr0+0xa75/0x1a50 sound/core/pcm_lib.c:464 snd_pcm_period_elapsed+0x160/0x250 sound/core/pcm_lib.c:1805 dummy_hrtimer_callback+0x94/0x1b0 sound/drivers/dummy.c:378 __run_hrtimer kernel/time/hrtimer.c:1519 [inline] __hrtimer_run_queues+0x609/0xe40 kernel/time/hrtimer.c:1583 hrtimer_run_softirq+0x17b/0x360 kernel/time/hrtimer.c:1600 __do_softirq+0x29b/0x9f6 kernel/softirq.c:345 invoke_softirq kernel/softirq.c:221 [inline] __irq_exit_rcu kernel/softirq.c:422 [inline] irq_exit_rcu+0x134/0x200 kernel/softirq.c:434 sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1100 </IRQ> asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:632 RIP: 0010:native_save_fl arch/x86/include/asm/irqflags.h:29 [inline] RIP: 0010:arch_local_save_flags arch/x86/include/asm/irqflags.h:70 [inline] RIP: 0010:arch_irqs_disabled arch/x86/include/asm/irqflags.h:137 [inline] RIP: 0010:acpi_safe_halt drivers/acpi/processor_idle.c:111 [inline] RIP: 0010:acpi_idle_do_entry+0x1c9/0x250 drivers/acpi/processor_idle.c:516 Code: dd 38 6e f8 84 db 75 ac e8 54 32 6e f8 e8 0f 1c 74 f8 e9 0c 00 00 00 e8 45 32 6e f8 0f 00 2d 4e 4a c5 00 e8 39 32 6e f8 fb f4 <9c> 5b 81 e3 00 02 00 00 fa 31 ff 48 89 de e8 14 3a 6e f8 48 85 db RSP: 0018:ffffc90000d47d18 EFLAGS: 00000293 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff8880115c3780 RSI: ffffffff89052537 RDI: 0000000000000000 RBP: ffff888141127064 R08: 0000000000000001 R09: 0000000000000001 R10: ffffffff81794168 R11: 0000000000000000 R12: 0000000000000001 R13: ffff888141127000 R14: ffff888141127064 R15: ffff888143331804 acpi_idle_enter+0x361/0x500 drivers/acpi/processor_idle.c:647 cpuidle_enter_state+0x1b1/0xc80 drivers/cpuidle/cpuidle.c:237 cpuidle_enter+0x4a/0xa0 drivers/cpuidle/cpuidle.c:351 call_cpuidle kernel/sched/idle.c:158 [inline] cpuidle_idle_call kernel/sched/idle.c:239 [inline] do_idle+0x3e1/0x590 kernel/sched/idle.c:300 cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:397 start_secondary+0x274/0x350 arch/x86/kernel/smpboot.c:272 secondary_startup_64_no_verify+0xb0/0xbb which is due to the driver doing poll_wait() twice on the same wait_queue_head. That is perfectly valid, but from checking the rest of the kernel tree, it's the only driver that does this. We can handle this just fine, we just need to ignore the second addition as we'll get woken just fine on the first one. Cc: stable@vger.kernel.org # 5.8+ Fixes: 18bceab ("io_uring: allow POLL_ADD with double poll_wait() users") Reported-by: syzbot+28abd693db9e92c160d8@syzkaller.appspotmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 4d14c5c upstream Calling btrfs_qgroup_reserve_meta_prealloc from btrfs_delayed_inode_reserve_metadata can result in flushing delalloc while holding a transaction and delayed node locks. This is deadlock prone. In the past multiple commits: * ae5e070 ("btrfs: qgroup: don't try to wait flushing if we're already holding a transaction") * 6f23277 ("btrfs: qgroup: don't commit transaction when we already hold the handle") Tried to solve various aspects of this but this was always a whack-a-mole game. Unfortunately those 2 fixes don't solve a deadlock scenario involving btrfs_delayed_node::mutex. Namely, one thread can call btrfs_dirty_inode as a result of reading a file and modifying its atime: PID: 6963 TASK: ffff8c7f3f94c000 CPU: 2 COMMAND: "test" #0 __schedule at ffffffffa529e07d #1 schedule at ffffffffa529e4ff #2 schedule_timeout at ffffffffa52a1bdd #3 wait_for_completion at ffffffffa529eeea <-- sleeps with delayed node mutex held #4 start_delalloc_inodes at ffffffffc0380db5 #5 btrfs_start_delalloc_snapshot at ffffffffc0393836 #6 try_flush_qgroup at ffffffffc03f04b2 #7 __btrfs_qgroup_reserve_meta at ffffffffc03f5bb6 <-- tries to reserve space and starts delalloc inodes. #8 btrfs_delayed_update_inode at ffffffffc03e31aa <-- acquires delayed node mutex #9 btrfs_update_inode at ffffffffc0385ba8 #10 btrfs_dirty_inode at ffffffffc038627b <-- TRANSACTIION OPENED #11 touch_atime at ffffffffa4cf0000 #12 generic_file_read_iter at ffffffffa4c1f123 #13 new_sync_read at ffffffffa4ccdc8a #14 vfs_read at ffffffffa4cd0849 #15 ksys_read at ffffffffa4cd0bd1 #16 do_syscall_64 at ffffffffa4a052eb #17 entry_SYSCALL_64_after_hwframe at ffffffffa540008c This will cause an asynchronous work to flush the delalloc inodes to happen which can try to acquire the same delayed_node mutex: PID: 455 TASK: ffff8c8085fa4000 CPU: 5 COMMAND: "kworker/u16:30" #0 __schedule at ffffffffa529e07d #1 schedule at ffffffffa529e4ff #2 schedule_preempt_disabled at ffffffffa529e80a #3 __mutex_lock at ffffffffa529fdcb <-- goes to sleep, never wakes up. #4 btrfs_delayed_update_inode at ffffffffc03e3143 <-- tries to acquire the mutex #5 btrfs_update_inode at ffffffffc0385ba8 <-- this is the same inode that pid 6963 is holding #6 cow_file_range_inline.constprop.78 at ffffffffc0386be7 #7 cow_file_range at ffffffffc03879c1 #8 btrfs_run_delalloc_range at ffffffffc038894c #9 writepage_delalloc at ffffffffc03a3c8f #10 __extent_writepage at ffffffffc03a4c01 #11 extent_write_cache_pages at ffffffffc03a500b #12 extent_writepages at ffffffffc03a6de2 #13 do_writepages at ffffffffa4c277eb #14 __filemap_fdatawrite_range at ffffffffa4c1e5bb #15 btrfs_run_delalloc_work at ffffffffc0380987 <-- starts running delayed nodes #16 normal_work_helper at ffffffffc03b706c #17 process_one_work at ffffffffa4aba4e4 #18 worker_thread at ffffffffa4aba6fd #19 kthread at ffffffffa4ac0a3d #20 ret_from_fork at ffffffffa54001ff To fully address those cases the complete fix is to never issue any flushing while holding the transaction or the delayed node lock. This patch achieves it by calling qgroup_reserve_meta directly which will either succeed without flushing or will fail and return -EDQUOT. In the latter case that return value is going to be propagated to btrfs_dirty_inode which will fallback to start a new transaction. That's fine as the majority of time we expect the inode will have BTRFS_DELAYED_NODE_INODE_DIRTY flag set which will result in directly copying the in-memory state. Fixes: c53e965 ("btrfs: qgroup: try to flush qgroup space when we get -EDQUOT") CC: stable@vger.kernel.org # 5.10+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> [sudip: adjust context] Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

[ Upstream commit 4add4d9 ] If a reset is performed, but even the reset fails for some reasons (e.g., on Surface devices, the fw reset requires another quirks), cancel_work_sync() hangs in mwifiex_cleanup_pcie(). # firmware went into a bad state [...] [ 1608.281690] mwifiex_pcie 0000:03:00.0: info: shutdown mwifiex... [ 1608.282724] mwifiex_pcie 0000:03:00.0: rx_pending=0, tx_pending=1, cmd_pending=0 [ 1608.292400] mwifiex_pcie 0000:03:00.0: PREP_CMD: card is removed [ 1608.292405] mwifiex_pcie 0000:03:00.0: PREP_CMD: card is removed # reset performed after firmware went into a bad state [ 1609.394320] mwifiex_pcie 0000:03:00.0: WLAN FW already running! Skip FW dnld [ 1609.394335] mwifiex_pcie 0000:03:00.0: WLAN FW is active # but even the reset failed [ 1619.499049] mwifiex_pcie 0000:03:00.0: mwifiex_cmd_timeout_func: Timeout cmd id = 0xfa, act = 0xe000 [ 1619.499094] mwifiex_pcie 0000:03:00.0: num_data_h2c_failure = 0 [ 1619.499103] mwifiex_pcie 0000:03:00.0: num_cmd_h2c_failure = 0 [ 1619.499110] mwifiex_pcie 0000:03:00.0: is_cmd_timedout = 1 [ 1619.499117] mwifiex_pcie 0000:03:00.0: num_tx_timeout = 0 [ 1619.499124] mwifiex_pcie 0000:03:00.0: last_cmd_index = 0 [ 1619.499133] mwifiex_pcie 0000:03:00.0: last_cmd_id: fa 00 07 01 07 01 07 01 07 01 [ 1619.499140] mwifiex_pcie 0000:03:00.0: last_cmd_act: 00 e0 00 00 00 00 00 00 00 00 [ 1619.499147] mwifiex_pcie 0000:03:00.0: last_cmd_resp_index = 3 [ 1619.499155] mwifiex_pcie 0000:03:00.0: last_cmd_resp_id: 07 81 07 81 07 81 07 81 07 81 [ 1619.499162] mwifiex_pcie 0000:03:00.0: last_event_index = 2 [ 1619.499169] mwifiex_pcie 0000:03:00.0: last_event: 58 00 58 00 58 00 58 00 58 00 [ 1619.499177] mwifiex_pcie 0000:03:00.0: data_sent=0 cmd_sent=1 [ 1619.499185] mwifiex_pcie 0000:03:00.0: ps_mode=0 ps_state=0 [ 1619.499215] mwifiex_pcie 0000:03:00.0: info: _mwifiex_fw_dpc: unregister device # mwifiex_pcie_work hang happening [ 1823.233923] INFO: task kworker/3:1:44 blocked for more than 122 seconds. [ 1823.233932] Tainted: G WC OE 5.10.0-rc1-1-mainline #1 [ 1823.233935] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1823.233940] task:kworker/3:1 state:D stack: 0 pid: 44 ppid: 2 flags:0x00004000 [ 1823.233960] Workqueue: events mwifiex_pcie_work [mwifiex_pcie] [ 1823.233965] Call Trace: [ 1823.233981] __schedule+0x292/0x820 [ 1823.233990] schedule+0x45/0xe0 [ 1823.233995] schedule_timeout+0x11c/0x160 [ 1823.234003] wait_for_completion+0x9e/0x100 [ 1823.234012] __flush_work.isra.0+0x156/0x210 [ 1823.234018] ? flush_workqueue_prep_pwqs+0x130/0x130 [ 1823.234026] __cancel_work_timer+0x11e/0x1a0 [ 1823.234035] mwifiex_cleanup_pcie+0x28/0xd0 [mwifiex_pcie] [ 1823.234049] mwifiex_free_adapter+0x24/0xe0 [mwifiex] [ 1823.234060] _mwifiex_fw_dpc+0x294/0x560 [mwifiex] [ 1823.234074] mwifiex_reinit_sw+0x15d/0x300 [mwifiex] [ 1823.234080] mwifiex_pcie_reset_done+0x50/0x80 [mwifiex_pcie] [ 1823.234087] pci_try_reset_function+0x5c/0x90 [ 1823.234094] process_one_work+0x1d6/0x3a0 [ 1823.234100] worker_thread+0x4d/0x3d0 [ 1823.234107] ? rescuer_thread+0x410/0x410 [ 1823.234112] kthread+0x142/0x160 [ 1823.234117] ? __kthread_bind_mask+0x60/0x60 [ 1823.234124] ret_from_fork+0x22/0x30 [...] This is a deadlock caused by calling cancel_work_sync() in mwifiex_cleanup_pcie(): - Device resets are done via mwifiex_pcie_card_reset() - which schedules card->work to call mwifiex_pcie_card_reset_work() - which calls pci_try_reset_function(). - This leads to mwifiex_pcie_reset_done() be called on the same workqueue, which in turn calls - mwifiex_reinit_sw() and that calls - _mwifiex_fw_dpc(). The problem is now that _mwifiex_fw_dpc() calls mwifiex_free_adapter() in case firmware initialization fails. That ends up calling mwifiex_cleanup_pcie(). Note that all those calls are still running on the workqueue. So when mwifiex_cleanup_pcie() now calls cancel_work_sync(), it's really waiting on itself to complete, causing a deadlock. This commit fixes the deadlock by skipping cancel_work_sync() on a reset failure path. After this commit, when reset fails, the following output is expected to be shown: kernel: mwifiex_pcie 0000:03:00.0: info: _mwifiex_fw_dpc: unregister device kernel: mwifiex: Failed to bring up adapter: -5 kernel: mwifiex_pcie 0000:03:00.0: reinit failed: -5 To reproduce this issue, for example, try putting the root port of wifi into D3 (replace "00:1d.3" with your setup). # put into D3 (root port) sudo setpci -v -s 00:1d.3 CAP_PM+4.b=0b Cc: Maximilian Luz <luzmaximilian@gmail.com> Signed-off-by: Tsuchiya Yuto <kitakar@gmail.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20201028142346.18355-1-kitakar@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>

yilunxu1984 and others added 30 commits June 4, 2020 09:49

Documentation: fpga: dfl: add description for performance reporting s…

4f6918c

…upport This patch adds description for performance reporting support for Device Feature List (DFL) based FPGA. Signed-off-by: Xu Yilun <yilun.xu@intel.com> Signed-off-by: Wu Hao <hao.wu@intel.com>

mfd: intel-m10-bmc: expose mac address and count

b51af43

Create two sysfs entries for exposing the MAC address and count from the MAX10 BMC register space. Signed-off-by: Russ Weight <russell.h.weight@intel.com> Signed-off-by: Xu Yilun <yilun.xu@intel.com>

rweight force-pushed the fpga-ofs-dev branch from d8e07c2 to d058516 Compare February 5, 2021 01:33

rweight force-pushed the fpga-ofs-dev branch 3 times, most recently from 689891f to 4afd3e8 Compare March 4, 2021 17:17

rweight force-pushed the fpga-ofs-dev branch 3 times, most recently from a91ce0e to edb6b7e Compare March 16, 2021 17:02

rweight force-pushed the fpga-ofs-dev branch from edb6b7e to fb7e349 Compare March 18, 2021 16:50

chenquanquan mentioned this pull request Mar 24, 2021

net: ethernet: s10hssi: fix the crash in the ipv6 network #6

Open

208yunyao mentioned this pull request Sep 19, 2024

5.15.92-dfl (struct device *)’ from incompatible pointer type ‘int (*)(struct device *)’ [-Werror=incompatible-pointer-types] .remove = dfl_bus_remove, OFS/linux-dfl-backport#140

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mfd: handle legacy m10bmc#1

mfd: handle legacy m10bmc#1
trixirt wants to merge 54 commits intoOFS:fpga-ofs-devfrom
trixirt:legacy-bmc

trixirt commented Jun 19, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

trixirt commented Jun 19, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants