| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
drm/xe/guc: Check GuC running state before deregistering exec queue
In normal operation, a registered exec queue is disabled and
deregistered through the GuC, and freed only after the GuC confirms
completion. However, if the driver is forced to unbind while the exec
queue is still running, the user may call exec_destroy() after the GuC
has already been stopped and CT communication disabled.
In this case, the driver cannot receive a response from the GuC,
preventing proper cleanup of exec queue resources. Fix this by directly
releasing the resources when GuC is not running.
Here is the failure dmesg log:
"
[ 468.089581] ---[ end trace 0000000000000000 ]---
[ 468.089608] pci 0000:03:00.0: [drm] *ERROR* GT0: GUC ID manager unclean (1/65535)
[ 468.090558] pci 0000:03:00.0: [drm] GT0: total 65535
[ 468.090562] pci 0000:03:00.0: [drm] GT0: used 1
[ 468.090564] pci 0000:03:00.0: [drm] GT0: range 1..1 (1)
[ 468.092716] ------------[ cut here ]------------
[ 468.092719] WARNING: CPU: 14 PID: 4775 at drivers/gpu/drm/xe/xe_ttm_vram_mgr.c:298 ttm_vram_mgr_fini+0xf8/0x130 [xe]
"
v2: use xe_uc_fw_is_running() instead of xe_guc_ct_enabled().
As CT may go down and come back during VF migration.
(cherry picked from commit 9b42321a02c50a12b2beb6ae9469606257fbecea) |
| In the Linux kernel, the following vulnerability has been resolved:
staging: most: remove broken i2c driver
The MOST I2C driver has been completely broken for five years without
anyone noticing so remove the driver from staging.
Specifically, commit 723de0f9171e ("staging: most: remove device from
interface structure") started requiring drivers to set the interface
device pointer before registration, but the I2C driver was never updated
which results in a NULL pointer dereference if anyone ever tries to
probe it. |
| In the Linux kernel, the following vulnerability has been resolved:
irqchip/mchp-eic: Fix error code in mchp_eic_domain_alloc()
If irq_domain_translate_twocell() sets "hwirq" to >= MCHP_EIC_NIRQ (2) then
it results in an out of bounds access.
The code checks for invalid values, but doesn't set the error code. Return
-EINVAL in that case, instead of returning success. |
| In the Linux kernel, the following vulnerability has been resolved:
serial: qcom-geni: Fix blocked task
Revert commit 1afa70632c39 ("serial: qcom-geni: Enable PM runtime for
serial driver") and its dependent commit 86fa39dd6fb7 ("serial:
qcom-geni: Enable Serial on SA8255p Qualcomm platforms") because the
first one causes regression - hang task on Qualcomm RB1 board (QRB2210)
and unable to use serial at all during normal boot:
INFO: task kworker/u16:0:12 blocked for more than 42 seconds.
Not tainted 6.17.0-rc1-00004-g53e760d89498 #9
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u16:0 state:D stack:0 pid:12 tgid:12 ppid:2 task_flags:0x4208060 flags:0x00000010
Workqueue: async async_run_entry_fn
Call trace:
__switch_to+0xe8/0x1a0 (T)
__schedule+0x290/0x7c0
schedule+0x34/0x118
rpm_resume+0x14c/0x66c
rpm_resume+0x2a4/0x66c
rpm_resume+0x2a4/0x66c
rpm_resume+0x2a4/0x66c
__pm_runtime_resume+0x50/0x9c
__driver_probe_device+0x58/0x120
driver_probe_device+0x3c/0x154
__driver_attach_async_helper+0x4c/0xc0
async_run_entry_fn+0x34/0xe0
process_one_work+0x148/0x290
worker_thread+0x2c4/0x3e0
kthread+0x118/0x1c0
ret_from_fork+0x10/0x20
The issue was reported on 12th of August and was ignored by author of
commits introducing issue for two weeks. Only after complaining author
produced a fix which did not work, so if original commits cannot be
reliably fixed for 5 weeks, they obviously are buggy and need to be
dropped. |
| In the Linux kernel, the following vulnerability has been resolved:
net/sched: ets: Remove drr class from the active list if it changes to strict
Whenever a user issues an ets qdisc change command, transforming a
drr class into a strict one, the ets code isn't checking whether that
class was in the active list and removing it. This means that, if a
user changes a strict class (which was in the active list) back to a drr
one, that class will be added twice to the active list [1].
Doing so with the following commands:
tc qdisc add dev lo root handle 1: ets bands 2 strict 1
tc qdisc add dev lo parent 1:2 handle 20: \
tbf rate 8bit burst 100b latency 1s
tc filter add dev lo parent 1: basic classid 1:2
ping -c1 -W0.01 -s 56 127.0.0.1
tc qdisc change dev lo root handle 1: ets bands 2 strict 2
tc qdisc change dev lo root handle 1: ets bands 2 strict 1
ping -c1 -W0.01 -s 56 127.0.0.1
Will trigger the following splat with list debug turned on:
[ 59.279014][ T365] ------------[ cut here ]------------
[ 59.279452][ T365] list_add double add: new=ffff88801d60e350, prev=ffff88801d60e350, next=ffff88801d60e2c0.
[ 59.280153][ T365] WARNING: CPU: 3 PID: 365 at lib/list_debug.c:35 __list_add_valid_or_report+0x17f/0x220
[ 59.280860][ T365] Modules linked in:
[ 59.281165][ T365] CPU: 3 UID: 0 PID: 365 Comm: tc Not tainted 6.18.0-rc7-00105-g7e9f13163c13-dirty #239 PREEMPT(voluntary)
[ 59.281977][ T365] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 59.282391][ T365] RIP: 0010:__list_add_valid_or_report+0x17f/0x220
[ 59.282842][ T365] Code: 89 c6 e8 d4 b7 0d ff 90 0f 0b 90 90 31 c0 e9 31 ff ff ff 90 48 c7 c7 e0 a0 22 9f 48 89 f2 48 89 c1 4c 89 c6 e8 b2 b7 0d ff 90 <0f> 0b 90 90 31 c0 e9 0f ff ff ff 48 89 f7 48 89 44 24 10 4c 89 44
...
[ 59.288812][ T365] Call Trace:
[ 59.289056][ T365] <TASK>
[ 59.289224][ T365] ? srso_alias_return_thunk+0x5/0xfbef5
[ 59.289546][ T365] ets_qdisc_change+0xd2b/0x1e80
[ 59.289891][ T365] ? __lock_acquire+0x7e7/0x1be0
[ 59.290223][ T365] ? __pfx_ets_qdisc_change+0x10/0x10
[ 59.290546][ T365] ? srso_alias_return_thunk+0x5/0xfbef5
[ 59.290898][ T365] ? __mutex_trylock_common+0xda/0x240
[ 59.291228][ T365] ? __pfx___mutex_trylock_common+0x10/0x10
[ 59.291655][ T365] ? srso_alias_return_thunk+0x5/0xfbef5
[ 59.291993][ T365] ? srso_alias_return_thunk+0x5/0xfbef5
[ 59.292313][ T365] ? trace_contention_end+0xc8/0x110
[ 59.292656][ T365] ? srso_alias_return_thunk+0x5/0xfbef5
[ 59.293022][ T365] ? srso_alias_return_thunk+0x5/0xfbef5
[ 59.293351][ T365] tc_modify_qdisc+0x63a/0x1cf0
Fix this by always checking and removing an ets class from the active list
when changing it to strict.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/tree/net/sched/sch_ets.c?id=ce052b9402e461a9aded599f5b47e76bc727f7de#n663 |
| In the Linux kernel, the following vulnerability has been resolved:
scsi: ufs: core: Fix invalid probe error return value
After DME Link Startup, the error return value is set to the MIPI UniPro
GenericErrorCode which can be 0 (SUCCESS) or 1 (FAILURE). Upon failure
during driver probe, the error code 1 is propagated back to the driver
probe function which must return a negative value to indicate an error,
but 1 is not negative, so the probe is considered to be successful even
though it failed. Subsequently, removing the driver results in an oops
because it is not in a valid state.
This happens because none of the callers of ufshcd_init() expect a
non-negative error code.
Fix the return value and documentation to match actual usage. |
| In the Linux kernel, the following vulnerability has been resolved:
firmware: stratix10-svc: fix bug in saving controller data
Fix the incorrect usage of platform_set_drvdata and dev_set_drvdata. They
both are of the same data and overrides each other. This resulted in the
rmmod of the svc driver to fail and throw a kernel panic for kthread_stop
and fifo free. |
| In the Linux kernel, the following vulnerability has been resolved:
net: dsa: microchip: Don't free uninitialized ksz_irq
If something goes wrong at setup, ksz_irq_free() can be called on
uninitialized ksz_irq (for example when ksz_ptp_irq_setup() fails). It
leads to freeing uninitialized IRQ numbers and/or domains.
Use dsa_switch_for_each_user_port_continue_reverse() in the error path
to iterate only over the fully initialized ports. |
| In the Linux kernel, the following vulnerability has been resolved:
scsi: qla2xxx: Clear cmds after chip reset
Commit aefed3e5548f ("scsi: qla2xxx: target: Fix offline port handling
and host reset handling") caused two problems:
1. Commands sent to FW, after chip reset got stuck and never freed as FW
is not going to respond to them anymore.
2. BUG_ON(cmd->sg_mapped) in qlt_free_cmd(). Commit 26f9ce53817a
("scsi: qla2xxx: Fix missed DMA unmap for aborted commands")
attempted to fix this, but introduced another bug under different
circumstances when two different CPUs were racing to call
qlt_unmap_sg() at the same time: BUG_ON(!valid_dma_direction(dir)) in
dma_unmap_sg_attrs().
So revert "scsi: qla2xxx: Fix missed DMA unmap for aborted commands" and
partially revert "scsi: qla2xxx: target: Fix offline port handling and
host reset handling" at __qla2x00_abort_all_cmds. |
| In the Linux kernel, the following vulnerability has been resolved:
afs: Fix delayed allocation of a cell's anonymous key
The allocation of a cell's anonymous key is done in a background thread
along with other cell setup such as doing a DNS upcall. In the reported
bug, this is triggered by afs_parse_source() parsing the device name given
to mount() and calling afs_lookup_cell() with the name of the cell.
The normal key lookup then tries to use the key description on the
anonymous authentication key as the reference for request_key() - but it
may not yet be set and so an oops can happen.
This has been made more likely to happen by the fix for dynamic lookup
failure.
Fix this by firstly allocating a reference name and attaching it to the
afs_cell record when the record is created. It can share the memory
allocation with the cell name (unfortunately it can't just overlap the cell
name by prepending it with "afs@" as the cell name already has a '.'
prepended for other purposes). This reference name is then passed to
request_key().
Secondly, the anon key is now allocated on demand at the point a key is
requested in afs_request_key() if it is not already allocated. A mutex is
used to prevent multiple allocation for a cell.
Thirdly, make afs_request_key_rcu() return NULL if the anonymous key isn't
yet allocated (if we need it) and then the caller can return -ECHILD to
drop out of RCU-mode and afs_request_key() can be called.
Note that the anonymous key is kind of necessary to make the key lookup
cache work as that doesn't currently cache a negative lookup, but it's
probably worth some investigation to see if NULL can be used instead. |
| In the Linux kernel, the following vulnerability has been resolved:
drm/xe/guc: Fix stack_depot usage
Add missing stack_depot_init() call when CONFIG_DRM_XE_DEBUG_GUC is
enabled to fix the following call stack:
[] BUG: kernel NULL pointer dereference, address: 0000000000000000
[] Workqueue: drm_sched_run_job_work [gpu_sched]
[] RIP: 0010:stack_depot_save_flags+0x172/0x870
[] Call Trace:
[] <TASK>
[] fast_req_track+0x58/0xb0 [xe]
(cherry picked from commit 64fdf496a6929a0a194387d2bb5efaf5da2b542f) |
| In the Linux kernel, the following vulnerability has been resolved:
macintosh/mac_hid: fix race condition in mac_hid_toggle_emumouse
The following warning appears when running syzkaller, and this issue also
exists in the mainline code.
------------[ cut here ]------------
list_add double add: new=ffffffffa57eee28, prev=ffffffffa57eee28, next=ffffffffa5e63100.
WARNING: CPU: 0 PID: 1491 at lib/list_debug.c:35 __list_add_valid_or_report+0xf7/0x130
Modules linked in:
CPU: 0 PID: 1491 Comm: syz.1.28 Not tainted 6.6.0+ #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:__list_add_valid_or_report+0xf7/0x130
RSP: 0018:ff1100010dfb7b78 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffffffffa57eee18 RCX: ffffffff97fc9817
RDX: 0000000000040000 RSI: ffa0000002383000 RDI: 0000000000000001
RBP: ffffffffa57eee28 R08: 0000000000000001 R09: ffe21c0021bf6f2c
R10: 0000000000000001 R11: 6464615f7473696c R12: ffffffffa5e63100
R13: ffffffffa57eee28 R14: ffffffffa57eee28 R15: ff1100010dfb7d48
FS: 00007fb14398b640(0000) GS:ff11000119600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000010d096005 CR4: 0000000000773ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 80000000
Call Trace:
<TASK>
input_register_handler+0xb3/0x210
mac_hid_start_emulation+0x1c5/0x290
mac_hid_toggle_emumouse+0x20a/0x240
proc_sys_call_handler+0x4c2/0x6e0
new_sync_write+0x1b1/0x2d0
vfs_write+0x709/0x950
ksys_write+0x12a/0x250
do_syscall_64+0x5a/0x110
entry_SYSCALL_64_after_hwframe+0x78/0xe2
The WARNING occurs when two processes concurrently write to the mac-hid
emulation sysctl, causing a race condition in mac_hid_toggle_emumouse().
Both processes read old_val=0, then both try to register the input handler,
leading to a double list_add of the same handler.
CPU0 CPU1
------------------------- -------------------------
vfs_write() //write 1 vfs_write() //write 1
proc_sys_write() proc_sys_write()
mac_hid_toggle_emumouse() mac_hid_toggle_emumouse()
old_val = *valp // old_val=0
old_val = *valp // old_val=0
mutex_lock_killable()
proc_dointvec() // *valp=1
mac_hid_start_emulation()
input_register_handler()
mutex_unlock()
mutex_lock_killable()
proc_dointvec()
mac_hid_start_emulation()
input_register_handler() //Trigger Warning
mutex_unlock()
Fix this by moving the old_val read inside the mutex lock region. |
| In the Linux kernel, the following vulnerability has been resolved:
crash: fix crashkernel resource shrink
When crashkernel is configured with a high reservation, shrinking its
value below the low crashkernel reservation causes two issues:
1. Invalid crashkernel resource objects
2. Kernel crash if crashkernel shrinking is done twice
For example, with crashkernel=200M,high, the kernel reserves 200MB of high
memory and some default low memory (say 256MB). The reservation appears
as:
cat /proc/iomem | grep -i crash
af000000-beffffff : Crash kernel
433000000-43f7fffff : Crash kernel
If crashkernel is then shrunk to 50MB (echo 52428800 >
/sys/kernel/kexec_crash_size), /proc/iomem still shows 256MB reserved:
af000000-beffffff : Crash kernel
Instead, it should show 50MB:
af000000-b21fffff : Crash kernel
Further shrinking crashkernel to 40MB causes a kernel crash with the
following trace (x86):
BUG: kernel NULL pointer dereference, address: 0000000000000038
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
<snip...>
Call Trace: <TASK>
? __die_body.cold+0x19/0x27
? page_fault_oops+0x15a/0x2f0
? search_module_extables+0x19/0x60
? search_bpf_extables+0x5f/0x80
? exc_page_fault+0x7e/0x180
? asm_exc_page_fault+0x26/0x30
? __release_resource+0xd/0xb0
release_resource+0x26/0x40
__crash_shrink_memory+0xe5/0x110
crash_shrink_memory+0x12a/0x190
kexec_crash_size_store+0x41/0x80
kernfs_fop_write_iter+0x141/0x1f0
vfs_write+0x294/0x460
ksys_write+0x6d/0xf0
<snip...>
This happens because __crash_shrink_memory()/kernel/crash_core.c
incorrectly updates the crashk_res resource object even when
crashk_low_res should be updated.
Fix this by ensuring the correct crashkernel resource object is updated
when shrinking crashkernel memory. |
| In the Linux kernel, the following vulnerability has been resolved:
f2fs: fix to avoid updating compression context during writeback
Bai, Shuangpeng <sjb7183@psu.edu> reported a bug as below:
Oops: divide error: 0000 [#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 11441 Comm: syz.0.46 Not tainted 6.17.0 #1 PREEMPT(full)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
RIP: 0010:f2fs_all_cluster_page_ready+0x106/0x550 fs/f2fs/compress.c:857
Call Trace:
<TASK>
f2fs_write_cache_pages fs/f2fs/data.c:3078 [inline]
__f2fs_write_data_pages fs/f2fs/data.c:3290 [inline]
f2fs_write_data_pages+0x1c19/0x3600 fs/f2fs/data.c:3317
do_writepages+0x38e/0x640 mm/page-writeback.c:2634
filemap_fdatawrite_wbc mm/filemap.c:386 [inline]
__filemap_fdatawrite_range mm/filemap.c:419 [inline]
file_write_and_wait_range+0x2ba/0x3e0 mm/filemap.c:794
f2fs_do_sync_file+0x6e6/0x1b00 fs/f2fs/file.c:294
generic_write_sync include/linux/fs.h:3043 [inline]
f2fs_file_write_iter+0x76e/0x2700 fs/f2fs/file.c:5259
new_sync_write fs/read_write.c:593 [inline]
vfs_write+0x7e9/0xe00 fs/read_write.c:686
ksys_write+0x19d/0x2d0 fs/read_write.c:738
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xf7/0x470 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
The bug was triggered w/ below race condition:
fsync setattr ioctl
- f2fs_do_sync_file
- file_write_and_wait_range
- f2fs_write_cache_pages
: inode is non-compressed
: cc.cluster_size =
F2FS_I(inode)->i_cluster_size = 0
- tag_pages_for_writeback
- f2fs_setattr
- truncate_setsize
- f2fs_truncate
- f2fs_fileattr_set
- f2fs_setflags_common
- set_compress_context
: F2FS_I(inode)->i_cluster_size = 4
: set_inode_flag(inode, FI_COMPRESSED_FILE)
- f2fs_compressed_file
: return true
- f2fs_all_cluster_page_ready
: "pgidx % cc->cluster_size" trigger dividing 0 issue
Let's change as below to fix this issue:
- introduce a new atomic type variable .writeback in structure f2fs_inode_info
to track the number of threads which calling f2fs_write_cache_pages().
- use .i_sem lock to protect .writeback update.
- check .writeback before update compression context in f2fs_setflags_common()
to avoid race w/ ->writepages. |
| In the Linux kernel, the following vulnerability has been resolved:
sparc: fix accurate exception reporting in copy_{from_to}_user for UltraSPARC III
Anthony Yznaga tracked down that a BUG_ON in ext4 code with large folios
enabled resulted from copy_from_user() returning impossibly large values
greater than the size to be copied. This lead to __copy_from_iter()
returning impossible values instead of the actual number of bytes it was
able to copy.
The BUG_ON has been reported in
https://lore.kernel.org/r/b14f55642207e63e907965e209f6323a0df6dcee.camel@physik.fu-berlin.de
The referenced commit introduced exception handlers on user-space memory
references in copy_from_user and copy_to_user. These handlers return from
the respective function and calculate the remaining bytes left to copy
using the current register contents. The exception handlers expect that
%o2 has already been masked during the bulk copy loop, but the masking was
performed after that loop. This will fix the return value of copy_from_user
and copy_to_user in the faulting case. The behaviour of memcpy stays
unchanged. |
| In the Linux kernel, the following vulnerability has been resolved:
fuse: missing copy_finish in fuse-over-io-uring argument copies
Fix a possible reference count leak of payload pages during
fuse argument copies.
[Joanne: simplified error cleanup] |
| Integer overflow or wraparound in the Linux kernel-mode driver for some Intel(R) 800 Series Ethernet before version 1.17.2 may allow an authenticated user to potentially enable denial of service via local access. |
| In the Linux kernel, the following vulnerability has been resolved:
f2fs: fix to avoid NULL pointer dereference in f2fs_check_quota_consistency()
syzbot reported a f2fs bug as below:
Oops: gen[ 107.736417][ T5848] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 1 UID: 0 PID: 5848 Comm: syz-executor263 Tainted: G W 6.17.0-rc1-syzkaller-00014-g0e39a731820a #0 PREEMPT_{RT,(full)}
RIP: 0010:strcmp+0x3c/0xc0 lib/string.c:284
Call Trace:
<TASK>
f2fs_check_quota_consistency fs/f2fs/super.c:1188 [inline]
f2fs_check_opt_consistency+0x1378/0x2c10 fs/f2fs/super.c:1436
__f2fs_remount fs/f2fs/super.c:2653 [inline]
f2fs_reconfigure+0x482/0x1770 fs/f2fs/super.c:5297
reconfigure_super+0x224/0x890 fs/super.c:1077
do_remount fs/namespace.c:3314 [inline]
path_mount+0xd18/0xfe0 fs/namespace.c:4112
do_mount fs/namespace.c:4133 [inline]
__do_sys_mount fs/namespace.c:4344 [inline]
__se_sys_mount+0x317/0x410 fs/namespace.c:4321
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
The direct reason is f2fs_check_quota_consistency() may suffer null-ptr-deref
issue in strcmp().
The bug can be reproduced w/ below scripts:
mkfs.f2fs -f /dev/vdb
mount -t f2fs -o usrquota /dev/vdb /mnt/f2fs
quotacheck -uc /mnt/f2fs/
umount /mnt/f2fs
mount -t f2fs -o usrjquota=aquota.user,jqfmt=vfsold /dev/vdb /mnt/f2fs
mount -t f2fs -o remount,usrjquota=,jqfmt=vfsold /dev/vdb /mnt/f2fs
umount /mnt/f2fs
So, before old_qname and new_qname comparison, we need to check whether
they are all valid pointers, fix it. |
| In the Linux kernel, the following vulnerability has been resolved:
ext4: guard against EA inode refcount underflow in xattr update
syzkaller found a path where ext4_xattr_inode_update_ref() reads an EA
inode refcount that is already <= 0 and then applies ref_change (often
-1). That lets the refcount underflow and we proceed with a bogus value,
triggering errors like:
EXT4-fs error: EA inode <n> ref underflow: ref_count=-1 ref_change=-1
EXT4-fs warning: ea_inode dec ref err=-117
Make the invariant explicit: if the current refcount is non-positive,
treat this as on-disk corruption, emit ext4_error_inode(), and fail the
operation with -EFSCORRUPTED instead of updating the refcount. Delete the
WARN_ONCE() as negative refcounts are now impossible; keep error reporting
in ext4_error_inode().
This prevents the underflow and the follow-on orphan/cleanup churn. |
| In the Linux kernel, the following vulnerability has been resolved:
LoongArch: BPF: Disable trampoline for kernel module function trace
The current LoongArch BPF trampoline implementation is incompatible
with tracing functions in kernel modules. This causes several severe
and user-visible problems:
* The `bpf_selftests/module_attach` test fails consistently.
* Kernel lockup when a BPF program is attached to a module function [1].
* Critical kernel modules like WireGuard experience traffic disruption
when their functions are traced with fentry [2].
Given the severity and the potential for other unknown side-effects, it
is safest to disable the feature entirely for now. This patch prevents
the BPF subsystem from allowing trampoline attachments to kernel module
functions on LoongArch.
This is a temporary mitigation until the core issues in the trampoline
code for kernel module handling can be identified and fixed.
[root@fedora bpf]# ./test_progs -a module_attach -v
bpf_testmod.ko is already unloaded.
Loading bpf_testmod.ko...
Successfully loaded bpf_testmod.ko.
test_module_attach:PASS:skel_open 0 nsec
test_module_attach:PASS:set_attach_target 0 nsec
test_module_attach:PASS:set_attach_target_explicit 0 nsec
test_module_attach:PASS:skel_load 0 nsec
libbpf: prog 'handle_fentry': failed to attach: -ENOTSUPP
libbpf: prog 'handle_fentry': failed to auto-attach: -ENOTSUPP
test_module_attach:FAIL:skel_attach skeleton attach failed: -524
Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
Successfully unloaded bpf_testmod.ko.
[1]: https://lore.kernel.org/loongarch/CAK3+h2wDmpC-hP4u4pJY8T-yfKyk4yRzpu2LMO+C13FMT58oqQ@mail.gmail.com/
[2]: https://lore.kernel.org/loongarch/CAK3+h2wYcpc+OwdLDUBvg2rF9rvvyc5amfHT-KcFaK93uoELPg@mail.gmail.com/ |