| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
smb: server: avoid double-free in smb_direct_free_sendmsg after smb_direct_flush_send_list()
smb_direct_flush_send_list() already calls smb_direct_free_sendmsg(),
so we should not call it again after post_sendmsg()
moved it to the batch list. |
| In the Linux kernel, the following vulnerability has been resolved:
usbip: validate number_of_packets in usbip_pack_ret_submit()
When a USB/IP client receives a RET_SUBMIT response,
usbip_pack_ret_submit() unconditionally overwrites
urb->number_of_packets from the network PDU. This value is
subsequently used as the loop bound in usbip_recv_iso() and
usbip_pad_iso() to iterate over urb->iso_frame_desc[], a flexible
array whose size was fixed at URB allocation time based on the
*original* number_of_packets from the CMD_SUBMIT.
A malicious USB/IP server can set number_of_packets in the response
to a value larger than what was originally submitted, causing a heap
out-of-bounds write when usbip_recv_iso() writes to
urb->iso_frame_desc[i] beyond the allocated region.
KASAN confirmed this with kernel 7.0.0-rc5:
BUG: KASAN: slab-out-of-bounds in usbip_recv_iso+0x46a/0x640
Write of size 4 at addr ffff888106351d40 by task vhci_rx/69
The buggy address is located 0 bytes to the right of
allocated 320-byte region [ffff888106351c00, ffff888106351d40)
The server side (stub_rx.c) and gadget side (vudc_rx.c) already
validate number_of_packets in the CMD_SUBMIT path since commits
c6688ef9f297 ("usbip: fix stub_rx: harden CMD_SUBMIT path to handle
malicious input") and b78d830f0049 ("usbip: fix vudc_rx: harden
CMD_SUBMIT path to handle malicious input"). The server side validates
against USBIP_MAX_ISO_PACKETS because no URB exists yet at that point.
On the client side we have the original URB, so we can use the tighter
bound: the response must not exceed the original number_of_packets.
This mirrors the existing validation of actual_length against
transfer_buffer_length in usbip_recv_xbuff(), which checks the
response value against the original allocation size.
Kelvin Mbogo's series ("usb: usbip: fix integer overflow in
usbip_recv_iso()", v2) hardens the receive-side functions themselves;
this patch complements that work by catching the bad value at its
source -- in usbip_pack_ret_submit() before the overwrite -- and
using the tighter per-URB allocation bound rather than the global
USBIP_MAX_ISO_PACKETS limit.
Fix this by checking rpdu->number_of_packets against
urb->number_of_packets in usbip_pack_ret_submit() before the
overwrite. On violation, clamp to zero so that usbip_recv_iso() and
usbip_pad_iso() safely return early. |
| In the Linux kernel, the following vulnerability has been resolved:
fbdev: udlfb: avoid divide-by-zero on FBIOPUT_VSCREENINFO
Much like commit 19f953e74356 ("fbdev: fb_pm2fb: Avoid potential divide
by zero error"), we also need to prevent that same crash from happening
in the udlfb driver as it uses pixclock directly when dividing, which
will crash. |
| In the Linux kernel, the following vulnerability has been resolved:
wifi: rtw88: fix device leak on probe failure
Driver core holds a reference to the USB interface and its parent USB
device while the interface is bound to a driver and there is no need to
take additional references unless the structures are needed after
disconnect.
This driver takes a reference to the USB device during probe but does
not to release it on all probe errors (e.g. when descriptor parsing
fails).
Drop the redundant device reference to fix the leak, reduce cargo
culting, make it easier to spot drivers where an extra reference is
needed, and reduce the risk of further memory leaks. |
| In the Linux kernel, the following vulnerability has been resolved:
staging: sm750fb: fix division by zero in ps_to_hz()
ps_to_hz() is called from hw_sm750_crtc_set_mode() without validating
that pixclock is non-zero. A zero pixclock passed via FBIOPUT_VSCREENINFO
causes a division by zero.
Fix by rejecting zero pixclock in lynxfb_ops_check_var(), consistent
with other framebuffer drivers. |
| In the Linux kernel, the following vulnerability has been resolved:
ALSA: ctxfi: Limit PTP to a single page
Commit 391e69143d0a increased CT_PTP_NUM from 1 to 4 to support 256
playback streams, but the additional pages are not used by the card
correctly. The CT20K2 hardware already has multiple VMEM_PTPAL
registers, but using them separately would require refactoring the
entire virtual memory allocation logic.
ct_vm_map() always uses PTEs in vm->ptp[0].area regardless of
CT_PTP_NUM. On AMD64 systems, a single PTP covers 512 PTEs (2M). When
aggregate memory allocations exceed this limit, ct_vm_map() tries to
access beyond the allocated space and causes a page fault:
BUG: unable to handle page fault for address: ffffd4ae8a10a000
Oops: Oops: 0002 [#1] SMP PTI
RIP: 0010:ct_vm_map+0x17c/0x280 [snd_ctxfi]
Call Trace:
atc_pcm_playback_prepare+0x225/0x3b0
ct_pcm_playback_prepare+0x38/0x60
snd_pcm_do_prepare+0x2f/0x50
snd_pcm_action_single+0x36/0x90
snd_pcm_action_nonatomic+0xbf/0xd0
snd_pcm_ioctl+0x28/0x40
__x64_sys_ioctl+0x97/0xe0
do_syscall_64+0x81/0x610
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Revert CT_PTP_NUM to 1. The 256 SRC_RESOURCE_NUM and playback_count
remain unchanged. |
| In the Linux kernel, the following vulnerability has been resolved:
vfio/xe: Reorganize the init to decouple migration from reset
Attempting to issue reset on VF devices that don't support migration
leads to the following:
BUG: unable to handle page fault for address: 00000000000011f8
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 2 UID: 0 PID: 7443 Comm: xe_sriov_flr Tainted: G S U 7.0.0-rc1-lgci-xe-xe-4588-cec43d5c2696af219-nodebug+ #1 PREEMPT(lazy)
Tainted: [S]=CPU_OUT_OF_SPEC, [U]=USER
Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR4 RVP, BIOS RPLPFWI1.R00.4035.A00.2301200723 01/20/2023
RIP: 0010:xe_sriov_vfio_wait_flr_done+0xc/0x80 [xe]
Code: ff c3 cc cc cc cc 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 55 48 89 e5 41 54 53 <83> bf f8 11 00 00 02 75 61 41 89 f4 85 f6 74 52 48 8b 47 08 48 89
RSP: 0018:ffffc9000f7c39b8 EFLAGS: 00010202
RAX: ffffffffa04d8660 RBX: ffff88813e3e4000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffc9000f7c39c8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff888101a48800
R13: ffff88813e3e4150 R14: ffff888130d0d008 R15: ffff88813e3e40d0
FS: 00007877d3d0d940(0000) GS:ffff88890b6d3000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000011f8 CR3: 000000015a762000 CR4: 0000000000f52ef0
PKRU: 55555554
Call Trace:
<TASK>
xe_vfio_pci_reset_done+0x49/0x120 [xe_vfio_pci]
pci_dev_restore+0x3b/0x80
pci_reset_function+0x109/0x140
reset_store+0x5c/0xb0
dev_attr_store+0x17/0x40
sysfs_kf_write+0x72/0x90
kernfs_fop_write_iter+0x161/0x1f0
vfs_write+0x261/0x440
ksys_write+0x69/0xf0
__x64_sys_write+0x19/0x30
x64_sys_call+0x259/0x26e0
do_syscall_64+0xcb/0x1500
? __fput+0x1a2/0x2d0
? fput_close_sync+0x3d/0xa0
? __x64_sys_close+0x3e/0x90
? x64_sys_call+0x1b7c/0x26e0
? do_syscall_64+0x109/0x1500
? __task_pid_nr_ns+0x68/0x100
? __do_sys_getpid+0x1d/0x30
? x64_sys_call+0x10b5/0x26e0
? do_syscall_64+0x109/0x1500
? putname+0x41/0x90
? do_faccessat+0x1e8/0x300
? __x64_sys_access+0x1c/0x30
? x64_sys_call+0x1822/0x26e0
? do_syscall_64+0x109/0x1500
? tick_program_event+0x43/0xa0
? hrtimer_interrupt+0x126/0x260
? irqentry_exit+0xb2/0x710
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7877d5f1c5a4
Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d a5 ea 0e 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89
RSP: 002b:00007fff48e5f908 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007877d5f1c5a4
RDX: 0000000000000001 RSI: 00007877d621b0c9 RDI: 0000000000000009
RBP: 0000000000000001 R08: 00005fb49113b010 R09: 0000000000000007
R10: 0000000000000000 R11: 0000000000000202 R12: 00007877d621b0c9
R13: 0000000000000009 R14: 00007fff48e5fac0 R15: 00007fff48e5fac0
</TASK>
This is caused by the fact that some of the xe_vfio_pci_core_device
members needed for handling reset are only initialized as part of
migration init.
Fix the problem by reorganizing the code to decouple VF init from
migration init. |
| In the Linux kernel, the following vulnerability has been resolved:
arm64: mm: Handle invalid large leaf mappings correctly
It has been possible for a long time to mark ptes in the linear map as
invalid. This is done for secretmem, kfence, realm dma memory un/share,
and others, by simply clearing the PTE_VALID bit. But until commit
a166563e7ec37 ("arm64: mm: support large block mapping when
rodata=full") large leaf mappings were never made invalid in this way.
It turns out various parts of the code base are not equipped to handle
invalid large leaf mappings (in the way they are currently encoded) and
I've observed a kernel panic while booting a realm guest on a
BBML2_NOABORT system as a result:
[ 15.432706] software IO TLB: Memory encryption is active and system is using DMA bounce buffers
[ 15.476896] Unable to handle kernel paging request at virtual address ffff000019600000
[ 15.513762] Mem abort info:
[ 15.527245] ESR = 0x0000000096000046
[ 15.548553] EC = 0x25: DABT (current EL), IL = 32 bits
[ 15.572146] SET = 0, FnV = 0
[ 15.592141] EA = 0, S1PTW = 0
[ 15.612694] FSC = 0x06: level 2 translation fault
[ 15.640644] Data abort info:
[ 15.661983] ISV = 0, ISS = 0x00000046, ISS2 = 0x00000000
[ 15.694875] CM = 0, WnR = 1, TnD = 0, TagAccess = 0
[ 15.723740] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 15.755776] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000081f3f000
[ 15.800410] [ffff000019600000] pgd=0000000000000000, p4d=180000009ffff403, pud=180000009fffe403, pmd=00e8000199600704
[ 15.855046] Internal error: Oops: 0000000096000046 [#1] SMP
[ 15.886394] Modules linked in:
[ 15.900029] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 7.0.0-rc4-dirty #4 PREEMPT
[ 15.935258] Hardware name: linux,dummy-virt (DT)
[ 15.955612] pstate: 21400005 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[ 15.986009] pc : __pi_memcpy_generic+0x128/0x22c
[ 16.006163] lr : swiotlb_bounce+0xf4/0x158
[ 16.024145] sp : ffff80008000b8f0
[ 16.038896] x29: ffff80008000b8f0 x28: 0000000000000000 x27: 0000000000000000
[ 16.069953] x26: ffffb3976d261ba8 x25: 0000000000000000 x24: ffff000019600000
[ 16.100876] x23: 0000000000000001 x22: ffff0000043430d0 x21: 0000000000007ff0
[ 16.131946] x20: 0000000084570010 x19: 0000000000000000 x18: ffff00001ffe3fcc
[ 16.163073] x17: 0000000000000000 x16: 00000000003fffff x15: 646e612065766974
[ 16.194131] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[ 16.225059] x11: 0000000000000000 x10: 0000000000000010 x9 : 0000000000000018
[ 16.256113] x8 : 0000000000000018 x7 : 0000000000000000 x6 : 0000000000000000
[ 16.287203] x5 : ffff000019607ff0 x4 : ffff000004578000 x3 : ffff000019600000
[ 16.318145] x2 : 0000000000007ff0 x1 : ffff000004570010 x0 : ffff000019600000
[ 16.349071] Call trace:
[ 16.360143] __pi_memcpy_generic+0x128/0x22c (P)
[ 16.380310] swiotlb_tbl_map_single+0x154/0x2b4
[ 16.400282] swiotlb_map+0x5c/0x228
[ 16.415984] dma_map_phys+0x244/0x2b8
[ 16.432199] dma_map_page_attrs+0x44/0x58
[ 16.449782] virtqueue_map_page_attrs+0x38/0x44
[ 16.469596] virtqueue_map_single_attrs+0xc0/0x130
[ 16.490509] virtnet_rq_alloc.isra.0+0xa4/0x1fc
[ 16.510355] try_fill_recv+0x2a4/0x584
[ 16.526989] virtnet_open+0xd4/0x238
[ 16.542775] __dev_open+0x110/0x24c
[ 16.558280] __dev_change_flags+0x194/0x20c
[ 16.576879] netif_change_flags+0x24/0x6c
[ 16.594489] dev_change_flags+0x48/0x7c
[ 16.611462] ip_auto_config+0x258/0x1114
[ 16.628727] do_one_initcall+0x80/0x1c8
[ 16.645590] kernel_init_freeable+0x208/0x2f0
[ 16.664917] kernel_init+0x24/0x1e0
[ 16.680295] ret_from_fork+0x10/0x20
[ 16.696369] Code: 927cec03 cb0e0021 8b0e0042 a9411c26 (a900340c)
[ 16.723106] ---[ end trace 0000000000000000 ]---
[ 16.752866] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 16.792556] Kernel Offset: 0x3396ea200000 from 0xffff8000800000
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
media: vidtv: fix NULL pointer dereference in vidtv_channel_pmt_match_sections
syzbot reported a general protection fault in vidtv_psi_desc_assign [1].
vidtv_psi_pmt_stream_init() can return NULL on memory allocation
failure, but vidtv_channel_pmt_match_sections() does not check for
this. When tail is NULL, the subsequent call to
vidtv_psi_desc_assign(&tail->descriptor, desc) dereferences a NULL
pointer offset, causing a general protection fault.
Add a NULL check after vidtv_psi_pmt_stream_init(). On failure, clean
up the already-allocated stream chain and return.
[1]
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
RIP: 0010:vidtv_psi_desc_assign+0x24/0x90 drivers/media/test-drivers/vidtv/vidtv_psi.c:629
Call Trace:
<TASK>
vidtv_channel_pmt_match_sections drivers/media/test-drivers/vidtv/vidtv_channel.c:349 [inline]
vidtv_channel_si_init+0x1445/0x1a50 drivers/media/test-drivers/vidtv/vidtv_channel.c:479
vidtv_mux_init+0x526/0xbe0 drivers/media/test-drivers/vidtv/vidtv_mux.c:519
vidtv_start_streaming drivers/media/test-drivers/vidtv/vidtv_bridge.c:194 [inline]
vidtv_start_feed+0x33e/0x4d0 drivers/media/test-drivers/vidtv/vidtv_bridge.c:239 |
| In the Linux kernel, the following vulnerability has been resolved:
ocfs2: fix possible deadlock between unlink and dio_end_io_write
ocfs2_unlink takes orphan dir inode_lock first and then ip_alloc_sem,
while in ocfs2_dio_end_io_write, it acquires these locks in reverse order.
This creates an ABBA lock ordering violation on lock classes
ocfs2_sysfile_lock_key[ORPHAN_DIR_SYSTEM_INODE] and
ocfs2_file_ip_alloc_sem_key.
Lock Chain #0 (orphan dir inode_lock -> ip_alloc_sem):
ocfs2_unlink
ocfs2_prepare_orphan_dir
ocfs2_lookup_lock_orphan_dir
inode_lock(orphan_dir_inode) <- lock A
__ocfs2_prepare_orphan_dir
ocfs2_prepare_dir_for_insert
ocfs2_extend_dir
ocfs2_expand_inline_dir
down_write(&oi->ip_alloc_sem) <- Lock B
Lock Chain #1 (ip_alloc_sem -> orphan dir inode_lock):
ocfs2_dio_end_io_write
down_write(&oi->ip_alloc_sem) <- Lock B
ocfs2_del_inode_from_orphan()
inode_lock(orphan_dir_inode) <- Lock A
Deadlock Scenario:
CPU0 (unlink) CPU1 (dio_end_io_write)
------ ------
inode_lock(orphan_dir_inode)
down_write(ip_alloc_sem)
down_write(ip_alloc_sem)
inode_lock(orphan_dir_inode)
Since ip_alloc_sem is to protect allocation changes, which is unrelated
with operations in ocfs2_del_inode_from_orphan. So move
ocfs2_del_inode_from_orphan out of ip_alloc_sem to fix the deadlock. |
| In the Linux kernel, the following vulnerability has been resolved:
ocfs2: fix use-after-free in ocfs2_fault() when VM_FAULT_RETRY
filemap_fault() may drop the mmap_lock before returning VM_FAULT_RETRY,
as documented in mm/filemap.c:
"If our return value has VM_FAULT_RETRY set, it's because the mmap_lock
may be dropped before doing I/O or by lock_folio_maybe_drop_mmap()."
When this happens, a concurrent munmap() can call remove_vma() and free
the vm_area_struct via RCU. The saved 'vma' pointer in ocfs2_fault() then
becomes a dangling pointer, and the subsequent trace_ocfs2_fault() call
dereferences it -- a use-after-free.
Fix this by saving ip_blkno as a plain integer before calling
filemap_fault(), and removing vma from the trace event. Since
ip_blkno is copied by value before the lock can be dropped, it
remains valid regardless of what happens to the vma or inode
afterward. |
| In the Linux kernel, the following vulnerability has been resolved:
ocfs2: handle invalid dinode in ocfs2_group_extend
[BUG]
kernel BUG at fs/ocfs2/resize.c:308!
Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
RIP: 0010:ocfs2_group_extend+0x10aa/0x1ae0 fs/ocfs2/resize.c:308
Code: 8b8520ff ffff83f8 860f8580 030000e8 5cc3c1fe
Call Trace:
...
ocfs2_ioctl+0x175/0x6e0 fs/ocfs2/ioctl.c:869
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl fs/ioctl.c:583 [inline]
__x64_sys_ioctl+0x197/0x1e0 fs/ioctl.c:583
x64_sys_call+0x1144/0x26a0 arch/x86/include/generated/asm/syscalls_64.h:17
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x93/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x76/0x7e
...
[CAUSE]
ocfs2_group_extend() assumes that the global bitmap inode block
returned from ocfs2_inode_lock() has already been validated and
BUG_ONs when the signature is not a dinode. That assumption is too
strong for crafted filesystems because the JBD2-managed buffer path
can bypass structural validation and return an invalid dinode to the
resize ioctl.
[FIX]
Validate the dinode explicitly in ocfs2_group_extend(). If the global
bitmap buffer does not contain a valid dinode, report filesystem
corruption with ocfs2_error() and fail the resize operation instead of
crashing the kernel. |
| In the Linux kernel, the following vulnerability has been resolved:
PCI: endpoint: pci-epf-vntb: Stop cmd_handler work in epf_ntb_epc_cleanup
Disable the delayed work before clearing BAR mappings and doorbells to
avoid running the handler after resources have been torn down.
Unable to handle kernel paging request at virtual address ffff800083f46004
[...]
Internal error: Oops: 0000000096000007 [#1] SMP
[...]
Call trace:
epf_ntb_cmd_handler+0x54/0x200 [pci_epf_vntb] (P)
process_one_work+0x154/0x3b0
worker_thread+0x2c8/0x400
kthread+0x148/0x210
ret_from_fork+0x10/0x20 |
| In the Linux kernel, the following vulnerability has been resolved:
PCI: endpoint: pci-epf-vntb: Remove duplicate resource teardown
epf_ntb_epc_destroy() duplicates the teardown that the caller is
supposed to perform later. This leads to an oops when .allow_link fails
or when .drop_link is performed. The following is an example oops of the
former case:
Unable to handle kernel paging request at virtual address dead000000000108
[...]
[dead000000000108] address between user and kernel address ranges
Internal error: Oops: 0000000096000044 [#1] SMP
[...]
Call trace:
pci_epc_remove_epf+0x78/0xe0 (P)
pci_primary_epc_epf_link+0x88/0xa8
configfs_symlink+0x1f4/0x5a0
vfs_symlink+0x134/0x1d8
do_symlinkat+0x88/0x138
__arm64_sys_symlinkat+0x74/0xe0
[...]
Remove the helper, and drop pci_epc_put(). EPC device refcounting is
tied to the configfs EPC group lifetime, and pci_epc_put() in the
.drop_link path is sufficient. |
| In the Linux kernel, the following vulnerability has been resolved:
KVM: SEV: Reject attempts to sync VMSA of an already-launched/encrypted vCPU
Reject synchronizing vCPU state to its associated VMSA if the vCPU has
already been launched, i.e. if the VMSA has already been encrypted. On a
host with SNP enabled, accessing guest-private memory generates an RMP #PF
and panics the host.
BUG: unable to handle page fault for address: ff1276cbfdf36000
#PF: supervisor write access in kernel mode
#PF: error_code(0x80000003) - RMP violation
PGD 5a31801067 P4D 5a31802067 PUD 40ccfb5063 PMD 40e5954063 PTE 80000040fdf36163
SEV-SNP: PFN 0x40fdf36, RMP entry: [0x6010fffffffff001 - 0x000000000000001f]
Oops: Oops: 0003 [#1] SMP NOPTI
CPU: 33 UID: 0 PID: 996180 Comm: qemu-system-x86 Tainted: G OE
Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Hardware name: Dell Inc. PowerEdge R7625/0H1TJT, BIOS 1.5.8 07/21/2023
RIP: 0010:sev_es_sync_vmsa+0x54/0x4c0 [kvm_amd]
Call Trace:
<TASK>
snp_launch_update_vmsa+0x19d/0x290 [kvm_amd]
snp_launch_finish+0xb6/0x380 [kvm_amd]
sev_mem_enc_ioctl+0x14e/0x720 [kvm_amd]
kvm_arch_vm_ioctl+0x837/0xcf0 [kvm]
kvm_vm_ioctl+0x3fd/0xcc0 [kvm]
__x64_sys_ioctl+0xa3/0x100
x64_sys_call+0xfe0/0x2350
do_syscall_64+0x81/0x10f0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7ffff673287d
</TASK>
Note, the KVM flaw has been present since commit ad73109ae7ec ("KVM: SVM:
Provide support to launch and run an SEV-ES guest"), but has only been
actively dangerous for the host since SNP support was added. With SEV-ES,
KVM would "just" clobber guest state, which is totally fine from a host
kernel perspective since userspace can clobber guest state any time before
sev_launch_update_vmsa(). |
| In the Linux kernel, the following vulnerability has been resolved:
KVM: SEV: Protect *all* of sev_mem_enc_register_region() with kvm->lock
Take and hold kvm->lock for before checking sev_guest() in
sev_mem_enc_register_region(), as sev_guest() isn't stable unless kvm->lock
is held (or KVM can guarantee KVM_SEV_INIT{2} has completed and can't
rollack state). If KVM_SEV_INIT{2} fails, KVM can end up trying to add to
a not-yet-initialized sev->regions_list, e.g. triggering a #GP
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 110 UID: 0 PID: 72717 Comm: syz.15.11462 Tainted: G U W O 6.16.0-smp-DEV #1 NONE
Tainted: [U]=USER, [W]=WARN, [O]=OOT_MODULE
Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 12.52.0-0 10/28/2024
RIP: 0010:sev_mem_enc_register_region+0x3f0/0x4f0 ../include/linux/list.h:83
Code: <41> 80 3c 04 00 74 08 4c 89 ff e8 f1 c7 a2 00 49 39 ed 0f 84 c6 00
RSP: 0018:ffff88838647fbb8 EFLAGS: 00010256
RAX: dffffc0000000000 RBX: 1ffff92015cf1e0b RCX: dffffc0000000000
RDX: 0000000000000000 RSI: 0000000000001000 RDI: ffff888367870000
RBP: ffffc900ae78f050 R08: ffffea000d9e0007 R09: 1ffffd4001b3c000
R10: dffffc0000000000 R11: fffff94001b3c001 R12: 0000000000000000
R13: ffff8982ab0bde00 R14: ffffc900ae78f058 R15: 0000000000000000
FS: 00007f34e9dc66c0(0000) GS:ffff89ee64d33000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe180adef98 CR3: 000000047210e000 CR4: 0000000000350ef0
Call Trace:
<TASK>
kvm_arch_vm_ioctl+0xa72/0x1240 ../arch/x86/kvm/x86.c:7371
kvm_vm_ioctl+0x649/0x990 ../virt/kvm/kvm_main.c:5363
__se_sys_ioctl+0x101/0x170 ../fs/ioctl.c:51
do_syscall_x64 ../arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x6f/0x1f0 ../arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f34e9f7e9a9
Code: <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f34e9dc6038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f34ea1a6080 RCX: 00007f34e9f7e9a9
RDX: 0000200000000280 RSI: 000000008010aebb RDI: 0000000000000007
RBP: 00007f34ea000d69 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f34ea1a6080 R15: 00007ffce77197a8
</TASK>
with a syzlang reproducer that looks like:
syz_kvm_add_vcpu$x86(0x0, &(0x7f0000000040)={0x0, &(0x7f0000000180)=ANY=[], 0x70}) (async)
syz_kvm_add_vcpu$x86(0x0, &(0x7f0000000080)={0x0, &(0x7f0000000180)=ANY=[@ANYBLOB="..."], 0x4f}) (async)
r0 = openat$kvm(0xffffffffffffff9c, &(0x7f0000000200), 0x0, 0x0)
r1 = ioctl$KVM_CREATE_VM(r0, 0xae01, 0x0)
r2 = openat$kvm(0xffffffffffffff9c, &(0x7f0000000240), 0x0, 0x0)
r3 = ioctl$KVM_CREATE_VM(r2, 0xae01, 0x0)
ioctl$KVM_SET_CLOCK(r3, 0xc008aeba, &(0x7f0000000040)={0x1, 0x8, 0x0, 0x5625e9b0}) (async)
ioctl$KVM_SET_PIT2(r3, 0x8010aebb, &(0x7f0000000280)={[...], 0x5}) (async)
ioctl$KVM_SET_PIT2(r1, 0x4070aea0, 0x0) (async)
r4 = ioctl$KVM_CREATE_VM(0xffffffffffffffff, 0xae01, 0x0)
openat$kvm(0xffffffffffffff9c, 0x0, 0x0, 0x0) (async)
ioctl$KVM_SET_USER_MEMORY_REGION(r4, 0x4020ae46, &(0x7f0000000400)={0x0, 0x0, 0x0, 0x2000, &(0x7f0000001000/0x2000)=nil}) (async)
r5 = ioctl$KVM_CREATE_VCPU(r4, 0xae41, 0x2)
close(r0) (async)
openat$kvm(0xffffffffffffff9c, &(0x7f0000000000), 0x8000, 0x0) (async)
ioctl$KVM_SET_GUEST_DEBUG(r5, 0x4048ae9b, &(0x7f0000000300)={0x4376ea830d46549b, 0x0, [0x46, 0x0, 0x0, 0x0, 0x0, 0x1000]}) (async)
ioctl$KVM_RUN(r5, 0xae80, 0x0)
Opportunistically use guard() to avoid having to define a new error label
and goto usage. |
| In the Linux kernel, the following vulnerability has been resolved:
KVM: SEV: Lock all vCPUs when synchronzing VMSAs for SNP launch finish
Lock all vCPUs when synchronizing and encrypting VMSAs for SNP guests, as
allowing userspace to manipulate and/or run a vCPU while its state is being
synchronized would at best corrupt vCPU state, and at worst crash the host
kernel.
Opportunistically assert that vcpu->mutex is held when synchronizing its
VMSA (the SEV-ES path already locks vCPUs). |
| In the Linux kernel, the following vulnerability has been resolved:
KVM: SEV: Drop WARN on large size for KVM_MEMORY_ENCRYPT_REG_REGION
Drop the WARN in sev_pin_memory() on npages overflowing an int, as the
WARN is comically trivially to trigger from userspace, e.g. by doing:
struct kvm_enc_region range = {
.addr = 0,
.size = -1ul,
};
__vm_ioctl(vm, KVM_MEMORY_ENCRYPT_REG_REGION, &range);
Note, the checks in sev_mem_enc_register_region() that presumably exist to
verify the incoming address+size are completely worthless, as both "addr"
and "size" are u64s and SEV is 64-bit only, i.e. they _can't_ be greater
than ULONG_MAX. That wart will be cleaned up in the near future.
if (range->addr > ULONG_MAX || range->size > ULONG_MAX)
return -EINVAL;
Opportunistically add a comment to explain why the code calculates the
number of pages the "hard" way, e.g. instead of just shifting @ulen. |
| In the Linux kernel, the following vulnerability has been resolved:
mm: call ->free_folio() directly in folio_unmap_invalidate()
We can only call filemap_free_folio() if we have a reference to (or hold a
lock on) the mapping. Otherwise, we've already removed the folio from the
mapping so it no longer pins the mapping and the mapping can be removed,
causing a use-after-free when accessing mapping->a_ops.
Follow the same pattern as __remove_mapping() and load the free_folio
function pointer before dropping the lock on the mapping. That lets us
make filemap_free_folio() static as this was the only caller outside
filemap.c. |
| In the Linux kernel, the following vulnerability has been resolved:
KVM: x86: Use scratch field in MMIO fragment to hold small write values
When exiting to userspace to service an emulated MMIO write, copy the
to-be-written value to a scratch field in the MMIO fragment if the size
of the data payload is 8 bytes or less, i.e. can fit in a single chunk,
instead of pointing the fragment directly at the source value.
This fixes a class of use-after-free bugs that occur when the emulator
initiates a write using an on-stack, local variable as the source, the
write splits a page boundary, *and* both pages are MMIO pages. Because
KVM's ABI only allows for physically contiguous MMIO requests, accesses
that split MMIO pages are separated into two fragments, and are sent to
userspace one at a time. When KVM attempts to complete userspace MMIO in
response to KVM_RUN after the first fragment, KVM will detect the second
fragment and generate a second userspace exit, and reference the on-stack
variable.
The issue is most visible if the second KVM_RUN is performed by a separate
task, in which case the stack of the initiating task can show up as truly
freed data.
==================================================================
BUG: KASAN: use-after-free in complete_emulated_mmio+0x305/0x420
Read of size 1 at addr ffff888009c378d1 by task syz-executor417/984
CPU: 1 PID: 984 Comm: syz-executor417 Not tainted 5.10.0-182.0.0.95.h2627.eulerosv2r13.x86_64 #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 Call Trace:
dump_stack+0xbe/0xfd
print_address_description.constprop.0+0x19/0x170
__kasan_report.cold+0x6c/0x84
kasan_report+0x3a/0x50
check_memory_region+0xfd/0x1f0
memcpy+0x20/0x60
complete_emulated_mmio+0x305/0x420
kvm_arch_vcpu_ioctl_run+0x63f/0x6d0
kvm_vcpu_ioctl+0x413/0xb20
__se_sys_ioctl+0x111/0x160
do_syscall_64+0x30/0x40
entry_SYSCALL_64_after_hwframe+0x67/0xd1
RIP: 0033:0x42477d
Code: <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007faa8e6890e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00000000004d7338 RCX: 000000000042477d
RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000005
RBP: 00000000004d7330 R08: 00007fff28d546df R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004d733c
R13: 0000000000000000 R14: 000000000040a200 R15: 00007fff28d54720
The buggy address belongs to the page:
page:0000000029f6a428 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x9c37
flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff)
raw: 000fffffc0000000 0000000000000000 ffffea0000270dc8 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff888009c37780: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ffff888009c37800: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>ffff888009c37880: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
^
ffff888009c37900: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ffff888009c37980: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
==================================================================
The bug can also be reproduced with a targeted KVM-Unit-Test by hacking
KVM to fill a large on-stack variable in complete_emulated_mmio(), i.e. by
overwrite the data value with garbage.
Limit the use of the scratch fields to 8-byte or smaller accesses, and to
just writes, as larger accesses and reads are not affected thanks to
implementation details in the emulator, but add a sanity check to ensure
those details don't change in the future. Specifically, KVM never uses
on-stack variables for accesses larger that 8 bytes, e.g. uses an operand
in the emulator context, and *al
---truncated--- |