Skip to content

[LTS 9.6] CVE-2026-31402, CVE-2026-23066, CVE-2026-23097, CVE-2026-23171#1248

Open
pvts-mat wants to merge 4 commits into
ctrliq:ciqlts9_6from
pvts-mat:CVE-batch-32_ciqlts9_6
Open

[LTS 9.6] CVE-2026-31402, CVE-2026-23066, CVE-2026-23097, CVE-2026-23171#1248
pvts-mat wants to merge 4 commits into
ctrliq:ciqlts9_6from
pvts-mat:CVE-batch-32_ciqlts9_6

Conversation

@pvts-mat
Copy link
Copy Markdown
Contributor

@pvts-mat pvts-mat commented May 18, 2026

[LTS 9.6]

CVE-2026-31402 VULN-180165
CVE-2026-23066 VULN-175573
CVE-2026-23097 VULN-175621
CVE-2026-23171 VULN-176288

Commits

CVE-2026-31402

nfsd: fix heap overflow in NFSv4.0 LOCK replay cache

jira VULN-180165
cve CVE-2026-31402
commit-author Jeff Layton <jlayton@kernel.org>
commit 5133b61aaf437e5f25b1b396b14242a6bb0508e2
upstream-diff Used `post_err_offset' instead of `op_status_offset +
  XDR_UNIT' in the `read_bytes_from_xdr_buf()' call, as the LTS 9.6
  version is missing ef3675b45bcb6c17cabbbde620c6cea52ffb21ac ("NFSD:
  Encode COMPOUND operation status on page boundaries")

CVE-2026-23066

rxrpc: Fix recvmsg() unconditional requeue

jira VULN-175573
cve CVE-2026-23066
commit-author David Howells <dhowells@redhat.com>
commit 2c28769a51deb6022d7fbd499987e237a01dd63a
upstream-diff Used linux-6.12.y backport
  cf969bddd6e69c5777fa89dc88402204e72f312a as the basis, which, unlike
  the upstream, produces no conflicts in net/rxrpc/recvmsg.c and only
  trivial conflicts in include/trace/events/rxrpc.h. The contentious
  commit is the non-backported a2ea9a9072607c2fd6442bd1ffb4dbdbf882aed7
  ("Use irq-disabling spinlocks between app and I/O thread"), which
  changes spin_lock()/spin_unlock() pairs to
  spin_lock_irq()/spin_unlock_irq(). Spin locks are used in the CVE fix
  which makes the upstream version incompatible. Linux 6.12 doesn't have
  this commit backported either which makes it better suited for LTS
  9.6. Conflicts left in include/trace/events/rxrpc.h are just a matter
  of putting the newly defined trace points in a correct place on the
  alphabetically-ordered list.

Commit a2ea9a9 was considered for the backport as prerequisite, but it's quite exetensive (14 files modified), requires considerable additional work (conflicts in 4 files) and introduces CVE-2025-38525 which would have to be bugfixed too.

CVE-2026-23097

migrate: correct lock ordering for hugetlb file folios

jira VULN-175621
cve CVE-2026-23097
commit-author Matthew Wilcox (Oracle) <willy@infradead.org>
commit b7880cb166ab62c2409046b2347261abf701530e
upstream-diff Changes in the `remove_migration_ptes()' function call:
  1. Compared `rc' with `MIGRATEPAGE_SUCCESS' as it was originally,
     instead of checking if it's zero - the success code simplification
     was done in the non-backported commit
     fb49a4425cfa163faccd91f913773d3401d3a7d4 ("treewide: remove
     MIGRATEPAGE_SUCCESS").
  2. Used `false' instead of `0' in the third argument in case `ttu' is
     zero. The LTS 9.6 version of `remove_migration_ptes()' accepts
     booleans as the last argument - this was changed in the
     non-backported commit b1f202060afeb7fcb98473929d26fd3d2093b067 ("mm:
     remap unused subpages to shared zeropage when splitting isolated
     thp"). For the same reason used `true' instead of `RMP_LOCKED' in
     case it's non-zero. Inside the LTS 9.6 version of
     `remove_migration_ptes()' it narrows down to the
     `rmap_walk_locked(dst, &rwc)' call, just like in the upstream.

The solution is basically the same as centos9 7db3700, except for formatting.

CVE-2026-23171

bonding: fix use-after-free due to enslave fail after slave array update

jira VULN-176288
cve CVE-2026-23171
commit-author Nikolay Aleksandrov <razor@blackwall.org>
commit e9acda52fd2ee0cdca332f996da7a95c5fd25294
upstream-diff Accounted for the missing commit
  e0caeb24f538c3c9c94f471882ceeb43d9dc2739 ("net: bonding: update the
  slave array for broadcast mode"): moved the conditional
  `bond_update_slave_arr()' call to the same place regardless of the
  condition differing from the upstream.

The LTS 9.6 version is missing e0caeb2 which changes the condition for the moved code fragment, causing conflicts in applying the upstream fix. That commit is a fix for ce7a381 net: bonding: add broadcast_neighbor option for 802.3ad, which was not backported to LTS 9.6 and should not be backported as part of CVE-2026-23171 fixing effort as it introduces new feature. The fix was therefore adapted manually.

Since the mechanism of the bug, as well as its fix, was not entirely clear, and no other backports of this fix to a Linux version with the e0caeb2 commit missing were found, the reproduction of the bug was attempted to make sure the proposed fix actually solves the problem.

Bug reproduction

From the fixing commit e9acda5 message:

It is very easy to reproduce the problem with a simple xdp_pass prog:
ip l add bond1 type bond mode balance-xor
ip l set bond1 up
ip l set dev bond1 xdp object xdp_pass.o sec xdp_pass
ip l add dumdum type dummy

Then run in parallel:
while :; do ip l set dumdum master bond1 1>/dev/null 2>&1; done;
mausezahn bond1 -a own -b rand -A rand -B 1.1.1.1 -c 0 -t tcp "dp=1-1023, flags=syn"

Config

To reproduce the issue tha KASAN option was enabled in ciqlts9_6. However, merely setting

CONFIG_KASAN=y

in the default config kernel-x86_64-rhel.config resulted in a kernel which was crashing on boot. Instead the kernel-x86_64-debug-rhel.config configuration was used which happened to have KASAN enabled and was nevertheless booting fine for some reason.

kasan-crashing.log

The "xdp_pass prog"

The xdp_pass prog mentioned in the commit message was assumed to be a minimal XDP eBPF program whose behavior is just to return XDP_PASS:

#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>

SEC("xdp_pass")
int xdp_prog(struct xdp_md *ctx)
{
    return XDP_PASS;
}

char _license[] SEC("license") = "GPL";

Compilation:

clang -I/usr/src/kernels/5.14.0-611.55.1.el9_7.x86_64+debug/tools/bpf/resolve_btfids/libbpf/include -O2 -g -target bpf -c xdp_pass.c -o xdp_pass.o

The bpf/bpf_helpers.h header was only available after installing kernel-debug-devel package

sudo dnf --color=never install kernel-debug-devel.x86_64 -y

Bug reproduction on ciqlts9_6

Commands from the commit message were used to reproduce the bug, namely

ip l add bond1 type bond mode balance-xor
ip l set bond1 up
ip l set dev bond1 xdp object xdp_pass.o sec xdp_pass
ip l add dumdum type dummy
while :; do ip l set dumdum master bond1 1>/dev/null 2>&1; done;

in one console and

mausezahn bond1 -a own -b rand -A rand -B 1.1.1.1 -c 0 -t tcp "dp=1-1023, flags=syn"

in another.

The reported KASAN message was not obtained on any run. Insted the kernel was spontaneously rebooting right after the mausezahn command invocation.

ciqlts9_6–reproduction-console-1.log
ciqlts9_6–reproduction-console-2.log

Patch test

Similarly to the bug reproduction on ciqlts9_6 the patched version was compiled under the kernel-x86_64-debug-rhel.config configuration file. Repeating the same steps resulted in no reboots, crashes or hangs.

ciqlts9_6-patched–reproduction-console-1.log
ciqlts9_6-patched–reproduction-console-2.log

kABI check: passed

[0/1] kabi_check_kernel	Check ABI of kernel [ciqlts9_6-CVE-batch-32]	_kabi_check_kernel__x86_64--test--ciqlts9_6-CVE-batch-32
+ dist_git_version=el-9.6
+ local_version=ciqlts9_6-CVE-batch-32
+ arch=x86_64
+ user=pvts
+ buildmachine=x86_64--build--ciqlts9_6
+ virsh_timeout=600
+ ssh_daemon_wait=20
+ src_dir=/mnt/code/kernel-dist-git-el-9.6
+ build_dir=/mnt/build_files/kernel-src-tree-ciqlts9_6-CVE-batch-32
+ sudo chmod +x /data/src/ctrliq-github-haskell/kernel-dist-git-el-9.6/SOURCES/check-kabi
+ ninja-back/virssh.xsh --max 8 --shutdown-on-success --shutdown-on-failure --timeout 600 --ssh-daemon-wait 20 pvts x86_64--build--ciqlts9_6 ''\''/mnt/code/kernel-dist-git-el-9.6/SOURCES/check-kabi'\'' -k '\''/mnt/code/kernel-dist-git-el-9.6/SOURCES/Module.kabi_x86_64'\'' -s '\''/mnt/build_files/kernel-src-tree-ciqlts9_6-CVE-batch-32/Module.symvers'\'''
kABI check passed
+ touch state/kernels/ciqlts9_6-CVE-batch-32/x86_64/kabi_checked

Boot test: passed

boot-test.log

Kselftests: passed relative

Reference

kselftests–ciqlts9_6–run1.log

Patch

kselftests–ciqlts9_6-CVE-batch-32–run1.log

Comparison

The tests results for the reference and the patch are the same.

$ ktests.xsh diff -d kselftests*.log

Column    File
--------  --------------------------------------------
Status0   kselftests--ciqlts9_6--run1.log
Status1   kselftests--ciqlts9_6-CVE-batch-32--run1.log

tests-comparison.txt

pvts-mat added 4 commits May 19, 2026 01:33
jira VULN-180165
cve CVE-2026-31402
commit-author Jeff Layton <jlayton@kernel.org>
commit 5133b61
upstream-diff Used `post_err_offset' instead of `op_status_offset +
  XDR_UNIT' in the `read_bytes_from_xdr_buf()' call, as the LTS 9.6
  version is missing ef3675b ("NFSD:
  Encode COMPOUND operation status on page boundaries")

The NFSv4.0 replay cache uses a fixed 112-byte inline buffer
(rp_ibuf[NFSD4_REPLAY_ISIZE]) to store encoded operation responses.
This size was calculated based on OPEN responses and does not account
for LOCK denied responses, which include the conflicting lock owner as
a variable-length field up to 1024 bytes (NFS4_OPAQUE_LIMIT).

When a LOCK operation is denied due to a conflict with an existing lock
that has a large owner, nfsd4_encode_operation() copies the full encoded
response into the undersized replay buffer via read_bytes_from_xdr_buf()
with no bounds check. This results in a slab-out-of-bounds write of up
to 944 bytes past the end of the buffer, corrupting adjacent heap memory.

This can be triggered remotely by an unauthenticated attacker with two
cooperating NFSv4.0 clients: one sets a lock with a large owner string,
then the other requests a conflicting lock to provoke the denial.

We could fix this by increasing NFSD4_REPLAY_ISIZE to allow for a full
opaque, but that would increase the size of every stateowner, when most
lockowners are not that large.

Instead, fix this by checking the encoded response length against
NFSD4_REPLAY_ISIZE before copying into the replay buffer. If the
response is too large, set rp_buflen to 0 to skip caching the replay
payload. The status is still cached, and the client already received the
correct response on the original request.

Fixes: 1da177e ("Linux-2.6.12-rc2")
	Cc: stable@kernel.org
	Reported-by: Nicholas Carlini <npc@anthropic.com>
	Tested-by: Nicholas Carlini <npc@anthropic.com>
	Signed-off-by: Jeff Layton <jlayton@kernel.org>
	Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
(cherry picked from commit 5133b61)
	Signed-off-by: Marcin Wcisło <marcin.wcislo@conclusive.pl>
jira VULN-175573
cve CVE-2026-23066
commit-author David Howells <dhowells@redhat.com>
commit 2c28769
upstream-diff Used linux-6.12.y backport
  cf969bd as the basis, which, unlike
  the upstream, produces no conflicts in net/rxrpc/recvmsg.c and only
  trivial conflicts in include/trace/events/rxrpc.h. The contentious
  commit is the non-backported a2ea9a9
  ("Use irq-disabling spinlocks between app and I/O thread"), which
  changes spin_lock()/spin_unlock() pairs to
  spin_lock_irq()/spin_unlock_irq(). Spin locks are used in the CVE fix
  which makes the upstream version incompatible. Linux 6.12 doesn't have
  this commit backported either which makes it better suited for LTS
  9.6. Conflicts left in include/trace/events/rxrpc.h are just a matter
  of putting the newly defined trace points in a correct place on the
  alphabetically-ordered list.

If rxrpc_recvmsg() fails because MSG_DONTWAIT was specified but the call at
the front of the recvmsg queue already has its mutex locked, it requeues
the call - whether or not the call is already queued.  The call may be on
the queue because MSG_PEEK was also passed and so the call was not dequeued
or because the I/O thread requeued it.

The unconditional requeue may then corrupt the recvmsg queue, leading to
things like UAFs or refcount underruns.

Fix this by only requeuing the call if it isn't already on the queue - and
moving it to the front if it is already queued.  If we don't queue it, we
have to put the ref we obtained by dequeuing it.

Also, MSG_PEEK doesn't dequeue the call so shouldn't call
rxrpc_notify_socket() for the call if we didn't use up all the data on the
queue, so fix that also.

Fixes: 540b1c4 ("rxrpc: Fix deadlock between call creation and sendmsg/recvmsg")
	Reported-by: Faith <faith@zellic.io>
	Reported-by: Pumpkin Chang <pumpkin@devco.re>
	Signed-off-by: David Howells <dhowells@redhat.com>
	Acked-by: Marc Dionne <marc.dionne@auristor.com>
cc: Nir Ohfeld <niro@wiz.io>
cc: Willy Tarreau <w@1wt.eu>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
cc: stable@kernel.org
Link: https://patch.msgid.link/95163.1768428203@warthog.procyon.org.uk
	Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit cf969bd)
	Signed-off-by: Marcin Wcisło <marcin.wcislo@conclusive.pl>
jira VULN-175621
cve CVE-2026-23097
commit-author Matthew Wilcox (Oracle) <willy@infradead.org>
commit b7880cb
upstream-diff Changes in the `remove_migration_ptes()' function call:
  1. Compared `rc' with `MIGRATEPAGE_SUCCESS' as it was originally,
     instead of checking if it's zero - the success code simplification
     was done in the non-backported commit
     fb49a44 ("treewide: remove
     MIGRATEPAGE_SUCCESS").
  2. Used `false' instead of `0' in the third argument in case `ttu' is
     zero. The LTS 9.6 version of `remove_migration_ptes()' accepts
     booleans as the last argument - this was changed in the
     non-backported commit b1f2020 ("mm:
     remap unused subpages to shared zeropage when splitting isolated
     thp"). For the same reason used `true' instead of `RMP_LOCKED' in
     case it's non-zero. Inside the LTS 9.6 version of
     `remove_migration_ptes()' it narrows down to the
     `rmap_walk_locked(dst, &rwc)' call, just like in the upstream.

Syzbot has found a deadlock (analyzed by Lance Yang):

1) Task (5749): Holds folio_lock, then tries to acquire i_mmap_rwsem(read lock).
2) Task (5754): Holds i_mmap_rwsem(write lock), then tries to acquire
folio_lock.

migrate_pages()
  -> migrate_hugetlbs()
    -> unmap_and_move_huge_page()     <- Takes folio_lock!
      -> remove_migration_ptes()
        -> __rmap_walk_file()
          -> i_mmap_lock_read()       <- Waits for i_mmap_rwsem(read lock)!

hugetlbfs_fallocate()
  -> hugetlbfs_punch_hole()           <- Takes i_mmap_rwsem(write lock)!
    -> hugetlbfs_zero_partial_page()
     -> filemap_lock_hugetlb_folio()
      -> filemap_lock_folio()
        -> __filemap_get_folio        <- Waits for folio_lock!

The migration path is the one taking locks in the wrong order according to
the documentation at the top of mm/rmap.c.  So expand the scope of the
existing i_mmap_lock to cover the calls to remove_migration_ptes() too.

This is (mostly) how it used to be after commit c0d0381.  That was
removed by 336bf30 for both file & anon hugetlb pages when it should
only have been removed for anon hugetlb pages.

Link: https://lkml.kernel.org/r/20260109041345.3863089-2-willy@infradead.org
	Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Fixes: 336bf30 ("hugetlbfs: fix anon huge page migration race")
	Reported-by: syzbot+2d9c96466c978346b55f@syzkaller.appspotmail.com
Link: https://lore.kernel.org/all/68e9715a.050a0220.1186a4.000d.GAE@google.com
	Debugged-by: Lance Yang <lance.yang@linux.dev>
	Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
	Acked-by: Zi Yan <ziy@nvidia.com>
	Cc: Alistair Popple <apopple@nvidia.com>
	Cc: Byungchul Park <byungchul@sk.com>
	Cc: Gregory Price <gourry@gourry.net>
	Cc: Jann Horn <jannh@google.com>
	Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
	Cc: Liam Howlett <liam.howlett@oracle.com>
	Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
	Cc: Matthew Brost <matthew.brost@intel.com>
	Cc: Rakie Kim <rakie.kim@sk.com>
	Cc: Rik van Riel <riel@surriel.com>
	Cc: Vlastimil Babka <vbabka@suse.cz>
	Cc: Ying Huang <ying.huang@linux.alibaba.com>
	Cc: <stable@vger.kernel.org>
	Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit b7880cb)
	Signed-off-by: Marcin Wcisło <marcin.wcislo@conclusive.pl>
jira VULN-176288
cve CVE-2026-23171
commit-author Nikolay Aleksandrov <razor@blackwall.org>
commit e9acda5
upstream-diff Accounted for the missing commit
  e0caeb2 ("net: bonding: update the
  slave array for broadcast mode"): moved the conditional
  `bond_update_slave_arr()' call to the same place regardless of the
  condition differing from the upstream.

Fix a use-after-free which happens due to enslave failure after the new
slave has been added to the array. Since the new slave can be used for Tx
immediately, we can use it after it has been freed by the enslave error
cleanup path which frees the allocated slave memory. Slave update array is
supposed to be called last when further enslave failures are not expected.
Move it after xdp setup to avoid any problems.

It is very easy to reproduce the problem with a simple xdp_pass prog:
 ip l add bond1 type bond mode balance-xor
 ip l set bond1 up
 ip l set dev bond1 xdp object xdp_pass.o sec xdp_pass
 ip l add dumdum type dummy

Then run in parallel:
 while :; do ip l set dumdum master bond1 1>/dev/null 2>&1; done;
 mausezahn bond1 -a own -b rand -A rand -B 1.1.1.1 -c 0 -t tcp "dp=1-1023, flags=syn"

The crash happens almost immediately:
 [  605.602850] Oops: general protection fault, probably for non-canonical address 0xe0e6fc2460000137: 0000 [ctrliq#1] SMP KASAN NOPTI
 [  605.602916] KASAN: maybe wild-memory-access in range [0x07380123000009b8-0x07380123000009bf]
 [  605.602946] CPU: 0 UID: 0 PID: 2445 Comm: mausezahn Kdump: loaded Tainted: G    B               6.19.0-rc6+ ctrliq#21 PREEMPT(voluntary)
 [  605.602979] Tainted: [B]=BAD_PAGE
 [  605.602998] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
 [  605.603032] RIP: 0010:netdev_core_pick_tx+0xcd/0x210
 [  605.603063] Code: 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 3e 01 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 6b 08 49 8d 7d 30 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 25 01 00 00 49 8b 45 30 4c 89 e2 48 89 ee 48 89
 [  605.603111] RSP: 0018:ffff88817b9af348 EFLAGS: 00010213
 [  605.603145] RAX: dffffc0000000000 RBX: ffff88817d28b420 RCX: 0000000000000000
 [  605.603172] RDX: 00e7002460000137 RSI: 0000000000000008 RDI: 07380123000009be
 [  605.603199] RBP: ffff88817b541a00 R08: 0000000000000001 R09: fffffbfff3ed8c0c
 [  605.603226] R10: ffffffff9f6c6067 R11: 0000000000000001 R12: 0000000000000000
 [  605.603253] R13: 073801230000098e R14: ffff88817d28b448 R15: ffff88817b541a84
 [  605.603286] FS:  00007f6570ef67c0(0000) GS:ffff888221dfa000(0000) knlGS:0000000000000000
 [  605.603319] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 [  605.603343] CR2: 00007f65712fae40 CR3: 000000011371b000 CR4: 0000000000350ef0
 [  605.603373] Call Trace:
 [  605.603392]  <TASK>
 [  605.603410]  __dev_queue_xmit+0x448/0x32a0
 [  605.603434]  ? __pfx_vprintk_emit+0x10/0x10
 [  605.603461]  ? __pfx_vprintk_emit+0x10/0x10
 [  605.603484]  ? __pfx___dev_queue_xmit+0x10/0x10
 [  605.603507]  ? bond_start_xmit+0xbfb/0xc20 [bonding]
 [  605.603546]  ? _printk+0xcb/0x100
 [  605.603566]  ? __pfx__printk+0x10/0x10
 [  605.603589]  ? bond_start_xmit+0xbfb/0xc20 [bonding]
 [  605.603627]  ? add_taint+0x5e/0x70
 [  605.603648]  ? add_taint+0x2a/0x70
 [  605.603670]  ? end_report.cold+0x51/0x75
 [  605.603693]  ? bond_start_xmit+0xbfb/0xc20 [bonding]
 [  605.603731]  bond_start_xmit+0x623/0xc20 [bonding]

Fixes: 9e2ee5c ("net, bonding: Add XDP support to the bonding driver")
	Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
	Reported-by: Chen Zhen <chenzhen126@huawei.com>
Closes: https://lore.kernel.org/netdev/fae17c21-4940-5605-85b2-1d5e17342358@huawei.com/
CC: Jussi Maki <joamaki@gmail.com>
CC: Daniel Borkmann <daniel@iogearbox.net>
	Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://patch.msgid.link/20260123120659.571187-1-razor@blackwall.org
	Signed-off-by: Paolo Abeni <pabeni@redhat.com>
(cherry picked from commit e9acda5)
	Signed-off-by: Marcin Wcisło <marcin.wcislo@conclusive.pl>
@pvts-mat pvts-mat force-pushed the CVE-batch-32_ciqlts9_6 branch from bd6d584 to 7a58277 Compare May 18, 2026 23:34
@pvts-mat pvts-mat marked this pull request as ready for review May 18, 2026 23:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant