CVE-2024-46787
概要

In the Linux kernel, the following vulnerability has been resolved:

userfaultfd: fix checks for huge PMDs

Patch series "userfaultfd: fix races around pmd_trans_huge() check", v2.

The pmd_trans_huge() code in mfill_atomic() is wrong in three different
ways depending on kernel version:

1. The pmd_trans_huge() check is racy and can lead to a BUG_ON() (if you hit
the right two race windows) - I've tested this in a kernel build with
some extra mdelay() calls. See the commit message for a description
of the race scenario.
On older kernels (before 6.5), I think the same bug can even
theoretically lead to accessing transhuge page contents as a page table
if you hit the right 5 narrow race windows (I haven't tested this case).
2. As pointed out by Qi Zheng, pmd_trans_huge() is not sufficient for
detecting PMDs that don't point to page tables.
On older kernels (before 6.5), you'd just have to win a single fairly
wide race to hit this.
I've tested this on 6.1 stable by racing migration (with a mdelay()
patched into try_to_migrate()) against UFFDIO_ZEROPAGE - on my x86
VM, that causes a kernel oops in ptlock_ptr().
3. On newer kernels (>=6.5), for shmem mappings, khugepaged is allowed
to yank page tables out from under us (though I haven't tested that),
so I think the BUG_ON() checks in mfill_atomic() are just wrong.

I decided to write two separate fixes for these (one fix for bugs 1+2, one
fix for bug 3), so that the first fix can be backported to kernels
affected by bugs 1+2.

This patch (of 2):

This fixes two issues.

I discovered that the following race can occur:

mfill_atomic other thread
============ ============
<zap PMD>
pmdp_get_lockless() [reads none pmd]
<bail if trans_huge>
<if none:>
<pagefault creates transhuge zeropage>
__pte_alloc [no-op]
<zap PMD>
<bail if pmd_trans_huge(*dst_pmd)>
BUG_ON(pmd_none(*dst_pmd))

I have experimentally verified this in a kernel with extra mdelay() calls;
the BUG_ON(pmd_none(*dst_pmd)) triggers.

On kernels newer than commit 0d940a9b270b ("mm/pgtable: allow
pte_offset_map[_lock]() to fail"), this can't lead to anything worse than
a BUG_ON(), since the page table access helpers are actually designed to
deal with page tables concurrently disappearing; but on older kernels
(<=6.4), I think we could probably theoretically race past the two
BUG_ON() checks and end up treating a hugepage as a page table.

The second issue is that, as Qi Zheng pointed out, there are other types
of huge PMDs that pmd_trans_huge() can't catch: devmap PMDs and swap PMDs
(in particular, migration PMDs).

On <=6.4, this is worse than the first issue: If mfill_atomic() runs on a
PMD that contains a migration entry (which just requires winning a single,
fairly wide race), it will pass the PMD to pte_offset_map_lock(), which
assumes that the PMD points to a page table.

Breakage follows: First, the kernel tries to take the PTE lock (which will
crash or maybe worse if there is no "struct page" for the address bits in
the migration entry PMD - I think at least on X86 there usually is no
corresponding "struct page" thanks to the PTE inversion mitigation, amd64
looks different).

If that didn't crash, the kernel would next try to write a PTE into what
it wrongly thinks is a page table.

As part of fixing these issues, get rid of the check for pmd_trans_huge()
before __pte_alloc() - that's redundant, we're going to have to check for
that after the __pte_alloc() anyway.

Backport note: pmdp_get_lockless() is pmd_read_atomic() in older kernels.

公表日 2024年9月18日17:15
登録日 2024年9月18日20:00
最終更新日 2024年11月21日0:33
CVSS3.1 : MEDIUM
スコア 4.7
ベクター CVSS:3.1/AV:L/AC:H/PR:L/UI:N/S:U/C:N/I:N/A:H
攻撃元区分(AV) ローカル
攻撃条件の複雑さ(AC)
攻撃に必要な特権レベル(PR)
利用者の関与(UI) 不要
影響の想定範囲(S) 変更なし
機密性への影響(C) なし
完全性への影響(I) なし
可用性への影響(A)
影響を受けるソフトウェアの構成
構成1 以上 以下 より上 未満
cpe:2.3:o:linux:linux_kernel:6.11:rc1:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:6.11:rc4:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:6.11:rc3:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:6.11:rc2:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:6.11:rc5:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:6.11:rc6:*:*:*:*:*:*
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:* 6.7 6.10.10
cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:* 4.3 6.6.51
関連情報、対策とツール
共通脆弱性一覧

JVN脆弱性情報
Linux の Linux Kernel における脆弱性
タイトル Linux の Linux Kernel における脆弱性
概要

Linux の Linux Kernel には、不特定の脆弱性が存在します。

想定される影響 サービス運用妨害 (DoS) 状態にされる可能性があります。 
対策

ベンダより正式な対策が公開されています。ベンダ情報を参照して適切な対策を実施してください。

公表日 2024年9月1日0:00
登録日 2024年11月21日15:13
最終更新日 2024年11月21日15:13
影響を受けるシステム
Linux
Linux Kernel 4.3 以上 6.6.51 未満
Linux Kernel 6.11
Linux Kernel 6.7 以上 6.10.10 未満
CVE (情報セキュリティ 共通脆弱性識別子)
CWE (共通脆弱性タイプ一覧)
ベンダー情報
変更履歴
No 変更内容 変更日
1 [2024年11月21日]   掲載 2024年11月21日10:03