製品・ソフトウェアに関する情報

脆弱性検索

ベンダー検索

プロダクト・サービス検索

JVN脆弱性検索トップ

->

JVN脆弱性詳細

vLLMにおける複数の脆弱性

タイトル	vLLMにおける複数の脆弱性
概要	vLLMは大規模言語モデル（LLM）の推論およびサービングエンジンです。バージョン0.5.5から0.23.1rc0まで、vLLMのGGUFデクオンタイズカーネル（csrc/quantization/gguf/gguf_kernel.cu）におけるテンソル次元の整数切り捨てにより、テンソルの部分的な処理が発生していました。出力テンソルはtorch::emptyによって完全なサイズで割り当てられますが（初期化されていないメモリ）、デクオンタイズCUDAカーネルは切り捨てられた要素数のみを処理します。そのため、出力テンソルの未処理部分には、GPUメモリに以前存在していたデータがそのまま残っていました。マルチテナントの推論環境において、この残留GPUメモリは他のユーザーの推論リクエストのテンソルデータを含んでおり、情報漏洩を引き起こす可能性がありました。この脆弱性は0.23.1rc0で修正されています。
想定される影響	・当該ソフトウェアが扱う全ての情報が外部に漏れる可能性があります。・当該ソフトウェアが扱う情報について、書き換えは発生しません。・当該ソフトウェアは停止しません。
対策	ベンダ情報を参照して適切な対策を実施してください。
公表日	2026年6月22日0:00
登録日	2026年6月26日11:56
最終更新日	2026年6月26日11:56

CVSS3.0 : 重要
スコア	7.5
ベクター	CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N

影響を受けるシステム

vLLM

vLLM 0.5.5 以上 0.23.1 未満

CVE (情報セキュリティ共通脆弱性識別子)

No	名前	URL
Common Vulnerabilities and Exposures (CVE)
1	CVE-2026-53923	https://www.cve.org/CVERecord?id=CVE-2026-53923
National Vulnerability Database (NVD)
2	CVE-2026-53923	https://nvd.nist.gov/vuln/detail/CVE-2026-53923

CWE (共通脆弱性タイプ一覧)

No	名前	URL
JVNDB
1	CWE-200	https://jvndb.jvn.jp/ja/cwe/CWE-200.html
2	CWE-681	http://cwe.mitre.org/data/definitions/681.html

ベンダー情報

No	名前	URL
GitHub
1	GGUF dequantize kernel int truncation exposes uninitialized GPU memory in multi-tenant serving Advisory vllm-project/vllm GitHub	https://github.com/vllm-project/vllm/security/advisories/GHSA-5jv2-g5wq-cmr4

その他

No	名前	URL
関連文書
1	[Security] Fix info disclosure via int32 truncation in GGUF dequantize kernels by jperezdealgaba Pull Request #44971 vllm-project/vllm GitHub	https://github.com/vllm-project/vllm/pull/44971
2	[Security] Fix info disclosure via int32 truncation in GGUF dequantiz… vllm-project/vllm@f219788 GitHub	https://github.com/vllm-project/vllm/commit/f219788f91952827132fa4fdf916427cd20d225e

変更履歴

No	変更内容	変更日
1	[2026年06月26日] 掲載	2026年6月26日11:56

NVD脆弱性情報

CVE-2026-53923

概要	vLLM is an inference and serving engine for large language models (LLMs). From 0.5.5 until 0.23.1rc0, integer truncation of tensor dimensions in vLLM's GGUF dequantize kernels (csrc/quantization/gguf/gguf_kernel.cu) causes partial tensor processing. The output tensor is allocated at full size via torch::empty (uninitialized memory), but the dequantize CUDA kernel processes only a truncated number of elements. The unfilled portion of the output tensor retains whatever was previously in GPU memory. In multi-tenant inference deployments, this residual GPU memory may contain tensor data from other users' inference requests, constituting information disclosure. This vulnerability is fixed in 0.23.1rc0.
公表日	2026年6月23日8:16
登録日	2026年6月27日4:12
最終更新日	2026年6月25日1:51

影響を受けるソフトウェアの構成

構成1		以上	以下	より上	未満
cpe:2.3:a:vllm:vllm::::::::		0.5.5			0.23.1

関連情報、対策とツール

No	URL	refsource	タグ
1	https://github.com/vllm-project/vllm/commit/f219788f91952827132fa4fdf916427cd20d225e	security-advisories@github.com
2	https://github.com/vllm-project/vllm/pull/44971	security-advisories@github.com
3	https://github.com/vllm-project/vllm/security/advisories/GHSA-5jv2-g5wq-cmr4	security-advisories@github.com

共通脆弱性一覧

No	CWE	名前	URL
1	CWE-200	情報漏えい	https://cwe.mitre.org/data/definitions/200.html
2	CWE-681	数値型間の変換の誤り	https://cwe.mitre.org/data/definitions/681.html