TensorRT-LLM

Creator

Creator

Created

Created

2023 Oct 24 4:10

Editor

Editor

Edited

Edited

2025 Mar 25 17:43

Refs

Refs

NVIDIA • Updated 2023 Nov 17 4:36

nvidia

빠른데 지원하는 모델 적고 왠진 모르겠지만 top p 1 로 해도 deterministic 함. 아마 오류인듯?

support-matrix.md

Sampling options

API Reference — tensorrt_llm documentation

model (str or Path) – The model name or a local model directory. Note that if the value could be both a model name or a local model directory, the local model directory will be prioritized.

https://nvidia.github.io/TensorRT-LLM/llm-api/reference.html#tensorrt_llm.llmapi.SamplingParams

Backlinks

FlashInfer Mistral

Recommendations

///////