nvidia
빠른데 지원하는 모델 적고 왠진 모르겠지만 top p 1 로 해도 deterministic 함. 아마 오류인듯?
Sampling options
API Reference — tensorrt_llm documentation
model (str or Path) – The model name or a local model directory.
Note that if the value could be both a model name or a local model directory,
the local model directory will be prioritized.
https://nvidia.github.io/TensorRT-LLM/llm-api/reference.html#tensorrt_llm.llmapi.SamplingParams

Seonglae Cho