Accelerator icon
Tesla P40
GPU
24
GB
PERFORMANCE OVERVIEW
Model
LLaMA v2
Q4_06.7B
Qwen2.5 14B Instruct
Q4_K - Medium14.8B
Qwen3-Coder-30B-A3B-Instruct-1M
Q4_K - Medium30.5B
Prompt Speed
833tokens/s
339tokens/s
288tokens/s
Generation Speed
40.9tokens/s
15.7tokens/s
29.3tokens/s
Time to First Token
1.52sec
3.50sec
4.04sec
LocalScore
282
115
128
COMPARE MODELS

3 models tested

Select Models

Qwen2.5 14B Instruct

Q4_K - Medium

LLaMA v2

Q4_0

Qwen3-Coder-30B-A3B-Instruct-1M

Q4_K - Medium

Tesla P40 - 24GB