Tesla P40
GPU
24
GB
PERFORMANCE OVERVIEW
Model
LLaMA v2
Q4_06.7B
Qwen2.5 14B Instruct
Q4_K - Medium14.8B
Qwen3-Coder-30B-A3B-Instruct-1M
Q4_K - Medium30.5B
Prompt Speed
833tokens/s
339tokens/s
288tokens/s
Generation Speed
40.9tokens/s
15.7tokens/s
29.3tokens/s
Time to First Token
1.52sec
3.50sec
4.04sec
LocalScore
282
115
128
COMPARE MODELS
3 models tested
Select Models
Qwen2.5 14B Instruct
Q4_K - Medium
LLaMA v2
Q4_0
Qwen3-Coder-30B-A3B-Instruct-1M
Q4_K - Medium
