Accelerator icon
Tesla P40
GPU
24
GB
PERFORMANCE OVERVIEW
Model
Llama 3.2 1B Instruct
Q4_K - Medium1.5B
LLaMA v2
Q4_06.7B
Qwen2.5 14B Instruct
Q4_K - Medium14.8B
Prompt Speed
2209tokens/s
833tokens/s
292tokens/s
Generation Speed
90.5tokens/s
40.9tokens/s
13.6tokens/s
Time to First Token
656ms
1.52sec
4.02sec
LocalScore
700
282
100
COMPARE MODELS

4 models tested

Select Models

Llama 3.2 1B Instruct

Q4_K - Medium

Qwen2.5 14B Instruct

Q4_K - Medium

LLaMA v2

Q4_0

Qwen3-Coder-30B-A3B-Instruct-1M

Q4_K - Medium

Tesla P40 - 24GB