Tesla P40
GPU
24
GB
PERFORMANCE OVERVIEW
Model
LLaMA v2
Q4_06.7B
Prompt Speed
833tokens/s
Generation Speed
40.9tokens/s
Time to First Token
1.52sec
LocalScore
282
COMPARE MODELS
1 models tested
Select Models
LLaMA v2
Q4_0
1 models tested
Select Models
LLaMA v2
Q4_0