Accelerator icon
Tesla P40
GPU
24
GB
PERFORMANCE OVERVIEW
Model
LLaMA v2
Q4_06.7B
Prompt Speed
833tokens/s
Generation Speed
40.9tokens/s
Time to First Token
1.52sec
LocalScore
282
COMPARE MODELS

1 models tested

Select Models

LLaMA v2

Q4_0

Tesla P40 - 24GB