Llama 3.2 1B Instruct
Q4_K - Medium
1.5Bparams
COMPARE ACCELERATORS
224 accelerators tested
Select Accelerators
NVIDIA RTX 6000 Ada Generation
47GB
NVIDIA H100 PCIe
79GB
NVIDIA A100-SXM4-80GB
79GB
NVIDIA GeForce RTX 3090 Ti
24GB
NVIDIA GeForce RTX 3080 Ti
12GB
Llama 3.2 1B Instruct - Q4_K - Medium
LEADERBOARD
GPU / 47GB
PROMPT
27892
tokens/s
GENERATION
416
tokens/s
TTFT
53
ms
LOCALSCORE
6030
GPU / 48GB
PROMPT
19620
tokens/s
GENERATION
131
tokens/s
TTFT
68
ms
LOCALSCORE
3350
GPU / 16GB
PROMPT
15512
tokens/s
GENERATION
226
tokens/s
TTFT
93
ms
LOCALSCORE
3334
GPU / 16GB
PROMPT
16736
tokens/s
GENERATION
141
tokens/s
TTFT
81
ms
LOCALSCORE
3077
GPU / 12GB
PROMPT
13347
tokens/s
GENERATION
188
tokens/s
TTFT
106
ms
LOCALSCORE
2856
GPU / 20GB
PROMPT
10703
tokens/s
GENERATION
237
tokens/s
TTFT
142
ms
LOCALSCORE
2616
PROMPT
8206
tokens/s
GENERATION
181
tokens/s
TTFT
161
ms
LOCALSCORE
2099
GPU / 20GB
PROMPT
8737
tokens/s
GENERATION
189
tokens/s
TTFT
179
ms
LOCALSCORE
2099
GPU / 8GB
PROMPT
6350
tokens/s
GENERATION
212
tokens/s
TTFT
216
ms
LOCALSCORE
1840
GPU / 16GB
PROMPT
6087
tokens/s
GENERATION
168
tokens/s
TTFT
251
ms
LOCALSCORE
1599
GPU / 8GB
PROMPT
5553
tokens/s
GENERATION
195
tokens/s
TTFT
287
ms
LOCALSCORE
1558
GPU / 8GB
PROMPT
6225
tokens/s
GENERATION
96.4
tokens/s
TTFT
224
ms
LOCALSCORE
1388
GPU / 6GB
PROMPT
4979
tokens/s
GENERATION
114
tokens/s
TTFT
286
ms
LOCALSCORE
1255
GPU / 128GB
PROMPT
3296
tokens/s
GENERATION
176
tokens/s
TTFT
334
ms
LOCALSCORE
1203
GPU / 192GB
PROMPT
3272
tokens/s
GENERATION
170
tokens/s
TTFT
339
ms
LOCALSCORE
1179
PROMPT
4083
tokens/s
GENERATION
124
tokens/s
TTFT
353
ms
LOCALSCORE
1128
GPU / 8GB
PROMPT
3399
tokens/s
GENERATION
109
tokens/s
TTFT
416
ms
LOCALSCORE
962
PROMPT
814
tokens/s
GENERATION
24.4
tokens/s
TTFT
1.83
sec
LOCALSCORE
221