Llama 3.2 1B Instruct
Q4_K - Medium
1.5Bparams
COMPARE ACCELERATORS
717 accelerators tested
Select Accelerators
NVIDIA RTX 6000 Ada Generation
47GB
NVIDIA GeForce RTX 4090
23GB
NVIDIA GeForce RTX 4090 D
47GB
NVIDIA L40S
44GB
NVIDIA RTX PRO 6000 Blackwell Workstation Edition
95GB
Llama 3.2 1B Instruct - Q4_K - Medium
LEADERBOARD
GPU / 47GB
PROMPT
27281
tokens/s
GENERATION
411
tokens/s
TTFT
53
ms
LOCALSCORE
5948
PROMPT
25401
tokens/s
GENERATION
244
tokens/s
TTFT
55
ms
LOCALSCORE
4842
GPU / 32GB
PROMPT
17479
tokens/s
GENERATION
247
tokens/s
TTFT
80
ms
LOCALSCORE
3780
GPU / 48GB
PROMPT
19620
tokens/s
GENERATION
131
tokens/s
TTFT
68
ms
LOCALSCORE
3350
GPU / 16GB
PROMPT
16736
tokens/s
GENERATION
141
tokens/s
TTFT
81
ms
LOCALSCORE
3077
PROMPT
18081
tokens/s
GENERATION
106
tokens/s
TTFT
74
ms
LOCALSCORE
2962
GPU / 12GB
PROMPT
13280
tokens/s
GENERATION
183
tokens/s
TTFT
106
ms
LOCALSCORE
2821
PROMPT
16184
tokens/s
GENERATION
106
tokens/s
TTFT
80
ms
LOCALSCORE
2780
GPU / 16GB
PROMPT
13268
tokens/s
GENERATION
161
tokens/s
TTFT
104
ms
LOCALSCORE
2738
GPU / 12GB
PROMPT
11428
tokens/s
GENERATION
186
tokens/s
TTFT
119
ms
LOCALSCORE
2612
GPU / 24GB
PROMPT
11598
tokens/s
GENERATION
182
tokens/s
TTFT
125
ms
LOCALSCORE
2565
GPU / 16GB
PROMPT
11965
tokens/s
GENERATION
207
tokens/s
TTFT
148
ms
LOCALSCORE
2559
GPU / 20GB
PROMPT
10448
tokens/s
GENERATION
210
tokens/s
TTFT
144
ms
LOCALSCORE
2477
GPU / 20GB
PROMPT
8753
tokens/s
GENERATION
189
tokens/s
TTFT
179
ms
LOCALSCORE
2100
PROMPT
8206
tokens/s
GENERATION
181
tokens/s
TTFT
161
ms
LOCALSCORE
2099
GPU / 8GB
PROMPT
7096
tokens/s
GENERATION
186
tokens/s
TTFT
189
ms
LOCALSCORE
1911
GPU / 16GB
PROMPT
7533
tokens/s
GENERATION
157
tokens/s
TTFT
183
ms
LOCALSCORE
1865
GPU / 8GB
PROMPT
6350
tokens/s
GENERATION
212
tokens/s
TTFT
216
ms
LOCALSCORE
1840
GPU / 8GB
PROMPT
7398
tokens/s
GENERATION
143
tokens/s
TTFT
201
ms
LOCALSCORE
1723
GPU / 512GB
PROMPT
5601
tokens/s
GENERATION
179
tokens/s
TTFT
206
ms
LOCALSCORE
1693
GPU / 8GB
PROMPT
5696
tokens/s
GENERATION
189
tokens/s
TTFT
236
ms
LOCALSCORE
1660
GPU / 16GB
PROMPT
6087
tokens/s
GENERATION
168
tokens/s
TTFT
251
ms
LOCALSCORE
1599
GPU / 256GB
PROMPT
4999
tokens/s
GENERATION
178
tokens/s
TTFT
227
ms
LOCALSCORE
1578
GPU / 8GB
PROMPT
5553
tokens/s
GENERATION
195
tokens/s
TTFT
287
ms
LOCALSCORE
1558
GPU / 8GB
PROMPT
6467
tokens/s
GENERATION
128
tokens/s
TTFT
219
ms
LOCALSCORE
1545
PROMPT
5884
tokens/s
GENERATION
131
tokens/s
TTFT
237
ms
LOCALSCORE
1481
GPU / 6GB
PROMPT
5430
tokens/s
GENERATION
151
tokens/s
TTFT
258
ms
LOCALSCORE
1475
GPU / 8GB
PROMPT
4930
tokens/s
GENERATION
124
tokens/s
TTFT
251
ms
LOCALSCORE
1347
GPU / 6GB
PROMPT
4917
tokens/s
GENERATION
118
tokens/s
TTFT
301
ms
LOCALSCORE
1248
GPU / 128GB
PROMPT
3296
tokens/s
GENERATION
176
tokens/s
TTFT
334
ms
LOCALSCORE
1203
GPU / 192GB
PROMPT
3272
tokens/s
GENERATION
170
tokens/s
TTFT
339
ms
LOCALSCORE
1179
PROMPT
4087
tokens/s
GENERATION
130
tokens/s
TTFT
354
ms
LOCALSCORE
1144
GPU / 128GB
PROMPT
2977
tokens/s
GENERATION
151
tokens/s
TTFT
368
ms
LOCALSCORE
1070
GPU / 8GB
PROMPT
3399
tokens/s
GENERATION
109
tokens/s
TTFT
416
ms
LOCALSCORE
962
GPU / 16GB
PROMPT
5678
tokens/s
GENERATION
34.7
tokens/s
TTFT
231
ms
LOCALSCORE
949
PROMPT
3376
tokens/s
GENERATION
99.4
tokens/s
TTFT
435
ms
LOCALSCORE
920
PROMPT
3624
tokens/s
GENERATION
81.6
tokens/s
TTFT
416
ms
LOCALSCORE
893
GPU / 6GB
PROMPT
800
tokens/s
GENERATION
35.6
tokens/s
TTFT
1.81
sec
LOCALSCORE
251
PROMPT
814
tokens/s
GENERATION
24.4
tokens/s
TTFT
1.83
sec
LOCALSCORE
221
GPU / 4GB
PROMPT
667
tokens/s
GENERATION
31.2
tokens/s
TTFT
2.11
sec
LOCALSCORE
214
PROMPT
547
tokens/s
GENERATION
19.5
tokens/s
TTFT
2.62
sec
LOCALSCORE
160
