TEST #2549 RESULTS

02/06/2026 - 12:45 PM

32.2

tokens/s

generation

5.78

sec

time to first token

238

tokens/s

prompt

110

LocalScore

HOW YOU STACK UP
Explore All Results

Llama 3.2 1B Instruct - Q4_K - Medium

SYSTEM
CPU
Intel Xeon CPU E5-2680 v4 @ 2.40GHz (broadwell)
RAM
125.8GB
OS
Linux
Kernel Release
6.12.15-production+truenas
Architecture
x86_64
Version
Cosmopolitan 3.9.7 MODE=x86_64; #1 SMP PREEMPT_DYNAMIC Mon Sep 8 18:50:34 UTC 2025
RUNTIME
Name
llamafile
Version
0.9.2
Commit Hash
a30b324
DETAILED RESULTS
TEST NAME
PROMPT
GENERATION
TTFT
pp1024+tg16
291
tokens/s
33.5
tokens/s
3.55
sec
pp4096+tg256
218
tokens/s
28.6
tokens/s
18.81
sec
pp2048+tg256
233
tokens/s
31.7
tokens/s
8.81
sec
pp2048+tg768
226
tokens/s
31.5
tokens/s
9.08
sec
pp1024+tg1024
245
tokens/s
32.9
tokens/s
4.22
sec
pp1280+tg3072
233
tokens/s
30.4
tokens/s
5.52
sec
pp384+tg1152
235
tokens/s
33.3
tokens/s
1.66
sec
pp64+tg1024
242
tokens/s
33.9
tokens/s
293
ms
pp16+tg1536
215
tokens/s
33.6
tokens/s
103
ms