TEST #3299 RESULTS
04/13/2026 - 8:08 AM
ACCELERATOR
MODEL
3.1
tokens/s
generation
97.25
sec
time to first token
14
tokens/s
prompt
8
LocalScore
HOW YOU STACK UP
Explore All ResultsMeta Llama 3.1 8B Instruct - Q4_K - Medium
SYSTEM
CPU
Intel Xeon CPU E3-1231 v3 @ 3.40GHz (haswell)
RAM
15.6GB
OS
Linux
Kernel Release
6.17.0-20-generic
Architecture
x86_64
Version
Cosmopolitan 3.9.7 MODE=x86_64; #20~24.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Mar 19 01:28:37 UTC 2
RUNTIME
Name
llamafile
Version
0.9.2
Commit Hash
a30b324
DETAILED RESULTS
TEST NAME
PROMPT
GENERATION
TTFT
pp1024+tg16
17
tokens/s
3.6
tokens/s
61.61
sec
pp4096+tg256
14
tokens/s
2.8
tokens/s
294.76
sec
pp2048+tg256
15
tokens/s
3.0
tokens/s
141.54
sec
pp2048+tg768
14
tokens/s
3.0
tokens/s
145.09
sec
pp1024+tg1024
15
tokens/s
3.2
tokens/s
69.14
sec
pp1280+tg3072
10
tokens/s
2.6
tokens/s
132.70
sec
pp384+tg1152
16
tokens/s
3.3
tokens/s
24.70
sec
pp64+tg1024
16
tokens/s
3.3
tokens/s
4.40
sec
pp16+tg1536
15
tokens/s
3.3
tokens/s
1.35
sec
