Llama Monitor

Stopped

Server Control

gfx906: Q4_0/Q4_1/Q8_0 use optimized MMVQ. Q4_K_M uses generic path (slower). turbo3 KV = 4.6x compression.

Server Logs

Inference

Prompt Speed
Generation Speed
Context (KV Cache)
Slots
Server Status

GPUs

GPUTempLoadVRAMPowerSCLKMCLK

Server Logs