Token Throughput per GPU vs End-to-end Latency

LLM Benchmarks • 8 GPUs • full benchmark at https://github.com/Scicom-AI-Enterprise-Organization/llm-benchmaq/tree/main/benchmarks

Benchmark Configuration