Performance of Llama 3.1 8B AI Inference using vLLM on ND-H100-v5 | Microsoft Community Hub

https://techcommunity.microsoft.com/t5/azure-high-performance-computing/performance-of-llama-3-1-8b-ai-inference-using-vllm-on-nd-h100/ba-p/4448355