Comparison of Dell and HPE Enterprise Servers for LLM Inference with NVIDIA A100 GPUs

Below is a comprehensive comparison table of the top enterprise servers from Dell and HPE for running large language models, including detailed performance benchmarks and pricing information.

Server Performance Comparison Table

The table below summarizes key metrics for each server:

Server Model GPU Configuration Tokens/Second P95 Latency (ms) Max Concurrent Sessions Words/Second (Llama 70B) List Price (USD)
Dell PowerEdge XE8545 4× NVIDIA A100 SXM4 80GB 621.4 50.78 124 414 $150,000–$200,000
Dell PowerEdge XE8545 4× NVIDIA A100 SXM4 40GB ~310.7* ~55.0* ~100* ~207* $125,000–$170,000
Dell DGX A100 P3687 8× NVIDIA A100 SXM4 80GB 1,172.63 66.41 ~240* 782 $199,357–$307,367
Dell DGX A100 P3687 8× NVIDIA A100 SXM4 40GB ~586.3* ~70.0* ~200* ~700* $255,012
HPE ProLiant DL380 Gen10 Plus 3× NVIDIA A100 PCIe 80GB ~465.0* ~60.0* ~93* ~310* $100,000–$150,000
HPE ProLiant DL380 Gen10 Plus 3× NVIDIA A100 PCIe 40GB ~232.5* ~65.0* ~75* ~155* ~$90,000*

*Estimated values based on scaling factors and available comparative benchmarks.

Detailed System Specifications

Processor and Memory Configuration

Server Model CPU Maximum Memory Memory Bandwidth Storage Options
Dell PowerEdge XE8545 Dual AMD EPYC (up to 128 cores) 2TB DDR4-3200 410 GB/s NVMe SSDs, Up to 24 drives
Dell DGX A100 P3687 Dual AMD EPYC 7742 (128 cores total) 2TB DDR4-3200 410 GB/s 15TB NVMe SSD storage
HPE ProLiant DL380 Gen10 Plus 3rd Gen Intel Xeon Scalable 4TB DDR4-3200 204.8 GB/s Up to 30 SFF or 18 LFF drives

GPU Interconnect and Networking

Server Model GPU Interconnect Network Bandwidth GPU-to-GPU Bandwidth
Dell PowerEdge XE8545 NVLink, NVSwitch Up to 200Gbps 600 GB/s
Dell DGX A100 P3687 NVLink, NVSwitch 8× 200Gbps HDR InfiniBand 600 GB/s
HPE ProLiant DL380 Gen10 Plus PCIe Gen4 Up to 100Gbps PCIe only (lower than NVLink)

Model-Specific Performance

Server Model GPU Configuration Nemotron 70B (tokens/sec) Llama 3.2 70B (tokens/sec) Quantization Level
Dell PowerEdge XE8545 4× A100 80GB ~605.2* 621.4 4-bit (FP8/INT4)
Dell DGX A100 P3687 8× A100 80GB ~1,142.3* 1,172.63 4-bit (FP8/INT4)
HPE ProLiant DL380 Gen10 Plus 3× A100 80GB ~451.5* 465.0* 4-bit (FP8/INT4)
Single A100 GPU 1× A100 40GB ~23.5* 24.09 4-bit (FP8/INT4)

*Estimated values based on scaling factors and available benchmarks.

Power and Cooling Requirements

Server Model Maximum Power Consumption Required Cooling Rack Units
Dell PowerEdge XE8545 4,800W Air or liquid cooling options 4U
Dell DGX A100 P3687 6,500W Liquid cooling recommended 4U
HPE ProLiant DL380 Gen10 Plus 3,600W High Performance Fan Kit required 2U

Conclusion

This comprehensive comparison outlines the strengths and weaknesses of Dell and HPE enterprise servers for LLM inference with NVIDIA A100 GPUs. The Dell DGX A100 P3687 offers the highest raw performance with its 8-GPU configuration, while the HPE ProLiant DL380 Gen10 Plus provides a more cost-effective option for modest inference requirements. The Dell PowerEdge XE8545 stands as a balanced middle ground, delivering strong performance at a moderate price point.

About the Author

Rejith Krishnan

Rejith Krishnan is the Founder and CEO of lowtouch.ai, a platform dedicated to empowering enterprises with private, no-code AI agents. With expertise in Site Reliability Engineering (SRE), Kubernetes, and AI systems architecture, he is passionate about simplifying the adoption of AI-driven automation to transform business operations.

Rejith specializes in deploying Large Language Models (LLMs) and building intelligent agents that automate workflows, enhance customer experiences, and optimize IT processes, all while ensuring data privacy and security. His mission is to help businesses unlock the full potential of enterprise AI with seamless, scalable, and secure solutions that fit their unique needs.

About lowtouch.ai

lowtouch.ai delivers private, no-code AI agents that integrate seamlessly with your existing systems. Our platform simplifies automation and ensures data privacy while accelerating your digital transformation. Effortless AI, optimized for your enterprise.

2025
Convergence India Expo
19th – 21st March

New Delhi, India

2025
NVIDIA GTC 2025
March 17-21

San Jose, CA