Enterprise Servers for LLM inference: Dell and HPE with A100 GPU

Comparison of Dell and HPE Enterprise Servers for LLM Inference with NVIDIA A100 GPUs

Below is a comprehensive comparison table of the top enterprise servers from Dell and HPE for running large language models, including detailed performance benchmarks and pricing information.

Server Performance Comparison Table

The table below summarizes key metrics for each server:

Server Model	GPU Configuration	Tokens/Second	P95 Latency (ms)	Max Concurrent Sessions	Words/Second (Llama 70B)	List Price (USD)
Dell PowerEdge XE8545	4× NVIDIA A100 SXM4 80GB	621.4	50.78	124	414	$150,000–$200,000
Dell PowerEdge XE8545	4× NVIDIA A100 SXM4 40GB	~310.7*	~55.0*	~100*	~207*	$125,000–$170,000
Dell DGX A100 P3687	8× NVIDIA A100 SXM4 80GB	1,172.63	66.41	~240*	782	$199,357–$307,367
Dell DGX A100 P3687	8× NVIDIA A100 SXM4 40GB	~586.3*	~70.0*	~200*	~700*	$255,012
HPE ProLiant DL380 Gen10 Plus	3× NVIDIA A100 PCIe 80GB	~465.0*	~60.0*	~93*	~310*	$100,000–$150,000
HPE ProLiant DL380 Gen10 Plus	3× NVIDIA A100 PCIe 40GB	~232.5*	~65.0*	~75*	~155*	~$90,000*

*Estimated values based on scaling factors and available comparative benchmarks.

Detailed System Specifications

Processor and Memory Configuration

Server Model	CPU	Maximum Memory	Memory Bandwidth	Storage Options
Dell PowerEdge XE8545	Dual AMD EPYC (up to 128 cores)	2TB DDR4-3200	410 GB/s	NVMe SSDs, Up to 24 drives
Dell DGX A100 P3687	Dual AMD EPYC 7742 (128 cores total)	2TB DDR4-3200	410 GB/s	15TB NVMe SSD storage
HPE ProLiant DL380 Gen10 Plus	3rd Gen Intel Xeon Scalable	4TB DDR4-3200	204.8 GB/s	Up to 30 SFF or 18 LFF drives

GPU Interconnect and Networking

Server Model	GPU Interconnect	Network Bandwidth	GPU-to-GPU Bandwidth
Dell PowerEdge XE8545	NVLink, NVSwitch	Up to 200Gbps	600 GB/s
Dell DGX A100 P3687	NVLink, NVSwitch	8× 200Gbps HDR InfiniBand	600 GB/s
HPE ProLiant DL380 Gen10 Plus	PCIe Gen4	Up to 100Gbps	PCIe only (lower than NVLink)

Model-Specific Performance

Server Model	GPU Configuration	Nemotron 70B (tokens/sec)	Llama 3.2 70B (tokens/sec)	Quantization Level
Dell PowerEdge XE8545	4× A100 80GB	~605.2*	621.4	4-bit (FP8/INT4)
Dell DGX A100 P3687	8× A100 80GB	~1,142.3*	1,172.63	4-bit (FP8/INT4)
HPE ProLiant DL380 Gen10 Plus	3× A100 80GB	~451.5*	465.0*	4-bit (FP8/INT4)
Single A100 GPU	1× A100 40GB	~23.5*	24.09	4-bit (FP8/INT4)

*Estimated values based on scaling factors and available benchmarks.

Power and Cooling Requirements

Server Model	Maximum Power Consumption	Required Cooling	Rack Units
Dell PowerEdge XE8545	4,800W	Air or liquid cooling options	4U
Dell DGX A100 P3687	6,500W	Liquid cooling recommended	4U
HPE ProLiant DL380 Gen10 Plus	3,600W	High Performance Fan Kit required	2U

Conclusion

This comprehensive comparison outlines the strengths and weaknesses of Dell and HPE enterprise servers for LLM inference with NVIDIA A100 GPUs. The Dell DGX A100 P3687 offers the highest raw performance with its 8-GPU configuration, while the HPE ProLiant DL380 Gen10 Plus provides a more cost-effective option for modest inference requirements. The Dell PowerEdge XE8545 stands as a balanced middle ground, delivering strong performance at a moderate price point.

About the Author

Rejith Krishnan

Rejith Krishnan is the Founder and CEO of lowtouch.ai, a platform dedicated to empowering enterprises with private, no-code AI agents. With expertise in Site Reliability Engineering (SRE), Kubernetes, and AI systems architecture, he is passionate about simplifying the adoption of AI-driven automation to transform business operations.

Rejith specializes in deploying Large Language Models (LLMs) and building intelligent agents that automate workflows, enhance customer experiences, and optimize IT processes, all while ensuring data privacy and security. His mission is to help businesses unlock the full potential of enterprise AI with seamless, scalable, and secure solutions that fit their unique needs.

About lowtouch.ai

lowtouch.ai delivers private, no-code AI agents that integrate seamlessly with your existing systems. Our platform simplifies automation and ensures data privacy while accelerating your digital transformation. Effortless AI, optimized for your enterprise.

Schedule a Demo

2025

Agentic AI

Join Us

2nd – 3rd October

New York City, USA

Promptstash

Chrome extension to manage and deploy AI prompt templates.

Get Promptstash

works with chatgpt, grok etc

Effortless way to save and reuse prompts

No-Code Agentic Products

Private AI Appliance

Private AI Infrastructure

AI Center of Excellence

AgentService

Featured Articles

lowtouch.ai for Datacenters: Unlocking AI-Powered Business Transformation

Comparison of Dell and HPE Enterprise Servers for LLM Inference with NVIDIA A100 GPUs

Comparison of Dell and HPE Enterprise Servers for LLM Inference with NVIDIA A100 GPUs

Comparison of Dell and HPE Enterprise Servers for LLM Inference with NVIDIA A100 GPUs

Server Performance Comparison Table

Detailed System Specifications

Processor and Memory Configuration

GPU Interconnect and Networking

Model-Specific Performance

Power and Cooling Requirements

Conclusion

About lowtouch.ai

Stay Ahead with the Latest in Agentic AI!

lowtouch.ai — Built by the innovators at CloudControl