StoneFly AI Servers Powered by Nvidia GPU
Available Now
Accelerate your AI workloads with high-performance NVIDIA L40S GPU-based AI servers built for deep learning, machine learning, generative AI, large language models (LLMs), and data analytics. Scale effortlessly, optimize processing power, and reduce training times to drive innovation in real-time.
Fill Out This Form for More Information
Breakthrough Performance for Artificial Intelligence
StoneFly GPU-based AI servers are optimized to handle the most demanding AI workloads, from training Generative AI models and Large Language Models (LLMs) to powering image generation and deep learning applications. Equipped with NVIDIA L40S GPUs and optional advanced interconnects like NVLink and NVSwitch, these servers deliver exceptional performance for enterprise AI tools such as CUDA, TensorFlow, PyTorch, and ONNX. Whether you’re developing complex NLP models, enabling real-time image and video analysis, or deploying AI-driven business solutions like IBM Watson, DataRobot, or H2O.ai, our AI servers provide the high throughput, scalability, and low latency needed to accelerate AI innovation across your organization.
Key Features
StoneFly’s GPU-based AI servers are designed to deliver unmatched performance, scalability, and efficiency for demanding AI, deep learning, generative AI, and machine learning workloads. Equipped with NVIDIA GPUs , these AI servers provide a robust infrastructure for training complex models, running real-time inferences, and handling large-scale data analytics with ease.
High-Performance GPU Acceleration
Equipped with multiple NVIDIA L40S GPUs per appliance for fast AI model training and data processing.
Scalable for AI Workloads
Easily expand resources to handle growing deep learning and machine learning demands without forklift upgrades.
Support for Leading AI Frameworks
Compatible with popular AI tools ensuring seamless integration with existing workflows.
StoneFly NVIDIA GPU AI Servers Performance Highlights
Tensor Performance
TFLOPS
RT Core Performance
TFLOPS
Single-Precision Performance
TFLOPS
Available StoneFly AI Servers
- Up to 4x L40S GPUs
- 2U Systems with Up to 8x Hot-Swappable NVMe/SAS Drives
- Up to 10x L40S GPUs
- 4U/5U Systems with Up to 24x Hot-Swappable NVMe/SAS Drives
- Up to 20x L40S GPUs
- 8U Systems with Support for Multi-Node Blade Servers
Comparing StoneFly NVIDIA L40S Systems vs NVIDIA HGX A100 vs NVIDIA H100 NVL
NVIDIA L40S | NVIDIA HGX A100 | NVIDIA H100 NVL | |
---|---|---|---|
Best For | Universal GPU for Gen AI | Highest Perf Multi-Node AI | Gen AI performance |
GPU Architecture | NVIDIA Ada Lovelace | NVIDIA Ampere | NVIDIA Hopper |
FP64 | N/A | 9.7 TFLOPS | 68 TFLOPS |
FP32 | 91.6 TFLOPS | 19.5 TFLOPS | 134 TFLOPS |
RT Core | 212 TFLOPS | N/A | N/A |
TF32 Tensor Core* | 366 TFLOPS | 312 TFLOPS | 1,979 TFLOPS |
FP16/BF16 Tensor Core* | 733 TFLOPS | 624 TFLOPS | 3,958 TFLOPS |
FP8 Tensor Core* | 1,466 TFLOPS | N/A | 7,916 TFLOPS |
INT8 Tensor Core* | 1,466 TOPS | 1,248 TOPS | 7,916 TOPS |
GPU Memory | 48 GB GDDR6 | 80 GB HBM2e | 188 GB HBM3 w/ ECC |
GPU Memory Bandwidth | 864 GB/s | 2,039 GB/s | 7.8 TB/s |
L2 Cache | 96 MB | 40 MB | 100 MB |
Media Engines | 3 NVENC (+AV1) 3 NVDEC 4 NVJPEG |
0 NVENC 5 NVDEC 5 NVJPEG |
0 NVENC 14 NVDEC 14 NVJPEG |
Power | Up to 350 W | Up to 400 W | 2x 350-400 W |
Form Factor | 2-slot FHFL | 8-way HGX | 2x 2-slot FHFL |