StoneFly AI Servers Powered by Nvidia GPU
Available Now
Accelerate your AI workloads with high-performance NVIDIA L40S, A100, and H100 GPU-based AI servers built for deep learning, machine learning, generative AI, large language models (LLMs), and data analytics. Scale effortlessly, optimize processing power, and reduce training times to drive innovation in real-time.
Multi-Core Processor(s)
NVME SSD and SAS Support
Up to 100Gb Network
Fill Out This Form for More Information
Breakthrough Performance for Artificial Intelligence
StoneFly GPU-based AI servers are optimized to handle the most demanding AI workloads, from training Generative AI models and Large Language Models (LLMs) to powering image generation and deep learning applications. Equipped with NVIDIA L40S, H100, and A100 GPUs and optional advanced interconnects like NVLink and NVSwitch, these servers deliver exceptional performance for enterprise AI tools such as CUDA, TensorFlow, PyTorch, and ONNX. Whether you’re developing complex NLP models, enabling real-time image and video analysis, or deploying AI-driven business solutions like IBM Watson, DataRobot, or H2O.ai, our AI servers provide the high throughput, scalability, and low latency needed to accelerate AI innovation across your organization.
Optional
Integrated or Disaggregated Shared Multi-Tier Low Latency, High Performance Storage for AI
The StoneFly AI servers offer optional integrated, or disaggregated, shared multi-tier high-performance storage solutions, designed for low-latency access and scalability. The system supports block, file, and S3 object storage within the same appliance, or as a shared repository, providing flexibility for various workloads. The architecture enables near-zero latency between storage and AI processing, ensuring rapid access to critical data. As demands grow, the storage capacity can be expanded seamlessly, without any performance bottlenecks. This makes it an ideal choice for AI projects requiring large datasets and fast I/O performance.
Key Features
StoneFly’s GPU-based AI servers are designed to deliver unmatched performance, scalability, and efficiency for demanding AI, deep learning, generative AI, and machine learning workloads. Equipped with NVIDIA GPUs , these AI servers provide a robust infrastructure for training complex models, running real-time inferences, and handling large-scale data analytics with ease.
High-Performance GPU Acceleration
Equipped with multiple NVIDIA L40S GPUs per appliance for fast AI model training and data processing.
Scalable for AI Workloads
Easily expand resources to handle growing deep learning and machine learning demands without forklift upgrades.
Support for Leading AI Frameworks
Compatible with popular AI tools ensuring seamless integration with existing workflows.
StoneFly NVIDIA GPU AI Servers Performance Highlights
Tensor Performance
TFLOPS
RT Core Performance
TFLOPS
Single-Precision Performance
TFLOPS
Integrate 300+ Sources to Your AI Servers with StoneFly SourceConnect™
Integrating SourceConnect™ with StoneFly’s AI servers enables you to combine data from over 300+ sources, including applications, databases, SaaS environments, and cloud, into a single location. Leverage advanced transformation and SQL modelling capabilities to optimize your datasets, making them ready for high-performance AI-based analytics. Handle and process large-scale datasets with ease, and set up near real-time sync for critical data, ensuring up-to-date insights. This powerful integration supports seamless data replication, secure backups, and efficient AI workloads for comprehensive, advanced analytics in a robust, scalable environment tailored for AI-driven applications.
Available StoneFly AI Servers
- Up to 4x L40S GPUs
- 2U Systems with Up to 8x Hot-Swappable NVMe/SAS Drives
- Up to 10x L40S GPUs
- 4U/5U Systems with Up to 24x Hot-Swappable NVMe/SAS Drives
- Up to 20x L40S GPUs
- 8U Systems with Support for Multi-Node Blade Servers
Comparing StoneFly NVIDIA L40S Systems vs NVIDIA HGX A100 vs NVIDIA H100 NVL
NVIDIA L40S | NVIDIA HGX A100 | NVIDIA H100 NVL | |
---|---|---|---|
Best For | Universal GPU for Gen AI | Highest Perf Multi-Node AI | Gen AI performance |
GPU Architecture | NVIDIA Ada Lovelace | NVIDIA Ampere | NVIDIA Hopper |
FP64 | N/A | 9.7 TFLOPS | 68 TFLOPS |
FP32 | 91.6 TFLOPS | 19.5 TFLOPS | 134 TFLOPS |
RT Core | 212 TFLOPS | N/A | N/A |
TF32 Tensor Core* | 366 TFLOPS | 312 TFLOPS | 1,979 TFLOPS |
FP16/BF16 Tensor Core* | 733 TFLOPS | 624 TFLOPS | 3,958 TFLOPS |
FP8 Tensor Core* | 1,466 TFLOPS | N/A | 7,916 TFLOPS |
INT8 Tensor Core* | 1,466 TOPS | 1,248 TOPS | 7,916 TOPS |
GPU Memory | 48 GB GDDR6 | 80 GB HBM2e | 188 GB HBM3 w/ ECC |
GPU Memory Bandwidth | 864 GB/s | 2,039 GB/s | 7.8 TB/s |
L2 Cache | 96 MB | 40 MB | 100 MB |
Media Engines | 3 NVENC (+AV1) 3 NVDEC 4 NVJPEG | 0 NVENC 5 NVDEC 5 NVJPEG | 0 NVENC 14 NVDEC 14 NVJPEG |
Power | Up to 350 W | Up to 400 W | 2x 350-400 W |
Form Factor | 2-slot FHFL | 8-way HGX | 2x 2-slot FHFL |