Hadoop and big data go hand in hand, however many companies feel Hadoop comes up short on certain enterprise features they need. StoneFly’s Scale-Out NAS Storage offers an enterprise-grade alternative to the underlying Hadoop Distributed File System (HDFS) that enables you to keep data in a POSIX compatible storage environment while performing big data analytics with a Hadoop MapReduce Framework.
To overcome the traditional limitations of hardware-based storage, StoneFly™ has created an HDFS plug-in that enables MapReduce to run directly on StoneFly’s Scale-Out NAS Storage. This plugin uses Scale-Out NAS Storage volumes to run Hadoop jobs across multiple namespaces, allowing you to perform in-place analytics without migrating data in or out of HDFS.

Integrating the plugin into the Hadoop ecosystem goes well beyond MapReduce and HDFS. The Hadoop plug-in is compatible with Hadoop-based applications and supports technologies such as Hive, Pig HBase, Tez, Sqoop, Flume and more!
In this example we see four Scale-Out NAS Storage servers in a trusted storage pool, split between two zones for high-availability. A separate server runs the “Ambari” management console, the “Yarn Resource Manager” and the “Job History Server”. This architecture eliminates the centralized metadata server and supports a fully fault-tolerant system with two or three way replication across a cluster that can scale anywhere from 2 to 128 nodes.






The StoneFly Scale-Out NAS Storage plugin for Apache Hadoop makes it painless and cost effective to run analytics on data in Apache Hadoop, eliminating many of the challenges enterprises face when working with the Hadoop distributed file system.
Get in touch with us to learn more about StoneFly’s Scale-Out NAS Storage.
Want new articles before they get published?
Subscribe to our Awesome Newsletter.