Select Page

StoneFly’s Scale-Out NAS Storage plug-in for Hadoop

Hadoop and big data go hand in hand, however many companies feel Hadoop comes up short on certain enterprise features they need. StoneFly’s Scale-Out NAS Storage offers an enterprise-grade alternative to the underlying Hadoop Distributed File System (HDFS) that enables you to keep data in a POSIX compatible storage environment while performing big data analytics with a Hadoop MapReduce Framework.

To overcome the traditional limitations of hardware-based storage, StoneFly™ has created an HDFS plug-in that enables MapReduce to run directly on StoneFly’s Scale-Out NAS Storage. This plugin uses Scale-Out NAS Storage volumes to run Hadoop jobs across multiple namespaces, allowing you to perform in-place analytics without migrating data in or out of HDFS.

StoneFly’s Scale-Out NAS Storage plug-in for Hadoop

Integrating the plugin into the Hadoop ecosystem goes well beyond MapReduce and HDFS. The Hadoop plug-in is compatible with Hadoop-based applications and supports technologies such as Hive, Pig HBase, Tez, Sqoop, Flume and more!

In this example we see four Scale-Out NAS Storage servers in a trusted storage pool, split between two zones for high-availability. A separate server runs the “Ambari” management console, the “Yarn Resource Manager” and the “Job History Server”.  This architecture eliminates the centralized metadata server and supports a fully fault-tolerant system with two or three way replication across a cluster that can scale anywhere from 2 to 128 nodes.

StoneFly’s Scale-Out NAS Storage plug-in for Hadoop
StoneFly’s Scale-Out NAS Storage plug-in for Hadoop
To eliminate complex and time-consuming code re-writes StoneFly’s Scale-Out NAS Storage supports data access to several different mechanisms. File access with NFS or SMB, object access with swift and access via the Hadoop file-system API. You can use standard Linux tools and utilities such as Grep, Awk and Python, and take advantage of multi-protocol support including native StoneFly Scale-Out NAS Storage, NFS, SMB, HCFS and swift.
You also have the ability to add or shrink a cluster on the fly without impacting application availability and perform automatic data re-balancing. Let’s take a closer look at the plugin in action. From the “Ambari” management console you’re able to start all the services with a click of a button:
StoneFly’s Scale-Out NAS Storage plug-in for Hadoop
We see there are a number of Hadoop services on the “Ambari” manager node. There are also four nodes in the StoneFly Scale-Out NAS Storage cluster.
StoneFly’s Scale-Out NAS Storage plug-in for Hadoop
In the terminal window we see maps and reduces happening in real time on the Scale-Out NAS Storage nodes, and the management console shows us that all the work is complete.
StoneFly’s Scale-Out NAS Storage plug-in for Hadoop
StoneFly’s Scale-Out NAS Storage plug-in for Hadoop
The StoneFly Scale-Out NAS Storage plugin for Apache Hadoop makes it painless and cost effective to run analytics on data in Apache Hadoop, eliminating many of the challenges enterprises face when working with the Hadoop distributed file system.

Get in touch with us to learn more about StoneFly’s Scale-Out NAS Storage.

Recent Posts

Maximizing Data Protection with Cloud Backup and Recovery

Maximizing Data Protection with Cloud Backup and Recovery

Businesses of all sizes must prioritize data protection and recovery to ensure continuous operations. One way to achieve this is through enterprise cloud backup solutions, which provide secure and scalable storage for critical data. As more businesses move their...

Guide to Sizing Your Enterprise SAN Appliance for Optimal Storage

Guide to Sizing Your Enterprise SAN Appliance for Optimal Storage

In today's data-driven business world, having a reliable and scalable enterprise data storage solution is crucial. As businesses continue to generate massive amounts of data, they need storage systems that can keep up with their growing needs, while also ensuring the...

You May Also Like

Maximizing Data Protection with Cloud Backup and Recovery

Maximizing Data Protection with Cloud Backup and Recovery

Protecting your enterprise data is crucial, and having a comprehensive cloud backup and recovery solution is vital for your business continuity. StoneFly offers enterprise-grade cloud backup and recovery solutions in Azure, AWS, and private cloud, with a focus on security and compliance. Read on to learn more about our solutions and best practices for implementing and managing them.

read more
On-Premise vs Private Cloud: Choosing the Right Infrastructure for Your Business Needs

On-Premise vs Private Cloud: Choosing the Right Infrastructure for Your Business Needs

Learn how to choose the right cloud infrastructure for your business with this comprehensive guide from StoneFly. Explore the pros and cons of on-premise data centers and private clouds, the benefits of different private cloud services, and how to ensure data security in private cloud environments. Discover the advantages of air-gapped and immutable repositories for backup storage and archiving, and find out how StoneFly can help protect your data from ransomware attacks.

read more

Subscribe To Our Newsletter

Join our mailing list to receive the latest news, updates, and promotions from StoneFly.

Please Confirm your subscription from the email