What’s new with your favorite virtualization companies and executives.


Check out what’s happening in your area, from webinars to lunch and learns.


Get the scoop on the latest technology news from industry experts.

How To’s

Step by step instructions on a variety of need to know virtualization topics.


Take a look at the industries most recent company and product annoucements.

Home » Blogs

Hadoop cluster capacity planning best practices

Submitted by on May 14, 2018 – 10:41 pmNo Comment

The first rule of Hadoop cluster capacity planning is that Hadoop can accommodate changes. If you overestimate your storage requirements, you can scale the cluster down. If you need more storage than you budgeted for, you can start out with a small cluster and add nodes as your data set grows.

Another best practice for Hadoop cluster capacity planning is to consider data redundancy needs. One of the advantages of storing data in a Hadoop cluster is that it replicates data, which protects against data loss. These replicas consume storage space, which you must factor into your capacity planning efforts. If you estimate that you will have 5 TB of data, and you opt for the Hadoop default of three replicas, your cluster must accommodate 15 TB of data.

To read the entire article, please click on