Top 10 Open Source Big Data Databases


A look at some of the most interesting examples of open source Big Data databases in use today.

The databases and data stockrooms you’ll discover on these pages are the genuine workhorses of the Big Data world. They hold and help deal with the huge supplies of organized and unstructured data that make it conceivable to dig for understanding with Big Data.

Businesses depend intensely on these open source arrangements, from apparatuses like Cassandra (initially created by Facebook) to the all around respected MongoDB, which was intended to help the biggest of big data loads. What’s more, the apparatuses meet people’s high expectations: OrientDB, for example, can hide away to 150,000 reports for every second. The associations that depend on these open source databases range from Boeing to Comcast to the Danish government. It’s exact to say that, however much any apparatus set, the software recorded on these pages assumes a focal part in today’s worldwide business commercial center.

1. Cassandra

Originally developed by Facebook, this NoSQL database is now managed by the Apache Foundation. It’s used by many organizations with large, active datasets, including Netflix, Twitter, Urban Airship, Constant Contact, Reddit, Cisco and Digg. Commercial support and services are available through third-party vendors. Operating System: OS Independent.

2. HBase

Another Apache project, HBase is the non-relational data store for Hadoop. Features include linear and modular scalability, strictly consistent reads and writes, automatic failover support and much more. Operating System: OS Independent.

3. MongoDB

MongoDB was designed to support humongous databases. It’s a NoSQL database with document-oriented storage, full index support, replication and high availability, and more. Commercial support is available through 10gen. Operating system: Windows, Linux, OS X, Solaris.

4. Neo4j

The “world’s leading graph database,” Neo4j boasts performance improvements up to 1000x or more versus relational databases. Interested organizations can purchase advanced or enterprise versions from Neo Technology. Operating System: Windows, Linux.

5. CouchDB

Designed for the Web, CouchDB stores data in JSON documents that you can access via the Web or query using JavaScript. It offers distributed scaling with fault-tolerant storage. Operating system: Windows, Linux, OS X, Android.

6. OrientDB

This NoSQL database can store up to 150,000 documents per second and can load graphs in just milliseconds. It combines the flexibility of document databases with the power of graph databases, while supporting features such as ACID transactions, fast indexes

7. Terrstore

Based on Terracotta, Terrastore boasts “advanced scalability and elasticity features without sacrificing consistency.” It supports custom data partitioning, event processing, push-down predicates, range queries, map/reduce querying and processing and server-side update functions. Operating System: OS Independent.

8. FlockDB

Best known as Twitter’s database, FlockDB was designed to store social graphs (i.e., who is following whom and who is blocking whom). It offers horizontal scaling and very fast reads and writes. Operating System: OS Independent.

9. Hibari

Used by many telecom companies, Hibari is a key-value, big data store with strong consistency, high availability and fast performance. Support is available through Gemini Mobile. Operating System: OS Independent.

10. Riak

Riak humbly claims to be “the most powerful open-source, distributed database you’ll ever put into production.” Users include Comcast, Yammer, Voxer, Boeing, SEOMoz, Joyent,, DotCloud, Formspring, the Danish Government and many others. Operating System: Linux, OS X.