Cassandra vs. MongoDB: Performance and Feature Comparison

For organizations considering their open source NoSQL database options, Cassandra and MongoDB are often near the top of the list. However, Cassandra and MongoDB are very different options, each with their own unique strengths, weaknesses, and use cases.

In this blog, we look at those differences and compare the performance, benefits, and use cases of Cassandra vs. MongoDB.

Comparing Cassandra vs. MongoDB Cassandra vs. MongoDB Performance Cassandra vs. MongoDB: Key Differences Cassandra Use Cases MongoDB Use Cases Final Thoughts Comparing Cassandra vs. MongoDB On the surface,

  • Cassandra and
  • MongoDB

may look similar. Both are NoSQL databases, both are open source software, and both are not suitable for ACID compliance. But before we delve too deep into their benefits and features, let’s get an overview of each of these open-source NoSQL DBMSs.

What is Cassandra?

Cassandra is an open source NoSQL database written in Java maintained by the Apache Software Foundation.

It offers high availability and scalability and is capable of handling large volumes of data and unstructured data types. By not requiring a fixed schema, Cassandra can handle things like replication much easier than other databases.

Originally from Facebook developers, Apache Cassandra was developed to handle inbox search. It was made open source in 2008 and later became an Apache project in 2009.

What is MongoDB?

MongoDB is a popular NoSQL document-oriented database developed by MongoDB Incorporated.

The word “Mongo” was derived from the database’s ability to store huge amounts of JSON data. Documents can have any schema that is different from a relational database management system (RDBMS). This means that data from related tables can be joined into a single document to denormalize the data.

Cassandra v.

MongoDB: Performance

Comparing performance between Cassandra and MongoDB is difficult to quantify. You can’t take the same application and data model and test it on both platforms and conclude that one database performs better than the other. Each is particularly suitable for different types of data models and loads. Instead, focus on how each database addresses issues such as consistency, availability, and partition tolerance based on your application requirements.

Both

MongoDB and Cassandra require writes to the database to be on a primary node, but the way they handle writes is different. Since Cassandra has support for multiple primary nodes, Cassandra’s architectural design allows it to handle many simultaneous writes on more than one node. It will be more write-efficient than MongoDB, which is limited to one writable primary node per replica set. Secondary servers can only be used for reads.

Read Performance

This is where MongoDB really shines, especially with consistency. By default, child nodes cannot accept read requests, but they can be easily configured for reads by setting a “read preference”. This means that the entire set of replicas can accept reads, but this is also possible to configure with Cassandra.

MongoDB has an advantage if your data model includes nested objects that require indexes, as it has better support for secondary indexes. However, Cassandra only has superficial support for secondary indexes. Secondary indices are also limited to individual columns and what are called equality comparisons. If you are mainly consulting for the primary key, Cassandra is the best choice. It really depends on your modeled data and how you query your data.

Both

Cassandra and MongoDB scale up and out seamlessly. There are generous and documented limits on the number of nodes in a cluster, but keep in mind that there are practical limits based on the server architecture.

Cassandra v.

MongoDB: Key Differences

As mentioned above, there are a number of differences between Cassandra and MongoDB, including query language/data model, architecture, aggregation, consistency, data types, and more.

Query Language

/ Data Model

The most striking difference between databases is the query language and the data model. Cassandra mimics traditional RDBMS SQL with its own Cassandra query language (or CQL) and column and row table structure that allows you to store different columns per row. MongoDB has a JavaScript command-line interface and organizes its documents into collections and databases using a JSON format. MongoDB will provide the most flexibility to define a fixed or variable schema for your data.

As

discussed above, MongoDB uses a primary/secondary architecture with a single node that accepts writable content and the rest allows read requests. Cassandra requires each node to be an active member within a ring by passing queries to the appropriate node where the data resides. Cassandra has an always-on architecture when a node fails. Failover within a set of MongoDB replicas can take up to a minute when a child is promoted as a writable primary.

Aggregation

MongoDB

includes an aggregation framework for transforming data in stages instead of looking at a large dataset, but Cassandra requires a plugin like Apache Hadoop or Spark

.

Consistency

MongoDB is good at consistency where you can query multiple nodes in a set of replicas and get the same data where Cassandra offers adjustable consistency at the cost of

performance. Data types MongoDB has a rich set of data types including String, Numeric, Boolean, Min/Max, Arrays, Timestamps, Object, Null, Symbol, Date, Object ID, Binary, Code, and Regular Expression. Cassandra supports data types, including built-in data types (similar to MongoDB), collections (maps, sets, and lists), and user-defined data types. The maximum document size is 16 megabytes in MongoDB, where Cassandra allows 2 GB for a single column (not recommended).

Security

MongoDB and Cassandra support role-based access control functionality as well as node-to-node and client-to-node TLS/SSL transport security. MongoDB requires an enterprise license for LDAP and Kerberos authentication, but downloads a free Cassandra extension for LDAP and Kerberos.

Apache Cassandra 4 added audit logging to track user access and activity where live traffic can be replayed.

Encryption at rest is an enterprise feature with MongoDB and Cassandra. You would need to explore volume or operating system file encryption for this functionality.

MongoDB Licensing has a multiple licensing system, offering a version covered by the server-side public license for all versions released after October 16, 2018. Previous versions are licensed under the GNU Affero General Public License v3. There are enterprise versions of MongoDB available that have their own licenses. Cassandra is simply licensed under the well-known Apache License v2.0.

Cassandra

handles write-heavy workloads such as transaction logging, time series data, inventory tracking, Internet of Things status and events, weather tracking, and much more where data is inserted and rarely updated. Cassandra is a great choice for geographically distributed data.

Cassandra favors partition tolerance and being available over things like write consistency. While Cassandra is a large database, it doesn’t have all the features of a relational database, such as transactions or a way to lock down data. Unions are also not implemented in Cassandra.

MongoDB Use Cases

MongoDB is also a great choice for big data workloads where Cassandra is very good, which also includes content management, analytics, and time series data. MongoDB’s built-in aggregation framework allows you to extract data into a central database that provides a single view of your data. Some other great use cases for MongoDB are things like content management systems, product data management, real-time data integration, and high-speed data logging.

Choosing

the

right database is a big decision for any team, and one that should always focus on the needs of the application now and what those needs will be in the future. With this in mind, Cassandra and MongoDB can be great choices for inclusion in modern, scalable data stacks.

Get support for your open source data

stack

From ActiveMQ to MongoDB, OpenLogic can support the open source data technologies that power your business. Talk to an expert today to see how 24/7/365 support from our enterprise architects can help your team find success.

Talk to an expert

Additional Resources Blog – Apache Cassandra Overview Blog – MongoDB

Overview Blog –

  • PostgreSQL vs.
  • MongoDB

  • Blog – Big Data on Demand with MongoDB
  • White Paper –

  • Decision Maker’s Guide to Open Source Databases
  • White Paper – The New Stack: Cassandra, Kafka, and Spark