Relational databases were not designed to cope with the scale and agility challenges that face modern applications, nor were they built to take advantage of the commodity storage and processing power available today.
NoSQL encompasses a wide variety of different database technologies that were developed in response to the demands presented in building modern applications:
The Benefits of NoSQL
When compared to relational databases, NoSQL databases are more scalable and provide superior performance, and their data model addresses several issues that the relational model is not designed to address:
- Supports large volumes of dynamic schema (structured, semi-structured, and unstructured data). Using NoSQL is especially useful in agile development environments where changes to schema happen frequently which would require migration of the entire database involving significant downtime
- Automatic Sharding – NoSQL databases, on the other hand, usually support auto-sharding, meaning that they natively and automatically spread data across an arbitrary number of servers, without requiring the application to even be aware of the composition of the server pool. Data and query load are automatically balanced across servers, and when a server goes down, it can be quickly and transparently replaced with no application disruption.
Most NoSQL databases also support automatic database replication to maintain availability in the event of outages or planned maintenance events. More sophisticated NoSQL databases are fully self-healing, offering automated failover and recovery, as well as the ability to distribute the database across multiple geographic regions to withstand regional failures and enable data localization. Unlike relational databases, NoSQL databases generally have no requirement for separate applications or expensive add-ons to implement replication.
A number of products provide a caching tier for SQL database systems. These systems can improve read performance substantially, but they do not improve write performance, and they add operational complexity to system deployments. If your application is dominated by reads then a distributed cache could be considered, but if your application has just a modest write volume, then a distributed cache may not improve the overall experience of your end users, and will add complexity in managing cache invalidation.
Many NoSQL database technologies have excellent integrated caching capabilities, keeping frequently-used data in system memory as much as possible and removing the need for a separate caching layer. Some NoSQL databases also offer fully managed, integrated in-memory database management layer for workloads demanding the highest throughput and lowest latency.
NoSQL Database Types
- Document databases pair each key with a complex data structure known as a document. Documents can contain many different key-value pairs, or key-array pairs, or even nested documents.
- Graph stores are used to store information about networks of data, such as social connections. Graph stores include Titan, Neo4J and Giraph.
- Key-value stores are the simplest NoSQL databases. Every single item in the database is stored as an attribute name (or ‘key’), together with its value. Examples of key-value stores are Riak and Berkeley DB. Some key-value stores, such as Redis, allow each value to have a type, such as ‘integer’, which adds functionality.
- Wide-column stores such as Cassandra and HBase are optimized for queries over large datasets, and store columns of data together, instead of rows.
To be Covered
- Sharding/Data Partitioning Strategy, Multi-tenancy, where they fit in the CAP theorem, supported data types, indexes, transactions, locking/isolation levels, querying capabilities
Some real world use cases for NoSQL
Google Cloud Datastore vs MongoDB vs Amazon DynamoDB