All too often, we receive questions of the following nature when it comes to how to manage your data:
- What database is best for my application?
- Can I get better performance by switching to a NoSQL database?
- How should I store files in my database?
Unfortunately, there are no simple answers; each needs to be considered on the basis of the application’s function and overall use within the organization. However, there are some general elements to consider.
Relational Data Stores
The most common data store we encounter are relational, or transactional, databases; such as MySQL, PostgreSQL, SQL Server, or Oracle. These systems are typically row-oriented data stores, highly normalized for referential integrity, supporting ACID-compliant transactions. The database is typically paired with the underlying web framework, comprising the model component of an MVC application. Though each database vendor offers unique features, switching from one to another won’t have a noticeable benefit.
Web frameworks are designed for model performance, as most will need at a minimum some level of user credentials (authentication) and permissions (authorization). However, many other elements of an application, while functional within a relational data store, are better served by another system.
Key-Value Data Stores
Most web applications store session states for subsequent requests; a key-value store such as Memcached or Redis could be used as the session data store. Compared to relational data stores, key-value data stores provide much lower latency with higher throughput, as they are implemented fully in-memory. In addition to sessions, a key-value store could be used as a data store for cached values, such as expensive query results or site configuration. Other simple state-based data, i.e. shopping carts or user preferences, as well as real-time data will benefit from key-value data stores. Due to the nature of in-memory systems, persistence and availability should be a factor in the decision to implement a key-value store in an application.
Document Data Stores
Sometimes referred to as “NoSQL” or “nonrelational", document databases store native JSON objects in semi-structured models. Document stores such as MongoDB, Elasticsearch, or DynamoDB allow for structurally independent objects to be stored alongside each other. Assume you have a CMS site with complex content needs. A document store could be used to support your content catalog, where each document (i.e. row) in the catalog (i.e. table) could have varying attributes (i.e. columns). Searching and aggregating data in a document store is much easier than maintaining complex joins, and horizontal scaling is much easier with document databases than relational data stores.
In addition to data attributes, applications might need to store files, i.e. static assets, user uploads, or even multimedia. These various objects, while possible to store in a database, are better served from object data stores as name-value pairs. The most common object storage we encounter is S3, but many compatible services exist from other cloud hosting providers. Compared to traditional file systems, object storage is more flexible in size and format but less consistent; i.e. data read after an update are eventually consistent.
Object storage can also be used for any type of data, from structured (csv) and semi-structured (JSON) data to unstructured data dumps of any system. This allows for a central repository of data, also known as a data lake, independent of structure or content. Big data frameworks such as Spark, Presto, and Hive can natively integrate into the data storage, allowing for analytic consumption and visualization of the data store.
Of course, you are not limited to only one data store for your application. Your architecture should allow for multiple data use cases and include the appropriate store for each. As the old adage goes, “If you only have a hammer, everything starts to look like a nail.” Instead of asking “what database is best for my application”, make sure you are instead asking “which store best supports this data type”?
There are many other database types beyond the common ones listed here, including data warehouse stores, time series (IoT), graph databases, and ledger (blockchain) databases. Each has a unique benefit for a specific dataset, desired workload, or use case. Be sure to explore all the options, or contact us for a free 1/2 day assessment.