As old as it is, we have debate: what kind of database should be used for my system? Usually the answer is “it depends”. We know that it depends on many different terms. I would like to cover certain terms, which can help you identifying and selecting the proper database, based on requirements for your project.
DOCUMENT VS ROW BASED DATA
At the beginning it might be worth mentioning in what way we store the data in these two concepts.
When you have a Sql database, it keeps a row-based-data in tables. For instnace you may have a users table, where it stores data about users inrows and each row has strict, predefined fields.
A NoSql database, keeps data in collections of documents. So what we call a table in a Sql database, we call it collection in NoSql. Following the naming, what we call a record in Sql, we call it a document in the latter one. Each document has dynamic fields, so it can store what ever data you want to put in there, with limitation to its size.
There is one more concept in the NoSql database. Some databases does not have collections or documents. They store data in a key-value way.
STRUCTURED VS UNSTRUCTURED DATA
The first thing which may come to your mind is stored data. Actually as you think about this, it is all about the data. So the question can be:what kind of data do you want to store? Is it data which is well structured, e.g. it comes from predefined forms, filled in by users, or is it unstructured or even with dynamic types from external services?
As you consider this, you will see that well structured data plays well with a SQL database, where it has a predefined schema for data. When you have an integration with external services and you want to store the lot from these services, maintaining schema can be a painful thing. In the latter case it is better to use a database which handles unstructured of dynamic data; a NoSQL database plays well in this case.
Also, when you have a hierarchical type of data, e.g. an order with order lines together with its prices, it can be very well placed in a document of a NoSql database. In a NoSql database we keep data as documents, and these documents can be nested in each other. For instance you may have a patient document with his treatments and examination as sub-documents.
On the other hand you may have well structured data like bills orinvoices, from which you generate a beautiful report. The best choice for that is Sql as you can easily generate these reports based on multiple tables joined together.
SCHEMA VS SCHEMA-LESS
Question about schema is almost like… “Quo-vadis project?”. Consider that. When you know up-front what kind of data you want to store in your project, and you are sure that it will not change too much in the future, you may safely go for a Sql database.
When you know that you do not know upfront too much about the data and it should be dynamic, it is better to go for NoSql. You may want to have an easy ability to add new documents to exisitng collections with new, not defined fields, it will be easier to have a schema-less kind of database – a NoSql one.
HORIZONTAL VS VERTICAL SCALING
When you think about scalability, it is important to understand how these two database concepts scale. Taking into the considertation cases where you want to have easy scalability of the database, as you system grows and handles more and more users and their data. Depending on the database type, scaling is handled in a slightly different way. Both ways obviously are valid, however it is good to see the difference between these them.
When you consider a Sql database, it scales vertically. You scale it by using more powerful machine. Instead scaling it across multiple machines, you scale it in one machine by adding better and more powerful hardware.
The second concept – a NoSql database scales horizontally. It means you can add new machines and it will scale across all of them. It is prepared for this type of scaling.
You will see it on the later step of you project, what kind of scalability you need, but it is a good thing, to have some predictions where the project may go. Based on these predictions plan and prepare for the future.
CONSISTENT VS CONSISTENT-LESS DATA
Consistency in the database is really important. It might be really important but that is not true for all cases.
In a NoSql database you have less consistent dataas it does not have a predefined schema. You may think that it is not a pretty good idea. Actually it is not a bad idea. Especially as I mentioned earlier. Sometimes you may want to have inconsistent data, e.g. in your warehouse or in a database which takes data from multiple places. You may want it not to be consistent as you want to handle it from many different sources. In this case, a NoSql database is a better choice.
On the other side of the spectrum, you have consistent databases. For instance you want to have strict validations and be sure that the data is kept in a predefined way, that it actually is consistent across an entire table and database. There you better go with a Sql database.
COMPLEX VS SIMPLE TRANSACTIONS
Related thing to consistency is about transactions. For certain project, transactions are important, for certain these transactions are placed somewhere in a shadow.
Consider that you need to ensure that user puts his data and it is valid, so later you can create additional records for the user in the same time. Create it only, when the data is valid, in the other case, do not insert user data at all. In order to ensure complex transactions across multiple tables, it is only handled by Sql databases.
NoSql databases so far allow you to define transactions just for a single document. So you are not able to revert insertions into multiple tables, if a data is wrong at certain stage. Also again that might be a benefit when you find that you do not care about consistency, and transactions are not taken into the consideration.
Whether you need a database for storing unstructured data or a database for handling high load for inserts and reads, consider making a conscious choice. Do not tempt for making choice based on what currently is new and shiny, just because it is that.
Make it based on requirements, numbers and valid arguments. Remember, that the selection cannot be a golden hammer for every problem. You may consider mixing SQL and NoSQL for appropriate use-case.
If you have any trouble on that matter, I am open to help you making an appropriate choice.