SQL vs noSQL Databases

9041-sql.jpg

For more than 50 years, developers have been using relational databases. The dominance of relational databases for so long stems from the pathway they’ve opened to store and process massive volumes of information and unify database-related work based on standardized SQL queries.

Now, for a little history lesson, the first version of SQL was developed by IBM in the early 70s, becoming a household name and an ANSI standard in 1986. A year later, it was ratified by ISO which marked a new milestone for SQL.

Despite being robust and high-volume, relational databases have a few performance and scalability limitations. To bridge the gaps germinated by SQL’s limitations, NoSQL databases came to exist. 

NoSQL databases, or in other words, “not only SQL”, primarily seek to solve the problems of scalability and availability by completely or partially rejecting the requirements of atomicity and data consistency

Relational versus non-relational databases began to compete in the 2000s, and are both still widely used today.

What is a Relational Database

A relational database, or oftentimes referred to as SQL or Structured Query Language, is a database where data is stored in table format. Data in relational databases are strictly structured and related to each other. For example, a table has rows and columns where each row represents a separate record, a column represents a field with an assigned data type, and each cell has data written based on its type.

Relational databases have evidenced their worth, reliability, and stability through prominent products like MySQL, PostgreSQL, Oracle, etc. SQL databases are built on the ACID principle, which describes the requirements for a transactional system: Atomicity, Consistency, Isolation, and Durability.

ACID-built databases don’t just store data, but they store it with maximum reliability. Simultaneously, one of the biggest disadvantages of SQL databases is their speed, which is not that great.

What is a Non-Relational Database

Non-relational databases (NoSQL) store data without clear relationships with each other and sans a clear structure. Instead of structured tables, the database contains many disparate documents, including images, videos, and even social media posts. Unlike relational databases, NoSQL databases do not support SQL queries.

NoSQL is an approach to implementing scalable storage (database) of information with a flexible data model, which differs from classical relational DBMS. 

In non-relational databases, scalability and availability problems that are important for big data are solved thanks to atomicity and consistency.

Multifunctionality vs Performance in Databases

Relational databases vs non relational databases can be compared in terms of multifunctionality and performance.

Relational SQL databases are best suited for storing structured data, which is especially true when data integrity is critical. It is also better to choose this model if the project requires a technology based on well-known standards. This is a big plus since you can count on a large number of additions to future functionality and a lot of developer experience. Compliance with ACID principles in relational databases guarantees data safety and predictability of functionality:

  • Atomicity - no transaction will be partially committed in the system.
  • Consistency - only valid transaction results are recorded.
  • Isolation - the result of a transaction is not affected by transactions running in parallel with it.
  • Durability - changes to the database persist despite crashes or user actions.

Non-relational NoSQL databases fit best when data requirements are unclear or vague at the time of their creation. It’s also a great fit for situations where requirements are prone to change as the project grows and develops. Also, NoSQL databases are best used in cases where it’s needed to deliver high speed of work on large amounts of data.

One of the main differences between SQL and NoSQL databases is scalability. Scalability in relational databases is vertical. This means that with an increase in the load of requests to the database, the load on the servers increases proportionally. If a large amount of data comes to or from the database, sooner or later the vertical scaling threshold may come. Then, the server will not be able to further increase performance, meaning that you’ll need horizontal scaling, that is, parallel processing of data in a server cluster.

Scalability in non-relational databases is based on a distributed architecture so it scales well horizontally and has high performance. NoSQL technologies can automatically distribute data across different servers, which improves the speed of reading data in a distributed environment.

There are two approaches to improving database performance. These approaches are replication and sharding.

Replication allows you to create a complete duplicate of a database. So, instead of one server, you will have several servers. Sharding is another data scaling technique that allows you to divide (partition) the database into multiple separate parts. Each of the partitions can be placed on a separate server. 

In sharding specifically, there are two types available: vertical and horizontal sharding. Vertical sharding is the allocation of a table or group of tables to a separate server. Horizontal sharding is the division of one table into different servers, so it’s best used for huge tables that cannot fit onto a single server. Please note that horizontal scaling can lead to CAP-theorem problems.

In the context of SQL vs NoSQL DB,  we’d like to highlight that NoSQL databases are potentially more fit for scalability. It is also worth mentioning that sharding is not easy for relational databases, so our relations can be between different shards and therefore their joins will be very slow. Thanks to the attached documents in document-oriented databases, there is no such problem. But, if the number of linked objects is high, there is a problem with the size of the object in document-oriented databases.

Usually the question of speeding up databases is solved by using cloud systems like Amazon AWS, Google Cloud, and Microsoft Azure provide very good support for database scalability, both SQL and NoSQL.

Which system is better to use for which projects

SQL

SQL is primarily a programming language designed to describe, modify, and retrieve data stored in relational databases.

Initially, SQL was the main way for a user to work with a database and allowed the following set of operations to be performed:

  • creating a new table in the database;
  • adding new records to the table;
  • selection of records from one or several tables;
  • changing records;
  • deleting records;
  • changing table structures.

Many well known RDBMS products use SQL language for queries: MySQL, PostgreSQL, Oracle, etc. 

Here’s a simple example of SQL requests:

SELECT * FROM Customers;

or

SELECT Name, Grade, Class FROM School;Code language: PHP (php)

SQL databases are ideal for structured data where you need consistency and a guarantee of information durability. However, it can be slow on some SQL requests especially for large amounts of data. It’s best practice for relational databases to use database version control migration technology such as Liquibase and Flyway.

NoSQL

Document Oriented Databases

MongoDB is a great example of a document-oriented NoSQL database. This database is used to store large amounts of weakly structured data. MongoDB uses collections and documents instead of using tables and rows as in traditional SQL databases. 

Documents are made up of key-value pairs which are the main unit of data in MongoDB. Collections contain sets of documents and functions and they can be equivalent to relational database tables.

MongoDB can work with both structured and unstructured data. This database is suitable for projects with heterogeneous data that are difficult to classify. MongoDB will also work well if a significant change in the data structure is expected in the project in the future, including for OLAP scenarios.

Here’s a MongoDB example in JSON format:

{
      _id: ObjectId("9....3"),
      name: { first: "John", last: "Smith" },
      birth: new Date('Jan 21, 1981'),
      groups: [ "Sport", "IT", "Photo" ],
      rating : NumberLong(50000)
}

Code language: JavaScript (javascript)

Document databases are well suited for storing data when it’s likely the structure will change frequently over time and when higher performance is needed as compared to relational databases.

Key-value Databases

Another great example of NoSQL is a key-value database Redis (REmote DIctionary Server). It is a non-in-memory relational data structure used as a database where the data is stored as a key-value pair. At the same time, the storage can scale by replicating between servers. 

Redis stores all data in memory, which makes data access as fast as possible compared to other databases. Redis is known for its exceptionally high performance even among other key-value stores.

Redis allows you to store data in high-level data structures like strings, hashes, lists, sets, giving you more flexibility in the type and amount of information that you can store in the Redis datastore. 

Here is a simple example of using Redis commands:

>SET foo "hello"
OK

>GET foo
"hello"

>EXISTS foo
(integer) 1

>APPEND foo " world"
(integer) 11

> GET foo
"hello world"

Code language: JavaScript (javascript)

Key-value databases will be very well suited for such tasks:

  • Data caching
  • Chats and messaging systems
  • Dispatching a variety of data
  • Different task queues
  • Monitoring various data

The advantage of key-value databases in comparison with relational and document-oriented databases is the high speed of processing queries. And yes, it is also worth mentioning the possibility of indexing, replication and load balancing to increase performance. The big plus of key-value databases is the low entry threshold and ease of development by developers, which is now a rather weighty argument. 

In document-oriented databases, there are certain problems with nesting, document size, search speed, index size, similarly to problems of scaling of relational databases, yet in relational databases, you can build more complex database queries and ensure data integrity.

Column based Databases

Columnar databases are a good option for big data processing as they offer high performance, efficient data compression, and excellent scalability.

Columnar databases store data as a sparse matrix, whose rows and columns are used as keys. Like a table, a set of data contains columns and rows. However, there is a clear difference: the column does not span all rows. Instead, the column is contained within the row, which also means that different rows can have different columns. 

Here is a simple example of table data representation in columnar databases:

ID

FistName

LastName

Age

001

John

Johnson

36

002

Amy

White

54

003

Alex

Pratt

43

In a columnar database, this information from the table will have the following structure:

001, 002, 003
John, Amy, Alex
Johnson, White, Pratt
36, 54, 43

Columnar databases allow you to search for values by individual columns and retrieve only those values that are required. This property of columnar databases makes them especially promising for use in OLAP systems and when working with big data.

Currently, a large number of columnar databases are presented on the software market: Vertica, HANA, MonetDB, Greenplum, ClickHouse.

An important property of columnar databases is high speed and flexibility in executing complex queries.

Conclusion

In the SQL vs NoSQL databases smackdown, the clear winner is you. Both have unique advantages and specific use case scenarios where they work best, so it’s important that you carefully evaluate the unique business needs of your project. 

Both are excellent choices for building storage and processing information; on one hand, SQL databases are a perfect solution for projects where you need to make complex queries and the data structure of the database is stable and will not change very often, yet they can be slow when compared to NoSQL.On the other hand, NoSQL databases are gaining popularity as well but this does not mean that relational databases are becoming an archaic solution. Most likely, they will be used actively, but more and more NoSQL databases will appear in symbiosis with relational databases.

There’s no apples to apples comparison when it comes to Relational DBb vs Non Relational DB, as both play their own role and solve specific tasks in information systems. Modern cloud solutions make SQL and NoSQL databases very quick and cost effective, allowing you to build distributed databases free of headaches.

Our database engineers from Svitla Systems have experience with both SQL and NoSQL solutions. Many of our clients are happy with fully realized projects with quality, high-performing databases. If you need to decide what database approach is best for your future project, or you need to migrate your database from SQL to NoSQL, please contact our sales representatives at Svitla Systems around the world.