Cloud Databases on AWS
by Svitla Team
In recent years, cloud solutions have become so strongly entrenched in the development and operation of information technologies that it is difficult to find some technology for which there is no high-quality cloud service.
Databases, which are widely used in the form of cloud services, are no exception. Cloud databases are presented both in combination with other cloud services and as separate services with hybrid cloud systems.
Many cloud providers are now developing generic and specialized databases with Internet access and easy scalability.
Here we take a look at the cloud solutions that are available as Amazon services. These services cover almost all possible database options and provide very good prices for their operation.
Why are Cloud Databases important?
The advantages of using databases in cloud systems appear in the following cases:
- a large amount of stored data in the project, which are frequently updated;
- the need for database replication;
- the need to increase and decrease the number of nodes to process queries to the database;
- the need to locate the database as close as possible to the customer service area, to reduce request processing time;
- tightly integrate with other cloud services within the project;
- make backups quickly and store them cheaply in the cloud.
What to look for when hosting a database in the cloud:
- security of the information system, encryption of database queries;
- some data cannot be stored in the cloud according to the legislation of certain countries, you need to carefully plan the system architecture so that sensitive data is stored on-premises;
- carefully examine the certification of a specific location in the cloud in order to meet the standards for the information system according to customer requirements;
- choose the right type of database to achieve the speed and ease of use of the database.
AWS Database Services
Let's take a look at AWS databases by type. AWS has a wide spectrum of cloud coverage in databases, so we will briefly describe each solution and its benefits. To select a database in a project, it is necessary to consult with experienced specialists in the field of information systems and developers, for example, from Svitla Systems.
Relational (SQL) Databases
A relational database is a system for organizing work with a set of data with predefined relationships between them. The data in the database is organized as a set of tables, and the tables consist of columns and rows. The tables store information about the objects represented in the database. Each column of the table stores one specific data type, each cell contains one attribute value. Each table sink is a set of related values that refer to a single object or entity. Each row in a table can have a unique identifier called a primary key. Then rows from multiple tables can be related using foreign keys. The data can be accessed in many ways using relational queries, without reorganizing the database tables. Using a relational database allows us to:
- work with data using a standardized Structured Query Language (SQL) to build complex queries to process information from a database;
- to monitor and enforce the integrity of the data (this is the completeness, accuracy, and uniformity of data), by applying business rules to the data in the tables through the use of constraints and triggers to guarantee the accuracy and reliability of the data;
- develop stored procedures to improve the work with relational queries;
- use the transaction mechanism to group database queries to meet the requirements of ACID (Atomicity, Consistency, Isolation, Durability);
- provide simplicity and ease of understanding by the developers. The only information construct used is the table
- use strict design rules based on mathematical apparatus
- to have complete data independence, i.e. changes to the application are minimal if relational database was changed.
Amazon Aurora is a MySQL or PostgreSQL compatible relational database dedicated to the cloud. This database combines the speed and availability of traditional commercial databases with the simplicity and economy of an open-source database. Application code developed to use MySQL or PostgreSQL can be very quickly reconfigured to use Aurora and take advantage of all the benefits it provides.
Amazon Aurora can be up to five times faster than standard MySQL databases. And compared to standard PostgreSQL databases, Aurora is three times faster.
This service provides increased security, availability in different regions, as well as the reliability of commercial databases about 10 times cheaper in cost. Amazon Aurora is managed by Amazon Relational Database Service (RDS). This automates time-consuming administrative tasks that include provisioning hardware, setting up a database, installing updates, and backing up.
Amazon Relational Database Service (Amazon RDS) is a relational cloud-based database with the ability to work with SQL queries. The main benefit of using RDS is to automate time-consuming administrative tasks such as hardware provisioning, database configuration, patching, and backups.
As described in the user documentation, RDS provides the following database engines Amazon Aurora, PostgreSQL, MySQL, MariaDB, Oracle Database, and SQL Server. Also, AWS Database Migration Service works very well for migration and replication of existing databases to Amazon Relational Database Service.
If analytical processing of large amounts of information is required, it is possible to use Redshift, which allows you to execute queries on structured and semi-structured data. These data may consist of several petabytes, could be located in a data warehouse, operational database, or data lake using standard SQL tools, as well as combine such data. Redshift makes it easy to store query results in an S3 data lake and then feed it to various analytics services such as Amazon EMR, Amazon Athena, and Amazon SageMaker. In addition, there is also the option to enable AQUA (Advanced Query Accelerator) mode, a new hardware-accelerated distributed cache that allows Redshift to run faster than other cloud storage solutions (up to 10x faster).
Amazon DynamoDB is a high-speed database of key-value pairs and documents. DynamoDB operates in multiple regions with multiple active servers and has built-in security, backup and recovery, and in-memory caching. DynamoDB can handle more than 10 trillion requests per day and can handle peak loads in excess of 20 million requests per second. This key-value database provides less than 10 milliseconds of latency at any scale. DynamoDB can be used with AWS AppSync and GraphQL to develop interactive mobile and web applications with real-time updates. It should be noted about this database, the key-value paradigm accessed via API will be used instead of SQL as in relational databases. In addition, Amazon DynamoDB provides 99.99% availability and replicates six copies of your data across three AWS Availability Zones in each region DynamoDB tables are deployed.
An in-memory database is a database that stores tables and records in memory. A resident DBMS or in-memory database management system is one of the types of software systems that operate in the in-memory computing paradigm. In-memory DBMS, due to the optimizations possible in the conditions of storage and processing in byte-addressable RAM, provide better performance than DBMS with databases on persistent storage devices. As a rule, such databases have a block organization and are connected via a bus or network interface.
Amazon ElastiCache for Memcached
Memcached is software that implements a data caching service in RAM based on a hash table. Consequently, Amazon ElastiCache for Memcached is a Memcached-compliant cloud storage service for key-value data that can be used as a cache tier or data store. One of the benefits of using this service in the cloud is the ElastiCache client for the Memcached cluster with Auto Discovery. This saves time by making it easier to connect the application to the Memcached cluster.
Amazon ElastiCache for Redis
Redis (remote dictionary server) is an open-source, resident NoSQL database management system that works with key-value data structures. Consequently, Amazon ElastiCache for Redis is a very fast in-memory data store that delivers sub-millisecond latency to real-time Internet-scale applications. As stated in the description of this cloud service, Amazon ElastiCache supports up to 340 TB in-memory clusters. Same time, this cluster can scale to 500 nodes. And data can be partitioned across multiple shards to provide a high level of scalability and availability.
Amazon DocumentDB is a document database that makes it easy to store, index, and quickly query JSON data. Amazon DocumentDB is a non-relational database with a MongoDB compliant API that gives users the performance, scalability, and availability they need to handle mission-critical workloads at any scale. As an AWS-managed service, users can realize the benefits of MongoDB without the overhead of self-managing the infrastructure, backup, and availability of the database. In Amazon DocumentDB, the storage and compute are decoupled. This allows each part to scale independently. To increase the read capacity to millions of requests each second user can add up to 15 low latency replicas with cost-effective Amazon DocumentDB.
Wide column Databases
Amazon Keyspaces (for Apache Cassandra) is a scalable, highly available, and managed database service compatible with Apache Cassandra. You can use this service to run Cassandra workloads on AWS. In turn, Apache Cassandra is a non-relational fault-tolerant distributed DBMS designed to create highly scalable and reliable storage of huge amounts of data presented in the form of a hash. Amazon Keyspaces is essentially serverless, so it automatically scales tables in the right direction based on application traffic, and the customer only pays for the resources they use. Amazon Keyspaces lets you build applications with virtually unlimited bandwidth and storage, serving thousands of requests per second. This encrypts data by default, with the ability to continuously back up data from tables using point-in-time recovery.
Graph databases are designed for highly related data with complex reciprocal relationships. In the Amazon cloud system, there is an Amazon Neptune graph base for this, which supports the popular W3C Property Graph and Resource Description Framework (RDF) models. Their respective Apache TinkerPop Gremlin and SPARQL query languages are also supported, making it easy to create queries to efficiently navigate through sets of complex data.
If you are looking for a fast, scalable, and serverless time series database service for the Internet of Things (IoT) and operational applications then Amazon Timestream is the right choice. As mentioned on Amazon web site Amazon Timestream can store and analyze more than 1,000,000,000,000 events every day with 1,000 faster speeds, and a budget of this effective solution ten times less compared to relational databases.
It is a really impressive result and will save a lot of money if you work with millions of IoT devices or monitoring systems for complex architecture. Amazon Timestream is a proper system for IoT devices, DevOps applications, and analytical applications. But as mentioned before, Amazon Timestream, as a non-relational database will not support SQL requests.
Amazon Quantum Ledger Database (QLDB)
If your project needs to build a decentralized database to provide transparent, immutable, and cryptographically verifiable transaction logs, owned by a central trusted authority, then Amazon Quantum Ledger Database (QLDB) should be used. Ledgers are essential for processing the history of economic and financial activities in an organization, fintech, or banking. In this case, the approach using blockchain frameworks helps a lot, such as Hyperledger Fabric and Ethereum, which can also be used as a ledger. Amazon website describes “With QLDB, your data’s change history is immutable – it cannot be altered or deleted – and using cryptography, you can easily verify that there have been no unintended modifications to your application’s data. QLDB uses an immutable transactional log, known as a journal, that tracks each application data change and maintains a complete and verifiable history of changes over time.”
Customer Use Cases
The most common use case we have seen with our customers is the use of an RDBMS in support of a transactional application, including both web and mobile applications. These RDBMSs have frequently been deployed on RDS to take advantage of all of the underlying management provided so that we can focus instead on the data storage and access requirements. (As AWS likes to say, we greatly prefer to let AWS handle the “undifferentiated heavy lifting” of managing a clustered, highly-available database, including redundant storage, replication, backups, operating system updates, and more.) A common practice to offset some of the extra cost associated with using a managed service is to utilize reserved instances for the databases, which themselves are a tradeoff of flexibility for a time commitment when a predictable database size and location will be used for the foreseeable future. These databases have included Aurora MySQL, PostgreSQL, and SQLServer most commonly.
Another frequent use-case we have seen has been related to a greater need for high-performing, low-latency storage rather than complex relationships between data, in order to best support mobile applications. The best fit for these has often been DynamoDB, fronted with a thin, GraphQL-based API layer in AWS AppSync where relationships between records can be applied, while still maintaining high performance. DynamoDB is similar to RDS in that it is a fully-managed service, even considered “serverless” since none of the underlying computing infrastructures is exposed to consumers of the service; it is entirely managed by AWS.
Thus, cloud databases are now rapidly entering service in a variety of information systems. This applies to both relational SQL and non-relational NoSQL databases. At the cost of hosting, cloud databases provide a more economical and high-quality solution that reliably works in the required geolocation for users.
Our highly qualified specialists from Svitla Systems solve a wide range of tasks in cloud systems, not only in Amazon but also in other clouds, like Azure, Heroku, Digital Ocean, etc.
You can contact our company about various tasks related to cloud systems, including database migration and the necessary package of services related to cloud databases.
Let's meet Svitla
We look forward to sharing our expertise, consulting you about your product idea, or helping you find the right solution for an existing project.
Your message is received. Svitla's sales manager of your region will contact you to discuss how we could be helpful.