DB Migration Service AWS

22915-tes2.jpg

Comprehensive Guide to Key Features and Benefits

Organizations continue making progress with the help of benchmarks, goals, and baselines made possible by high-quality data. Data empowers organizations to measure and establish goals, so you can pick starting points, identify relevant comparisons, and determine desired outcomes for your efforts. 

In short, data is what analytics is built on. Organizations use databases to store and get information that later on becomes actionable insights. But, what is a database, exactly? Is it a magical place where data goes to live? Kinda, yes. 

But, here’s a more formal definition: a database is a collection of data that follows a predetermined structure and is kept in a computer for easy access. In most cases, a database management system is responsible for overseeing the operation of a database. Database refers to both the data and the database management system (DBMS), as well as the accompanying applications.

An organization may move away from an existing database over time to save money, improve reliability, make the business more scalable, or reach any other goal. Database migration is the process of moving data from one or more source databases to one or more target databases, which is the focus subject of this post. When a migration is done, all of the data from the source databases is reflected in the target databases, even if the structure has changed. Clients that were using the source databases (such as web applications or BI tools) are switched over to the target databases and then the source databases are turned off.

Despite database migrations are extremely important, projects to move data can be very complicated. Data migration requires downtime, which may lead to interruption in data management operations. This is why it's important to know about the risks and best practices of DB migration, as well as the tools that can help make the process go smoothly.

AWS Database Migration Service is a great tool that implements migration scenarios to migrate data from various sources to a database on an AWS service. Svitla has rich and unique migration proposals on the AWS Marketplace as databases are an essential part of applications and software systems; each of our migration proposals are tailored based on customer needs. 

Next, we’ll cover all the fundamentals of database cloud migration to equip you with the best knowledge as you gear up for this important step for your databases.

What is a Database Cloud Migration?

Database Migration process consists of the following steps:

  1. Identify source and target databases.
  2. Define schema conversion and data mapping rules. They will depend on business requirements and/or specific database limitations. This could mean adding or removing tables and columns, removing elements, splitting fields, or changing types and constraints.
  3. Perform the actual process of copying data to the new location.
  4. (Optionally) Setup a continuous replication process.
  5. Update clients to use the new instance of the database.

It is important to understand that database (data) migration and schema migration are two distinct processes.

The goal of database schema migration is to update the existing database structure to adhere to the new requirements. Schema migrations are a programmatic way to handle small, often reversible changes to data structures. The goal of schema migration software is to make changes to databases easy to test, share, and repeat without losing any data. Most migration software makes artifacts that list the exact set of operations that need to be done to change a database from its known state to its new state. Standard version control software can be used to check these in and keep track of changes and share them with other team members. It usually involves only one database.

Contrary to that, database migration is a process of copying both database schema and data it contains, as well as all other database components such as views, stored procedures etc. In most cases data is copied from one database to another but more complex scenarios are also required sometimes.

Here’s an example of what a more complicated database migration looks like. 

Database Migration

Even though data migration is quite a standard process, there are multiple challenges and dangers to be faced. To name just a few, two databases might be incompatible for some reasons, which requires additional actions. Migration process usually lasts a substantial amount of time. Constant supervision is required to ensure no data is lost in the process. In case of data migration between two distant locations, a secure network connection must be established.

Now that you have a clearer idea of what database cloud migrations are, let’s look at some of the most common reasons to move a database:

Why Database Migrations matter

Below you can find a list of the most popular reasons to migrate your database:

  • to save money, e.g. by switching to a license-free database
  • to move data from an old system to newer software.
  • to move off of commercial databases (Oracle, MS SQL, etc.) to the open source ones (Postgres, MySQL, etc.)
  • to introduce a db replica for reliability - kind of a “living” backup (DMS also supports continuous data replication for such scenarios)
  • to create a copy of DB instances for different stages, for example for development or testing environments.
  • to unify disparate data, so it is accessible by different techniques.
  • to upgrade to the latest version of the database software to strengthen safety protocols and governance.
  • to improve performance and achieve scalability.
  • to move from an on-premise database to a cloud-based database.
  • to merge data from numerous sources into a centralized database

And as with anything, there are also challenges to consider: 

  • Data Loss: Data loss is the most common problem companies face during DB migration. During the planning phase, it's important to test for data loss and data corruption to see if all of the data was transferred correctly.
  • Data Security: The most valuable thing a business has is its data. So, keeping it safe is of the utmost importance. Data encryption should be the most important thing to do before the DB migration process.
  • Planning challenges: Large businesses often have different databases in different parts of the business. When planning a database migration, it can be hard to find these databases and figure out how to convert all schemas and normalize data.
  • Migration strategy challenges: Database migration isn’t something that is done often, for this reason companies struggle to come up with an optimal migration strategy for their specific needs. Hence, it is important to do a lot of research before DB migration.

Cloud migration strategies typically include seven Rs. Now, you’re probably wondering what Rs have to do with cloud migration but worry not as we’ve got you covered!

In the context of database cloud migration, the seven Rs are different ways to move to or inside the cloud. They include such scenarios when a business decides to move its on-premise ERP application for a business unit to the AWS cloud or extends or rearchitects existing solutions in AWS cloud to achieve new capabilities. Every application from that business unit will fall under one of the categories below.

  1. Retire. Decommission the database from on-premise, which includes many parts like DNS entry, load balancer rules tear down, the database itself decommission, and more.
  2. Retain. Retain the few databases that are still being used on-premises by organizations if their migration could give more issues than benefits. If other parts of the infrastructure are being migrated to the cloud, this might require setting up a fixed fast connection between the company infrastructure and the cloud.
  3. Relocate. Move your database servers from on-premises to the cloud virtual instances without making any updates.
  4. Rehost. Known by many as “lift and shift,” in the case of databases this means migrating the data from on-premises to a cloud environment. This option is typically favored by businesses looking to quickly scale in order to meet modern business demands. It allows for some additional changes such as a change in the database provider. The rehost strategy is for example moving from the on-premise database to a database hosted on the EC2 (VM) instance in the cloud. Moving to a cloud-managed database in the SaaS model is also possible.
  5. Repurchase. Also known as “drop and shop,” this is a decision to move to a new or different version of the database server. This option is favored by organizations willing to change their current licensing model. 
  6. Replatform. In the re-platforming strategy, the architecture of the core database will not be changed but it is moved to a new cloud-native database service with the same or compatible database engine. It means moving databases to the cloud without making major data and structure changes, but still taking advantage of the cloud's benefits.
  7. Refactor. There is no such thing as database refactoring per se. However, introducing major changes to your application followed by database migration might be very beneficial. Refactoring an application to use a different kind of database, for instance, a NoSQL database DynamoDB instead of standard relational database MySQL in many cases will reduce the costs of the entire application and/or increase its performance. 

What is AWS DMS?

One of the most popular sets of migration tools in the AWS ecosystem is AWS Data Migration. It's a great tool for moving data to the cloud because it lets you store data in a hybrid cloud, transfer data online or offline, or both. It also has a number of services that help you move datasets, no matter what kind they are.

The price for AWS Database Migration Services (DMS) starts out low. You only pay for the times you actually use. Also, you can use DMS for free for six months when you are moving data to any AWS-based database. Click on this link to find out more.

Pros: 

  • It helps a lot to cut down on application downtime.
  • It supports both homogeneous (same DBMS as source and target) and heterogeneous (different DBMS) migrations.
  • Due to its high availability, it can be used for continuous data migration.
  • Supervision of the migration process and automatic remediation.
  • Built-in security for data both when it is at rest and when it is being moved.
  • Built-in integration with AWS services.

Cons:

  • Only copies a small number of DDL (data definition language).
  • Does not support some network configurations.
  • Does not support any database version to any database version migrations.
  • Requires periodical monitoring.
  • As most database migration tools, may cause the system to slow down when a lot of data is being used.

As you can see, you can do one-time migrations with AWS DMS, and you can also replicate ongoing changes to make sure that sources and targets are always in sync. You can use the AWS Schema Conversion Tool (AWS SCT) to convert your database schema to the new platform if you want to switch to a different database engine. Then, you move the data with AWS DMS. Because AWS DMS is part of the AWS Cloud, you get the cost-effectiveness, speed to market, security, and flexibility that AWS services offer.

AWS DMS is, at its most basic, a server in the AWS Cloud that runs software for replication. You make a secure source and target connections to tell AWS DMS where to get data from and where to put it. Then, you tell this server to move your data by setting up a task to run on it. If the tables and their primary keys don't already exist on the target, AWS DMS makes them. You can make the target tables yourself ahead of time if you'd like. Or, you can use the AWS Schema Conversion Tool (AWS SCT) to create some or all of the target tables, indexes, views, triggers, etc.

How to migrate a Database to AWS?

AWS DMS helps you skip over the somewhat boring parts of the migration. In a traditional solution, you have to do a capacity analysis, buy hardware and software, install and manage systems, test and fix the installation, and then do it all over again. AWS DMS automatically sets up, manages, and keeps an eye on all the hardware and software you need for your migration. As soon as you start the AWS DMS configuration process, your migration can be up and running.

With AWS DMS, you can change the number of migration resources you use based on how busy you are. For example, if you decide you need more storage, you can easily add more storage and start your migration again, which usually only takes a few minutes.

AWS DMS uses a pay-as-you-go model. You only pay for AWS DMS resources when you use them. This is different from traditional licensing models, where you pay to buy the license upfront and then pay to keep it up to date.

Let’s take a look at the key considerations involved with using AWS DMS for your database cloud migrations. 

Components of AWS DMS

There are several AWS DMS internal components that help shape the solution into the success it is. Together, these elements help accomplish data migration and as you gain a greater understanding of each, you’ll be better equipped to provide improved insights when faced with troubleshooting tasks or resolving issues. 

There are three key AWS DMS components. 

Replication instance

At a high level, an AWS DMS replication instance is just a managed Amazon Elastic Compute Cloud (Amazon EC2) instance that hosts one or more replication tasks. Depending on how your migration works and how much space the replication server has, a single replication instance can host one or more replication tasks. You can choose the best configuration for your use case from the different replication instances that AWS DMS provides. 

DB Migration Service AWS: a Comprehensive Guide to Key Features and Benefits

The replication instance is set up by AWS DMS on an Amazon EC2 instance. Some of the smaller instance classes are enough to test the service or do small migrations. If you need to move a lot of tables or want to run multiple replication tasks at the same time, you might want to use one of the larger instances. This is the optimal method because AWS DMS can take up a lot of memory and CPU.

 

Source and target endpoints

To get to your source or target data store, AWS DMS uses an endpoint. Depending on your data store, the connection information is different, but in general, you give the following information when you make an endpoint.

  • Endpoint type. Source or target.
  • Engine type. Type of database engine.
  • Hostname. Server name or IP address that AWS DMS can reach.
  • Port. Port number used for database server connections.
  • Encryption. Secure Socket Layer (SSL) mode, if SSL is used to encrypt the connection.
  • Credentials. Username and password with required access rights.

When you use the AWS DMS console to set up an endpoint, the console will ask you to test the endpoint connection. Before the endpoint can be used in a DMS task, the test must pass. Like the information about the connection, the test criteria are different for each type of engine. In general, AWS DMS checks that the database exists at the given server name and port and that the provided credentials can be used to connect to the database with the necessary permissions to perform a migration. If the connection test works, AWS DMS will download and store schema information that will be used later when configuring tasks. 

One endpoint can be used by more than one replication task. For example, you might want to move two applications that are logically different but are hosted on the same source database. In this case, you make two replication tasks, one for each set of application tables. You can do both tasks with the same AWS DMS endpoint.

Replication task

You can move a set of data from the source endpoint to the target endpoint with an AWS DMS replication task. The last thing you need to do before you can start a migration is to set up a replication task.

When you make a replication task, you choose the following settings for it.

  • Replication instance: The instance that will host the task and run it.
  • Source endpoint.
  • Target endpoint.
  • Types of migration.
    • Full load (Migrating existing data).
    • Full load + CDC (Migrate existing data and replicate changes as they happen).
    • Change data capture (CDC) only, that is: replicate only ongoing data changes.
  • Options for target table preparation mode.
    • Do nothing. AWS DMS will assume that the tables on the target have already been made.
    • Drop tables on target. AWS DMS deletes the target tables and makes new ones.
    • Trim. If you made tables on the target, AWS DMS will delete them before the migration begins. If you choose this option and there are no tables, AWS DMS makes them.
  • Large Binary Objects (LOB) mode.
    • Don't include LOB columns. LOB columns are left out of the migration.
    • Full LOB mode: Move whole LOBs, no matter how big they are.
    • Limited LOB mode: LOBs will be cut off at the size set by the Max LOB Size parameter.
  • Table mappings show which tables need to be moved and how they are moved.
  • Data transformation.
    • Schema, table, and column names are being changed.
    • Changing tablespace names (for Oracle target endpoints).
    • Setting up the target's primary keys and unique indexes.
  • Data validation.
  • Logging on Amazon CloudWatch.

Please note you need an AWS Region to create your AWS DMS migration project with the needed replication instance, endpoints, and tasks. 

Sources for AWS DMS

Users can leverage several data stores as source endpoints in AWS DBS for data migration.

  • On-premises databases and databases in EC2

    • Oracle versions 10.2 and later (for versions 10.x), 11g and up to 12.2, 18c, and 19c for Enterprise, Standard, Standard One, and Standard Two editions.
    • Microsoft SQL Server versions for Enterprise, Standard, Workgroup, and Developer editions in 2005, 2008, 2008R2, 2012, 2014, 2016, 2017, and 2019. 
    • MySQL versions 5.5, 5.6, 5.7, and 8.0 versions.
    • MariaDB (supported as a MySQL-compatible data source) versions 10.0 (only versions 10.0.24 and later), 10.2, 10.3, 10.4, and 10.5.
    • PostgreSQL version 9.4 and later (for versions 9.x), 10.x, 11.x, 12.x, 13.x and 14.x.
    • MongoDB versions 3.x, 4.0, 4.2, and 4.4.
    • SAP Adaptive Server Enterprise (ASE) versions 12.5, 15, 15.5, 15.7, 16 and later.
    • IBM Db2 for Linux, UNIX, and Windows (Db2 LUW) versions:
      • Version 9.7, all fix packs are supported.
      • Version 10.1, all fix packs are supported.
      • Version 10.5, all fix packs except for Fix Pack 5 are supported.
      • Version 11.1, all fix packs are supported.
      • Version 11.5, Mods (0-7) with only Fix Pack Zero supported
    • IBM Db2 for z/OS version 12
  • Third-party managed database services:
    • Microsoft Azure SQL Database
    • Google Cloud for MySQL
  • Amazon RDS instance databases, and Amazon Simple Storage Service (S3)

    • Oracle versions 11g (versions 11.2.0.4 and later) and up to 12.2, 18c, and 19c for the Enterprise, Standard, Standard One, and Standard Two editions.
    • Microsoft SQL Server versions 2012, 2014, 2016, 2017, and 2019 for the Enterprise, Standard, Workgroup, and Developer editions. The Web and Express editions are not supported.
    • MySQL versions 5.5, 5.6, 5.7, and 8.0.
    • MariaDB (supported as a MySQL-compatible data source) versions 10.0.24 to 10.0.28, 10.1, 10.2, and 10.3, 10.3.13 and 10.4.
    • PostgreSQL version 10.x, 11.x, 12.x, 13.x, and 14.x.
    • Amazon Aurora with MySQL compatibility (supported as a MySQL-compatible data source).
    • Aurora with PostgreSQL compatibility (supported as a PostgreSQL-compatible data source).
    • Amazon S3.
    • Amazon DocumentDB (with MongoDB compatibility)

Targets for AWS DMS

With AWS DMS, you can use the following data stores as the endpoints for moving data.

  • On-premises and Amazon EC2 instance databases:
    • Oracle versions 10g, 11g, 12c, 18c, and 19c for the Enterprise, Standard, Standard One, and Standard Two editions.
    • Microsoft SQL Server versions 2005, 2008, 2008R2, 2012, 2014, 2016, 2017, and 2019 for the Enterprise, Standard, Workgroup, and Developer editions. The Web and Express editions are not supported.
    • MySQL versions 5.5, 5.6, 5.7, and 8.0.
    • MariaDB (supported as a MySQL-compatible data target) versions 10.0.24 to 10.0.28, 10.1, 10.2, 10.3 and 10.4.
    • PostgreSQL version 9.4 and later (for versions 9.x), 10.x, 11.x, 12.x, 13.x, and 14.x.
    • SAP Adaptive Server Enterprise (ASE) versions 15, 15.5, 15.7, 16 and later.
    • Redis versions 6.x.
    • Oracle versions 11g (versions 11.2.0.3.v1 and later), 12c, 18c, and 19c for the Enterprise, Standard, Standard One, and Standard Two editions.
    • Microsoft SQL Server versions 2012, 2014, 2016, 2017, and 2019 for the Enterprise, Standard, Workgroup, and Developer editions. The Web and Express editions are not supported.
    • MySQL versions 5.5, 5.6, 5.7, and 8.0.
    • MariaDB (supported as a MySQL-compatible data target) versions 10.0.24 to 10.0.28, 10.1, 10.2,10.3 and 10.4.
    • PostgreSQL version 10.x, 11.x, 12.x, 13.x, and 14.x.
  • Amazon SaaS relational databases and NoSQL solutions:
    • Amazon Aurora MySQL-Compatible Edition
    • Amazon Aurora PostgreSQL-Compatible Edition
    • Amazon Aurora Serverless v2
    • Amazon Redshift
    • Amazon S3
    • Amazon DynamoDB
    • Amazon OpenSearch Service
    • Amazon ElastiCache for Redis
    • Amazon Kinesis Data Streams
    • Amazon DocumentDB (with MongoDB compatibility)
    • Amazon Neptune
    • Apache Kafka – Amazon Managed Streaming for Apache Kafka (Amazon MSK) and self-managed Apache Kafka
    • Babelfish (version 1.2.0) for Aurora PostgreSQL (versions 13.4, 13.5, 13.6)

Migration with AWS DMS - best practices

These are some of the best tips and tricks to use AWS DMS most efficiently.

Converting schema

If your database migration scenario requires a change of the schema, we recommend using the AWS Schema Conversion Tool for all basic and advanced migrations. The tool supports all kinds of conversions, including code conversion, which is not supported by the built-in AWS DMS schema conversion mechanism. AWS SCT converts tables, indexes, views, triggers, and other elements into the DDL format of your choice.

Putting an idea to the test

After configuring your migration scenario, you should always do a test run with a smaller subset of data. This will allow you to find errors in your approach as well as help estimate the actual migration duration.

Using your name server onsite

If you have connectivity issues or data transfer time is too slow, you should consider configuring your own on-site DNS server to resolve AWS endpoints. Amazon Elastic Compute Cloud (EC2) DNS resolver is typically used by AWS Domain Name System (DNS) replication instances to fix domain endpoints. With Amazon Route 53 Resolver, however, you direct traffic to your internal name server. By establishing a secure connection between your server and AWS, you can send and receive queries with the help of this tool's incoming and outgoing endpoints, forwarding rules, and private connection. Behind a firewall, it is safer and more convenient to use an on-premises name server.

Using row filtering to improve performance when migrating large tables

When migrating a large table, it can be helpful to break the process down into smaller, more manageable tasks. Partition the migration into manageable chunks using a key or partition key, and then use row filtering to break up the tasks into smaller ones. If your integer primary key ID ranges from 1 to 8,000,000, you can use row filtering to create eight tasks that each transfer 1,000,000 records.

Removing the load off your source database

Amazon DMS will consume some of your primary database's resources. For each table being processed in parallel, AWS DMS does a full scan of the source table during a full load task. Each migration task you do uses the CDC process to track changes in the source. For example, if you're using Oracle and want AWS DMS to perform CDC, you might need to increase the amount of written data in your change log.

Wrapping up

Svitla Systems is a leading tech company that helps businesses improve their performance by making custom software solutions for them, such as a service to move their databases. We have a team of skilled professionals who can help you successfully move your database with an expert emphasis on cloud database migrations and AWS DMS. We can evaluate your current database to create a plan for the move by making sure all the right precautions are taken so that no data is lost while the database is being moved.