donderdag 16 april 2020

Azure Series : Synchronization between Azure Databases


I have to copy a couple of tables of about 200 million rows between a couple Azure SQL databases, just once or perhaps twice. I don't know it exactly. Now, in SQL Server (on-prem) you have some options and it's fairly easy copying data but in Azure it's a different ballgame. If the data is not very much you can use the "Generate scripts" and choose the option "data only" resulting in scripted data in INSERT statements. I tried the bacpac option in SSMS but I received a lot of validation errors because (perhaps) the database was not in a consistent state. I didn't investigate this much further. One another trick I tried was using SSIS, my old favorite ETL Tool. Although I enjoy the tool a lot, it seems that the integration and alignment with Azure should be better. Although, It is a good backup for my problem, I would like to know if there is somethnig better, easier or faster to use. So my options were starting to get smaller and smaller. Elastic queries could be an option but I have been there, done that before and so I ended up with experimenting with synchronization groups in Azure SQL Databases. This blogpost is a description of the process and investigation on how to setup synchronization between Azure SQL Database. I hope you find it useful and leave me note when you have remarks or questions.

The setup

Firstly, what are synchronization groups in Azure SQL Database? Well, it is a synchronization tool for data synchronization between Azure SQL Databases and on-premise SQL Servers (you have to install an agent). For this blogpost, I am only interested in synchronization between Azure SQL databases.

There are two types of databases: Hubs and Members.  

The configuration

First, create a sync group in the Azure Portal, Navigate to the database and search for "Sync to other databases" and click on that.

Create a new sync group with "New Sync Group"

Enter the Sync group name and I choose to use existing database and all of the databases are shown in the drop down box and I choose the Hub database. The next step is choosing the member database that is used for the synchronization.

The member database is used for the sync member.

And the next step is choosing the tables (and clumns if you wish) to sync from, but in my case it seems like saying for hours: "loading tables"....hmpf...

After a couple of tries and clicking around the following error message appears and now things were getting more and more clearer. The service has nog access to the Azure SQL Database Server.

So I set this option : Allow Azure services to access the server!

And now I recieved another error message, something about a bad login.

After correcting the password, all of a sudden I received a list of tables. I'm not sure but I took some while to manage this, but may be it's just me.

In the next step I can even choose columns for synchronization. There are some not supported columns over there. I leave that for later to investigate.

The execution

All ready and I pressed on the Sync button and some magic happened. The table is synchronized to the other database!

Some logging appears and it seems that the synchronization is succeeded.

Let's take a look in the database, but hey there are some tables in the database I didn't expect and seems a bit awkward. These tables are needed for the synchronization between the databases.

Also in the member database a lot of synchronization (meta) tables were created.

Final thoughts

I expected/hoped that the synchronization of databases is a kind of replacement of the import/export of data of the on premise SQL Server version, a one stop copy and paste method, but it's more like a synchronization tool, as off course the name implies. So for a simple copy action it's usable but you will get a lot of tables in your database unless you use a meta data database. 


1 opmerking: