Lifting & Shifting Data From RedShift To Azure SQL Using Azure Pipeline

Overview

AWS To Azure Data Transfer: Migrating raw data from AWS RedShift To Azure Data SQL Warehouse.

Easy data migration is one of the key components of Digital Transformation. This case study will investigate how we created an Azure pipeline to orchestrate an automated data migration workflow using ETL logic.

The objective of this workflow was to transfer raw data from Amazon Redshift to Azure SQL data warehouse. Additionally, we were required to create a logic that would trigger automated data extraction based on parameters like – new file entry or a specific time of the day.

This was not a simple lift-and-shift approach. After the extraction the data was transformed and schematic changes were done so that the data looks similar to other data that the client was accustomed to handle.

Technology Used:

Azure Data Factory for providing ETL logic and for file processing
Azure SQL Data Warehouse to ensure end storing of the data that is to be consumed by the analysts
Azure Data Lake for storing files for longer terms
Azure Data Factory

Visit Website

Systematic Approach To Complex Issues

As one of the world’s largest ITService Providers with over 120 engineers and IT support staff are ready to help.

SmartData been helping organizations throughout the World to manage their IT with our unique approach to technology management and consultancy solutions.

Watch our presentation

Challenge

We reviewed the challenge, and leveraging the power and scale of the cloud, devised a solution which in reality is beyond the traditional infrastructure.

The client approached us with a data science challenge pertaining to one of their data sets. We provided the client with the data in an AWS environment belonging to the Redshift data warehouse. This was found to be expensive, with the data and the computing expenditure getting coupled together in AWS. This increased the computing costs. Yes, the speed improved too, but that level of swiftness was not required.

However, the data was also found to be available in the CSV format stored in an S3 storage bucket. This further heralded the start of a newly devised approach. The entire infrastructure belonging to the client was already being deployed as well as managed by AutomationFactory.ai in Azure, making them opt for consolidation into the presently existing infrastructure.

The key considerations were- the solution must process large sets of data of more than 11,000 files and a total 2TB compressed size with extra files introduced every day. Raw files are to be ingested in a database and stored for future requirements. This ingestion should be rate-controlled and parallelizable for ensuring the management of multiple database connections with orderly ingestion.

There has to be an account for every single file ensuring the correct moving of data. Maintenance has to be made ongoing with minimal effort and cost involved. Maintenance has to be automated and certainly delegated away from the end-users.

At its core, Cilio Automation Factory believes in the philosophy that when you elevate employee experience, the customer-experience will automatically be elevated.

Book a Free Demo

How It Works

Deliver Only Exceptional Quality, And Improve!

Objectives

Our client already had Azure based infrastructure. So they wanted to lift and shift one of their data sets from AWS Redshift to their Azure infrastructure.

Technology Used

Azure Data Factory for providing ETL Logic and for file processing, Azure Data Lake, and Azure SQL Data Warrehouse.

Execution

We built Data Pipeline. Then we scheduled data transfer initialization. Thirdly, we created Status Tables. We also abstracted the Data Pipeline.

Results

The client can now seamlessly and automatically6 transfer data from Redshift to Azure. The client found the data useful and familiar as it was schematically changed.

Technical Details

AutomationFactory.ai recommended Microsoft’s Extract-Transform-Load (ETL) for Azure as ETL is an absolutely native service for Azure to get tied to the other Microsoft services.

Cilio Automation Factory built data pipeline to move from AWS to Azure. With manually triggered initial load, the update schedules got set to check the new files to be conducted at regular intervals.

We created status tables for keeping track of all the files. This further tracks the status of data when it gets passed through the data pipelines and ensures the usage of a decoupled structure for any troubleshooting or even manual intervention to occur at any particular stage without the creation of dependencies. The decoupled structure ensures the fixing of the individual files and the steps in isolation. This gets followed by the other pipelines, as well as the steps getting continued without interruption. The clarity in decoupling reveals successful identification of an error in the process that has got notified to the users for further investigation.

The entire data gets mapped back to the tables for further usage during requirements of processing or cleaning. The data is later transformed along with extra schema changes matching the client’s end-use and getting mapped to the traditional trading data.

The data pipelines were abstracted deliberately for allowing the least of the work to include new sources of data in the future. The objective was to make things easy for the client’s end-users, letting them do the required steps.

Request a Free Quote

Brainstorming & Recommendation

Microsoft ETL For Azure
Data Cleaning Strategies
Cost Saving Strategies
Ease Of Use For End Users

Execution

Building Data Pipeline
Update Schedule Initialization
Creation of Status Tables
Data Pipeline Abstraction

Results

Seamless Transfer Of Data
Automatic Data Transfers
Schematically Changed Data
Cost Saving

View all Case Studies

Lifting & Shifting Data From RedShift To Azure SQL Using Azure Pipeline

Overview

AWS To Azure Data Transfer: Migrating raw data from AWS RedShift To Azure Data SQL Warehouse.

Systematic Approach To Complex Issues

Challenge

We reviewed the challenge, and leveraging the power and scale of the cloud, devised a solution which in reality is beyond the traditional infrastructure.

How It Works

Deliver Only Exceptional Quality, And Improve!

Objectives

Technology Used

Execution

Results

Technical Details

AutomationFactory.ai recommended Microsoft’s Extract-Transform-Load (ETL) for Azure as ETL is an absolutely native service for Azure to get tied to the other Microsoft services.

Brainstorming & Recommendation

Execution

Results

All Kinds Of IT Services That Vow Future Business Success!

Satisfied Users Around The Globe

Request a Quote

Company

Solution

Resources

Contact

Locations