AWS Datasync overview

AWS DataSync is an online data transfer service that simplifies, automates, and accelerates copying large amounts of data between on-premises storage systems and AWS Storage services and between AWS Storage services.

For example, DataSync can copy data between Network File System (NFS), Server Message Block (SMB) file servers, self-managed object storage, S3 buckets, EFS file systems, and Amazon FSx.

Above diagram depicts the typical architecture of AWS Datasync services.

How it works:

1) Data Sync Service: Service in the AWS cloud, which manages and tracks data sync tasks, schedules

2) Data Sync Agent: A Virtual Appliance with computing power to run scheduled copy, uploading capability and maintain metadata (for full and incremental data transfer ) deployed at on-premise or cloud.

Advantages:

a) Cost-effective solution for Data Sync task ( service charged for per GB transfer in only)

b) Best suited for aggressive deployment with zero-touch existing infrastructure.

c) Secure transport between source and destination

d) Granular Data Sync schedules from minutes to Days

e) Full and Incremental Data Sync Support

f) Data Verification at various stages supported.

g) Logs, events can be integrated with AWS cloud monitoring systems.

Disadvantages:

a) Multiple appliances may be required depending on the count of files

b) Multiple tasks need to be created depending on directory and files depth

c) No clue or tracking method on sync or scan status at source and destination