Skip to main content

Cloud Storage - Backup and Archiving (Commvault) -1

 As the Hybrid cloud became a standard for enterprise IT infrastructure, enterprises consider public cloud storage as a long-term archiving solution. As a result, most Backup applications and storage appliances are now ready to integrate with Azure, AWS storage API.


I thought to share some Day2 challenges while deploy, integrate and manage the backup applications with cloud storage options.


Commvault is one of the leaders in enterprise backup tools, so a couple of scenarios will be tested in this series of posts using commvault and AWS s3, Glacier. Below picture depicts the LAB architecture.


1) Cloud Storage integration support

2) where we can fit cloud storage in a 3-2-1 strategy for backups

3) Deduplication, Micro pruning options

4) Encryption

5) Object locking and Ransomware protection

6) Cloud Lifecycle policy Support

7) Disaster recovery within the cloud


Commvault seems natively supporting  most of the cloud storage API without additional license requirements.


Integrating library can be selected from Library-> Cloud Storage-> Cloud Storage 



Linking the cloud storage bucket needs a programmatic access key and secret. Traditionally we can store these within commvault credential manager (encrypted within the appliance, registering backup infra as a resource and associating IAM role will be explained in further posts).


In Lab,  S3 Standard storage class integrated as primary destination from on premise backup to explore the IO Pattern, Dedup behavior (Storage cost)  DR and BCP capability with in cloud.

Commvault automatically populated max  writing streams per bucket, with regular share (accessible to other media servers) with micro pruning support.



A Sample backup examined, with deduplication enable at on premise and no deduplication at could.

Since it is it is first backup we got only compression benefit, not dedup  at commvault  disk library level.

780MB test payload occupied around 680MB in disk library and the same amount of storage taken in S3 library (with out dedup).


Job at on premise:


Local Disk Lib Chucks :



Cloud Disk Lib Chucks in activity log:


S3Cloud Disk Lib created each chunk with 32MB with default configuration (further finetuning need to be explored). 


 

Take away:  while calculating  storage IO cost , we need to consider 32MB object size and some of the retry IO and meta data IO, so per GB it is not 32 IO , it may exceeded more than that depending on configuration and tuning parameters.




Comments

Popular posts from this blog

AWS Datasync overview

AWS DataSync is an online data transfer service that simplifies, automates, and accelerates copying large amounts of data between on-premises storage systems and AWS Storage services and between AWS Storage services.  For example, DataSync can copy data between Network File System (NFS), Server Message Block (SMB) file servers, self-managed object storage,  S3 buckets, EFS  file systems, and Amazon FSx. Above diagram depicts the typical architecture of AWS Datasync services. How it works:  1) Data Sync Service: Service in the AWS cloud, which manages and tracks data sync tasks, schedules  2) Data Sync Agent: A Virtual Appliance with computing power to run scheduled copy, uploading capability and maintain metadata (for full and incremental data transfer ) deployed at on-premise or cloud.  Advantages: a) Cost-effective solution for Data Sync task ( service charged for per GB transfer in only) b) Best suited for aggressive deployment with zero-touch existing i...

Aws File Storage gateway insights #2

  S3 is object storage emulated as NFS using AWS file storage gateway; we need to understand S3 object operations and associated charges. Putting more frequent changing files on the AWS file storage gateway may surge the cost. Below is the AWS file operation vs S3 object impact. Interestingly, in LAB, I observed that even if you are accessing the S3 console using the AWS console for administrative purposes, it is calling the list API call or getting the files list. With help of FUSE and S3fs,  on premise NFS exported files were able to access in  cloud EC2 instances. This is very useful incase of you have some systems that needs hybrid file access. [root@ip-172-31-13-8 s3fs-fuse]# s3fs rmanbackupdemo -o use_cache=/tmp -o allow_other -o uid=1001 -o mp_umask=002 -o multireq_max=5 /mys3mount [root@ip-172-31-13-8 s3fs-fuse]# c /mys3mount/ -bash: c: command not found [root@ip-172-31-13-8 s3fs-fuse]# cd /mys3mount/ [root@ip-172-31-13-8 mys3mount]# ls awstest [root@ip-172-31-13-...