Azure Blob Storage Backup Automation

Azure Blob Storage is a highly scalable object storage solution for unstructured data. Automating backups to Azure Blob Storage ensures that critical data is securely stored and can be easily restored when needed. This guide provides bash and Python scripts for automating backups to Azure Blob Storage, with detailed explanations and examples.


Key Features

  1. Support for File/Directory Backups:
    • Compress files/directories and upload them to Azure Blob Storage.
  2. Retention Management:
    • Automatically delete old backups to optimize storage.
  3. Customizable Scheduling:
    • Automate backups with cron or similar tools.
  4. Logging and Error Handling:
    • Maintain logs for all backup operations.

Prerequisites

  1. Azure CLI Installed:
    • Install and configure the Azure CLI:
      curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash
      
    • Log in and configure:
      az login
      az account set --subscription <subscription_id>
      
  2. Storage Account and Container:
    • Create a storage account and blob container:
      az storage account create --name <storage_account_name> --resource-group <resource_group> --location <location>
      az storage container create --account-name <storage_account_name> --name <container_name>
      
  3. Access Key or SAS Token:
    • Generate an access key:
      az storage account keys list --account-name <storage_account_name>
      
    • Or create a Shared Access Signature (SAS) token:
      az storage container generate-sas --name <container_name> --account-name <storage_account_name> --permissions wl --expiry <expiry_date>
      

Bash Script for Azure Blob Backup

Script: azure_blob_backup.sh

#!/bin/bash

# Azure Blob Storage Backup Script
# Author: [Your Name]
# Version: 1.0

# Configuration
SOURCE_DIR="/path/to/source"          # Directory to back up
BACKUP_DIR="/path/to/local/backups"   # Temporary local backup directory
STORAGE_ACCOUNT="<storage_account>"  # Azure storage account name
CONTAINER="<container_name>"          # Azure Blob container name
RETENTION_DAYS=7                      # Number of days to retain backups
LOG_FILE="/var/log/azure_blob_backup.log"  # Log file for backup process

# Azure CLI login
AZURE_STORAGE_KEY="<access_key>"  # Replace with your access key or export as ENV variable

# Ensure the backup directory exists
mkdir -p "$BACKUP_DIR"

# Function: Log messages
log_message() {
    echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
}

# Function: Create a backup
create_backup() {
    local backup_file="${BACKUP_DIR}/backup_$(date '+%Y%m%d%H%M%S').tar.gz"
    log_message "Creating backup for $SOURCE_DIR..."
    tar -czf "$backup_file" "$SOURCE_DIR"

    if [[ $? -eq 0 ]]; then
        log_message "Backup created: $backup_file"
        echo "$backup_file"
    else
        log_message "Backup creation failed."
        exit 1
    fi
}

# Function: Upload backup to Azure Blob Storage
upload_to_blob() {
    local file=$1
    log_message "Uploading $file to Azure Blob Storage..."
    az storage blob upload --account-name "$STORAGE_ACCOUNT" --container-name "$CONTAINER" --name "$(basename "$file")" --file "$file"

    if [[ $? -eq 0 ]]; then
        log_message "Upload successful: $file"
    else
        log_message "Upload failed for $file."
        exit 1
    fi
}

# Function: Clean up old backups
cleanup_old_backups() {
    log_message "Cleaning up backups older than $RETENTION_DAYS days..."
    az storage blob list --account-name "$STORAGE_ACCOUNT" --container-name "$CONTAINER" --query "[].name" -o tsv | while read -r blob_name; do
        blob_date=$(az storage blob show --account-name "$STORAGE_ACCOUNT" --container-name "$CONTAINER" --name "$blob_name" --query "properties.lastModified" -o tsv | cut -d'T' -f1)
        blob_date_epoch=$(date -d "$blob_date" +%s)
        cutoff_date_epoch=$(date -d "$RETENTION_DAYS days ago" +%s)

        if [[ $blob_date_epoch -lt $cutoff_date_epoch ]]; then
            az storage blob delete --account-name "$STORAGE_ACCOUNT" --container-name "$CONTAINER" --name "$blob_name"
            log_message "Deleted old backup: $blob_name"
        fi
    done
}

# Main Script Execution
log_message "=== Azure Blob Backup Script Started ==="
backup_file=$(create_backup)
upload_to_blob "$backup_file"
cleanup_old_backups
log_message "Backup process completed successfully."

Explanation of the Bash Script

  1. Backup Creation:
    • Compresses the source directory into a .tar.gz file:
      tar -czf "$backup_file" "$SOURCE_DIR"
      
  2. Upload to Azure Blob Storage:
    • Uses the Azure CLI to upload files:
      az storage blob upload --account-name "$STORAGE_ACCOUNT" --container-name "$CONTAINER" --name "$(basename "$file")" --file "$file"
      
  3. Retention Policy:
    • Lists blobs in the container and deletes those older than RETENTION_DAYS.

Python Script for Azure Blob Backup

Script: azure_blob_backup.py

import os
import tarfile
from datetime import datetime, timedelta
from azure.storage.blob import BlobServiceClient

# Configuration
SOURCE_DIR = "/path/to/source"              # Directory to back up
BACKUP_DIR = "/path/to/local/backups"       # Temporary local backup directory
STORAGE_ACCOUNT = "my_storage_account"      # Azure storage account name
CONTAINER = "my_container"                  # Azure Blob container name
CONNECTION_STRING = "DefaultEndpointsProtocol=https;AccountName=my_storage_account;AccountKey=your_key;EndpointSuffix=core.windows.net"
RETENTION_DAYS = 7                          # Retention period in days
LOG_FILE = "/var/log/azure_blob_backup.log" # Log file for backup process

# Ensure the backup directory exists
os.makedirs(BACKUP_DIR, exist_ok=True)

# Logging function
def log_message(message):
    timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    with open(LOG_FILE, "a") as log_file:
        log_file.write(f"[{timestamp}] {message}\n")
    print(f"[{timestamp}] {message}")

# Create a tar.gz backup
def create_backup():
    timestamp = datetime.now().strftime("%Y%m%d%H%M%S")
    backup_file = os.path.join(BACKUP_DIR, f"backup_{timestamp}.tar.gz")
    log_message(f"Creating backup: {backup_file}")

    with tarfile.open(backup_file, "w:gz") as tar:
        tar.add(SOURCE_DIR, arcname=os.path.basename(SOURCE_DIR))

    log_message(f"Backup created: {backup_file}")
    return backup_file

# Upload a file to Azure Blob Storage
def upload_to_blob(file_path):
    log_message(f"Uploading {file_path} to Azure Blob Storage...")
    blob_service = BlobServiceClient.from_connection_string(CONNECTION_STRING)
    blob_client = blob_service.get_blob_client(container=CONTAINER, blob=os.path.basename(file_path))

    with open(file_path, "rb") as data:
        blob_client.upload_blob(data, overwrite=True)

    log_message(f"Upload successful: {file_path}")

# Delete old backups from Azure Blob Storage
def cleanup_old_backups():
    log_message(f"Cleaning up backups older than {RETENTION_DAYS} days...")
    blob_service = BlobServiceClient.from_connection_string(CONNECTION_STRING)
    container_client = blob_service.get_container_client(CONTAINER)
    cutoff_date = datetime.utcnow() - timedelta(days=RETENTION_DAYS)

    for blob in container_client.list_blobs():
        if blob.last_modified < cutoff_date:
            container_client.delete_blob(blob.name)
            log_message(f"Deleted old backup: {blob.name}")

# Main function
def main():
    log_message("=== Azure Blob Backup Script Started ===")
    backup_file = create_backup()
    upload_to_blob(backup_file)
    cleanup_old_backups()
    log_message("Backup process completed successfully.")

if __name__ == "__main__":
    main()

Key Features of the Python Script

  1. File Compression:
    • Uses the tarfile module to create a .tar.gz archive.
  2. Blob Upload:
    • Uses the azure-storage-blob Python SDK to upload files to Azure Blob Storage.
  3. Retention Management:
    • Deletes blobs older than RETENTION_DAYS based on the last_modified timestamp.

Scheduling Backup Automation

  1. Bash Script with cron:
    crontab -e
    

    Add:

    0 2 * * * /path/to/azure_blob_backup.sh >> /var/log/azure_blob_backup_cron.log 2>&1
    
  2. Python Script with cron:
    0 2 * * * python3 /path/to/azure_blob_backup.py >> /var/log/azure_blob_backup_cron.log 2>&1
    

Best Practices

  1. Encrypt Backups:
    • Use GPG to encrypt backups before uploading:
      gpg --encrypt --recipient [email protected] backup.tar.gz
      
  2. Monitor Logs:
    • Regularly review logs to ensure successful backups.
  3. Secure Credentials:
    • Use Azure Key Vault to securely store access keys and connection strings.
  4. Test Restorations:
    • Periodically restore backups to verify data integrity.

Refer for Creating Azure Storage Account 

For more information refer Azure Blob Storage Backup

 

Related articles

How to Install PIP on Ubuntu 22.04 | Step-by-Step

How to Install PIP on Ubuntu 22.04 | Step-by-Step In this step-by-step guide, we will walk you through how...

History of Cloud Computing

History of Cloud Computing Cloud computing has evolved over decades, transforming the way businesses and individuals store, process, and...

Google Cloud SQL

Google Cloud SQL Google Cloud SQL is a fully managed relational database service provided by Google Cloud Platform (GCP)....

Components of Artificial Intelligence System​

Components of Artificial Intelligence System Artificial Intelligence (AI) has redefined how machines interact with the world. At the core...