Saturday, 24 February 2018

AWS Storage Gateway

Overview
  • Enables hybrid storage architectures 
    • Move data to AWS for Big data / cloud bursting migration
    • Backup, archive, DR
    • Tiered storage (on-premise, cloud)
  • Uses native AWS storage
    • S3
    • EBS Snapshots
  • Efficient data transfer
    • Reduces bandwidth usage
  • Local caching

AWS Storage Gateway VM
  • Virtual appliance downloaded from AWS
  • Acts as a facade so that client applications on-premise need no change
    • Standard storage protocols
  • Installed on a host on-premise 
    • Needs VMWare/Windows Hyper-V hypervisor
    • Possible to install on EC2
      • e.g. for PoC purpose
  • Activation
    • specify IP address, name, timezone
    • AWS region to store snapshots
    • associates gateway with AWS account
  • Must have access to disk subsystem: SAN, NAS or DAS

File Gateway
  • Exposes NFS mount target
    • NFS v3/4.1 
    • Mounts S3 as a file system
  • 1-1 mapping between S3 object and file
    • Including metadata
  • Data stored in S3 bucket and can be accessed directly
    • Fine-grained control, lifecycle, CRR, etc.
    • EFS does not provide this
    • NFS client can access any data in the bucket
      • Including created outside of File Gateway, e.g.
        • Replicated from other bucket
        • Imported via Snowball

Volume Gateway
  • Exposes iSCSI mount point (block strorage)
    • Initiator: on-premise Application Servier
    • Target: Storage Gateway
  • Data stored on S3 in opaque format 
    • unlike File Gateway
      • no direct access to objects
      • stored in AWS buckets 
    • compression at-rest and in-transit
  • On-premise volume can be backed up to EBS snapshot
    • Restored as EBS volume
  • Max 1 PB of volume
  • Mode
    • Stored
    • Cached

Volume Gateway (Stored)
  • Local disk: "source of truth"
    • S3:  continuous synchronous backup of on-premise volume
  • EBS snapshots can be created
    • ad-hoc
    • scheduled
  • Size
    • Max 16TB per volume (EBS volume limit)
    • Max 32 voumes (=Max 512TB)
  • Use case
    • Offsite backup
      • On restore: everything downloaded

Voume Gateway (Cached)
  • S3: "source of truth"
    • Local disk: cache of data
  • Minimize need for on-premise scaling
  • Size
    • Max 32TB per volume
    • Max 32 volumes (=Max 1PB)
  • Data stored on S3 as "Volume Storage"
    • Supports Point In Time Snapshot of data in S3 -> EBS Snapshot
      • Can be restored to "Volume Storage"
      • If < 16 TB can also be restored to EBS volume
  • Allocated on-premise storage best practices
    • Cache Storage - local cache of frequently accessed data
      • optimize performance for iSCSI
      • durable data storage on-premise
      • Allow at least 20% of entire storage volume
      • Use RAID5 or RAID6
      • When fills-up and full of dirty data - iSCSI writes are blocked
    • Upload Buffer  - queue to get data up to S3 (asynchronous).
      • At least 150GBs recommended
      • Optimize performance for S3 
      • When it fills-up data is uploaded from Cache Storage directly but we cannot take PITs Snapshots during that time
    • Separate Cache/Buffer to different spindles
  • Dirty data - data put in Cached Volume that is not yet uploaded to S3
  • Avoid
    • Windows full format as it initializes blocks and you start paying for used storage (use quickformat instead)
    • full antivirus sweep as it ruins the cache
  • Use cases
    • Large data set but small working set 
    • Moving from Volume (Stored) to Volume (Shared)
    • Backup
      • On restore: nothing downloaded (empty cache)

Tape Gateway
  • Drop-in replacement for physical tape infrastructure
  • Exposes iSCSI interface
    • Media Changer
    • Tape Drive
  • Virtual Tape (VT)
    • Analogous to physical cartridge
    • Size 100-2.5TB
    • States
      • AVAILABLE - application may write to it
      • IN TRANSIT TO VTS - uploading data to AWS
      • ARCHIVING - upload to AWS complete. Archiving
      • ARCHIVED - in Glacier
  • Virtual Tape Library (VTL)
    • Analogous to Physical Library (with robotic arms and tape drives)
    • Many existing backup tools supported (e.g. Dell, Veritas, etc.)    
    • Max 1500 Virtual Tapes (total 150TB) in library 
      • Unlimited number in AWS
    • Drives
      • Tape Drive
        • I/O and Seek
        • Max 10
        • Responds to SCSI commands
      • Media Changer  (robotic arm)
        • Max 1
        • "Inserts" Virtual Tape into Tape Drive
  • Virtual Tape Shelf (VTS)
    • Analogous to off-site tape holding facility
    • When backup software ejects the tape it is moved to VTS
    • Backed by Glacier
  • Use case 
    • Replacing physical tape infrastructure

References

No comments:

Post a Comment