Model
- Vault
- name may be the same across regions (unique per region)
- container for archives
- analogous to S3 bucket
- max 1000 per region
- Archive
- base unit storage in Glacier
- immutable (create/delete only)
- can be any data (photo, document, etc.)
- best practice: aggregate data into .zip or .tar
- 32 kB metadata overhead
- max 40 TB
- Upload
- Single max 4GB
- Recommended multi-part for > 100 MB
- Compute and supply tree-hash
- Hash for each megabyte segment and combine in tree fashion
- Inventory
- Updated once per day
- List of all archives
- Inventory date not changed if no add/delete of archives
- Format: CSV or JSON
Jobs
- Executed asynchronously (Job ID returned)
- Associated with vault
- Multiple jobs may be in-progress
- Typically 3.5-4.5 hours
- When it completes user can download the output (available for 24h)
- Types
- Archive Retrieval
- entire archive or subset of files in the archive
- Inventory Retrieval (list of archives)
- filter can be applied (e.g. archive creation date)
- May have SNS notifications enabled
Tree Hash
- On upload include 2 headers
- x-amz-content-sha256
- hash of entire payload used for signature calculation
- x-amz-sha256-tree-hash
- specific to archive upload
- main benefit - avoids re-reading a (potentially big) file to calculate its hash
- it's computed piece-meal
- for each chunk of 1MB compute hash (last may be < 1 MB)
- build the next level of tree (compute hash again)
- repeat until you reach top (root)
- Examples
- Single request (6.5 MB)
- 1 request (SHA256 computed 13 times)
- Multi-part
- 2 requests each has hash-tree of corresponding parts
- Complete Multipart Upload (tree hash of entire archive)
Vault Access Policy
- Resource-based policy (similar to bucket policy)
Vault Lock Policy
- Similar to vault access control
- Enforce compliance requirements
- e.g. WORM (Write Once Read Many)
- Once policy is locked it cannot be edited
- Stronger control than vault access policy
- Use case
- time-based data retention rules (deny deletes) but allow read access
- Combine vault lock policy (deny delete) and vault access policy (read)
- Process
- Initiate lock
- Sets to IN_PROGRESS and returns LockId
- Validate and test your policy
- 24 hours timeout
- Complete the lock process
- Policy elements
- Resource (vault)
- Conditions
- glacier:ArchiveAgeInDays, glacier:ResourceTag
- Action
Pricing
- Allowed to retrieve 5% of data any month for free
- More retrieval you pay per 1 GB
- Peak usage taken and applied to the whole month retroactively
- Data Retrieval Policy
- Simplifies cost management by setting limits
- Free Tier Only
- Max Retrieval Rate (GB/h)
- No Retrieval Limit
S3
- Integrated with S3
- Lifecycle configuration can transition between S3 <-> Glacier
No comments:
Post a Comment