Friday, 23 March 2018

AWS SNS (SMS)


Overview
  • Worldwide SMS delivery
  • Sending modes
    • Direct
    • Bulk (topic subscribers)
  • DisplayName (first 10 characters) used in text message
  • Phone numebr in E.164 format (e.g. +48XXXYYYZZ)
  • Character limit
    • 160 GSM characters
    • 140 ASCII
    • 70 for Unicode (UCS-2)
  • Type
    • Promotional
    • Transactional

Opt-out
  • Certain countries require it (e.g. US, Canada)
  • Send STOP or QUIT to unsubscribe

Short code
  • 5-6 digit
  • By default AWS assigns a shared code
  • Dedicated code
    • Assigned exlusively to sender
    • Supports higher throughput
  • Recipient may reply to the message (e.g. opt-out)

Sender Id
  • 11 alphanumeric character
    • e.g. brand name
  • Only certain countries supported
    • e.g. US is not


Monitoring
  • Sent
  • Failed
  • Delivery rate (Sent/Failed)

AWS S3 (Security)


Encryption
  • Metadata is never encrypted
  • Server Side (SSE)
    • Possible to enforce with bucket policy (e.g. only encrypted data can be uploaded)
    • SSE-S3
      • S3 manages keys (AES-256)
    • SSE-KMS
      • More flexible than SSE-S3 but additional charges (for KMS)
      • Customer can manage or use default KMS key generated for him (aws/s3)
      • ETag is not MD5 hash anymore (as it would be a security hole)
      • Headers
        • x-amz-server-side-encryption = aws:kms
        • x-amz-server-side-encryption-aws-kms-key-id
        • x-amz-server-side-encryption-context (do not use sensitive data here)
    • SSE-C
      • Customer provides the key
      • Different objects(versions) may have different key
      • Headers
        • x-amz-server-side​-encryption​-customer-algorithm = AES256
        • x-amz-server-side​-encryption​-customer-key
        • x-amz-server-side​-encryption​-customer-key-MD5
    • Default Encryption
      • Feature to have S3 automatically encrypt the object (SSE-S3, SSE-KMS)
  • Client Side (CSE)
    • Encryption is opaque to S3 (just a blob)

Permissions
  • Places where you setup access permissions
    • Bucket Policy
      • Limited document size
    • ACL
      • Bucket ACL
      • Object ACL
    • User IAM Policy
  • Authorities
    • Parent Account Owner
    • Bucket Account Owner
    • Object Account Owner
  • User Context
    • Only when IAM user
  • Bucket Context
  • Object Context
    • Bucket Account Owner can deny access

ACL
  • Bucket and object level
  • Default ACL: grants owner full permissions
  • Max 100 grants per ACL
  • Grantee
    • AWS account
      • can be identified by email address
      • Cannot grant permissions to IAM users
    • Predefined AWS Group
      • Authenticated Users (any AWS account) - must have Authentication header
      • All Users (includes Anonymous)
      • Log Delivery Group (WRITE permission enables storing S3 logs)
  • Permissions
    • READ
      • Bucket
        • ListBucket, ListBucketVersions, ListBucketMultiPartUploads
      • Object
        • GetObject, GetObjectVersion, GetObjectTorrent
    • WRITE
      • Bucket
        • PutObject, DeleteObject, DeleteObjectVersion (only when grantee is owner)
    • READ_ACP (read bucket/object ACL)
      • Bucket
        • GetBucketACL
      • Object
        • GetObjectACL, GetObjectACLVersion
    • WRITE_ACP (change bucket/object ACL)
      • Bucket
        • PutBucketACL
      • Object
        • PutObjectACL
  • Canned ACL (predefined grants)
    • private
    • public-read
    • public-read-write
    • aws-exec-read
    • authenticated-read
    • bucket-owner-read
    • bucket-owner-fullcontrol
    • log-delivery-write
  • Use cases
    • Generally prefer Bucket Policy and IAM policy (ACL is legacy mechanism)
    • LogDeliveryGroup must use ACL
    • Bucket Policy document limit reached
    • Wide variety of permissions on objects (cannot be captured by policy easily)
    • Used in conjuntion with Requester Pays

Pre-signed urls
  • Example
    • https://s3.amazonaws.com/examplebucket/test.txt
      ?X-Amz-Algorithm=AWS4-HMAC-SHA256
      &X-Amz-Credential=<your-access-key-id>/20130721/us-east-1/s3/aws4_request
      &X-Amz-Date=20130721T201207Z
      &X-Amz-Expires=86400
      &X-Amz-SignedHeaders=host
      &X-Amz-Signature=<signature-value>  
  • Uploading encrypted object
    • SSE-KMS
    • SSE-S3
    • SSE-C (customer specified key)
      • restricts that upload to specific encryption key
  • Use cases
    • Restricted download
      • e.g. temporary access to a file (max 7 days)
    • Restricted upload
      • e.g. having any AWS credentials
    • Communication mechanism in CloudFormation
      • Signaling
        • CreatePolicy - Signalling
      •  WaitCondition/WaitHandle
  • Generating
    • Anyone with valid security credentials can create pre-signed url
      • It will only work if my permissions actually allow to upload (otherwise there would be privilage escalation)
    • Java SDK supports creation

CORS
  • Cross-origin access to mitigate JavaScript SOP restrictions
    • Preflight (OPTIONS) request to determine access rights
  • Configured on bucket
  • CORSRule
    • Allowed Origin (i.e. requestor domain)
    • Allowed Methods (GET, PUT, POST, ...)
    • Allowed Headers (in the preflight request which headers requestor may ask for)
    • Expose Headers (which headers can be read on the client side)
    • MaxAgeInSeconds - how long preflight response can be cached
  • Use Cases
    • Auto-complete
    • Drag'n'Drop upload to S3
    • Upload progress
    • Update content directly from JS
    • Serving Web Fonts

VPC Endpoint
  • Allows direct access to S3 from VPC
  • Use case
    • Bypass public Internet
  • Policies
    • S3 bucket policy - who can access me (aws:SourceVpc and aws:SourceVpce)
    • Endpoint policy - whom can I access (e.g. my own buckets only)
  • No need to change DNS name
    • Internally requests are routed differently
  • See also: VPC (Endpoint)

Macie
  • AWS managed service to scan/categorize data in S3
  • See also: Macie

AWS S3

Model
  • Resources
    • Bucket - subresources
      • website 
      • versioning 
      • bucket policy
      • ACL
      • CORS
      • logging
      • event notifications
  • Object - subresources
    • ACL
    • restore (when using Glacier restore)
Limits
  • 100 buckets per account (soft limit)

Bucket Addressing
  • virtual hosting style
  • path style
    • https://s3-eu-west-1.amazonaws.com/BUCKET/FILE 
      • must specify correct region    
      • s3.amazon.aws.com refers to us-east-1
      • when wrong region specified you get "301 Moved Permanently"
  • name globally unique
    • must be DNS compliant (except for us-east)
    • may contain "."

Request Redirects
  • DNS is used to route to S3 nodes - temporary errors may occur
  • Must resend request to different endpoint
    • Do not reuse temporary redirect as it may fail in future
  • Typically happen when bucket just created/deleted (eventual consistency)
  • Permanent redirect - addressed bucket incorrectly (see bucket addressing)
  • Use 100 - Expect Continue for PUT requests
    • Server decides based on headers if it can accept the requests
    • Avoid unnecessary work if the request is to be redirected anyway

Requester Pays
  • Owner pays for storage, requester pays for data transfer and requests
  • Requester includes x-amz-request-payer header
  • Requester must be authenticated by AWS
    • Not allowed for anonymous access
      • In particular not allowed for static website hosting

DevPay
  • "Tenant pays"
  • Charging for S3 based products
    • Once a month Amazon bills your customers
    • Deducts fixed transaction fee and gives you the rest
    • Amazon charges for my S3 costs + Dev Pay percentage fee
    • If customers do not pay - their access is cut-off
  • Customer data is isolated (cannot be accessed directly via S3 API)
  • DevPay Tokens
    • Product token - identifies the app
    • User token - identifies the user to be charged

Storage Classes
  • STANDARD - default (S)
    • Design for Hot/Temporary data
  •  STANDARD_IA - infrequently accessed (IA)
    • Suitable for long lived infrequently accessed data 
    • Storage cheaper than (S)
    • Requests more expensive than (S)
    • Minimum object size: 128kB
    • Minimum 30 days storage charge (not suitable for temporary objects)
  • GLACIER (G)
    • Retrieval takes many hours (async job)
    • Cannot be used for initial upload
      • can be used as lifecycle target
      • alternatively upload directly via Glacier API (but S3 does not see it then)
    • Object must be restored before accessed
    • Object visible in S3
  • REDUCED_REDUNDANCY (RR)
    • Sustains concurrent loss of 2 replicas
    • Probably being phased-out (some regions do not support it)
    • 99.9% durability (1/10000 can be lost per year)
    • When object is lost AWS returns 405 method not allowed   
      •     410 (Gone) - would be inappropriate as owner may decide to re-upload
  • Storage class can be changed
    • PUT object copy (request)
      • Destination is the same as source (x-amz-copy-source)
      • Indicate it is a copy by using directive: x-amz-directive: COPY
      • Set x-amz-storage-class: [STANDARD, REDUCED_REDUNDANCY]

Restoring from Glacier
  • Specify number of days to keep the restored file
    • Possible to modify later (until expires)
    • Cheapest storage class used (e.g. RR)
  • Restored objects charged for both S3 and Glacier

Versioning
  • Multiple versions of file 
  • Enabled on bucket level (versioning-enabled)
    • Cannot be disabled
  • Can be suspended (versioning-suspended)
    • stop accruing objects
    • delete can only remove object with (null) versionId otherwise delete marker is inserted
  • Each object gets unique versionId (1024 bytes string)
  • Listing
    • versions treated as separate objects
  • Deleting
    • S3 inserts DELETE MARKER
    • You can specify versionId to retrieve it
    • To permanently delete specify versionId

Cross-region replication (CRR)
  • Enabled on bucket level
    • Subset of objects can be replicated (prefix)
  • Asynchronous copy of all S3 objects from (S)ource bucket to (D)estination bucket
    • Can override storage class
    • Takes up to several hours
  • Ownership
    • By default ACL is copied
      • (S) account owns the (D) object
      • Can be overridden 
  • Storage Class
    • Can be specified
      • DR may want STANDARD_IA
  • Bi-directional replication
    • Master-master
  • Use cases
    • Compliance
    • Minimize latency
    • Operations (e.g. computing on the same set of resources)
    • DR
  • Requirements
    • Versioning must be enabled
    • (S) and (D) must be in different regions
    • Permissions need to be setup
  • Not replicable
    • Retroactive objects (i.e. created before configuration enabled)
    • Objects encrypted with SSE-C
      • SSE-KMS is replicable
        • KMS key is regional so you specify what to use in (D)
    • Bucket subresources
      • e.g. Lifecycle configuration
        • (S) and (D) may have different
    • Non-customer actions (e.g. lifecycle inserts delete marker)
    • Replicas from other buckets  (i.e. not transitive)
  • Status
    • GET Includes "x-amz-replication-status" header in response
      • S: PENDING, COMPLETED, FAILED
      • D: REPLICA

Server Access Logging
  • Require Log Delivery ACL on a bucket
  • Best effort
  • Alternative
    • CloudTrail: Data Events

Bit Torrent
  • Speeds-up large and popular object files
  • .torrent file - bootstrap information for the file
    • Use s3-path?torrent
  • Need BitTorrent client
  • Every anonymous object available for download
  • S3 acts as a "seeder"

Performance
  • No special action needed: < 100/s {PUT, LIST, DELETE } && < 300/s GET
  • Rapid increase: ask S3 Team to pre-partition (submit support case)
  • Key name dictates the S3 partition
    • Unlike DynamoDb it does not shard on key hash
      • Hence "List" is available
    • Objects are stored lexicographically across partitions
    • Recommendation for high-scale workloads
      • Avoid sequential keys (timestamps, ids)
        • They start with the same prefix and land on the same partition
        • Shard keys (MD5 - 4 digit hash)
          • Listing becomes very expensive (scan)
        • Group objects by key names (animations, videos)
        • Reverse the key name for better distribution of initial characters
          • e.g. userId=1234 -> 4321
  • Transfer optimizations 
    • TCP window scaling (increase initial receive windows WSCALE)
    • TCP selective acknowledgment - speed-up recovery after large packet loss

Data Consistency Model
  • Updates are atomic but on single key only
  • Latest PUT wins
  • Read-after-write consistency for NEW objects
  • Eventually consistent (may return stale data)
    • list newly written NEW object 
    • read-after-write for EXISTING objects (i.e. overwrite)
    • read-after-delete
    • list deleted object

Multi-part upload
  • Recommended for size > 100 MBs
    • Max object size = 5TB
  • Process
    • Split file locally
    • Initiate (S3 returns uploadId)
    • Upload each part  (1-10000)
      • Specify part number (1-10000)
        • Determines order
        • May not be contigous
      • Can be done in parallel
    • Finalize for each part number (Etag/part number)
  • ETag not generally MD5 anymore (like in SSE encryption)
  • Orphaned uploads
    • Billed as normal objects
    • Not visible in S3 console
    • May be cleaned-up with lifecycle rule

Static Website Hosting
  • Supports GET and HEAD requests (no POST)
  • Public only content 
  • On error returns HTML (not XML like S3 REST API)
  • Supports redirects (object and bucket level)
  • Supports root documents (e.g. index.html)
  • Does not support SSL
    • Use CloudFront on top
  • Option to redirect all requests to different hostname

Event Notifications
  • Bucket configuration
  • Types
    •  object created (Put,Post,Copy,CompleteMultiPlartUpload)
    •  object removed (Delete, DeleteMarkerCreated)
    •  object loss detected (RRSObjectLost)
  • Filters (optional)
    • Prefix
    • Suffix
  • Target
    • SNS, SQS, Lambda

Lifecycle configuration for versioned enabled buckets
  • Acts as Recycle Bin
  • Action on Current Version
    • Transition to the Standard-Infrequent Access
      • X days after the object's creation date
    • Archive to the Glacier Storage Class
      • X days after the object's creation date
    • Expire
      • Expiring current version will generate new version
  • Action on Previous Version
    • Transition to the Standard-Infrequent Access
      • X days after object becoming a previous vesion
    • Archive to the Glacier Storage Class
      • X days after object becoming a previous vesion
    • Permanently Delete
      • X days after object becoming a previous vesion

Object Tagging
  • Way to organize data
    • more flexible than location (bucket/prefix) 
  • Max 10 tags per object
  • Can grant IAM policy permissions per Tag
  • Can be used in Lifecycle rules

S3 Inventory
  • Same set of metadata as LIST API
  • Format: CSV, ORC
  • Can be queried by Athena (S3 file)
  • Split between multiple files
    • Manifest files
      • manifest.json 
      • symlink.txt - Apache Hive compatible
  • Report: daily/weekly
  • Delivery to S3 bucket
  • Delivery Notification
    • On "checksum" written (last step)
    • SNS/SQS/Lambda
  • Eventualyl consistent
  • Pricing - cost half of LIST API

Storage Class Analysis
  • Daily report
  • Set of heuristics what is the appropriate Storage Class
    • Access
    • Retention
  • Provides lifecycle recommendation
  • Can be exported to BI tool
  • Additional pricing

Transfer Acceleration
  • Uses CloudFront edge network (in reverse)
  • Upload to closest "PoP"
  • Uses backbone network to deliver to target
  • Enable on bucket level (new endpoint)

References

IAM (STS)


Security Token Service (STS)
  • Gives out temporary credentials
  • Global service
    • All credentials are global
    • Possible to call regional endpoint for reduced latency
  • Use cases
    • Federation (e.g. Enterprise or Web)
    • Delegation (e.g. Cross-account access)
    • Roles for EC2 instances (no need for storing Access Keys)
    • AWS Service that manage resources on customer behalf (e.g. AutoScaling)

Temporary Credentials
  • Similar to Access Keys but short-lived (minutes to hours)
  • Not stored with the user but dynamically generated
  • No need to distribute them
  • Can grant access without AWS identity (basis for federated identity)
  • Have restrictions based on the API used to generate them
    • For AssumeRole - cannot call GetFederationToken followed by GetSessionToken
      • You could extend expiration of your token this way
Revoking Temporary Credentials
  • Specify policy that denies access based on 
    • creator name (e.g. compromised account)
    • issue before certain time

Session
  • Temporary access to AWS
  • Generated by STS 
  • Elements
    • Access Key
    • Secret Access Key
    • Session Token
      • Must be submitted to every API call along with Access Key and Secret Access Key
    • Expiration (Min/Max/Default)
      • GetFederationToken (15m/36h/12h)
      • AssumeRole*(15m/1h/1h)

Policy scoping
  • Allows to restrict permissions (logical: role permissions && policy)

GetFederationToken
  • Works within AWS account
  • Up to 36 hours (much longer than others)
  • No MFA
  • Requires AWS credentials
    • Desired policy passed is passed as an argument
    • Caller must have union of all policies that you want to grant
  • Policy
    • There are no "role permissions" here so you only get what you specify
    • If no policy specified authenticated user may still get access based on resource policy, e.g.
      • Temporary credentials created for "Susan" (federation token)
      • S3 bucket access for "arn:aws:sts::111122223333:federated-user/Susan"
  • Use cases 
    • server side proxies (must safely store long term credentials)

AssumeRole
  • Works cross-account
  • 15 minutes - 1 hour
  • MFA supported
  • Supports policy scoping
  • Requires AWS credentials
  • Use cases
    • Grant access to resources in different AWS account
    • Enforce MFA authentication for privilage escalations

AssumeRoleWithSAML (SAML 2.0)
  • Works cross-acount
  • 15 minutes - 1 hour
  • Does not require AWS credentials (SAML response is cryptographically signed)
  • Must configure SAML Identity Provider first
  • Supports policy scoping
  • RoleSessionName is visible in CloudTrail so use correct value for traceability
  • Use cases
    • Enterprise organizations who have software that produces SAML assertions
      • Active Directory -> Active Directory Federation Services
    • Used for corporate Single Sign On (e.g. Isengard)

AssumeRoleWithWebIdentity
  • Works cross-account
  • 15 minutes - 1 hour
  • Does not require AWS credentials
  • Obsoleted by Cognito for mobile scenarios
  • Supports Policy Scoping
    • Request is not signed so make sure no intermediate can alter the policy
  • Use cases
    • Mobile and web users who do not have IAM users

GetSessionToken
  • Give temporary credentials for IAM user
  • 15 minutes - 36 hours
  • Requires AWS Credentials
  • Use cases
    • enforce MFA for avoiding privilege escalation
    • untrusted environments (web, mobile)
Single Sign On to the console
  • Temporary credentials can be used for sign-in
  • Endpoint: https://signin.aws.amazon.com/federation
    • Pass temporary credentials
    • It returns a token that can be used to sign-in directly to AWS console

References
  • https://danielgrzelak.com/backdooring-an-aws-account-da007d36f8f9