Overview
- Delivers streaming data directly to target (no need to write Consumer application)
Model
- Delivery Stream - main entity
- No need specify shards/partition keys
- Data record - 1000 KB
- At-least-once semantics - duplicates possible (like SQS)
- Retention: 24h (if destination is not available)
- Retries are automatic
Source
- Direct PUT
- API
- AWS IoT
- CloudWatch Logs (Subscription)
- CloudWatch Events (Rules)
- Amazon Kinesis Agent
- Monitors files and sends records to Kinesis Firehose
- Handles file rotation, checkpointing
- Similiar to CloudWatch Agent (Logs)
- Also works with Kinesis streams
- Kinesis Data Streams
Destination
- S3 bucket
- records are concatenated into larger objects
- compression: gzip, zip, snappy
- needs IAM role
- Supports encryption (SSE-KMS)
- Redshift table
- uses intermediate S3 bucket
- issues COPY command continuously
- no error-tolerance
- skipped objects are written to manifest file (S3)
- Compression: gzip
- ElasticSearch
- Splunk
Data transformation
- Invoke Lambda function on every record
- Source record backup possible
Buffer
- Size (1MB-128MB)
- Time (1-15 minutes)
- Buffer may be raised if delivery falls behind
No comments:
Post a Comment