Notes on AWS, Big Data, Machine Learning and Leadership: AWS ElastiCache (Redis)

Saturday, 17 March 2018

AWS ElastiCache (Redis)

Shard (node group)

Group of 1-6 nodes
- 1 master (read/write)
- 0-5 slaves (read only)
  - Async replication (replication lag)

Multi AZ failover

At least 1 read-replica must exist
Data loss possible (async replication)
Process
- EC detects primary down
- Selects read replica (least replication lag) and promotes
- Primary endpoint DNS updated (no application change for writes)
- New replica created instead of primary (only when AZ is back up)
- You need to change read endpoint (as connecting directly)
- Takes a few minutes
- Cannot manually promote when Multi-AZ enabled
- AOF must be disabled
- Customer initiated reboot does not trigger failover (but other reboots do)

Redis (Cluster mode disabled)

1 shard

Redis (Cluster mode enabled)

1-15 shards
Nodes know about their existence
- Gossip binary protocol on dedicated peer-to-peer port
Assigns keys to 16384 "hash slots"
- Not consistent hashing but similiar concept
- Nodes are aware of assignment
  - They do not proxy requests but can redirect
- Assignment is customizable
Online cluster resizing
- Dynamically add/remove shards
- EC manages slot ownership migration

Backup&Restore

Snapshot of entire cluster
- Entire cluster is restored
Contains cluster metadata + data
Stored in S3
Depending on the memory available
- Available memory: child process is forked
- Not enough memory: cooperative process employed
Can be automated (just like in RDS)
Cluster may be seeded from external .rdb file

At-rest encryption

Encrypts data on disk during sync and backup operations
Performance impact

References

https://www.allthingsdistributed.com/2017/11/scaling-amazon-elasticache.html

No comments:

Post a Comment