Presto
- Developed at Facebook (Hive too slow for them)
- Does not use map reduce
- Does not store intermediate results on disk (in-memory only)
- Does not "spill"
- Pipelined execution (all the stages at once)
- Example: it discovers a file and starts executing immediately, even though other files are not "touched yet", great for "LIMIT 10" queries
- It can stream results
- Implemented in Java
- Generates bytecode directly
- Manage memory themselves (avoid Garbage Collections)
- Flat memory
- Most of the time is spent reading, parsing and deserializing data to the internal memory format
AWS
- Used by Athena
- Can be run on EMR cluster (bypassing Hadoop)
References
No comments:
Post a Comment