Data can be static or dynamic. Static data are normally saved in a database management system (DBMS), a file system or even an object storage. You retrieve data from these system as needed for different purposes. Dynamic data is also called streaming data, which are generated and moving constantly. You have to process streaming data using Data Stream Management System (DSMS) [1] which allows you to handle continuous data streams. This blog discusses and compares two of the most popular data stream management systems: AWS Kinesis [2] and Apache Kafka.
Value |
AWS Kinesis |
Apache Kafka |
Platform Support |
Cloud Only |
On-Premise & Cloud |
Data Replay |
Yes (Kinesis VCR) |
N/A |
Availabilty |
Managed by AWS (Auto Replication) |
User to Configure (MirrorMaker) |
Scalabilty |
Limited (AWS Managed Shards) |
Highly Scalable (User Managed Partitions/Brokers) |
Connectivity |
Yes (Kinesis REST APIs) |
Yes (Kafka Connect, Kafka Streams, Kafka Rest Proxy) |
Interoperability |
Limited |
Extensive (Apache Samza, Apache Spark Streaming, Storm) |
Data Retention |
7 days |
No Limit |
Easy to Setup |
Easy |
Difficult (Confluent Cloud) |