Product comparison matrix

Honest side-by-side — StoatFlow vs Kafka Streams vs Apache Flink (self-hosted) vs managed Flink platforms.

One honest side-by-side so evaluators don't have to assemble it themselves. Strengths of each are acknowledged in the rows where they apply; trade-offs are stated plainly.

StoatFlow
Kafka Streams
Managed Flinkvaries significantly by provider
Deployment model
JVM app → single instance → static group membership
JVM app → n service instances → Kafka consumer group
JAR/SQL/Python job → managed service → provider-specific compute/capacity units, autoscaling, and runtime constraints
Scaling model
Vertical — CPU, memory, n lanes (virtual threads)
Horizontal — scale out n replicas; stream threads t
(same)
Rebalancing
None — single instance
Consumer-group rebalances and state restoration are operational concerns; mitigated by static membership, cooperative rebalancing, and standby replicas
Provider may automate or hide rescaling/restart mechanics; workload recovery still follows Flink's checkpoint/savepoint model.
Repartitioning
Local, in-process redistribution via lane channels → no Kafka repartition topics → no broker round-trip, no serialization, no network shuffle
Kafka repartition topics → serialization → network IO, load on brokers
(same)
Exactly-once
Chandy-Lamport with barriers + Kafka transactions; single-coordinator
Kafka transactions; stream task scoped
Distributed checkpoints (managed) — config options depend on provider & service
State model
Global — any key accessible from any lane — no replicated copies across application instances
Partition-scoped; global stores (limited scope) with local copies per replica
(same)
State access
(1) abstracted by DSL — (2) flexible direct global store access via interactive queries
(1) abstracted by DSL — (2) flexible direct partition-bounded store access via interactive queries
(same)
High-availability model
Single active instance with fast restart — benchmarked 400ms (stateless) / 800–1200ms (stateful) — no cross-instance rebalance or state shuffle in the normal failure path — hot-standby for blue-green failover on roadmap
HA patterns via static membership, cooperative rebalancing, and standby replicas; effective when configured well, but requires expert knowledge to setup and operate
Managed control plane and recovery infrastructure; workload recovery still follows Flink's checkpoint/savepoint model, with provider-specific autoscaling and failover behaviour.
DSL
Kafka Streams DSL-compatible + StoatFlow extensions
Kafka Streams DSL (native)
(same)
Low-latency
Designed for E2E sub-second Kafka-native workloads
Sub-second E2E latency per sub-topology — every (repartition) Kafka round-trip adds latency
(same)
Throughput ceiling
Benchmarked up to 500 MB/sec on single machine (8 Core VM)
Scales with instance count — repartition topics add Kafka I/O, which factors into the throughput envelope (distribution tax)
(same)
External I/O (REST, DB, AI)
Virtual-thread-friendly blocking I/O — high concurrency without dedicating platform threads per request — configurable parallelism
No native async-I/O DSL operator — blocking calls occupy stream threads — parallelism limited by Kafka partitions
(same)
Migration from Kafka Streams
Dependency swap + config cleanup; state recovers via reprocess from source topics (assumes adequate retention)
— (native)
Full rewrite
License
Commercial — seats + machines
Apache 2.0 (free)
Commercial — pricing model varies by vendor (usage-based for SaaS offerings like AWS KDA / Confluent Flink / Ververica Cloud; subscription for self-managed Ververica Platform)
Maturity
v1.0, MVP — early production pilots
Production since 2016 — battle-tested at scale
Ververica Platform GA since ~2018; AWS KDA for Apache Flink GA Nov 2018; Confluent Flink GA 2024
SQL interface
None — Kotlin/Java DSL only
None — DSL only (ksqlDB is Confluent licensed and no longer mentioned/relevant)
Same as Flink; often the primary surface
Not best fit
Workloads needing open-ended horizontal scale or very large state beyond one machine
Workloads where rebalance/state/repartition overhead dominates
Cost-sensitive mid-size Kafka-only workloads

(same) = identical to the column to the left.

When StoatFlow vs. each

vs. Kafka Streams

Kafka Streams is battle-tested and has one of the most ergonomic topology APIs on the JVM. StoatFlow keeps a Kafka Streams DSL-compatible programming model, but trades Kafka Streams' horizontal scale-out architecture for a simpler single-instance runtime. That removes whole categories of distributed overhead: inter-instance rebalancing, state migration, and repartition-topic round-trips. For Kafka-native workloads that fit comfortably on one modern machine, this can translate into lower compute, memory, network, and storage usage — and often lower latency, higher throughput on the same hardware, and materially lower operating cost. The trade-off is explicit: StoatFlow gives up open-ended horizontal scale in favour of simpler, more resource-efficient vertical scale.

vs. Apache Flink (self-hosted)

Flink is the right tool for non-Kafka sources and sinks, analytics, ML, unified streaming + batch, or massive-scale workloads (Netflix, Uber, Stripe, Alibaba). StoatFlow targets the wide middle of Kafka-native stream processing applications — workloads Flink can also handle, but where the JobManager/TaskManager runtime, checkpoint tuning, and platform expertise can become a significant part of the cost structure.

vs. Managed Flink

Managed Flink removes much of the operational burden, but you still pay for a managed distributed data-processing platform. That cost is justified when you need Flink's scale, elasticity, SQL, or connector ecosystem. It can be disproportionate for Kafka-native workloads that would otherwise fit comfortably on one machine. StoatFlow targets that middle ground: simpler deployment and service-like cost structure, with explicit single-machine limits.

Feature parity

Side-by-side DSL coverage — Kafka Streams baseline, Apache Flink for reference, StoatFlow's surface against both.

StoatFlow
Kafka Streams
Core transformations
map, flatMap, filter, peek
transform / process
Processor API / ProcessFunction
Branching / routing
split, branch, merge
union
side outputs
on roadmap
broadcast
Keying / partitioning
selectKey / keyBy, groupBy
repartition
shuffle, rescale, rebalance
global / GlobalKTable
Aggregations
count, reduce, aggregate
cogroup
Windowing
tumbling, hopping/sliding, session windows
custom windows
grace period / allowed lateness
suppression
triggers, evictors
Time
event time, processing time
stream time
ingestion time
watermarks
idleness, watermark alignment
Timers / callbacks
punctuators (stream-time / wall-clock-time)
timers (event-time / processing-time)
scheduled sources
Joins
stream-stream join, stream-table join, table-table join
inner join, left join, outer join
window join
foreign-key join
interval join, temporal join
semi join, anti join
Tables / changelogs
KTable / dynamic table
changelog stream, upsert
GlobalKTable
retractions
State
local state, keyed state
key-value state, window state, session state
custom state store
operator state, broadcast state
state TTL
atomic store operations (compute, merge)
KeyLockManager
State access
interactive queries
local state queries
containsKey, containsSession
I/O
Kafka source, Kafka sink
custom source, custom sink
file source/sink, JDBC source/sink
async I/O
Fault tolerance
restore
changelog topics / changelog state backend
standby replicas
on roadmap
checkpoints, savepoints
Consistency
at-least-once, exactly-once
transactions
two-phase commit
Higher-level APIs
DSL
Processor API / DataStream API
Table API, SQL
CEP
Execution modes
streaming
bounded streams, batch
unified batch/streaming
Testing
TopologyTestDriver
MiniCluster, OperatorTestHarness

Measured single-machine limit

On a basic 8-core Hetzner dedicated VM, StoatFlow saturates CPU or network at ~200–300 MB/s uncompressed throughput — in events, ~124K/sec on a 1KB stateless transform, up to ~2.1M/sec output on word-count-style aggregation. That is the practical ceiling to plan against when sizing a single instance; sustained workloads above it are the ones where horizontal scale-out is the right tool.

Two reference workloads anchor that envelope.

  • stateless-simple: 1 KB strings → map(uppercase()) → sink
    124k ops/s in -> 124k ops/s out
  • word-count: phrases dictionary → split + groupBy + count → sink
    52k ops/s in -> 2.1M ops/s out

Last measured 2026-05-10. More benchmarks — on stronger hardware — coming soon.