[{"data":1,"prerenderedAt":649},["ShallowReactive",2],{"navigation":3,"pages-product-features":38},[4],{"title":5,"path":6,"stem":7,"children":8,"icon":37},"Getting Started","\u002Fdocs\u002Fgetting-started","1.docs\u002F1.getting-started\u002F1.index",[9,12,17,22,27,32],{"title":10,"path":6,"stem":7,"icon":11},"Getting started","i-lucide-flag",{"title":13,"path":14,"stem":15,"icon":16},"Installation","\u002Fdocs\u002Fgetting-started\u002Finstallation","1.docs\u002F1.getting-started\u002F2.installation","i-lucide-download",{"title":18,"path":19,"stem":20,"icon":21},"License configuration","\u002Fdocs\u002Fgetting-started\u002Flicense-configuration","1.docs\u002F1.getting-started\u002F3.license-configuration","i-lucide-key-round",{"title":23,"path":24,"stem":25,"icon":26},"Your first app","\u002Fdocs\u002Fgetting-started\u002Ffirst-app","1.docs\u002F1.getting-started\u002F4.first-app","i-lucide-square-play",{"title":28,"path":29,"stem":30,"icon":31},"Architecture","\u002Fdocs\u002Fgetting-started\u002Farchitecture","1.docs\u002F1.getting-started\u002F5.architecture","i-lucide-layers",{"title":33,"path":34,"stem":35,"icon":36},"Migrating from Kafka Streams","\u002Fdocs\u002Fgetting-started\u002Fmigration","1.docs\u002F1.getting-started\u002F6.migration","i-lucide-shuffle",false,{"id":39,"title":40,"body":41,"booktabsTables":37,"compactToc":641,"description":642,"extension":643,"meta":644,"monoTables":37,"navigation":641,"path":645,"seo":646,"stem":647,"wide":37,"__hash__":648},"pages\u002F6.pages\u002Fproduct\u002Ffeatures.md","Features",{"type":42,"value":43,"toc":626},"minimark",[44,69,74,77,93,179,185,206,210,213,255,261,265,270,274,281,323,331,335,338,352,361,365,368,423,428,439,442,475,478,514,519,523,539,543,550,581,587,591],[45,46,47,48,52,53,58,59,63,64,68],"p",{},"StoatFlow is a Java library for building stream-processing applications on Apache Kafka. This page covers what it does — the DSL it implements, the runtime that executes it, the state-store options, the operational tooling it ships with, and the production-grade scaffolding for testing, building, and deploying. For ",[49,50,51],"em",{},"why"," StoatFlow is built this way see ",[54,55,57],"a",{"href":56},"\u002Fproduct\u002Fmotivation","Motivation","; for how it stacks up against Kafka Streams and Flink see the ",[54,60,62],{"href":61},"\u002Fproduct\u002Fcomparison-matrix","Comparison matrix","; for the benchmark numbers see ",[54,65,67],{"href":66},"\u002Fproduct\u002Fbenchmarks","Benchmarks",".",[70,71,73],"h2",{"id":72},"stream-processing-with-the-kafka-streams-dsl","Stream processing with the Kafka Streams DSL",[45,75,76],{},"Stream processing means reading continuous event streams from Kafka, transforming them on the fly — filtering, joining, aggregating, windowing — and writing the results to Kafka or to downstream systems. No batch windows, no nightly jobs; events flow through the pipeline and produce outputs continuously, typically with sub-second end-to-end latency.",[45,78,79,80,84,85,88,89,92],{},"The Kafka Streams DSL is the canonical programming model for this on the JVM. You declare a ",[81,82,83],"strong",{},"topology"," — a graph of processing steps — using two main abstractions: a ",[81,86,87],{},"KStream"," represents an unbounded sequence of events, and a ",[81,90,91],{},"KTable"," represents a continuously-updated table whose latest value for each key is the materialised state of an event stream. From these two primitives you compose every common stream-processing pattern:",[94,95,96,123,141,147,153,159,173],"ul",{},[97,98,99,102,103,107,108,107,111,107,114,107,117,107,120,68],"li",{},[81,100,101],{},"Stateless transforms"," — ",[104,105,106],"code",{},"map",", ",[104,109,110],{},"filter",[104,112,113],{},"flatMap",[104,115,116],{},"selectKey",[104,118,119],{},"branch",[104,121,122],{},"merge",[97,124,125,102,128,107,131,107,134,107,137,140],{},[81,126,127],{},"Stateful aggregations",[104,129,130],{},"count",[104,132,133],{},"reduce",[104,135,136],{},"aggregate",[104,138,139],{},"cogroup"," (combining multiple grouped streams into a joint aggregation).",[97,142,143,146],{},[81,144,145],{},"Joins"," — stream-stream, stream-table, table-table, and foreign-key joins (KIP-1104, Kafka 4.0+).",[97,148,149,152],{},[81,150,151],{},"Windowing"," — tumbling, hopping, sliding, and session windows for time-bounded aggregations.",[97,154,155,158],{},[81,156,157],{},"Event-time semantics"," — process records by when they happened, not when they arrived, using watermarks to handle out-of-order data.",[97,160,161,164,165,168,169,172],{},[81,162,163],{},"Processor API"," — drop down to a lower level when the DSL doesn't cover what you need; write a ",[104,166,167],{},"Processor"," or ",[104,170,171],{},"FixedKeyProcessor"," and wire it into the topology.",[97,174,175,178],{},[81,176,177],{},"Serdes"," — pluggable serialization\u002Fdeserialization for keys and values (String, Long, Avro with Schema Registry, Protobuf, JSON, custom).",[45,180,181,184],{},[81,182,183],{},"Error handling is a first-class part of the DSL surface."," Records that fail deserialization can be routed to a dead-letter queue rather than failing the topology. Errors thrown during processor execution are caught by pluggable exception handlers — log-and-continue, log-and-fail, or send-to-dead-letter-queue — configurable per topology. Errors during production back to Kafka go through a configurable handler too. StoatFlow ships the latest Kafka Streams additions in this area: pluggable per-record processing exception handlers and standardised dead-letter-queue integration with rich failure metadata (original topic, partition, offset, exception class, error context) so downstream consumers can reason about what failed and why.",[45,186,187,188,107,191,107,193,107,195,198,199,201,202,68],{},"StoatFlow implements the full Kafka Streams DSL surface. Existing topologies port over with a dependency swap and a configuration cleanup — the ",[104,189,190],{},"StreamsBuilder",[104,192,87],{},[104,194,91],{},[104,196,197],{},"Materialized",", and ",[104,200,167],{}," types you already know all work. Operator-by-operator detail, with usage examples, lives in the ",[54,203,205],{"href":204},"\u002Fdocs","docs",[70,207,209],{"id":208},"stoatflow-extensions-to-the-dsl","StoatFlow extensions to the DSL",[45,211,212],{},"On top of the standard DSL, StoatFlow adds a small set of capabilities borrowed from the Apache Flink design playbook — features that complement the standard Kafka Streams primitives.",[94,214,215,230,243,249],{},[97,216,217,102,220,107,223,107,226,229],{},[81,218,219],{},"Flink-style watermark strategies",[104,221,222],{},"BoundedOutOfOrderness",[104,224,225],{},"MonotonousTimestamps",[104,227,228],{},"NoWatermarks",", and custom strategies. Configure event-time progress per source topic; reuse Flink's mental model for late-arrival handling. Useful when source topics have different out-of-order tolerances, when watermark logic depends on record content rather than just record timestamps, or when you want a single watermark strategy reused consistently across multiple topologies.",[97,231,232,235,236,238,239,242],{},[81,233,234],{},"Flink-style timers"," — register event-time and processing-time timers from inside a custom ",[104,237,167],{},". Fire callbacks at deterministic event-time progress, independent of incoming records. Useful for emitting expiry events when per-key state goes inactive, scheduling periodic emissions of aggregated state per key, or implementing custom watermark-triggered semantics. Standard Kafka Streams ",[104,240,241],{},"Punctuator","s still work and remain useful for topology-wide scheduled work; these timers add a complementary per-key option.",[97,244,245,248],{},[81,246,247],{},"Scheduled sources"," — a topology-level source that emits records on an interval or cron schedule, without consuming from a Kafka topic. Useful for periodic enrichment refreshes (reload a reference table every five minutes), heartbeat records that keep watermarks advancing when traffic is sparse, watchdog records that trigger window flushes during quiet periods, scheduled report generation, and other time-driven inputs that aren't sourced from another topic.",[97,250,251,254],{},[81,252,253],{},"Side outputs"," — a single processor can emit to multiple labelled output channels rather than one. Useful for separating a primary output stream from audit or diagnostic streams, producing multiple typed outputs from one input record, or routing records to different destinations based on content while keeping the routing logic inside one processor.",[45,256,257,258,260],{},"These extensions sit alongside the DSL: standard topologies don't see them; new code can use them where they fit. The ",[54,259,205],{"href":204}," cover usage.",[70,262,264],{"id":263},"state-stores","State stores",[45,266,267,268,68],{},"Stateful operations — aggregations, joins, windowed counts — need somewhere to put their state. StoatFlow ships several store types: key-value, window, session, versioned (timestamped lookups per KIP-889), and timer. Each is available in two flavours: persistent (RocksDB-backed) for durability under crash and in-memory for low-latency hot data. State updates and Kafka commits land atomically together — there are no half-applied changes after a failure. Durability is provided by changelog topics; on restart, state is rebuilt from the changelog if the local copy is gone. Store-by-store reference and configuration knobs are in the ",[54,269,205],{"href":204},[70,271,273],{"id":272},"single-replica-architecture","Single-replica architecture",[45,275,276,277,280],{},"StoatFlow runs your topology as ",[81,278,279],{},"one application instance per app"," — a single replica, not a cluster, not a partitioned set. The single-replica shape gives you a set of capabilities that follow directly from there being no coordination across replicas:",[94,282,283,289,295,301,313],{},[97,284,285,288],{},[81,286,287],{},"Parallel processing decoupled from Kafka partition count."," Processing concurrency scales with the cores you give the JVM, not with how many partitions your source topics have. Add cores; get more throughput.",[97,290,291,294],{},[81,292,293],{},"Per-key processing ordering preserved."," Records for the same key always process in the order they arrive, regardless of how parallelism is distributed underneath.",[97,296,297,300],{},[81,298,299],{},"Global state."," Every state store is accessible from any processing context, by any key. No partition-scoped isolation, no inter-instance lookups, no global-table replication overhead.",[97,302,303,306,307,107,309,312],{},[81,304,305],{},"In-memory repartitioning."," Operations that change the key — ",[104,308,116],{},[104,310,311],{},"groupBy",", joins on different keys — route records to a new processing context within the same JVM, without a round-trip through a Kafka repartition topic.",[97,314,315,318,319,322],{},[81,316,317],{},"Blocking I\u002FO is natural."," REST calls, database queries, AI-inference requests work in-line. No async-callback frameworks, no ",[104,320,321],{},"CompletableFuture"," ceremony.",[45,324,325,326,328,329,68],{},"The mechanism behind these capabilities (virtual-thread lanes, key-affinity hashing, the lane dispatcher, the commit-barrier protocol) is described in ",[54,327,28],{"href":29},". The architectural reasoning — why single-replica, why now — lives on ",[54,330,57],{"href":56},[70,332,334],{"id":333},"exactly-once-semantics","Exactly-once semantics",[45,336,337],{},"StoatFlow supports both processing modes Apache Kafka offers:",[94,339,340,346],{},[97,341,342,345],{},[81,343,344],{},"At-least-once (ALO)"," — every record is processed at least once; duplicates can occur on failure recovery. Faster, simpler, the default for workloads that tolerate retries downstream.",[97,347,348,351],{},[81,349,350],{},"Exactly-once (EOS)"," — every record's effects (state-store updates, downstream output records, input-offset commits) are atomically committed exactly once, even across crashes and restarts. No duplicates, no lost work, no partially-applied side effects.",[45,353,354,355,358,359,68],{},"In StoatFlow, exactly-once is delivered by a ",[81,356,357],{},"single commit barrier"," that flows through the whole topology. When the barrier completes, all lane state, all sink writes, and all consumer offsets commit together in one Kafka transaction. There are no per-task transactions to coordinate, no cross-instance two-phase commits. Switching between ALO and EOS is a configuration change; the topology code doesn't move. The commit-protocol details are in ",[54,360,28],{"href":29},[70,362,364],{"id":363},"operational-simplicity-and-fast-deploys","Operational simplicity and fast deploys",[45,366,367],{},"StoatFlow is built to be operated, not orchestrated.",[94,369,370,376,382,388,394,400,409],{},[97,371,372,375],{},[81,373,374],{},"One JAR or container per app."," No cluster, no operator, no Helm chart, no sidecar. Deploy it the way you deploy any other Java application.",[97,377,378,381],{},[81,379,380],{},"No consumer-group sizing."," There's no \"group\" — a single replica owns all partitions. Nothing to plan, nothing to rebalance.",[97,383,384,387],{},[81,385,386],{},"No standby replicas to budget."," State is on the active instance; recovery is from changelog, not from a hot copy with its own resource cost.",[97,389,390,393],{},[81,391,392],{},"No checkpoint storage, no savepoint coordination."," The commit-barrier mechanism is internal to the single replica; there's no external checkpoint store to configure, tune, or pay for.",[97,395,396,399],{},[81,397,398],{},"Deploys are single-replica restarts."," A new version starts, recovers state, and resumes processing. No cluster-wide rebalance, no partition reassignment, no consumer-group reconvergence.",[97,401,402,405,406,408],{},[81,403,404],{},"Fast time-to-processing."," State restoration is parallelised across stores; on small and medium state, start-to-process is in the hundreds of milliseconds. See ",[54,407,67],{"href":66}," for measured cold-start numbers.",[97,410,411,414,415,418,419,422],{},[81,412,413],{},"Kubernetes-native, without a custom operator."," A standard ",[104,416,417],{},"Deployment"," plus a ",[104,420,421],{},"PersistentVolumeClaim"," is enough.",[45,424,425,426,68],{},"Less to set up, less to tune, less to fail in production. The design rationale behind this footprint sits on ",[54,427,57],{"href":56},[70,429,431,432,435,436],{"id":430},"production-runtime-core-and-runtime","Production runtime — ",[104,433,434],{},":core"," and ",[104,437,438],{},":runtime",[45,440,441],{},"StoatFlow ships as two modules:",[94,443,444,465],{},[97,445,446,450,451,453,454,457,458,461,462,464],{},[81,447,448],{},[104,449,434],{}," is the programmatic API. You build a topology with ",[104,452,190],{},", hand it to a ",[104,455,456],{},"StoatFlow"," instance, and call ",[104,459,460],{},"start()",". ",[104,463,434],{}," is embeddable in any JVM application — bring your own framework, your own DI container, your own deployment shape.",[97,466,467,471,472,474],{},[81,468,469],{},[104,470,438],{}," wraps ",[104,473,434],{}," with production scaffolding. It exposes an admin REST layer modelled on Spring Boot Actuator, so the boilerplate you'd otherwise write (health checks, metrics, graceful shutdown, lifecycle management, observability endpoints) ships out of the box.",[45,476,477],{},"The runtime endpoints cover:",[94,479,480,486,492,498,504],{},[97,481,482,485],{},[81,483,484],{},"Health probes"," — liveness and readiness, for Kubernetes and any other orchestrator that speaks HTTP probes.",[97,487,488,491],{},[81,489,490],{},"Metrics"," — Prometheus scrape endpoint with JVM, Kafka-client, and StoatFlow internal counters.",[97,493,494,497],{},[81,495,496],{},"Topology introspection"," — see the compiled processor DAG, watermark state per partition, commit-barrier coordinator state.",[97,499,500,503],{},[81,501,502],{},"Debug endpoints"," — thread state and barrier coordination, useful when a production app behaves unexpectedly and you need to see what every lane is doing right now.",[97,505,506,509,510,513],{},[81,507,508],{},"Pause \u002F unpause"," controls and a ",[81,511,512],{},"plugin \u002F lifecycle hook system"," for custom integrations.",[45,515,516,517,68],{},"The full endpoint reference is in the ",[54,518,205],{"href":204},[70,520,522],{"id":521},"testing","Testing",[45,524,525,526,529,530,435,533,536,537,68],{},"StoatFlow runs in any standard JVM testing framework — JUnit, Kotest, Spock, anything that supports JDK 25. The library ships its own in-memory test harness: a ",[104,527,528],{},"TopologyTestDriver"," that runs a topology synchronously without a Kafka broker, plus ",[104,531,532],{},"TestInputTopic",[104,534,535],{},"TestOutputTopic"," helpers for piping records through and asserting on outputs. Detailed test patterns and worked examples are in the ",[54,538,205],{"href":204},[70,540,542],{"id":541},"build-and-deployment","Build and deployment",[45,544,545,546,549],{},"StoatFlow ships a Gradle convention plugin (",[104,547,548],{},"io.stoatflow",") alongside the runtime modules. The plugin encapsulates the build conventions that StoatFlow apps need so you don't have to reproduce them yourself.",[94,551,552,558,564,570],{},[97,553,554,557],{},[81,555,556],{},"Toolchain"," — JDK 25 with the preview features StoatFlow uses (virtual threads, FFM) enabled.",[97,559,560,563],{},[81,561,562],{},"Shadow JAR"," — distributable fat JAR with the main class set; one artifact to deploy.",[97,565,566,569],{},[81,567,568],{},"Container image"," — opt-in Jib integration produces an OCI image from Gradle without a Dockerfile.",[97,571,572,575,576,435,578,580],{},[81,573,574],{},"GraalVM native image"," — opt-in native compilation for smaller binaries and faster cold starts. Reachability metadata (RocksDB FFM downcalls, Kafka client reflection) is curated and shipped with ",[104,577,434],{},[104,579,438],{},", so the standard StoatFlow surface compiles to native without manual configuration.",[45,582,583,584,586],{},"Standard Java packaging — no special runtime, no platform agent. The ",[54,585,205],{"href":204}," cover the build setup.",[70,588,590],{"id":589},"see-also","See also",[94,592,593,598,603,608,613,619],{},[97,594,595,597],{},[54,596,57],{"href":56}," — why StoatFlow is built the way it is",[97,599,600,602],{},[54,601,62],{"href":61}," — feature-by-feature against Kafka Streams and Flink",[97,604,605,607],{},[54,606,67],{"href":66}," — measured throughput, latency, resource use, cold-start time",[97,609,610,612],{},[54,611,28],{"href":29}," — runtime internals (lanes, dispatcher, commit-barrier protocol)",[97,614,615,618],{},[54,616,617],{"href":204},"Docs"," — usage, examples, configuration reference",[97,620,621,625],{},[54,622,624],{"href":623},"\u002Fproduct\u002Froadmap","Roadmap"," — what's coming next",{"title":627,"searchDepth":628,"depth":628,"links":629},"",2,[630,631,632,633,634,635,636,638,639,640],{"id":72,"depth":628,"text":73},{"id":208,"depth":628,"text":209},{"id":263,"depth":628,"text":264},{"id":272,"depth":628,"text":273},{"id":333,"depth":628,"text":334},{"id":363,"depth":628,"text":364},{"id":430,"depth":628,"text":637},"Production runtime — :core and :runtime",{"id":521,"depth":628,"text":522},{"id":541,"depth":628,"text":542},{"id":589,"depth":628,"text":590},true,"What StoatFlow does — the Kafka Streams DSL, StoatFlow extensions, state stores, single-replica runtime, exactly-once semantics, operational tooling, testing, and build chain.","md",{},"\u002Fpages\u002Fproduct\u002Ffeatures",{"title":40,"description":642},"6.pages\u002Fproduct\u002Ffeatures","yBgQGD2WFkS5OLMsiJwTov12w8H2voFmPDi7KyjmS0k",1780332010534]