[{"data":1,"prerenderedAt":701},["ShallowReactive",2],{"navigation":3,"\u002Fdocs\u002Fgetting-started\u002Farchitecture":38,"\u002Fdocs\u002Fgetting-started\u002Farchitecture-surround":696},[4],{"title":5,"path":6,"stem":7,"children":8,"icon":37},"Getting Started","\u002Fdocs\u002Fgetting-started","1.docs\u002F1.getting-started\u002F1.index",[9,12,17,22,27,32],{"title":10,"path":6,"stem":7,"icon":11},"Getting started","i-lucide-flag",{"title":13,"path":14,"stem":15,"icon":16},"Installation","\u002Fdocs\u002Fgetting-started\u002Finstallation","1.docs\u002F1.getting-started\u002F2.installation","i-lucide-download",{"title":18,"path":19,"stem":20,"icon":21},"License configuration","\u002Fdocs\u002Fgetting-started\u002Flicense-configuration","1.docs\u002F1.getting-started\u002F3.license-configuration","i-lucide-key-round",{"title":23,"path":24,"stem":25,"icon":26},"Your first app","\u002Fdocs\u002Fgetting-started\u002Ffirst-app","1.docs\u002F1.getting-started\u002F4.first-app","i-lucide-square-play",{"title":28,"path":29,"stem":30,"icon":31},"Architecture","\u002Fdocs\u002Fgetting-started\u002Farchitecture","1.docs\u002F1.getting-started\u002F5.architecture","i-lucide-layers",{"title":33,"path":34,"stem":35,"icon":36},"Migrating from Kafka Streams","\u002Fdocs\u002Fgetting-started\u002Fmigration","1.docs\u002F1.getting-started\u002F6.migration","i-lucide-shuffle",false,{"id":39,"title":28,"body":40,"description":690,"extension":691,"meta":692,"navigation":693,"path":29,"seo":694,"stem":30,"__hash__":695},"docs\u002F1.docs\u002F1.getting-started\u002F5.architecture.md",{"type":41,"value":42,"toc":673},"minimark",[43,52,59,64,67,70,98,101,110,114,117,180,191,195,202,209,212,218,228,236,240,247,257,269,276,284,288,294,297,308,311,318,324,327,331,341,352,366,377,393,397,403,414,420,434,440,458,463,467,470,475,489,501,523,535,557,561,564,639,643],[44,45,46,47,51],"p",{},"This page is the conceptual architecture of StoatFlow — the model that runs your topology, the mechanisms behind the exactly-once and per-key-order guarantees, and what you can observe at runtime. It describes ",[48,49,50],"strong",{},"behaviour and design",", not implementation: the source code is the source of truth for the specific algorithms, data structures, and internal protocols.",[44,53,54],{},[55,56],"img",{"alt":57,"src":58},"StoatFlow architecture — single-replica engine with key-affinity lanes, global state, and transactional producer","\u002Fassets\u002Fdocs\u002Farchitecture\u002FStoatFlow_high-level_architecture_detailed_20260517.png",[60,61,63],"h2",{"id":62},"the-single-instance-model","The single-instance model",[44,65,66],{},"The starting point is unusual for stream processing: a StoatFlow application runs as exactly one JVM process. There is no cluster, no scheduler, no worker pool. The process opens a Kafka consumer group with one member — itself — and that member is assigned every partition of every source topic the topology reads from.",[44,68,69],{},"Several properties follow directly from this:",[71,72,73,80,86,92],"ul",{},[74,75,76,79],"li",{},[48,77,78],{},"No rebalancing"," — there is no group to rebalance.",[74,81,82,85],{},[48,83,84],{},"Global state"," — every state store lives in this one process; any processing context can read or write any key.",[74,87,88,91],{},[48,89,90],{},"Deterministic behaviour"," — no inter-instance race, no clock skew between replicas, no split-brain scenarios.",[74,93,94,97],{},[48,95,96],{},"One coordinator"," — exactly-once commits are coordinated within the process, not across nodes.",[44,99,100],{},"What this model forbids is equally explicit. You cannot run two replicas of the same StoatFlow application pointed at the same source topics. There is no protocol to coordinate them. Kafka's consumer-group semantics would assign all partitions to one and idle the other. High availability comes from fast restart, not from running additional active replicas.",[44,102,103,104,109],{},"The design reasoning — including the trade-offs you accept by giving up open-ended horizontal scaling — is on ",[105,106,108],"a",{"href":107},"\u002Fproduct\u002Fmotivation","Motivation",".",[60,111,113],{"id":112},"data-flow","Data flow",[44,115,116],{},"The diagram above shows the path a record takes from source topic to sink topic.",[118,119,120,126,137,162,168,174],"ol",{},[74,121,122,125],{},[48,123,124],{},"Kafka consumer."," A single consumer reads from every partition of every source topic. Records arrive in batches.",[74,127,128,131,132,136],{},[48,129,130],{},"Record dispatch."," The dispatcher inspects each record's key, decides which processing lane handles it (via consistent hashing — see the next section), and places records onto per-lane queues. The dispatcher also injects commit barriers into the lanes when it's time to commit (see ",[133,134,135],"em",{},"Exactly-once semantics"," below).",[74,138,139,142,143,147,148,147,151,147,154,157,158,161],{},[48,140,141],{},"Processing lanes."," Each lane runs the topology — your ",[144,145,146],"code",{},"mapValues",", ",[144,149,150],{},"filter",[144,152,153],{},"join",[144,155,156],{},"aggregate",", custom ",[144,159,160],{},"Processor"," code — for the records assigned to it. Stateful operators read and write state stores.",[74,163,164,167],{},[48,165,166],{},"State stores."," Backed by RocksDB or held in memory. Globally accessible from any lane; see the next-but-one section.",[74,169,170,173],{},[48,171,172],{},"Sink collection."," Records emitted by the topology buffer in a sink collector, ready to publish to Kafka.",[74,175,176,179],{},[48,177,178],{},"Transactional producer."," A single Kafka producer writes the buffered output records, updates the consumer-group offsets, and commits — under exactly-once, all three of those happen atomically on a commit barrier.",[44,181,182,183,186,187,190],{},"That is the full data path. There is no broker round-trip between processing steps inside the topology, no cluster shuffle, no external coordinator. Repartitioning — moving a record from one lane to another because a ",[144,184,185],{},"selectKey"," or ",[144,188,189],{},"groupBy"," changed its key — happens in-memory between lanes; there is no internal repartition topic.",[60,192,194],{"id":193},"processing-lanes-and-key-affinity","Processing lanes and key affinity",[44,196,197,198,201],{},"A ",[48,199,200],{},"lane"," is a unit of concurrent processing inside the JVM. Each lane runs the topology independently of all other lanes, but against the same shared state stores.",[44,203,204,205,208],{},"Records are routed to lanes by ",[48,206,207],{},"key affinity",". The dispatcher hashes each record's key and consistently picks one lane — same key, same lane, every time. This guarantees that for any given key, the topology processes events in the order Kafka delivered them. Different keys process in parallel.",[44,210,211],{},"Two consequences are worth naming:",[44,213,214,217],{},[48,215,216],{},"Lane count is decoupled from Kafka partition count."," In the standard Kafka Streams model, processing parallelism is bounded by the partition count of the input topics — one stream thread per task, one task per partition. StoatFlow doesn't have that coupling. The consumer reads all partitions, then the dispatcher distributes work across however many lanes you configure. Lane count scales with cores, not partitions.",[44,219,220,223,224,227],{},[48,221,222],{},"Blocking I\u002FO is cheap."," Lanes run on virtual threads — JDK 21's GA primitive. A lane blocked on a REST call, a database query, or an AI-inference response parks at near-zero cost; the JVM keeps making progress on other lanes. This is what makes in-line external enrichment natural — no ",[144,225,226],{},"CompletableFuture"," chains, no reactive frameworks, no callback wiring required to keep throughput up under blocking calls.",[44,229,230,231,147,233,235],{},"Records that the topology re-keys with ",[144,232,185],{},[144,234,189],{},", or a key-changing join get re-hashed and routed to a different lane. That is the in-memory equivalent of Kafka Streams' repartition topic, without the broker round-trip or extra serialization.",[60,237,239],{"id":238},"state-stores-and-durability","State stores and durability",[44,241,242,243,246],{},"State is ",[48,244,245],{},"global",". Every state store lives in the JVM that's running the topology, and any lane can read or write any key. There's no partition-scoped isolation, no inter-instance lookup protocol — and because there's a single process holding all state, no replication of the same data across multiple JVMs.",[44,248,249,252,253,256],{},[48,250,251],{},"State stores are safe under concurrent access"," across lanes. The correctness story falls out of key affinity (see ",[133,254,255],{},"Processing lanes"," above): records with the same key always route to the same lane, so updates to any given key are processed serially by one lane in arrival order. Different keys update in parallel across different lanes — no contention, no global lock. For custom Processors that need to read-modify-write multiple keys atomically (rare, but real for some patterns), the runtime provides a key-lock utility so the cross-key invariant holds without forcing single-threaded execution.",[44,258,259,260,147,263,147,266,268],{},"StoatFlow ships several store types — key-value, window, session, versioned (timestamped lookups), and timer — each available in a RocksDB-backed (persistent, on-disk) variant or an in-memory variant. Stateful DSL operators (",[144,261,262],{},"count",[144,264,265],{},"reduce",[144,267,156],{},", joins, windowed counts, suppress) choose the appropriate store type automatically. Custom Processors can declare their own.",[44,270,271,272,275],{},"Durability is provided by ",[48,273,274],{},"Kafka changelog topics",". Every state write produces a changelog entry. Changelog topics are compacted by key, so the latest value for every key is preserved indefinitely without unbounded storage growth. State updates and the corresponding changelog publish are coupled atomically — when a commit barrier completes, you can be confident the changelog has the same data your in-memory state does.",[44,277,278,279,283],{},"On restart, the runtime rebuilds local state from the changelog. Restoration runs in parallel across stores so a topology with many state stores recovers concurrently rather than serially. For workloads with large state, the changelog read dominates restart time; see ",[105,280,282],{"href":281},"\u002Fproduct\u002Fbenchmarks","Benchmarks"," for measured cold-start numbers on representative workloads.",[60,285,287],{"id":286},"exactly-once-semantics-the-commit-barrier","Exactly-once semantics — the commit barrier",[44,289,290,291,109],{},"Exactly-once is the conceptual centrepiece of the runtime, and the mechanism is the ",[48,292,293],{},"commit barrier",[44,295,296],{},"A commit barrier is a marker — not a data record, not user content — that the dispatcher periodically injects into the lanes. As records flow through the topology the barrier flows with them. When every lane has reached the barrier, the runtime executes a single Kafka transaction that commits, atomically:",[71,298,299,302,305],{},[74,300,301],{},"Every state-store write since the previous barrier (via the changelog topics).",[74,303,304],{},"Every sink output record produced since the previous barrier.",[74,306,307],{},"The Kafka consumer-group offsets for every input partition that contributed records.",[44,309,310],{},"Either all three commit together, or none do. If the JVM crashes mid-barrier, the in-flight Kafka transaction aborts. The partial work — uncommitted state changes, uncommitted output records, uncommitted offset advances — is discarded. On restart, processing resumes from the previous successful barrier as if the interrupted epoch had never happened. No duplicate outputs. No lost state. No replayed offsets.",[44,312,313,314,317],{},"This protocol is in the ",[48,315,316],{},"Chandy-Lamport family"," of distributed-snapshot algorithms — the same conceptual lineage that Flink's checkpoint barriers descend from. What's different in StoatFlow is the scope: one process, one barrier, one transaction covers the whole topology. There are no per-task transactions to coordinate, no cross-instance two-phase commits, no external checkpoint store to configure.",[44,319,320,323],{},[48,321,322],{},"At-least-once mode"," bypasses the barrier entirely. The producer commits its output records and the consumer commits its offsets on independent, faster cadences. You accept that on a crash some records may be processed twice and downstream consumers may see duplicates. The trade-off is a lower commit-cadence floor on end-to-end latency — useful when downstream systems are already idempotent or duplicate-tolerant.",[44,325,326],{},"The barrier scheduling cadence, the recovery handshake, the bounded-wait protocol for the transaction itself, and the recovery accounting are implementation concerns and stay in the source.",[60,328,330],{"id":329},"event-time-and-watermarks","Event time and watermarks",[44,332,333,334,337,338,109],{},"Stream processing has to handle time. Records arrive out of order. Network buffers can hold a batch for an unpredictable interval. A topic with many producers carries events generated at very different wall-clock times. The runtime needs a consistent model for ",[133,335,336],{},"when did this happen"," that's independent of ",[133,339,340],{},"when did this arrive",[44,342,343,344,347,348,351],{},"That model is ",[48,345,346],{},"event time",". Every input record carries a timestamp — the Kafka record timestamp by default, or whatever a custom ",[144,349,350],{},"TimestampExtractor"," returns. Stateful operators that care about time (windowed aggregations, session windows, joins with time bounds) reason in event time.",[44,353,197,354,357,358,361,362,365],{},[48,355,356],{},"watermark"," is a claim made by the runtime: \"I do not expect any further records earlier than time ",[133,359,360],{},"T",".\" Watermarks are tracked ",[48,363,364],{},"per source partition","; the runtime combines them into a single global watermark for the application — because there's only one application instance, there's no distributed watermark-coordination protocol. The global watermark advances together with the commit barrier, so windowed-result records are committed alongside the watermark progress that produced them — recovery sees a consistent snapshot of \"what the app has seen up to.\"",[44,367,368,369,372,373,376],{},"When the global watermark passes a window's end, the runtime knows the window can close — no more records will arrive that belong inside it. ",[48,370,371],{},"Late records"," — records whose event time is older than the current watermark — follow configurable ",[48,374,375],{},"per-source-topic"," policies: drop them, route them to a dead-letter queue, or apply them to an open window within a configured grace period.",[44,378,379,380,383,384,387,388,392],{},"Custom Processors can register ",[48,381,382],{},"event-time timers"," and ",[48,385,386],{},"processing-time timers"," that fire callbacks when the relevant clock advances past a registered moment, independent of incoming records. See ",[105,389,391],{"href":390},"\u002Fproduct\u002Ffeatures","Features"," for the watermark strategies and timer API.",[60,394,396],{"id":395},"lifecycle-startup-restart-recovery","Lifecycle: startup, restart, recovery",[44,398,399,402],{},[48,400,401],{},"Cold start"," runs in three steps:",[118,404,405,408,411],{},[74,406,407],{},"The runtime opens a Kafka consumer in the configured group and gets every partition of every source topic assigned.",[74,409,410],{},"State stores restore from their changelog topics, in parallel. For stores without a local snapshot, this reads the whole changelog; for stores with a local snapshot, only the records since the last commit need to be read.",[74,412,413],{},"Once every store has caught up, the consumer seeks to the last committed input offsets and processing begins.",[44,415,416,419],{},[48,417,418],{},"Clean shutdown"," is the reverse:",[118,421,422,425,428,431],{},[74,423,424],{},"The runtime stops accepting new records into the dispatcher.",[74,426,427],{},"Records already in flight drain through the topology.",[74,429,430],{},"The dispatcher injects one final commit barrier.",[74,432,433],{},"When that barrier completes, the runtime commits, closes the consumer and producer, and exits.",[44,435,436,439],{},[48,437,438],{},"After a crash",", the flow is similar to cold start with one detail:",[118,441,442,445,448,451],{},[74,443,444],{},"The JVM exits non-cleanly; in-flight work was uncommitted, by design.",[74,446,447],{},"On restart, the runtime opens the consumer at the last committed offsets — every record after that offset will be re-read.",[74,449,450],{},"Local state stores may have partial in-memory data still on disk from before the crash; the runtime uses what's there as a head-start and the changelog fills the gap to the last committed barrier.",[74,452,453,454,457],{},"Processing resumes. Under exactly-once, the previous epoch's partial work was aborted at the broker; downstream consumers reading with ",[144,455,456],{},"read_committed"," isolation see no duplicates.",[44,459,460,461,283],{},"Restart times scale with state size — see ",[105,462,282],{"href":281},[60,464,466],{"id":465},"failure-modes-and-observability","Failure modes and observability",[44,468,469],{},"Production architecture is partly about what happens when things go right, and partly about what you observe when they don't. The runtime handles common failures with explicit, configurable policies; the admin endpoints expose the state you need to diagnose and respond.",[471,472,474],"h3",{"id":473},"common-failure-modes","Common failure modes",[44,476,477,480,481,484,485,488],{},[48,478,479],{},"A processor throws an exception."," The configured processing-exception handler decides: log and continue (skip the record), log and fail (stop the topology), or send to a dead-letter queue with the original record and error context. Silent skipping should be a deliberate choice, not an unexamined default. ",[133,482,483],{},"What you see:"," ",[144,486,487],{},"processor-error"," metrics, an error-level log entry with stack trace and record metadata, the offending record in the configured DLQ topic.",[44,490,491,494,495,484,497,500],{},[48,492,493],{},"A record fails deserialization on input."," Same machinery as a processor exception, with its own configurable handler. Useful for source topics that may contain malformed records — keep processing, route the broken records to a DLQ for offline inspection. ",[133,496,483],{},[144,498,499],{},"deserialization-error"," metrics, DLQ records carrying the original key\u002Fvalue bytes and the offending exception.",[44,502,503,506,507,484,509,512,513,516,517,383,520,109],{},[48,504,505],{},"A commit transaction times out or fails."," The runtime aborts the in-flight Kafka transaction, treats it as a fatal commit failure, and exits — Kubernetes restarts the process. On restart, the previous epoch's partial work was aborted at the broker (per Kafka's transaction semantics); the new process resumes from the last successful barrier with no duplicates downstream. ",[133,508,483],{},[144,510,511],{},"commit-stall"," metrics and the ",[144,514,515],{},"\u002Fdebug\u002Fbarriers"," endpoint show the stuck barrier before exit; the restarting instance enters the restoration phase visible via ",[144,518,519],{},"\u002Fstate",[144,521,522],{},"\u002Fhealth\u002Fready",[44,524,525,528,529,531,532,534],{},[48,526,527],{},"The Kafka broker is unavailable."," Producer and consumer retry per the Kafka client's exponential-backoff defaults. Short outages cause throughput to dip and recover. Sustained outages eventually exceed configured retry budgets and trigger a fatal failure, on the same exit-and-restart pattern as a commit failure. ",[133,530,483],{}," Kafka-client error metrics, growing consumer-lag metric, ",[144,533,522],{}," flipping to 503 once the runtime can no longer make progress.",[44,536,537,540,541,543,544,546,547,549,550,553,554,556],{},[48,538,539],{},"State restoration is slow on cold start."," Restoration proceeds store-by-store from the changelog topics; the time scales with state size. The ",[144,542,522],{}," probe returns 503 until restoration completes — Kubernetes won't route traffic, and load balancers won't think the instance is ready. ",[133,545,483],{}," the ",[144,548,519],{}," endpoint shows per-store restoration progress; ",[144,551,552],{},"restoration-lag"," metric per store; ",[144,555,522],{}," returning 503 with a JSON body identifying the in-progress restorations.",[471,558,560],{"id":559},"always-on-observability","Always-on observability",[44,562,563],{},"For steady-state diagnostics and capacity planning, the admin endpoints expose:",[71,565,566,578,587,603,614,633],{},[74,567,568,571,572,383,575,577],{},[48,569,570],{},"Health probes"," — ",[144,573,574],{},"\u002Fhealth\u002Flive",[144,576,522],{}," for Kubernetes or any HTTP-probe-aware orchestrator.",[74,579,580,571,583,586],{},[48,581,582],{},"Prometheus metrics",[144,584,585],{},"\u002Fmetrics"," exposes JVM, Kafka-client, and StoatFlow internal counters in the standard Prometheus format. Scrape from your existing monitoring stack; no agents to install.",[74,588,589,571,592,595,596,598,599,602],{},[48,590,591],{},"Topology introspection",[144,593,594],{},"\u002Ftopology"," renders the processor DAG; ",[144,597,519],{}," lists active state stores; ",[144,600,601],{},"\u002Fwatermarks"," shows per-partition watermark state.",[74,604,605,571,608,383,611,613],{},[48,606,607],{},"Debug endpoints",[144,609,610],{},"\u002Fdebug\u002Fthreads",[144,612,515],{}," give live views of every lane's thread state and the commit-barrier coordinator. Useful when a topology behaves unexpectedly and you need to see exactly what every lane is doing right now.",[74,615,616,619,620,147,623,147,626,629,630,109],{},[48,617,618],{},"Plugin and lifecycle hooks"," — register custom indicators, listeners, or shutdown hooks programmatically. User code can run on ",[144,621,622],{},"PRE_START",[144,624,625],{},"POST_START",[144,627,628],{},"PRE_STOP",", and ",[144,631,632],{},"POST_STOP",[74,634,635,638],{},[48,636,637],{},"Structured logs"," — JSON-formatted by default, correlatable with metrics by topology name, lane identifier, and barrier identifier.",[60,640,642],{"id":641},"where-to-go-next","Where to go next",[71,644,645,650,655,662,667],{},[74,646,647,649],{},[105,648,391],{"href":390}," — every distinguishing capability, by area",[74,651,652,654],{},[105,653,108],{"href":107}," — why StoatFlow is built this way",[74,656,657,661],{},[105,658,660],{"href":659},"\u002Fproduct\u002Fcomparison-matrix","Comparison matrix"," — feature-by-feature against Kafka Streams and self-hosted \u002F managed Flink",[74,663,664,666],{},[105,665,282],{"href":281}," — measured throughput, latency, resource use, cold-start times",[74,668,669,672],{},[105,670,671],{"href":34},"Migration"," — porting an existing Kafka Streams topology",{"title":674,"searchDepth":675,"depth":675,"links":676},"",2,[677,678,679,680,681,682,683,684,689],{"id":62,"depth":675,"text":63},{"id":112,"depth":675,"text":113},{"id":193,"depth":675,"text":194},{"id":238,"depth":675,"text":239},{"id":286,"depth":675,"text":287},{"id":329,"depth":675,"text":330},{"id":395,"depth":675,"text":396},{"id":465,"depth":675,"text":466,"children":685},[686,688],{"id":473,"depth":687,"text":474},3,{"id":559,"depth":687,"text":560},{"id":641,"depth":675,"text":642},"How StoatFlow runs your Kafka Streams topology as a single replica — the conceptual model, processing lanes, commit barriers, state, and operational surface.","md",{},{"icon":31},{"title":28,"description":690},"sus1V2BI_dYYJ-T0dahfTLVhoaRmQ-lLU5eZcNjeivI",[697,699],{"title":23,"path":24,"stem":25,"description":698,"icon":26,"children":-1},"Build and run a complete word-count stream processor on the StoatFlow runtime — in Kotlin or Java.",{"title":33,"path":34,"stem":35,"description":700,"icon":36,"children":-1},"Drop-in instructions for running an existing Kafka Streams topology on StoatFlow.",1780332012754]