Blazil Roadmap: Fintech Scale, AI LLM, and Hardening

Roadmap

Fintech scale, AI inference, and operating maturity now move on one roadmap.

v0.1 delivered 62,770 TPS on a 3-node DO cluster. v0.2 achieved 436K TPS sharded and 131K TPS VSR. v0.3 shipped April 19 2026 on AWS i4i.4xlarge — 233,894 TPS with live VSR failover testing. v0.3.1 integrated AI inference with Tract ONNX. v0.3.2 added multi-stream priority routing with <1ms critical latency. v0.3.3 hardened the platform with 2PC, mTLS, signed artifacts, and operator runbooks. v0.5 AI LLM verified distributed Qwen2.5-7B generation, and v0.5 AI Production now points toward large-model serving.

v0.1✓ Done62,770 TPS · DO cluster · gRPC

—Core Rust engine with LMAX Disruptor ring buffer
—TigerBeetle VSR consensus across three nodes
—gRPC + Tokio UDP transport · 256 in-flight window
—3-node DigitalOcean cluster · $252/month · 0% error rate

v0.2✓ Done1.2M TPS local · 436K TPS sharded · 131K TPS VSR

—Aeron IPC transport — 1,203,108 TPS peak on MacBook Air M4
—io_uring disk writes · sharded TigerBeetle E2E integration
—2 shards per node · 436,351 TPS aggregate (3-node sharded)
—130,998 TPS with full VSR consensus · 2/3 quorum · 0% error
—DO cluster benchmark completed April 13 2026 · SGP1

v0.3✓ Done233,894 TPS peak · AWS i4i · VSR failover tested

—Production-grade single-shard with live VSR failover test
—AWS i4i.4xlarge · Intel Xeon Platinum 8375C · 16 vCPU · 128 GiB · 1.9TB NVMe
—4 shards × dedicated TigerBeetle client · 3-node VSR cluster (loopback)
—233,894 TPS peak · 180,500 TPS avg · 12,421,068 events · 0% error
—VSR failover tested: 1 replica killed at t=80s → recovered in 37s · bench continued
—AWS Singapore · April 19 2026

v0.3 — AWS i4i.4xlarge · 233,894 TPS peak · VSR failover recovery tested · 12,421,068 events · 0% error · April 19 2026

v0.3.1 AI🔨 Implemented1,500–2,000 RPS estimate · AI inference pipeline

—5 production-grade datasets implemented
—Tract ONNX runtime integration for AI inference
—io_uring dataloader for high-throughput model serving
—2,291 LOC · 57 tests passing · CI 100% green
—AI inference pipeline implemented · production benchmark pending

v0.3.2 Priority✓ DoneSame TPS · <1ms critical latency

—Multi-stream priority routing (Critical/High/Normal)
—Critical requests bypass queue with <1ms latency guarantee
—429 tests passing · 0 Clippy warnings
—Production-ready priority scheduler deployed

v0.3.3 Hardening✓ DoneSame TPS · 2PC · mTLS · signed supply chain

—Cross-shard 2PC via TigerBeetle pending/post/void flow
—Kubernetes ingress + cert-manager mTLS automation
—Prometheus PVC persistence and operational hardening
—Syft SBOM + Cosign keyless signing in CI
—3 ADRs and 8 runbooks published for operators

v0.4 Fintech Scale🔮 Future1M+ TPS target · 4× i8g.16xlarge Graviton 4

—4× AWS i8g.16xlarge instances (Graviton 4, 64 vCPU each)
—Sharded VSR cluster with horizontal scaling
—Production benchmark pending · target 1M+ sustained TPS
—Multi-region replication architecture

Target: 2027 — bare-metal NVMe Gen4 · XDP ingress · estimated 5–10M TPS sharded · 1–2M TPS VSR

v0.5 AI LLM✓ Verified32 tokens in 19.7s · distributed Qwen2.5-7B

—Qwen2.5-7B-Instruct distributed 3-stage inference pipeline
—Stage orchestration across Aeron IPC streams 1001/2001/2002/1002/1003
—Language Drift issue fixed and English generation verified
—KV cache preserved across decode steps with correct position propagation
—Production-ready multi-token generation on CPU-only Apple M4

Qwen2.5-7B-Instruct · 3-stage distributed pipeline · 32 tokens in 19.7s on Apple M4 CPU · multi-token generation verified

v0.5 AI Production🔮 FutureTBD RPS · AI inference production benchmark

—Cortex v1 on ClarkenAI 70B plus Blazil Super Engine
—Cloud cost validation against commercial inference APIs
—Ankatos runtime integration and shared transport layer
—Production benchmark pending for large-model serving

The source is open and the benchmarks are reproducible.

Clone the repository, run the benchmark suite, and verify every number yourself.

View on GitHub →