AKKO AI Runtime

Orchestrate, govern, audit and optimise
private LLMs on your infrastructure.

Six proprietary components on top of an open-source Lego foundation. Built for banking, healthcare and the public sector — regulated environments where every prompt, every token and every row must stay inside your perimeter.

Air-gapped capable 0 outbound calls OCSF audit native Helm one-command

Six components, one runtime contract.

Each component is wired into the same Helm chart. Swap an engine (Query, Compute, Storage, Orchestration) without touching the runtime surface — the contract stays the same.

AKKO AI Runtime hex-grid architecture diagram showing the six components: Query Engine and AI Functions, NL2SQL Orchestrator, Catalog Auto-Enrichment, AI Governance Engine, AI Observability Layer, and RAG Pipeline.

What ships inside the runtime.

Each card maps to one component of the AKKO AI Runtime. The bold capital words are the layers ; vendor names appear only because this is an architectural page for CTOs and architects.

01 Query Layer

Query Engine + AI Functions

A federated query engine (Trino 480) extended with a JVM plugin that exposes 17 native SQL functions under the akko_ai_* namespace.

  • akko_ai_embed() — vector embeddings inside SQL
  • akko_ai_similarity() — cosine / dot-product / euclidean
  • akko_ai_search() — semantic search with Caffeine LRU cache
  • akko_ai_generate() — guarded LLM calls with token caps
17 SQL functions, JMX-instrumented
02 AI Layer

NL2SQL Orchestrator

ADEN turns natural language into governed SQL — with a scope-first OPA design (one policy call resolves the full grant set before generation, instead of N calls per column).

  • Scope-first OPA (1 call instead of N)
  • Multi-tier cache : semantic + exact + result
  • Vector catalog semantic search (Milvus)
  • SELECT-only enforcement at parse-time
0.8 s p50 NL → governed SQL
03 Catalog Layer

Catalog Auto-Enrichment

NORA + catalog-sync daemon : a steward-reviewed pipeline that mines the query log and the data itself to produce production-ready metadata — not a one-shot LLM hallucination.

  • PII detection (regex + LLM cross-check)
  • Description generation with steward sign-off
  • Foreign-key discovery from join patterns
  • Query log mining for column importance
4 enrichment pipelines, sync write-through
04 Governance Layer

AI Governance Engine

OPA bundles, OCSF audit emitter, Keycloak event router and a PII propagation tracker stitched into a single governance graph spanning data, prompts and outputs.

  • OPA bundles versioned + signed
  • OCSF event schema native (no translator)
  • Keycloak event router → audit sink
  • PII propagation graph (source → prompt → output)
6 audit sources, single OCSF stream
05 Observability Layer

AI Observability Layer

Cost, quality and drift — the three blind spots of every DIY LLM stack — surfaced as first-class metrics on top of an open observability bus (Prometheus + Tempo + VictoriaLogs).

  • LLM cost per user / per project / per query
  • Cache hit rate (semantic / exact / result)
  • Hallucination tracking against golden answers
  • Drift monitoring via golden-question replay
4 signals, one dashboard per persona
06 RAG Layer

RAG Pipeline

A three-tier retrieval pipeline that lets you start small with pgvector, scale to OpenSearch, and archive cold corpora as Iceberg — all with lineage.

  • Tier 1 — pgvector for hot, transactional retrieval
  • Tier 2 — OpenSearch for hybrid lexical + vector
  • Tier 3 — Iceberg for archived corpora and replay
  • Lineage end-to-end : document → chunk → answer
3 retrieval tiers, one lineage graph

What’s NOT in a typical assembly.

You can absolutely assemble Trino + OPA + Airflow + Iceberg yourself. The table below lists what you would still have to build before you ship the first governed prompt.

Capability AKKO AI Runtime Typical Trino + OPA + Airflow assembly
17 AI functions inside SQL shipped as JVM plugin, JMX-instrumented build a Trino plugin, wire JMX, version, sign
Single-call OPA (scope-first) 1 OPA call resolves the full grant set N calls per query, P95 explodes past 5 columns
Auto-enrichment daemon production-ready, steward-reviewed, write-through Airflow DAG + LLM glue + steward UI to build
Governance graph data × prompt × output, single store three siloed lineage tools, no PII propagation
Native OCSF audit 6 sources, one OCSF stream, regulator-ready write a translator per source, hope for parity
Multi-tier semantic cache semantic + exact + result, with invalidation Redis with TTL — no semantic equivalence

12–18 months of senior engineering to reconstruct what AKKO ships as a runtime.

Get the AKKO AI Runtime in your environment.

Cloud, on-prem or air-gapped. One Helm command. Your keys, your data, your perimeter, your exit.

Banking · Healthcare · Public Sector