AKKO AI Runtime

Orchestrate, govern, audit and optimise
private LLMs on your infrastructure.

Six proprietary components on top of an open-source Lego foundation. Built for banking, healthcare and the public sector — regulated environments where every prompt, every token and every row must stay inside your perimeter.

Try Sandbox Talk to sales

Air-gapped capable 0 outbound calls OCSF audit native Helm one-command

Architecture

Six components, one runtime contract.

Each component is wired into the same Helm chart. Swap an engine (Query, Compute, Storage, Orchestration) without touching the runtime surface — the contract stays the same.

Components

What ships inside the runtime.

Each card maps to one component of the AKKO AI Runtime. The bold capital words are the layers ; vendor names appear only because this is an architectural page for CTOs and architects.

01 Query Layer

Query Engine + AI Functions

A federated query engine (Trino 480) extended with a JVM plugin that exposes 17 native SQL functions under the akko_ai_* namespace.

akko_ai_embed() — vector embeddings inside SQL
akko_ai_similarity() — cosine / dot-product / euclidean
akko_ai_search() — semantic search with Caffeine LRU cache
akko_ai_generate() — guarded LLM calls with token caps

02 AI Layer

NL2SQL Orchestrator

ADEN turns natural language into governed SQL — with a scope-first OPA design (one policy call resolves the full grant set before generation, instead of N calls per column).

Scope-first OPA (1 call instead of N)
Multi-tier cache : semantic + exact + result
Vector catalog semantic search (Milvus)
SELECT-only enforcement at parse-time

03 Catalog Layer

Catalog Auto-Enrichment

NORA + catalog-sync daemon : a steward-reviewed pipeline that mines the query log and the data itself to produce production-ready metadata — not a one-shot LLM hallucination.

PII detection (regex + LLM cross-check)
Description generation with steward sign-off
Foreign-key discovery from join patterns
Query log mining for column importance

04 Governance Layer

AI Governance Engine

OPA bundles, OCSF audit emitter, Keycloak event router and a PII propagation tracker stitched into a single governance graph spanning data, prompts and outputs.

OPA bundles versioned + signed
OCSF event schema native (no translator)
Keycloak event router → audit sink
PII propagation graph (source → prompt → output)

05 Observability Layer

AI Observability Layer

Cost, quality and drift — the three blind spots of every DIY LLM stack — surfaced as first-class metrics on top of an open observability bus (Prometheus + Tempo + VictoriaLogs).

LLM cost per user / per project / per query
Cache hit rate (semantic / exact / result)
Hallucination tracking against golden answers
Drift monitoring via golden-question replay

06 RAG Layer

RAG Pipeline

A three-tier retrieval pipeline that lets you start small with pgvector, scale to OpenSearch, and archive cold corpora as Iceberg — all with lineage.

Tier 1 — pgvector for hot, transactional retrieval
Tier 2 — OpenSearch for hybrid lexical + vector
Tier 3 — Iceberg for archived corpora and replay
Lineage end-to-end : document → chunk → answer

Build vs run

What’s NOT in a typical assembly.

You can absolutely assemble Trino + OPA + Airflow + Iceberg yourself. The table below lists what you would still have to build before you ship the first governed prompt.

Capability	AKKO AI Runtime	Typical Trino + OPA + Airflow assembly
17 AI functions inside SQL	✓ shipped as JVM plugin, JMX-instrumented	✗ build a Trino plugin, wire JMX, version, sign
Single-call OPA (scope-first)	✓ 1 OPA call resolves the full grant set	✗ N calls per query, P95 explodes past 5 columns
Auto-enrichment daemon	✓ production-ready, steward-reviewed, write-through	✗ Airflow DAG + LLM glue + steward UI to build
Governance graph	✓ data × prompt × output, single store	✗ three siloed lineage tools, no PII propagation
Native OCSF audit	✓ 6 sources, one OCSF stream, regulator-ready	✗ write a translator per source, hope for parity
Multi-tier semantic cache	✓ semantic + exact + result, with invalidation	✗ Redis with TTL — no semantic equivalence

12–18 months of senior engineering to reconstruct what AKKO ships as a runtime.

Get the AKKO AI Runtime in your environment.

Cloud, on-prem or air-gapped. One Helm command. Your keys, your data, your perimeter, your exit.

Try the sandbox Contact form

Banking · Healthcare · Public Sector

Orchestrate, govern, audit and optimise private LLMs on your infrastructure.