Architecture USP · AI sovereignty

Your model.
Your key.
Your tenant.

AI features in EMS don't run through our vendor account. Every tenant configures their own LLM endpoint and API key — Azure OpenAI in an EU West datacenter, Mistral AI in Paris, or a self-hosted model on your own GPU cluster. The LiteLLM proxy unifies the API. With a custom endpoint, the rule is: hard-fail instead of silent fallback — data control stays with you, even on failure.

30+ models in the capability database
4 provider families out of the box
0 silent fallbacks on custom endpoint

Anatomy

Five building blocks for real AI data sovereignty

"BYO-LLM" is a checkbox in many SaaS settings — for us it's an architecture principle carried through from the tenant setting to the inference layer.

Per-tenant endpoint & key

Every tenant decides in their AI settings: default proxy, or their own endpoint with their own API key. With a custom endpoint, the vendor never sees the prompts or responses.

  • endpointUrl
  • apiKey (verschlüsselt)
  • fallbackToGlobalProxy: false

LiteLLM proxy as universal adapter

One unified, OpenAI-compatible API in front of every provider. Switching from OpenAI to Mistral or Ollama is a config change, not a code change in EMS.

  • OpenAI
  • Azure OpenAI
  • Anthropic
  • Mistral
  • Ollama
  • LocalAI

Hard-fail instead of silent fallback

If your custom endpoint goes down, the system rejects the request — it doesn't quietly reroute to the global vendor account. Data control stays intact, even in failure mode.

  • if (tenant.hasCustomEndpoint && !ok)
  • throw new InferenceUnavailableException();
  • // kein Fallback. Kein „opportunistisches" Routing.

Tool calling with schema enforcement

Each editor ships its own JSON schema, generated from the TypeScript model. The LLM literally cannot return fields the editor doesn't know — no hallucination, no format break.

  • structuredOutput: true
  • toolCalling: true
  • capabilityCheck pro Modell

Live token transparency per tenant

A dashboard bar in the tenant settings shows tokens spent and live cost — broken down per feature. No hidden "AI credits", no surprise on the monthly invoice.

  • tokens.in · tokens.out
  • cost.eur live
  • aiFeature-Aufschlüsselung

Scenario 1 · Data sovereignty

Your own Azure key. The vendor never sees your prompts.

A law firm bound by attorney-client privilege cannot send client data to third-party LLMs without a data processing agreement. They deploy Azure OpenAI in EU West (Sweden), create their own service principal with their own API key — and configure it in their EMS tenant's AI settings.

From the next click on "AI Paste" or "Improve with AI", every request runs exclusively against their own Azure tenancy. The vendor (Consiliari) has no access to prompts, no logs, no token counters. The GDPR data-protection impact assessment now addresses only one data processor — Azure EU — instead of a provider chain.

  • DPA with Microsoft only, not with the SaaS vendor
  • EU West region enforceable (Sweden Central, France Central)
  • Your own Microsoft compliance inventory applies

Scenario 2 · Air-gap

Self-hosted Ollama. Air-gapped research. AI still works.

A research institute runs EMS in an isolated network with no internet — typical for pharma and defense research. Instead of reaching out to an external LLM, the LiteLLM proxy points at an internal Ollama endpoint running Llama 3.3 70B on their own GPU cluster.

AI Paste, AI Grid Filter and Text Enhancement keep working — just against your model. Not one bit of research data leaves the network. No black-box cloud. No monthly token invoice.

  • Ollama, LocalAI, vLLM, or your own OpenAI-compatible API
  • Capability check adapts: vision off → AI Paste PDF off
  • Switching to a cloud LLM later is a single config line

Scenario 3 · Model routing

One model per AI feature — fitted to the task.

Not every AI feature needs the most expensive model. AI Paste with PDF upload needs vision capability — so GPT-5.5. Text Enhancement benefits from Claude Sonnet for more natural tone. AI Grid Filter is a fast, simple tool-calling task — a local 8B model is enough.

Configurable per tenant and per feature. The capability check prevents mistakes: pick an older model without tool calling for AI Paste, and you get a clear error — not a hallucinated reply.

Scenario 4 · EU AI Act

EU AI Act readiness, off-switch and live cost — per tenant.

AI features default to off per tenant. Activation is a deliberate tenant decision — not a silent vendor update. Each individual feature has its own switch. Token use and cost show up live in the settings, broken down by feature.

No autonomous model decision affects entries, hourly rates or approvals — AI only pre-fills fields and drafts text. The right to override and the audit log remain fully intact.

  • Per-feature opt-in instead of a global "AI on/off" switch
  • Live token and cost bar (no hidden credits)
  • No autonomous decisions — AI suggests, never rules

Comparison

BYO-LLM versus vendor-bound LLM

Most SaaS AI features run through one central vendor account — Salesforce Einstein, HubSpot Breeze, Personio AI. Your data flows under their provider contract, in their region, under their model-swap policy. Our approach is different.

DimensionTemporalis EMSVendor-bound suites
LLM provider choiceOpenAI · Azure · Anthropic · Mistral · Ollama · LocalAI · your own OpenAI-compatible APIfixed by the vendor — usually OpenAI behind the curtain
API key ownershiptenant configures their own keyvendor account, tenant pays for "AI credits"
Self-hosted optionOllama, vLLM, LocalAI — air-gap possiblenot supported, cloud enforced
Custom endpoint failurehard-fail (data control retained)irrelevant — no custom endpoint possible
Schema enforcement (tool calling)in 10 editors · schema generated from TS modelpartly free text, partly templates, rarely tool calling
Model per featureeach AI feature picks its own modelone model for every feature
Token transparencylive in tenant settings · per feature"AI credit" bundles, no real-time view
Per-feature off switchevery AI mode independently switchableglobal "AI on/off" or nothing at all

Want AI in your EMS — without handing your data to our vendor account?

Try free for 14 days, plug in your own key, trigger the hard-fail behaviour yourself. No credit card, no sales call.