gdpraigerman-companiesdata-sovereigntycompliance

DSGVO-konforme KI für deutsche Unternehmen: So funktioniert es

Die meisten KI-Tools schicken eure Daten auf US-Server. So betreibt ihr private KI auf europäischer Infrastruktur — DSGVO-konform ab Tag eins.

AI Loopwise19. Februar 20266 min read

Your legal team just asked a question no one in the room could answer: does your AI vendor qualify as a data processor under GDPR, and if so, have you signed a data processing agreement with them? If the answer involves any hesitation, your company is probably in violation right now — not because you did something wrong, but because most AI tools were not built with this question in mind.

German mid-market companies between 20 and 500 employees are in a specific bind. Enterprise AI platforms from SAP or Microsoft cost six figures to deploy and take months to configure. Consumer tools like ChatGPT process data on servers in the United States, which puts you in direct conflict with GDPR Article 44 through 49, the rules governing data transfers outside the EU. The middle ground — a system built for your size, deployed on European infrastructure, actually GDPR-compliant — has been mostly empty.

DSGVO-konforme KI Starts with Where the Data Lives

GDPR compliance for an AI system is not a policy you write. It is an architecture decision you make before writing a single line of code.

When a German mid-market company uses a US-based AI service, even through a European subsidiary, Article 44 applies: personal data may not be transferred to a third country unless that country ensures an adequate level of protection. The US does not have an adequacy decision for AI processing — the EU-US Data Privacy Framework covers specific certified companies, but most AI API providers are not on that list, and even those that are face ongoing legal challenge.

The practical consequence: if a tax advisory firm in Stuttgart uploads client documents to ChatGPT or Claude, that is a data transfer to the US. If those documents contain names, financial information, or tax IDs — they do, of course — the firm has processed personal data outside the EU without a legal basis.

The alternative is not complicated. Run the AI on servers inside Germany.

Loopwise deploys on Hetzner Cloud infrastructure in Nuremberg and Falkenstein. Both data centers are in Germany, subject to German and EU law. The LLM layer runs on Mistral AI, a French company with servers in France — inside the EU, inside the EEA, covered by the same legal framework as data processed in Berlin. No request ever reaches a server outside Europe.

Data sovereignty for AI means a system where the data cannot leave the continent because there is no path for it to do so. Not a DPA document you sign and forget.

Private AI Germany: The Architecture Behind the Compliance

GDPR Article 28 requires that if you use a third party to process personal data on your behalf — a data processor — you must have a written contract specifying what they can do with that data. Most AI vendors offer this as a standard DPA. The problem is not the contract. The problem is what happens to the data technically, regardless of what the contract says.

With a system like Loopwise, the architecture isolates data at three levels.

Database isolation. Every client gets its own PostgreSQL schema. A law firm's documents cannot be accessed by queries intended for a recruiting agency using the same platform. This is per-client schema isolation — not just row-level access control, but a structural separation in the database itself.

Vector storage. Document embeddings are stored in PostgreSQL with the pgvector extension, a 1024-dimension vector index using HNSW (Hierarchical Navigable Small World) indexing. When your team asks a question, the system searches your firm's documents only. The embedding model is Mistral Embed, running on Mistral's French infrastructure.

No training on your data. The LLM provider, Mistral AI, does not use API-submitted data to train its models. This matters because it answers a question clients ask repeatedly: does the AI learn from my documents and share what it learns with other customers? No. The model is fixed. Your documents are retrieved for context, used for one response, and not retained by the model.

These three points together — German servers, per-client database isolation, no training on your data — are what a DPO (Datenschutzbeauftragter) needs to sign off on an AI deployment. Not a vendor's marketing page, but a technical architecture they can audit.

What "No Data Crossing the Atlantic" Actually Means for Your Team

The compliance story matters to your legal and data protection teams. The people using the system every day have a more immediate question: does it actually work?

A mid-sized German company typically has its knowledge distributed across three places. Internal documents — PDFs, Word files, policy manuals — sitting on a file server or in SharePoint. Structured data in systems like DATEV, SAP, or a CRM. And tacit knowledge in people's heads that never got written down.

A private AI deployment addresses the first category directly. Upload your document library once. The system processes each file — PDFs, Word documents, CSVs — chunks them into segments the LLM can reason over, embeds them into vectors, and stores them in your PostgreSQL instance. After that, your team queries the library in plain German.

"Was sagt unsere Betriebsvereinbarung zu Homeoffice-Regelungen?" The system retrieves the relevant sections, passes them to the LLM with the question, and returns an answer with citations to the source documents. The answer is only as good as what is in your documents — which is exactly what you want, because the system is not guessing or hallucinating from training data. It is reading your files.

For a tax advisory firm with 50 employees, this might mean 3,000 regulatory documents instantly searchable. For a manufacturing company, it might mean maintenance manuals for 200 machines accessible from a tablet on the shop floor. The architecture handles both; the use case is yours to define.

One honest caveat: a document-based system only knows what you have put into it. Tacit knowledge, undocumented decisions, and data sitting in DATEV or SAP require additional integration work. The document layer is the right starting point for most companies, but it is a starting point.

The Compliance Requirements and the Product Requirements Are the Same

GDPR-compliant AI turns out to be more useful for mid-market companies than non-compliant alternatives — not as a coincidence, but as a direct result of how the constraints shape the system.

Per-client isolation means your data is not mixed with anyone else's, which means the AI's answers are based entirely on your documents, not someone else's. No-training-on-your-data means your competitive information stays yours. European infrastructure means lower latency for German users than US-hosted alternatives.

A vendor who builds for the global consumer market and bolts on a DPA afterward has to paper over the gaps between how the system works and what the contract says. A system designed for GDPR from the architecture up has no gaps to paper over.

Book a 30-minute demo — we'll run a live query against sample documents from your industry and show you exactly where your data sits in the architecture.

Zurück zu allen Beiträgen