audit-trailcomplianceloggingai-act

AI Audit Trail and Compliance Logging: Practical SMB Guide

April 24, 20267 min readPixel Management

This article is also available in Dutch

The EU AI Act entered into force in August 2024, and since August 2026 the obligations for high-risk AI systems apply. That means you can no longer get away with "we have AI in use, but we can't show exactly what it did." For high-risk AI systems, logging of decisions is mandatory. For lower-risk AI it's effectively mandatory the moment a data subject or regulator asks why an AI decision came out the way it did — and no log means no defense.

But what should you actually capture? How long do you keep it? What are the minimum requirements for an SMB without a security operations center? This article gives practical answers, with a reference architecture you can stand up in a few weeks.

Why audit logging is more than a regulatory checkbox

It's tempting to treat audit logging as a compliance tickbox. But businesses that take it seriously quickly discover three other benefits that outweigh the regulation itself.

Debugging when things go wrong. An AI system that made a wrong call — a rejected lead, a faulty price, a misclassification — can only be diagnosed if you know what input it received, which model version it ran, and what output it produced. Without a log, your team is empty-handed.

Evidence for customers. A customer who says "your AI treated me unfairly" is a different conversation when you can show within 5 minutes exactly what happened, vs. when you can't.

Improvement through analysis. Good logs are the foundation for measuring whether your AI is getting better or worse over time. Drift, hallucinations, and patterns of error become visible in retrospective analysis — not in real-time monitoring.

For broader regulatory context, see our pillar on AI legislation in the Netherlands and the EU AI Act.

What the law minimally requires

The EU AI Act (Art. 12, 13, 14) and GDPR (Art. 22, 30) together require a logging set that, in practice, breaks down into six components:

  • Which AI component was used — model, version, vendor
  • What input went in — the exact prompt or data, anonymized where possible
  • What output came out — including any scores or classifications
  • What decision followed — what the system or human did with the output
  • When and by whom — timestamp, user (human or system), context
  • What human oversight took place — did a human review the output? What did that person decide?

This sounds like a lot, but for most AI applications it can be structured down to one JSON line per decision. The technical execution isn't the bottleneck — the organizational discipline to actually do it consistently is.

Reference architecture for SMBs

A workable setup that scales without enterprise tooling has five components.

At the front sits a logging layer: every AI call (internal or external) is intercepted by a wrapper function that captures the six fields above. For Python that's a decorator, for Node.js it's middleware. If you're working in no-code stacks like n8n or Make, you add a dedicated "log-to-database" step to each AI flow.

Behind it sits storage — an append-only database. Postgres with an immutable_log table works well for most SMB volumes (up to tens of thousands of decisions per day). For higher volumes, you graduate to a time-series database (TimescaleDB, ClickHouse) or a log platform (Datadog Logs, Logtail).

Logs are only useful if you can search them, so build indexing around them: at minimum searchable by user, date, AI component, and decision type. Tools like Grafana Loki or a small Elastic instance handle this.

Retention runs into a dual requirement: GDPR demands deletion when data is no longer necessary; the AI Act requires retention for the time the system is in production plus a reasonable post-deployment investigation window. In practice that lands on a 24-month retention default for most use cases, with explicit deletion routines for personal data.

Finally: not everyone gets to see the logs. Separate roles work best — developers see anonymized logs for debugging, compliance officers see full logs for audits. And access itself is logged, because meta-logging is what makes an audit of your audit possible.

This setup typically costs €200–€800 per month in tooling for an SMB, plus a one-off €5,000–€15,000 to set up.

What to log per use case

Log content varies by AI application type. Four common patterns:

Chatbot or customer service AI

Per conversation: user question, retrieved knowledge source, answer, any escalation to human, whether the user was satisfied. Important extra field: confidence score of the answer — low-confidence answers that didn't escalate are the first place problems show up.

Document or contract analysis

Per analyzed document: document type, model that ran, extracted data points, any rejected or flagged elements, human review outcome. This connects directly to what we cover in AI document processing.

Decision support (recruiting, lending, pricing)

Per decision: all input variables, generated score, threshold value, final decision, whether a human reviewed it, and if so, whether that human overruled. This is high-risk under the AI Act — logging here must be stricter than for other use cases.

Generative AI for content

Per generated piece: prompt, model, version, output, where it was published, whether a human reviewed before it went out. This connects to transparency requirements — more in AI transparency requirements for businesses.

Save 5 hours per week on reconstructing AI decisions ad-hoc when a customer or regulator asks

Three common mistakes

Three problems that come up in every audit of AI logging:

Logging too much. Some businesses literally log every API call with the full prompt and output, including personal data, and keep it indefinitely. That's not an audit trail — it's a GDPR violation that regulators will enforce. Logs must be purpose-driven: only what's necessary for accountability.

Logging too little. Other businesses log only "AI used" without any content. That's legally equivalent to logging nothing — you can't reconstruct a decision. An auditor will expect to be able to retrieve and trace an individual decision.

Logging without being able to read it. Logs without search, without reporting, without periodic review are dead data. It takes weeks to find anything, and no one looks at it proactively. The value of logging sits in its accessibility.

Learn more about AI consulting?

View service

How to organize this internally

Logging is a technical topic that quickly becomes an organizational problem the moment ownership comes up. Three roles you should explicitly assign:

Logging owner (technical). Someone in IT or engineering who maintains the pipelines and storage and detects disruptions. In an SMB, often the CTO or lead developer.

Compliance reviewer. Someone who periodically (per quarter) walks through a sample of logs and checks whether the system is doing what it should. Often the privacy point of contact or an external consultant.

Incident responder. Someone who can respond within 24 hours when a question comes from a customer, regulator, or a reported error. Doesn't have to be full-time — but clearly named.

These roles map directly onto the broader AI governance framework for SMBs and work best in combination with a current AI compliance checklist. Anyone who's run a DPIA will find much of the input for logging requirements already in that document — see DPIA for AI projects.

What it costs

For an SMB with several AI systems in production:

ComponentCost
Logging infrastructure (storage + search)€100–€500/month
Wrapper and pipeline implementation€5,000–€15,000 one-off
Ongoing maintenance and review€500–€1,500/month
External auditor (if needed)€2,000–€8,000/year

Year 1 total: €12,000–€35,000. Against a potential AI Act fine of up to €15 million (or 3% of global turnover), this is cheap insurance. And the operational value — debug capability, customer trust, improvement insight — typically pays back independent of regulation.

The core: logs that hold up in an audit, not logs that look pretty

An audit trail that works is one that survives when someone in a real audit situation asks: "show me what this system did on March 14 at 11:23 with customer X." If you can show that in ten minutes — input, output, decision, human review — you're fine. If it takes three days to find or you have to say "we're not logging that" — then the work you've done is compliance theater, not compliance.

So build your logs from that scenario backward, not from a template list of fields. Ask yourself: if the data protection authority or AI Act regulator knocked on the door tomorrow, could I show within a working day what my AI did over the past month? If the answer is "no," you know where the first investment needs to land.

Curious how much time you could save?

Request a free automation scan. We'll analyze your processes and show you where the gains are — no strings attached.