# AI Security Checklist 2026

> The 45-point checklist DField Solutions runs on every AI audit engagement.
> Free · CC-BY 4.0 · attribution: [dfieldsolutions.com](https://dfieldsolutions.com)

**Legend**: **S** Severe · **H** High · **M** Medium · **L** Low
**Scope**: production LLM systems (chat, RAG, agents, classifiers)

---

## 1 · Data (10 items)

1. **[S]** Data entering the AI system is classified in 4 tiers (public / internal / confidential / GDPR personal). Only `public` and `internal` reach the prompt.
2. **[S]** PII is masked before the request hits the LLM (regex + ML detector, e.g. Presidio).
3. **[H]** Retention of prompt + response logs ≤ 30 days, encrypted at rest.
4. **[H]** Training data exclusion clause exists in the provider contract (no-train, in writing).
5. **[M]** Embeddings are encrypted if stored outside the primary region.
6. **[M]** Access to the vector DB is tenant-scoped at the query layer (Postgres RLS or similar).
7. **[H]** Raw embeddings are never exposed via a public API endpoint.
8. **[M]** DPA (Data Processing Agreement) is signed with the LLM vendor.
9. **[L]** Data-lineage tracking logs where each ingested document came from.
10. **[M]** PII-classifier eval runs on a monthly basis on a gold set.

## 2 · Model (7 items)

11. **[H]** Model version is pinned (explicit versioned identifier, not `latest`).
12. **[H]** Model provider region complies with data residency requirements (EU-region for GDPR).
13. **[M]** A fallback model is wired via a router (different provider preferred).
14. **[M]** Eval scores are re-run on every model version upgrade before promotion to prod.
15. **[L]** Cost-tracking dashboards exist per model / per feature.
16. **[M]** Model-output sampling is deterministic enough for reproducibility (temperature documented).
17. **[M]** On-prem / local model option exists for regulated workloads.

## 3 · Prompt (9 items)

18. **[S]** 200+ prompt-injection patterns run in CI; build fails on >5% success rate.
19. **[S]** User-submitted content is segmented with an unambiguous delimiter and marked as untrusted.
20. **[H]** System prompt contains no secrets (API keys, internal URLs, customer PII, business rules).
21. **[H]** Indirect prompt injection via document uploads is tested (white-on-white text, invisible markers).
22. **[H]** RAG-index poisoning is mitigated (source attribution, chunk-level classifier).
23. **[M]** Instruction hierarchy is explicit in the system prompt (system > developer > user > retrieved).
24. **[M]** Context window budget is enforced; long inputs are summarised before the primary call.
25. **[L]** Few-shot examples in the prompt don't leak customer data.
26. **[L]** Jailbreak prompts from public lists are blocked at the gateway.

## 4 · Access (9 items)

27. **[S]** All LLM calls route through a server-side proxy (no client-side API keys).
28. **[S]** Per-user rate-limit and token cap are enforced.
29. **[H]** Tool-call wrappers re-authorise: the LLM can call the tool, but the wrapper enforces that the user can access the target data.
30. **[H]** LLM-database reads respect row-level security / multi-tenant boundaries.
31. **[H]** Global rate-limit prevents cost explosion (per-minute + per-hour cap).
32. **[M]** Service account for the LLM has the minimum necessary IAM scope.
33. **[M]** API tokens to the LLM provider are rotated quarterly.
34. **[L]** Admin-only debug prompts require SSO / MFA.
35. **[M]** Audit log captures: user, prompt, response, tool calls, timing — for every request.

## 5 · Response (10 items)

36. **[S]** Output validator checks for disallowed content (NSFW, PII leak, disallowed personas).
37. **[S]** Reverse PII check on the response (no accidental customer data in generated text).
38. **[H]** Hallucination eval runs on a gold set in CI.
39. **[H]** Real-time alert fires when the response contains known sensitive patterns.
40. **[H]** Output HTML / Markdown is sanitised on the client to prevent image-URL-exfiltration.
41. **[M]** Citation / source-attribution is accurate (retrieved docs match claimed sources).
42. **[M]** A `reject & retry` loop runs when the validator finds a problem.
43. **[L]** Response is streamed incrementally (for UX, not security, but helps perceived latency).
44. **[M]** Cache keys for response caching include user / tenant ID.
45. **[S]** Observability on all 44 above · if you can't see it, it's not controlled.

---

## Scoring

| Score | Band | Action |
|-------|------|--------|
| 0-34% | **Critical** | Stop ship. Fix Severe + High items this week. |
| 35-59% | **At risk** | Ship blocked on any unresolved Severe item. Fix High items this sprint. |
| 60-84% | **Stable** | OK to ship. Schedule Medium items for the next two sprints. |
| 85-100% | **Strong** | Ship with confidence. Annual refresh + eval recalibration. |

---

## Attribution

Published by DField Solutions · [dfieldsolutions.com](https://dfieldsolutions.com)
Released under Creative Commons Attribution 4.0 (CC-BY 4.0).
Cite as: DField Solutions, "AI Security Checklist 2026" (April 2026).
