SOC 2 for AI Systems: Compliance Guide

Enterprise buyers increasingly require SOC 2 reports for AI vendors. This guide maps SOC 2's Trust Service Criteria to AI-specific risks, shows which controls you need for LLMs, RAG systems, and agent deployments, and provides a practical path to audit readiness.

SOC 2 for AI Systems: Compliance Guide

Key Takeaways

  • SOC 2 is technology-neutral — no AI-specific criteria, but auditors expect AI-specific controls
  • Map AI risks to the 5 Trust Service Criteria: Security, Availability, Processing Integrity, Confidentiality, Privacy
  • AI-specific controls: model governance, output monitoring, prompt injection defense, data lineage, human oversight
  • Third-party AI providers (OpenAI, Anthropic) are sub-service organizations — you need their SOC 2 reports
  • Timeline: 6-9 months for Type I, 12-18 months for Type II from scratch

SOC 2 Overview

SOC 2 (System and Organization Controls 2) evaluates an organization's controls related to security, availability, processing integrity, confidentiality, and privacy. Created by the AICPA, it's the baseline security assurance framework for SaaS and technology companies.

  • Type I: Point-in-time assessment — are controls properly designed? Faster to achieve, less convincing.
  • Type II: Operating effectiveness over a period (3-12 months) — are controls actually working? The standard enterprise buyers expect.

For AI companies and companies deploying AI systems, SOC 2 demonstrates that you handle customer data responsibly, your AI systems are reliable, and you have controls preventing unauthorized data access or processing.

TSC Mapping for AI

Map each Trust Service Criteria to AI-specific risks:

TSC CategoryAI-Specific RisksControl Examples
SecurityPrompt injection, model theft, API key exposureInput sanitization, model access controls, secrets management
AvailabilityModel degradation, LLM provider outages, GPU capacityFallback models, multi-provider, capacity planning
Processing IntegrityHallucination, data drift, output errorsOutput validation, drift monitoring, evaluation suites
ConfidentialityTraining data leakage, prompt data exposureData encryption, model isolation, zero-retention APIs
PrivacyPII in training data, automated profilingPII detection, consent management, data minimization

Security Controls

Security is the only mandatory TSC category. For AI systems:

  • Access Control: RBAC for model management, API access, and training data. Separate roles for model developers, operators, and consumers.
  • Network Security: AI services in private subnets, API gateway with rate limiting, WAF for web-facing AI endpoints.
  • Input Validation: Prompt injection defenses — input sanitization layers, adversarial input detection, output filtering.
  • Secrets Management: API keys for LLM providers in Vault/Secrets Manager. Rotation policies. No hardcoded keys.
  • Vulnerability Management: Regular scanning of AI dependencies (LangChain, transformers, vector DB libraries). Patch management SLAs.
  • Incident Response: AI-specific incident procedures — model poisoning, data breach through AI output, prompt injection attacks.

Availability Controls

  • Uptime SLAs: Define SLAs for AI endpoints (99.9% = 8.76 hours downtime/year). Different tiers for different AI services.
  • Redundancy: Multi-model fallback — primary LLM → secondary LLM → rule-based fallback. See production deployment patterns.
  • Capacity Planning: Monitor GPU utilization, queue depth, and API rate limit consumption. Auto-scale inference infrastructure.
  • Disaster Recovery: Model artifacts in versioned storage. Vector database backups with tested restoration procedures. Recovery time objectives (RTO) documented and tested.
  • Provider Dependency: Document dependency on third-party LLM providers. Have contingency plans for provider outages or API changes.

Processing Integrity

Processing integrity ensures AI outputs are accurate and reliable:

  • Output Validation: Automated quality checks on AI outputs — format validation, factual consistency, hallucination detection. Log quality scores.
  • Evaluation Suites: Versioned test datasets that run before every model update. Track accuracy, precision, recall, and domain-specific metrics.
  • Data Drift Monitoring: Statistical comparison of input distributions between baseline and production. Alert on significant distribution shifts.
  • Human Review: Sampling-based human review of AI outputs. Minimum review rate based on risk classification. Results feed back into model improvement.
  • Change Management: Model updates go through defined change management — testing, approval, staged rollout, rollback capability.

Confidentiality

  • Data Classification: Classify data that flows through AI systems (public, internal, confidential, restricted). Apply controls based on classification.
  • Encryption: TLS 1.3 in transit, AES-256 at rest for all AI data stores (vector databases, model artifacts, training data, logs).
  • Data Minimization: Only send necessary data to AI models. Strip unnecessary fields before API calls. Implement zero-retention policies with LLM providers.
  • Training Data Security: Secure storage of training datasets. Access logging. Prevent unauthorized copy/export.
  • Output Leakage Prevention: Ensure AI outputs don't reveal confidential data to unauthorized users. Multi-tenant isolation in RAG systems.

Privacy

  • PII Detection: Automated PII scanning in AI inputs and outputs. Block or redact PII that shouldn't be processed.
  • Consent Management: Track consent for data used in AI processing. Honor opt-out requests for AI-based profiling.
  • Data Subject Rights: Implement right-to-delete across AI data stores — training data, embeddings, vector databases, model weights (where applicable).
  • Automated Decision Notices: When AI makes decisions affecting individuals, provide notice and explanation per applicable regulations.
  • Cross-Border Transfers: If AI processing occurs in different jurisdictions than data collection, document transfer mechanisms. See GDPR AI compliance for EU requirements.

AI-Specific Controls

Beyond standard SOC 2 controls, implement these AI-specific policies:

Model Governance

  • Model inventory — catalog all models in production with version, purpose, data sources, owners
  • Model risk assessment — classify models by risk level (automated decisions, financial impact, safety)
  • Approval workflow — documented process for promoting models to production

Data Lineage

  • Track training data sources, preprocessing steps, and data quality metrics
  • Document which data each model was trained/fine-tuned on
  • Maintain data provenance for RAG knowledge bases

Responsible AI

  • Bias testing and fairness metrics for models that affect individuals
  • Transparency documentation — what the model does, how it was trained, known limitations
  • Human oversight requirements based on decision impact

Audit Preparation

  1. Months 1-2: Gap Assessment — Evaluate current controls against TSC. Identify gaps, especially AI-specific controls.
  2. Months 3-4: Control Design — Write control descriptions, policies, and procedures. Implement missing technical controls. Document AI-specific governance.
  3. Months 5-6: Control Implementation — Deploy monitoring, logging, and access controls. Train team on procedures. Begin evidence collection.
  4. Month 7: Type I Readiness — Internal review. Mock audit. Fix gaps. Engage auditor.
  5. Months 8-9: Type I Audit — Auditor evaluates control design. Receive report.
  6. Months 10-18: Type II Observation — Controls operate for 6-12 months. Collect evidence continuously. Auditor tests operating effectiveness.

Ready to build SOC 2-compliant AI systems? Explore our enterprise AI consulting services.

Frequently Asked Questions

Does SOC 2 have specific requirements for AI?

No — SOC 2's Trust Service Criteria are technology-neutral. But auditors increasingly expect AI-specific controls: model governance, data lineage, output monitoring, prompt injection defenses, and human oversight. Map AI risks to existing TSC categories.

How do I handle third-party AI models in SOC 2?

Third-party AI providers are sub-service organizations. You need their SOC 2 report, a data processing agreement, documented monitoring, and compensating controls for risks they don't cover.

What's the timeline to achieve SOC 2?

Type I: 6-9 months. Type II: 12-18 months from scratch. For AI systems, add 1-2 months for AI-specific control design.

Build Compliant AI Systems

SOC 2, HIPAA, and GDPR-ready AI — from architecture to audit.

Start a Project