How do I handle third-party AI models (OpenAI, Anthropic) in SOC 2?

Third-party AI providers are sub-service organizations under SOC 2. You need: their SOC 2 report (verify controls), a data processing agreement, documented monitoring of their service (uptime, security advisories), and compensating controls for risks they don't cover. Maintain a vendor risk assessment specific to AI capabilities.

What's the timeline to achieve SOC 2 for an AI system?

Type I (point-in-time): 3-6 months to design and implement controls, then audit. Type II (over a period, typically 6-12 months): controls must be operating effectively for the observation period. For AI systems, add 1-2 months for AI-specific control design. Total: 6-9 months for Type I, 12-18 months for Type II from scratch.

ComplianceDec 31, 202514 min read

SOC 2 for AI Systems: Compliance Guide

Q: Does SOC 2 have specific requirements for AI?

SOC 2 doesn't have AI-specific criteria — the Trust Service Criteria are technology-neutral. However, auditors increasingly expect AI-specific controls: model governance, data lineage for training data, output monitoring, prompt injection defenses, and human oversight for automated decisions. You must map AI-specific risks to existing TSC categories.

Enterprise buyers increasingly require SOC 2 reports for AI vendors. This guide maps SOC 2's Trust Service Criteria to AI-specific risks, shows which controls you need for LLMs, RAG systems, and agent deployments, and provides a practical path to audit readiness.

DecryptCode Engineering AI & ML Team

Key Takeaways

SOC 2 is technology-neutral — no AI-specific criteria, but auditors expect AI-specific controls
Map AI risks to the 5 Trust Service Criteria: Security, Availability, Processing Integrity, Confidentiality, Privacy
AI-specific controls: model governance, output monitoring, prompt injection defense, data lineage, human oversight
Third-party AI providers (OpenAI, Anthropic) are sub-service organizations — you need their SOC 2 reports
Timeline: 6-9 months for Type I, 12-18 months for Type II from scratch

SOC 2 Overview

SOC 2 (System and Organization Controls 2) evaluates an organization's controls related to security, availability, processing integrity, confidentiality, and privacy. Created by the AICPA, it's the baseline security assurance framework for SaaS and technology companies.

Type I: Point-in-time assessment — are controls properly designed? Faster to achieve, less convincing.
Type II: Operating effectiveness over a period (3-12 months) — are controls actually working? The standard enterprise buyers expect.

For AI companies and companies deploying AI systems, SOC 2 demonstrates that you handle customer data responsibly, your AI systems are reliable, and you have controls preventing unauthorized data access or processing.

TSC Mapping for AI

Map each Trust Service Criteria to AI-specific risks:

TSC Category	AI-Specific Risks	Control Examples
Security	Prompt injection, model theft, API key exposure	Input sanitization, model access controls, secrets management
Availability	Model degradation, LLM provider outages, GPU capacity	Fallback models, multi-provider, capacity planning
Processing Integrity	Hallucination, data drift, output errors	Output validation, drift monitoring, evaluation suites
Confidentiality	Training data leakage, prompt data exposure	Data encryption, model isolation, zero-retention APIs
Privacy	PII in training data, automated profiling	PII detection, consent management, data minimization

Security Controls

Security is the only mandatory TSC category. For AI systems:

Access Control: RBAC for model management, API access, and training data. Separate roles for model developers, operators, and consumers.
Network Security: AI services in private subnets, API gateway with rate limiting, WAF for web-facing AI endpoints.
Input Validation: Prompt injection defenses — input sanitization layers, adversarial input detection, output filtering.
Secrets Management: API keys for LLM providers in Vault/Secrets Manager. Rotation policies. No hardcoded keys.
Vulnerability Management: Regular scanning of AI dependencies (LangChain, transformers, vector DB libraries). Patch management SLAs.
Incident Response: AI-specific incident procedures — model poisoning, data breach through AI output, prompt injection attacks.

Availability Controls

Uptime SLAs: Define SLAs for AI endpoints (99.9% = 8.76 hours downtime/year). Different tiers for different AI services.
Redundancy: Multi-model fallback — primary LLM → secondary LLM → rule-based fallback. See production deployment patterns.
Capacity Planning: Monitor GPU utilization, queue depth, and API rate limit consumption. Auto-scale inference infrastructure.
Disaster Recovery: Model artifacts in versioned storage. Vector database backups with tested restoration procedures. Recovery time objectives (RTO) documented and tested.
Provider Dependency: Document dependency on third-party LLM providers. Have contingency plans for provider outages or API changes.

Processing Integrity

Processing integrity ensures AI outputs are accurate and reliable:

Output Validation: Automated quality checks on AI outputs — format validation, factual consistency, hallucination detection. Log quality scores.
Evaluation Suites: Versioned test datasets that run before every model update. Track accuracy, precision, recall, and domain-specific metrics.
Data Drift Monitoring: Statistical comparison of input distributions between baseline and production. Alert on significant distribution shifts.
Human Review: Sampling-based human review of AI outputs. Minimum review rate based on risk classification. Results feed back into model improvement.
Change Management: Model updates go through defined change management — testing, approval, staged rollout, rollback capability.

Confidentiality

Data Classification: Classify data that flows through AI systems (public, internal, confidential, restricted). Apply controls based on classification.
Encryption: TLS 1.3 in transit, AES-256 at rest for all AI data stores (vector databases, model artifacts, training data, logs).
Data Minimization: Only send necessary data to AI models. Strip unnecessary fields before API calls. Implement zero-retention policies with LLM providers.
Training Data Security: Secure storage of training datasets. Access logging. Prevent unauthorized copy/export.
Output Leakage Prevention: Ensure AI outputs don't reveal confidential data to unauthorized users. Multi-tenant isolation in RAG systems.

Privacy

PII Detection: Automated PII scanning in AI inputs and outputs. Block or redact PII that shouldn't be processed.
Consent Management: Track consent for data used in AI processing. Honor opt-out requests for AI-based profiling.
Data Subject Rights: Implement right-to-delete across AI data stores — training data, embeddings, vector databases, model weights (where applicable).
Automated Decision Notices: When AI makes decisions affecting individuals, provide notice and explanation per applicable regulations.
Cross-Border Transfers: If AI processing occurs in different jurisdictions than data collection, document transfer mechanisms. See GDPR AI compliance for EU requirements.

AI-Specific Controls

Beyond standard SOC 2 controls, implement these AI-specific policies:

Model Governance

Model inventory — catalog all models in production with version, purpose, data sources, owners
Model risk assessment — classify models by risk level (automated decisions, financial impact, safety)
Approval workflow — documented process for promoting models to production

Data Lineage

Track training data sources, preprocessing steps, and data quality metrics
Document which data each model was trained/fine-tuned on
Maintain data provenance for RAG knowledge bases

Responsible AI

Bias testing and fairness metrics for models that affect individuals
Transparency documentation — what the model does, how it was trained, known limitations
Human oversight requirements based on decision impact

Audit Preparation

Months 1-2: Gap Assessment — Evaluate current controls against TSC. Identify gaps, especially AI-specific controls.
Months 3-4: Control Design — Write control descriptions, policies, and procedures. Implement missing technical controls. Document AI-specific governance.
Months 5-6: Control Implementation — Deploy monitoring, logging, and access controls. Train team on procedures. Begin evidence collection.
Month 7: Type I Readiness — Internal review. Mock audit. Fix gaps. Engage auditor.
Months 8-9: Type I Audit — Auditor evaluates control design. Receive report.
Months 10-18: Type II Observation — Controls operate for 6-12 months. Collect evidence continuously. Auditor tests operating effectiveness.

Ready to build SOC 2-compliant AI systems? Explore our enterprise AI consulting services.

Frequently Asked Questions

Does SOC 2 have specific requirements for AI?

No — SOC 2's Trust Service Criteria are technology-neutral. But auditors increasingly expect AI-specific controls: model governance, data lineage, output monitoring, prompt injection defenses, and human oversight. Map AI risks to existing TSC categories.

How do I handle third-party AI models in SOC 2?

Third-party AI providers are sub-service organizations. You need their SOC 2 report, a data processing agreement, documented monitoring, and compensating controls for risks they don't cover.

What's the timeline to achieve SOC 2?

Type I: 6-9 months. Type II: 12-18 months from scratch. For AI systems, add 1-2 months for AI-specific control design.

Build Compliant AI Systems

SOC 2, HIPAA, and GDPR-ready AI — from architecture to audit.

Start a Project