Description:
- Adversaries may poison training data and publish it to a public location. The poisoned dataset may be a novel dataset or a poisoned variant of an existing open-source dataset. This data may be introduced to a victim system via supply chain compromise.
Source: MITRE ATLAS
Impact:
- Use of poisoned datasets compromised upstream in the AI supply chain can lead to integrity issues such as biased outcomes or even availability issues like outage of the AI system. The impact depends heavily on the context of the overall AI system.
Applies to which types of AI models? Data-driven models (e.g., predictive ML models, generative AI models)
- Which AI security requirements function against this threat? [?]
-
- Control function: Corrective
- Control function: Decision support
- Identifying security threats to the AI system
- Threat modeling
- Security evaluations such as AI red teaming
- ID and evaluate any constraints on data used for AI
- ID and evaluate compliance & legal obligations for AI system development and deployment
- Inventory deployed AI systems
- AI data and data supply inventory
- Linkage between dataset, model, and pipeline config
- Control function: Detective
- Control function: Directive
- Control function: Preventative
- Control function: Resistive
- Control function: Variance reduction
- Discussed in which authoritative sources? [?]
-
- CSA Large Language Model (LLM) Threats Taxonomy
2024, © Cloud Security Alliance- Where:
- 4. LLM Service Threat Categories > 4.2. Data Poisoning
- 4. LLM Service Threat Categories > 4.6. Insecure Supply Chain
- Where:
- Mitigating Artificial Intelligence (AI) Risk: Safety and Security Guidelines for Critical Infrastructure Owners and Operators
April 2024, © Department of Homeland Security (DHS)- Where:
- Appendix A: Cross-sector AI risks and mitigation strategies > Risk category: AI design and implementation failures > Supply chain vulnerabilities
- Appendix A: Cross-sector AI risks and mitigation strategies > Risk category: Attacks on AI > Adversarial manipulation of AI algorithms or data
- Where:
- MITRE ATLAS
2024, © The MITRE Corporation - Multilayer Framework for Good Cybersecurity Practices for AI
2023, © European Union Agency for Cybersecurity (ENISA)- Where:
- 2. Framework for good cybersecurity practices for AI > 2.2. Layer II – AI fundamentals and cybersecurity > Compromise of ML application components
- 2. Framework for good cybersecurity practices for AI > 2.2. Layer II – AI fundamentals and cybersecurity > Compromise of ML application components
- Where:
- OWASP AI Exchange
2024, © The OWASP Foundation - OWASP 2023 Top 10 for LLM Applications
Oct. 2023, © The OWASP Foundation- Where:
- LLM03: Training Data Poisoning
- LLM05: Supply Chain Vulnerabilities
- Where:
- OWASP 2025 Top 10 for LLM Applications
2025, © The OWASP Foundation - OWASP Machine Learning Security Top 10
2023, © The OWASP Foundation
- CSA Large Language Model (LLM) Threats Taxonomy
- Discussed in which commercial sources? [?]
-
- Databricks AI Security Framework
Sept. 2024, © Databricks- Where:
- Risks in AI System Components > Raw data 1.7: Lack of data trustworthiness
- Risks in AI System Components > Raw data 1.7: Lack of data trustworthiness
- Where:
- HiddenLayer’s 2024 AI Threat Landscape Report
2024, © HiddenLayer- Where:
- Part 2: Risks faced by AI-based systems > Supply chain attacks
- Part 2: Risks faced by AI-based systems > Data poisoning in supply chain attacks
- Where:
- Databricks AI Security Framework