Description:
- A class of attacks that seeks to reconstruct class representatives from the training data of an AI model, which results in the generation of semantically similar data rather than direct reconstruction of the data (i.e., extraction). (Source: NIST AI 100-2, section 2.4.1)
- Machine learning models’ training data could be reconstructed by exploiting the confidence scores that are available via an inference API. By querying the inference API strategically, adversaries can back out potentially private information embedded within the training data. (Source: MITRE ATLAS )
- Model inversion (or data reconstruction) occurs when an attacker reconstructs a part of the training set by intensive experimentation during which the input is optimized to maximize indications of confidence level in the output of the model. (Source: OWASP AI Exchange )
Impact:
- Can lead to a confidentiality breach of sensitive and/or confidential model training data. Depending on the model, this training data may include personally identifiable information, or other protected data.
Applies to which types of AI models? Predictive (non-generative) machine learning models
- Which AI security requirements function against this threat? [?]
-
- Control function: Corrective
- Control function: Decision support
- Identifying security threats to the AI system
- Threat modeling
- Security evaluations such as AI red teaming
- ID and evaluate any constraints on data used for AI
- ID and evaluate compliance & legal obligations for AI system development and deployment
- Inventory deployed AI systems
- Model card publication
- Linkage between dataset, model, and pipeline config
- Review the model cards of models used by the AI system
- Control function: Detective
- Control function: Directive
- Control function: Preventative
- Control function: Resistive
- Control function: Variance reduction
- Discussed in which authoritative sources? [?]
-
- Engaging with Artificial Intelligence
Jan. 2024, Australian Signals Directorate’s Australian Cyber Security Centre (ASD’s ACSC)- Where:
- Challenges when engaging with AI > 5. Model stealing attack (discusses model inversion)
- Challenges when engaging with AI > 5. Model stealing attack (discusses model inversion)
- Where:
- ISO/IEC TR 24028:2020: Information technology — Artificial intelligence — Overview of trustworthiness in artificial intelligence
2020, © International Standards Organization (ISO)/International Electrotechnical Commission (IEC)- Where:
- 8. Vulnerabilities, Risks, and Challenges > 8.3. AI-specific privacy threats > 8.3.4. Model query
- 8. Vulnerabilities, Risks, and Challenges > 8.3. AI-specific privacy threats > 8.3.4. Model query
- Where:
- Mitigating Artificial Intelligence (AI) Risk: Safety and Security Guidelines for Critical Infrastructure Owners and Operators
April 2024, © Department of Homeland Security (DHS)- Where:
- Appendix A: Cross-sector AI risks and mitigation strategies > Risk category: Attacks on AI > Model inversion and extraction
- Appendix A: Cross-sector AI risks and mitigation strategies > Risk category: Attacks on AI > Loss of data
- Where:
- MITRE ATLAS
2024, © The MITRE Corporation - NIST AI 100-2 E2023: Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations
Jan. 2024, National Institute of Standards and Technology (NIST)- Where:
- 2. Predictive AI Taxonomy > 2.4. Privacy Attacks
- 2. Predictive AI Taxonomy > 2.4. Privacy Attacks
- Where:
- OWASP Machine Learning Security Top 10
2023, © The OWASP Foundation - OWASP AI Exchange
2024, © The OWASP Foundation - Securing Artificial Intelligence (SAI); AI Threat Ontology
2022, © European Telecommunications Standards Institute (ETSI)- Where:
- 6. Threat landscape > 6.4. Threat modeling > 6.4.2.4 > Deployment
- 6. Threat landscape > 6.4. Threat modeling > 6.4.2.4 > Deployment
- Where:
- Securing Machine Learning Algorithms
2021, © European Union Agency for Cybersecurity (ENISA)- Where:
- 3. ML Threats and Vulnerabilities > 3.1. Identification of Threats > Oracle
- 3. ML Threats and Vulnerabilities > 3.1. Identification of Threats > Oracle
- Where:
- Engaging with Artificial Intelligence
- Discussed in which commercial sources? [?]
-
- AI Risk Atlas
2024, © IBM Corporation - Databricks AI Security Framework
Sept. 2024, © Databricks- Where:
- Risks in AI System Components > Model management 8.4: Model inversion
- Risks in AI System Components > Model serving – Inference requests 9.2: Model inversion
- Risks in AI System Components > Model serving – Inference requests 9.5: Infer training data membership
- Where:
- Failure Modes in Machine Learning
Nov. 2022, © Microsoft- Where:
- Intentionally-Motivated Failures > Model Inversion
- Intentionally-Motivated Failures > Membership Inference Attack
- Where:
- Snowflake AI Security Framework
2024, © Snowflake Inc.- Where:
- Privacy
- Fuzzing
- Model inversion
- Where:
- StackAware AI Security Reference
2024, © StackAware- Where: AI Risks > Training data extraction
- Where: AI Risks > Training data extraction
- AI Risk Atlas