HITRUST CSF requirement statement [?] (07.10bAISecSystem.2)

Data for 
(1) AI model training, 
(2) AI model fine-tuning, or 
(3) prompt enhancement via RAG—if used—is checked prior to usage (e.g., using statistical methods, through manual inspection, 
or through automated means) for suspicious unexpected values or patterns which could be adversarial or malicious in nature 
(e.g., poisoned samples).  
Identified anomalous entries are 
(4) removed.

Evaluative elements in this requirement statement [?]
1. Data for AI model training is checked prior to usage (e.g., using statistical 
methods, through manual inspection, or through automated means) for suspicious 
unexpected values or patterns which could be adversarial or malicious in nature 
(e.g., poisoned samples).  
2. Data for AI model fine-tuning is checked prior to usage (e.g., using statistical 
methods, through manual inspection, or through automated means) for suspicious 
unexpected values or patterns which could be adversarial or malicious in nature 
(e.g., poisoned samples).  
3. Data for prompt enhancement for RAG (if used) is checked prior to usage 
(e.g., using statistical methods, through manual inspection, or through automated 
means) for suspicious unexpected values or patterns which could be adversarial 
or malicious in nature (e.g., poisoned samples).  
4. Identified suspicious unexpected values or patterns are removed.


Illustrative procedures for use during assessments [?]

  • Policy: Examine policies related to each evaluative element within the requirement statement. Validate the existence of a written or undocumented policy as defined in the HITRUST scoring rubric.

  • Procedure: Examine evidence that written or undocumented procedures exist as defined in the HITRUST scoring rubric. Determine if the procedures and address the operational aspects of how to perform each evaluative element within the requirement statement.

  • Implemented: Examine evidence that all evaluative elements within the requirement statement have been implemented as defined in the HITRUST scoring rubric, using a sample based test where possible for each evaluative element. Example test(s):
    • For example, review the AI system to ensure data for AI model training, AI model tuning, or prompt enhancement via RAG if used, is checked prior to usage for anomalies such as unexpected values or patterns (e.g., using statistical methods, through manual inspection). Further, confirm that the identified anomalous entries are removed.

  • Measured: Examine measurements that formally evaluate and communicate the operation and/or performance of each evaluative element within the requirement statement. Determine the percentage of evaluative elements addressed by the organization’s operational and/or independent measure(s) or metric(s) as defined in the HITRUST scoring rubric. Determine if the measurements include independent and/or operational measure(s) or metric(s) as defined in the HITRUST scoring rubric. Example test(s):
    • For example, measures indicate if the data for AI model training, AI model tuning, or prompt enhancement via RAG if used, is checked prior to usage for anomalies such as unexpected values or patterns (e.g., using statistical methods, through manual inspection). Reviews, tests, or audits are completed by the organization to measure the effectiveness of the implemented controls and to confirm the identified anomalous entries are removed.

  • Managed: Examine evidence that a written or undocumented risk treatment process exists, as defined in the HITRUST scoring rubric. Determine the frequency that the risk treatment process was applied to issues identified for each evaluative element within the requirement statement.

Placement of this requirement in the HITRUST CSF [?]

  • Assessment domain: 07 Vulnerability Management
  • Control category: 10.0 – Information Systems Acquisition, Development, and Maintenance
  • Control reference: 10.b – Input Data Validation

Specific to which parts of the overall AI system? [?]
AI application layer:
  • Prompt enhancement via RAG, and associated RAG data sources
AI platform layer
  • AI datasets and data pipelines


Discussed in which authoritative AI security sources? [?]
  • OWASP 2023 Top 10 for LLM Applications
    Oct. 2023, © The OWASP Foundation
    • Where:
      • LLM03: Training data poisoning > Prevention and Mitigation Strategies > Bullet #5
      • LLM06: Sensitive information disclosure > Prevention and Mitigation Strategies > Bullet #1

  • Securing Machine Learning Algorithms
    2021, © European Union Agency for Cybersecurity (ENISA)
    • Where:
      • 4.1- Security Controls > Technical: Control all data used by the ML model > Use methods to clean the training dataset from suspicious samples

Discussed in which commercial AI security sources? [?]
  • Databricks AI Security Framework
    Sept. 2024, © Databricks
    • Where:
      • DASF 7: Enforce data quality checks on batch and streaming datasets
      • DASF 15: Explore datasets and identify problems

  • Snowflake AI Security Framework
    2024, © Snowflake Inc.
    • Where:
      • Adversarial samples > Mitigations > Input preprocessing
      • Model poisoning > Mitigations > Bullet 2
      • Training data poisoning > Mitigations > Dataset sanitization

Control functions against which AI security threats? [?]

Additional information
  • Q: When will this requirement included in an assessment? [?]
    • This requirement is included when the assessment’s in-scope AI system(s) leverage data-driven AI models (e.g., non-generative machine learning models, generative AI models).
    • The Security for AI systems regulatory factor must also be present in the assessment.

  • Q: Will this requirement be externally inheritable? [?] [?]
    • Yes, fully. This requirement may be the sole responsibility of the AI platform provider and/or model creator. Or, depending on the AI system’s architecture, only evaluative elements that are the sole responsibility of the AI platform provider and/or model creator apply.