HITRUST CSF requirement statement [?] (07.07aAISecOrganizational.5)

The organization maintains a documented inventory of data used to
(1) train, test, and validate AI models; 
(2) fine-tune AI models; and
(3) enhance AI prompts via RAG, as applicable. 
At minimum, this inventory contains the data
(4) provenance and
(5) sensitivity level (e.g., protected, confidential, public). 
This inventory is 
(6) periodically (at least semiannually) reviewed and updated.

Evaluative elements in this requirement statement [?]
1. The organization maintains a documented inventory of data used to train, test, and
validate AI models, as applicable. 
2. The organization maintains a documented inventory of data used to fine-tune AI models,
as applicable.
3. The organization maintains a documented inventory of data used to enhance AI 
prompts via RAG, as applicable.
4. The organization’s AI data inventory contains the data provenance.
5. The organization’s AI data inventory contains the data sensitivity level (e.g., 
protected, confidential, public).
6. The organization’s AI data inventory is periodically (at least semiannually) reviewed and
updated.


Illustrative procedures for use during assessments [?]

  • Policy: Examine policies related to each evaluative element within the requirement statement. Validate the existence of a written or undocumented policy as defined in the HITRUST scoring rubric.

  • Procedure: Examine evidence that written or undocumented procedures exist as defined in the HITRUST scoring rubric. Determine if the procedures and address the operational aspects of how to perform each evaluative element within the requirement statement.

  • Implemented: Examine evidence that all evaluative elements within the requirement statement have been implemented as defined in the HITRUST scoring rubric, using a sample based test where possible for each evaluative element. Example test(s):
    • For example, review the AI system to ensure the organization maintains a documented inventory of data used to train, test, and validate AI models; tune AI models; and enhance AI prompts via RAG, as applicable. Further, confirm this inventory contains the data provenance and sensitivity level (e.g., protected, confidential, public).

  • Measured: Examine measurements that formally evaluate and communicate the operation and/or performance of each evaluative element within the requirement statement. Determine the percentage of evaluative elements addressed by the organization’s operational and/or independent measure(s) or metric(s) as defined in the HITRUST scoring rubric. Determine if the measurements include independent and/or operational measure(s) or metric(s) as defined in the HITRUST scoring rubric. Example test(s):
    • For example, measures indicate if the organization maintains a documented inventory of data used to train, test, and validate AI models; tune AI models; and enhance AI prompts via RAG, as applicable. Reviews, tests, or audits are completed by the organization to measure the effectiveness of the implemented controls and to confirm the inventory contains the data provenance and sensitivity level (e.g., protected, confidential, public).

  • Managed: Examine evidence that a written or undocumented risk treatment process exists, as defined in the HITRUST scoring rubric. Determine the frequency that the risk treatment process was applied to issues identified for each evaluative element within the requirement statement.

Placement of this requirement in the HITRUST CSF [?]

  • Assessment domain: 07 Vulnerability Management
  • Control category: 07.0 – Asset Management
  • Control reference: 07.a – Inventory of Assets

Specific to which parts of the overall AI system? [?]
AI application layer:
  • Prompt enhancement via RAG) and associated RAG data sources
AI platform layer
  • Model tuning and associated datasets
  • AI datasets and data pipelines


Discussed in which authoritative AI security sources? [?]
  • Generative AI framework for HM Government
    2023, Central Digital and Data Office, UK Government
    • Where:
      • Using generative AI safely and responsibility > Ethics > Transparency and explainability > Practical recommendations > Bullet #2
      • Using generative AI safely and responsibility > Data protection and privacy > Accuracy > Practical recommendations > Bullet #3

Discussed in which commercial AI security sources? [?]
  • Databricks AI Security Framework
    Sept. 2024, © Databricks
    • Where:
      • Control DASF 11: Capture and view data lineage
      • Control DASF 17: Track and reproduce the training data used for ML model training

  • Google Secure AI Framework
    June 2023, © Google
    • Where:
      • Step 4. Apply the six core elements of the SAIF > Expand strong security foundations to the AI ecosystem > Prepare to store and track supply chain assets, code, and training data

  • Snowflake AI Security Framework
    2024, © Snowflake Inc.
    • Where:
      • Backdooring models (insider attacks) > Mitigations > Data provenance and auditability

Control functions against which AI security threats? [?]
Additional information
  • Q: When will this requirement included in an assessment? [?]
    • This requirement is included when the assessment’s in-scope AI system(s) leverage data-driven AI models (e.g., non-generative machine learning models, generative AI models).
    • The Security for AI systems regulatory factor must also be present in the assessment.

  • Q: Will this requirement be externally inheritable? [?] [?]
    • Yes, fully. This requirement may be the sole responsibility of the AI model creator. Or, depending on the AI system’s architecture, only evaluative elements that are the sole responsibility of the AI model creator apply.