AI Testing: NIST’s Dioptra as a Step Forward, and Other NIST Guidance

Posted

As part of NIST’s recent mandate to formalize AI Testing set forth in President Joe Biden’s Executive Order on AI, NIST recently released a testbed called Dioptra that can be utilized to conduct evaluations to assess AI developers’ claims about their systems’ performance. Dioptra helps users identify attacks that would reduce model performance, and quantify the failures that potentially result. Dioptra’s capabilities align with the core principles of rigorous AI testing, emphasizing the necessity of validating AI systems against risks such as reliability, security, and fairness.

In our article “Is Your AI Testing Tool a Breach of Contract Claim Waiting to Happen?”, we encouraged customers leveraging AI testing tools to ensure that their testing practices align with the terms and conditions outlined in their agreements for such tools.

Thus, while the advantages of a tool like Dioptra are clear, its deployment (as well as the deployment of any user-testing mechanisms) must respect the rights and obligations outlined in AI contracts. Specifically, Dioptra should be used within the boundaries of use rights, data privacy and intellectual property rights. Before deploying any testing tool, users should:

  • Review your license or use rights in your contract for the AI service to ensure there are no restrictions that would inhibit use of a testing software.
  • Ensure that the testing does not disrupt the performance of the AI service more broadly, or otherwise leverage the AI service in a manner prohibited by the AUP.
  • Consider local deployment of the testing tool is viable so as to avoid needing to host or share the model with a third party.

In addition to announcing its release of Dioptra, NIST also provided its initial guidance on Managing Misuse Risk for Dual-Use Foundational Models; and a finalized versions of its Risk Management Framework: Generative AI Profile, Secure Software Development Practices for Generative AI and Dual Use Foundational Models, and a Plan for Global Engagement on AI Standards. A summary of those materials follows.

Managing Misuse Risk for Dual Use Foundational Models
NIST includes seven objectives, and related practices to meet those objectives, for organizations to map, measure, manage and govern the risk that foundational models will result in harm. The below table summarizes those objectives and practices.

SSpeak_Chart_July29-240x300

Artificial Intelligence Risk Management Framework (RMF) – Generative AI Profile
The AI RMF profiles assist organizations in deciding how to best manage AI risks in a manner that is well-aligned with their goals, considers legal/regulatory requirements and best practices, and reflects risk management priorities. Some of the risks, though significant, are inapplicable to the most common commercial use cases for these services. Most relevant for a business client is the risk of (1) confabulation; (2) data privacy; (3) environmental impacts; (4) harmful bias or homogenization; (5) information security; (6) IP; and (7) value chain and component integration. NIST then suggests a number of general actions to manage those risks, which may serve as a helpful governance structure for implementing AI services.

Secure Software Development Practices for Generative AI and Dual-Use Foundation Models
This document provides a framework for secure software development. It sets forth a set of practices that align with the suggested actions in the AI RMF that can be actioned by AI model producers, AI system producers or AI system acquirers.

Plan for Global Engagement on AI Standards
This plan sets forth policy objectives to “promote responsible AI safety and security principles and actions with other nations, including our competitors, while leading key global conversations and collaborations to ensure that AI benefits the whole world, rather than exacerbating inequities, threatening human rights, and causing other harms.”

As the AI industry rapidly evolves, it lacks the maturity and transparency found in more established tech sectors—like cloud and other SaaS services. This vacuum has led to ad hoc development of services and evaluation methods, with no robust, universally accepted standards to guide the process. As such, NIST, designated by President Biden’s executive order to take a significant role in AI transparency efforts, emerges as a crucial resource. In other words, NIST can likely be viewed as the source of “industry standard” in the AI space.


RELATED ARTICLES

Is Your AI Testing Tool a Breach of Contract Claim Waiting to Happen?