DoD to Develop Scalable GenAI Testing Datasets

In an announcement Thursday, DoD said the CAIRT program's most recent red-team test involved more than 200 agency clinical providers and healthcare analysts to compare three LLMs for two prospective use cases: clinical note summarization and a medical advisory chatbot. They found more than 800 potential vulnerabilities and biases where LLMs are being tested to enhance military medical care.CAIRT aimed to build a community of practice around algorithmic evaluations in collaboration with the Defense Health Agency and the Program Executive Office, Defense Healthcare Management Systems. In 2024, the program also offered a financial AI bias bounty focused on unknown risks in LLMs, beginning with open-source chatbots.

Medigy Insights

"Since applying GenAI for such purposes within the DoD is in earlier stages of piloting and experimentation, this program acts as an essential pathfinder for generating a mass of testing data, surfacing areas for consideration and validating mitigation options that will shape future research, development and assurance of GenAI systems that may be deployed in the future," said Dr. Matthew Johnson CAIRT program lead, in a Jan. 2 statement about the initiative.

Continue reading at healthcareitnews.com

Make faster decisions with community advice

Deploy this technology today

HHS publishes AI Strategic Plan, with Guidance for Healthcare, Public health, Human Services

The U.S. Department of Health and Human Services has issued its HHS Artificial Intelligence Strategic Plan, which the agency says will "set in motion a coordinated public-private approach to improving …

Posted Jan 14, 2025