Date:
8 July 2025
 
Phillipa is Salsa’s Rules as Code Practice Lead and a content specialist.

The Policy2Code Prototyping Challenge

The Policy2Code Prototyping ChallengeExternal Link , organised by the Digital Benefits NetworkExternal Link and the Massive Data InstituteExternal Link at Georgetown University, aimed to assess the feasibility of using generative AI to convert US public benefits policies into machine-readable code. This initiative is part of the broader Rules as Code (RaC) movement, which seeks to make policy implementation more efficient by translating legislation into software code.

Twelve teams participated in the challenge, each experimenting with different AI models and approaches. For instance, BDO Canada’s “Policy Pulse” tool employed a large language model to analyse legislation related to homeless shelter deductions, achieving approximately 80% accuracy in identifying relevant policy changes.

Another notable project, “Code the Dream”, developed an AI assistant to help community benefits navigators manage client eligibility for programs like SNAPExternal Link and MedicaidExternal Link .

For more information on the Challenge, see our dedicated insight, Policy2Code Prototyping Challenge.

The challenges of using AI in Rules as Code

Despite these advances, the challenge highlighted significant obstacles. AI models often struggled with the nuances and exceptions inherent in legal texts, leading to potential inaccuracies. Given the high stakes of public benefits programs, even minor errors could have serious consequences for citizens.

Salsa Digital, with its extensive experience in RaC and tools like OpenFisca, recognises these challenges. The manual process of dissecting legislation into codifiable rules remains labour-intensive, requiring close collaboration between developers and policy experts. While AI can help in preliminary stages, human oversight is crucial to ensure the fidelity and reliability of the final code.

Moreover, integrating AI-driven chatbots with RaC systems introduces additional complexities. Given the imperative for absolute accuracy in government communications, the potential for AI to generate incorrect or misleading information is a significant barrier. Salsa’s research indicates that, despite advances, AI is not yet equipped to handle the intricacies of policy interpretation without substantial human intervention.

While AI offers tools that can aid in the RaC process, it is not a solution. The path forward lies in a hybrid approach, leveraging AI for efficiency while maintaining rigorous human oversight to uphold the standards required in public service delivery.