OpenAI has partnered with the U.S. Department of Energy’s Pacific Northwest National Laboratory to evaluate whether coding agents can accelerate federal permitting work, using PNNL’s PermitAI program as the delivery vehicle.
A benchmark built around federal drafting work
The collaboration produced DraftNEPABench, a benchmark OpenAI said is designed around National Environmental Policy Act workflows such as drafting environmental impact statements.
According to the announcement, PermitAI is funded by DOE’s Office of Policy and was built with 19 subject matter experts on the NEPA review process.
OpenAI described the benchmark around document-heavy work where reviewers must read technical reports running hundreds of pages, cross-check information across multiple sources and produce structured drafts that meet specified legal and technical criteria.
In its description of the testbed, PNNL said PermitAI is building on a large dataset of past environmental reviews and permitting documents and is testing applications that speed up discrete tasks inside permitting workflows.
How the agents were tested
The evaluation used “generalized coding agents” through Codex CLI, which OpenAI said can extract performance from reasoning models such as GPT-5 on research, technical analysis and report-writing tasks that involve a file system.
OpenAI’s developer documentation describes Codex CLI as a terminal-based coding agent that can read and run code in a selected directory.
What the experts found
Across a representative set of drafting tasks spanning NEPA document sections from 18 federal agencies, OpenAI said the 19 experts found the agents had the potential to reduce drafting work by one to five hours per subsection, or roughly a 15% reduction in drafting time. OpenAI also said benchmark scoring aggregates structure, clarity, accuracy and references across 102 tasks.
OpenAI positioned DraftNEPABench as a capability test on well-specified drafting tasks rather than a substitute for real-world permitting discretion. It said the benchmark emphasizes accuracy and correct reference use and noted that models may not flag incomplete, inconsistent, or out-of-date source materials without explicit instructions.
The timeline problem the benchmark lands in
The benchmark’s release coincides with a push from the Council on Environmental Quality (CEQ) for increased transparency in EIS timelines.
The Council on Environmental Quality’s January 2025 EIS timeline report said the median time from notice of intent to final EIS across agencies was 2.8 years for final EISs issued from 2019 through 2024, with a 2024 median of 2.2 years.
Separately, Resources for the Future cited earlier CEQ reporting that put average NEPA EIS timelines at 54 months across project types and agencies.
PermitAI has also been built to align with the government’s push for shared permitting data structures. DOE’s description of PermitAI said agencies can host NEPA documents on a PermitAI platform that implements CEQ’s NEPA and Permitting Data and Technology Standards.