Microsoft Research has launched debug-gym, a groundbreaking environment aimed at training AI coding tools in the intricate process of debugging code. As AI’s presence in software development continues to grow, the focus is shifting toward overcoming a significant hurdle: while AI excels at generating code, debugging remains a time-consuming challenge for developers. GitHub CEO Thomas Dohmke forecasts that soon, up to 80% of code might be generated by AI tools like Copilot.
This trend highlights how organizations of all sizes are increasingly utilizing AI for code generation, with recent data from Y Combinator showing that in one startup batch, language models wrote 95% of their code. However, the reality of software development is that debugging often takes up more time than writing the initial code. The Microsoft Research team, who maintain popular open-source repositories, have raised a thought-provoking question: what if AI could propose solutions to numerous open issues, requiring only human approval before implementation?
Debugging is an interactive process that involves forming hypotheses about crashes, gathering evidence, and testing code iteratively. Debug-gym aims to empower AI agents with similar capabilities, inquiring how effectively large language models (LLMs) can leverage interactive debugging tools. The environment permits code-repairing AI agents to use features like setting breakpoints, inspecting variable values, and deciding whether to explore further or rewrite code.
The initial experiments have shown promising results. Despite the challenges in addressing complex issues, agents using debugging tools demonstrated significant improvements in success rates compared to those that did not. The Microsoft team believes that fine-tuning LLMs specifically for debugging, along with creating specialized datasets, is crucial for advancing this research area.
By open-sourcing debug-gym, Microsoft Research encourages collaboration from the community to help advance the development of interactive debugging agents and enhance AI’s ability to seek information actively.