Photo 65.jpg
AI automates everything and can it replace programmers. Photo: Midjourney

A team of researchers has just published a comprehensive map of the challenges facing artificial intelligence (AI) in software development, and proposed a research roadmap to push the field further.

Imagine a future where AI quietly takes over the mundane tasks of software development: refactoring tangled code, migrating legacy systems, and hunting down race conditions, leaving human software engineers free to focus on system architecture, design, and creative problems that machines can’t yet solve. Recent advances in AI seem to be bringing that vision closer.

However, a new study by scientists at the Computer Science and Artificial Intelligence Laboratory (CSAIL) - MIT and partner research institutes has shown that: to realize that future, we must first look directly at the very real challenges of the present time.

“Many people say that programmers are no longer needed because AI can automate everything,” said Armando Solar-Lezama, a professor of electrical engineering and computer science at MIT, a senior researcher at CSAIL, and the study’s lead author. “In fact, we’ve made significant progress. The tools we have today are much more powerful than they were before. But we still have a long way to go to realize the full potential of automation.”

Professor Armando Solar-Lezama argues that the popular perception of software engineering is that it is a task similar to a student programming assignment: take a small function and write code to handle it, or do a LeetCode-style exercise. The reality is much more complex: from code refactorings to optimize designs, to large-scale migrations with millions of lines of code from COBOL to Java that change the entire technology stack of a company.

Measurement and communication remain difficult problems

Industrial-scale code optimizations—like GPU core tweaks or multi-layer improvements in the Chrome V8 engine—are still difficult to evaluate. Current benchmarks are mostly for small, packaged problems. The most practical metric, SWE-Bench, simply asks an AI model to fix a bug on GitHub—a low-level programming exercise that involves a few hundred lines of code and potentially exposes data, and ignores a wide range of real-world scenarios, like AI-assisted refactoring, human-machine pair programming, or high-performance system rewrites with millions of lines of code. Until benchmarks expand to cover these higher-risk scenarios, measuring progress—and thus accelerating it—will remain an open challenge.

In addition, human-machine communication is also a major barrier. PhD student Alex Gu - the lead author said that currently, interacting with AI is still like "a fragile communication line". When asking AI to generate code, he often gets back large, unstructured files, along with a few simple and sketchy test sets. This gap is also reflected in the fact that AI cannot effectively take advantage of software tools that are familiar to humans such as debuggers, static analyzers, etc.

Call to action from the community

The authors argue that there is no magic wand solution to these problems, and call for community-scale efforts: building data that reflects the actual development process of programmers (which code to keep, which code to remove, how code is refactored over time, etc.); common assessment tools for refactor quality, patch durability, and system transition accuracy; and building transparent tools that allow AI to express uncertainty and invite human intervention.

PhD student Alex Gu sees this as a “call to action” for large-scale open source communities that no single lab can deliver. Solar-Lezama envisions progress coming in small, incremental steps—“research findings that solve one piece of the problem at a time”—transforming AI from a “code suggestion tool” to a true engineering partner.

“Why does this matter? Software is already the foundation of finance, transportation, healthcare , and just about every day activity. But the human effort to build and maintain it securely is becoming a bottleneck,” Gu said. “An AI that can do the heavy lifting without making hidden errors would free up programmers to focus on creativity, strategy, and ethics. But to get there, we need to understand that finishing a piece of code is the easy part—the hard part is everything else.”

(Briefly translated from MIT News)

Source: https://vietnamnet.vn/hanh-trinh-dai-cua-ai-trong-ky-thuat-phan-mem-tu-dong-hoa-2426456.html