Article

Beyond augmentation: Agentic AI for software development

By Mukul Khare, Manjunath Shrikantiah, Shyam Jayachandran, Shishir Jhaveri, Harry Keir Hughes

03 Jun, 2025
7 min read

Insights

Software development is both a very popular, and very viable, AI use case.
But up until now, AI has only been used to augment, rather than automate, many phases of the software development life cycle.
Agentic AI takes software development to the next level, thanks to its ability to manage open-ended problems, multistep processes, and its skill to improve over time.
Agents we’ve tested can achieve between 80% and 90% improvement in database code generation; between 60% and 70% improvement in generating application programming interfaces (APIs) and microservices; and up to 60% improvement in generating user interface code.
But there are many ways agents can go wrong in large enterprise applications.
To reduce risk and ensure business success, we provide five recommendations, and specific operating model changes, to ensure agentic AI success.

Large language models (LLMs) are increasingly 'a commodity in many of our client organizations', Rafee Tarafdar, Infosys chief technology officer, told the Infosys Knowledge Institute in April 2025.

Software development is a key area for this artificial intelligence (AI), with Microsoft, Google, and other hyperscalers launching AI assistants and software to help complete lines of code, migrate and modernize legacy systems, and architect business processes from scratch.

Our AI Business Value Radar 2025 research, a deep dive into AI use cases across functions and industries, found that software development is both a very popular, and very viable, AI use case, creating more value than use cases in more human-focused areas such as marketing and sales (Figure 1).

Figure 1. IT use cases have high selection rates and are much more viable

Source: Infosys Knowledge Institute

The problem with augmentation

But up until now, AI has only been used to augment, rather than automate, many phases of the software development life cycle (SDLC).

Initially, coding assistants from GitHub and other vendors offered basic suggestions and auto-completions for simple functions or methods. Subsequent versions introduced conversational interactions, offering multiple code snippets and recommendations. However, developers with technical expertise still had to manually review, validate, and integrate these suggestions into their existing applications.

Due to the probabilistic nature of generative AI (you can never be sure you will get the right answer), developers were responsible for ensuring that the suggested code adhered to best practices, was error-free, and integrated with existing codebases. This manual validation was crucial, as suggested code could inadvertently break existing functionality, especially when external dependencies or multiple functions were involved.

Additionally, developers had to handle tasks such as creating new files, managing third-party package references, and ensuring the application compiled and ran correctly, adding complexity to the development process.

In this early stage, the AI has often been a copilot, limited to helping programmers with predefined workflows and tasks. For example, early releases of GitHub Copilot auto-completed code snippets through natural language but couldn’t plan and execute code on its own.

To make software developers more productive, and companies more competitive, LLMs must work with even more context and reasoning, and without the need for prompting at each stage of the SDLC.

They must be able to plan, reason, and act autonomously, leaving low-level drudgery to the machine and turning software developers into product architects.

Agentic AI in software development

Compounding the urgency of introducing AI systems that are even more automated is the service-as-software paradigm. Here, the software does not merely enable a task; it performs the service itself. The customer pays not for the use of a tool, but for the outcome: A risk report, a marketing campaign, a legal brief, all conjured by software, with little or no human input. In this landscape, augmentation won’t deliver the business benefit needed.

This is where AI agents come in: Systems that plan, reason, and act without the need for human interventions. Infosys has identified agentic AI as one of our Top 10 AI Imperatives for 2025.

Agentic AI takes software development to the next level, thanks to its ability to manage open-ended problems, multistep processes, and its skill to improve over time. Agents enable the LLM to determine the steps to complete a task that can’t always be hardcoded into a workflow; use tools or information over multiple iterations instead of single-shot retrieval; and receive feedback from either its environment (terminals, OS, code base) or users to provide better utility (Figure 2).

Figure 2. Agents plan, reason, and act without the need for prompts

Source: Infosys Knowledge Institute

The real-world impact of AI agents

At Infosys we have adopted GitHub Copilot at scale and its agent mode is particularly exciting.

This agent is capable of iterating on its own code, understanding its own errors, and then fixing them itself through self-reflection. According to the GitHub blog, it can “suggest terminal commands and ask you to execute them.” It also analyzes run-time errors with self-healing capabilities.

“In agent mode, Copilot will iterate on not just its own output, but the result of that output. And it will iterate until it has completed all the subtasks required to complete your prompt,” says Thomas Dohmke, CEO of GitHub. “Instead of performing just the task you requested, Copilot now has the ability to infer additional tasks that were not specified, but are also necessary for the primary request to work. Even better, it can catch its own errors, freeing you up from having to copy/paste from the terminal back into chat.”

But does it work well in a large enterprise setting like Infosys?

To find out, we conducted our own tests across an array of software development tasks, ranging from small experiments that required basic programming skills, to more advanced software engineering, making iterative changes on top of source code of Infosys’ own IP products and platforms.

According to our analysis, which includes experiments from both senior architects and junior developers at Infosys, agent mode can achieve between 80% and 90% improvement in database code generation; between 60% and 70% improvement in generating application programming interfaces (APIs) and microservices; and up to 60% improvement in generating user interface code.

Of course, productivity varies across programming languages.

Furthermore, the complexity of use cases and the expertise of developers significantly impact productivity, with expert developers outperforming novices.

This last point is no surprise. Other tests on AI in fields as far afield as law and research have found that the more senior a person is in their respective field, the greater the benefit of AI.

Aidan Toner-Rodgers of MIT, as an example, found that using an AI tool to assist with materials discovery nearly doubled the productivity of top researchers, while having no measurable impact on the bottom third.

What should companies consider when adopting agentic AI?

However, organizations will need to use agents like GitHub Copilot agent mode with care, especially when building applications for key business processes. We have five recommendations for using agents in this way:

First, establish a well-defined and robust software architecture with clear component boundaries, consistent coding standards, and documented design principles before letting AI agents implement new features. This architectural foundation provides essential context and guardrails that help agents generate properly integrated, maintainable code that aligns with enterprise standards. This foundation can be communicated either as prompts or through instructions or rule sets depending on the AI coding agents being leveraged.

Second, agents produce results before humans can check them properly. To take care of this, break software development tasks into small, manageable chunks to simplify reviewing, managing, and accepting changes. This approach can help catch the hallucinations that are likely when dealing with large code bases.

Third, provide the appropriate level of context to enhance solution accuracy. This includes specific technical information that an AI agent needs to plan, interpret tasks, and take actions. For example, share details about your tech stack, domain-specific requirements, business rules, user stories, and other relevant parameters applicable to the feature you want the agent to develop. The more precisely you define the problem statement and constraints, the more aligned the agent's output will be with your actual needs.

Fourth, leverage references, best practices, and guardrails to guide AI agents in creating code aligned with company standards. This is why many agents haven’t been let loose on the enterprise yet. Giving an agent full freedom rather than relying on human-generated prompts is risky, increasing the potential for harm. Hand control back to the user at regular intervals or whenever a complex task needs to be finalized.

To help in this, Infosys leverages the iLEAD platform. This is an AI-first, agent-integrated platform providing end-to-end SDLC assistance tailored for key personas including architects, developers, and DevOps teams, supporting the journey from requirements definition to release management. The platform ensures that the agent-generated code adheres to best practices, incorporates strong security measures, and is straightforward to maintain.

To facilitate this, we are also developing model context protocol servers — software components that act as bridges between AI agents and external systems — within the iLEAD platform, providing tools and resources for agents. These servers will expose endpoints designed for both industry-specific solutions (such as energy, finance, healthcare) and technology solutions (for integration with databases, design and architecture patterns, etc.), ensuring the generated code consistently aligns with industry and enterprise standards.

The final recommendation is to strategically select your AI model. AI assistants provide a choice of LLMs, and choosing the most appropriate one for your needs can improve results. For instance, choose Gemini 2.5 Pro or Claude 3.7 Sonnet for intricate problems, and reasoning-focused models for effective debugging and error fixing.

Senior developers who used agent mode in our analysis also had three further thoughts to add. First, avoid giving generic instructions with the entire codebase as context, as it can disrupt the overall solution; second, “Do not accept responses blindly — always review them!”; and third, use agents to assist with fixing and debugging issues, but avoid lengthy debugging sessions — starting fresh is often more effective.

Changing roles, changing outcomes

We tested agentic AI coding assistants with technologies including C#, Java, HTML, CSS, Angular, React, Vue.JS and SQL. Creating end-to-end unit test cases took about 45 minutes with AI assistance, compared to a couple of weeks it would have required manually. Migrating Oracle database views to SQL views was completed in just 10 minutes instead of the approximately three days usually needed. Similarly, upgrading configuration files, packages, and dependencies for a web application was completed in 30 minutes, a task that would typically require two to three days of developer time using traditional methods.

However, beyond the technology, when adopted at enterprise scale, how an organization is set up will determine its success with agentic AI, as we found in our 2023 Digital Radar research. Operating model and talent profile will drive a lot of the value, and speed coupled with quality will be of the essence.

Product-based teams will be needed, with each member having a product and customer mindset.

The ratio of senior architects to junior developers will increase — with a focus on fewer people managing an increasingly large number of agents: "With AI agents producing fewer syntax errors, cleaner structure, and faster iterations, software developers and engineers will become editors and reviewers, not authors of every line," says Lori Schafer, CEO at Digital Wave.

Our AI business value radar also found that workforce readiness drives most success. Workforce preparation will enable organizations to derive more value from agentic AI outputs and can increase value from software development use cases by as much as 18 percentage points.

Workshops and hackathons are ways to improve skills for more junior developers, while more senior talent will benefit from instruction in product management, Agile methodologies, and domain knowledge improvement.

Populating teams with skilled problem solvers with logical reasoning and IT experience increases the chances of achieving significant return on investment, our analysis found.

Ultimately, the success of agentic AI in software development will depend on business impact, user adoption, customer experience, risk management, and operational efficiency.

And many organizations are aware of this: 33% of enterprise software applications are set to incorporate agentic AI by 2028, according to Gartner, enabling autonomous solutions for 15% of day-to-day work decisions.

To remain competitive, enterprises will need to plan, reason, and act accordingly.

Authors

Mukul Khare, Manjunath Shrikantiah, Shyam Jayachandran, Shishir Jhaveri, Harry Keir Hughes