Report

Agentic data engineering: How to build an AI-native data enterprise

By Ashish Suratkal, Gaurav Verma, Pallavi Kaur, Harry Keir Hughes

30 Jun, 2026
15 min read

Insights

Enterprise data teams face two compounding pressures: building reliable data fast enough to meet demand and making that data accessible enough for business users to act on it. Agentic AI addresses both challenges, but only when implemented with discipline.
Agents inherit the weaknesses of the data estate they operate on. Poor governance, uncertified data, and thin instructions do not disappear when AI is applied.
Databricks, Snowflake, Google Cloud, and Microsoft Fabric each bring differentiated agentic capability to the enterprise data stack. Platform selection should start with an honest diagnosis of where the primary constraint lies.
Within any platform, data source selection, agent architecture, instruction quality, and evaluation discipline determine production success. Human oversight is what keeps autonomous systems trustworthy.

Enterprise data teams have long operated on a familiar rhythm: Data engineers build manual data pipelines, deploy them, fix them when they break, and start again. At the other end, business users, who need answers to specific, context-driven questions — often in simple language rather than through tools that require coding knowledge — receive dashboards and reports built in advance, and when they have a question those tools cannot answer, they submit analysis requests and wait. The model is mostly manual at every step, and for years it was adequate.

Two shifts have disrupted that model.

The first is on the supply side: Artificial intelligence (AI) adoption has enabled high-value analytics. This has driven demand for high-quality, current, well-governed data to a level that manual data engineering cannot sustain. Every new AI initiative, every new reporting requirement, every schema change upstream creates more work for teams already stretched by keeping existing pipelines running.

The second shift is on the demand side: Business decisions now require answers in hours, not the days or weeks a manual analytics cycle takes. A sales leader needs to know why conversion rates dropped in a specific region this week. A supply chain manager needs to understand an anomaly in today's inventory data, not in the next scheduled report. Static dashboards, built around questions someone anticipated months ago, cannot serve questions like these.

These two shifts feed each other. A data estate that is slow to build cannot keep pace with the consumption layer, where the data is consumed. A consumption layer built around static reports cannot serve a business that needs answers on demand. AI has begun to change this, first through targeted automation of individual tasks, and more recently through agentic AI, systems that can plan, execute, and self-correct across multistep data workflows end to end. Leading cloud data platforms, such as Databricks, Snowflake, Google Cloud, and Microsoft Fabric, have invested in these capabilities. The underlying technology has matured and early results are promising.

Where enterprise readiness falls short

Deploying agents without discipline creates risks at two junctions — one at the data infrastructure layer, and the other at the data consumption layer. These risks are usually underestimated resulting in unnecessary analyst queues to determine what went wrong, where, and why (Figure 1).

Figure 1. Compounding challenges of agentic AI

Source: Infosys

The infrastructure risk

Governance gaps
Agentic pipeline tools inherit the data they operate on. An agent running against a poorly governed schema produces poorly governed pipelines, faster. Automation amplifies what is already true about the data estate. Organizations that deploy agentic data engineering without first establishing governance frameworks, agent identities, defined permissions, and audit trails, do not reduce data quality risk. They accelerate it. Industry findings suggest that only one in five companies has a mature governance model for autonomous AI agents. That means most enterprises deploying these tools today are doing so without the controls needed to manage them safely and effectively at scale.

Skill gaps
There is also a skill shift most organizations underestimate. For engineers, the work moves from writing and fixing code to supervising agents, validating their output, and managing the governance frameworks that keep them operating safely. Teams that deploy agentic systems without adequate AI reskilling may find themselves with agents producing outputs that no one is equipped to review.

The consumption risk

Data quality
Conversational analytics tools, AI-powered systems that allow business users to ask questions in plain English and receive answers from enterprise data, face a different but equally serious set of failure patterns. Conversational agents do not interpret data. They generate queries against whatever structure and naming they find in the connected source. A raw data store, which holds transactional data without embedded business logic, forces the agent to infer definitions from column names and table structures. That inference becomes unreliable when business logic is inconsistently named or defined across tables, which is common in most enterprise data environments. Revenue may be calculated differently across three tables in the same system. The agent cannot know which definition is correct. It guesses and gives a confident-looking answer that is wrong.

Instruction quality
Conversational agents are configured through natural-language instruction blocks, which are the agent's only source of context about the business domain, naming conventions, and expected outputs. Instructions written at the level of “answer questions about our sales data” are not sufficient for production use. They leave the agent without the context needed for ambiguous terms, multistep calculations, or questions that cross domain boundaries. Most organizations treat instruction design as a one-time setup task and fail to consider that it needs to evolve in parallel with the data sources. Failure to do that results in declining accuracy.

Agents scale what is already there. Poor governance, uncertified data, and thin instructions do not disappear when AI is applied. They compound.

Build a two-layer solution

The solution operates at the same two layers as the problem. At the infrastructure layer, agentic data engineering tools automate pipeline creation, self-healing, documentation, and life cycle management. At the consumption layer, conversational analytics tools give business users governed, natural-language access to certified enterprise data. Getting both layers right — choosing the most suitable platform to use at each of these layers and understanding how both connect — is what produces an autonomous data enterprise. The findings and implementation guidance that follow draw on Infosys deployment experience and published performance data from platforms.

The platform landscape

Databricks: Broadest autonomy

Databricks has built the most autonomous data engineering capability available today. Genie Code covers the full pipeline life cycle from a single prompt: it builds, runs, monitors, and self-heals largely without manual intervention. In agent mode, it can plan and generate pipelines end to end or accelerate work on an existing one. It confirms plans with the user before proceeding. With approval, it searches tables, edits SQL or Python source files, runs pipeline updates, and reads datasets, operating strictly within the user's existing permissions. In Databricks internal benchmarking on real-world data science tasks, Genie Code resolved 77% of tasks against 32% for a leading coding agent equipped with Databricks model context protocol (MCP) integration, reflecting the advantage of purpose-built tooling over general-purpose integrations. For business users, Genie extends to conversational analytics grounded in Unity Catalog, the metadata and governance layer that stores table definitions, relationships, and business semantics across the enterprise. The platform's orientation remains strongest with data professionals and analyst-level users rather than general business users.

Snowflake: Developer productivity focus

Snowflake's acquisition of TensorStax brought autonomous pipeline construction and verification into its AI Data Cloud via Cortex Code. The platform's clearest differentiator is breadth: Snowflake CoCo, formerly Cortex Code, works across dbt, Apache Airflow, Jira, GitHub, Salesforce, and Slack through native MCP support, making it the most connected agentic tool for organizations managing data engineering through standard software development workflows. For business users, Snowflake CoWork, formerly Snowflake Intelligence, provides conversational access to both structured and unstructured data via a dedicated portal and an iOS app. Snowflake's April 2026 update framed CoWork and CoCo as a unified control plane, with CoWork handling the user layer and CoCo handling the builder layer within a single governed platform.

Google Cloud: LLM depth and migration

Google Cloud's data engineering agent, which has been generally available since April 2026, is powered by Gemini. Unlike other platforms where large language model (LLM) capability is connected through external APIs, Google Cloud embeds Gemini directly across BigQuery Studio, giving data engineering and analytics workflows native LLM access without additional configuration. This integration runs deepest for organizations already on Google Cloud, where Gemini is available across the full platform rather than as an add-on. Its strongest practical application is legacy migration: the agent translates on-premises pipeline code into modern Dataform or dbt formats, directly addressing one of the most costly data modernization challenges enterprises face. For conversational analytics, Google's Conversational Analytics API and Gemini in Looker give business users natural language access to BigQuery data, accessible through Gemini Enterprise and Looker. The platform's advantage deepens for organizations already running on Google Cloud, where this integration is already in place.

Microsoft Fabric: Conversational autonomy

Microsoft Fabric has made its most differentiated investment at the consumption layer. Fabric Data Agents, generally available since Microsoft’s FabCon conference in March 2026, give business users governed conversational access to enterprise data with no code required. The agent translates plain-English questions into SQL, DAX (the calculation language used in Power BI semantic models), or KQL (query language for real-time event data), executes against the connected source, and returns a formatted answer. The service inherits Microsoft Purview governance policies and row-level security automatically. Fabric's structural advantage is deployment surface: through Microsoft Teams and M365 Copilot, agents reach users inside the tools where most enterprise business users already work, including Excel and Word, without requiring a new interface or login. On the infrastructure side, Fabric has data engineering capability through Data Factory and Lakeflow pipelines, though this is not where it currently differentiates against the three infrastructure-focused platforms.

Figure 2. Comparative analysis of agentic data engineering platforms

Source: Infosys

The shift to agentic AI

The platforms are production-ready. The capability at both layers is real. What separates organizations that capture value from those that do not is a clear-eyed understanding of what the end objective is, where the current bottlenecks lie, and what the shift to agentic AI requires of the people and processes around it.

Govern before you deploy

Governance must be in place before deployment, not configured around it. At the infrastructure layer, this means establishing agent identities, defining permissions, and putting audit trails in place before any agentic pipeline tool runs in a production environment. At the consumption layer, it means having certified data sources and access controls configured before conversational agents are connected to enterprise data. An agent without governance boundaries removes the visibility needed to detect when risk materializes.

Sequence the layers

Infrastructure comes before consumption. The quality of data the infrastructure layer produces directly determines the reliability of the consumption layer. A conversational agent connected to well-governed, certified data can deliver accurate answers at scale. The same agent connected to uncertified data will produce unreliable answers, regardless of how carefully it is configured. Build and certify the data estate first, and then expose it to conversational agents. Organizations that invert this sequence spend more time recovering trust than building it.

Choose the right platform, and then implement deliberately

No single platform currently leads on both layers. The selection decision should start with an honest diagnosis of where the primary constraint sits. Platform choice, however, is only the first decision. Two implementation choices determine whether the selected platform delivers in production, each grounded in Infosys deployment experience.

Implementation choice 1: Choose the right data source
The most consequential decision for a conversational agent to function well is the data source it connects to (Figure 3). For instance, in the case of Microsoft Fabric, semantic models, governed data layers that embed certified business logic and prebuilt key performance indicator (KPI) definitions, consistently outperform raw data stores for business metric questions. Infosys trials showed semantic model-connected agents achieving approximately 90% accuracy on complex business questions, against 75% to 80% for raw store connections on the same question set. The raw store is the right choice for exploratory queries and transaction-level investigation. For certified KPI reporting and nontechnical business users, the semantic model should be the default.

Figure 3. Data source comparison based on Infosys trials conducted using Microsoft Fabric

Source: Infosys

Implementation choice 2: Match architecture to complexity
The pattern that appears simplest, one agent connected to all available data sources, is usually the worst performer in production, according to Infosys trials conducted using Microsoft Fabric (Figure 4). The agent must navigate different query languages, resolve naming conflicts, and split its instruction context across business domains simultaneously. Accuracy drops sharply, and the failure is silent: the agent returns answers, but reliability degrades in ways that are hard to detect without structured evaluation. The highest-performing pattern connects domain-specialist agents to their optimal data sources, coordinated by an orchestrator that classifies and routes incoming questions. One critical rule at this layer: the orchestrator must forward the user's original question without rephrasing. Any rewording before the question reaches the specialist agent degrades accuracy downstream.

Figure 4. Comparison of agent architecture patterns from Infosys trials using Microsoft Fabric

Source: Infosys

Engineer instructions properly

An agent's instruction block is its only source of context about the business domain, naming conventions, and expected output formats. Instruction quality has more impact on accuracy than architecture choice in many deployment scenarios. A well-engineered instruction set on a simple single-source agent can outperform a complex multiagent architecture with poorly written instructions. Instructions must specify the business domain explicitly, define ambiguous terms, name the correct data sources for specific question types, and include handling patterns for common multistep queries. They must be treated as a production engineering asset: versioned, owned, and updated whenever the underlying data sources evolve.

Evaluate before deploying

Without a structured evaluation framework in place before deployment, business users encounter inconsistent agent behavior, lose confidence, and return to the analyst queue. Infosys uses a test set of 40 to 50 representative questions with verified ground-truth answers, scored automatically using an LLM-as-judge method. The test set covers fact retrieval, KPI calculation, trend analysis, multistep reasoning, and edge cases, including null values, ambiguous date references, and cross-domain questions. Every deployment decision is gated by performance against this test set. An agent that does not meet the defined accuracy threshold does not move to production. Architecture changes, instruction revisions, and data source modifications are all re-evaluated against the same test set before taking effect.

Redefine the human role

As agentic tools take on more of the execution work, humans become an increasingly critical investment. Organizations should define new role expectations for data engineers and analysts before deployment, assign explicit ownership of agent governance, instruction quality, and evaluation frameworks, and build those responsibilities into team structures and performance expectations. The return on agentic AI is strongest in organizations where human judgment is applied at the right points: approving agent plans, reviewing outputs, and maintaining the governance boundaries that keep automated systems trustworthy.

The autonomous data enterprise is an operating model built across two layers, governed from the start, and maintained through disciplined human oversight. Platform choice sets the ceiling. Governance, data source selection, instruction quality, and evaluation discipline determine how close to that ceiling the deployment reaches. The organizations that move earliest and most deliberately will build a data estate capable of serving a business that moves faster than manual processes can support.

Authors

Ashish Suratkal, Gaurav Verma, Pallavi Kaur, Harry Keir Hughes

30 Jun, 2026
15 min read

Agentic data engineering: How to build an AI-native data enterprise

Insights

Where enterprise readiness falls short

Figure 1. Compounding challenges of agentic AI

The infrastructure risk

The consumption risk

Build a two-layer solution

The platform landscape

Databricks: Broadest autonomy

Snowflake: Developer productivity focus

Google Cloud: LLM depth and migration

Microsoft Fabric: Conversational autonomy

Figure 2. Comparative analysis of agentic data engineering platforms

The shift to agentic AI

Govern before you deploy

Sequence the layers

Choose the right platform, and then implement deliberately

Figure 3. Data source comparison based on Infosys trials conducted using Microsoft Fabric

Figure 4. Comparison of agent architecture patterns from Infosys trials using Microsoft Fabric

Engineer instructions properly

Evaluate before deploying

Redefine the human role

Connect with the Infosys Knowledge Institute

Thank you. We will get in touch with you shortly.

Stay connected with our latest Insights