The State of Performant Distributed Applications in 2025

Distributed systems are evolving rapidly, but true performance isn't just about execution speed. Principal Engineer Anant Agarwal explains why the secret to scaling enterprise platforms lies in making the 'change path' predictable. By prioritizing configuration over code-heavy workflows, teams can reduce latency and maintain stability even as requirements shift under heavy load.

Updated: Thursday, April 9, 2026, 19:32 [IST]

The State of Performant Distributed Applications in 2025

Distributed systems are no longer a specialty sport. The cloud-native applications market is projected to grow from $8.371 billion in 2024 to $30.236 billion by 2030, and that growth shows up in ordinary places: an HR admin trying to roll out a new workflow before a payroll cutoff, or an ops team watching latency climb because a "small" configuration change multiplied downstream calls. Anant Agarwal is a Principal Engineer whose work on configurable platform foundations earned him a promotion to Lead Software Engineer, and his core belief is blunt: performance in distributed systems is not just about execution speed, it is about ensuring that change can be introduced and propagated without creating instability. To understand how teams are designing highly performant, scalable distributed applications that stay stable while requirements keep moving, we spoke with Agarwal.

"Latency problems often start as change problems," Agarwal says. "If you cannot introduce a new field, rule, or workflow cleanly, teams ship workarounds, and those workarounds become your slowest dependency." The industry has learned the hard way that applications rarely fail only because of compute, they fail because the "simple" change takes too long to model, test, and roll out across many services and tenants. That matters because implementation time is still a drag on enterprise systems. One widely cited benchmark shows the average implementation timeline dropping from 15.5 months to 9 months year over year, which is progress, but it also shows how much calendar time can disappear before users feel value. Slow setup forces teams to batch changes, and batching is where defects hide.

AI Summary

AI-generated summary, reviewed by editors

Distributed systems are evolving rapidly, but true performance isn't just about execution speed. Principal Engineer Anant Agarwal explains why the secret to scaling enterprise platforms lies in making the 'change path' predictable. By prioritizing configuration over code-heavy workflows, teams can reduce latency and maintain stability even as requirements shift under heavy load.

Performant distributed applications Scaling through boring changes

Performance Starts With The Change Path

Agarwal’s answer was to treat configuration as a first-class performance surface. In his work on the Metadata Framework (MDF) at SAP, Agarwal focused on reducing the friction of change within enterprise systems. By introducing a configuration-driven model for objects, business rules, and user interfaces, the platform reduced setup time by over 90%, allowing teams to implement changes without relying on code-heavy workflows.

Configurability Is How You Reduce The Human Queue

The next pressure usually arrives right after initial setup: customization requests that keep coming, often from customers who cannot wait for a release train. People blame "process," but the bottleneck is usually the system’s shape. If a platform forces code changes for routine variations, the request becomes a ticket, then a sprint, then a delay. It is exhausting. Nobody wants that.

Agarwal remembers a late-week escalation that felt familiar to anyone who has supported enterprise customers. A customer needed a small change before a Monday deadline, and the uncomfortable truth was that the change itself was easy, but the path to make it safely was not. "The tense part is not writing code," he says. "The tense part is knowing that a small tweak will not ripple into permissions, data loads, and downstream behavior you did not intend."

MDF was built to remove that queue by shifting routine customization into controlled configuration. Customers could define fields, objects, business rules, and UIs through metadata rather than code, and the payoff was measurable: customization turnaround moved from 4–6 weeks to less than a day. That single change compresses entire cycles of back-and-forth, because a configuration that is explicit and permissioned is easier to reason about than a patchwork of special-case code.

While MDF addressed the problem of managing change safely, Agarwal’s work on distributed systems focused on how those changes behave at scale. At VMware, he contributed to the design of a Distributed Task Framework that enabled reliable execution of tasks across multiple nodes and services. This framework handled coordination, fault tolerance, and execution consistency in distributed environments, ensuring that system-wide operations could complete reliably even under partial failures or high load. In contrast to configuration systems like MDF, which optimize how changes are introduced, distributed task systems determine how those changes are executed across infrastructure.

Together, these two areas—configurability and distributed execution—address different sides of the same problem. One determines how safely change can be introduced, while the other determines how reliably that change propagates across a distributed system.

APIs Are The Real Load Test

Once a system is configurable, the next question is what happens when it is connected. Most enterprise applications are no longer used in isolation, they are stitched into identity systems, reporting layers, payroll providers, and internal tools that expect stable interfaces. The traffic follows. APIs now make up over half of the Internet’s moving parts, with programmable interfaces comprising 57% of Internet traffic.

In practice, that means your application is often a contract. Small inconsistencies get punished quickly, because integrations do not tolerate ambiguity. Agarwal’s approach was to make the model itself consistent: a dynamic object and UI layer with event-driven business rules, so behavior could be extended without breaking the underlying contract. It is not glamorous work, but it is the work that keeps systems fast when usage expands.

The challenge of maintaining visibility across distributed contracts is something Agarwal has written about publicly. His recent HackerNoon article, "Why Observability Needs an AI On-Call Engineer," makes the case that as systems become more configurable and distributed, the tooling watching them must evolve beyond static dashboards toward intelligent anomaly detection and cross-service correlation, treating operational problems as system design problems.

That emphasis on evidence over slogans also shows up in how he contributes outside his day job. His multiple peer reviews for technical manuscripts and papers for SARC journals reflect a habit of treating claims as testable statements, not vibes. In large platforms, that mindset translates directly into safer extensibility, clearer interfaces, and fewer surprises when load rises.

Extensibility Becomes A Business Constraint

As systems spread across regions and customer footprints, performance starts to look like governance: who can change what, how quickly, and with what guarantees. Extensibility lives on those API edges, and that is where confusion turns into real operational cost. The pressure is not only scale, it is exposure, with 150 billion API attacks documented from January 2023 through December 2024.

"You cannot scale an enterprise platform by negotiating every change," Agarwal says. "You scale it by deciding what is configurable, making that configuration safe, and refusing to let exceptions become hidden code paths." It is a pragmatic view of performance, because the system that can accept controlled change is the system that stays predictable under load. It also keeps engineers from re-litigating the same rules every time a customer asks for a variation.

MDF became that extensibility foundation at customer scale, adopted by 6,800+ customers worldwide and serving as the configuration model for a broad HCM SaaS product suite. The number matters because it implies diversity: different compliance regimes, different process shapes, different admin habits. A platform that performs across that spread is not just fast, it stays explainable enough that thousands of organizations can keep changing it without breaking it.

The Future Belongs To Systems That Make Change Boring

By the time a platform reaches real scale, the winning systems are not the ones with the cleverest internals. They are the ones that let teams ship necessary variation without starting a fire. That is the hidden principle behind performant distributed applications: the runtime is only half the story. The change path is the other half.

Agarwal’s advice is simple and slightly strict. "Write down what the platform promises, and then enforce it," he says. "If you let every edge case invent a new rule, you will eventually pay for it in latency, incidents, and rework."

The same evidence-first posture is why his selection as a judge for the Business Intelligence Group's 2025 AI Excellence Awards belongs in the closing of a story about performance, because the field ultimately rewards engineers who can make systems understandable at scale.