Get Updates
Get notified of breaking news, exclusive insights, and must-see stories!

Voice Assistant App Discovery: The Hidden Problem Google Finally Solved

Google's Sidhesh Badrinarayan reveals how he transformed voice assistant app discovery, turning user failure into seamless Play Store installs. Learn about the infrastructure overhaul, capability-based matching, and the future of AI intent that earned Google I/O recognition. This deep dive uncovers the engineering brilliance behind a platform-level advancement.

There are roughly 3.8 million applications available on the Google Play Store, yet the majority of app installations still happen through the same narrow corridors: search bars, app store charts, and word of mouth. Voice-driven discovery has long been an afterthought. A user asks their phone to do something, the assistant either executes the task or returns a failure state, and the interaction ends there. That failure state, a dead end where intent goes unresolved, became one of the more quietly persistent problems in mobile computing. It affected developers, it affected users, and it exposed a structural gap in how voice assistants had been architected from the start. The global voice assistant market reached approximately $15.8 billion in 2023 and continues to expand, with billions of Android devices active worldwide, yet the infrastructure connecting spoken intent to app discovery had never been purpose-built to close that loop.

Sidhesh Badrinarayan, a Senior Software Engineer at Google and an author of the published research article RoloKnow: An Analytic Framework for Mobile Applications and Games, spent over a year rebuilding that loop from the inside. Working within Google's App Actions infrastructure, Sidhesh served as technical lead on a feature that converted unfulfilled voice queries into frictionless app install suggestions, routing users directly to the Google Play Store when the requested app was not present on their device. The work addressed something the ecosystem had never formally solved: turning a voice assistant from task-executors into a proactive agent capable of finding the right tools for the user. The architecture his team delivered was featured at Google I/O 2022 as a platform-level advancement. We spoke with Sidhesh about the engineering decisions behind that system, the infrastructure challenges involved in rebuilding app discovery at scale, and what it meant for the millions of developers building on Android.

AI Summary

AI-generated summary, reviewed by editors

Google's Sidhesh Badrinarayan reveals how he transformed voice assistant app discovery, turning user failure into seamless Play Store installs. Learn about the infrastructure overhaul, capability-based matching, and the future of AI intent that earned Google I/O recognition. This deep dive uncovers the engineering brilliance behind a platform-level advancement.
Voice Assistant App Discovery Google s Breakthrough Solution Revealed

The standard narrative around voice assistants focuses on what they can do. Your work focused on what happens when they can't. Why did that failure state matter so much from an engineering perspective? The failure state is where you learn the most about a system's actual design philosophy. Before this feature existed, Google Assistant was designed primarily as an execution layer. It was brilliant at completing tasks within apps a user already had installed. But when a user's intent couldn't be satisfied, when they asked for something their device wasn't equipped to handle, the system had no response. It returned nothing. That's a significant gap when you think about the scale involved.

What made it matter architecturally is that the failure wasn't random. It was predictable. A user says, "Hey Google, track my run," and doesn't have a fitness app installed. The intent is clear, the capability gap is identifiable, and there's an entire ecosystem of apps on the Play Store that could satisfy that request immediately. The system knew what the user wanted. It just had no mechanism to bridge that want to a solution.

The deeper issue is that this failure compounded over time. Every dead-end query was a missed moment of value for the user and a missed discovery opportunity for a developer. We were leaving a genuine connection on the table, at massive scale, every single day.

Walk me through what the rebuild actually looked like. What had to change at the infrastructure level to make this work? The core challenge was that solving this required changes in multiple layers simultaneously. You can't just add a "suggest an app" prompt on top of existing infrastructure. The underlying indexing system had to be rebuilt to support the query volumes this feature would generate, and it had to do so with the precision required to make meaningful suggestions rather than irrelevant ones.

I engineered significant updates to the core indexing infrastructure that handles hundreds of thousands of queries per second. That's not a marginal scaling challenge. That's a complete rethinking of how the system prioritizes, processes, and ranks intent signals against an available app catalog. The ranking algorithm work was especially intensive. We needed to balance precision against performance, which are often in direct tension at that kind of throughput. The evaluation framework I put in place to validate the ranking algorithms was what ultimately gave us confidence that the feature was ready for a global rollout.

The result of that evaluation work was a 91% improvement in recall. That number reflects how accurately the system could match an unfulfilled voice query to a relevant app suggestion. Getting there required validating against real query distributions and resolving the dependencies across multiple distinct product teams that had to align before we could ship.

The feature specifically used capability-based intent matching rather than brand-name recognition as the basis for suggestions. What did that distinction mean in practice, and why did it matter for developers? Before this feature, if you wanted your app discovered through voice, your best strategy was brand awareness. A user had to know your app's name and say it. That's a system that rewards whoever already has the most recognition. It has almost nothing to do with whether your app is actually the best tool for a given job.

The capability-based model inverted that logic. Combined with the Brandless Queries work, the system became able to match intent to function, not intent to brand. A user asking to "track my calories" doesn't need to know that a specific app does exactly that. The system identifies the capability the user needs and surfaces the app best positioned to provide it. That fundamentally changed the conditions of discoverability across the Android ecosystem.

For developers, particularly smaller ones without the marketing budgets to compete on name recognition, this was a structural shift. Their app could reach a user in the exact moment that user expressed the need the app was built to serve. The discoverability problem was something I had thought about from both a research and an engineering perspective, and this feature directly addressed the acquisition gap that framework was designed to help developers navigate.

Google I/O 2022 featured this work as a platform-level advancement. What was it about the architecture that warranted that kind of recognition? Google I/O is a developer conference. When something gets featured there, it's not because it's technically interesting internally. It's because it changes what developers can do. That was the test this feature had to pass.

The architecture created a new acquisition channel that hadn't existed before. Prior to this, there was no zero-friction pipeline between a spoken voice command and a Play Store installation. We built one. And because it was built at the infrastructure level rather than as a surface-level feature, it was extensible across the entire ecosystem. Every app with App Actions configured could potentially benefit from it, across billions of Android devices.

I had spoken on the topic of building for product market fit at the Building Innovation Product Market Fit 2021 conference, and one of the recurring themes there was that user acquisition architecture matters as much as the product itself. What we built at Google was an embodiment of that idea, not just a feature, but a rethinking of how voice assistants participate in the app economy. The I/O recognition also validated the evaluation framework decisions. The feature could only be announced at that scale because the recall improvement data was credible and reproducible, and that came from the rigor of the ranking evaluation work done ahead of the launch.

You've worked across Amazon, Google Assistant, and now Google's AI infrastructure for Ads. Looking at how voice assistant architecture has evolved, what's the next gap you see? The gap that's starting to define the next era is context persistence. The App Install Suggestions work was about solving a single-turn failure state: one query, one dead end, one moment of friction. Solving that was a major step forward, and a key reason the work was ultimately showcased at Google I/O in 2022. But even with that success, it's still a reactive model. The system only responds when a user hits a wall.

Now, I think a lot more about how assistants can act on longitudinal intent. It's about recognizing the pattern of what a user has been trying to accomplish over time, not just what they said in the last five seconds. My work on agentic AI at Google, where we're building the infrastructure for autonomous advertising capabilities, is really pushing in that direction. An agent that understands what a business is trying to achieve over an entire quarter is a fundamentally different kind of system than one that just answers a single question and resets.

The reality is that meaningful advances in this space happen at the infrastructure layer, not just the interface. Getting the indexing, the evaluation, and the data architecture right is what actually determines whether an intelligent system performs intelligently at scale. The interface is almost always the last mile. The really hard work is everything underneath it.

"The question I keep returning to is: what does it mean for a system to truly understand intent? Not just parse a query, but understand what a person is actually trying to accomplish and what they'll need next. That's the problem worth building toward," Sidhesh said.

Notifications
Settings
Clear Notifications
Notifications
Use the toggle to switch on notifications
  • Block for 8 hours
  • Block for 12 hours
  • Block for 24 hours
  • Don't block
Gender
Select your Gender
  • Male
  • Female
  • Others
Age
Select your Age Range
  • Under 18
  • 18 to 25
  • 26 to 35
  • 36 to 45
  • 45 to 55
  • 55+