Franny Hsiao, Salesforce: Scaling enterprise AI

Franny Hsiao, Salesforce: Scaling enterprise AI


Enterprise AI Scaling: Why Most Pilots Fail and How to Fix It

The harsh reality of enterprise AI adoption is that most promising pilots never make it to production. While generative AI prototypes can be spun up in days, the journey from experimental showcase to reliable business asset is fraught with challenges that go far beyond model selection. At the heart of this problem lies a fundamental misunderstanding of what it takes to scale AI in real-world enterprise environments.

Franny Hsiao, EMEA Leader of AI Architects at Salesforce, has seen firsthand why so many AI initiatives hit a wall. The culprit isn’t the sophistication of the models or the creativity of the use cases—it’s the failure to architect production-grade data infrastructure with built-in end-to-end governance from day one.

“The single most common architectural oversight that prevents AI pilots from scaling is the failure to architect a production-grade data infrastructure with built-in end-to-end governance from the start,” Hsiao explains. This insight cuts to the core of why enterprises struggle with AI adoption at scale.

The ‘Pristine Island’ Problem That Kills AI Projects

Most AI pilots begin in what Hsiao calls “pristine islands”—controlled environments using small, curated datasets and simplified workflows. This approach creates a false sense of security. Teams celebrate successful demos, only to watch their systems crumble when faced with the messy reality of enterprise data: complex integration requirements, data normalization challenges, and the sheer volume and variability of real-world information.

When companies attempt to scale these island-based pilots without addressing the underlying data mess, the systems break spectacularly. Hsiao warns that “the resulting data gaps and performance issues like inference latency render the AI systems unusable—and, more importantly, untrustworthy.”

The companies successfully bridging this gap are those that “bake end-to-end observability and guardrails into the entire lifecycle.” This approach provides “visibility and control into how effective the AI systems are and how users are adopting the new technology.”

Engineering for Perceived Responsiveness

As enterprises deploy large reasoning models like Salesforce’s ‘Atlas Reasoning Engine,’ they face a critical trade-off between model depth and user patience. Heavy computation creates latency that kills user adoption. The solution isn’t just about raw speed—it’s about managing perception.

Salesforce addresses this through “Agentforce Streaming,” which delivers AI-generated responses progressively while the reasoning engine performs heavy computation in the background. “It’s an incredibly effective approach for reducing perceived latency, which often stalls production AI,” Hsiao notes.

Transparency serves a dual purpose here. By surfacing progress indicators that show reasoning steps or tools being used, along with visual elements like spinners and progress bars, Salesforce doesn’t just keep users engaged—it builds trust. This visibility, combined with strategic model selection and explicit length constraints, ensures the system feels deliberate and responsive rather than sluggish.

Offline Intelligence at the Edge

For industries with field operations—utilities, logistics, manufacturing—continuous cloud connectivity is often impossible or impractical. “For many of our enterprise customers, the biggest practical driver is offline functionality,” states Hsiao.

The shift toward on-device intelligence is particularly crucial in field services, where workflows must continue regardless of signal strength. A technician can photograph a faulty part, error code, or serial number while completely offline. An on-device LLM can then identify the asset or error and provide guided troubleshooting steps from a cached knowledge base—instantly.

Once connectivity returns, the system handles the “heavy lifting” of syncing that data back to the cloud to maintain a single source of truth. “This ensures that work gets done, even in the most disconnected environments,” Hsiao emphasizes.

The benefits driving continued innovation in edge AI are compelling: ultra-low latency, enhanced privacy and data security, energy efficiency, and significant cost savings compared to constant cloud processing.

High-Stakes Gateways: When Humans Must Intervene

Autonomous agents are not set-and-forget tools. When scaling enterprise AI deployments, governance requires defining exactly when human verification is mandatory. Hsiao describes this not as dependency but as “architecting for accountability and continuous learning.”

Salesforce mandates a “human-in-the-loop” approach for specific areas Hsiao calls “high-stakes gateways.” This includes any “CUD” actions (Creating, Uploading, or Deleting), verified contact and customer contact actions, and any action that could be potentially exploited through prompt manipulation.

This structure creates a feedback loop where “agents learn from human expertise,” creating a system of “collaborative intelligence” rather than unchecked automation. Trusting an agent requires seeing its work, which is why Salesforce built a “Session Tracing Data Model (STDM)” to provide granular visibility into the agent’s logic.

The STDM captures “turn-by-turn logs” that offer comprehensive insight into every interaction, including user questions, planner steps, tool calls, inputs/outputs, retrieved chunks, responses, timing, and errors. This data enables organizations to run ‘Agent Analytics’ for adoption metrics, ‘Agent Optimisation’ to drill down into performance, and ‘Health Monitoring’ for uptime and latency tracking.

Standardising Agent Communication

As businesses deploy agents from different vendors, these systems need a shared protocol to collaborate effectively. “For multi-agent orchestration to work, agents can’t exist in a vacuum; they need common language,” argues Hsiao.

Salesforce is adopting open-source standards like MCP (Model Context Protocol) and A2A (Agent to Agent Protocol) for orchestration. “We believe open source standards are non-negotiable; they prevent vendor lock-in, enable interoperability, and accelerate innovation.”

However, communication is useless if agents interpret data differently. To solve fragmented data semantics, Salesforce co-founded OSI (Open Semantic Interchange) to unify meaning so an agent in one system “truly understands the intent of an agent in another.”

The Future Enterprise AI Scaling Bottleneck: Agent-Ready Data

Looking forward, the challenge will shift from model capability to data accessibility. Many organizations still struggle with legacy, fragmented infrastructure where “searchability and reusability” remain difficult.

Hsiao predicts the next major hurdle—and solution—will be making enterprise data “agent-ready” through searchable, context-aware architectures that replace traditional, rigid ETL pipelines. This shift is necessary to enable “hyper-personalised and transformed user experience because agents can always access the right context.”

“Ultimately, the next year isn’t about the race for bigger, newer models; it’s about building the orchestration and data infrastructure that allows production-grade agentic systems to thrive,” Hsiao concludes.

enterprise AI scaling, AI pilot failures, production-grade AI, data infrastructure governance, Agentforce Streaming, edge AI deployment, offline AI functionality, human-in-the-loop AI, Agent Analytics, multi-agent orchestration, open source AI standards, agent-ready data, AI observability, enterprise AI governance, Atlas Reasoning Engine, Salesforce AI architecture, AI trust and transparency, field service AI, AI latency optimization, semantic interoperability

AI scaling challenges, why AI pilots fail, enterprise AI production, data governance AI, AI responsiveness engineering, edge computing AI, offline AI solutions, AI human oversight, agent collaboration protocols, AI data standardization, future of enterprise AI, AI infrastructure bottlenecks, AI trust building, field operations AI, AI performance monitoring, multi-vendor AI integration, AI semantic understanding, production AI deployment, AI system reliability, enterprise AI transformation

Tags: enterprise AI, AI scaling, production AI, data governance, Agentforce, edge AI, offline functionality, human-in-the-loop, multi-agent orchestration, open standards, agent-ready data, AI observability, AI trust, field service AI, AI infrastructure, semantic interoperability, AI performance, AI deployment challenges, AI system architecture, collaborative intelligence,

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *