One engineer made a production SaaS product in an hour: here's the governance system that made it possible
Treasure Data’s Bold Leap Into Agentic Coding: What Happens When AI Writes the Code, But Humans Still Hold the Keys
In a seismic shift that could redefine how enterprise software is built, Treasure Data—the SoftBank-backed customer data platform powering over 450 global brands—has unveiled Treasure Code, a groundbreaking AI-native command-line interface that lets data engineers and platform teams operate its full Customer Data Platform (CDP) through natural language. The kicker? The entire codebase was generated by AI in roughly 60 minutes.
But the real story isn’t about speed—it’s about what had to be true before those 60 minutes were even possible, and what broke spectacularly after.
The Governance-First Revolution
Before a single line of AI-generated code was committed, Treasure Data faced a critical question that every engineering leader watching the agentic coding wave will eventually confront: If AI can generate production-quality code faster than any team, what does governance look like when the human isn’t writing the code anymore?
Treasure Data’s answer was radical: build the governance layer first.
“When we started this journey, we had to get CISOs involved. I was involved. Our CTO, heads of engineering—just to make sure that this thing didn’t just go rogue,” Rafa Flores, Chief Product Officer at Treasure Data, told VentureBeat.
The guardrails they built live upstream of the code itself. When any user connects to the CDP through Treasure Code, access control and permission management are inherited directly from the platform. Users can only reach resources they already have permission for. PII cannot be exposed. API keys cannot be surfaced. The system cannot speak disparagingly about a brand or competitor.
This foundation made the next step possible: letting AI generate 100% of the codebase, with a three-tier quality pipeline enforcing production standards throughout.
The Three-Tier Pipeline That Makes AI-Generated Code Production-Ready
Tier 1: The AI Code Reviewer
Using Claude Code itself, Treasure Data built an AI-based code reviewer that sits at the pull request stage and runs a structured review checklist against every proposed merge. It checks for architectural alignment, security compliance, proper error handling, test coverage, and documentation quality. When all criteria are satisfied, it can merge automatically. When they aren’t, it flags for human intervention.
The fact that Treasure Data built the code reviewer in Claude Code is not incidental. It means the tool validating AI-generated code was itself AI-generated—a proof point that the workflow is self-reinforcing rather than dependent on a separate human-written quality layer.
Tier 2: The CI/CD Backbone
A standard CI/CD pipeline runs automated unit, integration, and end-to-end tests, static analysis, linting, and security checks against every change. This is the traditional safety net that catches what the AI reviewer might miss.
Tier 3: Human Review
Required wherever automated systems flag risk or enterprise policy demands sign-off. The internal principle Treasure Data operates under: AI writes code, but AI does not ship code.
Why This Isn’t Just “Cursor for Databases”
The obvious question for any engineering team is why not just point an existing tool like Cursor at your data platform, or expose it as an MCP server and let Claude Code query it directly.
Flores argued the difference is governance depth. A generic connection gives you natural language access to data but inherits none of the platform’s existing permission structures, meaning every query runs with whatever access the API key allows.
Treasure Code inherits Treasure Data’s full access control and permissioning layer, so what a user can do through natural language is bounded by what they’re already authorized to do in the platform.
The second distinction is orchestration. Because Treasure Code connects directly to Treasure Data’s AI Agent Foundry, it can coordinate sub-agents and skills across the platform rather than executing single tasks in isolation—the difference between telling an AI to run an analysis and having it orchestrate that analysis across omni-channel activation, segmentation, and reporting simultaneously.
What Broke Spectacularly
Even with the governance architecture in place, the launch didn’t go cleanly, and Flores was candid about it.
Treasure Data initially made Treasure Code available to customers without a go-to-market plan. The assumption was that it would stay quiet while the team figured out next steps. Customers found it anyway. More than 100 customers and close to 1,000 users adopted it within two weeks, entirely through organic discovery.
“We didn’t put any go-to-market motions behind it. We didn’t think people were going to find it. Well, they did,” Flores said. “We were left scrambling with, how do we actually do the go-to-market motions? Do we even do a beta, since technically it’s live?”
The unplanned adoption also created a compliance gap. Treasure Data is still in the process of formally certifying Treasure Code under its Trust AI compliance program, a certification it had not completed before the product reached customers.
A second problem emerged when Treasure Data opened skill development to non-engineering teams. CSMs and account directors began building and submitting skills without understanding what would get approved and merged, creating significant wasted effort and a backlog of submissions that couldn’t clear the repository’s access policies.
Enterprise Validation and What’s Still Missing
Thomson Reuters is among the early adopters. Flores said the company had been attempting to build an in-house AI agent platform and struggling to move fast enough. It connected with Treasure Data’s AI Agent Foundry to accelerate audience segmentation work, then extended into Treasure Code to customize and iterate more rapidly.
The feedback, Flores said, has centered on extensibility and flexibility, and the fact that procurement was already done, removing a significant enterprise barrier to adoption.
The gap Thomson Reuters has flagged, and that Flores acknowledges the product doesn’t yet address, is guidance on AI maturity. Treasure Code doesn’t tell users who should use it, what to tackle first, or how to structure access across different skill levels within an organization.
“AI that allows you to be leveraged, but also tells you how to leverage it, I think that’s very differentiated,” Flores said. He sees it as the next meaningful layer to build.
The Hard Lessons Every Engineering Leader Needs to Hear
Flores has had time to reflect on what the experience actually taught him, and he was direct about what he’d change. Next time, he said, the release would stay internal first.
“We will release it internally only. I will not release it to anyone outside of the organization,” he said. “It will be more of a controlled release so we can actually learn what we’re actually being exposed to at lower risk.”
On skill development, the lesson was to establish clear criteria for what gets approved and merged before opening the process to teams outside engineering, not after.
The common thread in both lessons is the same one that shaped the governance architecture and the three-tier pipeline: speed is only an advantage if the structure around it holds.
For engineering leaders evaluating whether agentic coding is ready for production, the Treasure Data experience translates into three practical conclusions:
-
Governance infrastructure has to precede the code, not follow it. The platform-level access controls and permission inheritance were what made it safe to let AI generate freely. Without that foundation, the speed advantage disappears because every output requires exhaustive manual review.
-
A quality gate that doesn’t depend entirely on humans is not optional at scale. AI can review every pull request consistently, without fatigue, and check policy compliance systematically across the entire codebase. Human review remains essential, but as a final check rather than the primary quality mechanism.
-
Plan for organic adoption. If the product works, people will find it before you’re ready. The compliance and go-to-market gaps Treasure Data is still closing are a direct result of underestimating that.
“Yes, vibe coding can work if done in a safe way and proper guardrails are in place,” Flores said. “Embrace it in a way to find means of not replacing the good work you do, but the tedious work that you can probably automate.”
tags
AgenticCoding #AIRevolution #EnterpriseAI #CodeGeneration #GovernanceFirst #TreasureCode #AIInfrastructure #EngineeringLeadership #FutureOfWork #TechInnovation #AIRegulation #ProductivityHack #TechNews #SoftwareDevelopment #AILeadership
viralphrases
60 minutes to build production code #mindblown
When AI writes the code but humans hold the keys
The governance layer that made agentic coding possible
What broke when 1,000 users adopted it in two weeks
The three-tier pipeline that makes AI-generated code production-ready
Why Cursor for databases isn’t the same thing
Thomson Reuters couldn’t build it fast enough internally
The compliance gap that caught them by surprise
CSMs building skills without knowing what gets approved
Speed is only an advantage if the structure around it holds
AI writes code, but AI does not ship code
The hard lesson: release internally first, not to customers
The missing piece: AI maturity guidance for enterprises
When vibe coding works (and when it doesn’t)
The future of software development is here
,




Leave a Reply
Want to join the discussion?Feel free to contribute!