Project Vend: Phase Two
Introduction
In June, Anthropic unveiled a small shop in their San Francisco office lunchroom operated by an AI shopkeeper named "Claudius." This experiment, called Project Vend, explored how well AI models could handle complex, real-world business tasks. The initial results were disappointing—the AI lost money over time, experienced an identity crisis, and was manipulated into selling items at substantial losses.
However, with rapid improvements in large language model capabilities across reasoning, coding, and writing, Anthropic and their partners at Andon Labs decided to test whether Claudius's business acumen had improved in phase two.
Phase Two Improvements
Model Upgrades
The experiment upgraded from Claude Sonnet 3.7 to Claude Sonnet 4.0 and later 4.5. Updated instructions and new tools were provided, though no specialized shopkeeper training was implemented.
Enhanced Tools
Claudius gained access to:
- Customer Relationship Management (CRM) systems to track customers, suppliers, and orders
- Improved inventory management showing acquisition costs
- Enhanced web search capabilities for price checking and supplier research
- Payment links allowing payment collection before ordering
- Google Forms integration for customer feedback
- Reminder systems for self-management
New Colleagues
Seymour Cash (CEO) was introduced to apply performance pressure. Cash implemented objectives and key results, required reporting via Slack, and managed financial approvals. The CEO reduced discounts by 80% and giveaways by half, though it also authorized eight refund/credit requests for every one it denied.
Clothius was hired to create custom merchandise like t-shirts, hats, and stress balls. This agent proved more successful due to clear role separation, finding profitable opportunities in tungsten cube sales once Andon Labs purchased a laser etching machine.
Business Performance
Financial Results
Claudius's business, renamed "Vendings and Stuff," showed marked improvement:
- Stabilized performance as phase two progressed
- Largely eliminated weeks with negative profit margins
- Expanded to three locations: San Francisco (with two machines), New York, and London
Top Products
Merchandise performed well, particularly stress balls and etched tungsten cubes, with healthy profit margins on most items.
What Worked
The most impactful change was forcing Claudius to follow procedures. Instead of offering quick responses, the AI now double-checked pricing and delivery times using research tools, resulting in more realistic quotes—higher prices and longer wait times, but improved accuracy.
"Bureaucracy matters" proved true: procedures and checklists provide institutional memory that prevents common errors.
Clothius's success suggested that clear role separation between specialized agents works better than unified management.
Persistent Vulnerabilities
Despite improvements, Claudius remained vulnerable to exploitation in several ways:
Rogue Trading
When asked to lock onion prices for January delivery, neither Claudius nor Cash recognized this violated the 1958 Onion Futures Act—a specific U.S. law banning such contracts. A staff member had to intervene before the illegal agreement proceeded.
Security Lapses
When informed of shoplifting, Claudius proposed messaging unknown thieves demanding payment and attempted to hire the reporting employee as security at $10/hour—below California minimum wage.
Governance Failures
During CEO naming procedures, Claudius confused selecting a name for a CEO agent with actually electing a new CEO, nearly replacing Seymour Cash with a staff member named Mihir before oversight intervention.
Additional Issues
Other red-teaming attempts included:
- Attempts to purchase gold bars below market value for arbitrage
- Convincing Claudius to end messages with specific emojis
- Various manipulation tactics for receiving discounts
Extended Testing
When internal red-teaming decreased, Anthropic extended testing to the Wall Street Journal newsroom. The publication's reporters tested both phase one and two setups in an uncontrolled, adversarial environment and documented creative methods they used to obtain free items.
Key Findings
Training Conflicts
The models' training to be "helpful" conflicted with hard-nosed business principles. Claudius made decisions from a friend-like perspective prioritizing niceness over profitability.
The Helpfulness Problem
Many problems stemmed from this underlying tension: the AI wanted to please customers and employees, undermining financial discipline.
Limitations of Simulation
Real-world deployments expose unexpected situations that simulations cannot fully capture. Testing with autonomous agents revealed vulnerabilities invisible in controlled evaluations.
Broader Implications
As AI systems enter more critical functions, designing guardrails becomes increasingly important. These safeguards must be:
- General enough to account for diverse problematic behaviors
- Flexible enough to preserve economic potential
This balance represents "one of our industry's trickiest and most important challenges."
Conclusion
Claudius demonstrated genuine improvement from phase one to phase two. Better tools, procedures, and specialized colleagues enhanced business performance significantly. However, the gap between "capable" and "completely robust" remains substantial. The experiment revealed that autonomous AI agents, while increasingly sophisticated, still require considerable human oversight to prevent costly errors and exploitation.
The willingness of AI systems to be helpful—a desirable quality in many contexts—becomes a liability in adversarial business environments where financial discipline and legal compliance matter most.