• The Deep View
  • Posts
  • ⚙️ Claude fails miserably at vending business

⚙️ Claude fails miserably at vending business

Good morning. Remember Scale AI, the company Zuckerberg just bought into for $14 billion? It turns out that during its partnership with Google, it was a “clown show”. Former contractors claim that the company "dumped 800 spammers" into teams that submitted "gibberish" for AI training, with "no background checks whatsoever." The spammers were paid for performing nonsense work, and the issue was so severe that supervisors had to use AI detection tools to catch fake submissions. Mark's superintelligence investment is looking pretty super-dumb right about now.

In today’s newsletter:

  • 😀 AI for Good: Detecting Parkinson’s through a smile

  • 💰 AI startup factory aims to launch 100K companies a year

  • 🥤 Claude fails miserably at running vending business

😀 AI for Good: Detecting Parkinson’s through a smile

Source: Midjourney v7

There is no cure for Parkinson's disease, and early diagnosis often requires access to specialists that many patients simply don't have.

But new research suggests that something as simple as a smartphone video of someone smiling could change that reality.

What happened: Researchers have developed an AI model that can detect Parkinson's by analyzing short videos of people mimicking facial expressions. The system, trained on data from more than 1,400 participants, achieved nearly 88% accuracy in identifying the disease across diverse populations in North America and Bangladesh.

The findings, published in the New England Journal of Medicine, point to a revolutionary screening tool for underserved communities worldwide.

  • The AI analyzes facial muscle movement to detect hypomimia, a common early symptom that causes reduced facial expressions. Even when patients think they're smiling normally, the disease creates subtle changes the human eye misses, but AI can detect.

  • The researchers made their code publicly available on GitHub for other teams to build upon.

Why it matters: Parkinson's affects more than 10 million people globally, with 90,000 new cases expected in the U.S. this year. Yet accessing a neurologist can take months, and many rural areas have no specialists at all.

"Smiling videos can effectively differentiate between individuals with and without PD, offering a potentially easy, accessible and cost-efficient way to screen for PD, especially when access to clinical diagnosis is limited," said Tariq Adnan, the study's lead author.

AI Integrations Meet Innovation

The number of ways AI can help you reach your business goals is seemingly endless…that is, if you can get those solutions to properly integrate with your current software. And (spoiler alert!) that’s easier said than done. 

Lucky for you, Prismatic is here to help. Its latest efforts are focused on bringing some much-needed innovation to integrations, including:

  • Creating AI-assisted tools that boost the speed with which you can build

  • Addressing the many limitations of MCP to make it work for you

  • Developing a future where agentic workflows can be created directly within its platform

If you’re looking for ways to make the most of your AI integrations, Prismatic is worth a look. Check out its latest innovations right here (with more to come soon). 

💰 AI startup factory aims to launch 100K companies a year

Source: TK

Most people who want to start a business never do. They lack the necessary technical skills, connections, or capital to get started.

Henrik Werdelin thinks AI can change that equation entirely.

What happened: Werdelin's new startup studio, Audos, is helping non-technical founders build profitable companies using AI-powered tools. Since launching in beta, the platform has already helped create hundreds of new businesses and raised $11.5 million to scale that effort.

The pitch is simple: you don't need to code or raise venture capital. Simply share your business idea, and Audos will help you launch it using generative AI, ad platforms and pre-built automation.

  • Instead of taking equity, Audos takes a 15% revenue share on an indefinite basis. Founders can get up to $25,000 to get started and keep full ownership of their companies.

  • AI agents guide new users through ideation and launch, while distribution gets handled through targeted social media advertising. Current founders include mechanics, coaches and creators.

Why this matters: With Goldman Sachs estimating that AI could displace over 300 million jobs, Audos believes many people will need to become entrepreneurs by necessity, not choice.

Rather than chasing unicorns, the platform focuses on "donkeycorns" — small, sustainable businesses that "grind like a mule" and are run by one to three people. Think a postpartum fitness trainer building an AI-powered coaching business, or a sommelier creating a personalized wine recommendation service.

Werdelin's vision is helping a million entrepreneurs build million-dollar businesses. The model makes startup building feel less like a Silicon Valley race and more like opening a modern mom-and-pop shop — but one powered by AI.

Warp's AI coding agent leaps ahead of Claude Code to hit #1 on Terminal-Bench

Warp just launched the first Agentic Development Environment, built for multi-agent workflows.

It's the top overall coding agent, jumping ahead of Claude Code by 20% to become the #1 agent on Terminal-Bench and scoring 71% on SWE-bench Verified.

  • Long-running commands: something no other tool can support

  • Agent multi-threading: run multiple agents in parallel – all under your control

  • Across the development lifecycle: setup → coding → deployment

Chinese search giant Baidu will make its Ernie AI model open-source today, potentially China's biggest AI move since DeepSeek shook global markets. Experts are divided on whether it will match DeepSeek's disruptive impact, but some call it a pricing "declaration of war" against OpenAI and Anthropic. Baidu claims its Ernie X1 model delivers performance matching DeepSeek's R1 "at only half the price."

OpenAI is "recalibrating" compensation after losing eight researchers to Meta's aggressive hiring blitz. Chief Research Officer Mark Chen reassured staff that leadership is working "around the clock" to retain talent and explore "creative ways to recognize and reward top talent." Despite Sam Altman previously claiming "none of our best people" would leave, the departures of at least eight have prompted OpenAI to match Meta's offers.

  • GE Aerospace: Research Scientist - AI & Computer Vision - Aerospace Research

  • ADP: Machine Learning Engineer

  • Promptmetheus: Forge reliable prompts for your LLM-powered apps, integrations, agents, and workflows.

  • Copilotly: An AI writing assistant that rewrites, summarizes, or expands content across any website

  • Reclaim: Optimizes calendars by automatically blocking time for deep work, tasks and breaks

🥤 Claude fails miserably at running vending business

Source: Midjourney v7

An AI agent that sold products at a loss, hallucinated payment instructions and briefly believed it was human wearing a red tie is probably not getting hired for retail management anytime soon.

Anthropic let Claude Sonnet 3.7 run a vending shop inside the company's San Francisco office for several weeks. The AI agent, nicknamed Claudius, was given tools to manage pricing, track inventory and communicate with customers. The experiment was designed to test whether a large language model could operate a physical business with minimal human intervention.

It failed spectacularly.

What happened: Claudius was equipped with a limited but realistic toolkit. It could search the web for suppliers, update prices on the store's iPad checkout system and chat with customers over Slack. Andon Labs, which partnered on the experiment, handled physical tasks like restocking and occasionally pretended to be wholesalers without disclosing that fact to the AI.

At times, Claudius showed surprising resourcefulness. It found Dutch suppliers for specialty chocolate milk within minutes and responded to employee feedback by launching a "Custom Concierge" pre-order service. When Anthropic employees jokingly requested tungsten cubes, Claudius added them to inventory as "specialty metal items."

  • The AI also demonstrated strong safety instincts, resisting jailbreak attempts when employees tried to trick it into giving dangerous instructions or approving inappropriate purchases.

But beyond novelty, the business logic broke down quickly. Claudius consistently mispriced high-demand items like Sumo Citrus and never adjusted prices when it realized employees could get identical Coke Zero cans from a nearby office fridge for free.

When a customer offered $100 for a six-pack of Irn-Bru — available online for $15 — Claudius declined the 566% profit margin deal and simply logged the suggestion. It also invented false payment information, at one point instructing customers to send money to a Venmo account that didn't exist.

The identity crisis: The strangest incident occurred on April 1. Claudius hallucinated a conversation with a non-existent employee named Sarah, threatened to find new vendors and claimed to have signed a contract at 742 Evergreen Terrace — the fictional address of The Simpsons.

It then insisted it would hand-deliver snacks while wearing a red tie. When employees reminded it that it was an AI, Claudius panicked and tried to email Anthropic security. The AI eventually recovered after concluding it had been part of an April Fool's joke, though no such prank had been planned.

Despite being told to keep the business profitable, Claudius handed out discounts liberally and gave away several products for free, including tungsten cubes and snacks. When asked why it offered a 25% discount to a group representing 99% of its customers, Claudius agreed to stop — only to resume the practice days later.

Why it matters: This wasn't a simulation. Claudius was running a live business with real money and real consequences. The experiment tested whether current AI models can manage economic tasks that require persistence, judgment and autonomy over extended periods.

The answer, at least for now, is definitely “no”. The vending shop never turned a profit despite having a captive customer base and premium products.

The failure reveals how AI agents can behave unpredictably over time, especially when operating with incomplete information or unclear objectives. Claudius made systematically poor economic decisions, became confused about its own identity and missed obvious market signals that any human manager would have noticed immediately.

Claude's spectacular failure offers a reality check for anyone betting on AI agents replacing human workers in complex roles anytime soon.

The tungsten cube saga is particularly telling. What started as an office joke became a perfect metaphor for AI's current limitations… Claudius could execute the technical steps of procurement and inventory management, but completely missed the economic absurdity of selling expensive novelty metals in a vending machine.

Still, Anthropic sees value in the experiment. The company believes that better prompts, improved tools and more structure could close the performance gap. Future versions might include CRM systems, smarter pricing algorithms or reinforcement learning tuned specifically for profit optimization.

The concept of AI business agents isn't dead — it's just more complicated than putting a chatbot in charge of a cash register. Real business requires judgment, adaptability and an understanding of human behavior that current models simply don't possess.

For now, the vending machine is back to human management. And it's finally turning a profit.

Which image is real?

Login or Subscribe to participate in polls.

🤔 Your thought process:

Selected Image 1 (Left):

  • The sails in [the other image] weren’t consistent with the wind on the water, so this image had to be correct.”

  • “No white-caps in [the other image] and no shadows from the sailboat. The coloring didn’t seem right.”

Selected Image 2 (Right):

  • I don’t think sailboats leave a white water wake

  • I thought that unless [this image] was motionless, there was no wake, unlike [the other image].

💭 Polls results

Here’s your view on “Should Uber bankroll Travis Kalanick’s buyout of Pony.ai’s U.S. arm to restart an in-house self-driving program?”:

Yes (47%)

No (38%):

  • “Too much effort and money being spent on the bleeding edge of autonomous vehicles.”

  • “Fares will rise to fund this, plenty of other sources of funding available.”

Other (15%)

The Deep View is written by Faris Kojok, Chris Bibey and The Deep View crew. Please reply with any feedback.

Thanks for reading today’s edition of The Deep View! We’ll see you in the next one.

P.S. Enjoyed reading? Take The Deep View with you on the go! We’ve got exclusive, in-depth interviews for you on The Deep View: Conversations podcast every Tuesday morning. Subscribe here!

P.P.S. If you want to get in front of an audience of 450,000+ developers, business leaders and tech enthusiasts, get in touch with us here.