The Deep View
Posts
⚙️ OpenAI introduces Codex

⚙️ OpenAI introduces Codex

The Deep View
May 19, 2025

Good morning. Nvidia CEO Jensen Huang's trip to Taiwan, after visiting the Middle East with Trump, has sparked "Jensanity" as adoring fans mob him for autographs on books, posters, and even baseballs. The Taiwan-born billionaire — whose company is now selling official Jensen-branded merch at a pop-up store — prompted confusion from his US-based colleagues (where he walks around fairly unnoticed).

— The Deep View Crew

In today’s newsletter:

🩸 AI for Good: AI spots blood clots before they strike
🤖 Penn reimagines research with AI at its core
🧠 OpenAI introduces Codex

🩸 AI for Good: AI spots blood clots before they strike

Source: ChatGPT 4o

For heart patients, the first sign of a dangerous clot is often a heart attack or stroke. Now, researchers at the University of Tokyo have unveiled an AI-powered microscope that can watch clots form in a routine blood sample – no catheter needed.

The new system uses a high-speed "frequency-division multiplexed" microscope – essentially a super-fast camera – to capture thousands of blood cell images each second. An AI algorithm then analyzes those images in real time to spot when platelets start piling into clumps, like a traffic jam forming in the bloodstream.

In tests on over 200 patients with coronary artery disease, those with acute coronary syndrome – a dangerous flare-up of heart disease – had far more platelet clumps than patients with stable conditions. Just as importantly, an ordinary arm-vein blood draw yielded virtually the same platelet data as blood taken directly from the heart’s arteries via catheter.

Why it matters: This AI tool could make personalized treatment easier and safer:

Traditional platelet monitoring relies on invasive or indirect methods
The AI tool analyzes blood from a basic arm draw
Real-time imaging allows doctors to observe platelet clumping directly
The method may reduce reliance on catheter-based procedures

The team of researchers published its findings this week in Nature Communications.

✂️ Cut your QA cycles down from hours to minutes

If slow QA processes and flaky tests are a bottleneck for your engineering team, you need QA Wolf.

QA Wolf's AI-native platform supports both web and mobile apps, delivering 80% automated test coverage in weeks and helping teams ship 5x faster by reducing QA cycles to minutes.

With QA Wolf, you get:

✅ Unlimited parallel test runs

✅ 15-min QA cycles

✅ 24-hour maintenance and on-demand test creation

✅ Zero-flake guarantee

The result? Drata’s team of 80+ engineers saw 4x more test cases and 86% faster QA cycles.

No flakes, no delays, just better QA — that’s QA Wolf.

Schedule a demo to learn more

🤖 Penn reimagines research with AI at its core

Source: UPenn

The University of Pennsylvania has quietly built a human collider for AI.

Launched this spring by cosmologist Bhuvnesh Jain and computer scientist René Vidal, the AI x Science Fellowship unites more than 20 postdoctoral researchers from physics, linguistics, chemistry, engineering and medicine. Each fellow receives two faculty mentors, a modest research budget and campus-wide access to labs and high-performance computing. Weekly Tuesday lunches double as idea exchanges, while open seminars pull in curious researchers from every school.

The fellowship grew out of a 2021 data-science pilot in Arts & Sciences and now spans Engineering and Penn Medicine, with Wharton fellows due in the fall. Jain and Vidal—co-chairs of Penn’s AI Council—plan to scale it into a university-wide Penn AI Fellowship and create a “data-science hub” where roaming AI specialists spend a fifth of their time parachuting into other labs.

Why it matters: As AI research moves rapidly into the private sector, this initiative encourages collaboration on AI research questions that don’t yet have commercial applications. Industry labs chase near-term products. Penn is betting that open-ended, ethically grounded questions—trustworthy AI, machine learning for dark-matter hunts—still belong in academia. The fellowship gives young scientists a network, résumé-ready collaborations and a sandbox for ideas too early or risky for corporate funding.

The Fastest LLM Guardrails Are Now Available For Free

Fast, secure and free: prevent LLM application toxicity and jailbreak attempts with <100ms latency.

Fiddler Guardrails are up to 6x cheaper than alternatives and deploy in your secure environment.

Connect your LLM app today and run free guardrails.

Google's AI mode replaces iconic ‘I’m Feeling Lucky’ button
Satya Nadella ditches podcasts for AI-powered chatbot conversations
Moonvalley raises $53M to expand ethical AI video tools
Alibaba and Tencent boost shopping with AI-powered advertising
CarPlay Ultra rolls out with next-gen features

Tesla: AI Research Engineer, Model Scaling, Self-Driving
Microsoft: Director - Responsible AI

Together AI: A fast and efficient way to launch AI models
Talently AI: A conversational AI interview platform (no more manual screening)
RevRag: Automated sales via AI calling, email, chat, and WhatsApp

🧠 OpenAI introduces Codex

Source: OpenAI

Vibe coding might be all the rage – the trend of non-coders building apps through AI – but OpenAI's latest release is pointedly not for the casual "build me a website" crowd. The company just launched Codex, a cloud-based software engineering agent built to assist professional developers with real production code.

"This is definitely not for vibe coding. I will say it's more for actual engineers working in prod, and sort of throwing all the annoying tasks you don't want to do," noted Pietro Schirano, one early user, capturing the tool's intent in plain terms.

OpenAI is rolling out Codex as a research preview to ChatGPT subscribers (initially Pro, Team, and Enterprise, with Plus users to follow). Here’s Sam Altman’s tweet on response to the rollout so far.

What makes Codex unique is that it spins up a remote development environment in OpenAI's cloud – complete with your repository, files, and a command line – and can carry out complex coding jobs independently before reporting back. Once enabled via the ChatGPT sidebar, you assign Codex a task with a prompt (for example, "Scan my project for a bug in the last five commits and fix it").

Under the hood, Codex uses a specialized new model called codex-1, derived from OpenAI's latest reasoning model, o3, but tuned specifically for code work. Key capabilities include:

Multi-step autonomy: Codex can write new features, answer questions about the codebase, fix bugs, and propose code changes via pull request – all by itself
Parallel agents: You can spawn multiple Codex agents working concurrently (the launch demo showed several fixing different parts of a codebase in parallel).
Test-driven verification: Codex repeatedly runs the project's test suite until the code passes, or until it exhausts its ideas and provides verifiable logs and citations of what it did.
Configurable via AGENTS.md: You can drop an AGENTS.md file in your repo to guide the AI. This file tells Codex about project-specific conventions, how to run the build or tests, which parts of the codebase matter most, etc. Early users report this dramatically helps Codex avoid rookie mistakes.

OpenAI has been testing Codex with several early design partners to prove its value in actual development teams:

Temporal uses Codex to debug issues, write and execute tests, and refactor large codebases, letting Codex handle tedious background tasks so human developers can stay "in flow" on core logic.
Superhuman is leveraging Codex to tackle small, repetitive tasks, and have found that PMs (non-engineers) can use Codex to contribute lightweight code changes.
Kodiak Robotics has Codex working on their self-driving codebase, writing debugging tools and improving test coverage.

The big picture: All this comes amid a broader frenzy to build agentic AI developers. Just months ago, startup Cognition released "Devin," branding it "the first AI software engineer." We immediately subscribed to the $500/month service when it launched to the public, drawn in by promises that it could write entire apps in minutes and solve complex coding issues with minimal help. However, we canceled within the first month after finding it didn't live up to the hyped announcements – a common theme in the current AI landscape where capabilities often lag behind marketing claims.

Cognition raised $21 million for Devin despite its early performance on the SWE-Bench coding challenge (an industry benchmark for fixing real GitHub issues) being modest – it solved about 13.9% of test tasks on its own. Hot on its heels, researchers at Princeton built SWE-Agent, an open-source autonomous coder using a GPT-4 backend that scored 12.3% on the same benchmark – nearly matching the venture-backed startup's AI dev agent with a fraction of the resources.

Big tech isn't sitting idle. Google is expected to unveil a major AI coding tool at tomorrow's I/O developer conference, and GitHub Copilot, the incumbent AI assistant, is evolving rapidly as Microsoft folds it into a broader Copilot X vision with chat and voice features inside the IDE.

It's becoming clear that in this new landscape, the advantage of simply owning a big codebase is evaporating. We previously dubbed this "the no-moat era" in our analysis – when an indie dev with AI tools can reimplement a competitor's core features over a weekend, traditional software moats based on headcount start to crumble.

AI agents succeed when they’re scoped, sandboxed, and verifiable. Devin over-promised, under-specified, and hit a wall. Codex under-promises (no “build me Instagram”), gives the agent a test harness, and documents every step. That mindset — treat the AI like a junior dev who must show their work — is how agentic coding will stick in the short-to-mid term future.

Expect pricing to migrate toward “pay per compute” rather than all-you-can-eat. By year’s end, I would expect every IDE, CI pipeline and repo host to surface “spawn agent” buttons. And expect the winners to be the dev teams that invest in good tests, clear docs, and tight review loops.

Software engineering just got a yet another teammate. It works fast, complains never, and absolutely needs a code review. Use it wisely. Buyer beware.

Which image is real?

🤔 Your thought process:

Selected Image 1 (Left):

“The reflections on the fuselage of the airplane in [the other image] seemed out of place, and the motion blur with the propellers didn't feel correct.”
“I think this one is real because I have seen images like this in realtime on occasion. I have seen the moon during the day and I have seen it with an aircraft too. In Florida, especially, cloud formations are common and seeing all three have happened before.”

Selected Image 2 (Right):

“Landing gear was open in the other pic, which put me off.”
“[The other image’s] tail stabilizer is too high for an aircraft with underwing engines”

💭 A poll before you go

Will you let OpenAI's Codex in your codebase?

Thanks for reading today’s edition of The Deep View!

We’ll see you in the next one.

P.S. Enjoyed reading? Take The Deep View with you on the go! We’ve got exclusive, in-depth interviews for you on The Deep View: Conversations podcast every Tuesday morning. Subscribe here!

If you want to get in front of an audience of 450,000+ developers, business leaders and tech enthusiasts, get in touch with us here.