• The Deep View
  • Posts
  • ⚙️ ChatGPT 5x worse than humans at summarizing

⚙️ ChatGPT 5x worse than humans at summarizing

Welcome back. Hope you enjoyed your July 4th weekend, because Trump and Elon certainly didn't — they spent the holiday feuding after Musk announced his new "America Party" to challenge the two-party system. Trump called his former ally "off the rails" and a "train wreck" over their $3.3 trillion spending bill dispute, while threatening to deport Musk and cut SpaceX's billions in government contracts. Apparently, Musk took declaring independence a little too literally this year.

In today’s newsletter:

  • 🏥 AI for Good: AI model catches signs of pancreatic cancer long before diagnosis

  • 🤖 How a national AI project could supercharge model scaling by 2027

  • 🔬 AI chatbots are distorting science behind the scenes

🏥 AI for Good: AI model catches signs of pancreatic cancer long before diagnosis

Source: Midjourney v7

Pancreatic cancer is one of the deadliest cancers, with a five-year survival rate of just 13%. Part of the problem is that it's notoriously difficult to catch early — symptoms are vague and the pancreas sits deep in the abdomen, making tumors hard to spot on scans.

What happened: Researchers from the University of Copenhagen developed an AI model called PANCANAI, which can detect pancreatic cancer on CT scans with remarkable accuracy. In a study published in Investigative Radiology, the model analyzed more than 1,200 CT scans from over 1,000 Danish patients with biopsy-confirmed pancreatic cancer.

  • Detected cancer with 92% accuracy on scans taken at the time of diagnosis

  • Spotted signs in 54% of scans taken more than a year before official diagnosis

  • Identified 83% of stage I cancers, when treatment is most effective

  • Worked across all cancer stages, from early to advanced disease

  • Analyzed scans from 2006 to 2016, showing consistent performance over time

The AI was trained to recognize subtle changes, such as lesions and pancreatic duct dilation, that often precede a formal diagnosis but are easily missed by human radiologists.

Why it matters: Early detection could be life-changing for patients. When pancreatic cancer is caught while still confined to the pancreas, survival rates jump to 44%. For one of medicine's most challenging cancers, that kind of head start could save thousands of lives each year.

The AI race is getting faster & dirtier day by day. Things we could never have imagined are happening. Thousands of people are getting laid off everyday and people are building 1-person million-dollar companies.

So if you’re not learning AI today, you probably won't have a job in the next 6 months.

That’s why, you need to join the 3-Day Free AI Mastermind by Outskill which comes with 16 hours of intensive training on AI frameworks, building with sessions, creating images and videos etc. that will make you an AI expert.

Originally priced at $895, but the first 100 of you get in for completely FREE! Extended 4th of july SALE! 🎁

📅 FRI-SAT-SUN- Kick Off Call & Live Sessions

🕜 10AM EST to 7PM EST

In the 5 sessions, you will:
Master prompt engineering to get the best out of AI.
Build custom GPT bots & AI agents for email management to save you 20+ hours weekly.
Monetise your AI skills into a $10,000/mo business.

Join now and get $5100+ in additional bonuses:

$5100+ worth of AI tools across 3 days — Day 1: 3000+ Prompt Bible, Day 2: Roadmap to make $10K/month with AI, Day 3: Your Personal AI Toolkit Builder.

🤖 How a national AI project could supercharge model scaling by 2027

Source: Midjourney v7

The phrase "AI Manhattan Project" once sounded like science fiction. Today, it's being studied with real math, real budgets and real timelines. Researchers now believe a US-led national initiative could push artificial intelligence to the edge of artificial general intelligence within just a few years.

What happened: In November, the US-China Economic and Security Review Commission recommended Congress "establish and fund a Manhattan Project-like program dedicated to racing to and acquiring an Artificial General Intelligence capability." Now, research from Epoch AI shows exactly what such a project could achieve — and the numbers are staggering.

The basic structure is becoming clearer. The federal government would coordinate funding and energy resources. Private companies would contribute chips, infrastructure and training methods. Together, they could consolidate nearly all US AI compute into a single initiative.

A 2e29 FLOP training run by the end of 2027 — that's 10,000 times more compute than GPT-4, equivalent to running GPT-4's entire training process 10,000 times simultaneously.

The math: The cost structure mirrors past national projects. The Manhattan Project consumed 0.4% of the United States' GDP. The Apollo program used twice that. A modern AI effort would fall within that range while delivering exponential gains in model scale.

  • At $244 billion in annual spending, the US could support a 100-day training run at unprecedented scale

  • 27 million H100-class GPUs would be deployed across the initiative

  • Total training power demand would reach 7.4 gigawatts — more than New York City's average power usage

  • Planned gas-fired generation coming online in 2027 could cover this demand without new construction

Why it matters: The analysis shows the physical requirements are within reach. Existing technologies can support the needed compute, energy and infrastructure. The Defense Production Act could expedite the development of critical infrastructure, just as it was invoked over 100 times during the pandemic.

With significant investment and coordination, this type of project could accelerate AI scaling by two full years, potentially delivering AGI-level systems by late 2027.

Your time is valuable. Let’s prove it with a $200 gift card.

Take 30 minutes to get to know Melio and we’ll give you a $200 gift card—no strings attached. 

Melio simplifies your business payments with:

  • Paying bills by card—even where cards are not accepted.

  • Earning credit card points and rewards for paying business bills.

  • Free monthly ACH payments.

  • 2-way sync with QuickBooks and Xero.

  • AR invoicing, and much, much more…

  • Google: Machine Learning Engineer, Search Ads, Shopping Relevance Models

  • Adobe: Data Scientist, Experience and Engagement Surfaces

  • Magnific: Upscale, enhance and expand images with AI for design, art or product visuals

  • Abridge: AI that listens to clinical conversations and generates structured notes

  • Kindo: Agentic infrastructure and security automation platform for enterprises

🔬 AI chatbots are distorting science behind the scenes

Source: Midjourney v7

The latest AI models have a problem — they're getting worse at accurately summarizing scientific research. In a new study, researchers found that advanced chatbots systematically misrepresent scientific findings, not through obvious hallucinations, but by quietly inflating conclusions, skipping qualifiers and replacing caution with false confidence.

When ChatGPT, Llama and DeepSeek were given scientific abstracts to summarize, the latest versions were five times more likely than human experts to overgeneralize findings. Even when explicitly prompted for accuracy, the models doubled down on broader, more assertive claims.

Researchers from Utrecht University and the University of Cambridge tested 10 popular large language models on nearly 5,000 summaries of journal articles from Nature, Science and The Lancet.

Only Claude performed well across all conditions in the study.

The problem isn't refusal, it's false fluency. Earlier models often declined to summarize unclear information. Newer models provide confident outputs that feel complete but carry subtle shifts in meaning. LLMs trained on simplified science journalism and press releases prioritize clarity over accuracy, resulting in polished but less trustworthy summaries that often overlook precise research.

Chatbots now shape how researchers, clinicians and students consume science. When summaries strip nuance, they reshape understanding. In medicine, removing critical safety context can influence treatment decisions.

So we've spent billions of dollars creating AI models that are worse at the task we actually need them to perform. Cool, cool, cool.

The really maddening part isn't that AI gets things wrong, it's that it gets them wrong in the exact way that sounds most convincing. When ChatGPT changes "was safe in this study" to "is a safe treatment," that's not a bug; it's a feature. Users want confidence, not caveats. The market rewards models that sound authoritative.

Asking for accuracy makes it worse. It's like telling someone not to think about elephants, except the elephant is medical misinformation and the someone is a $100 billion language model that half the internet now trusts more than doctors.

Which image is real?

Login or Subscribe to participate in polls.

🤔 Your thought process:

Selected Image 1 (Left):

  • “All of the boxes on the [other] image said “Dole.” AI still isn’t great at generating text accurately 100% of the time. All the signs and writing on the right looked garbled, like gibberish. Dead giveaway”

  • “There were too many things that were “real”. By this I mean the slight imperfections of the actual world. The gutters on the roof were slightly bent and the metal drains, one straight, had been bent over time to follow the contour of the street.”

Selected Image 2 (Right):

  • “In [the other image], there seems to be some wind (awning) but the scarf on the woman does not seem affected by the breeze”

  • “I think the walking stride of the man on the fake pic is exaggerated. His heel looks high as if he were running, but the rest of his posture looks docile. ”

💭 Poll results

“Would you subscribe to an AI-only channel?”

Already do (11%):

  • “Currently already following another AI VTuber named Neuro-sama! Her creator, Vedal, also shows up on streams and is always very informative!”

Would consider it (22%):

  • “AI only channels are in the arena of drones and self driving cars. Once they have a been around and used enough to have a track record and the potential uses for it are explored, it will explode.”

Maybe for certain genres/games (16%)

Probably not (27%):

  • “I think the idea should be to make AI videos indistinguishable from traditional videos. I wouldn't (knowingly) subscribe to an AI channel because they are usually sloppy when it comes to important details. I consume a lot of automotive content and of course some obviously AI-produced content has come across my feed. They could be very useful if edited well.”

  • “Fake humans are cool, but I want to hear from real people. I don't want to lose the humanity in what I consume.”

Definitely not (24%)

The Deep View is written by Faris Kojok, Chris Bibey and The Deep View crew. Please reply with any feedback.

Thanks for reading today’s edition of The Deep View! We’ll see you in the next one.

P.S. Enjoyed reading? Take The Deep View with you on the go! We’ve got exclusive, in-depth interviews for you on The Deep View: Conversations podcast every Tuesday morning. Subscribe here!

P.P.S. If you want to get in front of an audience of 450,000+ developers, business leaders and tech enthusiasts, get in touch with us here.