Featured image of post Last Week in AI: 05/25/25-05/31/25: DeepSeek Drops Open-Source Bomb, AI Scientists Get Published

Last Week in AI: 05/25/25-05/31/25: DeepSeek Drops Open-Source Bomb, AI Scientists Get Published

DeepSeek's massive open-source model challenges proprietary giants while an AI system achieves peer review - plus public polling shows growing caution about AI pace.

AI Summary

  • DeepSeek released R1-0528, a 685B parameter open-source model that beats Grok 3 mini and Qwen 3 in benchmarks
  • “The AI Scientist-v2” became the first AI system to get a paper accepted at a peer-reviewed ICLR workshop
  • New polling shows 77% of Americans want AI companies to slow development for safety considerations
  • Anthropic CEO warns 50% of entry-level white-collar jobs could disappear within five years

DeepSeek’s R1-0528 Challenges the Proprietary Giants

Chinese AI startup DeepSeek released R1-0528 this week, a 685 billion parameter reasoning model that’s forcing the industry to recalibrate what’s possible from open-source AI. The model jumped from 70% to 87.5% accuracy on the AIME 2025 mathematics test and is reportedly outperforming both xAI’s Grok 3 mini and Alibaba’s Qwen 3 in code generation tasks, according to Tom’s Guide’s analysis.

The company also released a smaller distilled version, DeepSeek-R1-0528-Qwen3-8B, targeting users with limited computational resources. Both models are available under open-source licenses on DeepSeek’s GitHub repository, continuing the company’s strategy of challenging Western AI dominance through freely available alternatives.

“The new version delivers major performance gains in complex reasoning, coding and logic, which are areas where even top-tier models often stumble.” - Tom’s Guide

Having a credible, locally-runnable alternative to proprietary APIs fundamentally changes procurement conversations. While these models don’t match the polish of OpenAI or Anthropic’s offerings - lacking sophisticated omni-modal features and the refinement that comes from massive commercial investment - they eliminate vendor lock-in and API dependencies that can torpedo project budgets overnight.

Testing Note: I’ll be deploying DeepSeek-R1-0528-Qwen3-8B on my local AI server this week to evaluate its real-world performance against my current model stack. Full review coming soon.

AI System Achieves Genuine Research Milestone

Researchers introduced “The AI Scientist-v2,” an autonomous system capable of conducting complete research cycles from hypothesis generation through manuscript writing. The breakthrough came when one of its generated papers was accepted at an ICLR workshop - marking the first time an AI-authored research paper passed peer review at a major machine learning conference, as detailed in the arXiv preprint.

The system handles the full scientific workflow: formulating hypotheses, designing experiments, executing them, analyzing results, and writing academic papers that meet publication standards. ICLR workshops maintain rigorous peer review standards, making this acceptance a genuine validation rather than a publicity stunt.

This moves us past AI as a research assistant into AI as an independent contributor to scientific knowledge. The acceleration potential is staggering - imagine thousands of parallel research streams running continuously, surfacing promising results for human interpretation and follow-up. We’re looking at a fundamental shift in how scientific discovery operates.

Public Demands Slower AI Development (Industry Won’t Listen)

A new Axios/Harris Poll revealed that 77% of Americans want AI companies to slow development and “get it right the first time,” even if that delays technological breakthroughs. The sentiment crosses generational lines, driven primarily by concerns about job displacement and misinformation proliferation.

“More than three-quarters of Americans (77%) want companies to create AI slowly and get it right the first time, even if that delays breakthroughs.” - Axios

The polling data creates obvious tension with industry dynamics where competitive pressure drives rapid deployment and iteration. Companies that pause risk being overtaken by those that don’t - public sentiment rarely drives technology deployment timelines in practice.

From my perspective, both extremes miss the mark. The “move fast and break things” crowd ignores legitimate concerns about societal disruption. The “pause everything until it’s perfect” faction doesn’t understand that responsible development requires real-world testing and iteration. The practical path forward involves selective acceleration - pushing hard on capabilities that solve genuine problems while exercising restraint on applications that could destabilize social institutions.

Anthropic CEO Delivers Blunt Job Displacement Warning

Dario Amodei, CEO of Anthropic, warned that AI could eliminate up to 50% of entry-level white-collar jobs within five years. Speaking to Axios, Amodei criticized the industry for “sugarcoating” the employment impact of rapidly advancing AI capabilities.

“The industry needs to stop ‘sugarcoating’ this white-collar bloodbath.” - Dario Amodei, Anthropic CEO

Based on current deployment patterns I’m observing, offshore implementation roles appear most vulnerable initially. AI-assisted development produces code much closer to production-ready, dramatically reducing the need for large offshore teams that traditionally handle specification-to-implementation work. The expensive domestic talent solves core problems, then tosses increasingly complete solutions over the fence - except the fence is getting much shorter.

This isn’t just coding jobs. Any role that primarily involves taking detailed specifications and executing them systematically faces automation pressure. Roles requiring creative problem-solving, stakeholder management, or dealing with ambiguous requirements remain secure for now.

Google Expands Veo 3 Video Generation Access

Google broadened access to Veo 3, its text-to-video AI model, through the Gemini app and Flow tool. Google One AI Premium subscribers with Pro tier can generate up to 10 videos, while Ultra subscribers receive higher usage limits, according to 9to5Google’s coverage.

“Google expanded access to Veo 3 through its Gemini app and Flow tool, allowing Pro subscribers to generate up to 10 videos and Ultra subscribers to have higher usage limits.” - 9to5Google

The expansion follows Google I/O demonstrations and represents the standard rollout pattern for new AI capabilities: limited initial access, gradual expansion, tiered usage based on subscription levels. The creative possibilities are real - small teams can now produce content that would have required substantial production budgets just months ago.

The broader implications around deepfakes and misinformation grow proportionally with accessibility. We’re not just getting better creative tools - we’re eliminating entire categories of expertise that previously served as natural barriers to sophisticated content manipulation.

The Week’s Underlying Current

This week crystallizes the fundamental contradiction driving AI development. Technical capabilities advance faster than our collective ability to process their implications. Open-source alternatives ensure no single entity controls the pace. Public opinion favors caution while competitive dynamics demand speed.

The most significant trend isn’t any single announcement - it’s the continued acceleration despite growing calls for restraint. Whether that acceleration produces broadly beneficial outcomes or destabilizing disruption depends on decisions being made right now in labs and boardrooms worldwide.

*This week I’ll have hands-on results from testing DeepSeek’s 8B model locally, new releases of my ‘AI Box’ and ‘MacroTracker’ apps, and, time permitting, a quick guide to starting with n8n automation.

Powered by Hugo & Stack Theme
Built with Hugo
Theme Stack designed by Jimmy