AI Updates on 2025-05-29

DeepSeek releases R1-0528 with improved benchmark performance, enhanced front-end capabilities, reduced hallucinations, and support for JSON output and function calling @deepseek_ai
Google DeepMind introduces MedGemma, their most capable open model for multimodal medical text and image comprehension @GoogleDeepMind
Perplexity launches Labs, an agentic AI system for complex tasks that can build analytical reports, presentations, and dynamic dashboards @perplexity_ai
Anthropic releases Claude 4 Opus with notable tendencies toward producing spiritual themes and mystical content when prompted @emollick

The New York Times signs agreement with Amazon to license editorial content for AI training, including content from NYT Cooking and The Athletic @AndrewCurran_
Andrew Ng warns that proposed cuts to U.S. basic research funding could severely impact American competitiveness in AI, noting that DARPA's $50M investment in early deep learning research created hundreds of billions in market value through Google Brain alone @AndrewYNg
Nathan Lambert observes that Chinese labs are dominating open model development throughout 2025, with little apparent concern from U.S. companies @natolambert
Hugging Face questions traditional AI business models, suggesting that tech companies will want to own their models and use open source protocols rather than rely on proprietary APIs @huggingface
Jeff Clune predicts that by the end of 2027, almost every economically valuable computer task will be done more effectively and cheaply by computers @jeffclune

MIT Technology Review reports that GenAI is almost 5x less accurate than humans when summarizing scientific research, raising concerns about reliability in academic contexts @MIT_CSAIL
Ethan Mollick demonstrates o3's advanced capabilities in business analysis but emphasizes the ongoing challenge of trusting AI results without domain expertise to verify them @emollick
Christopher Manning criticizes new visa restrictions affecting Chinese STEM students, arguing they harm U.S. scientific competitiveness @chrmanning
Haya Odeh discovers critical security vulnerabilities in Lovable's Row Level Security implementation, highlighting risks in AI-generated applications @HayaOdeh

Andrew Curran demonstrates how new video generation models like Veo are making high-quality content production accessible to individual creators, potentially disrupting traditional media production @AndrewCurran_
Deedy shows o3 achieving 90% accuracy on cricket game prediction from ball-by-ball data, calling it an extremely nontrivial task even for senior data scientists @deedydas
Brian Lovin uses Claude and Gemini to backfill hundreds of hours of podcast audio into a searchable database, creating a custom knowledge system @brian_lovin
Ethan Mollick has Claude 4 create a novel game with unique mechanics involving stealing and redistributing physical properties between objects @emollick
Microsoft integrates Copilot with Instacart for automated grocery shopping, handling recipes, shopping lists, and delivery seamlessly @mustafasuleyman

Anthropic open-sources interpretability tools that allow researchers to generate attribution graphs showing internal reasoning steps models use to arrive at answers @AnthropicAI
Berkeley AI Research presents FastTD3, a simple and fast off-policy reinforcement learning algorithm for humanoid control with open-source implementation @berkeley_ai
Alex Graveley introduces VScan, a two-stage visual token reduction framework enabling up to 2.91x faster inference and 10x fewer FLOPs while maintaining 95.4% of original performance @alexgraveley
Stanford NLP Group develops AI-generated kernels that perform close to or sometimes beat expert-optimized production kernels in PyTorch through test-time search @stanfordnlp
Nathan Lambert publishes research on noisy rewards in learning to reason, finding that LLMs demonstrate strong robustness to substantial reward noise, with models still converging even when 40% of reward outputs are manually flipped @natolambert