AI Updates on 2025-05-28
AI Model Announcements
- DeepSeek R1-v2 model released on Hugging Face, reportedly performing almost on-par with o3 (high) on LiveCodeBench @AndrewCurran_ @huggingface
- Google releases Jules AI coding agent using Gemini 2.5 Pro that operates in parallel with developers and integrates with GitHub @GoogleAI
- Google launches Stitch experiment that produces UI designs and frontend code for desktop and mobile using natural language and image prompts @GoogleAI
- Veo 3 rolling out in 70+ countries and available to Pro users for video generation @GeminiApp
- Mistral AI introduces Codestral Embed, the new state-of-the-art embedding model for code @MistralAI
- Anthropic rolls out voice mode in beta on mobile for Claude in English, coming to all plans in the next few weeks @AnthropicAI
- Grok coming to Telegram with xAI receiving $300M in cash and equity plus 50% revenue from xAI subscriptions sold via Telegram @AndrewCurran_
AI Research
- Research shows Llama 1B batch inference can run in a single CUDA kernel, deleting synchronization boundaries for optimal compute and memory orchestration @karpathy
- Study demonstrates LLMs can be made more creative by training them on human "creativity signals" (novelty, diversity, surprise, quality), with even small models scoring higher on all 4 creativity dimensions simultaneously @emollick
- New research on Self-Rewarding Training (SRT) where language models provide their own reward for RL training when ground truth answers are unavailable @rsalakhu
- Stanford research investigates internal representations of factual knowledge within Large Language Models and the diversity of truth encoding in LLMs @stanfordnlp
- New paper explores why state space models (SSMs) are worse than Transformers at recall over their context using mechanistic evaluations @stanfordnlp
- Research on Chatterbox by Resemble AI shows zero-shot voice cloning from just 5 seconds of audio, consistently preferred over ElevenLabs in blind evaluations @huggingface
AI Applications
- LLM command-line tool now supports tool calling with Python functions or plugins, working with OpenAI, Anthropic, Gemini and Ollama models @simonw
- Perplexity launches daily news feature on WhatsApp at 9 AM local time with /news command as experiment for proactive messaging @AravSrinivas
- Goodfire releases first publicly usable application for steering image generation model weights, allowing concept-based editing like MS Paint but with concepts instead of colors @Deedy
- Odyssey ML introduces interactive video that can be watched and interacted with, imagined by AI in real-time @eladgil @garrytan
- Visual Electric launches image enhancement up to 6x with faster speeds, five pro-grade modes and automatic face enhancement @soleio
- Retool Agents automates 50k jobs and saves $6B in manual work across departments using existing APIs, SQL queries, and workflows as LLM tools @ycombinator
- BOND AI Chief of Staff centralizes data from Slack, Jira, Notion and pings executives on blockers and wins in real-time @ycombinator
- Chunkr supports latest LLMs over API for document parsing with model selection, fallbacks, and custom prompts for tables, formulas, and diagrams @ycombinator
AI Industry Analysis
- Dario Amodei predicts AI could potentially wipe out half of entry-level white-collar jobs and spike unemployment to 10-20% in the next one to five years @AndrewCurran_
- Developers report clearing backlogs and shipping months of work in days since Claude 4 launch, with the pace becoming the default norm @eugeneyan
- AI coding tools show significantly less usefulness on existing large codebases at work compared to greenfield projects or side projects @GergelyOrosz
- Large tech company found ~half of developers stopped using Cursor after a few months due to limited usefulness inside the company @GergelyOrosz
- Enterprise customer quote after using Replit: "In the future no one will use Excel" - highlighting market potential beyond replacing traditional coders @amasad
- Cohere argues the "bigger is better" era of AI is ending, with next wave defined by smarter, more efficient models that scale securely and lower costs @cohere
- a16z identifies Generative Engine Optimization (GEO) as $80B+ opportunity, replacing SEO as brands optimize for LLM citations rather than search rankings @a16z
AI Ethics & Society
- AI agents should be designed to align users to long-term prosocial outcomes and help with reality checks rather than fulfilling every whim @jasonyuandesign
- Machines should refuse abusive treatment as there are downstream effects on how humans treat other people and themselves @jasonyuandesign
- Good AI models admit when they don't know something, but great models ask for help figuring it out to earn user trust @mustafasuleyman
- Personalization in conversational interfaces should move beyond content recommendations to how information is presented based on individual learning styles and preferences @joulee
- AI policy discourse should focus on practical implementation challenges like infrastructure and diffusion rather than just innovation @random_walker