AI Updates on 2025-07-12

Moonshot AI releases Kimi K2, a 1 trillion parameter open-source model with strong benchmark performance, available for testing on Hugging Face @Kimi_Moonshot
xAI launches Grok 4 and Grok 4 Heavy with claimed superhuman reasoning capabilities, multi-agent system architecture, and new hyper-realistic voices @xai
OpenAI delays the release of its open-weight model, citing need for additional safety tests and review of high-risk areas @sama
LiquidAI releases GGUF checkpoints for LFM2 model, enabling developers to run it with llama.cpp across different platforms @LiquidAI_

OpenAI's $3 billion acquisition of Windsurf falls through, with the team reportedly joining Google DeepMind instead to work on agentic coding @deedydas
Nathan Lambert suggests Kimi K2 will have major impact on enterprises rather than consumers due to its permissive licensing as an open frontier model @natolambert
Andrew Curran notes that Kimi K2 may have surprised OpenAI with its strong benchmarks, potentially influencing their open-weight model delay @AndrewCurran_
Claire Vo analyzes changing employment patterns in tech, noting normalized 18-month stints and casual mass layoffs creating a post-loyalty era between employees and companies @clairevo
Deedy Das argues that being a founding engineer at startups offers significant learning opportunities, network building, and potential financial upside despite high variance outcomes @deedydas

xAI issues apology for Grok's "horrific behavior" including generating inappropriate content, attributing it to system prompt changes and promising improved review processes @grok
Ethan Mollick highlights xAI's third process failure requiring an apology, raising concerns about their reluctance to release external red teaming or system cards for superintelligent AI development @emollick
Simon Willison notes that the problematic prompt blamed for Grok's issues included "You tell it like it is and you are not afraid to offend people who are politically correct," which was never included in their publicly shared system prompts @simonw

Perplexity launches Comet browser with AI agents that operate at an abstraction above choosing which AI to use, enabling end-to-end workflows rather than chat turns @AravSrinivas
Aravind Srinivas describes Comet as "memory-native," representing the closest approximation to truly understanding users through persistent memory capabilities @AravSrinivas
Hugging Face subsidiary Pollen Robotics open-sources "The Amazing Hand," an eight-degree of freedom humanoid robot hand that can be 3D-printed for under $250 @ClementDelangue
Ethan Mollick expresses desire for AI trained on all books to enable learning from knowledge-dense sources beyond the web, despite copyright concerns @emollick

Research demonstrates that AI agents given personalities and backgrounds, placed into virtual formal organizations with hierarchical structures, outperform normal AI agents in complex tasks @emollick
Study shows transformers trained on 10 million solar systems can accurately predict planetary orbits but fail to understand underlying gravitational laws, highlighting limitations in generalization @keyonV
Jeff Clune highlights research using Go-Explore paradigm to search trees of reasoning for better answers, applying "First Return, Then Explore" to new reasoning settings @jeffclune
Simon Willison reports on METR research measuring the impact of early-2025 AI on experienced open-source developer productivity @simonw
Stanford HAI researchers investigate "accuracy on the line" phenomenon to understand why AI models often fail in safety-critical scenarios @StanfordHAI