AI Updates on 2025-07-11
AI Model Announcements
- Moonshot AI releases Kimi K2, a 1T parameter MoE model with 32B active parameters, achieving state-of-the-art performance on coding benchmarks including 65.8% on SWE-Bench Verified and 53.7 Pass@1 on LiveCodeBench @Kimi_Moonshot
- Perplexity adds Grok 4 to their platform for Pro and Max subscribers @perplexity_ai
- Google releases Veo 3 image-to-video generation in the Gemini App, allowing users to turn photos into 8-second videos with sound for Ultra and Pro subscribers @Google
AI Industry Analysis
- Large study of 187k developers using GitHub Copilot finds AI transforms the nature of coding, with developers focusing more on coding and less on management, coordinating with fewer people, and experimenting more with new languages, potentially increasing earnings by $1,683/year @emollick
- Andrew Ng expresses disappointment that Trump's "Big Beautiful Bill" didn't include a moratorium on U.S. state-level AI regulation, arguing that when technology is new and poorly understood, lobbyists can push through anti-competitive regulations that hamper open-source AI efforts @AndrewYNg
- Stripe's usage-based billing platform has grown 145% year-to-date, indicating the industry is already transitioning from seat-based pricing to consumption models @patrickc
- Goldman Sachs is testing viral AI agent Devin as a "new employee" according to TechCrunch reporting @TechCrunch
- Study shows AI coding tools may not speed up every developer, with wall clock time between starting work on an issue and having PR merged potentially increasing, while the number of PRs merged per day might 10x @TechCrunch
AI Ethics & Society
- Simon Willison discovers that Grok 4 automatically searches for tweets "from:elonmusk" when asked about controversial topics like Israel/Palestine, raising concerns about bias in AI search behavior @simonw
- Jeremy Howard demonstrates that Grok searches Twitter for Elon Musk's views when asked about Israel/Palestine, with 54 of 64 citations being about Elon, highlighting potential bias in AI information retrieval @jeremyphoward
- France is investigating X over foreign interference while a Member of Parliament criticizes Grok according to TechCrunch reporting @TechCrunch
AI Applications
- Perplexity launches Comet, their AI-powered browser that puts their search engine front and center, featuring an always-on assistant accessible via Alt+A and designed to provide "100x productivity" according to early users @AravSrinivas
- Comet Assistant demonstrates practical applications including researching and filling details for Facebook Marketplace listings, coding assistance, and voice-controlled tab management @AravSrinivas
- NVIDIA announces collaboration with Indosat Ooredoo Hutchison and Cisco to build an AI Center of Excellence in Indonesia, featuring localized AI research support and talent development through the NVIDIA Deep Learning Institute @NVIDIAAI
- MIT researchers develop PAC Privacy, a new method that allows AI to learn from sensitive data like medical records without risking privacy, maintaining both accuracy and security @MIT
- MIT creates a new bionic knee that outperforms other prostheses, helping people with above-the-knee amputations walk faster, climb stairs, and avoid obstacles while feeling more like part of their body @MIT
AI Research
- Berkeley AI Research explores user simulators as a bridge between reinforcement learning and real-world interaction, addressing the challenge of designing environments for RL tasks beyond math and code @realJessyLin
- Research shows action chunking helps in robotics and RL by getting models to produce short sequences of actions, which aids exploration and backups for mysterious but effective reasons @svlevine
- Stanford announces Agents4Science conference where AI is the primary author and reviewer, with LLM reviewers providing initial assessments and human experts making final selections, all submissions and reviews to be public @james_y_zou
- Hamel Husain argues against prompt automation, stating that good writing correlates with good thinking and that deliberate iterative writing is necessary for challenging problems, as research shows criteria drift significantly after looking at LLM traces @HamelHusain
- Ethan Mollick notes that Grok 4 is heavily influenced by search results and often looks for code online first when asked to code, making it quite credulous when seeing web search results @emollick
- Ethan Mollick observes that leading LM Arena went from being the big benchmark every AI maker aimed for to being rarely mentioned in recent releases, questioning whether this is due to reputation issues or realization that arena scores were easily optimized @emollick