CoreAutoAI launches Core Automation for research workflows
—·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
·
mostly @andy_matuschak (5/6)
·
·
·
·
·
·
Reminds me of when we had early GPT-4 access cuz OpenAI accidentally leaked their Discord bot https://twitter.com/business/status/2046707189922890025
RT @rachelmetz: Big scoop from me: Anthropic's Mythos AI model -- the cybersecurity model it says is so powerful it can enable dangerous cy…
Wow, *everyone's* trading options these days. https://x.com/ByrneHobart/status/2046722973353062418/photo/1
SpaceX just secured the option to acquire Cursor, the AI coding company for $60B later this year. If SpaceX passes on the acquisition, they pay $10B for the partnership alone. The combination is Cursor’s product and distribution to expert software engineers, paired with SpaceX’s Colossus supercomputer roughly 1 million H100-equivalent GPUs to build what they’re calling the world’s best AI for coding and knowledge work. To understand why SpaceX wants Cursor, you need to understand what Cursor actually is. It went from $1M ARR in Dec 2023 to $1B in Nov 2025 and $2B by Feb 2026, the fastest-growing SaaS company ever, doubling roughly every two months. As of this month, Cursor is in talks to raise $2 billion or more at a $50 billion valuation and is forecasting over $6 billion in annualized revenue by end of 2026. Two months ago, SpaceX acquired xAI in an all-stock deal valuing the combined entity at $1.25 trillion and Cursor's two heads of product engineering quietly left in March to join SpaceX and xAI. The deal makes more sense when you hear the CEO explain what Cursor is actually trying to do. "The goal with the company is to replace coding with something that's much better." The four co-founders have been programmers their entire lives. Their frustration is even simple things to describe require editing millions of lines of esoteric formal programming languages, and enormous amounts of labor just to make them appear on screen. Their thesis for the next 5 to 10 years is to invent a new way to build software that is higher level and more productive, distilled down to simply defining how you want the software to work and how you want it to look. The path to get there is to be the best way to code with AI at every point in time, then evolve that process away from normal programming entirely. With Elon and SpaceX's compute, distribution, and track record of compressing 10-year timelines into 3, Cursor might get there faster than anyone expected.
RT @jordanschneider: but he's literally doing the same thing
RT @LeahLibresco: It’s so over for Locke and Demosthenes “What if I try a college application essay I wrote 15 years ago, when my prose s…
I always greatly enjoy reading @KelseyTuoc’s takes on AI. Her recent piece on de-anonymization is really important and I expect in five years we’ll be calling it prescient. https://www.theargumentmag.com/p/i-can-never-talk-to-an-ai-anonymously
SpaceXAI and @cursor_ai are now working closely together to create the world’s best coding and knowledge work AI. The combination of Cursor’s leading product and distribution to expert software engineers with SpaceX’s million H100 equivalent Colossus training supercomputer will allow us to build the world’s most useful models. Cursor has also given SpaceX the right to acquire Cursor later this year for $60 billion or pay $10 billion for our work together.
RT @Google: The next evolution of our autonomous research agent is here. Today, we’re introducing Deep Research and Deep Research Max via t…
The next evolution of our autonomous research agent is here. Today, we’re introducing Deep Research and Deep Research Max via the Gemini API. Powered by Gemini 3.1 Pro, you can now trigger comprehensive research workflows with unprecedented control and transparency, featuring: 🔌 Arbitrary MCP support 📊 Native infographic & chart generation 🌐 Fully cited reports grounded in the open web + your own files and data All from a single API call. Meet the new agent 🧵↓
RT @akseljoonas: Introducing ml-intern, the agent that just automated the post-training team @huggingface It's an open-source implementati…
OpenAI's first AI intern is expected by the end of this year, but we got impatient and decided to build it ourselves :) > Runs autonomously for hours / days depending on the task. > Can read every paper, model, and dataset on the HF Hub to build the best post-training recipes > Works with any capable model (Kimi K2.6, GPT-5.4, Opus 4.7 etc) > Runs locally or on HF infra (Spaces as sandboxes, Jobs for generating data & training models, Buckets for storage)
For the "small test" they've modified their docs to remove mention of Claude Code in Claude Pro: https://support.claude.com/en/articles/11145838-using-claude-code-with-your-max-plan It's been a shock to see Anthropic's integrity collapse in the face of commercial pressure. Would love a renewed commitment to straightforward honesty. https://twitter.com/TheAmolAvasare/status/2046724659039932830
"we are running experiment on you, because you are cattle. we have decided to partition an arbitrary number of you into the permanent underclass. enjoy" https://twitter.com/TheAmolAvasare/status/2046724659039932830
👀 https://twitter.com/Dareasmunhoz/status/2046574025258754190
open philanthropy strikes again https://twitter.com/dareasmunhoz/status/2046574025258754190
@gabeeegoooh @ayaanzhaque @BoyuanChen0 @dibyayB @jianfw @kenjihata @kiwhansong0 @liang_weixin @Marco_B_Liang @mengchaozzz @yuguang_yang Congrats! 🥳
@gabeeegoooh @ayaanzhaque @BoyuanChen0 @dibyayB @jianfw @kenjihata @kiwhansong0 @liang_weixin @Marco_B_Liang @mengchaozzz @yuguang_yang woaw
RT @Reuters: Exclusive: Meta is installing new tracking software on US-based employees' computers to capture mouse movements, clicks and ke…
Mark Zuckerberg and Meta Platforms $META just sent a memo to employees saying Meta Platforms is installing a new tracking software on the computers of all employees in the United States 🇺🇸 so it can train its AI Meta said the tracking tool will run on a list of work-related apps and websites The tool will capture stuff like mouse movements, keystrokes and screenshots of what the employees are seeing on their screens - Reuters
This is with the new tokenizer, btw. Pareto-optimal model for WeirdML. Another piece of evidence that Opus 4.7 is a distilled Mythos (also extremely token-efficient model). But it more strongly suggests that they've changed their post-training towards even greater economy. https://twitter.com/htihle/status/2045146084914155861
RT @arena: More on Claude Opus 4.7: the Thinking variant from @AnthropicAI takes #1 in Code Arena! This is +27 points over Opus-4.6 Thinki…
RT @daniel_mac8: Anthropic allows OpenClaw usage again. From @openclaw docs. https://x.com/daniel_mac8/status/2046547526413644272/photo/1
RT @daniel_mac8: Anthropic allows OpenClaw usage again. From @openclaw docs. https://x.com/daniel_mac8/status/2046547526413644272/photo/1
"something to show you", so they start with GPT Image gen 2 at 12pm PT (sadly 3AM in China, where i am right now :( And Spud (GPT 5.5) probably Thursday
I cannot express enough how excited I am about this! Make sure to tune in on OpenAI livestream at 12 PT, 3 PM ET. https://twitter.com/openai/status/2046589828918317155
RT @yohaniddawela: A single GPU can now calculate hundreds of global weather scenarios in under 60 seconds. The exact same task requires a…
This is a great example of what I call a "cloud law" (and it's about actual clouds!). A "cloud law" is a regular, exploitable pattern in nature that's too complex to be either intuited or explained by an individual human being. AI is opening up a whole new type of science. https://twitter.com/yohaniddawela/status/2046557831902417264
RT @thdxr: it's entirely fine and great for proprietary models to exist the issue is they try to kill open source behind scenes while pret…
RT @ClementDelangue: I’m hearing there’s renewed lobbying in DC and in state legislatures to ban or severely restrict open-source. Like a…
'OpenAI is preparing Agents in ChatGPT (codename Hermes)' @Teknium https://twitter.com/btibor91/status/2046545878538961304
Ok since you all don't get the joke It means they clearly see Hermes as superior to Lobsters. Otherwise they'd have named their silly side project after openclaw
RT @QiaochuYuan: people compared GPT-5.4's solution of erdos #1196 to alphago's move 37 but i think a tighter analogy is to alphago's unusu…
alphago beat a human with this, but a human beat alphazero with an adversarial strategy too. these domains are more dynamic than you might expect https://twitter.com/QiaochuYuan/status/2046659137623454007
RT @thielfellowship: Welcome 2026 Thiel Fellows! WHO ARE THEY? Victor Boyd: Birmingham, AL - @VictorWBoyd Cavalla is on a mission to get…
congratulations to the 2026 class. the fellowship was the most impactful thing i've ever participated in, and the quality of fellows just keep getting better https://twitter.com/thielfellowship/status/2046606224070733894
Moonshot’s Kimi K2.6 is the new leading open weights model. Kimi K2.6 lands at #4 on the Artificial Analysis Intelligence Index (54) behind only Anthropic, Google, and OpenAI (all 57) Key takeaways: ➤ Increase in performance on agentic tasks: @Kimi_Moonshot's Kimi K2.6 achieves an Elo of 1520 on our GDPval-AA evaluation, which is a marked improvement over Kimi K2.5’s Elo of 1309. GDPval-AA is our leading metric for general agentic performance, measuring the performance on knowledge work tasks such as preparing presentations and analysis. Models are given code execution and web browsing tools in an agentic loop via our open source reference agentic harness called Stirrup. This continues Kimi K2.6’s strength in tool use, maintaining a 96% score on τ²-Bench Telecom, placing it among other frontier models in this category. ➤ Low hallucination rate: Kimi K2.5 scores 6 on the AA-Omniscience Index, our knowledge evaluation measuring both accuracy and hallucination rate. This score is primarily driven by a comparatively low hallucination rate of 39% (reduced from Kimi K2.5’s 65%), indicating a greater capability to abstain rather than fabricate knowledge when the model is uncertain. Kimi K2.6’s low hallucination rate places it similarly to other models such as Claude Opus 4.7 (36%) and MiniMax-M2.7 (34%) ➤ High token usage: Kimi K2.6 demonstrates high token usage, but is in line with other frontier models in the same intelligence tier. To run the full Artificial Analysis Intelligence Index, Kimi K2.6 used ~160M reasoning tokens. This is slightly lower than Claude Sonnet 4.6 (~190M reasoning tokens) but much higher than GPT 5.4 (~110M reasoning tokens). ➤ Open weights: Kimi K2.6 is a Mixture-of-Experts (MoE) model with 1T total parameters and 32B active, same as the previous two generations of models Kimi K2 Thinking and Kimi K2.5. Kimi K2.6 again pushes the open weights frontier in intelligence. ➤ Third Party Access: Kimi K2.6 is accessible through Moonshot’s First Party API as well as third party API providers Novita, Baseten, Fireworks, and Parasail ➤ Multimodality: Kimi K2.6 supports Image and Video input and text output natively. The model’s max context length remains 256k. Further analysis in the threads below.
this specific reason is cope tho Kimi 2.6 is a big leap over Kimi 2.5, still open, with the same license even. my source who's usually very well informed was pretty confident it's over and they go full commercial. Well, not yet https://twitter.com/woke8yearold/status/2046719432831963501
OpenAI's new Euphony tool works almost exactly the same way as my Codex transcript viewer https://tools.simonwillison.net/codex-timeline?url=https%3A%2F%2Fgist.githubusercontent.com%2Fsimonw%2Fa9eb5993a2853ec840d26c0e56bde362%2Fraw%2Fb8c5febdf60d878da84e27c07efdaed159abde4a%2Flogs.jsonl#tz=local&q=&type=all&payload=all&role=all&hide=1&truncate=1 https://twitter.com/OpenAIDevs/status/2046620363568890230
I have found Euphony so useful internally! Glad it's now open source! https://twitter.com/OpenAIDevs/status/2046620363568890230
RT @antoine_chaffin: The new generation of open state-of-the-art single and multi-vector retrieval models is here It's time, DenseOn with…
RT @ozziekirkby: Frontier LLMs can do a lot—but can they write good flashcards? Turns out: not yet! @andy_matuschak and I created an eval…
Across many training, evaluation, and generation strategies, we find: no! LLMs are surprisingly bad at generating flashcards! And… getting slightly worse over time? https://x.com/andy_matuschak/status/2046621452041159057/photo/1
@_arohan_ congrats!!
@_arohan_ So it begins.
RT @cohere: New Technical Report from @EkagraRanjan: Contrary to what you might expect, MoE-based LLMs make speculative decoding even more…
edge city is special, check it out https://x.com/ashebytes/status/2046675495543070848/video/1 https://twitter.com/JoinEdgeCity/status/2046620050673516648
RT @JoinEdgeCity: How many of the best things in your life came from being in the right place at the right time? This summer, we're creati…
RT @pedroh96: OpenClaw is the fastest-growing open source project, but there are no stories of running it safely in production at scale. As…
@goodside interesting
Riley is back! 🫡 https://twitter.com/goodside/status/2046715990763774007
big world model tings! https://twitter.com/odysseyml/status/2046654139615326626