From: Thinking Machines introduces Interaction Models for real-time collaboration
RO
Rowan Zellers@rown·Quote tweet

Our interaction model is the first general video+speech model that's visually proactive. It was super fun working on this with @liliyu_lili / @saurabh_garg67 / @AndreaMadotto and others - after countless versions it was amazing when visual interruptions suddenly worked!

LI
Lili Yu@liliyu_lili·View on

We’re interested in AI systems that can collaborate in real time, without relying only on artificial turn boundaries. For audio, this feels natural: listen, speak, interrupt, update. For video, we think an important version of this is visual proactivity — models that respond when something happens visually: “Tell me when I start slouching.” “Count my pushups.” “Say stop when the person stops doing X.”

Thinking Machines introduces Interaction Models for real-time collaboration · KRO · Digg