1d ago

Perceptron AI releases Perceptron Mk1 multi-modal model

0

Perceptron AI released Perceptron Mk1, a multi-modal model for frontier video understanding and embodied reasoning. The model was developed over 16 months with custom recipes optimized for physical-world performance. Demonstrations show it extracting highlights from overhead soccer footage and generating robotic arm optimization recommendations amid factory obstructions. The release is the company's first public model and is distributed via Hugging Face.

Original post

Today we're releasing Perceptron Mk1: frontier video and embodied reasoning.

8:07 AM · May 12, 2026 View on X
Reposted by

@ArmenAgha It was one of the main/coolest features of OWL-ViT quite a while ago. Luckily you said almost all 😁

Armen Aghajanyan@ArmenAgha

Pointing by example is a surprisingly useful capability almost all multimodal models except ours get wrong.

5:19 PM · May 12, 2026 · 3.3K Views
4:33 PM · May 13, 2026 · 3.9K Views

Pointing by example is a surprisingly useful capability almost all multimodal models except ours get wrong.

5:19 PM · May 12, 2026 · 3.3K Views

I'm excited to finally release the fruit of the research we've been doing at Perceptron for the last 16 months: Perceptron Mk1. We've been developing multi-modal recipes from the ground up to build models that perform best in the physical world, from video understanding to embodied reasoning to robotics. Mk1 is our scaled up recipe.

Perceptron AI@perceptroninc

Today we're releasing Perceptron Mk1: frontier video and embodied reasoning.

3:07 PM · May 12, 2026 · 1.2M Views
3:34 PM · May 12, 2026 · 42.8K Views

Mk1 is incredibly cost effective. It performs at par with Gemini-Flash/Gemini-ER as well as the larger open source Qwen models on all perceptive/physical AI tasks, at a fraction of the cost ($0.15/M input, $1.50/M output).

Armen Aghajanyan@ArmenAgha

I'm excited to finally release the fruit of the research we've been doing at Perceptron for the last 16 months: Perceptron Mk1. We've been developing multi-modal recipes from the ground up to build models that perform best in the physical world, from video understanding to embodied reasoning to robotics. Mk1 is our scaled up recipe.

3:34 PM · May 12, 2026 · 42.8K Views
3:34 PM · May 12, 2026 · 1.6K Views

On Embodied Reasoning tasks, Mk1 hits the frontier while running cheaper and faster.

Armen Aghajanyan@ArmenAgha

Mk1 is incredibly cost effective. It performs at par with Gemini-Flash/Gemini-ER as well as the larger open source Qwen models on all perceptive/physical AI tasks, at a fraction of the cost ($0.15/M input, $1.50/M output).

3:34 PM · May 12, 2026 · 1.6K Views
3:34 PM · May 12, 2026 · 1.2K Views
Armen Aghajanyan@ArmenAgha

On Embodied Reasoning tasks, Mk1 hits the frontier while running cheaper and faster.

3:34 PM · May 12, 2026 · 1.2K Views
3:34 PM · May 12, 2026 · 539 Views

Video understanding is best in class.

Armen Aghajanyan@ArmenAgha
3:34 PM · May 12, 2026 · 539 Views
3:34 PM · May 12, 2026 · 444 Views
Armen Aghajanyan@ArmenAgha

Video understanding is best in class.

3:34 PM · May 12, 2026 · 444 Views
3:34 PM · May 12, 2026 · 607 Views

Try it: http://demo.perceptron.inc

Interested in the weights? We'll be opening a small partners program for folks to get direct access to the model. DM me.

Armen Aghajanyan@ArmenAgha
3:34 PM · May 12, 2026 · 607 Views
3:34 PM · May 12, 2026 · 584 Views

We're throwing a release party in Bellevue, lined up with MLSys. Thursday May 21, 6 to 9pm; food, drinks, RSVP at https://partiful.com/e/bsC2wZxXPhFiDbMynb5l

Armen Aghajanyan@ArmenAgha

I'm excited to finally release the fruit of the research we've been doing at Perceptron for the last 16 months: Perceptron Mk1. We've been developing multi-modal recipes from the ground up to build models that perform best in the physical world, from video understanding to embodied reasoning to robotics. Mk1 is our scaled up recipe.

3:34 PM · May 12, 2026 · 42.8K Views
8:54 PM · May 13, 2026 · 4.9K Views

@ArmenAgha @Scobleizer Congrats, very cool model!

Armen Aghajanyan@ArmenAgha

I'm excited to finally release the fruit of the research we've been doing at Perceptron for the last 16 months: Perceptron Mk1. We've been developing multi-modal recipes from the ground up to build models that perform best in the physical world, from video understanding to embodied reasoning to robotics. Mk1 is our scaled up recipe.

3:34 PM · May 12, 2026 · 42.8K Views
5:47 PM · May 12, 2026 · 324 Views

new architecture ideas coming out of ai startups are cool as hell

Armen Aghajanyan@ArmenAgha

I'm excited to finally release the fruit of the research we've been doing at Perceptron for the last 16 months: Perceptron Mk1. We've been developing multi-modal recipes from the ground up to build models that perform best in the physical world, from video understanding to embodied reasoning to robotics. Mk1 is our scaled up recipe.

3:34 PM · May 12, 2026 · 42.8K Views
11:44 PM · May 12, 2026 · 2.5K Views