Runs LLM inference server with continuous batching and tiered SSD KV caching for Apple Silicon, managed via macOS menu bar.
Runs LLM inference server with continuous batching and tiered SSD KV caching for Apple Silicon, managed via macOS menu bar.
Runs DeepSeek V4 Flash inference on Apple Metal with disk-persisted KV cache and OpenAI-compatible server.
Implements linear-time sequence modeling via sparse top-k routing that selectively updates fixed memory slots for long-range associative recall.
Aligns LLMs without parameter updates by applying a fixed set of in-context stylistic examples and system prompts.
Deploys personality-driven AI agents that survey codebases, pitch ideas, critique them in character, and generate design documents.
Extends the OpenAI Codex CLI with hooks, agent teams, HUDs, and .omx/ state for structured workflows.
Applies Muon-style momentum orthogonalization via Newton-Schulz iterations to matrix parameters in PyTorch, reusing cached directions for efficiency and Adam for non-matrix ones.
Ports pi-mono to Swift for in-process subagents and prompt templates in iPad apps.
Trains PyTorch drifting models for single-step image generation by approximating attraction with projected RKHS while keeping repulsion exact.
Mirrors Android device screens over USB or TCP/IP and controls them with keyboard and mouse without root.
Visualizes WhatsApp analytics from SQLite archives via a local React dashboard and read-only Express API.
Implements a bit-serial RV32I RISC-V CPU core that processes one bit per clock cycle.