Garry Tan: Hi, I might give it away as open source (it's called GBrain) https://github.com/garrytan/gbrain https://twitter.com/businessbarista/status/

#110

Garry Tan

@garrytan· 752.5K followers

CEORank #110

Original Tweet

Hi, I might give it away as open source (it's called GBrain) https://github.com/garrytan/gbrain https://twitter.com/businessbarista/status/2044874360280723934

View on X →

AI Classification

Whether our pipeline considers this post AI-relevant

AI Relevant

Attachment Summary

Source: article

GBrain, a knowledge brain for OpenClaw agents developed by the author, provides searchable, indexed memory over markdown repos using Postgres, pgvector, and hybrid search (keyword + vector), enabling instant queries like "Who should I invite to dinner who knows both Pedro and Diana?" across 3,000+ people pages, 5,800 Apple Notes, and 13 years of calendar data. Built after grep failed at thousands of files, it fulfills Andrej Karpathy's LLM OS vision of AI-maintained personal wikis with compiled truth pages and append-only timelines, where agents incrementally update and enrich entities from meetings, emails, and tweets to make users smarter over time without manual note-taking.

Enriched Text

Assembled input used for vector embedding and topic clustering

Context: Quoting @businessbarista: "Someone is going to build a worldclass “Brain” for enterprises & make a stupid amount of money. Why? As @da_fant said, “coding w ai is solved bc all context is in the git repo. knowledge work is difficult bc context is spread out. an ai system that creates a git repo w all" Tweet: Hi, I might give it away as open source (it's called GBrain) https://github.com/garrytan/gbrain @businessbarista: "Someone is going to build a worldclass “Brain” for enterprises & make a stupid amount of money. Why? As @da_fant said, “coding w ai is solved bc all context is in the git repo. knowledge work is difficult bc context is spread out. an ai system that creates a git repo w all" GBrain, a knowledge brain for OpenClaw agents developed by the author, provides searchable, indexed memory over markdown repos using Postgres, pgvector, and hybrid search (keyword + vector), enabling instant queries like "Who should I invite to dinner who knows both Pedro and Diana?" across 3,000+ people pages, 5,800 Apple Notes, and 13 years of calendar data. Built after grep failed at thousands of files, it fulfills Andrej Karpathy's LLM OS vision of AI-maintained personal wikis with compiled truth pages and append-only timelines, where agents incrementally update and enrich entities from meetings, emails, and tweets to make users smarter over time without manual note-taking.

Current Stats

45.1KViews

362Likes

19Retweets

24Replies

321Bookmarks

2Quotes

GitHub - garrytan/gbrain: Garry's Opinionated OpenClaw Brain

GitHub · 1,543 words

Garry's Opinionated OpenClaw Brain. Contribute to garrytan/gbrain development by creating an account on GitHub.

The memex Vannevar Bush imagined, built for people who think for a living. GBrain is a knowledge brain for OpenClaw agents. It gives your agent a searchable, indexed memory over your markdown repos using Postgres + pgvector + hybrid search. Works with any OpenClaw agent. Paste the install instructions into your agent, and it handles the rest. Here's what one person built with gbrain and a single AI agent over six months:All in Postgres. All searchable by meaning, not just keywords. All maintained by an agent that runs while you sleep."Who should I invite to dinner who knows both Pedro and Diana?" — cross-references the social graph across 3,000+ people pages"What have I said about the relationship between shame and founder performance?" — searches YOUR thinking, not the internet"What changed with the Series A since Tuesday?" — diffs timeline entries across deal and company pages"Prep me for my meeting with Jordan in 30 minutes" — pulls dossier, shared history, recent activity, open threadsEvery meeting, email, tweet, and person enrichment flows back into the brain. Six months from now you know more than any human could retain. Not because you're taking notes — because the system never forgets. Your markdown repo is the source of truth. GBrain makes it searchable. Your AI agent makes it live. Andrej Karpathy's LLM OS / Knowledge LLM post sketched the vision: a personal wiki maintained by AI agents, where every page is a living document that gets smarter as the agent processes more information. I started building exactly that. Markdown files in a git repo, one page per entity, compiled truth on top, append-only timeline on the bottom. It worked. Until I hit thousands of files. At 500 files, grep is fine. At 3,000 people pages, 5,800 Apple Notes, and 13 years of calendar data, grep falls apart. You need real search: keyword for exact names, vector for semantic meaning, and something that fuses both. You need an index that updates incrementally when one file changes, not a full directory walk. You need your agent to find "everyone who was at the board dinner last March" in milliseconds, not 30 seconds of grepping. That's what GBrain is. The search and sync layer I had to build once the brain outgrew grep. GBrain fixes this with hybrid search that combines keyword and vector approaches, plus a knowledge model that treats every page like an intelligence assessment: compiled truth on top (your current best understanding, rewritten when evidence changes), append-only timeline on the bottom (the evidence trail that never gets edited). AI agents maintain the brain. You ingest a document and the agent updates every entity mentioned, creates cross-reference links, and appends timeline entries. MCP clients query it. The intelligence lives in fat markdown skills, not application code. Most tools help you find things. GBrain makes you smarter over time. The core loop:Every cycle through this loop adds knowledge. The agent enriches a person page after a meeting. Next time that person comes up, the agent already has context — their role, your history, what they care about, what you discussed last time. You never start from zero. An agent without this loop answers from stale context. An agent with it gets smarter every conversation. The difference compounds daily. Never do anything twice. If you look someone up once, that lookup lives in the brain forever. If a pattern emerges across three meetings, the agent captures it. If you generate an original idea in conversation, it goes to originals/ — your searchable intellectual archive. The repo is the system of record. GBrain is the retrieval layer. The agent reads and writes through both. Human always wins — you can edit any markdown file directly and gbrain sync picks up the changes. The numbers above aren't theoretical. They come from a real deployment documented in GBRAIN_SKILLPACK. md — a reference architecture for how a production AI agent uses gbrain as its knowledge backbone. What's in the skillpack:It's a pattern book, not a tutorial. "Here's what works, here's why."GBrain is world knowledge — people, companies, deals, meetings, concepts, your original thinking. It's the long-term memory of what you know about the world. OpenClaw agent memory (memory_search) is operational state — preferences, decisions, session context, how the agent should behave. They're complementary:All three should be checked. GBrain for facts about the world. Memory for agent config. Session for immediate context. Install via openclaw skills install gbrain. GBrain doesn't ship with demo data. It finds YOUR markdown and makes it searchable. Act 1: Discovery. GBrain scans your machine for markdown repos. Act 2: Import. Your files move from the repo into Supabase. Act 3: Search. The agent picks a query from your actual content. Your file count will be different. Your queries will be different. The agent picks them based on what it imported. That's the point: this is YOUR brain, not a demo. The compounding effect. Search for Pedro. The agent pulls his page, his relationship history, his company. Next time Brex comes up in conversation, the agent already knows Pedro co-founded it, what you discussed last, and what's on your open threads. You didn't do anything — the brain already had it. GBrain needs three things to run:Set the API keys as environment variables:The Supabase connection URL is configured during gbrain init. The OpenAI and Anthropic SDKs read their keys from the environment automatically. Without an OpenAI key, search still works (keyword only, no vector search). Without an Anthropic key, search still works (no multi-query expansion, no LLM chunking). To install, paste this into OpenClaw and we'll work with you to do the rest:OpenClaw will install gbrain, walk through Supabase setup, discover your markdown files, import them, and prove search works with a query from your data. After setup, you talk to your brain through OpenClaw:GBrain keeps your brain current. After setup, gbrain sync --watch polls your git repo and imports only what changed. Binary files (images, PDFs, audio) can be moved to cloud storage with gbrain files mirror to slim down your git repo. Supabase settings: GBrain connects directly to Postgres (not the REST API). You need the Session pooler connection string (port 6543), not the project URL or anon key. Find it: Project Settings > Database > Connection string > URI tab > change dropdown to "Session pooler". All paths require a Postgres database with pgvector. Supabase Pro ($25/mo) is the recommended zero-ops option. Upgrade depends on how you installed:After upgrading, run gbrain init again to apply any schema migrations (idempotent, safe to re-run). After installing via CLI or library path, run the setup wizard:The init wizard:Config is saved to ~/. gbrain/config. json with 0600 permissions. OpenClaw users skip this step. The orchestrator runs the wizard for you during install. Import is idempotent. Re-running it skips unchanged files (compared by SHA-256 content hash). Progress bar shows status. ~30s for text import of 7,000 files, ~10-15 min for embedding. Every page in the brain follows the compiled truth + timeline pattern:Above the --- separator: compiled truth. Your current best understanding. Gets rewritten when new evidence changes the picture. Below: timeline. Append-only evidence trail. Never edited, only added to. The compiled truth is the answer. The timeline is the proof. Keyword search alone misses conceptual matches. "Ignore conventional wisdom" won't find an essay titled "The Bus Ticket Theory of Genius" even though it's exactly about that. Vector search alone misses exact phrases when the embedding is diluted by surrounding text. RRF fusion gets both right. Multi-query expansion catches phrasings you didn't think of.10 tables in Postgres + pgvector:Indexes: B-tree on slug/type, GIN on frontmatter/search_vector, HNSW on embeddings, pg_trgm on title for fuzzy slug resolution. Three strategies, dispatched by content type:Recursive (timeline, bulk import): 5-level delimiter hierarchy (paragraphs, lines, sentences, clauses, words). 300-word chunks with 50-word sentence-aware overlap. Fast, predictable, lossless. Semantic (compiled truth): Embeds each sentence, computes adjacent cosine similarities, applies Savitzky-Golay smoothing to find topic boundaries. Falls back to recursive on failure. Best quality for intelligence assessments. LLM-guided (high-value content, on request): Pre-splits into 128-word candidates, asks Claude Haiku to identify topic shifts in sliding windows. 3 retries per window. Most expensive, best results. GBrain is library-first. The CLI and MCP server are thin wrappers over the engine. The BrainEngine interface is pluggable. See docs/ENGINES. md for how to add backends. Add to your Claude Code or Cursor MCP config:30 tools generated from the contract-first operations. ts: page CRUD, search, tags, links, timeline, admin, sync, raw data, file management, and more. Every tool is generated from the same operation definitions as the CLI. Parity tests verify structural identity. Fat markdown files that tell AI agents HOW to use gbrain. No skill logic in the binary. Embedding, chunking, and search fusion are engine-agnostic. Only raw keyword search (searchKeyword) and raw vector search (searchVector) are engine-specific. RRF fusion, multi-query expansion, and 4-layer dedup run above the engine on SearchResult[] arrays. For a brain with ~7,500 pages:Supabase free tier (500MB) won't fit a large brain. Supabase Pro ($25/mo, 8GB) is the starting point. Initial embedding cost: ~$4-5 for 7,500 pages via OpenAI text-embedding-3-large. See CONTRIBUTING. md. Run bun test for unit tests. For E2E tests against real Postgres+pgvector: docker compose -f docker-compose. test. yml up -d then DATABASE_URL=postgresql://postgres:postgres@localhost:5434/gbrain_test bun run test:e2e. Welcome PRs for:MIT

Topic: Private Personalized AI AssistantStory: Garry Tan launches GStack Browse: open-source steerable browser for Claude Code

Engagement Timeline(60 snapshots)

Time	Views	Likes	Bookmarks	RTs	Replies
11:00 AM UTC	+343	+1	+4	—	—
10:50 AM UTC	+343	+1	+1	—	—
10:40 AM UTC	+284	+3	+3	+1	—
10:30 AM UTC	+247	+3	+3	—	—
10:20 AM UTC	+275	+3	+2	—	—
10:10 AM UTC	+254	+2	+2	—	—
10:00 AM UTC	+226	+1	+3	—	—
9:50 AM UTC	+22	+1	—	—	—
9:40 AM UTC	+398	+4	+8	—	—
9:30 AM UTC	+266	—	+2	—	—

Time

Views

Likes

Bookmarks

RTs

Replies

11:00 AM UTC

+343

—

10:50 AM UTC

+343

—

10:40 AM UTC

+284

—

10:30 AM UTC

+247

—

10:20 AM UTC

+275

—

10:10 AM UTC

+254

—

10:00 AM UTC

+226

—

9:50 AM UTC

+22

—

9:40 AM UTC

+398

—

9:30 AM UTC

+266

—