Building Sathi: How I Turned Claude Into a 70+ Tool Personal Operating System

TL;DR — I spent the last few months building Sathi, an MCP server that exposes 70+ tools across 8 pillars (memory, tasks, habits, goals, finance, documents, skills, graph) so Claude can read and write my entire personal data through chat. This is the architecture and reasoning behind it — and why I'm now building MCP servers as a service.

The problem I was actually solving

I was juggling six productivity apps: Todoist for tasks, Streaks for habits, Apple Notes for thoughts, Google Sheets for expenses, a journaling app for reviews, and Drive for documents. None of them talked to each other.

Worse — every AI conversation started from zero context. I'd tell Claude about a project on Monday, and by Wednesday that context was gone. The LLM had no persistent memory of my life, so I spent half of every session re-explaining myself.

I didn't want another dashboard. I wanted to talk to one interface — in English, Hindi, or Hinglish — and have it handle the rest.

What Sathi is

Sathi is a single Model Context Protocol (MCP) server. Claude (or Cursor, or any MCP-compatible client) connects once via OAuth and gets read/write access to 70+ tools across 8 pillars:

Memory — persistent knowledge with hybrid semantic + keyword search
Tasks — priority-based with subtasks and auto-complete
Habits — daily logging, streaks, analytics
Goals — outcomes + milestones + cross-pillar reviews
Finance — UPI auto-detection, categorization, summaries
Documents — upload, extract, semantic search
Skills — snapshots and progression
Graph — cross-pillar links so a task can reference a goal can reference a document

No forms. No dashboards (optional one exists, but I rarely open it). Just chat.

Four architecture decisions that actually mattered

1. Stateless auth — re-authenticate every call

Vercel serverless functions can't hold state between requests. Instead of fighting that, I leaned into it. Every MCP call re-authenticates.

The payoff: I can hit Sathi from Claude Desktop, Cursor, and my Android companion app simultaneously without state drift. No session collisions, no sync bugs. Slightly more latency, massively more reliability.

2. Stripe-style prefixed IDs

Every entity has a self-describing ID: pa_task_xxx, pa_habit_xxx, pa_memory_xxx. In logs, I instantly know what I'm looking at. Debugging went from "what is this UUID?" to "oh, that's a habit log."

Small decision, huge compounding value.

3. IST timestamps, stored natively

I'm the only user. I live in Delhi. Every timezone conversion was pure overhead — and week-boundary math broke in subtle ways when UTC converted back to IST.

So I just store IST (Asia/Kolkata) directly. Streak calculations, weekly summaries, "this month's spending" — all correct by default, no off-by-one day errors. Would I do this for a multi-region product? No. For a personal tool? Every single time.

4. Hybrid search — pgvector + Postgres FTS in one query

Pure vector search misses exact keyword matches (names, dates, IDs). Pure full-text search misses intent. So I wrote a single SQL function that merges both with a tunable score weighting.

Asking "when did I last meet Sarah?" works whether "Sarah" is the exact memory title or buried in a paragraph about a 2024 project. Same query, same function, one round-trip.

The numbers I'm proud of

70+ MCP tools across 8 pillars
812 tests across 74 files
92% statement coverage (84.9% branch, 95.69% function)
OAuth 2.0 + PKCE compliant
Sub-300ms p50 tool call latency

The tech stack

Next.js 16, Postgres + pgvector (Supabase), TypeScript strict mode everywhere, Zod validation on every tool input, OpenAI text-embedding-ada-002 for semantic vectors, MCP SDK from Anthropic, deployed on Vercel. Android companion app built with React Native + Expo.

What I learned building this

Three things I'll carry into every future project:

1. Fight the urge to generalize early. Sathi is aggressively single-user. I could have built multi-tenancy from day one. I didn't. Shipping a working thing for one user beat a half-working thing for theoretical users.

2. Tool surface area is a design problem, not an engineering one. I landed on a query_X / manage_X pattern per pillar — ~19 top-level tools fronting 64 logical operations. Claude routes intent to the right tool more reliably when the surface is flat and named clearly.

3. The boring parts are the hard parts. The hybrid search took a weekend. The OAuth flow took two. What took months? IST edge cases, offline queue sync on mobile, and tests that actually meant something.

Watch it in action

4-minute demo on YouTube:

Sathi — Full Demo (4 min)

Live at sathi.devfrend.com. Free, bootstrapped, Android companion app available.

Need a custom MCP server?

Sathi is my proof that MCP servers can do serious work — not just demos. If you're building an AI product and want purpose-built MCP tools that actually ship, I take on a small number of contract clients per quarter.

Book a free 30-minute consult →

Or reach out directly: theamargupta.tech@gmail.com