I build AI systems for companies that need them to actually survive contact with real users — not collapse the first time someone speaks in Hinglish, drops the call, or forgets the Wi-Fi.
Outbound AI calling system for a hospital network — calling discharged patients on real Indian SIM numbers, in their own language. Stack-compared Pipecat, LiveKit and Bolna against the same test scenario before committing. ElevenLabs for TTS, Deepgram for transcription, Sarvam for Hindi STT, Plivo for telephony. Unit economics close at ₹1.20–1.30 per minute at scale. That's the hard part nobody writes threads about.
A 24/7 conversational appointment agent built on LiveKit and Pipecat. Reads real scheduling data, checks availability across providers dynamically, handles rescheduling and cancellations — all in voice, all in real time. Not a decision tree with a voice skin. An actual reasoning agent that books appointments.
Qwen2.5-0.5B fine-tuned across five rounds on real Indian clinical transcripts — Hinglish, doctor shorthand, dose changes buried mid-sentence. Extracts eleven structured fields per transcript with precision that surprised the reviewing doctors. Small enough to run on a ₹25,000 tablet. The kind of fine-tuning where you're writing custom post-processors between every round because the failure modes are always new.
The model, wrapped in a full Android product. Runs entirely offline — no cloud, no subscription, no data leaving the hospital. Doctors dictate, the app extracts vitals, medications, dosage changes, diagnoses and follow-ups in real time. Also handles OCR on lab reports, vital history tracking, and report summarisation. Shipped as a branded APK, in daily clinical use.
Most RAG systems return the nearest neighbor. VenueBot will return the right answer. A discovery engine that reasons about constraints across turns, narrows results without restarting, and knows when to say no instead of hallucinating a venue that doesn't exist.
Domain-tuned chunking and a custom reranker built for venue-specific jargon. Not off-the-shelf similarity.
Memory that narrows the result set across turns — "but outdoor" filters, it doesn't restart.
Confidence thresholds. When the system doesn't know, it says so. No hallucinated venues.
"I take on one or two serious projects at a time. I send working code, not decks. I don't disappear for a week when something breaks. If you want a thirty-person agency — we're not a fit. If you want a senior builder who'll actually ship your thing — we probably are."
You send a paragraph. I send back a real proposal with cost breakdowns, stack reasoning, and a timeline I'll actually hit. Usually in 48 hours.
You get the repo from day one. Weekly written updates, Loom walkthroughs when they help. No black boxes, no big reveal at the end.
Handover isn't a zip file. I stay for the real-user edge cases and document things your next dev can actually read.