Meet-Me: The Digital Twin
<400ms TTFT (Time To First Token)
98% Retrieval Accuracy (Ground Truth)
5/min Rate Limit (SlowAPI)
The Challenge
Professional portfolios are traditionally static, failing to capture the nuance of a Senior MLE’s decision-making process. I needed a way to provide technical recruiters and collaborators with an interactive, authoritative source of truth that could explain my architectural choices without manual intervention.
The Solution
I engineered Meet-Me, a localized RAG kernel that serves as my Digital Twin.
- Contextual Brain (Vault Ingestion): The system indexes a curated Obsidian vault using Qdrant for vector storage. It leverages hierarchical tagging (Parent Categories + Retrieval Hooks) to ensure precise categorical alignment.
- Inference Engine (Groq LPU): To achieve sub-second responsiveness, I utilized Groq’s Llama-3.3-70B models, optimized with prefix caching of static resume data.
- Security & Rate Management: The backend is built on FastAPI, featuring a robust middleware stack including SlowAPI for rate limiting and custom API key validation to protect inference quotas.
- Terminal Interface: The frontend is a custom Astro component designed with a terminal aesthetic, supporting Async Generators for real-time token streaming and an amber-themed high-contrast UI.
The Impact
Meet-Me successfully bridges the gap between high-level summaries and deep-dive technical specs. It maintains a zero-hallucination policy through strict cosine similarity thresholding and verified metadata flags, ensuring that every claim the Twin makes is grounded in my actual project documentation.