Welcome to My Blog!

Hello!

Thanks for stopping by. I'm an AI developer focused on building reliable, low-latency systems—think real-time speech-to-text, face recognition, vector search, and scalable LLM services. I care about practical engineering: measurable performance, clear architectures, and code that holds up in production.

What you'll find here

Hands-on guides for deploying and operating AI workloads (Ray/vLLM, Triton, FastAPI, Next.js, etc.).
Fine-tuning & training: LoRA/QLoRA/PEFT, SFT/DPO/ORPO, dataset curation & cleaning, mixed precision, FSDP/DeepSpeed, and experiment tracking (W&B/MLflow).
Computer vision: detection/segmentation (YOLO/RT-DETR), OCR, face recognition with embeddings and re-identification, plus production-grade augmentation pipelines.
LLMs & RAG/agents: vector stores (FAISS/OpenSearch/Supabase), retrieval pipelines, evaluation (e.g., RAGAS), prompt engineering, and lightweight agent patterns.
Benchmarks and notes on latency, throughput, memory, and cost.
System design write-ups: data pipelines, streaming, observability, and scaling patterns.
Code snippets & checklists you can drop into your own projects.

My approach

Build first, optimize second. Ship something that works, then profile and iterate.
Measure everything. If we can't measure it, we can't improve it.
Keep it simple. Prefer boring, proven tools over flashy complexity.

Let's connect

I'd love to hear what you're building and the challenges you're facing.

Leave a comment below to start a thread with the community.
Or reach me directly at mail@201lab.top.

If any post helps you, consider sharing it—your feedback and questions often shape what I write next. Welcome aboard, and enjoy the read!