I design and build production-grade AI systems across
LLM inference, serving infrastructure, and safe tool execution.
My work spans runtime optimization, auditable AI workflows,
and privacy-preserving ML for text and audio applications.
My work sits at the boundary between engineering and applied research,
with a focus on systems that are measurable, reliable,
auditable, and safe by design.
LLM inference and serving systems
Safe and governed tool-using AI runtimes
Benchmarking, runtime validation, and deployment workflows
Privacy-preserving synthetic voice detection
I’m especially interested in AI inference, serving platforms,
runtime enforcement, and privacy-conscious production AI systems.
Now
What I’m actively building and exploring:
PolicyGraph: a reproducible safety runtime for MCP tool execution
Inference systems work around vLLM, TensorRT-LLM, Triton, and Kubernetes
OWASP-aligned evaluation and contract tests for agent safety
Follow-up research on runtime enforcement patterns and privacy-preserving AI
My broader work explores how to make AI systems safer and more enforceable in practice —
through policy-gated execution, runtime validation,
and privacy-preserving system design.
PolicyGraph submitted to TechRxiv
Filed systems patent on privacy-preserving synthetic voice detection
Ongoing follow-up work on runtime enforcement patterns for safe AI systems