Hi, I’m Qi Jia, a senior software engineer focused on AI infrastructure, model deployment, and platform engineering.

I currently work at ShengShu, where I served as a core engineer in building ShengShu’s inference platform for AI model integration, deployment, and serving.

Before ShengShu, I was a Software Technical Lead at Great Wall Motor. I led engineering teams building an all-in-one AI R&D platform and a data-driven AI pipeline for autonomous driving and intelligent cockpit systems. The work covered data management, model training, model evaluation, inference deployment, vehicle-side data collection, annotation, and model distribution.

Earlier in my career, I worked at Momenta, was a founding engineer at Beijing DanDiao Technology, and started as a software engineer at Beijing LingMiao Technology. I hold a bachelor’s degree in Computer Software Engineering from Inner Mongolia University, with a Stanford Online Professional Certificate in Artificial Intelligence and the Linux Foundation CKA.

On this blog, I write about the systems and engineering practices behind large-scale AI applications: inference platforms, cloud-native infrastructure, Kubernetes, RAG, vector databases, generative models, and the bridge between research workflows and production deployment.

On GitHub, I use the handle kuafou, with interests around vLLM, vLLM-Omni, SGLang, FastVideo, and PyTorch.

You can also find me on LinkedIn and GitHub.