About

Hi, I’m Qi Jia, a senior software engineer focused on AI infrastructure and diffusion models.

I currently work at ShengShu, where I am an engineer building their inference platform for AI model integration, deployment, and serving.

Before ShengShu, I worked at Great Wall Motor as an engineering manager focused on AI platform development. I led engineering teams building an all-in-one AI R&D platform and a data-driven AI pipeline intelligent cockpit systems. The work covered data management, model training, model evaluation, inference deployment, vehicle-side data collection, annotation, and model distribution.

Earlier in my career, I worked at Momenta, helping to build the first-generation Closed Loop Automation Platform.

On this blog, I write about the systems and engineering practices behind large-scale AI applications: inference platforms, cloud-native infrastructure, Kubernetes, RAG, vector databases, generative models, and the bridge between research workflows and production deployment.

On GitHub, I use the handle kuafou, and I am a contributor to vLLM-Omni, SGLang, FastVideo, PyTorch, and others.

You can also find me on LinkedIn and GitHub.