LLM Inference Optimization: vLLM, TGI, and Production Serving Strategies in 2026
on Llm, Inference, Vllm, Tgi, Mlops, Production, Ai
on Llm, Inference, Vllm, Tgi, Mlops, Production, Ai
on Gitops, Kubernetes, Argocd, Flux, Devops, Ci/cd, Platform engineering
on Ai, Llm, Prompt engineering, Machine learning, Productivity
In 2023, prompt engineering felt like casting spells. “Pretend you are a senior engineer…” “Think step by step…” “You will be tipped $200 for a good answer.” Some of these worked. Most were cargo cult. By 2026, we have actual empirical data on what moves the needle — and the picture is more nuanced than the discourse suggests.
on Devops, Iac, Opentofu, Terraform, Cloud
In August 2023, HashiCorp announced it was changing Terraform’s license from MPL 2.0 (open source) to the Business Source License (BSL). The community reaction was swift and decisive: within weeks, the OpenTF Foundation forked Terraform and launched OpenTofu. By 2026, OpenTofu has not only caught up with Terraform but diverged meaningfully, adding features the HashiCorp roadmap deprioritized. Here’s the state of IaC in 2026.
on Next.js, React, Frontend, Performance, Web development
Next.js 15 has been in production for over a year now, and the dust has settled on the App Router patterns that actually work. What started as a confusing paradigm shift — “where did getServerSideProps go?” — has matured into a powerful model for building fast, scalable web applications. But the learning curve is real, and the pitfalls are subtle. Here’s what I’ve learned from deploying it at scale.