Mastering In-Place Vertical Scaling for Pod-Level Resources in Kubernetes 1.36

By
<h2>Introduction</h2> <p>Kubernetes v1.36 brings a powerful new capability: <strong>in-place vertical scaling for pod-level resources</strong> has graduated to Beta, meaning it's enabled by default via the <code>InPlacePodLevelResourcesVerticalScaling</code> feature gate. This feature allows you to dynamically adjust the aggregate resource budget (<code>.spec.resources</code>) of a running Pod without necessarily restarting your containers. In this how-to guide, you'll learn exactly how to leverage this feature to simplify resource management for complex Pods, such as those with sidecars, and scale their shared pool of CPU and memory on the fly.</p><figure style="margin:20px 0"><img src="https://picsum.photos/seed/3147589968/800/450" alt="Mastering In-Place Vertical Scaling for Pod-Level Resources in Kubernetes 1.36" style="width:100%;height:auto;border-radius:8px" loading="lazy"><figcaption style="font-size:12px;color:#666;margin-top:5px"></figcaption></figure> <h2>What You Need</h2> <ul> <li>A Kubernetes cluster running version 1.36 or later (with the feature gate enabled by default).</li> <li><code>kubectl</code> command-line tool installed and configured to access your cluster.</li> <li>Understanding of basic Kubernetes Pod and resource concepts.</li> <li>Permission to patch Pods (edit or update resources).</li> <li>A test environment (e.g., minikube, kind, or a cloud cluster) where you can safely experiment.</li> </ul> <h2>Step 1: Define a Pod with Pod-Level Resources and No Container-Level Limits</h2> <p>To take advantage of in-place vertical scaling at the Pod level, your Pod specification must include both a <code>spec.resources</code> block (which defines the aggregate budget) and containers that <strong>do not</strong> have individual resource limits. This way, the containers inherit the pod-level budget.</p> <p>Create a file named <code>shared-pool-pod.yaml</code> with the following content:</p> <pre><code>apiVersion: v1 kind: Pod metadata: name: shared-pool-app spec: resources: # Pod-level limits limits: cpu: "2" memory: "4Gi" containers: - name: main-app image: nginx:latest # No container-level limits or requests – they inherit from the pod level resizePolicy: - resourceName: "cpu" restartPolicy: "NotRequired" - resourceName: "memory" restartPolicy: "NotRequired" - name: sidecar image: busybox:latest command: ["sleep", "3600"] resizePolicy: - resourceName: "cpu" restartPolicy: "NotRequired" </code></pre> <p><strong>Important:</strong> The <code>resizePolicy</code> is set at the container level. As of v1.36, pod-level <code>resizePolicy</code> is not supported, so the Kubelet evaluates each container separately. Use <code>NotRequired</code> to avoid container restarts during non-disruptive updates.</p> <p>Apply the Pod:</p> <pre><code>kubectl apply -f shared-pool-pod.yaml</code></pre> <h2>Step 2: Perform an In-Place Resize via the Resize Subresource</h2> <p>Now that your Pod is running, you can increase the shared CPU pool from 2 to 4 CPUs without restarting the containers. Use the <code>resize</code> subresource to send a patch:</p> <pre><code>kubectl patch pod shared-pool-app --subresource resize --patch \ '{"spec":{"resources":{"limits":{"cpu":"4"}}}}'</code></pre> <p>The patch only updates the pod-level <code>limits.cpu</code>. Memory can be updated similarly. The Kubelet will receive the change and attempt to apply it to the cgroups of each container that inherits from the pod-level budget.</p> <h2>Step 3: Verify the Resize</h2> <p>Check the updated resource status of the Pod:</p> <pre><code>kubectl describe pod shared-pool-app</code></pre> <p>Look for the <code>Resources</code> section under <code>Spec</code> – it should now show <code>cpu: 4</code>. You can also inspect the cgroup of a container inside the Pod to confirm the change:</p> <pre><code>kubectl exec shared-pool-app -c main-app -- cat /sys/fs/cgroup/cpu/cpu.max</code></pre> <p>If the resize succeeded without a restart, the container's CPU limit should have increased instantly.</p> <h2>Step 4: Understand When a Restart Is Required</h2> <p>Not all resource types or scenarios allow non-disruptive changes. The <code>resizePolicy</code> per container determines the behavior:</p> <ul> <li><strong>Non-disruptive (<code>NotRequired</code>):</strong> The Kubelet updates cgroup limits dynamically via the Container Runtime Interface (CRI). This works for CPU and memory, but may not be supported for other resources like hugepages or ephemeral storage.</li> <li><strong>Disruptive (<code>RestartContainer</code>):</strong> If you set <code>restartPolicy: RestartContainer</code> for a specific resource, the Kubelet will restart that container to apply the new pod-level budget safely. Use this when the runtime doesn't support dynamic updates or if you want to ensure a clean slate.</li> </ul> <p><strong>Note:</strong> When you change the pod-level resources, every container that inherits from that budget will see a resize event. The Kubelet consults each container's <code>resizePolicy</code> individually. If any container requires a restart, it will be restarted independently of the others.</p> <h2>Step 5: Safely Reduce Resources (Quick Guide)</h2> <p>You can also reduce the pod-level resource budget. However, be careful: reducing CPU or memory below the current usage of any container may cause throttling or OOM kills. The Kubelet will apply the reduction, but it's best to monitor usage before shrinking.</p> <p>Example patch to reduce CPU to 1.5 cores:</p> <pre><code>kubectl patch pod shared-pool-app --subresource resize --patch \ '{"spec":{"resources":{"limits":{"cpu":"1500m"}}}}'</code></pre> <p>Always test reductions in a non-production environment first.</p> <h2>Tips and Best Practices</h2> <ul> <li><strong>Start with <code>NotRequired</code> for known supported resources.</strong> This avoids unnecessary restarts and keeps your services running during scaling events.</li> <li><strong>Use Pod-level resources when containers share a common resource pool.</strong> This is ideal for sidecar patterns where you don't want to calculate per-container limits manually.</li> <li><strong>Monitor container-level usage with tools like <code>kubectl top pod</code> or observability platforms</strong> to make informed scaling decisions. Pod-level limits are an aggregate; ensure the total is not exceeded by any single container's demand.</li> <li><strong>Test scaling operations against a stress container</strong> that generates CPU load to verify that in-place changes take effect without disruption.</li> <li><strong>Remember that <code>resizePolicy</code> is per container, not per pod.</strong> If you want to enforce a restart for all containers, you must set <code>RestartContainer</code> in each container's policy.</li> <li><strong>Keep an eye on the Kubelet logs</strong> for messages about resize attempts, especially if a change fails to apply in place and falls back to a restart.</li> <li><strong>Combine with Horizontal Pod Autoscaler (HPA) for dynamic scaling</strong> – you can trigger vertical adjustments based on metrics while HPA handles count changes.</li> </ul> <p>With the steps above, you can confidently use in-place vertical scaling for pod-level resources in Kubernetes 1.36. This feature simplifies operations for multi-container Pods, reduces downtime, and gives you finer control over resource co‑scheduling.</p>

Related Articles