🔥Save up to $132K/month in CI costs!Try Free
Skip to main content

Kubernetes Probes - A Complete Guide to Container Health Checks

7 min read
Author: James Smith
Senior Kubernetes Engineer
Kubernetes is my thing—I love building scalable systems and fine-tuning container workflows for better performance.

This article was last updated on January 9, 2025, to include advanced techniques for configuring Kubernetes probes, such as optimizing probe timings, debugging probe failures, and best practices for production-ready configurations, along with simplified explanations to enhance clarity.

Introduction

TLDR;
  • What is Liveness Probe?
    A Liveness Probe basically ensures that a container is alive and serving its purpose. Upon failure, the probe will restart the container through Kubernetes to recover functionality.

  • What is Readiness Probe?
    A Readiness Probe determines if a container is ready to handle traffic. On failure, the container is removed from the endpoint list of the service until it becomes healthy.

  • What is Startup Probe?
    A Startup Probe is available, which buys time for slow-starting applications to initialize. If it fails, Kubernetes simply assumes that the application cannot recover and gives up trying to start it.

Kubernetes probes are a kind of check-up for your containers. Liveness probes verify whether a container is alive-like checking the pulse, readiness probes check if it's ready for traffic-like checking if a person has woken up, whereas startup probes give more time to slow-starting applications-like waiting for him to actually wake up.

After having spent countless nights debugging container issues, proper configuration of the probes stands out to keep applications healthy in Kubernetes. Let me share what I've learned the hard way with you, so you can avoid my mistakes.

Steps we'll cover:

How Kubernetes Container Health Checks Work

Click to zoom

You should think of container health like human health. So just as a doctor would check different aspects of your health, Kubernetes relies on various types of probes to make sure your containers are healthy.

I remember a painful incident when our production service kept crashing because we had not set up our probes correctly. Technically, the containers were "alive" but not ready for traffic-like a person who's technically awake but still groggy and shouldn't be driving.

Configuring Different Types of Kubernetes Probes

How to Configure Kubernetes Liveness Probes

This is like checking if someone is breathing. If the liveness probe fails, Kubernetes restarts the container, much as you would call emergency services if someone stopped breathing.

livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 15 # Wait before first check
periodSeconds: 10 # Check every 10 seconds

How to Configure Kubernetes Readiness Probes

Think of this as the check that one is awake and ready for work. A failing readiness probe has the meaning that Kubernetes will not send traffic to the container, like "go back to bed, you're not ready for the day yet."

readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5

How to Configure Kubernetes Startup Probes

This is similar to giving the system extra time to wake up. I do this for applications that need extra time to start, like Java services that need to warm up their JVM.

startupProbe:
httpGet:
path: /healthz
port: 8080
failureThreshold: 30
periodSeconds: 10

Interactive Kubernetes Probe Configuration Tool

Not sure which probe settings to use? Try our interactive configuration tool:

Generated YAML:

livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
  initialDelaySeconds: 15
  periodSeconds: 10
  timeoutSeconds: 5
  failureThreshold: 3
  successThreshold: 1

Example Kubernetes Probe Configuration for Production

Now let me share one config with you, which saved us from sleepless nights as a team. Given is the usual Web app with slow startup backend:

apiVersion: v1
kind: Pod
metadata:
name: web-app
spec:
containers:
- name: backend
image: my-backend:1.0
ports:
- containerPort: 8080
startupProbe:
httpGet:
path: /healthz
port: 8080
failureThreshold: 30 # Allow up to 5 minutes for startup
periodSeconds: 10
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 10
periodSeconds: 15
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5

Common Kubernetes Probe Issues and Solutions

Here are some mistakes I've made (so you don't have to):

Too aggressive timing

  • Problem: Short timeouts set restarting healthy containers
  • Solution: Start with longer timeouts and adjust based on monitoring

Incorrect Health Check Endpoints

  • Issue: Checking endpoints that are not indicative of true health
  • Solution: Implement health check endpoints that verify critical dependencies.

Missing Startup Probes - Problem:

  • Applications that fail because they require more time to start.
  • Solution: Add startup probes for slow-starting applications**

Comparing Kubernetes Probe Types

Probe TypePurposeExample Use CaseAction on Failure
LivenessChecks if the container is aliveRestarting crashed applicationsRestarts the container
ReadinessChecks if the container is ready for trafficTemporarily removing from serviceRemoves container from endpoints
StartupEnsures slow-starting apps get enough timeLong initialization applicationsStops attempting to start the container

How to Configure Kubernetes Probes in 3 Steps

  1. Define the probe in your container spec: Add a livenessProbe, readinessProbe, or startupProbe configuration.
  2. Specify the probe type: Choose between HTTP, TCP, or command-based probes based on your application’s requirements.
  3. Set timings cautiously: Use initialDelaySeconds, periodSeconds, and timeoutSeconds conservatively to avoid false positives.

Best Practices for Kubernetes Health Checks

After breaking production multiple times (yes, I admit it), here's what I've learned:

Start Conservative

livenessProbe:
initialDelaySeconds: 30 # Start with longer delays
timeoutSeconds: 5 # Keep timeouts reasonable
periodSeconds: 30 # Don't check too frequently

Use Different Endpoints Liveness: Basic "is it running?" check

  • Readiness: Deep health check including dependencies
  • Startup: Same as liveness but a bit more tolerant

Monitor Probe Results

# Check probe status
kubectl describe pod my-pod | grep -A 5 "Liveness"
kubectl describe pod my-pod | grep -A 5 "Readiness"

How to Debug Kubernetes Probe Issues

When things go wrong (and they will), here's how I debug probe issues:

# Get probe failure events
kubectl get events --field-selector reason=Unhealthy
# Check pod status
kubectl describe pod <pod-name>
# View container logs
kubectl logs <pod-name> --previous # See logs from previous container if it crashed

Conclusion

Proper probe configuration is like a good health insurance policy-you hope you never need it, but you'll be glad you have it when the worst happens. Configure conservatively, monitor closely, and adjust based on real-world behavior.

Remember: It is better that health checks take a little longer than having false positives causing unnecessary restarts. Trust me, your 3 AM self will thank you for being cautious. Need help monitoring the health of your container? Check out CICube for advanced Kubernetes insights.