7 Mind-Blowing Kubernetes Hacks

DavidW (skyDragon)
overcast blog
Published in
13 min readApr 7, 2024

--

Kubernetes harbors capabilities that even seasoned developers might not be fully aware of. These hacks delve into the more esoteric, yet incredibly potent tricks that can significantly empower those who master them. These are not your everyday tips but profound insights into making Kubernetes do amazing things.

1. Ephemeral Containers for Real-Time Troubleshooting

Ephemeral containers in Kubernetes allow for the dynamic insertion of a troubleshooting container into a running pod, providing a temporary environment for real-time debugging without altering the pod’s original setup. This functionality is invaluable for diagnosing issues in a live environment where traditional debugging methods might disrupt service.

What is an Ephemeral Container?

An ephemeral container is a type of container that you can add to a running pod temporarily. It’s designed to execute debugging tools that aren’t included in the pod’s main containers, offering a powerful way to investigate and troubleshoot issues in real-time. Once you’re done with an ephemeral container, it can be removed, leaving the original pod unaffected.

How to Use Ephemeral Containers

To use an ephemeral container, you employ the kubectl debug command, which allows you to inject a debug container into an existing pod. For example, to start a debugging session with an Ubuntu container in a pod named myapp-pod, you would use:

kubectl debug myapp-pod -it --image=ubuntu --target=myapp-container

In this command, --target specifies the container within the pod where you want to run debugging commands. This opens an interactive shell within the Ubuntu container, where you can install and use debugging tools as needed.

When to Use Ephemeral Containers

Use ephemeral containers when you need to troubleshoot or debug live applications within your Kubernetes cluster without affecting their operation. They are particularly useful for:

  • Inspecting running processes in a container.
  • Examining file systems or configurations within a pod.
  • Capturing network traffic between containers.

Best Practices for Ephemeral Containers

  • Limit Access: Given their powerful nature, ensure that only authorized personnel can inject ephemeral containers into pods.
  • Audit Usage: Keep logs of when and why ephemeral containers are used to ensure accountability and for future reference in troubleshooting.
  • Minimal Impact: Use ephemeral containers sparingly and aim to keep their lifetime short to minimize impact on the running applications.

Learn More

To dive deeper into ephemeral containers and their capabilities, check out these resources:

2. Dynamic Admission Control for Customized Governance

Dynamic Admission Control in Kubernetes enables the enforcement of complex governance rules and policies on Kubernetes objects in real-time. Through MutatingAdmissionWebhook and ValidatingAdmissionWebhook, administrators can modify incoming requests to the Kubernetes API server or reject them altogether, ensuring adherence to organizational policies and enhancing the cluster’s security and integrity.

What is Dynamic Admission Control?

Dynamic Admission Control refers to a Kubernetes feature that allows administrators to intercept, inspect, and modify requests to the Kubernetes API server before the object creation, modification, or deletion is finalized. This is achieved using admission controllers, which are plugins that act as gatekeepers to enforce governance rules.

How to Use Dynamic Admission Control

To utilize Dynamic Admission Control, you define admission webhook servers that Kubernetes should call into before completing certain API requests. For example, to ensure that all Pods have resource limits specified, you can set up a ValidatingAdmissionWebhook.

Create a Webhook Server: First, you’ll need a webhook server that can validate or mutate Kubernetes objects. Here’s a simplified example of what the server might expect for a Pod resource limit check:

from flask import Flask, request, jsonify

app = Flask(__name__)
@app.route('/validate-pods', methods=['POST'])
def validate():
request_info = request.get_json()
try:
# Example validation: Ensure CPU limits are set
containers = request_info["request"]["object"]["spec"]["containers"]
for container in containers:
if "resources" not in container or "limits" not in container["resources"] or "cpu" not in container["resources"]["limits"]:
return jsonify({
"response": {
"allowed": False,
"status": {
"message": "Missing CPU resource limits"
}
}
})
return jsonify({"response": {"allowed": True}})
except KeyError as e:
return jsonify({
"response": {
"allowed": False,
"status": {
"message": f"Error processing request: {e}"
}
}
})
if __name__ == '__main__':
app.run(debug=True, host='0.0.0.0', port=443, ssl_context=('cert.pem', 'key.pem'))

Register the Webhook with Kubernetes: Create a ValidatingWebhookConfiguration that points to your webhook server:

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
name: "pod-validator-webhook"
webhooks:
- name: "validator.example.com"
rules:
- operations: ["CREATE"]
apiGroups: [""]
apiVersions: ["v1"]
resources: ["pods"]
clientConfig:
service:
name: "webhook-service"
namespace: "default"
path: "/validate-pods"
caBundle: "<CA_BUNDLE>"
admissionReviewVersions: ["v1"]
sideEffects: None

When to Use Dynamic Admission Control

Dynamic Admission Control is most effective in scenarios where you need to enforce specific organizational policies, such as:

  • Ensuring all containers have resource limits to avoid resource hogging.
  • Enforcing naming conventions for resources for better organization and management.
  • Preventing the use of certain container images from untrusted registries.

Best Practices for Dynamic Admission Control

  • Use for Policy Enforcement: Focus on governance policies that need to be enforced universally across your cluster.
  • Test Thoroughly: Thoroughly test your webhooks in a development environment to ensure they don’t inadvertently block legitimate requests.
  • Monitor and Audit: Keep detailed logs of admission control decisions to audit behavior and troubleshoot rejected requests.

Learn More

For those interested in implementing Dynamic Admission Control in their Kubernetes clusters, these resources will provide further insights:

3. Kustomize for Advanced Configuration Management

Kustomize introduces a streamlined, template-free approach to managing Kubernetes application configurations, facilitating more sophisticated deployment strategies directly from kubectl. It stands out by enabling the creation of customized overlays that adjust base configurations for different environments, such as development, staging, and production, without duplication.

What is Kustomize?

Kustomize is a tool built into kubectl for customizing Kubernetes resource configurations. It lets you define a base configuration and create variations called overlays for different environments. This approach keeps configurations DRY (Don't Repeat Yourself) and simplifies the management of complex Kubernetes applications.

How to Use Kustomize

Kustomize structures configurations into bases and overlays. A base contains the common resource definitions, while overlays modify these resources for specific purposes.

  1. Create a Base Configuration: Start by creating a directory for your base with a kustomization.yaml file that references your Kubernetes resources:
# kustomization.yaml in the base directory
resources:
- deployment.yaml
- service.yaml

Here, deployment.yaml and service.yaml are your standard Kubernetes resource files.

Create an Overlay: For each environment, create an overlay that specifies customizations. For example, to create a development overlay:

# kustomization.yaml in the development overlay directory
bases:
- ../../base
patchesStrategicMerge:
- deployment_patch.yaml

deployment_patch.yaml contains changes specific to the development environment, such as a different number of replicas:

apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 2

Apply the Configuration: Use kubectl to apply the overlay configuration to your cluster:

kubectl apply -k overlays/development/

This command merges the base with the development overlay and applies the result to your cluster.

When to Use Kustomize

Use Kustomize when you need to manage complex configurations across multiple environments without duplicating resource definitions. It’s ideal for:

  • Deploying the same application with slight variations across environments.
  • Managing feature flags or environment-specific parameters.
  • Implementing a GitOps workflow where configuration changes are versioned and reviewed.

Best Practices for Kustomize

  • Organize Resource Definitions: Keep your base configurations organized and minimal. Use overlays for environment-specific customizations.
  • Version Everything: Store your Kustomize configurations in version control to track changes and roll back when necessary.
  • Review Changes Carefully: Before applying changes, use kubectl kustomize <directory> to generate the merged configurations and review them for accuracy.

Learn More

For more detailed guidance on using Kustomize, explore these resources:

4. User-Defined Aggregate Metrics for Custom Monitoring

Beyond the standard monitoring capabilities for CPU and memory utilization, Kubernetes supports the collection of custom and aggregate metrics. This advanced monitoring feature allows you to define and collect metrics that are specifically relevant to your application’s performance and health, enabling you to tailor auto-scaling and monitoring strategies to your operational requirements.

What are User-Defined Aggregate Metrics?

User-defined aggregate metrics in Kubernetes are custom metrics that you define to monitor the performance of your applications beyond the default system metrics. These can include anything from the number of active users, transaction throughput, to error rates or custom application performance indicators. Aggregate metrics combine data from multiple sources or instances, providing a holistic view of system behavior.

How to Use User-Defined Aggregate Metrics

To collect and use user-defined aggregate metrics in Kubernetes, you typically integrate with a monitoring solution that supports custom metrics, such as Prometheus, and then configure Kubernetes to use these metrics for horizontal pod auto-scaling (HPA) or for monitoring and alerting purposes.

  1. Collect Metrics with Prometheus: First, ensure that Prometheus is collecting the custom metrics from your application. This usually involves instrumenting your application to expose a metrics endpoint that Prometheus can scrape.
from prometheus_client import start_http_server, Summary
import random
import time

# Create a metric to track time spent and requests made.
REQUEST_TIME = Summary('request_processing_seconds', 'Time spent processing request')
# Decorate function with metric.
@REQUEST_TIME.time()
def process_request(t):
"""A dummy function that takes some time."""
time.sleep(t)
if __name__ == '__main__':
# Start up the server to expose the metrics.
start_http_server(8000)
# Generate some requests.
while True:
process_request(random.random())

Configure HPA to Use Custom Metrics: With your metrics being collected by Prometheus, you can configure the Kubernetes HPA to scale your deployment based on these metrics. This involves deploying the Prometheus Adapter for Kubernetes Metrics APIs, which allows HPA to query Prometheus for your custom metrics.

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: myapp-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 1
maxReplicas: 10
metrics:
- type: Pods
pods:
metric:
name: request_processing_seconds
target:
type: AverageValue
averageValue: 500ms

When to Use User-Defined Aggregate Metrics

Consider using user-defined aggregate metrics in scenarios where:

  • Standard metrics do not provide sufficient insight into application performance.
  • You need to monitor business-critical operations that reflect the application’s success.
  • Custom scaling behavior is required that standard metrics cannot provide.

Best Practices for User-Defined Aggregate Metrics

  • Clearly Define Metrics: Ensure that your custom metrics are well-defined, meaningful, and directly related to operational objectives or performance indicators.
  • Avoid Metric Overload: Collect only the metrics you need to avoid overwhelming your monitoring system and to keep the focus on key performance indicators.
  • Regularly Review and Adjust: As your application and operational requirements evolve, review your custom metrics to ensure they remain relevant and adjust your monitoring and scaling configurations accordingly.

Learn More

To deepen your understanding of implementing and using user-defined aggregate metrics in Kubernetes, consider these resources:

5. API Priority and Fairness for Request Management

The API Priority and Fairness (APF) feature in Kubernetes ensures that requests to the Kubernetes API server are handled fairly and efficiently, preventing important requests from being starved by less critical ones. By prioritizing and isolating requests, Kubernetes can maintain cluster stability and responsiveness, even under heavy load.

What is API Priority and Fairness?

API Priority and Fairness is a Kubernetes feature that categorizes incoming API requests into different priority levels. Each request is then queued and processed based on its priority, ensuring that critical operations are not delayed by bulk, less urgent requests. This feature is crucial for clusters under high load, preventing API server overload and ensuring that critical cluster operations can proceed without interruption.

How to Use API Priority and Fairness

To utilize the API Priority and Fairness feature, you must define PriorityLevelConfiguration and FlowSchema objects. These objects determine how requests are categorized and the order in which they are served.

  1. Define Priority Levels: PriorityLevelConfiguration objects allow you to define different priority levels for request processing. For example, creating a high-priority level might look like this:
apiVersion: flowcontrol.apiserver.k8s.io/v1beta1
kind: PriorityLevelConfiguration
metadata:
name: high-priority
spec:
type: Limited
limited:
assureConcurrencyShares: 100
limitResponse:
type: Queue
queuing:
queues: 50
handSize: 5
queueLengthLimit: 10

Categorize Requests with Flow Schemas: FlowSchema objects define how requests are matched to priority levels. You can create a FlowSchema to route requests from specific users or service accounts to the high-priority level:

apiVersion: flowcontrol.apiserver.k8s.io/v1beta1
kind: FlowSchema
metadata:
name: system-operations-high-priority
spec:
priorityLevelConfiguration:
name: high-priority
matchingPrecedence: 500
rules:
- subjects:
- kind: Group
name: "system:masters"
resourceRules:
- verbs: ["*"]
apiGroups: ["*"]
resources: ["*"]

When to Use API Priority and Fairness

The APF feature is especially useful in scenarios where your Kubernetes API server faces high demand. This includes:

  • Large clusters with extensive automation and numerous concurrent operations.
  • Environments where critical operations must be guaranteed processing resources, regardless of overall load.
  • Clusters experiencing performance issues due to non-essential operations consuming excessive API server resources.

Best Practices for API Priority and Fairness

  • Monitor API Server Performance: Regularly monitor your API server’s performance to understand how request prioritization impacts overall cluster operations.
  • Use with Caution: Prioritizing certain requests over others can have significant impacts on cluster behavior. Ensure that priority levels and flow schemas are configured to reflect the true importance of different operations.
  • Review and Adjust Configurations Regularly: As your cluster’s workload changes, review your APF configurations to ensure they continue to meet your operational needs effectively.

Learn More

For more information on configuring and using the API Priority and Fairness feature in Kubernetes, explore the following resources:

6. Transparent Multi-cluster Services with Submariner

Submariner is a tool that bridges the networking gap between separate Kubernetes clusters, allowing for secure and efficient cross-cluster communication. By establishing a seamless overlay network, Submariner enables pods and services in different clusters to interact as if they were within a single cluster, simplifying the architecture for multi-cluster deployments.

What is Submariner?

Submariner is an open-source project that connects multiple Kubernetes clusters, enabling pods in separate clusters to communicate with each other directly. It creates an encrypted tunnel between clusters, leveraging the underlying network infrastructure while maintaining security and isolation.

How to Use Submariner

To deploy Submariner, you need two or more Kubernetes clusters. Here’s a simplified overview of setting up Submariner:

Install Submariner using Submariner Operator: The Submariner Operator is a convenient way to install and manage your Submariner deployment. First, install the Submariner Operator on each cluster:

subctl install --kubeconfig <path-to-kubeconfig> --clusterid <unique-cluster-id>

Join Clusters: Once the operator is installed, use subctl to join each cluster to the mesh network:

subctl join --kubeconfig <path-to-kubeconfig> broker-info.subm --clusterid <unique-cluster-id>

This command configures the cluster to participate in the Submariner overlay network, connecting it to other clusters.

When to Use Submariner

Submariner is particularly useful in scenarios such as:

  • Multi-Cloud Deployments: When your Kubernetes clusters are spread across different cloud providers.
  • Hybrid Cloud and On-Premises Environments: Connecting clusters across on-premises datacenters and cloud environments.
  • Disaster Recovery: Facilitating synchronization and backup across clusters located in different geographical regions for high availability.

Best Practices for Submariner

  • Network Overlap Considerations: Ensure there is no overlap in Pod and Service CIDRs between the clusters to avoid networking conflicts.
  • Secure Configuration: Use Submariner’s built-in encryption to secure cross-cluster traffic, especially when crossing the public internet.
  • Monitor Inter-cluster Traffic: Implement monitoring and logging to track the flow of traffic between clusters for troubleshooting and performance analysis.

Learn More

For detailed instructions and configurations for Submariner, explore the following resources:

7. Advanced Scheduling with Pod Topology Spread Constraints

Pod Topology Spread Constraints enhance Kubernetes’ scheduling capabilities by allowing you to distribute pods evenly across different topology domains, such as nodes, zones, and regions, improving your cluster’s resilience and performance.

What are Pod Topology Spread Constraints?

Pod Topology Spread Constraints are scheduling rules that you can apply to pods to control how they are spread across your cluster’s topology. They offer a way to achieve high availability and efficient resource utilization by evenly distributing pods based on specified criteria.

How to Use Pod Topology Spread Constraints

To use Pod Topology Spread Constraints, you add them to your pod specifications. Here’s an example of spreading pods across different availability zones:

apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
replicas: 6
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: nginx
image: nginx
topologySpreadConstraints:
- maxSkew: 1
topologyKey: "topology.kubernetes.io/zone"
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app: myapp

In this example, maxSkew defines the maximum difference in pod numbers between any two zones. topologyKey specifies that the spreading should consider the cluster's zones, and whenUnsatisfiable: ScheduleAnyway means that pods should still be scheduled even if the constraint can’t be perfectly satisfied.

When to Use Pod Topology Spread Constraints

Consider using Pod Topology Spread Constraints in scenarios such as:

  • High Availability Deployments: To ensure your application remains available even if an entire zone or node fails.
  • Load Balancing Across Zones: For applications that benefit from being close to users or resources spread across different geographical locations.
  • Efficient Resource Utilization: To prevent resource hotspots by distributing workloads more evenly across the cluster.

Best Practices for Pod Topology Spread Constraints

  • Balanced Cluster State: Ensure your nodes are labeled accurately according to your topology spread constraints to avoid scheduling imbalances.
  • Compatibility with Other Scheduling Policies: Consider how topology spread constraints interact with other scheduling policies like taints and affinities to avoid unintended scheduling behavior.
  • Use with Node Affinity: Combine topology spread constraints with node affinity to finely control pod placement according to both topology and node attributes.

Learn More

Dive deeper into Pod Topology Spread Constraints with these resources:

Further Reading

--

--

Into cloud-native architectures and tools like K8S, Docker, Microservices. I write code to help clouds stay afloat and guides that take people to the clouds.