Challenge Structure
Understanding the anatomy of a Kubeasy challenge and how all components work together.
Every Kubeasy challenge follows a consistent structure. This page explains each component and how they work together.
Directory Structure
Each challenge is a folder in the challenges repository:
challenges/
└── pod-evicted/
├── challenge.yaml # Metadata, description, AND validations
├── manifests/ # Initial broken state
│ ├── deployment.yaml
│ ├── service.yaml
│ └── ...
├── policies/ # Kyverno policies (prevent bypasses)
│ └── protect.yaml
└── image/ # Optional: custom Docker images
└── DockerfileComponents
1. challenge.yaml
The challenge.yaml file contains everything about the challenge: metadata, description, and validations.
title: Pod Evicted
description: |
A data processing pod keeps crashing and getting evicted.
It was working fine yesterday, but now Kubernetes keeps killing it.
theme: resources-scaling
difficulty: easy
estimated_time: 15
initial_situation: |
A data processing application is deployed as a single pod.
The pod starts successfully but after a few seconds gets killed.
It enters a CrashLoopBackOff state and keeps restarting.
objective: |
Fix the pod so it can run without being evicted.
Understand why Kubernetes is killing the application.
validations:
- key: pod-running
title: "Pod Ready"
description: "The pod must be running and healthy"
order: 1
type: status
spec:
target:
kind: Pod
labelSelector:
app: data-processor
conditions:
- type: Ready
status: "True"Metadata Fields
| Field | Required | Description |
|---|---|---|
title | Yes | Challenge name (shown in UI) |
description | Yes | Brief description of symptoms (NOT the cause!) |
theme | Yes | Category for grouping |
difficulty | Yes | easy, medium, or hard |
estimated_time | Yes | Minutes to complete |
initial_situation | Yes | What the user will find |
objective | Yes | What needs to be achieved |
validations | Yes | Success criteria (see below) |
Writing Good Descriptions
Describe symptoms, not causes:
# BAD - Reveals the problem
description: |
The ConfigMap has invalid JSON syntax.
Fix the JSON formatting error.
# GOOD - Maintains mystery
description: |
A microservice keeps crashing shortly after deployment.
The team swears the code hasn't changed.State goals, not methods:
# BAD - Tells user what to do
objective: |
Increase the memory limit to 256Mi.
# GOOD - States the outcome
objective: |
Make the pod run stably without being evicted.2. manifests/ Directory
Contains the initial broken state - Kubernetes resources that learners need to fix.
# manifests/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: data-processor
spec:
replicas: 1
selector:
matchLabels:
app: data-processor
template:
metadata:
labels:
app: data-processor
spec:
containers:
- name: processor
image: kubeasy/data-processor:v1
resources:
limits:
memory: "32Mi" # BUG: Too low!
cpu: "100m"Design principles:
- Keep it minimal - Only include what's needed
- Make it realistic - Mirror production problems
- One problem at a time - Don't combine unrelated issues
- Clear naming - Be descriptive
3. policies/ Directory
Kyverno policies that prevent bypasses - stopping users from cheating instead of solving the challenge properly.
# policies/protect.yaml
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: protect-data-processor
spec:
validationFailureAction: Enforce
rules:
- name: preserve-image
match:
resources:
kinds: ["Deployment"]
names: ["data-processor"]
namespaces: ["challenge-*"]
validate:
message: "Cannot change the application image"
pattern:
spec:
template:
spec:
containers:
- name: processor
image: "kubeasy/data-processor:v1"What to protect:
- Container images (prevent replacing the app)
- Critical volume mounts
- Essential labels (ensure validations can find resources)
What NOT to protect:
- Resource limits/requests (user should be able to change these)
- Environment variables
- Probe configurations
- Labels and annotations
4. Validations
Defined in challenge.yaml, validations determine when the challenge is solved.
validations:
- key: unique-identifier
title: "User-Friendly Title"
description: "What this checks"
order: 1
type: status|log|event|metrics|rbac|connectivity
spec:
# Type-specific configurationAvailable types:
| Type | Purpose |
|---|---|
status | Check resource conditions (Ready, Available) |
log | Find strings in container logs |
event | Detect forbidden K8s events (OOMKilled, Evicted) |
metrics | Check pod metrics (restart count) |
rbac | Test ServiceAccount permissions |
connectivity | HTTP connectivity tests |
See Validation Rules for detailed examples.
5. image/ Directory (Optional)
If your challenge needs a custom application, you can create your own Docker image. Simply add an image/ directory with a Dockerfile in your challenge folder:
my-challenge/
├── challenge.yaml
├── manifests/
├── policies/
└── image/
├── Dockerfile
└── app.py # Your application filesExample Dockerfile:
# image/Dockerfile
FROM python:3.11-slim
COPY app.py /app/
CMD ["python", "/app/app.py"]Automatic Build & Publish
When you merge your challenge to main, the CI automatically:
- Detects challenges with an
image/directory - Builds the Docker image (multi-arch:
linux/amd64andlinux/arm64) - Pushes to GitHub Container Registry:
ghcr.io/kubeasy-dev/<challenge-name>:latest
Using your custom image in manifests:
# manifests/deployment.yaml
spec:
containers:
- name: app
image: ghcr.io/kubeasy-dev/my-challenge:latestWhen to Use Custom Images
- Your challenge needs specific application behavior (memory leaks, slow responses, etc.)
- You need to simulate a realistic application workload
- Standard images (nginx, python, busybox) don't fit your scenario
Best Practices
- Keep images small (use slim/alpine base images)
- Don't include secrets or sensitive data
- Make the application behavior predictable and reproducible
- Add a
.dockerignorefile to exclude unnecessary files
Themes
Challenges are grouped by theme:
| Theme | Description |
|---|---|
rbac-security | Permissions, roles, security contexts |
networking | Services, ingress, network policies |
volumes-secrets | Storage, ConfigMaps, Secrets |
resources-scaling | Limits, requests, HPA, scaling |
monitoring-debugging | Probes, logging, events |
How It All Works Together
-
User starts challenge (
kubeasy challenge start pod-evicted)- CLI creates ArgoCD Application pointing to
manifests/ - ArgoCD deploys resources to dedicated namespace
- Kyverno policies are applied
- CLI creates ArgoCD Application pointing to
-
User investigates and fixes
- Uses
kubectlto explore the problem - Modifies resources to fix the issue
- Kyverno validates changes (prevents bypasses)
- Uses
-
User submits solution (
kubeasy challenge submit pod-evicted)- CLI loads validations from
challenge.yaml - Executes each validation against the cluster
- Sends results to backend
- XP awarded if all validations pass
- CLI loads validations from
Best Practices
Challenge Design
- One concept per challenge - Don't mix RBAC + networking + storage
- Realistic scenarios - Use problems that occur in production
- Clear error messages - When users check logs, they should see helpful errors
- No red herrings - Don't add confusing complexity
Validation Design
- Check outcomes, not implementations - "Pod is healthy" not "Memory is 256Mi"
- Don't reveal solutions - Validation titles should be generic
- Accept multiple solutions - If there are valid alternatives, allow them
Documentation
- Describe symptoms - Not the root cause
- State goals - Not the method to achieve them
- Never include solutions - Let users figure it out
Example: Complete Challenge
pod-evicted/
├── challenge.yaml
├── manifests/
│ └── deployment.yaml
└── policies/
└── protect.yamlchallenge.yaml:
title: Pod Evicted
description: |
A data processing pod keeps crashing and getting evicted.
It was working fine yesterday, but now Kubernetes keeps killing it.
theme: resources-scaling
difficulty: easy
estimated_time: 15
initial_situation: |
A data processing application is deployed as a single pod.
The pod starts successfully but after a few seconds gets killed.
It enters a CrashLoopBackOff state and keeps restarting.
objective: |
Fix the pod so it can run without being evicted.
validations:
- key: pod-ready
title: "Pod Ready"
description: "The pod must be running"
order: 1
type: status
spec:
target:
kind: Pod
labelSelector:
app: data-processor
conditions:
- type: Ready
status: "True"
- key: no-oom
title: "No Crash Events"
description: "No eviction or crash events"
order: 2
type: event
spec:
target:
kind: Pod
labelSelector:
app: data-processor
forbiddenReasons:
- "OOMKilled"
- "Evicted"
sinceSeconds: 300Next Steps
- Creating Your First Challenge - Step-by-step guide
- Validation Rules - Detailed validation reference
- Testing Challenges - How to test locally