Challenge Structure

Understanding the anatomy of a Kubeasy challenge and how all components work together.

Every Kubeasy challenge follows a consistent structure. This page explains each component and how they work together.

Directory Structure

Each challenge is a folder in the challenges repository:

challenges/
└── pod-evicted/
    ├── challenge.yaml      # Metadata, description, AND validations
    ├── manifests/          # Initial broken state
    │   ├── deployment.yaml
    │   ├── service.yaml
    │   └── ...
    ├── policies/           # Kyverno policies (prevent bypasses)
    │   └── protect.yaml
    └── image/              # Optional: custom Docker images
        └── Dockerfile

Components

1. challenge.yaml

The challenge.yaml file contains everything about the challenge: metadata, description, and validations.

title: Pod Evicted
description: |
  A data processing pod keeps crashing and getting evicted.
  It was working fine yesterday, but now Kubernetes keeps killing it.

theme: resources-scaling
difficulty: easy
estimated_time: 15

initial_situation: |
  A data processing application is deployed as a single pod.
  The pod starts successfully but after a few seconds gets killed.
  It enters a CrashLoopBackOff state and keeps restarting.

objective: |
  Fix the pod so it can run without being evicted.
  Understand why Kubernetes is killing the application.

validations:
  - key: pod-running
    title: "Pod Ready"
    description: "The pod must be running and healthy"
    order: 1
    type: status
    spec:
      target:
        kind: Pod
        labelSelector:
          app: data-processor
      conditions:
        - type: Ready
          status: "True"

Metadata Fields

Field	Required	Description
`title`	Yes	Challenge name (shown in UI)
`description`	Yes	Brief description of symptoms (NOT the cause!)
`theme`	Yes	Category for grouping
`difficulty`	Yes	`easy`, `medium`, or `hard`
`estimated_time`	Yes	Minutes to complete
`initial_situation`	Yes	What the user will find
`objective`	Yes	What needs to be achieved
`validations`	Yes	Success criteria (see below)

Writing Good Descriptions

Describe symptoms, not causes:

# BAD - Reveals the problem
description: |
  The ConfigMap has invalid JSON syntax.
  Fix the JSON formatting error.

# GOOD - Maintains mystery
description: |
  A microservice keeps crashing shortly after deployment.
  The team swears the code hasn't changed.

State goals, not methods:

# BAD - Tells user what to do
objective: |
  Increase the memory limit to 256Mi.

# GOOD - States the outcome
objective: |
  Make the pod run stably without being evicted.

2. manifests/ Directory

Contains the initial broken state - Kubernetes resources that learners need to fix.

# manifests/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: data-processor
spec:
  replicas: 1
  selector:
    matchLabels:
      app: data-processor
  template:
    metadata:
      labels:
        app: data-processor
    spec:
      containers:
      - name: processor
        image: kubeasy/data-processor:v1
        resources:
          limits:
            memory: "32Mi"  # BUG: Too low!
            cpu: "100m"

Design principles:

Keep it minimal - Only include what's needed
Make it realistic - Mirror production problems
One problem at a time - Don't combine unrelated issues
Clear naming - Be descriptive

3. policies/ Directory

Kyverno policies that prevent bypasses - stopping users from cheating instead of solving the challenge properly.

# policies/protect.yaml
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: protect-data-processor
spec:
  validationFailureAction: Enforce
  rules:
    - name: preserve-image
      match:
        resources:
          kinds: ["Deployment"]
          names: ["data-processor"]
          namespaces: ["challenge-*"]
      validate:
        message: "Cannot change the application image"
        pattern:
          spec:
            template:
              spec:
                containers:
                  - name: processor
                    image: "kubeasy/data-processor:v1"

What to protect:

Container images (prevent replacing the app)
Critical volume mounts
Essential labels (ensure validations can find resources)

What NOT to protect:

Resource limits/requests (user should be able to change these)
Environment variables
Probe configurations
Labels and annotations

4. Validations

Defined in challenge.yaml, validations determine when the challenge is solved.

validations:
  - key: unique-identifier
    title: "User-Friendly Title"
    description: "What this checks"
    order: 1
    type: status|log|event|metrics|rbac|connectivity
    spec:
      # Type-specific configuration

Available types:

Type	Purpose
`status`	Check resource conditions (Ready, Available)
`log`	Find strings in container logs
`event`	Detect forbidden K8s events (OOMKilled, Evicted)
`metrics`	Check pod metrics (restart count)
`rbac`	Test ServiceAccount permissions
`connectivity`	HTTP connectivity tests

See Validation Rules for detailed examples.

5. image/ Directory (Optional)

If your challenge needs a custom application, you can create your own Docker image. Simply add an image/ directory with a Dockerfile in your challenge folder:

my-challenge/
├── challenge.yaml
├── manifests/
├── policies/
└── image/
    ├── Dockerfile
    └── app.py          # Your application files

Example Dockerfile:

# image/Dockerfile
FROM python:3.11-slim

COPY app.py /app/
CMD ["python", "/app/app.py"]

Automatic Build & Publish

When you merge your challenge to main, the CI automatically:

Detects challenges with an image/ directory
Builds the Docker image (multi-arch: linux/amd64 and linux/arm64)
Pushes to GitHub Container Registry: ghcr.io/kubeasy-dev/<challenge-name>:latest

Using your custom image in manifests:

# manifests/deployment.yaml
spec:
  containers:
    - name: app
      image: ghcr.io/kubeasy-dev/my-challenge:latest

When to Use Custom Images

Your challenge needs specific application behavior (memory leaks, slow responses, etc.)
You need to simulate a realistic application workload
Standard images (nginx, python, busybox) don't fit your scenario

Best Practices

Keep images small (use slim/alpine base images)
Don't include secrets or sensitive data
Make the application behavior predictable and reproducible
Add a .dockerignore file to exclude unnecessary files

Themes

Challenges are grouped by theme:

Theme	Description
`rbac-security`	Permissions, roles, security contexts
`networking`	Services, ingress, network policies
`volumes-secrets`	Storage, ConfigMaps, Secrets
`resources-scaling`	Limits, requests, HPA, scaling
`monitoring-debugging`	Probes, logging, events

How It All Works Together

User starts challenge (kubeasy challenge start pod-evicted)
- CLI creates ArgoCD Application pointing to manifests/
- ArgoCD deploys resources to dedicated namespace
- Kyverno policies are applied
User investigates and fixes
- Uses kubectl to explore the problem
- Modifies resources to fix the issue
- Kyverno validates changes (prevents bypasses)
User submits solution (kubeasy challenge submit pod-evicted)
- CLI loads validations from challenge.yaml
- Executes each validation against the cluster
- Sends results to backend
- XP awarded if all validations pass

Best Practices

Challenge Design

One concept per challenge - Don't mix RBAC + networking + storage
Realistic scenarios - Use problems that occur in production
Clear error messages - When users check logs, they should see helpful errors
No red herrings - Don't add confusing complexity

Validation Design

Check outcomes, not implementations - "Pod is healthy" not "Memory is 256Mi"
Don't reveal solutions - Validation titles should be generic
Accept multiple solutions - If there are valid alternatives, allow them

Documentation

Describe symptoms - Not the root cause
State goals - Not the method to achieve them
Never include solutions - Let users figure it out

Example: Complete Challenge

pod-evicted/
├── challenge.yaml
├── manifests/
│   └── deployment.yaml
└── policies/
    └── protect.yaml

challenge.yaml:

title: Pod Evicted
description: |
  A data processing pod keeps crashing and getting evicted.
  It was working fine yesterday, but now Kubernetes keeps killing it.
theme: resources-scaling
difficulty: easy
estimated_time: 15
initial_situation: |
  A data processing application is deployed as a single pod.
  The pod starts successfully but after a few seconds gets killed.
  It enters a CrashLoopBackOff state and keeps restarting.
objective: |
  Fix the pod so it can run without being evicted.

validations:
  - key: pod-ready
    title: "Pod Ready"
    description: "The pod must be running"
    order: 1
    type: status
    spec:
      target:
        kind: Pod
        labelSelector:
          app: data-processor
      conditions:
        - type: Ready
          status: "True"

  - key: no-oom
    title: "No Crash Events"
    description: "No eviction or crash events"
    order: 2
    type: event
    spec:
      target:
        kind: Pod
        labelSelector:
          app: data-processor
      forbiddenReasons:
        - "OOMKilled"
        - "Evicted"
      sinceSeconds: 300

Next Steps

Creating Your First Challenge - Step-by-step guide
Validation Rules - Detailed validation reference
Testing Challenges - How to test locally

Challenge Structure

On this page