KopiaMaintenance CRD Reference

Overview 

The KopiaMaintenance Custom Resource Definition (CRD) provides streamlined management of Kopia repository maintenance operations in VolSync. This namespace-scoped resource offers a simple, direct approach to configuring maintenance schedules for your Kopia repositories.

What is KopiaMaintenance?

KopiaMaintenance is a Kubernetes custom resource that manages automated maintenance operations for Kopia repositories. It creates and manages CronJobs that perform essential repository maintenance tasks including:

Garbage collection of unused data blocks
Repository compaction and optimization
Index maintenance for improved performance
Verification of repository integrity
Automatic maintenance ownership management

Key Features 

Namespace-scoped: Each KopiaMaintenance resource manages repositories within its namespace
Direct repository configuration: Explicit 1:1 mapping between maintenance resources and repositories
Simple API: Focused design without complex selectors or priority systems
Resource management: Configure CPU and memory limits for maintenance operations
Flexible scheduling: Support for standard cron expressions and aliases

When to Use KopiaMaintenance 

Use KopiaMaintenance when you need:

Automated maintenance for Kopia repositories
Namespace-isolated maintenance management
Clear, explicit maintenance configuration
Control over maintenance resource consumption
Simple deployment without cross-namespace complexity

Continue using embedded maintenanceCronJob in ReplicationSource when:

You have existing configurations that work well
You prefer configuration alongside your backup definitions
You need minimal setup for single repositories

API Specification 

Basic Structure 

apiVersion: volsync.backube/v1alpha1
kind: KopiaMaintenance
metadata:
  name: <maintenance-name>
  namespace: <target-namespace>
spec:
  repository:
    repository: <repository-secret-name>
    customCA:  # Optional
      configMapName: <ca-configmap-name>
      key: <ca-cert-key>
  trigger:  # New trigger support
    schedule: "0 2 * * *"  # Scheduled trigger
    # OR
    manual: "trigger-1"    # Manual trigger
  enabled: true
  suspend: false
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 1
  resources:
    requests:
      memory: "256Mi"
      cpu: "100m"
    limits:
      memory: "1Gi"
      cpu: "500m"
  # Cache configuration (new)
  cacheCapacity: 10Gi
  cacheStorageClassName: fast-ssd
  cacheAccessModes:
    - ReadWriteOnce
  # OR use existing PVC
  cachePVC: existing-cache-pvc

Field Reference 

Required Fields 

repository (KopiaRepositorySpec, required): Defines the repository configuration for maintenance. The repository secret must exist in the same namespace as the KopiaMaintenance resource.
repository.repository (string, required): Name of the secret containing repository configuration. Secret must contain Kopia repository connection details (URL, credentials, etc.)

Optional Fields 

repository.customCA (ReplicationSourceKopiaCA, optional)

Custom CA configuration for repository access.

configMapName: Name of ConfigMap containing CA certificate
key: Key within ConfigMap containing the certificate (default: “ca.crt”)
secretName: Alternative to ConfigMap, name of Secret containing CA certificate

trigger (KopiaMaintenanceTriggerSpec, optional)

Defines when maintenance will be performed. Supports scheduled and manual triggers.

schedule: Cron schedule for maintenance execution (mutually exclusive with manual)
manual: String value for manual trigger (mutually exclusive with schedule)
Default: If no trigger specified, defaults to schedule: "0 2 * * *"

schedule (string, optional, deprecated)

Cron schedule for maintenance execution.

DEPRECATED: Use trigger.schedule instead. This field will be removed in a future version.
Default: "0 2 * * *" (daily at 2 AM)
Supports standard cron expressions and aliases (@daily, @weekly, @monthly)

enabled (boolean, optional)

Determines if maintenance should be performed.

Default: true
When false, no maintenance jobs will be created

suspend (boolean, optional)

Temporarily stop maintenance without deleting configuration.

Default: false
When true, prevents new Jobs from being created while allowing existing Jobs to complete

successfulJobsHistoryLimit (integer, optional)

Number of successful maintenance Jobs to retain.

Default: 3
Minimum: 0

failedJobsHistoryLimit (integer, optional)

Number of failed maintenance Jobs to retain.

Default: 1
Minimum: 0

resources (ResourceRequirements, optional)

Compute resources for maintenance containers.

Default requests: 256Mi memory
Default limits: 1Gi memory
Configure based on repository size and performance requirements

serviceAccountName (string, optional)

Custom ServiceAccount for maintenance jobs. If not specified, uses default maintenance ServiceAccount.

podSecurityContext (PodSecurityContext, optional)

Pod-level security context for maintenance jobs. Allows configuring security settings such as runAsUser, fsGroup, and other standard Kubernetes pod security options. Container automatically inherits these settings. Default: runAsUser: 1000, fsGroup: 1000, runAsNonRoot: true

containerSecurityContext (SecurityContext, optional)

Container-level security context for maintenance jobs. For advanced use cases where you need fine-grained control over container security.

IMPORTANT: For setting the user ID, use podSecurityContext.runAsUser instead. The container automatically inherits runAsUser from the pod-level context.

Use this field only for advanced security controls like capabilities, privileged mode, seLinux, or seccomp profiles.

Default: Security hardening settings are applied automatically (readOnlyRootFilesystem, allowPrivilegeEscalation: false, capabilities dropped)

moverPodLabels (map[string]string, optional)

Additional labels for maintenance pods. Applied alongside VolSync-managed labels.

affinity (Affinity, optional)

Pod affinity rules for maintenance jobs. Supports nodeAffinity, podAffinity, and podAntiAffinity.

cacheCapacity (Quantity, optional)

Size of the Kopia metadata cache volume. If specified without cachePVC, a new PVC will be created.

cacheStorageClassName (string, optional)

StorageClass for the Kopia metadata cache volume. Only used when creating a new cache PVC.

cacheAccessModes ([]PersistentVolumeAccessMode, optional)

Access modes for the Kopia metadata cache volume. Default: [ReadWriteOnce]

cachePVC (string, optional)

Name of an existing PVC to use for Kopia cache. If specified, other cache configuration fields are ignored.

Status Fields 

The KopiaMaintenance controller updates these status fields:

activeCronJob (string): Name of the currently active CronJob managing maintenance. Empty if no CronJob is active.
lastReconcileTime (Time): Timestamp of the last successful reconciliation.
lastMaintenanceTime (Time): Timestamp of the last successful maintenance operation.
nextScheduledMaintenance (Time): Next scheduled maintenance execution time.
maintenanceFailures (integer): Count of consecutive maintenance failures.
lastManualSync (string): Set to the last spec.trigger.manual value when manual maintenance completes. Used to track completion of manual triggers.
conditions ([]Condition): Current state observations of the maintenance configuration. Common conditions: Ready, Reconciling, Error.

Configuration Examples 

Trigger Configuration 

Scheduled Trigger (Recommended)

apiVersion: volsync.backube/v1alpha1
kind: KopiaMaintenance
metadata:
  name: scheduled-maintenance
  namespace: my-app
spec:
  repository:
    repository: kopia-repository-secret
  trigger:
    schedule: "0 3 * * *"  # 3 AM daily
  enabled: true

Manual Trigger for On-Demand Maintenance 

apiVersion: volsync.backube/v1alpha1
kind: KopiaMaintenance
metadata:
  name: manual-maintenance
  namespace: my-app
spec:
  repository:
    repository: kopia-repository-secret
  trigger:
    manual: "run-maintenance-2024-01-15"  # Change this value to trigger
  enabled: true

# To trigger maintenance:
# 1. Update spec.trigger.manual to a new value
# 2. Wait for status.lastManualSync to match the new value
# 3. Maintenance has completed when values match

Basic Daily Maintenance (Legacy)

apiVersion: volsync.backube/v1alpha1
kind: KopiaMaintenance
metadata:
  name: daily-maintenance
  namespace: my-app
spec:
  repository:
    repository: kopia-repository-secret
  schedule: "0 3 * * *"  # 3 AM daily (deprecated field)
  enabled: true
  successfulJobsHistoryLimit: 3  # Keep last 3 successful jobs
  failedJobsHistoryLimit: 1       # Keep last failed job

Weekly Maintenance with Resource Limits 

apiVersion: volsync.backube/v1alpha1
kind: KopiaMaintenance
metadata:
  name: weekly-maintenance
  namespace: production
spec:
  repository:
    repository: prod-backup-config
  schedule: "0 2 * * 0"  # 2 AM on Sundays
  resources:
    requests:
      memory: "512Mi"
      cpu: "200m"
    limits:
      memory: "2Gi"
      cpu: "1"
  successfulJobsHistoryLimit: 5
  failedJobsHistoryLimit: 2

Maintenance with Custom CA 

apiVersion: volsync.backube/v1alpha1
kind: KopiaMaintenance
metadata:
  name: secure-maintenance
  namespace: secure-backups
spec:
  repository:
    repository: private-s3-config
    customCA:
      configMapName: company-ca-bundle
      key: ca-bundle.crt
  schedule: "0 1 * * 1,4"  # 1 AM on Mondays and Thursdays
  moverPodLabels:
    environment: production
    team: platform

High-Performance Maintenance with Cache 

apiVersion: volsync.backube/v1alpha1
kind: KopiaMaintenance
metadata:
  name: large-repo-maintenance
  namespace: data-warehouse
spec:
  repository:
    repository: warehouse-backup-config
  trigger:
    schedule: "0 0 * * 6"  # Midnight on Saturdays
  resources:
    requests:
      memory: "2Gi"
      cpu: "1"
    limits:
      memory: "8Gi"
      cpu: "4"
  # Cache configuration for better performance
  cacheCapacity: 20Gi
  cacheStorageClassName: fast-ssd
  cacheAccessModes:
    - ReadWriteOnce
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: node-type
            operator: In
            values: ["high-memory"]

Cache Configuration Examples 

Using Existing Cache PVC 

apiVersion: volsync.backube/v1alpha1
kind: KopiaMaintenance
metadata:
  name: maintenance-with-existing-cache
  namespace: production
spec:
  repository:
    repository: prod-backup-config
  trigger:
    schedule: "0 2 * * *"
  cachePVC: shared-kopia-cache  # Use existing PVC

Creating New Cache PVC 

apiVersion: volsync.backube/v1alpha1
kind: KopiaMaintenance
metadata:
  name: maintenance-with-new-cache
  namespace: production
spec:
  repository:
    repository: prod-backup-config
  trigger:
    schedule: "0 2 * * *"
  cacheCapacity: 15Gi            # Create new PVC with this size
  cacheStorageClassName: fast    # Use this storage class
  cacheAccessModes:
    - ReadWriteOnce

No Cache (EmptyDir Fallback)

apiVersion: volsync.backube/v1alpha1
kind: KopiaMaintenance
metadata:
  name: maintenance-no-cache
  namespace: testing
spec:
  repository:
    repository: test-backup-config
  trigger:
    schedule: "0 4 * * *"
  # No cache configuration - will use EmptyDir

Temporarily Suspended Maintenance 

apiVersion: volsync.backube/v1alpha1
kind: KopiaMaintenance
metadata:
  name: suspended-maintenance
  namespace: testing
spec:
  repository:
    repository: test-backup-config
  trigger:
    schedule: "0 4 * * *"
  enabled: true
  suspend: true  # Temporarily suspended
  successfulJobsHistoryLimit: 10  # Keep more history during suspension

Pod Security Configuration 

Overview 

The podSecurityContext field allows you to customize pod-level security settings for maintenance jobs. This is particularly useful when repository directories have specific ownership requirements or when you need to comply with security policies.

When to Use Pod Security Context 

You should configure podSecurityContext when:

Repository ownership differs from defaults: Your repository directory is owned by a user other than UID 1000
Permission errors occur: You see “permission denied” errors when accessing repository files
Security compliance: Your organization requires specific security context settings
Storage system requirements: Your storage backend requires specific user/group IDs

Common Use Case: Permission Denied Errors 

Problem: Maintenance jobs fail with permission errors when accessing the repository.

Error Example:

ERROR error connecting to repository: unable to read format blob:
error determining sharded path: error getting sharding parameters for storage:
unable to complete GetBlobFromPath:/repository/.shards despite 10 retries:
open /repository/.shards: permission denied

Cause: The repository directory is owned by a user (e.g., UID 2000) that differs from the default maintenance job user (UID 1000).

Solution: Configure podSecurityContext to match the repository ownership:

apiVersion: volsync.backube/v1alpha1
kind: KopiaMaintenance
metadata:
  name: my-maintenance
  namespace: backup-ns
spec:
  repository:
    repository: my-repo-secret
  podSecurityContext:
    runAsUser: 2000      # Match repository directory owner
    fsGroup: 2000        # Match repository directory group
    runAsNonRoot: true   # Security best practice

Configuration Examples 

Matching Repository File Ownership 

When your repository files are owned by a specific user:

apiVersion: volsync.backube/v1alpha1
kind: KopiaMaintenance
metadata:
  name: custom-user-maintenance
  namespace: production
spec:
  repository:
    repository: prod-backup-secret
  podSecurityContext:
    runAsUser: 2000
    fsGroup: 2000
    runAsNonRoot: true
  trigger:
    schedule: "0 2 * * *"

Additional Security Settings 

For enhanced security compliance:

apiVersion: volsync.backube/v1alpha1
kind: KopiaMaintenance
metadata:
  name: secure-maintenance
  namespace: production
spec:
  repository:
    repository: secure-repo-secret
  podSecurityContext:
    runAsUser: 3000
    runAsGroup: 3000
    fsGroup: 3000
    runAsNonRoot: true
    seccompProfile:
      type: RuntimeDefault
    supplementalGroups:
      - 4000
  trigger:
    schedule: "0 3 * * 0"

Default Security Context 

When podSecurityContext is not specified, the following defaults are used:

podSecurityContext:
  runAsUser: 1000
  fsGroup: 1000
  runAsNonRoot: true

This default configuration works for most scenarios where repository directories are created by VolSync with standard ownership.

Determining Required User/Group IDs 

To identify the correct user and group IDs for your repository:

For filesystem-based repositories (repositoryPVC):

# Create a temporary pod to check ownership
kubectl run -it --rm debug --image=busybox --restart=Never \
  --overrides='
  {
    "spec": {
      "containers": [{
        "name": "debug",
        "image": "busybox",
        "command": ["sh"],
        "volumeMounts": [{
          "name": "repo",
          "mountPath": "/repository"
        }]
      }],
      "volumes": [{
        "name": "repo",
        "persistentVolumeClaim": {
          "claimName": "your-repository-pvc"
        }
      }]
    }
  }' \
  -- sh -c "ls -ln /repository"

# Look for the numeric user and group IDs in the output
# Example output: drwxr-xr-x 2 2000 2000 4096 Jan 20 10:00 repository

For object storage repositories (S3, Azure, GCS):

Object storage typically doesn’t require specific UIDs, but you may need to match the user that created the repository if filesystem caching is used.

Available Security Context Fields 

The podSecurityContext field supports all standard Kubernetes PodSecurityContext options:

Field	Description
`runAsUser`	UID to run the pod processes
`runAsGroup`	Primary GID for pod processes
`fsGroup`	Special supplemental group for volume ownership
`runAsNonRoot`	Ensures containers run as non-root (recommended: true)
`supplementalGroups`	Additional groups for the first process
`fsGroupChangePolicy`	How volume ownership is changed (OnRootMismatch, Always)
`seccompProfile`	Seccomp profile (e.g., RuntimeDefault)
`seLinuxOptions`	SELinux options for containers
`windowsOptions`	Windows-specific security settings

Container-Level Security 

KopiaMaintenance supports both pod-level and container-level security context configuration. This provides flexibility for advanced use cases while keeping simple scenarios straightforward.

Security Context Inheritance 

How it works:

Pod-level settings (podSecurityContext) apply to all containers and control volume permissions
Container-level settings (containerSecurityContext) provide fine-grained container controls
The container inherits ``runAsUser`` from the pod-level context - no need to set it twice

Default behavior (when containerSecurityContext is not specified):

# Container security context (applied automatically)
securityContext:
  allowPrivilegeEscalation: false
  capabilities:
    drop:
      - ALL
  privileged: false
  readOnlyRootFilesystem: true
  runAsNonRoot: true
  # runAsUser: <inherited from pod-level>

These defaults provide defense-in-depth security by:

Preventing privilege escalation
Dropping all Linux capabilities
Making the root filesystem read-only
Ensuring non-root execution

Simple configuration (recommended for most users):

apiVersion: volsync.backube/v1alpha1
kind: KopiaMaintenance
spec:
  podSecurityContext:
    runAsUser: 2000      # Container inherits this
    fsGroup: 2000
    runAsNonRoot: true

Advanced configuration (for custom capabilities, seLinux, etc.):

apiVersion: volsync.backube/v1alpha1
kind: KopiaMaintenance
spec:
  podSecurityContext:
    runAsUser: 2000      # Still set user here
    fsGroup: 2000
  containerSecurityContext:
    allowPrivilegeEscalation: false
    capabilities:
      drop: ["ALL"]
      add: ["NET_BIND_SERVICE"]  # Advanced: add specific capability
    readOnlyRootFilesystem: true
    runAsNonRoot: true
    # Don't set runAsUser here - it's inherited from pod level

Backward Compatibility 

Existing KopiaMaintenance resources continue to work without changes:

If podSecurityContext is not specified, the default values are applied
No migration is required for existing configurations
You can add podSecurityContext to existing resources at any time

Troubleshooting Pod Security Issues 

Maintenance Jobs Fail with Permission Errors

# Check the maintenance job logs
kubectl logs -n <namespace> job/<maintenance-job-name>

# Verify pod security context
kubectl get pod <maintenance-pod> -o jsonpath='{.spec.securityContext}'

# Check repository directory permissions (for filesystem repos)
kubectl exec <maintenance-pod> -- ls -ln /repository

Solution: Configure podSecurityContext to match repository ownership.

Jobs Won’t Start Due to Security Policy Violations

# Check pod security admission warnings
kubectl describe pod <maintenance-pod>

Solution: Adjust podSecurityContext to comply with cluster security policies (Pod Security Standards, OPA policies, etc.).

SELinux Context Errors

podSecurityContext:
  seLinuxOptions:
    level: "s0:c123,c456"
    role: "system_r"
    type: "container_t"
    user: "system_u"

Best Practices 

Trigger Selection 

Scheduled Triggers

Use scheduled triggers for:

Regular, predictable maintenance windows
Production environments with consistent backup patterns
Repositories that grow at a steady rate

Example schedules:

"0 2 * * *" - Daily at 2 AM
"0 3 * * 0" - Weekly on Sunday at 3 AM
"0 4 1 * *" - Monthly on the 1st at 4 AM
"@daily" - Once per day at midnight
"@weekly" - Once per week on Sunday at midnight

Manual Triggers

Use manual triggers for:

On-demand maintenance after large data changes
Testing and troubleshooting
Maintenance coordination with other operations
CI/CD pipeline integration

To use manual triggers:

Set spec.trigger.manual to a unique value
Apply the resource
Monitor status.lastManualSync
When lastManualSync matches your trigger value, maintenance is complete
Update spec.trigger.manual to a new value for next trigger

Job History Management 

KopiaMaintenance allows you to control how many completed Job records are retained for successful and failed maintenance operations. This helps balance between having debugging history and reducing cluster resource usage.

Configuration Fields 

successfulJobsHistoryLimit (integer, default: 3)

Controls how many successful maintenance Job records to keep. These records are useful for:

Tracking maintenance execution patterns
Verifying maintenance is running on schedule
Reviewing historical performance and duration
Troubleshooting intermittent issues

Set to 0 to delete successful jobs immediately after completion.

failedJobsHistoryLimit (integer, default: 1)

Controls how many failed maintenance Job records to keep. Failed jobs are crucial for:

Diagnosing what went wrong during maintenance
Identifying patterns in failures
Providing logs for troubleshooting
Understanding error conditions

Set to 0 to delete failed jobs immediately (not recommended).

When to Customize 

Increase history limits when:

Debugging maintenance issues and need more historical context
Running maintenance infrequently (weekly/monthly) and want long-term history
Tracking performance trends over time
Working in development/testing environments

Decrease history limits when:

Running maintenance very frequently (hourly) and don’t need extensive history
Cluster has limited resources and job records consume too much memory
Using external monitoring and don’t need Kubernetes job history
Operating in resource-constrained environments

Example Configurations 

Minimal History (Resource Constrained):

spec:
  successfulJobsHistoryLimit: 1   # Keep only last success
  failedJobsHistoryLimit: 0       # Delete failures immediately

Extended History (Debugging):

spec:
  successfulJobsHistoryLimit: 10  # Keep 10 successful runs
  failedJobsHistoryLimit: 5       # Keep 5 failed runs for analysis

Balanced Default (Recommended):

spec:
  successfulJobsHistoryLimit: 3   # Default: last 3 successful runs
  failedJobsHistoryLimit: 1       # Default: last failed run

Cache Configuration 

Kopia uses a metadata cache to improve performance. KopiaMaintenance supports four cache scenarios:

1. Existing PVC (Recommended for Production)

Best when you want full control over the cache PVC:

spec:
  cachePVC: my-cache-pvc  # Must exist in same namespace

2. Auto-Created PVC

Best for automatic cache management:

spec:
  cacheCapacity: 10Gi
  cacheStorageClassName: fast-ssd
  cacheAccessModes:
    - ReadWriteOnce

3. EmptyDir (Default)

When no cache configuration is provided, uses ephemeral storage. Suitable for:

Small repositories
Testing environments
When persistence isn’t critical

4. No Cache

Kopia will operate without cache if explicitly disabled in repository configuration.

Cache Sizing Guidelines:

Small repos (<100GB): 1-2Gi cache
Medium repos (100GB-1TB): 5-10Gi cache
Large repos (>1TB): 15-30Gi cache
Very large repos: 50Gi+ cache

Repository Secret Management 

Keep secrets in the same namespace: The repository secret must exist in the same namespace as the KopiaMaintenance resource
Use descriptive secret names: Choose names that clearly identify the repository purpose (e.g., prod-s3-backup-config, dev-gcs-repo)
Secure sensitive data: Ensure repository secrets are properly protected with RBAC

Scheduling Considerations 

Avoid peak hours: Schedule maintenance during low-activity periods
Stagger multiple maintenances: If managing multiple repositories, use different schedules to avoid resource contention
Consider repository size: Large repositories may need weekly rather than daily maintenance
Account for time zones: Schedules are interpreted in the controller’s timezone

Resource Allocation 

Start conservative: Begin with default resources and adjust based on observed usage
Monitor maintenance jobs: Check job completion times and resource consumption
Scale for repository size: Larger repositories require more memory and CPU
Use node affinity: Direct maintenance to appropriate nodes for large-scale operations

Resource Recommendations by Repository Size:

Repository Size	Memory (Request/Limit)	CPU (Request/Limit)
Small (<100GB)	256Mi / 1Gi	100m / 500m
Medium (100GB-1TB)	512Mi / 2Gi	200m / 1
Large (1TB-10TB)	1Gi / 4Gi	500m / 2
Very Large (>10TB)	2Gi / 8Gi	1 / 4

Maintenance Ownership 

Kopia requires a single user to own maintenance operations. KopiaMaintenance automatically:

Sets identity: Uses maintenance@volsync as the maintenance identity
Claims ownership: Automatically claims or reclaims maintenance ownership
Handles conflicts: Retries if another user currently owns maintenance
Ensures reliability: Prevents maintenance failures due to ownership issues

Naming Conventions 

Use descriptive names: prod-daily-maintenance, staging-weekly-cleanup
Include frequency: Indicate maintenance schedule in the name when relevant
Match repository purpose: Align maintenance names with repository naming

Migration Guide 

Migrating from maintenanceIntervalDays 

The maintenanceIntervalDays field has been removed from ReplicationSource. All maintenance operations must now be configured through the KopiaMaintenance CRD.

Old Configuration (No Longer Supported):

apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
  name: my-backup
spec:
  sourcePVC: my-data
  kopia:
    repository: kopia-config
    maintenanceIntervalDays: 7  # REMOVED - NO LONGER SUPPORTED

New Configuration (Required):

Create a separate KopiaMaintenance resource:

apiVersion: volsync.backube/v1alpha1
kind: KopiaMaintenance
metadata:
  name: my-maintenance
  namespace: same-as-replicationsource
spec:
  repository:
    repository: kopia-config  # Same secret as ReplicationSource
  trigger:
    schedule: "0 2 * * 0"      # Weekly on Sunday at 2 AM
  # Optional: Add cache for better performance
  cacheCapacity: 10Gi
  cacheStorageClassName: fast-ssd
  cacheAccessModes:
    - ReadWriteOnce

Migration Benefits:

Independent scheduling: Maintenance no longer tied to backup frequency
Better performance: Dedicated cache configuration for maintenance
Resource control: Specify CPU/memory limits for maintenance jobs
Flexible triggers: Support for both scheduled and manual maintenance

Migrating from Deprecated schedule Field 

The schedule field is deprecated in favor of trigger.schedule. Here’s how to migrate:

Old Configuration:

apiVersion: volsync.backube/v1alpha1
kind: KopiaMaintenance
metadata:
  name: my-maintenance
spec:
  repository:
    repository: backup-config
  schedule: "0 2 * * *"  # Deprecated field

New Configuration:

apiVersion: volsync.backube/v1alpha1
kind: KopiaMaintenance
metadata:
  name: my-maintenance
spec:
  repository:
    repository: backup-config
  trigger:
    schedule: "0 2 * * *"  # New field location

Backward Compatibility:

The deprecated schedule field continues to work
If both fields are set, trigger.schedule takes precedence
The controller will log warnings when using the deprecated field
Plan to migrate before the field is removed in a future version

From Embedded maintenanceCronJob 

If you’re currently using embedded maintenance configuration in ReplicationSource:

Before (Embedded Configuration):

apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
  name: app-backup
  namespace: production
spec:
  sourcePVC: app-data
  kopia:
    repository: prod-backup-config
    maintenanceCronJob:
      enabled: true
      schedule: "0 2 * * *"
      resources:
        requests:
          memory: "256Mi"

After (Separate KopiaMaintenance):

# Step 1: Create KopiaMaintenance resource
apiVersion: volsync.backube/v1alpha1
kind: KopiaMaintenance
metadata:
  name: prod-maintenance
  namespace: production
spec:
  repository:
    repository: prod-backup-config
  schedule: "0 2 * * *"
  resources:
    requests:
      memory: "256Mi"
    limits:
      memory: "1Gi"

---
# Step 2: Remove maintenanceCronJob from ReplicationSource
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
  name: app-backup
  namespace: production
spec:
  sourcePVC: app-data
  kopia:
    repository: prod-backup-config
    # maintenanceCronJob section removed

Migration Steps 

Create KopiaMaintenance resources before modifying ReplicationSources
Verify CronJob creation using kubectl get cronjobs -n <namespace>
Remove embedded configuration from ReplicationSources
Monitor maintenance execution to ensure continuity

Adding Cache to Existing Maintenance 

To add cache support to existing maintenance configurations:

Step 1: Create a cache PVC (if not using auto-creation)

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: kopia-cache
  namespace: production
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: fast-ssd
  resources:
    requests:
      storage: 10Gi

Step 2: Update KopiaMaintenance to use cache

apiVersion: volsync.backube/v1alpha1
kind: KopiaMaintenance
metadata:
  name: prod-maintenance
  namespace: production
spec:
  repository:
    repository: prod-backup-config
  trigger:
    schedule: "0 2 * * *"
  cachePVC: kopia-cache  # Add this line

Step 3: Monitor performance improvement

# Check maintenance job duration before and after cache
kubectl get jobs -n production -l volsync.backube/kopia-maintenance=true \
  -o custom-columns=NAME:.metadata.name,DURATION:.status.completionTime

Advanced Usage 

Combining Manual and Scheduled Triggers 

While you cannot use both triggers simultaneously in a single resource, you can create separate resources for different trigger types:

# Regular scheduled maintenance
apiVersion: volsync.backube/v1alpha1
kind: KopiaMaintenance
metadata:
  name: scheduled-maintenance
  namespace: production
spec:
  repository:
    repository: prod-backup-config
  trigger:
    schedule: "0 2 * * *"
---
# On-demand maintenance for the same repository
apiVersion: volsync.backube/v1alpha1
kind: KopiaMaintenance
metadata:
  name: manual-maintenance
  namespace: production
spec:
  repository:
    repository: prod-backup-config
  trigger:
    manual: "on-demand-1"
  enabled: false  # Enable only when needed

Automating Manual Triggers 

You can automate manual triggers using kubectl or CI/CD pipelines:

#!/bin/bash
# Script to trigger manual maintenance

NAMESPACE="production"
MAINTENANCE_NAME="manual-maintenance"
TRIGGER_VALUE="manual-$(date +%Y%m%d-%H%M%S)"

# Update the trigger
kubectl patch kopiamaintenance $MAINTENANCE_NAME -n $NAMESPACE \
  --type merge -p '{"spec":{"trigger":{"manual":"'$TRIGGER_VALUE'"}}}'

# Wait for completion
while true; do
  LAST_SYNC=$(kubectl get kopiamaintenance $MAINTENANCE_NAME -n $NAMESPACE \
    -o jsonpath='{.status.lastManualSync}')
  if [ "$LAST_SYNC" == "$TRIGGER_VALUE" ]; then
    echo "Maintenance completed"
    break
  fi
  echo "Waiting for maintenance to complete..."
  sleep 30
done

Performance Tuning with Cache 

Cache Warming Strategy:

For optimal performance, pre-warm the cache before heavy maintenance:

apiVersion: batch/v1
kind: Job
metadata:
  name: cache-warmer
  namespace: production
spec:
  template:
    spec:
      containers:
      - name: kopia
        image: kopia/kopia:latest
        command:
        - kopia
        - repository
        - status
        - --config-file=/tmp/repository/config
        volumeMounts:
        - name: cache
          mountPath: /cache
        - name: repository-config
          mountPath: /tmp/repository
      volumes:
      - name: cache
        persistentVolumeClaim:
          claimName: kopia-cache
      - name: repository-config
        secret:
          secretName: prod-backup-config

Troubleshooting 

Common Issues 

Maintenance Not Running 

Symptoms:

No CronJob created in namespace
status.activeCronJob is empty

Solutions:

Verify repository secret exists:

kubectl get secret <repository-secret> -n <namespace>

Check KopiaMaintenance status:

kubectl describe kopiamaintenance <name> -n <namespace>

Review controller logs for errors:

kubectl logs -n volsync-system deployment/volsync | grep -i kopiamaintenance

Authentication Failures 

Symptoms:

Maintenance jobs fail with authentication errors
Repository access denied messages

Solutions:

Verify secret contains required fields:

kubectl get secret <repository-secret> -n <namespace> -o jsonpath='{.data}' | jq 'keys'

Check secret data is valid and not corrupted
Ensure custom CA is properly configured if using self-signed certificates

Resource Exhaustion 

Symptoms:

Maintenance jobs killed or evicted
Out of memory errors

Solutions:

Increase resource limits:

resources:
  requests:
    memory: "1Gi"
  limits:
    memory: "4Gi"

Monitor actual usage:

kubectl top pod -n <namespace> -l job-name=<maintenance-job>

Schedule Not Working 

Symptoms:

Jobs not running at expected times
Incorrect execution frequency

Solutions:

Validate cron expression using online validators or tools
Check controller timezone configuration
Verify suspend is not set to true

Job History for Debugging 

The job history limits control how much historical data you have available for troubleshooting:

# View recent successful maintenance jobs
kubectl get jobs -n <namespace> -l volsync.backube/kopia-maintenance=true \
  --sort-by=.metadata.creationTimestamp

# Check job history count
kubectl get jobs -n <namespace> -l volsync.backube/kopia-maintenance=true \
  -o custom-columns=NAME:.metadata.name,STATUS:.status.succeeded,FAILED:.status.failed,START:.status.startTime

# View logs from a specific job
kubectl logs -n <namespace> job/<maintenance-job-name>

# If you need more history, increase the limits:
kubectl patch kopiamaintenance <name> -n <namespace> --type merge \
  -p '{"spec":{"successfulJobsHistoryLimit":10,"failedJobsHistoryLimit":5}}'

Tip

If you’re troubleshooting maintenance issues and the job history has been cleaned up, consider temporarily increasing successfulJobsHistoryLimit and failedJobsHistoryLimit to capture more execution history.

Debugging Commands 

# Check KopiaMaintenance resources
kubectl get kopiamaintenance -A

# View detailed status with trigger info
kubectl get kopiamaintenance <name> -n <namespace> -o yaml | grep -A5 trigger

# Check trigger status
kubectl get kopiamaintenance <name> -n <namespace> \
  -o jsonpath='{.spec.trigger.manual} -> {.status.lastManualSync}\n'

# View cache configuration
kubectl get kopiamaintenance <name> -n <namespace> \
  -o jsonpath='{.spec.cache*}'

# Check created CronJobs (for scheduled triggers)
kubectl get cronjobs -n <namespace> -l volsync.backube/kopia-maintenance=true

# Check Jobs (for manual triggers)
kubectl get jobs -n <namespace> -l volsync.backube/kopia-maintenance=true

# View maintenance job logs
kubectl logs -n <namespace> job/<maintenance-job-name>

# Check events for errors
kubectl get events -n <namespace> --field-selector involvedObject.name=<maintenance-name>

# Monitor cache PVC usage
kubectl exec -n <namespace> <pod-name> -- df -h /cache

Limitations 

Current Limitations 

Namespace Isolation: Repository secret must exist in the same namespace as KopiaMaintenance
No Cross-Namespace Management: Cannot manage repositories in different namespaces
Single Repository: Each KopiaMaintenance manages exactly one repository
No Repository Discovery: No automatic detection of repositories or ReplicationSources

Design Rationale 

The simplified design provides:

Clear ownership: Namespace-scoped resources have clear ownership boundaries
Better security: No cross-namespace secret access reduces attack surface
Simpler RBAC: Namespace-level permissions are easier to manage
Predictable behavior: Direct configuration eliminates matching complexity

Performance Considerations 

Cache Impact on Performance 

The Kopia cache significantly improves maintenance performance:

Performance Comparison:

Repository Size	Without Cache	With Cache
100GB	15-20 minutes	5-8 minutes
1TB	2-3 hours	30-45 minutes
10TB	8-12 hours	2-3 hours

Cache Optimization Tips:

Use SSD storage for cache PVCs when possible
Size appropriately: 1-2% of repository size is usually sufficient
Monitor cache hit rates through Kopia logs
Persistent cache is crucial for large repositories
Share cache between maintenance and backup operations when possible

Scheduling Optimization 

Best Practices for Scheduling:

Avoid backup windows: Don’t run maintenance during active backups
Stagger maintenance: Spread maintenance across different times for multiple repositories
Consider time zones: Schedule based on application usage patterns
Frequency guidelines:
- Daily: Small, frequently changing repositories
- Weekly: Medium-sized, moderate change rate
- Monthly: Large, slow-changing archives

Example Staggered Schedule:

# Repository 1: 2 AM
trigger:
  schedule: "0 2 * * *"

# Repository 2: 3 AM
trigger:
  schedule: "0 3 * * *"

# Repository 3: 4 AM
trigger:
  schedule: "0 4 * * *"

Monitoring and Observability 

Key Metrics to Monitor 

Maintenance Health Metrics:

volsync_kopia_maintenance_last_run_timestamp_seconds: Last successful maintenance
volsync_kopia_maintenance_duration_seconds: Maintenance duration
volsync_kopia_maintenance_cronjob_failures_total: Failed maintenance count

Repository Health Metrics:

Repository size growth rate
Deduplication ratio
Number of snapshots
Orphaned blocks count

Prometheus Queries 

Alert on Missing Maintenance:

time() - volsync_kopia_maintenance_last_run_timestamp_seconds > 259200

Track Maintenance Duration Trends:

rate(volsync_kopia_maintenance_duration_seconds[1d])

Monitor Cache Effectiveness:

# Check cache hit ratio in maintenance logs
kubectl logs -n <namespace> job/<maintenance-job> | grep -i "cache hit"

Integration with CI/CD 

GitOps Integration Example:

# In your GitOps repository
apiVersion: volsync.backube/v1alpha1
kind: KopiaMaintenance
metadata:
  name: post-deployment-maintenance
  namespace: production
spec:
  repository:
    repository: prod-backup-config
  trigger:
    manual: "deployment-${CI_COMMIT_SHA}"  # Trigger after deployment
  cacheCapacity: 20Gi
  resources:
    requests:
      memory: "2Gi"
    limits:
      memory: "4Gi"

Jenkins Pipeline Example:

stage('Trigger Maintenance') {
  steps {
    script {
      def triggerValue = "jenkins-${env.BUILD_NUMBER}"
      sh """
        kubectl patch kopiamaintenance manual-maintenance \
          -n production \
          --type merge \
          -p '{"spec":{"trigger":{"manual":"${triggerValue}"}}}'
      """

      // Wait for completion
      timeout(time: 30, unit: 'MINUTES') {
        waitUntil {
          def status = sh(
            script: "kubectl get kopiamaintenance manual-maintenance -n production -o jsonpath='{.status.lastManualSync}'",
            returnStdout: true
          ).trim()
          return status == triggerValue
        }
      }
    }
  }
}

Next Steps 

Review Backup Configuration for repository setup
Explore Troubleshooting Guide for detailed debugging
Set up monitoring with the /examples/kopia/maintenance-alerts
Learn about Kopia’s maintenance operations in detail
Understand cache architecture in Kopia’s performance guide

Support 

For issues or questions:

GitHub Issues: https://github.com/backube/volsync/issues
GitHub Discussions: https://github.com/backube/volsync/discussions
Documentation: https://volsync.readthedocs.io/