Multi-tenancy and Shared Repositories

The VolSync Kopia mover implements multi-tenancy through the automatic generation of unique usernames and hostnames for each backup client. This ensures that multiple ReplicationSources and ReplicationDestinations can safely share the same Kopia repository without conflicts.

Each Kopia client requires a unique identity consisting of:

  • Username: Identifies the tenant/user in the repository

  • Hostname: Identifies the specific host/instance within that tenant

Simplified Multi-Tenancy with Namespace-Only Hostnames

VolSync’s hostname generation uses namespace-only identification by design, providing simplified multi-tenant isolation:

  • Hostname = Namespace: The hostname is ALWAYS just the namespace name (unless explicitly customized)

  • Username = ReplicationSource/ReplicationDestination name: By default, the username is derived from the object name

  • Unique identity guaranteed: Since Kubernetes prevents duplicate object names in a namespace, the combination of namespace (hostname) + object name (username) is always unique

  • Clear tenant boundaries: Each namespace represents a distinct tenant with a single hostname

  • Multi-source support: Multiple ReplicationSources in the same namespace share the hostname but have different usernames

  • No collision risk: There’s no security issue or collision risk because Kubernetes enforces unique object names

This design enables powerful multi-tenancy features:

  • Namespace as tenant boundary: All backups from a namespace share the same hostname, representing a logical tenant

  • Per-source isolation: Each ReplicationSource has its own username, maintaining separate snapshot histories

  • Repository-level policies: Administrators can apply retention policies based on namespace (hostname) patterns

  • No collision possible: Even with shared hostnames, unique usernames prevent any conflicts

  • Simplified access control: Control repository access at the namespace level

Understanding identity generation

VolSync automatically generates usernames and hostnames based on your Kubernetes resources. This is an intentional design that leverages Kubernetes’ built-in uniqueness guarantees:

Key Design Principle: The hostname is ALWAYS just the namespace name (unless explicitly customized). This is not a limitation but a deliberate design choice that simplifies multi-tenancy and ensures predictable behavior.

How Identity Overrides Work Internally

When you specify custom username or hostname fields in your ReplicationSource/ReplicationDestination, VolSync applies these overrides at repository connection time, not at snapshot creation time:

  1. Environment Variables: The custom values are set as KOPIA_OVERRIDE_USERNAME and KOPIA_OVERRIDE_HOSTNAME environment variables

  2. Repository Connection: These variables are used with the --override-username and --override-hostname flags when running kopia repository connect or kopia repository create

  3. Persistent Identity: Once connected with the override identity, all subsequent operations (including kopia snapshot create) automatically use that identity

  4. No Snapshot Flags: The --override-username and --override-hostname flags do NOT exist for kopia snapshot create - they were removed in Kopia v0.6.0

This design ensures that the identity is established once at connection time and consistently used for all operations.

Username generation logic

The username generation follows this priority order:

  1. Custom Username (Highest Priority)

    If spec.kopia.username is specified, it is used exactly as provided without any sanitization or modification.

  2. Default Username Generation

    When no custom username is provided, VolSync generates one from the ReplicationSource/ReplicationDestination name:

    1. With Namespace: If the combined length of {objectName}-{namespace} ≤ 50 characters, use this format

    2. Object Name Only: If the combined name is too long, use just the sanitized object name

    3. Sanitization: Remove invalid characters and apply character restrictions

    4. Fallback: Use “volsync-default” if sanitization results in an empty string

Username examples

# Example 1: Custom username (no modifications applied)
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
  name: app-backup
  namespace: production
spec:
  kopia:
    username: "backup-user@company.com"  # Used exactly as-is
    # Generated username: backup-user@company.com

# Example 2: Default generation with namespace
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
  name: app-data
  namespace: prod
spec:
  kopia:
    # No username specified
    # Generated username: app-data-prod (≤50 chars)

# Example 3: Long names - object name only
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
  name: very-long-application-backup-with-detailed-name
  namespace: production-environment
spec:
  kopia:
    # Combined length > 50 chars
    # Generated username: very-long-application-backup-with-detailed-name

# Example 4: Special characters sanitized
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
  name: app@service.backup
  namespace: dev-test
spec:
  kopia:
    # Special chars removed: @ and . are invalid
    # Generated username: appservicebackup-dev-test

Hostname generation logic

The hostname generation follows this simple priority order:

  1. Custom Hostname (Highest Priority)

    If spec.kopia.hostname is specified, it is used exactly as provided without modification.

  2. Namespace-Only Hostname (Default)

    When no custom hostname is provided, the hostname is ALWAYS just the namespace name:

    • Format: {namespace} - This is the only format used

    • Intentional design: PVC names are NEVER included in the hostname

    • Multi-tenancy benefit: All ReplicationSources in a namespace share the same hostname

    • No collisions: Combined with unique usernames (from object names), identities are always unique

    • Predictable: You always know the hostname will be the namespace name

  3. Fallback Hostname

    If namespace is empty or becomes empty after sanitization, use “volsync-default”

  4. Sanitization

    For all generated hostnames:

    • Replace underscores with hyphens

    • Remove invalid characters (only alphanumeric, dots, and hyphens allowed)

    • Trim leading/trailing hyphens and dots

    • Use “volsync-default” if sanitization results in empty string

Hostname examples

# Example 1: Custom hostname (unchanged behavior)
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
  name: db-backup
  namespace: production
spec:
  sourcePVC: mysql-data
  kopia:
    hostname: "mysql-primary.production.local"  # Used exactly as-is
    # Generated hostname: mysql-primary.production.local

# Example 2: Namespace-only hostname (default and intentional behavior)
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
  name: app-backup
  namespace: prod
spec:
  sourcePVC: app-data
  kopia:
    # No hostname specified
    # Generated hostname: prod (ALWAYS just namespace)
    # Generated username: app-backup (from object name)
    # Full identity: app-backup@prod (guaranteed unique)

# Example 3: Multiple sources in same namespace - demonstrating multi-tenancy design
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
  name: app-backup
  namespace: production-environment
spec:
  sourcePVC: long-application-storage-pvc-name-v2
  kopia:
    # No hostname specified
    # Generated hostname: production-environment (namespace)
    # Generated username: app-backup (object name)
    # Full identity: app-backup@production-environment
---
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
  name: db-backup  # Different name = different username
  namespace: production-environment
spec:
  sourcePVC: database-pvc
  kopia:
    # No hostname specified
    # Generated hostname: production-environment (same namespace = same hostname)
    # Generated username: db-backup (different object name)
    # Full identity: db-backup@production-environment
    # Result: Both sources share hostname but have unique identities

Character sanitization rules

Username Sanitization

Allowed Characters: a-z, A-Z, 0-9, - (hyphen), _ (underscore)

Sanitization Process:

  1. Remove all characters not in the allowed set

  2. Trim leading and trailing hyphens and underscores

  3. If result is empty, use “volsync-default”

Examples:

Hostname Sanitization

Allowed Characters: a-z, A-Z, 0-9, . (dot), - (hyphen)

Sanitization Process:

  1. Replace underscores (_) with hyphens (-)

  2. Remove all characters not in the allowed set

  3. Trim leading and trailing hyphens and dots

  4. If result is empty, use “volsync-default”

Examples:

Customization guide

When to use custom values

Custom Username:

  • Multi-tenant environments: Use meaningful tenant identifiers like tenant-a, dept-finance

  • Email-based identification: user@company.com (will be preserved exactly)

  • Legacy compatibility: Match existing Kopia repository users

  • Regulatory compliance: Meet specific naming requirements

Custom Hostname:

  • Infrastructure alignment: Match actual hostnames like web01.prod.company.com

  • Logical grouping: primary-db, backup-replica, cache-layer

  • Environment consistency: app.production, app.staging, app.development

Configuration examples

Scenario 1: Multi-Environment Setup

# Production environment
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
  name: webapp-backup
  namespace: production
spec:
  kopia:
    username: "webapp-prod"
    hostname: "webapp.production.cluster"
---
# Staging environment
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
  name: webapp-backup
  namespace: staging
spec:
  kopia:
    username: "webapp-staging"
    hostname: "webapp.staging.cluster"

Scenario 2: Department-Based Tenancy

# Finance department backup
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
  name: accounting-db
  namespace: finance
spec:
  kopia:
    username: "finance-dept"
    hostname: "accounting-primary"
---
# HR department backup
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
  name: employee-db
  namespace: hr
spec:
  kopia:
    username: "hr-dept"
    hostname: "hr-primary"

Troubleshooting Multi-Tenant Repositories

Using Discovery Features

VolSync provides enhanced discovery features to help manage multi-tenant repositories:

Discovering All Tenants/Identities

To see all identities (tenants) in a shared repository:

# Create a temporary ReplicationDestination for discovery
cat <<EOF | kubectl apply -f -
apiVersion: volsync.backube/v1alpha1
kind: ReplicationDestination
metadata:
  name: tenant-discovery
  namespace: default
spec:
  trigger:
    manual: discover
  kopia:
    repository: kopia-config
    destinationPVC: temp-discovery
    copyMethod: Direct
EOF

# Wait for status to populate
sleep 10

# View all tenants/identities
kubectl get replicationdestination tenant-discovery -o json | \
  jq '.status.kopia.availableIdentities[] |
      {identity: .identity, snapshots: .snapshotCount, latest: .latestSnapshot}'

# Clean up
kubectl delete replicationdestination tenant-discovery

Example output showing multiple tenants:

{
  "identity": "finance-dept@finance-accounting-data",
  "snapshots": 45,
  "latest": "2024-01-20T10:00:00Z"
}
{
  "identity": "hr-dept@hr-employee-data",
  "snapshots": 30,
  "latest": "2024-01-20T09:30:00Z"
}
{
  "identity": "webapp-backup@production-webapp-data",
  "snapshots": 60,
  "latest": "2024-01-20T11:00:00Z"
}

Common Issues

Issue 1: Repository Access Conflicts

Problem: Multiple backups seem to interfere with each other

Solution: Use the discovery features to verify unique identities:

# Check what identity a source is using
kubectl describe replicationsource my-backup -n my-namespace

# Use discovery to see all identities
kubectl get replicationdestination <discovery-dest> -o json | \
  jq '.status.kopia.availableIdentities[].identity'

Alternative Solution: Use the sourceIdentity field for cross-namespace restores or when destination name differs from source name:

# ⚠️ sourceIdentity only needed when:
# - Cross-namespace restore (different namespaces)
# - Destination name ≠ source ReplicationSource name
# ✅ NOT needed for same namespace + matching names
spec:
  kopia:
    sourceIdentity:
      sourceName: my-backup        # Source ReplicationSource name
      sourceNamespace: my-namespace # Source namespace
      # sourcePVCName: optional - auto-discovered if not provided

Issue 2: Understanding Namespace-Only Hostnames

Question: Why is the hostname just the namespace and not including PVC names?

Answer: This is intentional design, not a bug or limitation

Design Benefits: - Predictable: Hostname is ALWAYS just the namespace: {namespace} - Multi-tenancy: All ReplicationSources in a namespace belong to the same “tenant” - No collisions: Unique usernames (from object names) ensure unique identities - Simplified management: One hostname per namespace makes policy management easier - Kubernetes-native: Leverages Kubernetes’ built-in name uniqueness guarantees

Issue 3: Multiple ReplicationSources Share Same Hostname

Observation: Multiple ReplicationSources in the same namespace have the same hostname

Explanation: This is the intended multi-tenancy design

How it works:

  • All ReplicationSources in a namespace share the same hostname (the namespace name)

  • Each ReplicationSource has a unique username (from its object name)

  • Result: Each source has a unique identity like webapp-backup@production and db-backup@production

  • This design treats the namespace as the tenant boundary

  • No collision risk because Kubernetes enforces unique object names within a namespace

  • If you need separate hostnames, use custom hostname configuration

Verify the hostname:

# Check what identity was actually generated
kubectl get replicationdestination <name> -o jsonpath='{.status.kopia.requestedIdentity}'
# The hostname part (after @) will always be just the namespace

Issue 4: Identifying Snapshots from Wrong Tenant

Problem: Restored wrong tenant’s data

Solution: Use the enhanced error reporting to identify correct tenant:

# View error message with available identities
kubectl describe replicationdestination <name> | grep -A 10 "Message:"

# List all available identities with details
kubectl get replicationdestination <name> -o json | \
  jq '.status.kopia.availableIdentities[] |
      select(.identity | contains("<namespace>"))'

The error message will show all available identities, making it easy to identify the correct one for your tenant/namespace.

Character Validation Patterns

The API enforces validation patterns for custom usernames and hostnames:

Pattern: ^[a-zA-Z0-9][a-zA-Z0-9._-]*[a-zA-Z0-9]$|^[a-zA-Z0-9]$

Requirements:

  • Must start and end with alphanumeric character

  • Middle characters can include ., _, -

  • Single character names are allowed

  • Cannot be empty

Valid Examples:

  • user1

  • backup-user

  • tenant.backup_job

  • a (single character)

Invalid Examples:

  • -backup-user (starts with hyphen)

  • backup-user- (ends with hyphen)

  • .backup.user. (starts/ends with dot)

  • backup user (contains space)

  • ```` (empty string)

Troubleshooting Identity Override Issues

Issue: Trying to use –override-username or –override-hostname with kopia snapshot create

Problem: You see errors when trying to pass --override-username or --override-hostname as additional arguments.

Solution: These flags don’t exist for kopia snapshot create (removed in Kopia v0.6.0). Instead:

  1. Use the username and hostname fields in your ReplicationSource spec

  2. These are applied at repository connection time via environment variables

  3. Once connected, all snapshots automatically use the override identity

# Correct approach
spec:
  kopia:
    username: "custom-user"
    hostname: "custom-host"
    # DO NOT add these to additionalArgs:
    # additionalArgs:
    #   - "--override-username=custom-user"  # WRONG - flag doesn't exist
    #   - "--override-hostname=custom-host"  # WRONG - flag doesn't exist

Issue: Identity not being applied as expected

Problem: Snapshots are created with different identity than configured.

Debugging Steps:

  1. Check the mover pod logs to see the identity being used:

kubectl logs <mover-pod> | grep -E "KOPIA_OVERRIDE|Using.*override|Creating snapshot for"
  1. Verify environment variables are set:

kubectl exec <mover-pod> -- env | grep KOPIA_OVERRIDE
  1. Confirm the identity at connection time:

kubectl logs <mover-pod> | grep "repository connect"

The logs should show the override flags being applied during repository connection, not during snapshot creation.

Identity Configuration for ReplicationDestination

Note

Kopia ReplicationDestination has flexible identity configuration

Identity is now OPTIONAL! When not provided, VolSync automatically generates an identity:

  • Username: <destination-name>

  • Hostname: <namespace>

This works perfectly for simple same-namespace restores when the destination name matches the source name.

For more complex scenarios, you can still provide:

  1. sourceIdentity for cross-namespace restores or different names

  2. Both username AND hostname for custom identity control

The system validates that you either provide both username and hostname together, or neither (for automatic identity).

Simplified Restore with sourceIdentity

For ReplicationDestination resources, the sourceIdentity field provides a streamlined approach to restoring from specific sources in multi-tenant repositories:

Traditional Approach (Manual Identity)

# You need to know the exact username and hostname
apiVersion: volsync.backube/v1alpha1
kind: ReplicationDestination
metadata:
  name: restore-data
spec:
  kopia:
    # Must match exactly what the source used
    username: "webapp-backup-production"
    hostname: "production-webapp-pvc"

Simplified Approach (sourceIdentity with Auto-Discovery)

# Just specify the source name and namespace
apiVersion: volsync.backube/v1alpha1
kind: ReplicationDestination
metadata:
  name: restore-data
spec:
  kopia:
    sourceIdentity:
      sourceName: webapp-backup
      sourceNamespace: production
      # sourcePVCName is optional - auto-discovered but doesn't affect hostname
    # VolSync automatically:
    # 1. Fetches the ReplicationSource configuration
    # 2. Discovers the sourcePVC name from the source
    # 3. Generates matching username/hostname

Approach with Explicit PVC Name

# Optionally specify the source PVC name explicitly
apiVersion: volsync.backube/v1alpha1
kind: ReplicationDestination
metadata:
  name: restore-data
spec:
  kopia:
    sourceIdentity:
      sourceName: webapp-backup
      sourceNamespace: production
      sourcePVCName: webapp-data  # Optional - for reference only, doesn't affect hostname

This is especially useful in multi-tenant scenarios where:

  • Multiple teams share the same repository

  • You need to restore data across namespaces

  • Identity generation rules have changed over time

  • You want to avoid manual identity management errors

Best practices for shared repositories

Repository Configuration Strategy

Single Repository Approach (Strongly Recommended)

For optimal storage efficiency and deduplication benefits, use a single Kopia repository for all your PVCs within an organization or cluster:

# Single shared repository for ALL PVCs
apiVersion: v1
kind: Secret
metadata:
  name: kopia-repository-shared
type: Opaque
stringData:
  KOPIA_REPOSITORY: s3://company-backups  # No path prefixes!
  KOPIA_PASSWORD: secure-repository-password
  # ... credentials

This approach maximizes deduplication across all your data. Kopia’s content-defined chunking means that duplicate data blocks (like OS files, common libraries, or repeated patterns) are stored only once across ALL your backups, regardless of which PVC they come from.

Benefits of Single Repository:

  • Maximum deduplication: 50-80% storage reduction is common

  • Simplified management: One repository to monitor and maintain

  • Automatic isolation: Each ReplicationSource gets a unique identity

  • Cost optimization: Significant reduction in cloud storage costs

  • Performance: Kopia handles thousands of clients in a single repository efficiently

When Multiple Repositories Might Be Needed:

Only use separate repositories when you have clear requirements such as:

  • Compliance: Legal requirements for data separation (HIPAA, PCI-DSS, GDPR)

  • Organizational boundaries: Different departments with separate budgets

  • Geographic constraints: Data residency requirements

  • Incompatible retention: Vastly different retention policy requirements

Warning

Avoid using bucket path prefixes like s3://bucket/app1 and s3://bucket/app2 unless absolutely necessary. This prevents deduplication between the paths and increases storage costs.

Naming Strategies

Environment-Based:

# Pattern: {app}-{env}
spec:
  kopia:
    username: "webapp-prod"
    hostname: "web01.production"

Department-Based:

# Pattern: {dept}-{resource}
spec:
  kopia:
    username: "finance-database"
    hostname: "accounting-primary"

Function-Based:

# Pattern: {function}-{instance}
spec:
  kopia:
    username: "backup-agent"
    hostname: "web-tier-01"

Security Considerations

Username Security:

  • Use descriptive but not sensitive information

  • Avoid including secrets or passwords

  • Consider audit trail requirements