Multi-tenancy and Shared Repositories
The VolSync Kopia mover implements multi-tenancy through the automatic generation of unique usernames and hostnames for each backup client. This ensures that multiple ReplicationSources and ReplicationDestinations can safely share the same Kopia repository without conflicts.
Each Kopia client requires a unique identity consisting of:
Username: Identifies the tenant/user in the repository
Hostname: Identifies the specific host/instance within that tenant
Simplified Multi-Tenancy with Namespace-Only Hostnames
VolSync’s hostname generation uses namespace-only identification by design, providing simplified multi-tenant isolation:
Hostname = Namespace: The hostname is ALWAYS just the namespace name (unless explicitly customized)
Username = ReplicationSource/ReplicationDestination name: By default, the username is derived from the object name
Unique identity guaranteed: Since Kubernetes prevents duplicate object names in a namespace, the combination of namespace (hostname) + object name (username) is always unique
Clear tenant boundaries: Each namespace represents a distinct tenant with a single hostname
Multi-source support: Multiple ReplicationSources in the same namespace share the hostname but have different usernames
No collision risk: There’s no security issue or collision risk because Kubernetes enforces unique object names
This design enables powerful multi-tenancy features:
Namespace as tenant boundary: All backups from a namespace share the same hostname, representing a logical tenant
Per-source isolation: Each ReplicationSource has its own username, maintaining separate snapshot histories
Repository-level policies: Administrators can apply retention policies based on namespace (hostname) patterns
No collision possible: Even with shared hostnames, unique usernames prevent any conflicts
Simplified access control: Control repository access at the namespace level
Understanding identity generation
VolSync automatically generates usernames and hostnames based on your Kubernetes resources. This is an intentional design that leverages Kubernetes’ built-in uniqueness guarantees:
Key Design Principle: The hostname is ALWAYS just the namespace name (unless explicitly customized). This is not a limitation but a deliberate design choice that simplifies multi-tenancy and ensures predictable behavior.
How Identity Overrides Work Internally
When you specify custom username
or hostname
fields in your ReplicationSource/ReplicationDestination, VolSync applies these overrides at repository connection time, not at snapshot creation time:
Environment Variables: The custom values are set as
KOPIA_OVERRIDE_USERNAME
andKOPIA_OVERRIDE_HOSTNAME
environment variablesRepository Connection: These variables are used with the
--override-username
and--override-hostname
flags when runningkopia repository connect
orkopia repository create
Persistent Identity: Once connected with the override identity, all subsequent operations (including
kopia snapshot create
) automatically use that identityNo Snapshot Flags: The
--override-username
and--override-hostname
flags do NOT exist forkopia snapshot create
- they were removed in Kopia v0.6.0
This design ensures that the identity is established once at connection time and consistently used for all operations.
Username generation logic
The username generation follows this priority order:
Custom Username (Highest Priority)
If
spec.kopia.username
is specified, it is used exactly as provided without any sanitization or modification.Default Username Generation
When no custom username is provided, VolSync generates one from the ReplicationSource/ReplicationDestination name:
With Namespace: If the combined length of
{objectName}-{namespace}
≤ 50 characters, use this formatObject Name Only: If the combined name is too long, use just the sanitized object name
Sanitization: Remove invalid characters and apply character restrictions
Fallback: Use “volsync-default” if sanitization results in an empty string
Username examples
# Example 1: Custom username (no modifications applied)
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
name: app-backup
namespace: production
spec:
kopia:
username: "backup-user@company.com" # Used exactly as-is
# Generated username: backup-user@company.com
—
# Example 2: Default generation with namespace
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
name: app-data
namespace: prod
spec:
kopia:
# No username specified
# Generated username: app-data-prod (≤50 chars)
—
# Example 3: Long names - object name only
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
name: very-long-application-backup-with-detailed-name
namespace: production-environment
spec:
kopia:
# Combined length > 50 chars
# Generated username: very-long-application-backup-with-detailed-name
—
# Example 4: Special characters sanitized
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
name: app@service.backup
namespace: dev-test
spec:
kopia:
# Special chars removed: @ and . are invalid
# Generated username: appservicebackup-dev-test
Hostname generation logic
The hostname generation follows this simple priority order:
Custom Hostname (Highest Priority)
If
spec.kopia.hostname
is specified, it is used exactly as provided without modification.Namespace-Only Hostname (Default)
When no custom hostname is provided, the hostname is ALWAYS just the namespace name:
Format:
{namespace}
- This is the only format usedIntentional design: PVC names are NEVER included in the hostname
Multi-tenancy benefit: All ReplicationSources in a namespace share the same hostname
No collisions: Combined with unique usernames (from object names), identities are always unique
Predictable: You always know the hostname will be the namespace name
Fallback Hostname
If namespace is empty or becomes empty after sanitization, use “volsync-default”
Sanitization
For all generated hostnames:
Replace underscores with hyphens
Remove invalid characters (only alphanumeric, dots, and hyphens allowed)
Trim leading/trailing hyphens and dots
Use “volsync-default” if sanitization results in empty string
Hostname examples
# Example 1: Custom hostname (unchanged behavior)
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
name: db-backup
namespace: production
spec:
sourcePVC: mysql-data
kopia:
hostname: "mysql-primary.production.local" # Used exactly as-is
# Generated hostname: mysql-primary.production.local
—
# Example 2: Namespace-only hostname (default and intentional behavior)
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
name: app-backup
namespace: prod
spec:
sourcePVC: app-data
kopia:
# No hostname specified
# Generated hostname: prod (ALWAYS just namespace)
# Generated username: app-backup (from object name)
# Full identity: app-backup@prod (guaranteed unique)
—
# Example 3: Multiple sources in same namespace - demonstrating multi-tenancy design
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
name: app-backup
namespace: production-environment
spec:
sourcePVC: long-application-storage-pvc-name-v2
kopia:
# No hostname specified
# Generated hostname: production-environment (namespace)
# Generated username: app-backup (object name)
# Full identity: app-backup@production-environment
---
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
name: db-backup # Different name = different username
namespace: production-environment
spec:
sourcePVC: database-pvc
kopia:
# No hostname specified
# Generated hostname: production-environment (same namespace = same hostname)
# Generated username: db-backup (different object name)
# Full identity: db-backup@production-environment
# Result: Both sources share hostname but have unique identities
Character sanitization rules
Username Sanitization
Allowed Characters: a-z
, A-Z
, 0-9
, -
(hyphen), _
(underscore)
Sanitization Process:
Remove all characters not in the allowed set
Trim leading and trailing hyphens and underscores
If result is empty, use “volsync-default”
Examples:
Hostname Sanitization
Allowed Characters: a-z
, A-Z
, 0-9
, .
(dot), -
(hyphen)
Sanitization Process:
Replace underscores (
_
) with hyphens (-
)Remove all characters not in the allowed set
Trim leading and trailing hyphens and dots
If result is empty, use “volsync-default”
Examples:
Customization guide
When to use custom values
Custom Username:
Multi-tenant environments: Use meaningful tenant identifiers like
tenant-a
,dept-finance
Email-based identification:
user@company.com
(will be preserved exactly)Legacy compatibility: Match existing Kopia repository users
Regulatory compliance: Meet specific naming requirements
Custom Hostname:
Infrastructure alignment: Match actual hostnames like
web01.prod.company.com
Logical grouping:
primary-db
,backup-replica
,cache-layer
Environment consistency:
app.production
,app.staging
,app.development
Configuration examples
Scenario 1: Multi-Environment Setup
# Production environment
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
name: webapp-backup
namespace: production
spec:
kopia:
username: "webapp-prod"
hostname: "webapp.production.cluster"
---
# Staging environment
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
name: webapp-backup
namespace: staging
spec:
kopia:
username: "webapp-staging"
hostname: "webapp.staging.cluster"
Scenario 2: Department-Based Tenancy
# Finance department backup
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
name: accounting-db
namespace: finance
spec:
kopia:
username: "finance-dept"
hostname: "accounting-primary"
---
# HR department backup
apiVersion: volsync.backube/v1alpha1
kind: ReplicationSource
metadata:
name: employee-db
namespace: hr
spec:
kopia:
username: "hr-dept"
hostname: "hr-primary"
Troubleshooting Multi-Tenant Repositories
Using Discovery Features
VolSync provides enhanced discovery features to help manage multi-tenant repositories:
Discovering All Tenants/Identities
To see all identities (tenants) in a shared repository:
# Create a temporary ReplicationDestination for discovery
cat <<EOF | kubectl apply -f -
apiVersion: volsync.backube/v1alpha1
kind: ReplicationDestination
metadata:
name: tenant-discovery
namespace: default
spec:
trigger:
manual: discover
kopia:
repository: kopia-config
destinationPVC: temp-discovery
copyMethod: Direct
EOF
# Wait for status to populate
sleep 10
# View all tenants/identities
kubectl get replicationdestination tenant-discovery -o json | \
jq '.status.kopia.availableIdentities[] |
{identity: .identity, snapshots: .snapshotCount, latest: .latestSnapshot}'
# Clean up
kubectl delete replicationdestination tenant-discovery
Example output showing multiple tenants:
{
"identity": "finance-dept@finance-accounting-data",
"snapshots": 45,
"latest": "2024-01-20T10:00:00Z"
}
{
"identity": "hr-dept@hr-employee-data",
"snapshots": 30,
"latest": "2024-01-20T09:30:00Z"
}
{
"identity": "webapp-backup@production-webapp-data",
"snapshots": 60,
"latest": "2024-01-20T11:00:00Z"
}
Common Issues
Issue 1: Repository Access Conflicts
Problem: Multiple backups seem to interfere with each other
Solution: Use the discovery features to verify unique identities:
# Check what identity a source is using
kubectl describe replicationsource my-backup -n my-namespace
# Use discovery to see all identities
kubectl get replicationdestination <discovery-dest> -o json | \
jq '.status.kopia.availableIdentities[].identity'
Alternative Solution: Use the sourceIdentity
field for cross-namespace restores
or when destination name differs from source name:
# ⚠️ sourceIdentity only needed when:
# - Cross-namespace restore (different namespaces)
# - Destination name ≠ source ReplicationSource name
# ✅ NOT needed for same namespace + matching names
spec:
kopia:
sourceIdentity:
sourceName: my-backup # Source ReplicationSource name
sourceNamespace: my-namespace # Source namespace
# sourcePVCName: optional - auto-discovered if not provided
Issue 2: Understanding Namespace-Only Hostnames
Question: Why is the hostname just the namespace and not including PVC names?
Answer: This is intentional design, not a bug or limitation
Design Benefits:
- Predictable: Hostname is ALWAYS just the namespace: {namespace}
- Multi-tenancy: All ReplicationSources in a namespace belong to the same “tenant”
- No collisions: Unique usernames (from object names) ensure unique identities
- Simplified management: One hostname per namespace makes policy management easier
- Kubernetes-native: Leverages Kubernetes’ built-in name uniqueness guarantees
Issue 3: Multiple ReplicationSources Share Same Hostname
Observation: Multiple ReplicationSources in the same namespace have the same hostname
Explanation: This is the intended multi-tenancy design
How it works:
All ReplicationSources in a namespace share the same hostname (the namespace name)
Each ReplicationSource has a unique username (from its object name)
Result: Each source has a unique identity like
webapp-backup@production
anddb-backup@production
This design treats the namespace as the tenant boundary
No collision risk because Kubernetes enforces unique object names within a namespace
If you need separate hostnames, use custom hostname configuration
Verify the hostname:
# Check what identity was actually generated kubectl get replicationdestination <name> -o jsonpath='{.status.kopia.requestedIdentity}' # The hostname part (after @) will always be just the namespace
Issue 4: Identifying Snapshots from Wrong Tenant
Problem: Restored wrong tenant’s data
Solution: Use the enhanced error reporting to identify correct tenant:
# View error message with available identities
kubectl describe replicationdestination <name> | grep -A 10 "Message:"
# List all available identities with details
kubectl get replicationdestination <name> -o json | \
jq '.status.kopia.availableIdentities[] |
select(.identity | contains("<namespace>"))'
The error message will show all available identities, making it easy to identify the correct one for your tenant/namespace.
Character Validation Patterns
The API enforces validation patterns for custom usernames and hostnames:
Pattern: ^[a-zA-Z0-9][a-zA-Z0-9._-]*[a-zA-Z0-9]$|^[a-zA-Z0-9]$
Requirements:
Must start and end with alphanumeric character
Middle characters can include
.
,_
,-
Single character names are allowed
Cannot be empty
Valid Examples:
user1
backup-user
tenant.backup_job
a
(single character)
Invalid Examples:
-backup-user
(starts with hyphen)backup-user-
(ends with hyphen).backup.user.
(starts/ends with dot)backup user
(contains space)```` (empty string)
Troubleshooting Identity Override Issues
Issue: Trying to use –override-username or –override-hostname with kopia snapshot create
Problem: You see errors when trying to pass --override-username
or --override-hostname
as additional arguments.
Solution: These flags don’t exist for kopia snapshot create
(removed in Kopia v0.6.0). Instead:
Use the
username
andhostname
fields in your ReplicationSource specThese are applied at repository connection time via environment variables
Once connected, all snapshots automatically use the override identity
# Correct approach
spec:
kopia:
username: "custom-user"
hostname: "custom-host"
# DO NOT add these to additionalArgs:
# additionalArgs:
# - "--override-username=custom-user" # WRONG - flag doesn't exist
# - "--override-hostname=custom-host" # WRONG - flag doesn't exist
Issue: Identity not being applied as expected
Problem: Snapshots are created with different identity than configured.
Debugging Steps:
Check the mover pod logs to see the identity being used:
kubectl logs <mover-pod> | grep -E "KOPIA_OVERRIDE|Using.*override|Creating snapshot for"
Verify environment variables are set:
kubectl exec <mover-pod> -- env | grep KOPIA_OVERRIDE
Confirm the identity at connection time:
kubectl logs <mover-pod> | grep "repository connect"
The logs should show the override flags being applied during repository connection, not during snapshot creation.
Identity Configuration for ReplicationDestination
Note
Kopia ReplicationDestination has flexible identity configuration
Identity is now OPTIONAL! When not provided, VolSync automatically generates an identity:
Username:
<destination-name>
Hostname:
<namespace>
This works perfectly for simple same-namespace restores when the destination name matches the source name.
For more complex scenarios, you can still provide:
sourceIdentity
for cross-namespace restores or different namesBoth
username
ANDhostname
for custom identity control
The system validates that you either provide both username and hostname together, or neither (for automatic identity).
Simplified Restore with sourceIdentity
For ReplicationDestination resources, the sourceIdentity
field provides a streamlined
approach to restoring from specific sources in multi-tenant repositories:
Traditional Approach (Manual Identity)
# You need to know the exact username and hostname
apiVersion: volsync.backube/v1alpha1
kind: ReplicationDestination
metadata:
name: restore-data
spec:
kopia:
# Must match exactly what the source used
username: "webapp-backup-production"
hostname: "production-webapp-pvc"
Simplified Approach (sourceIdentity with Auto-Discovery)
# Just specify the source name and namespace
apiVersion: volsync.backube/v1alpha1
kind: ReplicationDestination
metadata:
name: restore-data
spec:
kopia:
sourceIdentity:
sourceName: webapp-backup
sourceNamespace: production
# sourcePVCName is optional - auto-discovered but doesn't affect hostname
# VolSync automatically:
# 1. Fetches the ReplicationSource configuration
# 2. Discovers the sourcePVC name from the source
# 3. Generates matching username/hostname
Approach with Explicit PVC Name
# Optionally specify the source PVC name explicitly
apiVersion: volsync.backube/v1alpha1
kind: ReplicationDestination
metadata:
name: restore-data
spec:
kopia:
sourceIdentity:
sourceName: webapp-backup
sourceNamespace: production
sourcePVCName: webapp-data # Optional - for reference only, doesn't affect hostname
This is especially useful in multi-tenant scenarios where:
Multiple teams share the same repository
You need to restore data across namespaces
Identity generation rules have changed over time
You want to avoid manual identity management errors
Best practices for shared repositories
Repository Configuration Strategy
Single Repository Approach (Strongly Recommended)
For optimal storage efficiency and deduplication benefits, use a single Kopia repository for all your PVCs within an organization or cluster:
# Single shared repository for ALL PVCs
apiVersion: v1
kind: Secret
metadata:
name: kopia-repository-shared
type: Opaque
stringData:
KOPIA_REPOSITORY: s3://company-backups # No path prefixes!
KOPIA_PASSWORD: secure-repository-password
# ... credentials
This approach maximizes deduplication across all your data. Kopia’s content-defined chunking means that duplicate data blocks (like OS files, common libraries, or repeated patterns) are stored only once across ALL your backups, regardless of which PVC they come from.
Benefits of Single Repository:
Maximum deduplication: 50-80% storage reduction is common
Simplified management: One repository to monitor and maintain
Automatic isolation: Each ReplicationSource gets a unique identity
Cost optimization: Significant reduction in cloud storage costs
Performance: Kopia handles thousands of clients in a single repository efficiently
When Multiple Repositories Might Be Needed:
Only use separate repositories when you have clear requirements such as:
Compliance: Legal requirements for data separation (HIPAA, PCI-DSS, GDPR)
Organizational boundaries: Different departments with separate budgets
Geographic constraints: Data residency requirements
Incompatible retention: Vastly different retention policy requirements
Warning
Avoid using bucket path prefixes like s3://bucket/app1
and s3://bucket/app2
unless absolutely necessary. This prevents deduplication between the paths and
increases storage costs.
Naming Strategies
Environment-Based:
# Pattern: {app}-{env}
spec:
kopia:
username: "webapp-prod"
hostname: "web01.production"
Department-Based:
# Pattern: {dept}-{resource}
spec:
kopia:
username: "finance-database"
hostname: "accounting-primary"
Function-Based:
# Pattern: {function}-{instance}
spec:
kopia:
username: "backup-agent"
hostname: "web-tier-01"
Security Considerations
Username Security:
Use descriptive but not sensitive information
Avoid including secrets or passwords
Consider audit trail requirements