dbatools icon indicating copy to clipboard operation
dbatools copied to clipboard

Support Backup/Restore to Amazon S3

Open vanillagranilla opened this issue 1 year ago • 10 comments

Summarize Functionality

S3 is now a supported backup destination in SQL Server 2022, similar to backup to Azure blob storage URL.

  1. For Backup-DbaDatabase: support backing up directly to S3 URL(s)
  2. For commands that Restore backups (Restore-DbaDatabase and Invoke-DbaAdvancedRestore), support restoring directly from S3 URL(s)
  3. For commands that list/find databases, verify that s3 URLS are accepted as input/output: Find-DbaBackup, Format-DbaBackupInformation, Get-DbaBackupInformation, probably others.

Is there a command that is similiar or close to what you are looking for?

Yes

Technical Details

Microsoft summary: https://learn.microsoft.com/en-us/sql/relational-databases/backup-restore/sql-server-backup-to-url-s3-compatible-object-storage?view=sql-server-ver16#restore-from-url

Just starting to investigate the work involved. We're open to starting to implement some of this functionality but would like to discuss first.

vanillagranilla avatar Feb 22 '24 19:02 vanillagranilla

Has there been any update to this? This would be helpful in the dbatools platform. I noticed the Ola SQL maintenance solutions also just recently made support for AWS S3 backups officially available.

cjerrington avatar Nov 25 '24 16:11 cjerrington

I would just add to this that Test-DbaLastBackup needs this added as well, assuming that just adding it to the other cmdlets will fix this cmdlet.

unclebiguns avatar Mar 13 '25 14:03 unclebiguns

Just chiming in to say I would be very interested in this as well. It seems a bit odd that Backup-DbaDatabase can have Azure blob storage as storage for backups (if I understand correctly) but not s3 storage given that s3 is supported by so many different storage solutions.

stefhak avatar Aug 12 '25 13:08 stefhak

it's more a problem on the popularity of S3 as the TARGET system of direct backup from SQL Server rather than S3 being supported by many different solutions. Backup to Azure Blob is available since 2012, S3 is available since 2022 and the adoption is prolly not that high in comparison (less than 10 people in more than a year chimed in here).

In 2025, "free" solutions for S3 are basically non-existant (at least since minio started closing down features of the community edition), and if you're on "enterprise-y" ones, usually the backup is orchestrated by an "intelligent agent" (see Veeam, Rubrik, CommVault, etc) that is able to do "diffs always". Other platforms do not need to leverage direct S3 and use a local-caching-sf, deduplicating before storing on any object storage hence cutting immensely costs (see NetApp, Nasuni, DataDomain, etc).

niphlod avatar Aug 12 '25 14:08 niphlod

One in ten would still be a large number. Regarding on-prem S3 storage services: many "unified" systems have own S3 service on board (not "free" but you don't pay anything for it, if you buy NAS as you normally do, now you get S3 as well) and many users have dedicated on-premises storage systems (which has the advantage of being "different media").

The fact that MinIO (again) crippled their OSS is unfortunate, but even in the case of MinIO the paid version isn't crippled and it may be fair to say that, if you can afford a commercial DB, you may be able to afford non-free S3 storage software.

I think it may be more than 10% of DbaTools users who would use S3 if they could, including in mixed scenarios (frequent backups to unified/dedicated S3, less frequent to Azure Blob (cloud)).

scaleoutsean avatar Aug 13 '25 03:08 scaleoutsean

it's not "one in ten". it's, literally, eleven people ^_^. This is not to say "never gonna happen" but that's not high on the priorities.

niphlod avatar Aug 13 '25 07:08 niphlod

@niphlod I really cannot comment on how others do things, or impact of fewer (no) "free" S3 solutions (but I appreciate you sharing your insights). I just wanted to say that in the work I have been tasked to do support for backup to S3 would be helpful. My comment "It seems a bit odd" may have been wrong, I realize that from your input.

stefhak avatar Aug 13 '25 07:08 stefhak

if you're on "enterprise-y" ones, usually the backup is orchestrated by an "intelligent agent" (see Veeam, Rubrik, CommVault, etc) that is able to do "diffs always".

Without taking account if S3 backup/restore support is needed to dbatools (I just had this in my watch list). We have done quite big evaluations with both Veeam and Rubrik couple of times and ended up conclusion that using those for SQL backups is waste of money and adds unnecessary complexity (even we still use Veeam for other backups).

Only truly enterprise ready solution is Ola Hallengren's maintenance solution and he recently added support of direct S3 backups. I understood that request for that become from AWS (our DBA is Ola's friend).

So we are in process of migrating from old solution where backups was first taken to SMB share and then synced to on-prem S3 to direct S3 backups.

Anyone how plans to do same, I would recommend:

  • Make sure that you have fast NVMe based S3 solution.
  • Make sure that you have CU 20 installed. Bug reference 4100859 is the one I found when was stress testing target solution. Without that patch, if S3 endpoint goes down behind of load balancer, the backup process goes stuck in way that all the backups fails until you restart SQL instance.

During that bug investigation we also got info from Microsoft that only officially supported backend storage is AWS S3 but they fixed the bug as I was able to proof that same issue happens in there if you have load balancer between SQL server and AWS S3.

Other platforms do not need to leverage direct S3 and use a local-caching-sf, deduplicating before storing on any object storage hence cutting immensely costs (see NetApp, Nasuni, DataDomain, etc).

Based on what I have seen. None of those deduplication technologies gets even close the level of disk saving what you get with SQL own compression and smart usage of full, diff and log backups. Also using of TDE makes those technologies even less effective.

And in SQL Server 2025 you will be able to use ZSTD compression.

Lastly, one big reason for us to use S3 to store SQL backups have always been WORM support enforced by storage system because ransomware is big issue nowadays and protection is much cheaper to implement like this than by using solutions like Rubrik.

I hope that this helps someone and does not disturb discussion too much (as it is a bit offtopic)

olljanat avatar Aug 15 '25 12:08 olljanat

@claude there has been a lot of advancements in recent years with this. can you check out https://learn.microsoft.com/en-us/sql/relational-databases/backup-restore/sql-server-backup-and-restore-with-s3-compatible-object-storage?view=sql-server-ver17 and see what we can do within the confines of what microsoft allows us to do without 3rd party software. also sql server 2025 is now out. let's see what we can do. ultrathink about a solution, these are our two most important commands and the solution needs to be well thought out.

potatoqualitee avatar Nov 28 '25 16:11 potatoqualitee

Claude finished @potatoqualitee's task —— View job


S3 Backup/Restore Support - Research Complete ✓

I've completed a comprehensive analysis of Microsoft's S3 backup/restore capabilities and developed a detailed implementation plan for dbatools.


Key Findings from Microsoft Documentation

SQL Server Version Support

  • Minimum Version: SQL Server 2022 (VersionMajor 16) only
  • SQL Server 2025 (VersionMajor 17) continues support with potential enhancements
  • Not available in SQL Server Express editions
  • S3 support is fundamentally different from Azure Blob (different protocol, credential model, and feature set)

S3 URL Format

Microsoft supports two URL styles:

  • Path style: s3://<endpoint>:<port>/<bucket>/<backup_file_name>
  • Virtual host style: s3://<bucket>.<domain>/<backup_file_name>
  • Always HTTPS: The s3:// prefix automatically uses HTTPS; HTTP is not supported

Credential Requirements

S3 credentials have a specific structure that differs from Azure:

CREATE CREDENTIAL [s3://<endpoint>:<port>/<bucket>]
WITH
    IDENTITY = 'S3 Access Key',
    SECRET = '<AccessKeyID>:<SecretKeyID>';

Critical constraints:

  • IDENTITY must always be exactly 'S3 Access Key'
  • SECRET format is AccessKeyID:SecretKeyID (colon-separated)
  • Access Key ID and Secret Key cannot contain colons
  • Only alphanumeric values allowed in keys
  • Credential name is the full S3 URL (bucket-level)

Technical Limitations

  1. TLS Required: S3 endpoint must use a TLS certificate trusted by SQL Server
  2. Buckets must pre-exist: Cannot be created from SQL Server
  3. Multipart upload orphans: Temporary uncommitted data files aren't removed on failure (S3 REST API limitation)
  4. URL length: Total URL limited to 259 characters (254 usable)
  5. Credential name length: Limited to 128 characters (UTF-16)

Advanced Features

  • Region support: Optional via BACKUP_OPTIONS or RESTORE_OPTIONS JSON
    BACKUP_OPTIONS = '{"s3": {"region":"us-west-2"}}'
    
  • MAXTRANSFERSIZE: 5-20 MB (default 10 MB), determines multipart upload part size
  • Compression required to change MAXTRANSFERSIZE values
  • Encryption supported: Works with TDE and backup encryption

Comparison: Azure Blob vs S3

Feature Azure Blob Storage S3-Compatible Storage
URL Prefix https:// s3:// (auto-HTTPS)
Minimum SQL Version SQL 2012+ SQL 2022+
Credential Identity Container URL or "SHARED ACCESS SIGNATURE" or "Managed Identity" Always "S3 Access Key"
Credential Secret Access Key or SAS Token AccessKeyID:SecretKeyID
Credential Name Flexible (container URL for SAS) Must match S3 URL
Multiple Credentials Yes (per container) Yes (per bucket)
Striping Support Yes (multiple URLs) Yes (multiple URLs)
Express Edition Supported Not supported
HTTP Support Yes No (HTTPS only)

Key architectural difference: Azure uses HTTP-based blob storage APIs, while S3 uses a different REST API with stricter requirements.


Implementation Strategy

Core Principles

  1. Parallel implementation: Add S3 support alongside existing Azure support, not as a replacement
  2. Version gating: Add checks for SQL Server 2022+ when S3 URLs are detected
  3. Graceful degradation: Clear error messages for unsupported scenarios
  4. Consistent patterns: Mirror existing Azure implementation patterns for familiarity
  5. URL detection: Auto-detect S3 URLs via s3:// prefix

Affected Commands & Changes Required

1. Backup-DbaDatabase (HIGH PRIORITY)

Current state: Supports Azure via AzureBaseUrl and AzureCredential parameters

Proposed changes:

  • Add new parameters: -S3BaseUrl and -S3Credential (mirroring Azure pattern)
  • Alternative approach: Keep existing parameters and auto-detect based on URL prefix
    • s3:// → S3 mode
    • https:// → Azure mode
  • Add SQL Server version check (2022+) when S3 URLs detected
  • Add Express Edition check (not supported)
  • Handle S3-specific credential validation
  • Support S3 region parameter via BACKUP_OPTIONS JSON
  • Update examples and help documentation

Validation requirements:

if ($S3BaseUrl) {
    # Check SQL Server 2022+
    if ($server.VersionMajor -lt 16) {
        Stop-Function -Message "S3 backup requires SQL Server 2022 (version 16) or higher. Current version: $($server.VersionMajor)"
    }
    # Check not Express
    if ($server.Edition -like '*Express*') {
        Stop-Function -Message "S3 backup is not supported in SQL Server Express editions"
    }
    # Validate credential exists
    # Validate URL format
}
2. Restore-DbaDatabase (HIGH PRIORITY)

Current state: Supports Azure via AzureCredential parameter and auto-detects URLs in Path

Proposed changes:

  • Add S3 URL detection in Path parameter processing
  • Add new parameter: -S3Credential (or auto-detect from URL)
  • Add SQL Server version check (2022+) when S3 URLs detected
  • Support S3 region parameter via RESTORE_OPTIONS JSON
  • Handle S3-specific URL patterns (path style vs virtual host style)
  • Update examples and help documentation

URL detection logic:

if ($Path -match '^s3://') {
    $isS3 = $true
    # Validate SQL Server 2022+
    # Extract bucket and validate credential
}
3. Get-DbaBackupInformation (MEDIUM PRIORITY)

Current state: Has AzureCredential parameter

Proposed changes:

  • Add S3 URL detection
  • Add -S3Credential parameter
  • Handle S3 URLs in backup header reads
  • Ensure Read-DbaBackupHeader works with S3 URLs
4. Format-DbaBackupInformation (LOW PRIORITY)

Current state: Processes backup information objects

Proposed changes:

  • Ensure S3 URLs are handled in path processing
  • Validate URL format preservation through pipeline
  • May not require changes if it processes BackupHistory objects
5. Find-DbaBackup (LOW PRIORITY)

Current state: Scans file systems for backup files

Proposed changes:

  • Document that S3 scanning requires different approach (cannot use file system commands)
  • May require integration with S3-compatible APIs (out of scope for native SQL Server integration)
  • Consider marking S3 bucket scanning as future enhancement
6. Invoke-DbaAdvancedRestore (MEDIUM PRIORITY)

Current state: Advanced restore wrapper

Proposed changes:

  • Add S3 support by leveraging Restore-DbaDatabase changes
  • Pass through S3 credentials appropriately
7. Test-DbaLastBackup (MEDIUM PRIORITY)

Current state: Tests backup restorability

Proposed changes:

  • Support S3 URLs if they're in backup history
  • Leverage Restore-DbaDatabase S3 support
8. New-DbaCredential (HELPER COMMAND)

Current state: Creates SQL Server credentials

Proposed changes:

  • Add example for S3 credential creation
  • Add parameter validation for S3 credential format
  • Helper function to validate AccessKeyID:SecretKeyID format

Example:

# New helper or validation
$s3Params = @{
    SqlInstance   = "server1"
    Name          = "s3://endpoint:port/bucket"
    Identity      = "S3 Access Key"
    SecurePassword = (ConvertTo-SecureString "AKIAIOSFODNN7EXAMPLE:wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY" -AsPlainText -Force)
}
New-DbaCredential @s3Params

Technical Implementation Details

URL Detection Function

function Test-DbaS3Url {
    param([string]$Url)
    
    if ($Url -match '^s3://') {
        # Extract components
        # Path style: s3://endpoint:port/bucket/path
        # Virtual host: s3://bucket.domain/path
        return $true
    }
    return $false
}

Credential Validation

function Test-DbaS3Credential {
    param(
        $Server,
        [string]$CredentialName
    )
    
    $credential = Get-DbaCredential -SqlInstance $Server -Name $CredentialName
    
    if (-not $credential) {
        throw "S3 credential '$CredentialName' not found"
    }
    
    if ($credential.Identity -ne 'S3 Access Key') {
        throw "S3 credential must have Identity = 'S3 Access Key'"
    }
    
    # Secret format: AccessKeyID:SecretKeyID
    # Cannot validate without exposing secret
}

Backup Device Type

# In Backup-DbaDatabase
$device = New-Object Microsoft.SqlServer.Management.Smo.BackupDeviceItem
if ($S3BaseUrl) {
    $device.DeviceType = "URL"  # Same as Azure
} else {
    $device.DeviceType = "File"
}

Testing Requirements

Test Scenarios

  1. Backup to S3:

    • Single file backup
    • Striped backup (multiple S3 URLs)
    • Full, Differential, Log backups
    • With compression
    • With encryption
    • With checksum
  2. Restore from S3:

    • Single file restore
    • Striped restore
    • Point-in-time recovery
    • With NORECOVERY
    • With file relocation
  3. Credential Management:

    • Create S3 credentials
    • Validate credential format
    • Auto-detect credentials from URL
  4. Error Handling:

    • SQL Server < 2022
    • Express Edition
    • Missing credentials
    • Invalid URL format
    • Malformed AccessKeyID:SecretKeyID
  5. Integration Tests:

    • Backup → Restore roundtrip
    • Backup history queries
    • Get-DbaBackupInformation with S3 URLs

Test Infrastructure Needs

  • S3-compatible storage endpoint (MinIO, AWS S3, or compatible)
  • SQL Server 2022+ instance
  • Valid S3 credentials
  • Test buckets pre-created

Documentation Requirements

Parameter Help Updates

  • Add S3 examples to all affected commands
  • Document SQL Server 2022+ requirement
  • Document Express Edition limitation
  • Provide S3 credential creation examples

New Examples Needed

# Example: Backup to S3
Backup-DbaDatabase -SqlInstance sql2022 `
    -Database AdventureWorks `
    -S3BaseUrl "s3://s3.amazonaws.com:443/my-backup-bucket" `
    -S3Credential "s3://s3.amazonaws.com:443/my-backup-bucket" `
    -Type Full -Checksum

# Example: Restore from S3
Restore-DbaDatabase -SqlInstance sql2022 `
    -Path "s3://s3.amazonaws.com:443/my-backup-bucket/AdventureWorks.bak" `
    -DatabaseName "AdventureWorks_Restored" `
    -S3Credential "s3://s3.amazonaws.com:443/my-backup-bucket"

# Example: Create S3 credential
$s3Secret = ConvertTo-SecureString "AKIAEXAMPLE:SecretKeyEXAMPLE" -AsPlainText -Force
New-DbaCredential -SqlInstance sql2022 `
    -Name "s3://s3.amazonaws.com:443/my-backup-bucket" `
    -Identity "S3 Access Key" `
    -SecurePassword $s3Secret

Recommended Implementation Phases

Phase 1: Core Backup/Restore (MVP)

  • [ ] Add S3 support to Backup-DbaDatabase
  • [ ] Add S3 support to Restore-DbaDatabase
  • [ ] Add S3 credential example to New-DbaCredential
  • [ ] Basic integration tests
  • [ ] Documentation updates

Estimated complexity: Moderate (2-3 days with testing)

Phase 2: Backup Information Commands

  • [ ] Add S3 support to Get-DbaBackupInformation
  • [ ] Update Format-DbaBackupInformation for S3 URLs
  • [ ] Update Test-DbaLastBackup
  • [ ] Update Invoke-DbaAdvancedRestore

Estimated complexity: Low-Moderate (1-2 days)

Phase 3: Advanced Features & Polish

  • [ ] S3 region support via BACKUP_OPTIONS/RESTORE_OPTIONS
  • [ ] Enhanced error messages
  • [ ] Performance optimization
  • [ ] Comprehensive test coverage

Estimated complexity: Low (1 day)

Phase 4: Future Enhancements (Optional)

  • [ ] S3 bucket scanning for Find-DbaBackup (requires S3 API integration)
  • [ ] S3 credential helper cmdlet
  • [ ] Support for S3-compatible providers (MinIO, StorageGRID, etc.)

Risks and Considerations

Technical Risks

  1. Limited testing infrastructure: Need S3-compatible storage for testing
  2. Credential security: AccessKeyID:SecretKeyID in plain text in credential secret
  3. SMO support: Verify SMO handles S3 URLs correctly (likely yes, since it's URL type)
  4. SQL Server bugs: CU 20 required for stability (per community feedback)

Adoption Considerations

  1. User demand: Moderate but growing (11+ users interested)
  2. SQL Server 2022 requirement: Limits audience to newer versions only
  3. S3 provider costs: Many free options gone (MinIO OSS limited)
  4. Azure preference: Many users already on Azure Blob

Mitigation Strategies

  • Provide clear version requirements in documentation
  • Offer MinIO Docker compose example for testing
  • Link to Microsoft's official S3 documentation
  • Maintain parallel Azure support (don't break existing functionality)

Open Questions

  1. Parameter naming: Should we use -S3BaseUrl / -S3Credential or reuse -AzureBaseUrl / -AzureCredential with auto-detection?

    • Recommendation: Separate parameters for clarity and explicit user intent
  2. URL validation: How strict should URL format validation be?

    • Recommendation: Validate prefix and basic structure, let SQL Server handle detailed validation
  3. Region parameter: Expose as separate -S3Region parameter or require JSON in -BackupOptions?

    • Recommendation: Start with documentation showing JSON format, add separate parameter in Phase 3 if requested
  4. Backward compatibility: Any risk to existing Azure functionality?

    • Recommendation: No risk if we add separate code paths with proper URL detection

Conclusion

S3 backup/restore support is feasible and well-defined within dbatools architecture. The implementation should:

  1. Mirror Azure patterns for consistency
  2. Gate on SQL Server 2022+ for version safety
  3. Provide clear error messages for unsupported scenarios
  4. Start with core backup/restore (Phase 1) and iterate
  5. Maintain excellent documentation given the new credential model

The most important commands are Backup-DbaDatabase and Restore-DbaDatabase. Starting with these two in Phase 1 provides immediate value to users while keeping scope manageable.

Ready to proceed with implementation when you are. I can start with Phase 1 (core backup/restore support) or we can discuss the approach further.


claude[bot] avatar Nov 28 '25 16:11 claude[bot]