passwordless-server
passwordless-server copied to clipboard
PAS-148 | Data export
Ticket
- Closes PAS-148
Description
We need to be able to export individual applications with all their relevant data, so customers can migrate to self-hosted instances.
Shape
Given applications with either a lot of users or enterprise applications with event logging enabled can end up having a lot of records.
Considerations:
- We need to be able to enable paging easily at a later stage or possibly offload the data export process using a serverless approach.
- CSV:
- RFC-4180 compliant
- Easy to enable paging later
- CsvHelper has a lot of capable functionality included to help us achieve paging easily later.
- UTF-8 encoded
Remarks:
- Wasn't meant to include paging at this stage.
- Wasn't meant to include encryption at this stage.
Split to one file per entity type. Every file is a csv document compliant with RFC-4180. Splitting to a file per record makes it also easier theoretically to enable paging at a later stage when a file grows to a certain size.
POST /backup/schedule
: Schedules a new backup job.
GET /backup/jobs
: Retrieves a list of backup jobs with their relevant status past or present.
Example response:
[
{ "jobId": "guid", "status": "pending", "createdAt": "ISO-8160", "lastUpdatedAt": "ISO-8160" },
{ "jobId": "guid", "status": "failed", "createdAt": "ISO-8160", "lastUpdatedAt": "ISO-8160" },
{ "jobId": "guid", "status": "running", "createdAt": "ISO-8160", "lastUpdatedAt": "ISO-8160" },
{ "jobId": "guid", "status": "completed", "createdAt": "ISO-8160", "lastUpdatedAt": "ISO-8160" },
]
The data stored is stored as a UTF-8 encoded byte array.
public class ArchiveJob : PerTenant
{
public Guid Id { get; set; }
public AccountMetaInformation Application { get; set; }
public DateTime CreatedAt { get; set; }
public DateTime? UpdatedAt { get; set; }
public JobStatus Status { get; set; } = JobStatus.Pending;
public List<Archive> Archives { get; set; } = new();
}
public class Archive : PerTenant
{
public Guid Id { get; set; }
/// <summary>
/// The identifier of the backup job that created this archive.
/// </summary>
public Guid JobId { get; set; }
public DateTime CreatedAt { get; set; }
public Type? Entity { get; set; }
[MaxLength(100 * 1024 * 1024, ErrorMessage = "Data cannot be larger than 100MB.")]
public byte[] Data { get; set; }
public AccountMetaInformation Application { get; set; } = null!;
}
The upload process would likely involve a user to create a new organization, and import an old app by uploading the exported files. The restoration would only be able to start if the schema's match and when all documents were successfully uploaded.
ApplicationEvents
An additional migration for ApplicationEvents
had to be included, as it was not inheriting from PerTenant
according to the conventions we had laid out for DbTenantContext
. This was then essentially messing with the generics I had implemented in BackupWorkerService
. This would have otherwise significantly impacted maintainability as well.
To prevent any unnecessary migrations from being executed which could result into data loss, the database column was mapped manually to its original value. Just the snapshot was essentially updated to reflect the mapping from the column name to the property (CLR).
Screenshots
n/a
Checklist
I did the following to ensure that my changes were tested thoroughly:
- [ ] Unit tests
- [ ] Integration tests
I did the following to ensure that my changes did not introduce new security vulnerabilities:
- [ ] Secured endpoints with secret key.
- [ ] Sanitization for macro's was ignored, given these backups are not meant to be opened in a program like Microsoft Excel.
- [ ] In this implementation, a DoS attack would be possible with very large customers.
Codecov Report
Attention: Patch coverage is 47.23502%
with 1374 lines
in your changes missing coverage. Please review.
Project coverage is 33.91%. Comparing base (
a194a43
) to head (0574e1e
). Report is 135 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #465 +/- ##
==========================================
+ Coverage 32.63% 33.91% +1.28%
==========================================
Files 504 525 +21
Lines 26394 28985 +2591
Branches 819 833 +14
==========================================
+ Hits 8613 9831 +1218
- Misses 17670 19038 +1368
- Partials 111 116 +5
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Re-iterating earlier comment: Let's review this PR but do not merge it until we've shipped some of the things currently in main
to prod.