AzureStorageExplorer
AzureStorageExplorer copied to clipboard
Find accurate progress reports for table import
Initially, progress was reported based on the number of bytes read from the import file, but this was inaccurate, because the number of bytes read doesn't correlate well with the number of entities read or uploaded.
A solution will likely require a refactor of the CSV parser to read less greedily and/or report the number of lines or records that have been read.
If 1.21 and older did not have progress % for import, then feel free to move this to later milestone.
Progress percentage was not reported initially, so moving to 1.24.0.
This was actually resolved when working on performance improvements for table import. Progress is now calculated as follows:
$$\frac{\text{bytes read}}{\text{file size}} \times \frac{\text{entities uploaded}}{\text{total entities}}$$
Ideally, progress would be determined by the number of entities. However, the total number of entities cannot be known when import starts, and increases over time. The file size provides a better basis, but the import is not complete until we've uploaded what's read from the file.
The answer is to combine the two. The file size is the dominant factor, adjusted slightly by the number of entities. When the entire file has been read, the dominant factor becomes the number of entities, which is good, because by then we do know how many entities need to be uploaded.