Storage uploads partial files on network failure
[READ] Step 1: Are you in the right place?
Issues filed here should be about bugs in the code in this repository. If you have a general question, need help debugging, or fall into some other category use one of these other channels:
- For general technical questions, post a question on StackOverflow with the firebase tag.
- For general Firebase discussion, use the firebase-talk google group.
- For help troubleshooting your application that does not fall under one of the above categories, reach out to the personalized Firebase support channel.
[REQUIRED] Step 2: Describe your environment
- Android Studio version:
Android Studio Koala | 2024.1.1 Patch 1
Build #AI-241.18034.62.2411.12071903, built on July 10, 2024
Runtime version: 17.0.11+0-17.0.11b1207.24-11852314 aarch64
VM: OpenJDK 64-Bit Server VM by JetBrains s.r.o.
macOS 14.1
GC: G1 Young Generation, G1 Old Generation
Memory: 8192M
Cores: 12
Metal Rendering is ON
Registry:
debugger.new.tool.window.layout=true
ide.experimental.ui=true
Non-Bundled Plugins:
de.santiv.fastscrolling (1.2)
IdeaVIM (2.15.3)
- Firebase Component: Storage
- Component version:
33.1.2
[REQUIRED] Step 3: Describe the problem
I upload files from disk using putStream from a BufferedInputStream backed by a FileInputStream pointing to a file in my application's private storage directory. I do not use resumable uploads.
I've noticed several times now that a partial file is stored in storage. I can tell it's partial because it's an audio file that was and remains playable on the local device, but cannot be played when it's downloaded from storage (in Chrome or anywhere else). To mitigate this issue, I calculate an md5 hash on the device and compare it to the md5 hash in the StorageMetadata returned by the completed firebase Task. I've found that sometimes these hashes do not match and that when they don't, it corresponds with a partial upload.
I can also tell from logs that the users's network condition degrades during the upload, and errors are logged like:
error sending network request POST https://firebasestorage.googleapis.com/v0/b/<redacted>&uploadType=resumable&upload_id=<redacted>&upload_protocol=resumable
I'd expect any failure of any part of the file to upload to result in the upload task returning a failed result, not a successful one. I'd expect the firebase client SDK to validate the uploaded hash matches the input data hash automatically, or to provide an API where I can ask it to do so (as it does for iOS and Node).
Steps to reproduce:
I haven't been able to reproduce it locally yet, probably because the files I'm trying with are too small and/or my network is too good. I'll keep trying though.
But here's some example logs that show the network quality and storage logs during an upload that resulted in a partial file:
Network Event
info
03:15:28.933
{
action: NETWORK_CAPABILITIES_CHANGED,
download_bandwidth: 11658,
network_type: cellular,
upload_bandwidth: 11658,
vpn_active: false
}
Logcat
warning
03:15:13.098
[Filtered]
{
tag: StorageUtil
}
Device Event
info
03:15:10.165
{
action: BATTERY_CHANGED,
charging: false,
level: 96
}
Network Event
info
03:14:54.212
{
action: NETWORK_CAPABILITIES_CHANGED,
download_bandwidth: 8245,
network_type: cellular,
upload_bandwidth: 8245,
vpn_active: false
}
Device Event
info
03:14:09.824
{
action: BATTERY_CHANGED,
charging: false,
level: 97
}
Logcat
warning
03:14:09.321
[Filtered]
{
tag: StorageUtil
}
Logcat
warning
03:14:09.319
network unavailable, sleeping.
{
tag: ExponenentialBackoff
}
Network Event
info
03:14:08.964
{
action: NETWORK_CAPABILITIES_CHANGED,
download_bandwidth: 14,
network_type: cellular,
upload_bandwidth: 14,
vpn_active: false
}
Network Event
info
03:14:08.964
{
action: NETWORK_AVAILABLE
}
Logcat
warning
03:14:08.154
[Filtered]
{
tag: StorageUtil
}
Logcat
warning
03:14:08.152
network unavailable, sleeping.
{
tag: ExponenentialBackoff
}
Logcat
warning
03:14:06.970
[Filtered]
{
tag: StorageUtil
}
Logcat
warning
03:14:06.967
network unavailable, sleeping.
{
tag: ExponenentialBackoff
}
Logcat
warning
03:14:05.724
[Filtered]
{
tag: StorageUtil
}
Logcat
warning
03:14:05.718
network unavailable, sleeping.
{
tag: ExponenentialBackoff
}
Logcat
warning
03:14:04.537
[Filtered]
{
tag: StorageUtil
}
Logcat
warning
03:14:04.533
network unavailable, sleeping.
{
tag: ExponenentialBackoff
}
Logcat
warning
03:14:03.286
[Filtered]
{
tag: StorageUtil
}
Logcat
warning
03:14:03.283
network unavailable, sleeping.
{
tag: ExponenentialBackoff
}
Logcat
warning
03:14:02.254
[Filtered]
{
tag: StorageUtil
}
Logcat
warning
03:14:02.251
network unavailable, sleeping.
{
tag: ExponenentialBackoff
}
Relevant Code:
Here's the code I'm using now. The MD5 logic detects the partial upload, but a partial file is still stored in Firebase storage.
val ourMd5 = base64Md5(file)
BufferedInputStream(FileInputStream(file)).use {
suspendCancellableCoroutine { continuation ->
val task = ref.putStream(it, metadata)
continuation.invokeOnCancellation { error ->
isCancelled = task.cancel()
}
task.addOnProgressListener {
...
}
task.addOnCanceledListener {
...
}
task.addOnPausedListener {
...
}
task.addOnCompleteListener { completeTask ->
if (completeTask.isCanceled) {
continuation.resumeWithException(
completeTask.exception ?: UploadException("Cancelled")
)
return@addOnCompleteListener
}
if (!completeTask.isSuccessful) {
continuation.resumeWithException(
completeTask.exception ?: RuntimeException("Unknown failure")
)
return@addOnCompleteListener
}
// I don't think this is strictly necessary, but technically result can
// throw, so let's catch it.
val result =
try {
completeTask.result
} catch (e: Exception) {
continuation.resumeWithException(e)
return@addOnCompleteListener
}
val resultMetadata = result.metadata
if (resultMetadata == null) {
continuation.resumeWithException(UploadException("Missing metadata!"))
return@addOnCompleteListener
}
if (ourMd5 != null && ourMd5 != resultMetadata.md5Hash) {
continuation.resumeWithException(UploadException("Mismatched hashes!"))
return@addOnCompleteListener
}
val progress: FileUploadStatus = FileUploadProgress(
file = andyFile,
bytesTransferred = totalBytes,
bytesTotal = totalBytes,
)
trySend(progress)
continuation.resumeWith(Result.success(resultMetadata.path))
}
}
}
Hi @sjudd, thank you for reaching out. Firebase storage uploads the file in chunks, and adjusts the size of each chunk based on upload performance. In addition to that, Firebase Storage is using an exponential backoff strategy for retrying failed operations.
I'll go ahead and mark this as a feature request. While we are unable to promise any timeline for this, we'll definitely keep this under our radar.
P.S. For folks who find this useful, adding an emoji thumbs up on the original post can help us prioritize adding this to the roadmap.
@lehcar09 thanks for the response. I don't think this is a feature request, but rather a significant bug in the implementation. It appears that Firebase not infrequently will upload partial / corrupt files, presumably because a chunk fails to upload and there's no hash check to catch this kind of issue.
I can't of any valid reason why Firebase would indicate a file uploaded successfully, but then the file is corrupt and the uploaded hash returned by Firebase's API (in the metada) does not match the hash calculated locally.
I agree that adding an explicit hash check is a feature request. But failing to upload the entire file while indicating the file uploaded successfully is a bug (barring extremely rare circumstances like bitflips or broken hardware).