geo-deep-learning icon indicating copy to clipboard operation
geo-deep-learning copied to clipboard

Competition problem: parallel inferences in production

Open micpilon opened this issue 2 years ago • 0 comments

In the production inference process, there is a step where we check the presence or not of the asset in the data repository to avoid downloading it in HPC every time.

In the context where we use a new image that has never been loaded in HPC and we want to run all our models in parallel, this has the side effect that the first of the N models downloads the assets in the data folder as expected, but other runs detect the presence of the TIFs while they are being written in said directory. This leads to read errors in the other steps of the process while some tiles are incomplete, etc.

We should find a more complete way to validate the presence of assets in HPC so that we don't include those in the process of downloading.

Proposed implementation: loop over the file size.

if not os.path.exists(val):
	# Download file
	
else:
    file_size = -1
	while file_size != os.path.getsize(val):
	    file_size = os.path.getsize(val)
		time.sleep(1)

     # Path exists and file size is stable. Good to go.

micpilon avatar Oct 12 '22 13:10 micpilon