model
model copied to clipboard
Embedding factory script looping through each MGRS tile
Jupyter notebook script to generate GeoParquet embedding files on a per MGRS tile basis.
Steps:
- The script first generates an mgrs_world.txt file with a list of MGRS code names like 12ABC. Need to run this command first:
aws s3 ls s3://clay-tiles-02/02/ | tr -s ' ' | cut -d ' ' -f 3 | cut -d '/' -f 1 > mgrs_world.txt
- A for-loop then goes through each MGRS tile, with the model running the prediction to generate GeoParquet files that are uploaded to s3.
Notes:
- There were about 947019 rows of embeddings generated from the clay-small-70MT-1100T-10E.ckpt model checkpoint in Dec 2023.
- Embeddings were generated using a
g5.4xlarge
EC2 instance with 1 NVIDIA A10G GPU that allows for bfloat16 dtype calculations.
Closes #120