dstack icon indicating copy to clipboard operation
dstack copied to clipboard

Output files sometimes missing in us-east-2

Open eafpres opened this issue 3 years ago • 2 comments

sample application

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Thu Sep  1 17:55:26 2022

@author: eafpres
"""
#
#%% libraries
#
import pandas as pd
import matplotlib.pyplot as plt
import sys
import os
#
#%% configure
#
my_os = sys.platform
print('found OS ', my_os)
print('user is: ', os.environ.get('USER'))
print('working directory is: ', os.getcwd())
#
#%% data
#
data = pd.read_csv('data/parabolic_data.csv')
#
#%% stats
#
print('data summary')
print(data.describe())
#
# save summary
#
data.describe().to_csv('output/data_summary.csv')
#
#%% visualize
#
fig, ax = plt.subplots(figsize = (9, 9))
ax.scatter(data['x'], data['y'])
plt.savefig('output/parabolic.jpg')
#

configure in us-east-1

(python38) PS C:\eaf llc\aa-Analytics and BI\dstack> dstack config --aws-profile default
Configure AWS backend:

Region name (us-east-2): us-east-1
S3 bucket name (eaf-test-dstack-20221012): eaf-test-dstack
The bucket 'eaf-test-dstack' doesn't exist. Create it? [y/n]: y
OK

add a tag for the data folder

(python38) PS C:\eaf llc\aa-Analytics and BI\dstack> dstack tags add test-dstack-data -a data
Uploading artifact 'data': 100%|████████████████████████████████████████████████████████████████████████| 640/640 [00:01<00:00, 364B/s]
OK

run the workflow

(python38) PS C:\eaf llc\aa-Analytics and BI\dstack> dstack run read_data
 RUN          WORKFLOW          STATUS     APPS  ARTIFACTS           SUBMITTED   TAG
 green-eel-1  read_data         Submitted        output              47 sec ago

Provisioning... It may take up to a minute. ✓

To interrupt, press Ctrl+C.

Collecting pandas==1.3.4
  Downloading pandas-1.3.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.5 MB)
Collecting matplotlib==3.5.0
  Downloading matplotlib-3.5.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl (11.3 MB)
Collecting pytz>=2017.3
  Downloading pytz-2022.4-py2.py3-none-any.whl (500 kB)
Collecting python-dateutil>=2.7.3
  Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
Collecting numpy>=1.17.3
  Downloading numpy-1.23.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.1 MB)
Collecting setuptools-scm>=4
  Downloading setuptools_scm-7.0.5-py3-none-any.whl (42 kB)
Collecting packaging>=20.0
  Downloading packaging-21.3-py3-none-any.whl (40 kB)
Collecting kiwisolver>=1.0.1
  Downloading kiwisolver-1.4.4-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.2 MB)
Collecting pyparsing>=2.2.1
  Downloading pyparsing-3.0.9-py3-none-any.whl (98 kB)
Collecting pillow>=6.2.0
  Downloading Pillow-9.2.0-cp38-cp38-manylinux_2_28_x86_64.whl (3.2 MB)
Collecting cycler>=0.10
  Downloading cycler-0.11.0-py3-none-any.whl (6.4 kB)
Collecting fonttools>=4.22.0
  Downloading fonttools-4.37.4-py3-none-any.whl (960 kB)
Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.8/site-packages (from python-dateutil>=2.7.3->pandas==1.3.4->-r requirements.txt (line 1)) (1.16.0)
Collecting typing-extensions
  Downloading typing_extensions-4.4.0-py3-none-any.whl (26 kB)
Collecting tomli>=1.0.0
  Downloading tomli-2.0.1-py3-none-any.whl (12 kB)
Requirement already satisfied: setuptools in /opt/conda/lib/python3.8/site-packages (from setuptools-scm>=4->matplotlib==3.5.0->-r requirements.txt (line 2)) (61.2.0)
Installing collected packages: pyparsing, typing-extensions, tomli, packaging, setuptools-scm, pytz, python-dateutil, pillow, numpy, kiwisolver, fonttools, cycler, pandas, matplotlib
Successfully installed cycler-0.11.0 fonttools-4.37.4 kiwisolver-1.4.4 matplotlib-3.5.0 numpy-1.23.4 packaging-21.3 pandas-1.3.4 pillow-9.2.0 pyparsing-3.0.9 python-dateutil-2.8.2 pytz-2022.4 setuptools-scm-7.0.5 tomli-2.0.1 typing-extensions-4.4.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
found OS  linux
user is:  None
working directory is:  /workflow
data summary
               x            y
count  76.000000    76.000000
mean   38.500000  1974.026316
std    22.083176  1754.755717
min     1.000000    11.000000
25%    19.750000   401.000000
50%    38.500000  1493.000000
75%    57.250000  3288.750000
max    76.000000  5787.000000

check the result

  (python38) PS C:\eaf llc\aa-Analytics and BI\dstack> dstack artifacts list green-eel-1
 ARTIFACT  FILE              SIZE
 output    data_summary.csv  170.0B
           parabolic.jpg     29.3KiB

configure for us-east-2

(python38) PS C:\eaf llc\aa-Analytics and BI\dstack> dstack config --aws-profile eaf
Configure AWS backend:

Region name (us-east-1): us-east-2
S3 bucket name (eaf-test-dstack): eaf-test-dstack-2
The bucket 'eaf-test-dstack-2' doesn't exist. Create it? [y/n]: y
OK

add the tag

(python38) PS C:\eaf llc\aa-Analytics and BI\dstack> dstack tags add test-dstack-data -a data
Uploading artifact 'data': 100%|████████████████████████████████████████████████████████████████████████| 640/640 [00:01<00:00, 357B/s]
OK

run the workflow

(python38) PS C:\eaf llc\aa-Analytics and BI\dstack> dstack run read_data
 RUN          WORKFLOW          STATUS     APPS  ARTIFACTS           SUBMITTED   TAG
 bad-moose-1  read_data         Submitted        output              50 sec ago

Provisioning... It may take up to a minute. ✓

To interrupt, press Ctrl+C.

missing output

check for results

(python38) PS C:\eaf llc\aa-Analytics and BI\dstack> dstack artifacts list bad-moose-1
 ARTIFACT  FILE  SIZE

The lack of output indicating the requirements.txt was handled is not every time--sometimes it shows up.
The lack of code output is not every time--sometimes it shows up
But every time in us-east-2 there is nothing in the output folder, and in us-east-1 there is

eafpres avatar Oct 12 '22 21:10 eafpres

Posting it here from Slack for history:

What I did:

  1. Configured us-east-2
  2. Run download from dstack-examples
  3. Checked output artifacts All worked well

If possible, to help me reproduce this issue, please create a small public Git repo so I can reproduce it exactly as you do it.

Also, in case it's possible, please attach the runner logs from CloudWatch for the corresponding runs. CloudWatch Group: /dstack/runners<bucket-name>

peterschmidt85 avatar Oct 19 '22 19:10 peterschmidt85

I will organize a repo and update here when ready

eafpres avatar Oct 19 '22 19:10 eafpres

I believe this is resolved in the current update, so I am closing.

eafpres avatar Nov 27 '22 18:11 eafpres