OG-USA
OG-USA copied to clipboard
Windows GH Action tests not running
For the last three PR's (PR #73 #74 #75 #76), the Windows operating system tests in build_and_test.yml
are not running on GitHub Actions. The Checkout, Setup Miniconda, and Build sections all run, but it stalls out at the test section. PR #76 temporarily removes the Windows tests from the build_and_test.yml
GH Action. But we need to get those tests back in.
We should watch closely if the implementation of the OG-ZAF testing suite has the same issues. If not, we should look at the differences between OG-USA and OGZAF and OG-Core. In the latter two, the GH Actions on all three operating systems work. We suspect that the issue might be some dependencies in OG-USA that are not in those other two repositories.
cc: @jdebacker
I tried reinstating the Windows tests in PR #89 but they still get stuck in build_and_test.yml
in the "Test" stage.
I found a Windows machine, downloaded this repo, and followed the instructions to install using the Anaconda Prompt CLI.
Findings:
- When I tried to create the environment (
conda env create -f environment.yml
), I got apip
failure. It seemed to relate to therpy2
package so I did aconda install R
, which installed R and I was then able to create the environment. - I installed the
ogusa
package from source in theogusa-dev
virtual environment without issue - Next, I ran
pytest
. Tests were collected, but none ran and I got the command prompt back. - I tried to run individual test modules, e.g.,
pytest tests/test_income.py
. I was able to run all tests successfully EXCEPT:- The tests requiring the
puf.csv
file intests/test_get_micro_data.py
failed, which would be expected given that I did not have this file. - When I try to run
tests/test_psid_data_setup.py
I see just:
collecting...
- The tests requiring the
Then the command prompt is back.
- When I remove the
tests/test_psid_data_setup.py
from the/tests
directory, I can runpytest
and all remaining tests run as expected.
Conclusion:
The failures of the GH Actions tests on are related to the rpy2
package, which is only imported in the psid_data_setup.py
module. The failure to properly import this package results in an odd pytest
failure (odd in that there are no error messages or warnings, pytest
just doesn't run).
Solution
The rpy2
package is a headache. Its only use in OG-USA
is to convert an R dataframe to a Pandas DataFrame. Instead of this, we should just save the data out of R (i.e., from the ./OG-USA/data/PSID/psid_download.R
script) as a CSV file. That way we can read it into a Pandas DataFrame without issue and we can drop rpy2
as a dependency.
The only potential problem I foresee with this is that CSV files are larger (often much larger) than .RData files. It could be that the CSV file with the same data exceeds the 100MB file limit for GH. If this is the case, we can compress it or use a different file format (but ideally one that can be created and output from R so that it can be done in psid_download.R
.
cc @rickecon