Expected failing test_feature_availability_profiler tests on Linux (others?)
This is likely more a question than an issue, but an issue seemed more appropriate than StackOverflow for a unit test. With current master (installed in a virtualenv with all dependencies) I get a few errors/failures, one each of which are in test_feature_availability_profiler. Specifically:
healthcareai.tests.test_feature_availability_profiler.TestFeatureAvailabilityProfiler
healthcareai.tests.test_feature_availability_profiler.TestFeatureAvailabilityProfilerError3
raise exceptions about the fact that the elements are not date types, instead of the expected exception.
I noticed some changes in this area in cb4c162, are the failing tests expected in master, or is it likely a platform issue?
Debugging follows, feel free to ignore
Digging in, it looks as though feature_availability_profiler wants to verify the dtype of the Series is a datetime64[ns], yet since the initial type for the only element is int, the dtype becomes an object once datetimes are mixed in, whereas if it is instantiated only with datetimes the error goes away...my quick hackery:
def setUp(self):
self.df = pd.DataFrame(np.random.randn(1000, 2),
columns=['AdmitDTS',
'LastLoadDTS'])
# generate load date
self.df['LastLoadDTS'] = pd.datetime(2015, 5, 20)
# generate datetime objects for admit date
delta = pd.datetime(2015, 5, 20) - pd.datetime(2015, 5, 1)
int_delta = (delta.days * 24 * 60 * 60) + delta.seconds
def test_time(random_second):
return pd.datetime(2015, 5, 1) + timedelta(seconds=random_second)
admit = [test_time(randrange(int_delta)) for _ in range(1000)]
self.df['AdmitDTS'] = pd.Series.from_array(admit)
Proposed: https://github.com/HealthCatalyst/healthcareai-py/pull/467