GoodreadsScraper icon indicating copy to clipboard operation
GoodreadsScraper copied to clipboard

The data parsing step adds spurious values

Open havanagrawal opened this issue 6 years ago • 0 comments

If the dateutil.parse function cannot find a component of the timestamp (any of day, month or year), it replaces it with the current date's components.

This can cause problems in later steps of the analysis, where spurious patterns in time series will show up. This can be fixed either by:

  1. Collecting day, month and year in separate fields, using NaN where applicable
  2. Using NaN if the entire date cannot be captured.

havanagrawal avatar Dec 07 '17 18:12 havanagrawal