zoon icon indicating copy to clipboard operation
zoon copied to clipboard

LocalOccurrenceData shouldn't force a value column

Open goldingn opened this issue 8 years ago • 4 comments

For presence-only data, normally there is no column of 1s to put for the value column. In these cases we'd rather pass a single value to apply to all records.

I'm trying to write a workflow to reproduce this paper by grabbing this dataset of occurrences. This runs:

library(zoon)
LoadModule(LocalOccurrenceData)

df <- LocalOccurrenceData(filename = 'http://datadryad.org/bitstream/handle/10255/dryad.88856/albopictus.csv',
                          occurrenceType = 'presence',
                          columns = c(long = 'X',
                                      lat = 'Y',
                                      value = 'YEAR'))
head(df)
  longitude latitude value     type fold
1 -121.5833 37.08698  2001 presence    1
2 -118.0833 34.08313  2001 presence    1
3 -121.2500 38.08333  2001 presence    1
4 -121.5833 37.08332  2001 presence    1
5 -118.2500 34.08359  2001 presence    1
6 -117.9167 34.08771  2001 presence    1

but obviously we don't want the year as the value...

A nicer interface would be:

df <- LocalOccurrenceData(filename = 'http://datadryad.org/bitstream/handle/10255/dryad.88856/albopictus.csv',
                          occurrenceType = 'presence',
                          coords = c(longitude = 'X',
                                     latitude = 'Y'),
                          value = 1)

where value is optionally a column name or a numeric value to apply to all records.

Also correcting lat/long to latitude/longitude whilst we're at it...

goldingn avatar Sep 16 '15 14:09 goldingn

Worth mentioning in the docs of LocalOccurrenceData that it can be used for online data like this.

timcdlucas avatar Sep 16 '15 14:09 timcdlucas

Yup, just discussing that in the office - possibly also a name change?

goldingn avatar Sep 16 '15 14:09 goldingn

OccurrenceTable or something. OccurrenceDataFrame. Name change seems reasonable.

timcdlucas avatar Sep 16 '15 14:09 timcdlucas

OccurrenceTable seems best to me. I also think that this should work too:

df <- LocalOccurrenceData(filename = 'http://datadryad.org/bitstream/handle/10255/dryad.88856/albopictus.csv',
                          occurrenceType = 'presence',
                          coords = c(longitude = 'X',
                                     latitude = 'Y'))

Where value defaults to presence only - 1

AugustT avatar Sep 29 '15 09:09 AugustT