Need a locatable unique subscribers set
Unique subscribers is very useful for creating subsets of subscribers who are active, but would be much more useful if those subscribers were all locatable.
Perhaps more generally, it would be useful to have a way to run queries on only the set of locatable events (e.g. IntereventInterval and TotalActivePeriodsSubscriber, as well as UniqueSubscribers). The main use-case I'm imagining here is to use these not-location-related queries to define subsets of subscribers that will sufficiently often appear in the outpus of SubscriberLocations queries.
I think there are also two slightly different definitions of "locatable" that we may want to separate here:
- Exclude events whose cell ID does not appear in
infrastructure.cells(this is currently implicitly the case for the outputs of aJoinToLocationquery, due to theINNER JOINof query to spatial unit) - Exclude events for cells that do appear in
infrastructure.cells, but don't map to any location for a specified spatial unit - e.g. events at a cell whose location falls outside the admin0 boundary. These events are currently not excluded from the output of aJoinToLocationquery (and hence also the output of aSubscriberLocationsquery) if a mapping table is specified for the spatial unit, because the spatial unit join clause uses aLEFT JOIN, but are excluded for a polygon spatial unit without mapping table, because the point-in-polygon join clause uses anINNER JOIN. I think we need to make this consistent one way or another, and perhaps also user-controllable via a parameter.
One final note (which I feel may be best handled as part of this issue rather than its own issue): I think the ignore_nulls argument to SubscriberLocations has no effect at all - it filters the output of the location-joined events query to exclude any rows with a null location_id, but these are already excluded by the INNER JOIN in JoinToLocation (as mentioned in my first point above).
I think there are also two slightly different definitions of "locatable" that we may want to separate here
The situation is simplified as of #5361: "locatable" in the sense of "cell ID appears in the result of a specified SpatialUnit query" now unambiguously means "cell ID corresponds to one of the relevant geography elements" (depending on the type of spatial unit, this could mean that the cell ID appears in infrastructure.cells, appears in the specified mapping table, or has a known point location that falls within one of the specified polygons). So the set of locatable cells can be entirely specified by a SpatialUnit object, and what remains for this issue is to allow more queries to be run only on the events at these locatable cells.