bdit_data-sources
bdit_data-sources copied to clipboard
Reconcile Bluetooth routes that have changed over time
Offshoot of #62
Three things to consider:
- When route's analysis_id changed but the geometry did not
- When route's analysis_id changed and the geometry did
- Something else
- [x] For 1 there should be a lookup table for deprecated analysis_ids linking to the new segment_name. This should silently span those report changes for the historical Open Data release.
- [x]
bluetooth.report_active_dates
has been created to show start and end dates of routes.
There's a puzzle remaining with the routes on College, where new routes were created on 2017-09-17, but the old routes were never decommissioned. I thought the routes were the same, but there seems to be subtle differences in the 5-minute aggregation (with our filtering out of WiFi observations).
WITH old_routes AS (SELECT analysis_id, datetime_bin, segment_name, tt, obs
FROM bluetooth.aggr_5min
INNER JOIN bluetooth.segments USING (analysis_id)
WHERE segment_name IN ('CO_BA_CO_UN','CO_UN_CO_PA','CO_PA_CO_UN','CO_UN_CO_BA') AND duplicate)
, new_routes AS (SELECT analysis_id, datetime_bin, segment_name, tt, obs
FROM bluetooth.aggr_5min
INNER JOIN bluetooth.segments USING (analysis_id)
WHERE segment_name IN ('CO_BA_CO_UN','CO_UN_CO_PA','CO_PA_CO_UN','CO_UN_CO_BA') AND NOT duplicate)
SELECT Date_trunc('month', datetime_bin)::DATE AS mnth, segment_name, COUNT(1)
FROM old_routes
INNER JOIN new_routes USING (segment_name, datetime_bin)
WHERE new_routes.obs != old_routes.obs OR new_routes.tt != old_routes.tt
GROUP BY mnth, segment_name
ORDER BY mnth
The results of the above query, not aggregated by month are below
segment_name | count |
---|---|
CO_BA_CO_UN | 5835 |
CO_PA_CO_UN | 1486 |
CO_UN_CO_BA | 6255 |
CO_UN_CO_PA | 1421 |
I thought this might be due to some changing in processing over time (see #103, or since move_raw_data()
was created), but the differences persist until today.
I'm going to remove observations for the old duplicate College routes from bluetooth.observations
and park them in a duplicate_route_observations
table and then delete the aggregates based on them.
SELECT *
INTO bluetooth.duplicate_route_observations
FROM bluetooth.observations
INNER JOIN bluetooth.segments USING (analysis_id)
WHERE duplicate AND measured_timestamp >= '2017-09-14';
DELETE FROM bluetooth.observations
USING bluetooth.segments
WHERE segments.analysis_id = observations.analysis_id AND duplicate AND measured_timestamp >= '2017-09-14';
DELETE FROM bluetooth.aggr_5min aggr
USING bluetooth.segments
WHERE segments.analysis_id = aggr.analysis_id AND duplicate AND datetime_bin >= '2017-09-14';
DELETE FROM bluetooth.aggr_5min_alldevices aggr
USING bluetooth.segments
WHERE segments.analysis_id = aggr.analysis_id AND duplicate AND datetime_bin >= '2017-09-14';
DELETE FROM bluetooth.aggr_15min aggr
USING bluetooth.segments
WHERE segments.analysis_id = aggr.analysis_id AND duplicate AND datetime_bin >= '2017-09-14';
DELETE FROM bluetooth.aggr_15min_alldevices aggr
USING bluetooth.segments
WHERE segments.analysis_id = aggr.analysis_id AND duplicate AND datetime_bin >= '2017-09-14';
DELETE FROM bluetooth.aggr_15min_temp aggr
USING bluetooth.segments
WHERE segments.analysis_id = aggr.analysis_id AND duplicate AND datetime_bin >= '2017-09-14';
UPDATE bluetooth.all_analyses a
SET pull_data=FALSE
FROM bluetooth.segments s
WHERE duplicate AND a.analysis_id = s.analysis_id;
Dundas routes
The western end of Dundas is surprisingly confusing. My understanding based on the below table, which you can produce with the query at the bottom of this comment. The westernmost detector was at Roncesvalles, installed 2015-12-03. At the same time a detector was installed, from West to East at Sterling, Dovercourt, and Spadina. On 2017-09-19 detectors were installed at Dufferin (between Sterling and Dovercourt) and Bathurst (between Dovercourt and Spadina) On 2017-12-17 the detector at Roncesvalles was removed. This has left this strange situation where Sterling-Dovercourt and Dovercourt-Spadina overlap with Dufferin-Bathurst and Bathurst-Spadina (see map below table).
analysis_id | direction | start_crossstreet | end_crossstreet | report_name | start_date | end_date |
---|---|---|---|---|---|---|
1412879 | EB | Dovercourt | Spadina | DT- Dundas - AD3 to AD4 | 2015-12-03 | |
1453262 | EB | Dufferin | Bathurst | DT-0006. Dundas-EB_Dufferin-to-Bathurst | 2017-09-19 | |
1453239 | EB | Roncesvalles | Dufferin | DT-0005. Dundas-EB_Roncesvalles-to-Dufferin | 2017-10-05 | 2017-12-17 |
1412641 | EB | Roncesvalles | Sterling | DT- Dundas - AD1 to AD2 | 2015-12-03 | 2017-12-17 |
1412818 | EB | Sterling | Dovercourt | DT- Dundas - AD2 to AD3 | 2015-12-03 |
SELECT segment_name, analysis_id, direction, start_crossstreet,
end_crossstreet, a.report_name, start_date, end_date, length, bluetooth, wifi
FROM bluetooth.segments_dir_corrected segs
INNER JOIN bluetooth.all_analyses a USING (analysis_id)
JOIN bluetooth.report_active_dates USING(analysis_id)
WHERE street = 'Dundas'
ORDER BY direction, start_crossstreet
The below query identifies the analyses for which we've stopped receiving data and when.
SELECT a.analysis_id, a.report_name, segment_name,
start_date, end_date
FROM bluetooth.all_analyses a
LEFT OUTER JOIN bluetooth.segments seg USING(analysis_id )
JOIN bluetooth.report_active_dates USING(analysis_id)
WHERE pull_data
ORDER BY end_date, report_name, start_date
analysis_id | report_name | segment_name | start_date | end_date |
---|---|---|---|---|
1419415 | GTA - Lawrence AL1-AL2 EB | 2016-04-09 | 2017-04-19 | |
1419433 | GTA - Lawrence AL2-AL1 WB | 2016-04-09 | 2017-04-19 | |
1419444 | GTA - Sheppard AS1-AS2 EB | 2016-04-09 | 2017-07-05 | |
1419450 | GTA - Sheppard AS2-AS1 WB | 2016-04-09 | 2017-07-06 | |
1428103 | Shep Heron Hill to Pharmacy | 2017-04-19 | 2017-07-06 | |
1428150 | Shep Pharmacy to 401 | 2017-04-19 | 2017-07-06 | |
1428139 | Shep Pharmacy to Clydesdale | 2017-04-19 | 2017-07-06 | |
1428120 | Shep Pharmacy to Heron Hill | 2017-04-19 | 2017-07-06 | |
1427570 | VP 401 to Pharmacy | 2017-04-19 | 2017-07-06 | |
1428092 | VP Clydesdale to Pharmacy | 2017-04-19 | 2017-07-06 | |
1432987 | DTB - Adelaide to Richmond at Duncan and Simcoe | 2017-04-19 | 2017-07-21 | |
1432982 | DTB - Richmond -BR2-BR3 | RM_SB_RM_DN | 2016-10-01 | 2017-07-21 |
1432978 | DTB - Richmond -BR3-BR4 | RM_DC_RM_BA | 2016-10-01 | 2017-07-21 |
1432992 | DTB - Richmond to Adelaide at Duncan and Simcoe | 2017-04-19 | 2017-07-21 | |
1453239 | DT-0005. Dundas-EB_Roncesvalles-to-Dufferin | DU_RO_DU_DF | 2017-10-05 | 2017-12-17 |
1453507 | DT-0018. Dundas-WB_Dufferin-to-Roncesvalles | DU_DF_DU_RO | 2017-10-05 | 2017-12-17 |
1454832 | DT-0083. Roncesvalles-SB_Dundas-to-Queen | DU_RO_QU_RO | 2017-09-21 | 2017-12-17 |
1454853 | DT-0084. Roncesvalles-NB_Queen-to-Dundas | QU_RO_DU_RO | 2017-09-21 | 2017-12-17 |
1412641 | DT- Dundas - AD1 to AD2 | DU_RO_DU_SL | 2015-12-03 | 2017-12-17 |
1412684 | DT- Dundas - AD2 to AD1 | DU_SL_DU_RO | 2015-12-03 | 2017-12-17 |
1448452 | Miln Milner to 401Morn | 2017-08-16 | 2017-12-27 | |
1448469 | Morn 401Morn to McLevin | 2017-08-15 | 2017-12-27 | |
1448474 | Morn 401Morn to Meadowvale | 2017-08-15 | 2017-12-27 | |
1448479 | Morn 401Morn to Milner | 2017-08-16 | 2017-12-27 | |
1448484 | Morn 401Morn to Morningside | 2017-08-15 | 2017-12-27 | |
1448489 | Morn 401Morn to Neilson | 2017-08-15 | 2017-12-27 | |
1448494 | Morn 401Morn to Sheppard | 2017-08-15 | 2017-12-27 | |
1448499 | Morn McLevin to 401Morn | 2017-08-15 | 2017-12-27 | |
1448531 | Morn Morningside to 401Morn | 2017-08-15 | 2017-12-27 | |
1448567 | Morn Sheppard to 401Morn | 2017-10-30 | 2017-12-27 | |
1448665 | SE Meadowvale to 401Morn | 2017-08-16 | 2017-12-27 | |
1448715 | SE Neilson to 401Morn | 2017-08-16 | 2017-12-27 | |
1453667 | DT-0027. Queen-EB_Parliament-to-Broadview | QU_PA_QU_BV | 2017-10-02 | 2018-02-15 |
1453680 | DT-0028. Queen-WB_Broadview-to-Parliament | QU_BV_QU_PA | 2017-10-02 | 2018-02-15 |
1454293 | DT-0057. King-EB_Parliament-to-Broadview | KN_PA_KN_BV | 2017-10-02 | 2018-02-15 |
1454311 | DT-0058. King-WB_Broadview-to-Parliament | KN_BV_KN_PA | 2017-10-02 | 2018-02-15 |
1455710 | DT-0139. Broadview-SB_Queen-to-Eastern | QU_BV_EA_BV | 2017-10-02 | 2018-02-15 |
1455724 | DT-0140. Broadview-NB_Eastern-to-Queen | EA_BV_QU_BV | 2017-10-02 | 2018-02-15 |
- The routes without a
segment_name
were used for different travel time studies, and haven't yet been coded in tosegments
so won't investigate right now, see #102 - Richmond has already been looked at, detectors were moved and new routes created
- Dundas has been investigated above
- Richmond & Duncan detector appears to have been removed but Adelaide one has stayed up.
- Broadview detector is down
Removing observations from the Sherbourne-Parliament segments of Richmond and Adelaide because they are wholly contained by the newer King Street Pilot Jarvis-Parliament segments, which are also longer.
I'm leaving the other Richmond and Adelaide routes alone because, while they overlap in time, they are very staggered in space.
WITH ,insertion as (INSERT INTO bluetooth.duplicate_route_observations(
analysis_id, id, user_id, measured_time, measured_time_no_filter, startpoint_number, startpoint_name, endpoint_number, endpoint_name, measured_timestamp, outlier_level, cod, device_class)
SELECT analysis_id, id, user_id, measured_time, measured_time_no_filter, startpoint_number, startpoint_name, endpoint_number, endpoint_name, measured_timestamp, outlier_level, cod, device_class
FROM bluetooth.observations
WHERE analysis_id IN (1427970, 1427959) AND measured_timestamp >= '2017-10-02'
RETURNING id)
DELETE FROM bluetooth.observations
USING insertion
WHERE observations.id = insertion.id;
DELETE FROM bluetooth.observations
USING insertion
WHERE observations.id = insertion.id;
The reader at College/Bathurst was last operational on October 10th, however it's still marked as operational among multiple sources. We should fix this in our map, spreadsheet and database tables, and find out if other readers, marked as operational, aren't.