center-randomize
center-randomize copied to clipboard
Calculating route distance(driving as mode of transit) instead of Haversine distance
This PR is to implement the route distance with driving mode as means of transit than Haversine distance, Route distance calculated using distancematrix.ai api (Kind of like google maps api to calculate the best route), allows 1000 requests for free
Added calculate_route_distance.py file which calculates the route distance for all the schools to all the centers and saves it in a distance.tsv file in results folder -- (because calculation of route distance is slow because of numerous api requests) use accurate api for better accuracy (I used fast api for speed).
Added school_center_using_route_distance.py (to not affect the original file) to use previously calculated route distance while considering distance between schools and centers.
Result: while testing for small set of schools, it was found that the students allocated were less fragmented compared to using haversine distance (ss for reference)
Using Route Distance:
Using Haversine Distance
- Revert back to original name of
school_center.py
.
Thanks for the guide, but this method is very slow because of the API call to be made, a total of 57,024 requests needs to be made for the given sample data, probably might take a whole day, so until this PR is merged I decided to go with a separate file.
Distancematrix.ai gives only 1000 requests for free, would be easy if used google maps API but its paid.
Students will need to travel from where they live. Distance from school was probably used because it the only feasible approximation. So I wonder whether a precise method like this is warranted here, especially given the cost of implementation. Also, this takes driving distance, a more appropriate measure would be route and availability of public transport. Just offering a different perspective. This is an interesting solution none-the-less and thank you for contributing
Students will need to travel from where they live. Distance from school was probably used because it the only feasible approximation. So I wonder whether a precise method like this is warranted here, especially given the cost of implementation. Also, this takes driving distance, a more appropriate measure would be route and availability of public transport. Just offering a different perspective. This is an interesting solution none-the-less and thank you for contributing
The threshold distance is set 2KM, so public transport being not available is not much of issue
The speed of calculation can be speed up using google maps api which is much faster (paid) and distributing the task and running parallelly. If ran 10 processes simultaneously, its just 57,02.4 calls per process, which will be completed in approximately 1 hour and the calculation is to be done only once and can be reused.
The problem here is its just sample data here. More api calls need to be made for actual data and there are around 35000 (rough estimate) schools in Nepal. Don't think this approach is economically feasible while transport limitation is a factor in some rural areas.