center-randomize icon indicating copy to clipboard operation
center-randomize copied to clipboard

Calculating route distance(driving as mode of transit) instead of Haversine distance

Open ArunShresthaa opened this issue 10 months ago • 4 comments

This PR is to implement the route distance with driving mode as means of transit than Haversine distance, Route distance calculated using distancematrix.ai api (Kind of like google maps api to calculate the best route), allows 1000 requests for free

Added calculate_route_distance.py file which calculates the route distance for all the schools to all the centers and saves it in a distance.tsv file in results folder -- (because calculation of route distance is slow because of numerous api requests) use accurate api for better accuracy (I used fast api for speed).

Added school_center_using_route_distance.py (to not affect the original file) to use previously calculated route distance while considering distance between schools and centers.

Result: while testing for small set of schools, it was found that the students allocated were less fragmented compared to using haversine distance (ss for reference)

Using Route Distance: image

Using Haversine Distance image

ArunShresthaa avatar Apr 24 '24 17:04 ArunShresthaa

  • Revert back to original name of school_center.py.

Thanks for the guide, but this method is very slow because of the API call to be made, a total of 57,024 requests needs to be made for the given sample data, probably might take a whole day, so until this PR is merged I decided to go with a separate file.

Distancematrix.ai gives only 1000 requests for free, would be easy if used google maps API but its paid.

ArunShresthaa avatar Apr 25 '24 03:04 ArunShresthaa

Students will need to travel from where they live. Distance from school was probably used because it the only feasible approximation. So I wonder whether a precise method like this is warranted here, especially given the cost of implementation. Also, this takes driving distance, a more appropriate measure would be route and availability of public transport. Just offering a different perspective. This is an interesting solution none-the-less and thank you for contributing

sapradhan avatar Apr 25 '24 17:04 sapradhan

Students will need to travel from where they live. Distance from school was probably used because it the only feasible approximation. So I wonder whether a precise method like this is warranted here, especially given the cost of implementation. Also, this takes driving distance, a more appropriate measure would be route and availability of public transport. Just offering a different perspective. This is an interesting solution none-the-less and thank you for contributing

The threshold distance is set 2KM, so public transport being not available is not much of issue

The speed of calculation can be speed up using google maps api which is much faster (paid) and distributing the task and running parallelly. If ran 10 processes simultaneously, its just 57,02.4 calls per process, which will be completed in approximately 1 hour and the calculation is to be done only once and can be reused.

ArunShresthaa avatar Apr 25 '24 17:04 ArunShresthaa

The problem here is its just sample data here. More api calls need to be made for actual data and there are around 35000 (rough estimate) schools in Nepal. Don't think this approach is economically feasible while transport limitation is a factor in some rural areas.

LuluW8071 avatar Apr 26 '24 00:04 LuluW8071