geocoder
geocoder copied to clipboard
Proposed solution to the AR includes problem
I would like to propose the following solution to the AR includes problem (see 'Known Issue' at the end of the README), and to open a discussion with Alex and the geocoder community.
Proposed solution:
- Remove the
:select
from the AR query. - Instead, add
distance
andbearing
instance methods to the AR model, to be calculated application-side. The formulae would translate easily into pure Ruby code.
This would primarily enable the includes
scope to work correctly, but would also improve performance:
- The bearing is currently always calculated, but may not be required.
- The database must currently calculate the distance twice per row: once for the
SELECT
and again for theORDER BY
. - When used in conjunction with pagination, many of the calculated distances and bearings are discarded if they don't appear on the current page, only to be recalculated when the page changes.
Comments welcome.
You're right that calculating distance twice per row is sub-optimal. However, unless I'm missing something, doesn't this require you to load all geocoded objects from the database to find which are within a given radius? I think instantiating ActiveRecord objects is a much worse performance problem than linearly increasing the number of DB calculations.
Also, doesn't this kind of kill the scope
functionality? Let me know if I'm missing something.
Removing the :select
would stop the query from returning the distance
and bearing
columns, but it would still be possible to filter by radius and sort by distance.
The scope functionality would work as expected:
@items = Item.recent.near('London', 50).includes(:user).paginate
<% @items.each do |item| %>
<%= item.user.name # Eager loaded %>
<%= item.description %>
<%= item.distance # Calls the distance method on Item %>
<%= item.bearing # Calls the bearing method on Item %>
<% end %>
Interesting. Would item.distance
and item.bearing
actually work as above or would they need to take an argument? Requiring an argument (basically using distance_to
or bearing_to
instead) isn't a problem in itself but it would break current implementations so we'd need a plan for a gentle transition.
It's a good idea...let's keep discussing.
Ideally, it would retain compatibility by requiring no argument and would read the origin location from the near
scope. I'm not sure whether it is possible to access the scope from an object (item
) or a collection of objects (@items
). Do you know?
I've just had a quick play, and it seems to be possible:
class Item < ActiveRecord::Base
scope :test, where(:name => 'Test')
end
> items = Item.test
> items.where_values_hash
=> {:name=>"Test"}
or
> items = Item.test
> items.scope_for_create
=> {:name=>"Test"}
I'm not sure of the difference at this stage.
Offhand, I can't think of a way to do it. Once the scope is "executed" an array of objects is returned and I don't know of a way to add data to those objects except with a select
in the scope. I'm not sure there isn't a way, but I don't know of one right now.
It's not necessarily a showstopper, though. We could introduce deprecation warnings and plan on changing the implementation in version 1.1.
My next task is to move the tests from geocoder_test (the ones that run against a real database) into the main geocoder repository and make it easier to run them against the various back-end stores. When that's done let's work to make sure related functionality is thoroughly tested and then start changing the implementation.
Posted that last comment before I saw yours. As you've shown, it's possible to get information about the scope from the scope itself, but how do we give each of the returned objects access to that scope? Seems like we need to inject something into each one...
+1 on this, from my digging around it seems using select at all is just asking for problems - too many parts of AR modify/replace/depend on the select
@jerrett: Do you have any ideas for accessing distance and bearing calculations for each returned object?
Good ideas? Not really, but you could probably get away with a subquery in a :conditions.. :D
I'll give it some thought and poke around a bit - doing it in the model as suggested doesn't seem terrible if that's possible, but for sure if it's possible to do it in sql (esp. if it could be done once for order and selection) does seem ideal...
and that won't work either, cause it'd be in the where :|
I have the same problem using with AR.includes
Restaurant.includes(:dishes).near(lat, lng, 100).where("dishes.name = ?", dish_name)
Geocorder doesn't add the distance column in the query if I add AR.includes.
How can I use near and includes together? I would like to retrieve all restaurants near to a specific lat and lng AND containing a specific dish ORDERED by distance.
@phstc: I think you could get away with what you are trying to do without includes, but with joins (which would actually be a more optimized version of the initial query, since you're trying to filter based on a relationship.
@kevinelliott In my case I used the workaround bellow
selected_dish = Dish.find_by_name(dish_name).first
selected_dish.restaurants.near(lat, lng, 100)
It works as a join.
Not sure if this helps or not (http://stackoverflow.com/a/9505584/1116573) because it sounds like you can mix a custom sql statement with the active relation stuff.
Also can be found here (http://apidock.com/rails/ActiveRecord/AssociationPreload/ClassMethods/preload_associations#note_content-158), but it says deprecated and I can't find what it has been replaced by.
If someone can suggest what the new syntax should be then maybe it could be suggested as a workaround if it can't be worked into the gem? I'm a bit new to rails so hopefully I'm not just talking twaddle.
as @amnesia7 suggested, it seems we can manually add an includes
query without messing the original query :
What we want (but doesn't work) :
City.near("Omaha, NE", 20).includes(:venues)
Workaround with joins + select :
City.near("Omaha, NE", 20, :select => "cities.*, venues.*").joins(:venues)
Workaround with eager-loading (a la includes
) :
(Rails 3.1+)
query = City.near("Omaha, NE", 20)
ActiveRecord::Associations::Preloader.new(query, [:venues]).run
The downside is you can't add any conditions on venues fields in the original query. In this case, the only solution is to use both joins + select for conditions, and ActiveRecord::Associations::Preloader for eager-loading
+1
+1
+1
Any progress here?
While we wait, this is how I get around the problem of distance
not existing when you includes
:
# First initialization of the ARel chain
@museums = Museum.active.geocoded.near([latitude, longitude], distance)
# Later filtered by category (optionally), which uses a join
@museums = @museums.joins(:categories).where(:categories => { :id => category_ids })
... more stuff perhaps ...
# Now right before we return flow to the view, let's force AR to preload our "includes" so we're eager loading
ActiveRecord::Associations::Preloader.new(@museums, [:categories, :hours]).run
So as long as you use joins
and relationships to query your data in completion without an includes
, and then at the very end you preload the data, then you're able to use distance
in any of your conditions. The key here is that you're doing all of your conditions without includes BEFORE ARel compiles and executes, followed by an explicit preload run.
Thanks @kevinelliott !
+1
I found out today that you don't have to overwrite the whole SELECT
clause when you want to create your own custom named columns to the retrieved result set.
I tried this simple tweak to the current lib/geocoder/stores/active_record.rb
file, scope :near
definition and it works as suspected, not overwriting my own select clauses and giving me the column name for later using with ORDER BY clauses :)
diff --git a/lib/geocoder/stores/active_record.rb b/lib/geocoder/stores/active_record.rb
index 7c48653..c369fd4 100644
--- a/lib/geocoder/stores/active_record.rb
+++ b/lib/geocoder/stores/active_record.rb
@@ -30,25 +30,26 @@ module Geocoder::Store
##
# Find all objects within a radius of the given location.
# Location may be either a string to geocode or an array of
# coordinates (<tt>[lat,lon]</tt>). Also takes an options hash
# (see Geocoder::Store::ActiveRecord::ClassMethods.near_scope_options
# for details).
#
scope :near, lambda{ |location, *args|
latitude, longitude = Geocoder::Calculations.extract_coordinates(location)
if Geocoder::Calculations.coordinates_present?(latitude, longitude)
options = near_scope_options(latitude, longitude, *args)
- select(options[:select]).where(options[:conditions]).
+ scoped.select_values << options[:select]
+ where(options[:conditions]).
order(options[:order])
else
# If no lat/lon given we don't want any results, but we still
# need distance and bearing columns so you can add, for example:
# .order("distance")
select(select_clause(nil, "NULL", "NULL")).where(false_condition)
end
}
##
# Find all objects within the area of a given bounding box.
# Bounds must be an array of locations specifying the southwest
Maybe this could be a resolution to the age-old problem with custom select difficulties?
Any progress on this? I'm not sure if this has anything to do with polymorphic but that would be nice feature. So you could make a polymorphic Location class and then add a has_one to and class you need to have locations.
order by distance
#on your model def self.order_by_distance(latitude, longitude) #https://github.com/alexreisner/geocoder/blob/8c000fdf7ee7a8bac9b9bb17429e94d2b371f4da/lib/geocoder/stores/active_record.rb#L155 distance = distance_sql(latitude, longitude, order: "distance") near([latitude, longitude], 1000000, order: "(#{distance}) ASC") end
I haven't got access to an app with geocoder to test this at the moment but I thought I'd throw it out there incase someone else can check it.
I remembered about this issue and have recently used .select()
, .joins()
and .includes()
all together in the same ActiveRecord call, ie
Post.select('extra-fields-here').joins(:author).includes(:author)
so I thought someone might want to check if it could resolve the issue here as well since for me it seems to have given me the benefit of overriding select, joining tables and eager loading in one go. I'm not sure if it worked by fluke or design but it has worked in my case, although unrelated to geocoder, and I'd be interested to know if it helps here.
Disregard my comment above. Although it allows me to include extra fields in the .select()
to use for ordering the results they are not callable in the view so if distance was included in the select the view would not be able to output it.
You could try .joins(:categories).preload(:categories)
http://api.rubyonrails.org/classes/ActiveRecord/QueryMethods.html#method-i-preload
Thanks @amnesia7 , it works, though it sucks in terms of performance (N+1 problem).
@rubyconvict maybe try .joins(:categories).eager_load(:categories)
instead.
No, I tried, this one is not any better (N+1) and I have one search scenario where it causes "ERROR: PG::Error: ERROR: column "distance" does not exist".