squeel icon indicating copy to clipboard operation
squeel copied to clipboard

best way to do a self join?

Open ashanbrown opened this issue 11 years ago • 9 comments

Sorry but I couldn't find the answer to this in the readme: Is this the best way to do a self join in squeel?

joins { MyClass.as(same_table).on {...} }

This generates an nested subquery, so I'm wondering if there is a better way. I'm not sure if sql optimizes out the subquery so I'm a little worried about this. Thanks!

ashanbrown avatar Sep 09 '14 21:09 ashanbrown

Would you please explain more about self join? Thank you very much, I guess you could use an association in model to join self :)

bigxiang avatar Sep 10 '14 03:09 bigxiang

Sure. The most common case where I use for a self join is something that I can currently write like this:


class Event < ActiveRecord::Base
  scope :first_time, -> {
    joins { Event.where {}.as(first_events).
        on { (first_events.user_id == events.user_id) & (first_events.time < events.time) }.outer }.
        where { first_events.id == nil }
  }
end

ashanbrown avatar Sep 10 '14 03:09 ashanbrown

I think we can avoid self join from different levels.

First, it's better to add a column like first_time_flag, and set the value after a user creates the event first time, so we easily find all events that are created first time.

Second, if we can't add a column, we can use an aggregation instead of the complicated join.

class Event < ActiveRecord::Base
  scope :first_time, -> do 
    first_time_for_users_relation = Event.group { user_id }.select { MIN(time) }
    where { time.in(first_time_for_users_relation) }
  end
end

Third, I really don't like a time comparison personally because I'm not sure what the precision of time is and whether there is a dump time, Could id represent the sequence of events are created? If so, I suggest that we use id directly.

Hope it helps :)

bigxiang avatar Sep 10 '14 03:09 bigxiang

I your suggestions all design decisions that I can understand, but in the end they don't help me answer my questions, which is how to best represent a self join. ;) I think there are plenty of valid reasons to do a self join but I don't really want to debate them here. I wonder why I can't write something like this:

class Event < ActiveRecord::Base
  scope :first_time, -> {
    joins { as(first_events).
        on { (first_events.user_id == user_id) & (first_events.time < time) }.outer }.
        where { first_events.id == nil }
  }
end

This incorporates features such as self providing an as and columns being assumed to belong to self when no other table is indicated (rather than to first_events, as I believe the are now). In general, I can't seem to get an as when I want one; for example, on an association. I also would like to be able to apply a scope to an association but that doesn't seem to be supported (add a scope to an association chain and squeel assumes it is another association).

ashanbrown avatar Sep 10 '14 03:09 ashanbrown

I'm also running into this problem - and was rather hoping I could alias table names as I see fit. Not really sure why I can't. I've dumbed this query down to provide a minimal example - my real query is quite a bit more complicated and self-joins a couple of times.

Let's say that you have a students table, and any given student can be the mentor of another student - so you have the mentored_by_id foreign key to self-join the table to itself. Let's further suppose that you want to ask the question "does this student have a mentor relationship with a mentor?" (IE - is there a 3-student mentor-to-mentee chain, starting at student X). The squeel should look like this:

Student.where(id: 10).joins{
  students.as(mentee1).on{
    students.id == mentee1.mentored_by_id
  }
}.joins{
  students.as(mentee2).on{
    mentee1.id == mentee2.mentored_by_id
  }
}.exists?

This is fairly close to my use-case except I have multiple join conditions for each relationship (using just relationships with a nested keypath like students.mentee.mentee wouldn't work... and reads oddly).

The only problem, of course, is that I can't alias a table name like this. I get the error:

only can convert ActiveRecord::Relation to a join node

I have attempted to do it this way (where mentee is the name of the rails association to self-join):

Student.where(id: 10).joins{
  student = self
  joining_student = student.mentee
  joining_student.on{student.id == joining_student.mentored_by_id}
}.references(:all)

But of course, that didn't quite work, it fails with this error:

Association named 'on' was not found on Student; perhaps you misspelled it?

Remember, my join condition is more complicated than just id-to-foreign key, so straight keypaths isn't going to work either, and it looks like on is not supported for keypaths to customize the JOIN ON conditions which I was attempting to do here.

nzifnab avatar Sep 17 '14 04:09 nzifnab

@nzifnab I am not sure your issue is the same as @dontfidget 's. Maybe you could provide a minimal gist in another issue thread to show your real goal including the schema, models, your approach, final query you want?

bigxiang avatar Oct 26 '14 15:10 bigxiang

@dontfidget I will keep an eye on your issue to see if I can provide a MyClass.as() and I'm not very sure what this means:

I also would like to be able to apply a scope to an association but that doesn't seem to be supported (add a scope to an association chain and squeel assumes it is another association)

bigxiang avatar Oct 26 '14 15:10 bigxiang

I would love to be able to do a table alias, specifically for a dependent subquery. I would like to run something like this:

    Appointment.where do
      appointment_id == Appointment.as('client_appointments').select(:appointment_id).where do
        (client_appointments.client_id == appointments.client_id)
      end.order(starts_at: :asc).limit(1)
    end

Any progress on the table aliases?

CyborgMaster avatar May 15 '15 00:05 CyborgMaster

So I've got a workaround for now. I can write the subquery like this:

      appointment_id == Appointment.joins('as client_appointments')
      .select(:appointment_id).where do
        (client_appointments.client_id == appointments.client_id) &
          (client_appointments.source_id == appointments.source_id)
      end.order(starts_at: :asc).limit(1)

Really hacky, but it works for now.

CyborgMaster avatar May 15 '15 00:05 CyborgMaster