daru
daru copied to clipboard
Add way to generate DataFrame from active_record with aggregated fields
Recently I stumbled on problem with speed when using daru, and found that a way to speed things up was to do more work in database - namely group and aggregate in ActiveRecord/database instead of on dataframe.
Here is what I wanted to use:
active_record = Provider.left_join(zip: :district).group(:id)
However I found out that I cannot give field like ANY_VALUE(district.id), because it gets converted to symbol infrom_activerecord, and subsequently pluck tries to convert it to table.column.
(At least thats how I understand it works).
So, we found out way to bypass this and I was thinking about adding this to daru, in something like this:
# Load dataframe from AR::Relation
#
# @param relation [ActiveRecord::Relation] A relation to be used to load the contents of dataframe
# @param with_sql_methods [Boolean] Enables giving fields with SQL methods
#
# @return A dataframe containing the data in the given relation
def from_activerecord(relation, *fields, with_sql_methods: false)
fields = relation.klass.column_names if fields.empty?
fields = if with_sql_methods
fields.map(&:to_s)
else
fields.map(&:to_sym)
result = relation.pluck(*fields).transpose
Daru::DataFrame.new(result, order: fields).tap(&:update)
end
Now I can create new DataFrame as
data_frame = Daru::DataFrame.from_activerecord(active_record,
["ANY_VALUE(district.id)"],
with_sql_methods: true)
What do you think about that?
@janmpeterka - Thanks for this feature request suggestion! 🎉
You'd have to contribute this to the daru-io repository for this. We currently have the implementation of ActiveRecord importer here, wherein we support just normal field names and not sql methods. You can probably add has_sql_methods flag / keyword argument - the rest of the logic you might already find in the existing importer logic itself 😄
Would you like to contribute this feature, @janmpeterka?
Thanks, I will look into it. Not sure if I will be able to write the implementation myself, quite new to Ruby :)
Well, daru-io is quite inactive, so no contributing there.