dbt-databricks icon indicating copy to clipboard operation
dbt-databricks copied to clipboard

lazy load agate

Open dwreeves opened this issue 1 year ago • 16 comments
trafficstars

TLDR: lazy-loading agate speeds up the load time of dbt by about 3.5%. Most instances of agate are unnecessary as dbt only uses it in a select few situations, and most instances of agate can be placed behind TYPE_CHECKING.

Already implemented in dbt-adapters and dbt-core.

  • https://github.com/dbt-labs/dbt-adapters/pull/126
  • https://github.com/dbt-labs/dbt-core/pull/9744

Note: usually I perform the following check with each adapter. The following code should return an empty list:

import sys
import dbt.adapters.databricks.impl
print([i for i in sys.modules if "agate" in i])

However, because dbt-databricks has not yet been migrated to the dbt-core 1.8 ecosystem, this check is not successful:

>>> print([i for i in sys.modules if "agate" in i])
['agate.exceptions', 'agate.csv_py3', 'agate.aggregations.base', 'agate.data_types.base', 'agate.data_types.boolean', 'agate.data_types.date', 'agate.data_types.date_time', 'agate.data_types.number', 'agate.data_types.text', 'agate.data_types.time_delta', 'agate.data_types', 'agate.aggregations.all', 'agate.aggregations.any', 'agate.warns', 'agate.utils', 'agate.aggregations.count', 'agate.aggregations.has_nulls', 'agate.aggregations.percentiles', 'agate.aggregations.deciles', 'agate.aggregations.first', 'agate.aggregations.iqr', 'agate.aggregations.median', 'agate.aggregations.mad', 'agate.aggregations.max', 'agate.aggregations.max_length', 'agate.aggregations.max_precision', 'agate.aggregations.sum', 'agate.aggregations.mean', 'agate.aggregations.min', 'agate.aggregations.mode', 'agate.aggregations.quartiles', 'agate.aggregations.quintiles', 'agate.aggregations.variance', 'agate.aggregations.stdev', 'agate.aggregations.summary', 'agate.aggregations', 'agate.mapped_sequence', 'agate.columns', 'agate.computations.base', 'agate.computations.change', 'agate.computations.formula', 'agate.computations.percent', 'agate.computations.percent_change', 'agate.computations.rank', 'agate.computations.percentile_rank', 'agate.computations.slug', 'agate.computations', 'agate.config', 'agate.rows', 'agate.type_tester', 'agate.table.aggregate', 'agate.table.bar_chart', 'agate.table.bins', 'agate.table.column_chart', 'agate.table.compute', 'agate.table.denormalize', 'agate.table.distinct', 'agate.table.exclude', 'agate.table.find', 'agate.table.from_csv', 'agate.fixed', 'agate.table.from_fixed', 'agate.table.from_json', 'agate.table.from_object', 'agate.tableset.aggregate', 'agate.tableset.bar_chart', 'agate.tableset.column_chart', 'agate.tableset.from_csv', 'agate.tableset.from_json', 'agate.tableset.having', 'agate.tableset.line_chart', 'agate.tableset.merge', 'agate.tableset.print_structure', 'agate.tableset.proxy_methods', 'agate.tableset.scatterplot', 'agate.tableset.to_csv', 'agate.tableset.to_json', 'agate.tableset', 'agate.table.group_by', 'agate.table.homogenize', 'agate.table.join', 'agate.table.limit', 'agate.table.line_chart', 'agate.table.merge', 'agate.table.normalize', 'agate.table.order_by', 'agate.table.pivot', 'agate.table.print_bars', 'agate.table.print_html', 'agate.table.print_structure', 'agate.table.print_table', 'agate.table.rename', 'agate.table.scatterplot', 'agate.table.select', 'agate.table.to_csv', 'agate.table.to_json', 'agate.table.where', 'agate.table', 'agate.testcase', 'agate', 'dbt.clients.agate_helper']

So, the actual code change is not blocked by #619, but it won't "work" until #619 is resolved.

Checklist

  • [x] I have run this code in development and it appears to resolve the stated issue
  • [x] This PR includes tests, or tests are not required/relevant for this PR
  • [x] I have updated the CHANGELOG.md and added information about my change to the "dbt-databricks next" section.

dwreeves avatar Mar 31 '24 15:03 dwreeves