"ReferenceError: ref is not defined" when using ref function in SQLX config block
-- filtered_postcodes.sqlx
config {
{
type: "view",
assertions: {
uniqueKey: ['postcode'],
rowConditions: [
`${ref("check_postcode")}(postcode)`,
`postcode IN (SELECT postcode FROM ${ref("known_postcodes")}`
],
},
}
SELECT *,
FROM ${ref("customer_postcodes")}
WHERE sales > 10000
When trying to use the ref() function in the config block, for example to add a simple rowCondition assertion, and then running dataform compile I get a compilation error:
Compilation errors:
definitions/filtered_postcodes.sqlx: ReferenceError: ref is not defined
I would expect I could use the references in the config block just as I can in the SQLX body, to reference custom functions (like check_postcode which checks for valid values) or tables (for example so I can check that “foreign keys” exist in the source table) for simple data tests.
(The reference should also add a dependency of the current file on the references in the config block, so in the example filtered_postcodes should not only depend on customer_postcodes but also check_postcode and imported_postcodes.)
How check_postcode is defined? If it is a JS function there is a different syntax to use it, please follow https://cloud.google.com/dataform/docs/javascript-in-dataform
-- check_postcode.sqlx
config {
type: "operations",
hasOutput: true
}
CREATE OR REPLACE FUNCTION ${self()} (postcode STRING)
RETURNS BOOL
AS (
-- German postal codes are 5 digits
REGEXP_CONTAINS(postcode, '^[0-9]{5}$')
);
It's a user-defined BigQuery function in SQL. This is of course a very simple example, we have much more complex check functions that we implement in SQL and want to reuse in various actions/models. The issue is in general with referencing SQL actions/models inside the config block and not only in the SQL part.
Technically you could use functions for resolving dependencies in a config block like this
config {
{
type: "view",
assertions: {
uniqueKey: ['postcode'],
rowConditions: [
dataform.resolve("check_postcode") + `(postcode)`,
`postcode IN (SELECT postcode FROM ${ref("known_postcodes")}`
],
},
}
But javascript in the config block is executed before actions in the graph are registered, so you won't be able to resolve other dependencies there.
So I additionally need to manually declare the dependency with dependencies: ['check_postcode'] and it should work right?
Addition: It's also not possible to use self() in the config block. So if you want to check the latest results of an incremental model that's not easily possible.
config {
type: "incremental",
description: "Tracking of row counts over time before and after transformation.",
assertions: {
rowConditions: [
`updated_at != (SELECT MAX(updated_at) FROM ${self()}) OR count_before = count_after`,
]
}
}