sling-cli
sling-cli copied to clipboard
Sling Gets stuck for unknown reason
Sling version:
sling==1.4.23
What is the Operating System?
Linux
Do you have a CLI Pro/Platform subscription?
No
Description of the issue
Running sling in batches ( parquet ) gets stuck, without any reason. I got timeout.
I tried to create my own script, using batch that failed, loading the parquet files, and it worked smoothly
Replication Configuration
NUM_BATCHES = 10
Those tables are too large to fit in a single batch
for i in range(NUM_BATCHES): defaults = { 'mode': 'incremental', 'object': 'slow_features.noa_orders_info_sync', # Where to write to 'update_key': 'shopper_id', # not used - we do manual batching logic 'sql': f"SELECT * FROM {{stream_full_name}} WHERE MOD(ABS(FNV_HASH(shopper_id)), {NUM_BATCHES}) = {i} --{{incremental_value}}", 'source_options': { # Parquet format fixes bugs we had with export of large CSVs, but it uses Duckdb to read the parquets # which requires a stronger machine 'format': 'parquet', } }
streams = {
'analytics_v2.shopper_orders_info': sling.ReplicationStream(
**defaults
),
}
Log Output
2025-10-19 21:29:49 DBG select * from read_parquet(['sling_temp/stream/1760909308449.parquet/u01-0003_part_00.parquet', 'sling_temp/stream/1760909308449.parquet/u01-0000_part_00.parquet', 'sling_temp/stream/1760909308449.parquet/u01-0001_part_00.parquet', 'sling_temp/stream/1760909308449.parquet/u01-0002_part_00.parquet']) 2025-10-19 21:29:49 DBG applying column casing (normalize) for target type ([REDACTED]) 2025-10-19 21:29:49 DBG casting column 'shopper_id' as 'text' 2025-10-19 21:29:49 DBG casting column 'store_id' as 'text' 2025-10-19 21:29:49 DBG casting column 'store_name' as 'text' 2025-10-19 21:29:49 DBG casting column 'orders_info' as 'text' 2025-10-19 21:29:49 INF writing to target database [mode: incremental] 2025-10-19 21:29:49 INF streaming data (direct insert) 2025-10-19 21:30:43 INF inserted 3527602 rows into "slow_features"."noa_orders_info_sync" in 134 secs [26,138 r/s] [1.4 GB] 2025-10-19 21:30:43 DBG closed "redshift" connection (conn-redshift-8bX) 2025-10-19 21:30:43 DBG closed "[REDACTED]" connection (conn-[REDACTED]-wis) 2025-10-19 21:30:43 INF execution succeeded
FYI there is a new sling version released (1.4.24). Please run pip install -U sling.
running batch number: 7 out of 10
2025-10-19 21:30:43 INF Sling CLI | https://slingdata.io/
2025-10-19 21:30:43 INF Sling Replication | SLING_RS -> SLING_PG | analytics_v2.shopper_orders_info
2025-10-19 21:30:43 DBG Sling version: 1.4.23 (linux amd64)
2025-10-19 21:30:43 DBG type is db-db
2025-10-19 21:30:43 DBG using: {"columns":{},"mode":"incremental","select":[],"transforms":null}
2025-10-19 21:30:43 DBG using source options: {"empty_as_null":false,"format":"parquet","datetime_format":"AUTO","max_decimals":-1}
2025-10-19 21:30:43 DBG using target options: {"datetime_format":"auto","file_max_rows":0,"max_decimals":-1,"use_bulk":true,"add_new_columns":true,"adjust_column_type":false,"column_casing":"normalize"}
2025-10-19 21:30:43 INF connecting to source database (redshift)
2025-10-19 21:30:43 DBG opened "redshift" connection (conn-redshift-7ES)
2025-10-19 21:30:43 INF connecting to target database ([REDACTED])
2025-10-19 21:30:43 DBG opened "[REDACTED]" connection (conn-[REDACTED]-GV0)
2025-10-19 21:30:43 INF getting checkpoint value (shopper_id)
2025-10-19 21:30:43 DBG select max("shopper_id") as max_val from "slow_features"."noa_orders_info_sync"
2025-10-19 21:30:43 INF reading from source database
2025-10-19 21:30:43 INF unloading from redshift to s3
2025-10-19 21:30:43 DBG opened "s3" connection (conn-s3-0pA)
2025-10-19 21:30:43 DBG unload ('SELECT * FROM "analytics_v2"."shopper_orders_info" WHERE MOD(ABS(FNV_HASH(shopper_id)), 10) = 6 --null')
to 's3://[REDACTED]/sling_temp/stream/1760909443793.parquet/u01-'
credentials 'aws_access_key_id=;aws_secret_access_key='
allowoverwrite PARQUET PARALLEL
2025-10-19 21:31:31 DBG Unloaded to s3://[REDACTED]/sling_temp/stream/1760909443793.parquet
2025-10-19 21:31:31 DBG opened "s3" connection (conn-s3-s8L)
2025-10-19 21:31:31 DBG copying 4 files from remote to /tmp/duck.temp.1760909491369.i4E for local processing
2025-10-19 21:31:32 DBG opened "file" connection (conn-file-oSu)
2025-10-19 21:31:32 DBG reading datastream from s3://[REDACTED]/sling_temp/stream/1760909443793.parquet [format=parquet, nodes=4]
2025-10-19 21:31:32 DBG select * from read_parquet(['sling_temp/stream/1760909443793.parquet/u01-0003_part_00.parquet', 'sling_temp/stream/1760909443793.parquet/u01-0000_part_00.parquet', 'sling_temp/stream/1760909443793.parquet/u01-0001_part_00.parquet', 'sling_temp/stream/1760909443793.parquet/u01-0002_part_00.parquet'])
2025-10-19 21:31:33 DBG applying column casing (normalize) for target type ([REDACTED])
2025-10-19 21:31:33 DBG casting column 'shopper_id' as 'text'
2025-10-19 21:31:33 DBG casting column 'store_id' as 'text'
2025-10-19 21:31:33 DBG casting column 'store_name' as 'text'
2025-10-19 21:31:33 DBG casting column 'orders_info' as 'text'
2025-10-19 21:31:33 INF writing to target database [mode: incremental]
2025-10-19 21:31:33 INF streaming data (direct insert)
the last two runs
up until the last batch everything work smoothly. the parquet files are 4, each size of 120mg .
hey @noat28 i'm wondering if something is happening in the duckdb process while reading those parquet files.
Can you answer the following questions:
- Does it get stuck deterministically? As in, you can reproduce the hang 100% of the time?
I tried to create my own script, using batch that failed, loading the parquet files, and it worked smoothly, can you share that script?- One more thing, can you pass env var
SLING_KEEP_TEMP=trueand run it? This will keep the temporary parquet files in your machine. When it hangs, stop the process. Then can you run duckdb cli manually and try to read the parquet files withselect * from read_parquet([...])? Does that work fine? Maybe read all the data and write to a test csv file or something.
- Yes, tried multiple times, and it reproduced everytime.
- Sure
import pandas as pd from utils.data_manager import write_data_to_postgres from utils.data_manager import Mode import sqlalchemy, uuid import json from sqlalchemy.dialects.postgresql import JSONB
def clean_orders_info(x): if isinstance(x, (bytes, bytearray)): x = x.decode('utf-8') if isinstance(x, str): try: x = json.loads(x) except json.JSONDecodeError: return None return x
df = pd.read_parquet('u01-0002_part_00.parquet') df['orders_info'] = df['orders_info'].apply(clean_orders_info) assert df['orders_info'].apply(lambda x: isinstance(x, dict)).all()
columns_dtypes = { 'shopper_id': sqlalchemy.types.VARCHAR, 'store_id': sqlalchemy.types.VARCHAR, 'store_name': sqlalchemy.types.VARCHAR, 'orders_info': sqlalchemy.dialects.postgresql.JSONB }
write_data_to_postgres(df=df, table_name=f'noa_test', mode=Mode.APPEND, scope="features", schema_name='features', database_name='prod_features', dtype=columns_dtypes, )
- that's what I did, I tried the broken parquet files, in my script, and it worked smoothly. they also don't seem big, 123MG per file
@noat28 thanks, but in your script you're reading with pandas. Sling uses DuckDB CLI to read your parquet files. Could you try to read from duckdb so it uses the same logic path as sling?
Also, if you could share the trace output (running with v1.4.24, env var DEBUG=TRACE), that could help.
Hi @flarco
- I tried using duckdb cli ( hopefully i did it the way you intentent ) i was able to load the 4 problematic parquet files successfully, same script: con = duckdb.connect(database=':memory:') stream_full_name = 'noa_orders_info_sync' run_queries_on_pg(TABLES_INIT_DDL) for i in files: print(i) df = con.execute(f""" SELECT * FROM '{i}' """).df() df['orders_info'] = df['orders_info'].apply(clean_orders_info) assert df['orders_info'].apply(lambda x: isinstance(x, dict)).all() write_data_to_postgres(df=df, table_name='noa_orders_info_sync', mode=Mode.APPEND, scope="feature-store-write", schema_name='slow_features', database_name='prod_features', dtype=columns_dtypes, )
i wonder if u intented me to use it for the insert as well,
also tried to upgrade the library on my cluster, and used TRACE: still stuck and timedout: not sure how to tackle it
Run timed out Downloading sling binary (v1.4.24) for linux/amd64... running batch number: 1 out of 10 2025-10-26 22:19:00 TRC no type detected for JAVA_OPTS 2025-10-26 22:19:00 INF Sling CLI | https://slingdata.io/ 2025-10-26 22:19:00 INF Sling Replication | SLING_RS -> SLING_PG | analytics_v2.shopper_orders_info 2025-10-26 22:19:00 DBG Sling version: 1.4.24 (linux amd64) 2025-10-26 22:19:00 DBG type is db-db 2025-10-26 22:19:00 DBG using: {"columns":{},"mode":"incremental","select":[],"transforms":null} 2025-10-26 22:19:00 DBG using source options: {"empty_as_null":false,"format":"parquet","datetime_format":"AUTO","max_decimals":-1} 2025-10-26 22:19:00 DBG using target options: {"datetime_format":"auto","file_max_rows":0,"max_decimals":-1,"use_bulk":true,"add_new_columns":true,"adjust_column_type":false,"column_casing":"normalize"} 2025-10-26 22:19:00 INF connecting to source database (redshift) 2025-10-26 22:19:00 TRC driver=[REDACTED] conn_url=[REDACTED]://[REDACTED]:7e%5E%25LGi%2A%242xp2h@[REDACTED]:5439/prod 2025-10-26 22:19:00 DBG opened "redshift" connection (conn-redshift-sling_rs-t3f) 2025-10-26 22:19:00 INF connecting to target database ([REDACTED]) 2025-10-26 22:19:00 TRC driver=[REDACTED] conn_url=[REDACTED]ql://[REDACTED]:4ki%26shB%21WX@[REDACTED]:5432/prod_features 2025-10-26 22:19:00 DBG opened "[REDACTED]" connection (conn-[REDACTED]-sling_pg-GSE) 2025-10-26 22:19:00 TRC SELECT pg_attribute.attname AS column_name, pg_catalog.format_type(pg_attribute.atttypid, pg_attribute.atttypmod) as data_type, CASE pg_attribute.atttypid WHEN 21 /int2/ THEN 16 WHEN 23 /int4/ THEN 32 WHEN 20 /int8/ THEN 64 WHEN 1700 /numeric/ THEN CASE WHEN pg_attribute.atttypmod = -1 THEN null ELSE ((pg_attribute.atttypmod - 4) >> 16) & 65535 -- calculate the precision END WHEN 700 /float4/ THEN 24 /FLT_MANT_DIG/ WHEN 701 /float8/ THEN 53 /DBL_MANT_DIG/ ELSE null END AS precision, CASE WHEN pg_attribute.atttypid IN (21, 23, 20) THEN 0 WHEN pg_attribute.atttypid IN (1700) THEN CASE WHEN pg_attribute.atttypmod = -1 THEN null ELSE (pg_attribute.atttypmod - 4) & 65535 -- calculate the scale END ELSE null END AS scale from pg_catalog.pg_class INNER JOIN pg_catalog.pg_namespace ON pg_class.relnamespace = pg_namespace.oid INNER JOIN pg_catalog.pg_attribute ON pg_class.oid = pg_attribute.attrelid where 1=1 and pg_class.relkind in ('r', 'v', 'm', 'f', 'p') and pg_namespace.nspname = 'slow_features' and pg_class.relname = 'noa_orders_info_sync' and pg_attribute.attnum >= 1 and not pg_attribute.attisdropped ORDER BY pg_attribute.attnum /* nD */ conn=conn-[REDACTED]-sling_pg-GSE 2025-10-26 22:19:00 TRC database col => "column_name" native_type=NAME generic_type=string length=0 precision=0 scale=0 sourced=true 2025-10-26 22:19:00 TRC database col => "data_type" native_type=TEXT generic_type=text length=2147483647 precision=2147483647 scale=0 sourced=true 2025-10-26 22:19:00 TRC database col => "precision" native_type=INT4 generic_type=integer length=0 precision=0 scale=0 sourced=true 2025-10-26 22:19:00 TRC database col => "scale" native_type=INT4 generic_type=integer length=0 precision=0 scale=0 sourced=true 2025-10-26 22:19:00 TRC query responded in 0.008205 secs 2025-10-26 22:19:00 TRC query returned 4 rows 2025-10-26 22:19:00 TRC database col => "shopper_id" native_type=uuid generic_type=uuid length=0 precision=0 scale=0 sourced=true 2025-10-26 22:19:00 TRC database col => "store_id" native_type=uuid generic_type=uuid length=0 precision=0 scale=0 sourced=true 2025-10-26 22:19:00 TRC database col => "store_name" native_type=text generic_type=text length=0 precision=0 scale=0 sourced=true 2025-10-26 22:19:00 TRC database col => "orders_info" native_type=jsonb generic_type=json length=0 precision=0 scale=0 sourced=true 2025-10-26 22:19:00 INF getting checkpoint value (shopper_id) 2025-10-26 22:19:00 DBG select max("shopper_id") as max_val from "slow_features"."noa_orders_info_sync" [sling_pg-GSE] 2025-10-26 22:19:00 INF reading from source database 2025-10-26 22:19:00 TRC GetSQLColumns: select * from ( SELECT * FROM "analytics_v2"."shopper_orders_info" WHERE MOD(ABS(FNV_HASH(shopper_id)), 10) = 0 --null ) as t limit 0 offset 0
2025-10-26 22:19:00 TRC select * from ( SELECT * FROM "analytics_v2"."shopper_orders_info" WHERE MOD(ABS(FNV_HASH(shopper_id)), 10) = 0 --null ) as t limit 0 offset 0 /* GetSQLColumns / / nD */ conn=conn-redshift-sling_rs-t3f 2025-10-26 22:19:00 TRC database col => "shopper_id" native_type=VARCHAR generic_type=text length=65535 precision=65535 scale=0 sourced=true 2025-10-26 22:19:00 TRC database col => "store_id" native_type=VARCHAR generic_type=text length=65535 precision=65535 scale=0 sourced=true 2025-10-26 22:19:00 TRC database col => "store_name" native_type=VARCHAR generic_type=text length=65535 precision=65535 scale=0 sourced=true 2025-10-26 22:19:00 TRC database col => "orders_info" native_type= generic_type=text length=0 precision=0 scale=0 sourced=true 2025-10-26 22:19:00 TRC query responded in 0.013226 secs 2025-10-26 22:19:00 TRC GetSQLColumns: select * from ( SELECT * FROM "analytics_v2"."shopper_orders_info" WHERE MOD(ABS(FNV_HASH(shopper_id)), 10) = 0 --null ) as t limit 0 offset 0
2025-10-26 22:19:00 TRC select * from (
SELECT * FROM "analytics_v2"."shopper_orders_info" WHERE MOD(ABS(FNV_HASH(shopper_id)), 10) = 0 --null
) as t limit 0 offset 0
/* GetSQLColumns / / nD / conn=conn-redshift-sling_rs-t3f
2025-10-26 22:19:00 TRC database col => "shopper_id" native_type=VARCHAR generic_type=text length=65535 precision=65535 scale=0 sourced=true
2025-10-26 22:19:00 TRC database col => "store_id" native_type=VARCHAR generic_type=text length=65535 precision=65535 scale=0 sourced=true
2025-10-26 22:19:00 TRC database col => "store_name" native_type=VARCHAR generic_type=text length=65535 precision=65535 scale=0 sourced=true
2025-10-26 22:19:00 TRC database col => "orders_info" native_type= generic_type=text length=0 precision=0 scale=0 sourced=true
2025-10-26 22:19:00 TRC query responded in 0.007952 secs
2025-10-26 22:19:00 INF unloading from redshift to s3
2025-10-26 22:19:00 DBG opened "s3" connection (conn-s3-aFH)
2025-10-26 22:19:01 DBG unload ('SELECT * FROM "analytics_v2"."shopper_orders_info" WHERE MOD(ABS(FNV_HASH(shopper_id)), 10) = 0 --null')
to 's3://[REDACTED]/sling_temp/stream/1761517140927.parquet/u01-'
credentials 'aws_access_key_id=;aws_secret_access_key='
allowoverwrite PARQUET PARALLEL [sling_rs-t3f]
2025-10-26 22:26:57 DBG Unloaded to s3://[REDACTED]/sling_temp/stream/1761517140927.parquet
2025-10-26 22:26:57 DBG opened "s3" connection (conn-s3-khT)
2025-10-26 22:26:57 TRC listing path: s3://[REDACTED]/sling_temp/stream/1761517140927.parquet
2025-10-26 22:26:57 DBG copying 4 files from remote to /tmp/duck.temp.1761517617287.yI7 for local processing
2025-10-26 22:26:58 TRC copied s3://[REDACTED]/sling_temp/stream/1761517140927.parquet/u01-0003_part_00.parquet to /tmp/duck.temp.1761517617287.yI7/sling_temp/stream/1761517140927.parquet/u01-0003_part_00.parquet [130277587 bytes]
2025-10-26 22:26:58 TRC copied s3://[REDACTED]/sling_temp/stream/1761517140927.parquet/u01-0002_part_00.parquet to /tmp/duck.temp.1761517617287.yI7/sling_temp/stream/1761517140927.parquet/u01-0002_part_00.parquet [129860519 bytes]
2025-10-26 22:26:58 TRC copied s3://[REDACTED]/sling_temp/stream/1761517140927.parquet/u01-0001_part_00.parquet to /tmp/duck.temp.1761517617287.yI7/sling_temp/stream/1761517140927.parquet/u01-0001_part_00.parquet [130288480 bytes]
2025-10-26 22:26:58 TRC copied s3://[REDACTED]/sling_temp/stream/1761517140927.parquet/u01-0000_part_00.parquet to /tmp/duck.temp.1761517617287.yI7/sling_temp/stream/1761517140927.parquet/u01-0000_part_00.parquet [130492895 bytes]
2025-10-26 22:26:58 DBG opened "file" connection (conn-file-rYr)
2025-10-26 22:26:58 DBG reading datastream from s3://[REDACTED]/sling_temp/stream/1761517140927.parquet [format=parquet, nodes=4]
2025-10-26 22:26:58 INF downloading duckdb 1.3.2 for linux/amd64
2025-10-26 22:26:58 TRC unzipping to: /root/.sling/bin/duckdb/1.3.2/duckdb
2025-10-26 22:26:59 TRC Proc command -> /root/.sling/bin/duckdb/1.3.2/duckdb -csv -nullvalue \N
2025-10-26 22:26:59 TRC INSTALL json; LOAD json; / nD / conn=conn-file-rYr
2025-10-26 22:26:59 TRC select 1 / nD / conn=conn-file-rYr
2025-10-26 22:26:59 TRC describe select * from read_parquet(['sling_temp/stream/1761517140927.parquet/u01-0003_part_00.parquet', 'sling_temp/stream/1761517140927.parquet/u01-0002_part_00.parquet', 'sling_temp/stream/1761517140927.parquet/u01-0001_part_00.parquet', 'sling_temp/stream/1761517140927.parquet/u01-0000_part_00.parquet']) / nD / conn=conn-file-rYr
2025-10-26 22:26:59 TRC applying transforms to ds.1761517619991.a9z: []
2025-10-26 22:27:00 TRC duckdb describe got 4 column(s)
2025-10-26 22:27:00 DBG select * from read_parquet(['sling_temp/stream/1761517140927.parquet/u01-0003_part_00.parquet', 'sling_temp/stream/1761517140927.parquet/u01-0002_part_00.parquet', 'sling_temp/stream/1761517140927.parquet/u01-0001_part_00.parquet', 'sling_temp/stream/1761517140927.parquet/u01-0000_part_00.parquet']) [rYr]
2025-10-26 22:27:00 TRC applying transforms to ds.1761517618788.Fca: []
2025-10-26 22:27:00 TRC number-cols detected: 4
2025-10-26 22:27:00 TRC new ds.Start ds.1761517618788.Fca
2025-10-26 22:27:00 TRC first row of ds.1761517618788.Fca => []interface {}{"e08c9074-57d5-4435-9b66-b7753f73ba13", "f84221b4-783d-4328-820f-81b024c08e23", "carpe", "{"5227715264645":{"occurred_at":"2024-10-01 06:05:03","subtotal":24.95,"discount_value":11.22,"variant_id_to_amount":{"40363478581381":1}}}"}
2025-10-26 22:27:00 DBG applying column casing (normalize) for target type ([REDACTED])
2025-10-26 22:27:00 DBG casting column 'shopper_id' as 'text'
2025-10-26 22:27:00 DBG casting column 'store_id' as 'text'
2025-10-26 22:27:00 DBG casting column 'store_name' as 'text'
2025-10-26 22:27:00 DBG casting column 'orders_info' as 'text'
2025-10-26 22:27:00 TRC ds.1761517618788.Fca.StreamURL => s3://[REDACTED]/sling_temp/stream/1761517140927.parquet
2025-10-26 22:27:00 TRC new batch ds.1761517618788.Fca-0
2025-10-26 22:27:00 TRC 1 datastreams pushed [ds.1761517618788.Fca]
2025-10-26 22:27:00 TRC pushed 1 datastreams
2025-10-26 22:27:00 TRC []string{"shopper_id [text | VARCHAR]", "store_id [text | VARCHAR]", "store_name [text | VARCHAR]", "orders_info [text | JSON]"}
2025-10-26 22:27:00 INF writing to target database [mode: incremental]
2025-10-26 22:27:00 TRC select schema_name
from information_schema.schemata
where catalog_name = ( select current_catalog )
order by schema_name / nD / conn=conn-[REDACTED]-sling_pg-GSE
2025-10-26 22:27:00 TRC database col => "schema_name" native_type=NAME generic_type=string length=0 precision=0 scale=0 sourced=true
2025-10-26 22:27:00 TRC query responded in 0.011114 secs
2025-10-26 22:27:00 TRC query returned 637 rows
2025-10-26 22:27:00 TRC try pausing ds.1761517618788.Fca
2025-10-26 22:27:00 TRC SELECT
pg_attribute.attname AS column_name,
pg_catalog.format_type(pg_attribute.atttypid, pg_attribute.atttypmod) as data_type,
CASE pg_attribute.atttypid
WHEN 21 /int2/ THEN 16
WHEN 23 /int4/ THEN 32
WHEN 20 /int8/ THEN 64
WHEN 1700 /numeric/ THEN
CASE WHEN pg_attribute.atttypmod = -1
THEN null
ELSE ((pg_attribute.atttypmod - 4) >> 16) & 65535 -- calculate the precision
END
WHEN 700 /float4/ THEN 24 /FLT_MANT_DIG/
WHEN 701 /float8/ THEN 53 /DBL_MANT_DIG/
ELSE null
END AS precision,
CASE
WHEN pg_attribute.atttypid IN (21, 23, 20) THEN 0
WHEN pg_attribute.atttypid IN (1700) THEN
CASE
WHEN pg_attribute.atttypmod = -1 THEN null
ELSE (pg_attribute.atttypmod - 4) & 65535 -- calculate the scale
END
ELSE null
END AS scale
from pg_catalog.pg_class
INNER JOIN pg_catalog.pg_namespace ON pg_class.relnamespace = pg_namespace.oid
INNER JOIN pg_catalog.pg_attribute ON pg_class.oid = pg_attribute.attrelid
where 1=1
and pg_class.relkind in ('r', 'v', 'm', 'f', 'p')
and pg_namespace.nspname = 'slow_features'
and pg_class.relname = 'noa_orders_info_sync'
and pg_attribute.attnum >= 1
and not pg_attribute.attisdropped
ORDER BY pg_attribute.attnum / nD / conn=conn-[REDACTED]-sling_pg-GSE
2025-10-26 22:27:00 TRC database col => "column_name" native_type=NAME generic_type=string length=0 precision=0 scale=0 sourced=true
2025-10-26 22:27:00 TRC database col => "data_type" native_type=TEXT generic_type=text length=2147483647 precision=2147483647 scale=0 sourced=true
2025-10-26 22:27:00 TRC database col => "precision" native_type=INT4 generic_type=integer length=0 precision=0 scale=0 sourced=true
2025-10-26 22:27:00 TRC database col => "scale" native_type=INT4 generic_type=integer length=0 precision=0 scale=0 sourced=true
2025-10-26 22:27:00 TRC query responded in 0.002111 secs
2025-10-26 22:27:00 TRC query returned 4 rows
2025-10-26 22:27:00 TRC begin
2025-10-26 22:27:00 TRC SELECT
pg_attribute.attname AS column_name,
pg_catalog.format_type(pg_attribute.atttypid, pg_attribute.atttypmod) as data_type,
CASE pg_attribute.atttypid
WHEN 21 /int2/ THEN 16
WHEN 23 /int4/ THEN 32
WHEN 20 /int8/ THEN 64
WHEN 1700 /numeric/ THEN
CASE WHEN pg_attribute.atttypmod = -1
THEN null
ELSE ((pg_attribute.atttypmod - 4) >> 16) & 65535 -- calculate the precision
END
WHEN 700 /float4/ THEN 24 /FLT_MANT_DIG/
WHEN 701 /float8/ THEN 53 /DBL_MANT_DIG/
ELSE null
END AS precision,
CASE
WHEN pg_attribute.atttypid IN (21, 23, 20) THEN 0
WHEN pg_attribute.atttypid IN (1700) THEN
CASE
WHEN pg_attribute.atttypmod = -1 THEN null
ELSE (pg_attribute.atttypmod - 4) & 65535 -- calculate the scale
END
ELSE null
END AS scale
from pg_catalog.pg_class
INNER JOIN pg_catalog.pg_namespace ON pg_class.relnamespace = pg_namespace.oid
INNER JOIN pg_catalog.pg_attribute ON pg_class.oid = pg_attribute.attrelid
where 1=1
and pg_class.relkind in ('r', 'v', 'm', 'f', 'p')
and pg_namespace.nspname = 'slow_features'
and pg_class.relname = 'noa_orders_info_sync'
and pg_attribute.attnum >= 1
and not pg_attribute.attisdropped
ORDER BY pg_attribute.attnum / nD / conn=conn-[REDACTED]-sling_pg-GSE
2025-10-26 22:27:00 TRC database col => "column_name" native_type=NAME generic_type=string length=0 precision=0 scale=0 sourced=true
2025-10-26 22:27:00 TRC database col => "data_type" native_type=TEXT generic_type=text length=2147483647 precision=2147483647 scale=0 sourced=true
2025-10-26 22:27:00 TRC database col => "precision" native_type=INT4 generic_type=integer length=0 precision=0 scale=0 sourced=true
2025-10-26 22:27:00 TRC database col => "scale" native_type=INT4 generic_type=integer length=0 precision=0 scale=0 sourced=true
2025-10-26 22:27:00 TRC query responded in 0.001546 secs
2025-10-26 22:27:00 TRC query returned 4 rows
2025-10-26 22:27:00 TRC SELECT
pg_attribute.attname AS column_name,
pg_catalog.format_type(pg_attribute.atttypid, pg_attribute.atttypmod) as data_type,
CASE pg_attribute.atttypid
WHEN 21 /int2/ THEN 16
WHEN 23 /int4/ THEN 32
WHEN 20 /int8/ THEN 64
WHEN 1700 /numeric/ THEN
CASE WHEN pg_attribute.atttypmod = -1
THEN null
ELSE ((pg_attribute.atttypmod - 4) >> 16) & 65535 -- calculate the precision
END
WHEN 700 /float4/ THEN 24 /FLT_MANT_DIG/
WHEN 701 /float8/ THEN 53 /DBL_MANT_DIG/
ELSE null
END AS precision,
CASE
WHEN pg_attribute.atttypid IN (21, 23, 20) THEN 0
WHEN pg_attribute.atttypid IN (1700) THEN
CASE
WHEN pg_attribute.atttypmod = -1 THEN null
ELSE (pg_attribute.atttypmod - 4) & 65535 -- calculate the scale
END
ELSE null
END AS scale
from pg_catalog.pg_class
INNER JOIN pg_catalog.pg_namespace ON pg_class.relnamespace = pg_namespace.oid
INNER JOIN pg_catalog.pg_attribute ON pg_class.oid = pg_attribute.attrelid
where 1=1
and pg_class.relkind in ('r', 'v', 'm', 'f', 'p')
and pg_namespace.nspname = 'slow_features'
and pg_class.relname = 'noa_orders_info_sync'
and pg_attribute.attnum >= 1
and not pg_attribute.attisdropped
ORDER BY pg_attribute.attnum / nD / conn=conn-[REDACTED]-sling_pg-GSE
2025-10-26 22:27:00 TRC database col => "column_name" native_type=NAME generic_type=string length=0 precision=0 scale=0 sourced=true
2025-10-26 22:27:00 TRC database col => "data_type" native_type=TEXT generic_type=text length=2147483647 precision=2147483647 scale=0 sourced=true
2025-10-26 22:27:00 TRC database col => "precision" native_type=INT4 generic_type=integer length=0 precision=0 scale=0 sourced=true
2025-10-26 22:27:00 TRC database col => "scale" native_type=INT4 generic_type=integer length=0 precision=0 scale=0 sourced=true
2025-10-26 22:27:00 TRC query responded in 0.001375 secs
2025-10-26 22:27:00 TRC query returned 4 rows
2025-10-26 22:27:00 TRC database col => "shopper_id" native_type=uuid generic_type=uuid length=0 precision=0 scale=0 sourced=true
2025-10-26 22:27:00 TRC database col => "store_id" native_type=uuid generic_type=uuid length=0 precision=0 scale=0 sourced=true
2025-10-26 22:27:00 TRC database col => "store_name" native_type=text generic_type=text length=0 precision=0 scale=0 sourced=true
2025-10-26 22:27:00 TRC database col => "orders_info" native_type=jsonb generic_type=json length=0 precision=0 scale=0 sourced=true
2025-10-26 22:27:00 TRC unpausing ds.1761517618788.Fca
2025-10-26 22:27:00 INF streaming data (direct insert)
2025-10-26 22:27:00 TRC SELECT
pg_attribute.attname AS column_name,
pg_catalog.format_type(pg_attribute.atttypid, pg_attribute.atttypmod) as data_type,
CASE pg_attribute.atttypid
WHEN 21 /int2/ THEN 16
WHEN 23 /int4/ THEN 32
WHEN 20 /int8/ THEN 64
WHEN 1700 /numeric/ THEN
CASE WHEN pg_attribute.atttypmod = -1
THEN null
ELSE ((pg_attribute.atttypmod - 4) >> 16) & 65535 -- calculate the precision
END
WHEN 700 /float4/ THEN 24 /FLT_MANT_DIG/
WHEN 701 /float8/ THEN 53 /DBL_MANT_DIG/
ELSE null
END AS precision,
CASE
WHEN pg_attribute.atttypid IN (21, 23, 20) THEN 0
WHEN pg_attribute.atttypid IN (1700) THEN
CASE
WHEN pg_attribute.atttypmod = -1 THEN null
ELSE (pg_attribute.atttypmod - 4) & 65535 -- calculate the scale
END
ELSE null
END AS scale
from pg_catalog.pg_class
INNER JOIN pg_catalog.pg_namespace ON pg_class.relnamespace = pg_namespace.oid
INNER JOIN pg_catalog.pg_attribute ON pg_class.oid = pg_attribute.attrelid
where 1=1
and pg_class.relkind in ('r', 'v', 'm', 'f', 'p')
and pg_namespace.nspname = 'slow_features'
and pg_class.relname = 'noa_orders_info_sync'
and pg_attribute.attnum >= 1
and not pg_attribute.attisdropped
ORDER BY pg_attribute.attnum / nD */ conn=conn-[REDACTED]-sling_pg-GSE
2025-10-26 22:27:00 TRC database col => "column_name" native_type=NAME generic_type=string length=0 precision=0 scale=0 sourced=true
2025-10-26 22:27:00 TRC database col => "data_type" native_type=TEXT generic_type=text length=2147483647 precision=2147483647 scale=0 sourced=true
2025-10-26 22:27:00 TRC database col => "precision" native_type=INT4 generic_type=integer length=0 precision=0 scale=0 sourced=true
2025-10-26 22:27:00 TRC database col => "scale" native_type=INT4 generic_type=integer length=0 precision=0 scale=0 sourced=true
2025-10-26 22:27:00 TRC query responded in 0.001304 secs
2025-10-26 22:27:00 TRC query returned 4 rows
2025-10-26 22:27:00 TRC database col => "shopper_id" native_type=uuid generic_type=uuid length=0 precision=0 scale=0 sourced=true
2025-10-26 22:27:00 TRC database col => "store_id" native_type=uuid generic_type=uuid length=0 precision=0 scale=0 sourced=true
2025-10-26 22:27:00 TRC database col => "store_name" native_type=text generic_type=text length=0 precision=0 scale=0 sourced=true
2025-10-26 22:27:00 TRC database col => "orders_info" native_type=jsonb generic_type=json length=0 precision=0 scale=0 sourced=true
2025-10-26 22:27:40 TRC closed ds.1761517618788.Fca-0
2025-10-26 22:27:40 TRC Pushed 3539197 rows for ds.1761517618788.Fca
2025-10-26 22:27:40 TRC commiting
2025-10-26 22:27:40 TRC COPY 3539197 ROWS
2025-10-26 22:27:40 TRC executing defer functions
2025-10-26 22:27:41 INF inserted 3539197 rows into "slow_features"."noa_orders_info_sync" in 520 secs [6,804 r/s] [1.4 GB]
2025-10-26 22:27:41 TRC task-execution cleanup
2025-10-26 22:27:41 DBG closed "redshift" connection (conn-redshift-sling_rs-t3f)
2025-10-26 22:27:41 DBG closed "[REDACTED]" connection (conn-[REDACTED]-sling_pg-GSE)
2025-10-26 22:27:41 TRC executing defer functions
2025-10-26 22:27:41 INF execution succeeded
running batch number: 2 out of 10 2025-10-26 22:27:41 TRC no type detected for JAVA_OPTS 2025-10-26 22:27:41 INF Sling CLI | https://slingdata.io/ 2025-10-26 22:27:41 INF Sling Replication | SLING_RS -> SLING_PG | analytics_v2.shopper_orders_info 2025-10-26 22:27:42 DBG Sling version: 1.4.24 (linux amd64) 2025-10-26 22:27:42 DBG type is db-db 2025-10-26 22:27:42 DBG using: {"columns":{},"mode":"incremental","select":[],"transforms":null} 2025-10-26 22:27:42 DBG using source options: {"empty_as_null":false,"format":"parquet","datetime_format":"AUTO","max_decimals":-1} 2025-10-26 22:27:42 DBG using target options: {"datetime_format":"auto","file_max_rows":0,"max_decimals":-1,"use_bulk":true,"add_new_columns":true,"adjust_column_type":false,"column_casing":"normalize"} 2025-10-26 22:27:42 INF connecting to source database (redshift) 2025-10-26 22:27:42 TRC driver=[REDACTED] conn_url=[REDACTED]://[REDACTED]:7e%5E%25LGi%2A%242xp2h@[REDACTED]:5439/prod 2025-10-26 22:27:42 DBG opened "redshift" connection (conn-redshift-sling_rs-nm5) 2025-10-26 22:27:42 INF connecting to target database ([REDACTED]) 2025-10-26 22:27:42 TRC driver=[REDACTED] conn_url=[REDACTED]ql://[REDACTED]:4ki%26shB%21WX@[REDACTED]:5432/prod_features 2025-10-26 22:27:42 DBG opened "[REDACTED]" connection (conn-[REDACTED]-sling_pg-2lT) 2025-10-26 22:27:42 TRC SELECT pg_attribute.attname AS column_name, pg_catalog.format_type(pg_attribute.atttypid, pg_attribute.atttypmod) as data_type, CASE pg_attribute.
*** WARNING: max output size exceeded, skipping output. ***
gres-sling_pg-hAB) 2025-10-26 22:52:51 TRC executing defer functions 2025-10-26 22:52:51 INF execution succeeded
running batch number: 7 out of 10 2025-10-26 22:52:52 TRC no type detected for JAVA_OPTS 2025-10-26 22:52:52 INF Sling CLI | https://slingdata.io/ 2025-10-26 22:52:52 INF Sling Replication | SLING_RS -> SLING_PG | analytics_v2.shopper_orders_info 2025-10-26 22:52:52 DBG Sling version: 1.4.24 (linux amd64) 2025-10-26 22:52:52 DBG type is db-db 2025-10-26 22:52:52 DBG using: {"columns":{},"mode":"incremental","select":[],"transforms":null} 2025-10-26 22:52:52 DBG using source options: {"empty_as_null":false,"format":"parquet","datetime_format":"AUTO","max_decimals":-1} 2025-10-26 22:52:52 DBG using target options: {"datetime_format":"auto","file_max_rows":0,"max_decimals":-1,"use_bulk":true,"add_new_columns":true,"adjust_column_type":false,"column_casing":"normalize"} 2025-10-26 22:52:52 INF connecting to source database (redshift) 2025-10-26 22:52:52 TRC driver=[REDACTED] conn_url=[REDACTED]://[REDACTED]:7e%5E%25LGi%2A%242xp2h@[REDACTED]:5439/prod 2025-10-26 22:52:52 DBG opened "redshift" connection (conn-redshift-sling_rs-oXa) 2025-10-26 22:52:52 INF connecting to target database ([REDACTED]) 2025-10-26 22:52:52 TRC driver=[REDACTED] conn_url=[REDACTED]ql://[REDACTED]:4ki%26shB%21WX@[REDACTED]:5432/prod_features 2025-10-26 22:52:52 DBG opened "[REDACTED]" connection (conn-[REDACTED]-sling_pg-B2p) 2025-10-26 22:52:52 TRC SELECT pg_attribute.attname AS column_name, pg_catalog.format_type(pg_attribute.atttypid, pg_attribute.atttypmod) as data_type, CASE pg_attribute.atttypid WHEN 21 /int2/ THEN 16 WHEN 23 /int4/ THEN 32 WHEN 20 /int8/ THEN 64 WHEN 1700 /numeric/ THEN CASE WHEN pg_attribute.atttypmod = -1 THEN null ELSE ((pg_attribute.atttypmod - 4) >> 16) & 65535 -- calculate the precision END WHEN 700 /float4/ THEN 24 /FLT_MANT_DIG/ WHEN 701 /float8/ THEN 53 /DBL_MANT_DIG/ ELSE null END AS precision, CASE WHEN pg_attribute.atttypid IN (21, 23, 20) THEN 0 WHEN pg_attribute.atttypid IN (1700) THEN CASE WHEN pg_attribute.atttypmod = -1 THEN null ELSE (pg_attribute.atttypmod - 4) & 65535 -- calculate the scale END ELSE null END AS scale from pg_catalog.pg_class INNER JOIN pg_catalog.pg_namespace ON pg_class.relnamespace = pg_namespace.oid INNER JOIN pg_catalog.pg_attribute ON pg_class.oid = pg_attribute.attrelid where 1=1 and pg_class.relkind in ('r', 'v', 'm', 'f', 'p') and pg_namespace.nspname = 'slow_features' and pg_class.relname = 'noa_orders_info_sync' and pg_attribute.attnum >= 1 and not pg_attribute.attisdropped ORDER BY pg_attribute.attnum /* nD */ conn=conn-[REDACTED]-sling_pg-B2p 2025-10-26 22:52:52 TRC database col => "column_name" native_type=NAME generic_type=string length=0 precision=0 scale=0 sourced=true 2025-10-26 22:52:52 TRC database col => "data_type" native_type=TEXT generic_type=text length=2147483647 precision=2147483647 scale=0 sourced=true 2025-10-26 22:52:52 TRC database col => "precision" native_type=INT4 generic_type=integer length=0 precision=0 scale=0 sourced=true 2025-10-26 22:52:52 TRC database col => "scale" native_type=INT4 generic_type=integer length=0 precision=0 scale=0 sourced=true 2025-10-26 22:52:52 TRC query responded in 0.011798 secs 2025-10-26 22:52:52 TRC query returned 4 rows 2025-10-26 22:52:52 TRC database col => "shopper_id" native_type=uuid generic_type=uuid length=0 precision=0 scale=0 sourced=true 2025-10-26 22:52:52 TRC database col => "store_id" native_type=uuid generic_type=uuid length=0 precision=0 scale=0 sourced=true 2025-10-26 22:52:52 TRC database col => "store_name" native_type=text generic_type=text length=0 precision=0 scale=0 sourced=true 2025-10-26 22:52:52 TRC database col => "orders_info" native_type=jsonb generic_type=json length=0 precision=0 scale=0 sourced=true 2025-10-26 22:52:52 INF getting checkpoint value (shopper_id) 2025-10-26 22:52:52 DBG select max("shopper_id") as max_val from "slow_features"."noa_orders_info_sync" [sling_pg-B2p] 2025-10-26 22:52:52 INF reading from source database 2025-10-26 22:52:52 TRC GetSQLColumns: select * from ( SELECT * FROM "analytics_v2"."shopper_orders_info" WHERE MOD(ABS(FNV_HASH(shopper_id)), 10) = 6 --null ) as t limit 0 offset 0
2025-10-26 22:52:52 TRC select * from ( SELECT * FROM "analytics_v2"."shopper_orders_info" WHERE MOD(ABS(FNV_HASH(shopper_id)), 10) = 6 --null ) as t limit 0 offset 0 /* GetSQLColumns / / nD */ conn=conn-redshift-sling_rs-oXa 2025-10-26 22:52:52 TRC database col => "shopper_id" native_type=VARCHAR generic_type=text length=65535 precision=65535 scale=0 sourced=true 2025-10-26 22:52:52 TRC database col => "store_id" native_type=VARCHAR generic_type=text length=65535 precision=65535 scale=0 sourced=true 2025-10-26 22:52:52 TRC database col => "store_name" native_type=VARCHAR generic_type=text length=65535 precision=65535 scale=0 sourced=true 2025-10-26 22:52:52 TRC database col => "orders_info" native_type= generic_type=text length=0 precision=0 scale=0 sourced=true 2025-10-26 22:52:52 TRC query responded in 0.011679 secs 2025-10-26 22:52:52 TRC GetSQLColumns: select * from ( SELECT * FROM "analytics_v2"."shopper_orders_info" WHERE MOD(ABS(FNV_HASH(shopper_id)), 10) = 6 --null ) as t limit 0 offset 0
2025-10-26 22:52:52 TRC select * from (
SELECT * FROM "analytics_v2"."shopper_orders_info" WHERE MOD(ABS(FNV_HASH(shopper_id)), 10) = 6 --null
) as t limit 0 offset 0
/* GetSQLColumns / / nD / conn=conn-redshift-sling_rs-oXa
2025-10-26 22:52:52 TRC database col => "shopper_id" native_type=VARCHAR generic_type=text length=65535 precision=65535 scale=0 sourced=true
2025-10-26 22:52:52 TRC database col => "store_id" native_type=VARCHAR generic_type=text length=65535 precision=65535 scale=0 sourced=true
2025-10-26 22:52:52 TRC database col => "store_name" native_type=VARCHAR generic_type=text length=65535 precision=65535 scale=0 sourced=true
2025-10-26 22:52:52 TRC database col => "orders_info" native_type= generic_type=text length=0 precision=0 scale=0 sourced=true
2025-10-26 22:52:52 TRC query responded in 0.006748 secs
2025-10-26 22:52:52 INF unloading from redshift to s3
2025-10-26 22:52:52 DBG opened "s3" connection (conn-s3-dMI)
2025-10-26 22:52:52 DBG unload ('SELECT * FROM "analytics_v2"."shopper_orders_info" WHERE MOD(ABS(FNV_HASH(shopper_id)), 10) = 6 --null')
to 's3://[REDACTED]/sling_temp/stream/1761519172463.parquet/u01-'
credentials 'aws_access_key_id=;aws_secret_access_key='
allowoverwrite PARQUET PARALLEL [sling_rs-oXa]
2025-10-26 22:56:22 DBG Unloaded to s3://[REDACTED]/sling_temp/stream/1761519172463.parquet
2025-10-26 22:56:22 DBG opened "s3" connection (conn-s3-fjh)
2025-10-26 22:56:22 TRC listing path: s3://[REDACTED]/sling_temp/stream/1761519172463.parquet
2025-10-26 22:56:22 DBG copying 4 files from remote to /tmp/duck.temp.1761519382476.pmX for local processing
2025-10-26 22:56:23 TRC copied s3://[REDACTED]/sling_temp/stream/1761519172463.parquet/u01-0003_part_00.parquet to /tmp/duck.temp.1761519382476.pmX/sling_temp/stream/1761519172463.parquet/u01-0003_part_00.parquet [130279387 bytes]
2025-10-26 22:56:23 TRC copied s3://[REDACTED]/sling_temp/stream/1761519172463.parquet/u01-0001_part_00.parquet to /tmp/duck.temp.1761519382476.pmX/sling_temp/stream/1761519172463.parquet/u01-0001_part_00.parquet [129859206 bytes]
2025-10-26 22:56:23 TRC copied s3://[REDACTED]/sling_temp/stream/1761519172463.parquet/u01-0000_part_00.parquet to /tmp/duck.temp.1761519382476.pmX/sling_temp/stream/1761519172463.parquet/u01-0000_part_00.parquet [130034629 bytes]
2025-10-26 22:56:24 TRC copied s3://[REDACTED]/sling_temp/stream/1761519172463.parquet/u01-0002_part_00.parquet to /tmp/duck.temp.1761519382476.pmX/sling_temp/stream/1761519172463.parquet/u01-0002_part_00.parquet [130136384 bytes]
2025-10-26 22:56:24 DBG opened "file" connection (conn-file-BOL)
2025-10-26 22:56:24 DBG reading datastream from s3://[REDACTED]/sling_temp/stream/1761519172463.parquet [format=parquet, nodes=4]
2025-10-26 22:56:24 TRC Proc command -> /root/.sling/bin/duckdb/1.3.2/duckdb -csv -nullvalue \N
2025-10-26 22:56:24 TRC INSTALL json; LOAD json; / nD / conn=conn-file-BOL
2025-10-26 22:56:24 TRC select 1 / nD / conn=conn-file-BOL
2025-10-26 22:56:24 TRC describe select * from read_parquet(['sling_temp/stream/1761519172463.parquet/u01-0003_part_00.parquet', 'sling_temp/stream/1761519172463.parquet/u01-0001_part_00.parquet', 'sling_temp/stream/1761519172463.parquet/u01-0000_part_00.parquet', 'sling_temp/stream/1761519172463.parquet/u01-0002_part_00.parquet']) / nD / conn=conn-file-BOL
2025-10-26 22:56:24 TRC applying transforms to ds.1761519384423.YtY: []
2025-10-26 22:56:24 TRC duckdb describe got 4 column(s)
2025-10-26 22:56:24 DBG select * from read_parquet(['sling_temp/stream/1761519172463.parquet/u01-0003_part_00.parquet', 'sling_temp/stream/1761519172463.parquet/u01-0001_part_00.parquet', 'sling_temp/stream/1761519172463.parquet/u01-0000_part_00.parquet', 'sling_temp/stream/1761519172463.parquet/u01-0002_part_00.parquet']) [BOL]
2025-10-26 22:56:24 TRC applying transforms to ds.1761519384188.0Bx: []
2025-10-26 22:56:24 TRC number-cols detected: 4
2025-10-26 22:56:24 TRC new ds.Start ds.1761519384188.0Bx
2025-10-26 22:56:24 TRC first row of ds.1761519384188.0Bx => []interface {}{"d00684a5-786e-42af-b4b2-caf296dd22f3", "000d9020-2cab-4438-802a-d6d71d9107ba", "dwc", "{"6774506455095":{"occurred_at":"2025-09-24 15:26:39","subtotal":175.9,"discount_value":26.4,"variant_id_to_amount":{"3436627623949":1,"40216565153847":1,"40216565121079":2,"3436706430989":1}},"6655193514039":{"occurred_at":"2025-07-26 15:42:38","subtotal":175.9,"discount_value":26.4,"variant_id_to_amount":{"3436706430989":1,"40216565153847":1,"3436627623949":1,"40216565121079":2}},"6543366127671":{"occurred_at":"2025-05-27 16:40:12","subtotal":175.9,"discount_value":26.4,"variant_id_to_amount":{"3436706430989":1,"40216565121079":2,"40216565153847":1,"3436627623949":1}},"6425134465079":{"occurred_at":"2025-03-28 06:27:37","subtotal":175.9,"discount_value":26.4,"variant_id_to_amount":{"40216565121079":2,"40216565153847":1,"3436627623949":1,"3436706430989":1}},"6304766427191":{"occurred_at":"2025-01-27 08:11:37","subtotal":175.9,"discount_value":26.4,"variant_id_to_amount":{"3436706430989":1,"40216565121079":2,"40216565153847":1,"3436627623949":1}},"6175991398455":{"occurred_at":"2024-11-28 07:00:55","subtotal":175.9,"discount_value":26.4,"variant_id_to_amount":{"3436627623949":1,"40216565121079":2,"3436706430989":1,"40216565153847":1}},"6059321950263":{"occurred_at":"2024-09-29 07:02:25","subtotal":175.9,"discount_value":26.4,"variant_id_to_amount":{"40216565121079":2,"3436706430989":1,"40216565153847":1,"3436627623949":1}},"5946354958391":{"occurred_at":"2024-07-31 05:58:12","subtotal":175.9,"discount_value":26.4,"variant_id_to_amount":{"40216565121079":2,"3436627623949":1,"40216565153847":1,"3436706430989":1}},"5817151258679":{"occurred_at":"2024-06-01 07:16:02","subtotal":175.9,"discount_value":26.4,"variant_id_to_amount":{"3436706430989":1,"40216565121079":2,"3436627623949":1,"40216565153847":1}},"5691138441271":{"occurred_at":"2024-04-02 05:58:05","subtotal":175.9,"discount_value":26.4,"variant_id_to_amount":{"3436706430989":1,"40216565153847":1,"40216565121079":2,"3436627623949":1}},"5580214534199":{"occurred_at":"2024-02-02 07:51:40","subtotal":175.9,"discount_value":26.4,"variant_id_to_amount":{"40216565153847":1,"40216565121079":2,"3436706430989":1,"3436627623949":1}},"5469605920823":{"occurred_at":"2023-12-04 06:52:17","subtotal":175.9,"discount_value":26.4,"variant_id_to_amount":{"3436627623949":1,"40216565121079":2,"3436706430989":1,"40216565153847":1}},"5346291351607":{"occurred_at":"2023-10-05 05:49:17","subtotal":287.84,"discount_value":43.2,"variant_id_to_amount":{"3436706430989":2,"40216565121079":2,"3436627623949":2,"40216565153847":2}},"5239264018487":{"occurred_at":"2023-08-06 05:45:26","subtotal":143.92,"discount_value":0,"variant_id_to_amount":{"638749409":2,"638652625":2,"2413985169421":2,"2414002503693":2}}}"}
2025-10-26 22:56:24 DBG applying column casing (normalize) for target type ([REDACTED])
2025-10-26 22:56:24 DBG casting column 'shopper_id' as 'text'
2025-10-26 22:56:24 DBG casting column 'store_id' as 'text'
2025-10-26 22:56:24 DBG casting column 'store_name' as 'text'
2025-10-26 22:56:24 DBG casting column 'orders_info' as 'text'
2025-10-26 22:56:24 TRC ds.1761519384188.0Bx.StreamURL => s3://[REDACTED]/sling_temp/stream/1761519172463.parquet
2025-10-26 22:56:24 TRC new batch ds.1761519384188.0Bx-0
2025-10-26 22:56:24 TRC 1 datastreams pushed [ds.1761519384188.0Bx]
2025-10-26 22:56:24 TRC pushed 1 datastreams
2025-10-26 22:56:24 TRC []string{"shopper_id [text | VARCHAR]", "store_id [text | VARCHAR]", "store_name [text | VARCHAR]", "orders_info [text | JSON]"}
2025-10-26 22:56:24 INF writing to target database [mode: incremental]
2025-10-26 22:56:24 TRC select schema_name
from information_schema.schemata
where catalog_name = ( select current_catalog )
order by schema_name / nD / conn=conn-[REDACTED]-sling_pg-B2p
2025-10-26 22:56:24 TRC database col => "schema_name" native_type=NAME generic_type=string length=0 precision=0 scale=0 sourced=true
2025-10-26 22:56:24 TRC query responded in 0.011299 secs
2025-10-26 22:56:24 TRC query returned 637 rows
2025-10-26 22:56:24 TRC try pausing ds.1761519384188.0Bx
2025-10-26 22:56:24 TRC SELECT
pg_attribute.attname AS column_name,
pg_catalog.format_type(pg_attribute.atttypid, pg_attribute.atttypmod) as data_type,
CASE pg_attribute.atttypid
WHEN 21 /int2/ THEN 16
WHEN 23 /int4/ THEN 32
WHEN 20 /int8/ THEN 64
WHEN 1700 /numeric/ THEN
CASE WHEN pg_attribute.atttypmod = -1
THEN null
ELSE ((pg_attribute.atttypmod - 4) >> 16) & 65535 -- calculate the precision
END
WHEN 700 /float4/ THEN 24 /FLT_MANT_DIG/
WHEN 701 /float8/ THEN 53 /DBL_MANT_DIG/
ELSE null
END AS precision,
CASE
WHEN pg_attribute.atttypid IN (21, 23, 20) THEN 0
WHEN pg_attribute.atttypid IN (1700) THEN
CASE
WHEN pg_attribute.atttypmod = -1 THEN null
ELSE (pg_attribute.atttypmod - 4) & 65535 -- calculate the scale
END
ELSE null
END AS scale
from pg_catalog.pg_class
INNER JOIN pg_catalog.pg_namespace ON pg_class.relnamespace = pg_namespace.oid
INNER JOIN pg_catalog.pg_attribute ON pg_class.oid = pg_attribute.attrelid
where 1=1
and pg_class.relkind in ('r', 'v', 'm', 'f', 'p')
and pg_namespace.nspname = 'slow_features'
and pg_class.relname = 'noa_orders_info_sync'
and pg_attribute.attnum >= 1
and not pg_attribute.attisdropped
ORDER BY pg_attribute.attnum / nD / conn=conn-[REDACTED]-sling_pg-B2p
2025-10-26 22:56:24 TRC database col => "column_name" native_type=NAME generic_type=string length=0 precision=0 scale=0 sourced=true
2025-10-26 22:56:24 TRC database col => "data_type" native_type=TEXT generic_type=text length=2147483647 precision=2147483647 scale=0 sourced=true
2025-10-26 22:56:24 TRC database col => "precision" native_type=INT4 generic_type=integer length=0 precision=0 scale=0 sourced=true
2025-10-26 22:56:24 TRC database col => "scale" native_type=INT4 generic_type=integer length=0 precision=0 scale=0 sourced=true
2025-10-26 22:56:24 TRC query responded in 0.001860 secs
2025-10-26 22:56:24 TRC query returned 4 rows
2025-10-26 22:56:24 TRC begin
2025-10-26 22:56:24 TRC SELECT
pg_attribute.attname AS column_name,
pg_catalog.format_type(pg_attribute.atttypid, pg_attribute.atttypmod) as data_type,
CASE pg_attribute.atttypid
WHEN 21 /int2/ THEN 16
WHEN 23 /int4/ THEN 32
WHEN 20 /int8/ THEN 64
WHEN 1700 /numeric/ THEN
CASE WHEN pg_attribute.atttypmod = -1
THEN null
ELSE ((pg_attribute.atttypmod - 4) >> 16) & 65535 -- calculate the precision
END
WHEN 700 /float4/ THEN 24 /FLT_MANT_DIG/
WHEN 701 /float8/ THEN 53 /DBL_MANT_DIG/
ELSE null
END AS precision,
CASE
WHEN pg_attribute.atttypid IN (21, 23, 20) THEN 0
WHEN pg_attribute.atttypid IN (1700) THEN
CASE
WHEN pg_attribute.atttypmod = -1 THEN null
ELSE (pg_attribute.atttypmod - 4) & 65535 -- calculate the scale
END
ELSE null
END AS scale
from pg_catalog.pg_class
INNER JOIN pg_catalog.pg_namespace ON pg_class.relnamespace = pg_namespace.oid
INNER JOIN pg_catalog.pg_attribute ON pg_class.oid = pg_attribute.attrelid
where 1=1
and pg_class.relkind in ('r', 'v', 'm', 'f', 'p')
and pg_namespace.nspname = 'slow_features'
and pg_class.relname = 'noa_orders_info_sync'
and pg_attribute.attnum >= 1
and not pg_attribute.attisdropped
ORDER BY pg_attribute.attnum / nD / conn=conn-[REDACTED]-sling_pg-B2p
2025-10-26 22:56:24 TRC database col => "column_name" native_type=NAME generic_type=string length=0 precision=0 scale=0 sourced=true
2025-10-26 22:56:24 TRC database col => "data_type" native_type=TEXT generic_type=text length=2147483647 precision=2147483647 scale=0 sourced=true
2025-10-26 22:56:24 TRC database col => "precision" native_type=INT4 generic_type=integer length=0 precision=0 scale=0 sourced=true
2025-10-26 22:56:24 TRC database col => "scale" native_type=INT4 generic_type=integer length=0 precision=0 scale=0 sourced=true
2025-10-26 22:56:24 TRC query responded in 0.001482 secs
2025-10-26 22:56:24 TRC query returned 4 rows
2025-10-26 22:56:24 TRC SELECT
pg_attribute.attname AS column_name,
pg_catalog.format_type(pg_attribute.atttypid, pg_attribute.atttypmod) as data_type,
CASE pg_attribute.atttypid
WHEN 21 /int2/ THEN 16
WHEN 23 /int4/ THEN 32
WHEN 20 /int8/ THEN 64
WHEN 1700 /numeric/ THEN
CASE WHEN pg_attribute.atttypmod = -1
THEN null
ELSE ((pg_attribute.atttypmod - 4) >> 16) & 65535 -- calculate the precision
END
WHEN 700 /float4/ THEN 24 /FLT_MANT_DIG/
WHEN 701 /float8/ THEN 53 /DBL_MANT_DIG/
ELSE null
END AS precision,
CASE
WHEN pg_attribute.atttypid IN (21, 23, 20) THEN 0
WHEN pg_attribute.atttypid IN (1700) THEN
CASE
WHEN pg_attribute.atttypmod = -1 THEN null
ELSE (pg_attribute.atttypmod - 4) & 65535 -- calculate the scale
END
ELSE null
END AS scale
from pg_catalog.pg_class
INNER JOIN pg_catalog.pg_namespace ON pg_class.relnamespace = pg_namespace.oid
INNER JOIN pg_catalog.pg_attribute ON pg_class.oid = pg_attribute.attrelid
where 1=1
and pg_class.relkind in ('r', 'v', 'm', 'f', 'p')
and pg_namespace.nspname = 'slow_features'
and pg_class.relname = 'noa_orders_info_sync'
and pg_attribute.attnum >= 1
and not pg_attribute.attisdropped
ORDER BY pg_attribute.attnum / nD / conn=conn-[REDACTED]-sling_pg-B2p
2025-10-26 22:56:24 TRC database col => "column_name" native_type=NAME generic_type=string length=0 precision=0 scale=0 sourced=true
2025-10-26 22:56:24 TRC database col => "data_type" native_type=TEXT generic_type=text length=2147483647 precision=2147483647 scale=0 sourced=true
2025-10-26 22:56:24 TRC database col => "precision" native_type=INT4 generic_type=integer length=0 precision=0 scale=0 sourced=true
2025-10-26 22:56:24 TRC database col => "scale" native_type=INT4 generic_type=integer length=0 precision=0 scale=0 sourced=true
2025-10-26 22:56:24 TRC query responded in 0.001696 secs
2025-10-26 22:56:24 TRC query returned 4 rows
2025-10-26 22:56:24 TRC database col => "shopper_id" native_type=uuid generic_type=uuid length=0 precision=0 scale=0 sourced=true
2025-10-26 22:56:24 TRC database col => "store_id" native_type=uuid generic_type=uuid length=0 precision=0 scale=0 sourced=true
2025-10-26 22:56:24 TRC database col => "store_name" native_type=text generic_type=text length=0 precision=0 scale=0 sourced=true
2025-10-26 22:56:24 TRC database col => "orders_info" native_type=jsonb generic_type=json length=0 precision=0 scale=0 sourced=true
2025-10-26 22:56:24 TRC unpausing ds.1761519384188.0Bx
2025-10-26 22:56:24 INF streaming data (direct insert)
2025-10-26 22:56:24 TRC SELECT
pg_attribute.attname AS column_name,
pg_catalog.format_type(pg_attribute.atttypid, pg_attribute.atttypmod) as data_type,
CASE pg_attribute.atttypid
WHEN 21 /int2/ THEN 16
WHEN 23 /int4/ THEN 32
WHEN 20 /int8/ THEN 64
WHEN 1700 /numeric/ THEN
CASE WHEN pg_attribute.atttypmod = -1
THEN null
ELSE ((pg_attribute.atttypmod - 4) >> 16) & 65535 -- calculate the precision
END
WHEN 700 /float4/ THEN 24 /FLT_MANT_DIG/
WHEN 701 /float8/ THEN 53 /DBL_MANT_DIG/
ELSE null
END AS precision,
CASE
WHEN pg_attribute.atttypid IN (21, 23, 20) THEN 0
WHEN pg_attribute.atttypid IN (1700) THEN
CASE
WHEN pg_attribute.atttypmod = -1 THEN null
ELSE (pg_attribute.atttypmod - 4) & 65535 -- calculate the scale
END
ELSE null
END AS scale
from pg_catalog.pg_class
INNER JOIN pg_catalog.pg_namespace ON pg_class.relnamespace = pg_namespace.oid
INNER JOIN pg_catalog.pg_attribute ON pg_class.oid = pg_attribute.attrelid
where 1=1
and pg_class.relkind in ('r', 'v', 'm', 'f', 'p')
and pg_namespace.nspname = 'slow_features'
and pg_class.relname = 'noa_orders_info_sync'
and pg_attribute.attnum >= 1
and not pg_attribute.attisdropped
ORDER BY pg_attribute.attnum / nD */ conn=conn-[REDACTED]-sling_pg-B2p
2025-10-26 22:56:24 TRC database col => "column_name" native_type=NAME generic_type=string length=0 precision=0 scale=0 sourced=true
2025-10-26 22:56:24 TRC database col => "data_type" native_type=TEXT generic_type=text length=2147483647 precision=2147483647 scale=0 sourced=true
2025-10-26 22:56:24 TRC database col => "precision" native_type=INT4 generic_type=integer length=0 precision=0 scale=0 sourced=true
2025-10-26 22:56:24 TRC database col => "scale" native_type=INT4 generic_type=integer length=0 precision=0 scale=0 sourced=true
2025-10-26 22:56:24 TRC query responded in 0.001465 secs
2025-10-26 22:56:24 TRC query returned 4 rows
2025-10-26 22:56:24 TRC database col => "shopper_id" native_type=uuid generic_type=uuid length=0 precision=0 scale=0 sourced=true
2025-10-26 22:56:24 TRC database col => "store_id" native_type=uuid generic_type=uuid length=0 precision=0 scale=0 sourced=true
2025-10-26 22:56:24 TRC database col => "store_name" native_type=text generic_type=text length=0 precision=0 scale=0 sourced=true
2025-10-26 22:56:24 TRC database col => "orders_info" native_type=jsonb generic_type=json length=0 precision=0 scale=0 sourced=true
thanks @noat28 , that didn't help me pin down the issue. Could you try with the latest dev build? There is new function to dump the stack of the program after the timeout.
Can you run with the env vars:
SLING_TIMEOUT=45=> to set the timeout to end the job run after 45minutes (or set what ever you think is best for after the hanging happens)SLING_TIMEOUT_STACK=true=> this will print the stack of the goroutines upon timeout.
SLING_TIMEOUT_STACK is brand new, so it only works in the latest dev build.
If you're running with python, just set env var SLING_BINARY=/path/to/devbuild/binary/sling. Python with use that to run sling. Share the log stack when it times out.
This should be more helpful in pinning down why it's hanging. Let me know if any questions.
Parquet Data Import to DuckLake Encounters Stalling/Lag
The value of rich_format might be too long
Environment Information:
- Database/Engine: DuckDB 1.4.2
- Sling: v1.4.27 (Self-compiled macOS version, last commit 98f2e3d7aa05ed9d849c4d14519f10a14b05f406)
- Data Source: S3 Parquet File
- File Path:
s3://sling-test/sling-test/test1.parquet - Number of Records: 1 row
Original Query:
SELECT
blog_id,
length(rich_format),
typeof(rich_format),
typeof(blog_id)
FROM
read_parquet(
's3://sling-test/sling-test/test1.parquet' )
Query Result:
{
"blog_id": 1961612358729793536,
"length(rich_format)": 65330,
"typeof(rich_format)": "VARCHAR",
"typeof(blog_id)": "DECIMAL(20,0)"
}
Configuration Information (S3/Stream Configuration Snippet):
source: S3_CONN
target: DUCKLAKE_CONN
defaults:
mode: full-refresh
source_options:
format: parquet
target_options:
column_typing:
decimal:
min_precision: 30
max_precision: 60
min_scale: 0
max_scale: 9
streams:
"sling-test/test1.parquet":
object: "dw_biz.x_blog"
single: true
select: ["blog_id", "rich_format"]
Sling Command:
SLING_TIMEOUT=5 SLING_TIMEOUT_STACK=true sling run -r local_test/snapshot_import_test.yaml --trace
Stack:
4:52PM INF success!
4:52PM INF success!
2025-11-19 16:52:27 INF Sling CLI | https://slingdata.io
2025-11-19 16:52:27 DBG setting timeout for 5 minutes
2025-11-19 16:52:27 DBG opened "s3" connection (conn-s3-iEL)
2025-11-19 16:52:27 DBG opened "s3" connection (conn-s3-79N)
2025-11-19 16:52:27 TRC Could not successfully get format values. Blank values for: object_temp_full_name, object_temp_schema, object_temp_table
2025-11-19 16:52:27 TRC object format map: {"DD":"19","DDD":"19D","HH":"16","MM":"11","MMM":"Nov","YY":"25","YYYY":"2025","hh":"04","mm":"52","object":{"full_name":"\"dw_biz\".\"x_blog\"","name":"\"dw_biz\".\"x_blog\"","schema":"dw_biz","table":"x_blog","temp_full_name":"\"main\".\"x_blog_sling_duckdb_tmp\"","temp_schema":"main","temp_table":"x_blog_sling_duckdb_tmp"},"object.full_name":"\"dw_biz\".\"x_blog\"","object.name":"\"dw_biz\".\"x_blog\"","object.schema":"dw_biz","object.table":"x_blog","object.temp_full_name":"\"main\".\"x_blog_sling_duckdb_tmp\"","object.temp_schema":"main","object.temp_table":"x_blog_sling_duckdb_tmp","object_full_name":"\"dw_biz\".\"x_blog\"","object_name":"\"dw_biz\".\"x_blog\"","object_schema":"dw_biz","object_table":"x_blog","object_temp_full_name":"\"main\".\"x_blog_sling_duckdb_tmp\"","object_temp_schema":"main","object_temp_table":"x_blog_sling_duckdb_tmp","run_timestamp":"2025_11_19_165227","source":{"bucket":"sling-test","kind":"file","name":"s3_conn","type":"s3"},"source.bucket":"sling-test","source.kind":"file","source.name":"s3_conn","source.type":"s3","source_bucket":"sling-test","source_kind":"file","source_name":"s3_conn","source_type":"s3","ss":"27","stream":{"file_ext":"parquet","file_folder":"postdb.x_blog","file_name":"test1","file_path":"sling-test/test1.parquet","full_name":"s3://sling-test/sling-test/test1.parquet","name":"sling-test/test1.parquet"},"stream.file_ext":"parquet","stream.file_folder":"postdb.x_blog","stream.file_name":"test1","stream.file_path":"sling-test/test1.parquet","stream.full_name":"s3://sling-test/sling-test/test1.parquet","stream.name":"sling-test/test1.parquet","stream_file_ext":"parquet","stream_file_folder":"postdb.x_blog","stream_file_name":"test1","stream_file_path":"sling-test/test1.parquet","stream_full_name":"s3://sling-test/sling-test/test1.parquet","stream_name":"sling-test/test1.parquet","target":{"kind":"database","name":"ducklake_conn","schema":"main","table":"x_blog","type":"ducklake"},"target.kind":"database","target.name":"ducklake_conn","target.schema":"main","target.table":"x_blog","target.type":"ducklake","target_kind":"database","target_name":"ducklake_conn","target_schema":"main","target_table":"x_blog","target_type":"ducklake","timestamp":{"DD":"19","DDD":"19D","HH":"16","MM":"11","MMM":"Nov","YY":"25","YYYY":"2025","date":"2025-11-19","datetime":"2025-11-19 16:52:27","file_name":"2025_11_19_165227","hh":"04","mm":"52","rfc3339":"2025-11-19T16:52:27+08:00","ss":"27"}}
2025-11-19 16:52:27 TRC len(selectStreams) = 0, len(matchedStreams) = 0, len(replication.Streams) = 1
2025-11-19 16:52:27 INF Sling Replication | S3_CONN -> DUCKLAKE_CONN | sling-test/test1.parquet
2025-11-19 16:52:27 TRC srcFileProvided: true, tgtFileProvided: false, srcDbProvided: false, tgtDbProvided: true, srcApiProvided: false, srcStreamProvided: true
2025-11-19 16:52:27 TRC using Config:
{
"source": {
"conn": "S3_CONN",
"type": "s3",
"stream": "sling-test/test1.parquet",
"select": [
"blog_id",
"rich_format"
],
"primary_key": [],
"options": {
"empty_as_null": false,
"header": true,
"fields_per_rec": -1,
"compression": "auto",
"format": "parquet",
"null_if": "NULL",
"datetime_format": "AUTO",
"skip_blank_lines": false,
"max_decimals": -1
}
},
"target": {
"conn": "DUCKLAKE_CONN",
"type": "ducklake",
"object": "\"dw_biz\".\"x_blog\"",
"options": {
"datetime_format": "auto",
"file_max_rows": 0,
"file_max_bytes": 0,
"max_decimals": -1,
"use_bulk": true,
"add_new_columns": true,
"adjust_column_type": true,
"column_casing": "normalize",
"column_typing": {
"decimal": {
"min_precision": 30,
"max_precision": 60,
"min_scale": 0,
"max_scale": 9
}
},
"table_tmp": "\"main\".\"x_blog_sling_duckdb_tmp\""
}
},
"mode": "full-refresh",
"options": {},
"stream_name": "sling-test/test1.parquet",
"replication_stream": {
"mode": "full-refresh",
"object": "\"dw_biz\".\"x_blog\"",
"select": [
"blog_id",
"rich_format"
],
"source_options": {
"format": "parquet"
},
"target_options": {
"adjust_column_type": true,
"column_typing": {
"decimal": {
"min_precision": 30,
"max_precision": 60,
"min_scale": 0,
"max_scale": 9
}
}
},
"single": true,
"hooks": {}
},
"incremental_val": null,
"incremental_val_str": ""
}
2025-11-19 16:52:27 DBG Sling version: dev (darwin arm64)
2025-11-19 16:52:27 DBG type is file-db
2025-11-19 16:52:27 DBG using: {"columns":null,"mode":"full-refresh","select":["blog_id","rich_format"],"transforms":null}
2025-11-19 16:52:27 DBG using source options: {"empty_as_null":false,"header":true,"fields_per_rec":-1,"compression":"auto","format":"parquet","null_if":"NULL","datetime_format":"AUTO","skip_blank_lines":false,"max_decimals":-1}
2025-11-19 16:52:27 DBG using target options: {"datetime_format":"auto","file_max_rows":0,"file_max_bytes":0,"max_decimals":-1,"use_bulk":true,"add_new_columns":true,"adjust_column_type":true,"column_casing":"normalize","column_typing":{"decimal":{"min_precision":30,"max_precision":60,"min_scale":0,"max_scale":9}},"table_tmp":"\"main\".\"x_blog_sling_duckdb_tmp\""}
2025-11-19 16:52:27 INF connecting to target database (ducklake)
2025-11-19 16:52:27 TRC The file ducklake: does not exist, however it will be created if needed.
2025-11-19 16:52:27 TRC Proc command -> /Users/DemoUser/.sling/bin/duckdb/1.4.2/duckdb -csv -nullvalue \N\
2025-11-19 16:52:27 TRC INSTALL json; LOAD json; /* nD */
2025-11-19 16:52:27 TRC select 1 /* nD */
2025-11-19 16:52:27 TRC duckdb scanner: SOF marker seen
2025-11-19 16:52:27 TRC duckdb scanner: EOF marker seen
2025-11-19 16:52:27 TRC duckdb scanner: EOF marker seen
2025-11-19 16:52:27 DBG opened "ducklake" connection (conn-ducklake-ducklake_conn-ak0)
2025-11-19 16:52:27 TRC CREATE SECRET IF NOT EXISTS s3_secret (TYPE s3, REGION 'us-east-1', URL_STYLE 'path', SECRET 'minioadmin', ENDPOINT '127.0.0.1:9000', USE_SSL 'false', KEY_ID 'minioadmin') /* nD */
2025-11-19 16:52:27 TRC INSTALL json; LOAD json;INSTALL arrow from community; LOAD arrow ;INSTALL ducklake; LOAD ducklake;INSTALL postgres; LOAD postgres;INSTALL aws; LOAD aws;INSTALL httpfs; LOAD httpfs;SET http_timeout = 9999;CREATE SECRET IF NOT EXISTS s3_secret (TYPE s3, REGION 'us-east-1', URL_STYLE 'path', SECRET 'minioadmin', ENDPOINT '127.0.0.1:9000', USE_SSL 'false', KEY_ID 'minioadmin'); /* nD */
2025-11-19 16:52:27 TRC ATTACH IF NOT EXISTS 'ducklake:postgres:dbname=ducklake_catalog host=127.0.0.1 user=postgres' AS ducklake (DATA_PATH 's3://ducklake') /* nD */
2025-11-19 16:52:28 TRC duckdb scanner: SOF marker seen
2025-11-19 16:52:28 TRC duckdb scanner: EOF marker seen
2025-11-19 16:52:28 TRC duckdb scanner: EOF marker seen
2025-11-19 16:52:28 TRC USE ducklake; /* nD */
2025-11-19 16:52:28 TRC duckdb scanner: SOF marker seen
2025-11-19 16:52:28 TRC duckdb scanner: EOF marker seen
2025-11-19 16:52:28 TRC duckdb scanner: EOF marker seen
2025-11-19 16:52:28 INF reading from source file system (s3)
2025-11-19 16:52:28 DBG opened "s3" connection (conn-s3-3Cd)
2025-11-19 16:52:28 TRC listing path: s3://sling-test/sling-test/test1.parquet
2025-11-19 16:52:28 DBG copying 1 files from remote to /var/folders/25/xjhjr5p90rq0ndcfmg9456h00000gn/T/duck.temp.1763542348135.A3t for local processing
2025-11-19 16:52:28 TRC copied s3://sling-test/sling-test/test1.parquet to /var/folders/25/xjhjr5p90rq0ndcfmg9456h00000gn/T/duck.temp.1763542348135.A3t/sling-test/test1.parquet [51855 bytes]
2025-11-19 16:52:28 DBG opened "file" connection (conn-file-wyW)
2025-11-19 16:52:28 DBG reading datastream from s3://sling-test/sling-test/test1.parquet [format=parquet, nodes=1]
2025-11-19 16:52:28 TRC Proc command -> /Users/DemoUser/.sling/bin/duckdb/1.4.2/duckdb -csv -nullvalue \N\
2025-11-19 16:52:28 TRC INSTALL json; LOAD json; /* nD */ conn=conn-file-wyW
2025-11-19 16:52:28 TRC select 1 /* nD */ conn=conn-file-wyW
2025-11-19 16:52:28 TRC duckdb scanner: SOF marker seen
2025-11-19 16:52:28 TRC duckdb scanner: EOF marker seen
2025-11-19 16:52:28 TRC duckdb scanner: EOF marker seen
2025-11-19 16:52:28 TRC describe select "blog_id","rich_format" from read_parquet(['sling-test/test1.parquet']) /* nD */ conn=conn-file-wyW
2025-11-19 16:52:28 TRC duckdb scanner: SOF marker seen
2025-11-19 16:52:28 TRC duckdb scanner: EOF marker seen
2025-11-19 16:52:28 TRC duckdb scanner: EOF marker seen
2025-11-19 16:52:28 TRC duckdb describe got 2 column(s)
2025-11-19 16:52:28 DBG select "blog_id","rich_format" from read_parquet(['sling-test/test1.parquet']) [wyW]
2025-11-19 16:52:28 TRC duckdb scanner: SOF marker seen
2025-11-19 16:57:27 TRC task-execution cleanup
2025-11-19 16:57:27 TRC pushed 0 datastreams
goroutine 606 [running]:
main.dumpRuntimeStack()
/Users/DemoUser/work/sling-cli/cmd/sling/sling_run.go:787 +0x44
main.setTimeout.func1()
/Users/DemoUser/work/sling-cli/cmd/sling/sling_run.go:765 +0x28
created by time.goFunc
/opt/homebrew/Cellar/go/1.25.4/libexec/src/time/sleep.go:215 +0x38
goroutine 1 [runnable]:
github.com/slingdata-io/sling-cli/core/sling.(*TaskExecution).Execute(0x140010c0000)
/Users/DemoUser/work/sling-cli/core/sling/task_run.go:167 +0x348
main.runTask(0x14000ab8848, 0x1400100aa88)
/Users/DemoUser/work/sling-cli/cmd/sling/sling_run.go:432 +0x5ec
main.replicationRun({0x16d23b225, 0x24}, 0x14000a68848, {0x10c5d46a0, 0x0, 0x0})
/Users/DemoUser/work/sling-cli/cmd/sling/sling_run.go:572 +0xcd8
main.processRun(0x10c546280)
/Users/DemoUser/work/sling-cli/cmd/sling/sling_run.go:239 +0x1d04
github.com/flarco/g.CliProcess()
/Users/DemoUser/go/pkg/mod/github.com/flarco/[email protected]/cli.go:287 +0x3f0
main.cliInit(0x108375720?)
/Users/DemoUser/work/sling-cli/cmd/sling/sling_cli.go:554 +0x1cc
main.main()
/Users/DemoUser/work/sling-cli/cmd/sling/sling_cli.go:518 +0x1e0
goroutine 10 [select]:
go.opencensus.io/stats/view.(*worker).start(0x1400067c880)
/Users/DemoUser/go/pkg/mod/[email protected]/stats/view/worker.go:292 +0x84
created by go.opencensus.io/stats/view.init.0 in goroutine 1
/Users/DemoUser/go/pkg/mod/[email protected]/stats/view/worker.go:34 +0x94
goroutine 13 [select, 5 minutes]:
database/sql.(*DB).connectionOpener(0x14000edf520, {0x1091202d0, 0x1400111bb80})
/opt/homebrew/Cellar/go/1.25.4/libexec/src/database/sql/sql.go:1261 +0x80
created by database/sql.OpenDB in goroutine 1
/opt/homebrew/Cellar/go/1.25.4/libexec/src/database/sql/sql.go:841 +0x114
goroutine 23 [syscall, 5 minutes]:
os/signal.signal_recv()
/opt/homebrew/Cellar/go/1.25.4/libexec/src/runtime/sigqueue.go:149 +0x2c
os/signal.loop()
/opt/homebrew/Cellar/go/1.25.4/libexec/src/os/signal/signal_unix.go:23 +0x1c
created by os/signal.Notify.func1.1 in goroutine 1
/opt/homebrew/Cellar/go/1.25.4/libexec/src/os/signal/signal.go:152 +0x28
goroutine 82 [select, 5 minutes]:
main.main.func2()
/Users/DemoUser/work/sling-cli/cmd/sling/sling_cli.go:494 +0x68
created by main.main in goroutine 1
/Users/DemoUser/work/sling-cli/cmd/sling/sling_cli.go:493 +0x1d8
goroutine 29 [sleep]:
time.Sleep(0x5f5e100)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/runtime/time.go:363 +0x150
github.com/slingdata-io/sling-cli/core/sling.NewTask.func1()
/Users/DemoUser/work/sling-cli/core/sling/task.go:125 +0xa4
created by github.com/slingdata-io/sling-cli/core/sling.NewTask in goroutine 1
/Users/DemoUser/work/sling-cli/core/sling/task.go:108 +0x314
goroutine 30 [runnable]:
github.com/slingdata-io/sling-cli/core/sling.(*TaskExecution).Execute.func1()
/Users/DemoUser/work/sling-cli/core/sling/task_run.go:83 +0x104
created by github.com/slingdata-io/sling-cli/core/sling.(*TaskExecution).Execute in goroutine 1
/Users/DemoUser/work/sling-cli/core/sling/task_run.go:77 +0x29c
goroutine 31 [runnable]:
runtime.CallersFrames(...)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/runtime/symtab.go:81
runtime.Caller(0x3)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/runtime/extern.go:321 +0x68
github.com/flarco/g.getCallerStack(0x10728f6a8?)
/Users/DemoUser/go/pkg/mod/github.com/flarco/[email protected]/error.go:151 +0x60
github.com/flarco/g.NewError(0x140011f05b0?, {0x108594e40, 0x14000e6ebc0}, {0x14000a73088, 0x1, 0x1})
/Users/DemoUser/go/pkg/mod/github.com/flarco/[email protected]/error.go:172 +0x38
github.com/flarco/g.Error(...)
/Users/DemoUser/go/pkg/mod/github.com/flarco/[email protected]/error.go:279
github.com/slingdata-io/sling-cli/core/dbio/iop.(*Dataflow).WaitReady(0x140011f05b0)
/Users/DemoUser/work/sling-cli/core/dbio/iop/dataflow.go:700 +0xb4
github.com/slingdata-io/sling-cli/core/dbio/filesys.GetDataflowViaDuckDB({0x109198060, 0x14000c02000}, {0x14000aa3e80, 0x78}, {0x140011c4750, 0x1, 0x1}, {0x0, {0x140010734a0, 0x2, ...}, ...})
/Users/DemoUser/work/sling-cli/core/dbio/filesys/fs.go:1287 +0x38c
github.com/slingdata-io/sling-cli/core/dbio/filesys.(*BaseFileSysClient).ReadDataflow(0x1400029ef08, {0x14000aa3e80, 0x78}, {0x140011c4090?, 0x14000020160?, 0x15?})
/Users/DemoUser/work/sling-cli/core/dbio/filesys/fs.go:678 +0xc7c
github.com/slingdata-io/sling-cli/core/sling.(*TaskExecution).ReadFromFile(0x140010c0000, 0x14000ab8848)
/Users/DemoUser/work/sling-cli/core/sling/task_run_read.go:351 +0xd50
github.com/slingdata-io/sling-cli/core/sling.(*TaskExecution).runFileToDB(0x140010c0000)
/Users/DemoUser/work/sling-cli/core/sling/task_run.go:502 +0xa68
github.com/slingdata-io/sling-cli/core/sling.(*TaskExecution).Execute.func2()
/Users/DemoUser/work/sling-cli/core/sling/task_run.go:140 +0x6f4
created by github.com/slingdata-io/sling-cli/core/sling.(*TaskExecution).Execute in goroutine 1
/Users/DemoUser/work/sling-cli/core/sling/task_run.go:97 +0x304
goroutine 100 [IO wait, 5 minutes]:
internal/poll.runtime_pollWait(0x119617000, 0x72)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/runtime/netpoll.go:351 +0xa0
internal/poll.(*pollDesc).wait(0x1400181e0c0?, 0x140013ed000?, 0x1)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/internal/poll/fd_poll_runtime.go:84 +0x28
internal/poll.(*pollDesc).waitRead(...)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0x1400181e0c0, {0x140013ed000, 0x1000, 0x1000})
/opt/homebrew/Cellar/go/1.25.4/libexec/src/internal/poll/fd_unix.go:165 +0x1e0
os.(*File).read(...)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/os/file_posix.go:29
os.(*File).Read(0x1400121c020, {0x140013ed000?, 0x0?, 0x0?})
/opt/homebrew/Cellar/go/1.25.4/libexec/src/os/file.go:144 +0x68
bufio.(*Scanner).Scan(0x14001802080)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/bufio/scan.go:219 +0x738
github.com/flarco/g/process.(*Proc).scanAndWait.func1()
/Users/DemoUser/go/pkg/mod/github.com/flarco/[email protected]/process/process.go:414 +0x40
created by github.com/flarco/g/process.(*Proc).scanAndWait in goroutine 40
/Users/DemoUser/go/pkg/mod/github.com/flarco/[email protected]/process/process.go:413 +0xa8
goroutine 40 [syscall, 5 minutes]:
syscall.syscall6(0x100000000?, 0x119a8d478?, 0x10e15c5c0?, 0x90?, 0x14000100008?, 0x14001556990?, 0x1400050c5f8?)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/runtime/sys_darwin.go:60 +0x40
syscall.wait4(0x1400050c628?, 0x102cb5f6c?, 0x90?, 0x108f72ba0?)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/syscall/zsyscall_darwin_arm64.go:44 +0x4c
syscall.Wait4(0x1400050c658?, 0x1400050c664, 0x1400050c6a8?, 0x102c3e55c?)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/syscall/syscall_bsd.go:144 +0x28
os.(*Process).pidWait.func1(...)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/os/exec_unix.go:64
os.ignoringEINTR2[...](...)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/os/file_posix.go:266
os.(*Process).pidWait(0x14001780180)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/os/exec_unix.go:63 +0x9c
os.(*Process).wait(0x1400050c6e8?)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/os/exec_unix.go:28 +0x24
os.(*Process).Wait(...)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/os/exec.go:340
os/exec.(*Cmd).Wait(0x14001782000)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/os/exec/exec.go:922 +0x38
github.com/flarco/g/process.(*Proc).scanAndWait(0x1400175a000)
/Users/DemoUser/go/pkg/mod/github.com/flarco/[email protected]/process/process.go:457 +0x110
created by github.com/flarco/g/process.(*Proc).Start in goroutine 31
/Users/DemoUser/go/pkg/mod/github.com/flarco/[email protected]/process/process.go:378 +0x6a0
goroutine 101 [IO wait, 4 minutes]:
internal/poll.runtime_pollWait(0x119616e00, 0x72)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/runtime/netpoll.go:351 +0xa0
internal/poll.(*pollDesc).wait(0x1400181e000?, 0x140013f618b?, 0x1)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/internal/poll/fd_poll_runtime.go:84 +0x28
internal/poll.(*pollDesc).waitRead(...)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0x1400181e000, {0x140013f618b, 0xe75, 0xe75})
/opt/homebrew/Cellar/go/1.25.4/libexec/src/internal/poll/fd_unix.go:165 +0x1e0
os.(*File).read(...)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/os/file_posix.go:29
os.(*File).Read(0x1400121c010, {0x140013f618b?, 0x109096708?, 0x1082a3380?})
/opt/homebrew/Cellar/go/1.25.4/libexec/src/os/file.go:144 +0x68
bufio.(*Scanner).Scan(0x14001802100)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/bufio/scan.go:219 +0x738
github.com/flarco/g/process.(*Proc).scanAndWait.func2()
/Users/DemoUser/go/pkg/mod/github.com/flarco/[email protected]/process/process.go:436 +0x40
created by github.com/flarco/g/process.(*Proc).scanAndWait in goroutine 40
/Users/DemoUser/go/pkg/mod/github.com/flarco/[email protected]/process/process.go:435 +0x104
goroutine 73 [runnable]:
github.com/slingdata-io/sling-cli/core/dbio/iop.(*Dataflow).PushStreamChan(0x140011f05b0, 0x140000e8930)
/Users/DemoUser/work/sling-cli/core/dbio/iop/dataflow.go:624 +0x12c
created by github.com/slingdata-io/sling-cli/core/dbio/filesys.GetDataflowViaDuckDB in goroutine 31
/Users/DemoUser/work/sling-cli/core/dbio/filesys/fs.go:1283 +0x384
goroutine 74 [select, 4 minutes]:
io.(*pipe).read(0x14001a908a0, {0x14000c34014, 0xfec, 0x102c81edc?})
/opt/homebrew/Cellar/go/1.25.4/libexec/src/io/pipe.go:57 +0x7c
io.(*PipeReader).Read(0x108375960?, {0x14000c34014?, 0x14000a77578?, 0x102bd5474?})
/opt/homebrew/Cellar/go/1.25.4/libexec/src/io/pipe.go:134 +0x24
bufio.(*Reader).fill(0x14001798d20)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/bufio/bufio.go:113 +0xe0
bufio.(*Reader).Peek(0x14001798d20, 0x1000)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/bufio/bufio.go:152 +0x58
github.com/flarco/g.Peek({0x1090bc968, 0x14001798d20?}, 0x0)
/Users/DemoUser/go/pkg/mod/github.com/flarco/[email protected]/io.go:34 +0x134
github.com/slingdata-io/sling-cli/core/dbio/iop.(*CSV).getReader(0x1400177c300)
/Users/DemoUser/work/sling-cli/core/dbio/iop/csv.go:190 +0x30
github.com/slingdata-io/sling-cli/core/dbio/iop.(*Datastream).ConsumeCsvReader(0x140010e0608, {0x1090bcc88, 0x14001a908a0})
/Users/DemoUser/work/sling-cli/core/dbio/iop/datastream.go:1453 +0x1f8
github.com/slingdata-io/sling-cli/core/dbio/iop.(*DuckDb).StreamContext(0x1400181f320, {0x1091202d0, 0x1400176a7d0}, {0x14001a60320, 0x94}, {0x14001a41cf0, 0x1, 0x7?})
/Users/DemoUser/work/sling-cli/core/dbio/iop/duckdb.go:835 +0x804
github.com/slingdata-io/sling-cli/core/dbio/iop.(*DuckDb).Stream(...)
/Users/DemoUser/work/sling-cli/core/dbio/iop/duckdb.go:763
github.com/slingdata-io/sling-cli/core/dbio/iop.(*Datastream).ConsumeParquetReaderDuckDb(0x140010e0608, {0x14000080540, 0xb2}, {0x0, {0x140010734a0, 0x2, 0x2}, {0x0, 0x0}, {0x1400101a0c0, ...}, ...})
/Users/DemoUser/work/sling-cli/core/dbio/iop/datastream.go:1668 +0x188
github.com/slingdata-io/sling-cli/core/dbio/filesys.GetDataflowViaDuckDB.func1.1()
/Users/DemoUser/work/sling-cli/core/dbio/filesys/fs.go:1266 +0x5b8
created by github.com/slingdata-io/sling-cli/core/dbio/filesys.GetDataflowViaDuckDB.func1 in goroutine 72
/Users/DemoUser/work/sling-cli/core/dbio/filesys/fs.go:1245 +0x1c4
goroutine 138 [chan send, 4 minutes]:
github.com/flarco/g/process.(*Proc).scanAndWait.func2()
/Users/DemoUser/go/pkg/mod/github.com/flarco/[email protected]/process/process.go:454 +0x2e0
created by github.com/flarco/g/process.(*Proc).scanAndWait in goroutine 79
/Users/DemoUser/go/pkg/mod/github.com/flarco/[email protected]/process/process.go:435 +0x104
goroutine 137 [IO wait, 4 minutes]:
internal/poll.runtime_pollWait(0x119616a00, 0x72)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/runtime/netpoll.go:351 +0xa0
internal/poll.(*pollDesc).wait(0x1400181fa40?, 0x140016de000?, 0x1)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/internal/poll/fd_poll_runtime.go:84 +0x28
internal/poll.(*pollDesc).waitRead(...)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0x1400181fa40, {0x140016de000, 0x1000, 0x1000})
/opt/homebrew/Cellar/go/1.25.4/libexec/src/internal/poll/fd_unix.go:165 +0x1e0
os.(*File).read(...)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/os/file_posix.go:29
os.(*File).Read(0x1400121c508, {0x140016de000?, 0x1400050be48?, 0x102caced4?})
/opt/homebrew/Cellar/go/1.25.4/libexec/src/os/file.go:144 +0x68
bufio.(*Scanner).Scan(0x140018c0380)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/bufio/scan.go:219 +0x738
github.com/flarco/g/process.(*Proc).scanAndWait.func1()
/Users/DemoUser/go/pkg/mod/github.com/flarco/[email protected]/process/process.go:414 +0x40
created by github.com/flarco/g/process.(*Proc).scanAndWait in goroutine 79
/Users/DemoUser/go/pkg/mod/github.com/flarco/[email protected]/process/process.go:413 +0xa8
goroutine 79 [syscall, 4 minutes]:
syscall.syscall6(0x900101018a4000?, 0x119a8c618?, 0x10e15ca78?, 0x90?, 0x14000700808?, 0x140017f4090?, 0x1400050d5f8?)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/runtime/sys_darwin.go:60 +0x40
syscall.wait4(0x1400050d628?, 0x102cb5f6c?, 0x90?, 0x108f72ba0?)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/syscall/zsyscall_darwin_arm64.go:44 +0x4c
syscall.Wait4(0x1400050d658?, 0x1400050d664, 0x10c5d9220?, 0x10e15caa8?)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/syscall/syscall_bsd.go:144 +0x28
os.(*Process).pidWait.func1(...)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/os/exec_unix.go:64
os.ignoringEINTR2[...](...)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/os/file_posix.go:266
os.(*Process).pidWait(0x140017a53c0)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/os/exec_unix.go:63 +0x9c
os.(*Process).wait(0x1400050d6e8?)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/os/exec_unix.go:28 +0x24
os.(*Process).Wait(...)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/os/exec.go:340
os/exec.(*Cmd).Wait(0x14001782480)
/opt/homebrew/Cellar/go/1.25.4/libexec/src/os/exec/exec.go:922 +0x38
github.com/flarco/g/process.(*Proc).scanAndWait(0x14001444820)
/Users/DemoUser/go/pkg/mod/github.com/flarco/[email protected]/process/process.go:457 +0x110
created by github.com/flarco/g/process.(*Proc).Start in goroutine 74
/Users/DemoUser/go/pkg/mod/github.com/flarco/[email protected]/process/process.go:378 +0x6a0
panic: SLING_TIMEOUT = 5 mins reached!
goroutine 606 [running]:
main.setTimeout.func1()
/Users/DemoUser/work/sling-cli/cmd/sling/sling_run.go:770 +0x14c
created by time.goFunc
/opt/homebrew/Cellar/go/1.25.4/libexec/src/time/sleep.go:215 +0x38
@AndyCokeZero thanks, super-helpful. I'll try to reproduce.
@AndyCokeZero I can't reproduce the hang-up. I have a python file using pyarrow to generate the large values (over 66K chars). it just works as normal for me... Do you mind atteching the test file you used?
@flarco Sure, I'll attach the test file now. test1.parquet.zip
@AndyCokeZero thank you again.
Was able to reproduce. It all came down to the buffer size for one line, when duckDB CLI sends to Sling CLI. Buffer size was 64kb, and hung. Now default is changed to 10MB. Furthermore, it can be specified in the env.yaml entry with max_buffer_size or env var SLING_MAX_BUFFER_SIZE. Fixed for next release (or use dev build). Closing.
@flarco Verified the fix on my setup. Appreciate the update!