verified-sources icon indicating copy to clipboard operation
verified-sources copied to clipboard

Slack `ts` and `thread_ts` inconsistent types

Open zilto opened this issue 10 months ago • 4 comments

dlt version

0.4.7

Source name

slack

Describe the problem

Values by get_messages() and get_thread_replies() don't return the same data types for field ts and thread_ts. Values are returned as timestamp for the first and string for the latter.

This is problematic when trying to join tables of messages and replies based on their thread_ts (thread id), which is a very common operation.

This is because get_messages() passes datetime_fields=MSG_DATETIME_FIELDS whereas get_thread_replies() doesn't.

Expected behavior

  1. ts and thread_ts should both receive the same type from MSG_DATETIME_FIELDS

  2. More importantly, according to Slack specs, ts and thread_ts are not timestamps and string is actually the proper type. (see ref)

There are a few additional fields that describe the author (such as user or bot_id), but there's also an additional ts field. The ts value is essentially the ID of the message, guaranteed unique within the context of a channel or conversation.

They look like UNIX/epoch timestamps, hence ts, with specified milliseconds. But they're actually message IDs, even if they're partially composed in seconds-since-the-epoch.

Given ts and thread_ts do not exactly represent a timestamp but rather are unique ids that can be sorted chronologically, I just removing them from the default values of MSG_DATETIME_FIELDS.

This would be a breaking change for the message tables, but not for replies tables, so it would the right time to introduce the change to defaults if accepted.

Steps to reproduce

dlt init slack

How you are using the source?

I run this source for fun.

Operating system

Linux

Runtime environment

Local

Python version

3.10.9

dlt destination

duckdb

Additional information

As a solution, I manually change type of ts and thread_ts of messages from timestamp to string

zilto avatar Apr 04 '24 23:04 zilto