`pull_requests` table contains many duplicates since upgrade to v2
Hi, we have upgraded our Github integration to v2 on Stitch one week ago. Since then it seems the pull_requests table contains many duplicates of some pull requests. This is weird since the primary key is indeed id
Could please you help us on this?
Hello 👋 This is interesting. As far as I can tell, the tap is doing everything it needs to in order to enable deduplication downstream via ID. You can see that here when it writes the SCHEMA message per stream with the key_properties for that stream and here where it specifies the key_properties for the PullRequests stream object. Nevertheless, when I just tested it, I also got duplicates for the most recent record (from the inclusive query on the incremental extraction).
This might be something better served through Stitch's support channels (this tap is marked as supported by Stitch in the docs) rather than leaving it up to a community contributor to pick up.
@Nicoowr did you get anywhere with this? Also running into the same issue.
@BenPeddie Our team is in contact with Stitch support but nothing new yet Will keep this thread updated
Still no update? Why in contact with stitch?