airbyte
airbyte copied to clipboard
π Source Stripe: missing data in Incremental sync mode
Connector Name
source-stripe
Connector Version
5.0.1
What step the error happened?
During the sync
Relevant information
Hello Airbyte's community,
I identified an issue on source Stripe Charges
stream using Incremental
sync mode, leading to missing data in destination.
The concerned missing payments looks to systematically have:
-
Blocked
outcome type - empty failure message
NB: some other Blocked
payments with filled failure message are properly retrieved.
Please find attached an example:
Another common pattern is the
402 Stripe API Error
. It could probably be an interesting path to follow...
If this path is confirmed, it possibly impacts other streams also.
To be noticed:
- I tried with many destinations and the behavior is always the same:
- Google Cloud Storage in AVRO
- Google Cloud Storage in JSONL
- BigQuery with GCS staging
- BigQuery with standards inserts
- I tried a manual
Incremental
run in local fromsource-stripe
Airbyte's repo using below command, and concerned payment is still NOT retrieved (contrary to others) :
python main.py read --config secrets/config.json --catalog secrets/configured_catalog.json --state secrets/state.json | grep py_***my_id***
- if I run a
Full
sync, data is properly retrieved BUT with wrongupdated
field value, for which right value is available thanks toIncremental
sync only. So, this workaround does not look acceptable.
@davydov-d as Stripe expert, you could probably help to investigate ? :) I would be happy to contribute also, but not sure from where to start... This is currently a blocking point on our side to consume Stripe data via Airbyte.
Thanks a lot for your help ! π
EDIT: please check below comment with new elements to go further...
Relevant log output
No response
Contribute
- [ ] Yes, I want to contribute
Few new elements after deeper testing Full refresh
vs Incremental
: it seems to be an issue regarding pagination management in Incremental
sync.
Considering a start date
set at 2023-11-29T00:00:00Z
(in config.json
for Full refresh
), equivalent to timestamp 1701216000
(in state.json
for Incremental
), I have below logs:
Full refresh
python main.py read --config secrets/config.json --catalog secrets/configured_catalog.json --state secrets/state.json | grep py_***my_id***
=> FOUND π
python main.py read --config secrets/config.json --catalog secrets/configured_catalog.json --state secrets/state.json --debug | grep "https://api.stripe.com/v1/charges"
=> generates:
{"type": "DEBUG", "message": "Making outbound API request", "data": {"headers": "{'User-Agent': 'python-requests/2.28.2', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'Stripe-Version': '2022-11-15', 'Stripe-Account': 'xxx', 'Authorization': 'Bearer ****'}", "request_body": "None", "url": "https://api.stripe.com/v1/charges?created%5Bgte%5D=1701216000&created%5Blte%5D=1701302400&limit=100&expand%5B%5D=data.refunds"}}
{"type": "DEBUG", "message": "Making outbound API request", "data": {"headers": "{'User-Agent': 'python-requests/2.28.2', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'Stripe-Version': '2022-11-15', 'Stripe-Account': 'xxx', 'Authorization': 'Bearer ****'}", "request_body": "None", "url": "https://api.stripe.com/v1/charges?created%5Bgte%5D=1701302401&created%5Blte%5D=1701388801&limit=100&expand%5B%5D=data.refunds"}}
{"type": "DEBUG", "message": "Making outbound API request", "data": {"headers": "{'User-Agent': 'python-requests/2.28.2', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'Stripe-Version': '2022-11-15', 'Stripe-Account': 'xxx', 'Authorization': 'Bearer ****'}", "request_body": "None", "url": "https://api.stripe.com/v1/charges?created%5Bgte%5D=1701388802&created%5Blte%5D=1701452651&limit=100&expand%5B%5D=data.refunds"}}
{"type": "DEBUG", "message": "Making outbound API request", "data": {"headers": "{'User-Agent': 'python-requests/2.28.2', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'Stripe-Version': '2022-11-15', 'Stripe-Account': 'xxx', 'Authorization': 'Bearer ****'}", "request_body": "None", "url": "https://api.stripe.com/v1/charges?created%5Bgte%5D=1701216000&created%5Blte%5D=1701302400&limit=100&expand%5B%5D=data.refunds"}}
{"type": "DEBUG", "message": "Making outbound API request", "data": {"headers": "{'User-Agent': 'python-requests/2.28.2', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'Stripe-Version': '2022-11-15', 'Stripe-Account': 'xxx', 'Authorization': 'Bearer ****'}", "request_body": "None", "url": "https://api.stripe.com/v1/charges?created%5Bgte%5D=1701302401&created%5Blte%5D=1701388801&limit=100&expand%5B%5D=data.refunds&starting_after=py_3OI1uCCBk8J2diIe1k9P14b5"}}
=> pagination management looks fine π
Incremental
python main.py read --config secrets/config.json --catalog secrets/configured_catalog.json --state secrets/state.json | grep py_***my_id***
=> NOT FOUND π
python main.py read --config secrets/config.json --catalog secrets/configured_catalog.json --state secrets/state.json --debug | grep "https://api.stripe.com/v1/charges"
=> generates:
{"type": "DEBUG", "message": "Making outbound API request", "data": {"headers": "{'User-Agent': 'python-requests/2.28.2', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'Stripe-Version': '2022-11-15', 'Stripe-Account': 'xxx', 'Authorization': 'Bearer ****'}", "url": "https://api.stripe.com/v1/charges?created%5Bgte%5D=1701216000&created%5Blte%5D=1701302400&limit=100&expand%5B%5D=data.refunds", "request_body": "None"}}
=> pagination management looks broken π
... or is it simply a different logging display in case of enabled debugging ?
Alternative: bug on Stripe ?
Other idea: any issue on Stripe side to log the event on charge creation following the PaymentIntent into the Events
object, even if the charge really exists in the Charges
object ?
=> to prevent from similar issues in a general way, in case of Incremental
load, the connector would have to check into the Stream
object (here Charges
) for occurrences in the same temporal scope complementary to Events
object ? (99% it will produce same objects duplicated, but 1% it would help to workaround the identified issue)
or perhaps @girarda as you seem to master Stripe connector π
Hi Kev-datams.
Thanks for your research on the bug and the key insights. I think you are right on the events. It seems the events does not exist. I am not sure if you were able to check the EVENTs endpoint, as we look into the different events to properly follow the changes and the insert the accordingly.
I will be taking over this issue and post my findings.
Hi kev-datams.
I have tried to recreate the behavior you are reporting. I used the Stripe sandbox to create multiple pages. I couldn't find any problems on the behavior of payment_intents as shown in your dashboard and all works as expected.
What I did realize is you were testing with the Charges
stream. According to Stripe, this stream is deprecated and payment_intents
must be used instead:
data:image/s3,"s3://crabby-images/f48c7/f48c7afabc292c8fa3ef7a37c3899bf51db4d3d2" alt="image.png"
So the events may not be catched all the time when creating a payment when it fails through this stream, but they for sure appear in the payment_intents. In my case, I checked both streams and I still see all the payment intents I tested (successful and failed). I even did a Block pyament just as the one you have and I see it in the event, payment_intent and charges streams:
data:image/s3,"s3://crabby-images/cd01a/cd01a386709a430f5536ccf15aaeb770bc675de2" alt="image.png"
Can you confirm if you can see the payment intent in your payment_intent stream?
In any case, since I have not been able to reproduce it, I will keep this ticket open for a week or so. If the behavior appears again, we can reopen the ticket.
Hi @kev-datams Can you check comment above please