airbyte
airbyte copied to clipboard
π Source Stripe: missing data in Incremental sync mode
Connector Name
source-stripe
Connector Version
5.0.1
What step the error happened?
During the sync
Relevant information
Hello Airbyte's community,
I identified an issue on source Stripe Charges stream using Incremental sync mode, leading to missing data in destination.
The concerned missing payments looks to systematically have:
Blockedoutcome type- empty failure message
NB: some other Blocked payments with filled failure message are properly retrieved.
Please find attached an example:
Another common pattern is the
402 Stripe API Error. It could probably be an interesting path to follow...
If this path is confirmed, it possibly impacts other streams also.
To be noticed:
- I tried with many destinations and the behavior is always the same:
- Google Cloud Storage in AVRO
- Google Cloud Storage in JSONL
- BigQuery with GCS staging
- BigQuery with standards inserts
- I tried a manual
Incrementalrun in local fromsource-stripeAirbyte's repo using below command, and concerned payment is still NOT retrieved (contrary to others) :
python main.py read --config secrets/config.json --catalog secrets/configured_catalog.json --state secrets/state.json | grep py_***my_id***
- if I run a
Fullsync, data is properly retrieved BUT with wrongupdatedfield value, for which right value is available thanks toIncrementalsync only. So, this workaround does not look acceptable.
@davydov-d as Stripe expert, you could probably help to investigate ? :) I would be happy to contribute also, but not sure from where to start... This is currently a blocking point on our side to consume Stripe data via Airbyte.
Thanks a lot for your help ! π
EDIT: please check below comment with new elements to go further...
Relevant log output
No response
Contribute
- [ ] Yes, I want to contribute
Few new elements after deeper testing Full refresh vs Incremental: it seems to be an issue regarding pagination management in Incremental sync.
Considering a start date set at 2023-11-29T00:00:00Z (in config.json for Full refresh), equivalent to timestamp 1701216000 (in state.json for Incremental), I have below logs:
Full refresh
python main.py read --config secrets/config.json --catalog secrets/configured_catalog.json --state secrets/state.json | grep py_***my_id***
=> FOUND π
python main.py read --config secrets/config.json --catalog secrets/configured_catalog.json --state secrets/state.json --debug | grep "https://api.stripe.com/v1/charges"
=> generates:
{"type": "DEBUG", "message": "Making outbound API request", "data": {"headers": "{'User-Agent': 'python-requests/2.28.2', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'Stripe-Version': '2022-11-15', 'Stripe-Account': 'xxx', 'Authorization': 'Bearer ****'}", "request_body": "None", "url": "https://api.stripe.com/v1/charges?created%5Bgte%5D=1701216000&created%5Blte%5D=1701302400&limit=100&expand%5B%5D=data.refunds"}}
{"type": "DEBUG", "message": "Making outbound API request", "data": {"headers": "{'User-Agent': 'python-requests/2.28.2', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'Stripe-Version': '2022-11-15', 'Stripe-Account': 'xxx', 'Authorization': 'Bearer ****'}", "request_body": "None", "url": "https://api.stripe.com/v1/charges?created%5Bgte%5D=1701302401&created%5Blte%5D=1701388801&limit=100&expand%5B%5D=data.refunds"}}
{"type": "DEBUG", "message": "Making outbound API request", "data": {"headers": "{'User-Agent': 'python-requests/2.28.2', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'Stripe-Version': '2022-11-15', 'Stripe-Account': 'xxx', 'Authorization': 'Bearer ****'}", "request_body": "None", "url": "https://api.stripe.com/v1/charges?created%5Bgte%5D=1701388802&created%5Blte%5D=1701452651&limit=100&expand%5B%5D=data.refunds"}}
{"type": "DEBUG", "message": "Making outbound API request", "data": {"headers": "{'User-Agent': 'python-requests/2.28.2', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'Stripe-Version': '2022-11-15', 'Stripe-Account': 'xxx', 'Authorization': 'Bearer ****'}", "request_body": "None", "url": "https://api.stripe.com/v1/charges?created%5Bgte%5D=1701216000&created%5Blte%5D=1701302400&limit=100&expand%5B%5D=data.refunds"}}
{"type": "DEBUG", "message": "Making outbound API request", "data": {"headers": "{'User-Agent': 'python-requests/2.28.2', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'Stripe-Version': '2022-11-15', 'Stripe-Account': 'xxx', 'Authorization': 'Bearer ****'}", "request_body": "None", "url": "https://api.stripe.com/v1/charges?created%5Bgte%5D=1701302401&created%5Blte%5D=1701388801&limit=100&expand%5B%5D=data.refunds&starting_after=py_3OI1uCCBk8J2diIe1k9P14b5"}}
=> pagination management looks fine π
Incremental
python main.py read --config secrets/config.json --catalog secrets/configured_catalog.json --state secrets/state.json | grep py_***my_id***
=> NOT FOUND π
python main.py read --config secrets/config.json --catalog secrets/configured_catalog.json --state secrets/state.json --debug | grep "https://api.stripe.com/v1/charges"
=> generates:
{"type": "DEBUG", "message": "Making outbound API request", "data": {"headers": "{'User-Agent': 'python-requests/2.28.2', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'Stripe-Version': '2022-11-15', 'Stripe-Account': 'xxx', 'Authorization': 'Bearer ****'}", "url": "https://api.stripe.com/v1/charges?created%5Bgte%5D=1701216000&created%5Blte%5D=1701302400&limit=100&expand%5B%5D=data.refunds", "request_body": "None"}}
=> pagination management looks broken π
... or is it simply a different logging display in case of enabled debugging ?
Alternative: bug on Stripe ?
Other idea: any issue on Stripe side to log the event on charge creation following the PaymentIntent into the Events object, even if the charge really exists in the Charges object ?
=> to prevent from similar issues in a general way, in case of Incremental load, the connector would have to check into the Stream object (here Charges) for occurrences in the same temporal scope complementary to Events object ? (99% it will produce same objects duplicated, but 1% it would help to workaround the identified issue)
or perhaps @girarda as you seem to master Stripe connector π
Hi Kev-datams.
Thanks for your research on the bug and the key insights. I think you are right on the events. It seems the events does not exist. I am not sure if you were able to check the EVENTs endpoint, as we look into the different events to properly follow the changes and the insert the accordingly.
I will be taking over this issue and post my findings.
Hi kev-datams.
I have tried to recreate the behavior you are reporting. I used the Stripe sandbox to create multiple pages. I couldn't find any problems on the behavior of payment_intents as shown in your dashboard and all works as expected.
What I did realize is you were testing with the Charges stream. According to Stripe, this stream is deprecated and payment_intents must be used instead:
So the events may not be catched all the time when creating a payment when it fails through this stream, but they for sure appear in the payment_intents. In my case, I checked both streams and I still see all the payment intents I tested (successful and failed). I even did a Block pyament just as the one you have and I see it in the event, payment_intent and charges streams:
Can you confirm if you can see the payment intent in your payment_intent stream?
In any case, since I have not been able to reproduce it, I will keep this ticket open for a week or so. If the behavior appears again, we can reopen the ticket.
Hi @kev-datams Can you check comment above please