rest_api: passing value for path parameters not working as expected
dlt version
0.4.12
Source name
rest_api
Describe the problem
Configuring an endpoint like this:
{
"name": "user",
"endpoint": {
"path": "users/{id}",
"params": {
"id": 2,
},
},
},
Is returing an url built like this:
https://reqres.in/api/users/%7Bid%7D?id=2
the expected is
https://reqres.in/api/users/2
Expected behavior
No response
Steps to reproduce
I am using the reqres.in testing api, with the following configuration:
import dlt
from rest_api import RESTAPIConfig, rest_api_source, RESTClient, DltResource
def load_reqres_in():
reqres_in_config: RESTAPIConfig = {
"client": {
"base_url": "https://reqres.in/api",
},
"resources": [
{
"name": "user",
"endpoint": {
"path": "users/{id}",
"params": {
"id": 2,
},
},
},
],
}
pipeline = dlt.pipeline(
pipeline_name="reqres_in",
destination="duckdb",
)
reqres_in_source = rest_api_source(reqres_in_config)
load_info = pipeline.run(reqres_in_source)
print(load_info)
if __name__ == "__main__":
load_reqres_in()
How you are using the source?
I run this source in production.
Operating system
Linux
Runtime environment
Local
Python version
3.10.9
dlt destination
duckdb
Additional information
No response
Hey @francescomucio could you share a use case for having the path interpolated from param value?
...
"endpoint": {
"path": "users/{id}",
"params": {
"id": 2,
},
},
...
Hi @burnash,
I found out this problem while testing for a specific item to be returned by an API, but I can see this used in case of automatically generated resources or to partition a data loading getting only one resource per time (and the following calls).
For example, using the Datadog API I can imagine downloading the results of a set of tests runs, but not all of them; the workflow will be:
-
Call
https://api.datadoghq.com/api/v1/synthetics/tests/{public_id}/resultsto get the latest test results IDs -
Call
https://api.datadoghq.com/api/v1/synthetics/tests/{public_id}/results/{result_id}to get the details of a specific test
This can be an overkill if we need to download the results of all the tests, public_id (the id of the test) can part of a list of tests that we need to download with dlt.
I hope it makes sense
Thanks for elaborating @francescomucio I believe the similar case has just been reported in the community Slack: https://dlthub-community.slack.com/archives/C04DQA7JJN6/p1719291805818969
I'm thinking how to put this together with the current rest_api config. Let me know if you open to update https://github.com/dlt-hub/verified-sources/pull/499 as my idea is a bit different: most likely we'd need to adjust the child resource, not the parent.
@francescomucio I'm unable to reproduce it now with the reqres.in example above:
2025-02-18 10:19:15,978|[INFO]|26403|8365117248|dlt|client.py|_send_request:124|Making GET request to https://reqres.in/api/users/2 with params={}, json=None
Most likely this has been fixed already. Closing this for now.