tranco-list-api
tranco-list-api copied to clipboard
tranco-list-api
This is an early work in progress, anything from database schema, JSON format or API endpoints are candidates for change
This is code that can be used to create a read-only HTTP API for the Tranco top one million sites list.
The parts involved:
- PostgreSQL is expected as the storage backend.
tldbwritercontinously checks for the latest list ID and loads it into the database.tlapidresponds to HTTP requests based on the database contents.
Building containers (substitute "eest" with your own account or registry)
$ docker build -t eest/tldbwriter:vX.Y.Z -f Dockerfile-tldbwriter .
$ docker push eest/tldbwriter:vX.Y.Z
$ docker build -t eest/tlapid:vX.Y.Z -f Dockerfile-tlapid .
$ docker push eest/tlapid:vX.Y.Z
Managing config secrets used by containers
Using the pipe-to-kubectl method allows later updates to the secret if the config needs to change.
$ kubectl create secret generic tldbwriter --from-file=config=./tldbwriter.toml -n tlapi --dry-run=client -o yaml | kubectl apply -f -
$ kubectl create secret generic tlapid --from-file=config=./tlapid.toml -n tlapi --dry-run=client -o yaml | kubectl apply -f -
After a new configuration file is uploaded the affected deployment will need to be restarted using something like this:
$ kubectl rollout restart deployment tlapid -n tlapi
Currently supported endpoints
Get the top ten sites
$ curl -s https://example.com/api/sites | jq .
{
"list": "66NX",
"last-modified": "Fri, 06 Sep 2019 22:15:50 GMT",
"reference": "https://tranco-list.eu/list/66NX",
"sites": [
{
"site": "google.com",
"rank": 1
},
{
"site": "facebook.com",
"rank": 2
},
{
"site": "netflix.com",
"rank": 3
},
{
"site": "youtube.com",
"rank": 4
},
{
"site": "microsoft.com",
"rank": 5
},
{
"site": "amazon.com",
"rank": 6
},
{
"site": "twitter.com",
"rank": 7
},
{
"site": "tmall.com",
"rank": 8
},
{
"site": "instagram.com",
"rank": 9
},
{
"site": "linkedin.com",
"rank": 10
}
]
}
Set a starting point and/or number of results to get
$ curl -s 'https://example.com/api/sites?start=50&count=2' | jq .
{
"list": "66NX",
"last-modified": "Fri, 06 Sep 2019 22:15:50 GMT",
"reference": "https://tranco-list.eu/list/66NX",
"sites": [
{
"site": "mozilla.org",
"rank": 50
},
{
"site": "googleusercontent.com",
"rank": 51
}
]
}
Fetch a site by name
$ curl -s https://example.com/api/site/google.com | jq .
{
"list": "66NX",
"last-modified": "Fri, 06 Sep 2019 22:15:50 GMT",
"reference": "https://tranco-list.eu/list/66NX",
"sites": [
{
"site": "google.com",
"rank": 1
}
]
}
Fetch a site by rank
$ curl -s https://example.com/api/rank/1 | jq .
{
"list": "66NX",
"last-modified": "Fri, 06 Sep 2019 22:15:50 GMT",
"reference": "https://tranco-list.eu/list/66NX",
"sites": [
{
"site": "google.com",
"rank": 1
}
]
}