micromasters
micromasters copied to clipboard
Async task to update course metadata
The source of truth for course titles and dates is the MITx Course Management database in QuickBase. We should be pulling all data from there. Data available from edX's APIs may not be updated in a timely manner.
Acceptance criteria
- [ ] Use course catalog API, i.e. https://api.edx.org/catalog/v1/catalogs/
- Docs at http://course-catalog-api-guide.readthedocs.io/en/latest/index.html
- edX has swagger documents describing the library; perhaps they can be used to generate the client code. See https://github.com/edx/api-manager/
- [x] Request service account from edX
- [ ] API client code should be added to the library at https://github.com/mitodl/edx-api-client
- [ ] Update course title, thumbnails and course runs.
- [ ] course runs include short_description, long_description, start (date), end (date), enrollment_start, enrollment_end
- [ ] seats include price and upgrade_deadline
every six hours is adequate
Two bits of data I know that we need can be found in the catalogs api, price and upgrade_deadline
Docs are at http://course-catalog-api-guide.readthedocs.io/en/latest/course_catalog/catalog.html#cc-api-seats
My personal edX account has been enabled as a service account for testing, but at the moment I can't access any catalogs. I'll update if/when that gets fixed.
I had to ask the product manager at edX to create a catalog that my user can access.
It's a simple read-only API. You can request a list of catalogs, and then you can request all the courses in a catalog. As far as I know, there's no way to query the catalog. For example:
curl -X GET -H "Authorization: JWT {access_token} https://api.edx.org/catalog/v1/catalogs/
{
"count": 1,
"next": null,
"previous": null,
"results": [{
"id": 34,
"name": "All courses",
"query": "*",
"courses_count": 1811,
"viewers": ["pdpinch"]
}]
}
curl -X GET -H "Authorization: JWT {access_token} https://api.edx.org/catalog/v1/catalogs/34/courses/
The response to the courses endpoint is paginated.
The courses are keyed on partial course_keys, i.e. {org}+{number}
. The course object contains a list of course_runs
. course_runs
contains a list of seats
. seats
of type verified
have a price
and an upgrade_deadline
I updated the description to include some of the info mentioned in the comments.
I requested the accounts from edX and I have a prod account now, but no stage account yet.
I need to take another look at the edX catalog API to see if it can do what we need. I'm concerned that it doesn't have enough historical data about past runs.
Another option would be to gather data from the MITx Business Management (BM) system aka QuickBase.
The MITx Course Management database in QuickBase is clearly the more reliable choice, instead of edX APIs. I am updating the description to reflect that.