datagov-wptheme
datagov-wptheme copied to clipboard
Implement Quality Assurance features and user ratings
https://github.com/ckan/ckanext-qa is already installed, but it's not exposed
- Implement automated CKAN QA features in template for dataset and resource
- Implement feature for user provided feedback/comments (eg built in features on CKAN)
Fully implementing QA extension should also help address other issues like #55
I'll plan to update this issue with a mockup, but here's what they look like on data.gov.uk for reference
For detecting file formats
For checking on broken links
Some of this is displayed on resource pages on catalog.data.gov, but it's missing most of the detail, eg
It looks like ckanext-qa is still installed, but maybe not fully enabled, not properly configured, or the results are not fully displayed in the template. I seem to recall there being some potential issue with the ckanext-archiver extension which is a pre-requisite. That is also installed, but maybe not fully enabled.
Hi, I've been working on this issue with the help of Fuhu and Kishore. I've been working with the ckanext-archiver and ckanext-qa extensions.
Here is what the resource page currently looks like.
Currently what is shown is the QA Openness score. Since, we will not be downloading cached copies, the ckanext-qa extension looks at file extensions for scoring.
The ckanext-archiver gathers information about the "brokenness" of links. I've attached three screen shots so you can see the different examples of a failed link, inconclusive link, and successful link.
Please let me know of any other features or information that should be included. Also, the format, design, location of content can be modified.
Will try to get this on BSP-dev after some other bugs/issues are figured out. Currently resources aren't being archived after harvesting.
Now it shows how many times the link has failed. It will also inform the user the last time the link was successful; however, it will say it does not have a record of it working since first check if there was never a successful attempt.
@philipashlock @hkdctol
The Link quality check and Openness rating info will be moved to the section "Resource Quality" that is currently on catalog production, Openness rating is nothing but Resource Quality. See the current snapshop in catalog
Also currently we dont have any license info for any of the geospatial datasets, should we default to "https://creativecommons.org/publicdomain/zero/1.0/" or "http://www.usa.gov/publicdomain/label/1.0/" ? Without license value, the openness rating would not be calculated.
Here are the latest updates/formatting to the resources page. We put the archiver and quality information into respective tables near bottom of the page. Brief descriptions are also seen in the top right of the page.
If you hover over the stars, a tooltip describing the criteria for openness scores is shown. If you also click the stars, it leads to you this page http://5stardata.info/en/.
The More details link scrolls the user down to the tables.
Also added small change to the package page. Under the resource link, there is information on the status of broken links.
@John123Yu Can we also display openness rating on the dataset page, along with the link check status?
CC: @philipashlock @JJediny @hkdctol
This is how the dataset page looks with openness score next to link status.
@John123Yu @FuhuXia @hkdctol @JJediny We are good to go ahead with the US Govt work license "http://www.usa.gov/publicdomain/label/1.0/" for Federal Geospatial datasets.
We will need to perform a one time bulk update for geospatial harvest sources under federal organizations and also address the new datasets during harvesting process.
@kvuppala sure--if the bulk update is only thing left I think you can close next week.
@hkdctol New QA extension changes are not yet deployed to production, we will move them along with harvesting process changes to apply the default license and also bulk update. After that we will re run the link check on the catalog for resources.
@John123Yu @FuhuXia what is it that we have to review/ok to go to production?
@FuhuXia @hkdctol I just missed Fuhu so I was unable to speak with him directly but I believe he is currently reindexing Solr to have the correct license for Federal Organization Spatial Datasets. The quality assurance pluggin will not run for datasets without licenses. I can speak with him again tomorrow morning to confirm what else needs to be done. And we will get back to you asap.
Kishore mentioned that you will be reviewing the front-end changes, are there any front-end changes that you, Phil, or John would like to see? I can make those changes without causing much delay.
Fuhu set up a personal tigbox for me to test on. You can look at the current frontend changes using these two urls:
http://34.227.176.216/dataset/consumer-complaint-database
http://34.227.176.216/dataset/consumer-complaint-database/resource/2f297213-7198-4be1-af1e-2d2623e7f6e9
@FuhuXia @hkdctol We are currently applying the latest changes to BSP Dev, when that finishes we will let you know so you can check the changes. After that we can make a move to push changes to production.
Changes are now pushed to production, and the qa check jobs are running.