dataportals.org icon indicating copy to clipboard operation
dataportals.org copied to clipboard

Bulk import

Open marks opened this issue 8 years ago • 18 comments

How would one bulk submit data portals to this list?

marks avatar Aug 11 '15 13:08 marks

@marks the underlying DB is basically a spreadsheet so if the data is in the right structure one can copy and paste into the spreadsheet.

rufuspollock avatar Aug 11 '15 13:08 rufuspollock

Right - it's just a bit unclear without digging into the code where the underlying db lives. The only clear path is through the google spreadsheet :-\

For example, I can probably get you a description, lat/long, etc. for all these data portals. A lot of the ones on the map are currently out of date

http://api.us.socrata.com/api/catalog/v1/domains

marks avatar Aug 11 '15 13:08 marks

@marks this is a create idea and much appreciated. Let me know if you need any further help.

Stephen-Gates avatar Aug 19 '15 20:08 Stephen-Gates

What's the best way to get this started? Provide the content that's in the google form as a spreadsheet?

marks avatar Aug 19 '15 22:08 marks

@marks I'll create a Google sheet and share it with you. There are some differences between the online form and what's published that I'm discovering as a new Editor of DataPortals.org. I'll try and make things efficient for both of us. (May take a couple of days).

Stephen-Gates avatar Aug 19 '15 23:08 Stephen-Gates

@rgrp @mattfullerton please review this cleaned up copy of the database - personal information removed. If you think this is OK to ask @marks to fill out, say so here, then Mark can take a copy and share back here.

Validation rules appear to have changed over time. I've added validation based on the current form.

Stephen-Gates avatar Aug 21 '15 12:08 Stephen-Gates

@marks You'll note in the database, columns for api_endpoint, api_type, and full_metadata_download with descriptions at http://goo.gl/PfHswi

Is there a standard way to access Socrata to download the full metadata?

Stephen-Gates avatar Aug 26 '15 05:08 Stephen-Gates

Socrata portals have a data.json which include a description of each dataset and from which a count could be derived. Is that what you're asking?

Mark Silverberg (m) 512 826 7004 http://twitter.com/skram

On Aug 26, 2015, at 1:34 AM, Stephen Gates [email protected] wrote:

@marks You'll note in the database, columns for api_endpoint, api_type, and full_metadata_download with descriptions at http://goo.gl/PfHswi

Is there a standard way to access Socrata to download the full metadata?

— Reply to this email directly or view it on GitHub.

marks avatar Aug 26 '15 11:08 marks

The ideal case for the last one is a dump of all the metadata for all datasets, i.e. the same as what you would get if you grabbed the metadata from each one individually. If it is that or nearly that I would put it in. If it isn't possible, leave blank.

mattfullerton avatar Aug 26 '15 12:08 mattfullerton

@mattfullerton I might have misspoke. Wont be able to get all of this information out of one single API but I am committed to working with the right people to compile this and get it contributed. Do you literally want a CSV with the different fields that are marked as "In Production DB" or what?

marks avatar Aug 26 '15 21:08 marks

My take is a row per portal in a CSV file or equivalent will be fine. I didn't expect to go to one place and get catalog contents for many portals. Just complete as much as you can.

Stephen-Gates avatar Aug 26 '15 22:08 Stephen-Gates

Sorry if I was confusing things here - I was only explaining what goes in the full_metadata_download field. The list of catalogs can be on whatever format you like. CSV would be the easiest.

mattfullerton avatar Aug 27 '15 07:08 mattfullerton

@marks @Stephen-Gates @mattfullerton are we good to go here? It would be great to get this imported.

rufuspollock avatar Nov 18 '15 13:11 rufuspollock

@rgrp - just to be clear, you want folks to submit PRs to the portals.csv file in this repo?

marks avatar Nov 18 '15 14:11 marks

@marks yes - or you can give us a CSV and we'll perform the merge for you :-) - obviously you doing the PR is probably preferable.

rufuspollock avatar Nov 19 '15 11:11 rufuspollock

@marks is this complete?

rufuspollock avatar Mar 14 '16 12:03 rufuspollock

I've made some corrections but run into other issues and time constraints. Feel free to close this issue if you want to keep the backlog clean

Thanks

Mark Silverberg (m) 512 826 7004 http://twitter.com/skram

On Mar 14, 2016, at 8:20 AM, Rufus Pollock [email protected] wrote:

@marks is this complete?

— Reply to this email directly or view it on GitHub.

marks avatar Mar 14 '16 12:03 marks

@marks we can keep it open for the time being - also if you are busy any chance you could you post the list of portals you wanted added somewhere as csv and someone else can help integrate?

rufuspollock avatar Mar 15 '16 11:03 rufuspollock