refine-client-py
refine-client-py copied to clipboard
Create New Project in Open Refine using JSON or XML
I'm wondering if there are any examples of python code that uses Open Refine to create a new project using JSON or XML files. We are currently running the python code from: https://github.com/maxogden/refine-python
This works perfectly for creating project using csv or tsv files. It was after the fact that we've discovered: https://github.com/PaulMakepeace/refine-client-py/.
Reviewing that code, it looks like the project format plays a big part here. However there are 2 points that we can't get past:
-
Line 151 in https://github.com/PaulMakepeace/refine-client-py/blob/master/google/refine/refine.py list different project formats, but I didn't see the JSON option. Is there one?
-
Line 190 is where we started with created an new project from XML, but didn't see a way to select a specific xml element to start with, similar to what you can do in the desktop version.
Love what you'll are doing here, and want to expand to other formats. Are there any examples out there or advice? thanks!
That giant hash of defaults essentially just mirrors the defaults that OpenRefine has. I derived it by making new projects by hand with OR and capturing the http stream (see "Useful Tools" in the README). This was about 1.5 years ago and I haven't tracked any subsequent developments, or I possibly missed the two parameters you mention.
Unfortunately I am not spending much time with OR these days but if you want to add these I'd suggest doing what I did and figuring out what the defaults are, either from OR source or capturing the stream; neither should be too difficult. Alternatively, the developers on the forum may be able to help.
Thanks Paul. I've used Fiddler to try and capture the details behind the desktop app. Unfortunately, it only points out that that "get-models" and "get-rows" is called with no parameter details. I didn't check the source code yet. Will do that and if I should come across a solution, I'll post it.
Don't forget that new_project_defaults
isn't actually used - see
https://github.com/PaulMakepeace/refine-client-py/blob/master/google/refine/refine.py#L150
- you may pass anything to the
new_project
method'sproject_format
parameter.
It's been a while but I think you need to go earlier in the project
creation process if you're seeing just get-models
etc.
Happy to look at a pull request in a branch (ideally with tests) if you come up with something!
On Tue, Jun 16, 2015 at 12:40 PM, bkoncsol [email protected] wrote:
Thanks Paul. I've used Fiddler to try and capture the details behind the desktop app. Unfortunately, it only points out that that "get-models" and "get-rows" is called with no parameter details. I didn't check the source code yet. Will do that and if I should come across a solution, I'll post it.
— Reply to this email directly or view it on GitHub https://github.com/PaulMakepeace/refine-client-py/issues/8#issuecomment-112543121 .
You need to provide a recordPath in the options parameter. That's the "click in that tree" thing when you import xml or json files via GUI. It's been a while (June 2015) since this issue has been raised but maybe someone is interested in this extension of the CLI:
- repo: https://github.com/felixlohmeier/openrefine-client
- pull request: https://github.com/PaulMakepeace/refine-client-py/pull/12
new projects defaults work with pull request #17