refine-client-py icon indicating copy to clipboard operation
refine-client-py copied to clipboard

Create New Project in Open Refine using JSON or XML

Open bkoncsol opened this issue 9 years ago • 5 comments

I'm wondering if there are any examples of python code that uses Open Refine to create a new project using JSON or XML files. We are currently running the python code from: https://github.com/maxogden/refine-python

This works perfectly for creating project using csv or tsv files. It was after the fact that we've discovered: https://github.com/PaulMakepeace/refine-client-py/.

Reviewing that code, it looks like the project format plays a big part here. However there are 2 points that we can't get past:

  1. Line 151 in https://github.com/PaulMakepeace/refine-client-py/blob/master/google/refine/refine.py list different project formats, but I didn't see the JSON option. Is there one?

  2. Line 190 is where we started with created an new project from XML, but didn't see a way to select a specific xml element to start with, similar to what you can do in the desktop version.

Love what you'll are doing here, and want to expand to other formats. Are there any examples out there or advice? thanks!

bkoncsol avatar Jun 16 '15 19:06 bkoncsol

That giant hash of defaults essentially just mirrors the defaults that OpenRefine has. I derived it by making new projects by hand with OR and capturing the http stream (see "Useful Tools" in the README). This was about 1.5 years ago and I haven't tracked any subsequent developments, or I possibly missed the two parameters you mention.

Unfortunately I am not spending much time with OR these days but if you want to add these I'd suggest doing what I did and figuring out what the defaults are, either from OR source or capturing the stream; neither should be too difficult. Alternatively, the developers on the forum may be able to help.

paulmakepeace avatar Jun 16 '15 19:06 paulmakepeace

Thanks Paul. I've used Fiddler to try and capture the details behind the desktop app. Unfortunately, it only points out that that "get-models" and "get-rows" is called with no parameter details. I didn't check the source code yet. Will do that and if I should come across a solution, I'll post it.

bkoncsol avatar Jun 16 '15 19:06 bkoncsol

Don't forget that new_project_defaults isn't actually used - see https://github.com/PaulMakepeace/refine-client-py/blob/master/google/refine/refine.py#L150

  • you may pass anything to the new_project method's project_format parameter.

It's been a while but I think you need to go earlier in the project creation process if you're seeing just get-models etc.

Happy to look at a pull request in a branch (ideally with tests) if you come up with something!

On Tue, Jun 16, 2015 at 12:40 PM, bkoncsol [email protected] wrote:

Thanks Paul. I've used Fiddler to try and capture the details behind the desktop app. Unfortunately, it only points out that that "get-models" and "get-rows" is called with no parameter details. I didn't check the source code yet. Will do that and if I should come across a solution, I'll post it.

— Reply to this email directly or view it on GitHub https://github.com/PaulMakepeace/refine-client-py/issues/8#issuecomment-112543121 .

paulmakepeace avatar Jun 16 '15 21:06 paulmakepeace

You need to provide a recordPath in the options parameter. That's the "click in that tree" thing when you import xml or json files via GUI. It's been a while (June 2015) since this issue has been raised but maybe someone is interested in this extension of the CLI:

  • repo: https://github.com/felixlohmeier/openrefine-client
  • pull request: https://github.com/PaulMakepeace/refine-client-py/pull/12

felixlohmeier avatar Feb 01 '17 23:02 felixlohmeier

new projects defaults work with pull request #17

daniel-butler avatar May 20 '18 04:05 daniel-butler