extruct icon indicating copy to clipboard operation
extruct copied to clipboard

Add raw json output to JsonLdExtractor

Open Granitosaurus opened this issue 6 years ago • 6 comments

Sometimes jsonld schema is prefered in raw json format rather than python dict - this PR implementes as_json kwarg bool to determine whether to return python dictionary or json string.

Some use cases:

1. Use alternative json library
2. Use json.loads kwargs
3. Use object hook functions

My personal use case is to strip away @ keys with object hook:

def strip_jsonld(data):
    return {k: v for k, v in data.items() if not k.startswith('@')}

data = json.loads(script, object_hook=strip_jsonld)

Granitosaurus avatar Jan 25 '19 07:01 Granitosaurus

Codecov Report

Merging #103 into master will decrease coverage by 0.32%. The diff coverage is 83.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #103      +/-   ##
==========================================
- Coverage    87.3%   86.98%   -0.33%     
==========================================
  Files          11       11              
  Lines         457      461       +4     
  Branches       97       98       +1     
==========================================
+ Hits          399      401       +2     
- Misses         52       53       +1     
- Partials        6        7       +1
Impacted Files Coverage Δ
extruct/jsonld.py 100% <100%> (ø) :arrow_up:
extruct/rdfa.py 91.3% <60%> (-8.7%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 3ab5592...ff22c72. Read the comment docs.

codecov[bot] avatar Jan 25 '19 07:01 codecov[bot]

Also added this feature to RDFA extractor and updated readme with short examples.

Granitosaurus avatar Jan 29 '19 01:01 Granitosaurus

This looks mostly ready. What about adding a test covering the usage of parse_json=False?

Gallaecio avatar Dec 17 '19 18:12 Gallaecio

Sorry that it took me so long to attend to this but the tests gave me a bit of an headache :D

Should be good to go now!

Granitosaurus avatar Jan 04 '20 08:01 Granitosaurus

TestRDFa.test_wikipedia_xhtml_rdfa_no_prefix seems to be failing after your changes.

Gallaecio avatar Jan 07 '20 11:01 Gallaecio

TestRDFa.test_wikipedia_xhtml_rdfa_no_prefix seems to be failing after your changes.

It has been failing master for me too; I actually update the fixtures in this PR to prevent it failing but the test just seems to be flawed in some sense - it keeps either breakin on travis or on my machine locally. :man_shrugging:

Granitosaurus avatar Jan 16 '20 05:01 Granitosaurus