glom icon indicating copy to clipboard operation
glom copied to clipboard

add support for JSON Lines format in 'target'

Open apalala opened this issue 6 years ago • 3 comments

Try to read a JSON object per input line when reading the whole input as JSON fails. This makes glom behave more like jq.

apalala avatar May 20 '18 12:05 apalala

Codecov Report

Merging #24 into master will decrease coverage by 0.59%. The diff coverage is 27.27%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master      #24     +/-   ##
=========================================
- Coverage   83.72%   83.12%   -0.6%     
=========================================
  Files           9        9             
  Lines         805      812      +7     
  Branches      133      136      +3     
=========================================
+ Hits          674      675      +1     
- Misses         91       96      +5     
- Partials       40       41      +1
Impacted Files Coverage Δ
glom/cli.py 39.17% <27.27%> (-1.94%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update d06e3d1...5e053fb. Read the comment docs.

codecov-io avatar May 20 '18 12:05 codecov-io

Hey Juacarlo! First off, thanks for this, definitely want this sort of direction in the CLI. I think I see an issue with the implementation, however.

I think to really make this part of the functionality production-ready, we need to switch to more of a streaming approach, which means switching up the interface to use a file-like interface, and reading one line at a time. As it stands, this will hold the whole dataset in memory.

Feel free to take a swing at that, but if you don't have the time, I was planning on doing it myself in the near future anyways, and I'll be sure to give you due credit :)

mahmoud avatar May 23 '18 01:05 mahmoud

Hey Mahmoud!

You are right! In fact, I have memory problems with the software I use for JSONL data because it doesn't stream the lines.

I'll take a shot at it.

apalala avatar May 23 '18 02:05 apalala