fx icon indicating copy to clipboard operation
fx copied to clipboard

Aggregations

Open dcmoura opened this issue 2 years ago • 12 comments

What would be the most efficient way of calculating the average of a JSON property using fx? Thanks

dcmoura avatar Apr 19 '22 11:04 dcmoura

Using js, python or ruby?

antonmedv avatar Apr 19 '22 12:04 antonmedv

Whatever would be more efficient :-) My question is, if you have to calculate an average from a large file (1GB-10GB JSON lines), what option would you recommend for best processing time, without getting out of memory?

dcmoura avatar Apr 19 '22 14:04 dcmoura

Like fx data.json 'sum' is ok.

antonmedv avatar Apr 19 '22 15:04 antonmedv

Let's say we want the average of the overall property. With jq I would do one of the following:

jq -n '[inputs.overall] | add/length' data.json

or

jq -n 'def sigma(s): reduce s as $x([0,0]; [.[0]+$x, .[1]+1]); sigma(inputs | .overall) | .[0] / .[1]' data.json

dcmoura avatar Apr 19 '22 17:04 dcmoura

So it’s an array where each element contains overall field?

antonmedv avatar Apr 19 '22 17:04 antonmedv

JSON lines, e.g.

{"overall": 2.0, "another": "bla"}
{"overall": 4.0, "another": "bla bla"}
{"overall": 1.0}

dcmoura avatar Apr 20 '22 07:04 dcmoura

I see. This feature doesn’t yet ported for nodejs version to go version. Right now fx works only on single json.

antonmedv avatar Apr 20 '22 07:04 antonmedv

OK, thank you

dcmoura avatar Apr 20 '22 09:04 dcmoura

Actually I have created a PR to update the documentation concerning reducers. It describes how you can add your own functions/global data to the scope and namespace of the reducers.

You could then define a global variable to store your result, and then specify a reducer that updates that variable as it maps/each the json data. You could specify the reducer at the command line, or even have a specific function that is loaded using the .fxrc.js file that is available and can operate on it.

PR #203 contains the updated reducers.md and examples. I do not know when it will be committed so you can use the PR link to see the file if it hasn't been updated yet.

digitallyserviced avatar May 04 '22 05:05 digitallyserviced

Actually fx now lacks support for aggregation.

I’m planning to add a support via jq style —slurp arg.

antonmedv avatar May 04 '22 06:05 antonmedv

@dcmoura @antonmedv it lacks it in the sense of specific arguments or options that make fx aware of them.

It does support because the reducers in the node language type because it's ... NodeJS ... which can operate on the data however the hell you want.

With a preloaded nodejs namespace/scope using .fxrc.js with any modules/libraries/custom code or data.

The documentation update in PR #203 actually details using .fxrc.js to provide more functionality to the command line reducer scripts as well as additional data sources that can be included in the namespace used at the command line.

See example usages

digitallyserviced avatar May 04 '22 14:05 digitallyserviced

I’m also thinking of extending .fxrc.js to a js reducers as well.

antonmedv avatar May 04 '22 18:05 antonmedv

Done!

antonmedv avatar Sep 19 '23 08:09 antonmedv