json-machine
json-machine copied to clipboard
Recursive iteration
Enable iteration inside iteration to lazily iterate any structure.
foreach (RecursiveJsonMachine::fromFile('universe.json', '/results') as $gallaxy) {
foreach($gallaxy as $solarSystem) {
foreach($solarSystem as $solarSystemObject) {
// process star, planet, asteroid ...
}
}
}
~The solution lies in yield from
.~ (It doesn't. yield from
also tries to rewind the generator. It must be a good old iteration via Iterator
methods.) "Child" generator will be given a reference to the top level SyntaxCheckedTokens
/Tokens
generator and will ~yield from
~ yield
from it until the end of its iteration level. Then the control will be returned back to the higher-level generator. All will be happening on the same instance of SyntaxCheckedTokens
/Tokens
How this will reflect in public API must be well thought through.
I like (and need) this. How is this going? Can I help? Thanks
Thank you. No progress whatsoever :) The idea persists though.
Could a nested JSON pointer - as explained in the readme - help you in the meantime?
I tried to implement it and turned this Issue into PR. Can you try it and throw in some suggestions?
Codecov Report
Attention: 2 lines
in your changes are missing coverage. Please review.
Comparison is base (
5147f38
) 100.00% compared to head (01fc434
) 99.67%. Report is 1 commits behind head on master.
Files | Patch % | Lines |
---|---|---|
src/Parser.php | 95.65% | 2 Missing :warning: |
:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@ Coverage Diff @@
## master #36 +/- ##
=============================================
- Coverage 100.00% 99.67% -0.33%
- Complexity 185 220 +35
=============================================
Files 17 19 +2
Lines 526 607 +81
=============================================
+ Hits 526 605 +79
- Misses 0 2 +2
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
I ponder that a better name for this feature might be something like lazy_vectors
rather than recursive
. Because for example lazy_scalars
option (as stream resources) may arrive later. Moreover, recursive
is somehow vague and does not exactly express what the option does. Any ideas on the name anyone? Ping @fwolfsjaeger @cerbero90
Hey @halaxa, I was looking into a very similar solution a couple of days ago. Great timing!
I'd say that what we are leveraging here are technically nested iterators, so probably naming them as what they actually are may cause less confusion.
I see that you generate iterators for objects too, wouldn't iterators for arrays be enough?
I agree that recursive is not the right word to use. As @cerbero90 suggested, nested iterator would make sense.
Also, by returning an actual Iterator object it would be possible to use SPL iterators to further filter the data.
So the name of the feature and the option would be nested iterators?
The point about not iteratig objects made me think of two optinons instead of just one. One for arrays as iterators and second for objects as iterators. That way users can decide what they want.
Also, what about providing a method called for example $nestedIterator->materialize()
so users can decide which level/item should be accessible via object/array?
The nested iterator should also be RecursiveIterator
, not just Generator
instance so it's compatible with spl recursive iterators.
Looking at the recursive example in the readme in this PR, I think a very useful method might also be $user->advanceTo('friends')
with support for syntax sugar $user['friends']
. The example would then simplify to something like
foreach ($users as $user) {
foreach ($user['friends'] as $friend) {
$friend = $friend->materialize(); // or maybe $friend->decode()?
$friend['name'], $friend['avatar'];
}
}
First of all, thanks for get back to this. It would help a lot.
Regarding the naming, nested iterators
, makes sense to me.
The example you show in previous commit with syntax sugar looks promising. However, the ideal would be being able to materialize one item without materialize its soons. EG:
foreach ($users as $user) {
echo $user['name']; // $user['name']->materialize() should work too
foreach ($user['friends'] as $friend) { // 'friends' instanceof Traversable, not an array/object
$friend = $friend->materialize();
$friend['name'], $friend['avatar'];
}
}
So, I would differenciate between materialize()
and decode()
:
- materialize: In case of an array/object, if items/props aren't parsed, only scalar data types.
-
decode: Equals to
json_decode()
From here, it shouldn't be difficult decode omiting one or more props $user->decode(['omit' => ['friends']])
Thanks for the feedback.
I'm not sure about $user['name']->materialize()
as it's expected to return string and not an object.
As for the decode()
method, I like it less and less. Its first meaning - simply decoding a value - is superfluous as decoding is implicit and transparent from the moment user specifies a decoder
option. Its possible second meaning - similarity to the word materialize - may be misleading. Especially so, because I plan to stack decoding as another generator on top of Parser
, and then the decode
option might be null
as PassthruDecoder
will be a thing of the past. What would the user think the method decode()
does when they explicitly specified no decoding?
What I'm currently wondering about is the format of the key of advanceTo($key)
. Should it be in the expected decoded format according to a decoder provided or should it be always in pure JSON string format, so it does not change when a decoder is swapped?
Additionally, if we later stack decoding on top of Parser
, the decoding traversable should know methods advanceTo()
and materialize()
. Probably not. Not SRP enough. So we can put another generator on top of it to handle those syntax sugar methods. Unfortunately, the problems don't end there. Too late to explain myself for today. Good night ;)
So, after some thinking about it, I would lean towards the separate RecursiveItems
facade class (similar to Items
) to wrap this functionality in. It will better fit the single responsibility principle. It itself may also implement PHP's RecursiveIterator
and Items
won't become a mess with the additional recursive
option or with some other API extensions that might come in the future. Any takes on this?