add missing requirements for internal file pointers
I am proposing the following amendment to PSR-17.
I am aware of the amendments clause of the bylaws:
Following the rules of the workflow bylaw, once a PSR has been “Accepted” the PSR meaning cannot change, backwards compatibility must remain at 100%, and any confusion that arises from original wording can be clarified through errata.
The rules for errata are covered in the workflow bylaw, and only allow non-backwards compatible clarification to be added to the meta document. Sometimes, modifications will be necessary in PSR document itself, and this document outlines those cases.
The question here is what exactly is implied by "backwards compatibility".
In the usual sense of "backwards compatibility" with regards to a class, for example, the addition of a new feature (such as a new method or optional argument) maintains "backwards compatibility", in the sense that it doesn't change something that already exists.
In the same sense, the omission of this important detail in the specification doesn't change something that already exists, but rather adds something that was missing and thereby addresses a problem with the existing specification.
While this will render some existing implementations "buggy", these were already "buggy", in the sense that they can't actually be used with existing emitters or with predictable results, as per the circumstances described in proposed section 5.7 of PSR-17-http-factory-meta.md.
The issue was debated on the forum here:
https://groups.google.com/forum/#!topic/php-fig/S5YIw-Pu1yM
https://groups.google.com/forum/#!topic/php-fig/mv9-MWDVVBM
@Crell I've added an Errata section per your feedback on the forum.
I also added erratum references in the spec itself.
Let me know if there's anything else I should do here?
the pointer should be placed at the end of the stream. This is consistent with how resource handling works natively in PHP.
No, it isn't.
When you open a file, the file pointer is at the beginning of the stream.
This is true of PHP and any other language I can think of, and probably for the same reasons I cited in the additions to the meta: unpredictable performance, most likely use-case, and the fact that some types of streams can't be rewinded and/or aren't repeatable.
From the standpoint of PSR-17 it makes more sense to be able to add onto a given stream...
Onto a given stream, yes - not onto existing content being put into a new stream.
...than it does to unintentionally perform a partial overwrite by forgetting to manually seek to the end of the stream
Yes, that is an unlikely use-case.
It's also an extremely unlikely use-case to create a stream with existing content with the intent to write more content to the same stream.
There are basically two likely use-cases for createStream():
- Create a stream with existing content with the intent to read/emit that content.
- Create an empty stream with the intent to write to the stream.
A third use-case where you put some content into a stream and then continue writing to it is first of all unlikely - if given a stream with some existing content, you will almost certainly corrupt the stream by appending to it; there are almost no data formats that would permit it. If creating your own stream with existing content, it's more likely you will start with an empty stream and then write everything in increments - which you can just do, so there isn't any need for a third use either.
emitters can use the the
isSeekable()method to determine ifrewind()can be called.
And then? Throw an exception? ... And then what's the alternative?
Despite the assertion that "rewinding the stream could have unacceptable performance implications" there is no proof given for that reasoning
Suppose I'm reading from an S3 stream over the network - reading the entire stream twice definitely has unacceptable performance implications. (As it happens, this is a real use-case for us.)
This is also consistent with PSR-7
It is not.
PSR-7 does not specify anything about creation of streams.
I'm not sure what your point is with regards to __toString() - is is worded as "MUST attempt", implying what we know, that it may not be possible. PSR-7 also has getContents() which does not attempt to rewind - I'm not sure why you think consistency with __toString() is more important, they're just two different methods for two different use-cases.
(Note that the usefulness of __toString() is quite limited in the first place - it works only for smaller streams; an emitter can't safely rely on it, since it will have unpredictable memory usage and/or will simply crash for longer streams as you hit PHP's memory limit. An emitter will generally read and pass on the stream content in chunks, so __toString() is probably mostly useful for e.g. debugging use-cases.)
Forcing the stream pointer to be at the beginning is awkward for existing resources because the only way to append to the stream would be to manually seek to the end of the stream
But that is not what the proposed amendment says - for existing resources, it says literally the opposite.
Yes, for createStream() and createStreamFromFile(), it says:
The stream MUST be created with the current position of the file read/write pointer at the beginning of the stream.
This is consistent with opening a file with fopen() - this is useful for existing content, as well as for creating an empty stream to be populated with new content.
For createStreamFromResource(), it says:
The current read/write position of the given PHP resource MUST NOT be modified.
This is useful for existing resources.
It is only wrapping an existing PHP stream handle, and therefore doesn't make any assumptions about what state you may or may not want that stream to be in.
please update this PR to document that streams created with
StreamFactoryInterfaceSHOULD always place the pointer at the beginning of the stream
The current meta wouldn't make any sense, and I wouldn't know what to put there instead.
We cannot use MUST because that would constitute a breaking change in the specification.
It constitutes a breaking change only in implementations that get this wrong in the first place.
We currently have a 50/50 split among implementations on this point, which is problematic only because we can't count on streams created with existing content being positioned so they're ready to read, and because streams aren't consistently rewindable.
You can seek to the end of any stream, if that's what you want - but you can't necessarily seek back to the beginning, and if you have to try, this can have performance implications, or simply might not work at all.
The language opens streams with the cursor at zero for those reasons.
That said, the note regarding openStreamFromFile() probably needs to change, as this is really a wrapper around fopen(), which lets you specify with the $mode argument how the file pointer will be positioned - e.g. a will position the file pointer at the end of the file, while r or w will position the file pointer at the start.
I'm not sure how to word that exactly, but it probably shouldn't prescribe that this factory function do anything other than pass the arguments to fopen() and leave the stream as-is?
And createStreamFromResource() is basically a given, so I think maybe at this point we're only really debating createStream(), which doesn't let you specify a mode? I'd assume rw, you seem to assume a, which as explained would be problematic.
Wanting to start with createStream("some content") and then write("more content") simply isn't necessary - you can just createStream("") and then write() everything. This can't come at the expense of predictable performance, and emitters have to be able to assume streams are ready to read and emit, since they can't necessarily rewind or repeat all streams.
I've updated the spec and meta with regards to createStreamFromFile() behaving like fopen().
Regarding MUST vs SHOULD, please give this further consideration.
Adding an option to harmonize this behavior across implementations doesn't really solve the problem - it's almost as bad as being unspecified: since the behavior remains unpredictable, an emitter would still need to try to rewind the stream, which is the problem we're trying to address.
Again, whether this is a "breaking change" is a matter of interpretation. It was previously unspecified, so it's not a breaking change in the sense that it's reverting a previous decision - no decision was made, and this is an amendment to address that problem and achieve consistency.
Alternatively, I think we should consider creating a new PSR that does provide consistency.
The PSR isn't useful as it is, and adding optional consistency doesn't solve the problem.
@shadowhand as Editor/Maintainer for this PSR, can you indicate if this PR can be merged, closed, or needs more work?
Returning to this (after nearly 3 years), I agree with the proposed changes and with the argument made by @mindplay-dk as to why.
In its current form, I believe this PR is not acceptable as an amendment. It introduces backward incompatible requirements to the specification, by using the key word 'MUST'. In doing so, all current implementations that do not meet these requirements suddenly would be non-compliant.
From the PSR Workflow bylaw:
Errata clarifications may only be added to the meta document, not to the spec itself, as items in their own section. The spec itself may only be minimally edited to point readers toward a relevant errata if appropriate.
The PSR Amendments bylaw offers the option to add annotations to the original spec:
If Errata is added which is deemed important enough by whoever is initiating the errata vote, annotations may be placed in or near the offending line so that readers know to view the errata for more information, with a link containing an anchor to that specific piece of errata.
As such, I believe one of two things should happen:
- Amend the PR to be in line with the bylaws: Create an errata in the meta document and add annotations to the spec.
- Start a new PSR to supersede PSR-17.
I would propose to bring this as a discussion to the list.
Sorry, this was 3 years ago - I have no recollection of anything and I'm not currently using PHP.
Do what you want. 🙂
Sorry, this was 3 years ago - I have no recollection of anything and I'm not currently using PHP.
Thanks for letting us know!
Do what you want. 🙂
@shadowhand How do you want to proceed?
@vdelau given that this sat for years and no one pushed for this to be resolved, I don't think this needs to happen.
That said, I do agree that not defining this in the spec was an oversight. If there is a way forward that involves no vote, or a simple CC acceptance vote, I think we should complete this.
I think the least intrusive way is to add an clarifying errata in the meta document and reference it from the main spec. I expect that the errata would be considered substantial and thus require a CC approval vote.
In any case, any change made has to be 100% backward compatible. That means that we can not guarantee that conforming implementation will implement these changes and users can not rely on the interface doing The Right Thing. Any errata would amount to a recommendation or a 'SHOULD' level requirement.
@vdelau do you think this PR, with the suggestions I have made, are a reasonable change for the errata?