URI.js
URI.js copied to clipboard
Support Matrix parameters
Please add support for matrix parameters: http://www.w3.org/DesignIssues/MatrixURIs.html
What kind of API would you expect this to have?
I wonder if @annevk came across any matrix params in the wild in his URL tests.
This seems like it's just an idea from timbl, not something that actually happened.
Yeah, I thought so. I came across this (or something similar?) a couple of years ago, but never saw anyone actually use it. Thanks @annevk!
@rodneyrehm I first heard of matrix parameters in OReilly's RESTful Web Services book and have used them in production during the past 4-5 years. They are well supported by JAX-RS and most REST frameworks I've come across.
I am expecting to be able to get/set matrix parameters the same way as I set query parameters, but in addition to adding/removing parameters against the last path segment (as query parameters currently do) I want additional methods to be able to add/remove matrix parameters against non-terminal path segments.
Ok, then let's toy with this for a bit. As MatrixURIs is a bit vague at times, I'll quickly re-iterate what I understood.
The character used to delimit keys and values is =
, the character to delimit key-value-pairs is ;
, both fall into the category of unreserved characters, more precisely into the subset sub-delims
. The character group sub-delims
is a possible component of every possible segment group (segment
, segment-nz
, segment-nz-nc
) as per Collected URI ABNF, the notation fits general URI rules.
For the most part, the matrix notation fits encoding/decoding of x-www-form-urlencoded
- with the exceptions:
- spaces being encoded as
%20
instead of+
-
;
instead of&
being the key-value-pair delimiter - keys are unique (for compatibility, last declaration wins)
-
;key
, unlike&key
does not convey a null value (;key=
and&key=
behave the same, though) -;key
has to be removed from the segment during normalization - order of keys does not matter, so by convention we'll sort them alphabetically
Supporting the MatrixURI notation has implications on the following existing methods
-
.normalizePath()
to addnormalizePathMatrix()
to sort and unique-ify matrix params and remove trailing;
-
.relativeTo()
and.absoluteTo()
becauseURI(;foo=bar).absoluteTo('//foo.com/one;key=val')
should yield'//foo.com/one;key=val;foo=bar'
Supporting the MatrixURI notation calls for additional methods
-
.matrix()
analogous to.segmentCoded()
for access to individual components and the entire path
URI('//foo.com/one;key=val/two;some=m%C3%B6re;data=;bla/thr%3Fe').matrix(true);
// would return the following construct:
[
{
segment: 'one',
matrix: {
key: 'val',
},
},
{
segment: 'two',
matrix: {
data: '',
// note: decoded value
some: 'möre',
},
},
{
// note: decoded value
segment: 'thr?e',
matrix: {},
},
];
@rodneyrehm
Great analysis. This brings up some questions:
- Since keys are unique, I guess it's up to the user to decide how to pack multiple values per key-value pair?
- You said that
;key=
does not convey a null value. What does it convey then? :)
Also, you might want to post a new answer to http://stackoverflow.com/q/401981/14731 with some of these points.
I might get back to the SO question later, for now let's first figure out what exactly is going on and is expected to happen.
Interesting fact: MatrixURIs do seem to be used in various places:
- Ruby On Rails uses them to denote page state like
som/resource;edit
- Piwik "supports" them
- AngularJS seems to support them
- RFC 3986 Section 3.3 explains the concept without calling it MatrixURI in the last paragraph
Some more resources to (re)visit later:
Since keys are unique, I guess it's up to the user to decide how to pack multiple values per key-value pair?
After looking around for a bit, it seems that the comma (,
, also in sub-delims
) was intended for just that. Something similar has not been defined for x-www-form-urlencoded
because it does not mandate keys being unique, so multiple values could be provided simply by repeating the key over and over again.
You said that;key= does not convey a null value. What does it convey then? :)
It conveys the empty string
, which is to be interpreted as "key exists without value".
adding the comma to my example above, we end up with:
URI('//foo.com/one;key=val/two;some=m%C3%B6re;data=;list=1,2,3;bla/thr%3Fe').matrix(true);
// would return the following construct:
[
{
segment: 'one',
matrix: {
key: 'val',
},
},
{
segment: 'two',
matrix: {
data: '',
// note: "," delimits multiple values within a key-value component
list: ['1', '2', '3'],
// note: decoded value
some: 'möre',
// note: key 'bla' is removed because it was null
},
},
{
// note: decoded value
segment: 'thr?e',
matrix: {},
},
];
You said that;key= does not convey a null value. What does it convey then? :)
It conveys the empty string, which is to be interpreted as "key exists without value".
If that's the case, then I question your earlier statement:
;key has to be removed from the segment during normalization
It seems to me that this contradicts:
there must be a syntax for removing an attribute, hopefully distinguishing a removed attribute from one whose value if the empty string
found at http://www.w3.org/DesignIssues/MatrixURIs.html. Where did you read that keys with empty values should be removed during normalization?
see the resolving relative urls at the end of that document
@rodneyrehm Okay, I see. I see now that a relative path of ";roads" causes a key to be removed whereas ";roads=" does not. I thought the two forms were equivalent, but they are not.
What is the status on this feature guys? Quite eager to use Matrix Parameters in my URIs and your library would be a perfect fit if it had the support :)
the status is: we have figured out what "Matrix Parameters" are supposed to be. There is no clear specification, only a thought-document.
as far as I know nobody has started implementing anything yet
Could the matrix function take a few optional configuration options with the contrived edge-cases so that the user decides if they want for example to use last parameter wins or alternatively treat it as a list in the same way the query parameters would do? Similarly so there could be options to configure other edge cases like that, there can't be that many?
The problem with that is that two users would interpret the same string as different resources - something that we already see happening with query strings. You want to avoid these situations. like the plague. (that said, I'd go with the proposed comma and be done with it.)
You're welcome to try implementing this. I'll not get around to it for quite some time.
what is the website of URL Matrix? Someone supports me to use it so that i can check My site http://vuanhhospital.com.vn/detal/kham-suc-khoe-tong-quat-svvmmqpnxv , but i cannot find out the site. :(
Angular 2 use Matrix URL notation
.
https://angular.io/docs/ts/latest/guide/router.html
Angular is very popular and sure Angular 2 is another game changer. Really great is URI.js can support Matrix notation.
As the author of one of the documents referenced by @rodneyrehm I figured I'd weigh in. (Thanks @Laurian for alerting me.)
It is true, per @annevk that path parameters (aka matrix parameters) look like something thought up by timbl but never really put into practice. We probably have the early-90s work on CGI and cgi-lib.pl
(and its descendants) to thank for that. Consider:
- Before there was a key-value
QUERY_STRING
environment variable, the query parameters were theargv
of the script. This behaviour still exists (e.g.?one+two+three
generates["one", "two", "three"]
.) - Query strings have been amenable to manipulation by HTML forms (with
method="GET"
) since forever. - There is a clean mapping between
argv
andQUERY_STRING
; not so for path parameters, because there's one list of them for every path segment. (Moreover the/
delimiter supersedes sub-delimiters, so you can't have a literal/
in a path parameter.) - So what does it mean for, e.g., the second-to-last path segment to have parameters?
- Moreover, what does it mean to have a system of parameters in general which is completely orthogonal to the system of query parameters?
(Aside: For the longest time, there was no effective distinction, from the point of view of the CGI-and-its-descendants API, between parameters which came in from the URI, and parameters which came in from the request body. I am happy to see that many frameworks have finally teased these back apart.)
URI query parameters are historically meant to manipulate a resource's response. We imagine the resource to be a function (hopefully with no side effects under GET
) and the parameters tell the function how to produce representations. This way it's possible (in theory, less in practice) to conceive of a 1:1 mapping of the domain (the set of query strings) to the range (the set of, e.g., byte segments) of the resource.
So that position is filled. What role should path parameters play, then? There's a set for each path segment, they aren't manipulable by ordinary HTML, and there is already a perfectly good infrastructure around query strings.
In my own work, I've used path parameters to represent successive operations over resource representations. So I already have a resource which generates some representation by whatever indiscriminate means (could be a file, script, framework, etc.), and the path parameters represent functions which take these representations as input, along with a list of arguments (unlike query parameters which represent lists of arguments by repeating the parameter key). In other words, I used the fact that these parameters are relatively untouched by standard or convention and the fact that there's already a perfectly good parametrization mechanism in query strings to come up with a completely orthogonal semantics and processing model.
Here is an example of a picture being turned into a black and white avatar:
/a-picture;crop=100,100,400,400;scale=32;desaturate
We can imagine the picture's pixel data passing through the crop
function with the given parameters, say x1
y1
x2
y2
(defined elsewhere), then scale
with w
(an optional h
being omitted because the scaling is square), and finally desaturate
, which takes no additional arguments. Not only does the URI path segment indicate that the resource is a derivation, but it also tells the story of what the derivation is. Furthermore, you would be able to clip off the individual parameters at ;
and see the penultimate state of the resource's representation, in a manner analogous to clipping off path segments at the /
to see the "directory" under the given resource.
Anyway, that was the best I could come up with for path parameters.
Similarly with @doriantaylor, I'm looking into matrix params as sequence of operations in two specific cases:
- composition of Media Fragment URIs into a playlist:
/base/video1.mp4;t=0,10/video2.mp4;t=23,56
as the query string format will limit me to one fragment per URL http://www.w3.org/TR/media-frags/#standardisation-URI-queries - parameters for a Capability URL segment which would enable some operations on the resource past that segment:
/edit;key=ab;ttl=100;hmac=cafebabe/folder/file.extension
similar with Tahoe-LAFS example from http://w3ctag.github.io/capability-urls/#tahoe-lafs
As far as I understand Parsing matrix URLs, @doriantaylor's image manipulation example is in conflict with at least two of three implications cited:
/a-picture;crop=100,100,400,400;scale=32;desaturate
- attributes can only occur once - not sure
-
there must be a syntax for removing an attribute - violated by
;desaturate
-
attributes are unordered - I'm pretty sure
;crop=…;scale=…
generates a different image than;scale=…;crop=…
Correct me if I'm wrong, but this seems to be using the syntax without the semantics. Which is fine, I'm not judging. It does, however, point out how different interpretations make this a tough topic.
Regardless of Matrix URLs I have problems reconciling @Laurian's playlist example (/base/video1.mp4;t=0,10/video2.mp4;t=23,56
) with the hierarchical nature of the path. Technically this is not an illegal use of the path segment, but it does feel wrong. (I'd probably have preferred :
as the file delimiter, mostly because it's the delimiter in linux, e.g. PATH=some-path:$PATH
)
@rodneyrehm indeed if I saw that document (I can't remember), it would have been over eight years ago when I wrote mine, and I would have disagreed with it just as much then. ;)
When I reread timbl's Matrix URIs document, I'm basically seeing a description of query string parameters as they already are, with the exception of constraints 1 and 2.
Constraint 1, that an attribute ought only occur once, actually contradicts TBL himself when he wrote that URIs should only ever need to be compared lexically (source forthcoming; see also Fielding). Constraint 3, as well as query parameters, which can vary as n! of the number of keys (to say nothing of semantically-equivalent lexical perturbations in the values), also already violate this other constraint.
Constraint 2, that there must be syntax for removing an attribute, is presumably to be analogous to ../
in relative path URIs. My comment there is: good luck retrofitting that into $EVERYTHING
. You can already do relative URIs with query strings; you just have to supply the entire query string, à la href="?foo=bar"
. In other words, it isn't clear to me what is gained by the ability to prune a path parameter through a relative URI, especially considering what it would cost to realize it.
When I did my original path-parameter work, my motivation stemmed from the fact that I had a bunch of resources that were related in ways that were both purely deterministic, and parametrized, so I was initially looking for a reasonable, less ad-hoc way to generate an addressing scheme. I later came up with the idea that they could represent functions over representations (including things like @Laurian's time slicing). The other cool thing would be that you could develop these filter functions separately from the application.
A side note on parameter sequence: If you really wanted to, you could treat the sequence of query parameters as significant—it's just an (almost universally-held) implementation convention that they aren't. Indeed web browsers append query parameters to form target URIs in the order they appear in the document. (Of course the HTML specs have admonished us for aeons not to bank on that.)
So, I may have read that document 8-9 years ago, I may not have. (I did read several TBL dispatches around then). It's definitely interesting, but in the 20 years since it was slapdashed out, a lot of code has been written.
If the world survived for two decades without matrix parameters as described in that document, then there probably wasn't a serious need for them—as described. The syntax, however, has been sitting patiently all this time waiting for a useful semantics. But then, I suppose to paraphrase Andreessen, "rough consensus and running code wins".
Anyway, for my immediate purposes I will play with URI.parseQuery
on uri.segment()
and I will make a transcoding operation from matrix params to query string format. I will report back.
Hi guys, I recently had to use matrix URIs at my internship. I found this particular thread extremely informative. Seeing that there wasn't much support out there for Matrix URIs in Node, I've developed an express-compliant npm module (matrix-parser) for this purpose.
Its fully functional (although not optimized, but I'm working on it) and I've tried my best to implement the rules discussed by @rodneyrehm and @cowwoc. Since I'm not so experienced with JS, I'd appreciate if you could have a look at the module and give me feedback.
I've written lots of tests to cover side-cases, you can see their descriptions to understand what rules I'm following.
If you think that this is something URI.js could use, please let me know. I'd like to make a contribution to this repo as well :)
One of my colleague is trying to passing an ISO timestamp value to the server which I developed.
For following paths
.../someThings;min=yyyy-MM-ddTHH:mm:ssZ/someOtherThings
The colons of the time part, HH:mm:ss
, is escaping.
Can anybody please help me to help him?