raml-spec
raml-spec copied to clipboard
Resolution of resource type parameters using libraries
The following raml has types with the same name in the root raml and the dependent libraries that is passed as a resource type parameter. Which would be the final types of headers hA, hA2, hB, hB2?
#%RAML 1.0
title: type shadows
uses:
a: libA.raml
types:
Monkey: string
/users:
type:
a.rtA:
kind: Monkey
#%RAML 1.0 Library
# libA.raml
uses:
b: libB.raml
types:
Monkey: number
resourceTypes:
rtA:
type:
b.rtB:
kindB: <<kind>>
post:
headers:
hA: <<kind>>
hA2: Monkey
#%RAML 1.0 Library
# libB.raml
types:
Monkey: boolean
resourceTypes:
rtB:
post:
headers:
hB: <<kindB>>
hB2: Monkey
It is definitely not clarified in the spec at this moment. As well as I know current JS parser implementation, tracks parameter origin and initially attempts to resolve type from parameter origin, if it is failed it attempts to resolve locally defined type with a given name.
So in your example:
hA - type from api.raml, hA2 - type from libA.raml, hB - type from libA.raml, hB2 - type from libB.raml
Regards, Pavel
Looks a bit counter-intuitive to treat a parameter that resolves to value X different than the literal value X. What will happen when you mix things?
e.g.: type: suffix-<<param>>
I'm not able to figure out an elegant way to resolve this, but whatever decision is taken needs to be clarified in the spec.
Don't forget to create a PR with that clarification ;)
after discussing with several raml users we arrived to the following proposal:
references are resolved first in the context where they were defined and if not present there they are looked up in the parent context. Contexts are defined for root raml files and typed fragments (e.g.: libraries, data types, resource types, etc.).
In the example provided before the resulting type of the headers would be:
hA: number (from libA)
hA2: number (from libA)
hB: boolean (from libB)
hB2: boolean (from libB)
The issue with the latest proposed approach is that in such a case user will have to closely investigate the code of the resource type / trait in order to use it AND also the whole code of the library.
By passing Monkey as a parameter in api.raml user definitely wanted use to the string one, because no other Monkey is available in that context under Monkey name.
User could have no idea that there is another Monkey defined in the library, there can be a thousand lines between rtA resource type and Monkey definition in the libA.raml code.
As a result user can be greatly confused and spend a lot of time to find another Monkey, which is causing API to behave differently from user's expectations.
I vote for the following variant:
hA: Monkey (type: string from api.raml)
hA2: Monkey (type: number from libA.raml)
hB: Monkey (type: string from api.raml)
hB2: Monkey (type: boolean from libB.raml)
@usarid , @sichvoge what are your thoughts?
@ddenisenko And what about the other Santiago's scenario type: suffix-<<param>>.
Is this suppose to be resolved in the param context or in the library.
I edited my original proposal to make it clear that we do not replace Monkey type with string/number/boolean. string/number/boolean were listed just to clarify, which Monkey we take in each case.
So, the proposal looks like this:
hA: Monkey (type: string from api.raml)
hA2: Monkey (type: number from libA.raml)
hB: Monkey (type: string from api.raml)
hB2: Monkey (type: boolean from libB.raml)
Regarding the prefixes/suffixes.
We perform the textual substitution first, resolving second.
So in case of the following RAML:
api.raml:
title: type shadows
uses:
a: libA.raml
types:
Monkey: string
/users:
type:
a.rtA:
kind: onkey
#%RAML 1.0 Library
# libA.raml
types:
Monkey: number
resourceTypes:
rtA:
post:
headers:
hA: M<<kind>>
hA2: Monkey
We will still have:
hA: Monkey (type: string from api.raml)
hA2: Monkey (type: number from libA.raml)
This I don't get it because we can get more creative and hA2 key be a parameter.
<<keyName>> : Monkey Is this still resolved to libA.raml? Also what happen if it has more than one parameter from different sources? There can be a lot of use cases here and people can get really creative.
Yes, hA2 key can be a parameter. This is not a problem for both approaches, and should be supported by both too.
Basically, both approaches do perform textual substitution and both approaches must understand the semantics (like recognizing type references), because both finally should be printing the JSON, where each type and other references must be prefixed by the proper namespace.
In example, for the approach proposed by Santiago and I would expect the final expanded RAML look effectively the following:
#%RAML 1.0
title: type shadows
uses:
a: libA.raml
b: libB.raml
types:
Monkey: string
/users:
post:
headers:
hA: a.Monkey
hA2: a.Monkey
hB: b.Monkey
hB2: b.Monkey
I omitted type for simplicity. Import (uses instruction) of libB.raml must be added automatically.
So, again, to provide the final result both approaches should understand that type reference is type reference etc.
There are more interesting cases, where the approach I proposed (and Pavel actually also meant it) is certainly more complex to implement.
In example:
#%RAML 1.0
title: type shadows
uses:
a: libA.raml
types:
Monkey:
properties:
propA: string
/users:
type:
a.rtA:
param1: Monkey
#%RAML 1.0 Library
# libA.raml
uses:
b: libB.raml
types:
Monkey:
properties:
propB: number
resourceTypes:
rtA:
type:
b.rtB:
param1: <<param1>>
param2: Monkey
#%RAML 1.0 Library
# libB.raml
types:
Monkey:
properties:
propC: boolean
resourceTypes:
rtB:
post:
headers:
hB: <<param1>> | <<param2>> | Monkey
In this case, with my approach, the expanded RAML should look like:
#%RAML 1.0
title: type shadows
uses:
a: libA.raml
b: libB.raml
types:
Monkey: string
/users:
post:
headers:
hB: Monkey | a.Monkey | b.Monkey
Following Santiago's approach, the result should look like:
#%RAML 1.0
title: type shadows
uses:
a: libA.raml
b: libB.raml
types:
Monkey: string
/users:
post:
headers:
hB: b.Monkey | b.Monkey | b.Monkey
So both approaches are handling "static" parts in the same way. Both approaches have to recognize references and perform the proper automatic imports (if needed) and assign the proper namespaces in the expanded RAML.
The key difference between the approaches is that the one I propose makes parser to track parameters AND the sources of parameter values, and resolves the names -only for parameters- starting from the scope, parameter value was defined in and ending in the scope, the resource type/trait is defined in.
And yes, I agree that users can be quite creative, in example:
<<keyParam>>: Prefix<<param1>>And<<param2>>Suffix
This one does not contradict to both approaches, but is weird from any point of view.
Finally I had a chance to read through the different comments and the three suggested options from @svacas, @petrochenko-pavel-a, and @ddenisenko. In my opinion, the second option from Pavel looks like the best approach from a users perspective. The reason being, using a library does actually not mean that I know everything about it. The only thing that I definitely know is what the available resource types are and if one defines a parameter, that I have to provide it. All in my known context. For example:
#%RAML 1.0
title: type shadows
uses:
a: libA.raml
types:
Monkey: string
/users:
type:
a.rtA:
kind: Monkey
The only visible information I have is that the provided Monkey kind is the one that I have defined in my context. I don't know that there is another inside the library and I might not really care about that. My expectations are that the processor picks the one that I have defined and not the one that is hidden inside the library. Obviously there are many really really strange corner cases where people can get very creative, but my decision would always be that the processor looks inside the local context where I have used the resource type first and if there is nothing go to the context where the resource type has been defined narrowing down the information that needs to be placed instead of the parameter. With Santiago's approach, and Denis is correct, the consumer would need to learn everything about the library and its content before he/she can start using it. That's not always the expectations. I just want to use it and define things that are visible for me. Again, there are many very weird corner cases like the one with the prefix and suffix which is difficult to handle indeed. Let's assume you have:
#%RAML 1.0
title: type shadows
uses:
a: libA.raml
types:
Monkey: string
/users:
type:
a.rtA:
kind: Monkey
#%RAML 1.0 Library
# libA.raml
types:
post-Monkey: number
resourceTypes:
rtA:
post:
headers:
hA: post-<<kind>>
hA2: Monkey
What would the consumer expect? Obviously he would expect you to resolve the type into Monkey since that is the context I work in, but it actually resolves into post-Monkey. That can get really confusing for the end consumer of the resource type rtA since he/she does not know there is a suffix. So how can you make sure to help the consumer to understand that the parameter is suffixed with post- and might not get resolved to what you would expect? But these are corner cases and more definition problems that might need to be clarified / fixed and documented during the design phase.
All in all, my vote goes to @petrochenko-pavel-a option as it seems the correct one from a consumer expectation perspective:
hA: string (from api.raml)
hA2: number (from libA.raml) (the user does not know anything about `hA2` so why should it get resolved to the one in api.raml)
hB: number (from libA.raml) (again the processor should resolve first where the resource type has been **used** and here it's being used inside the libA.raml context)
hB2: boolean (from libB.raml)
I'll try to summarize your approach to check if we are on the same page:
- references from explicit parameters are resolved in the context where the value is assigned, and if not present there in the context of the resource type/trait.
- literal references are resolved in the context were defined and cannot be overridden by the context where the resource type/trait is used.
Points to discuss:
- I guess implicit parameters (resourcePathName, methodName) are first resolved in the context where the resource type/trait is used, like the explicit ones.
- we need to define what to do with mixed references composed of parameters from different contexts and literal values (extreme case, but needs to be addressed)
- Denis and Christian approach is different regarding the propagation of the type through the resource type inheritance chain. In Denis' case if the value assigned to a parameter X comes from a previous parameter Y then Y context is used, which may be the expected outcome.
Okay, so now we have 3 different approaches.
I combined the samples above, and will post the effective expanded RAML for each approach, for the ease of comparison.
I hope the authors will correct me if my understanding of how their results should look like appear to be wrong.
Original RAML:
#%RAML 1.0
title: type shadows
uses:
a: libA.raml
types:
Monkey: string
/users:
type:
a.rtA:
kind: Monkey
#%RAML 1.0 Library
# libA.raml
uses:
b: libB.raml
types:
Monkey: number
resourceTypes:
rtA:
type:
b.rtB:
kindB: <<kind>>
kindB2: Monkey
post:
headers:
hA: <<kind>>
hA2: Monkey
#%RAML 1.0 Library
# libB.raml
types:
Monkey: boolean
resourceTypes:
rtB:
post:
headers:
hB: <<kindB>>
hB2: Monkey
hB3: <<kindB>> | <<kindB2>> | Monkey
Expanded RAML:
@svacas approach:
#%RAML 1.0
title: type shadows
uses:
a: libA.raml
b: libB.raml
types:
Monkey: string
/users:
post:
headers:
hA: a.Monkey
hA2: a.Monkey
hB: b.Monkey
hB2: b.Monkey
hB3: b.Monkey | b.Monkey | b.Monkey
@sichvoge approach:
#%RAML 1.0
title: type shadows
uses:
a: libA.raml
b: libB.raml
types:
Monkey: string
/users:
post:
headers:
hA: Monkey
hA2: a.Monkey
hB: a.Monkey
hB2: b.Monkey
hB3: a.Monkey | a.Monkey | b.Monkey
@ddenisenko approach:
#%RAML 1.0
title: type shadows
uses:
a: libA.raml
b: libB.raml
types:
Monkey: string
/users:
post:
headers:
hA: Monkey
hA2: a.Monkey
hB: Monkey
hB2: b.Monkey
hB3: Monkey | a.Monkey | b.Monkey
references from explicit parameters are resolved in the context where the value is assigned, and if not present there in the context of the resource type/trait.
Yes. If the value is absent in the value assignment context, we can fall back either to the next context in the chain between value assignment context and particular resource type / trait context, or right to the resource type / trait context itself. Doesn't matter much as this case is probably the rare one.
literal references are resolved in the context were defined and cannot be overridden by the context where the resource type/trait is used.
Yes.
All in all, lets wait for decision regarding the general approach, then discuss the details.
Hi,
Is there any update on this? Libraries are pretty tough to use if the library has to know about the types defined in the RAML in which it is being called. I understand that there are some subtle use cases, some of which are listed here e.g. where both the library user and the library define similarly named types. However, the more general case where the type is only defined in the library user should be supported.
In my use case (as attached), it's failing validation using both the javascript and java parsers. Unfortunately, I can't really grok anything from the error messages, which are very different from the java and javascript parsers. I'm attaching the raml zip, and log files containing the error output of the java and javascript parsers, and how I ran the parsers.
It will be nice if this is resolved in the spec, and parsers, before Mule 3.8.1 is released.
customer.raml.zip customer.raml.error.cmdline.txt customer.raml.error.java.txt customer.raml.error.javascript.txt
Thanks.
@usarid
I've been talking to @usarid and one other independent "source" ( @antoniogarrote ) who is very experienced with languages as well to get a different perspective. Antonio created a Lisp program (that supports macros) that shows how the expansion should work in his opinion.
His conclusion aligns to @ddenisenko results, and I asked him to give us more details here coming today.
Looking at the different approaches and exercises, @ddenisenko approach seems to be the strongest as not only him, but also Antonio and Uri came to the same conclusion. IMHO, we should move forward with that as a base and start to discuss the different corner cases and close gaps / interpretation to get to a formal algorithm that describes the resolution which we can put into the specification.
I'd like to discuss the following questions:
- We haven’t really defined what a parameter actually takes. Is it a literal, a references or an object?
- What if we define a trait and a type with the same key name?
- What if the type is object and we want to provide a literal which might have the same value vs a type key name (eg “Monkey” vs Monkey)?
- What if the type is object, how do we handle
post-<<param>>orhello<<param>>world? Are we saying that in these cases a param is a string by default? - What about functions, for example:
<<param | !singularize>>? Handling params as string again?
(please bear in mind that those are only my questions, please feel free to ask your own)
That is the result from a quick 1:1 chat with Denis.
- We haven’t really defined what a parameter actually takes. Is it a literal, a references or an object?
Parameter is always a string. The string can be later treated as a reference, or not, depending on where exactly that string is applied, how it is looking according to its surroundings (in example, two parameters concatenated together) etc. But, there is also a "source of the value" associated with the parameter application, so if its application appears to be a reference, we can later use that to resolve the reference.
In other words, there are two phases here.
Phase 1
Treat parameters as pure strings and do not care of the values at all except remembering the source.
@ddenisenko compared to @antoniogarrote you basically set scopes right? (looking at the gist he added a symbol in front of each value of the relevant nodes which basically represents the scope)
Phase 2
At the second phase, we check all the places where references can be. _If the reference is found, we resolve it and patch the string if needed. And that happens whether there was a parameter value in that string or not_, the only difference is the way we juggle with the reference resolving context.
Example
#%RAML 1.0
# api.raml
title: type shadows
uses:
a: libA.raml
types:
Monkey: string
/users:
type:
a.rtA:
kind: Monkey
#%RAML 1.0 Library
# libA.raml
types:
MonkeyA: number
resourceTypes:
rtA:
post:
headers:
hA: <<kind>>
hA2: Monkey2
After phase 1:
```yaml
#%RAML 1.0
title: type shadows
uses:
a: libA.raml
types:
Monkey: string
/users:
post:
headers:
hA: Monkey #We remember that the source of this one is root.raml, so we dont need to patch this to suite well in root.raml
hA2: Monkey2 #We remember that the source of this one is libA.raml, so we need to patch this to a.Monkey2 to suite in root.raml
After phase 2 (end result):
#%RAML 1.0
title: type shadows
uses:
a: libA.raml
types:
Monkey: string
/users:
post:
headers:
hA: Monkey
hA2: a.Monkey2
Notice that we analyze both hA and hA2 just because both are potential references. Then we resolve both and make a decision of whether to patch the value for it to be a valid reference in the context of root.raml to what we actually want to reference.
During that patching we do not change hA: Monkey as its already a correct reference in the context of root.raml that points to Monkey type from root.raml. But we need to patch hA2: Monkey2 to become hA2: a.Monkey2, even though it is not related to parameters at all, for it in the context of root.raml to correctly point to Monkey2 type in libA.raml. And at this point we could be also automatically generating an import of libA.raml if it is absent.
- What if we define a trait and a type with the same key name?
We dont care. Everything are strings at the first phase. At the second phase we known whether this reference points to the trait or a type due to knowing the language structure.
- What if the type is object and we want to provide a literal which might have the same value vs a type key name (eg “Monkey” vs Monkey)?
If you your string "Monkey" goes to a description, we do not resolve and patch it.
If your string "Monkey" goes to type: <<MonkeyTypeHere>> or is: <<MonkeyTraitHere>>, we will resolve and patch it.
- What if the type is object, how do we handle
post-<<param>>orhello<<param>>world? Are we saying that in these cases a param is a string by default?
That is a very good question.
At the first phase its all simple and clear:
type: HereWillBeThe<<param>> with param==Monkey becomes type: HereWillBeTheMonkey,
type: <<param1>>WillBeThe<<param2>> with param1==Here,param2==Monkey becomes type: HereWillBeTheMonkey,
_And we need to decide what to do on the second phase._
We have two options:
- As soon as the reference value is not completely filled with parameter, we just ignore the fact that the parameter was used here at all. This does not mean this reference will never be patched. It just means, we do not start calculating where the reference points to from the context, parameter value comes from, but instead we start calculation from this current context.
- Establish a more complicated rule, like if we have the following context chain : [root.raml] <- [Unit, where param1 value is set] <- [Some other unit] <- [Unit, where param2 value is set] <- [Current library unit with
type: <<param1>>WillBeThe<<param2>>inside], we calculate the unit, which was used as a source of some parameter value and is closest to the expansion root (root.raml), and that will be [Unit, where param1 value is set], and use its context to start resolvingHereWillBeTheMonkeytype. If we cant find typeHereWillBeTheMonkeyin that unit, we than proceed to [Some other unit] , and then all the way to [Current library unit withtype: <<param1>>WillBeThe<<param2>>inside] in the end.
I do not know, which approach is better. But I should mention that while this one looks a bit artificial, and it is not very important of how we trait this kind of cases, there is also a case, where we will have to break out own rules of treating everything as strings and then resolving references as a whole, and these are type expressions:
_Here is the exceptional case_:
Imagine type: <<param1>> | <<param2>>. In this case we still first work on pure strings level: type: FirstType | SecondType. But then we are not resolving FirstType | SecondType as a whole, instead we resolve FirstType and SecondType separately, and start resolving each one in its own context: where FirstType value comes from and where SecondType value comes from.
- What about functions, for example:
<<param | !singularize>>? Handling params as string again?
Functions are resolved during the first "string" phase.
Feedback?
Hi all, I'll try to explain a little bit the code Christian linked before.
I just came across this discussion while working on something else with Christian. I'm lacking most of the context, so apologies in advance.
When looking at the problem of the variables in resource types, I was making some assumptions:
- Certain nodes in the AST parsed from the RAML files are actually variables. In this case type identifiers and resourceTypes identifiers
- Variables are lexically scoped
- A RAML file opens a new lexical scope
The first section of the file Christian linked is the in-memory AST I would expect an interpreter for RAML would build (the READ phase in the READ - EVAL - PRINT - LOOP sequence) for the provided files:
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; AST and variable resolution ;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Each file creates a scope
;; Variables are lexically qualified within their scopes, aliases are expanded
;; api.raml
(types [api.raml/Monkey string])
(endpoint api.raml//users
{:type (libA.raml/rtA {:kind api.raml/Monkey})})
;; libA.raml
(types [libaA.raml/Monkey number])
(resource-types libA.raml/rtA [kind]
`{:type (libB.raml/rtB :kindB ~kind)
:post {:headers {:ha ~kind
:hA2 libaA.raml/Monkey}}} )
;; libB.raml
(types [libB.raml/Monkey boolean])
(resource-types libB.raml/rtB [kindB]
`{:post {:headers {:hB ~kindB
:hB2 libB.raml/Monkey}}})
Vars identifiers become fully qualified using the scope where they are defined.
Monkey in api.raml becomes api.raml/Monkey while Monkey in liba.raml becomes libA.raml/Monkey, this is so because I'm assuming files open a new scope but also can be used as a namespace for the variables.
Once you have parsed the files and created the in-memory AST, you can go on with the EVAL phase. Here is where you need to deal with the resourceTypes.
One possibility is to consider each resourceType as a template or macro, where arguments are treated as nodes of the AST that are cloned lexically and replaced in the sub-tree of the macro (resourceType), then the output of the replacement is inserted in the source AST at the position where the resourceType is applied. So the EVAL-1 phase just transforms the input AST into an output AST where all the resourceTemplates have been expanded. The only problem with this approach is that the output of expanding resourceTemplates cannot be just inserted in the original AST at the resourceTemplate application point, application must be defined as the merging of maps for the source and the result of the resourceTemplate expansion. Rules for key collisions must be defined.
This macro expansion phase in two steps is what I was trying to show in the next two sections of the file:
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Macro Expansion Time (1) ;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Recursive macro expansion
(types [api.raml/Monkey string])
(types [libaA.raml/Monkey number])
(types [libB.raml/Monkey boolean])
(endpoint api.raml//users
{:type (libB.raml/rtB :kindB api.raml/Monkey)
:post {:headers {:ha api.raml/Monkey
:hA2 libA.raml/Monkey}}})
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Macro Expansion Time (2) ;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(types [api.raml/Monkey string])
(types [libaA.raml/Monkey number])
(types [libB.raml/Monkey boolean])
(endpoint api.raml//users
{:post {:headers {:ha api.raml/Monkey
:hA2 libA.raml/Monkey
:hb api.raml/Monkey
:hB2 libB.raml/Monkey}}})
After the full tree has been expanded, we can proceed to the second stage of the EVAL phase, in EVAL-2 the input is the expanded AST and the output is the RAML data model built from it. In this case the evaluation is just the resolution of the remaining variables. This is what shows the last section of the file:
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Evaluation Time ;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Final computation of the RAML model
(endpoint /users
{:post {:headers {:ha string
:hA2 number
:hb string
:hB2 boolean}}})
This is just a way of defining operationally the way the parser works. A different output could be generated by defining different rules, for example, using dynamic scope instead of lexical scopes, or by defining different rules for the start of a scope.
Again, I'm missing a lot of the context, so sorry if I'm missing something important. I hope it can be of any help.
The approach that makes sense to me, and I believe will be the expected behavior, is loosely described as: passing a parameter entails passing a literal ("Monkey" or "[ 1, 2, 3 ]") as well as the file context at the point the parameter was passed; then the parser understands what kind of parameter value is needed from the place where the parameter is used, and resolves that value using the passed-in literal and context. Let's go with this one. This conclusion is the one described most recently by @antoniogarrote and the same one (I believe) that @ddenisenko was describing throughout this, and it seems unambiguous and applicable to the "creative" cases that e.g. @machaval brought up.
So the answer to @svacas original question "Which would be the final types of headers hA, hA2, hB, hB2" is:
hA: string
hA2: number
hB: string
hB2: boolean
What I'm not sure about is whether there are any other rules needed, e.g. collisions or fallbacks, but I'm hoping that can now be worked out given the above decision, and then a clear, simple language can be found to describe this and clarify in the spec.