crystal icon indicating copy to clipboard operation
crystal copied to clipboard

Improve `UUID#inspect` to return Crystal expression

Open straight-shoota opened this issue 3 years ago • 23 comments

This patch changes UUID#inspect to return a string representation that is also a valid Crystal expression for creating a UUID instance. This serves as a convenience feature for developers. You can easily print a UUID (p! uuid) and use the result to re-construct the same UUID in Crystal code. Without that, you have to manually edit the current format to form a call to UUID.new. Considering the diff is quite minimal, I think it's a nice treat to make this easier for users.

straight-shoota avatar Mar 07 '22 20:03 straight-shoota

I understand the rationale behind this, but this could be said about many other types. Current convention used in #inspect is #<Name ...> for classes and Name(...) for structs. It would be a bit weird to have UUID as the only type which exhibits different behavior.

Sija avatar Mar 07 '22 22:03 Sija

Many types have inspect formats that are valid Crystal expressions. All types that have literals or literal-like expressions (such as Set{1, 2}). It should work fine for many simple data types like UUID. "Crystal-expressiveness" is an extremely useful property. I don't think it should be limited to types that have literal or literal-like expressions. This obviously doesn't work for complex types with transient state. They can't be represented by a Crystal expression. But others can and IMO it makes great sense to utilize that.

straight-shoota avatar Mar 08 '22 10:03 straight-shoota

I personally think the rule should be:

  • If there's a way to make the string representation not involve method calls, let's do that (this is the case of URI::Params, which can be written as URI::Params{...}
  • Otherwise, let's avoid putting method calls in inspect

I personally think seeing an array of URIs like this:

[URI.new("..."), URI.new("...")]

instead of this:

[URI("..."), URI("...")]

looks very strange.

Maybe we can add an URI.[](String) method that is an alias of new. Then the output would be:

[URI["..."], URI["..."]]

which is more, doesn't involve method calls, and it's also valid Crystal code.

asterite avatar Mar 08 '22 12:03 asterite

I'd be happy with UUID["..."] as well. I'm not sure if avoiding methods calls should be a strong goal, but in any case it's a bit more concise. In fact, I have been working on a similar solution for BitArray which would make the format of BitArray#inspect work as a constructor.

straight-shoota avatar Mar 08 '22 12:03 straight-shoota

In Ruby methods can start with an uppercase letter, so you can do this:

require "uri"

uri = URI("https://example.com")

In Crystal that syntax conflicts with generics. Maybe there's a way to allow it, but I'm not sure. And maybe it would just add confusion.

So we could use the [] class method instead, maybe also use that as a convention. I was thinking what other types could do this... there's Dir, but Dir already has a [] method with a different meaning, so maybe it's not a good idea (though maybe it will work, because it will be a pattern that will only match that file).

But one thing that I'm noticing is that in Ruby these types don't have an inspect output that is valid Ruby code. For URI, it's:

#<URI::HTTPS https://hello>

If you have a struct:

❯ irb
irb(main):001:0> Point = Struct.new(:x, :y)
=> Point
irb(main):002:0> Point.new(1, 2)
=> #<struct Point x=1, y=2>

I'm not saying that we shouldn't do this because Ruby doesn't do it. I'm just trying to think why Ruby doesn't do it... maybe the case where everything you output becoming a valid Crystal expression is really hard.

Maybe the solution is to review this case by case 🤷

asterite avatar Mar 08 '22 13:03 asterite

I'm just trying to think why Ruby doesn't do it

I think it's more of a coincidence that some Ruby inspects produce Ruby code and others don't. That is, I think they chose a format for inspect that conveys the most meaning to a programmer and sometimes that format looks like code and sometimes it doesn't. The documentation even states "Returns a string containing a human-readable representation of obj". I'd argue most Ruby inspects don't produce valid code. For example, Array and Hash appear to produce valid code, but that breaks down if you have an Array of objects or Array of Dates.

In my opinion, inspect is not about producing code, but producing something human-readable. If we want a method that produces valid code, perhaps a new method is in order? (offhand, thinking .to_crystal)

Fryguy avatar Mar 08 '22 15:03 Fryguy

In my opinion, inspect is not about producing code, but producing something human-readable.

Agreed. But it doesn't hurt when the human-readable representation is also valid code. I believe there is even a good argument that a valid expression can be a very good human-readable representation. Because that's exactly what a human would write to produce an object. When a value can be represented as reasonably concise code, there's no need for an artificial syntax to describe the object. Using the already existing syntax for construction is much easier. Adding a method for such as UUID.[] doesn't make use of existing API. So the question is whether it makes sense to introduce this method, regardless of #inspect. And I think it does because it serves as a concise way for producing an object. If we agree on that, we can add this constructor and then use it's for #inspect. The result would be concise and consistent syntax for producing and consuming objects. You could think of it as serialization for humans. #inspect is the serializer, and a constructor expression is the deserializer.

straight-shoota avatar Mar 08 '22 17:03 straight-shoota

If we want a method that produces valid code, perhaps a new method is in order? (offhand, thinking .to_crystal)

I don't think there's a reason for that. Simple data objects that can be easily represented with a Crystal expression should use that for #inspect. Objects can't be represented that way (say, a Socket, for example) can use some other form of representation for #inspect, whatever is reasonable. Having two methods produce different human-readable representations, #inspect and #to_crystal (Crystal code is human-readable), doesn't make much sense in my opinion.

straight-shoota avatar Mar 08 '22 17:03 straight-shoota

I think we can try using the URI.new(...) approach. It's just a few more letters compared to URI.[] and it doesn't involve adding a new method just for this. I know I was kind of against it, but I think we should try it anyway. We can always revert it, doing changes to inspect doesn't matter much regarding backwards compatibility.

asterite avatar Mar 08 '22 17:03 asterite

@straight-shoota #inspect is meant for human inspection, not as a serializer as you described it - where this idea comes from anyway?

Sija avatar Mar 08 '22 17:03 Sija

Well, inspecting means understanding. That needs an effectual representation. Serialization means turning structured data into a serialized presentation. The human-readable representation of #inspect is a form of serializing.

straight-shoota avatar Mar 08 '22 17:03 straight-shoota

Well, inspecting means understanding. That needs an effectual representation. Serialization means turning structured data into a serialized presentation. The human-readable representation of #inspect is a form of serializing.

Again, #inspect is meant for humans, not machines and it's not a serialization, but rather a textual representation using a format that resembles a source code. Serialization is meant for storage and transmission, not inspection.

Sija avatar Mar 08 '22 18:03 Sija

Again, #inspect is meant for humans, not machines and it's not a serialization, but rather a textual representation using a format that resembles a source code

I think @straight-shoota 's point is: if we can achieve both in a way that is still pretty much human-readable, then why not? Specially considering that those humans will be Crystal programmers, if they see a valid Crystal expression they will understand it, maybe even easier than understanding a custom output.

asterite avatar Mar 08 '22 18:03 asterite

Just to clarify, I don't think there's anything wrong with updating specific inspects on a case-by-case basis to emit a string that happens to work directly in crystal. However, specifically for UUID.new, it just feels weird to me personally, but probably only because I've never seen method calls emitted in .inspects before (in Ruby or Crystal). I think the UUID[] style would feel a little nicer, but there's nothing really wrong with the UUID.new style.

Fryguy avatar Mar 08 '22 18:03 Fryguy

For primitives and container(-like) types it certainly makes sense to have a format that works both ways. For others it seems to me like trying to conflate two different concepts (inspection and serialization) into one, which is simply at best confusing and non-coherent. Next, we'll be having questions why other types are not following such pattern, and should #inspect return serialized objects, or changing constructors in order to fit into such practice (already proposed), and so on.

Sija avatar Mar 08 '22 18:03 Sija

@straight-shoota btw, why won't you post an RFC before opening a PR with breaking changes - as you like to advise so often? Not doing so - like on many other occasions - is creating a community of double-standards - and it's not that I believe we need an RFC for every change, it's about following the same standards you put out for others.

Sija avatar Mar 08 '22 18:03 Sija

.new is problematic because it is defined as the constructor by the language, and so it carries a connotation not seen in alternative string representations, namely that the constructed object never occupies the same heap (or stack) space as any other Crystal object. Thus it is fallacious for an object to say that it belongs to space that isn't occupied by itself. (On the other hand, if this new object is indeed interned then that constructor is no longer trivial enough to warrant a .new representation anyway.)

If we use T[...] then this connotation disappears. T{...} technically involves an argless .new call, but is often viewed as a distinct literal per se, so it is probably fine too. I have no strong opinion over other constructor-like method calls; apparently this has already been done for non-native Paths.

HertzDevil avatar Mar 08 '22 19:03 HertzDevil

@Sija I honestly believed this is merely a simple DX improvement that doesn't warrant any prior discussion (like #11880 for example). Apparently, I was wrong about that.

For primitives and container(-like) types it certainly makes sense to have a format that works both ways.

What makes primitives and container-like types special in this regard? Why couldn't this same concept apply to other types?

trying to conflate two different concepts (inspection and serialization) into one, which is simply at best confusing and non-coherent.

As I already argued before, I don't think inspection and serialization are that different. And then I'd again ask why combining both would be confusing and incoherent, when it's apparently not for primitives and container-like types?

Next, we'll be having questions why other types are not following such pattern

Maybe those questions would be rightfully asked? Why don't I have a simple way to inspect URI or Time and copy that representation into code and have it produce an object with the same value? Or what about BitArray (actually a container-like type!) which currently has no way of construction from a specific value? Probably every record could have a default #inspect implementation that produces a Crystal expression representing the record object.

Related: https://github.com/crystal-lang/crystal/issues/9966 / #9974

straight-shoota avatar Mar 08 '22 20:03 straight-shoota

To me, the ideal case would be to have something that works for every struct consistently. And if I had the God of Syntax here with me, I'll ask they to add Struct(@fld₁ = v₁, ..., @fldₙ = vₙ) as a constructor for each struct with the obvious semantics, and making it a "code smell" if you put it in your code. For generics it would simply be Generic(Class)(@fld₁ = v₁, ..., @fldₙ = vₙ) (like what inspect outputs).

beta-ziliani avatar Mar 10 '22 19:03 beta-ziliani

To me, the ideal case would be to have something that works for every struct consistently. And if I had the God of Syntax here with me, I'll ask they to add Struct(@fld₁ = v₁, ..., @fldₙ = vₙ) as a constructor for each struct with the obvious semantics, and making it a "code smell" if you put it in your code. For generics it would simply be Generic(Class)(@fld₁ = v₁, ..., @fldₙ = vₙ) (like what inspect outputs).

beta-ziliani avatar Mar 10 '22 19:03 beta-ziliani

@beta-ziliani I hope the gods are not favourable to you :P

I don't understand why a inspect representation that also works as a Crystal expression - in whatever syntax - would be considered a code smell. It shouldn't matter whether I write an expression like URI::Params{"key" => "13809138213109", "secret" => "jdUad$"} (or whatever struct-only example) or copy it from some inspect output.

straight-shoota avatar Mar 10 '22 19:03 straight-shoota

For the same reason visibility exists in Ruby/Crystal. You want to prevent the creation of arbitrary objects.

I want to respect the Gods of Valid Code.

beta-ziliani avatar Mar 10 '22 19:03 beta-ziliani

Elixir seems to favor valid Elixir expressions now: https://github.com/elixir-lang/elixir/blob/v1.14/CHANGELOG.md#expression-based-inspection-and-inspect-improvements

HertzDevil avatar Aug 18 '22 13:08 HertzDevil

I still agree with https://github.com/crystal-lang/crystal/pull/11879#issuecomment-1061708913 that we should prefer [...] over .new(...) and alias the constructor if necessary. #13044 uses [...] on a type that doesn't even have a public constructor yet, and we should follow suit here.

HertzDevil avatar Feb 13 '23 19:02 HertzDevil

It's related, but a separate feature. I think it makes sense to add it on its own.

straight-shoota avatar Feb 14 '23 10:02 straight-shoota

Then that separate PR should come before this one, as that's the whole premise of this PR. We do not merely want the output of #inspect to be valid Crystal syntax, we want it to actually be able to evaluate to something.

HertzDevil avatar Feb 14 '23 12:02 HertzDevil

I don't think the order matters much, really. We have other inspect formats that look like valid syntax, but are not functional. It's completely fine to merge this before adding the [] constructor.

straight-shoota avatar Feb 14 '23 20:02 straight-shoota