Gradualizer icon indicating copy to clipboard operation
Gradualizer copied to clipboard

Interchangeable record and tuple types

Open timCF opened this issue 5 years ago • 10 comments

Hello! Maybe you can help me to understand how Erlang record types are represented in Gradualizer, and we can figure out why there is issue in this particular example:

Let's say we have this call

https://github.com/timCF/hm-crypto/blob/3ca5ff5cd02056401eb1f081f460d3f9ef50ddad/lib/hm_crypto.ex#L51

:public_key.sign function accepts one of 3 Erlang records as last argument:

#'RSAPrivateKey'{} | #'DSAPrivateKey'{} | #'ECPrivateKey'{}

so it expects what &parse_pem/1 will return one of them, let's now take a look to &parse_pem/1

https://github.com/timCF/hm-crypto/blob/3ca5ff5cd02056401eb1f081f460d3f9ef50ddad/lib/hm_crypto/public_key.ex#L62

and it returns exactly one of them:

https://github.com/timCF/hm-crypto/blob/3ca5ff5cd02056401eb1f081f460d3f9ef50ddad/lib/hm_crypto/public_key.ex#L4-L49

Elixir does not have special syntax for Erlang records, so some sugar from Record module was used to generate some macro which will represent Erlang records as proper tuples in source code. If we decompile generated .beam file back to Erlang, types and spec will look like

-type rsa_key() :: binary() | parsed_rsa_key().

-type parsed_rsa_key() :: {'RSAPrivateKey', any(),
			   any(), any(), any(), any(), any(), any(), any(),
			   any(), any()} |
			  {'DSAPrivateKey', any(), any(), any(), any(), any(),
			   any()} |
			  {'ECPrivateKey', any(), any(), any(), any()}.

-spec parse_pem(rsa_key()) -> parsed_rsa_key().

Looks correct, but if I try to apply Gradualizer, it says

The function 'Elixir.HmCrypto.PublicKey':parse_pem on line 51 is expected to return #'RSAPrivateKey'{} | #'DSAPrivateKey'{} | #'ECPrivateKey'{} but it returns parsed_rsa_key()

timCF avatar Dec 23 '18 12:12 timCF

I'm not sure how interchangeably Gradulaizer handles records and tuples.

I workaround idea: what if you use the exported type from public_key itself (so that you avoid handling record types directly in Elixir), ie:

-spec parse_pem(rsa_key()) -> :public_key.private_key().

Another workaround idea for records for which there is no type exported: Generate an Erlang module that exports these types:

-module('Elixir.HmCrypto.PublicKey.RecordTypes').
-include_lib("public_key/include/public_key.hrl").
-export_type([rsa_public_key/0, rsa_private_key/0, ...]).
-type rsa_public_key()  :: #'RSAPublicKey'{}.
-type rsa_private_key() :: #'RSAPrivateKey'{}.
...

I have to admit the typespec for some public_key functions are not that fine grained, some decode functions are declared to return term() which is not too helpful.

gomoripeti avatar Dec 23 '18 13:12 gomoripeti

Interesting topic! I didn't know Elixir doesn't use Erlang records.

Currently, a record expression only matches a record type and the same for tuples, with the exception that tuple() is any record or tuple.

Perhaps we should support tuples and records interchangeably, in the long run. We could treat the type #rec{} as syntactic sugar for {rec, ...} and expand it to a tuple type when normalizing types.

Another thing we don't support yet is subtypes of records, i.e. records where one or more fields have a specialized type, e.g.

-record(rec, {foo :: a|b|c, bar :: any()}).
-type rec_subtype() :: #rec{foo :: a}.

zuiderkwast avatar Dec 23 '18 14:12 zuiderkwast

Thanks, currently I will try to use tiny additional Erlang files for these type annotations! But idea about #rec{} as syntactic sugar for {rec, ...} looks very reasonable as well as idea about record subtypes :)

timCF avatar Dec 23 '18 19:12 timCF

I tried it, anyway, if I have code like

  @spec parse_pem(binary() | :public_key.private_key()) :: :public_key.private_key()
  def parse_pem(rsa_private_key() = pem), do: pem
  def parse_pem(dsa_private_key() = pem), do: pem
  def parse_pem(ec_private_key() = pem), do: pem

  def parse_pem(pem_string) when is_binary(pem_string) and pem_string != "" do
    [pem_entry] = :public_key.pem_decode(pem_string)
    :public_key.pem_entry_decode(pem_entry)
  end

Gradualizer gives error

The pattern {'RSAPrivateKey',_,_,_,_,_,_,_,_,_,_} = _pem@1 on line 38 doesn't have the type binary() | public_key:private_key()

Because actual generated Erlang code is

-spec parse_pem(binary() |
		public_key:private_key()) -> public_key:private_key().

parse_pem({'RSAPrivateKey', _, _, _, _, _, _, _, _, _,
	   _} =
	      _pem@1) ->
    _pem@1;
parse_pem({'DSAPrivateKey', _, _, _, _, _, _} =
	      _pem@1) ->
    _pem@1;
parse_pem({'ECPrivateKey', _, _, _, _} = _pem@1) ->
    _pem@1;
parse_pem(_pem_string@1)
    when erlang:is_binary(_pem_string@1) andalso
	   _pem_string@1 /= <<>> ->
    [_pem_entry@1] = public_key:pem_decode(_pem_string@1),
    public_key:pem_entry_decode(_pem_entry@1).

Of course I can refactor my code to avoid this problem, but it can appear in other, less trivial cases. Btw, if I will put clause with is_binary/1 guard before others, I will get other error

The variable '_pem_string@1' on line 40 has type <<_:_*8>> |
          #'RSAPrivateKey'{} |
          #'DSAPrivateKey'{} |
          #'ECPrivateKey'{} but is expected to have type binary()

which means that Gradualizer expect variable pem_string can be not only binary here

[pem_entry] = :public_key.pem_decode(pem_string)

(but technically this is not correct because of is_binary/1 guard)

timCF avatar Dec 23 '18 21:12 timCF

Just a note: If we convert a record type to a tuple type, we could annotate the tuple type with erl_anno:set_record(true, Anno). It is an annotation which exists specifically for this purpose.

record A Boolean indicating if the origin of the abstract code is a record. Used by Dialyzer to assign types to tuple elements.

zuiderkwast avatar Dec 31 '18 01:12 zuiderkwast

Personally I would really like to avoid treating records as their tuple representation as far as possible. Records as implemented in erlang is a "leaky abstraction", but I think it'd be nice if we could maintain as much of the abstraction as possible. That being said, if there's a lot of code out there which exploits the fact that records are tuples then we'll probably have to allow that. I added support for treating records as subtypes of tuple() because I came across code which exploited this fact heavily.

josefs avatar Jan 02 '19 08:01 josefs

This feels like a perfect candidate for a flag. You could enable it when type checking existing code exploiting that records are tuples, but leave it disabled when developing new code where you want to maintain the abstraction.

UlfNorell avatar Jan 02 '19 08:01 UlfNorell

-gradualizer([records_as_tuples]).

An option sounds great to me.

If we want an option like this passed around in Env, either it needs to be passed around to normalize/2 and other functions or we need to handle options in another way. Perhaps using application config params, e.g. application:get_env(gradualizer, records_as_tuples).

zuiderkwast avatar Jan 02 '19 11:01 zuiderkwast

I noticed in my code base a frequent pattern using lists:key* with records (which basically makes use of leaky record abstraction)

-spec f([#r{}]) -> #r{} | false.
f(RL) ->
    lists:keyfind(1, #r.f, RL).

Is there any way this could be made valid without records_as_tuples? Could the spec of lists:keyfind/3 be overriden so it works on all records as well? Not having a generic record() type (only record(Name)) is intentional, right? (Although I found a trace here https://github.com/josefs/Gradualizer/blob/master/src/typechecker.erl#L1141, what is this user type and where should it come from?)

gomoripeti avatar Dec 14 '20 00:12 gomoripeti

@gomoripeti as you noticed, there is no general records type. I suppose we could add such a type and let all field accesses be allowed and return any(). But I'm not convinced it's worth the hassle, just to make it separate from tuples. I agree with @UlfNorell that adding a flag for this is a good way forward.

The code that you refer to is the "record environment" which contains a map of all the known records and its fields.

josefs avatar Dec 16 '20 10:12 josefs