language-rust
                                
                                 language-rust copied to clipboard
                                
                                    language-rust copied to clipboard
                            
                            
                            
                        Deal with Idents/Names efficiently
It may be the case that identifiers or names should be interned, or use Text, but in any case the current situation should change.
Might be worth considering an Invalid constructor given the number of times Rust relies on that idea. Either that, or adjust the AST to get rid of this pattern (and have mkIdent prevent this case at runtime).
data Ident = Invalid | Ident String -- or some ID in a global symbol table
For example, an Arg requires a pattern, but for cases like fn f(i32) -> i32, that pattern is an IdentP of an invalid (empty) Ident. Similarly, in FieldPat, there is an edge case where there is no identifier. Rust fills it in with the same identifier as the patter.
For now, Ident stayed the same, and instead the AST was adjusted to avoid having to fill Ident holes with an empty/filler/invalid Ident. Still not ruling out the possibility of adding the Invalid :: Ident variant, especially since it would match more closely the Rust AST.
Reopening this for the more general problem of how to efficiently deal with Ident and Name.
This commit removed the Name data type and just changed it to be a synonym for String. After a couple more simplifications, things ended up pretty nice. This issue is now repurposed for the question of whether it is worth trying to compact strings?
For example, have something like
import Data.IntMap
mkIdent :: IntMap Ident -> String -> (IntMap Ident, Ident)
mkIdent s cache = case lookup h cache of
                           Just ident@(Ident s' _) | s == s' -> (cache, ident)
                           _ -> let ident = Ident s h in  (insert h ident cache, ident)
     where h = hash s
The cost of the IntMap probably dwarfs any benefits of using it though... Still something to keep in mind.
There are really two ideas to keep track of here:
- Compacting common Name's (like the previous comment)
- Representing Name's in a format that saves bytes and pointer indirection (likeText)
I should benchmark this before doing anything.