gf-core icon indicating copy to clipboard operation
gf-core copied to clipboard

Use a GF grammar for printing GF grammars

Open heatherleaf opened this issue 7 years ago • 3 comments

Most (all?) output formats can be described as a concrete syntax of a GF grammar.

This relates to issue #26 , and should make it simpler to add and modify different output formats.

Idea: write one GF grammar for canonical GF (ie what’s left after compilation and partial evaluation), and output formats such a Haskell, Javascript, Json, Yaml etc as concrete syntaxes. The same can be done for outputting PGF/PMCFG grammars. And possibly the same for speech grammars (which need some grammar transformations such as left-recursion elimination)

This would affect the modules GF.Compiler, GF.Compile.Export, and others. In fact, several modules can probably be removed (such as GF.Compile.PGFtoXxx)

One extra feature would be that we don’t have to recompile GF to add a new output format. So, anyone can create their own :)

heatherleaf avatar Jan 11 '19 12:01 heatherleaf

This sounds like a good idea with one clarification.

Every time when I make major changes in the compiler or the runtime, I need to update a lot of things. This includes the different exports to Haskell, Python, Prolog, etc. I believe that many of those exports are not used by anyone. When I update them I make sure that they compile but whenever necessary I might have to change the export format and this has happened several times in the last few years. Yet, no one has complained that this has broken his/her code. Therefore I believe that they were one time experiment and after that they were not used further.

Moving these exports out of the compiler is a good idea. There should be an abstract syntax defined somewhere in another repository. The only responsibility of the compiler would be to produce one abstract syntax tree representing the GF grammar. When necessary the compiler maintainers should also update the abstract syntax for describing GF grammars.

The maintenance of the export formats should be left to the corresponding maintainers who are the only one who know how the format should be used.

krangelov avatar Jan 11 '19 14:01 krangelov

That's exactly what I meant. I think.

Also: all of the current export formats (except Haskell concrete syntax) are on the PGF grammar, but it's sometimes more useful to print the compiled GF grammar instead. So we need two GF grammars, and people can decide if they want to render the GF or the PGF.

heatherleaf avatar Jan 11 '19 15:01 heatherleaf

I made a very quick abstract syntax for canonical GF. I.e., what's left after compilation and partial evaluation, or what batchCompile is returning here:

https://github.com/GrammaticalFramework/gf-core/blob/f32d222e7120b2cdbcf7959f2230d01588ee1aa0/src/compiler/GF/Compiler.hs#L47-L52

I know that I'm skipping over esoteric things such as lindef, data, def, printname, etc., and I have probably missed a lot of things, and there are probably better ways of doing it (ping @Thomas-H @krangelov @aarneranta @johnjcamilleri @inariksit et al). But here's anyway:

abstract GFCanonical = {

cat
  Grammar ; Abstract ; Concrete ; [Concrete]{0} ;

  CatDef ; [CatDef]{0} ;

  FunDef ; [FunDef]{0} ;
  SimpleType ; ComplexType ; [ComplexType]{0} ;
  TypeApplication ; TypeBinding ; [TypeBinding]{0} ; 

  ParamDef ; [ParamDef]{0} ;
  ParamValue ; [ParamValue]{0} ; ParamType ;

  LincatDef ; [LincatDef]{0} ;
  LinType ; [LinType]{0} ;
  RecordRowType ; [RecordRowType]{0} ;

  LinDef ; [LinDef]{0} ;
  LinValue ; [LinValue]{0} ;
  TableRowValue ; [TableRowValue]{0} ;
  RecordRowValue ; [RecordRowValue]{0} ;

  CatId ; [CatId]{0} ;
  FunId ;
  ParamId ; [ParamId]{0} ;
  ValueId ;
  LabelId ;
  VarId ; [VarId]{0} ;

fun

  grammar : Abstract -> [Concrete] -> Grammar ;
  abs : [CatDef] -> [FunDef] -> Abstract ;
  cnc : [ParamDef] -> [LincatDef] -> [LinDef] -> Concrete ;

  -- abstract category declarations

  simpleCatDef : CatId -> CatDef ;
  complexCatDef : CatId -> [CatId] -> CatDef ;

  -- abstract function declarations

  simpleFunDef : FunId -> SimpleType -> FunDef ;
  complexFunDef : FunId -> ComplexType -> FunDef ;

  simpleType : [CatId] -> CatId -> SimpleType ;
  complexType : [TypeBinding] -> TypeApplication -> ComplexType ;

  nobinding : ComplexType -> TypeBinding ;
  binding : VarId -> ComplexType -> TypeBinding ;

  noapplication : CatId -> TypeApplication ;
  application : CatId -> [ComplexType] -> TypeApplication ;

  -- concrete param declarations

  paramDef : ParamId -> [ParamValue] -> ParamDef ;

  simpleParamValue : ParamId -> ParamValue ;
  complexParamValue : ParamId -> [ParamId] -> ParamValue ;

  -- concrete lincat definitions

  lincatDef : CatId -> LinType -> LincatDef ;

  strType, intType, floatType : LinType ;
  paramType : ParamType -> LinType ;
  tableType : ParamType -> LinType -> LinType ;
  recordType : [RecordRowType] -> LinType ;
  tupleType : [LinType] -> LinType ;

  recordRowType : LabelId -> LinType -> RecordRowType ;

  -- concrete linearisation definitions

  linDef : FunId -> [VarId] -> LinValue -> LinDef ;

  strConstant : String -> LinValue ;
  intConstant : Int -> LinValue ;
  floatConstant : Float -> LinValue ;
  paramConstant : ParamValue -> LinValue ;

  tableValue : [TableRowValue] -> LinValue ;
  recordValue : [RecordRowValue] -> LinValue ;
  tupleValue : [LinValue] -> LinValue ;

  tableRowValue : ParamValue -> LinValue -> TableRowValue ;
  recordRowValue : LabelId -> LinValue -> RecordRowValue ;

  -- identifiers

  catid : String -> CatId ;
  funid : String -> FunId ;
  paramid : String -> ParamId ;
  valueid : String -> ValueId ;
  labelid : String -> LabelId ;
  varid : String -> VarId ;
  anonymous : VarId ;

}

heatherleaf avatar Jan 11 '19 15:01 heatherleaf