fortran-src icon indicating copy to clipboard operation
fortran-src copied to clipboard

support for deprecated or proprietary constructs? `record //` and `structure //`.

Open smoothdeveloper opened this issue 9 months ago • 8 comments

I'm trying out the parser, but it doesn't handle some of the idioms in the code I'm dealing with (there will be a stream of this), here is a first example array declaration:

record /myRecordType/ variableName(*)

relying on such a structure definition:

structure /myRecordType/
  integer fieldA
  real fieldB
end structure

I'm interested in learning to extend the parser but wonder how such non-standard extensions would be integrated?

Also, my haskell skills are pretty low, I'm trying to understand how the .x & .y files are processed, and would like to improve the code to have a verbose command line flag, that would trace things as the lexer and parser are processing; which would help figuring out the flow of things.

edit:

  • https://fortranwiki.org/fortran/show/Modernizing+Old+Fortran#structure_record_map_union_statements

smoothdeveloper avatar Feb 15 '25 12:02 smoothdeveloper

You'll want to pass --version=77l which will parse it in legacy mode, there is support for various old DEC constructs it's just gated behind this flag (fortran-src has been used extensively on old proprietary code bases with a lot of these structures). See the lexer code here and the parser code here for structures, the / { legacy77P} in Lexer.x means it's only getting lexed with that flag.

For reference the .x files use alex for lexing, and the .y files use happy for parsing.

RaoulHC avatar Feb 15 '25 12:02 RaoulHC

@RaoulHC, thanks for all the pointers (erm, targets)!

I was just looking in the older lexer and got hopeful seeing both record and structure, giving -v 77l helps but I'm still getting lexing failed, this time it outputs a AlexInput {...} stuff I can actually diagnose a bit better.

I'll read up on alex and happy to gain better understanding.

Now, the codebase I'd like to process is using a mix of idioms used indiscriminately, those constructs are used in free form files.

Is there a sense that we don't want the constructs supported in later lexers, due to added complexity?

Would you recommend me to create my own custom lexer / parser pair, by copying the 2003 and integrating the bits I encounter and are in the 77l grammar?

Do you suggest a particular approach if we'd like the tool to have --verbose flag that outputs information about the processing as it is ongoing? I think it would be useful to the user and the maintainers as well.

smoothdeveloper avatar Feb 15 '25 13:02 smoothdeveloper

Ah yes it doesn't support that, perhaps because it wasn't ever encountered in free form files. The lexing of fixed and free form is very different, so you'd have to add support for those tokens to the free form lexer, and then perhaps you can just add a combination here that parses the 77 syntax tree after a free form lexer pass. There were a lot of newer features added to the old parser (dealt with a lot of fortran 90 and newer constructs being used in fixed form legacy code), so even if you've got a horrific mix of standards it should mainly work.

RaoulHC avatar Feb 15 '25 13:02 RaoulHC

@RaoulHC thanks.

I've started something in #305, note that in the codebase I'm looking at, I also have things like [allocatable](:).

Intel Fortran compiler support a bunch of those qualifiers, which I think in the standard, are supposed to occur before the variable/argument names.

Is it also something you are fine supporting in the lexer?

Any idea, what is the amount of work, once the lexer part is fixed and it compiles, to make sure things remain ok for consumers and to maintain the necessary data to handle those nuances, in the AST?

smoothdeveloper avatar Feb 15 '25 23:02 smoothdeveloper

module mixed_idioms_structure_and_records_declarations
    implicit none
    structure /MyThing/
        integer field1
        real field2
    end structure
contains
    subroutine something ()
        record /MyThing/ myThings[allocatable](:)
        
        integer i, max
        max = 100
        allocate(myThings(max))

        do i = 1, max
            myThings(i).field1 = i
            myThings(i).field2 = sqrt(real(i))
            print *, "mything(", i, ").field1=", myThings(i).field1
            print *, "mything(", i, ").field2=", myThings(i).field2
        end do
    end subroutine
end module

program test
    use mixed_idioms_structure_and_records_declarations
    call something()
end program

FYI, this is a small program that compiles with Intel IFX, with no extra settings.

ifx test.decls.f90 -o test.decls.exe test.decls.exe

mything( 1 ).field1= 1 mything( 1 ).field2= 1.000000 ... mything( 100 ).field1= 100 mything( 100 ).field2= 10.00000

smoothdeveloper avatar Feb 15 '25 23:02 smoothdeveloper

I believe that syntax is not standard, record's themselves are not standard and allocatable only got added in Fortran 90 so the standard has syntax of form integer, allocatable, dimension(:) :: var. Not to say we shouldn't support it, fortran-src supports various bits of non-standard syntax as it has been used in a lot of legacy code bases, just not an Intel one afaik. So I'm happy with adding that to the lexer.

As for your PR I've only had a skim, but I wonder if we want something slightly less granular than for specific constructs. Might make sense to have an IFX/intel flag, that adds supports for all the extra stuff it allows.

I believe because all of the constructs are already supported in the AST (and there's only one AST here for all standards), once the lexer and parser support it, there shouldn't be any other work

RaoulHC avatar Feb 16 '25 12:02 RaoulHC

@RaoulHC thanks for all the feedback, ok I was able to get things to type check again in my branch, the default parsers instantiation is a bit gnarly :)

Understood from your comments:

  • seem to be non standard, but we can add parsing & AST support for it
  • from user perspective, rather than having a bunch of new flags for non standard things (like I started with DecStructure), user can pass a compiler flavour flag, here --compiler-flavour=intel

Maybe it is worth still having the fine grained predicates and internal representation of those options, a particular compiler flavour would initialise the list of internal options; I don't think it makes the lexer / parser more difficult, and it still allows consumers of the library to tailor their parser, but you please let me know whatever is best.

smoothdeveloper avatar Feb 16 '25 20:02 smoothdeveloper

I'm trying to add this in Free/Lexer.x

<0> "structure"    / { legacyDECStructureP }      { toSC scT >> addSpan TStructure }
<0> "endstructure" / { legacyDECStructureP }      { toSC scT >> addSpan TEndStructure }

And in Free/Fortran90.y

  structure                   { TStructure _ }
  endstructure                { TEndStructure _ }
-- ...

-- this is straight copy from the Fortran77.y
MAYBE_NAME :: { Maybe Name }
: '/' NAME '/' { Just $2 }
| {- empty -}  { Nothing }

STRUCTURE_DECLARATIONS :: { [StructureItem A0] }
: STRUCTURE_DECLARATIONS STRUCTURE_DECLARATION_STATEMENT
  { if isNothing $2 then $1 else fromJust $2 : $1 }
| STRUCTURE_DECLARATION_STATEMENT { if isNothing $1 then [] else [fromJust $1] }

STRUCTURE_DECLARATION_STATEMENT :: { Maybe (StructureItem A0) }
: DECLARATION_STATEMENT NEWLINE
  { let StDeclaration () s t attrs decls = $1
    in Just $ StructFields () s t attrs decls }

| structure MAYBE_NAME NAME NEWLINE STRUCTURE_DECLARATIONS endstructure NEWLINE
  { Just $ StructStructure () (getTransSpan $1 $7) $2 $3 (fromReverseList $5)}

I'm not yet able to make sense of the issues in the generated code, which looks like this:

Image

I wanted to ask:

In cabal repl, when I :load app/Main.hs, if I've made changes in the .x or .y, does it do all which is needed or I need to jig something else?

Would you mind providing more guidance about how to edit the parser? I hope after few more efforts to make first progress here, I'll be more autonomous and able to contribute, but it is a bit steep for me still 🙂.

There is something else I'm encountering in the fortran codebase that I'll intend to use the tool with, which is $ IF preprocessor things, should I open a ticket specific to it?

smoothdeveloper avatar Feb 19 '25 04:02 smoothdeveloper

Made some progress in #305; to parse my example, the remaining thing is the [allocatable] attribute.

This syntax doesn't seem to be documented on intel website https://www.intel.com/content/www/us/en/docs/fortran-compiler/developer-guide-reference/2023-0/allocatable.html.

I think it will be a bit challenging due to how attributes are represented for declarations, so looking for suggestions if there are AST adjustments we would want:

For example:

type(MyType) :: a[allocatable](:), b

would be similar to / pretty printed:

type(MyType), allocatable :: a(:)
type(MyType) :: b

in the AST:

data Statement a  =
    StDeclaration
    -- ^ Declare variable(s) at a given type.
        a SrcSpan
        (TypeSpec a)                -- ^ Type specification
        (Maybe (AList Attribute a)) -- ^ Attributes
        (AList Declarator a)        -- ^ Declarators
   | ...

My current idea would be to generate several StDeclaration if any of the "declarator" has attached attributes, maybe I can attempt to group them by distinct set of attributes.

Conceptually, it would be something like this:

[
  StDeclaration span typeSpec (Just [Allocatable]) [a]
  StDeclaration span typeSpec None [b]
]

For my sample input.

Does that make sense?

Aside, in the pretty printer code, I'm facing some challenge with the indentation / reusing of the existing primitives, there seems to be assumption the dec structure would only be printed with a fixed indentation, but this is secondary concern for now.

smoothdeveloper avatar Mar 07 '25 13:03 smoothdeveloper

Thanks for this @smoothdeveloper - I need to digest a bit what the plan is here but I think it relates to this issue which I have just created which we've come across in a different place https://github.com/camfort/fortran-src/issues/307. Does it sound like solving this issue would do also what you are thinking, or not quite enough?

dorchard avatar Apr 01 '25 15:04 dorchard