ford Support to non ascii characters

I had troubles running the tool if documentation included non ascii characters as degree symbol or the spanish letter ñ. I checked the .md and the source files were saved using utf8 coding Another little point. If I remember well, in the on-line user manual , in the example, I saw "title" instead of "project" as the keyword for the project name

Aug 19 '15 07:08 gebeele

Are you using non-ascii characters in Fortran source files or FORD markdown files? A number of compilers, including gfortran don't allow non-ascii characters in Fortran sources, so I'm not sure whether or not it's reasonable to expect FORD to accommodate them. Of course I defer to @cmacmackin's judgement and expertise.

Aug 19 '15 13:08 zbeekman

@zbeekman I was under the impression that non-ASCII characters were part of the 2003 standard, and that many compilers had implemented them. This bug seems to arise, in part, due to some sloppy design in the Markdown library. From what I'm reading online, there should be workarounds.

As for the Wiki, are you referring to the block of Markdown near the top of the Project File Options page? That isn't meant to be an example of FORD meta-data. As I say on that page, it is taken directly from Markdown documentation explaining how to format meta-data within Markdown. I've added a few words to try to make that clearer.

Aug 19 '15 13:08 cmacmackin

Fixed the handling of unicode both in the project file and anywhere in the source code (including documentation).

Aug 19 '15 14:08 cmacmackin

Thanks! I was getting the error either when I used special characters in the project file (my surname has a ñ letter): author: Gustavo Baños ... or in documentation comments within the fortran source: function get_stresses_angle(this, angle) !! Get stress components (sxx,syy,sxy) at certain angle (º) I'll try again... The mention to the "title" keyword was regarding the options page, right. Ok, understood, that's was an extract of a Markdown example not necessarily applicable to FORD.

Aug 19 '15 15:08 gebeele

To be clear, I haven't released the corrections yet. There is a major feature I'm looking to implement prior to the next release.

On 19/08/15 12:09 PM, gebeele wrote:

Thanks! I was getting the error either when I used special characters in the project file (my surname has a ñ letter): author: Gustavo Baños ... or in documentation comments within the fortran source: function get_stresses_angle(this, angle) !! Get stress components (sxx,syy,sxy) at certain angle (º) I'll try again... The mention to the "title" keyword was regarding the options page, right. Ok, understood, that's was an extract of a Markdown example not necessarily applicable to FORD.

— Reply to this email directly or view it on GitHub https://github.com/cmacmackin/ford/issues/73#issuecomment-132633589.

Chris MacMackin Saint Mary's University Curriculum Vitae http://ap.smu.ca/%7Ecmacmack/CV.pdf

Aug 19 '15 15:08 cmacmackin

TL;DR

I'm pretty sure that UTF-8 encoded source files are not yet supported by Gfortran, so don't insert raw characters outside of the ascii range anywhere other than in your comments, where they are safe, even if gfortran processes them as multiple (bad) ascii characters. (If you want non-ascii literal constants, use -fbackslash) However, I see no reason to prevent inclusion of such characters in comments and FORD markup after doing some more research.

Long Version

I'm having trouble finding a more authoritative source, at the moment, but it was at last mentioned in the gfortran 4.4 release notes that uff-8 characters are not allowed in the source code yet. Furthermore, trying to pass -finput-charset=UTF-8 to gfortran 5.1 yields the error message:

$ gfortran -finput-charset=UTF-8 t.f90
f951: Warning: command line option '-finput-charset=UTF-8' is valid for C/C++/ObjC/ObjC++ but not for Fortran

Additionally, through personal experience with various tests for json-fortran I found that embedding UTF-8 encoded characters in the source code was unreliable. Sometimes it works due to the fact that UTF-8 encoding is backwards compatible. ASCII chars are only written as one byte, where as characters outside of the ASCII set are written using multiple bytes. The effect of this is that, sometimes, you can embed characters outside of the ASCII data set in unicode, and have them work as expected in some contexts, but not others.

Now, support for UTF-8 encoded source files is different than support for ISO 10646/unicode in programs. For gfortran, the safest way, in my experience, to embed character literal constants outside of the ASCII characters set is to use the \uXXXX hex representation, and then pass gfortran the -fbackslash flag. And, provided you open IO units with the proper encoding (UTF-8) a program compiled with gfortran can read and write non-ASCII characters to external and internal IO units provided that the proper character kind is used (ISO 10646).

At any rate, @gebeele's example only uses non-ASCII characters in FORD markup comments, and the FORD project file, which gfortran should be able to handle without incident. (All bytes representing characters outside of the ASCII data set start with a 1, whereas all bytes in the ascii set start with a 0, so there is no way, with correctly UTF-8 encoded characters, to accidentally insert a byte that would look like a new line, or something else to break gfortran's comment parsing.)

Aug 19 '15 15:08 zbeekman

Although this issue has been closed one year ago, I am still facing the same troubles as @gebeele, so I wonder if the solution has been finally implemented in any release of FORD or it is my fault if I can not deal with the trouble. I have some Spanish non ascii characters in the project and also in the comments (ñ for example is part of my surname). All files are UTF-8 encoded, so how can I do to visualize the HTML documentation correctly?

Sep 26 '16 07:09 anuf

Hi @anuf. I tried again FORD a few months ago and had the same problem. This, and the lack of an option for not showing the source code in the html, pushed me to explore other ways to document my Fortran. Coming back to the topic, I had what could be a similar issue with some Python code of mine using matplotlib and UTF8 text files as data source, which I managed to fix using the io library ( ( io.open(pathtofile, encoding='utf-8') ). But this might not be the issue with FORD.

Sep 30 '16 17:09 gebeele

@gebeele I am working to a patch to disable all sources.

@anuf if you post a minimal working example raising the error I can try to see what happens (no sure when I can... but I'l try). The encoding in python is an art that I have not completely unsterstood...

Sep 30 '16 19:09 szaghi

@szaghi Example of trouble: In my ford_project_file.md I have:

screenshot_ford07

and in HTML I get:

screenshot_ord06

Moreover, in FORTRAN comment I have:

screenshot_ford03

And in the HTML documentation generated by FORD I get:

screenshot_ford01

These are the issues I am facing. Thanks @gebeele, @szaghi for help.

Oct 03 '16 06:10 anuf

@anuf Ok, I'll try to do a view on it soon.

Oct 03 '16 07:10 szaghi

Hi. During these vacactions I had a look to to the fortran-lang and stdlib initiatives and saw that FORD was being used for the API documentation... I have just tried to update my FORD package in my Python distribution and rerun the example I tried some years ago... and I obtained similar results:

non-ascci characters (doc-comments if Fortran and metadata in FORD md file like letter "ñ" or degree symbol "º") are not properly shown in the output html
The Fortran source is not hidden ( "source: false" not working?) I attach the files I have just used... Did I do something wrong? (*) Many thanks in advance

ford_example.zip

(*) The link to FORD documentation https://github.com/cmacmackin/ford/wiki is apparently empty of contents now

GitHub
cmacmackin/ford
Automatically generates FORtran Documentation from comments within the code. - cmacmackin/ford

Jan 05 '21 12:01 gebeele

Was that problem solved? I have the same problem with non-ascii characters, too.

Sep 30 '21 14:09 tuncaen

@tuncaen I think this might still be a problem. I've reopened the issue so we can track it. At the very least, it needs some tests adding to check that it does work.

Sep 30 '21 14:09 ZedThree

We do now explicitly open everything as utf-8 by default, so this has been resolved

Mar 24 '23 16:03 ZedThree

ford ford copied to clipboard

Support to non ascii characters

TL;DR

Long Version

ford
ford copied to clipboard