SynFacilSyn
SynFacilSyn copied to clipboard
Demo program FacilSyntaxTool.
Hi Tito, I have written a small demo tool to check the capabilities of your nice highlighter units for my configuration files. If you want, you can merge the code from https://github.com/bigopensky/FacilSyntaxTool. The examples and corresponding xml syntax rules are in the directory data
. The tool is tested under Linux only. At the moment I'm looking for a possibility to define keyword within blocks due to the "multi-syntax" nature of the config files (like in org-babel). Best regards
Hi. Excellent. I will include your tool in the source code and a reference to your Github.
I wonder if SynFaciSyn
will have the capabilities to highlight a multi-language block schema, where we have different token within the several block levels I have attached s small example for mixing SQL and Pascal pseudo code in different blocks.
Configuration
<?xml version="1.0"?>
<language name="MULTI.LANG.CONFIG" ext="conf" colorBlock="Block">
<symbols> </symbols>
<attribute name="GLOBAL" forecol="#404040" bold="TRUE" backcol="#EEEEEE"/>
<attribute name="COMMENT" forecol="#404004"/>
<attribute name="DESC" forecol="#A04040" bold="TRUE"/>
<attribute name="SCRIPT" forecol="#004000" bold="TRUE"/>
<attribute name="DB.QUERY" forecol="#804040" bold="TRUE"/>
<attribute name="NUMBER" forecol="#800080" bold="TRUE"/>
<attribute name="LITERAL" forecol="#000080" bold="TRUE"/>
<token attribute="GLOBAL">
MAGIC: BEGIN END.MAGIC END
</token>
<token attribute="NUMBER" charsStart="-+0..9" content= "-+0..9.E"/>
<string start="'" end="'"/>
<comment start="#" />
<block name="GLOBAL" start="MAGIC:" end="END.MAGIC" backCol="#EEEEEE">
<comment start="#" />
<token attribute="GLOBAL">
DESCRIPTION: END.DESCRIPTION SCRIPT: END.SCRIPT DB.QUERY: END.DB.QUERY
</token>
<block name="DESCRIPTION" start="DESCRIPTION:" end="END.DESCRIPTION" backCol="#EEEEFF">
<comment mode="*VERBATIM*" />
</block>
<block name="" start="DB.QUERY" end="END.DB.QUERY" backCol="#FFEEEE">
<comment start="--" />
<token name="LITERAL" attribute="LITERAL" start=""" end="""/>
<token attribute="DB.QUERY">
AS SELECT FROM LEFT RIGHT CROSS JOIN WHERE AND NOT IS NULL LIKE OR
IN GROUP BY ORDER LIMIT NUMERIC TIMESTAMP TIMEZONE WITH
FLOAT8 VARCHAR CHAR UNION
</token>
</block>
<block name="" start="SCRIPT" end="END.SCRIPT" backCol="#EEFFEE">
<comment start="{" end="}"/>
<token attribute="SCRIPT">
PROCDURE FUNCTION TYPE CONST VAR BEGIN END
WHILE DO TO FOR REPEAT UNTIL RECORD IF THEN ELSE
</token>
</block>
</block>
<example>
MAGIC: TEST.TOOL BEGIN
DESCRIPTION: BEGIN
* NAME
Making fancy stuff on the moon
* SYNOPSIS
Call it like 123
* COPYRIGHT
(c) 2024 Man in the moon
END.DESCRIPTION
DB.QUERY: SHADOW BEGIN
SELECT * FROM SHADOWS;
END.DE.QUERY
DB.QUERY: LIGHTS BEGIN
SELECT * FROM LIGHTS;
END.DB.QUERY
SCRIPT: LIGHT.AND.SHADOW BEGIN
USES DB.QUERIES;
VAR SHADOWS: DB.QUERY;
LIGHTS: DB.QUERY;
DEFAULT: DB.CONNECTION;
BEGIN
DEFAULT := DB.CONNECTION.Create('pg:service=test_db');
SHADOWS = DB.QUERY.Create( 'DEFAULT', 'SHADOW');
LIGHTS = DB.QUERY.Create( 'DEFAULT', 'LIGHT');
WRITELN(SHADOWS.DATA);
WRITELN(LIGHT.DATA);
FREE.ALL([SHADOW, LIGHT, DEFAULT]);
END;
END.SCRIPT
END.MAGIC
</example>
</language>
However, formally Synfacilsyn hasn't support for multi language. It's possible to emulate it using blocks but it's very limited. To include support for multi language was one of my objectives when I created SynFacilSyn. I stopped developing it when I found some issues:
- SynEdit control is wonderful but it's some old fashioned in these days (Although it was improved in the last days).
- The design of folding in SynEdit is some complex and it affects the block processing in SynFaciLSyn.
- SynFacilSyn was designed to be fast. Including support for multi language, involve some complex tasks that can be difficult to integrate in the current design.
I think it's possible to include multi language support in SynFacilSyn but probably it's a lot of work. Anyway I will support if someone wants to improve the highlighter.
Hi thank you for your fast answer. I'm not the crack in terms of understanding how the scanning, assigning and rendering in the TSynEdit
environment works. I aware from other projects like Emacs (treesitter
for example) that highlighting corresponds to parsing the content and assigning aspects of the syntax tree to some rendering attributes. What I cannot see, is how the syntax tree in is associated to the text you can modify in TSynEdit
. I guess this is important because we do not want to re-scan the whole thing if we working within a string or a block and do not change the structure. In Emacs the call that aspect narrowing. Do you know if there is an text "overlayed" block rendering/structure in TSynEdit?
BTW: For my config files I have a small hand crafted parser and try to decide how to adjust some block features. I have seen, that is is complicated, to define a block for:
{The old Algol 68 stuff}
TYPE.TOKEN: BLOCK.NAME BEGIN
END.BLOCK.NAME
by using
<block name="name" start="TYPE.TOKEN:" end="END.BLOCK.NAME" backCol="red" />
but the END
tag can be easily changed END.TYPE.TOKEN
which is known by the <block .../>
context.
So the code will become
{The old Algol 68 stuff}
TYPE.TOKEN: BLOCK.NAME BEGIN
END.TYPE.TOKEN
or
{The old Algol 68 stuff}
TYPE.TOKEN: BLOCK.NAME BEGIN
END
What I cannot see, is how the syntax tree in is associated to the text you can modify in
TSynEdit
. I guess this is important because we do not want to re-scan the whole thing if we working within a string or a block and do not change the structure. In Emacs the call that aspect narrowing. Do you know if there is an text "overlayed" block rendering/structure in TSynEdit?
Well. I don't have all details in mind (It's a lot of time I don't check highlighters). But FAIR there is not a Syntax Tree created in SynEdit (just a stack for folding marks) and that's a problem because if you need it, you have to create yourself. Moreover, SynEdit has a strict sequence to call the highlighter when modified ant it limits the way a highlighter should be implemented. I found several obstacles when trying to suit my highlighter blocks and SynEdit. I finally decided not to try more.
BTW: For my config files I have a small hand crafted parser and try to decide how to adjust some block features. I have seen, that is is complicated, to define a block for:
Is it not possible just to use "END" as delimiter? I think it's possible to use several delimiters for blocks but it's more complicated.
Hi thank you for your response. Thinking about the matter literally helps me to sort, whats there and can be done.
To the last topic yes it is possible to use END
. I will try to comment the config parser and bring it into my repository.
The config is organized in blocks and at least it is working like this:
- The syntax is is line oriented, which makes tokenizing very easy. I have the following lines:
- It always starts with a
MAGIC:
line/block (implicit block) aNAME
to determin the type of the config (which parser is can be used). - A block is starting with an opening
TOKEN:
with an optionalNAME
andBEGIN
, if the registered token belongs to named block or theNAME
is omitted in an anonymous block. If theTOKEN:
fits the definition, the block is registered or a error is thrown. - Scalar (
TOKEN: VALUE
) or record (TOKEN: VALUE1 VALUE2 VALUE3 ..
.) related stuff in written one line (and I have these backslash thing.. to join lines on low level reading). - I have a
verbatim
blocks where the for descriptions (markdown) or scripting languages where at least the syntax differs from the structure in 1, 2, 3 and 4. They can be nested in blocks but cannot not contain blocks. - When a block token is opened the scanner generates the expected block syntax (
END
,END.TOKEN
orEND.TOKEN.NAME
) soEND
orEND.TOKEN
is definitively ok. - If the scanner reaches something which looks like
END.*
the stack is consulted and checks the TOKEN or Name behind the dot, closes the block if the definition matches and pops the entry or throws an error at the corresponding line. - The next line is read according to the block/scalar/record DEFINITION stored at the stack. If the token is not valid ...error.
The definition is made by a syntax tree for each configuration like this:
function TRstbMgmConfig.DefineSyntax(
const aApp, aMagic,
aVersion: TString): TSyntaxDefinition;
var
lSecRoot: TSection;
lSecCur: TSection;
begin
{ Create the MAGIC config }
Result := TSyntaxDefinition.Create(aApp, aMagic, aVersion);
{ Create the root node }
{ Name, Parent, Ouccurrence }
lSecRoot:=TSection.Create(NIL, SKEY_RUNTIME, OC_ONE, TP_ROOT);
{ Add a scalar type file with default }
{ Token, Type, Default, Mandatory = TRUE }
lSecRoot.AddValue( PKEY_USER_CFG, TP_FILE, CStrNone);
lSecRoot AddValue( PKEY_USER_ETC, TP_PATH, CStrNone);
lSecRoot.AddValue(lPKEY_CUR_USER, TP_STRING, CStrNone);
...
{ Add a new node within root }
lSecCur := TSection.Create(lSecRoot, SKEY_COMMON, OC_ONE);
lSecCur.AddValue(PKEY_VERSION, TP_TOKEN, DEF_RSTB_VERSION);
lSecCur.AddValue(PKEY_SYS_ID_ROOT, TP_STRING, CStrNone);
lSecCur.AddValue(PKEY_SYS_ID_CMGN, TP_STRING, CStrNone);
...
{ Define a script region for mark down }
lSecCur := TSection.Create(lSecRoot, SKEY_DESC, OC_ONE, TP_MD_SCRIPT);
{Define a script region for DB.QUERY wit multiple occurrence but at least one }
lSecCur := TSection.Create(lSecRoot, SKEY_DB_QUERY, OC_ONE_MULTI, TP_SQL_SCRIPT);
...
Result.SetRoot(lSecRoot);
end;
and every thing is predefined like in a DTD (xml) but attributes are scalars. It is also possible to attach the syntax intentions (keywords, comments etc.) to the type TOKEN (TP_ROOT, TP_STRING, TP_PATH, TP_FILE, TP_MD_SCRIPT ...)
which are at the moment CONST
but I can use predefined objects here.
I use the more detailed variant for the END
token, because when you mixing up the structure, it is easier to find out where you are within the structure (not the line) of the many BEGIN .. END
block. Usually we a similar technique do this in programming by putting comments behind the END term, which is shown together with wrong line.
Conclusion
I guess when I use an line oriented setup, it could be easy to follow the folding approach of TSynEdit
and I will dig deeper into this matter. Do you have an advise for some documentation beyond the source code itself especially for the code folding and the rendering queue?
BTW I use the TOKEN:
form at the beginning of the line, because it is very grep/regexp friendly, incremental (look ahead is only needed for the END.*
and I don't need a XML parser to screen the matter. Usually I use a BLOCK reference structure (type names on a stack) approach vs. a deep nested structure approach. Scalars coded as ${/BLOCK/NAME}
and parsed before can be interpolated during the reading process. The use of :{VAR.NAME}
is intended as runtime variables, when the config is used by an application.
I guess when I use an line oriented setup, it could be easy to follow the folding approach of
TSynEdit
and I will dig deeper into this matter. In some way, that's the limitation I found too in SynEdit folding design. Do you have an advise for some documentation beyond the source code itself especially for the code folding and the rendering queue? Sadly, SynEdit is not well documented. I wrote a small book about that https://github.com/t-edson/SynFacilSyn/blob/1.21/Docs/La%20Biblia%20del%20SynEdit%20-%20Rev7.pdf but it's not detailed on code folding. You can ask at the Lazarus forum and will get the help of some experts. In fact, there are several question related to folding and SynEdit in general. You can get useful information for some posts.
Are you creating your own Synedit Highlighter? Folding must be implemented in highlighters as SynFacilSyn do.