dev
dev copied to clipboard
Modules RFC
this is a complex feature that requires joint support from both the CLEO runtime & compilers. See also Functions RFC
Idea
Be able to export SCM functions from CLEO scripts and import them in other scripts.
Goals
- Have modular code, library code can be updated independently from the main code
- third-party utils can be written in edit modes different from the main script or even in different languages
- DRY (Don't repeat yourself)
- Improve development velocity
- help with migrating from legacy modes to the new ones (custom->SBL/SRC)
Design
Export
CLEO scripts could export pure SCM functions. In the sense of this document a pure function is the one that only depends on its inputs. Functions are exported using the export
keyword:
:fun1
...
:fun2
...
:fun3
...
export @fun1
export @fun2
export @fun3
Export
line can be anywhere in the script. The label must mark an SCM function start. Duplicate entries are allowed:
export @fun1
export @fun1
By default exported functions get named after the label name (e.g. fun1
, fun2
). This name is important as another script uses it for the import. To give the export a custom name, as
keyword can be used:
export @fun1 as matrix_mult
It should not be possible to name different functions with the same export name:
export @fun1 as matrix_mult
export @fun2 as matrix_mult
If @fun1 and @fun2 mark different locations, it is a hard compilation error. Otherwise import of 'matrix_mult' would be ambiguous.
Import
A script can import functions using the import
keyword.
import fun1 from "scripts.s"
cleo_call fun1 3 1 2 3
This is a syntactic sugar for:
cleo_call "[email protected]" 3 1 2 3
@
separates the function name that is invoked and the file name.
A label used in the import statement can not be used in a script as a regular label and vice versa. Otherwise the call destination is ambiguous.
Note that this is not relevant if the function syntax is implemented (see below)
import fun1 from "scripts.s"
cleo_call fun1 3 1 2 3
:fun1 // error because :fun1 is an external label
:fun1
...
import fun1 from "scripts.s" // error because :fun1 is a local label
cleo_call fun1 3 1 2 3
Import can also use as
keyword to avoid name collisions:
import @fun1 as matrix_mult from "scripts.s"
Duplicate imports are allowed:
import fun1 from "scripts.s"
import fun1 from "scripts.s"
Duplicate aliases are not allowed:
import fun1 as fun1 from "scripts.s"
import fun2 as fun1 from "scripts.s" // error
Import names are case-insensitive. import FUN1
and import fun1
are equivalent.
Import statement can be used anywhere before cleo_call
, preferably at the top of the script.
Path separator can be either \\
or /
. The runtime normalizes them.
import fun1 from "folder\\scripts.s"
import fun1 from "folder/scripts.s"
Because a single \
serves as an escape character, double \\
is required.
CLEO Library support
Changes have to be made to cleo_call
and cleo_return
commands.
If the first argument to cleo_call
is a string, CLEO treats as an import path (NEW).
If a string is given, CLEO resolves the path similarly to 0A92, 0A94. See the post down below for the path resolution algorithm.
Once the path is resolved and the file is found, it gets loaded into the game process and the pointer is obtained (P
). If the file is already loaded, the P
is returned immediately.
Then cleo_call
saves all current lvars, gosub stack and base IP (NEW).
Then the base IP of the current script is set to P
(NEW).
Then as with any SCM function local variables are reset and input arguments are passed.
Then CLEO finds the export table. The runtime uses jump offsets to find the section with id 01
(see Future extensions) and scans the memory after it. Each "row" is a pair of a null-terminated string and a 32-bit offset. When a match between the requested function name and an exported name is found, current ip of the script is set to the found offset. Search is case-insensitive.
Execution continues inside the loaded file, all offsets work relative to P
.
When the script encounters a cleo_return
, results are stored in shared ScriptParams.
Then base ip is restored alongside other things (NEW).
Then stored values get copied into the host script variables.
Compiler support
Exported function should be discoverable by the runtime. When [email protected]
gets loaded, there must be a way to find the offset for 'fun' inside the scripts.s.
To achieve this, the compiler constructs an export table. It is trivially located at the start of the script and routed away from the normal execution with a jump instruction. This technique is similar to what a main.scm header uses.
source code for
:fun
...
:fun2
...
export @fun
export @fun2
transformed into this by the compiler:
0002: @after_table
hex
"fun" 00 @fun 00 00
"fun2" 00 @fun2 00 00
end
:after_table
:fun
...
:fun2
See "Compatibility with Function syntax" down below for the structure of the export table. Last two 00 are reserved for input and output.
As with main.scm header, it gets constructed after the full pass on the code, so the compiler knows all functions that need to be exported.
Disassembler support
- No extra work is planned. Export table will be present as a regular jump and hex..end block (Extra info is needed). A stretch goal would be to reconstruct the export table to the initial form.
Possible limitations
- length of the import argument is limited to 255. Very long file names or function names will result in a compilation error.
- there is no limit on a number of function to export from a single file
Compatibility with future extensions
Header extension in CLEO scripts could be used to store more than just an export table. Need to provide a clear distinction between different sections. It could be a service byte after the jump instruction:
02 00 01 xx xx xx xx 01
where the last 01
is the marker for an export table.
The runtime logic should then jump +8, not +7.
Custom Headers proposal
Stretch goal: https://gist.github.com/x87/5d0bd6bdd0062380628eb35103894e1b
IDE Support (stretch goal)
IDE should scan the export table and offer a list of available functions for autocomplete, and also display function signature if function syntax is implemented
Compatibility with Function syntax
https://github.com/sannybuilder/dev/issues/263
Export
export function fun1(x: int, y: int, z: int)
//
end
export function fun2(): int
//
end
Import Opaque Function
import fun1, fun2 from "scripts.s"
/// implicitly declares functions
/// function fun1(...): ...
/// function fun2(...): ...
fun1(1, 2, 3) // transforms into a cleo_call 3 1 2 3
int x = fun2() // transforms into a cleo_call 0 x
Note that combining imports in one statement (import fun1, fun2
) is currently not supported. Each import has to be on its own line.
Import with Function Declaration (not supported)
import function fun1(a: int, b: int, c: int) from "scripts.s"
/// explicitly declares a function
/// function fun1(a: int, b: int, c: int)
fun1(1, 2, 3) // transforms into a cleo_call 3 1 2 3
Export table
Export table should store function signatures (number of input and output params and their types). Types are encoded as a single byte using the Sanny Builder types (in decimal): 01 - int 02 - float 03 - string, short string 04 - longstring 20+ - class ids
Each line in the export table contains:
function name, 00, offset, N inputs, input 1 type, input 2 type, ... input N type, N outputs, output 1 type, output 2 type, ... output N type, flags (1 byte), address (4 bytes)
0002: @after_table
hex
"fun1" 00 @fun1 03 01 01 01 00 00 00000000 // 3 input args: i i i, 0 output args, 00 flags, 00000000 - address
"fun2" 00 @fun2 00 01 01 02 AABBCCDD // 0 input args, 1 output arg 02 flags, AABBCCDD - address
end
:after_table
Reserve space for possible extensions?
Imports should perhaps also support 'as' feature. It will solve problem with name collisions in multiple modules and local code, as well as give name to imitate namespaces by giving imported functions prefix.
I do not like fact that addressing module with name depends on current working directory. As discussed before regarding other topics, currently working directory is global property shared between scripts.
If I use import @fun1 from "scripts.s"
then I expect to @fun1 always leads to the imported module. Currently intention is to encode module and export name as string param for cleo_call.
This will lead to problem where calling: ` 0A99: set_current_directory 0 fun1()
0A99: set_current_directory 1 fun1() ` will fail in one case, or run different module if it happens to exist.
I was thinking about it and solution might be to include directory in path itself, like: "0:\cleo\script.s" "1:\MPACK6" "2:\script.s" // cleo dir?
This would also solve problems in other opcodes receiving file paths if supported everywhere.
0A99: set_current_directory 0
fun1()
0A99: set_current_directory 1
fun1()
in the compiled code will look like:
0A99: set_current_directory 0
cleo_call "[email protected]"
0A99: set_current_directory 1
cleo_call "[email protected]"
path resolution only happens during the first call, then CLEO remembers that "scripts.s" is associated with, for example, "D:\Games\SA\CLEO\scripts.s". The second call does not lead to the new path resolution, and CLEO uses already loaded module. The second 0A99 plays no role there.
So, you can not have two modules named "utils.s" in different locations?
I think it's more of a runtime problem, not the compiler. In the script, there is only file name as it's given in the import statement.
import X from Y // this Y goes as is into all X calls -> cleo_call "X@Y"
What is your proposal on how to resolve Y?
I gave solution for runtime. Problem is that both compiler and running script are meant to localize same file, but of course game environment looks differently than development one.
What are solutions? Force module's target location and store it in the module itself, so compiler can read it? Create some kind of unique GUID for modules? Hard to do as these should be unique, but still same after alerting module's code.
Or something like that in Sanny
{$MODULE mod_scripts="include\scripts.s", "0:\cleo\scripts.s"}
import @fun1 from mod_scripts
import @fun2 from mod_scripts
We can't enforce module path as it limits the usage. You should be able to use a module from any place where your script is located.
If there is a file called utils.s
you can copy it to CLEO folder and import in any CS file using
import X from 'utils.s'
or copy to Documents\GTA San Andreas User Files\MPACK5
and using the same statement import module functions in scr.scm.
Also modules can import other modules, which means those files should be located in the same place.
With that being said, path resolution could work like this:
- if the path is absolute, it gets resolved as is.
import X from "D:\Games\SA\utils.s"
This probably should be forbidden.
- if the page is relative, it gets resolved relative to the current script's file (regardless of cwd).
Path Resolution
When we encounter a 0AB1 with a string argument (module call), we need to determine the script directory. We can't rely on cwd
as it could change in runtime.
- If current script is not custom
- if this is not a mission pack
- the directory is "game\data\scripts"
- if this is a mission pack
- the directory is "Documents\GTA San Andreas User Files\MPACKx" where x is the mission pack id
- if this is not a mission pack
- if current script is custom
- the directory is the script's directory (see below for the meaning)
For custom scripts, the directory is stored on CCustomScript struct and is changed in two cases
- on initial load, based on szFileName
- during 0AB1, set to the module directory.
To illustrate, consider this example.
File my.cs
is located in D:\Games\SA\CLEO. When CLEO loads this file, CCustomScript
constructor sets the script's baseDir
to D:\Games\SA\CLEO
Then there is a command cleo_call "fun1@utils\a\u.s"
.
CLEO loads module utils\a\u.s
relative to the baseDir
from the file D:\Games\SA\CLEO\utils\a\u.s
.
Then CLEO stores the current baseDir
on the new ScmFunction struct.
Then CLEO sets the current script's baseDir
to D:\Games\SA\CLEO\utils\a
Then it runs the function.
Imagine there is another import inside u.s
file, e.g. import from "extra\f.s". This import gets resolved relative to the current baseDir
which is D:\Games\SA\CLEO\utils\a
.
CLEO loads new module from file D:\Games\SA\CLEO\utils\a\extra\f.s
.
Then CLEO stores the current baseDir
on the new ScmFunction struct.
Then CLEO sets the current script's baseDir
to D:\Games\SA\CLEO\utils\a\extra
Then it runs the function.
Function returns and the baseDir
is restored to D:\Games\SA\CLEO\utils\a
Function returns and the baseDir
is restored to D:\Games\SA\CLEO
We are now back in my.cs and code flow continues.
For SCM files (main.scm or mission packs) the algorithm is the same.
Path Normalization
Because paths can use both \
and /
as separators, the runtime replaces all /
with \
.
CLEO Implementation https://github.com/cleolibrary/CLEO4/pull/101
@MiranDMC We need an update in CLEO5. Export table is now 5-bytes longer for each function. I added a flags byte and 4 bytes for the address (used for static foreign functions). None of those are relevant to CLEO, so you just need to skip extra 5 bytes.
https://github.com/cleolibrary/CLEO5/pull/121
Added Sanny Builder documentation here https://docs.sannybuilder.com/language/import-export
Initial implementation is released in 4.0.0
See https://github.com/sannybuilder/dev/issues/264#issuecomment-1719602868 for outstanding improvements