compiler Tag-based OO-like code

Making PAWN more OO-like has been discussed over and over and over again. While I'd personally be against a fully-fledged rewrite a-la SourcePawn Transitional Syntax,, I think I have a relatively light-weight proposal to introduce a good set of OO features very very little cost in terms of changes (and nothing breaking). Basically, just map this:

native Tag.Function(a, b);
stock Tag.Function(a, b)
{
}

To this

native Tag_Function(Tag:this, a, b);
stock Tag_Function(Tag:this, a, b)
{
}

This follows long-standing SA:MP Module_Function conventions, plus a this convention, and would not introduce new requirements for passing around "objects", as this is still just a cell - what you want to do with that in terms of treating it as a handle to more data is entirely up to the library author (or whoever). It would even remain compatible with old code as calling Tag_Function(handle, 0, 0) would call the correct function even if it was declared with the new Tag.Function() syntax.

Some libraries already use a macro to support Library::Method() syntax, which is why I didn't suggest adopting :: for this (plus it would probably be more complex to get working around the existing tag syntax). That could even be an avenue for writing global libraries:

#if __COMPILER_MODIFIED
    #define this,) this)
    #define Object::%0(%1) Object_%0(Object:this,%1)
#else
    #define Object::%0(%1) Object.%0(%1)
#endif

Natives would work the same:

native File.Write(const string[]) = fwrite;

Would compile as:

native File_Write(File:this, const string[]) = fwrite;

Clearly entirely compatible with the existing native.

To call the methods, use a strongly tagged variable (I'd suggest against allowing this on weak tags - far too likely to cause unexpected calls):

new StrongTag:x;

x.Method();

Compiles as (sort of):

new StrongTag:x;

tagof(x)_Method(x);

Or more accurately:

new StrongTag:x;

StrongTag_Method(x);

Obviously that would require the tag of a variable to be known at compile time, and this could not work for run-time polymorphism:

Func({ Tag1, Tag2 }:x)
{
    x.Method();
}

Tag returns (and hopefully chaining) could work, but I know that would be much harder to implement as it would require more manipulation of the AST:

Tag:Tag.M1()
{
    return this;
}

Tag:Tag.M2()
{
    return this;
}

// Call:

new Tag:x;
x.M1().M2();

// Becomes:

Tag_M2(Tag_M1(x));

Or no reason to keep the same tag:

Tag2:Tag1.M1()
{
    return that;
}

Float:Tag2.M2()
{
    return 1.0;
}

// Call:

new Tag1:x;
x.M1().M2();

// Becomes:

Tag2_M2(Tag1_M1(x));

This is not full objects, and is mostly based around tag-based handles, but I think it goes a long way. I wrote something about supporting operators as well, but they already have overloading and wouldn't need the dot-call syntax.

Jan 05 '18 18:01 Y-Less

I really like this proposal! I've seen many "OO" attempts in Pawn and this seems like the most elegant. This reminds me of how Golang and Python handle things (Golang being my preferred option, Python's self as the first argument has always annoyed me).

This was actually along the lines of something I wanted to introduce using the pawn-parser. My only concern is that I'm not sure how I or others feel about introducing rather large changes (even though they are purely additive) to the compiler.

Something I was planning to introduce to sampctl was the idea of "plugins" which would basically act similarly to webpack style plugins that "transpile" javascript - a common practice in the JS world is to build language syntax extensions then allow them to be transpiled back to early versions (such as ES5) using build toolchain plugins. The idea with sampctl was to introduce a pre and post build commands that could be anything from search-and-replace apps to fully featured language transpilation which would facilitate things like this and ideas discussed here: https://github.com/Southclaws/ScavengeSurvive/issues/441

Jan 05 '18 19:01 Southclaws

I just realised that this is basically https://en.m.wikipedia.org/wiki/Uniform_Function_Call_Syntax

Aug 30 '18 21:08 Y-Less

#pragma rational Float

#define this. THIS<Object>
#define THIS<%3>%0(%1) %3_%0(%3:THIS_:this,%1)
#define THIS_:this,) this)

native Object_SetPos(Object:this, Float:x, Float:y, Float:z) = SetObjectPos;
native Object:Object_Create(modelid, Float:x, Float:y, Float:z, Float:rx = 0.0, Float:ry = 0.0, Float:rz = 0.0, Float:drawDistance = 0.0) = CreateObject;

main()
{
	new Object:this = Object_Create(1337, 0.0, 0.0, 4.0);
	this.SetPos(5.5, 6.6, 7.7);
}

Oct 19 '18 11:10 Y-Less

Basically, just map this:

native Tag.Function(a, b);
stock Tag.Function(a, b)
{
}

To this

native Tag_Function(Tag:this, a, b);
stock Tag_Function(Tag:this, a, b)
{
}

Is this really necessary? We already have problems with a 31-character name limit when hooking callbacks with long names; with member variables/functions this might become even worse. Aside from a (supposedly) light-weight implementation, are there any other benefits we would have from mapping Tag.Function to Tag_Function?

I think it shouldn't be very hard to make the compiler treat Tag.Function as two distinct identifiers. Here's how I see implementing this:

At the top of struct symbol add a new pointer symbol *tagfuncnext which would be used for organizing a list of functions associated with the tag. So, for example, in Tag.Function the global table (glbtab) would contain only Tag, but not Function, then Tag's tagfuncnext would point at Function.
In function hier1() (file sc3.c) add code for operator ., so it would have the same priority, as (), [] and {}.

This would solve the potential problem with the name length limit, and might open up a better way for function hooking. Also, the mechanism described above can be reused later for struct-like enumerations (#609), so we'll need it anyway.

@Y-Less @YashasSamaga Would it be OK if I start implementing tag-based functions in the way described above? Any objections/suggestions?

May 30 '21 09:05 Daniel-Cortez

I think this should be an opt-in experimental feature controlled by a compiler flag for now. Maybe a few select libraries could provide an optional conditionally compiled OO-like code and provide feedback.

I think it shouldn't be very hard to make the compiler treat Tag.Function as two distinct identifiers.

How would native functions work given that they have a tight identifier length limit? One way would be to not allow method overloads to be native functions. Instead, the user will have to use a normal identifier and then create a proxy to invoke that native.

native Tag.Function(); // error

native NormalNativeFunctionName();
stock Tag.Function() = NormalNativeFunctionName;

This would completely decouple native functions from the tag based OO system.

Aside from a (supposedly) light-weight implementation, are there any other benefits we would have from mapping Tag.Function to Tag_Function?

I think another disadvantage of a simple lightweight mapping is error handling. It might be easier to have nicer diagnostics for OO code instead of some direct mapping. For example, "could not find method 'Method' for tag 'SomeTag'" instead of "undefined symbol 'SomeTag_Method'".

There is also an option of mangling the name for internal use like it's done for user defined operators. If we restrict methods to be non-native non-public functions, we can store the function name as "Tag.Method". The length limit might cause a problem though. I always wondered if we could increase the identifier length for local functions (non-native & non-public). The mangled name will allow a version of Tag_Function to exist independently of Tag.Function. I can't think of a use case though.

I think it shouldn't be very hard to make the compiler treat Tag.Function as two distinct identifiers. ...

I think it's easier if we use mangled names with dots and extend the identifier limit for local identifiers (in a separate PR). The name limit extension can be kept for internal use or be made external. We can have a new flag to mark methods from normal functions (or simply [ab]use the presence of a dot in the managed name).

It has been a really long time since I have worked with the compiler or the language itself. I am out of touch and cannot reason clearly about pawn or the compiler atm. Apologies for the same. I hope my comments make sense.

May 30 '21 11:05 YashasSamaga

These are some very good ideas, but regardless of the implementation there's still one major issue with this idea - pre-processor function hooking just doesn't work. The defines would be based on the canonical name (which may or may not contain .), but that name is generated based on the tag name at a later stage in the compilation, after pre-processing. You could base the hook name on just the part after the ., which would work as-is right now, but will have far more collisions if there are two members with the same name:

Player.MySetPos()
{
}

#define SetPos MySetPos

playerid.SetPos(); // Great.

vehicleid.SetPos(); // Incorrectly replaced.

native Tag.Function(); // error

native NormalNativeFunctionName();
stock Tag.Function() = NormalNativeFunctionName;

There's direct support for this in the compiler with native renaming already:

native Tag.Function() = NormalNativeFunctionName;

The length limit might cause a problem though. I always wondered if we could increase the identifier length for local functions (non-native & non-public).

That's something that's been considered a few times, and would be useful. Even natives can have longer names with the internal/external names.

Aside from a (supposedly) light-weight implementation, are there any other benefits we would have from mapping Tag.Function to Tag_Function?

Not really. If the "correct" way is better and not hard, there's no other reason not to do it.

@Y-Less @YashasSamaga Would it be OK if I start implementing tag-based functions in the way described above? Any objections/suggestions?

I have no objections. I think there are still some issues to solve, but a prototype might help with that.

May 30 '21 11:05 Y-Less

These are some very good ideas, but regardless of the implementation there's still one major issue with this idea - pre-processor function hooking just doesn't work. The defines would be based on the canonical name (which may or may not contain .), but that name is generated based on the tag name at a later stage in the compilation, after pre-processing.

If it all boils down to the preprocessor, then maybe we could implement a new hooking mechanism as a language feature? Doing this would kill two birds with one stone: solve the problem with hooks being preprocessor-dependent and have an easy-to-use hooking method, without the need to write all of that #if defined garbage after every hook. @Y-Less WDYT?

Jul 20 '21 11:07 Daniel-Cortez

Well maybe, but I don't think that really addresses the root issue, that's just a patch for one symptom.

Jul 20 '21 19:07 Y-Less

https://wiki.alliedmods.net/SourcePawn_Transitional_Syntax#Methodmaps

Feb 18 '22 17:02 Y-Less

I think I've solved the pre-processor issue: __tagof(). This is basically a pre-processor-time version of tagof(), which returns the tag as a symbol. Thus:

this.Func()

Becomes:

__tagof(this)Func(this)

And from there normal pre-processor rules take over:

TagFunc(this)

And you can redefine this however you like.

Apr 03 '22 01:04 Y-Less

We just need to be careful not to parse this (or any other symbol) multiple times in the pre-processor, to avoid possibly incorrect macro replacements. I.e. replacing it once when it is first seen before . and a second time at its new site in the parameters.

This is even more complex with chaining:

this.GetAngle().ToInt();

Becomes:

Float_ToInt(Entity_GetAngle(this));

Apr 03 '22 11:04 Y-Less

compiler compiler copied to clipboard

Tag-based OO-like code

compiler
compiler copied to clipboard