mtasa-blue
mtasa-blue copied to clipboard
Implement lua-vec - vectors as native Lua datatype
This PR aims to fix the issue we have with userdata vectors. (#321) As discussed on discord (and maybe even in the issue itself), using Lua tables would be only marginally better, and the performance benefit would probably be lost when doing table <=> conversion. Note: I've tested this with bytecode as well (Compiled all default resources, and they all worked perfectly). So no backwards compatibility issues here.
This code(Lua-vec) was "stolen" from here.
I need opinions: Although I've already discussed this with @Woovie, I'd be great if others could give their feedback.
I think naming the datatype as vector
would be totally okay (the function table will be named vector
in lower case, just so we keep Lua's naming convention. We can just alias it to Vector
in a form of a global variable)
Features in lua-vec
:
- Nice indexing of vectors, either with indecies,
.x
, or even.xxyz
,.xx
(like in hlsl) - Really fast (even compared to Lua table implementation). from 4x up to 10x faster than Vector4.
- Very memory friendly. Our current Vector implementation is a memory hog([1]).
Lua-vec
uses no additional memory at all! Its free! [2]
[1]: A Vector4 needs 4 bytes for lua_newuserdata
, another sizeof(unsigned long)
+ sizeof(void*)
+ sizeof(CLuaVector4D
(which is 16 + 4 bytes) in CIdArray
[2]: Its value is stored in a GcObject
union. This is where the userdata is stored as well, so, compared to the userdata implementation, we use no additional memory
Todo:
- Make
vec
callable (add __call metamethod), right now vectors can only be created withvec.new
- Make it possible for
ArgumentParser
andCScriptArgReader
to read the new type as aVector2/3/4D
- Add support to construct
Vector2/3/4
fromnative vector
- Add support to construct
native vector
from existingVector2/3/4
(only way to implement this cleanly is to use.x
.y
.z
) - @qaisjp suggested that it would be nice to have a separate branch to keep track of our modifications to Lua.
- Implement
LUA_TVEC
inCLuaArgument
(WriteToBitStream
andReadFromBitStream
)
Tests:
Resource used: vectest.zip
Scene rendering with smallpt.lua
(from Lua-vec):
Command: runsmallpttests 250 250 4 false
Settings: w: 250, h: 250, samples: 4, do_force_gc_collect: false
============[vec]============
Writing to file... took 176 ticks
Tracing took 115081 ticks
Memory used: 3.3 gb
============[stdlua]============
Writing to file... took 291 ticks
Tracing took 371085 ticks
Memory used: <= 3.3 gb
============[Vector4]============
Ran out of memory. Needs around 16 gigs.
Command: `runsmallpttests 250 250 4 true`
Settings: w: 250, h: 250, samples: 4, do_force_gc_collect: true
============[vec]============
Collect garbage total: 19830 ticks
Writing to file... took 165 ticks
Tracing took 103906 ticks
============[Smallpt - Vector4]============
Collect garbage: 25859 ticks
Writing to file: 338 ticks
Tracing: 475007 ticks
============[Smallpt - stdlua]============
Collect garbage: 15044 ticks
Writing to file: 344 ticks
Tracing: 360707 ticks
Syntethic add/create/mul/sub benchmark:
Iterations: 10000000
============[native-vector - GC: OFF]=======
=> create: 2694 ms
=> sub: 1692 ms
=> mul: 1681 ms
=> eg: 535 ms
=> add: 1716 ms
============[native-vector - GC: ON]========
=> create: 2228 ms
=> sub: 1338 ms
=> mul: 1313 ms
=> eg: 507 ms
=> add: 1327 ms
===========================================
============[Vector2 - GC: OFF]============
=> create: 9966 ms
=> sub: 10291 ms
=> mul: 10085 ms
=> eg: 2138 ms
=> add: 9660 ms
============[Vector2 - GC: ON]============
=> create: 30965 ms
=> sub: 21562 ms
=> mul: 21275 ms
=> eg: 5635 ms
=> add: 17737 ms
===========================================
============[Vector3 - GC: OFF]============
=> create: 22923 ms
=> sub: 11093 ms
=> mul: 12490 ms
=> eg: 2196 ms
=> add: 13755 ms
============[Vector3 - GC: ON]============
=> create: 47758 ms
=> sub: 15084 ms
=> mul: 15346 ms
=> eg: 4736 ms
=> add: 13413 ms
===========================================
============[Vector4 - GC: OFF]============
=> create: 14349 ms
=> sub: 31588 ms
=> mul: 24745 ms
=> eg: 2147 ms
=> add: 12531 ms
============[Vector4 - GC: ON]============
=> create: 50227 ms
=> sub: 16102 ms
=> mul: 15704 ms
=> eg: 4504 ms
=> add: 13207 ms
===========================================
So, whats people's opinion on this?
I'd be really happy to see this in 1.6.
Also it seems like we could without problems add support for matrices as well. A GCObject
is 180 bytes, a regular 3x3 matrix is.. 48, which means that the unions size would stay the same.
I like the changes.
@qaisjp suggested that it would be nice to have a separate branch to keep track of our modifications to Lua.
We can have a submodule for Lua. In this way, it will be easy to track the changes.
So, we should create a separate repo?
So, we should create a separate repo?
Hmm, that doesn't seem like a good idea. Let's just create a new branch for Lua. I'd do it like this:
- Download Lua 5.1 original source and commit the code to the branch.
- Copy the Lua source from MTA vendor to the new branch.
- And then create a PR for lua-vec changes.
My approach would be:
- Fork Lua repo to multitheftauto org. This keeps original history instead of destroying the Lua 5.1 history, and makes it easier to cherry-pick improvements later.
- Cherry-pick our custom 5.1 mods from mtasa-blue to a branch on the Lua repo
- Replace vendor/lua in mtasa-blue with a submodule that targets that Lua repo
- Create a PR for lua-vec changes in the Lua repo.
This approach is compatible with the submodule system proposed in #319.
AFAIK There is no official Lua repo on GitHub, if that's what you meant by Fork Lua repo. I dont think the Lua history is necessary. I'll just create a new repo with the 5.1 source code from Lua website
Okay I think I did it. Is this what you meant?
AFAIK There is no official Lua repo on GitHub, if that's what you meant by Fork Lua repo. I dont think the Lua history is necessary. I'll just create a new repo with the 5.1 source code from Lua website
There's an official mirror on GitHub of the Lua repo, here. It has a tag for every released Lua version, all the way back to 5.1. I agree that keeping the history may come in handy in the future for some cherry-picks, for updating Lua more easily, or whatever, but I would also understand not keeping the original commit history if it's deemed too verbose.
Please look at MTA org's /lua repo. I have filtered our commits to vendor/lua + added base Lua at the top (source code for 5.1.5 downloaded from Lua website). Would that fit?
i noticed you call the vector type in Lua only "vec" which is inconsistent to the other types: "number", "table", ... the type should be "vector"
since Lua uses doubles, shouldn't the vectors not use doubles too?
There would be no benefit of the additional memory required. MTA uses float vectors everywhere, the additiona precision would be lost. Yeah, the type name will be changed, once this PR gets reviewed or something.. Tbh, we didnt really agree on a name yet. I think vector (perhaps with a capital V) is acceptable. Since it's not vanilla Lua, we might as well stick to our naming convention.
maybe cVector where c stands for custom or if someone has better idea he should write it
btw. since my knowledge is not too big about instruction sets so, only asking if it's possible: could it be possible that the vectors could somehow utilize something like sse2 for even higher performance?
Please look at MTA org's /lua repo. I have filtered our commits to vendor/lua + added base Lua at the top (source code for 5.1.5 downloaded from Lua website). Would that fit?
I'm not sure if you were talking to me, but yeah, it looks good to me. I was just pointing out that the original approach that @qaisjp suggested was possible due to the existence of that mirror.
Anyways, the submodule stuff #2112
The main bottleneck is probably still the Lua side, rather than the actual maths. Not worth the headache. But where I think we might benefit from SSE is our vector stuff.. although I wonder why MSVC doesn't use SSE by default there.. Should be pretty trivial I belive.
Edit: Turns out there's 2 lua-vec branches: One which is a first-class value like a number, bool, etc.. and the second one (this) treats it as GCObject's. Obviously the first one is faster, but comes at the expense of making TValue
be 16 bytes instead of 8 (Which translates to basically 2x memory usage for Lua values) and probably performance loss as it would take 4 memory fetches to move the value into the register. The GC'd version is probably already fast enough IMHO. Although it uses some hackery (some kind of free list?).
@Pirulax A great amount of time has passed, but do you still want to get this pull request merged?
This is very useful and would be a good update. You don't plan to continue dealing with this PR in the future?