fa icon indicating copy to clipboard operation
fa copied to clipboard

Refactor all import paths to lower case

Open Garanas opened this issue 2 years ago • 4 comments

Description

This is a non-trivial issue and it likely requires an external program to process the excessive amount of changes that it proposes. But before we dive into that, lets make sure we understand what the issue is.

The file import system is defined in Lua, it can be found here. For ease of reading, here's a copy:

function import(name)
    local modules = __modules

    -- (1) attempt to find the module without lowering the name
    local existing = modules[name]
    if existing then
        return existing
    end

    -- (2) attempt to find the module when lowered the name
    name = StringLower(name)
    existing = modules[name]
    if existing then
        return existing
    end

    -- inform the devs that we're loading this module for the first time
    if informDevOfLoad then
        SPEW("Loading module '", name, "'")
    end
    
    -- (...)
    
end

When we import a file multiple times then we return the same instance of that file. We do this via caching. The caching requires a key, the key used is the path in question. A file should be referenced by only one key.

We've introduced an early exit (1) to check if we can find the module without lowering the key (the name / path). If that doesn't work, we lower it (to guarantee there's one key to a file) and (2) check if we can find the module this time. If we can't, then we're seeing this module for the first time so we'll have to create a new module.

The issue is that lowering a string has a complexity of O(n) where n is the length of the string. Hence, we'd really like to prevent that. As a few examples of possible inputs:

  • 35: /lua/ui/game/layouts/score_mini.lua
  • 26: /lua/terranprojectiles.lua
  • 64: /projectiles/adfshielddisruptor01/adfshielddisruptor01_script.lua

Note that these have all already been lowered by the code - imagine having to do that each time the Shield Disruptor creates a projectile!

Course of action

Every single import statement in the game repository should be lower case. There are 4600 instances over 1401 files, therefore it is an impossible task to do by hand. Luckily, it isn't difficult to automate the task either. To solve this issue you'll need to write a smart regular expression (1) / (2) that takes the input and returns the same characters, but lowered.

As useful tool to help you: https://www.regexpal.com/

Other approaches are welcome too, but this one is easiest to re-apply. Note that any solution should be idempotent.

Test plan

Technically this should have no impact on the game - the same (cached) modules will be loaded. Therefore to test the changes, you should be able to run a replay of the game without the changes, but then with the changes.

Learning goals

Become more comfortable with regular expressions.

Garanas avatar Aug 18 '22 10:08 Garanas

Interestingly, the engine uses the mixed-case filenames. I doubt we want to touch that though.

Hdt80bro avatar Sep 03 '22 22:09 Hdt80bro

Is Thier a reasoning why we want them all to be lower can and not a form of camel case ?

MrRowey avatar Sep 04 '22 03:09 MrRowey

It's slightly faster, since it's the version of the import name that's already cached

Hdt80bro avatar Sep 04 '22 05:09 Hdt80bro

Is Thier a reasoning why we want them all to be lower can and not a form of camel case ?

Yes, it prevents lowering the string entirely.

Interestingly, the engine uses the mixed-case filenames. I doubt we want to touch that though.

Let's not 😄

Garanas avatar Sep 04 '22 14:09 Garanas

@Garanas you mention that an import runs every time a projectile is fired. Why is that? Wouldn't it be better to just import it once? Especially if that import always returns the same thing because the results are cached.

I'm assuming this is so that we can reload files when debugging, but it seems like we should only do that when debugging and in production releases we should just import files once, then we don't even need to run the import function.

What if we found a way to pre process some files for production builds vs debug builds?

For example in C# you can use #IF DEBUG to pick which code gets compiled, this has no performance impact at runtime. It seems like something like this would be really beneficial to not have to make decisions between Lua development experience and runtime performance. I believe we could solve some issues like this with some pre processing.

Last thing, if we were to change the import function so that it doesn't need to lower case the file name, wouldn't this break mods?

hahn-kev avatar Nov 03 '22 05:11 hahn-kev

you mention that an import runs every time a projectile is fired. Why is that?

It appears that the blueprint file is passed into the engine as a string and it calls import on it. Same things happens for effects. I don't think there's a way around that.

What if we found a way to pre process---

NO Lua is already an interpreted scripting language so it doesn't make sense to have a preprocessor. In fact, Lua used to have preprocessing long ago and they removed because preprocessing is awful to deal with--it makes code ugly, hard to reason about, hard to maintain, and - this is the worst part - do entirely different things for different people (no need to undo all of the painstaking work done to get our language to simply do what it says it'll do). In general, introducing more facets people have to worry in development about always makes debugging harder long-term, and using a 40-year-old macro system designed to ease people out of FORTRAN and overcome unspecified language design <-> hardware <-> environment mismatches is one of the worst (we're lucky to not have to deal with any of those). Along the same lines, at one point I actually suggested adding a minifier at deployment, but not only does our deployment system not support anything like that, getting it to a point where it could would not be the effort (well, at least for that).

Deep breath

Okay, with that obligatory rant about the CPP out of the way (hope that didn't come across personally), lets talk about what you're actually suggesting: improving the tools we have at our disposal that improve the development process.

This is a great idea! One of the things I just did actually was to revamp our unit testing capabilities and add a workflow to support it. Adding more tools is highly encouraged. Admittedly, your specific suggestion for a way to replace efficient code to help with debugging seems to be based on a guess about how projectiles were made that didn't end up being right, so I don't immediately see a use for it, but there's probably still several I haven't thought of. It should be easy set up in pure Lua by swapping functions, so we wouldn't need a separate preprocessor for that (again, sorry for fixating on that).

Actually, we currently already do something similar for LazyVar and Layouts with flags embedded in the logic. It's not pretty or organized though (e.g. you have to change the file variable every time you want to use the extended debug features). It probably also has overhead we could avoid with a system like you're suggesting, but at the very least it hasn't seemed to be an issue so far (probably because they're in UI code). Making it all behave in a unified manner is something that's been on my mind, so a new system for the future might be good too.

Last thing, if we were to change the import function so that it doesn't need to lower case the file name, wouldn't this break mods?

Yes, and our code base as well. We can't have that.

Hdt80bro avatar Nov 03 '22 07:11 Hdt80bro

ok that makes sense to me about how the blueprints work. But if that's the case then refactoring all our imports won't change that. We could change all blueprints to make sure the effects are lowercase, but otherwise all the rest of our imports only happen the first time a file is loaded and then never again, meaning that there's essentially no performance benefit at run time? only maybe at load time, and the first time each file is loaded, which for 1,401 files isn't going to be very many times at most.

Yes, I'm more concerned with improving the dev process, and I'd hate to have to choose between dev process and performance, so a way to have both would be great. Maybe we can do that here? I'm not sure.

hahn-kev avatar Nov 03 '22 07:11 hahn-kev

But if that's the case then refactoring all our imports won't change that. We could change all blueprints to make sure the effects are lowercase, but otherwise all the rest of our imports only happen the first time a file is loaded and then never again, meaning that there's essentially no performance benefit at run time?

You're right that it's an extremely small gain, and most of it happens when loading (note that any improvement to loading times is appreciated however). There are still places that use the import function inside of function calls that don't happen while loading a file though--running a quick count, I see about 1000 such uses (out of about 5000 imports in total).

Hdt80bro avatar Nov 03 '22 08:11 Hdt80bro