premake-core Performance issues on large projects

What seems to be the problem? Premake currently struggles on large solutions. Our current solution features 230 projects, 5 platforms, with 4 configurations each. Our project file generation takes between 2-3 minutes which isn't extraordinarily bad as you generally don't generate as you iterate, it still takes up 30-40% of our incremental build times on our build machines.

What did you expect to happen? It to perform better

What have you tried so far? An early win was findProject is very slow. We optimised this by adding an already lowered and a hashed form of the name to the prj at creation time, then comparing against those. This way string comparison only occurs if a hash comparison already succeeded, and it doesn't have to bother lowing the string constantly. In the profile information at the bottom that will show as components/findProjectOptimization.lua. This reduced the findproject time from 19.3 seconds to 6.8 seconds.

How can we reproduce this? Our project is under NDA so we cannot share the specifics. The easiest reproduction would be to download a ton of open source libraries, and add a number of psuedo platforms.

What version of Premake are you using? Version at commit ad89dfcb0d334e0b898214aa452f18eae961ab10

Anything else we should know?

Here is a portion of the pepperfish_profiler results. I've removed everything which takes less than 1 second. As you can see a significant amount of time is spent in config set. All our custom overrides are in OurStuff/components/** https://pastebin.com/jDsGsdw2

Two big culprits are _fetchMerged and <function: 000001F215E59230>(?) in configset, and compile which alone took 60 seconds in self

(ETA: I believe 000001F215E59230 is the local function remove in _fetchMerged)

Mar 27 '20 15:03 ClxS

Further findings. The main reason fetchMerged is so high impact is __index in context.lua, which is in turn highly used by bakeFiles.

----------------------- L:bakeFiles@src/base/oven.lua:298 ---------------------- Sample count: 17472 Time spend total: 75.485s Time spent in children: 75.485s Time spent in self: 0.000s Time spent per sample: 0.00432s/sample Time spent in self per sample: 0.00000s/sample

Child L:for iterator@src/base/project.lua:21 sampled 1 times. Took 0.00s Child L:__index@src/base/context.lua:58 sampled 1056 times. Took 12.32s Child L:foreachi@src/base/table.lua:73 sampled 16320 times. Took 62.45s Child C:sort@=[C]:? sampled 95 times. Took 0.71s

If I'm reading correctly, does this create a fcfg for every file, for every configuration? If right this is likely a root of our problem. (Accounts for 75s at least, the other 60s being compile, which would bring us to a very reasonable 20s left)

-- Start by building a comprehensive list of all the files contained by the
-- project. Some files may only be included in a subset of configurations so
-- I need to look at them all.

for cfg in p.project.eachconfig(prj) do
    local function addFile(fname, i)

        -- If this is the first time I've seen this file, start a new
        -- file configuration for it. Track both by key for quick lookups
        -- and indexed for ordered iteration.
        local fcfg = files[fname]
        if not fcfg then
            fcfg = p.fileconfig.new(fname, prj)
            fcfg.order = i
            files[fname] = fcfg
            table.insert(files, fcfg)
        end

        p.fileconfig.addconfig(fcfg, cfg)
    end

    table.foreachi(cfg.files, addFile)

    -- If this project uses NuGet, we need to add the generated
    -- packages.config file to the project. Is there a better place to
    -- do this?

    if #prj.nuget > 0 and (_ACTION < "vs2017" or p.project.iscpp(prj)) then
        addFile("packages.config")
    end
end

As we have around 10k+ files, that would be around 200k fcfgs being created. Is there anything we can do to avoid this, even if it's just a local override?

Mar 27 '20 17:03 ClxS

An early win was findProject is very slow. We optimised this by adding an already lowered and a hashed form of the name to the prj at creation time, then comparing against those. This way string comparison only occurs if a hash comparison already succeeded, and it doesn't have to bother lowing the string constantly

Nice find. It might be worth spinning this specific optimization off to its own issue or PR so it can be acted on separately from this discussion.

If I'm reading correctly, does this create a fcfg for every file, for every configuration? If right this is likely a root of our problem

You are right, and nonsense like this is exactly what I'm trying to address with premake-next. I personally don't see a way to fix it without an overhaul of all of the configuration handling internals. Open to ideas and incremental improvements though.

Mar 30 '20 14:03 starkos

@starkos I'll start experimenting over the next couple of months. As long as things are just expecting fcfgs being passed as function args, and not specifically looking at prj._.files all over, I have a few ideas. I'll get back to you if any of them work!

Apr 15 '20 21:04 ClxS

reviving this old thread 🙄 in my case it seems that the bottleneck is that we have a lot of filters that apply to certain configurations mostly at project level but each fileconfig will context.copyFilters(fcfg, prj) which means that for each file we will try to match mostly useless filters is there a way add only filters that could affect files here instead of adding them all ? Off the top of my head I could think of language (?) `filter {"files:" ... } Are there any other filters that could affect files ?

Mar 26 '22 14:03 mihaisebea

Upon further inspection, compared to a previous version that i was using 🙄 where we were calling configset.compile 3 times per file, Now It seems we are calling configset.compile 6 times per file for a root, for solution, for project (?) twice

In oven.bakeFiles

local function addFile(fname, i)

				-- If this is the first time I've seen this file, start a new
				-- file configuration for it. Track both by key for quick lookups
				-- and indexed for ordered iteration.
				local fcfg = files[fname]
				if not fcfg then
					fcfg = p.fileconfig.new(fname, prj) ---->>>> this calls context compile 
					fcfg.order = i
					files[fname] = fcfg
					table.insert(files, fcfg)
				end

				p.fileconfig.addconfig(fcfg, cfg) ---->>>>> this ALSO calls context compile 
			end
			
			table.foreachi(cfg.files, addFile)

this seems to have been introduced by @tvandijck in commit 32885d1673b50194ad7137261634775d772a4534

Mar 26 '22 16:03 mihaisebea

further investigation into configset.compile


			if abspath and block._basedir and block._basedir ~= basedir then					
				basedir = block._basedir
				filter.files = path.getrelative(basedir, abspath)
			end
		
			if criteria.matches(block._criteria, filter) then
				table.insert(result.blocks, block)
			end

40% of the time is spent on this code about 20% on path.gerelative and the rest of 20% of the criteria.matches test it seems that on windows the block._basedir is made ::tolower in addFilter is some blocks but not in others ? Adding a print for block._basedir <==> basedir in the above block that makes the files relative, results in

C:/Work/****       <==>   c:/work/***** 
c:/work/****        <==>   C:/Work/*****

Actual filenames are omitted for brevity :) Questions is do we still need to make the basedir to lower @starkos ? or are some other places where we are missing a tolower ? Also would tolower cause issues with case sensitive filesystems ?

Mar 28 '22 16:03 mihaisebea

after even further investigation it seems this problem comes from the export plugin block._basedir = block.basedir making that tolower reduces the cost of path.getrelative

Mar 28 '22 20:03 mihaisebea

premake-core premake-core copied to clipboard

Performance issues on large projects

premake-core
premake-core copied to clipboard