CFU-Playground icon indicating copy to clipboard operation
CFU-Playground copied to clipboard

Oxide: Conda-provided toolchain performance consistently different than fresh-built

Open tcal-x opened this issue 3 years ago • 13 comments

We have nightly actions that built 3 designs each with 3 different seeds. One of these actions gets Yosys and Nextpnr-Nexus via Conda packges. The other action builds them fresh by cloning the Yosys and Nextpnr repositories and building them fresh.

The performance (achieved maximum frequency of the placed and routed design) is usually worse with the Conda-provided package, and there is no explanation for it.

See https://github.com/google/CFU-Playground/actions/workflows/fmax-trials.yml (Conda) and https://github.com/google/CFU-Playground/actions/workflows/fmax-trials-fresh-build.yml (fresh-built).

For the middle design with fresh-built tools, the (prelim/final) fmax in MHz were (70/84), (61/83), (62/82).

Using Conda-provided tools, the values were (66/73), (54/76), (64/75).

They should be identical unless there was a significant commit between the runs (the git hashes are printed out in each run), but that is not the case here. The fresh-built results have been the same for the last few days, and the Conda packages were built within the last day.

Are the tools built with different flags? Could there be some other executable in the Conda packages that somehow affects performance? You can run both ways locally (look at the Github actions for each).

tcal-x avatar Oct 02 '21 04:10 tcal-x

This certainly seems weird. @PiotrZierhoffer - can you get someone to investigate?

My bet is that the versions are not as close as @tcal-x thinks they are.

mithro avatar Oct 02 '21 04:10 mithro

I suppose the prjoxide executable/database is a potential source of difference as well. If the nextpnr-nexus Conda build in turn uses the prjoxide Conda package, that might be a bit old.

tcal-x avatar Oct 02 '21 05:10 tcal-x

Yeah, actually the Yosys Conda package is a bit old (3 days). Piotr mentioned that some packages were't getting approved as a new 'main' because of an unrelated CI failure.

tcal-x avatar Oct 02 '21 05:10 tcal-x

@PiotrZierhoffer , I see the Litex-Hub Yosys 'main' issue has been resolved, so that we are getting an up-to-date Yosys version. I am still seeing differences between the Conda-provided tools and the fresh-built.

The yosys --version printouts are pretty different -- this means they were compiled with different flags? Do you know the story behind all of the flags in the Conda build?

From fresh-built:

Yosys 0.10+10 (git sha1 f3ef579a, clang 10.0.0-4ubuntu1 -fPIC -Os)
nextpnr-nexus -- Next Generation Place and Route (Version 9c32e2d8)

From Conda-provided:

Yosys 0.10+10 (git sha1 abc57006, x86_64-conda_cos6-linux-gnu-gcc 1.24.0.133_b0863d8_dirty -fvisibility-inlines-hidden -fmessage-length=0 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -fdebug-prefix-map=/home/runner/work/conda-eda/conda-eda/workdir/conda-env/conda-bld/yosys_1633388921977/work=/usr/local/src/conda/yosys-0.9_5622_gabc57006 -fdebug-prefix-map=/home/runner/work/CFU-Playground/CFU-Playground/env/conda/envs/cfu-common=/usr/local/src/conda-prefix -fPIC -Os -fno-merge-constants)
nextpnr-nexus -- Next Generation Place and Route (Version 0.0.0-3848-g9c32e2d8)

tcal-x avatar Oct 05 '21 21:10 tcal-x

First of all, I see clang vs gcc, so it's a different toolchain. The flags come mainly from conda, I see that we add -std=c++11 -Os -fno-merge-constants.

Do you still observe the performance difference here?

PiotrZierhoffer avatar Oct 05 '21 22:10 PiotrZierhoffer

Hi @PiotrZierhoffer , yes, there is still a difference looking at the latest workflows (https://github.com/google/CFU-Playground/actions/workflows/fmax-trials.yml and https://github.com/google/CFU-Playground/actions/workflows/fmax-trials-fresh-build.yml).

The Yosys compile flags might not have anything to do with it; only if they affect Yosys output. But I don't see any -D<something> flags.

Can you have someone try to get to the bottom of it? E.g. see if the Yosys output differs, if so why, if not then what is different, etc. Maybe the difference can be reproduced on your machine, maybe not -- that would be 'interesting' too if there's no difference locally.

tcal-x avatar Oct 06 '21 04:10 tcal-x

@tcal-x it seems the problem does not exist anymore (see the latest runs):

conda: https://github.com/google/CFU-Playground/runs/3882511524?check_suite_focus=true#step:19:1 fresh build: https://github.com/google/CFU-Playground/runs/3882327394?check_suite_focus=true#step:19:1

in both cases the results were:

Info: Max frequency for clock 'por_clk$glb_clk': 71.47 MHz (PASS at 70.72 MHz)
Info: Max frequency for clock 'por_clk$glb_clk': 76.56 MHz (PASS at 70.72 MHz)

kgugala avatar Oct 14 '21 08:10 kgugala

Should the results not be identical if given identical input / versions?

mithro avatar Oct 14 '21 14:10 mithro

Should the results not be identical if given identical input / versions?

Even though I should know what is going on, it confused me at first as well.
Then I remembered that each run gives out two max freq lines: one preliminary and one final.

So to make it more clear:

Conda results:

Info: Max frequency for clock 'por_clk$glb_clk': 71.47 MHz (PASS at 70.72 MHz)
Info: Max frequency for clock 'por_clk$glb_clk': 76.56 MHz (PASS at 70.72 MHz)

Fresh build results:

Info: Max frequency for clock 'por_clk$glb_clk': 71.47 MHz (PASS at 70.72 MHz)
Info: Max frequency for clock 'por_clk$glb_clk': 76.56 MHz (PASS at 70.72 MHz)

Thanks @kgugala ; I will check the runs again tomorrow, and assuming they still match, I'll close this.

tcal-x avatar Oct 14 '21 16:10 tcal-x

I see identical results with the most recent runs; I'll close this.

tcal-x avatar Oct 18 '21 19:10 tcal-x

I'm again seeing significant performance (critical path / fmax) differences for HPS between locally-built tools and Conda-provided tools. I see it both in CI and building locally.

Using locally-built and installed yosys and nextpnr-nexus:

seed-1/nextpnr-nexus.log:Info: Max frequency for clock 'clkout$glb_clk': 89.08 MHz (PASS at 53.50 MHz)
seed-2/nextpnr-nexus.log:Info: Max frequency for clock 'clkout$glb_clk': 80.06 MHz (PASS at 53.50 MHz)
seed-3/nextpnr-nexus.log:Info: Max frequency for clock 'clkout$glb_clk': 85.35 MHz (PASS at 53.50 MHz)
seed-4/nextpnr-nexus.log:Info: Max frequency for clock 'clkout$glb_clk': 68.47 MHz (PASS at 53.50 MHz)
seed-5/nextpnr-nexus.log:Info: Max frequency for clock 'clkout$glb_clk': 82.35 MHz (PASS at 53.50 MHz)
seed-6/nextpnr-nexus.log:Info: Max frequency for clock 'clkout$glb_clk': 81.95 MHz (PASS at 53.50 MHz)
seed-7/nextpnr-nexus.log:Info: Max frequency for clock 'clkout$glb_clk': 81.95 MHz (PASS at 53.50 MHz)
seed-8/nextpnr-nexus.log:Info: Max frequency for clock 'clkout$glb_clk': 86.63 MHz (PASS at 53.50 MHz)

Using Conda-provided tools:

seed-1/nextpnr-nexus.log:Info: Max frequency for clock 'clkout$glb_clk': 79.23 MHz (PASS at 53.50 MHz)
seed-2/nextpnr-nexus.log:Info: Max frequency for clock 'clkout$glb_clk': 79.17 MHz (PASS at 53.50 MHz)
seed-3/nextpnr-nexus.log:Info: Max frequency for clock 'clkout$glb_clk': 71.94 MHz (PASS at 53.50 MHz)
seed-4/nextpnr-nexus.log:Info: Max frequency for clock 'clkout$glb_clk': 71.09 MHz (PASS at 53.50 MHz)
seed-5/nextpnr-nexus.log:Info: Max frequency for clock 'clkout$glb_clk': 73.35 MHz (PASS at 53.50 MHz)
seed-6/nextpnr-nexus.log:Info: Max frequency for clock 'clkout$glb_clk': 74.83 MHz (PASS at 53.50 MHz)
seed-7/nextpnr-nexus.log:Info: Max frequency for clock 'clkout$glb_clk': 73.35 MHz (PASS at 53.50 MHz)
seed-8/nextpnr-nexus.log:Info: Max frequency for clock 'clkout$glb_clk': 69.44 MHz (PASS at 53.50 MHz)

tcal-x avatar Feb 25 '22 18:02 tcal-x

The difference is entirely due to whether gcc or clang is used to build Yosys. With a local build, clang is default. If I instead build Yosys with:

make config-gcc
make -j8
sudo make install

then I get exactly the same results as when using the Conda package.

I'll file an issue on Yosys to see if this is expected behavior.

tcal-x avatar Feb 26 '22 04:02 tcal-x

I opened https://github.com/YosysHQ/yosys/issues/3218 last week.

tcal-x avatar Mar 02 '22 17:03 tcal-x