ggml ggml : improve CI + add more tests

The current state of the testing framework is pretty bad - we have a few simple test tools in tests, but these are not maintained properly and are quite rudimentary. Additionally, the Github Actions do not allow to run heavy workloads so it is difficult to run integration tests even on small models such as GPT-2. Not to mention that there is no GPU support

Ideally, it would be awesome to make a CI that can build the code on as much different hardware as possible and perform some performance and accuracy tests for various models. This will allow quicker iteration over new changes to the core library

I posted a discussion in llama.cpp on this topic - hopefully we gather some insight on how to make such CI in the cloud:

https://github.com/ggerganov/llama.cpp/discussions/1985

Extra related issues:

https://github.com/ggerganov/llama.cpp/issues/2631
https://github.com/ggerganov/llama.cpp/issues/2634

TODOs:

[ ] Add Metal CI to llama.cpp using the new macos-13 runners: https://github.com/ggerganov/ggml/pull/514

Jun 25 '23 09:06 ggerganov

I'd be interested in helping with the 'add more tests' part of this because of some unanswered question. But I believe it would be reasonable to have some directions here. Obvious question: do we have some means to get test coverage yet?

Jul 11 '23 19:07 goerch

I guess we can focus on CPU-only testing for now. The most straightforward approach is to have a unit test for each function in the ggml.h API. Some functions like ggml_rope() and ggml_alibi() should be cross-validated with the reference Python implementations somehow since these are difficult to judge if they compute stuff correctly. Such tests are lightweight and can be part of the existing Github Actions.

Regarding GPU tests - when the cloud CI framework is ready, we will simply run "integration" tests in the cloud. For example, the CI can obtain certain model data and run text generation and perplexity calculations using different GPUs. Whatever is available for rent. We can figure out the details for this later.

Test coverage would be nice - I've used lcov in the past. Maybe we can integrate it in the Github Actions CI.

Jul 11 '23 19:07 ggerganov

@ggerganov : My tries to get lcov working on Windows failed miserably, but I got clang/llvm coverage analysis working. Here is a first summary:

Filename                      Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
src\ggml.c                      10461              5746    45.07%         512               216    57.81%        9985              4703    52.90%        4848              2775    42.76%
tests\test-grad0.c                464                68    85.34%          11                 1    90.91%         818                68    91.69%         322                62    80.75%

Files which contain no functions:
include\ggml\ggml.h                 0                 0         -           0                 0         -           0                 0         -           0                 0         -
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                           10925              5814    46.78%         523               217    58.51%       10803              4771    55.84%        5170              2837    45.13%

The report is based on the merged profile data of all currently active tests, run by ctest, llvm seems to require the mention of a specific executable, however. If this is an acceptable way forward, I'll try to cleanup the CMake changes and propose a PR.

Jul 12 '23 15:07 goerch

Yes, this looks even better. Let's give it a try

Jul 12 '23 18:07 ggerganov

Hi @ggerganov, I just published a PR to support multiple platforms and OS few days ago. Let me know if it's something you find relevant?

Jul 16 '23 07:07 alonfaraj

@alonfaraj

Thank you very much! I'm currently looking at the PR - sorry for the delay

In the meantime, I've made progress on the Azure Cloud CI idea and hacked a simple framework using Bash + Git:

https://github.com/ggml-org/ci

Currently, I am able to very easily attach new nodes from the cloud and have them run various tests. The tests are implemented in the ci/run.sh script. At the moment I've rented just 3 CPU instances:

The ggml-2 instance is a high-performance one and can run heavier workloads like MPT 7B inference. The results are summarized neatly in Github README.md files for each commit.

If this strategy turns out to be effective, I will probably scale it up and add GPU and bare-metal nodes.

Jul 16 '23 18:07 ggerganov

Looks good! I will take a deeper look as well.

Jul 16 '23 19:07 alonfaraj

Add Metal CI to llama.cpp using the new macos-13 runners: https://github.com/ggerganov/ggml/pull/514

Sep 08 '23 16:09 ggerganov

@ggerganov , How are you going... and how are you progressing on CI?

I've recently finished an Azure Arctitecture/Devops contract... got familiar with CI/CD on Azure, Azure Infrastructure-as-code (IaC), different Azure Services etc.

Re-reading this Roadmap item... seems the solution may be

Github Action starts a CI process on Azure; - create Azure "Webworker" infrastructure - multi approaches, shared, to dedicated, CPU or GPU - [optionally] run unit tests - run performance test - return reports - destroy Azure "Webworker" infrastructure

A yaml settings file + GitHub Secrets to manage the config.

The CI could be run on the forked Repo.. using the GitHub Secrets, hence Azure Credentials, of the fork GitHub Account

Do you still have a large Azure allocation?

Apr 17 '24 07:04 ianscrivener

ggml ggml copied to clipboard

ggml : improve CI + add more tests

TODOs:

ggml
ggml copied to clipboard