transformers-php icon indicating copy to clipboard operation
transformers-php copied to clipboard

Establishing the test infrastructure

Open k00ni opened this issue 9 months ago • 1 comments

@CodeWithKyrian wrote in https://github.com/CodeWithKyrian/transformers-php/pull/36#issuecomment-2120853914:

If you notice, there aren't many tests in the library, which I'm not proud of. This is because I want to take my time to decide on the best structure for the tests. Classes like Tensor and Image can be easily tested, and I included tests for the basic tokenizer because the config file sizes are relatively small. However, the overall testing structure of the library is still largely undecided. [...] Since you seem to have a keen interest in testing, I would really appreciate your suggestions on how best to structure tests for the project.

I suggest select appropriate objectives and approaches/styles for the library first and then derive how to set up test infrastructure.

Based on the documentation and your summary in #34, my assumption is you have the following objectives etc.:

  1. Mirror the functionality of the HuggingFace Python library as closely as possible
  2. Use plain PHP wherever possible, combined with access to C libraries (and others) using the FFI extension
  3. Imitate the code style so the transition for people from Python to PHP (and other languages) is as smooth as possible. Here is an example of the code style: https://codewithkyrian.github.io/transformers-php/summarization#running-a-pipeline-session
  4. You provide the library as Open Source on Github, so I assume you wanna attract (or at least hope for) people who help you developing/maintaining it.

Feel free to object to any of these points or complement/update them, so there is no misunderstanding.

In the following a few remarks on the side:

  • Even though this library has a good example (= HuggingFace Python library) which outlines the direction of the development, its still very young. So there is much room for decisions to make.
  • It seems as if you wanna transfer the Python (functional) style of doing things also to the test environment. That's probably why you use the Pest test framework. Is this style of coding important to you or are you willing to change it (at least partly)? There is no need to justify it, because you are the author (Benevolent dictator for life, BDFL) of the library and a free to decide how things are suppose to be done.

Suggested test environment

1. Decide which framework to use: Pest or PHPUnit

I don't know the Pest framework much, but in my time in which I wrote tests for #36 it seemed cumbersome to use, because I don't like the functional style of doing things personally (point 3). This opinion is shared by others as well but on the other side, Pest provide a very nice output (reference).

PHPUnit is widely known and represents a major style how to write code in PHP. It is used by big PHP projects such as Symfony framework or Doctrine. If my assumption (point 4) is correct, you should consider using a tool which is likely more known to PHP developers. In my experience, people are either not familiar with testing at all, but if they are, they at least know PHPUnit.

:heavy_check_mark: I would switch to PHPUnit but explore, if Pest provides improvements in the output.

2. Decide which role a test plays

Tests can play various roles. Not only can they represent certain aspects of a software, they can also be used to show that a certain functionality is provided. I would use a very wide view on tests and use them for everything that suits this library, which means:

  • Write a test to show that a function behaves as intended (e.g. check return value for a given input)
  • Write a test to show that a (set of) function(s) acts inside certain parameters (e.g. memory limits for certain inputs).
  • Write a test to show that a certain misbehavior (bug) was fixed (e.g. each pull request containing a fix should demonstrate that it works)
  • Write a test to keep track of current problems. In a project of mine I used a skipping test to remind me, that a certain functionality still doesn't work. The idea was, that if it ever was fixed I get notified through my tests.

(and more ...)

3. Folder structure

The following structure did us a great service in various projects I (help to) maintain on Github (e.g. PDFParser, EasyRdf fork). The reason was, that its flexible enough to grow with the project but with enough structure to avoid files "flying around". The basic structure is:

test
|
`--- files
|       |
|       `--- forIssue33.txt
|       `--- ...
|
`--- TestCase.php        <== Root class for each test


tests
|
`--- IntegrationTests   <== Majority of tests: because everything is entangled and people usually don't care
|     |                  or don't know it better, integration tests is a good fit.
|     |                 It usually contains unit tests, system tests ... too.
|     |
|     `--- Class1Test.php
|     `--- ...
|
`---- ModelDependentTests   <== if it makes sense, test the library using certain models to check boundaries etc.
|     `--- ModelXTest.php
|     `--- ...
|
`---- PerformanceTests
...

Each folder in tests should represent a major area of interest to the library. I can imagine overall performance or memory usage is of high priority. In these cases a separate test area (such as PerformanceTests) might help, so long running tests or tests which need a special environment don't pollute, for instance, the integration tests (folder IntegrationTests). Also, not all tests have to run on each new commit. Some might only run when a new PR is created. These things might need some time to observe and configure properly.

All tests should run as part of the Continuous Integration pipeline here (using Github Actions). This is well documented and people using it for a wide variety of things (e.g. compilations, tests, data aggregation).

4. Include static code analysis

I will keep this one short. Static code analyzers such as PHPstan just use the source code (+ some config) and don't need custom test cases. They use certain rules (based on a given configuration). One of the major benefits is their ability to find errors the developer usually doesn't think about. Also, they help to establish type safety in the code base. Feel free to ask for more info.


Please read this as a suggestion and feel free to do whatever you want with it, no hard feelings. I might send further PRs in the future, but it really depends my available time. Just wanted to add this so there is no misunderstanding.

k00ni avatar May 22 '24 08:05 k00ni