Open-Assistant
Open-Assistant copied to clipboard
Write instructions and examples for unit-testing python code (backend and bot)
use pytest for this
note: this issue concerns unit tests, meaning if e.g. a database is required, it should be mocked
Hi @yk, that sounds like something I could look into. I worked with pytest and did some testing that required DB/API mocking. If no one else is interested, I might've taken a look. (Though I would need some time to start, like fork/set up the project, read contribution guidelines, etc.)
Also, I guess that would help me learn more about the codebase, as I'm pretty interested in building an open AI assistant, and wish to contribute something.
@devforfu Hi Ilia, I'm in the same boat as you regarding this project. This seems like a good task to get started with. If you're interested, let's connect by email or Discord to work together on setting up the project and getting started. Email: [email protected] Discord: Nick DiSanto#4045
Hey @devforfu @NickDiSanto thanks to both of you for speaking up :) very cool! I'll assign both of you and leave it up to you how to organize it. Remember, this is not about writing all the tests, but about giving people a bit of a framework, some instructions, and some examples on how tests are done.
@yk Sounds good! Yes, that makes sense. I guess that would be helpful to establish some testing foundation.
@NickDiSanto That's great! Yes, let's connect via Discord. My tag there is Ilia#9789.
@devforfu & @NickDiSanto , I started working on #205, we should connect -- lomz#2555
Langchain has a pretty thorough set of unit and integration tests for a very similar problem. https://github.com/hwchase17/langchain/tree/master/tests/unit_tests there are some pretty good examples for database testing too.
I'm starting to look at adding tests for the discord bot. The critical flow I'm looking to test is _handle_task (found here)
To test _handle_task , I'm finding we'd either need to A) do a lot of mocking of Discord methods or B) encapsulate interactions with Discord inside of a TaskCoordinator class, something that would handle sending messages, receiving replies, etc.
Option B would be a nice idea if we intended to have multiple front ends for _handle_task other than a Discord bot. We could write a TaskCoordinator for completing tasks through the CLI, or through a desktop app, or anything.
Option A will get the job done but ties the _handle_task code pretty tightly to Discord, and mock.patching is kinda fickle. It is a discord bot, though.
Just wanted to put some thoughts down and poll the group to figure out which direction we should go in! My preference would be Option B because that feels easier and more fun to test and also gives us some flexibility. Would love thoughts.
@jack-michaud I think this is a general problem with adding tests after writing the code. In general, tests are just other code. Re-useable code that's easy to reason about (so has fewer bugs) is easy to test. If it's difficult to test, it's usually best to just fix the code.
In this case I'd be tempted to split the logic out into short functions and pass mocks or fixtures in using dependency injection. For example, this bunch of code could be a function that takes a ctx, oasst_api and returns an event:
await ctx.author.send("Please type your response here:")
try:
event = await ctx.bot.wait_for(
hikari.DMMessageCreateEvent, timeout=MAX_TASK_TIME, predicate=lambda e: e.author.id == ctx.author.id
)
except asyncio.TimeoutError:
await ctx.author.send("Task timed out. Exiting")
await oasst_api.nack_task(task.id, reason="timed out")
logger.info(f"Task {task.id} timed out")
return
I'd then throw fixtures into conftest.py with names like connected_api, flaky_connection, ctx_with_lazy_user and so on, with mocks or test classes that do the minimal needed.
@bitplane I totally agree that what I'm running into is a "smell" - it's hard to test and it's hard to reason about. The tests we'd write with mock.patch would fail as we change the code.
In this case I'd be tempted to split the logic out into short functions and pass mocks or fixtures in using dependency injection. For example, this bunch of code could be a function that takes a ctx, oasst_api and returns an event... I'd then throw fixtures into conftest.py with names like connected_api, flaky_connection, ctx_with_lazy_user and so on, with mocks or test classes that do the minimal needed.
Yes, it looks like oasst_api is already being injected. We would also probably create a DiscordTaskCoordinator(ctx) since ctx is a Discord specific object. The principles remain the same, except we can mock the Discord methods as well. The function signature would be task_coordinator, oasst_api -> event -- it would no longer take in a ctx, and that means it's something that we can easily test since we own all of those interfaces.
If it's difficult to test, it's usually best to just fix the code.
I'd like @yk or @andreaskoepf to weigh in here. If we're trying to focus on delivering an MVP, our time would be spent better elsewhere and not refactoring yet.
I fully trust you, be pragmatic and decide what you think is best. Also, @andreaskoepf probably knows way more about that part of the codebase than me
One option could be to add some unit test coverage for some easier/trivial parts, do a bit of refactoring so they're easier to test, and put a long description of what you changed and why in the commit messages. Then create a few more unit test issues that reference the pull request as an example of the kind of stuff that needs to be done.
That way we get the ability to spread the work out a bit and not get bogged down on refactoring now, it gives new devs some nice easy first issues to pick up, and we've still got some unit tests for CI and coverage reports.
Oh I forgot to put something in here... I got some basic unit tests merged in last weekend and plan to do some more this weekend. Nothing end-to-end or exhaustive, but a step in the right direction.
~~I'll open a ticket for the next batch~~ I didn't but I referenced this one
Closing all old discord bot issues. Discord data collection never took off. It might be more viable to develop a discord bot that communicates with the inference system.