incus icon indicating copy to clipboard operation
incus copied to clipboard

Port `network_peers` to the database generator

Open stgraber opened this issue 8 months ago • 21 comments
trafficstars

Incus uses a built-in code generator for most of its database interactions. Unfortunately not all entities have yet been moved over to that mechanism, causing us to have far more SQL code in our code tree than we'd really like.

The goal here is to define new code generated functions in internal/server/db/cluster/ and then remove the old implementation from internal/server/db/ transitioning everything over to the new generated functions.

A recent example of the database generator being used can be found here: https://github.com/lxc/incus/pull/655

Note that some of the syntax has since evolved, so you may want to look at the current version of networks_integrations.go to see how to invoke the generator.

stgraber avatar Mar 17 '25 21:03 stgraber

Hey @stgraber , I'm a UT student having to contribute to an open source project for Professor Chidambaram's Virtualization course. I am interested in this issue, but I would like to know more about the issue first before taking it on.

From my understanding, we are refactoring the code to use a built-in interface to create databases, is that right?

NathanChase22 avatar Mar 25 '25 20:03 NathanChase22

Yeah, that's right. We're moving away from hand written database code over to programmatically generated code. We've been in an in-between state for a number of years now and that's making working on the code a bit annoying.

So we're now looking at moving a bunch of the remaining bits, starting with those that follow a pretty common pattern over to the generated code.

Basically the way I'd normally approach this is:

  • Add a basic new setup in internal/server/db/cluster/ based on what's been done for network_integrations for example
  • Run make update-schema to generate all the new functions (.mapper.go files)
  • Remove the old DB code (in this case internal/server/db/network_peers.go)
  • Build the code again and deal with the failures. It will fail to build on every place that was calling the old logic. All of those need to then be updated one by one to use the new generated functions.
  • Send the PR and hope all the tests still clear, if not, track down the failures until things are green again.

stgraber avatar Mar 25 '25 20:03 stgraber

How much of the code base would you say is dependent on the old DB code? Would we have to adjust any build files like Makefiles to accommodate the new functions?

What sort of tests are currently present for the database? Are we expecting no change to the end behavior after migrating?

It looks like that this issue is similar to some of the other issues, how much variation is there between this issue and say #1802 in the approach?

NathanChase22 avatar Mar 25 '25 20:03 NathanChase22

We've opened an issue for each of the current files but the general approach and work needed to resolve it should be near identical from one to the next.

There usually isn't any changes needed to the Makefile or anything there. It's really about changing all the call sites for the old functions to the new functions.

The tests should all be at a higher level than this, so no changes should be needed to the tests, but they will instead be a good way to ensure that the change didn't introduce any unforeseen regression.

stgraber avatar Mar 25 '25 21:03 stgraber

@stgraber Just got back with my professor, we are willing to take on this issue

NathanChase22 avatar Mar 27 '25 20:03 NathanChase22

Excellent, I assigned it to you!

If you have other team members working on this with you, you can have them comment in here too so I can add them to the issue.

stgraber avatar Mar 27 '25 21:03 stgraber

Yes there's also my partner, @oronila .

If we run into any issues setting up the environment, who and how should we contact for questions / help?

NathanChase22 avatar Mar 27 '25 21:03 NathanChase22

GitHub only lets me add people who commented as assignees (to prevent this being used to spam people).

As for questions and interactions. The best is going to be to comment in here or in the pull request once you have it open. That way I and the other maintainers (and some of the other contributors) will be notified and can help you out.

stgraber avatar Mar 27 '25 23:03 stgraber

Hi, I am the other member that needs to be added.

oronila avatar Mar 27 '25 23:03 oronila

Perfect, got it assigned to the two of you!

stgraber avatar Mar 28 '25 00:03 stgraber

Forked and booted up the development container, however when I ran the script that checks for missing dependencies it says I am missing 'incusd' . What is it and how should I download it?

And what tests should I be expected to run? Just want to make sure I have everything setup and tests working before I do anything further

NathanChase22 avatar Mar 29 '25 04:03 NathanChase22

Starting from a clean Ubuntu 24.04 system, I did:

  • apt install acl attr autoconf automake dnsmasq-base git libacl1-dev libcap-dev liblxc1 lxc-dev libsqlite3-dev libtool libudev-dev liblz4-dev libuv1-dev make pkg-config rsync squashfs-tools tar tcl xz-utils ebtables git busybox-static curl gettext jq sqlite3 socat bind9-dnsutils shellcheck flake8 pipx
  • pipx install codespell
  • apt remove --purge uidmap
  • rm /etc/subuid /etc/subgid
  • curl -L https://go.dev/dl/go1.24.1.linux-amd64.tar.gz | tar -zxf - -C /usr/local/
  • export PATH=${PATH}:/root/go/bin:/usr/local/go/bin/:/root/.local/bin/
  • git clone https://github.com/lxc/incus
  • cd incus/
  • make deps
  • export CGO_CFLAGS="-I/root/go/deps/raft/include/ -I/root/go/deps/cowsql/include/"
  • export CGO_LDFLAGS="-L/root/go/deps/raft/.libs -L/root/go/deps/cowsql/.libs/"
  • export LD_LIBRARY_PATH="/root/go/deps/raft/.libs/:/root/go/deps/cowsql/.libs/"
  • export CGO_LDFLAGS_ALLOW="(-Wl,-wrap,pthread_create)|(-Wl,-z,now)"
  • make
  • make static-analysis
  • make unit-test
  • cd test
  • INCUS_TMPFS=1 ./main.sh filtering

This installed all the needed dependencies for building Incus and running tests, then tweaked the system to have it behaved and lastly built Incus, ran the static analysis, ran the unit tests and ran one of the system tests.

Note that if running as a regular user, you'll need to run the apt commands through sudo and you'll need to update any path containing /root/ with your user's home directory instead.

Lastly. Some of the steps change the user's environment. Those need to be applied to any terminal you open or you won't be able to find some commands or build Incus. To have something persistent, you may tweak your user's bashrc to include those.

export PATH=${PATH}:/root/go/bin:/usr/local/go/bin/:/root/.local/bin/
export CGO_CFLAGS="-I/root/go/deps/raft/include/ -I/root/go/deps/cowsql/include/"
export CGO_LDFLAGS="-L/root/go/deps/raft/.libs -L/root/go/deps/cowsql/.libs/"
export LD_LIBRARY_PATH="/root/go/deps/raft/.libs/:/root/go/deps/cowsql/.libs/"
export CGO_LDFLAGS_ALLOW="(-Wl,-wrap,pthread_create)|(-Wl,-z,now)"

stgraber avatar Mar 31 '25 05:03 stgraber

@stgraber If I was running this on the development container would you have different instructions?

NathanChase22 avatar Mar 31 '25 17:03 NathanChase22

Ah, no, I've sadly never used the devcontainer stuff. @breml contributed that but I don't know if he ever ran all the tests and everything from that environment.

stgraber avatar Mar 31 '25 21:03 stgraber

Both of us were able to get the environment up and running using both the devcontainer (only change was the go version in Dockerfile) and on a Ubuntu VM. We both have ran into errors because our host machines don't support KVM, but the test ran and said it was a success.

For the sake of our task, do we need to be able to run KVM?

Here are our output files after running the test.

development container log

UTM VM log

NathanChase22 avatar Apr 01 '25 07:04 NathanChase22

I documented in https://github.com/lxc/incus/pull/1566, which targets of the Makefile do work in the devcontainer. The unit tests should work, but it was not possible to run the complete suite of integration tests.

breml avatar Apr 01 '25 14:04 breml

For the sake of our task, do we need to be able to run KVM?

Nope, you won't be needing KVM.

Usually what I do these days is I make sure that I can:

  • Build the code (make)
  • Run the static analysis (make static-analysis)
  • Run the unit tests (make unit-test)

If that's all good, then I push to Github and let it run all the system tests for me as you get a full run including complex things like Ceph and Linstor in just around an hour whereas running all that locally would take a long time, not to mention require a bunch of setup for some of the more complex storage options. Not that much of that is really relevant in this case as the network logic doesn't vary based on storage.

stgraber avatar Apr 01 '25 15:04 stgraber

After reviewing the network_integrations example, we are understanding how the code generator can be evoked. The following is general outline for how to proceed

PLAN OUTLINE:

  1. ID all network_peers methods based on the type of operation it's supposed to perform
  2. ID all SQL queries based on the type of operation it's supposed to perform
  3. ID all Aux. variables, structs, methods, etc which don't fall under the code gen scheme, with the example there was methods like ToAPI()
  4. Create Code Gen directives and add Aux. variables to network_peers.go
  5. Evoke code gen and check correctness of method headers in mapper interface file (should be the same)
  6. Check code correctness, follows similar logic
  7. RM old code and compile
  8. Deal with failures wherever old logic was called
  9. Run unit tests and see if they pass
  10. Create PR and have system tests run

NathanChase22 avatar Apr 03 '25 03:04 NathanChase22

We've been working on this issue and got some of the statements to be successfully replicated. However based on the given generator documentation and experimentation, it looks like some of the SQL queries in the original file cannot be replicated using the generator directives.

For SQL statements that are used in functions that can be generated like CreateNetworkPeer() or UpdateNetworkPeer() that cannot be generated since they have logic that extends outside of the abilities of the generator (according to the documentation, it seems you cant perform an UPDATE using multiple keys to filter with WHERE), what's the best course of action?

EDIT: We're gonna to rewrite the logic around the call sites as the earlier comments suggested

NathanChase22 avatar Apr 10 '25 01:04 NathanChase22

Yeah, the generated functions are less flexible but the loss of flexibility becomes a gain in consistency, making most of our endpoints generally follow the same pattern.

You may indeed find yourself occasionally needing to open a transaction, then perform a Get with the desired filter, then alter the returned object with the changes to be performed, then feeding that to Update along with the row ID that was returned during Get.

This turns what used to be one query into two, but it also handles some extra cases that may not have been previously handled, like ensuring that the record does currently exist, that there is only a single match for that particular condition (if desired), ...

stgraber avatar Apr 10 '25 03:04 stgraber

Coming back around to this issue. I have been doing a lot of deep digging, understanding the code, and some of the methods seem pretty easy to replace with the generator, but some of the more complex SQL queries that are about 15 lines with the method being about 50-75, seem to be unreasonable to be inlined to the other files with just the generator code. This is similar to what Nathan was mentioning in the comment above, but I don't think we would be able to recreate some of the methods with two queries. If I am mistaken, please let me know (I'm just not certain if my thought process is correct). If not, should I create helper methods for these long queries/methods in the cluster/network_peers.go?

oronila avatar May 03 '25 00:05 oronila