openNetVM icon indicating copy to clipboard operation
openNetVM copied to clipboard

Comprehensive list of enhancements

Open koolzz opened this issue 6 years ago • 7 comments

Possible new feature/updates for onvm, both WIP, Planned and Finished

Description:
This will serve as a pinned issue with tasks that need to be implemented. This is easier to edit than the projects board and provides a quick overview. We might utilize the Projects board for tasks that are actually assigned and under development.

When a task is WIP it will have a check-mark box next to it and an owner/Projects link after. When a task is finished the box will be checked moved to the DONE section.

Adding or Claiming tasks:
If anyone has other ideas for OpenNetVM add a comment with a description and I'll append them to the issue. Similarly, if you want to claim and start a task drop a comment and I'll assign it/move it to the project board.

Overall improvements, scripts, docs, other:

  • Any reason to maintain a wiki section in the main repo? Would updating docs be better? Does anyone care or want to do it? Discuss
  • Docker documentation & setup needs an update

NF related:

  • Add a shutdown callback similar to nf_setup but on shut down, this can be used to print out average stats and other useful info. #179
  • Improved router NF, routes based on rules defined in the config but can now also accept different parameters(port/flags/seq) from the ip_hdr, udp_hdr, packet payload, etc. Could use lpm?
  • Based on the NF-to-NF messaging we can implement complex experiments that require multiple NFs that communicate with each other. #297

Scripting:

  • We could provide helper scripts to launch linear and circular chains of NFS, probably using python & JSON for this #184

Web stats(currently all assigned to @kevindweb, link:

  • [ ] Fix the bug where you click on one panel but the other one is highlighted
  • [ ] Show all allocated cores (not just the ones used)

Cores/Scaling/Magic:

  • [ ] Automatic scaling for NFs, take f.e. speed tester, based on some data metrics in onvm_mgr decide when speed tester is overloaded (some queue data might be useful) -> scale more speed testers. Assigned to @ratnadeepb, link

Performance:

  • Overall performance tweaks and monitoring which parts of the code lead to slowdowns
  • Separating the NF startup msging processing from the stats, currently, it's all being done in one thread

Multi-Host ONVM

  • service chaining across hosts (VXLAN encapsulation)
  • See Phill's old code

CloudLab:

  • Update main profile
  • Provide complex templates, a 3 node setup with a client, onvm middlebox with a local server and a remote server is more fun that just bare onvm installation

mTCP:

  • Fix scaling, with the current version scaling probably won't work as the nf_info is stored in a global variable
  • Performance tests, optimizations & other things

mOS

  • Essentially a middle box mTCP, we already had a working port https://github.com/Grace-Liu/onvm-mos here, might want to update that

Snort

  • Update Snort, an imaginary prize goes to whoever gets it up and running, currently, @ratnadeepb is trying to set it up

Cluster

  • Have a webpage that shows if the nodes are up, just pings all nodes

Other endpoint application: We could explore other applications, such as the dpdk nginx module that can be used with onvm as examples of complex NFs. Before doing this we have to think about what benefits it would give us to port something.

Also, we should look at f-stack, it seems similar to our project.

DONE (As of Spring 2020):

  • [x] Check ports before trying to send (running basic monitor with speed_testser or load_generator will crash the onvm_mgr) #160
  • [x] Check that no NFs are running before starting the onvm_mgr (when mgr dies but the NF stays alive relaunching mgr will fail without a very descriptive error message) #180
  • [x] Half of the nodes are not really alive, fix them
  • [x] Migration of NFs to another core, Assigned to @koolzz pr link #87
  • [x] onvm_mgr launch script should check if any NFs are running before launching #180

DONE (Spring 2019):

  • [x] Update the good old ./install.sh / setup_environment.sh scripts as they fail to deal with hugepages. Assigned to @koolzz
  • [x] Ubuntu 18.04.01 update? Assigned to @dennisafa, link
  • [x] Get ~~a sysadmin~~ sysadmins @kevindweb @dennisafa
  • [x] Ensuring we only have one PR being tested at once for CI. @koolzz
  • [x] Finish the Firewall NF #225 Assigned to @dennisafa, pr link
  • [x] Web stats improvements:
    • Add new events? The easy one is scaling
    • After the pthread pr we will have core info -> we need to put that onto the web page.
  • [x] Reuse instance ids(basically reuse onvm_nf structs) doesn't make sense for us not to do this. Assigned to @koolzz, pr here

koolzz avatar Feb 06 '19 05:02 koolzz

@twood02 I would be interested in working on the auto-scaling problem. It would seem to me that having the onvm manager identify load conditions and finding opportunities to remedy that by automatically scaling a (subset | all) of the running NFs could be an interesting problem to work on.

ratnadeepb avatar Feb 07 '19 01:02 ratnadeepb

@twood02 @koolzz As noted yesterday, I would like to on the main website, additional stats and other improvements to the online platform. This can include the onvm_web functionality that was just merged, possibly including pthreads that we talked about in the meeting? I will keep in touch about my progress. In addition, @dennisafa and I are learning about the gwcloudlab and a start to systems administration with giving access to nodes which we can build upon in the future.

kevindweb avatar Feb 07 '19 16:02 kevindweb

I'll make cards in the project board by the end of the week

koolzz avatar Feb 07 '19 18:02 koolzz

I'll work on testing out ONVM on Ubuntu 18.04.1 LTS as well as sysadmin work and the firewall code. I found a blank 18.04 profile on cloudlab that I'll initiate today.

dennisafa avatar Feb 07 '19 19:02 dennisafa

If people are looking for even more tasks and we want to expand CI further beyond #299, we can add the following:

  1. Dispatching different workloads to CI worker nodes
  2. Ensuring we only have one PR being tested at once

AaronCoplan avatar Feb 08 '19 05:02 AaronCoplan

@koolzz I will work on the onvm_mgr launch script should check if any NFs are running before launching scripting task if you want to update this. #180 fixes this if you want to check that and update here.

kevindweb avatar Jan 06 '20 05:01 kevindweb

@kevindweb

@koolzz I will work on the onvm_mgr launch script should check if any NFs are running before launching scripting task if you want to update this. #180 fixes this if you want to check that and update here.

You should have enough repo permissions to edit the issue, lmk if that's not the case

koolzz avatar Jan 06 '20 12:01 koolzz