flux icon indicating copy to clipboard operation
flux copied to clipboard

[WIP] Refactor serviceManager and explorerService

Open MorningLightMountain713 opened this issue 1 year ago • 1 comments

This is far from finished - but have tested the blockProcessor, and serviceManager - both work.

Tests will be failing Tests will be missing No linting Probably many bug still

I will udpate this pull with tasks once I have them figured out.

What this pull aims to do:

Simplify a lot of the code. I have found the serviceManager and explorerService overly complex and hard to follow. (same with the trySpawningGlobalApplications but I'll get this tuned first) Obviously, a lot of work has gone into this code over the years, and you can see how things have evolved over time, with different requirements.

Complex / "tricky" code is hard to maintain, and hard to debug.

serviceManager

I have added in some high level apis to set timers and things, with human readable strings, so if you want to set a recurring function, you can do that with runEvery and runAfter

The actual config has been abstracted away into a couple of Maps, and the config / logic seperated. Ideally, I would like to move the config off entirely into a yaml config file but will leave that for now.

For example:

The below will get loaded into a Map, this will run the app hashes check every thirty minutes, but will wait a random amount of time between 15m and 30m before starting. It also stores the timer in a map, so it can be cancelled on demand

[appsService.continuousFluxAppHashesCheck, { schedule: '30m', afterDelay: randomMsBetween('15m', '30m') }],

I have probably messed up the timing of some of the elements above, but that is easy to move things around and I'll dial that in during testing.

Also now keep track of all the timeouts and intervals, so that you can actually call stopFluxFunctions and it will shut down gracefully. Obviously, this still needs work as there are still some routines that aren't captured by this.

explorerService

Thie was quite complex, due to the recursion being used and the requirement to know when the block processing has finished for db access (and to not torch the db).

I've removed the recursion and introduced an awaitable lock, in conjunction with a custom abortController This allows for a simple public API; for example, you can await stopBlockProcessing and it will:

  • interrupt the while loop
  • if there are any sleeps being awaited on, interrupt those
  • if there are any db routines running, wait for them to finish

This is how simple stopBlockProcessing is now:

async function stopBlockProcessing() {
  if (bpc.locked) {
    await bpc.abort();
    bpc = new BlockProcessorController();
  }
}

Linting

I also suggest disabling the eslint no-restricted-syntax rule for the ZelBack folder (leave in place for the front end) it doesn't really serve any purpose, all it's doing is peppering code comments everywhere when someone uses it. For NodeJS using a for loop isn't expensive, and modern JS they all run about the same speed. Should keep the no-await-in-loop as that ones pretty good - makes you think wether you should be awaiting or gathering the promises at the end.

randomStuff

I've added a sleep method to the global namespace. This makes a lot of sense to me, it's more of a universally accepted and understood method than delay, it also better describes what is happening as you are forfeiting control of the main loop. (in my opinon) I like the feel of having it on the global namespace - not sure if this is a bad practice or not.

The nodemon stuff - Just playing around here at the moment. I don't think Flux should be aiming to use Nodemon longterm (it's a development tool right?). I could be wrong here. Maybe I'm missing something, but quite easy to use chokidar and do this in house. In my opinon, we should be doing everything we can to not restart the FluxOS service externally. Reloading modules can be done on the fly. The api should be up all the time.

Happy for any feedback - maybe this isn't a direction you want to go in, all good, just let me know.

MorningLightMountain713 avatar Feb 24 '24 16:02 MorningLightMountain713

Update summary. Have a bunch more to do but probably stop here and get tests working (a lot will be broken), so we can look at maybe merging into develop as I'm starting to get a few merge conflicts when rebasing.

Have rebased this branch onto develop, so it's current.

I still have some work to do to make this backwards compatible with loggers, eg flux_node_viewer.sh and maybe for an interum will log both types.

Additional features:

  • ALL child_process commands now go through one function. Completely removed the need for node-cmd. This makes it a LOT easier to handle errors and logging, as that is all done in the function. Highly configurable. I was doing this via a thread to test some console logging stuff, but turned out unnecessary. Can also run commands "exclusively" so any other calls to run that command will wait for this one to finish. All commands are now run without a shell. Alongside the performance gains, this is also much easier to debug / read / less error prone.

  • All child_process commands are now logged to debug.log.

  • As you can store the executable bit in git, have made all the helper scripts executable so they don't need to be run with bash. Can now run them directly like so:

           const cwd = path.join(__dirname, '../../../helpers');
           const scriptPath = path.join(cwd, 'fluxwatchtower.sh');
           const { stdout, error } = await serviceHelper.runCommand(scriptPath, { cwd });
    
  • Fixes console output - For me (I'm assuming everyone else too), running node app.js was unusable - the stdout logs would end up missing carriage returns and it was a garbled mess, have fixed this, so they are pretty. I believe this is an underlying NodeJS error, I've narrowed it down to the instantiation of child_processes and threads.

  • Restored ability to use SIGINT (ctrl +c) when running via console. Prior, when the logs would start missing carridge returns, interrupts would not work - now they do.

  • Uses pino for file logging (had to implemet custom console logger to get around \r issues, otherwise would use Pino for that too). Pino is very fast, better than Winston.

  • The reason to use Pino (apart from performance) is for structured logs (json) so they can be exported. I.e. if operators want to aggregate logs on something like Splunk. Can still be piped into pino-pretty to tail logs. i.e. tail -f debug.log | pino-pretty

  • Can now log objects as well as strings with the format log.info(myObject, myString)

  • Disables console logs when no TTY is attached. (Unless overridden with --force-stdout)

  • Many improvements / bugfixes. E.g. outbound websockets now return a promise, so you can await it, instead of just sleep for 500ms. (I want to rework all the network modules but will leave that for now)

  • Moves some files around for better separation. I.e, log module is now sharable between GUI/Backend. Use entrypoints.

  • Ability to run GUI and Backend on separate processes. (default is together)

  • Logs all inbound http requests (both gui and api) to debug log, using express middleware.

  • Add help option

Screenshots:

Example of broken console logs:

Screenshot 2024-03-13 at 10 47 31 AM

New console logs:

Screenshot 2024-03-12 at 8 11 10 AM

More console logs (color is wrong as I had something selected):

Screenshot 2024-03-12 at 8 10 03 AM

File logging (piped through pino-pretty):

Screenshot 2024-03-12 at 8 10 12 AM

Raw file logging (exportable / filterable JSON) i.e. you can specify what log level you want to filter at):

Screenshot 2024-03-12 at 8 10 42 AM

Help:

Screenshot 2024-03-13 at 11 10 37 AM

MorningLightMountain713 avatar Mar 13 '24 11:03 MorningLightMountain713

⚠️ GitGuardian has uncovered 1 secret following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

🔎 Detected hardcoded secret in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
- Generic High Entropy Secret 55a33a3ffbeb70b0a9b69587b879a706cdc07432 tests/unit/fluxCommunicationMessagesSender.test.js View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secret safely. Learn here the best practices.
  3. Revoke and rotate this secret.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

Our GitHub checks need improvements? Share your feedbacks!

gitguardian[bot] avatar Mar 18 '24 17:03 gitguardian[bot]

I will retire this PR. I still have the branch locally. Some of this has already been merged. I will look to get the remainder of this split out and completed once I've finsished current work.

MorningLightMountain713 avatar May 22 '24 13:05 MorningLightMountain713