pwnagotchi icon indicating copy to clipboard operation
pwnagotchi copied to clipboard

serialize, compress, store and share experience

Open evilsocket opened this issue 4 years ago • 7 comments

Inside the log files, among other things, there're the key information the AI needs at every epoch:

  • the observation
  • the current policy the AI picked
  • the reward score after the policy has ben applied on that epoch

this information should be stored separately, this way even if we delete the root.nn file (in case of breaking changes for the AI module), the AI can be retrained on the previous experience in just a few seconds.

Basically the gym wrapper would be replaying previous training epochs instead of running in the current one.

It'd be also interesting to share these files among users in order to have some sort of federated learning.

evilsocket avatar Nov 12 '19 12:11 evilsocket

Having a "Self taught baby" and a "Federated learning" branch would be cool, for the people that want to be with the units from the start or the ones that want a better trained unit with less effort, i would probably still go with self taught baby though.

Arttumiro avatar Nov 12 '19 12:11 Arttumiro

For federated learning, be warned that you might get a "tainted" pool. So, I believe that having a "booster" pack might be the way to go. Basically saying, you get 5000 randomly selected epochs from the top 1000 and use that to kickstart a brain.

mrseeker avatar Nov 15 '19 11:11 mrseeker

can you elaborate a bit more on the tainted pool?

evilsocket avatar Nov 15 '19 11:11 evilsocket

Tainted pool: When someone pushes log files to the brain with values that are impossible to achieve, in order to "skew" the results of a federated learning experience. An example of a "tainted" pool would be an upload of all results that only contain positive results.

mrseeker avatar Nov 15 '19 11:11 mrseeker

that's relatively easy to workaround ... my idea was to upload all the experiences to the grid, then server side quickly train a model on top of it and benchmark it to determine if that experience pool would improve or decrease its performance, so that only actually useful (and less noisy) experience buffers will be used

evilsocket avatar Nov 15 '19 11:11 evilsocket

Is there some branch that is working on this issue?

NOMADooo avatar Apr 24 '22 13:04 NOMADooo

This went stale I believe?

mrseeker avatar Apr 24 '22 13:04 mrseeker