arcade icon indicating copy to clipboard operation
arcade copied to clipboard

Serializing arcade types to disk (spritelist, etc)

Open gran4 opened this issue 2 years ago • 11 comments

Enhancement request:

Can we add basic saving and loading using pickle?

What should be added/changed?

Add basic saving and loading using pickle?

What would it help with?

Let people save easier. Would make saving and loading viable in game jams. I have code for it but @einarf keeps telling me to discuss things like this first.

Here's the idea:

Create a function that Copys and converts a Sprite list to a list and access a new .save function. Give the save func the game so it can access other relevant SpriteLists?

Then using that function, the user can save any relevant information and save it to pickle. (Sprites in SpriteLists won't save)

Then when loading it in, the user can put the Sprites into the revelant SpriteList(s) again. For each sprite, the user can access a new .load function. SImilar to load, give the save func the game so it can access other relevant SpriteLists?

gran4 avatar Apr 18 '23 16:04 gran4

It worked for my game

gran4 avatar Apr 18 '23 16:04 gran4

tl;dr:

  1. Using pickle for loading games is a huge security risk
  2. Using pickle for loading games creates compatibility problems
  3. Attempting a generalized save system doesn't seem worth the added complexity & other costs

It's better to use pickle only for caching the results of expensive calculations to disk locally, such as various AI / pathing data.

1. Security Issues

Saved game files are sometimes shared between arbitrary users, who should be treated as untrusted data sources. The pickle doc specifically warns against using it in such contexts:

Warning: The pickle module is not secure. Only unpickle data you trust.

It is possible to construct malicious pickle data which will execute arbitrary code during unpickling. Never unpickle data that could have come from an untrusted source, or that could have been tampered with.

This means using pickle for transferable game data is an exploitable security vulnerability which allows executing arbitrary code.

I strongly advise you to use a different save mechanism for your game. A JSON-based format is probably a decent choice, especially if you want something human readable to make debugging easier.

2. Compatibility Issues

The pickle module only supports saving certain types of data. It doesn't support any of the following:

These can show up in user, arcade, and dependency code. Anything using pyglet features risks being affected because pyglet makes heavy use of ctypes pointers internally.

Additionally, pyglet frequently uses platform-specific classes, which will likely create problems when attempting to deserialize data created on other operating systems.

3. The Price of Generalized Save Systems

tl;dr Even without the concerns raised by pickle, trying to build a generalized save system doesn't seem appropriate for arcade right now

To my understanding, the only reliable way of making a generalized save system is heavily restricting the data types of the game or engine to a reliable & trusted subset of types.

For example, it's theoretically possible to write a game engine to easily map to JSON's 7 data types. This could be very useful for building large, ECS-focused games which need strong consistency, such as 2D games like Elder Scrolls or Fallout games.

In addition to noting how famous these games are for their numerous bugs, you should also keep the following in mind:

  1. The required restrictions conflict with arcade's goals of beginner-friendliness
  2. Generality often makes things unwieldy (e.g. Java's XML-based Spring framework)
  3. Bridging the first two requires writing a lot of adapter code
  4. The smaller games most arcade users build don't require items 2 & 3 of this list
  5. It may take a lot of work to maintain this system if it was built

Combined with the issues in the previous two sections, it seems inappropriate to attempt a generalized save system, especially given the issues it may create for the Rust & web projects.

pushfoo avatar Apr 19 '23 00:04 pushfoo

What if it was in the arcade examples so people could use the code if they wanted to?

gran4 avatar Apr 19 '23 02:04 gran4

Pickle isn't human-readable, making it very easy to decipher and decrypt. And it has very little security, as @pushfoo stated. Anybody can just use a script to read and modify the contents. It's more useful to create a custom decrypting algorithm for your game, so people can't cheat. Also, pickle can only support a few datatypes, making it not very useful for saving data in a game. Unless there is a very good reason we shouldn't add an example to arcade's library.

eschan145 avatar Apr 19 '23 16:04 eschan145

The security problem with pickle is that it allows for arbitrary code execution. Meaning if your game used it, and someone shared a save file for example, that was modified, they could make it run any arbitrary code they wanted on your system. Pickle should generally only be used when you have full trust over the origination and consumption of the content.

Making a fully generalized save/load system I would say is borderline not possible. Each game really just needs to know what it needs in order to re-load and serialize that data to disk some way.

Trying to prevent cheating is a mostly useless endeavour because there is simply not a way in existence to save the data such that the client isn’t able to modify it. The only way to achieve that would be if you are exclusively storing the save files on a server you control instead of locally with the client.

I really don’t see a meaningful way to include this functionality in Arcade, the best I think we can do is have some examples.

Cleptomania avatar Apr 19 '23 20:04 Cleptomania

EDIT: Clepto posted as I was writing this, and already covered the main points.

What if it was in the arcade examples so people could use the code if they wanted to?

I'll get to that at the end of this comment.

tl;dr

  1. Anti-cheat obfuscation isn't worth it for arcade or single-player Python games
  2. The larger problem is that different game types need to store entirely different data
  3. To cover saving in the doc, we need to either improve the GUI widgets or figure out how to make legible & minimal examples

Why Anti-Cheat is Misguided

Anybody can just use a script to read and modify the contents.

This isn't the type of security I meant. The same is true for JSON.

It's more useful to create a custom decrypting algorithm for your game, so people can't cheat.

In my opinion, this is misguided on multiple levels.

First, writing your own cryptography tends to be a bad idea for things that matter unless you're an expert. It's a much better idea to use more proven crypto libraries to wrap serialized data.

Second, even if you use well-known crypto, the key and the encrypted data are both on the user's system. This means you don't have security, just obfuscation. Consider:

  • Attaching a debugger to a Python process is trivial
  • Messing with the stack and variable names makes debugging harder

Cheaters have more time than you, and can be motivated to cheat even more by the addition of anti-cheat features. Anti-cheat isn't worth the effort for single-player Python or arcade projects.

The Problem with Save Game Formats

The larger problem is that different types of games might need completely different types of save data structures to be efficient:

  • Single player puzzle games might get away with only storing an int, a tuple of ints for the board size, and a list for the current board state, and a list of Tuple[str, int] for the local high scores
  • An action survival game might need to store position and velocity data in addition to player character data
  • An RPG might need to store complex position, map, and interrelation data that is tedious to serialize

As I outlined earlier, trying to make a system which covers all these cases will be very hard. The result will not only be less efficient than storing only the data needed, but it may also be more brittle.

For example, imagine if you had a perfectly general save game format which serializes all arcade objects to and from disk based on their properties. It's easy to get up and running, but it will break if you change the name or image size of a texture file. The greater abstraction involved in reading and writing only the necessary data can make your game and its logic more durable and less complicated.

I'll think about adding helper functions, decorators, or types in the future to make game saving & loading easier, but I'd like to discuss this with @Cleptomania after I gain a better understanding of our Tiled parser. That will have to wait until after PyCon since there are more urgent tickets to address right now.

Legibly Presenting Save Examples

What if it was in the arcade examples so people could use the code if they wanted to?

This may be useful if there were small, very specific examples for different mini games. Since this is a very broad topic, they may be better as a programming guide section. I'd be willing to work on writing one based on these comments.

However, there are some difficulties with this:

  1. Where would the games save to? Standardizing this is hard, people may want different behavior.
  2. How would we present this to the user? A View + menu? Rows of buttons?

I'll think about this since it might still be a good idea, and you should think about it too. Please feel free to post mockups or prototypes in this thread or on Discord.

pushfoo avatar Apr 19 '23 20:04 pushfoo

The Problem with Save Game Formats

The larger problem is that different types of games might need completely different types of save data structures to be efficient:

  • Single player puzzle games might get away with only storing an int, a tuple of ints for the board size, and a list for the current board state, and a list of Tuple[str, int] for the local high scores
  • An action survival game might need to store position and velocity data in addition to player character data
  • An RPG might need to store complex position, map, and interrelation data that is tedious to serialize

As I outlined earlier, trying to make a system which covers all these cases will be very hard. The result will not only be less efficient than storing only the data needed, but it may also be more brittle.

Well I think we have to make the user do the saving based on their use case. All I was taking about adding was a thing for allowing the Spritelists to be pickled.

It's more useful to create a custom decrypting algorithm for your game, so people can't cheat.

To me it seems like removing the c stuff out of a Spritelist would still be useful if you have your own decrypting algorithm. I think this might have a use case... Maybe

gran4 avatar Apr 20 '23 16:04 gran4

You can still pickle lazy spritelists. It will include textures and whatnot. The amount of stuff it will save might be very overkill.

I don't think this is terrible idea, but doing it the right way will require quite a bit of work. It's not uncommon in games to "bake" data to greatly reduce loading times.

einarf avatar Apr 20 '23 18:04 einarf

You can still pickle lazy spritelists. It will include textures and whatnot. The amount of stuff it will save might be very overkill.

I don't think this is terrible idea, but doing it the right way will require quite a bit of work. It's not uncommon in games to "bake" data to greatly reduce loading times.

Like what could be serialized? Would we use pickle or something else?

gran4 avatar Apr 29 '23 19:04 gran4

Like what could be serialized? Would we use pickle or something else?

You can pickle lazy spritelists. SpriteList(lazy=True). They only contains members that can be pickled. The GPU resources created on the first draw() or when you call initialize().

Have you compared the loading times for pickled spritelist and compared it to creating it programatically?

einarf avatar Apr 29 '23 20:04 einarf

Nope, I don't know how to accurately check that. Also,

You can pickle lazy spritelists. SpriteList(lazy=True). They only contains members that can be pickled. The GPU resources created on the first draw() or when you call initialize().

You could use it on any Spritelist if you take the Sprites out of them. Doing(and creating) a .save and .load method on the sprites gives it flexability. It was enough for my game(I used pickle becuase it felt easier). Combined with giving the sprites the game information and SpriteLists, it is flexible. Is there anywhere the arcade library can help with it though?

It doesn't seem like it.

gran4 avatar Apr 30 '23 03:04 gran4

I don't think this will never happen. It's a huge pain to maintain and we can barely maintain what we have now.

If someone wants to create and maintain an external library here I'm all for it

einarf avatar Jul 12 '24 16:07 einarf