Allow choosing between compile-time or runtime generation of large attack tables.
I'm trying to use shakmaty in a WebAssembly project that will run in a browser. Unfortunately the static attack tables (particularly shakmaty::bootstrap::ATTACKS, which is 700KB on its own) are increasing my binary size by an order of magnitude (~80KB -> ~800KB). To address this, I would like a Cargo feature that controls whether the tables are generated at compile time and embedded in the binary, or are generated at runtime.
With this PR, when the runtime-lut feature is enabled, ATTACKS and RAYS are boxed (Box::<[u64, _]>), and use the lazy_static macro to defer initialisation to the first use at runtime.
The code is not really the prettiest, but I believe the maybe_const_fn macro is the only way to avoid writing out the table generation functions twice, once for array and once for Box. It would have been possible to simply call Box::new(init_magics()), which semantically works, but causes the whole LUT to be pushed onto the stack while it's generated, which triggered a stack overflow for me building against the wasm32-unknown-unknown target.
Possible concerns:
- New (small, popular) dependency on
lazy_staticwhen feature is enabled. - The
ATTACKSandRAYSvariables now have different types depending on a feature flag. It would be possible to make them both&'static [u64; _](usingBox::leakfor the box version), but since they are internal-only and Box derefs to an array anyway this didn't seem necessary. - I didn't bother to implement runtime generation for individual piece tables (
KNIGHT_ATTACKSetc.) since they are much smaller. If the consistency is desirable though this approach could be extended to all static tables.
Exploring the space of possible tradeoffs, there's also Hyperbola Quintessence, which does not need such large tables even at runtime. For example, https://github.com/niklasf/chessops uses it to reduce bootstrap time, which could be important in web applications.
I now implemented enough of it for a benchmark, and see "only" a ~20% perft performance drop. Maybe that's the way to go then, instead of runtime generated magic tables?