lichobile icon indicating copy to clipboard operation
lichobile copied to clipboard

Allow loading chess board from image.

Open therealmitchconnors opened this issue 7 years ago • 24 comments

Allow users to select an image of a chess board or diagram (from camera, file, or URL), and use it to load up the Board Editor with the pictured configuration. This feature concept was discussed originally in #526. This issue is for further discussion of the feature.

therealmitchconnors avatar Feb 21 '17 04:02 therealmitchconnors

My initial plan is to use @Elucidation 's tensorflow_chessbot Convolutional Neural Network model as a basis for this feature. I will convert his code from Python to Java, running against tensorflow for android, and nd4j in lieu of Tensorflow on Python and numpy (with some C++/JNI for loading the trained model). The Java program will then need to be packaged into a Cordova Plugin. This plan will not work for iOS, as a separate plugin back end would need to be written in XCode, and I do not have any OSX or iOS devices on which to develop and test.

therealmitchconnors avatar Feb 21 '17 04:02 therealmitchconnors

One problem I am running into early on: Building android tensorflow applications requires a pretty specific build environment. Bazel and JDK 8 are required, along with a few others. @veloce, what is your current build environment and automation like? For this feature, we could modify your build environment to meet the prerequisites of tensorflow for android, or I could manage building the cordova plugin, and you could install the jar (not sure exactly how this works with Cordova). Or we could come up with something else. Thoughts?

therealmitchconnors avatar Feb 21 '17 04:02 therealmitchconnors

I would recommend that you host a service that allows an image to be uploaded and responds with a FEN. That service could then be incorporated into lila or simply routed through lila. Trying to do the work on the phone just strikes me as the wrong approach (but I'm not an ML guy).

freefal avatar Mar 02 '17 02:03 freefal

As a soft counter to that, among several existing technologies already on phones, face-detect on most smartphones is now done using an ML model, and that's running real-time on a live view finder in many cases. The hard part is training such a model, and that's usually done offline, the actual prediction step using a trained model is usually quite light in terms of computation, just a couple sets of matrix multiplications.

Whether this is the appropriate solution for your app, I guess would depend on both how difficult it is getting tensorflow for android working for a good range of android versions, and the pros/cons of running a server, which can easily update a model, but is a hosted service with associated issues. It may be better initially to host a server if you are planning to run on devices that are not android such as iOS. Having it run on the phone will be much quicker though, and not reliant on an internet connection and a server.

Elucidation avatar Mar 02 '17 03:03 Elucidation

I'd like to add that I already have a trained model, courtesy of @Elucidation. For an example of an Android app using an already trained model, see https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android . The trick will be integrating this with the cordova app and existing build environment...

I would also be interested in hosting something on the lichess site that allows additional training of the model, if they have the processing power...

therealmitchconnors avatar Mar 02 '17 04:03 therealmitchconnors

As I said, I'm no ML expert. I simply suspect that your model will change over time and sending PRs to lichobile every time you improve it a bit seems inefficient. If you implement it as a simple HTTP API that takes image data as input and responds with a FEN, you get the following benefit:

  1. Android and iOS can both be supported without any Android hooks. Having single-platform features is undesirable. Most (all?) of our cordova hooks have implementations on both Android and iOS.
  2. Lila (the lichess website) could potentially allow users to directly upload images to the website and get a board editor with the position.
  3. You can improve your model at any time on the server and users will immediately get the benefit. We write the app once for the API and should never have to update the app again.

The reason I'm steering you away from Android is that I think you are going to spend lots of time making it work and I actually think you're better off not doing it directly on the phone. But with all that said, you are the one implementing this, so feel free to do as you wish.

On Wed, Mar 1, 2017 at 11:26 PM, therealmitchconnors < [email protected]> wrote:

I'd like to add that I already have a trained model, courtesy of @Elucidation https://github.com/Elucidation. For an example of an Android app using an already trained model, see https://github.com/tensorflow/tensorflow/tree/master/ tensorflow/examples/android . The trick will be integrating this with the cordova app and existing build environment...

I would also be interested in hosting something on the lichess site that allows additional training of the model, if they have the processing power...

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/veloce/lichobile/issues/561#issuecomment-283552874, or mute the thread https://github.com/notifications/unsubscribe-auth/AA6RdsCmHjHoUGzS0VtKviNeAqToMTkaks5rhkUPgaJpZM4MG3ro .

freefal avatar Mar 02 '17 04:03 freefal

I'm not that familiar with cordova plugins and build. Cordova uses gradle and the plugin system has a mecanism to add extra dependencies.

See this doc: https://cordova.apache.org/docs/en/latest/guide/platforms/android/plugin.html#adding-dependency-libraries

I can't help you more than this, sorry. You'll have to figure out the rest on your own.

veloce avatar Mar 02 '17 07:03 veloce

@freefal, you make great points, both in regards to the need to update the model, as well as about cross-platform availability. Let me take a step back for a moment and explain why I have arrived at this architecture, and perhaps we can find ways to improve it.

Ideally, my goal would be that users anywhere could utilize this feature, as well as submit feedback when the model makes a mistake. There are a handful of hurdles to overcome, mostly related to the computationally complex nature of Machine Learning. As I'm sure you are aware, Lila runs on donated hardware, and we need to make sure any feature we propose there would not overwhelm their systems. This is why the Stockfish Engine executes primarily in JavaScript on the client side, rather than in a client/server configuration. It seems to me (I do not have data to back this up) that executing the ML Model on Lila's servers could potentially be a significant strain on them, and so I have pursued client-side options for this feature. Do we have a way of testing these assumptions?

Ideally, I would develop this feature in Javascript, so that it could be run in the browser, or natively in Cordova (on both iOS and Android), but I know of no Tensorflow libraries for Javascript (https://github.com/node-tensorflow/node-tensorflow looks awesome, but appears to have stalled in the design phase). This leaves us the option of developing a Cordova Plugin that abstracts the native Tensorflow implementation away, which complex, but is feasible for Android, and possible for iOS. Sadly, this divides the project into three distinct code bases, so I'd love another option if you see a viable one. This Issue in particular focuses on Android because I do not have any Apple devices, and so my ability to develop and test the iOS side of the Plugin is severely limited.

Finally, regarding providing additional training to the model, I believe this is only feasible through a server model, and is computationally complex enough that it would likely be prohibitively expensive to run on a truly free service. Additionally, protecting the service from adversarial examples and denial of service attacks would be challenging. Once the model has been trained, I suspect that distribution of the updates would be trivial, as we could have the app check for updates and download them periodically.

I'd love to hear your (and others') feedback on this, as if it is feasible to do this on Lila's servers, things get much easier very quickly... Do you know someone from Lila who could help us understand the computational limitations they have, as well as how much bandwidth we can use (uploading images could become expensive on AWS, not sure if they pay per KB or not).

therealmitchconnors avatar Mar 02 '17 16:03 therealmitchconnors

@Elucidation, perhaps you could weigh in on what environment you are running the chess bot from, and what sort of processing power you need to run and train the model...

therealmitchconnors avatar Mar 02 '17 16:03 therealmitchconnors

Right now /u/ChessFenBot runs as a docker image on a google cloud instance, running a ubuntu variant.

I happened to already have an n1-standard-1 instance running for other work. It runs several of my bots and other projects.

Over the last couple weeks the average CPU utilization for this bot has been a < 2%, memory usage hovers around 170 Mb, avg 100Kb/s network input. This could probably run from a raspberry pi.

Over the last year chessfenbot has only processed around 800 chessboard images (and discarded many more failures say 10-100x), so a very light load. I will be moving my bots over to a micro-instance at some point.

Training the model is done offline, I use my home desktop for example, I could also train on a cloud instance if I needed, the problem is simple enough that it wouldn't take more than a couple minutes to hours on a CPU-only instance at the moment. Retraining a model is simple to do offline, new checkpoint files are created such as the one here. Updating a model through the store may be annoying, since they are on the order of 50Mb (un-optimized) for example.

Keep in mind that I chose CNNs for fun and ease of development instead of final accuracy, a K-nearest-neighbor (KNN) type approach for image search would probably work more efficiently in the majority of cases here, I assume this is how existing chess OCR apps work. I have a feeling KNN would have issues with edge cases where boards haven't been seen before, that a CNN will handle more gracefully. On the plus side, KNN is something you could implement in javascript.

Elucidation avatar Mar 03 '17 05:03 Elucidation

It sounds like it would be worthwhile to discuss this feature with the Lila teams before moving forward. Is the Lila channel the best place for this discussion? @Sam, would it be useful to upload only the gradients and other metadata, rather than the full image? I am trying to find ways to prevent massive bandwidth usage by this feature...

therealmitchconnors avatar Mar 13 '17 18:03 therealmitchconnors

Probably not, since the ML model takes in the chess tile images. You could however find and crop the chessboard from the original image, rescale to 256x256 px and pass only that. That would make bandwidth usage much simpler.

On Mon, Mar 13, 2017 at 11:03 AM therealmitchconnors < [email protected]> wrote:

It sounds like it would be worthwhile to discuss this feature with the Lila teams before moving forward. Is the Lila channel the best place for this discussion? @Sam https://github.com/Sam, would it be useful to upload only the gradients and other metadata, rather than the full image? I am trying to find ways to prevent massive bandwidth usage by this feature...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/veloce/lichobile/issues/561#issuecomment-286192247, or mute the thread https://github.com/notifications/unsubscribe-auth/AAai7YOlDB-bMFYYDcfr5z2X65TKQTx0ks5rlYTfgaJpZM4MG3ro .

Elucidation avatar Mar 14 '17 20:03 Elucidation

This thread has gone dead for a very long time, as I have been distracted. I have wrapped the prediction service in grpc, which can be called from any client with relatively few dependencies (ideal for avoiding writing my own cordova plugin).

https://github.com/therealmitchconnors/tensorflow_chessbot/tree/serve

PM me for an IP Address that is currently serving. Next step is to get the server on a load balancer, and figure out how to call grpc from lichobile... Then I have to figure out how to hand ownership of the service over to the lila folks, and get it running on their hardware.

therealmitchconnors avatar Dec 09 '17 04:12 therealmitchconnors

PM me for an IP Address that is currently serving.

I have no idea what you mean by that, sorry.

For the grpc client in lichobile, I don't know really grpc but I think you'll have to write a cordova plugin. There is actually a web implementation of grpc (https://github.com/improbable-eng/grpc-web) but you need to use both their client and server.

To take a picture of a board you have this cordova plugin.

veloce avatar Dec 09 '17 10:12 veloce

For grpc, I was planning on using the node grpc client, rather than a cordova plugin. I think that would mean it is accessible from all target platforms, since it's a part of node. Does that make sense?

Regarding an IP Address, I am currently serving this grpc service from AWS, and if anyone wanted to play with it, I can provide details, but not in public, as there is no security around the service yet, and I'd rather not get a deluge of random traffic.

Thanks for the help. I hope to have a merge request to you inside of a week.

therealmitchconnors avatar Dec 09 '17 23:12 therealmitchconnors

For grpc, I was planning on using the node grpc client, rather than a cordova plugin. I think that would mean it is accessible from all target platforms, since it's a part of node. Does that make sense?

No I'm afraid you can't do that. Cordova uses node.js platform for its command line interface tools, but a cordova application is not a node.js application. Lichobile, as any cordova application, is a web application running in a webview, wrapped in a cordova layer that allow to access many native APIs.

If you read this, you'll see that while grpc can work in a webapp (browser or webview), you have to use grpc web protocol, which is a bit different from the native grpc protocol.

Thus is you want to use grpc with a cordova app, you have 2 choices: 1 - use this project: it provides you with a Go package that wraps a grpc.Server to allow communication with a web client (I don't know the details), and a typescript web client that allow to to write typescript code directly integrated into lichobile 2 - write a cordova plugin that would use grpc native protocol

I'd got for 1 of course, like you said you don't want to write a cordova plugin if not necessary. It might require some change to you existing server, or not, you'll have to figure it out. I think you first need to try it in a simple example web app (just a browser page including the typescript grpc web client) to test it and make sure it works.

When you have that, then integration with lichobile can be started, with the use of the cordova camera plugin. I can help you starting from this point, if you have questions regarding lichobile code.

At this time, there will be also the question of hosting of the server code. I can't help you with that, and I don't know whether lichess can host it. I suggest you to contact directly the lichess team for that ([email protected]).

veloce avatar Dec 10 '17 11:12 veloce

Thanks for clarifying. I will see about the grpc-web stuff...

therealmitchconnors avatar Dec 11 '17 20:12 therealmitchconnors

I was wondering if you might provide input regarding an alternative design. I chose to use grpc for this service because it can represent binary data in an extremely compact format, but perhaps it would be simpler to stick with the api format already in use, and encode the images into b64 json?

This would mean a 33% increase in payload size (from ~1MB to ~1.3MB), and some performance degradation, but would allow us to implement the API in a manner consistent with everything else...

therealmitchconnors avatar Dec 11 '17 22:12 therealmitchconnors

Indeed. And actually you can even upload an image in binary format with a regular xhr, as explained here.

This feature is relatively new in browser, but the good news is that is it fully supported on lichobile, because it runs on iOS >= 10.2 and android chrome 53.

veloce avatar Dec 11 '17 23:12 veloce

Another useful link: https://cordova.apache.org/blog/2017/10/18/from-filetransfer-to-xhr2.html

I currently use the deprecated file-transfer-plugin, and have created #736 to address it.

veloce avatar Dec 11 '17 23:12 veloce

Hi @veloce @therealmitchconnors. Let me ask what is status of this?

pociej avatar Feb 13 '21 10:02 pociej

@pociej it's stalled unfortunately. I just never found the time...

therealmitchconnors avatar Feb 13 '21 14:02 therealmitchconnors

Not only scan position - but play a full game with the camera :) Has been done many times on a PC with a webcam. But would be absolutely awesome to have via a phone camera. Thereby making a laptop redundant and making chess truly smart and mobile at the same time.

infinitless avatar Nov 23 '21 11:11 infinitless

I would absolutely love a feature like this.

jamesqo avatar Feb 05 '22 01:02 jamesqo