0xDeCA10B icon indicating copy to clipboard operation
0xDeCA10B copied to clipboard

[demo] Track accuracy over time.

Open juharris opened this issue 4 years ago • 8 comments

In database? On blockchain?

We now track the accuracy in the database (managed by the server). This is okay but it's centralized so it would be good to add some proof to that database.

juharris avatar Jun 17 '20 19:06 juharris

Hi @juharris I m have knowledge of database and blockchain
I would like to contribute to this project . Could please explain the issue more and where to start ?

hkaur008 avatar Apr 24 '21 20:04 hkaur008

Hey, thanks for reaching out! As people add data and train a model, the model's accuracy for some test set will change and I would like to track that accuracy's change over time. I think there's a lot to be done for this issue, but we can break it down in some steps. Ideally, for the highest transparency, we would compute the test set evaluation on-chain, but that would be very expensive and arguably wasteful. So what are the steps that we can make towards transparency? I think as a start, you can store the accuracy and timestamp in the table that you can set up in demo/server.js. Maybe you can also store a hash of test set data that was used and some other metadata about the test set? I think that's a decent start and once that is done, you can get an idea of other ways to store test set metrics. You can also get into zero-knowledge proofs or use hashes to prove that the right computation was done to perform evaluation.

juharris avatar Apr 24 '21 21:04 juharris

I think i just need to maintain a table of accuracy , timestamp , hashset and other meta data for time being then improve it and then improve this with hashing to prove that changes where made or get in zero-knowledge proofs as well. Could you please assign me this issue ?

hkaur008 avatar Apr 25 '21 17:04 hkaur008

whenever a new training sample (data set of a particular model changes) is added the accuracy of the model changes. then changed accuracy with timestamp needs to be recorded of the model in an SQLite table. I think every model is having the same data table. But every model will have different accuracies for same dataset or data . i need to maintain table for every model separately to track accuracy of every model with timestamp ? . Please correct me if i am wrong .

hkaur008 avatar May 03 '21 20:05 hkaur008

Using a new table for each model will be hard to manage, so they should all use the same table. You can use a column with a dataset name to help keep track of which dataset the model was tested against.

juharris avatar May 03 '21 20:05 juharris

so we can create a table who has following parameters transaction_hash ,id of model ,accuracy , timestamp , as model is already storing meta data and transaction_hash is primary key to location which particular data we are talking of , I need to store only accuracy and timestamp .

we are having following apis :- // Health // Get all models. // Get model with specific ID. // Insert a new model. // DATA MANAGEMENT // Insert a training sample. // Get original training data.

when do I need to call function to check accuracy with which API ?

hkaur008 avatar May 04 '21 06:05 hkaur008

Thanks for the update! I don't think a transaction hash is appropriate for the location of data. I'm not sure what we should use. You can just mode a "data_location" column and we can figure out what to put in it later. It might vary. Data might be on-chain, at a URL, it can vary.

I believe I answer the question and functions in the PR: You should make 2 new functions.

juharris avatar May 05 '21 00:05 juharris

What is left in this issue?

hkaur008 avatar Jul 15 '21 18:07 hkaur008