ttyper Results history option

This is an enhancement request.

The results could be saved to a file and there could be an option to show all of them by date. This could be helpful in tracking overall progress.

And maybe later, the results could be visualized in a graph too?

Jun 09 '21 18:06 TahsinTariq

I'd would like to propose a JSON data structure for this, which would save all necessary data to replay the trial and do thorough analysis.

{
    "word": [
        {"start_datetime": "2021-08-02 17:36:22.2", "keystrokes": "word", "timings": [0.2, 0.1, 0.1, 0.2], "trial_hash": "a4f8e2"},
        {"start_datetime": "2021-08-02 17:37:25.7", "keystrokes": "wort\bd", "timings": [0.3, 0.1, 0.1, 0.3, 0.1, 0.1], "trial_hash": "a4f8e2"}
    ]
}

Alternatively it would also be possible to put the word into the data structure. The datetime string should be formatted so it is best read by common parsers. Alternatively one could save the int64 representation.

Zipping this would lead to very small files. Also they could be saved separately for each trial and then concatenated if needed.

Having the really detailed data available would be very convenient for in depth analysis:

Which words are the hardest for me? Where do I spend time?
Which sequences (like the "-ion" ending) are the hardest for me?
How did I improve over time?
At which time of day do I perform best?

Providing a easy to parse data structure makes it easy to analyze the data in other frameworks or languages.

Best regards Julian

EDIT: Think through this one more time I actually would prefer a solely csv based data structure. It is again easier to parse than json (for example with excel) and concatenating is also much easier. Also when zipping it, it should make no difference in filesize (zipping uses a dictionary). Also reading from one or multiple csv files is embarrassingly parallelize-able.

start_datetime,keystrokes,timings,trial_hash
2021-08-02 17:36:22.2,word,0.2 0.1 0.1 0.2,a4f8e2

I'm not sure how to best represent the timings. But joining them with a spacebar should work out.

Aug 02 '21 10:08 JulianWgs

I've been thinking about this issue for a little bit, and I think it would be best to store the results internally using a binary file format or JSON, and then provide multiple exporters for different file formats.

Using serde, this would be quite easy to implement for most JSON-like formats. It also has the advantage of minimizing uncompressed file sizes, and if we store the results internally, we could easily implement automatic compression if file size is still an issue.

As for @JulianWgs' suggestion of using CSV, I definitely think providing a CSV exporter could be useful, but I also think a lot of thought needs to be put into the encoding since our data is essentially a two-dimensional list of characters (keystrokes, really) annotated with a duration, plus some metadata about the test; there's simply no good way to encode a two-dimensional list of tuples in a CSV file. @JulianWgs' proposed encoding would certainly work, but I'm not sure if it would be much easier to parse than JSON because of the space-delimited timings, which won't work well with e.g. Excel.

As for performance concerns, I don't think there really are any. It's true that CSVs are stupidly easy to append to, but if we use a file per test we won't ever need to and each results object should almost never be larger than a few dozen KBs.

Dec 12 '21 06:12 max-niederman

ttyper ttyper copied to clipboard

Results history option

ttyper
ttyper copied to clipboard