LMdiff
LMdiff copied to clipboard
A diff tool for language models
 Should also allow `Top-10 Diff`, corresponding to the search results  It is returned as part of the following packet: ``` return { "text": text, "tokens": tokens, "m1": {...
> how difficult would it be to show distribution over all the data and highlight the picked examples? 
Create datasets and analysis results of `gpt-gen` and `distillgpt2-gen`. Questions - How would you generate diverse phrases? (one per line)? Would you have a prompt dataset?
Consider porting documentation to `mkdocs` to have a professional feel for this tool
When querying the API for text snippets from pre-computed corpus, some snippets are duplicates which violates the uniqueness requirement for the list.
1. `mrpc` is a sufficient description of `glue_mrpc` since glue is the task name encompassing several datasets 2. We would like to provide a popup near the dataset name that...