data-measurements-tool icon indicating copy to clipboard operation
data-measurements-tool copied to clipboard

Developing tools to automatically analyze datasets

Results 11 data-measurements-tool issues
Sort by recently updated
recently updated
newest added

Hello fellow maintainer, I have added some standardized markdowns to the README.md to improve the docs. Also, I fixed the unnecessary **`** signs in the document. Please review and approve.

I am looking for a library that can help measuring the dataset quality. This project is very useful for me. But I find that the latest commit is submitted 5...

Hi, I am trying to use _HuggingFaceM4/OBELICS_ with the data-measurements-tool. The dataset is loaded but due to its huge size it (approx 378 GB), I am unable to get results....

Hello! I'm a Cybersecurity researcher developing Packj [1]. Our tool has detected a supply-chain vulnerability in this repository. In order for me to disclose it, kindly enable GitHub Private vulnerability...

The current nPMI class needs to be refactored to become a generic "associations" module, which exposes nPMI along with other association measurements.

When I run ``` python3 run_data_measurements.py --dataset="hate_speech_offensive" --config="default" --split="train" --label_field="label" --feature="tweet" ``` the `dset_peek.json` file is not cached, which prevents me from running the UI in `live` mode Snapshot of...

Updated files: - app.py - dataset_util.py new files: - styles.css and index.html: stylisation code for streamlit's component

Apologies for the cached data🙈 the discussed way that we brought up had some additional problems with necessary files and updates that were part of the previous commit with the...

UI updates (see [figma](https://www.figma.com/file/xvBWSyzuURvtIaMKAjAPZY/Data-measurements-tool)): - sidebar - tabs instead of drop downs - colouring To:do's to come - fixing bold text - slight adjusting of colours