NeMo-Curator icon indicating copy to clipboard operation
NeMo-Curator copied to clipboard

Improved Semantic Deduplication Docs

Open ryantwolf opened this issue 1 year ago • 0 comments
trafficstars

As I am revisiting the semantic deduplication documentation, there are a few things we should add:

  • Documentation of the CLI
  • If the user uses add_id like we recommend, the id_col_type in the config should be a string.

ryantwolf avatar Aug 09 '24 23:08 ryantwolf