camel icon indicating copy to clipboard operation
camel copied to clipboard

Add agent example use case to generate query, positive and negative examples

Open zechengz opened this issue 4 months ago • 2 comments

Description

Add an agent example use case to generate query, positive and negative docs example which can be used by text embedding model (text encoder) contrastive learning.

Motivation and Context

Recently there is a good paper [link] published which uses "agent" to generate tasks and corresponding query, positive and (hard) negative document examples. These document examples can then be used for text embedding model finetuning (contrastive learning). Text encoders use this method achieve quite good text embedding performance on the MTEB leaderboard, including SFR-Embedding-Mistral and e5-mistral-7b-instruct etc.

The whole generation includes two steps.

One is task generation: image Another one is document generation: image

  • [ ] I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

What types of changes does your code introduce? Put an x in all the boxes that apply:

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [x] New feature (non-breaking change which adds core functionality)
  • [ ] Breaking change (fix or feature that would cause existing functionality to change)
  • [ ] Documentation (update in the documentation)
  • [x] Example (update in the folder of example)

Implemented Tasks

  • [x] Create a new EMBEDDING task type
  • [x] Create a new prompt for the EMBEDDING task type
  • [x] Create example that includes task generation and single agent query, positive and negative documents generation

Checklist

Go over all the following points, and put an x in all the boxes that apply. If you are unsure about any of these, don't hesitate to ask. We are here to help!

  • [ ] I have read the CONTRIBUTION guide. (required)
  • [ ] My change requires a change to the documentation.
  • [ ] I have updated the tests accordingly. (required for a bug fix or a new feature)
  • [ ] I have updated the documentation accordingly.

zechengz avatar Mar 04 '24 05:03 zechengz