Open-Assistant icon indicating copy to clipboard operation
Open-Assistant copied to clipboard

Adding Legal Documents as Dataset for improving model's legal advice ability

Open ElJaviLuki opened this issue 2 years ago • 3 comments

Statement

The current Open Assistant model lacks the ability to provide accurate legal advice due to the lack of legal documents in its dataset. This could lead to incorrect information being given and causing harm to users seeking legal advice.

Proposed Solution

In order to improve the legal advice skills of the model, it is proposed that legal documents from several countries be added to the dataset. This includes Constitution, Federal and State Laws (if that applies for the country), Jurisprudence, Traditions and Customs, Legal Doctrine and any other source of law. These legal documents would provide the model with the necessary information to provide accurate legal advice to users.

Steps to Implement

  1. Obtain lots of legal documents from source of law from different countries (USA, European Union (European Law & laws from each country), Russia, China, India, Australia, and a long etc.).
  2. Clean and pre-process the data to ensure it is in a format that can be used by the model.
  3. Integrate the legal documents into the model's existing dataset.
  4. Train the model using the new legal documents dataset.
  5. Test the model's performance to ensure that it is providing accurate legal advice.

Benefits

By adding legal documents to the model's dataset, it would improve the accuracy of legal advice provided by the model and increase the trust users have in the information provided by Open Assistant. This would also increase the value of Open Assistant as a tool for providing legal advice.

Open Questions

  • Are there any legal concerns regarding the use of legal documents in the model's dataset?
  • What is the best format for integrating legal documents into the model's dataset?
  • How can the accuracy of the model's legal advice be tested and validated?

ElJaviLuki avatar Feb 10 '23 14:02 ElJaviLuki

Great idea! I've assigned to you.

huu4ontocord avatar Feb 10 '23 17:02 huu4ontocord

Could be useful: https://lang.org.ua/en/corpora/#anchor7 (under the Corpus of laws and legal acts header)

large (more than 9 Gb) corpus of laws and legal acts of Ukraine

The cutoff date is approx. 2016-2017 though. So while it may be useful for the model to learn the structure of legal documents, some laws or legal acts could have become outdated since.

nmeln avatar Feb 13 '23 14:02 nmeln

@ElJaviLuki - Hi can you give us a status on this issue?

huu4ontocord avatar Feb 24 '23 06:02 huu4ontocord