mem0 [REFACTOR] Config

With all the PRs and Issues, it has become obvious that we as maintainers need to decide on a way to configure embedchain.

Jul 06 '23 10:07 cachho

Passing a dictionary of options to the method: This approach is simple, flexible and makes it clear which options are being used for each individual call to the query method.

Here's an example:

naval_chat_bot.query(
    "What unique capacity does Naval argue humans possess when it comes to understanding explanations or concepts?", 
    options={"num_documents": 5, "llm_model": "chatgpt-3.5"}
)

In the query method, you'd unpack the options dictionary like so:

def query(self, question, options=None):
    if options is None:
        options = {}
    num_documents = options.get('num_documents', 1)
    llm_model = options.get('llm_model', 'default_model')
    # ...

Using a configuration class: This is a more structured approach that provides a clear definition of what the configuration options are, and makes it easy to provide default values for those options. It also allows you to add methods to the configuration class to manipulate or validate the configuration data.

Here's how you'd use a configuration class:

class QueryConfig:
    def __init__(self, num_documents=1, llm_model='default_model'):
        self.num_documents = num_documents
        self.llm_model = llm_model

config = QueryConfig(num_documents=5, llm_model="chatgpt-3.5")
naval_chat_bot.query("What unique capacity does Naval argue humans possess when it comes to understanding explanations or concepts?", config)

In the query method, you'd access the properties of the config object:

def query(self, question, config=None):
    if config is None:
        config = QueryConfig()
    num_documents = config.num_documents
    llm_model = config.llm_model
    # ...

Using setter methods in the class: This approach involves changing the state of the App object itself. This makes it easy to change the configuration dynamically without having to pass in the same configuration options every time you make a query. However, it also means that the configuration persists across multiple queries.

Here's an example:

class App:
    def __init__(self):
        self.config = QueryConfig()

    def set_num_documents(self, num_documents):
        self.config.num_documents = num_documents

    def set_llm_model(self, llm_model):
        self.config.llm_model = llm_model

    def query(self, question):
        num_documents = self.config.num_documents
        llm_model = self.config.llm_model
        # ...

naval_chat_bot = App()

naval_chat_bot.set_num_documents(5)
naval_chat_bot.set_llm_model("chatgpt-3.5")

naval_chat_bot.query("What unique capacity does Naval argue humans possess when it comes to understanding explanations or concepts?")

Jul 06 '23 10:07 cachho

I'm strongly leaning towards 2. . It has the best developer experience, and user experience.
_3. I was tagged in a PR that tried to introduce 3.
_1. is stupid because dict.get() is always a hassle to type and has bad intellisence support

Chroma for instance also uses 2.

Jul 06 '23 10:07 cachho

Maintainers have decided to roll with variant 2.

Jul 06 '23 10:07 cachho

Yes, as discussed on a call. let's go with approach 2. Its cleaner. We will have config classes like

InitConfig
QueryConfig
ChatConfig

Jul 06 '23 14:07 taranjeet

PR is ready, #158

Jul 06 '23 15:07 cachho

@cachho : feel free to close this if everything is addressed. For ChatConfig, I have created a new issue https://github.com/embedchain/embedchain/issues/167 to track.

Jul 06 '23 19:07 taranjeet

closing this because all configs so far exist. If there are questions regarding chat, tracking in #167.

Jul 06 '23 20:07 cachho