llmware icon indicating copy to clipboard operation
llmware copied to clipboard

Refactor logging functionality and add features

Open MacOS opened this issue 1 year ago • 4 comments

Proposed new feature or change

This issue is a first dump of my thoughts. Feel free to comment on it and improve it.

Motive

Logging functionality is currently implemented with the print function and the logging package. In addition, some of these are commented out for no apparent reason.

The issue is that users want the logging to be done by one and only one way. In addition, the users want to inspect the logs in a log file.

I therefore propose to implement a coherent logging system. In this process, the following should be done.

  1. uncomment all commented out print and logging code,
  2. move all the logging to the logging package,
  3. add a central logging object for each module, and
  4. add that the logs are written to log files,
  5. remove all update strings in the messages,

Example

Actual

As stated above, the logic is spread out throughout the code base. Here are some examples for each of the five points above.

print used for logging. https://github.com/llmware-ai/llmware/blob/e3435db316bc9c2b955bd8c50c80a55f526da2a0/llmware/embeddings.py#L448

An commented out print statement. https://github.com/llmware-ai/llmware/blob/e3435db316bc9c2b955bd8c50c80a55f526da2a0/llmware/embeddings.py#L294

A logging statement with update in it. https://github.com/llmware-ai/llmware/blob/e3435db316bc9c2b955bd8c50c80a55f526da2a0/llmware/embeddings.py#L114-L115

Desired

It is desirable to clean up the logging functionality and add logging to a file. The output of such a log message, should, for example,

2024-02-11T16:04:47.556391+00:00:embedding:INFO: Milvus - Embeddings Created: 10 of 50

or

2024-02-11T16:04:47.556391+00:00:embedding:WARNING: unable to determine if embeddings have been properly counted and captured.   Please check if databases connected.

Generally, the form should hence be

<DATETIME>:<MODULE_NAME>:<LOGGING_LEVEL>:<LOGGING_MESSAGE>

Existing solutions I looked for

Within the package, none, since it is currently not implemented at all.

Outside of the package, yes. A blueprint might be to take a look at how flask does it.

Proposed solution

In __init__.py add the following.

import os
import logging

from llmware.configs import LLMWareConfig


def create_logger():
    logging.basicConfig(level=LLMWareConfig.get_logging_level())

    logger = logging.getLogger(__name__)

    # maybe some config logic here, for example to set where the logs should be written to,
    # depending on the operating system. In addtion, setting the format of the log messages - see above.
    ....

    return logger

And then add in each module

from __init___ import create_logger

logger = create_logger()

Technical Analysis

The proposed efforts, and hence the changes, would clean the codebase (e.g. moving print class to the logger), streamline the logging functionality (e.g. adding the create_logger function to configure the logger centrally), and add new features for permanent logging to a file, or files.

Additional links

Logging HOWTO Flask source code

MacOS avatar Feb 11 '24 16:02 MacOS

@MacOS - agree 100% with your assessment - and would appreciate your help and support to drive this logging improvement. As you note, it will require updates across the code-base.

doberst avatar Mar 13 '24 09:03 doberst

Great!

I think it is a good idea to chop this up in smaller junks. May I suggest the following steps as a starting point for discussions, in that order.

  1. A PR that deals with the commented out statements. Either by deleting them or adding them back.
  2. A PR that unifies the LOGGING_MESSAGE.
  3. A PR that moves all the logging to the logging package.
  4. A PR that adds a central logging object for each module, and
  5. A PR that adds that the logs can optionally be are written to log files,

Besides, it is really important that we agree on the logging format. The format I suggested has the obvious problem with the :, as it separates the parts of the logging message and the parts of the DATETIME.

MacOS avatar Mar 13 '24 09:03 MacOS