loguru icon indicating copy to clipboard operation
loguru copied to clipboard

mutliprocess / multithreading : all logs appear on sys.stdout sink, but some logs are trimmed in logfile sink

Open jose-atlin opened this issue 2 years ago • 2 comments

model.py

import random
import time

from loguru import logger


class FakeModel:
    def predict(self, thread_num: int, trace_id: str):
        sleep = random.randint(0, 4)
        logger.info(f"[{trace_id}-{thread_num}-3] Sleeping - start sleep={sleep} secs")
        time.sleep(sleep)
        logger.info(f"[{trace_id}-{thread_num}-4] Sleeping - end sleep={sleep} secs")
        return {}

fastapi_app.py

import random
import sys
import time
from threading import Thread
from typing import List

from fastapi import FastAPI

from app.model import FakeModel
from loguru import logger

app = FastAPI()
model = FakeModel()

# Setup logger sinks
logger.remove(0)

logger.add(
    sink=sys.stdout,
    level="INFO",
    format="{message} | {extra}",
    enqueue=True,
)

logger.add(
    sink=open("logfile.log", "w"),
    level="INFO",
    format="{message} | {extra}",
    enqueue=True,
)


def thread_predicts(thread_num: int, trace_id: str):
    with logger.contextualize(trace_id=trace_id, thread_num=thread_num):
        sleep = random.randint(0, 4)
        logger.info(f"[{trace_id}-{thread_num}-1] Sleeping - start sleep={sleep} secs")
        time.sleep(sleep)
        logger.info(f"[{trace_id}-{thread_num}-2] Sleeping - end sleep={sleep} secs")

        result = model.predict(thread_num, trace_id)

        sleep = random.randint(0, 4)
        logger.info(f"[{trace_id}-{thread_num}-5] Sleeping - start sleep={sleep} secs")
        time.sleep(sleep)
        logger.info(f"[{trace_id}-{thread_num}-6] Sleeping - end sleep={sleep} secs")
        return result


@app.get("/trace/{trace_id}")
def predict(trace_id: str):
    threads_: List[Thread] = []
    for thread_num in range(2):
        thread_ = Thread(target=thread_predicts, args=(thread_num, trace_id))
        threads_.append(thread_)

    for thread_ in threads_:
        thread_.start()

    for thread_ in threads_:
        thread_.join()

locustfile.py

import time
import uuid

from locust import HttpUser, between, task


class FakePerformanceTest(HttpUser):
    wait_time = between(1, 5)

    @task
    def test_fake_predict(self):
        for _ in range(5):
            trace_id = str(uuid.uuid4())
            self.client.get(f"/trace/{trace_id}", name="/trace")
            time.sleep(1)

There are times when logs for a thread is completely missing in logfile.log but is available in the sys.stdout. I dont know if the logs would be of any use..

This is how the logs came in logfile.log for a particular trace_id 51173d6d-0b43-4076-9ef0-daee310e21f6.

[51173d6d-0b43-4076-9ef0-daee310e21f6-0-2] Sleeping - end sleep=3 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 0}
[51173d6d-0b43-4076-9ef0-daee310e21f6-0-3] Sleeping - start sleep=0 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 0}

These were generated in sys.stdout but was not available in logfile.log

[51173d6d-0b43-4076-9ef0-daee310e21f6-0-1] Sleeping - start sleep=3 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 0}
[51173d6d-0b43-4076-9ef0-daee310e21f6-1-1] Sleeping - start sleep=4 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 1}
[51173d6d-0b43-4076-9ef0-daee310e21f6-0-2] Sleeping - end sleep=3 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 0}
[51173d6d-0b43-4076-9ef0-daee310e21f6-0-3] Sleeping - start sleep=0 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 0}
[51173d6d-0b43-4076-9ef0-daee310e21f6-0-4] Sleeping - end sleep=0 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 0}
[51173d6d-0b43-4076-9ef0-daee310e21f6-0-5] Sleeping - start sleep=2 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 0}
[51173d6d-0b43-4076-9ef0-daee310e21f6-1-2] Sleeping - end sleep=4 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 1}
[51173d6d-0b43-4076-9ef0-daee310e21f6-1-3] Sleeping - start sleep=0 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 1}
[51173d6d-0b43-4076-9ef0-daee310e21f6-1-4] Sleeping - end sleep=0 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 1}
[51173d6d-0b43-4076-9ef0-daee310e21f6-1-5] Sleeping - start sleep=4 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 1}
[51173d6d-0b43-4076-9ef0-daee310e21f6-0-6] Sleeping - end sleep=2 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 0}
[51173d6d-0b43-4076-9ef0-daee310e21f6-1-6] Sleeping - end sleep=4 secs | {'trace_id': '51173d6d-0b43-4076-9ef0-daee310e21f6', 'thread_num': 1}

Is there something i am missing. Thank you..

jose-atlin avatar Mar 14 '23 04:03 jose-atlin

In both these cases

  • Also I would like to add that, when using --workers=1 for uvicorn server, will not create this problem. This only persists if the no of process is more than 1. uvicorn --workers=2 fastapi_app:app will cause problem uvicorn --workers=1 fastapi_app:app runs fine

  • Also if the number of threads i.e. in the predict() method of app.py is removed and run as a simple method call.

@app.get("/trace/{trace_id}")
def predict(trace_id: str):
    thread_predicts(thread_num=0, trace_id=trace_id)

works fine too

jose-atlin avatar Mar 14 '23 04:03 jose-atlin

Hi @jose-atlin. Sorry for the late reply.

This is a recurring problem caused by the use of several workers, and therefore several processes. We need a way to synchronize the logger between processes, and I don't know if FastAPI allows that.

I definitely need to take a look at the possible workaround, but I'm not very familiar with FastAPI and Gunicorn.

In the meantime, maybe this blog post can help you: Unify Python logging for a Gunicorn/Uvicorn/FastAPI application.

Delgan avatar Apr 07 '23 08:04 Delgan