BentoML feature: Allow for New relic integration

Feature request

We are using bentoML for our model serving. We found that it is a great library. However when integrating with new relic we are facing issues.

Motivation

We have been using bentoML for our work. It is a great repository thank you for building this. However, when we try to integrate bentoML with new relic monitoring we are getting the following error. We tried mounting a fastapi app/ custom middleware but seems like these efforts are not fruitful. Please help us to integrate newrelic with bentoML

Debug mode from bentoML:

 /usr/local/lib/python3.8/site-packages/uvicorn/lifespan/on.py:97: RuntimeWarning: coroutine 'middleware_wrapper' was never awaited
self.logger.error(msg, exc_info=exc)
Object allocated at (most recent call last):
 File "/usr/local/lib/python3.8/site-packages/asgiref/compatibility.py", lineno 34
   instance = application(scope)

Complete error Message

ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py", line 407, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/usr/local/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
    return await self.app(scope, receive, send)
  File "/usr/local/lib/python3.9/site-packages/newrelic/api/asgi_application.py", line 387, in nr_async_asgi
    return await coro
  File "/usr/local/lib/python3.9/site-packages/newrelic/common/async_proxy.py", line 148, in __next__
    return self.send(None)
  File "/usr/local/lib/python3.9/site-packages/newrelic/common/async_proxy.py", line 120, in send
    return self.__wrapped__.send(value)
  File "/usr/local/lib/python3.9/site-packages/newrelic/api/asgi_application.py", line 70, in __call__
    return await self.app(scope, receive, self.send_inject_browser_agent)
  File "/usr/local/lib/python3.9/site-packages/newrelic/api/asgi_application.py", line 330, in nr_async_asgi
    return await wrapped(scope, receive, send)
  File "/usr/local/lib/python3.9/site-packages/starlette/applications.py", line 124, in __call__
    await self.middleware_stack(scope, receive, send)
  File "<string>", line 5, in wrapper
  File "/usr/local/lib/python3.9/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/usr/local/lib/python3.9/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/usr/local/lib/python3.9/site-packages/newrelic/hooks/framework_starlette.py", line 108, in middleware_wrapper
    return await FunctionTraceWrapper(wrapped, name=name)(*args, **kwargs)
  File "<string>", line 5, in wrapper
  File "/usr/local/lib/python3.9/site-packages/opentelemetry/instrumentation/asgi/__init__.py", line 581, in __call__
    await self.app(scope, otel_receive, otel_send)
  File "/usr/local/lib/python3.9/site-packages/asgiref/compatibility.py", line 35, in new_application
    return await instance(receive, send)
TypeError: 'coroutine' object is not callable
/usr/local/lib/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py:416: RuntimeWarning: coroutine 'middleware_wrapper' was never awaited
  self.transport.close()
Object allocated at (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/asgiref/compatibility.py", lineno 34
    instance = application(scope)

Other

No response

Jan 20 '23 17:01 kkkarthik6

@kkkarthik6 thanks for sharing the use case. Are you using the @asgi_application decorator to integrate New Relic? Upon a quick Google search, it seems like New Relic also offers a Python agent for integrating with ASGI applications, like Uvicorn which BentoML was built on top of. Have you evaluated if that works for your use case?

Update: New Relic's Python Agent turns out to be the incorrect integration point with BentoML. See conversation later in this thread, https://github.com/bentoml/BentoML/issues/3463#issuecomment-1410076941

Jan 20 '23 21:01 ssheng

Hi @ssheng , Thanks for the reply. We have the following setup and also tried mounting fastAPI as our asgi middleware. Seems like it always fails to boot with the above warning.

import newrelic.agent
newrelic.agent.initialize('/app/newrelic.ini')
import numpy as np
import bentoml
from bentoml.io import NumpyNdarray
import time
import tracemalloc
tracemalloc.start()
from bentoml import Service
async def sleep():
    time.sleep(1)
class ACVService(Service):
    @Service.asgi_app.setter
    def asgi_app(self):
        from bentoml._internal.server.http_app import HTTPAppFactory
        return HTTPAppFactory(self)(enabale_metrics=False)
svc = ACVService("data-bentoml-monitoring")
asgi_bento = svc.asgi_app
@newrelic.agent.asgi_application()
@svc.api(input=NumpyNdarray(), output=NumpyNdarray())
async def classify(input_series: np.ndarray) -> np.ndarray:
   await sleep()
   return input_series

Start the app using this command BENTOML_DEBUG=TRUE uvicorn src.service:asgi_bento --port 5000 --host 0.0.0.0 --reload

Error log.

Traceback (most recent call last): data-bento-monitoring-retail-1 | File "/usr/local/lib/python3.8/site-packages/uvicorn/protocols/http/h11_impl.py", line 407, in run_asgi data-bento-monitoring-retail-1 | result = await app( # type: ignore[func-returns-value] data-bento-monitoring-retail-1 | File "/usr/local/lib/python3.8/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__ data-bento-monitoring-retail-1 | return await self.app(scope, receive, send) data-bento-monitoring-retail-1 | File "/usr/local/lib/python3.8/site-packages/newrelic/api/asgi_application.py", line 363, in nr_async_asgi data-bento-monitoring-retail-1 | return await coro data-bento-monitoring-retail-1 | File "/usr/local/lib/python3.8/site-packages/newrelic/common/async_proxy.py", line 148, in __next__ data-bento-monitoring-retail-1 | return self.send(None) data-bento-monitoring-retail-1 | File "/usr/local/lib/python3.8/site-packages/newrelic/common/async_proxy.py", line 120, in send data-bento-monitoring-retail-1 | return self.__wrapped__.send(value) data-bento-monitoring-retail-1 | File "/usr/local/lib/python3.8/site-packages/newrelic/api/asgi_application.py", line 363, in nr_async_asgi data-bento-monitoring-retail-1 | return await coro data-bento-monitoring-retail-1 | File "/usr/local/lib/python3.8/site-packages/newrelic/common/async_proxy.py", line 148, in __next__ data-bento-monitoring-retail-1 | return self.send(None) data-bento-monitoring-retail-1 | File "/usr/local/lib/python3.8/site-packages/newrelic/common/async_proxy.py", line 120, in send data-bento-monitoring-retail-1 | return self.__wrapped__.send(value) data-bento-monitoring-retail-1 | File "/usr/local/lib/python3.8/site-packages/starlette/applications.py", line 124, in __call__ data-bento-monitoring-retail-1 | await self.middleware_stack(scope, receive, send) data-bento-monitoring-retail-1 | File "<string>", line 5, in wrapper data-bento-monitoring-retail-1 | File "/usr/local/lib/python3.8/site-packages/starlette/middleware/errors.py", line 184, in __call__ data-bento-monitoring-retail-1 | raise exc data-bento-monitoring-retail-1 | File "/usr/local/lib/python3.8/site-packages/starlette/middleware/errors.py", line 162, in __call__ data-bento-monitoring-retail-1 | await self.app(scope, receive, _send) data-bento-monitoring-retail-1 | File "/usr/local/lib/python3.8/site-packages/newrelic/hooks/framework_starlette.py", line 112, in middleware_wrapper data-bento-monitoring-retail-1 | return await FunctionTraceWrapper(wrapped, name=name)(*args, **kwargs) data-bento-monitoring-retail-1 | File "<string>", line 5, in wrapper data-bento-monitoring-retail-1 | File "/usr/local/lib/python3.8/site-packages/bentoml/_internal/server/http/instruments.py", line 111, in __call__ data-bento-monitoring-retail-1 | self._setup() data-bento-monitoring-retail-1 | File "/usr/local/lib/python3.8/site-packages/simple_di/__init__.py", line 139, in _ data-bento-monitoring-retail-1 | return func(*_inject_args(bind.args), **_inject_kwargs(bind.kwargs)) data-bento-monitoring-retail-1 | File "/usr/local/lib/python3.8/site-packages/bentoml/_internal/server/http/instruments.py", line 51, in _setup data-bento-monitoring-retail-1 | self.legacy_metrics_request_duration = metrics_client.Histogram( data-bento-monitoring-retail-1 | File "/usr/local/lib/python3.8/site-packages/bentoml/_internal/server/metrics/prometheus.py", line 151, in Histogram data-bento-monitoring-retail-1 | return partial(self.prometheus_client.Histogram, registry=self.registry) File "/usr/local/lib/python3.8/site-packages/bentoml/_internal/server/metrics/prometheus.py", line 59, in prometheus_client assert os.path.isdir(self.multiproc_dir)

Jan 23 '23 19:01 kkkarthik6

I'll share an experiment I ran:

# run-newrelic-instrumented-bento.sh
pip install newrelic

NEW_RELIC_LICENSE_KEY="MY KEY" \
NEW_RELIC_APP_NAME="bento-tensorflow-poc" \
NEW_RELIC_APPLICATION_LOGGING_ENABLED=true \
NEW_RELIC_APPLICATION_LOGGING_LOCAL_DECORATING_ENABLED=true \
NEW_RELIC_APPLICATION_LOGGING_FORWARDING_ENABLED=true \
    newrelic-admin run-program uvicorn poc_api.service:svc.asgi_app --host 0.0.0.0 --port 5000 --workers 1

Requirements

bentoml==1.0.13
rich
loguru # the new-relic CLI instruments this logging library with trace ID, etc.

I used the train.py from the BentoML keras example

# src/poc_model/service.py
import numpy as np
from numpy.typing import NDArray
from typing import List
from PIL.Image import Image as PILImage
from textwrap import dedent

import bentoml
from bentoml.io import Image, JSON
from bentoml.io import NumpyNdarray
from bentoml.exceptions import InternalServerError

from pydantic import BaseModel, Field

# endpoint that inputs and outputs a string
from bentoml.io import Text
from loguru import logger as LOGGER

mnist_classifier: bentoml.Model = bentoml.tensorflow.get("tensorflow_mnist:latest")
mnist_classifier_runner: bentoml.Runner = mnist_classifier.to_runner()

svc = bentoml.Service(
    name="tensorflow_mnist_service",
    runners=[mnist_classifier_runner],
)

from rich import inspect
print(inspect(svc, methods=True))

class ClassifyDigitResponse(BaseModel):

    confidences: List[float] = Field(..., description=dedent("""\
        Confidence scores for each digit class.

        The `0`th element is the confidence score for digit `0`,

        the `1`st element is the confidence score for digit `1`, 

        the `2`st element is the confidence score for digit `2`, 

        and so on.
    """))

"""
Request:
curl -X 'POST' \
  'http://0.0.0.0:5000/classify-digit' \
  -H 'accept: application/json' \
  -H 'Content-Type: image/jp2' \
  --data-binary '@7.png'

Response:
{
  "confidences": [
    -3.8217179775238037,
    1.1907492876052856,
    -1.0903446674346924,
    -0.28582924604415894,
    -3.335996627807617,
    -5.478107929229736,
    -13.418190956115723,
    14.586523056030273,
    -1.2501418590545654,
    1.6832548379898071
  ]
}

Response headers:
content-length: 213 
 content-type: application/json 
 date: Tue,31 Jan 2023 07:55:31 GMT 
 server: uvicorn 
 x-bentoml-request-id: 1925001177826861353 
"""
@svc.api(
    route="/classify-digit",
    input=Image(), 
    output=JSON(pydantic_model=ClassifyDigitResponse), 
)
async def classify_digit(f: PILImage) -> "np.ndarray":

    assert isinstance(f, PILImage)

    arr = np.array(f) / 255.0
    assert arr.shape == (28, 28)

    # We are using greyscale image and our PyTorch model expect one extra channel dimension
    arr = np.expand_dims(arr, (0, 3)).astype("float32")  # reshape to [1, 28, 28, 1]

    confidences_arr: NDArray = await mnist_classifier_runner.async_run(arr) # [1, 10]
    confidences = list(confidences_arr[0])

    return ClassifyDigitResponse(confidences=confidences)

"""
Request:
curl -X 'POST' \
  'http://0.0.0.0:5000/newrelic-log' \
  -H 'accept: text/plain' \
  -H 'Content-Type: text/plain' \
  -d 'None'

Response: "None"

Response Headers:
 content-length: 4 
 content-type: text/plain; charset=utf-8 
 date: Tue,31 Jan 2023 07:50:05 GMT 
 server: uvicorn 
 x-bentoml-request-id: 283141936243368829 
"""
@svc.api(input=Text(), output=Text(), route="/newrelic-log")
def health(text: str):
    LOGGER.info(f"This is an INFO log. {text}")
    return text

"""
Request:
curl -X 'POST' \
  'http://0.0.0.0:5000/newrelic-error' \
  -H 'accept: text/plain' \
  -H 'Content-Type: text/plain' \
  -d 'None'

Response: error

Response headers:
content-length: 2 
 content-type: application/json 
 date: Tue,31 Jan 2023 07:48:40 GMT 
 server: uvicorn 

Note: the bento trace id wasn't returned here ^^^. That'd be bad if I were trying to debug this error.
Also, the NewRelic trace ID header isn't here in both endpoints.
"""
@svc.api(input=Text(), output=Text(), route="/newrelic-error")
def error(text: str):
    LOGGER.error(f"This is an ERROR log. {text}")
    exc = InternalServerError(message="Fail!")
    LOGGER.exception(exc)
    raise exc

Here are some of the logs I got. You can see that the instrumentation affected loguru:

6ef6f8a|bento-tensorflow-poc|
2023-01-31 00:48:40.786 | ERROR    | poc_api.service:error:76 - Fail! NR-LINKING|MzM1NjA0OHxBUE18QVBQTElDQVRJT058MTIwODc1MDAzOQ|PRVLOERIDDOCH|d1b88a559281729ffa23b7a5fd0cf8d0|9954d0a586ef6f8a|bento-tensorflow-poc|
NoneType: None
Exception on /newrelic-error [POST] NR-LINKING|MzM1NjA0OHxBUE18QVBQTElDQVRJT058MTIwODc1MDAzOQ|PRVLOERIDDOCH|d1b88a559281729ffa23b7a5fd0cf8d0|9954d0a586ef6f8a|bento-tensorflow-poc|
Traceback (most recent call last):
  File "/Users/eric/repos/extra/bento-tensorflow-poc/venv/lib/python3.8/site-packages/bentoml/_internal/server/http_app.py", line 336, in api_func
    output = await run_in_threadpool(api.func, input_data)
  File "/Users/eric/repos/extra/bento-tensorflow-poc/venv/lib/python3.8/site-packages/newrelic/hooks/framework_starlette.py", line 213, in wrap_run_in_threadpool
    return await wrapped(func, *args, **kwargs)
  File "/Users/eric/repos/extra/bento-tensorflow-poc/venv/lib/python3.8/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool
    return await anyio.to_thread.run_sync(func, *args)
  File "/Users/eric/repos/extra/bento-tensorflow-poc/venv/lib/python3.8/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/Users/eric/repos/extra/bento-tensorflow-poc/venv/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/Users/eric/repos/extra/bento-tensorflow-poc/venv/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/Users/eric/repos/extra/bento-tensorflow-poc/venv/lib/python3.8/site-packages/newrelic/core/context.py", line 85, in _context_wrapper
    return wrapped(*args, **kwargs)
  File "/Users/eric/repos/extra/bento-tensorflow-poc/api/src/poc_api/service.py", line 77, in error
    raise exc
bentoml.exceptions.InternalServerError: Fail!
INFO:     127.0.0.1:64048 - "POST /newrelic-error HTTP/1.1" 500 Internal Server Error
2023-01-31 00:50:06.017 | INFO     | poc_api.service:health:69 - This is an INFO log. None NR-LINKING|MzM1NjA0OHxBUE18QVBQTElDQVRJT058MTIwODc1MDAzOQ|PRVLOERIDDOCH|0aa3fc64ef1857294ba0c6cf8283ab43|61c25b67f8b9959e|bento-tensorflow-poc|

Here's the logs page in NewRelic

Here's the tracing page:

Summary page

I've done this same setup with FastAPI before. Unlike FastAPI, with Bento I'm not seeing

Any evidence of errors, even though most of my requests were to the /newrelic-error endpoint
No metrics regarding which path I hit /newrelic-error shows up in some of the logs, but FastAPI's NewRelic instrumentation will give me a full "Access Logs report" AKA which paths returned which status codes and with what ratio
No trace ID in the response headers; with FastAPI the response headers include the trace ID. FastAPI also knows how to handle when the trace ID is present in the request headers

Can you think of a way I could properly configure NewRelic to have these features? These may be deal breakers for us. At the end of the day, we need to be able to:

know which endpoints are getting errors
be able to look up errors using a trace ID
be able to correlate logs from up- and downstream services using the trace ID

And it's a major plus if:

we can get a good traceback when we correlate the logs to the error trace ID

Jan 31 '23 08:01 phitoduck

There's an ongoing NewRelic support thread here, though as of right now, there isn't much there.

Jan 31 '23 08:01 phitoduck

Moving some of the conversation we had with @phitoduck here. Hopefully they can be helpful to @kkkarthik6.

Thanks for the insightful post, @phitoduck. It seems like you are integrating with newrelic through the Python Agent. Haven’t read a lot into the implementation, it seems like the instrumentation is automatically injected into the ASGI application if the application was started with newrelic-admin run-program uvicorn.

Treating BentoML as a simple ASGI application will not work out-of-the-box. BentoML’s runner architecture spawns multiple processes to optimize system resource utilization. Instead of a single uvicorn process, BentoML consists of multiple ASGI processes.

I will need to look into New Relic further to investigate a better integration. Since BentoML is fully compliant to OpenTelemetry and Prometheus standards, I’m hopeful that New Relic has proper supports for those standards.

New Relic supports OpenTelemetry Protocol for exporting trace data. In BentoML, you can customize the OTLP exporter in the Tracing configuration with New Relic’s endpoint.

New Relic supports exporting Prometheus metrics through their Prometheus OpenMetrics integration for Docker.

Jan 31 '23 10:01 ssheng

I opened a ticket with the NewRelic support team. Here's an email I just got from them:

Hi,

It looks like that our devs have analyzed the issue and as per their recommendation, we don't support the way BentoML internally creates multiple asgi apps and suggest to use OTEL or prometheus open metrics to report trace data.

However, I can definitely see the advantage of supporting BentoML. I'll be submitting a feature request on your behalf for this capability. At the moment, I don't have a timeline for this feature, but submitting this request will certainly help with its prioritization. We take requests like this very seriously and use them to prioritize what we implement next. Since we track feature requests internally, I'll mark this ticket to be closed.

We appreciate you taking the time to bring this to our attention, and hope that you will continue to do so in the future. If you have any further questions or comments about this, please let us know by replying to this ticket and we'll be happy to continue the conversation.

Best, Raji Durai Technical Support Engineer New Relic, Inc | Portland

Feb 06 '23 22:02 phitoduck

Hi @phitoduck, I see you got a response from NewRelic and they suggested using OpenTelemetry to report metrics and you created an example in this repo https://github.com/phitoduck/bentoml-opentelemetry-newrelic. Does that work for you, or was it an experiment? I definitely need to monitor our BentoML services with NewRelic! Thank you for investigating this.

Jun 21 '24 21:06 mihow