opentelemetry-python icon indicating copy to clipboard operation
opentelemetry-python copied to clipboard

sdk: Implement basic os resource detector

Open Zirak opened this issue 1 year ago • 2 comments

Description

Implement basic os resource detector.

Based on OS resource semantics: https://opentelemetry.io/docs/specs/semconv/resource/os/

Currently implements os.type and os.version, attempting to be in line with what's reported by other runtimes (like java and node).

I have not yet tested on some more exotic OSs such as hp-ux, aix, or z/os.

Type of change

Please delete options that are not relevant.

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [x] New feature (non-breaking change which adds functionality)
  • [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • [ ] This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • [x] Checked platform.system and platform.release on a variety of operating systems
  • [x] Ran java and node agents in similar environments, seeing values are in alignment
  • [x] Replicated non-trivial cases as unit test patches

Does This PR Require a Contrib Repo Change?

Answer the following question based on these examples of changes that would require a Contrib Repo Change:

  • The OTel specification has changed which prompted this PR to update the method interfaces of opentelemetry-api/ or opentelemetry-sdk/

  • The method interfaces of test/util have changed

  • Scripts in scripts/ that were copied over to the Contrib repo have changed

  • Configuration files that were copied over to the Contrib repo have changed (when consistency between repositories is applicable) such as in

    • pyproject.toml
    • isort.cfg
    • .flake8
  • When a new .github/CODEOWNER is added

  • Major changes to project information, such as in:

    • README.md
    • CONTRIBUTING.md
  • [ ] Yes. - Link to PR:

  • [x] No.

Checklist:

  • [x] Followed the style guidelines of this project
  • [ ] Changelogs have been updated
  • [x] Unit tests have been added
  • [ ] Documentation has been updated

I'm unsure what the practice to do the two items above (changelog & documentation) would actually require. It seems like opening a PR is a prerequisite to generating a changelog. I haven't seen any special documentation around resource detectors. Is my understanding correct?

Zirak avatar Jun 23 '24 06:06 Zirak

Great points, thank you! I've completely missed the entrypoint.

Regarding making it a default, I've done the following simple diff:

@@ -180,11 +180,13 @@ class Resource:
         resource = _DEFAULT_RESOURCE
 
         otel_experimental_resource_detectors = environ.get(
-            OTEL_EXPERIMENTAL_RESOURCE_DETECTORS, "otel"
+            OTEL_EXPERIMENTAL_RESOURCE_DETECTORS, "otel,os"
         ).split(",")
 
         if "otel" not in otel_experimental_resource_detectors:
             otel_experimental_resource_detectors.append("otel")
+        if "os" not in otel_experimental_resource_detectors:
+            otel_experimental_resource_detectors.append("os")
 
         for resource_detector in otel_experimental_resource_detectors:
             resource_detectors.append(

Running locally it looks good (yay!), but I'm having trouble with testing. A lot of things expect _DEFAULT_RESOURCE to be the baseline for all future resources. I'm currently tinkering with decorating the entire TestResources with a platform.uname patch, alongside with extending _DEFAULT_RESOURCE as part of __init__ and rewriting existing test cases to use it (Edit: This has since been pushed), e.g.

@@ -61,12 +62,26 @@ except ImportError:
     psutil = None
 
 
+@patch("platform.uname", lambda: platform.uname_result(
+            system="Linux",
+            node="node",
+            release="1.2.3",
+            version="4.5.6",
+            machine="x86_64",
+            processor="x86_64"
+        ))
 class TestResources(unittest.TestCase):
     def setUp(self) -> None:
         environ[OTEL_RESOURCE_ATTRIBUTES] = ""
+        self.mock_platform = {
+            OS_TYPE: "linux",
+            OS_VERSION: "1.2.3",
+        }
+        self.default_resource = _DEFAULT_RESOURCE.merge(Resource(self.mock_platform))
@@ -86,6 +101,7 @@ class TestResources(unittest.TestCase):
             TELEMETRY_SDK_VERSION: _OPENTELEMETRY_SDK_VERSION,
             SERVICE_NAME: "unknown_service",
         }
+        expected_attributes.update(self.mock_platform)
@@ -431,7 +447,7 @@ class TestResources(unittest.TestCase):
         resource_detector.raise_on_error = False
         self.assertEqual(
             get_aggregated_resources([resource_detector]),
-            _DEFAULT_RESOURCE.merge(
+            self.default_resource.merge(

It feels a bit icky. Am I missing a better, simpler way?

Zirak avatar Jun 25 '24 06:06 Zirak

@Zirak Please add an entry in the changelog

xrmx avatar Jul 01 '24 10:07 xrmx

Noticed the tests failing on python 3.8 - that's strange, will take a look

Zirak avatar Jul 03 '24 16:07 Zirak

@Zirak lint and docs are failing too

xrmx avatar Jul 09 '24 07:07 xrmx

Apologies for the wait, life got in the way. I've pushed 4 commits:

  • Catching up with main (lmk if there's another preferred way of doing so)
  • Linting, I somehow missed that in the commit amendments
  • Fix the code on python 3.8 (including pypy), very good catch from the robots
  • Actually write docs in rst, my first real time writing in rst so it was an adventure, lmk if it can be improved

Zirak avatar Jul 14 '24 19:07 Zirak

I'm not sure if I agree with @xrmx 's comment regarding making the resource detector loaded by default. We have OtelResourceDetector loaded by default and it populates service.name by default which is already marked stable as an attribute. Even though some of the fields are required, I believe this is required IF the resource detector exists, not that it is required by default as part of the sdk. The attributes are also marked as experimental in the sem conv so not too eager to have this default behavior until it is stable.

lzchen avatar Jul 15 '24 17:07 lzchen

Thanks @lzchen. How are discussions like this usually handled? Comments here, the CNCF slack, SIG topic, etc.? I'd be happy to present it at a SIG if necessary.

Zirak avatar Jul 15 '24 20:07 Zirak

I'm not sure if I agree with @xrmx 's comment regarding making the resource detector loaded by default. We have OtelResourceDetector loaded by default and it populates service.name by default which is already marked stable as an attribute. Even though some of the fields are required, I believe this is required IF the resource detector exists, not that it is required by default as part of the sdk. The attributes are also marked as experimental in the sem conv so not too eager to have this default behavior until it is stable.

I'm fine on not making it enabled by default

xrmx avatar Jul 16 '24 13:07 xrmx

Coolio, will revert it to not be a default

Zirak avatar Jul 17 '24 08:07 Zirak

@lzchen It's no longer a default, could you take another look?

Or @xrmx, what's the next step?

Zirak avatar Jul 25 '24 09:07 Zirak