azure-sdk-for-python icon indicating copy to clipboard operation
azure-sdk-for-python copied to clipboard

[AzureML] Review dependencies list

Open annatisch opened this issue 2 years ago • 3 comments

The dependencies list needs to be trimmed down to remove any libraries that are not required for basic usage. This might involve moving somethings to optional feature sets, others can just be enabled if certain dependencies have been installed (e.g. how azure-core handles the aiohttp dependency) The versions of dependencies also need to be reviewed to prevent dependency conflicts.

I have started compiling some notes before:

# This should be easy to align to latest core
"azure-core<2.0.0,>=1.19.1,!=1.22.0",
"msrest>=0.6.18",

# I believe this is deprecated - we should make sure no usage of this is in the public API so we can remove
# this dependency as soon as is practical.
"azure-common<2.0.0,>=1.1",

# This one is a little unfortunate in that it looks like a solid, well-maintained library - though it does use
# C extensions which means it's only compatible with a subset of platforms. Fortunately it does have
# a full complement of wheels, so we could probably GA with it - but we should look at what it might
# mean for someone on an unsupported platform.
"pyyaml<7.0.0,>=5.1.0",

"azure-identity",
"azure-mgmt-core<2.0.0,>=1.2.0",

# We shouldn't need a dependency on mashmellow. If customers are using marshmellow for their own models
# they should be able to use that without us needing to take a dependency on it.
"marshmallow<4.0.0,>=3.5",

# This looks like a solid library that we could probably add as approved dependency.
"jsonschema<5.0.0,>=4.0.0",

# These two look like they're only related to console output - which is generally only in the jurisdiction of the
# CLI, not the SDK. We could take a closer look at this is see what scenarios it's used for and whether these
# make sense as SDK dependencies. The versions should be given a minimum, and only capped at a major version.
"tqdm<=4.63.0",
"colorama<=0.4.4",

# Seems like a solid dependency - but we should fix the version to give it a minimum, and if it needs a
# max it should be a major version.
"pyjwt<=2.3.0",

# The storage libraries are pretty big - do we really need to preinstall all 3 of these?
# Maybe we could default to setting up support for Blobs (or whichever is most popular) and then make it easy
# to swap this out for an alternative Storage endpoint as needed so the Blobs SDK could be uninstalled and
# replaced with FileShare etc.
"azure-storage-blob<13.0.0,>=12.10.0",
"azure-storage-file-share<13.0.0",
"azure-storage-file-datalake<=12.6.0",

# What's this one used for?
"pydash<=4.9.0",

# I would avoid this one as we shouldn't GA with a dependency on a beta library. It means we are locked
# to an unstable version.
"pathspec==0.9.*",

# We should try to remove this one as it's not a stable library - however assuming it's not publicly
# exposed this could probably not urgent.
"isodate",

# Docker should not be installed by default - I would simply make the local endpoint feature predicated
# on a customer choosing to install docker. Similar to how core checks for aiohttp:
# https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/core/azure-core/azure/core/pipeline/transport/__init__.py#L71.
"docker",

"typing-extensions>=4.0.1",

# Please remove this one - our SDKs don't determine telemetry configuration on behalf of customers.
"applicationinsights<=0.11.10",

annatisch avatar Jul 04 '22 00:07 annatisch

Some further updates to my comments above:

  • The dependency on Azure-Identity should be removed - we can just vendor any code from identity that we need.
  • We should take a closer look at how isodate is being used - as removing it later could be breaking.
  • Pathspec should be removed.

Adding @azureml-github for greater visibility so we can start making progress on this.

annatisch avatar Jul 20 '22 00:07 annatisch

@annatisch Regarding azure-core dependency. This is just pinning the major version removes a yanked version. Any issues for lower version being pinned ?

This should be easy to align to latest core

"azure-core<2.0.0,>=1.19.1,!=1.22.0",

singankit avatar Aug 06 '22 01:08 singankit

@singankit - no there's no issue with pinning the lower version - this should be fine.

annatisch avatar Aug 08 '22 04:08 annatisch

azure-storage

  • Should be able to remove unused storage libs
  • Multiple deps because provider may be unknown to user pathspec
  • Investigate possibility of replacing or vendoring the one function that is being used. azure-identity
  • Vendoring any base classes as needed.

marshmallow

  • Used for schema tools, not customer facing (i.e. could be refactored out without breaking changes)
  • Chat to Johan about this one - do we need archboard approval. applicationinsights
  • Can raise with archboard for further review tqdm

annatisch avatar Aug 16 '22 22:08 annatisch

dependency on azure-identity has been removed.

singankit avatar Aug 26 '22 02:08 singankit

We can have major version upper bounds.

storage dependency - this should be good to have pathspec - @annatisch had a PR to vendor it. Currently under Mozilla 2.0. Check with CELA contact for the license tqdm. colorama - good for GA. Should showing progress bar be enable by default?
app insights - Needs separate discussion

singankit avatar Sep 01 '22 21:09 singankit

azure-common - Leave it since generated code uses it.

singankit avatar Sep 01 '22 22:09 singankit

@annatisch @singankit What work is left for this?

luigiw avatar Dec 01 '22 19:12 luigiw

I think this is resolved now.

annatisch avatar Dec 01 '22 19:12 annatisch