azure-sdk-for-python
azure-sdk-for-python copied to clipboard
[AzureML] Review dependencies list
The dependencies list needs to be trimmed down to remove any libraries that are not required for basic usage. This might involve moving somethings to optional feature sets, others can just be enabled if certain dependencies have been installed (e.g. how azure-core handles the aiohttp dependency) The versions of dependencies also need to be reviewed to prevent dependency conflicts.
I have started compiling some notes before:
# This should be easy to align to latest core
"azure-core<2.0.0,>=1.19.1,!=1.22.0",
"msrest>=0.6.18",
# I believe this is deprecated - we should make sure no usage of this is in the public API so we can remove
# this dependency as soon as is practical.
"azure-common<2.0.0,>=1.1",
# This one is a little unfortunate in that it looks like a solid, well-maintained library - though it does use
# C extensions which means it's only compatible with a subset of platforms. Fortunately it does have
# a full complement of wheels, so we could probably GA with it - but we should look at what it might
# mean for someone on an unsupported platform.
"pyyaml<7.0.0,>=5.1.0",
"azure-identity",
"azure-mgmt-core<2.0.0,>=1.2.0",
# We shouldn't need a dependency on mashmellow. If customers are using marshmellow for their own models
# they should be able to use that without us needing to take a dependency on it.
"marshmallow<4.0.0,>=3.5",
# This looks like a solid library that we could probably add as approved dependency.
"jsonschema<5.0.0,>=4.0.0",
# These two look like they're only related to console output - which is generally only in the jurisdiction of the
# CLI, not the SDK. We could take a closer look at this is see what scenarios it's used for and whether these
# make sense as SDK dependencies. The versions should be given a minimum, and only capped at a major version.
"tqdm<=4.63.0",
"colorama<=0.4.4",
# Seems like a solid dependency - but we should fix the version to give it a minimum, and if it needs a
# max it should be a major version.
"pyjwt<=2.3.0",
# The storage libraries are pretty big - do we really need to preinstall all 3 of these?
# Maybe we could default to setting up support for Blobs (or whichever is most popular) and then make it easy
# to swap this out for an alternative Storage endpoint as needed so the Blobs SDK could be uninstalled and
# replaced with FileShare etc.
"azure-storage-blob<13.0.0,>=12.10.0",
"azure-storage-file-share<13.0.0",
"azure-storage-file-datalake<=12.6.0",
# What's this one used for?
"pydash<=4.9.0",
# I would avoid this one as we shouldn't GA with a dependency on a beta library. It means we are locked
# to an unstable version.
"pathspec==0.9.*",
# We should try to remove this one as it's not a stable library - however assuming it's not publicly
# exposed this could probably not urgent.
"isodate",
# Docker should not be installed by default - I would simply make the local endpoint feature predicated
# on a customer choosing to install docker. Similar to how core checks for aiohttp:
# https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/core/azure-core/azure/core/pipeline/transport/__init__.py#L71.
"docker",
"typing-extensions>=4.0.1",
# Please remove this one - our SDKs don't determine telemetry configuration on behalf of customers.
"applicationinsights<=0.11.10",
Some further updates to my comments above:
- The dependency on Azure-Identity should be removed - we can just vendor any code from identity that we need.
- We should take a closer look at how isodate is being used - as removing it later could be breaking.
- Pathspec should be removed.
Adding @azureml-github for greater visibility so we can start making progress on this.
@annatisch Regarding azure-core dependency. This is just pinning the major version removes a yanked version. Any issues for lower version being pinned ?
This should be easy to align to latest core
"azure-core<2.0.0,>=1.19.1,!=1.22.0",
@singankit - no there's no issue with pinning the lower version - this should be fine.
azure-storage
- Should be able to remove unused storage libs
- Multiple deps because provider may be unknown to user pathspec
- Investigate possibility of replacing or vendoring the one function that is being used. azure-identity
- Vendoring any base classes as needed.
marshmallow
- Used for schema tools, not customer facing (i.e. could be refactored out without breaking changes)
- Chat to Johan about this one - do we need archboard approval. applicationinsights
- Can raise with archboard for further review tqdm
dependency on azure-identity has been removed.
We can have major version upper bounds.
storage dependency - this should be good to have
pathspec - @annatisch had a PR to vendor it. Currently under Mozilla 2.0. Check with CELA contact for the license
tqdm. colorama - good for GA. Should showing progress bar be enable by default?
app insights - Needs separate discussion
azure-common - Leave it since generated code uses it.
@annatisch @singankit What work is left for this?
I think this is resolved now.