packaging
packaging copied to clipboard
Adding a wheel management API
I'm the maintainer of wheel
. I'm wondering if you would be interested in receiving an API for reading and writing wheels in this project. I've been wondering for a while if it makes any sense for PyPA to have so many packaging projects.
This is what I had in mind:
- The new "wheel public API" would go into this project instead of
wheel
-
setuptools
would adopt this API and ditchwheel
as a dependency for building wheels -
wheel
would depend onpackaging
at run time (currently it vendors this project) and become nothing but a CLI tool - (optional)
auditwheel
andwheel
would be merged (preferably intowheel
) (optional, to be discussed later)
The goal is to get rid of at least one PyPA project.
Do you have an issue or something that shows what you're proposing as a "wheel public API"?
And for reading or writing wheel files, we have purposefully taken a sans-I/O approach, so it would be more like "processing and creating zipfile.ZipFile
instances that represent wheel files".
There's a long time draft PR here: https://github.com/pypa/wheel/pull/472
And for reading or writing wheel files, we have purposefully taken a sans-I/O approach, so it would be more like "processing and creating zipfile.ZipFile instances that represent wheel files".
But ZipFile
does I/O, so how does that work? And is there ongoing work towards this in packaging
already?
There's a long time draft PR here: pypa/wheel#472
It looks like https://github.com/pypa/wheel/pull/472/files#diff-8ca627810a744afb59bbb1d28a71f6ee66c9878506cfbbf2b6b980053a13c2b8 is the key bit. We already deal with wheel file names, so the key bit seems to be the parts that parse/create all the details of a wheel file.
But
ZipFile
does I/O, so how does that work?
ZipFile
can do whatever it wants, but the point is we as a project wouldn't directly be doing any I/O and would be defining more of an interface that we would work with.
And is there ongoing work towards this in
packaging
already?
Nope. Closest thing we're working on right now is parsing metadata.
ZipFile
can do whatever it wants, but the point is we as a project wouldn't directly be doing any I/O and would be defining more of an interface that we would work with.
Ok, so to clarify, does that mean you don't want to have an I/O implementation in this project? I was hoping to reduce the amount of separate projects, and having wheel
only host a thin I/O implementation + a CLI tool that few people use doesn't seem sensible to me.
FWIW, an abstracted interface for interacting with wheels was created as part of the installer
project (https://installer.pypa.io/en/stable/concepts/).
I am doubtful that we can create a sans-io implementation of the wheel format -- wheels can and do contain files that are GBs in size, which would make operating in-memory with such wheels memory-intensive, if we insist on operating with only complete strings/bytes. If we're doing things incrementally, it's better to rely on existing I/O patterns.
Notably, that's basically what we have with https://github.com/pypa/installer/blob/5f281b8e312bac4da00670b643dd4499e1c66a9a/src/installer/sources.py#L139 and even the wheel public API PR ~basically implements that interface (thanks @agronholm for doing that!).
@pradyunsg if installer
already has an API then I think it more sense to rely on that API then us moving it here and duplicating it or going through the hassle of deprecating in installer
. Thoughts?
packaging
should probably not depend on installer
?
I suspect wheel
also shouldn't depend on installer
?
Probably installer
also does not cover the "creating" part of the API thar wheel covers, just the "reading" part.
packaging
should probably not depend oninstaller
?
Correct, and I wasn't proposing that. We actually can't have any dependencies at this point.
Quoting https://installer.pypa.io/en/stable/concepts/ (about WheelSource
):
This protocol/abstraction is designed to be implementable without a direct dependency on this library. This allows for other libraries in the Python packaging ecosystem to provide implementations of the protocol, allowing for more code reuse opportunities.
Do I get a cookie for this not being needless overengineering? 🙃
Probably installer also does not cover the "creating" part of the API thar wheel covers, just the "reading" part.
That's correct. The installation usecase only cares about reading, so that was the scoping there.
I'm wondering if you would be interested in receiving an API for reading and writing wheels in this project.
I just realised that this was never answered directly -- yes, I think that's a good idea. If @brettcannon and @dstufft agree (or at least are OK with going down this route), we'd have concensus among that packaging
maintainers. :)
I'm wondering if you would be interested in receiving an API for reading and writing wheels in this project.
I just realised that this was never answered directly -- yes, I think that's a good idea. If @brettcannon and @dstufft agree (or at least are OK with going down this route), we'd have concensus among that
packaging
maintainers. :)
Sorry, missed the notification for this comment!
My only real concern is starting to have to care about file I/O which we have avoided thus far. If we can manage to keep a sans-I/O approach then I'm fine with it. If we can't then I think we need to discuss what that change in stance means and how we are going to keep it under control (i.e., I don't want to take on any networking, for instance).
I don't quite get the apprehension about file I/O. Abstracting it out is fine, but what's the problem with providing a implementation that handles the file I/O? I was hoping to get all of that moved out of wheel
into packaging
.
@agronholm not getting in the middle of I/O has been a project policy because people want to use this project in odd places and ways. For instance, if we try go do file I/O we then will have people asking about how to make it work with files stored in some cloud blob storage, some async framework, etc. We have been able to avoid those situations by simply not engaging in any I/O in any way and thus not forcing people into a specific I/O approach.
And even abstractions are a bit tenuous. For instance, if we have a creation API for wheels, how do you pass in the files you want to write and their locations? The simple way is to take a directory and just slurp it all into the zip file. But that requires an API to something that can read a directory and then read those contents. We could define our own protocol that mirrors pathlib.Path
and that's what we operate with, but we would have to maintain that protocol as there's no abstract path protocol (yet; probably Python 3.13).
I personally think we could do this in stages. The obvious stuff that requires no I/O we can do without any controversy in the first stage. And then for anything that we think would require file I/O we would have to be very thoughtful about whatever protocols we wanted to support to make it happen and not corner ourselves in some way that isn't appropriately flexible and in a second stage.
@brettcannon I'm not sure I follow. Why can't we have an IO[bytes]
object, taken as input?
The detail of dealing with the filesystem or network is then a problem for the IO object passed in, and if it can't provide a compatible API, then we can't work with that. Taking in an IO object as an argument implies that we don't have to deal with the details of the IO here (leaving that to the object at hand).
I would like to add that during the 5 years I've maintained wheel
, nobody has come asking for these things :smiley: , even when they thought wheel has a public API. But indeed, declaring an interface that works with file descriptors should be good enough, and we're not in any way obligated to fulfill every request coming our way.
Why can't we have an
IO[bytes]
object, taken as input?
I'm assuming you mean typing.IO[bytes]
(which is typing.BinaryIO
)?
And we definitely can go with that protocol, we just to have to agree that's the protocol we want to go with and understand the implications. Specifically, that protocol is big and it isn't async. I'm personally fine with all of that, to be clear, but I want us to be explicit that this is how we want to support file I/O in the project going forward, especially if we don't want to put in the effort to specify smaller, stricter protocols (which I'm once again fine not doing for the simplicity of it all).
We also have to understand that protocol has no way to walk the file system (e.g. pathlib.Path.iterdir()
). So if we wanted a create_wheel()
that took in a directory like src/
and create a wheel from that, we don't have a way to express such an operation. Now we can obviously work around that by having whatever "create a wheel" API we have build the wheel up file-by-file via some add_file()
method or something.
Once again, I'm not saying "no" to trying to make this work at all (I think it makes sense for a wheel API to exist here and it makes sense for it to be able to read and create wheels), I just want us to all agree on where the edges of this project are going to be pushed out to since this will change some things for us.
Ah, OK. I understand the concern now. 😅
I agree that the choice of having I/O-based API (and the corresponding I/O mechanism) should be an explicit design choice. And, I'd be on-board for having a sans-I/O implementation (with a filesystem-based I/O helper provided out of the box).
However, I can't really come up with any sans-I/O design where we can have tooling handle wheels with large files in them, without a bunch of memory overhead in a sans-I/O model. Given that, I'm OK with a non-async I/O model for composing/creating a wheel and processing/reading a wheel.
The reading portion of a wheel API is covered by the installer's WheelSource
API (the one piece of that which might be worth improving is the RECORD file validation, which is currently coupled to the WheelSource
, and coupled to the I/O model -- I'm happy to evolve installer's WheelSource
to have a better approach for that).
It seems like it should be possible to design a sans I/O approach here? Maybe not with the zipfile module itself, but that's just a limitation of that module then. But in theory, we should be able to define an abstraction that is passed and emits chunks rather than whole files at once.
Are you thinking of something like:
from functools import partial
from shutil import COPY_BUFSIZE
wheel = ...
with open(large_model_filename, "rb") as model_file:
# something before to initialize the file, somehow?
for chunk in iter(partial(model_file.read, COPY_BUFSIZE), b''):
wheel.<something>(chunk, "purelib", "awesome-model.safetensors")
# something after to finalize the file, somehow?
If we put it in a context manager... I'd argue we're basically re-creating:
with open(large_model_filename, "rb") as model_file:
with wheel.open(destination, "purelib", "wb") as dest_file:
shutil.copyfileobj(model_file, dest_file)
... except without the ability to use shutil. 😉 (and, no, I don't think that's a good API)
Maybe not with the zipfile module itself, but that's just a limitation of that module then.
Yea, that's part of what I was hinting at -- there's only one-shot mechanisms to write to a zip (in-memory with io.BytesIO
, or on disk directly) with Zipfile
and we don't want to take on non-standard-lib dependencies on this project. This means that even if we had an API that took chunks one-at-a-time, we need to write things onto the zipfile on a file-by-file basis for things to actually work. In other words, we either need to write in one-shot or somehow buffer a large file.
That leaves us with the only way to do things (without taking an I/O object) as pretending that it's an I/O-ish object (while not using standard I/O APIs) while allowing for potential misuse, or a stateful object when performing the "add a file" operation.
I don't think a sans-IO model based on chunks would be better than:
with open(large_model_filename, "rb") as model_file:
wheel.<something>(model_file, "purelib", "awesome-model.safetensors")
and having a loop in there that's basically https://github.com/pypa/installer/blob/444e5295f2403ab303dcf14160591074396f1d21/src/installer/utils.py#L116.
I think it's a fine API? Like yea you're kinda doing it weird for shutil, but it also means you can do:
from functools import partial
from shutil import COPY_BUFSIZE
async def make_wheel():
wheel = ...
session = aioboto3.Session()
async with session.client("s3") as s3:
s3_ob = await s3.get_object(Bucket=bucket, Key=blob_s3_key)
stream = s3_ob["Body"]
while chunk := stream.read(COPY_BUFSIZE):
wheel.<something>(chunk, "purelib", "awesome-model.safetensors")
I think allowing that is worth the slightly worse implementation when you couple yourself to an IO implementation (and let's be honest, all Sans I/O APIs are at least a little bit less streamlined than ones that tie yourself to an IO implementation, because they have to be to actually function).
Stepping back a minute though, can we define what it is exactly that this API is trying to do? Looking at the installer code linked, I see a few things:
- Parses the wheel filename.
- Determine what the .dist-info directory name of the wheel is.
- Gets a list of the filenames that are inside of the .dist-info directory.
- Gets the contents of a file from the
.dist-info
directory. - Validates the
RECORD
file against the contents of the zip file. - Iterate over the files in the Wheel, validating that all of the files are in the
RECORD
.
Reading that, to me none of the interesting bits are actually driving the ZipFile
implementation, so we could reimplement that in a sans i/o way like this:
import zipfile
from shutil import COPY_BUFSIZE
filename = "..."
with zipfile.ZipFile(filename, "r") as zfp:
wheel = SansIOWheelFile(filename, zfp.namelist()) # Parses the filename and assigns attributes to wheel.version etc
wheel.dist_info_dir # Searches the list of filenames it's stored internally and returns the correct one.
wheel.dist_info_filenames # Searches the list of filenames it's stored internally and returns the correct ones.
# Equivalent to installer.sources.WheelFile.read_dist_info, but letting the caller do the I/O
zfp.read(wheel.dist_info_filename("METADATA"))
# Equivalent to installer.sources.WheelFile.validate_record(validate_contents=True) but
# letting the caller do the I/O.
#
# Note: Context manager and for loop only need for the validate_contents = True case
with wheel.record_validator(zfp.read(wheel.dist_info_filename("RECORD"))) as validator:
for file_validator in validator:
with zfp.open(file_validator.filename, "r") as stream:
for chunk in stream.read(COPY_BUFSIZE):
_chunk = file_validator.read(chunk) # Does no I/O, just updates the internal hash and returns the chunk
# Upon validator.__exit__ all of the hashes will get finalized and we'll collect our issues
# Equivalent to installer.sources.WheelFile.get_contents but letting the caller do the I/O
for filename in wheel.get_filenames(zfp.read(wheel.dist_info_filename("RECORD"))):
zfp.read(filename) # ... Or use zfp.open or whatever
Like all Sans I/O APIs, it's a little more verbose than one that is married to a specific I/O implementation, but it's not like.. terrible? It seems perfectly fine, and most of these things actually exist in installer already (particularly the reading of chunks to validate the files in records) but by turning the API "inside out" we divorce the logic of what we're doing from the logic of how I/O happens. Making it possible to drive it from an asynchronous I/O case, but also making testing significantly easier since we separate it from the need to do I/O at all.
In fact, you can see that the installer.sources.WheelFile
doesn't even want to drive the I/O in all cases, because it's returning a file object from its get_contents
, because the actually interesting part of that method is validating that the filenames are in the RECORD
not calling ZipFile.open()
.
You can tweak the exact API a bit, for instance you could have the various methods that need to access the list of filenames to each pass in the filenames rather than requiring it passed into the constructor... or you could also record the RECORD
contents to be passed into the constructor so the RECORD
consuming methods don't need to implement them.
Of course, given a SansIOWheelFile
like above, there's nothing stopping us from also including a synchronous WheelFile
that just acts as a wrapper around SansIOWheelFile
and does the boilerplate I/O stuff. Something like:
import zipfile
class WheelFile:
def __init__(self, zfp: zipfile.ZipFile):
self._zipfile = zfp
self._wheel = SansIOWheelFile(os.path.basename(zfp.filename), zfp.namelist())
@property
def dist_info_dir(self):
return self._wheel.dist_info_dir
@property
def dist_info_filenames(self):
return self._wheel.dist_info_filenames
def read_dist_info(self, filename):
self._zipfile.read(self._wheel.dist_info_filename(filename))
def validate_record(self):
with wheel.record_validator(zfp.read(wheel.dist_info_filename("RECORD"))) as validator:
for file_validator in validator:
with self._zipfile.open(file_validator.filename, "r") as stream:
for chunk in stream.read(COPY_BUFSIZE):
file_validator.read(chunk)
def get_contents(self):
for filename in wheel.get_filenames(zfp.read(wheel.dist_info_filename("RECORD"))):
yield self._zipfile.open(filename, "r")
Stepping back a minute though, can we define what it is exactly that this API is trying to do? Looking at the installer code linked, I see a few things: Looking at the installer code linked, I see a few things:
@dstufft , the ones you wrote, plus:
- Creating a Python representation of a wheel, that you can use to:
- add files
- automatically create a
WHEEL
file - automatically create a
RECORD
file once you are done adding files.
Lastly:
- We want to create an actual file from this representation (it does not have to do IO, but it should be easy to integrate with something else that does).
I think this thread started from a comment in the setuptools issue tracker, when @agronholm and I were discussing moving bdist_wheel
to setuptools
. So one of the primary use cases is creating wheels.
Other part of the discussion that preceeded this thread is that, wheel
currently impements the macosx_libfile module to calculate the most adequate platform tag for the current machine. I think is an important piece necessary to create wheels.
This is related to some existing functionality in packaging
, and conceptually feels like a natural extension.
So maybe diserves a different thread, but when talking about porting code from pypa/wheel
to pypa/packaging
, that is also part of the work imho.
Anyway, this is the final requirement:
- calculate the most appropriate platform tag to use for the current machine when creating a binary wheel.
Sounds good so far! Having a sans-I/O base implementation, plus a helper for actually working with wheels in the file system would work great in my opinion. I'm not opposed to an async helper either if someone wants that.
Anyway, this is the final requirement:
- calculate the most appropriate platform tag to use for the current machine when creating a binary wheel.
Just to be clear, the current machine should preferably be a non-factor when creating wheels. I would like the implementation to determine the platform tags from the metadata and files being presented to the API, if possible.
I assume the inputs for creating a wheel are the files to include + the relevant metadata files, sans the wheel specific ones like WHEEL
and RECORD
?
import contextlib
import os
import zipfile
from shutil import COPY_BUFSIZE
# Kinda sucks I feel like read/create should be different classes, but both kinda
# want to be called WheelFile
from something import SansIOWheelMaker
# Maybe we can have some defaults for the various tags? Or maybe we should
# require them to be passed in. Probably require them to be passed in? I definitely
# think any serious tag generation should live outside this, and get passed in.
#
# This should also include stuff like the Generator, If root should be purelib,
# etc. Basically all the stuff that goes into WHEEL and/or influences where stuff
# gets placed in the wheel.
wheel = SansIOWheelMaker("myproject", "myversion", python_tag="...")
with contextlib.ExitStack() as stack:
# The library has computed the wheel filename for us, so we can just go
# ahead and open it.
wheelfile = stack.enter_context(zipfile.ZipFile(wheel.filename, "w"))
# Real things will probably do something smarter than just scandir, like
# checking for files committed in VCS, or looking at the package names or
# whatever.
for entry in stack.enter_context(os.scandir("build/")):
# Zip files can contain directories, but the RECORD file doesn't mention
# them, so it's unclear if wheels can/should contain directory records
# or not, and if they do what they should be in RECORD, but if they
# can we can just do the right thing here.
if entry.is_dir():
continue
# We'll need to open the existing file, and also the zip file to add it
with contextlib.ExitStack() as fstack:
# In reality, we'd have to have logic to determine if this is a purelib
# or something else.
file = fstack.enter_context(wheel.add_file(entry.name, "purelib"))
sfp = fstack.enter_context(open(entry.path, "rb"))
zfp = fstack.enter_context(wheelfile.open(file.name, "w"))
while chunk := sfp.read(COPY_BUFSIZE):
# file.write does not hold onto any data, it's just taking chunks
# so that it can run them through a hasher for RECORD, and returns
# the chunk so the caller can do something with (presumably write
# it to a zip file).
zfp.write(file.write(chunk))
# When the `file` context manager ends, the RECORD file hash will
# be finalized. We could make it not a context manager, and just
# finalize whenever we emit the RECORD file, or offer both.
# We also need to pass the metadata in somehow, which we could make a
# different function for it, or we could just use a "metadata" sigil to
# wheel.add_file. Here we pick a different function, because there's no
# promise that sysconfig schemes don't add a `metadata` type in the future.
#
# Another option for METADATA would be to have it take a Metadata class
# instance, or possibly even have the SansIOWheelMaker take it and it can
# use that for the project name / version / etc, then METADATA can be treated
# like WHEEL and RECORD is, as "finalizing" files.
with contextlib.ExitStack() as fstack:
file = fstack.enter_context(wheel.add_metadata_file("METADATA"))
zfp = fstack.enter_context(wheelfile.open(file.name, "w"))
zfp.write(file.write(b"...."))
# After this, calling anything to add any kind of file to the wheel should
# be an error, as should trying to write to any existing file objects, etc.
# This emits the RECORD and WHEEL file (and METDATA, or any other future
# files that may be mandatory).
#
# We could make SansIOWheelFile a context manager and have its __exit__
# error if this method hasn't been called.
for filename, filedata in wheel.finalize():
with wheelfile.open(filename, "w") as zfp:
zfp.write(filedata)
I think that API looks perfectly fine?
What's interesting is I tried to make a synchronous API like I did for reading, and that actually feels much harder to do in a generic way, because there's a lot of little fiddly bits that are hard to get right generically, because for each file you basically need to know:
- If the file should be included.
- What
sysconfig
scheme this file should be a part of. - What is the filename, relative to the root of the
sysconfig
scheme directory for the filessysconfig
scheme.
However, that becomes hard to do without ending up with API that looks similar to the above or that makes assumptions about the above based on something. I kind of sketched it out the best I could, and came up with this:
class WheelMaker:
def add_files(self, path, callback):
with os.scandir(path) as it:
for entry in it:
if entry.is_dir():
continue
if meta := callback(path, entry.name):
scheme, arcname = meta
with contextlib.ExitStack() as fstack:
file = fstack.enter_context(self._wheel.add_file(arcname, scheme))
sfp = fstack.enter_context(open(entry.path, "rb"))
zfp = fstack.enter_context(self._zipfilefile.open(file.name, "w"))
while chunk := sfp.read(COPY_BUFSIZE):
zfp.write(file.write(chunk))
You would have to pass a callback that ws something like:
def callback(base, filepath):
if filepath.endswith("*.pyc"): # Check in VCS, or whatever makes sense
return None
# (Scheme, Filepath Relative To Scheme Dir)
return "purelib", filepath
The real callback would likely be much more complicated than that of course, and it works? But it assumes that you have the files on disk in a layout that makes sense to be shuffled right into the wheel (which may be a reasonable assumption?) or that you will call add_files
multiple times or something.
Another option is to add (either instead of, or in addition to) a add_file
API, but that can take a path or a file object (and if you do it in addition to, you can refactor add_files
to use it), and that can end up looking something like:
class WheelMaker:
def add_files(self, path, callback):
with os.scandir(path) as it:
for entry in it:
if entry.is_dir():
continue
if meta := callback(path, entry.name):
scheme, arcname = meta
self.add_file(entry.path, arcname, scheme)
def add_file(self, path_or_obj, arcname, scheme):
with contextlib.ExitStack() as fstack:
if isinstance(path_or_obj, str): # PathLike I guess?
sfp = fstack.enter_context(open(path_or_obj, "rb"))
else:
sfp = path_or_obj
file = fstack.enter_context(self._wheel.add_file(arcname, scheme))
zfp = fstack.enter_context(self._zipfilefile.open(file.name, "w"))
while chunk := sfp.read(COPY_BUFSIZE):
zfp.write(file.write(chunk))
Even with all of that, you still need a way to add metadata files... which I guess you could just treat as a special type of scheme? But like earlier there's no guarantee that "metadata" doesn't become a scheme name in the future. We could add an add_file
like API that just adds the metadata file, so something like:
class WheelMaker:
def add_metadata_file(self, filename, path_or_obj):
with contextlib.ExitStack() as fstack:
if isinstance(path_or_obj, str): # PathLike I guess?
sfp = fstack.enter_context(open(path_or_obj, "rb"))
else:
sfp = path_or_obj
file = fstack.enter_context(self._wheel.add_metadata_file(filename))
zfp = fstack.enter_context(self._zipfile.open(file.name, "w"))
while chunk := sfp.read(COPY_BUFSIZE):
zfp.write(file.write(chunk))
Since we can do synchronous I/O, we wouldn't need the .finalize()
method, and could just emit RECORD
and WHEEL
in __exit__
?
Oh, but in the above I was assuming that they could pass us a fp that we could wrap with a ZipFile
, but they won't actually know the filename (unless we require they figure it out themselves) until after the WheelMaker has been instantiated, or we end up needing an explicit WheelMaker().open()
that takes the fp
, or we need WheelMaker
to take a directory it will put the file in?
Anyways, all together, the sync API use could end up looking something like:
import contextlib
import io
from something import WheelMaker
class FileFilter:
def __init__(self, *args, **kwargs):
# Some information stored here like the VCS root, or the package name
# or whatever this caller uses to classify and filter files.
pass
def __call__(self, base, filepath):
# Real filtering/classification Logic
pass
wheel = WheelMaker("myproject", "myversion", python_tag="...")
with contextlib.ExitStack() as stack:
wfp = stack.enter_context(open(wheel.filename, "wb"))
# This returns an object, and we probably move all of the add_* stuff to an
# inner class that is returned from __enter__, unless we want to just return
# self and have all of the add_* methods have to check if we have a file
# we're operating on at the moment.
wheel = stack.enter_context(wheel.open(wfp))
# Assume that we have a build directory that has stuff in the correct place
# since that is our best case, more complicated cases will have to use the
# wheel.add_file() directly.
wheel.add_files("build/", FileFilter())
# This file is generated in memory or whatever, for some reason it's not in
# the build directory
wheel.add_file(io.BytesIO(b"..."), "mypkg/__generated__.py", "purelib")
# Add whatever metadata you want.
wheel.add_metadata_file("METADATA", io.BytesIO(b"..."))
# When the wheel context manager exits, it will write the RECORD and WHEEL and
# any other files to the archive, then close the internal zipfile, the underlying
# file object will still be open, which in the above will be closed by ExitStack
# next.
It gets a little awkward due to the circular dependency between the WheelMaker.__init__
being the thing that computes the filename and also wanting to have the caller pass in a path or fileobj to operate on, but overall it seems ok?
I saw #805, and wanted to highlight that there is a Wheel editing API in setuptools-ext
(full disclosure: I have contributed to it in the past). That API also supports read of existing wheels, and write of new wheels (to some extent - this could be improved though IMO). It is lazy in the sense that when you call the wheel.write()
method to add a file, it holds that in-memory until the wheel.write_wheel()
method is called (this can obviously be memory intensive, but given we are talking about wheels
I think that is reasonable [even considering the size of tensorflow etc.]).
I would be happy to discuss the api in more detail if there is interest (at the least, I think it is interesting to consider when designing a wheel api from scratch). An example of its use for modifying wheel metadata can be found at https://github.com/wimglenn/setuptools-ext/blob/0.9/setuptools_ext.py#L221-L233.
Finally, it is a bit of a surprise to me that we would go from a specialised project (wheel
) to a general one (packaging
). My guess is that, given there is no definitive answer to what a good API looks like for wheels, you would want to expose an experimental API and build experience from there. I don't personally believe packaging
is the best place to do that. Just my 2 cents from an avid packaging
user.
My question is, since it's not a lot of code, why should we want to have so many scattered packaging infrastructure projects? I'm trying to undo at least some of the insanity. You have so far given no rationale for putting this interface to wheel
- where I don't want it. I want wheel
to become nothing but a CLI tool for manipulating wheels. Perhaps one day even that could be eliminated entirely.
If you have suggestions/criticism about the API itself, feel free to voice your opinions in the PR.