pycdlib
pycdlib copied to clipboard
UDF format ISO Large file support
Hello, First, thank you Chris for this wonderful module. I am having trouble with the UDF ISO's breaking large files up into smaller ones. I imagine I am not enabling some feature to support large files or something like that but I am not sure what I am doing wrong.
I am running version 1.1
C:\Users\User\Desktop\test>pip show pycdlib
Name: pycdlib
Version: 1.11.0
Summary: Pure python ISO manipulation library
Home-page: http://github.com/clalancette/pycdlib
Author: Chris Lalancette
Author-email: [email protected]
License: LGPLv2
Location: c:\users\user\venv\mpmod\lib\site-packages
Requires:
Required-by: media-processor
Here I have a directory with a large file in it..
C:\Users\User\Desktop\test>dir bigzip
Volume in drive C has no label.
Volume Serial Number is 6000-8F6B
Directory of C:\Users\User\Desktop\test\bigzip
04/27/2021 07:33 AM <DIR> .
04/27/2021 07:33 AM <DIR> ..
07/17/2020 08:51 AM 8,367,733,776 Windows-10vm.zip
1 File(s) 8,367,733,776 bytes
2 Dir(s) 45,585,305,600 bytes free
Here is my code:
def dir2iso(source, destination, filter=None):
"Create an ISO from a given source directory."
if filter==None:
filter = lambda x:True
new_iso = pycdlib.PyCdlib()
new_iso.new(udf="2.60")
for eachitem in pathlib.Path(source).rglob("*"):
if eachitem.is_dir() and filter(eachitem):
new_iso.add_directory( udf_path = "/"+str(eachitem.relative_to(source).as_posix()))
elif eachitem.is_file() and filter(eachitem):
new_iso.add_file(str(eachitem), udf_path = "/"+str(eachitem.relative_to(source).as_posix()))
new_iso.write(destination)
return "Created", []
def dir2iso_cli():
parser = argparse.ArgumentParser()
parser.add_argument("source", help = "The path to the directory to turn into an ISO")
parser.add_argument("destination", help="The destination ISO file to create (including path).")
args = parser.parse_args()
dir2iso(args.source, args.destination)
if __name__ == "__main__":
dir2iso_cli()
I execute that program and pass it the directory containing the 1 large zip file and here is the resulting iso:
E:\>dir
Volume in drive E is CDROM
Volume Serial Number is 5957-8578
Directory of E:\
04/04/2021 01:02 AM 4,294,965,248 Windows-10vm.zip
04/04/2021 01:02 AM 4,072,768,528 Windows-10vm.zip
2 File(s) 8,367,733,776 bytes
0 Dir(s) 0 bytes free
I would appreciate your help.
Ah, interesting.
The issue here is a limitation of ISOs. Regular ISO9660 can only create files of up to 4GB. However, it allows "splitting" files into smaller files, so you can effectively get larger file sizes.
UDF does not have the 4GB file limitation. However, pycdlib treats all ISOs as ISO9660 compatible, with optional UDF support. So it still splits up all files into smaller chunks so that they are still viable from the ISO9660 perspective.
I'm not sure how to resolve this, to be honest. We could add a "UDF-only" mode, but it's actually quite a lot of work and I've been stuck trying to do that for years now (see #19, for instance). Otherwise, in order to maintain compatibility with older ISO9660, we kind of have to keep doing this splitting.
I'm open to other ideas, but I can't think of how to fix this right now.
Interesting. Forgive my ignorance of the various iso standards and limitations. When I changed my ISO format to Joliet it stopped splitting the files. I have 1 8GB file now. Is that an expected behavior?
Mark
From: Chris Lalancette @.> Sent: Wednesday, April 28, 2021 9:49:54 PM To: clalancette/pycdlib @.> Cc: MarkBaggett @.>; Author @.> Subject: Re: [clalancette/pycdlib] UDF format ISO Large file support (#65)
Ah, interesting.
The issue here is a limitation of ISOs. Regular ISO9660 can only create files of up to 4GB. However, it allows "splitting" files into smaller files, so you can effectively get larger file sizes.
UDF does not have the 4GB file limitation. However, pycdlib treats all ISOs as ISO9660 compatible, with optional UDF support. So it still splits up all files into smaller chunks so that they are still viable from the ISO9660 perspective.
I'm not sure how to resolve this, to be honest. We could add a "UDF-only" mode, but it's actually quite a lot of work and I've been stuck trying to do that for years now (see #19https://github.com/clalancette/pycdlib/issues/19, for instance). Otherwise, in order to maintain compatibility with older ISO9660, we kind of have to keep doing this splitting.
I'm open to other ideas, but I can't think of how to fix this right now.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/clalancette/pycdlib/issues/65#issuecomment-828889510, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAFVSDMXYYLEW2EWGUJOXVDTLC3MFANCNFSM43VFJBQQ.
Joliet format didn't solve my issue after all. In joliet format is no longer splitting the ISO into multiple ISO files. However the files inside the ISO appear to be limited to 8GB. I tried modifying the following lines of the code above:
new_iso = pycdlib.PyCdlib()
#new_iso.new(udf="2.60")
new_iso.new(joliet=3)
Now files are truncated (see below).
How do I use this library to create an ISO that contains 20GB files? Is it possible?
File lengths in ISO are truncated. File hashes don't match (for obvious reasons).
PS C:\Users\User\Desktop\source> ls *.ova
Directory: C:\Users\User\Desktop\source
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a---- 5/24/2021 2:35 PM 12123749376 VirtualMachine.ova
PS C:\Users\User\Desktop\source> Get-FileHash -Algorithm md5 *.ova
Algorithm Hash Path
--------- ---- ----
MD5 511F9BCF8863BF4FD319212A62D95836 .\VirtualMachine.ova
PS E:\> dir *.ova
Directory: E:\
Mode LastWriteTime Length Name
---- ------------- ------ ----
--r--- 6/8/2021 10:27 AM 7828784128 VirtualMachine.ova
PS E:\> Get-FileHash -Algorithm md5 *.ova
Algorithm Hash Path
--------- ---- ----
MD5 C3B0F7272D50C7115A7E31C206A5BC11 E:\VirtualMachine.ova
If anyone else finds that they need to create ISOs on Windows files larger than 8GB, here is the nasty, dirty, traitorous solution I came up with. If there is a way to do this with this or any other native python module I'd appreciate the heads up.
https://github.com/MarkBaggett/pxpowershell
Specifically the dir2iso function in https://github.com/MarkBaggett/pxpowershell/blob/main/pxpowershell/example_dir2iso.py