vaex
vaex copied to clipboard
[BUG-REPORT] Exception in pyinstaller bundled app for vaex >=4.6.0
Description
I'm facing two exceptions when using latest vaex versions (4.6.0 and 4.7.0) after bundling using pyinstaller 4.7.
First exception
Traceback (most recent call last):
File "main.py", line 1, in <module>
File "<frozen importlib._bootstrap>", line 983, in _find_and_load
File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
File "PyInstaller\loader\pyimod03_importers.py", line 476, in exec_module
File "vaex\__init__.py", line 43, in <module>
File "<frozen importlib._bootstrap>", line 983, in _find_and_load
File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
File "PyInstaller\loader\pyimod03_importers.py", line 476, in exec_module
File "vaex\dataset.py", line 13, in <module>
File "<frozen importlib._bootstrap>", line 983, in _find_and_load
File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
File "PyInstaller\loader\pyimod03_importers.py", line 476, in exec_module
File "frozendict\__init__.py", line 22, in <module>
FileNotFoundError: [Errno 2] No such file or directory: 'D:\\git_repos\\pyinstaller_problem2_minmal\\dist\\main\\frozendict\\VERSION'
It's caused by the VERSION
file of frozendict
(new in versions >2.0) not being bundled. That's actually a pyinstaller/frozendict issue. I just wanted to post the solution here as others will likely face the same issue. It can be solved by using the following hook file:
hook-frozendict.py:
from pathlib import Path
import frozendict
datas = [(Path(frozendict.__path__[0]) / 'VERSION', 'frozendict')]
Second exception
Hello world
Traceback (most recent call last):
File "main.py", line 6, in <module>
File "vaex\dataframe.py", line 928, in count
File "vaex\dataframe.py", line 902, in _compute_agg
File "vaex\dataframe.py", line 1672, in _delay
File "vaex\dataframe.py", line 412, in execute
File "vaex\execution.py", line 181, in execute
File "vaex\execution.py", line 186, in run
File "vaex\asyncio.py", line 51, in just_run
File "nest_asyncio.py", line 81, in run_until_complete
File "asyncio\futures.py", line 181, in result
File "asyncio\tasks.py", line 249, in __step
File "vaex\execution.py", line 334, in execute_async
File "vaex\memory.py", line 37, in create_tracker
ValueError: No memory tracker found with name default
[1272] Failed to execute script 'main' due to unhandled exception!
For this one I have not found a solution yet and would like to query help. I have trouble understanding how the embedded importing in vaex/memory.py
works (and I guess so does pyinstaller). Any hints how to solve this?
Thats the concerned code section from vaex/memory.py
def create_tracker():
memory_tracker_type = vaex.settings.main.memory_tracker.type
if not _memory_tracker_types:
with lock:
if not _memory_tracker_types:
for entry in pkg_resources.iter_entry_points(group="vaex.memory.tracker"):
_memory_tracker_types[entry.name] = entry.load()
cls = _memory_tracker_types.get(memory_tracker_type)
if cls is not None:
return cls()
raise ValueError(f"No memory tracker found with name {memory_tracker_type}")
Steps to reproduce are the following:
main.py:
import vaex
print("Hello world")
df = vaex.from_dict({'A':[1,2,3]})
print(df.count())
Executing python main.py
, the script runs fine.
Bundle using pyinstaller 4.7 (having above mentioned hook-frozendict.py
):
pyinstaller --onedir --additional-hooks-dir=. main.py
Output of main.exe is:
Hello world
Traceback (most recent call last):
File "main.py", line 6, in <module>
File "vaex\dataframe.py", line 928, in count
File "vaex\dataframe.py", line 902, in _compute_agg
File "vaex\dataframe.py", line 1672, in _delay
File "vaex\dataframe.py", line 412, in execute
File "vaex\execution.py", line 181, in execute
File "vaex\execution.py", line 186, in run
File "vaex\asyncio.py", line 51, in just_run
File "nest_asyncio.py", line 81, in run_until_complete
File "asyncio\futures.py", line 181, in result
File "asyncio\tasks.py", line 249, in __step
File "vaex\execution.py", line 334, in execute_async
File "vaex\memory.py", line 37, in create_tracker
ValueError: No memory tracker found with name default
[1272] Failed to execute script 'main' due to unhandled exception!
Software information
- Vaex version (
import vaex; vaex.__version__)
: {'vaex-core': '4.7.0'} - Vaex was installed via: pip
- Python: 3.7.9
- Pyinstaller: 4.7
- OS: Win10
Hi,
thanks for sharing this. I think pyinstaller is not picking up entry points for some reason. Those are listed in https://github.com/vaexio/vaex/blob/1b04e089a60d838362aad71ee4fdef9dc6e174be/packages/vaex-core/setup.py#L185 Does this help you?
Regards,
Maarten Breddels
I'm having the same error. What you're mentioning is included in entry_points.txt but I don't know how to solve that.
I succeeded in following the hints from https://github.com/pyinstaller/pyinstaller/issues/3050 and added the following to my .spec file:
# Helper function to make iter_entry_points work e.g. for vaex
# copied and modified from https://github.com/pyinstaller/pyinstaller/issues/3050
def prepare_entrypoints(ep_packages):
hook_ep_packages = dict()
hiddenimports = set()
runtime_hooks = list()
if not ep_packages:
return list(hiddenimports), runtime_hooks
for ep_package in ep_packages:
for ep in pkg_resources.iter_entry_points(ep_package):
if ep_package in hook_ep_packages:
package_entry_point = hook_ep_packages[ep_package]
else:
package_entry_point = []
hook_ep_packages[ep_package] = package_entry_point
package_entry_point.append("{} = {}:{}".format(ep.name, ep.module_name, ep.attrs[0]))
hiddenimports.add(ep.module_name)
try:
os.mkdir('./generated')
except FileExistsError:
pass
with open("./generated/pkg_resources_hook.py", "w") as f:
f.write("""# Runtime hook generated from spec file to support pkg_resources entrypoints.
ep_packages = {}
if ep_packages:
import pkg_resources
default_iter_entry_points = pkg_resources.iter_entry_points
def hook_iter_entry_points(group, name=None):
if group in ep_packages and ep_packages[group]:
eps = ep_packages[group]
for ep in eps:
parsedEp = pkg_resources.EntryPoint.parse(ep)
parsedEp.dist = pkg_resources.Distribution()
yield parsedEp
else:
return default_iter_entry_points(group, name)
pkg_resources.iter_entry_points = hook_iter_entry_points
""".format(hook_ep_packages))
runtime_hooks.append("./generated/pkg_resources_hook.py")
return list(hiddenimports), runtime_hooks
# List of packages that should have their "Distutils entrypoints" included.
ep_packages = ["vaex.memory.tracker"]
hiddenimports, runtime_hooks = prepare_entrypoints(ep_packages)
and then add the hiddenimports
and runtime_hooks
to the arguments of Analysis
like so:
a = Analysis(
...
hiddenimports=hiddenimports,
runtime_hooks=runtime_hooks,
)
Hope that helps
I am using Auto py to exe GUI and facing the same issue Exception in Tkinter callback ... Can someone help how to resolve it using GUI
Traceback (most recent call last):
File "tkinter\__init__.py", line 1702, in __call__
File "KPI_Automation_GUI.py", line 302, in startConversion
startConversion_mf4()
File "KPI_Automation_GUI.py", line 215, in startConversion_mf4
match_extract_txt.match_and_Extract(textfilelist, str(DriveEnv.get()))
File "match_extract_txt.py", line 2142, in match_and_Extract
df.export_hdf5(Databasehdf5FilePath_temp, progress=True, chunk_size=1000000, parallel=True, mode='w')
File "vaex\dataframe.py", line 6907, in export_hdf5
with vaex.utils.progressbars(progress, title="export(hdf5)") as progressbar:
File "vaex\utils.py", line 988, in progressbars
return tree(*args, **kwargs)
File "vaex\progress.py", line 206, in tree
return ProgressTree(bar=bar(title=title), next=next, name=name)
File "vaex\progress.py", line 181, in bar
return _progressbar_registry[type_name](title=title)
File "vaex\utils.py", line 75, in __getitem__
raise NameError(f'No {self.typename} registered with name {name!r} under entry_point {self.entry_points!r}')
NameError: No progressbar registered with name 'simple' under entry_point 'vaex.progressbar'
It seems Python 3.10 breaks the fix above for PyInstaller due to the new importlibs.metadata being used instead. For now however, I've fixed this for my own project by editing dataset.py and memory.py to add the code from the generated python hook and set entry_points to the hook function. utils.py may also need to be overidden in some use cases, but that didn't turn out to be needed for bare hdf5/csv access.
If any hidden imports are missing after using the entry points fix above, they can be identified by using something like this:
modlist=open('modules.txt','w')
print(json.dumps(sorted(list(sys.modules.keys())), indent=4),file=modlist)
modlist.close()
and doing a diff between the running python version and compiled exe version, then filtering the results. Its possible simply changing the import in the hook file may fix the problem, but I haven't tested that possibility yet, as it was unclear whether the direct import of entry_points would override the hook when the module was imported a second time. Either way, combining the fix given with one of these two possibilities will yield a working app. To prevent breaking the development process, I just installed a 3.10 parallel to the development environment for the build so that it doesn't matter if the installed files are edited.
EDIT: Adding for those trying to build apps relying on vaex-viz, the lazy accessors are set in init.py, applying the same monkey patch for vaex.dataframe.accessor and vaex.expression.accessor in init.py will resolve the problem.
Based on the discussion above and looking in the related issues I have not been able to find a solution to this problem. I'm running Vaex 4.16 on python 3.10 in conda with the following basic example:
`import vaex as vx from vaex.hdf5.dataset import Hdf5MemoryMapped, AmuseHdf5MemoryMapped, Hdf5MemoryMappedGadget vx.settings.main.memory_tracker.type = 'default'
vx.dataset.opener_classes = [Hdf5MemoryMapped,AmuseHdf5MemoryMapped,Hdf5MemoryMappedGadget]
df = vx.open(r"c:\20220613.hdf5") print(df.head) df.select(df['date'] >= "2022-06-01", mode='and' ) print("count: ", df.count(selection=True)) df.select(df['starttime'] >="2022-06-13 14:00:00" , mode='and' ) print("count: ", df.count(selection=True)) df.close() ` This will give the correct result when running as a script. When running as an executable it gives the correct output for the df.head but the df.count results in the memory tracker issue as mentioned in this thread. I would really appreciate some help solving this as I am currently not able to package my solution as an executable.
It's been more than a year. Does anyone know the solution? Thank you in advance.
pyinstaller -hidden-import vaex.viz --hidden-import vaex.astro.legacy --recursive-copy-metadata vaex fixed issues with Vaex 4.17 on Python 3.10