Nuitka
Nuitka copied to clipboard
Collect submodules without follow imports in them.
in standalone mode currently it is blocking to add package folder manually after partial of package included in final executable.
Actually, I need --collect-submodules
that will include other module files without checking for imports in them, what currently --include-package
doing is also includes what's imported inside submodules.
This will prevent multiple import search paths (one in binary and one in directory):
Module A:
a.py -> no depends
b.py -> depends on large package B
here app.py
only imports A.a
and nuitka only collects this.
app.py
also in the future would dynamically import A.b
.
if try to create folder A
and put b.py
in executable dir, still you cannot import it, bcz namespace A
already exists inside of executable (first path in sys.path
).
I tried --include-package-data
, I was expecting to include .py
files, but it didn't.
Code is not data files. Data file options ignore DLLs, extension modules and Python code. You can specify a pattern to force it to include code files, but then you are on your own when you try to use them in any way.
I am not sure, I like the collect approach. But ever after I added no-auto-follow to the Yaml, plugins now get told which package wants to include a module, and the decision is actually limited to that module, and so Nuitka actually is capable of doing what you say you want here. But I somehow feel you are probably just confused, and tell me a technical mechanism that you believe to be the solution, rather than what the actual problem is.
Nuitka actually is capable of doing what you say you want here.
Yes as simple as adding few lines.
But I somehow feel you are probably just confused, and tell me a technical mechanism that you believe to be the solution, rather than what the actual problem is.
As I said in description above, by this way, without need B
to be included in binary, Later I'll able to manually put package B in app directory, then dynamic import A.b
would not fail when app.py
tries it.
I did it in linked PR, actually currently it implemented only for interpreter deps.
(encodings
and importlib
submodules auto included at here)
The PR changes are not general, they only affect the stdlib packages require for startup. Also I need to review it, but encodings are already included by default, it seemed to be done that way. Not sure, why importlib
should be treated that way, so far I never had an issue with it being incomplete.
Maybe you are encoutering a bug that needs a resolution? The noautofollow is being used in stdlib a lot these days, e.g. we are doing these.
- module-name: 'json.decoder'
anti-bloat:
- description: 'avoid _json module usage'
no-auto-follow:
'_json': 'may slow down by using fallback implementation'
when: 'not has_builtin_module("_json")'
So, if I get you correctly, you have something like an external Python application, that you want to include, and have or not have its dependencies included. You do not want or know to use multidist
for that. And quite generally, I guess, it would be sweet if Nuitka finally allowed to include modules as source code rather than as compiled code, since you do not care as much about that other program, than say the main one.
If this is about how only needed stdlib is included, e.g. with importlib
and the external program is supposed to use that, what I would like to see, is that we end up using multidist, doing a full dependency analysis of second program, and then exclude things that you do not want to follow. But in the mean time, if you included data files with /*.py
it will do it, and you just need to add your own --include-package=importlib
which may or may not have bugs.
The yaml package configuration e.g. has this
- module-name: 'pysnmp.smi'
data-files:
patterns:
- 'mibs/**/*.py'
I believe command line patterns do that as well, I recall using it recently like that for a customer. It will however, not work to use *
or just directory names, and the patterns in --include-data-dir
I believe ought to work, to allow code, but I am not 100% sure of that now.
I do not see, where your change comes in handy in any of that. Not including all of stdlib anymore, seems to be a problem for you, and I can see how that is bad. I could see how we add that as an option back, but I am not fond of doing that, happy to have gotten rid of that.
encodings are already included by default, it seemed to be done that way.
Yes partly included, some submodules excluded by conditions.
Not sure, why
importlib
should be treated that way, so far I never had an issue with it being incomplete.
Assume I'll want to manually put submodules in app dir, but the current approach blocking that bcz of multiple search path!
Maybe you are encoutering a bug that needs a resolution? The noautofollow is being used in stdlib a lot these days, e.g. we are doing these.
I guess you misunderstood, Let me tell more obvious example:
using nuitka 1.8rc9 and app.py
:
import mypack.mysub
exec(input("> "))
mypack/
__init__.py -> empty
mysub.py -> print("mysub hello")
extra.py -> print("extra hello")
try create standalone of app.py
, then put the test.py
in app.dist
that contains:
import mypack.extra
run app.exe
and enter import test
:
E;\app.dist>app
mysub hello
> import test
Traceback (most recent call last):
File "E:\app.py", line 3, in <module>
File "<string>", line 1, in <module>
File "E:\app.dist\test.py", line 1, in <module>
import mypack.extra
ModuleNotFoundError: No module named 'mypack.extra'
here test.py
added manually, but I can't add mypack.extra
manually.
It will however, not work to use
*
or just directory names,
Correct, I tried all of these and yaml file before. yaml file process per module, I want that --collect-submodules
command to be apply for all collected packages, As I don't want re-compile, I can put files manually in directory.
I do not see, where your change comes in handy in any of that. Not including all of stdlib anymore, seems to be a problem for you, and I can see how that is bad. I could see how we add that as an option back, but I am not fond of doing that, happy to have gotten rid of that.
Its WIP and only did on interpreter deps to show that works fine in the base code. We should expand it to elsewhere as we as adding cli options.
So, do you want to specify what to include, or do you imagine including all modules that you have in your Python installation? You could still do that externally by building a command line, but surely it will explode due to length issues on even the more forgiving OSes, or project options which do not have that kind of limit.
Plugins can at this time not contribute to the list of root modules, which arguably is an omission, and ought to be easy to add. You would walk there with pkgutil.walk_packages()
or whatever does the trick, and add all of those, then decide their compilation mode as well, and maybe that will already work. With the command line options of plugins, you can then control that, and we could even add that plugin.
I am not sure, what you said so far really requires core changes. Accepting patterns for the inclusion options like *
I am sure will be a not so nice new user trap, because something will tell them, yeah, that's solving my issues, where it probably is not the right solution.
I guess with https://github.com/Nuitka/Nuitka/pull/2383/commits/07140a6d57327c092bab167b400fd98e4603323e and https://github.com/Nuitka/Nuitka/pull/2383/commits/474d96e2e50ab259b0fc29b91a14c61e6a4e5b30, this is now more understandable. (ModuleName
with dont_follow
attribute)
So, do you want to specify what to include, or do you imagine including all modules that you have in your Python installation?
We can do both: --sub-collect=[ all | stdlib | specificPackage ]
I am not sure, what you said so far really requires core changes. Accepting patterns for the inclusion options like
*
I am sure will be a not so nice new user trap, because something will tell them, yeah, that's solving my issues, where it probably is not the right solution.
We don't need *
as option, this option is for preventing multiple module search paths for whose adding more packages in directory manually.
Including all (recursive) submodules without following imports in them would be enough. So only package name would be enough.
It seems you are re-implementing nuitka.importing.Recursion.decideRecursion
in a worse way (module names are strings, they are not supposed to carry usage information). This is a typical pattern it's used like:
# This will get back to all other plugins allowing them to inhibit it though.
decision, decision_reason = Recursion.decideRecursion(
using_module_name=module.getFullName(),
module_filename=module_filename,
module_name=full_name,
module_kind=module_kind,
)
if decision:
imported_module = Recursion.recurseTo(
module_name=full_name,
module_filename=module_filename,
module_kind=module_kind,
source_ref=module.getSourceReference(),
reason="implicit import",
using_module_name=module.module_name,
)
addUsedModule(
module=imported_module,
using_module=module,
usage_tag="plugin:" + plugin.plugin_name,
reason=decision_reason,
source_ref=module.source_ref,
)
The point where stdlib is scanned, these decisions should be asked, and used. That is actually a bug to not do it, that makes e.g. -nofollow-import-to=textwrap
not have an effect.
Including all of stdlib would be an include option, that make the decision function always return yes. Following is not including, so a --include-stdlib
could be added to force including all, and --noinclude-stdlib
to disable the compromise Nuitka is currently implementing in the hard coded way, and that should be used when none of those is given.
The stdlib scan has historically 2 phases, one where it picks technically needed stuff, plus one picking up the stdlib module names for inclusion generally even without anything else using it, where the later is based on a file system scan.
I have a similar issue with pygame
package.
All submodules like pygame.sprite
, pygame.display
, pygame.mixer
etc., are missing and result with ModuleNotFoundError
@ArtBIT can you open a new issue with a minimal reproducible example? thank you.
This issue never gave any fruit. From my understanding, it was attempted to have an extra implementation of how to decide recursion rather than using a plugin.