RFC: hyperopt facelift
This is continuation of a discussion started in #348, where I promised to summarise my thoughts on what could be cleaned up in the current codebase to hopefully make it easier for everyone.
Preface: while trying to build a conda package for hyperopt, I had to run the tests on circleci / appveyor / travis on all possible platforms, and many of them were failing randomly on different boxes. I then spent a good few days digging through the codebase and trying to make sense of it, also noting the existing todos in the comments.
Note 1: while some of the points below may seem too intrusive, please hear me out first. Note 2: apologies for the wall of text.
- MIT?
I've noticed that some of the files claim they are MIT-licensed, e.g. hyperopt/ipy.py, while the whole project is 3-clause BSD.
Would the authors consider switching the project to a somewhat friendlier MIT license? (unless there's reasons we're not aware about that prohibit doing so)
- Drop Python 2 support
(What?!)
In all seriousness, I suggest dropping Python 2 support for good. Before jumping to pros/cons, let's look at the current state of things in Python world:
- Python 2 is EOL in 2020
- Many big Python projects are dropping Python 2 support significantly before 2020; if you haven't seen this before, check out http://www.python3statement.org/ for the partial list of projects and an approximate timeline
- IPython stopped stopped supporting Python 2 from version 6.0
- Most importantly (for hyperopt), NumPy will switch to Python 3.x-only releases within a year
Pros:
-
Dependency on
sixpackage can be removed -
Dependency on
futurepackage can be removed -
Decluttering the imports; all of the below can be removed (this is taken from
hyperopt/fmin.py, as an example):from __future__ import print_function from __future__ import absolute_import from future import standard_library from builtins import str from builtins import object import six.moves.cPickle as pickle standard_library.install_aliases()which becomes just
import pickle -
Type annotations can (and should) be used throughout the code; this helps general readability and allows IDEs and static checkers like mypy to verify type correctness. E.g.,
def __init__(self, symbol_table, apply_name, o_len, pure): ...can be annotated as
def __init__(self, symbol_table: SymbolTable, apply_name: str, o_len: int, pure: bool) -> None: ...You could also use more complex annotations including generics, such as
Dict[K, V]orIterator[T]. Note that many docstrings that say 'this expects an argument of such and such type and returns such type' can essentially be replaced with a type annotation; instead, docstrings can be used for describing the effects and/or side effects, i.e. what the function actually does. -
Enables the use of
asyncioif needed -
Enums should really be treated as such; e.g.,
# hyperopt/base.py JOB_STATE_NEW = 0 JOB_STATE_RUNNING = 1 JOB_STATE_DONE = 2 JOB_STATE_ERROR = 3 JOB_STATES = [ JOB_STATE_NEW, JOB_STATE_RUNNING, JOB_STATE_DONE, JOB_STATE_ERROR]should probably just be
import enum class JobState(enum.IntEnum): NEW = 0 RUNNING = 1 DONE = 2 ERROR = 3(The same goes for string enums). For the sake of backwards-compatibility, functions expecting those enums could take either enums or the contained values (at least during the transition period).
Cons:
- No Python 2 support. Only a con if you're still using Python 2 in 2018 in the first place...
- Simplifying setup.py + versioning
-
distribute_setup.pycan be removed as it's not needed - All of
setup.pyexcept thesetup()call itself can be removed, it's not needed (I've just checked). - The preferred way of distributing scripts these days (like
hyperopt-mongo-worker) is through entry points hooks and not throughscripts. E.g., you could have ahyperopt/mongo/worker.pywith amain()function, and then insetup.pyregister
This way, it's just a normal part of the codebase and not a standalone file.entry_points={'console_scripts': ['hyperopt.mongo.worker:main']} - Looking at
RELEASE.txt(update the version insetup.py-> update the version inhyperopt/__init__.py-> sdist -> git tag, I think the versioning process could be simplified by using tag-based scheme from the start. A good tool for that issetuptools_scm, made by the same guys who maintain pip and setuptools. Basically, insetup.pyinstead of hard-coding a version number you do this:
That's all. Now the project will be versioned according to the current git tag (plus there's a scheme for assigning versions for commits that have non-zero distance from tag, e.g. if the current tag is v0.1.1 and the distance to it is 2, the assigned version will be something like v0.1.2-dev2+ac6f8bac). You could also provide the importable version number like so:# setup.py setup( use_scm_version=True, setup_requires=['setuptools_scm'] )
The release process would then look like this: git tag -> build sdist, wheel, or w/e else.# hyperopt/__init__.py from pkg_resources import get_distribution, DistributionNotFound try: __version__ = get_distribution(__name__).version except DistributionNotFound: pass
- Various style/consistency/refactoring fixes
A list of things I've noticed, in no particular order, just so I don't forget:
- If Python 2 is dropped,
class A(object):becomes justclass A: - 'flake8 requires a non-blank line' comment at the end of most source files is wrong; it doesn't (anymore)
- Calling superclass constructor is just
super().__init__()in Python 3 (instead ofBase.__init__()orsuper(Foo, self).__init__(). - Use absolute imports everywhere, e.g.
from hyperopt.base import fooinstead offrom .base import foo. - Someone should go through all
TODOandXXXcomments and either remove or resolve them (or convert to GH issues), many of them have been there for years. - Complying with pep8 for function/argument naming: e.g. functions like
SONifyor arguments likeN. - Avoid circular imports. E.g.,
fminimportsbasewhich importsfmin. There's even a note saying "Stop-gap implementation! fmin should have been a Trials method in the first place but for now it's still sitting in another file." Should it be refactored to become a Trials method then? - There's many references to 'bandit' in the codebase (e.g. in the command-line tool, and in plotting); however, my understanding is that it's leftovers from previous versions when hyperopt was restricted to just doing bandit opt? Should these be fixed?
- Multiprocessing parallelization backend
It would be nice to have a multiprocessing backend that 'just works'. As for parallelization itself, it could be done via e.g. joblib. However, this will involve a non-trivial amount of work, e.g. implementing an (in-memory?) shared database which would collect results across processes. Another option is using distributed (dask), as suggested in #282.
Note: looking at hyperopt/ipy.py, it says 'WARNING: IPythonTrials is not as complete, stable'. Would it become redundant if a joblib backend was implemented?
- Better separation of mongo stuff (make
pymongooptional?)
I understand that Mongo backend is an essential part of hyperopt, however quite often it won't be needed (people just import fmin and hp and go on optimizing). It would be nice if, at the very least, the core code didn't try to import pymongo unless you tried to create a MongoTrials.
It might be nice to provide a clean separation, so that mongo-related code is not intertwined with the core codebase; also, pymongo could be made a separate dependency, enabled via [mongo] setuptools feature (which would also install the mongo backend), like so:
$ pip install hyperopt[mongo]
- Miscellaneous QoL improvements
Random hiccups I've stumbled upon while using hyperopt myself:
- Integer parameters. As of late, hyperopt is extremely popular for tuning hyperparameters for forest learners (such as
xgboostorlightgbm). However, there's parameters that are integers (e.g., maximum depth of a tree, of number of estimators). While you can specifyhp.quniform(min, max, 1), you will still have to convert it to int manually in the objective function. It would be nice if hyperopt supported this natively as this is a very common use case. E.g.,hp.quniformint(min, max, 1)andhp.qloguniformint(min, max)(with step defaulting to 1). - Cleaner access to the current state of
Trialsfrom the objective function. Maybe I'm missing something, but the objective function currently receives space and doesn't have direct access to the trials object? E.g., if you wanted to log something like:
you would need to be able to pull both the current best score and the number of iterations from the trials object. You can capture the trials object manually, of course, by doing something like:Iteration #10: Score: 123 (current best: 234)
but it would be neat if hyperopt supported this directly. If objective function is meant to be a simple function of space, maybe fmin could alternatively accept a hypotheticaldef make_objective(): trials = Trials() def objective(space): do_stuff() log(trials) return objective, trialsObjectivesubclass, with richer API, and providing access to trials object. Or maybe it could be done differently. - Ability to easily provide initial search point(s), as noted in #295.
- Accessing
Trials.best_trialwill throw an exception (argmin of an empty array) if there have been no successful trials recorded; the code does something like this:
However, there's no direct way to check if there are any successful trials other than copying/pasting this; it might be nice to provide@property def best_trial(self): candidates = [t for t in self.trials if t['result']['status'] == STATUS_OK]Trials.successful_trials(optionally, alsoTrials.has_best) properties.
- Tests
- First and foremost, please don't test in-source-tree. E.g., you would never catch packaging errors this way. Thus, ideally, the tests should be moved to a
/tests/folder which would not have an__init__.pyin it. In-tree tests could still be easily run viaPYTHONPATH=. <run-tests>if need be. - Drop hyperopt's dependency on
nose. There's no reason to install it if you're just using the package. - Set up automated tests on Travis (Linux / OS X) and AppVeyor (Windows). Could possibly use
toxas well (on Travis) to simplify testing across intepreter versions, both locally and on CI. On Windows though, you'll have to do it manually since Python is normally installed via conda which is not compatible with venv. - If using tox for testing, all additional test dependencies should be listed there (e.g. pytest, matplotlib, mongo, etc).
- Optionally, set up automated coverage tracking, e.g. on codecov.
To the tests themselves:
- Would you consider switching to pytest? It's arguably a better test framework than
nosewith a better test runner and an ecosystem of plugins. I could help with migration if need be, once the test failures on master are fixed. Main pros: shared fixtures, fixture parametrization, less boilerplate, expression expansion on failures, native support for exact/approximate matching for numpy arrays (in the most recent versions). - Would it make sense to have some property-checked tests? E.g., via hypothesis. Main pros: failing case shrinkage.
- Ideally, test files should not import each other (e.g. stuff from
test_domainsbeing imported in other test_files). The fixtures could be provided in the form of proper (possibly parametrized) fixtures; other shared stuff could also be exposed via conftest. - matplotlib tests may fail if
DISPLAYis not set; runnning plotting tests opens a new plotting window which is not nice. Plotting tests should configure matplotib backend to avoid either of the above problems. - There's test failures noted in #315, most of which are due to hyperopt not catching up with the latest NumPy conventions.
- I've observed other test failures on different platforms as well; here's the list:
test_basic*,test_mu_is_used*,test_cdf*,test_pdf_logpdf*,test_random*,TestExperimentWithThreads*,test_plot*,test_q1lognormal*(exact failures can be reproduced if needed by re-enabling these tests in https://github.com/conda-forge/staged-recipes/pull/4710 and re-running the builds on all platforms).
hey @aldanor, thanks for all your suggestions.
- About the license, I think only @jaberg can decide this.
- I agree with you that fading out python2 makes sense, but it's quite a lot of work (I only recently ported hyperopt to python3), and I'd rather prioritise the other points first.
- nice to have :)
- A lot of this overlaps with 1., but of course the TODOs etc. have to be cleaned. agreed.
- yeah, personally I'm in favour to replace mongodb, if possible. that will hurt a lot, though. :)
- sure, I think it's reasonable to make this optional.
- yes, accessing
Trialsmore conveniently would be great. - personally I'm also in favour of pytest + tox for CI and moving tests to a separate folder.
@aldanor how confident are you to pull of replacing mongo?
I'd be happy if we could mostly focus on fixing failing tests and setting up CI, before making a new release. Afterwards we could begin to tackle other things.
Just to reply quickly to a few points:
fading out python2 makes sense, but it's quite a lot of work (I only recently ported hyperopt to python3)
It's actually not a whole lot of work, at least the initial cleaning - at least compared to other points, like fixing up the tests or especially point (4.). In fact I've already hacked it together in a local py3-only branch to see if it's feasible before posting this, took me less than an hour :) Of course, there's more detailed py3-related work as outlined in (3.), that will take more time but again, it's fairly mechanical.
@aldanor how confident are you to pull of replacing mongo?
With a simple/naive in-memory db + joblib? Again, it's a bit of work but not that hard, I would perhaps volunteer to try and hack it together later (once we have a stable release with tests passing and all that). Basically, a stable method of pickling (e.g. cloudpickle), a multiprocessing-safe queue to transfer hand-pickled stuff in and out, and a parallelization backend. Most of work really would be trying to fit this to the existing Trials interface and all the existing conventions.
I broke out the license question to new issue ^^ to discuss there.
Re: dropping python2 support, I'd be fine with that. I sent an email to hyperopt-discuss to see if anyone objects. If no one objects in a week, then let's say python2 support is dropped.
As a suggestion, if some of the other cleanup can be done without Python 2 support getting in the way, why not have a final Python 2 release for those who are stuck there for various reasons before moving to Python 3?
This post has been 😴 for 18+ months! I did a small summary:
- License --> discussed in issue XXX (is this solved?)
- drop python2 --> @jaberg seems to have decided, but hyperopt is still available to install via pip
- Code style --> I suggested
blackto solve most of the issues you comment --> see https://github.com/hyperopt/hyperopt/pull/556 @maxpumperla @jaberg while black requires python3.6, could we at least pass this PR (now just blackening the codebase without requiring to install it)? - Parallelization --> ?
- Separation of mongo stuff --> ?
- Misc QoL --> ?
- Pytest --> marked as not a priority for now but a nice to have (maybe we can open an issue and mark it as Help wanted)
Other requests:
- Support for Dask
maybe @aldanor could include this (or something similar) as a TLDR at the beginning of the issue to keep track of the different branches this issue has created?
Life would be a lot easier if we can work with dask and hyperopt together.
Shameless plug: https://github.com/perpetual-ml/perpetual