augur icon indicating copy to clipboard operation
augur copied to clipboard

filter: Split filter.py into smaller files

Open victorlin opened this issue 2 years ago • 9 comments

Description of proposed changes

No functional changes here, just splitting a big file up into smaller pieces.

Organization is up for debate.

Related issue(s)

  • Supersedes #937

Testing

Tests updated to reflect new organization of files (see #996).

victorlin avatar Jul 01 '22 22:07 victorlin

Codecov Report

Base: 61.92% // Head: 62.09% // Increases project coverage by +0.16% :tada:

Coverage data is based on head (b98580c) compared to base (9b59bd7). Patch coverage: 95.94% of modified lines in pull request are covered.

:exclamation: Current head b98580c differs from pull request most recent head 574355b. Consider uploading reports for the commit 574355b to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #997      +/-   ##
==========================================
+ Coverage   61.92%   62.09%   +0.16%     
==========================================
  Files          52       58       +6     
  Lines        6333     6361      +28     
  Branches     1559     1559              
==========================================
+ Hits         3922     3950      +28     
  Misses       2137     2137              
  Partials      274      274              
Impacted Files Coverage Δ
augur/filter/io.py 77.50% <77.50%> (ø)
augur/filter/validate_arguments.py 86.95% <86.95%> (ø)
augur/filter/_run.py 96.33% <96.33%> (ø)
augur/filter/include_exclude_rules.py 97.27% <97.27%> (ø)
augur/filter/subsample.py 98.77% <98.77%> (ø)
augur/__version__.py 100.00% <100.00%> (ø)
augur/filter/__init__.py 100.00% <100.00%> (ø)
augur/filter/errors.py 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

codecov[bot] avatar Jul 01 '22 23:07 codecov[bot]

…but I expect that to be a small straightforward change.

Untested, but here are two ways to make that change.

Smallest diff:

diff --git a/augur/__init__.py b/augur/__init__.py
index 31ee6826..a69dcf44 100644
--- a/augur/__init__.py
+++ b/augur/__init__.py
@@ -142,4 +142,9 @@ def command_name(command):
     package     = command.__package__
     module_name = command.__name__
 
+    # If the command module is also a package, e.g. augur/x/__init__.py instead
+    # of augur/x.py, then use the parent package.
+    if package == module_name:
+        package = package.rsplit(".", 1)[0]
+
     return remove_prefix(package, module_name).lstrip(".").replace("_", "-")

Reconsidering some unnecessary machinery:

diff --git a/augur/__init__.py b/augur/__init__.py
index 31ee6826..0a7186fa 100644
--- a/augur/__init__.py
+++ b/augur/__init__.py
@@ -19,7 +19,7 @@ recursion_limit = os.environ.get("AUGUR_RECURSION_LIMIT")
 if recursion_limit:
     sys.setrecursionlimit(int(recursion_limit))
 
-command_strings = [
+COMMAND_NAMES = [
     "parse",
     "index",
     "filter",
@@ -29,10 +29,10 @@ command_strings = [
     "refine",
     "ancestral",
     "translate",
-    "reconstruct_sequences",
+    "reconstruct-sequences",
     "clades",
     "traits",
-    "sequence_traits",
+    "sequence-traits",
     "lbi",
     "distance",
     "titers",
@@ -44,7 +44,7 @@ command_strings = [
     "measurements",
 ]
 
-COMMANDS = [importlib.import_module('augur.' + c) for c in command_strings]
+COMMANDS = [(c, importlib.import_module('augur.' + c.replace("-", "_"))) for c in COMMAND_NAMES]
 
 def make_parser():
     parser = argparse.ArgumentParser(
@@ -56,10 +56,10 @@ def make_parser():
     add_default_command(parser)
     add_version_alias(parser)
 
-    for command in COMMANDS:
+    for command_name, command in COMMANDS:
         # Add a subparser for each command.
         subparser = subparsers.add_parser(
-            command_name(command),
+            command_name,
             help        = first_line(command.__doc__),
             description = command.__doc__)
 
@@ -129,17 +129,3 @@ def add_version_alias(parser):
         nargs  = 0,
         help   = argparse.SUPPRESS,
         action = run_version_command)
-
-
-def command_name(command):
-    """
-    Returns a short name for a command module.
-    """
-
-    def remove_prefix(prefix, string):
-        return re.sub('^' + re.escape(prefix), '', string)
-
-    package     = command.__package__
-    module_name = command.__name__
-
-    return remove_prefix(package, module_name).lstrip(".").replace("_", "-")

tsibley avatar Jul 05 '22 21:07 tsibley

Will get back to this once #1002 is done due to overlapping desires.

victorlin avatar Jul 12 '22 18:07 victorlin

Rebased on latest changes with the notable difference of using augur/filter/ instead of augur/support/filter.

victorlin avatar Jul 20 '22 22:07 victorlin

rules.py seems a little vague, but I can't think of a better name 🤔

How about include_exclude_rules.py?

Can you comment more on why you decided to re-export FilterException in augur.filter? It's not clear to me why this is necessary.

Motivation was from https://github.com/nextstrain/augur/pull/997#discussion_r914206200 but I also don't see it as necessary. Maybe @tsibley can chime in here?

victorlin avatar Jul 21 '22 22:07 victorlin

I suggested it under the thinking that augur.filter is the "facade" of the API, while everything else can be thought of as "behind the curtain". Exception classes are just as much a part of that module's API for callers as the other classes, functions, etc. in it.

tsibley avatar Jul 22 '22 22:07 tsibley

@tsibley I'm still not convinced... or at least the following lines should change under your thinking.

https://github.com/nextstrain/augur/blob/f01c45022e35fa8827a9e06f8cc2373570047200/augur/filter/filter_and_subsample.py#L11

https://github.com/nextstrain/augur/blob/f01c45022e35fa8827a9e06f8cc2373570047200/tests/filter/test_subsample.py#L5

Would you suggest to change them to these?

from augur import AugurError
from augur.filter import FilterException

In my opinion, these are to be used internally, and we should just reference them from the original location so that we as developers can clearly see that FilterException is defined in augur/filter/errors.py.

victorlin avatar Jul 28 '22 18:07 victorlin

@victorlin

In my opinion, these are to be used internally, and we should just reference them from the original location so that we as developers can clearly see that FilterException is defined in augur/filter/errors.py.

Ah, I have no issue with internal usage referencing the original location they're defined, and indeed agree it's preferable.

My point with https://github.com/nextstrain/augur/pull/997#discussion_r914206200 was that if we consider at least some of the functions in augur.filter to be part of Augur's public Python API, then we should keep FilterException available via augur.filter by re-exporting it (rather than making external callers dig around internals to find it). But if none of augur.filter is part of Augur's public Python API—and indeed I don't think we've ever fully clarified what is and isn't—then my point is moot. :-)

If you still don't agree, that's fine, and feel free to drop the re-export!

tsibley avatar Jul 28 '22 18:07 tsibley

@tsibley Ok I understand now! Indeed, augur.filter.FilterException is the first thing listed on the augur.filter module page. Your small flag on FilterException raises a bigger flag here – pretty much everything else on that page won't be available anymore (look at the new preview page)!

So before continuing here, I think we first need to clarify what part of augur.filter (and other modules) is part of Augur's Python Development API: #1011

victorlin avatar Jul 28 '22 22:07 victorlin

About to rebase this onto the latest changes, but dropping the re-export of FilterException (8ab610b34fc8d58046bf77750a02c998c6f45610) given work in #1087 that does not include augur.filter in the public API.

victorlin avatar Nov 04 '22 19:11 victorlin

Removed changelog entry b98580cda722f06051a5aec7e5f02c461b7105f7 given updated changelog intentions proposed in #1089.

victorlin avatar Nov 07 '22 21:11 victorlin