Feature Request: Add option to skip over SyntaxError in broken cells
Summary
My use case for pipreqsnb is to create requirements.txt files for both the .py and python notebook files in a given folder. Currently pipreqsnb raises an error and stops executing when it encounters a SyntaxError in a notebook cell. In a python development environment with messy or in progress notebooks this can happen often which makes dependency generation difficult. It would be useful to have an option to skip over cells with errors when parsing the the python notebooks files in a folder.
Proposal
Add a feature (e.g., --ignore-errors flag or default behavior) to catch and skip over cells that cannot be parsed by ast.parse() due to syntax errors.
Example patch:
def get_import_string_from_source(source):
imports = []
splitted = source.splitlines()
try:
tree = ast.parse(source)
except SyntaxError as e:
print(f"Warning: Skipping invalid code due to SyntaxError: {e}")
return []
for node in ast.walk(tree):
if isinstance(node, (ast.Import, ast.ImportFrom)):
imports.append(splitted[node.lineno - 1])
return imports
Benefits
- Improves robustness of
pipreqsnbwhen working with incomplete or in-progress notebooks, which is a very common use case in data science and research. - Prevents a single bad code cell from stopping dependency generation for the entire project.
Testing
I have tested the proposed update locally on my computer and confirmed that it:
- Sucessfully creates the requirements.txt file for the cells with no sytnax errors where it previously stopped execution becuase of error
- Prints a warning message for cells with invalide sytax that is skips over, e.g. 'Warning: Skipping cell due to SyntaxError: invalid syntax. Perhaps you forgot a comma? (
, line 1)'
Hey, thanks for the feedback.
I think pipreqs supports notebooks now. Can you check if you can solve this with pipreqs only?
- https://github.com/bndr/pipreqs Check their readme
--scan-notebooks Look for imports in jupyter notebook files.