snakemake-wrappers
snakemake-wrappers copied to clipboard
fix: gffread issue with reading gff files, writing fasta files
The gffread wrapper raises ValueError: Unknown annotation format for gff files. In addition to this, the wrapper assumes that any fasta output contains the spliced exons of each transcript (produced by -w), and not e.g. spliced coding sequences or protein fasta sequences. This PR fixes both of these issues.
Unfortunately it also introduces a breaking change to how fasta files are produced. I think defaulting to -w output for fasta files is a bad solution, but could implement this if it's required.
QC
- [x] I confirm that I have followed the documentation for contributing to
snakemake-wrappers.
While the contributions guidelines are more extensive, please particularly ensure that:
- [x]
test.pywas updated to call any added or updated example rules in aSnakefile - [x]
input:andoutput:file paths in the rules can be chosen arbitrarily - [x] wherever possible, command line arguments are inferred and set automatically (e.g. based on file extensions in
input:oroutput:) - [x] temporary files are either written to a unique hidden folder in the working directory, or (better) stored where the Python function
tempfile.gettempdir()points to - [x] the
meta.yamlcontains a link to the documentation of the respective tool or command underurl: - [x] conda environments use a minimal amount of channels and packages, in recommended ordering
Summary by CodeRabbit
-
New Features
- Introduced a new testing rule for GFF3 files, enhancing compatibility with multiple annotation formats.
- Added a new GFF3 annotation file defining genomic features for two manually created genes.
- Expanded test coverage to include validation for GFF and GTF formats.
-
Bug Fixes
- Improved input validation and error handling for various annotation file formats.
-
Documentation
- Updated test functions to reflect new naming conventions and added tests for additional bioinformatics tools.
[!WARNING]
Rate limit exceeded
@fgvieira has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 18 minutes and 47 seconds before requesting another review.
⌛ How to resolve this issue?
After the wait time has elapsed, a review can be triggered using the
@coderabbitai reviewcommand as a PR comment. Alternatively, push new commits to this PR.We recommend that you space out your commits to avoid hitting the rate limit.
🚦 How do rate limits work?
CodeRabbit enforces hourly rate limits for each developer per organization.
Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.
Please see our FAQ for further information.
📥 Commits
Files that changed from the base of the PR and between 732241540868701e32ed639994eef56c755f3188 and 42065bc166e3c8feb5c18fa65b53f5c40012c2f7.
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
Walkthrough
The pull request introduces modifications to the bioinformatics pipeline's handling of GFF and GTF files. It renames the existing rule test_gffread to test_gffread_gtf, adds a new rule test_gffread_gff for GFF3 files, and updates the associated input files and parameters. A new GFF3 annotation file is created, and the wrapper script is enhanced to support multiple annotation formats. Additionally, the test suite is expanded with new functions to validate these changes, improving coverage for bioinformatics tools.
Changes
| File | Change Summary |
|---|---|
| bio/gffread/test/Snakefile | - Renamed rule test_gffread to test_gffread_gtf.- Added new rule test_gffread_gff for GFF3 files. Updated input files and parameters for both rules. |
| bio/gffread/test/annotation.gff3 | - Introduced a new GFF3 file defining genomic features for two genes with transcripts, exons, and CDS. |
| bio/gffread/wrapper.py | - Expanded annotation format checks to include .gff and .gff3.- Updated output flag handling and added validation for fasta output flags. |
| test.py | - Renamed test function test_gffread to test_gffread_gtf.- Added new test function test_gffread_gff. |
| bio/gffread/meta.yaml | - Added fasta_flag and extra parameters in params section. Updated notes on format detection. |
Possibly related PRs
- #3367: This PR directly relates to the main PR as it also modifies the
Snakefileby renaming the ruletest_gffreadtotest_gffread_gtfand introduces a new ruletest_gffread_gff, aligning with the changes made in the main PR.
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?
🪧 Tips
Chat
There are 3 ways to chat with CodeRabbit:
- Review comments: Directly reply to a review comment made by CodeRabbit. Example:
I pushed a fix in commit <commit_id>, please review it.Generate unit testing code for this file.Open a follow-up GitHub issue for this discussion.
- Files and specific lines of code (under the "Files changed" tab): Tag
@coderabbitaiin a new review comment at the desired location with your query. Examples:@coderabbitai generate unit testing code for this file.@coderabbitai modularize this function.
- PR comments: Tag
@coderabbitaiin a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:@coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.@coderabbitai read src/utils.ts and generate unit testing code.@coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.@coderabbitai help me debug CodeRabbit configuration file.
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.
CodeRabbit Commands (Invoked using PR comments)
@coderabbitai pauseto pause the reviews on a PR.@coderabbitai resumeto resume the paused reviews.@coderabbitai reviewto trigger an incremental review. This is useful when automatic reviews are disabled for the repository.@coderabbitai full reviewto do a full review from scratch and review all the files again.@coderabbitai summaryto regenerate the summary of the PR.@coderabbitai resolveresolve all the CodeRabbit review comments.@coderabbitai configurationto show the current CodeRabbit configuration for the repository.@coderabbitai helpto get help.
Other keywords and placeholders
- Add
@coderabbitai ignoreanywhere in the PR description to prevent this PR from being reviewed. - Add
@coderabbitai summaryto generate the high-level summary at a specific location in the PR description. - Add
@coderabbitaianywhere in the PR title to generate the title automatically.
CodeRabbit Configuration File (.coderabbit.yaml)
- You can programmatically configure CodeRabbit by adding a
.coderabbit.yamlfile to the root of your repository. - Please see the configuration documentation for more information.
- If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation:
# yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json
Documentation and Community
- Visit our Documentation for detailed information on how to use CodeRabbit.
- Join our Discord Community to get help, request features, and share feedback.
- Follow us on X/Twitter for updates and announcements.
Looks good. Can you just update the meta.yaml file?
thanks,
I ended up renaming the parameter to fasta_flag instead of out_flag. I thought this was a better description, I hope you don't mind.