mokapot
mokapot copied to clipboard
Fast and flexible semi-supervised learning for peptide detection in Python
Hi! I am having an issue with getting protein level confidence. I have an MS Amanda output file that I am reading using the `psm_utils` package, and then converting that...
Can't get protein level confidence. "IndexError: index -3 is out of bounds for axis 0 with size 1"
I'm trying to run mokapot on this file (TestAX.pin): https://drive.google.com/file/d/1T4447c9Y24hx3qHla_cW55HEg4A_XEuW/view?usp=sharing Which works great to give peptide confidence: ```python psms = mokapot.read_pin("TestAX.pin") results, models = mokapot.brew(psms) print(results) ``` ``` A mokapot.confidence.LinearConfidence...
This is a huge, in-progress PR swapping out the Pandas backend for Polars. The goal is to improve the speed and scalability of mokapot. In addition to this huge change,...
This PR partially addresses #101: - `pyinstaller.spec` instructs PyInstaller to freeze mokapot with all its dependencies (including Python) in an executable. - `innosetup.iss` instructs InnoSetup to bundle everything in a...
The PsmSchema definition is currently implemented via dataclasses. It's great to have the ability to validate a dataframe with a schema! Pydantic is a library for defining schemas and validation...
Hi, thanks for the great tool and for making it open source. Do you know if it is easily possible/has some experience to create executables for different platforms for mokapot?...
Mokapot is a workflow that broadly consists of the following steps - data preprocessing: optionally subsetting the input data and then doing a 3-fold split to tho generate training data...
For very large datasets, single-threaded IO operations are currently a speed bottleneck. Pyarrow datasets natively support: - partitioning a dataframe - [multi-threaded read](https://arrow.apache.org/cookbook/py/io.html#reading-partitioned-data) - [multi-threaded](https://github.com/apache/arrow/blob/main/python/pyarrow/dataset.py#L873C1-L875C62) [write](https://arrow.apache.org/cookbook/py/io.html#writing-partitioned-datasets) - [specifying number of...
@sambenfredj 's pull requests introduces streaming at several places of the workflow but those intermediary file formats are not specified and documented yet. In addition, switching to a binary format...
Hi Will, Thanks for your time in advance: I am trying to use Mokapot on sage results of timsTOF DDA data - nothing special. The issue I encounter is the...