Merged
Conversation
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Co-authored-by: SarahNakamura <snakamura@unomaha.edu>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
…dicates an incorrect taxon id Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
# Conflicts: # main/como/rnaseq_gen.py # main/como/rnaseq_preprocess.py
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
* fix(fpkm): update imports for zFPKM calculation improvements Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix(fpkm): use Salmon quantification instead of STAR quantification Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: ruff formatting Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: fill with integers for faster processing Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: remove unnecessary async function usage Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix: remove non existant genes from conversion Signed-off-by: Josh Loecker <joshloecker@icloud.com> * refactor: use more explicit (albeit longer) code to create gene_info dataframe object Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: import required modules Signed-off-by: Josh Loecker <joshloecker@icloud.com> * refactor: optional argument for fragment data Signed-off-by: Josh Loecker <joshloecker@icloud.com> * refactor: improve handling for single cell data Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: generalize data type input Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: ruff formatting Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: simplify FPKM/RPKM calculations; properly compute per-gene FPKM scores Signed-off-by: Josh Loecker <joshloecker@icloud.com> * refactor: move zfpkm calculation to external package Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: use np.bool for boolean array Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: ruff formatting Signed-off-by: Josh Loecker <joshloecker@icloud.com> * feat: allow setting negative zFPKM results to 0 Signed-off-by: Josh Loecker <joshloecker@icloud.com> * feat: simplification to use external zfpkm package Signed-off-by: Josh Loecker <joshloecker@icloud.com> * feat: allow providing the fragment size filepath (from rnaseq preprocessing) Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore(ruff): reduce max line length Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore(ruff): mark unsorted imports as fixable Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore(uv): lock pyproject file Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix: rename count to quant in testing files Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix: test new quant information Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: use quant files instead of strand files Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: updated COMO_input files for naiveB to use updated FastqToGeneCounts information Signed-off-by: Josh Loecker <joshloecker@icloud.com> * feat: added Salmon quantification data for naive B Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: use `_read_file` function to read data Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix(tests): remove 1 from expected gene names to fix header Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix(tests): use `endswith` instead of `is in` Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix(tests): Use missing file appropriately Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore(uv): Use dependency groups Signed-off-by: Josh Loecker <joshloecker@icloud.com> * revert: use synchronous programming for more deterministic usage Signed-off-by: Josh Loecker <joshloecker@icloud.com> --------- Signed-off-by: Josh Loecker <joshloecker@icloud.com>
* fix(fpkm): update imports for zFPKM calculation improvements Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix(fpkm): use Salmon quantification instead of STAR quantification Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: ruff formatting Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: fill with integers for faster processing Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: remove unnecessary async function usage Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix: remove non existant genes from conversion Signed-off-by: Josh Loecker <joshloecker@icloud.com> * refactor: use more explicit (albeit longer) code to create gene_info dataframe object Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: import required modules Signed-off-by: Josh Loecker <joshloecker@icloud.com> * refactor: optional argument for fragment data Signed-off-by: Josh Loecker <joshloecker@icloud.com> * refactor: improve handling for single cell data Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: generalize data type input Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: ruff formatting Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: simplify FPKM/RPKM calculations; properly compute per-gene FPKM scores Signed-off-by: Josh Loecker <joshloecker@icloud.com> * refactor: move zfpkm calculation to external package Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: use np.bool for boolean array Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: ruff formatting Signed-off-by: Josh Loecker <joshloecker@icloud.com> * feat: allow setting negative zFPKM results to 0 Signed-off-by: Josh Loecker <joshloecker@icloud.com> * feat: simplification to use external zfpkm package Signed-off-by: Josh Loecker <joshloecker@icloud.com> * feat: allow providing the fragment size filepath (from rnaseq preprocessing) Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore(ruff): reduce max line length Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore(ruff): mark unsorted imports as fixable Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore(uv): lock pyproject file Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix: rename count to quant in testing files Signed-off-by: Josh Loecker <joshloecker@icloud.com> * feat: add single cell normalization using scanpy defaults Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix: test new quant information Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix: test new quant information Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: use quant files instead of strand files Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: use quant files instead of strand files Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: updated COMO_input files for naiveB to use updated FastqToGeneCounts information Signed-off-by: Josh Loecker <joshloecker@icloud.com> * feat: added Salmon quantification data for naive B Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: use `_read_file` function to read data Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix(tests): remove 1 from expected gene names to fix header Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix(tests): use `endswith` instead of `is in` Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix(tests): Use missing file appropriately Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore(uv): Use dependency groups Signed-off-by: Josh Loecker <joshloecker@icloud.com> * revert: use synchronous programming for more deterministic usage Signed-off-by: Josh Loecker <joshloecker@icloud.com> --------- Signed-off-by: Josh Loecker <joshloecker@icloud.com>
* fix(fpkm): update imports for zFPKM calculation improvements Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix(fpkm): use Salmon quantification instead of STAR quantification Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: ruff formatting Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: fill with integers for faster processing Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: remove unnecessary async function usage Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix: remove non existant genes from conversion Signed-off-by: Josh Loecker <joshloecker@icloud.com> * refactor: use more explicit (albeit longer) code to create gene_info dataframe object Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: import required modules Signed-off-by: Josh Loecker <joshloecker@icloud.com> * refactor: optional argument for fragment data Signed-off-by: Josh Loecker <joshloecker@icloud.com> * refactor: improve handling for single cell data Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: generalize data type input Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: ruff formatting Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: simplify FPKM/RPKM calculations; properly compute per-gene FPKM scores Signed-off-by: Josh Loecker <joshloecker@icloud.com> * refactor: move zfpkm calculation to external package Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: use np.bool for boolean array Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: ruff formatting Signed-off-by: Josh Loecker <joshloecker@icloud.com> * feat: allow setting negative zFPKM results to 0 Signed-off-by: Josh Loecker <joshloecker@icloud.com> * feat: simplification to use external zfpkm package Signed-off-by: Josh Loecker <joshloecker@icloud.com> * feat: allow providing the fragment size filepath (from rnaseq preprocessing) Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore(ruff): reduce max line length Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore(ruff): mark unsorted imports as fixable Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore(uv): lock pyproject file Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix: rename count to quant in testing files Signed-off-by: Josh Loecker <joshloecker@icloud.com> * feat: add single cell normalization using scanpy defaults Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix: test new quant information Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix: test new quant information Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: use quant files instead of strand files Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: use quant files instead of strand files Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: updated COMO_input files for naiveB to use updated FastqToGeneCounts information Signed-off-by: Josh Loecker <joshloecker@icloud.com> * feat: added Salmon quantification data for naive B Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: use `_read_file` function to read data Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix(tests): remove 1 from expected gene names to fix header Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix(tests): use `endswith` instead of `is in` Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix(tests): Use missing file appropriately Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore(uv): Use dependency groups Signed-off-by: Josh Loecker <joshloecker@icloud.com> * revert: use synchronous programming for more deterministic usage Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore(type): fix pyrefly type errors Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore(type): fix ruff & pyrefly issues Signed-off-by: Josh Loecker <joshloecker@icloud.com> * refactor: rename `_log_and_raise_error` to `log_and_raise_error` Signed-off-by: Josh Loecker <joshloecker@icloud.com> * refactor: rename `_read_file` to `read_file` Signed-off-by: Josh Loecker <joshloecker@icloud.com> --------- Signed-off-by: Josh Loecker <joshloecker@icloud.com>
* fix(fpkm): update imports for zFPKM calculation improvements Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix(fpkm): use Salmon quantification instead of STAR quantification Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: ruff formatting Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: fill with integers for faster processing Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: remove unnecessary async function usage Signed-off-by: Josh Loecker <joshloecker@icloud.com> * fix: remove non existant genes from conversion Signed-off-by: Josh Loecker <joshloecker@icloud.com> * refactor: use more explicit (albeit longer) code to create gene_info dataframe object Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: import required modules Signed-off-by: Josh Loecker <joshloecker@icloud.com> * refactor: optional argument for fragment data Signed-off-by: Josh Loecker <joshloecker@icloud.com> * refactor: improve handling for single cell data Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: generalize data type input Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: ruff formatting Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: simplify FPKM/RPKM calculations; properly compute per-gene FPKM scores Signed-off-by: Josh Loecker <joshloecker@icloud.com> * refactor: move zfpkm calculation to external package Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: use np.bool for boolean array Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: ruff formatting Signed-off-by: Josh Loecker <joshloecker@icloud.com> * feat: allow setting negative zFPKM results to 0 Signed-off-by: Josh Loecker <joshloecker@icloud.com> * feat: simplification to use external zfpkm package Signed-off-by: Josh Loecker <joshloecker@icloud.com> * feat: allow providing the fragment size filepath (from rnaseq preprocessing) Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore(ruff): reduce max line length Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore(ruff): mark unsorted imports as fixable Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore(uv): lock pyproject file Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: remove zfpkm related code as it is now an external package Signed-off-by: Josh Loecker <joshloecker@icloud.com> * revert: remove local pyproject requirement Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: remove zfpkm testing files Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: force sync lock Signed-off-by: Josh Loecker <joshloecker@icloud.com> * Purge `fast_bioservices` (#246) * feat: use bioservice's `MyGeneInfo` Signed-off-by: Josh Loecker <joshloecker@icloud.com> * refactor: convert from fast_bioservices to COMO's built-in pipeline Signed-off-by: Josh Loecker <joshloecker@icloud.com> * refactor: remove fast-bioservices from pyproject Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: use bioservices over fast_bioservices Signed-off-by: Josh Loecker <joshloecker@icloud.com> * feat(test): added unit tests for identifier conversion pipeline Signed-off-by: Josh Loecker <joshloecker@icloud.com> --------- Signed-off-by: Josh Loecker <joshloecker@icloud.com> --------- Signed-off-by: Josh Loecker <joshloecker@icloud.com>
* revert: use `raise <error>` instead of `log_and_raise` function Signed-off-by: Josh Loecker <joshloecker@icloud.com> * chore: remove non-existant import --------- Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com>
…elated tests Signed-off-by: Josh Loecker <joshloecker@icloud.com>
Signed-off-by: Josh Loecker <joshloecker@icloud.com> # Conflicts: # .github/workflows/continuous_integration.yml # main/COMO.ipynb # main/como/create_context_specific_model.py # main/como/rnaseq.py # main/como/rnaseq_gen.py # main/como/rnaseq_preprocess.py # pyproject.toml # uv.lock
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is a massive pull request that contains a huge variety of features, improvements, bug fixes, and changes. While these are too numerous to list, the main components include: