New module: lsa/cosine (calculate cosine similarity)#10002
New module: lsa/cosine (calculate cosine similarity)#10002miguelrosell wants to merge 16 commits intonf-core:masterfrom
Conversation
pinin4fjords
left a comment
There was a problem hiding this comment.
Thanks for the PR!
There's a few things to do to make this fit in with repository standards, and I think there's some complexity to be stripped.
In particular, since there's no actual software called 'cosimflow', I think this needs to be named like lsa/cosine in my view, since it's principally a wrapper around that function. Or possibly it could be a script under custom/.
I've simplified the script a bit to apply some of my suggestions for you:
modules/nf-core/cosimflow/main.nf
Outdated
| ) | ||
|
|
||
| # AQUÍ PASAMOS LOS ARGUMENTOS DE NEXTFLOW DIRECTAMENTE | ||
| args_list <- c("--input", "$expression_matrix", "--out_prefix", "$prefix") |
There was a problem hiding this comment.
The script constructs c("--input", "$expression_matrix", "--out_prefix", "$prefix") then parses them back into opt$input and opt$out_prefix - getting the same values it started with. Only task.ext.args actually needs parsing. Feed $args directly to optparse for configurable options (--method, --min_gene_mean), and use Nextflow template variables for input/prefix. See attached template.
| assert process.success | ||
|
|
||
| // 4. Verificamos que se generen los archivos | ||
| assert process.out.matrix |
There was a problem hiding this comment.
if the outputs are deterministic, they should be snapshotted, if not you should snapshot file names at least, see e.g. how deseq2/differential module does it:
{ assert snapshot(
process.out.results,
...
file(process.out.dispersion_plot_png[0][1]).name,
...
).match() },
…odules into nf-core-cosimflow
|
@pinin4fjords Thank you for all of corrections, comments and suggestions, I really appreciate it, for your patience and the detailed help!! I hope I have applied everything you requested: -Renamed it to lsa/cosine. -I switched the input to CSV-only (removed readxl dependency) as suggested. -I updated meta.yml (license/maintainers) and fixed the variable scope in main.nf so the R template works correctly.
If there's anything that I have missed or that needs a second look at, I'm all ears! Again, thank you! |
|
Hi @pinin4fjords! I applied both changes:
Like always, I'm always here and willing to change anything I might have missed if necessary. |
|
Hi @pinin4fjords! I just updated the branch with master to resolve the out-of-date status. Looks like the workflows are currently awaiting approval to run. Is the next step approving if the checks pass or am i missing something? Have a good day and thank youu!! |
If you go on the nf-core Slack to the #github-invitations channel you can get yourself added to the org so that workflows will run for you! |
|
Hello, @pinin4fjords! I managed to fix all the schema requirements for the meta.yml and aligned the process name as well. I think all linting and tests pass with 0 errors now! Have a good Friday! |
Hi @ewels and @pinin4fjords!
Following up on our chat in the proposals channel, here’s the PR for the cosimflow module. It’s a tool for calculating pairwise cosine similarity from expression matrices (outputting both the matrix and a heatmap).
I think it could be a handy addition for the rnaseq pipeline, maybe as a downstream QC step to see how samples cluster together beyond the usual PCA.
PR checklist:
Closes: xxx
Tests performed locally:
nf-core modules test cosimflow --profile condaQuick note: I ended up putting the R script directly inside the
main.nfscript block. It was the cleanest way I found to handle the dollar signs and variable escaping for R/Nextflow without everything blowing up.