fix: preserve sample_query_options during cohort normalization by kshirajahere · Pull Request #1101 · malariagen/malariagen-data-python

kshirajahere · 2026-03-11T17:34:03Z

Summary

Before this change, sample_query_options could still be lost inside shared cohort normalisation even after the recent caller-level forwarding fixes. Any multi-cohort path that rebuilt cohort queries and then rechecked cohort sizes would fail for valid parameterised pandas queries such as country in @countries_list.

After this change, cohort normalisation preserves the full query context, and the shared query helpers now treat engine="python" as a default rather than forcing it as a duplicate keyword.

Closes #1100.

Why this matters

This is a framework-level correctness fix, not just a surface patch:

pairwise_average_fst() was still broken for local_dict-backed filters because _setup_cohort_queries() re-applied the combined cohort query without sample_query_options.
Multi-cohort H12 plotting had the same underlying problem for variable-backed sample filters.
Once query options are forwarded correctly, sample_metadata() and _filter_sample_dataset() must also accept an explicit engine option without raising TypeError.

The net effect is that the public sample_query_options contract now behaves consistently across direct metadata selection and higher-level cohort analyses.

Exact changes

add _prep_sample_query_options() in AnophelesBase to preserve user-supplied query context while defaulting the engine to python
use that helper in shared query-evaluation paths instead of passing engine="python" as a duplicate keyword
pass sample_query_options through _setup_cohort_queries() when validating derived cohort queries
add regressions for:
- direct sample_metadata() queries using both engine and local_dict
- pairwise_average_fst() with local_dict-backed sample_query
- multi-cohort H12 plotting with local_dict-backed sample_query

Tests run

ruff check malariagen_data/anoph/base.py malariagen_data/anoph/sample_metadata.py tests/anoph/test_fst.py tests/anoph/test_h12.py tests/anoph/test_sample_metadata.py
pytest tests/anoph/test_h12.py tests/anoph/test_fst.py -q
pytest tests/anoph/test_sample_metadata.py -k 'sample_metadata_with_query or sample_metadata_with_indices or query_options' -q

fix: preserve sample_query_options during cohort normalization

a50d211

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: preserve sample_query_options during cohort normalization#1101

fix: preserve sample_query_options during cohort normalization#1101
kshirajahere wants to merge 1 commit intomalariagen:masterfrom
kshirajahere:GH1100-preserve-sample-query-options-cohort-normalisation

kshirajahere commented Mar 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kshirajahere commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why this matters

Exact changes

Tests run

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kshirajahere commented Mar 11, 2026 •

edited

Loading