How should I summarize thousands of per-variant quantile scores into one prioritization score?

Audrey · October 8, 2025, 5:02pm

Hello! I am using AlphaGenome to prioritize noncoding variants with potential regulatory effects. As the docs note, a single genome contains many variants and most will have little impact.

In the preprint’s Variant scoring section, it says the goal is a single informative scalar value per variant. In practice, I’m getting thousands of quantile scores per variant (often ~5k–40k) across output types, ontologies/tissues, and scorer types.

This scoring documentation briefly describes aggregating scores, but I don’t see any meaningful suggestions on best practices to get this single score. If we take a mean of the scores surely the output types that aren’t expected to have an effect will bring down the overall score. Do we take maximum score of each variant? This seems like we may overestimate some variants effects, but maybe not…

Anyway, my question:

Is there a built in aggregator that implements the single score per variant in the preprint? If so, what is the recommended configuration in terms of which outputs and ontologies to include based on variant type/location.
If not, what best practices do you recommend to avoid aggregating the scores inappropriately (e.g., taking mean across all scores)? Maybe there is some way to take an adjusted max quantile, or perform a weighted aggregation based on relevant outputs and ontologies, etc.

Thanks!

Jun_Cheng · November 6, 2025, 8:01pm

Hi,

We are working to provide recommended way of aggregating the scores. Depends on your use case, for splicing variants, we take the max across tissues/cell-types, then weighted sum the three heads. i.e. splice_sites.max(tissues/cell-types) + splice_site_usage.max(tissues/cell-types) + 0.2*splice_junctions.max(tissues/cell-types).For expression, check how we did the aggregation for the traitgym eval.

Regards,

Jun

Topic		Replies	Views
Best way to aggregate AlphaGenome scores across tracks and modalities for variant prioritization? Help & Support	2	111	May 8, 2026
How to combine gene expression raw scores? Help & Support	1	538	December 17, 2025
Variant scorer quantiles not implemented in ag_research Help & Support	4	130	May 15, 2026
GeneMaskLFCScorer AggregationType Help & Support	3	1225	October 18, 2025
Eqtl analysis using alphagenome Help & Support	3	1260	October 15, 2025

How should I summarize thousands of per-variant quantile scores into one prioritization score?

Related topics