Best way to aggregate AlphaGenome scores across tracks and modalities for variant prioritization?

Hi AlphaGenome team/community,

We are working on a research project involving genome variant analysis, and we are exploring how to integrate AlphaGenome as an additional annotation layer for variant prioritization in a research-only context.

For practical downstream interpretation, we would ideally like to reduce the AlphaGenome output to a small number of columns per variant, rather than storing thousands of rows across output types, tracks, tissues/cell types and scorers.

We have seen that for splicing there is a recommended merged score combining SPLICE_SITES, SPLICE_SITE_USAGE and SPLICE_JUNCTIONS, using the maximum effect across tracks/genes and combining them into a single alphagenome_splicing score. Would a similar strategy be meaningful for other modalities, such as expression or regulatory outputs?

More specifically, we are considering something like:

  • alphagenome_priority_score: maximum or combined score across selected relevant modalities

  • top_modality: splicing / expression / chromatin accessibility / TF / histone, etc.

  • top_output_type

  • top_track or tissue/cell type where the maximum effect was observed

  • modality-specific scores, e.g. splicing_score, expression_score, regulatory_score

Our main question is whether quantile_score values are comparable across output types, scorers and tracks, so that taking the maximum quantile score per variant would be a reasonable prioritization strategy. If not, would you recommend summarizing scores separately by modality instead?

We are also trying to avoid generating very large outputs for many tissues or cell types that are not relevant to our disease context. Would the best approach be to restrict ontology_terms / tracks before scoring, or to score broadly and then aggregate/filter afterwards?

In short, what would be the recommended way to obtain a compact, clinically usable AlphaGenome annotation table with one row per variant, while preserving enough information to know which modality and track drove the prioritization?

Any guidance or examples of best practices for pipeline integration would be very helpful.

Thank you!

Hello, I’m not part of the AlphaGenome support team — I’m also an AlphaGenome user. However, I noticed that the Terms of Service emphasize: ‘You must not use AlphaGenome API or outputs for clinical purposes or rely on them for medical or other professional advice.’ (https://deepmind.google.com/science/alphagenome/terms) This is deeply frustrating to me too

Hi There,

Thanks for reaching out.

Yes - the quantile score calibration procedure was implemented to make scores comparable across different assays and genomic contexts. By mapping raw predicted effects to a percentile rank based on a common background (348,126 gnomAD SNPs), the model removes the unit-specific biases of different modalities.

We currently don’t filter score_variant requests; there is no additional compute overhead incurred by this and results will need to be filtered to the specific ontology on the client side (see this issue).

We discuss methods for track aggregation here. Taking the max quantile score across all modalities will identify the strongest predicted regulatory effect for a variant, regardless of whether that effect is on splicing, expression, accessibility etc. As technically only one strong molecular impact of a variant is required to classify it as ‘functional’ (eg loss of splice donor site), max aggregation is often the most appropriate.

That said, performing cross-modality aggregation is a complex task and remains an open research question that we are still exploring, though max aggregation of genes is a reasonable approach.
Please note that AlphaGenome is strictly a research tool and is not validated for clinical diagnosis, medical treatment, or personal medical advice.

Kind regards,
Tumi