ISM implementation/efficiency

valeh · April 21, 2026, 6:33am

Hi. I have been using alphagenome’s score_ism_variants function and just realized that it does quite a lot of unnecessary forward passes:

Forward passes in score_ism_variants
ism_variants generates 3 × L variants (3 non-reference substitutions at each of L positions in the ism_interval).

Each variant scored via score_variant → _predict_variant makes 4 forward passes:

apply_fn(reference_sequence, …) — main model, ref
apply_fn(alternate_sequence, …) — main model, alt
junctions_apply_fn(ref_embeddings, …) — splice junction head, ref
junctions_apply_fn(alt_embeddings, …) — splice junction head, alt

All forward passes use batch size = 1 (sequence is encoded with [np.newaxis] giving shape [1, S, 4]).

Each of the 3L variants gets its own thread, and within that thread _predict_variant runs 4 forward passes serially: main-ref → main-alt → junction-ref → junction-alt. The parallelism is only across variants (threads), not within a single variant’s scoring.

This seems quite wasteful as it does 3L as many forward passes for the reference sequence, and also uses a batch size of 1 when in theory more could fit (at least for smaller context sizes).

I was wondering if you have considered making the ISM workflow more efficient. Is there a way to increase batch size at least, or do one forward pass for the ref allele? I am being quite bottlenecked by this at this time so any help or suggestion (or code changes) would be much appreciated Thank you!

tward · April 21, 2026, 10:16am

Hi @valeh,

Thanks for the post! So for the open-source model, we prioritized readability and correctness over speed. For this reason, we decided to opt for a simpler single-device setup, as well as not sharing the reference prediction between the 3 alleles for ISM. We’ll look into adding this, but we’re keen to keep the code somewhat simple to make it easier for folks to follow. Increasing batch size can be difficult as you can quickly run out of GPU memory, though this very much depends on the diversity of variant scoring you’re doing.

Note that there’s only 2 full forward passes per variant prediction: the junction predictions re-uses the trunk embeddings from the first forward pass

valeh · April 21, 2026, 3:25pm

Hi @tward Thank you for the response (and clarification regarding the junction embeddings). So I kind of need the speed since I am scoring a very large number of variants, but I also have made some code changes to the open-source model because I need to do the aggregation a bit differently than the CenterMaskScorer currently does, as well as expose (return) the raw unaggregated predictions from the ISM forward passes. (I suppose if I can at least have the latter change I won’t need the former as I can do the aggregation on the raw tracks post-hoc).

Is there any way I can keep my code changes but also use the faster implementation?

Thanks!

valeh · April 24, 2026, 4:24pm

Hi @tward Just checking back in on this: Is there any way I can keep my code changes but use the faster implementation? I could speed up the open-source impl with the help of claude code, but it would be much easier if I could use your impl instead and not re-invent the wheel.

Please let me know either way. Thanks!

tward · April 28, 2026, 4:13pm

Is there any way I can keep my code changes but use the faster implementation?

You’ll need to re-base any local changes on top, but assuming you’re only changing the variant score aggregations then it should be fine!

but it would be much easier if I could use your impl instead

I don’t currently have a sharable implementation, but it’s on the list of TODOs

Topic		Replies	Views
Using API to Reproduce Figure 4(b): ISM for Alternative Allele Help & Support	3	829	February 5, 2026
How ISM used to negative stand? Help & Support	3	1054	September 23, 2025
ISM using REF and ALT sequences as baseline Feedback & Feature Requests	1	1322	July 11, 2025
Predict_interval throughput/server-side output averaging Feedback & Feature Requests	4	1261	September 5, 2025
Suggestion for a new functionality of variant scorer Feedback & Feature Requests	3	949	January 27, 2026

ISM implementation/efficiency

Related topics