About 64 Models in the Teacher Ensembl

Dadi_Gao · June 26, 2025, 8:04pm

In the paper it says “Additionally, 64 models were trained using all available reference genome intervals (all-folds)”.

May I asked how it ended up in 64 from 4-fold cross-validation? Or what are the differences between these 64 models?

Thank you so much!

Guido_Novati · June 30, 2025, 9:03am

Hi!
The 64 “all-folds” models are separate from the four cross-validation models. They were trained on all the data without a holdout set.

The difference between these 64 models comes from different random initializations during training, leading to distinct final models.

Dadi_Gao · July 6, 2025, 3:39pm

Am I right that the 4-fold cross-validation loop used to detertmine the best hyperparameter configuration, and then all the 64 “all-folds” models are trained on this SAME hyperparameter configuration accordingly but with different random initializations? Thank you so much!

Guido_Novati · July 7, 2025, 5:18pm

Yes, all pre-trained (i.e. not distilled) models were obtained with the same set of hyper-parameters and different random seeds. These hyper-parameters were found based only on fold-0 trained models. The design space is very large so only few of them are in the ballpark of the optimal ones.

Topic		Replies	Views
AlphaGenome – now available through an API Announcements	5	614	July 9, 2025
Variant effect prediction seems to have a random component Help & Support	3	70	July 18, 2025
Applicability of the Model to Prokaryotic Genomes Help & Support	1	85	July 9, 2025
Why is AlphaGenome working fine with organ and not on it's sub-anatomical structures? Help & Support	1	49	July 23, 2025
Is there a way to obtain embeddings? Feedback & Feature Requests	1	174	July 1, 2025

About 64 Models in the Teacher Ensembl

Related topics