Is full fine-tuning feasible on a single H100?

Moon · January 30, 2026, 6:07pm

Hi everyone,

Thank you for the great work! I’m currently trying to fine-tune the entire AlphaGenome model on a new dataset using a single H100 GPU. Even with batch_size=1 and gradient accumulation, we run into memory limits very quickly.

Based on the distillation setup described in the paper—where both a frozen teacher and an unfrozen student are loaded and gradients are computed—we expected full fine-tuning to be possible.

Inference works without any issues. However, once we enable gradient computation on a 1M-bp input sequence, we immediately hit out-of-memory errors, even when training with only a single head.

Is full fine-tuning on sequences of this length feasible on a single H100? If so, what techniques or adjustments are typically required to avoid OOM in this setting?

Thanks in advance for any guidance.

Best,
Moon

Dmitry_Penzar · February 4, 2026, 12:27pm

Is it legal to finetune AG? From the license it seems that the answer is no

Topic		Replies	Views
AlphaGenome Finetuning Announcements	1	477	March 2, 2026
LoRA Finetuning on Custom CUT&Tag Dataset Help & Support	4	243	February 24, 2026
Alphagenome-ft: a fine-tuning package for alphagenome & Perturbation Assay Fine-Tuning Results Announcements	1	459	March 12, 2026
Inference speed on NVIDIA H100 GPU Help & Support	1	181	February 4, 2026
Tutorial for Training on New Data? Feedback & Feature Requests	1	248	February 17, 2026

Is full fine-tuning feasible on a single H100?

Related topics