Mouse reference genome

What is the mouse mm10 equivalent for these lines of code where you specify the transcript information for the reference genome?

gtf = pd.read_feather(

'https://storage.googleapis.com/alphagenome/reference/gencode/'

‘hg38/gencode.v46.annotation.gtf.gz.feather’

)

Hello @Seren_Ford, welcome to the community!

The mouse gtf is available from https://storage.googleapis.com/alphagenome/reference/gencode/mm10/gencode.vM23.annotation.gtf.gz.feather

So the following should work:

import pandas as pd


df = pd.read_feather(
    'https://storage.googleapis.com/alphagenome/reference/gencode/mm10/'
    'gencode.vM23.annotation.gtf.gz.feather'
)

Hope that helps!

1 Like

Dear tward,

Thank you for sharing the gtf feather file for us. Could you also share the pas, splice site starts and splice site end feather files for us, please ?

So the splice site starts/ends are located here: https://storage.googleapis.com/alphagenome/reference/gencode/mm10/gencode.vM23.splice_sites_starts.feather and https://storage.googleapis.com/alphagenome/reference/gencode/mm10/gencode.vM23.splice_sites_ends.feather.

We have not processed mouse polyA sites, so don’t have a feather file for that.

NB: The scripts used to process GTF and splice sites is available at alphagenome/scripts/process_gtf.py at main · google-deepmind/alphagenome · GitHub

Thank you, Tward, for providing us with the data and code!

1 Like