Right syntax for dna_model.predict_sequence

Hii,

So I wanted to predict DNase-seq for a 1kb sequence of a specific tissue.

What is the right syntax, for dna_model.predict_sequence since I am willing to do only for a 1kb sequence and not include “.center(2048, ‘N’)”. As I couldn’t find any documentation from AlphaGenome that could provide me with any sort of explanation for the codes, it would be a great help if anyone could help with this.

Thank You.

Regards.

Hi Muskan_Sharma, you should be able to pad the sequence with Ns on both size before calling predict_sequence. An example is:

def pad_sequence(dna_sequence: str, target_length: int = 2**11) -> str:
    current_length = len(dna_sequence)
    
    # Calculate the total padding required.
    padding_needed = target_length - current_length
    pad_left = padding_needed // 2
    pad_right = padding_needed - pad_left
    padded_sequence = 'N' * pad_left + dna_sequence + 'N' * pad_right

    return padded_sequence

Hope that answers your question.