Hii,
So I wanted to predict DNase-seq for a 1kb sequence of a specific tissue.
What is the right syntax, for dna_model.predict_sequence since I am willing to do only for a 1kb sequence and not include “.center(2048, ‘N’)”. As I couldn’t find any documentation from AlphaGenome that could provide me with any sort of explanation for the codes, it would be a great help if anyone could help with this.
Thank You.
Regards.
Hi Muskan_Sharma, you should be able to pad the sequence with Ns on both size before calling predict_sequence. An example is:
def pad_sequence(dna_sequence: str, target_length: int = 2**11) -> str:
current_length = len(dna_sequence)
# Calculate the total padding required.
padding_needed = target_length - current_length
pad_left = padding_needed // 2
pad_right = padding_needed - pad_left
padded_sequence = 'N' * pad_left + dna_sequence + 'N' * pad_right
return padded_sequence
Hope that answers your question.