Hello,
I am interested in heart expression in embryonic tissue. I am using the variant prediction scores in this way:
variant = genome.Variant(
chromosome=chrom,
position=pos,
reference_bases=ref,
alternate_bases=alt,
)
interval = variant.reference_interval.resize(dna_client.SEQUENCE_LENGTH_1MB)
variant_scores = dna_model.score_variant(
interval=interval,
variant=variant,
variant_scorers=[variant_scorers.RECOMMENDED_VARIANT_SCORERS['RNA_SEQ']],
)
df_scores = variant_scorers.tidy_scores(variant_scores)
ontology_terms = ['UBERON:0000948'] # heart
However, looking at the df scores, there is no filter for biosample_life_stage which we can see elsewhere.
I can see this info in variant_scores:
variant_scores[0].var[variant_scores[0].var['ontology_curie'] == 'UBERON:0000948']
| name | strand | Assay title | ontology_curie | biosample_name | biosample_type | biosample_life_stage | gtex_tissue | data_source | endedness | genetically_modified | nonzero_mean | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 182 | UBERON:0000948 polyA plus RNA-seq | + | polyA plus RNA-seq | UBERON:0000948 | heart | tissue | adult | encode | paired | False | 0.340610 | |
| 183 | UBERON:0000948 total RNA-seq | + | total RNA-seq | UBERON:0000948 | heart | tissue | adult | encode | paired | False | 0.087664 | |
| 453 | UBERON:0000948 polyA plus RNA-seq | - | polyA plus RNA-seq | UBERON:0000948 | heart | tissue | adult | encode | paired | False | 0.340610 | |
| 454 | UBERON:0000948 total RNA-seq | - | total RNA-seq | UBERON:0000948 | heart | tissue | adult | encode | paired | False | 0.087664 | |
| 595 | UBERON:0000948 polyA plus RNA-seq | . | polyA plus RNA-seq | UBERON:0000948 | heart | tissue | embryonic | encode | single | False | 0.467385 |
So from the df_scores output I can infer that if I filter to non stranded, I am always looking at the embryonic tissue track:
df_scores[(df_scores['ontology_curie'] == 'UBERON:0000948') & (df_scores['track_strand'] == '.')]
However, I’m not sure how robust this is for any result. Is there a better way to directly filter to the track I want based on life stage? Or am I misunderstanding the output. Please let me know. Thanks!