3. How to use gffpandas¶
3.1. Example Tutorial:¶
The following GFF3 file will be used as example, to show how gffpandas has to be used. It contains a header and eleven annotation entries:
##gff-version 3
##sequence-region NC_016810.1 1 20
NC_016810.1 RefSeq region 1 4000 . + . Dbxref=taxon:216597;ID=id0;gbkey=Src;genome=genomic;mol_type=genomic DNA;serovar=Typhimurium;strain=SL1344
NC_016810.1 RefSeq gene 1 20 . + . ID=gene1;Name=thrL;gbkey=Gene;gene=thrL;locus_tag=SL1344_0001
NC_016810.1 RefSeq CDS 13 235 . + 0 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_005179941.1;ID=cds0;Name=YP_005179941.1;Parent=gene1;gbkey=CDS;product=thr operon leader peptide;protein_id=YP_005179941.1;transl_table=11
NC_016810.1 RefSeq gene 1 20 . + . ID=gene2;Name=thrA;gbkey=Gene;gene=thrA;locus_tag=SL1344_0002
NC_016810.1 RefSeq CDS 341 523 . + 0 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_005179941.1;ID=cds0;Name=YP_005179941.1;Parent=gene2;gbkey=CDS;product=thr operon leader peptide;protein_id=YP_005179941.1;transl_table=11
NC_016810.1 RefSeq gene 1 600 . - . ID=gene3;Name=thrX;gbkey=Gene;gene=thrX;locus_tag=SL1344_0003
NC_016810.1 RefSeq CDS 21 345 . - 0 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_005179941.1;ID=cds0;Name=YP_005179941.1;Parent=gene3;gbkey=CDS;product=thr operon leader peptide;protein_id=YP_005179941.1;transl_table=11
NC_016810.1 RefSeq gene 41 255 . + . ID=gene4;Name=thrB;gbkey=Gene;gene=thrB;locus_tag=SL1344_0004
NC_016810.1 RefSeq CDS 61 195 . + 0 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_005179941.1;ID=cds0;Name=YP_005179941.1;Parent=gene4;gbkey=CDS;product=thr operon leader peptide;protein_id=YP_005179941.1;transl_table=11
NC_016810.1 RefSeq gene 170 546 . + . ID=gene5;Name=thrC;gbkey=Gene;gene=thrC;locus_tag=SL1344_0005
NC_016810.1 RefSeq CDS 34 335 . + 0 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_005179941.1;ID=cds0;Name=YP_005179941.1;Parent=gene5;gbkey=CDS;product=thr operon leader peptide;protein_id=YP_005179941.1;transl_table=11
The library can be imported as the following:
import gffpandas.gffpandas as gffpd
First step is to read in the GFF3 file with the method called ‘read_gff3’. Then a dataframe (.df) or rather header (.header) can be returned:
>>> annotation = gffpd.read_gff3('annotation.gff')
>>> print(annotation.header)
>>> print(annotation.df)
Out[1]:
##gff-version 3
##sequence-region NC_016810.1 1 20
seq_id source type start end score strand phase \
0 NC_016810.1 RefSeq region 1 4000 . + .
1 NC_016810.1 RefSeq gene 1 20 . + .
2 NC_016810.1 RefSeq CDS 13 235 . + 0
3 NC_016810.1 RefSeq gene 1 20 . + .
4 NC_016810.1 RefSeq CDS 341 523 . + 0
5 NC_016810.1 RefSeq gene 1 600 . - .
6 NC_016810.1 RefSeq CDS 21 345 . - 0
7 NC_016810.1 RefSeq gene 41 255 . + .
8 NC_016810.1 RefSeq CDS 61 195 . + 0
9 NC_016810.1 RefSeq gene 170 546 . + .
10 NC_016810.1 RefSeq CDS 34 335 . + 0
attributes
0 Dbxref=taxon:216597;ID=id0;gbkey=Src;genome=ge...
1 ID=gene1;Name=thrL;gbkey=Gene;gene=thrL;locus_...
2 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:Y...
3 ID=gene2;Name=thrA;gbkey=Gene;gene=thrA;locus_...
4 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:Y...
5 ID=gene3;Name=thrX;gbkey=Gene;gene=thrX;locus_...
6 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:Y...
7 ID=gene4;Name=thrB;gbkey=Gene;gene=thrB;locus_...
8 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:Y...
9 ID=gene5;Name=thrC;gbkey=Gene;gene=thrC;locus_...
10 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:Y...
The created data frame contains all eleven annotation entries and can be changed now. Depending on which annotation entries are desired, different options of gffpandas can be used and/or combined.
In this example, the user wants to return a GFF3 file, but only its coding sequences (‘CDS’), which base pair length (bp) is minimal 10 bp long and maximal 250 bp long. Therefore, the following functions will be combined:
>>> combined_df = annotation.filter_feature_of_type(['CDS']).filter_by_length(10, 250).to_gff3('temp.gff')
>>> gff_content = open('temp.gff').read()
>>> print(gff_content)
Out[2]:
##gff-version 3
##sequence-region NC_016810.1 1 20
NC_016810.1 RefSeq CDS 13 235 . + 0 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_005179941.1;ID=cds0;Name=YP_005179941.1;Parent=gene1;gbkey=CDS;product=thr operon leader peptide;protein_id=YP_005179941.1;transl_table=11
NC_016810.1 RefSeq CDS 341 523 . + 0 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_005179941.1;ID=cds0;Name=YP_005179941.1;Parent=gene2;gbkey=CDS;product=thr operon leader peptide;protein_id=YP_005179941.1;transl_table=11
NC_016810.1 RefSeq CDS 61 195 . + 0 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_005179941.1;ID=cds0;Name=YP_005179941.1;Parent=gene4;gbkey=CDS;product=thr operon leader peptide;protein_id=YP_005179941.1;transl_table=11
3.2. Methods included in gffpandas:¶
In this subsection, the possible functions of gffpandas will be presented.
3.2.1. filter_feature_of_type¶
For example:
>>> filtered_df = annotation.filter_feature_of_type(['gene'])
>>> print(filtered_df.df)
Out[2]:
seq_id source type start end score strand phase \
1 NC_016810.1 RefSeq gene 1 20 . + .
3 NC_016810.1 RefSeq gene 1 20 . + .
5 NC_016810.1 RefSeq gene 1 600 . - .
7 NC_016810.1 RefSeq gene 41 255 . + .
9 NC_016810.1 RefSeq gene 170 546 . + .
attributes
1 ID=gene1;Name=thrL;gbkey=Gene;gene=thrL;locus_...
3 ID=gene2;Name=thrA;gbkey=Gene;gene=thrA;locus_...
5 ID=gene3;Name=thrX;gbkey=Gene;gene=thrX;locus_...
7 ID=gene4;Name=thrB;gbkey=Gene;gene=thrB;locus_...
9 ID=gene5;Name=thrC;gbkey=Gene;gene=thrC;locus_...
3.2.2. filter_by_length¶
For example:
>>> filtered_by_length = annotation.filter_by_length(min_length=10, max_length=300)
>>> print(filtered_by_length.df)
Out[3]:
seq_id source type start end score strand phase \
1 NC_016810.1 RefSeq gene 1 20 . + .
2 NC_016810.1 RefSeq CDS 13 235 . + 0
3 NC_016810.1 RefSeq gene 1 20 . + .
4 NC_016810.1 RefSeq CDS 341 523 . + 0
7 NC_016810.1 RefSeq gene 41 255 . + .
8 NC_016810.1 RefSeq CDS 61 195 . + 0
attributes
1 ID=gene1;Name=thrL;gbkey=Gene;gene=thrL;locus_...
2 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:Y...
3 ID=gene2;Name=thrA;gbkey=Gene;gene=thrA;locus_...
4 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:Y...
7 ID=gene4;Name=thrB;gbkey=Gene;gene=thrB;locus_...
8 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:Y...
3.2.3. get_feature_by_attribute¶
For example:
>>> feature_by_attribute = annotation.get_feature_by_attribute('gbkey', ['CDS'])
>>> print(feature_by_attribute.df)
Out[4]:
seq_id source type start end score strand phase \
2 NC_016810.1 RefSeq CDS 13 235 . + 0
4 NC_016810.1 RefSeq CDS 341 523 . + 0
6 NC_016810.1 RefSeq CDS 21 345 . - 0
8 NC_016810.1 RefSeq CDS 61 195 . + 0
10 NC_016810.1 RefSeq CDS 34 335 . + 0
attributes
2 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:Y...
4 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:Y...
6 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:Y...
8 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:Y...
10 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:Y...
3.2.4. attributes_to_columns¶
For example:
>>> attr_to_columns = annotation.attributes_to_columns()
>>> print(attr_to_columns)
Out[5]:
seq_id source type start end score strand phase \
0 NC_016810.1 RefSeq region 1 4000 . + .
1 NC_016810.1 RefSeq gene 1 20 . + .
2 NC_016810.1 RefSeq CDS 13 235 . + 0
3 NC_016810.1 RefSeq gene 1 20 . + .
4 NC_016810.1 RefSeq CDS 341 523 . + 0
5 NC_016810.1 RefSeq gene 1 600 . - .
6 NC_016810.1 RefSeq CDS 21 345 . - 0
7 NC_016810.1 RefSeq gene 41 255 . + .
8 NC_016810.1 RefSeq CDS 61 195 . + 0
9 NC_016810.1 RefSeq gene 170 546 . + .
10 NC_016810.1 RefSeq CDS 34 335 . + 0
attributes \
0 Dbxref=taxon:216597;ID=id0;gbkey=Src;genome=ge...
1 ID=gene1;Name=thrL;gbkey=Gene;gene=thrL;locus_...
2 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:Y...
3 ID=gene2;Name=thrA;gbkey=Gene;gene=thrA;locus_...
4 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:Y...
5 ID=gene3;Name=thrX;gbkey=Gene;gene=thrX;locus_...
6 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:Y...
7 ID=gene4;Name=thrB;gbkey=Gene;gene=thrB;locus_...
8 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:Y...
9 ID=gene5;Name=thrC;gbkey=Gene;gene=thrC;locus_...
10 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:Y...
Dbxref ... gbkey \
0 taxon:216597 ... Src
1 None ... Gene
2 UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_00517... ... CDS
3 None ... Gene
4 UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_00517... ... CDS
5 None ... Gene
6 UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_00517... ... CDS
7 None ... Gene
8 UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_00517... ... CDS
9 None ... Gene
10 UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_00517... ... CDS
gene genome locus_tag mol_type product \
0 None genomic None genomic DNA None
1 thrL None SL1344_0001 None None
2 None None None None thr operon leader peptide
3 thrA None SL1344_0002 None None
4 None None None None thr operon leader peptide
5 thrX None SL1344_0003 None None
6 None None None None thr operon leader peptide
7 thrB None SL1344_0004 None None
8 None None None None thr operon leader peptide
9 thrC None SL1344_0005 None None
10 None None None None thr operon leader peptide
protein_id serovar strain transl_table
0 None Typhimurium SL1344 None
1 None None None None
2 YP_005179941.1 None None 11
3 None None None None
4 YP_005179941.1 None None 11
5 None None None None
6 YP_005179941.1 None None 11
7 None None None None
8 YP_005179941.1 None None 11
9 None None None None
10 YP_005179941.1 None None 11
3.2.5. overlaps_with¶
For example:
>>> overlapings = annotation.overlaps_with(seq_id='NC_016811.1', type='gene',
start=40, end=300, strand='+')
>>> no_overlap = annotation.overlaps_with(seq_id='NC_016811.1', start=1, end=4000,
strand='+', complement=True)
>>> print(overlapings.df)
>>> print(no_overlap.df)
Out[6]:
seq_id source type start end score strand phase \
0 NC_016810.1 RefSeq region 1 4000 . + .
2 NC_016810.1 RefSeq CDS 13 235 . + 0
7 NC_016810.1 RefSeq gene 41 255 . + .
8 NC_016810.1 RefSeq CDS 61 195 . + 0
9 NC_016810.1 RefSeq gene 170 546 . + .
10 NC_016810.1 RefSeq CDS 34 335 . + 0
attributes
0 Dbxref=taxon:216597;ID=id0;gbkey=Src;genome=ge...
2 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:Y...
7 ID=gene4;Name=thrB;gbkey=Gene;gene=thrB;locus_...
8 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:Y...
9 ID=gene5;Name=thrC;gbkey=Gene;gene=thrC;locus_...
10 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:Y...
Out[7]:
Empty DataFrame
Columns: [seq_id, source, type, start, end, score, strand, phase, attributes]
Index: []
3.2.6. find_duplicated_entries¶
For example:
>>> redundant_entries = annotation.find_duplicated_entries(seq_id='NC_016811.1', type='gene')
>>> print(redundant_entries.df)
Out[8]:
seq_id source type start end score strand phase \
3 NC_016810.1 RefSeq gene 1 20 . + .
attributes
3 ID=gene2;Name=thrA;gbkey=Gene;gene=thrA;locus_...
The following methods of the library won’t return a data frame:
3.2.7. to_gff3¶
For example:
>>> annotation.to_gff3('temp.gff')
>>> gff3_file = open('temp.gff').read()
>>> print(gff3_file)
Out[9]:
##gff-version 3
##sequence-region NC_016810.1 1 20
NC_016810.1 RefSeq region 1 4000 . + . Dbxref=taxon:216597;ID=id0;gbkey=Src;genome=genomic;mol_type=genomic DNA;serovar=Typhimurium;strain=SL1344
NC_016810.1 RefSeq gene 1 20 . + . ID=gene1;Name=thrL;gbkey=Gene;gene=thrL;locus_tag=SL1344_0001
NC_016810.1 RefSeq CDS 13 235 . + 0 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_005179941.1;ID=cds0;Name=YP_005179941.1;Parent=gene1;gbkey=CDS;product=thr operon leader peptide;protein_id=YP_005179941.1;transl_table=11
NC_016810.1 RefSeq gene 1 20 . + . ID=gene2;Name=thrA;gbkey=Gene;gene=thrA;locus_tag=SL1344_0002
NC_016810.1 RefSeq CDS 341 523 . + 0 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_005179941.1;ID=cds0;Name=YP_005179941.1;Parent=gene2;gbkey=CDS;product=thr operon leader peptide;protein_id=YP_005179941.1;transl_table=11
NC_016810.1 RefSeq gene 1 600 . - . ID=gene3;Name=thrX;gbkey=Gene;gene=thrX;locus_tag=SL1344_0003
NC_016810.1 RefSeq CDS 21 345 . - 0 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_005179941.1;ID=cds0;Name=YP_005179941.1;Parent=gene3;gbkey=CDS;product=thr operon leader peptide;protein_id=YP_005179941.1;transl_table=11
NC_016810.1 RefSeq gene 41 255 . + . ID=gene4;Name=thrB;gbkey=Gene;gene=thrB;locus_tag=SL1344_0004
NC_016810.1 RefSeq CDS 61 195 . + 0 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_005179941.1;ID=cds0;Name=YP_005179941.1;Parent=gene4;gbkey=CDS;product=thr operon leader peptide;protein_id=YP_005179941.1;transl_table=11
NC_016810.1 RefSeq gene 170 546 . + . ID=gene5;Name=thrC;gbkey=Gene;gene=thrC;locus_tag=SL1344_0005
NC_016810.1 RefSeq CDS 34 335 . + 0 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_005179941.1;ID=cds0;Name=YP_005179941.1;Parent=gene5;gbkey=CDS;product=thr operon leader peptide;protein_id=YP_005179941.1;transl_table=11
3.2.8. to_csv¶
For example:
>>> annotation.to_csv('temp.csv')
>>> csv_file = open('temp.csv').read()
>>> print(csv_file)
Out[9]:
seq_id,source,type,start,end,score,strand,phase,attributes
NC_016810.1,RefSeq,region,1,4000,.,+,.,Dbxref=taxon:216597;ID=id0;gbkey=Src;genome=genomic;mol_type=genomic DNA;serovar=Typhimurium;strain=SL1344
NC_016810.1,RefSeq,gene,1,20,.,+,.,ID=gene1;Name=thrL;gbkey=Gene;gene=thrL;locus_tag=SL1344_0001
NC_016810.1,RefSeq,CDS,13,235,.,+,0,Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_005179941.1;ID=cds0;Name=YP_005179941.1;Parent=gene1;gbkey=CDS;product=thr operon leader peptide;protein_id=YP_005179941.1;transl_table=11
NC_016810.1,RefSeq,gene,1,20,.,+,.,ID=gene2;Name=thrA;gbkey=Gene;gene=thrA;locus_tag=SL1344_0002
NC_016810.1,RefSeq,CDS,341,523,.,+,0,Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_005179941.1;ID=cds0;Name=YP_005179941.1;Parent=gene2;gbkey=CDS;product=thr operon leader peptide;protein_id=YP_005179941.1;transl_table=11
NC_016810.1,RefSeq,gene,1,600,.,-,.,ID=gene3;Name=thrX;gbkey=Gene;gene=thrX;locus_tag=SL1344_0003
NC_016810.1,RefSeq,CDS,21,345,.,-,0,Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_005179941.1;ID=cds0;Name=YP_005179941.1;Parent=gene3;gbkey=CDS;product=thr operon leader peptide;protein_id=YP_005179941.1;transl_table=11
NC_016810.1,RefSeq,gene,41,255,.,+,.,ID=gene4;Name=thrB;gbkey=Gene;gene=thrB;locus_tag=SL1344_0004
NC_016810.1,RefSeq,CDS,61,195,.,+,0,Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_005179941.1;ID=cds0;Name=YP_005179941.1;Parent=gene4;gbkey=CDS;product=thr operon leader peptide;protein_id=YP_005179941.1;transl_table=11
NC_016810.1,RefSeq,gene,170,546,.,+,.,ID=gene5;Name=thrC;gbkey=Gene;gene=thrC;locus_tag=SL1344_0005
NC_016810.1,RefSeq,CDS,34,335,.,+,0,Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_005179941.1;ID=cds0;Name=YP_005179941.1;Parent=gene5;gbkey=CDS;product=thr operon leader peptide;protein_id=YP_005179941.1;transl_table=11
3.2.9. to_tsv¶
For example:
>>> annotation.to_tsv('temp.tsv')
>>> tsv_file = open('temp.tsv').read()
>>> print(tsv_file)
Out[10]:
seq_id source type start end score strand phase attributes
NC_016810.1 RefSeq region 1 4000 . + . Dbxref=taxon:216597;ID=id0;gbkey=Src;genome=genomic;mol_type=genomic DNA;serovar=Typhimurium;strain=SL1344
NC_016810.1 RefSeq gene 1 20 . + . ID=gene1;Name=thrL;gbkey=Gene;gene=thrL;locus_tag=SL1344_0001
NC_016810.1 RefSeq CDS 13 235 . + 0 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_005179941.1;ID=cds0;Name=YP_005179941.1;Parent=gene1;gbkey=CDS;product=thr operon leader peptide;protein_id=YP_005179941.1;transl_table=11
NC_016810.1 RefSeq gene 1 20 . + . ID=gene2;Name=thrA;gbkey=Gene;gene=thrA;locus_tag=SL1344_0002
NC_016810.1 RefSeq CDS 341 523 . + 0 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_005179941.1;ID=cds0;Name=YP_005179941.1;Parent=gene2;gbkey=CDS;product=thr operon leader peptide;protein_id=YP_005179941.1;transl_table=11
NC_016810.1 RefSeq gene 1 600 . - . ID=gene3;Name=thrX;gbkey=Gene;gene=thrX;locus_tag=SL1344_0003
NC_016810.1 RefSeq CDS 21 345 . - 0 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_005179941.1;ID=cds0;Name=YP_005179941.1;Parent=gene3;gbkey=CDS;product=thr operon leader peptide;protein_id=YP_005179941.1;transl_table=11
NC_016810.1 RefSeq gene 41 255 . + . ID=gene4;Name=thrB;gbkey=Gene;gene=thrB;locus_tag=SL1344_0004
NC_016810.1 RefSeq CDS 61 195 . + 0 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_005179941.1;ID=cds0;Name=YP_005179941.1;Parent=gene4;gbkey=CDS;product=thr operon leader peptide;protein_id=YP_005179941.1;transl_table=11
NC_016810.1 RefSeq gene 170 546 . + . ID=gene5;Name=thrC;gbkey=Gene;gene=thrC;locus_tag=SL1344_0005
NC_016810.1 RefSeq CDS 34 335 . + 0 Dbxref=UniProtKB%252FTrEMBL:E1W7M4%2CGenbank:YP_005179941.1;ID=cds0;Name=YP_005179941.1;Parent=gene5;gbkey=CDS;product=thr operon leader peptide;protein_id=YP_005179941.1;transl_table=11
3.2.10. stats_dic¶
For example:
>>> statistics = annotation.stats_dic()
>>> print(statistics.df)
Out[11]:
{'Maximal_bp_length': 599, 'Minimal_bp_length': 19, 'Counted_strands': + 9
- 2
Name: strand, dtype: int64, 'Counted_feature_types': gene 5
CDS 5
region 1
Name: type, dtype: int64}