logo_geca
Peroxibase spacer GECA spacer Instructions spacer Documentation spacer Article abstract

Input Format:


The data submitted by the user should be in FASTA format. The FASTA header is as follows : ">Accession_id|sequence name" for exemple : >5546|Sb1CysPrx01

An example of the protein sequences format:
>1064|OsPrx54|
MALLLLRRGGGFAAATVLAVVVVALVLSCGGGAEAAVRDLRVGYYAETCPDAEAVVRDTMARARAHEARSVASVMRLQFH
>1049|OsPrx39|
MAATLRWGGGGLAVAAFAAVVALSGLLGVAANYGGGGGFLFPQFYQHTCPQMEAVVGGIVARAHAEDPRMAASLLRMHFH

An example of the genomic sequences format:

>1064|OsPrx54|
ATGGCGGCGACATTGCGTTGGGGCGGCGGCGGGCTCGCGGTGGCGGCGTTTGCGGCGGTGGTCGCGTTGTCCGGCCTCCT
>1049|OsPrx39|
ATGGGCGCTGTGGCTGCGGTTCGTGCCGCGGTCCTGGTCGTGGCCGTGGCCCTCGCCGCGGCGGCGGCCGGCGCGTCGGC

The gene structure information could be in Genbank format or GFF3 format:
* Genbank format should be preceded by the same FASTA header of the protein and genomic input. For exemple :
>1064|OsPrx54
join(203122798..203122892,203126718..203126874,203130660..203130806,203131568..203131714,203133072..203133200)
>1049|OsPrx39
complement(join(1..672))

* GFF3 format has an advantage over the Genbank format, as it can contain information about Domains. These Domains are identified by their domain ID (smart#,pfam# ...) and can be displayed with the new version of GECA over the protein sequences.
An example of the genomic structures in GFF3 format :
#gff-version 3
##query 1064|OsPrx39 1 623
1064|OsPrx39 Scipio protein_match 1 1246 0.987 + 0 ID=1;Query=1|AtPIN1 1 415
1064|OsPrx39 Scipio protein_match 1399 1633 0.987 + 2 ID=1;Query=1|AtPIN1 416 494
1064|OsPrx39 Scipio protein_match 3039 3102 0.987 + 1 ID=1;Query=1|AtPIN1 602 623;Mismatches=619 621 622 623;Gap=M16 I1 M5
1064|OsPrx39 GenBank Region 1 623 . + . ID=GenBank:Region:AF171223_1:8:32;Dbxref=CDD:197784;Note= smart00557;region_name=ZnF_RBZ
# protein sequence = [MITAADFYHVMTAMVPLYVAMILAYGSVKWWKIFTPDQCSGINRFVALFAVPLLSFHFIAANNPYAMNLRFLAADSLQ
# KVIVLSLLFLWCKLSRNGSLDWTITLFSLSTLPNTLVMGIPLLKGMYGNFSGDLMVQIVVLQCIIWYTLMLFLFEYRGAKLLISE]
#
##gff-version 3
##query 1049|OsPrx54 1 648
1049|OsPrx54 Scipio protein_match 1 270 0.988 + 0 ID=3;Query=2|AtPIN2 1 90
1049|OsPrx54 Scipio protein_match 519 821 0.988 + 0 ID=3;Query=2|AtPIN2 91 191
1049|OsPrx54 GenBank Region 340 427 . + . ID=GenBank:Region:AF171223_1:8:32;Dbxref=CDD:197784;Note=Zinc finger domain; smart00587;region_name=ZnF_RBZ
1049|OsPrx54 GenBank Region 550 580 . + . ID=GenBank:Region:AF171223_1:8:32;Dbxref=CDD:197784;Note=Zinc finger domain; pfam00757;region_name=ZnF_RBZ
# protein sequence = [MITGKDMYDVLAAMVPLYVAMILAYGSVRWWGIFTPDQCSGINRFVAVFAVPLLSFHFISSNDPYAMNYHFLAADSLQ]

GECA results:


Once the data is submitted, you will be directed to the GECA Results page. This page gives access to the different results generated by GECA, which are: *the multiple alignment file, *CIWOG Results, sequences of the common introns detected and the image generated by GECA











Hosted by :