A. What's GECA ?

GECA (Gene Evolution/Conservation Analysis) is a collection of perl
scripts, which align exon/intron structures and detect common introns
and similarities between sequences in order to provide information
about genes evolution and/or conservation.

A local version of GECA can be installed on UNIX platforms and requires
pre-installation of PERL, MAFFT and CIWOG.The strategy relies on a simple
fact : by aligning the commun introns of closely related sequences, we align
the exon structure of the respective genes. Once the sequences are aligned,
the are compared, amino acide by amino acide, to search for similarity
between the sequences.

A web based version is available at "".

Comments and questions are welcome.

B. Installing GECA

The GECA distribution contains several directories and files. Source
code and documentation files are included in the distribution.

The distribution is archived and compressed in a single file using the
command tar -zcvf. The compressed file name is GECA.tar.gz (or
something similar depending on compiled binaries included). The GECA
files can be extracted following these instructions:


After executing these commands, the directory GECA will be created in your working directory.

GECA needs the pre-installation of the following software:

-CIWOG (CIWOG_Plants_Version_1_loose [5/11/2009])

-MAFFT : Multiple alignment program for amino acid or nucleotide sequences


Minimum version tested 5.8.8

-Perl Modules.

+ Some of these modules are included in linux installations.
- quick check to see if you have the module installed
% perl -e "use xxx;"
- if the response is 'Cant locate XXX in @INC ....' then you need to install this module
+ All of the modules can be obtained from or using the cpan shell:
% cpan
cpan> install xxx

Where xxx is one of the following:

a. GD
- GD has several prerequisites.
- It is recommended to install GD according to the README at

b. GD::Text::Align

c. DBI

d. Bio::SeqIO

e. Bio::AlignIO

f. Getopt::Long


- minimum version tested 4.1.9

GECA in itself doesn't require any installation, all u need is to configure The "".

C. Configurating GECA

The file contains all the variables uses by the different scripts.
after installing CIWOG and MAFFT, u need to specify the pathway for these two,
for the connection to the CIWOG database, and to specify a temporary file that
would contain all the subfiles generated by GECA.

D. File Listing

The GECA distribution contains a set of independent programs that are used by

* : a parser to extract the proteique sequence from the
alignment file and to detect the presence of possible internal stop codons.

* : a parser that searches for identities and similarities
betweend the proteic sequences using a BLOSUM62 matrix ((Henikoff S et al., 1992)).

* : the configuration file for GECA.

*Geca_input.php : the web interface for GECA.

* : extracts the data from Geca_input.php and launches GECA.

* : creates the GECA result.

* : formates the resultat of MAFFT.

* & : creates the structure and alignment files for CIWOG.

*README.txt : This file.

*User's Manual.

E. to run GECA

There are several variables that need to be set by users to specify the path where GECA can find :

-The alignment files and structural files used by CIWOG (usually they are in yourLocalFile/CIWOG/input/).

-CIWOG configuration file "" that can be found in "CIWOG/conf/".

To run GECA as a DEMO version, please visit :
"" and follow the instructions on our home page.

To run GECA using the source code, once all the requirements are met, you need to let
APACHE know about GECA. But since it should have been done for CIWOG, all you need to
do is place the php (get_fasta_geca.php and geca_input.php) files in /ciwog/cgi

F.Authors and help

GECA has been written by Nizar Fawal (UMR 5546 CNRS/Universite Paul Sabatier).

GECA home page is at ""