miRExpress soure code download

How to install miRExpress in your machine
  • 1. Download miRExpress package (miRExpress.tgz) to your linux machine.
  • 2. type tar -zxvf miRExpress.tgz
  • 3. change into unzip directory
  • 4. type ./configure
  • 5. type make
  • 6. type make install
Usage:

miRExpress accepts the next generation seqeuncing data as query sequences in FASTQ format and the length of input sequences less than 64. miRExpress contains the miRNA sequences from miRBase in FASTA format as reference sequences. If you want to deal with raw sequencing data and construct miRNA expression profiles. You can use the follow procedures.

Raw_data_parse -> Trim_adaptor -> alignmentSIMD (alignmentSIMDthread) -> analysis

How to use this command will describe as follow:

Command:

"Raw_data_parse" handles the raw data sequences in FASTQ format and output the unique sequences and their counts using Tab(\t) to divide.

Raw_data_parse [-i raw_data] [-o output file name, optional]


 -i       raw data sequence file in FASTQ format.
 -o       output file name. Default is input file name plus .merge
"Trim_adapter" handles the sequence file which contain adpter or not according the input of adaptor sequence.
The input sequences file format as follow:

        ============================================
        Counts          Sequences
        71      GCGGAAATAGCTTAATGGTAGAGCTCGTATGCCGT
        4       AGATTAAGCCATGCATGTTCGTATGCCGTCTTCTG
        1       TCGAACAAGTAGGTGTAACTGTTCGTATGCCGTCT
        1       AGAGAAGATTGGATAGACGGGAAGTAGTATGCCGT
        ============================================
        Counts and Sequences are divided by Tab(\t).
Trim_adapter [-i input file] [-t 3' adaptor sequence file] [-h 5' adaptor sequence file, optional] [-o output file name, optional]

  -i     input sequence file
  -t     3' adaptor sequence file
  -h     5' adaptor sequence file
  -o     output file name. Default is input file name plus .trim
"statistics_reads" computes sequence number and counts according to the length of sequence

statistics_reads [-i input file] [-o output file name, optional]

 -i     input sequence file
 -o     output file name. Default is input file name plus .len
"alignmentSIMD" handles the alignment in query sequences and reference sequences.

alignmentSIMD [-r reference sequence file] [-i input sequence file] [-o output directory] [-t alignment identity, optional]

 -r     reference sequence file in FASTA format
 -i     input file
 -o     output directory
 -t     alignment identity between query and reference sequences. Default value is 1.
"alignmentSIMDthread" handles the alignment in query sequences and reference sequences by multiple processors.

alignmentSIMDthread [-r reference sequence file] [-i input sequence file] [-o output directory] [-t alignment identity, optional] [-u Number of processors/cores]

 -r     reference sequence file in FASTA format
 -i     input file
 -o     output directory
 -t     alignment identity between query and reference sequences. Default value is 1.
 -u     number of thread wanna to be created, depends on your cpu number, default is 1.
"analysis" handles the result of alignment and constructs miRNA expression profiles

analysis [-r reference sequence file] [-d alignment result directory] [-o output file name] [-t output file for comparing result, optional]

 -r     reference sequence file in FASTA format (This file need to be the same with 
        the file used to do alignmentSIMD)
 -d     alignment result directory (This directory need to be the same with the 
        directory used to do alignmentSIMD)
 -o     output file name
 -t	output file for comparing result

Department of Biological Science and Technology & Institute of Bioinformatics and System Biology, National Chiao Tung University, Taiwan Contact with Dr. Hsien-Da Huang