Tool Usage

Command:

rgt-ODIN <BAM> <BAM> <FASTA> <CHROM SIZES>

Required Input

Option Name Type Default Description
BAM File None Two BAM files to find differential peaks in.
FASTA File None The genome in FASTA format.
CHROM SIZES File None File that gives the chromosome sizes. See here how to obtain it.

Output

ODIN creates four files. The file(s)

  • *.bw give the postprocessed ChIP-seq signal (in bigWig format),
  • *-setup.info gives information about the setting,
  • *-diffpeaks.bed describes the estimated differential peaks in a proprietary BED format,
  • *-uncor-difpeaks.bed gives the same information like above but without p-value correction,
  • *-diffpeaks.narrowPeak describes the estimated differential peaks in narrowPeak format, and
  • *-uncor-diffpeaks.narrowPeak gives the same information like above but without p-value correction.

As the narrowPeak format does not give the possibility to store the counts of signal 1 and signal 2 for each peak, we output an proprietary BED format.

The 11th column in the BED file gives a comma separated list for each differential peak. The list contains the counts of in the first sample, the counts in the second sample as well as the calculated p-value. For downstream analysis of this BED file, we provide two tools. The first tool separates the BED file by differential peaks that gain signal 1 and that gain signal 2. The second tool filters the BED file by p-value.

In both files, *-diffpeaks.bed and *-diffpeaks.narrowPeak, the strand column (6th column) indicates whether the peak gains signal 1 (positive strand) or gains signal 2 (negative strand).

To output valid narrowPeak and BED files, we add several dummy columns. Only then it is possible to use them in other tools, like e.g. the IGV browser.

For the BED file, columns 7 and 8, give the same information as columns 2 and 3; column 9 gives a colour code for the peaks (red for a differential peak in signal 1, and green for a differential peak in signal 2). Column 10 is always 0.

For the narrowPeak file, column 5, 7, 9 and 10 do not give any information.

Options

Option Name Type Default Description
–input-1 File None input-DNA file in BAM format of ChIP-seq experiment for first BAM file. See Section 2.1.5 in our paper for details.
–input-2 File None input-DNA file in BAM format of ChIP-seq experiment for second BAM file. See Section 2.1.5 in our paper for details.
-p, –pvalue float 0.1 p-value cutoff, keep only differentials peak with a p-value higher then cutoff.
–no-correction Boolean False No Benjamini/Hochberg p-value multiple testing correction
-m, –merge Boolean False Merge peaks which have a distance less than the estimated fragment size. (recommended for histones). See Section 2.3 in our paper for details.
-n, –name String None Prefix for all files that will be created. If none, use “exp-<BAM>-<BAM>”.
–ext-1 Integer None Extension size for first BAM file. If option is not chosen, estimate extension size. See Section 2.1.2 in our paper for details.
–ext-2 Integer None Extension size for second BAM file. If option is not chosen, estimate extension size. See Section 2.1.2 in our paper for details.
–ext-input-1 Integer None Extension size for first input-DNA BAM file. If option is not chosen, estimate extension size. See Section 2.1.2 in our paper for details.
–ext-input-2 Integer None Extension size for second input-DNA BAM file. If option is not chosen, estimate extension size. See Section 2.1.2 in our paper for details.
–distr String binom HMM’s emission distribution [Binomial (binom), (constraint) mixture of poisson (poisson-c)]. See Section 2.1.2 in our paper for details. See Section 2.2.2 in our paper for details.
–mag Integer 3 Magnitude of Poisson mixture model. We did experiments with a magnitude of up to 4.
-v Boolean False verbose mode
–version Boolean  False  show version
–cite Boolean False show BibTeX entry of our paper

 

Advanced Options

Option Name Type Default Description
–region File genome Regions (BED) where to search for DPs.
–deadzones File None Consider regions (in BED format) with poor mappability, so-called deadzones. See Section 4.1 in our paper for details.
–no-gc-content float 0.05 turn of GC-content calculation (faster, but less accurate). See Section 4.1 in our paper for details. See Section 2.1.4 in our paper for details.
–const-chrom String None Constrain HMM learning process to chromosome.
–factor-input-1 float None Normalization factor for first input. If option is not chosen, estimate factor. See Section 2.1.6 in our paper for details.
–factor-input-2 float None Normalization factor for second input. If option is not chosen, estimate factor. See Section 2.1.6 in our paper for details.
-c Integer 0.7 Threshold that each observation’s posterior probability must exceed to be considered as a differential peak. See Section 2.2.1 in our paper for details.
-f float None Minimum fold change which a potential differential peak must exhibit. See Section 2.2.1 in our paper for details.
-b Integer 100 Size of underlying bins for creating the signal. See Section 2.1.3 in our paper for details.
-s Integer 50 Stepsize with which the window consecutively slides across the genome to create the signal. See Section 2.1.2 in our paper for details. See Section 2.1.3 in our paper for details.
–debug Boolean False Output debug information. Warning: space consuming! This should not be interesting for the normal user.