Tool Usage

TDF can be executed with the following command:

rgt-TDF {promotertest,regiontest} [required inputs] [options]

Where:

  • {promotertest,regiontest}: Define the applying test, either promoter test, or genomic region test.
  • [required inputs]: Required inputs files and paths.
  • [options]: Additional input parameters or output options.

General Inputs for both tests

There are some inputs common for both tests shown below:

Required Input for both tests

Option Name Type Description
-h, –help Show the help message and exit
 -r  PATH Input file name for RNA sequence (in fasta format)
 -rn  String Define the RNA name
 -o  PATH Output directory name for all the results and temporary files
 -organism  String Define the organism (hg19 or mm9)
 -genome_path  PATH Define the path of genome FASTA file

Options

Option Name Type Default Description
 -showdbs  Boolean  False Show the plots and statistics of DBS (DNA Binding sites)
 -a  Float  0.05 Define significance level for rejection null hypothesis
 -ccf  Integer  20 Define the cut off value for promoter counts
 -rt  Boolean  False Remove temporary files (fa, txp…etc)
 -log  Boolean  False Set the plots in log scale
 -ac  PATH  None Input file for RNA accecibility
 -accf  Integer  500 Define the cut off value for RNA accecibility
 -obed  Boolean  False Output the BED files for DNA binding sites.
 -showpa  Boolean  False Show parallel and antiparallel bindings in the plot separately.

Options for Triplexator

The arguments of theTriplexator can be adjusted by the options below. The default values in TDF is different than Triplexator itself. For detailed information about these arguments please check Triplexator’s manual.

Option Name Type Default Description
 -l  Integer  15 Define the minimum length of triplex
 -e  Integer  20 Set the maximal error-rate in % tolerated
 -c  Integer  2 Sets the tolerated number of consecutive errors with respect to the canonical triplex rules as were found to greatly destabilize triplexes in vitro.
 -fr  String  off Activates the filtering of low complexity regions and repeats in the sequence data
 -fm  Integer  0 Method to quickly discard non-hits (Default 0).’0′ = greedy approach; ‘1’ = q-gram filtering.
 -of  Integer  1 Define output formats of Triplexator
 -mf  Boolean  False Merge overlapping features into a cluster and report the spanning region.
 -rm  Boolean  False Set the multiprocessing

Particular Inputs for promoter test

Required Input for promoter test

The target promoters can be defined in two ways:

  1. A gene list, which contains gene symbols or Ensembl IDs, one gene per line in plain text format. The argument, -de, should be used;
  2. Two BED files containing the regions of target promoters and non-target promoters (background). Two arguments, -bed and -bg, should be used together.
Option Name Type Description
 -de  PATH Input file for gene list (gene symbols or Ensembl ID)
 -bed  PATH Input BED file of the promoter regions of genes
 -bg  PATH Input BED file of the promoter regions of background genes

Options for promoter test

Option Name Type Default Description
 -pl  Integer  1000 Define the promotor length
 -score  Boolean  False Load score column from input gene list of BED file for analysis.
 -scoreh  Boolean  False Use the header of scores from the given gene list or BED file.

Particular Inputs for region set test

Required Input for region set test

Option Name Type Description
  -bed  PATH Input BED file for interesting regions on DNA

Options for region set test

Option Name Type Default Description
 -n  Integer  10000 Number iterations (randomization)
 -f  PATH  None Input BED file as mask for randomization
 -score  Boolean  False Load score column from input BED file