External software and databases
The TCW uses the following external software packages, where they are all included with the TCW package the optional DE methods. Please reference any program in your publications that are used in your analysis.
runAS (Annotation Setup)
- UniProt: though other databases can be used for annotation, the TCW has the best support for this database.
Dimmer EC, Huntley RP, Alam-Faruque Y, Sawford T, O'Donovan C, et al. (2012) The UniProt-GO Annotation database in 2011. Nucleic Acids Res 40: D565-570.
- Gene Ontology (GO): if UniProt is used, the TCW uses it along with the GO database for GO annotations.
The GO consortium (2012) The Gene Ontology: enhancements for 2011. Nucleic Acids Res 40: D559-564.
runSingleTCW (single species database)
- BLAST: is used for assembly and annotation.
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25: 3389-3402.
- DIAMOND: maybe be used in place of Blast for super-fast annotation.
Buchfink, Benjamin, Klaus Reuter, and Hajk-Georg Drost.
Sensitive Protein Alignments at Tree-of-Life Scale Using DIAMOND.
Nature Methods 18, no. 4 (April 1, 2021): 366–68. https://doi.org/10.1038/s41592-021-01101-x.
- CAP3: is used for assembly.
Huang X, Madan A (1999) CAP3: A DNA sequence assembly program. Genome Res 9: 868-877.
- Transdecoder: the scripts for creating the Markov model and scoring a sequence were re-programmed in Java for the TCW ORF finder.
Hass BJ, Papanicolaou A, Yassour M. et al. (2013) De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nature Protocols 8:1494-1512
- UniProt and Gene Ontology (GO).
runDE (Differential Expression)
The following are all R packages.
- edgeR: differential expression (DE) analysis
Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26: 139-140.
- DESeq: differential expression (DE) analysis
Love MI, Huber W and Anders S (2014) Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology 15:550.
- GOSeq: GO differential expression
Young MD, Wakefield MJ, Smyth GK, Oshlack A (2010) Gene ontology analysis for RNA-seq: accounting for selection bias. Genome Biol 11: R14.
runMultiTCW (compare database created from single species sTCW databases)
- orthoMCL: to compute orthologous and paralogous clusters.
Li L, Stoeckert CJ, Jr., Roos DS (2003) OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res 13: 2178-2189.
- KaKs_Calculator: compute Ka/Ks values.
Zhang Z, Li J, Xiao-Qian Z, Wang J, Wong, G, Yu J (2006) KaKs_Calculator: Calculating Ka and Ks through model selection and model averaging. Geno. Prot. Bioinfo. Vol 4 No 4. 259-263.
- MAFFT: computes multiple alignment for scoring.
Katoh K, Standley DM (2013) MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability. Molecular Biology and Evolution Vol 30, Issue 4 772:780
- MUSCLE: computes multiple alignment of a cluster.
Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32: 1792-1797.
- MstatX: scores the multiple alignment.
Guillaume Collet (2012) https://github.com/gcollet/MstatX.
- BLAST and DIAMOND for comparisons.
viewSingleTCW (view sTCW databases)
- BLAST and DIAMOND for comparisons.
viewMultiTCW (view mTCW databases)
- BLAST and DIAMOND for comparisons.
- MAFFT and MUSCLE for multiple alignments.
|