Written by
Jamie Hatfield     01/2002
Contents
Description
Command
Line Options
Sample
Usage
Description
Back
to top
ESD reads a incremental update downloaded from Genbank (ftp://ncbi.nlm.nih.gov/genbank/daily-nc/
- filename is of the format ncMMDD.flat.gz). Genbank posts one of
these every morning at around 7am for the previous day's additions.
ESD scans through this file looking for an organism that you specify
on the command line after the filename. It then saves the sequence
data from that Genbank entry to a file that is named either by the
sequence's accession number or the clone name.
Command
line options
Back
to top
esd <file> '<organism>' {c | a}
<file> is the file to extract daily updates from
'<organism>' is the official organism name
(e.g., 'Oryza sativa')
(Don't forget the single quotes around the organism name!)
a - name sequence files by the accession number
c - name sequence files by the clone name
Sample
Usage
Back
to top
## Get the daily incremental update
cd fsd
ftp ncbi.nlm.nih.gov
user anonymous
cd /genbank/daily-nc
bi
get ncMMDD.flat.gz
## Extract the sequence files for your organism
tar -xf ncMMDD.flat.gz
esd ncMMDD.flat 'Organism' a
cd ..
## Perform a simulated digest. See fsd documentation for more options
fsd b f . d fsd c 180000 80
## Input the simulated digest clones into fpc
fpc -batch updcor
## Update the remarks for those clones
fpc -batch mergerm fsd/remark.ace
|