About the Book
Gene sequence data is the most abundant type of data available, and if you're
interested in analyzing it, you'll find a wealth of computational methods and
tools to help you. In fact, finding the data is not the challenge at all; rather
it is dealing with the plethora of flat file formats used to process the
sequence entries and trying to remember what their specific field codes mean. If
you survive by surrounding yourself with well-thumbed hard copies of readme
files or remembering exactly where to look for the details when you need them,
then Sequence Analysis in a Nutshell: A Guide to Common Tools and Databases
is for you. This book is a handy resource, as well as an invaluable reference,
for anyone who needs to know about the practical aspects and mechanics of
sequence analysis.
Sequence Analysis in a Nutshell: A Guide to Common Tools and Databases
pulls together all of the vital information about the most commonly used
databases, analytical tools, and tables used in sequence analysis. The book is
partitioned into three fundamental areas to help you maximize your use of the
content. The first section, "Databases" contains examples of flatfiles
from key databases (GenBank, EMBL, SWISS-PROT), the definitions of the codes or
fields used in each database, and the sequence feature types/terms and
qualifiers for the nucleotide and protein databases.
The second section, "Tools" provides the command line syntax for
popular applications such as ReadSeq, MEME/MAST, BLAST, ClustalW, and the EMBOSS
suite of analytical tools. The third section, "Appendixes"
concentrates on information essential to understanding the individual components
that make up a biological sequence. The tables in this section include
nucleotide and protein codes, genetic codes, as well as other relevant
information.
Written in O'Reilly's enormously popular, straightforward "Nutshell"
format, this book draws together essential information for bioinformaticians in
industry and academia, as well as for students. If sequence analysis is part of
your daily life, you'll want this easy-to-use book on your desk.
Related Books
Bioinformatics Books
(Bioinformatics
Books)
Table ofContents
Preface
I. Data Formats
1. FASTA Format
NCBI's Sequence Identifier Syntax
NCBI's Non-Redundant Database Syntax
References
2. GenBank/EMBL/DDBJ
Example Flat Files
GenBank Example Flat File
DDBJ Example Flat File
GenBank/DDBJ Field Definitions
EMBL Example Flat File
EMBL Field Definitions
DDBJ/EMBL/GenBank Feature Table
References
3. SWISS-PROT
SWISS-PROT Example Flat File
SWISS-PROT Field Definitions
SWISS-PROT Feature Table
References
4. Pfam
Pfam Example Flat File
Pfam Field Definitions
References
5. PROSITE
PROSITE Example Flat File
PROSITE Field Definitions
References
II. Tools
6. Readseq
Supported Formats
Command-Line Options
References
7. BLAST
formatdb
blastall
megablast
blastpgp
PSI-BLAST
PHI-BLAST
bl2seq
References
8. BLAT
Command-Line Options
References
9. ClustalW
Command-Line Options
References
10. HMMER
hmmalign
hmmbuild
hmmcalibrate
hmmconvert
hmmemit
hmmfetch
hmmindex
hmmpfam
hmmsearch
References
11. MEME/MAST
MEME
MAST
References
12. EMBOSS
Common Themes
List of All EMBOSS Programs
Details of EMBOSS Programs
References
III. Appendixes
A. Nucleotide and Amino Acid Tables
B. Genetic Codes
C. Resources
D. Future Plans
Index
|