Documentation read from 05/19/2023 16:24:01 version of /vol/kmer-server-prod/FIGdisk.server.rhel6/FIG/bin/svr_fasta.
svr_fasta <gene_ids.tbl >sequences.tbl
Produce DNA or protein strings for genes.
This script takes as input a tab-delimited file with gene IDs at the end of each line. For each gene ID, the gene's DNA or protein sequence is written to the output file. If the --fasta
option is specified, the sequence is written in FASTA format.
This is a pipe command: the input is taken from the standard input and the output to the standard output. The columns of data preceding the first will be supplied as comments to each FASTA string. In addition, if the incoming ID is not a FIG ID, the output gene's FIG ID will be prefixed to the comment.
Note that because some gene IDs correspond to multiple genes, there may be more output items than input lines.
Database source of the IDs specified-- SEED
for FIG IDs, GENE
for standard gene identifiers, or LocusTag
for locus tags. In addition, you may specify RefSeq
, CMR
, NCBI
, Trembl
, or UniProt
for IDs from those databases. Use mixed
to allow mixed ID types (though this may cause problems when the same ID has different meanings in different databases). Use prefixed
to allow IDs with prefixing indicating the ID type (e.g. uni|P00934
for a UniProt ID, gi|135813
for an NCBI identifier, and so forth). The default is SEED
.
If specified, the output FASTA sequences will be protein sequences; otherwise, they will be DNA sequences. The default is FALSE.
If specified, the output sequences will be FASTA format, otherwise just simple character strings. The default is FALSE. In this case the output file will look the same as the input file but with DNA/protein sequences tacked onto the end of each line.
The URL for the Sapling server, if it is to be different from the default.
Column index. If specified, indicates that the input IDs should be taken from the indicated column instead of the last column. The first column is column 1.