Egglib 2.1.11
C++ library reference manual
Static Public Member Functions | List of all members
Ms Class Reference

ms-like sequence format parser More...

#include <Ms.hpp>

Static Public Member Functions

static DataMatrix get (std::string, unsigned int ns, bool separated=false)
 Imports a sequence alignment. More...
 
static DataMatrix get (std::istream &stream, unsigned int ns, bool separated=false)
 Imports a sequence alignment. More...
 
static std::string format (DataMatrix &dataMatrix, bool separated=false)
 Exports a sequence alignment. More...
 
static void format (std::ostream &stream, DataMatrix &dataMatrix, bool separated=false)
 Exports a sequence alignment. More...
 
static double tMRCA ()
 Returns the last tMRCA read by any Ms instance. More...
 
static double prob ()
 Returns the last "prob" read by any Ms instance. More...
 
static std::string trees ()
 Returns the tree string found in the last simulation read by any Ms instance. More...
 

Detailed Description

ms-like sequence format parser

The class provides parsing (input) and formatting (output) operations in ms format, that is the format used by Richard Hudson's program ms for outputting genotypes and by the associated program samplestat for reading them. Both types of operations are available through static methods using either a string or a stream (which can be a stream to or from a file or a string). In either case, types from the STL are used. Although ms deals only with data coded with 0 and 1, the class Ms offers the possibility of both importing and exporting data coded with by integer. All methods have an option named "separated". If this option is true, the parser or formatter introduces a slight modification of the format: genotypes individual data are separated by a white space ("1 0 1 1" instead of "1011", allowing genotype values larger than 9: "1 0 11 1").

Member Function Documentation

std::string format ( DataMatrix dataMatrix,
bool  separated = false 
)
static

Exports a sequence alignment.

Internally creates a stringstream, calls the overloaded method and returns the outcome.

Parameters
dataMatrixthe alignment object to write.
separatedtrue if a white space separator must be placed between the genotype at each site.
void format ( std::ostream &  stream,
DataMatrix dataMatrix,
bool  separated = false 
)
static

Exports a sequence alignment.

Writes the formatted string to the stream 'on the fly'. The formatted string is guaranteed to starts with a // line and ends with an empty line. The client is expected to take care of writing any header and add an additional white line between simulations if needed. The method throws a SeqlibRuntimeError if the stream is not writable. The data matrix should contain only data within range 0-9 if separated is false (default) and any positive (>=0) integer if separated is true. Note that output generated with separated=true is never compatible with the original ms format, and that output generated with separator=false is compatible with the original ms format only if all alleles are 0 or 1 (which is not checked by this formatted).

Parameters
streamthe stream (file or string stream) where to write the output.
dataMatrixthe alignment object to write.
separatedtrue if a white space separator must be placed between the genotype at each site.
DataMatrix get ( std::string  str,
unsigned int  ns,
bool  separated = false 
)
static

Imports a sequence alignment.

Creates a istringstream from the string and calls the overloaded method.

Parameters
strthe string to parse.
nsthe expected number of sequences.
separatedtrue if a white space separator is placed between genotype at each site.
Returns
A sequence alignment as a data matrix.
DataMatrix get ( std::istream &  stream,
unsigned int  ns,
bool  separated = false 
)
static

Imports a sequence alignment.

Attemps to generate a DataMatrix object from the stream. Reads only one simulation and throws a SeqlibFormatError exception in case of format error.

Allows any number of white lines before the //, but no other data. Supports at the end of lines (before the
). Accepted symbols are all integers (0-9).

Parameters
streamthe stream to parse.
nsthe expected number of sequences.
separatedtrue if a white space separator is placed between genotype at each site.
Returns
A sequence alignment as a data matrix.
double prob ( )
static

Returns the last "prob" read by any Ms instance.

"prob" is returned by ms when a fixed number of segregating sites is used in conjunction with a theta value. If a "prob" value was present in the last simulation read by any Ms instance, it will be returned by this method. A value of -1 is returned if no simulation was read, or if the last simulation didn't contain a "prob" value or if the last simulation provoked an exception before reaching the "prob" line.

double tMRCA ( )
static

Returns the last tMRCA read by any Ms instance.

If a tMRCA value was present in the last simulation read by any Ms instance, it will be returned by this method. A value of -1. is returned if no simulation was read, or if the last simulation didn't contain a tMRCA value or if the last simulation provoked an exception before reaching the tMRCA line.

std::string trees ( )
static

Returns the tree string found in the last simulation read by any Ms instance.

If one or more trees were present in the last simulation read by any Ms instance, they will be returned as a unique string by this method. An empty string is returned if no simulation was read, or if the last simulation, or if the last simulation didn't contain any tree value or if the last simulation provoked an exception before reaching the tree line.

Note: the trees are returned as a single line.


The documentation for this class was generated from the following files:

Hosted by 
Get EggLib at SourceForge.net. Fast, secure and Free Open Source software downloads