Egglib 2.1.11
C++ library reference manual
Public Member Functions | Protected Member Functions | Protected Attributes | List of all members
Align Class Reference

Handles a sequence alignment. More...

#include <Align.hpp>

Inheritance diagram for Align:
Container CharMatrix

Public Member Functions

 Align ()
 Creates an empty alignment. More...
 
 Align (unsigned int number_of_sequences, unsigned int alignment_length, char const *const *const cstring_array)
 Creates an alignment from a data matrix. More...
 
 Align (unsigned int number_of_sequences, unsigned int alignment_length)
 Creates an alignment with given dimensions. More...
 
 Align (const Align &align)
 Copy constructor. More...
 
 Align (const Container &container)
 Copy constructor accepting a Container object. More...
 
Alignoperator= (const Align &align)
 Copy operator. More...
 
Alignoperator= (const Container &container)
 Copy operator accepting a Container object. More...
 
virtual ~Align ()
 Destructor. More...
 
virtual unsigned int append (const char *name, const char *sequence, unsigned int group=0)
 Adds a sequence. More...
 
virtual unsigned int removePosition (unsigned int pos)
 Removes a position (column) of the alignment. More...
 
virtual unsigned int remove (unsigned int pos)
 Removes a sequence from the alignment. More...
 
virtual void sequence (unsigned int seq, const char *sequence)
 Replace a sequence string. More...
 
virtual const char * sequence (unsigned int pos) const
 Gets the name of a given sequence. More...
 
virtual unsigned int ls () const
 Alignment length. More...
 
virtual unsigned int ls (unsigned int pos) const
 Length of a given sequence. More...
 
char character (unsigned int s, unsigned int p) const
 Fast and unsecure accessor. More...
 
virtual char get (unsigned int sequence, unsigned int position) const
 Gets a nucleotide. More...
 
virtual void set (unsigned int sequence, unsigned position, char ch)
 Sets a matrix position to a new character. More...
 
void binSwitch (unsigned int pos)
 Reverse a given column in binary data. More...
 
Align vslice (std::vector< unsigned int > list_of_sites)
 Extracts specified positions (columns) of the alignment. More...
 
Align vslice (unsigned int a, unsigned int b)
 Extracts a range of positions (columns) More...
 
virtual void clear ()
 Deletes all the content of the object. More...
 
unsigned int numberOfSequences () const
 Same as ns() More...
 
unsigned int numberOfSites () const
 Same as ls() More...
 
unsigned int populationLabel (unsigned int sequenceIndex) const
 Gets a group label (insecure) More...
 
double sitePosition (unsigned int position) const
 Just return the passed value. More...
 
- Public Member Functions inherited from Container
 Container ()
 Creates an empty object. More...
 
 Container (const Container &source)
 Copy constructor. More...
 
Containeroperator= (const Container &source)
 Assignment operator. More...
 
 Container (unsigned int number_of_sequences, char const *const *const cstring_array)
 Creates an object from a data matrix. More...
 
virtual ~Container ()
 Destructor. More...
 
virtual void name (unsigned int pos, const char *name)
 Changes the name of a given sequence. More...
 
virtual void group (unsigned int pos, unsigned int group)
 Changes the group index of a given sequence. More...
 
Container hslice (unsigned int a, unsigned int b) const
 Extracts a range of sequences. More...
 
unsigned int ns () const
 Gets the number of sequences. More...
 
virtual const char * name (unsigned int pos) const
 Gets the name of the a given sequence. More...
 
virtual unsigned int group (unsigned int pos) const
 Gets the group index of a given sequence. More...
 
bool isEqual () const
 Checks if all lengths are equal. More...
 
unsigned int equalize (char ch='?')
 Equalizes sequence lengths. More...
 
int find (const char *string, bool strict=true) const
 Finds a sequence by its name. More...
 

Protected Member Functions

virtual void appendSequence (unsigned int pos, const char *sequence)
 This function is not available for alignments.
 
virtual void init ()
 
virtual void setFromSource (unsigned int number_of_sequences, unsigned int alignment_length, const char *const *const cstring_array)
 
virtual void copyObject (const Container &)
 
virtual void copyObject (const Align &)
 
- Protected Member Functions inherited from Container
virtual void setFromSource (unsigned int number_of_sequences, const char *const *const cstring_array)
 
virtual void getNamesAndGroups (const Container &)
 

Protected Attributes

unsigned int _ls
 
- Protected Attributes inherited from Container
unsigned int _ns
 
unsigned int * lnames
 
char ** names
 
char ** sequences
 
unsigned int * groups
 

Detailed Description

Handles a sequence alignment.

Creation from a file or string stream should be performed using the class Fasta. Align objects can be created by deep copy from both Align and Container type. In the latter case, the length are artificially equalized by "?" characters. Align objects can be created from a DataMatrix object (and all the way arround) using the specific class DMAConverter.

Sequences are represented by two strings (name and sequence) and an integer (group) that can be accessed or modified by index.The order of sequences is guaranteed to be conserved, as if Align was a list of triplets (name, sequence, group).

The data matrix is implemented as continuous array (char**) and allows efficient access and modification of data. For very large data matrices you might claim immediately the required memory using the constructor Align(unsigned int, char**).

Constructor & Destructor Documentation

Align ( )

Creates an empty alignment.

Align ( unsigned int  number_of_sequences,
unsigned int  alignment_length,
char const *const *const  cstring_array 
)

Creates an alignment from a data matrix.

Allows you to create an object from data stored in a char* array. The array's dimensions must be passed to the constructor, and as a result there is not need to terminate each sequence by a NULL character.

Parameters
number_of_sequencesthe number of sequences (the length of the first dimension of the array).
alignment_lengththe length of sequences (the length of all lines of the array).
cstring_arraythe pointer to the data matrix.
Align ( unsigned int  number_of_sequences,
unsigned int  alignment_length 
)

Creates an alignment with given dimensions.

Allows you to allocate directly a data matrix of a given size. Names are empty strings, groups 0, and all characters are ?.

Parameters
number_of_sequencesthe number of sequences (the length of the first dimension of the array).
alignment_lengththe length of sequences (the length of all lines of the array).
Align ( const Align align)

Copy constructor.

Align ( const Container container)

Copy constructor accepting a Container object.

All but the longest sequences are padded with ? to match the longest sequence's length.

~Align ( )
virtual

Destructor.

Member Function Documentation

unsigned int append ( const char *  name,
const char *  sequence,
unsigned int  group = 0 
)
virtual

Adds a sequence.

If the object already contains at least one sequence, the new sequence must have the same length. Otherwise, a EggUnalignedError is raised.

Parameters
namethe name of the sequence.
sequencethe sequence string.
groupthe group index of the sequence.
Returns
The new number of sequences.

Reimplemented from Container.

void binSwitch ( unsigned int  pos)

Reverse a given column in binary data.

The specified column must contain only "0" ans "1" characters. "0" is replaced by "1" and all the way around

char character ( unsigned int  s,
unsigned int  p 
) const
inlinevirtual

Fast and unsecure accessor.

This accessor doesn't perform out-of-bound checking!

Parameters
sthe index of the sequence (line).
pthe position in the alignment (column).
Returns
The character at the given position.

Implements CharMatrix.

void clear ( )
virtual

Deletes all the content of the object.

Reimplemented from Container.

char get ( unsigned int  sequence,
unsigned int  position 
) const
virtual

Gets a nucleotide.

This modifier does perform out-of-bound checking. The specified position must exist.

Parameters
sequencethe index of the sequence (line).
positionthe position in the alignment (column).
Returns
the character at the given position.

Reimplemented from Container.

unsigned int ls ( ) const
virtual

Alignment length.

Returns 0 if the alignment is empty.

unsigned int ls ( unsigned int  pos) const
virtual

Length of a given sequence.

Calling this function is exactly the same as calling ls() (without arguments), regardless of the index provided, except that an exception is thrown if the index is out of bounds. Provided for compatibility with Container.

Parameters
posthe index of the sequence.
Returns
the length of the alignment.

Reimplemented from Container.

unsigned int numberOfSequences ( ) const
inlinevirtual

Same as ns()

Implements CharMatrix.

unsigned int numberOfSites ( ) const
inlinevirtual

Same as ls()

Implements CharMatrix.

Align & operator= ( const Align align)

Copy operator.

Align & operator= ( const Container container)

Copy operator accepting a Container object.

All but the longest sequences are padded with ? to match the longest sequence's length.

unsigned int populationLabel ( unsigned int  sequenceIndex) const
inlinevirtual

Gets a group label (insecure)

Implements CharMatrix.

unsigned int remove ( unsigned int  pos)
virtual

Removes a sequence from the alignment.

Parameters
posthe index of the sequence to remove.
Returns
The new number of sequences.

Reimplemented from Container.

unsigned int removePosition ( unsigned int  pos)
virtual

Removes a position (column) of the alignment.

Parameters
posthe position to remove in the alignment.
Returns
The new length of the alignment.
void sequence ( unsigned int  seq,
const char *  sequence 
)
virtual

Replace a sequence string.

The new sequence must have the same length than the alignment. Otherwise, a EggUnalignedError is raised.

Parameters
seqthe index of the sequence to change.
sequencethe new sequence.

Reimplemented from Container.

virtual const char* sequence ( unsigned int  pos) const
inlinevirtual

Gets the name of a given sequence.

Parameters
posthe index of the sequence.
Returns
The sequence string for that particular sequence.

Reimplemented from Container.

void set ( unsigned int  sequence,
unsigned  position,
char  ch 
)
virtual

Sets a matrix position to a new character.

This modifier does perform out-of-bound checking. The specified position must exist.

Parameters
sequencethe index of the sequence (line).
positionthe position in the alignment (column).
chthe new character value.

Reimplemented from Container.

double sitePosition ( unsigned int  position) const
inlinevirtual

Just return the passed value.

Implements CharMatrix.

Align vslice ( std::vector< unsigned int >  list_of_sites)

Extracts specified positions (columns) of the alignment.

All the specified sites are extracted in the specified order. This function is suitable for bootstrap (resample allowing redrawing the same site) and permutations.

This function doesn't perform out-of-bound checking.

Parameters
list_of_sitesa vector containing alignment positions.
Returns
A copy of the object containing the specified set of positions.
Align vslice ( unsigned int  a,
unsigned int  b 
)

Extracts a range of positions (columns)

Parameters
athe first position.
bthe index immediately passed the last sequence to extract.
Returns
A copy of the object containing the specified range of sequences.

Positions a to b-1 are extracted, provided that the indices fit in the current length of sequences. To extract all sequences, use align.vslice(0, align.ls()).

Note: invalid ranges will be silently supported. If a>=ls or b<=a, an empty object is returned. If b>ns, ls will be substituted to a.


The documentation for this class was generated from the following files:

Hosted by 
Get EggLib at SourceForge.net. Fast, secure and Free Open Source software downloads