Chapter 6: Q27E (page 198)

Alignment with gap penalties. The alignment algorithm of Exercise 6.26 helps to identify DNA sequences that are close to one another. The discrepancies between these closely matched sequences are often caused by errors in DNA replication. However, a closer look at the biological replication process reveals that the scoring function we considered earlier has a qualitative problem: nature often inserts or removes entire substrings of nucleotides (creating long gaps), rather than editing just one position at a time. Therefore, the penalty for a gap of length 10 should not be 10 times the penalty for a gap of length 1, but something significantly smaller.
Repeat Exercise 6.26, but this time use a modified scoring function in which the penalty for a gap of length k is c₀ + c₁k, where c₀ and c₁ are given constants (and c₀ is larger than c₁).

Short Answer

Expert verified

The alignment algorithm with a modified scoring function is as follows,

$s (i, j, 0) = \underset{0 \leq k < a}{m a x} (s (i - 1, j - 1, k) + δ (x [i], y [j])) s (i, j, 1) = m a x (\begin{matrix} s (i, j - 1, 0) - (c_{0} +_c_{1}) \\ s (i, j - 1, 1) - c_{1} \\ s (i, j - 1, 2) - (c_{0} +_c_{1}) \end{matrix}) + δ (-, y [j]) s (i, j, 2) = m a x (\begin{matrix} s (i, j - 1, j, 0) - (c_{0} +_c_{1}) \\ s (i, j - 1, j, 1) - (c_{0} +_c_{1}) \\ s (i, j - 1, j, 2) - c_{1} \end{matrix}) + δ (x [i], -)$

Step by step solution

Explain Alignment with gap penalties:

An array alignment provides a way of identifying the regions of similarities between the strings. The matrix M of size n x m is constructed to find the best aligning score of the given strings. The given case can occur for any two characters of the strings.The first case is to align the sequence with no gaps. The match or miss is determined by the given scoring function. The other case is to align the sequence with gaps.

Give the alignment algorithm with a modified scoring function

Consider the alignment algorithm of Exercise 6.26 as follows,

$s (i, j) = m a x (\begin{matrix} s (i - 1, j) + δ (x [i], -) \\ s (i, j - 1) + δ (-, y [j]) \\ s (i - 1, j - 1) + δ (x [i], y [j]) \end{matrix})$

Redefine the subproblem as, s(i,j,k) where, $k \in \{0, 12\}$ corresponding to the three matches at the las position:

$x [i] \leftrightarrow y [j], - \leftrightarrow y [j], x [i] \leftrightarrow -$ , So by adding the modified scoring function the alignment algorithm can be provided as follows,