Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

Alignment with gap penalties. The alignment algorithm of Exercise 6.26 helps to identify DNA sequences that are close to one another. The discrepancies between these closely matched sequences are often caused by errors in DNA replication. However, a closer look at the biological replication process reveals that the scoring function we considered earlier has a qualitative problem: nature often inserts or removes entire substrings of nucleotides (creating long gaps), rather than editing just one position at a time. Therefore, the penalty for a gap of length 10 should not be 10 times the penalty for a gap of length 1, but something significantly smaller.

Repeat Exercise 6.26, but this time use a modified scoring function in which the penalty for a gap of length k is c0 + c1k, where c0 and c1 are given constants (and c0 is larger than c1).

Short Answer

Expert verified

The alignment algorithm with a modified scoring function is as follows,

s(i,j,0)=max0k<asi-1,j-1,k+δxi,yjs(i,j,1)=maxs(i,j-1,0)-(c0+_c1)s(i,j-1,1)-c1s(i,j-1,2)-(c0+_c1)+δ-,yjs(i,j,2)=maxs(i,j-1,j,0)-(c0+_c1)s(i,j-1,j,1)-(c0+_c1)s(i,j-1,j,2)-c1+δxi,-

Step by step solution

01

Explain Alignment with gap penalties:

An array alignment provides a way of identifying the regions of similarities between the strings. The matrix M of size n x m is constructed to find the best aligning score of the given strings. The given case can occur for any two characters of the strings.The first case is to align the sequence with no gaps. The match or miss is determined by the given scoring function. The other case is to align the sequence with gaps.

02

Give the alignment algorithm with a modified scoring function

Consider the alignment algorithm of Exercise 6.26 as follows,

s(i,j)=maxs(i-1,j)+δxi,-s(i,j-1)+δ(-,yj)s(i-1,j-1)+δxi,yj

Redefine the subproblem as, s(i,j,k) where,k0,12 corresponding to the three matches at the las position:

x[i]y[j],-y[j],x[i]-, So by adding the modified scoring function the alignment algorithm can be provided as follows,

s(i,j,0)=max0k<asi-1,j-1,k+δxi,yjs(i,j,1)=maxs(i,j-1,0)-(c0+_c1)s(i,j-1,1)-c1s(i,j-1,2)-(c0+_c1)+δ-,yjs(i,j,2)=maxs(i,j-1,j,0)-(c0+_c1)s(i,j-1,j,1)-(c0+_c1)s(i,j-1,j,2)-c1+δxi,-

Therefore, The alignment algorithm with modified scoring function has been provided.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Pebbling a checkerboard. We are given a checkerboard which has 4 rows and ncolumns, and has an integer written in each square. We are also given a set of 2n pebbles, and we want to place some or all of these on the checkerboard (each pebble can be placed on exactly one square) so as to maximize the sum of the integers in the squares that are covered by pebbles. There is one constraint: for a placement of pebbles to be legal, no two of them can be on horizontally or vertically adjacent squares (diagonal adjacency is fine).

  1. Determine the number of legal patterns that can occur in any column (in isolation, ignoring the pebbles in adjacent columns) and describe these patterns.

Call two patterns compatible if they can be placed on adjacent columns to form a legal placement. Let us consider subproblems consisting of the first columns 1kn. Each subproblem can be assigned a type, which is the pattern occurring in the last column.

  1. Using the notions of compatibility and type, give an O(n)-time dynamic programming algorithm for computing an optimal placement.

Consider the following variation on the change-making problem (Exercise 6.17): you are given denominations x1,x2,...,xn, and you want to make change for a value v, but you are allowed to use each denomination at most once. For instance, if the denominations are 1,5,10,20,then you can make change for 16=1+15and for 31=1+10+20but not for 40(because you can’t use 20 twice).

Input: Positive integers; x1,x2,...,xnanother integer v.

Output: Can you make change for v, using each denominationxi at most once?Show how to solve this problem in time O(nV).

Time and space complexity of dynamic programming. Our dynamic programming algorithm for computing the edit distance between strings of length m and n creates a table of size n×mand therefore needs O (mn) time and space. In practice, it will run out of space long before it runs out of time. How can this space requirement be reduced?

  1. Show that if we just want to compute the value of the edit distance (rather than the optimal sequence of edits), then only O(n) space is needed, because only a small portion of the table needs to be maintained at any given time.
  2. Now suppose that we also want the optimal sequence of edits. As we saw earlier, this problem can be recast in terms of a corresponding grid-shaped dag, in which the goal is to find the optimal path from node (0,0) to node (n,m). It will be convenient to work with this formulation, and while we’re talking about convenience, we might as well also assume that is a power of 2.
    Let’s start with a small addition to the edit distance algorithm that will turn out to be very useful. The optimal path in the dag must pass through an intermediate node (k,m2) for some k; show how the algorithm can be modified to also return this value k.
  3. Now consider a recursive scheme:
    Procedure find-path((0,0)(n,m))
    Compute the value kabove
    find-path ((0,0)k,m2)
    find-path k,m2n,m
    concatenate these two paths, with kin the middle.
    Show that this scheme can be made to run inO (mn) time and O(n) space.

You are going on a long trip. You start on the road at mile post 0. Along the way there aren hotels, at mile posts a1<a2<...<an , where eachai is measured from the starting point. The only places you are allowed to stop are at these hotels, but you can choose which of the hotels you stop at. You must stop at the final hotel (at distance an), which is your destination. You’d ideally like to travel miles a day, but this may not be possible (depending on the spacing of the hotels). If you travel x miles during a day, the penalty for that day is (200x)2. You want to plan your trip so as to minimize the total penalty- that is, the sum, over all travel days, of the daily penalties.Give an efficient algorithm that determines the optimal sequence of hotels at which to stop

Optimal binary search trees. Suppose we know the frequency with which keywords occur in programs of a certain language, for instance:

begin5%do40%else8%end4%

if10%then10%while23%

We want to organize them in a binary search tree, so that the keyword in the root is alphabetically bigger than all the keywords in the left subtree and smaller than all the keywords in the right subtree (and this holds for all nodes). Figure 6.12 has a nicely-balanced example on the left. In this case, when a keyword is being looked up, the number of comparisons needed is at most three: for instance, in finding “while”, only the three nodes “end”, “then”, and “while” get examined. But since we know the frequency 196 Algorithms with which keywords are accessed, we can use an even more fine-tuned cost function, the average number of comparisons to look up a word. For the search tree on the left, it is

cost=1(0.04)+2(0.40+0.10)+3(0.05+0.08+0.10+0.23)=2.42

By this measure, the best search tree is the one on the right, which has a cost of Give an efficient algorithm for the following task. Input: n words (in sorted order); frequencies of these words: p1,p2,...,pn.

Output: The binary search tree of lowest cost (defined above as the expected number of comparisons in looking up a word).

Figure 6.12 Two binary search trees for the keywords of a programming language.

See all solutions

Recommended explanations on Computer Science Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free