Chapter 6: Q26E (page 197)
Sequence alignment. When a new gene is discovered, a standard approach to understanding its function is to look through a database of known genes and find close matches. The closeness of two genes is measured by the extent to which they are aligned. To formalize this, think of a gene as being a long string over an alphabet . Consider two genes (strings) and . An alignment of x and y is a way of matching up these two strings by writing them in columns, for instance:
Here the “_” indicates a “gap.” The characters of each string must appear in order, and each column must contain a character from at least one of the strings. The score of an alignment is specified by a scoring matrixof size , where the extra row and column are to accommodate gaps. For instance the preceding alignment has the following score:
Give a dynamic programming algorithm that takes as input two strings X[1K n] and Y {1K m} and a scoring matrix and returns the highest-scoring alignment. The running time should be O(mn) .
Short Answer
The dynamic algorithm that runs in O(nm) time has been obtained.