Chapter 21: Problem 2
Alignment tools and methods HIV virions wrap themselves with a lipid bilayer membrane as they bud off from infected cells; in this viral membrane envelope are "spikes" composed of two different proteins (actually glycoproteins), gp41 and gp120. The gp denotes glycoprotein, and the number indicates their molecular weight in kilodaltons. gp 120 and \(8 p 41\) together form the trimeric envelope spike on the surface of HIV that functions in viral entry into a host cell. The primary receptor for \(g p 120\) is \(\mathrm{CD} 4,\) a protein found mainly on the white blood cells known as T-lymphocytes. gp120 avoids detection by the host immune system through a number of strategies, including rapid changes in sequence due to mutations. In this exercise, you will grab a sequence for gp 120 and -blast" it to find related proteins. Use the "Search database" on the LANL HIV Sequence Database site (www.hiv.lanl.gov/) to find a sequence for \(\mathrm{gp} 120\); the complete gp120 molecule has about 500 amino acids, so a complete DNA sequence will have roughly 1500 base pairs. With the BLAST website (www.ncbi.nlm.nih.gov/ blast), open a new window and select "blastx" under the "Basic BLAST" heading. Copy and paste your approximately 1500 nucleotide sequence of gp 120 into the top box under "Enter Query Sequence." For the database, select "Protein Data Bank proteins (pdb)." What we are doing is having BLAST translate the \(g\) p 120 nucleotide sequence into an amino acid sequence and then compare it with amino acid sequences of proteins in the PDB. Note that there are some proteins that appear multiple times in the PDB because their structures have been analyzed and determined more than once or in different contexts. Finally, push the "BLAST" button and wait for your results to appear on a new page. (a) How does BLAST determine the ranking of the results from your search? (b) For your top search result, identify the percentage of sequence identity with your query sequence. Explain how this number is determined. (c) You will notice that BLAST has used gaps in many of your alignments. What is the evolutionary significance of these gaps? (d) For your top hit, how many alignments with this high a score or better would have been expected by chance? (e) Looking at your ranked list of results and using only the "E-value," which hits do you expect to (possibly) have a genuine evolutionary relationship with your gp 120 sequence and why?
Short Answer
Step by step solution
Key Concepts
These are the key concepts you need to understand to accurately answer the question.