Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

Protein mutation rates Random mutations lead to amino acid substitutions in proteins that are described by the Poisson probability distribution \(p_{s}(t) .\) Namely, the probability that \(s\) substitutions at a given amino acid position in a protein occur over an evolutionary time \(t\) is \\[p_{s}(t)=\frac{e^{-\lambda t}(\lambda t)^{s}}{s !}\\] where \(\lambda\) is the rate of amino acid substitutions per site per unit time. For example, some proteins like fibrinopeptides evolve rapidily, and \(\lambda_{F}=9\) substitutions per site per \(10^{9}\) years. Histones, on the other hand, evolve slowly, with \(\lambda_{H}=0.01\) substitutions per site per \(10^{9}\) years. (a) What is the probability that a fibrinopeptide has no mutations at a given site in 1 billion years? What is this probability for a histone? (b) We want to compute the average number of mutations \((s)\) over time \(t\) \\[ \langle s\rangle=\sum_{s=0}^{\infty} s p_{s}(t) \\] First, using the fact that probabilities must sum to 1 compute the sum \(\sigma=\sum_{s=0}^{\infty}(\lambda t)^{s} / s !\). Then, write an expression for \((s),\) making use of the identity \\[\sum_{s=0}^{\infty} s \frac{(\lambda t)^{s}}{s !}=(\lambda t) \sum_{s=1}^{\infty} \frac{(\lambda t)^{s-1}}{(s-1) !}=\lambda t \sigma\\] (c) Using your answer in (b), determine the ratio of the expected number of mutations in a fibrinopeptide to that of a histone, \((s)_{F} /(s)_{H}\) (Adapted from Problem 1.16 of \(\mathrm{K}\). Dill and S. Bromberg. Molecular Driving Forces, 2nd ed. Garland Science, 2011.)

Short Answer

Expert verified
In short, firstly the Poisson distribution is used to find the no mutations probabilities for fibrinopeptides and histones. Afterwards, an infinite series is evaluated using the formula (given in the problem) and the average number of mutations is expressed. Finally, the expected number of mutations in a fibrinopeptide is found to be a certain ratio times more than that in a histone over a billion years.

Step by step solution

01

- Understanding the Notation and Formulas

First, make sure to understand the symbols and notation. In this case, \(s\) is the number of substitutions that occur at a given amino acid position at a time \(t\). The symbol \(\lambda\) refers to the rate of amino acid substitutions per site per unit time. The formula \[p_{s}(t)=\frac{e^{-\lambda t}(\lambda t)^{s}}{s!}\] shows the Poisson distribution. It gives probability that \(s\) substitutions occur at an amino acid position in a protein over an evolutionary time \(t\).
02

- Calculate No Mutations Probability for Fibrinopeptide

This step involves substituting the values into the Poisson distribution formula to find the mutation probability. For fibrinopeptides we know \(\lambda_{F}=9\) and \(t = 10^9\). Since we're looking for no mutations, \(s = 0\). So substituting these, we get \[p_{0}(t)=\frac{e^{-9 \times 10^{9}}(9 \times 10^{9})^{0}}{0!}\] After evaluation this expression, we get the required probability.
03

- Calculate No Mutations Probability for Histone

Following the same approach as in step 2, we substitute the given values for histones into the Poisson distribution formula. We know \(\lambda_{H} = 0.01\) and \(t = 10^9\), and we're also looking for no mutations (\(s = 0\)). So, the expression becomes \[p_{0}(t)=\frac{e^{-0.01 \times 10^{9}}(0.01 \times 10^{9})^{0}}{0!}\]. After evaluating this expression, we get the required probability.
04

- Evaluate the Infinite Series

We evaluate the summation \(\sigma = \sum_{s=0}^{\infty} (\lambda t)^s / s!\). Using the formula for the sum of an infinite series of a geometric sequence, we see that the sum actually evaluates to \(e^{\lambda t}\).
05

- Write Expression for Average Number of Mutations

Using the given identity, we write an expression for the average number of mutations \(\langle s\rangle\) as follows: \[\langle s\rangle = \lambda t \sigma\] Substituting the value of sigma from Step 4, we get \(\langle s\rangle = \lambda t e^{\lambda t}\).
06

- Compute Ratio of Expected Number of Mutations

Finally, we substitute the calculated values for fibrinopeptides and histones to \(\langle s \rangle\), and compute the ratio as follows: \[(s)_F/ (s)_H = (\lambda_F t e^{\lambda_F t}) / (\lambda_H t e^{\lambda_H t})\]. Upon simplification, we get the ratio of the expected number of mutations in a fibrinopeptide to that of a histone over the given period of time.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Amino Acid Substitutions
In the study of molecular biology and evolutionary biology, amino acid substitutions are a core focus. These are changes in the protein sequence that occur when one amino acid in a protein is replaced by another. This process of change is pivotal as proteins are the workhorse molecules in cells, taking on a vast array of functions from catalyzing metabolic reactions to DNA replication.

Substitutions can happen due to errors in DNA replication or as a result of environmental damage to the DNA. Some substitutions may have little to no effect on the function of a protein, while others can be detrimental or, less commonly, advantageous, potentially leading to evolutionary change. The rate at which these substitutions occur is central to understanding both the mechanisms of protein function and the forces that drive evolutionary processes.

Considering the exercise provided, we focused on calculating the likelihood of these substitutions over an extended period—1 billion years—to illuminate how often we might expect these changes to occur in two types of proteins with widely differing mutation rates.
Poisson Probability Distribution
The Poisson probability distribution is a discrete probability distribution that expresses the likelihood of a given number of events occurring in a fixed interval of time or space, provided these events happen with a known constant mean rate and independently of the time since the last event. In our scenario, it is used to model the probability of a certain number of amino acid substitutions occurring within a given period.

The formula \(p_{s}(t) = \frac{e^{-\lambda t}(\lambda t)^{s}}{s!}\) succinctly captures the essence of this distribution. It tells us the probability that \(s\) substitutions will occur in time \(t\), with \(\lambda\) being the substitution rate. In our example, the exercise asks to apply the Poisson distribution to calculate the probability of no mutations occurring in both fast-evolving proteins like fibrinopeptides and slow-evolving proteins like histones.

Understanding the implications of the Poisson distribution enriches students' knowledge of statistical models and their critical applications in biological systems, like measuring mutation rates in evolutionary studies.
Evolutionary Biology
At the crux of evolutionary biology lies the study of how organisms change over time. Mutations, including amino acid substitutions in proteins, are the raw material of genetic variation which, in turn, is essential for evolution by natural selection. Evolutionary biologists investigate the rates at which mutations occur and how these rates contribute to the diversity and adaptability of life forms.

In our exercise, the emphasis is on contrasting the mutation rates of fibrinopeptides and histones, exemplifying the diverse evolutionary pressures and constraints acting on different proteins. The probabilities drawn through the Poisson distribution reflect the evolutionary dynamics of proteins: fast-evolving fibrinopeptides can adapt quickly, whereas histones, being stable and slow to change, preserve the structural integrity of cellular components over time. Such insights are fundamental for students to comprehend how molecular changes are quantified and their implications for the broader evolutionary landscape.
Molecular Biology
Molecular biology delves into the molecular underpinnings of the processes that govern life, including how genetic information is replicated, expressed, and regulated. Within this discipline, understanding protein mutation rates is crucial as it impacts on gene expression and function. The exercise in question marries molecular biology with mathematics by using the Poisson probability distribution to predict mutations.

This integrative approach helps students grasp how quantitative analysis can provide insights into biological mechanisms—such as the expected number of amino acid substitutions over time. When students calculate mutation rates, they are engaging with one of the pillars of molecular biology, the mutation process, which can translate into changes in phenotype and affect organism functions and survival. Hence, mastering these calculations is not merely an academic exercise, but a window into the molecular forces that shape life.

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Restriction enzymes and sequences (a) Restriction enzymes are proteins that recognize specific sequences at which they cut the DNA. Two commonly used restriction enzymes are HindIII and EcoRI. Look up the recognition sequences that these enzymes each cut and make a sketch of the pattern of cutting they carry out. Consider the approximately 48,000 bp genome of lambda phage and make an estimate of the lengths of the fragments that you would get if the DNA is cut with both the HindIII and EcoRI restriction enzymes. There is a precise mathematical way to do this and it depends upon the length of the recognition sequence-a 5 cutter will have shorter fragments than an 8 cutter-explain that. (b) Find the actual fragment lengths obtained in the lambda genome using these restriction enzymes by going to the New England Biolabs website (www.neb.com) and looking up the tables identifying the sites on the lambda genome that are cut by these different enzymes. How do these cutting patterns compare with your results from (a)? (c) Plot the number of cuts in the lambda genome as a function of the length of the recognition sequence of several commercially available type II restriction enzymes. You can download the list of type II restriction enzymes from the book's website. Combine this plot with a curve showing your theoretical expectation.

Alignment tools and methods HIV virions wrap themselves with a lipid bilayer membrane as they bud off from infected cells; in this viral membrane envelope are "spikes" composed of two different proteins (actually glycoproteins), gp41 and gp120. The gp denotes glycoprotein, and the number indicates their molecular weight in kilodaltons. gp 120 and \(8 p 41\) together form the trimeric envelope spike on the surface of HIV that functions in viral entry into a host cell. The primary receptor for \(g p 120\) is \(\mathrm{CD} 4,\) a protein found mainly on the white blood cells known as T-lymphocytes. gp120 avoids detection by the host immune system through a number of strategies, including rapid changes in sequence due to mutations. In this exercise, you will grab a sequence for gp 120 and -blast" it to find related proteins. Use the "Search database" on the LANL HIV Sequence Database site (www.hiv.lanl.gov/) to find a sequence for \(\mathrm{gp} 120\); the complete gp120 molecule has about 500 amino acids, so a complete DNA sequence will have roughly 1500 base pairs. With the BLAST website (www.ncbi.nlm.nih.gov/ blast), open a new window and select "blastx" under the "Basic BLAST" heading. Copy and paste your approximately 1500 nucleotide sequence of gp 120 into the top box under "Enter Query Sequence." For the database, select "Protein Data Bank proteins (pdb)." What we are doing is having BLAST translate the \(g\) p 120 nucleotide sequence into an amino acid sequence and then compare it with amino acid sequences of proteins in the PDB. Note that there are some proteins that appear multiple times in the PDB because their structures have been analyzed and determined more than once or in different contexts. Finally, push the "BLAST" button and wait for your results to appear on a new page. (a) How does BLAST determine the ranking of the results from your search? (b) For your top search result, identify the percentage of sequence identity with your query sequence. Explain how this number is determined. (c) You will notice that BLAST has used gaps in many of your alignments. What is the evolutionary significance of these gaps? (d) For your top hit, how many alignments with this high a score or better would have been expected by chance? (e) Looking at your ranked list of results and using only the "E-value," which hits do you expect to (possibly) have a genuine evolutionary relationship with your gp 120 sequence and why?

The molecular clock In eukaryotes, the majority of individual point mutations are thought to be "neutral" and have little or no effect on phenotype. Only a small fraction of the genome codes for proteins and critical DNA regulatory sequences. Even within coding regions, the redundancy of the genetic code is suffcient to render many mutations "synonymous" (that is, they do not change the amino acid, and hence the protein, encoded by the DNA). The slow accumulation of neutral mutations between two populations can be used as a "molecular clock" to estimate the length of time that has passed since the existence of their last common ancestor. In these estimates, it is common to make the simplifying approximations that (1) most mutations are neutral and (2) the rate of accumulation of neutral mutations is just the average point mutation rate per generation (that is, ignoring other kinds of mutations such as deletions, inversions, etc., as well as variations in and correlations among mutations). (a) With a crude estimate of the point mutation rate of humans of \(10^{-8}\) per base pair per generation, what fraction of the possible nucleotide differences would you expect there to be between chimpanzees and humans given that the fossil record and radiochemical dating indicate their lineages diverged about six million years ago? Compare your estimate with the observed result from sequencing of about \(1.5 \%\) (b) Some parasitic organisms (lice are an example) have specialized and co- evolved with humans and chimps separately. A natural hypothesis is that the most recent common ancestor of the human and chimp parasites existed at the same time as that of the human and chimp themselves. How might you test this from DNA sequence data and other information? What are likely to be the largest causes of uncertainty in the estimates? (Problem courtesy of Daniel Fisher.)

Mutual information by another name In the chapter, we introduced the concept of mutual information as the average decrease in the missing information associated with one variable when the value of another variable in known. In terms of probability distributions, this can be written mathematically as \\[I=\sum_{y} p(y)\left[-\sum_{x} p(x) \log _{2} p(x)+\sum_{x} p(x | y) \log _{2} p(x | y)\right]\\] where the expression in square brackets is the difference in missing information, \(S_{x}-S_{x} y,\) associated with probability of \(x, p(x),\) and with probabilify of \(x\) conditioned on \(y, p(x | y)\) Using the relation between the conditional probability \(p(x | y)\) and the joint probability \(p(x, y)\) \\[p(x | y)=\frac{p(x, y)}{p(y)}\\] show that the formula for mutual information given in Equation 21.77 can be used to derive the formula used in the chapter (Equation 21.17 ), namely \\[I=\sum_{x, y} p(x, y) \log _{2}\left[\frac{p(x, y)}{p(x) p(y)}\right]\\].

Fidelity of protein synthesis The average mass of proteins in the cell is 30,000 Da, and the average mass of an amino acid is 120 Da. In eukaryotic cells, the translation rate for a single ribosome is roughly 40 amino acids per second. (a) The largest known polypeptide chain made by any cell is titin. It is made by muscle cells and has an average weight of \(3 \times 10^{6}\) Da. Estimate the translation time for titin and compare it with that of a typical protein. (b) Protein synthesis is very accurate: for every 10,000 amino acids joined together, only one mistake is made. What is the probability of an error occurring for one amino acid addition? What fraction of average sized proteins are synthesized error-free?

See all solutions

Recommended explanations on Biology Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free