Warning: foreach() argument must be of type array|object, bool given in /var/www/html/web/app/themes/studypress-core-theme/template-parts/header/mobile-offcanvas.php on line 20

Restriction enzymes and sequences (a) Restriction enzymes are proteins that recognize specific sequences at which they cut the DNA. Two commonly used restriction enzymes are HindIII and EcoRI. Look up the recognition sequences that these enzymes each cut and make a sketch of the pattern of cutting they carry out. Consider the approximately 48,000 bp genome of lambda phage and make an estimate of the lengths of the fragments that you would get if the DNA is cut with both the HindIII and EcoRI restriction enzymes. There is a precise mathematical way to do this and it depends upon the length of the recognition sequence-a 5 cutter will have shorter fragments than an 8 cutter-explain that. (b) Find the actual fragment lengths obtained in the lambda genome using these restriction enzymes by going to the New England Biolabs website (www.neb.com) and looking up the tables identifying the sites on the lambda genome that are cut by these different enzymes. How do these cutting patterns compare with your results from (a)? (c) Plot the number of cuts in the lambda genome as a function of the length of the recognition sequence of several commercially available type II restriction enzymes. You can download the list of type II restriction enzymes from the book's website. Combine this plot with a curve showing your theoretical expectation.

Short Answer

Expert verified
The expected fragments lengths for HindIII and EcoRI restriction enzymes were calculated based on the genome length and recognition sequences. These estimates were then validated via www.neb.com. The total number of cuts in the lambda genome for given lengths of recognition sequences of several type II restriction enzymes was plotted and compared with a theoretical expectation curve. The outcome demonstrates the impact of recognition sequence length on the fragmentation of the genome by restriction enzymes.

Step by step solution

01

Identify Recognition Sequences

Look up and identify the recognition sequences for HindIII and EcoRI restriction enzymes. HindIII cuts the DNA at the recognition sequence 'AAGCTT' and EcoRI cuts the DNA at the sequence 'GAATTC'.
02

Sketch Cutting Patterns

Create sketches for both enzymes showing how and where they would cut the lambda phage genome. Be sure to include the sequences you identified earlier.
03

Estimate Fragment Lengths

To estimate fragment lengths you need to consider that HindIII (6 nucleotides long) will cut the genome every 4^6 and EcoRI (6 nucleotides long) will cut it every 4^6 (considering that there are 4 base pairs, 2 of which are present in each of the DNA strands, and we have both strands to account for these enzymes). If a rough estimation is needed, it can be done by dividing the total length of the genome by the expected cut frequency.
04

Validate Estimates

Go to www.neb.com and cross-verify the cutting sites and resulting fragment lengths for the HindIII and EcoRI enzymes by looking at the tables identifying the lambda genome sites.
05

Interpret Comparisons

The fragment lengths defined in previous steps should be compared with the actual fragment lengths obtained by using the enzymes on the lambda genome. Discuss whether the results obtained in the mathematical model align with the practical outcomes in the website.
06

Plotting Number of Cuts

Using the list of type II restriction enzymes, create a plot of the number of cuts by such a restriction enzyme in the lambda genome based on the length of their recognition sequence. Label this chart clearly to show the number of cuts and the length of the recognition sequence.
07

Theoretical Expectation Curve

Next, a curve showing the theoretical expectation of cuts in the lambda genome will be created. This curve should represent the expected number of cuts for a given length of the recognition sequence. Combine this chart with previous steps.
08

Compare Plots

Finally, compare the initial plot with the theoretical expectation curve. Discuss any similarities or differences, and consider potential reasons for these differences.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with Vaia!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Recognition Sequences
Restriction enzymes, also known as restriction endonucleases, are proteins that have the unique ability to recognize specific sequences of nucleotides in DNA, known as recognition sequences. These sequences are typically palindromic, meaning they read the same forwards and backwards on the complementary strands. Different restriction enzymes have unique recognition sequences, which they use to identify where to cut the DNA strand. For instance:
  • HindIII recognizes the sequence 'AAGCTT'.
  • EcoRI recognizes the sequence 'GAATTC'.
This specificity forms the basis for using restriction enzymes in DNA manipulation and cloning. Recognition sequences usually consist of 4 to 8 nucleotides, and the number of nucleotides determines the frequency of cuts along a DNA sequence. This is because every additional nucleotide reduces the probability of that sequence occurring by a factor of four, given the four nucleotides that DNA consists of.
DNA Fragmentation
DNA fragmentation is the process of cutting DNA into smaller pieces using restriction enzymes. Each enzyme cleaves the DNA at its specific recognition sequence, resulting in a pattern of fragments. The lambda phage genome, which is about 48,000 base pairs in length, can be used as a model to see how DNA fragmentation works. The length of the fragments for each enzyme can be estimated based on the sequence it recognizes and cuts.
  • For a 6-base pair recognition sequence, such as those for HindIII and EcoRI, the enzyme cuts, on average, every 46 base pairs (about every 4096 base pairs).
  • This results in approximately 48,000/4096, or about 12 fragments, assuming random distribution of the recognition sequences along the genome.
However, the actual number and size of fragments depend on the exact locations of the recognition sequences in the genome.
Lambda Phage Genome
The lambda phage is a virus that infects bacteria, and its genome is frequently used in molecular biology experiments due to its manageable size and well-studied sequence. Its genome is approximately 48,000 base pairs long, making it an ideal candidate for studying DNA fragmentation with restriction enzymes.
  • The sequence of the lambda phage genome is fully mapped, making it easy to plan and predict how restriction enzymes will cut it.
  • This prediction is important in both theoretical exercises and practical applications, such as cloning or genomic mapping.
Using databases like the one provided by New England Biolabs can offer clarity by specifying the actual cut patterns of various restriction enzymes across the lambda phage genome.
Mathematical Estimation
Mathematical estimation in the context of restriction enzymes involves predicting the number and sizes of DNA fragments created when the DNA is digested with a restriction enzyme. This estimation is a powerful tool in genomics because it can predict experimental outcomes.
  • The formula for determining the expected fragment size is based on the probability of finding the recognition sequence in a sequence of random nucleotides, calculated as 4n, where n is the length of the recognition sequence.
  • This helps in estimating the average number of cuts and thereby the average size of the resulting fragments.
For example, a 6-base pair recognition sequence will statistically appear once every 4096 base pairs in a random sequence, providing an estimation basis for expected fragment sizes when analyzing complex genomes like that of the lambda phage.

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Open reading frames in \(E .\) coll In this problem, we will search the \(E\). coli genome for open reading frames. The actual genome sequence of \(E\). coli is available on the book's website. (a) Write a program that scans the DNA sequence and records the distance between start and stop codons in each of the three ORFs on the forward strand. You may skip the calculation for the reverse strand. You can find an example of this code implemented in Matlab on the book's website. (b) Plot the distribution of ORF lengths \(L\) and compare it with that expected for random DNA calculated in Problem 4.7 (c) Estimate a cut-off value \(L_{\text {cut }}\), above which the ORFs are statistically significant, that is, the number of observed ORFs with \(L>L\) cut is much greater than expected by chance. (Problem courtesy of Sharad Ramanathan.)

Fidelity of protein synthesis The average mass of proteins in the cell is 30,000 Da, and the average mass of an amino acid is 120 Da. In eukaryotic cells, the translation rate for a single ribosome is roughly 40 amino acids per second. (a) The largest known polypeptide chain made by any cell is titin. It is made by muscle cells and has an average weight of \(3 \times 10^{6}\) Da. Estimate the translation time for titin and compare it with that of a typical protein. (b) Protein synthesis is very accurate: for every 10,000 amino acids joined together, only one mistake is made. What is the probability of an error occurring for one amino acid addition? What fraction of average sized proteins are synthesized error-free?

Protein mutation rates Random mutations lead to amino acid substitutions in proteins that are described by the Poisson probability distribution \(p_{s}(t) .\) Namely, the probability that \(s\) substitutions at a given amino acid position in a protein occur over an evolutionary time \(t\) is \\[p_{s}(t)=\frac{e^{-\lambda t}(\lambda t)^{s}}{s !}\\] where \(\lambda\) is the rate of amino acid substitutions per site per unit time. For example, some proteins like fibrinopeptides evolve rapidily, and \(\lambda_{F}=9\) substitutions per site per \(10^{9}\) years. Histones, on the other hand, evolve slowly, with \(\lambda_{H}=0.01\) substitutions per site per \(10^{9}\) years. (a) What is the probability that a fibrinopeptide has no mutations at a given site in 1 billion years? What is this probability for a histone? (b) We want to compute the average number of mutations \((s)\) over time \(t\) \\[ \langle s\rangle=\sum_{s=0}^{\infty} s p_{s}(t) \\] First, using the fact that probabilities must sum to 1 compute the sum \(\sigma=\sum_{s=0}^{\infty}(\lambda t)^{s} / s !\). Then, write an expression for \((s),\) making use of the identity \\[\sum_{s=0}^{\infty} s \frac{(\lambda t)^{s}}{s !}=(\lambda t) \sum_{s=1}^{\infty} \frac{(\lambda t)^{s-1}}{(s-1) !}=\lambda t \sigma\\] (c) Using your answer in (b), determine the ratio of the expected number of mutations in a fibrinopeptide to that of a histone, \((s)_{F} /(s)_{H}\) (Adapted from Problem 1.16 of \(\mathrm{K}\). Dill and S. Bromberg. Molecular Driving Forces, 2nd ed. Garland Science, 2011.)

The molecular clock In eukaryotes, the majority of individual point mutations are thought to be "neutral" and have little or no effect on phenotype. Only a small fraction of the genome codes for proteins and critical DNA regulatory sequences. Even within coding regions, the redundancy of the genetic code is suffcient to render many mutations "synonymous" (that is, they do not change the amino acid, and hence the protein, encoded by the DNA). The slow accumulation of neutral mutations between two populations can be used as a "molecular clock" to estimate the length of time that has passed since the existence of their last common ancestor. In these estimates, it is common to make the simplifying approximations that (1) most mutations are neutral and (2) the rate of accumulation of neutral mutations is just the average point mutation rate per generation (that is, ignoring other kinds of mutations such as deletions, inversions, etc., as well as variations in and correlations among mutations). (a) With a crude estimate of the point mutation rate of humans of \(10^{-8}\) per base pair per generation, what fraction of the possible nucleotide differences would you expect there to be between chimpanzees and humans given that the fossil record and radiochemical dating indicate their lineages diverged about six million years ago? Compare your estimate with the observed result from sequencing of about \(1.5 \%\) (b) Some parasitic organisms (lice are an example) have specialized and co- evolved with humans and chimps separately. A natural hypothesis is that the most recent common ancestor of the human and chimp parasites existed at the same time as that of the human and chimp themselves. How might you test this from DNA sequence data and other information? What are likely to be the largest causes of uncertainty in the estimates? (Problem courtesy of Daniel Fisher.)

Comparison of Pax6 and eyeless In this exercise, you will examine the sequences for both Pax and eyeless and consider the differences and similarities between them. First, download the sequences for \(P a \times 6(82069480)\) and eyeless (12643549) from the \(\mathrm{NCB}\) Entrez Protein site using their accession numbers, given in parentheses. Go to the BLAST homepage (www.ncbi.nlm.nih.gov/blast) and, choose "Align two sequences using BLAST (bl2seq)" under the "Specialized BLAST heading. Instead of searching a large database as is typical with BLAST, we will only be aligning two sequences with one another. Paste your sequences for \(P\) ax 6 and eyeless into boxes for Sequence 1 and Sequence 2 and make sure that you choose blastp as the program. When all of this is done, push the Align button. In your BLAST alignments there is a line between "Query" and "Sbjct" that helps guide the eye with the alignment; if "Query" and "Sbjct" agree identically, the matching letter is repeated in the middle; if they do not match exactly but the amino acids are compatible in some sense (a favorable mismatch), then a "+ " is displayed on the middle line to indicate a positive score. Where there is no letter on the middle line indicates an unfavorable mismatch or a gap. The numbers at the beginning and end of the "Query" and "Sbjct" lines tell you the position in the sequence. (a) Choose one of the alignments returned by BLAST and give a tally of the number of (i) identical amino acids; (ii) favorable mismatches; (iii) unfavorable mismatches: (iv) gaps. (b) Give two examples of unfavorable mismatches and two examples of favorable mismatches in your chosen alignment. Based on what you know of the chemistry and structure of the amino acids, why might these amino acid pairs give rise to negative and positive scores, respectively? (c) Choose one unfavorable mismatch pair and one favorable mismatch pair from your chosen alignment. What codons may give rise to each of these amino acids? What is the minimum number of mutations necessary in the DNA to produce this particular unfavorable mismatch? How many DNA mutations would be required to produce the particular favorable mismatch you chose?

See all solutions

Recommended explanations on Biology Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.

Sign-up for free