Biochemistry 201 RNA Processing

Biochemistry 201 RNA Processing

LECTURE: THE SHIFTING RNA PARADIGM

General References

J. E. Dahlberg, J. N. Abelson, et al. 1989. Methods in Enzymology Vol 180, 181

R. F. Gesteland, J. F. Atkins, et al. 1993. The RNA World

Wolfram Saenger, 1984. Principles of Nucleic Acid Structure

I. The Paradigm: DNA ----------------> RNA --------------> proteins

information storage --> information transfer ---> function

"Molecular Biology of the Cell" (1983 Edition)

Alberts, Bray, Lewis, Raff, Roberts & Watson

‘The molecular processes that underlie protein synthesis are very complex. Although we can describe many of them, they do not make conceptual sense in the way that DNA transcription, DNA repair, and DNA replication do. For example, we now know that not one but three main classes of RNA molecules (mRNA, tRNA, and rRNA) are involved in protein synthesis, but we do not fully understand why this must be so. Thus, the details of protein synthesis must largely be learned as fact without an obvious conceptual framework.’

-Same holds for RNA processing, especially pre-mRNA splicing, which involves many proteins and RNAs.

"Just So Stories and Rube Goldberg Machines: Speculations on the Origin of the Protein Synthetic Machinery" (1979, Steenbock Symposium) C.R. Woese

‘From an evolutionary viewpoint a chicken-and-egg paradox results from assuming that the defining interactions in translation reside in the protein components: A translation mechanism is already required to evolve those proteins that would be required to build that translation mechanism in the first place. The paradox could be avoided, of course, if nucleic acids were to define the mechanics of translation initially.

Ribosomal RNA can no longer be viewed as a static structural element, as a scaffolding upon which the function defining proteins are positioned. Ribosomal RNA is capable of mechanism.’

And even earlier, in 1968: Orgel and Crick independently come down in favor of functional roles for nucleic acids in early evolution:

"At present there is no evidence that polynucleotides have even limited catalytic activity. In biological systems we know that catalytic functions are preformed by proteins and never by polynucleotides....even if polynucleotides are able to catalyze chemical reactions, and used this ability in the early stages of the evolution of life, the function would subsequently have been taken over by the much more versatile polypeptides. Thus the question of the catalytic activity of polynucleotides remains open..."

Orgel, J. Mol. Biol. (1968) 38, 381-393: "Evolution of the Genetic Apparatus"

see also Crick, J. Mol. Biol. (1968) 38, 3367-379.

Basic idea: RNA has structure; therefore it can have function.

Hypothesis: Life began with RNA

Corollary: In order to understand modern day biology, we must understand the pathway of its origins.

But to understand & evaluate this hypothesis, must understand what structure RNA has and what it can do (i.e., on a molecular level):

II. What do they mean, RNA has structure?

-DNA is boring, RNA is not: A and B form DNA vs tRNA.

(Figures 1-3)

-Watson-Crick base pairs (Figure 1B) represent only a small fraction of the potential base-base interactions (Figure 4). "Anything can base pair with anything" (e.g., a U.C base pair bridged by a water molecule: Holbrook, Cheong, Tinoco & Kim, Nature (1991) 353, 579-581: Crystal structure of an RNA double helix incorporating a track of non-Watson-Crick base pairs" Reference included).

-Multiple secondary & tertiary structures possible (Figure 4 & 5)

-And these can come together to form complex structures: e.g., tRNA (Figure 6)

Lessons from tRNA:

-Stacking of helices to give the overall L shape. (Figure 7)

-Several types of interactions bring the two sets of stacked helices together.

-Additional stacking interactions give intercalation (e.g., A9 between G45 and G46)

(Figure 8)

-Long range base pairs help anchor "L" shape (e.g., G15.C48) (Figure 7)

-Base triples also help anchor "L" shape (e.g., C13.G22-G46) (Figure 9)

-Phosphates and 2' hydroxyls can interact with bases.

[e.g., P9 with the base triple above; 2'OH of U8 with N1 of A21 (Figure 10)]

-There are NETWORKS of interactions (e.g., C13.G22-G46.A9)

-Metal ions are bound at many positions.

Forces involved include: hydrogen bonding, hydrophobic effect, and dipole effects for base pairing and base stacking. Simple, but we still can’t predict energy and structure.

{It is really not possible to become comfortable with structure without staring at structures in private, even cross-eyed. This brief discussion was an attempt to introduce some of the things to look for. A book with a wealth of structural information on nucleic acids is Saenger, "Principles of Nucleic Acid Structure" Springer-Verlag (1984); Chapter 3 of Cantor & Schimmel's Biophysical Chemistry has a nice treatment of tRNA structure. A more general review with leading references is Chastain & Tinoco (1991) Prog. Nucleic Acids Res. Molec. Biol. 41, 131-177: "Structural elements in RNA".}

III. So RNA has structure, what about function?

1982: Cech finds that RNA can have proteinesque function: an intron from pre-rRNA (r = ribosomal) excises itself and ligates the exons in the absence of protein.

Kruger et al. (1982) Cell 31, 147-157: "Self-splicing RNA: Autoexcision and autocyclization of the ribosomal RNA intervening sequence of Tetrahymena."

-Looks & acts like a real enzyme (Figure 11A,B & C)

Golden, Gooding, Podell & Cech (1998) Science 282, 259-264: "A preorganized active site in the crystal structure of the Tetrahymena ribozyme."

Narlikar & Herschlag (1997) Annu. Rev. Biochem. 66, 19-59: "Mechanistic aspects of enyzmatic catalysis: Lessons from comparison of RNA and protein enzymes."

Conclusions (post-1982):

-Life began with RNA serving both the informational role of DNA and the functional role of proteins.

Next Question: How do you go from RNA to proteins? (Figure 12)

-Start small, with helper peptides that are in the primordial goo.

-Any organism that could scavenge such preexisting peptides (or peptide-like molecules) would have an advantage. Indeed there is evidence that peptides can help RNA function [Herschlag, J. Biol. Chem. (1995) 270, 20871-20874: "RNA Chaperones and the RNA Folding Problem"]

-However, this might be an evolutionary dead end, because ultimately the organism would need make its own peptides.

-Alternatively, such an organism might evolve better RNAs, which can more fully utilize the assistance of the peptides. These better RNAs might then come up with the breakthrough of figuring out how to perform peptide synthesis.

-Once the process has begun, there would be selective pressure for better and better peptide synthesis machinery, to make peptides are longer and more accurately.

-Ultimately, proteins would replace RNA for most functions, since proteins are better than RNA!

-Given the complexity of switching from an RNA world to a protein world, it is reasonable that there will be vestiges of the RNA world.

-This has an important implication: life is not the best possible design; rather, the chosen mechanisms are dependent on how we got here.

The most likely vestigal candidate might be the machinery that is responsible for making protein from RNA codes.

Hypothesis: The ribosome is a "ribozyme" (RNA enzyme).

-Noller provides evidence that the ribosome is a "ribozyme"

(Noller, Hoffarth, Zimnisk, Science (1992) 258, 1416-1419: " Unusual resistance of peptidyl transferase to protein extraction procedures".)

-Although not proof, there is substantial data implicating ribosomal RNA in function.

-Soon, we will have a structural picture, which will guide experiments to answer this question definitively:

Next question: How did the organism initially decide which amino acids to put in the peptides, i.e., what was the origin of the code?

-It is hard to imagine a scenario other than one in which there was a direct interaction of amino acid with RNA.

Hypothesis: Vestiges of these interactions remain in modern day tRNAs: Do codons bind their cognate amino acids?

Test: Yarus tested all 20 amino acids as potential inhibitors of self-splicing by the group I self-splicing RNAs. He found that only Arg was a good inhibitor, with Ki = 2 mM.

-This established that a complex RNA molecule can recognize an amino acid. Furthermore, there is modest (~2-fold) greater binding of L-Arg than of D-Arg. (The L-amino acids are those found in modern day proteins.)

(Yarus, Science (1988) 240, 1751-1758: "A specific amino acid binding site composed of RNA".)

-Arg is a competitive inhibitor of guanosine in this reaction. [Guanosine is a required cofactor in splicing of group I introns.] Does the binding of Arg related to the genetic code, or does it simply arise because Arg and guanosine look alike?

-Michel identified the guanosine (and Arg) binding site within the intron. Remarkably, or coincidentally, one strand of the intron at this site is AGA, the codon for Arg. Furthermore, this trinucleotide sequence varies in over 100 sequenced group I introns between AGA, CGA, AGG, all Arg codons!

[Yarus & Christian, Nature (1989) 342, 349-350: "Genetic Code Origins"; Yarus, New Biologist (1991) 3, 183-189: "An RNA-amino acid complex and the origin of the genetic code"; Knight & Landweber (1998) Chemistry & Biology 5, 215: "Rhyme or reason: RNA•Arginine interactions and the genetic code."]

Hypothesis: If it happened once, it can happen again.

Test: Szostak is actually trying to create life in a test tube. Even more remarkably, he's had considerable success in reengineering a group I intron into a RNA dependent RNA polymerase that's made of RNA! (I.e., it's the RNA molecule that's catalyzing the synthesis of another RNA molecule, using a third RNA molecule as a template.)

Doudna, Couture & Szostak, Science (1991) 251, 1605-1608: "A multisubunit ribozyme that is a catalyst of and template for complementary strand RNA synthesis".

New Paradigm: Life began with RNA.

-Not proven, but does help make sense of modern-day biology.

IV. Now, turn, at last, to RNA processing.

Pre-mRNA splicing:

5% of mRNAs in yeast have introns

Essentially all mammalian mRNAs have multiple introns

The Discovery:

-Introns were first inferred from the absence of contiguous open reading frames (ORFs) in sequenced genes. (An ORF is a sequence that can be read into protein without being interrupted by stop codons.)

-The introns were shown to be contained in precursor RNA by comparison of precursor RNA or mRNA hybridized to genomic DNA. The hybrids were visualized by electron microscopy (EM) (Figure 13)

The loop in B shows that the mRNA is missing a sequence that is present in the middle of the precursor RNA. Tilghman et al., Proc. Natl. Acad. Sci. USA (1978) 75, 1309-1313: "The intervening sequence of a mouse b-globin gene is transcribed within the 15S b-globin mRNA precursor."

see also: Perry, J. Cell Biol. (1981) 91, 28s-38s: " RNA processing comes of age."

-Nobel prize in '94 to Sharp & Roberts for the discovery of splicing.

The pathway:

-Pre-mRNA splicing is incredibly complex (Figure 14a&b).

(Some of the early references: Ruskin, Krainer, Maniatis & Green, Cell (1984) 38, 317-331: "Excision of an intact intron as a novel lariat structure during pre-mRNA splicing in vitro"; Brody & Abelson, Science (1985) 228, 963-967: "The spliceosome: yeast pre-mRNA associates with a 40S complex in a splicing dependent reaction"; Frendewey & Keller, Cell (1985) 42, 355-367: "Stepwise assembly of a pre-mRNA splicing complex requires U-snRNPs and specific intron sequences"; Pikielny, Rymond & Rosbash, Nature (1986) 341-345: "Electrophoresis of ribonucleoproteins reveals an ordered assembly pathway of yeast splicing complexes".)

-This process involves >50 proteins, 5 RNA/protein complexes (snRNP's or snurp's), and ATP hydrolysis at several steps. (e.g., see Figure 15)

{Some recent information about steps in the splicing pathway will be discussed in the next lecture and in the literature discussion session.}

The mechanism:

-Protein or RNA catalysis?

-Cf group II intron self-splicing (Figure 16):

Same chemical pathway.

But pre-mRNA splicing differs from group II self-splicing in two fundamental ways:

1. cell extract required (i.e., many factors)

2. ATP dependent.

Hypothesis: Since group II and nuclear pre-mRNA follow the same chemical pathway in splicing the introns are evolutionarily related:

-It has been suggested that nuclear introns arose by invasion from mitochondrial group II introns (via a reverse transcriptase mechanism).

-Is the spliceosome a group II intron in pieces? (Jaquier, TIBS (1990) 15, 351-354: "Self-splicing group II an nuclear introns: how similar are they?")

(Figure 17A & B)

Some pertinent observations:

-Group II introns in pieces and trans splicing of pre-mRNA in Trypanosomes and Euglena may represent intermediate stages in the conversion of group II's to spliceosomes. (Sharp, Science (1991) 254, 663: "Five easy pieces", and references therin.)

-Similarities in the consensus sequences for splice sites and branchpoint (Figure 18).

-Adenosine residue used in branch for both.

-This adenosine occurs as a bulged residue in both secondary structures (Figure 19).

-Domain V of the group II and snRNA U6 are the most conserved phylogenetically, and are therefore reasonable candidates for the catalytic centers.

-Mutagenesis data to test potential base pairs identified by phylogenetic comparisons suggests that there is base pairing between U2 (which is at the branchpoint) and U6 right at the highly conserved region of U6. Thus, it is reasonable that U6 could be catalytic. There is a potential homology in secondary structure to the group II intron (Figure 20).

Madhani & Guthrie (1992) Cell 71, 803-817: "A novel basepairing interaction between U2 and U6 snRNAs suggests a mechanism for the catalytic activation of the spliceosome."

Madhani & Guthrie (1994) Annu. Rev. Genet. 28, 1-26: "Dynamic RNA-RNA interactions in the spliceosome."

-Conservation of ‘active site’ between ‘old’ and newly discovered ‘AT-AC’ splicesome (Figure 21A & B).

Tarn & Steitz (1996) Cell 84, 801-811: "A novel spliceosome containing U11, U12, and U5 snRNPs excises a minor class (AT-AC) of introns in vitro."

Tarn & Steitz (1996) Science 273, 1824-1832: "Highly diverged U4 and U6 small nuclear RNAs required for splicing rare AT-AC introns."

The evolutionary view: By reconstructing introns and splicing from the RNA world and beyond, we end up with a different view of modern biology than we would had we assumed that life is somehow perfect. For example, it seems reasonable that splicing could be accomplished by proteins in the absence of RNA. Furthermore, there is no way evolution could have sampled all possible RNA and protein sequences, showing that any solution to the problem of life arrived at through evolution must represent a local rather than global optimization.

Bottom line: To understand modern-day biology, we must understand its origins.

A molecular view: Molecular interactions are the language of biology. TO understand what biology does and what it can and cannot do, we need to consider biological questions at a molecular level.

-More on this in the next lecture.