RNA editing with CRISPR-Cas13
Once a disease-causing mutation has been transcribed, it can be difficult to stop. Gene editing techniques that change the DNA sequence don't affect mRNA transcripts that have been released into the cell. A new type of gene editing technique fixes this problem by targeting messenger RNA. These new tools are programmable and can be directed to edit mRNA molecules. But several issues remain: How can we make sure the editor is changing the right base? How can we avoid breaking the molecule when we edit it? And how can we make sure the editors aren't also changing other molecules and causing unintended problems?
Nucleic acid editing holds promise for treating genetic disease, particularly at the RNA level, where disease-relevant sequences can be rescued to yield functional protein products. Type VI CRISPR-Cas systems contain the programmable single-effector RNA-guided ribonuclease Cas13. We profiled type VI systems in order to engineer a Cas13 ortholog capable of robust knockdown and demonstrated RNA editing by using catalytically inactive Cas13 (dCas13) to direct adenosine-to-inosine deaminase activity by ADAR2 (adenosine deaminase acting on RNA type 2) to transcripts in mammalian cells. This system, referred to as RNA Editing for Programmable A to I Replacement (REPAIR), which has no strict sequence constraints, can be used to edit full-length transcripts containing pathogenic mutations. We further engineered this system to create a high-specificity variant and minimized the system to facilitate viral delivery. REPAIR presents a promising RNA-editing platform with broad applicability for research, therapeutics, and biotechnology.
Precise nucleic acid–editing technologies are valuable for studying cellular function and as novel therapeutics. Current editing tools, based on programmable nucleases such as the prokaryotic CRISPR-associated nucleases Cas9 (1–4) or Cpf1 (5), have been widely adopted for mediating targeted DNA cleavage, which in turn drives targeted gene disruption through nonhomologous end joining (NHEJ) or precise gene editing through template-dependent homology-directed repair (HDR) (6). NHEJ uses host machineries that are active in both dividing and post-mitotic cells and provides efficient gene disruption by generating a mixture of insertion or deletion (indel) mutations that can lead to frame shifts in protein-coding genes. HDR, in contrast, is mediated by host machineries whose expression is largely limited to replicating cells. Accordingly, the development of gene-editing capabilities for post-mitotic cells remains a major challenge. DNA base editors, consisting of a fusion between Cas9 nickase and cytidine deaminase, can mediate efficient cytidine-to-uridine conversions within a target window and substantially reduce the formation of double-strand break–induced indels (7, 8). However, the potential targeting sites of DNA base editors are limited by the requirement of Cas9 for a protospacer adjacent motif (PAM) at the editing site (9). Here, we describe the development of a precise and flexible RNA base editing technology using the type VI CRISPR-associated RNA-guided ribonuclease (RNase) Cas13 (10–13).
Cas13 enzymes have two higher eukaryotes and prokaryotes nucleotide-binding (HEPN) endoRNase domains that mediate precise RNA cleavage with a preference for targets with protospacer flanking sites (PFSs) observed biochemically and in bacteria (10, 11). Three Cas13 protein families have been identified to date: Cas13a (previously known as C2c2), Cas13b, and Cas13c (12, 13). We recently reported that Cas13a enzymes can be adapted as tools for nucleic acid detection (14) as well as mammalian and plant cell RNA knockdown and transcript tracking (15), and observed that the biochemical PFS was not required for RNA interference with Cas13a (15). The programmable nature of Cas13 enzymes makes them an attractive starting point to develop tools for RNA binding and perturbation applications.
The adenosine deaminase acting on RNA (ADAR) family of enzymes mediates endogenous editing of transcripts via hydrolytic deamination of adenosine to inosine, a nucleobase that is functionally equivalent to guanosine in translation and splicing (16, 17). There are two functional human ADAR orthologs, ADAR1 and ADAR2, which consist of N-terminal double-stranded RNA–binding domains and a C-terminal catalytic deamination domain. Endogenous target sites of ADAR1 and ADAR2 contain substantial double-stranded identity, and the catalytic domains require duplexed regions for efficient editing in vitro and in vivo (18, 19). The ADAR catalytic domain is capable of deaminating target adenosines without any protein cofactors in vitro (20). ADAR1 has been found to target mainly repetitive regions, whereas ADAR2 mainly targets nonrepetitive coding regions (17). Although ADAR proteins have preferred motifs for editing that could restrict the potential flexibility of targeting, hyperactive mutants, such as ADAR2(E488Q) (21), relax sequence constraints and increase adenosine-to-inosine editing rates. (Single-letter abbreviations for the amino acid residues are as follows: A, Ala; C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, His; I, Ile; K, Lys; L, Leu; M, Met; N, Asn; P, Pro; Q, Gln; R, Arg; S, Ser; T, Thr; V, Val; W, Trp; and Y, Tyr. In the mutants, other amino acids were substituted at certain locations; for example, E488Q indicates that glutamic acid at position 488 was replaced by glutamine.) ADARs preferentially deaminate adenosines mispaired with cytidine bases in RNA duplexes (22), providing a promising opportunity for precise base editing. Although previous approaches have engineered targeted ADAR fusions via RNA guides (23–26), the specificity of these approaches has not been reported, and their respective targeting mechanisms rely on RNA-RNA hybridization without the assistance of protein partners that may enhance target recognition and stringency.
We assayed a subset of the family of Cas13 enzymes for RNA knockdown activity in mammalian cells and identified the Cas13b ortholog from Prevotella sp. P5-125 (PspCas13b) as the most efficient and specific for mammalian cell applications. We then fused the ADAR2 deaminase domain with the E488Q mutation (ADAR2DD) to catalytically inactive PspCas13b and demonstrated RNA editing for programmable A to I (G) replacement (REPAIR) of reporter and endogenous transcripts as well as disease-relevant mutations. Last, we used a rational mutagenesis scheme to improve the specificity of dCas13b-ADAR2DD fusions in order to generate REPAIRv2 with more than 919-fold higher specificity.
Comprehensive characterization of Cas13 family members in mammalian cells
We previously developed Leptotrichia wadei Cas13a (LwaCas13a) for mammalian knockdown applications, but it required a monomeric superfolder green fluorescent protein (msfGFP) stabilization domain for efficient knockdown, and although the specificity was high, knockdown levels were not consistently below 50% (15). We sought to identify a more robust RNA-targeting CRISPR system by characterizing a genetically diverse set of Cas13 family members in order to assess their RNA knockdown activity in mammalian cells (Fig. 1A). We generated mammalian codon-optimized versions of multiple Cas13 proteins, including 21 orthologs of Cas13a, 15 of Cas13b, and seven of Cas13c, and cloned them into an expression vector with N- and C-terminal nuclear localization signal (NLS) sequences and a C-terminal msfGFP to enhance protein stability (table S1). To assay interference in mammalian cells, we designed a dual-reporter construct expressing the independent Gaussia (Gluc) and Cypridina (Cluc) luciferases under separate promoters, allowing one luciferase to function as a measure of Cas13 interference activity and the other to serve as an internal control. For each Cas13 ortholog, we designed PFS-compatible guide RNAs, using the Cas13b PFS motifs derived from an ampicillin interference assay (fig. S1, table S2, and supplementary text) and the 3′ H (not G) PFS from previous reports of Cas13a activity (10).
We transfected human embryonic kidney (HEK) 293FT cells with Cas13-expression, guide RNA, and reporter plasmids and then quantified levels of Cas13 expression and the targeted Gluc 48 hours later (Fig. 1B and fig. S2A). Testing two guide RNAs for each Cas13 ortholog revealed a range of activity levels, including five Cas13b orthologs with similar or increased interference across both guide RNAs relative to the recently characterized LwaCas13a (Fig. 1B), and we observed only a weak correlation between Cas13 expression and interference activity (fig. S2, B to D). We selected the top five Cas13b orthologs and the top two Cas13a orthologs for further engineering.
We next tested Cas13-mediated knockdown of Gluc without msfGFP to select orthologs that do not require stabilization domains for robust activity. We hypothesized that Cas13 activity could be affected by subcellular localization, as we previously reported for optimization of LwaCas13a (15). Therefore, we tested the interference activity of the seven selected Cas13 orthologs C-terminally fused to one of six different localization tags without msfGFP. Using the luciferase reporter assay, we identified the top three Cas13b designs with the highest level of interference activity: Cas13b from Prevotella sp. P5-125 (PspCas13b) and Cas13b from Porphyromonas gulae (PguCas13b) C-terminally fused to the HIV Rev nuclear export sequence (NES), and Cas13b from Riemerella anatipestifer (RanCas13b) C-terminally fused to the mitogen-activated protein kinase NES (fig. S3A). To further distinguish activity levels of the top orthologs, we compared the three optimized Cas13b constructs with the optimal LwaCas13a-msfGFP fusion and to short hairpin–mediated RNA (shRNA) for their ability to knock down the endogenous KRAS (V-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog) transcript by using position-matched guides (fig. S3B). We observed the highest levels of interference for PspCas13b (average knockdown, 62.9%) and thus selected this for further comparison with LwaCas13a.
To more rigorously define the activity of PspCas13b and LwaCas13a, we designed position-matched guides tiling along both Gluc and Cluc transcripts and assayed their activity using our luciferase reporter assay. We tested 93 and 20 position-matched guides targeting Gluc and Cluc, respectively, and found that PspCas13b had consistently increased levels of knockdown relative to LwaCas13a (average of 92.3% for PspCas13b versus 40.1% knockdown for LwaCas13a) (Fig. 1, C and D).
Specificity of Cas13 mammalian interference activity
To characterize the interference specificities of PspCas13b and LwaCas13a, we designed a plasmid library of luciferase targets containing single mismatches and double mismatches throughout the target sequence and the three flanking 5′ and 3′ base pairs (fig. S3C). We transfected HEK293FT cells with either LwaCas13a or PspCas13b, a fixed guide RNA targeting the unmodified target sequence, and the mismatched target library corresponding to the appropriate system. We then performed targeted RNA sequencing (RNA-seq) of uncleaved transcripts in order to quantify depletion of mismatched target sequences. We found that LwaCas13a and PspCas13b had a central region that was relatively intolerant to single mismatches, extending from base pairs 12 to 26 for the PspCas13b target and 13 to 24 for the LwaCas13a target (fig. S3D). Double mismatches were even less tolerated than single mutations, with little knockdown activity observed over a larger window, extending from base pairs 12 to 29 for PspCas13b and 8 to 27 for LwaCas13a in their respective targets (fig. S3E). Additionally, because there are mismatches included in the three nucleotides flanking the 5′ and 3′ ends of the target sequence, we could assess PFS constraints on Cas13 knockdown activity. Sequencing showed that almost all PFS combinations allowed robust knockdown, indicating that a PFS constraint for interference in mammalian cells likely does not exist for either enzyme tested. These results indicate that Cas13a and Cas13b display similar sequence constraints and sensitivities against mismatches.
We next characterized the interference specificity of PspCas13b and LwaCas13a across the mRNA fraction of the transcriptome. We performed transcriptome-wide mRNA sequencing to detect significant differentially expressed genes. LwaCas13a and PspCas13b demonstrated robust knockdown of Gluc (Fig. 1, E and F) and were highly specific compared with a position-matched shRNA, which showed hundreds of off-targets (Fig. 1G), a finding consistent with our previous characterization of LwaCas13a specificity in mammalian cells (15).
Cas13-ADAR fusions enable targeted RNA editing
Given that PspCas13b achieved consistent, robust, and specific knockdown of mRNA in mammalian cells, we envisioned that it could be adapted as an RNA-binding platform to recruit RNA-modifying domains, such as ADARDD, for programmable RNA editing. To engineer a PspCas13b lacking nuclease activity (dPspCas13b, referred to as dCas13b hereafter), we mutated conserved catalytic residues in the HEPN domains and observed loss of luciferase RNA knockdown (fig. S4A). We hypothesized that a dCas13b-ADARDD fusion could be recruited by a guide RNA to target adenosines, with the hybridized RNA creating the required duplex substrate for ADAR activity (Fig. 2A). To enhance target adenosine deamination rates, we introduced two additional modifications to our initial RNA editing design: We introduced a mismatched cytidine opposite the target adenosine, which has been previously reported to increase deamination frequency, and fused dCas13b with the deaminase domains of human ADAR1 or ADAR2 containing hyperactivating mutations in order to enhance catalytic activity [ADAR1DD(E1008Q) (27) or ADAR2DD(E488Q)] (21).
To test the activity of dCas13b-ADARDD, we generated an RNA-editing reporter on Cluc by introducing a nonsense mutation [W85X (UGG→UAG)], which could functionally be repaired to the wild-type codon through A→I editing (Fig. 2B) and then be detected as restoration of Cluc luminescence. We evenly tiled guides with spacers 30, 50, 70, or 84 nucleotides (nt) long across the target adenosine so as to determine the optimal guide placement and design (Fig. 2C). We found that dCas13b-ADAR1DD(E1008Q) required longer guides to repair the Cluc reporter, whereas dCas13b-ADAR2DD(E488Q) was functional with all guide lengths tested (Fig. 2C). We also found that the hyperactive E488Q mutation improved editing efficiency as wild-type ADAR2DD displayed reduced luciferase restoration (fig. S4B). From this demonstration of activity, we chose dCas13b-ADAR2DD(E488Q) for further characterization and designated this system RNA Editing for Programmable A to I Replacement version 1 (REPAIRv1).
To validate that restoration of luciferase activity was due to bona fide editing events, we directly measured REPAIRv1-mediated editing of Cluc transcripts via reverse transcription and targeted next-generation sequencing. We tested 30- and 50-nt spacers around the target site and found that both guide lengths resulted in the expected A-to-I edit, with 50-nt spacers achieving higher editing percentages (Fig. 2, D and E, and fig. S4C). We also observed that 50-nt spacers had an increased propensity for editing at nontargeted adenosines within the sequencing window, likely because of increased regions of duplexed RNA (Fig. 2E and fig. S4C).
We next targeted an endogenous gene, PPIB. We designed 50-nt spacers tiling PPIB and found that we could edit the PPIB transcript with up to 28% editing efficiency (fig. S4D). To test whether REPAIR could be further optimized, we modified the linker between dCas13b and ADAR2DD(E488Q) (fig. S4E and table S3) and found that linker choice modestly affected luciferase activity restoration. Additionally, we tested the ability of dCas13b and guide alone to mediate editing events, finding that the ADARDD is required for editing (fig. S5, A to D).
Defining the sequence parameters for RNA editing
Given that we could achieve precise RNA editing at a test site, we wanted to characterize the sequence constraints for programming the system against any RNA target in the transcriptome. Sequence constraints could arise from dCas13b-targeting limitations, such as the PFS, or from ADAR sequence preferences (28). To investigate PFS constraints on REPAIRv1, we designed a plasmid library that carryies a series of four randomized nucleotides at the 5′ end of a target site on the Cluc transcript (Fig. 3A). We targeted the center adenosine within either a UAG or AAC motif and found that for both motifs, all PFSs demonstrated detectable levels of RNA editing, with a majority of the PFSs having >50% editing at the target site (Fig. 3B). Next, we sought to determine whether the ADAR2DD in REPAIRv1 had any sequence constraints immediately flanking the targeted base, as has been reported previously for ADAR2DD (28). We tested every possible combination of 5′ and 3′ flanking nucleotides directly surrounding the target adenosine (Fig. 3C) and found that REPAIRv1 was capable of editing all motifs (Fig. 3D). Last, we analyzed whether the identity of the base opposite the target A in the spacer sequence affected editing efficiency and found that an A-C mismatch had the highest luciferase restoration, in agreement with previous reports of ADAR2 activity, with A-G, A-U, and A-A having drastically reduced REPAIRv1 activity (fig. S5E).
Correction of disease-relevant human mutations using REPAIRv1
To demonstrate the broad applicability of the REPAIRv1 system for RNA editing in mammalian cells, we designed REPAIRv1 guides against two disease-relevant mutations: 878G→A (AVPR2W293X) in X-linked nephrogenic diabetes insipidus and 1517G→A (FANCC W506X) in Fanconi anemia. We transfected expression constructs for cDNA of genes carrying these mutations into HEK293FT cells and tested whether REPAIRv1 could correct the mutations. Using guide RNAs containing 50-nt spacers, we were able to achieve 35% correction of AVPR2 and 23% correction of FANCC (Fig. 4, A to D). We then tested the ability of REPAIRv1 to correct 34 different disease-relevant G→A mutations (table S4) and found that we were able to achieve substantial editing at 33 sites with up to 28% editing efficiency (Fig. 4E). The mutations we chose are only a fraction of the pathogenic G-to-A mutations (5739) in the ClinVar database, which also includes an additional 11,943 G-to-A variants (Fig. 4F and fig. S6). Because there are no strict sequence constraints (Fig. 3), REPAIRv1 is capable of potentially editing all of these disease-relevant mutations, especially given that we observed editing regardless of the target motif (Figs. 3C and 4G).
Delivering the REPAIRv1 system to diseased cells is a prerequisite for therapeutic use, and we therefore sought to design REPAIRv1 constructs that could be packaged into therapeutically relevant viral vectors, such as adeno-associated viral (AAV) vectors. AAV vectors have a packaging limit of 4.7 kb, which cannot accommodate the large size of dCas13b-ADARDD [4473 base pairs (bp)] along with promoter and expression regulatory elements. To reduce the size, we tested a variety of N-terminal and C-terminal truncations of dCas13 fused to ADAR2DD(E488Q) for RNA-editing activity. We found that all C-terminal truncations tested were still functional and able to restore luciferase signal (fig. S7), and the largest truncation, C-terminal Δ984–1090 (total size of the fusion protein, 4152 bp) was small enough to fit within the packaging limit of AAV vectors.
Transcriptome-wide specificity of REPAIRv1
Although RNA knockdown with PspCas13b was highly specific in our luciferase tiling experiments, we observed off-target adenosine editing within the guide:target duplex (Fig. 2E). To see whether this was a widespread phenomenon, we tiled an endogenous transcript, KRAS, and measured the degree of off-target editing near the target adenosine (Fig. 5A). We found that for KRAS, although the on-target editing rate was 23%, there were many sites around the target site that also had detectable A-to-I edits (Fig. 5B).
Because of the observed off-target editing within the guide:target duplex, we initially evaluated transcriptome-wide off-targets by performing RNA-seq on all mRNAs with 12.5x coverage. Of all the editing sites across the transcriptome, the on-target editing site had the highest editing rate, with 89% A-to-I conversion. We also found that there was a substantial number of A-to-I off-target events, with 1732 off-targets in the targeting guide condition and 925 off-targets in the nontargeting guide condition, with 828 off-targets shared between the targeting and nontargeting guide conditions (Fig. 5, C and D). Given the high number of overlapping off-targets between the targeting and nontargeting guide conditions, we reasoned that the off-targets may arise from ADARDD. To test this hypothesis, we repeated the Cluc-targeting experiment, this time comparing transcriptome changes for REPAIRv1 with a targeting guide, REPAIRv1 with a nontargeting guide, REPAIRv1 alone, or ADARDD(E488Q) alone (fig. S8). We found differentially expressed genes and off-target editing events in each condition (fig. S8, B and C). There was a high degree of overlap in the off-target editing events between ADARDD(E488Q) and all REPAIRv1 off-target edits, supporting the hypothesis that REPAIR off-target edits are driven by dCas13b-independent ADARDD(E488Q) editing events (fig. S8D).
Next, we sought to compare two RNA-guided ADAR systems that have been described previously (fig. S9A). The first uses a fusion of ADAR2DD to the small viral protein lambda N (ƛN), which binds to the BoxB-ƛ RNA hairpin (24). A guide RNA with double BoxB-ƛ hairpins guides ADAR2DD(E488Q) to edit sites encoded in the guide RNA (25). The second design uses full-length ADAR2 (ADAR2) and a guide RNA with a hairpin that the double-strand RNA (dsRNA)–binding domains (dsRBDs) of ADAR2 recognize (23, 26). We analyzed the editing efficiency of these two systems compared with REPAIRv1 and found that the BoxB-ADAR2 and full-length ADAR2 systems demonstrated 50 and 34.5% editing rates, respectively, compared with the 89% editing rate achieved by REPAIRv1 (fig. S9, B to E). Additionally, the BoxB and full-length ADAR2 systems created 1814 and 66 observed off-targets, respectively, in the targeting guide conditions, compared with the 2111 off-targets in the REPAIRv1 targeting guide condition. All the conditions with the two ADAR2DD-based systems (REPAIRv1 and BoxB) showed a high percentage of overlap in their off-targets, whereas the full-length ADAR2 system had a largely distinct set of off-targets (fig. S9F). The overlap in off-targets between the targeting and nontargeting conditions and between REPAIRv1 and BoxB conditions suggests that ADAR2DD drives off-targets independent of dCas13 targeting (fig. S9F).
Improving specificity of REPAIR through rational protein engineering
To improve the specificity of REPAIRv1, we used structure-guided protein engineering of ADAR2DD(E488Q). Because of the guide-independent nature of the off-targets, we hypothesized that destabilizing ADAR2DD(E488Q)–RNA binding would selectively decrease off-target editing, but maintain on-target editing because of increased local concentration from dCas13b tethering of ADAR2DD(E488Q) to the target site. We mutated residues in ADAR2DD(E488Q) previously determined to contact the duplex region of the target RNA (Fig. 6A) (19). To assess efficiency and specificity, we tested 17 single mutants with both targeting and nontargeting guides, under the assumption that background luciferase restoration in the nontargeting condition would be indicative of broader off-target activity. We found that mutations at the selected residues had substantial effects on the luciferase activity for targeting and nontargeting guides (Fig. 6, A and B, and fig. S10A). A majority of mutants either significantly improved the luciferase activity for the targeting guide or increased the ratio of targeting to nontargeting guide activity, which we termed the specificity score (Fig. 6, A and B).
We selected a subset of these mutants (Fig. 6B) for transcriptome-wide specificity profiling by next-generation sequencing. As expected, off-targets measured from transcriptome-wide sequencing correlated with our specificity score (fig. S10B) for mutants. We found that with the exception of ADAR2DD(E488Q/R455E), all sequenced REPAIRv1 mutants could effectively edit the reporter transcript (Fig. 6C), with many mutants showing reduction in the number of off-targets (Fig. 6C and figs. S10C and S11). We further explored motifs surrounding off-targets for the various specificity mutants and found that REPAIRv1 and most of the engineered variants exhibited a strong 3′ G preference for their off-target edits, which is in agreement with the characterized ADAR2 motif (fig. S12A) (28).
We focused on the mutant ADAR2DD(E488Q/T375G)—because it had the highest percent editing of the four mutants with the lowest numbers of transcriptome-wide off-targets—and termed it REPAIRv2. Compared with REPAIRv1, REPAIRv2 exhibited increased specificity, with a reduction from 18,385 to 20 transcriptome-wide off-targets with high-coverage sequencing (125x coverage, 10 ng of REPAIR vector transfected) (Fig. 6D). In the region surrounding the targeted adenosine in Cluc, REPAIRv2 also had reduced off-target editing, visible in sequencing traces (Fig. 6E). In motifs derived from the off-target sites, REPAIRv1 presented a strong preference toward 3′ G but showed off-target edits for all motifs (fig. S12B); by contrast, REPAIRv2 only edited the strongest off-target motifs (fig. S12C). The distribution of edits on transcripts was heavily skewed for REPAIRv1, with highly edited genes having more than 60 edits (fig. S13A), whereas REPAIRv2 only edited one transcript (EEF1A1) multiple times (fig. S13B). REPAIRv1 off-target edits were predicted to result in numerous variants, including 1000 missense base changes (fig. S13C), with 93 events in genes related to cancer processes (fig. S13D). In contrast, REPAIRv2 only had six predicted missense changes (fig. S13E), none of which were in cancer-related genes (fig. S13F). Analysis of the sequence surrounding off-target edits for REPAIRv1 or -v2 did not reveal homology to guide sequences, suggesting that off-targets are likely dCas13b-independent (fig. S14), which is consistent with the high overlap of off-targets between REPAIRv1 and the ADAR2 deaminase domain (fig. S8D). To directly compare REPAIRv2 with other programmable ADAR systems, we repeated our Cluc-targeting experiments with all systems at two different dosages of ADAR vector, finding that REPAIRv2 had comparable on-target editing with that of BoxB and ADAR2 but with substantially fewer off-target editing events at both dosages (fig S15). REPAIRv2 had enhanced specificity compared with REPAIRv1 at both dosages (fig. S15B), a finding that also extended to two guides targeting distinct sites on PPIB (fig. S16, A to D). It is also worth noting that in general, the lower-dosage condition (10 ng REPAIR vector) had fewer off-targets than that of the higher dosage condition (150 ng REPAIR vector) (fig. S5).
To assess editing specificity with greater sensitivity, we sequenced the low-dosage condition (10 ng of transfected DNA) of REPAIRv1 and v2 at much higher sequencing depth (125x coverage of the transcriptome). Increased numbers of off-targets were found at higher sequencing depths corresponding to detection of rarer off-target events (fig. S17). Furthermore, we speculated that different transcriptome states could also potentially alter the number of off-targeting events. Therefore, we tested REPAIRv2 activity in the osteosarcoma U2OS cell line, observing six and seven off-targets for the targeting and nontargeting guide, respectively (fig. S18).
We targeted REPAIRv2 to endogenous genes to test whether the specificity-enhancing mutations reduced nearby edits in target transcripts while maintaining high-efficiency on-target editing. For guides targeting either KRAS or PPIB, we found that REPAIRv2 had no detectable off-target edits, unlike REPAIRv1, and could effectively edit the on-target adenosine at efficiencies of 27.1% (KRAS) or 13% (PPIB) (Fig. 6F). This specificity extended to additional target sites, including regions that demonstrate high levels of background in nontargeting conditions for REPAIRv1, such as other KRAS or PPIB target sites (fig. S19). Overall, REPAIRv2 eliminated off-targets in duplexed regions around the edited adenosine and showed dramatically enhanced transcriptome-wide specificity.
We show here that the RNA-guided RNA-targeting type VI-B CRISPR effector Cas13b is capable of highly efficient and specific RNA knockdown, providing the basis for improved tools for interrogating essential genes and noncoding RNA as well as controlling cellular processes at the transcript level. Catalytically inactive Cas13b (dCas13b) retains programmable RNA-binding capability, which we leveraged here by fusing dCas13b to the adenosine deaminase domain of ADAR2 to achieve precise A-to-I edits, a system we term REPAIRv1. Further engineering of the system produced REPAIRv2, which has dramatically higher specificity than previously described RNA-editing platforms (25, 29) while maintaining high levels of on-target efficacy.
Although Cas13b exhibits high fidelity, our initial results with dCas13b-ADAR2DD(E488Q) fusions revealed a substantial number of off-target RNA editing events. To address this, we used a rational mutagenesis strategy to vary the ADAR2DD residues that contact the RNA duplex, identifying a variant, ADAR2DD(E488Q/T375G), that is capable of precise, efficient, and highly specific editing when fused to dCas13b. Editing efficiency with this variant was comparable with or better than that achieved with two currently available systems, BoxB-ADAR2DD(E488Q) or ADAR2 editing. Moreover, the REPAIRv2 system created only 20 observable off-targets in the whole transcriptome, which is at least an order of magnitude better than both alternative editing technologies. Although it is possible that ADAR could deaminate adenosine bases on the DNA strand in RNA-DNA heteroduplexes (20), it is unlikely to do so in this case because Cas13b does not bind DNA efficiently and because REPAIR is cytoplasmically localized. Additionally, the lack of homology of off-target sites to the guide sequence and the strong overlap of off-targets with the ADARDD(E488Q)–only condition suggest that off-targets are not mediated by off-target guide binding. Deeper sequencing and novel inosine enrichment methods could further refine our understanding of REPAIR specificity in the future.
The REPAIR system offers many advantages compared with other nucleic acid–editing tools. First, the exact target site can be encoded in the guide by placing a cytidine within the guide across from the desired adenosine to create a favorable A-C mismatch ideal for ADAR-editing activity. Second, Cas13 has no targeting sequence constraints, such as a PFS or PAM, and no motif preference surrounding the target adenosine, allowing any adenosine in the transcriptome to be potentially targeted with the REPAIR system. The lack of motif for ADAR editing, in contrast with previous literature, is likely due to the increased local concentration of REPAIR at the target site owing to dCas13b binding. DNA base editors can target either the sense or antisense strand, whereas the REPAIR system is limited to transcribed sequences, constraining the total number of possible editing sites. However, because of the less constrained nature of targeting with REPAIR, this system can effect more edits within ClinVar (Fig. 4C) than Cas9-DNA base editors. Third, the REPAIR system directly deaminates target adenosines to inosines and does not rely on endogenous repair pathways to generate desired editing outcomes. Therefore, REPAIR should be able to mediate efficient RNA editing even in post-mitotic cells such as neurons. Fourth, in contrast to DNA editing, RNA editing is transient and can be more easily reversed, allowing the potential for temporal control over editing outcomes. The transient nature of REPAIR-mediated edits will likely be useful for treating diseases caused by temporary changes in cell state, such as local inflammation, and could also be used to treat disease by modifying the function of proteins involved in disease-related signal transduction. For instance, REPAIR editing would allow the recoding of some serine, threonine, and tyrosine residues that are the targets of kinases (fig. S20). Phosphorylation of these residues in disease-relevant proteins affects disease progression for many disorders, including Alzheimer’s disease and multiple neurodegenerative conditions (30). REPAIR might also be used to transiently or even chronically change the sequence of expressed, risk-modifying G-to-A variants so as to decrease the chance of entering a disease state for patients. For instance, REPAIR could be used to functionally mimic A-to-G alleles of IFIH1 that protect against autoimmune disorders such as type I diabetes, immunoglobulin A deficiency, psoriasis, and systemic lupus erythematosus (31, 32).
The REPAIR system provides multiple opportunities for additional engineering. Cas13b possesses pre–CRISPR-RNA (crRNA) processing activity (13), allowing for multiplex editing of multiple variants—any one of which alone may not affect disease, but together might have additive effects and disease-modifying potential. Extension of our rational design approach, such as combining promising mutations and directed evolution, could further increase the specificity and efficiency of the system, while unbiased screening approaches could identify additional residues for improving REPAIR activity and specificity.
Currently, the base conversions achievable by REPAIR are limited to generating inosine from adenosine; additional fusions of dCas13 with other catalytic RNA editing domains, such as APOBEC (apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like), could enable cytidine-to-uridine editing. Additionally, mutagenesis of ADAR could relax the substrate preference to target cytidine, allowing for the enhanced specificity conferred by the duplexed RNA substrate requirement to be exploited by C-to-U editors. Adenosine-to-inosine editing on DNA substrates may also be possible with catalytically inactive DNA-targeting CRISPR effectors, such as dCas9 or dCpf1, either through formation of DNA-RNA heteroduplex targets (20) or mutagenesis of the ADAR domain.
We have demonstrated the use of the PspCas13b enzyme as both an RNA-knockdown and RNA-editing tool. The dCas13b platform for programmable RNA binding has many applications, including live transcript imaging, splicing modification, targeted localization of transcripts, pulldown of RNA-binding proteins, and epitranscriptomic modifications.We used dCas13 to create REPAIR, adding to the existing suite of nucleic acid–editing technologies. REPAIR provides a new approach for treating genetic disease or mimicking protective alleles and establishes RNA editing as a useful tool for modifying genetic function.
Materials and Methods
Figs. S1 to S20
Tables S1 to S9
This is an article distributed under the terms of the Science Journals Default License.
REFERENCES AND NOTES
1. P. D. Hsu, E. S. Lander, F. Zhang, Cell 157, 1262–1278 (2014).
2. A. C. Komor, A. H. Badran, D. R. Liu, Cell 168, 20–36 (2017).
3. L. Cong et al., Science 339, 819–823 (2013).
4. P. Mali et al., Science 339, 823–826 (2013).
5. B. Zetsche et al., Cell 163, 759–771 (2015).
6. H. Kim, J. S. Kim, Nat. Rev. Genet. 15, 321–334 (2014).
7. A. C. Komor, Y. B. Kim, M. S. Packer, J. A. Zuris, D. R. Liu, Nature 533, 420–424 (2016).
8. K. Nishida et al., Science 353, aaf8729 (2016).
9. Y. B. Kim et al., Nat. Biotechnol. 35, 371–376 (2017).
10. O. O. Abudayyeh et al., Science 353, aaf5573 (2016).
11. S. Shmakov et al., Mol. Cell 60, 385–397 (2015).
12. S. Shmakov et al., Nat. Rev. Microbiol. 15, 169–182 (2017).
13. A. A. Smargon et al., Mol. Cell 65, 618–630.e7 (2017).
14. J. S. Gootenberg et al., Science 356, 438–442 (2017).
15. O. O. Abudayyeh et al., Nature 550, 280–284 (2017).
16. K. Nishikura, Annu. Rev. Biochem. 79, 321–349 (2010).
17. M. H. Tan et al., Nature 550, 249–254 (2017).
18. B. L. Bass, H. Weintraub, Cell 55, 1089–1098 (1988).
19. M. M. Matthews et al., Nat. Struct. Mol. Biol. 23, 426–433 (2016).
20. Y. Zheng, C. Lorenzo, P. A. Beal, Nucleic Acids Res. 45, 3369–3377 (2017).
21. A. Kuttan, B. L. Bass, Proc. Natl. Acad. Sci. U.S.A. 109, E3295–E3304 (2012).
22. S. K. Wong, S. Sato, D. W. Lazinski, RNA 7, 846–858 (2001).
23. M. Fukuda et al., Sci. Rep. 7, 41478 (2017).
24. M. F. Montiel-Gonzalez, I. Vallecillo-Viejo, G. A. Yudowski, J. J. Rosenthal, Proc. Natl. Acad. Sci. U.S.A. 110, 18285–18290 (2013).
25. M. F. Montiel-González, I. C. Vallecillo-Viejo, J. J. Rosenthal, Nucleic Acids Res. 44, e157 (2016).
26. J. Wettengel, P. Reautschnig, S. Geisler, P. J. Kahle, T. Stafforst, Nucleic Acids Res. 45, 2797–2808 (2017).
27. Y. Wang, J. Havel, P. A. Beal, ACS Chem. Biol. 10, 2512–2519 (2015).
28. K. A. Lehmann, B. L. Bass, Biochemistry 39, 12875–12884 (2000).
29. T. Stafforst, M. F. Schneider, Angew. Chem. Int. Ed. Engl. 51, 11166–11169 (2012).
30. C. Ballatore, V. M. Lee, J. Q. Trojanowski, Nat. Rev. Neurosci. 8, 663–672 (2007).
31. Y. Li et al., J. Invest. Dermatol. 130, 2768–2772 (2010).
32. R. C. Ferreira et al., Nat. Genet. 42, 777–780 (2010).
For references 33-38, please see the Supplementary Materials section.