CRISPECTOR, a statistical CRISPR genome editing analysis software for research

Amit I, Iancu O, Levy-Jurgenson A, et al. CRISPECTOR provides accurate estimation of genome editing translocation and off-target activity from comparative NGS data. Nat Comms. 2021;12(1):3042.

Citation summary: Amit et al., working with scientists at Integrated DNA Technologies (IDT), have developed a specialized research analysis software, called CRISPECTOR, to analyze NGS reads generated from PCR products amplified from DNA edited in CRISPR genome editing experiments [1]. Previous software programs could not effectively resolve very low CRISPR editing signal from noise and could not adequately perform statistical quantification or detect chromosomal translocations. The CRISPECTOR software can accomplish all three of these research objectives while also, importantly, enabling multiplex amplicon analysis.

Background

Cas9 and other CRISPR-associated endonucleases demonstrate varying levels of off-target editing capabilities depending on the guide RNA (gRNA) sequence that is used in a given genome editing research experiment.  Even the highest fidelity Cas9 enzymes available, in combination with the best-designed gRNA sequences, have been shown to cause low levels of off-target editing in some cases.  Researchers can nominate sites of off-target effects (OTEs) by a variety of empirical next generation sequencing (NGS) methods, including GUIDE-seq [2], CIRCLE-seq [3], SITE-seq [4], and DISCOVER-seq [5].

Software tools already exist for analysis of research data obtained with these methods, but before now such tools had not been designed to provide any intrinsic statistical analysis to separate signal from noise in cases of low editing, nor to detect translocations. Moreover, the more complex multiplexed PCR setups that amplify many potential CRISPR targets at once require a more robust software, rather than simpler software tools that are designed to analyze a single amplicon.

Experiment

The researchers performed CRISPR genome editing and investigated five on-target sites in addition to 226 off-target sites (nominated by GUIDE-seq [2]) across multiple cell lines, for a total of 1161 nominated sites of genome edits to be analyzed. The team also designed control samples, which did not undergo CRISPR genome editing.  As in previous studies, once the CRISPR genome editing was completed, off-target sites were amplified by multiplex rhAmpSeq PCR, and the amplicons from this multiplex assay were sequenced by NGS. Data generated from this large and complex set of amplicons were then analyzed with CRISPECTOR as described below.

The researchers built the CRISPECTOR software for NGS research data analysis based on a sample comparison approach, with an analytical process designed into the software as follows. First, the NGS reads are all mapped to PCR amplicons. The software then aligns reads from CRISPR-edited and paired control samples to a reference genome, and then compares sequence variations. The researchers took into consideration the fact that some of the deviations found in CRISPR-edited samples would represent genuine editing events, but other deviations would represent problems with the experimental process, such as sequencing errors. To sort out which deviations are the genuine edits, the authors implemented a Bayesian classification approach in the software which predicts real editing events based on their enrichment in the CRISPR-edited sample. The researchers also devised parameters which may be fine-tuned to enable increased accuracy of calling true edits in low signal-to-noise experiments (where editing levels are only marginally higher than observed noise).

Results and conclusion

Using CRISPECTOR, the researchers were not only able to sort through NGS data from multiplex PCR-amplified samples, but could also identify translocations. This was accomplished by a modification of the mapped-read parsing step. In this modification, the software was programmed to identify reads with primer sequences normally expected to be associated with different PCR amplicons. This approach would suggest that the primers had amplified targets which were continuous in the multiplex pool, and which would therefore have resulted from a translocation event. As a positive control, synthetic translocation data were generated from an IDT MiniGene construct to demonstrate the success of the identifications. Additionally, ddPCR was used to confirm some of the translocations identified by CRISPECTOR.

In conclusion, several key benefits to future CRISPR research have been introduced by the CRISPECTOR software. These include statistical comparisons of rhAmpSeq data from CRISPR-cut and control samples, accurate detection of CRISPR edits even when only low levels of CRISPR cutting occur, and calling of translocations.  The well thought-out design of the CRISPECTOR software will only enhance CRISPR research into the future.

CRISPECTOR is freely available for non-commercial research use only at https://github.com/YakhiniGroup/crispector

RUO - IDT products discussed on this page are for Research Use Only. Not for use in diagnostic procedures. Unless otherwise agreed to in writing, IDT does not intend these products to be used in clinical applications and does not warrant their fitness or suitability for any clinical diagnostic use. Purchaser is solely responsible for all decisions regarding the use of these products and any associated regulatory or legal obligations.

References

Amit I, Iancu O, Levy-Jurgenson A, et al. Nat Comms. 2021;12(1):3042.

Tsai SQ, Zheng Z, Nguyen NT, et al. Nat Biotechnol. 2015;33(2):187-197.

Tsai SQ, Nguyen NT, Malagon-Lopez J, et al. Nat Methods. 2017;14(6):607-614.

Cameron P, Settle AH, Fuller CK, et al. Prot Exch. 2017.

Wienert B, Wyman SK, Richardson CD, et al. Science. 2019;364(6437):286-289.

Published Jul 30, 2021