Reversion mutations are secondary mutations that reverse the deleterious effects of an original pathogenic mutation, partially or fully restoring the gene’s function. Reversion mutations are key mechanisms for cancer cells to develop resistance to targeted therapies such as PARP inhibitors which target DNA damage repair in cancers with BRCA1/2 mutations. Detecting reversion mutations can help understand treatment failure and predict resistance. Monitoring reversions through blood tests (ctDNA) during treatment can offer early warnings of acquired resistance.
The revert package detects reversions for a specific
pathogenic mutation from BAM files of DNA-seq data. revert
performs local realignments of reads in flanking windows surrounding the
pathogenic mutation with permissive gap opening for soft-clipped reads
and adjustments subject to pathogenic mutation, and identifies reversion
mutations that restore the open reading frame of the reference gene or
the reference sequence, e.g., secondary indels converting the orignal
frameshift insertion or deletion into inframe indels, secondary SNVs
restoring the mutant codon caused by the original nonsense or missense
SNV, indels or SNVs replacing the original pathogenic mutation,
secondary SNVs creating a cryptic splice donor/acceptor site or a
cryptic start/stop codon, etc.
The revert package is designed to be applicable to most
types of DNA-seq data such as ctDNA, WES, WGS and targeted amplicon
sequencing (TAS). To start using revert quickly, see the Examples section.
A BAM file containing aligned reads to be processed, see below for recommendations on BAM file preparation
A file path to write output files
The reference genome version (hg19/hg38/mm10) or a FASTA file containing the open reading frames of reference sequences
Genomic position of a pathogenic mutation following the HGVS-like syntax for substitution, deletion, insertion, deletion-insertion (delins), or duplication, e.g., “chr13:g.32913778T>G”, “chr13:g.32913319_32913320delTG”, “chr17:g.41244706_41244707insT”, “chr17:g.41244936delinsAA”, “Brca2_5805_wt:117del”
Gene name and transcript Ensembl ID of the pathogenic mutation for the reference genome hg19, hg38 or mm10
Other default parameters
| Parameter | Description | Default |
|---|---|---|
| detection.window | the length of flanking regions to be added to both ends of pathogenic mutation locus for detecting reversion mutations | 100 |
| splice.region | the length of splicing junction region to be considered in introns | 8 |
| check.soft.clipping | whether soft-clipped reads to be realigned | TRUE |
| softClippedReads.realign.window | the length of flanking regions to be added to both ends of pathogenic mutation locus for realigning soft-clipped reads | 1000 |
| softClippedReads.realign.match | the scoring for a nucleotide match for realigning soft-clipped reads | 1 |
| softClippedReads.realign.mismatch | the scoring for a nucleotide mismatch for realigning soft-clipped reads | 4 |
| softClippedReads.realign.gapOpening | the cost for opening a gap in the realignment of soft-clipped reads | 6 |
| softClippedReads.realign.gapExtension | the incremental cost incurred along the length of the gap in the realignment of soft-clipped reads | 0 |
| check.wildtype.reads | whether wild type reads to be processed as revertant-to-wildtype reads | FALSE |
| is.paired.end | whether reads in BAM file are paired-end (TRUE) or single-end (FALSE) | TRUE |
| keep.duplicate.reads | whether duplicated reads in the BAM file to be processed (TRUE) or discarded (FALSE) | TRUE |
| keep.secondary.alignment | whether secondary alignment reads in the BAM file to be processed (TRUE) or discarded (FALSE) | TRUE |
| keep.supplementary.alignment | whether supplementary alignment reads in the BAM file to be processed (TRUE) or discarded (FALSE) | TRUE |
| minimum.mapping.quality | the minimum mapping quality of reads in the BAM file to be processed | 0 |
| verbose | whether progress logging to be printed to stdout | TRUE |
| out.failed.reads | whether the name of failed reads to be written to ‘.failed_reads.txt’ file | FALSE |
Many state-of-art NGS aligners enable clipping modes to improve the
accuracy of reads alignment by focusing on the high-confidence and
well-aligned parts of a read and discarding (hard-clipping) or ignoring
(soft-clipping) the non-aligned parts caused by adapters, large indels
or translocations where the large indels or translocations might suggest
potential large genomic rearrangements (LGRs) restoring the gene’s
function partially. The revert package realigns
soft-clipped reads in flanking windows surrounding the pathogenic
mutation with permissive gap opening to identify the LGR reversions. To
improve the sensitivity for reversion detection, it is recommended to
generate the BAM files by using standard aligners in soft-clipping mode,
e.g., enabling parameters -Y for bwa mem and
--local for bowtie2.
The function getReversions() writes the following result
files to the output directory:
‘.reversions.txt’ contains all reversions identified for the pathogenic mutation from the BAM file.
| Column | Description |
|---|---|
| pathogenic_mutation | the original pathogenic mutation |
| pathogenic_mutation_left_aligned | left-aligned position of the pathogenic mutation if it is an insertion or deletion |
| reversion_id | unique identifier of the reversion |
| reversion_frequency | number of reads carrying the reversion |
| pathogenic_mutation_retained | whether the pathogenic locus retained the original mutation (Yes), arose a different mutation (No), or reverted to wild type (WT) |
| reversion | the reversion for pathogenic mutation, consisting of one or more mutations |
| reads_total | number of total reads aligned to the pathogenic mutation locus |
| reads_wildtype | number of reads exhibiting wild type at the pathogenic mutation locus |
| reads_withPathogenicMutation | number of reads carrying the pathogenic mutation |
| reads_withReplacementMutation | number of reads carrying a different mutation but not the pathogenic mutation at the pathogenic locus |
| mutations_in_reversion | number of mutations included in the reversion |
‘.split_mutations.txt’ contains information of each single mutation in a reversion.
| Column | Description |
|---|---|
| reversion_id | unique identifier of a reversion, corresponding to the ‘reversion_id’ in ‘.reversions.txt’ |
| mutation_id | unique identifier of each single mutation in a reversion |
| mutation_type | SNV, INS, DEL, DELINS or WT (self-revertant mutation represented by MT>WT) |
| mutation | genomic position of the mutation in HGVS-like syntax |
| mutation_length_change | length of the reference sequence change caused by the mutation |
| pathogenic_mutation | the original pathogenic mutation |
| distance_to_pathogenic_mutation | distance in reference sequence between the mutation and the pathogenic mutation |
‘.revert_assembly.bam’ contains all reads realigned to the pathogenic mutation. An RG tag is added to each realigned read indicating two read groups, ‘Revertant’ and ‘NonRevertant’. The revert-assembled BAM file can be loaded to IGV for visualizing reversions.
‘.revert_assembly.bam.bai’ is the index file for ‘.revert_assembly.bam’.
‘.revert_settings.txt’ contains the summary of running parameters and processed reads.
‘.failed_reads.txt’ (optional) contains the names of reads failed for reversion detection.
Reversion detection for a frameshift deletion
library(revert)
getReversions(
bam.file = system.file("extdata", "toy_data_1.bam", package="revert"),
out.dir = tempdir(),
reference = "hg19",
pathog.mut = "chr13:g.32913319_32913320delTG",
gene.name = "BRCA2",
transcript.id = "ENST00000544455" )Reversion detection for a frameshift insertion
getReversions(
bam.file = system.file("extdata", "toy_data_2.bam", package="revert"),
out.dir = tempdir(),
reference = "hg19",
pathog.mut = "chr17:g.41244706_41244707insT",
gene.name = "BRCA1",
transcript.id = "ENST00000357654" )Reversion detection for a frameshift deletion-insertion
getReversions(
bam.file = system.file("extdata", "toy_data_3.bam", package="revert"),
out.dir = tempdir(),
reference = "hg19",
pathog.mut = "chr17:g.41244936delinsAA",
gene.name = "BRCA1",
transcript.id = "ENST00000357654" )Reversion detection for a nonsense SNV
getReversions(
bam.file = system.file("extdata", "toy_data_4.bam", package="revert"),
out.dir = tempdir(),
reference = "hg19",
pathog.mut = "chr13:g.32913778T>G",
gene.name = "BRCA2",
transcript.id = "ENST00000544455" )Reversion detection for a splice-acceptor SNV
getReversions(
bam.file = system.file("extdata", "toy_data_5.bam", package="revert"),
out.dir = tempdir(),
reference = "hg19",
pathog.mut = "chr13:g.32928997G>A",
gene.name = "BRCA2",
transcript.id = "ENST00000544455" )Reversion detection for a targeted deletion with customised reference sequence
Development of revert was supported by Breast Cancer Now.