SNP to Amino Acid Change Using R

BY IN Code, R, Tutorials NO COMMENTS YET

I have a set of single nucleotide changes such as SNPs or missense mutations that I want to annotate in terms of the gene affected and amino acid residue change.

Take for example, a T>C change on chromosome 14 at position 73637755 (on Assembly GRCh37): http://www.ncbi.nlm.nih.gov/clinvar/variation/18145/

One way to annotate this SNP is using the myvariant.info API: myvariant.info/v1/query?q=chr14%3A73637755-73637755
We can then parse the outputted XML to find that indeed the gene affected is PSEN1 and the amino acid change is an L to P.

We can also get the same annotation in R using various Bioconductor packages:

# Load relevant libraries
require(BSgenome.Hsapiens.UCSC.hg19)  # Full genome sequences for Homo sapiens (UCSC version hg19)
require(TxDb.Hsapiens.UCSC.hg19.knownGene)  # Known gene annotation data (UCSC version hg19) stored as TxDb objects
require(VariantAnnotation)  

# Encode the SNP
gr <- GRanges(seqnames='chr14', IRanges(73637755, width=1))
alt <- DNAStringSet(x = 'C')

# Annotate
annot <- predictCoding(
	query = gr,
	varAllele = alt,
	subject = TxDb.Hsapiens.UCSC.hg19.knownGene, 
	seqSource = BSgenome.Hsapiens.UCSC.hg19)

The resulting output conveys the same general information, but now we can stay within R, which can be convenient for downstream analyses:

GRanges object with 4 ranges and 12 metadata columns:
      seqnames               ranges strand |      varAllele     CDSLOC    PROTEINLOC   QUERYID        TXID     CDSID      GENEID   CONSEQUENCE       REFCODON       VARCODON         REFAA         VARAA
  [1]    chr14 [73637755, 73637755]      + |              C [338, 338]           113         1       51630    152942        5663 nonsynonymous            CTA            CCA             L             P
  [2]    chr14 [73637755, 73637755]      + |              C [338, 338]           113         1       51631    152942        5663 nonsynonymous            CTA            CCA             L             P
  [3]    chr14 [73637755, 73637755]      + |              C [326, 326]           109         1       51632    152942        5663 nonsynonymous            CTA            CCA             L             P
  [4]    chr14 [73637755, 73637755]      + |              C [326, 326]           109         1       51633    152942        5663 nonsynonymous            CTA            CCA             L             P

So, what do you think ?