Polyjuice

Polyjuice is a tool for exploring genomic polymorphisms.

The name is also a tribute to the awesome Polyjuice Potion in the Harry Potter series.

Whereas the real potion is brewed with traces of a person, such as hair strands or toenail clippings to create a temporary imperfect copy of that person; this library takes in traces of a mutation, such as HGVS strings, to create imperfect guesses of possible genomic coordinates.

Overview

Polyjuice has a library module and a web service module. It can answers questions like:

  • For a gene, what are the exons coordinates ?
  • For a gene, what base is in some coding sequence position P across different transcripts ?
  • For a gene, what codon is in position P across different transcripts ?
  • For a gene and an HGVS string, what are possible genomic coordinates ?

And it can generate a VCF file from a set of HGVS strings.

All the gene endpoints have a corresponding transcript endpoints. The difference is the gene endpoint will show results for all matching transcripts.

Limitations

This tool only handles coding regions for cds' and codons. The following cases are NOT handled:

  • Mutations crossing intronic regions and UTR regions.
  • For DNA coding sequence: conversions, copy number variations, allele combinations, complex mutations.
  • For Protein: anything but simple substitutions.

Related work

Some other tools that deal with transcripts and HGVS notations:

Usage

The tool requires two files from Ensembl - the CDS FASTA file and the GFF3 file.

See the web service module for more information on how to use the service endpoints.

Building manually

  • Download the two files from Ensembl
  • Install Java 8 and sbt
  • Build an application jar
sbt assembly
  • Create a properties file with the following properties specified, for example:
service.port=8080
service.host=localhost
ensembl.build=GRCh39.build91
geneList=EGFR,BRAF,ERBB2
ensembl.cdsFastaPath=/path/to/cds.fa.gz
ensembl.featureGff3Path=/path/to/gff3.gz
  • Run the application
java -jar -Xmx128m -Dconfig.file=/path/to/file.properties /path/to/assembly.jar