Polyjuice is a tool for exploring genomic polymorphisms.
The name is also a tribute to the awesome Polyjuice Potion in the Harry Potter series.
Whereas the real potion is brewed with traces of a person, such as hair strands or toenail clippings to create a temporary imperfect copy of that person; this library takes in traces of a mutation, such as HGVS strings, to create imperfect guesses of possible genomic coordinates.
- For a gene, what are the exons coordinates ?
- For a gene, what base is in some coding sequence position P across different transcripts ?
- For a gene, what codon is in position P across different transcripts ?
- For a gene and an HGVS string, what are possible genomic coordinates ?
And it can generate a VCF file from a set of HGVS strings.
All the gene endpoints have a corresponding transcript endpoints. The difference is the gene endpoint will show results for all matching transcripts.
This tool only handles coding regions for cds' and codons. The following cases are NOT handled:
- Mutations crossing intronic regions and UTR regions.
- For DNA coding sequence: conversions, copy number variations, allele combinations, complex mutations.
- For Protein: anything but simple substitutions.
Some other tools that deal with transcripts and HGVS notations:
The tool requires two files from Ensembl - the CDS FASTA file and the GFF3 file.
See the web service module for more information on how to use the service endpoints.
- Create a properties file with the following properties specified, for example:
service.port=8080 service.host=localhost ensembl.build=GRCh39.build91 geneList=EGFR,BRAF,ERBB2 ensembl.cdsFastaPath=/path/to/cds.fa.gz ensembl.featureGff3Path=/path/to/gff3.gz
- Run the application
java -jar -Xmx128m -Dconfig.file=/path/to/file.properties /path/to/assembly.jar