hammerlab / filter-bam

Stand-alone utility for filtering a BAM file to specific genomic regions, using Apache Spark.

GitHub

filter-bam

Maven Central

Stand-alone utility for filtering a BAM file to specific genomic regions, using Apache Spark.

Usage

sbt assembly
spark-submit \
  --properties-file <spark properties file> \
  target/scala-2.11/filter-bam-assembly-1.0.0-SNAPSHOT.jar \
  in.bam \
  <regions> \
  out.bam \
  [-c] [--include-unmapped-mates]

Parameters

  • <regions>: comma-separated list of genomic regions, in the format accepted by hammerlab/genomic-loci.
  • -c/--count: print the number of reads output, in addition to writing out.bam.
  • --include-unmapped-mates: include unmapped reads whose mate-contig and mate-start are set to a value that overlaps <regions>.