Quick start
If you’re in a hurry and can’t read too much documentation, here’s a quick way to use JLOH with commonly available files derived from NGS data analysis. If, instead, you’re working with a hybrid organism, make sure you read this section: Working with hybrids.
Requirements
JLOH is built to work as an addition to any common variant calling pipeline. These pipelines are based on two steps: 1) read mapping and 2) variant calling. Regardless of the tool used to map the reads, the most common alignment format is the SAM format (or its binary version BAM). In the calling step, regardless of the tool used to call variants the output is most often in VCF format.
These are the files you need as input to infer LOH blocks.
File type |
Description |
---|---|
FASTA |
The reference genome sequence where you map your genomic reads. |
BAM |
The output of the mapping step. |
VCF |
The file containing all the single-nucleotide polymorphisms (SNPs) called from the BAM file onto the FASTA file. |
These three files will highlight the positions in which the genotype represented from the reads has lost heterozygosity when compared to the genotype represented by the reference.
Calculating SNP density
This is done with jloh stats, for details see Modelling SNP density. Run the command:
jloh stats --vcf my_variants.vcf
And choose the thresholds of SNP density for heterozygous and homozygous SNPs.
Extracting blocks
The second step is the inference of LOH blocks. This step is done with jloh extract
. At the very minimum, the program requires these parameters:
jloh extract --vcf my_variants.vcf --bam my_mappings.bam --ref my_reference_genome.fasta
Besides the three input parameters, we encourage you to set at least two other parameters, even though they have default values:
--min-snps-kbp <N,N>
: heterozygous/homozygous minimum SNP/kbp densities to label a region as heterozygous/homozygous
--threads
: number of parallel operations
Among the output files, there is also a table with the inferred LOH blocks which will end with *LOH_blocks.tsv
. This is the main output of the program.