Disambiguation¶
disambiguate
For mixed or explant samples, provide a list ofgenome_build
identifiers to check and remove from alignment. Currently supports cleaning a single organism. For example, withgenome_build: hg19
anddisambiguate: [mm10]
, it will align to hg19 and mm10, run disambiguation and discard reads confidently aligned to mm10 and not hg19. Affects fusion detection whenstar
is chosen as the aligner. Aligner must be set to a non false value for this to run.
Example config:
details:
- algorithm:
aligner: bwa
background: /path/to/project/config/1000g_pon.hg38.vcf.gz
disambiguate: mm10
exclude_regions: [lcr]
mark_duplicates: true
platform: illumina
quality_format: standard
realign: false
recalibrate: false
tools_on:
- noalt_calling
variantcaller:
- vardict
- mutect2
vcfanno: somatic
analysis: variant2
description: SAMPLE
files:
- /path/to/project/input/sample_1.fq.gz
- /path/to/project/input/sample_2.fq.gz
genome_build: hg38
metadata:
batch: bSAMPLE
phenotype: tumor
upload:
dir: ../final
The resulting final/project/multiqc/multiqc_report.html
will contain hg38 and mm10 columns with numbers of reads aligned to hg38 and mm10.
Also final/sample
contains sample-ready.bam - reads aligned to hg38 and sample-disambiguate-mm10.bam - reads aligned to mm10.
Output¶
final/project/multiqc: Reads = single reads in the final bam (mapped to human genome), hg38 = single reads mapped to human genome (same as Reads), mm10 = single reads mapped to mouse genome; Ambig. = single reads ambiguously mapped; Total reads = sum (hg38 + mm10 + Ambig.).