DEEPOMICS® FFPE is developed to distinguish whether a given variant call is true somatic variant or a FFPE-induced artifact.


  • If you have your VCF called from a FFPE sample using MuTect2, please go to “Analysis” page and execute DEEPOMICS® FFPE for it.
  • If you would like to try but you do not have your VCF, you can use an example of VCF file [Download] (Genome assembly: GRCh37/hg19 [Homo Sapiens (Human)]).​
  • If you would like to prepare your VCF, you can use this “Tutorial” page.​

The input file of DEEPOMICS® FFPE is supposed to be a Variant Call Format(VCF) generated by Mutect2. We assume that DEEPOMICS® FFPE is not that sensitive to the options used for variant calling, but we describe the command line options that we used just in case. We hope this will be helpful for the users to create their own VCF. Please make sure that the VCF must contain proper header lines starting with two of hash (##). If you did NOT edit the VCF yourself, you do not have to worry about this.​


​1. How to prepare an input file

Note: In the following command lines, {user option} has to be properly filled by user.


1.1 Removing adapter sequences.



cutadapt -q 20 \

-a {left_adapter} \

-A {right_adapter} \

--minimum-length 50 \

-o {output_R1.fastq.gz} \

-p {output_R2.fastq.gz} \

{input_R1.fastq.gz} {input_R2.fastq.gz}


 



1.2 Aligning paired-end sequencing reads to the reference genome.



bwa mem -M \

-R @RG\\tID:{sample_name}\\tPL:illumina\\tSM:{sample_name} \

-t 24 \

-k 18 \

{reference.fasta} \

{input_R1.fastq.gz} {input_R2.fastq.gz} | \

samtools view -Sb -F 0x100 > {output.bam}




1.3 Sorting an alignment file and removing sequencing duplicates.



picard SortSam \

-Xmx50g \

TMP_DIR={output.tmp} \

I={input.bam} \

O={output.bam} \

SO=coordinate

 

picard MarkDuplicates \

-Xmx50g \

TMP_DIR={output.tmp} \

I={input.bam} \

O={output.bam} \

M={output_metrics_dedup.txt} \

CREATE_INDEX=true \

CREATE_MD5_FILE=true \

REMOVE_SEQUENCING_DUPLICATES=true




1.4 Local realignment and base quality recalibration



gatk BaseRecalibrator \

--tmp-dir{output.tmp} \

-R {reference.fasta} \

-I {input.bam\

-O {output.recaltable\

--known-sites {dbsnp_gatk.vcf.gz}


gatk ApplyBQSR \

--tmp-dir{output.tmp} \

-R {reference.fasta} \

-I {input.bam} \

--bqsr-recal-file {input.recaltable} \

-O {output.bam}




1.5 Variant calling



gatk Mutect2 \

--tmp-dir {output.tmp} \

--reference {reference.fasta} \

--germline-resource {gnomad_af.vcf.gz} \

--panel-of-normals {1000g_db.vcf.gz} \

--input {input.bam} \

-tumor {tumor_sample_name} \

--intervals {variant_target.bed} \

--output {output.vcf} \

-bamout {output.bam} \

--annotation AlleleFraction \

--annotation AS_FisherStrand \

--annotation TandemRepeat

 

gatk FilterMutectCalls \

--tmp-dir {output.tmp} \

--reference {reference.fasta} \

--variant {input.vcf} \

--output {output_filt.vcf}



2. How to use the output

Once DEEPOMICS® FFPE is completed, you will get the URL of the output file (To get the output file, your email address is required).

“IS_VARIANT” field in the ”INFO” column of the output VCF is going to tell you whether the given variant is distinguished as a true somatic variant (IS_VARIANT=variant) or a FFPE-induced artifact (IS_VARIANT=artifact) by DEEPOMICS® FFPE.



If you need a VCF without containing artifacts, please see below. (Please note that the result of following execution should include the header lines.)



$ grep –v “IS_VARIANT=artifact” {path/to/output_VCF} > {variants_w-header.vcf}




If you need a VCF containing only true variants, but excluding the header lines, please execute following command.



$ grep “IS_VARIANT=variant” {path/to/output_VCF} > {variants_wo-header.vcf}