Skip to content
Snippets Groups Projects
Commit eb8a1fb3 authored by s1734289's avatar s1734289
Browse files

Add document describing running process of CNV-calling pipeline

parent a38bdb8d
No related branches found
No related tags found
No related merge requests found
Pipeline #16853 failed
This document describes how to run the cnv-calling workflow of the nextflow variant pipeline
Currently, the pipeline is set up to expect a folder 'ExomeDepth_assets' in the 'trio-whole-exome/pipeline', which contains the Rscript to run ExomeDepth, a script to build a reference ExomeDepth object, and paths to sex specific references A future development will be to move the R scripts into the bin folder, and the reference objects to the assets folder
The run time of the existing pipeline is currently roughly 10 minutes per family
required input files
ped file
sample sheet - with an extra column containing the path to the sample bam file
reference fasta - HG38 is the assembly that the pipeline has been designed for
reference bedfile - exome refseq file
output files:
outputs are stored in a folder with the family ID
exome_calls_[bam file name].csv - a csv file produced by ExomeDepth
[individual id]cnv_calls_all_chr.bed - a sorted bedfile with the chromosome and location of variants and variant type. If no variants are present for a chromosome, the start and end both have the value 0
unique_proband/[individual id]intersects.txt - details the intersects between the variants from the proband and the parents
unique_proband/[individual id]proband_only.bed - this bedfile contains locations of CNVs that only occur in the proband, and neither parent, with the variant type
unique_proband/[individual id]VEP_output.vcf - a vcf containing the output from the VEP command
future output - graph visualising the location of the variants
Assuming the reference fasta and bedfiles are already defined in a config file the pipeline can be run using the command
nextflow run main.nf -c [path to config] \
--workflow cnv-calling \
--ped_file [path to ped file] \
--sample_sheet [path to sample sheet]
Likely probelems moving from eddie to ULTRA
pathing errors where absolute paths are used the format is usually /exports/igmm/eddie/IGMM-VariantAnalysis/ while on Ultra the format will likely be home/u035/u035/shared/- for example in the VEP command pointing to the G2P plug in
modules: The modules loaded by the eddie.config are: 'anaconda/5.3.1', 'singularity', igmm/apps/BEDTools, igmm/apps/samtools/1.6, R/3.5.3, igmm/apps/vep/100
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment