From 328b1584cd303f6dfd779ee3f1351e9aef18bcd2 Mon Sep 17 00:00:00 2001 From: ameyner2 <alison.meynert@igmm.ed.ac.uk> Date: Tue, 25 Aug 2020 16:11:49 +0100 Subject: [PATCH] Install data targets in stages, use previously compiled gnomaAD v3 --- docs/Software_installation.md | 39 ++++++++++++++--------------------- 1 file changed, 15 insertions(+), 24 deletions(-) diff --git a/docs/Software_installation.md b/docs/Software_installation.md index 13bd346..b4b158d 100644 --- a/docs/Software_installation.md +++ b/docs/Software_installation.md @@ -6,6 +6,8 @@ Downloaded Aspera Connect version 3.7.4.147727 from https://downloads.asperasoft ## bcbio +Start with installing the base software, and add datatargets. + This will take a long time, and may require multiple runs if it fails on a step. It will resume if needed. Run on a screen session and log each attempt. It's important to set the limit on the number of concurrently open files to as high as possible (4096 on ultra). ``` @@ -14,47 +16,36 @@ wget https://raw.github.com/bcbio/bcbio-nextgen/master/scripts/bcbio_nextgen_ins ulimit -n 4096 +DATE=`date +%Y%m%d%H%M` python bcbio_nextgen_install.py /home/u035/project/software/bcbio \ --tooldir /home/u035/project/software/bcbio/tools \ --genomes hg38 --aligners bwa \ - --datatarget variation --datatarget gnomad \ - --datatarget vep --datatarget dbnsfp --cores 32 &> bcbio_install_YYYYMMDDhhmm.log + --cores 32 &> bcbio_install_base_${DATE}.log ``` -The bcbio gnomAD installation was failing because the software *vt* had problems. Solution: replace vt with version from bcbio-1.2.0. See https://github.com/bcbio/bcbio-nextgen/issues/3327 and https://github.com/bcbio/bcbio-nextgen/issues/3328. +Fix an issue with bcbio & vt/samtools/htslib. See https://github.com/bcbio/bcbio-nextgen/issues/3327 and https://github.com/bcbio/bcbio-nextgen/issues/3328. ``` -cd /home/u035/project/software/bcbio/anaconda/bin -mv vt vt.bcbio-1.2.3 -scp ameyner2@eddie.ecdf.ed.ac.uk:/exports/igmm/software/pkg/el7/apps/bcbio/1.2.0/bin/vt ./ -chmod a+x vt - -cd ../../genomes/Hsapiens/hg38/txtmp/ -qsub /home/u035/project/scripts/bcbio_gnomad_install.sh +DATE=`date +%Y%m%d%H%M` +/home/u035/project/software/bcbio/tools/bcbio_nextgen.py upgrade -u development --tools &> bcbio_install_upgrade_tools_${DATE}.log ``` -This will take a few days (4-6) to run. When complete: +Install datatarget variation ``` -cd ../../genomes/Hsapiens/hg38/txtmp/ -mv variation/gnomad* ../variation/ +DATE=`date +%Y%m%d%H%M` +/home/u035/project/software/bcbio/tools/bcbio_nextgen.py upgrade -u skip --datatarget variation &> bcbio_install_datatarget_variation_${DATE}.log ``` -Now restart the installation without the gnomad datatarget. +Install datatarget vep ``` -cd /home/u035/project/software/install -wget https://raw.github.com/bcbio/bcbio-nextgen/master/scripts/bcbio_nextgen_install.py - -ulimit -n 4096 - -python bcbio_nextgen_install.py /home/u035/project/software/bcbio \ - --tooldir /home/u035/project/software/bcbio/tools \ - --genomes hg38 --aligners bwa \ - --datatarget variation \ - --datatarget vep --datatarget dbnsfp --cores 32 &> bcbio_install_YYYYMMDDhhmm.log +DATE=`date +%Y%m%d%H%M` +/home/u035/project/software/bcbio/tools/bcbio_nextgen.py upgrade -u skip --datatarget vep &> bcbio_install_datatarget_vep_${DATE}.log ``` +We already had gnomAD 3.0 compiled and downloaded from another bcbio installation, so this gets copied to /home/u035/project/software/bcbio/genomes/Hsapiens/hg38/variation. + Put a fake file name for genotype2phenotype field in genomes/hg38/seq/hg38-resources.yaml Increase JVM memory for GATK in galaxy/bcbio_system.yaml -- GitLab