From 328b1584cd303f6dfd779ee3f1351e9aef18bcd2 Mon Sep 17 00:00:00 2001
From: ameyner2 <alison.meynert@igmm.ed.ac.uk>
Date: Tue, 25 Aug 2020 16:11:49 +0100
Subject: [PATCH] Install data targets in stages, use previously compiled
 gnomaAD v3

---
 docs/Software_installation.md | 39 ++++++++++++++---------------------
 1 file changed, 15 insertions(+), 24 deletions(-)

diff --git a/docs/Software_installation.md b/docs/Software_installation.md
index 13bd346..b4b158d 100644
--- a/docs/Software_installation.md
+++ b/docs/Software_installation.md
@@ -6,6 +6,8 @@ Downloaded Aspera Connect version 3.7.4.147727 from https://downloads.asperasoft
 
 ## bcbio
 
+Start with installing the base software, and add datatargets.
+
 This will take a long time, and may require multiple runs if it fails on a step. It will resume if needed. Run on a screen session and log each attempt. It's important to set the limit on the number of concurrently open files to as high as possible (4096 on ultra).
 
 ```
@@ -14,47 +16,36 @@ wget https://raw.github.com/bcbio/bcbio-nextgen/master/scripts/bcbio_nextgen_ins
 
 ulimit -n 4096
 
+DATE=`date +%Y%m%d%H%M`
 python bcbio_nextgen_install.py /home/u035/project/software/bcbio \
   --tooldir /home/u035/project/software/bcbio/tools \
   --genomes hg38 --aligners bwa \
-  --datatarget variation --datatarget gnomad \
-  --datatarget vep --datatarget dbnsfp --cores 32 &> bcbio_install_YYYYMMDDhhmm.log
+  --cores 32 &> bcbio_install_base_${DATE}.log
 ```
 
-The bcbio gnomAD installation was failing because the software *vt* had problems. Solution: replace vt with version from bcbio-1.2.0. See https://github.com/bcbio/bcbio-nextgen/issues/3327 and https://github.com/bcbio/bcbio-nextgen/issues/3328.
+Fix an issue with bcbio & vt/samtools/htslib. See https://github.com/bcbio/bcbio-nextgen/issues/3327 and https://github.com/bcbio/bcbio-nextgen/issues/3328.
 
 ```
-cd /home/u035/project/software/bcbio/anaconda/bin
-mv vt vt.bcbio-1.2.3
-scp ameyner2@eddie.ecdf.ed.ac.uk:/exports/igmm/software/pkg/el7/apps/bcbio/1.2.0/bin/vt ./
-chmod a+x vt
-
-cd ../../genomes/Hsapiens/hg38/txtmp/
-qsub /home/u035/project/scripts/bcbio_gnomad_install.sh 
+DATE=`date +%Y%m%d%H%M`
+/home/u035/project/software/bcbio/tools/bcbio_nextgen.py upgrade -u development --tools &> bcbio_install_upgrade_tools_${DATE}.log
 ```
 
-This will take a few days (4-6) to run. When complete:
+Install datatarget variation
 
 ```
-cd ../../genomes/Hsapiens/hg38/txtmp/
-mv variation/gnomad* ../variation/
+DATE=`date +%Y%m%d%H%M`
+/home/u035/project/software/bcbio/tools/bcbio_nextgen.py upgrade -u skip --datatarget variation &> bcbio_install_datatarget_variation_${DATE}.log
 ```
 
-Now restart the installation without the gnomad datatarget.
+Install datatarget vep
 
 ```
-cd /home/u035/project/software/install
-wget https://raw.github.com/bcbio/bcbio-nextgen/master/scripts/bcbio_nextgen_install.py
-
-ulimit -n 4096
-
-python bcbio_nextgen_install.py /home/u035/project/software/bcbio \
-  --tooldir /home/u035/project/software/bcbio/tools \
-  --genomes hg38 --aligners bwa \
-  --datatarget variation \
-  --datatarget vep --datatarget dbnsfp --cores 32 &> bcbio_install_YYYYMMDDhhmm.log
+DATE=`date +%Y%m%d%H%M`
+/home/u035/project/software/bcbio/tools/bcbio_nextgen.py upgrade -u skip --datatarget vep &> bcbio_install_datatarget_vep_${DATE}.log
 ```
 
+We already had gnomAD 3.0 compiled and downloaded from another bcbio installation, so this gets copied to /home/u035/project/software/bcbio/genomes/Hsapiens/hg38/variation.
+
 Put a fake file name for genotype2phenotype field in genomes/hg38/seq/hg38-resources.yaml
 
 Increase JVM memory for GATK in galaxy/bcbio_system.yaml
-- 
GitLab