Skip to content
Snippets Groups Projects
Commit 7dd81783 authored by ameyner2's avatar ameyner2
Browse files

Documentation for including reads for samples sequenced on previous runs.

parent d17af75f
No related branches found
No related tags found
No related merge requests found
Pipeline #30196 passed
......@@ -156,6 +156,38 @@ ped_file=<input_ped_file>
cp $ped_file $project_id.ped
```
2. If there are samples from previous sequencing runs to be included in this analysis, the FASTQ for these samples needs to be in the `$READS_DIR/$project_id` folder.
```
mkdir $READS_DIR/$project_id
cd $READS_DIR
```
Check to see if the reads have already been merged and are available in a previous run's folder. For each sample:
```
find . -name '<sample_id>'
cp <prev_project_id>/*<sample_id>* $project_id/
```
If not found, look in the original data folder and copy the original files.
```
cd $DOWNLOAD_DIR
find . -name '<sample_id>'
cp -R path/to/files/*<sample_id>* $READS_DIR/$project_id/
```
Move to the project reads folder and rename the files to match the `<sample_id>_<R[12]>.gz` pattern if there is only one file per read end. If there are two files per read end, merge the files with the given script.
```
cd $READS_DIR/$project_id
python $SCRIPTS/merge_and_rename_NGI_fastq_files.py file1_R1.gz:file2_R1.gz <sample_name> 1 .
python $SCRIPTS/merge_and_rename_NGI_fastq_files.py file1_R2.gz:file2_R2.gz <sample_name> 2 .
```
Remove the original files.
3. In the params folder, create the symlinks to the reads and the bcbio configuration files. If specifying a common sample suffix, ensure it includes any joining characters, e.g. “-“ or “_”, so that the family identifier can be cleanly separated from the suffix. Get the number of families from the batch.
*Edinburgh Genomics data*
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment