Skip to content
Snippets Groups Projects
Commit 5439c70a authored by sgrannem's avatar sgrannem
Browse files

Update README.md

parent 1a4a1454
No related branches found
No related tags found
No related merge requests found
......@@ -197,9 +197,9 @@ Notebooks 6.1.1 and 6.1.2 process the prediction results so that it can be used
### 10. Predicting RNA-binding residues for proteins using our XGBoost models.
Notebook 6.2 takes all the prediction results available in the large table (produced by notebook 5.0), feeds that to our XGBoost models, and calculates for each amino acid in each protein a probability for RNA-binding. These findings are then provided in PDB files where the probability for RNA-binding for each amino acid is provided in the b-factor column.
### 10. Analysis of cross-linked peptide and amino acid sequences
### 11. Analysis of cross-linked peptide and amino acid sequences
Notebooks 6.3 and 6.4 compare the cross-linking data to the GT-PLIP and GT-Distance ground truth datasets as well as the predictions from the different tools that pyRBDome employs. Notebook 6.3 determines whether cross-linked peptide and amino acids (where available) are significantly enriched for predicted RNA-binding sites compared to the random peptide datasets and the peptides generated by Trypsin/Lys-C digestion of the protein sequences. Notebook 6.4 does similar analyses but here the cross-linking data are compared to the ground truth datasets.
### 11. Making the PDF and pymol session output files
### 12. Making the PDF and pymol session output files
The series 7 notebooks gather all the prediction and cross-linking information from the PDB files that were produced by notebook 4.0 and place the information in a large table where RNA-binding probabilities provided by each algorithm are stored as well as the location of cross-linked peptides and amino acid residues. The notebooks in the pyRBDome analyses of the ground truth dataset also contain extra code that adds the distances to RNA molecules for each amino acid for all protein-RNA structures that were analysed. Notebook 7.1 takes all the analysis results and for each protein produces PDF files summarising all the results in the protein sequences. The scorebars in the PDF files indicate the XGBoost RNA-binding probabilities for each amino acid. Notebook 7.2 generates pymol session files that enables the user to conveniently load all PDB files into a single Pymol session.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment