Switching nuc_est_conv and max projection order
Summary
We want to try investigating whether swapping operations in computing nuc_est_conv
across z-stacks improves identification of protein localisation.
Current behaviour/setbacks
nuc_est_conv
isn't computed by default during extraction.
Desired behaviour/advantages
- Compute
nuc_est_conv
as an additional measure for an experiment of interest. Then go through the usual extraction and post-processing routines. - Investigate whether (a) finding max projection across z-stacks then computing
nucEstConv
or (b) computingnucEstConv
for each z-stick then finding max projection across the time series does better in terms of identifying protein localisation changes.
Also see https://www.wiki.ed.ac.uk/display/SWAIN/z-stacks+and+nucEstConv -- which suggests that swapping the order may improve things. However, this was based on the MATLAB version of the image segmentation & analysis pipeline.
Implementation sketch
I will split this into two parts based on the two parts in 'Desired behaviour/advantages'.
Part 1: computing nuc_est_conv
I have identified 3 options. These options are not mutually exclusive -- the solution may well be a combination of all three.
Option 1: Add nuc_est_conv
as a default measure in extraction
How: uncomment line 38 in https://git.ecdf.ed.ac.uk/swain-lab/aliby/aliby/-/blob/master/extraction/core/functions/defaults.py, then run whole pipeline again on desired experiment.
Pros: easy, takes literally 3 seconds to implement
Cons: re-segmenting takes time and may not be desired if cell outlines have already been identified. We may also have to re-do this with multiple experiments, making the data output inconsistent between experiments.
Discussion: Do we want to re-integrate nuc_est_conv
permanently into the pipeline? @amuoz commented the measure cd1b134e, but no reason was given.
Option 2: Define an Extractor
object, adding nuc_est_conv
as part of parameters, and re-extract images that have cell outlines already defined
How: Specify parameters by defining a ExtractorParameters
object (https://git.ecdf.ed.ac.uk/swain-lab/aliby/aliby/-/blob/master/extraction/core/extractor.py), adding nuc_est_conv
as a measure in addition to the existing defaults. Then define an Extractor
object (https://git.ecdf.ed.ac.uk/swain-lab/aliby/aliby/-/blob/master/extraction/core/extractor.py) with these parameters. Use this object, take the images and parameters as arguments, and re-do extraction (or perhaps only the nuc_est_conv
part). New information should be written to the HDF5 file. Then, post-processing can be run again.
Pros: This is the ideal case. This should require the least resources and is the least redundant way to solve the problem. Plus, it takes advantage of aliby
's modularity and parameters-process paradigm.
Cons:
Discussion: Arin has attempted to do this, but has struggled to find the method within Extractor
to achieve this. His attempt was based on https://git.ecdf.ed.ac.uk/swain-lab/aliby/skeletons/-/blob/master/notebooks/4.%20Re-postprocessing.ipynb, but apparently PostProcessor
and Extractor
objects are structured in quite different ways. Here is a sketch:
import h5py
from pathlib import Path
folder = Path("/home/jupyter-arin/data/23174_2022_03_25_flavin_htb2_glucose_limitation_hard_delft_04_02")
from aliby.pipeline import PipelineParameters, Pipeline
pipeline_params = PipelineParameters.default(
general={
"expt_id": 23174, # should match the experiment so that channels match
"distributed": 10, # doesn't matter
"server_info": {
"host": *****,
"username": *****,
"password": *****,
},
},
)
extractor_params_dict = pipeline_params.to_dict()['extraction']
extractor_params_dict['tree']['mCherry']['np_max'].update({'nuc_est_conv'})
from extraction.core.extractor import ExtractorParameters, Extractor
from pathos.multiprocessing import Pool
def extract_file(filepath):
try:
with h5py.File(filepath, "a") as f:
if "extraction" in f:
del f["/extraction"]
extractor = Extractor(
ExtractorParameters.from_dict(extractor_params_dict), filepath)
extractor.run()
print(filepath, " PASSED\n")
except Exception as e:
print(filepath, " FAILED\n")
print(e)
with Pool(1) as p:
results = p.map(
lambda x: extract_file(x), Path(folder).rglob("*.h5")
)
which currently fails.
Option 3: Use nuc_est_conv
function on its own and use that on images.
How: Import it from https://git.ecdf.ed.ac.uk/swain-lab/aliby/aliby/-/blob/master/extraction/core/functions/custom/localisation.py
Pros:
Cons: Doesn't take advantage of how things are organised in aliby
.
Part 2: find max projection
The original method should be implemented within nuc_conv_3d
in https://git.ecdf.ed.ac.uk/swain-lab/aliby/aliby/-/blob/master/extraction/core/functions/custom/localisation.py.
The alternative method should be:
Assuming that nuc_est_conv
is computed separately for each z-stack, we just need to call numpy.max
on the outputs from each.
Then the results from each method can be plotted and thus compared.