Number of time points stored as an array in metadata, crashing extractor
Summary
Number of time points stored as an array in metadata, crashing extractor
Steps to reproduce
import h5py
from extraction.core.extractor import ExtractorParameters, Extractor
from agora.io.bridge import image_creds_from_h5, parameters_from_h5
from aliby.io.omero import Image
filepath = "/path/to/file.h5"
image_id, creds = image_creds_from_h5(filepath)
with Image(image_id, **creds) as image:
# Get and extend pipeline parameters
pipeline_params = parameters_from_h5(filepath)
pipeline_params['extraction']['tree']['mCherry']['np_max'].update({new_measure})
# Construct extractor
extractor = Extractor.from_tiler(
parameters = ExtractorParameters.from_dict(pipeline_params['extraction']),
store = filepath,
tiler = Tiler.from_hdf5(image, filepath, TilerParameters.from_dict(pipeline_params['tiler'])),
)
extractor.run()
TL;DR: Define an extractor (given parameters, local file, tiler object), run without specifying tps
argument (time points).
What is the current bug behavior?
TypeError
raised.
What is the expected correct behavior?
Extractor obtains the number of time points from HDF5 metadata and defines tps
within .run()
as the list of time points (e.g. 0 to 100). Then runs correctly.
Logs/Traceback
Traceback (most recent call last):
File "/opt/tljh/user/envs/aliby/lib/python3.7/pdb.py", line 1699, in main
pdb._runscript(mainpyfile)
File "/opt/tljh/user/envs/aliby/lib/python3.7/pdb.py", line 1568, in _runscript
self.run(statement)
File "/opt/tljh/user/envs/aliby/lib/python3.7/bdb.py", line 578, in run
exec(cmd, globals, locals)
File "<string>", line 1, in <module>
File "/home/jupyter-arin/scratch-arin/nucEstConv.py", line 45, in <module>
reextract_with_measure(fp, NEW_MEASURE)
File "/home/jupyter-arin/scratch-arin/nucEstConv.py", line 31, in reextract_with_measure
extractor.run()
File "/home/jupyter-arin/scratch-arin/nucEstConv.py", line 31, in reextract_with_measure
extractor.run()
File "/home/jupyter-arin/.local/lib/python3.7/site-packages/extraction/core/extractor.py", line 444, in run
tps = list(range(self.meta["time_settings/ntimepoints"]))
TypeError: only integer scalar arrays can be converted to a scalar index
Possible fixes
On the surface, it should just involve fixing the logic in https://git.ecdf.ed.ac.uk/swain-lab/aliby/aliby/-/blob/master/extraction/core/extractor.py, line 444, from this:
tps = list(range(self.meta["time_settings/ntimepoints"]))
to this:
tps = list(range(self.meta["time_settings/ntimepoints"].item()))
Because self.meta["time_settings/ntimepoints"]
is an array of size 1 rather than a scalar. This causes the bug, and therefore converting it into a scalar should fix it.
A better fix may be to fix the metadata directly as it doesn't make sense to store time_settings/ntimepoints
as an array.
However, I think there might be more to it-- why hasn't this been an issue with the pipeline before? Is this sort of logic more widespread in the code base?