Compare revisions

Alán Muñoz · Alán Muñoz · Alán Muñoz · Alán Muñoz · Alán Muñoz · Alán Muñoz
--- a/.gitlab/issue_templates/bug.md
+++ b/.gitlab/issue_templates/bug.md
 ## Summary

-(Summarize the bug encountered concisely)
+{Summarize the bug encountered concisely}
+
+I confirm that I have (if relevant):
+- [ ] Read the troubleshooting guide: https://gitlab.com/aliby/aliby/-/wikis/Troubleshooting-(basic)
+- [ ] Updated aliby and aliby-baby.
+- [ ] Tried the unit test.
+- [ ] Tried a scaled-down version of my experiment (distributed=0, filter=0, tps=10)
+- [ ] Tried re-postprocessing.

 ## Steps to reproduce

-(How one can reproduce the issue - this is very important)
+{How one can reproduce the issue - this is very important}
+
+- aliby version: 0.1.{...}, or if development/unreleased version, commit SHA: {...}
+- platform(s):
+    - [ ] Jura
+    - [ ] Other Linux, please specify distribution and version: {...}
+    - [ ] MacOS, please specify version: {...}
+    - [ ] Windows, please specify version: {...}
+- experiment ID: {...}
+   - Any special things you need to know about this experiment: {...}

 ## What is the current bug behavior?

@@ -19,6 +35,12 @@
 (Paste any relevant logs - please use code blocks (```) to format console output, logs, and code, as
 it's very hard to read otherwise.)

+```
+{PASTE YOUR ERROR MESSAGE HERE!!}
+
+
+```
+
 ## Possible fixes

 (If you can, link to the line of code that might be responsible for the problem)
--- a/README.md
+++ b/README.md
@@ -11,11 +11,12 @@ End-to-end processing of cell microscopy time-lapses. ALIBY automates segmentati
 ## Quickstart Documentation
 Installation of [VS Studio](https://visualstudio.microsoft.com/downloads/#microsoft-visual-c-redistributable-for-visual-studio-2022) Native MacOS support for is under work, but you can use containers (e.g., Docker, Podman) in the meantime.

-For analysing local data
+To analyse local data
 ```bash
-pip install aliby # aliby[network] if you want to access an OMERO server
+pip install aliby 
 ```
-
+ Add any of the optional flags `omero` and `utils` (e.g., `pip install aliby[omero, utils]`). `omero` provides tools to connect with an OMERO server and `utils` provides visualisation, user interface and additional deep learning tools.
+  
 See our [installation instructions]( https://aliby.readthedocs.io/en/latest/INSTALL.html ) for more details.

 ### CLI
@@ -80,12 +81,18 @@ It fetches the metadata from the Image object, and uses the TilerParameters valu
 ```python
 fpath = "h5/location"

-trap_id = 9
-trange = list(range(0, 30))
+tile_id = 9
+trange = range(0, 10)
 ncols = 8

 riv = remoteImageViewer(fpath)
-trap_tps = riv.get_trap_timepoints(trap_id, trange, ncols)
+trap_tps = [riv.tiler.get_tiles_timepoint(tile_id, t) for t in trange] 
+
+# You can also access labelled traps
+m_ts = riv.get_labelled_trap(tile_id=0, tps=[0])
+
+# And plot them directly
+riv.plot_labelled_trap(trap_id=0, channels=[0, 1, 2, 3], trange=range(10))
 ```

 Depending on the network speed can take several seconds at the moment.
@@ -95,8 +102,8 @@ For a speed-up: take fewer z-positions if you can.
 Alternatively, if you want to get all the traps at a given timepoint:

 ```python
-timepoint = 0
-seg_expt.get_tiles_timepoints(timepoint, tile_size=96, channels=None,
+timepoint = (4,6)
+tiler.get_tiles_timepoint(timepoint, channels=None,
                                z=[0,1,2,3,4])
 ```


--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -4,7 +4,6 @@
   contain the root `toctree` directive.

 .. toctree::
-   :hidden:

   Home page <self>
   Installation <INSTALL.md>

--- a/examples/parsers/swainlab_logfile_header_example.log
+++ b/examples/parsers/swainlab_logfile_header_example.log
+2022-10-10 15:31:27,350 - INFO 
+Swain Lab microscope experiment log file
+GIT commit: e5d5e33 fix: changes to a few issues with focus control on Batman.
+Microscope name: Batman
+Date: 022-10-10 15:31:27
+Log file path: D:\AcquisitionDataBatman\Swain Lab\Ivan\RAW DATA\2022\Oct\10-Oct-2022\pH_med_to_low00\pH_med_to_low.log
+Micromanager config file: C:\Users\Public\Microscope control\Micromanager config files\Batman_python_15_4_22.cfg
+Omero project: Default project
+Omero tags: 
+Experiment details: Effect on growth and cytoplasmic pH of switch from normal pH (4.25) media to higher pH (5.69). Switching is run using the Oxygen software
+-----Acquisition settings-----
+
+2022-10-10 15:31:27,350 - INFO Image Configs:
+Image config,Channel,Description,Exposure (ms), Number of Z sections,Z spacing (um),Sectioning method
+brightfield1,Brightfield,Default bright field config,30,5,0.6,PIFOC
+pHluorin405_0_4,pHluorin405,Phluorin excitation from 405 LED 0.4v and 10ms exposure,5,1,0.6,PIFOC
+pHluorin488_0_4,GFPFast,Phluorin excitation from 488 LED 0.4v,10,1,0.6,PIFOC
+cy5,cy5,Default cy5,30,1,0.6,PIFOC
+
+Device properties:
+Image config,device,property,value
+pHluorin405_0_4,DTOL-DAC-1,Volts,0.4
+pHluorin488_0_4,DTOL-DAC-2,Volts,0.4
+cy5,DTOL-DAC-3,Volts,4
+
+2022-10-10 15:31:27,353 - INFO 
+group: YST_247 field: position
+Name, X, Y, Z, Autofocus offset
+YST_247_001,-8968,-3319,2731.125040696934,123.25
+YST_247_002,-8953,-3091,2731.3000406995416,123.25
+YST_247_003,-8954,-2849,2731.600040704012,122.8
+YST_247_004,-8941,-2611,2730.7750406917185,122.8
+YST_247_005,-8697,-2541,2731.4500407017767,118.6
+group: YST_247 field: time
+start: 0
+interval: 300
+frames: 180
+
+group: YST_247 field: config
+brightfield1: 0xfffffffffffffffffffffffffffffffffffffffffffff
+pHluorin405_0_4: 0xfffffffffffffffffffffffffffffffffffffffffffff
+pHluorin488_0_4: 0xfffffffffffffffffffffffffffffffffffffffffffff
+cy5: 0xfffffffffffffffffffffffffffffffffffffffffffff
+
+2022-10-10 15:31:27,356 - INFO 
+group: YST_1510 field: position
+Name,X,Y,Z,Autofocus offset
+YST_1510_001,-6450,-230,2343.300034917891,112.55
+YST_1510_002,-6450,-436,2343.350034918636,112.55
+YST_1510_003,-6450,-639,2344.000034928322,116.8
+YST_1510_004,-6450,-831,2344.250034932047,116.8
+YST_1510_005,-6848,-536,2343.3250349182636,110
+group: YST_1510 field: time
+start: 0
+interval: 300
+frames: 180
+
+group: YST_1510 field: config
+brightfield1: 0xfffffffffffffffffffffffffffffffffffffffffffff
+pHluorin405_0_4: 0xfffffffffffffffffffffffffffffffffffffffffffff
+pHluorin488_0_4: 0xfffffffffffffffffffffffffffffffffffffffffffff
+cy5: 0xfffffffffffffffffffffffffffffffffffffffffffff
+2022-10-10 15:31:27,359 - INFO 
+group: YST_1511 field: position
+Name, X, Y, Z, Autofocus offset
+YST_1511_001,-10618,-1675,2716.900040484965,118.7
+YST_1511_002,-10618,-1914,2717.2250404898077,122.45
+YST_1511_003,-10367,-1695,2718.2500405050814,120.95
+YST_1511_004,-10367,-1937,2718.8250405136496,120.95
+YST_1511_005,-10092,-1757,2719.975040530786,119.45
+
+group: YST_1511 field: time
+start: 0
+interval: 300
+frames: 180
+
+group: YST_1511 field: config
+brightfield1: 0xfffffffffffffffffffffffffffffffffffffffffffff
+pHluorin405_0_4: 0xfffffffffffffffffffffffffffffffffffffffffffff
+pHluorin488_0_4: 0xfffffffffffffffffffffffffffffffffffffffffffff
+cy5: 0xfffffffffffffffffffffffffffffffffffffffffffff
+
+2022-10-10 15:31:27,362 - INFO 
+group: YST_1512 field: position
+Name,X,Y,Z,Autofocus offset
+YST_1512_001,-8173,-2510,2339.0750348549336,115.65
+YST_1512_002,-8173,-2718,2338.0250348392874,110.8
+YST_1512_003,-8173,-2963,2336.625034818426,110.8
+YST_1512_004,-8457,-2963,2336.350034814328,110.9
+YST_1512_005,-8481,-2706,2337.575034832582,113.3
+group: YST_1512 field: time
+start: 0
+interval: 300
+frames: 180
+
+group: YST_1512 field: config
+brightfield1: 0xfffffffffffffffffffffffffffffffffffffffffffff
+pHluorin405_0_4: 0xfffffffffffffffffffffffffffffffffffffffffffff
+pHluorin488_0_4: 0xfffffffffffffffffffffffffffffffffffffffffffff
+cy5: 0xfffffffffffffffffffffffffffffffffffffffffffff
+
+2022-10-10 15:31:27,365 - INFO 
+group: YST_1513 field: position
+Name,X,Y,Z,Autofocus offset
+YST_1513_001,-6978,-2596,2339.8750348668545,113.3
+YST_1513_002,-6978,-2380,2340.500034876168,113.3
+YST_1513_003,-6971,-2163,2340.8750348817557,113.3
+YST_1513_004,-6971,-1892,2341.2500348873436,113.3
+YST_1513_005,-6692,-1892,2341.550034891814,113.3
+group: YST_1513 field: time
+start: 0
+interval: 300
+frames: 180
+
+group: YST_1513 field: config
+brightfield1: 0xfffffffffffffffffffffffffffffffffffffffffffff
+pHluorin405_0_4: 0xfffffffffffffffffffffffffffffffffffffffffffff
+pHluorin488_0_4: 0xfffffffffffffffffffffffffffffffffffffffffffff
+cy5: 0xfffffffffffffffffffffffffffffffffffffffffffff
+
+2022-10-10 15:31:27,365 - INFO 
+2022-10-10 15:31:27,365 - INFO 
+-----Experiment started-----
--- a/poetry.lock
+++ b/poetry.lock
--- a/pyproject.toml
+++ b/pyproject.toml
 [tool.poetry]
 name = "aliby"
-version = "0.1.58"
+version = "0.1.64"
 description = "Process and analyse live-cell imaging data"
 authors = ["Alan Munoz <alan.munoz@ed.ac.uk>"]
 packages = [
@@ -14,7 +14,7 @@ readme = "README.md"

 [tool.poetry.scripts]
 aliby-run = "aliby.bin.run:run"
-aliby-annotate = "aliby.bin.annotate:annotate_image"
+aliby-annotate = "aliby.bin.annotate:annotate"
 aliby-visualise = "aliby.bin.visualise:napari_overlay"

 [build-system]
@@ -45,15 +45,14 @@ tqdm = "^4.62.3" # progress bars
 xmltodict = "^0.13.0" # read ome-tiff metadata
 zarr = "^2.14.0"
 GitPython = "^3.1.27"
+h5py = "2.10" # File I/O


 # Networking
 omero-py = { version = ">=5.6.2", optional = true } # contact omero server

-[tool.poetry.extras]
-omero = ["omero-py"]
-utils = ["napari", "torch", "pytorch-lightning", "torchvision", "trio", "grid-strategy"]
-
+# Baby segmentation
+aliby-baby = {version = "^0.1.17", optional=true}

 # Postprocessing
 [tool.poetry.group.pp.dependencies]
@@ -61,10 +60,9 @@ leidenalg = "^0.8.8"
 more-itertools = "^8.12.0"
 pycatch22 = "^0.4.2"

+[tool.poetry.group.pp]
+optional = true

-[tool.poetry.group.baby.dependencies]
-aliby-baby = "^0.1.15"
-h5py = "2.10" # File I/O

 [tool.poetry.group.dev]
 optional = true
@@ -104,13 +102,18 @@ pytest = "^6.2.5"
 [tool.poetry.group.utils]
 optional = true

+# Dependency groups can only be used by a poetry installation, not pip
 [tool.poetry.group.utils.dependencies]
-napari = ">=0.4.16"
-torch = "^1.13.1"
-pytorch-lightning = "^1.9.3"
-torchvision = "^0.14.1"
-trio = "^0.22.0"
-grid-strategy = "^0.0.1"
+napari = {version = ">=0.4.16", optional=true}
+Torch = {version = "^1.13.1", optional=true}
+pytorch-lightning = {version = "^1.9.3", optional=true}
+torchvision = {version = "^0.14.1", optional=true}
+trio = {version = "^0.22.0", optional=true}
+grid-strategy = {version = "^0.0.1", optional=true}
+
+[tool.poetry.extras]
+omero = ["omero-py"]
+baby = ["aliby-baby"]

 [tool.black]
 line-length = 79

--- a/src/agora/abc.py
+++ b/src/agora/abc.py
@@ -202,7 +202,7 @@ class ProcessABC(ABC):
    def run(self):
        pass

-    def _log(self, message: str, level: str = "warn"):
+    def _log(self, message: str, level: str = "warning"):
        # Log messages in the corresponding level
        logger = logging.getLogger("aliby")
        getattr(logger, level)(f"{self.__class__.__name__}: {message}")

--- a/src/agora/io/signal.py
+++ b/src/agora/io/signal.py
@@ -47,20 +47,25 @@ class Signal(BridgeH5):
    def __getitem__(self, dsets: t.Union[str, t.Collection]):
        """Get and potentially pre-process data from h5 file and return as a dataframe."""
        if isinstance(dsets, str):  # no pre-processing
-            df = self.apply_prepost(dsets)
-            return self.add_name(df, dsets)
+            return self.get(dsets)
        elif isinstance(dsets, list):  # pre-processing
            is_bgd = [dset.endswith("imBackground") for dset in dsets]
            # Check we are not comparing tile-indexed and cell-indexed data
            assert sum(is_bgd) == 0 or sum(is_bgd) == len(
                dsets
            ), "Tile data and cell data can't be mixed"
-            return [
-                self.add_name(self.apply_prepost(dset), dset) for dset in dsets
-            ]
+            return [self.get(dset) for dset in dsets]
        else:
            raise Exception(f"Invalid type {type(dsets)} to get datasets")

+    def get(self, dsets: t.Union[str, t.Collection], **kwargs):
+        """Get and potentially pre-process data from h5 file and return as a dataframe."""
+        if isinstance(dsets, str):  # no pre-processing
+            df = self.get_raw(dsets, **kwargs)
+            prepost_applied = self.apply_prepost(dsets, **kwargs)
+
+            return self.add_name(prepost_applied, dsets)
+
    @staticmethod
    def add_name(df, name):
        """Add column of identical strings to a dataframe."""
@@ -129,18 +134,24 @@ class Signal(BridgeH5):
        Returns an array with three columns: the tile id, the mother label, and the daughter label.
        """
        if lineage_location is None:
-            lineage_location = "postprocessing/lineage"
-            if merged:
-                lineage_location += "_merged"
+            lineage_location = "modifiers/lineage_merged"
        with h5py.File(self.filename, "r") as f:
+            # if lineage_location not in f:
+            #     lineage_location = lineage_location.split("_")[0]
+            if lineage_location not in f:
+                lineage_location = "postprocessing/lineage"
            tile_mo_da = f[lineage_location]
-            lineage = np.array(
-                (
-                    tile_mo_da["trap"],
-                    tile_mo_da["mother_label"],
-                    tile_mo_da["daughter_label"],
-                )
-            ).T
+
+            if isinstance(tile_mo_da, h5py.Dataset):
+                lineage = tile_mo_da[()]
+            else:
+                lineage = np.array(
+                    (
+                        tile_mo_da["trap"],
+                        tile_mo_da["mother_label"],
+                        tile_mo_da["daughter_label"],
+                    )
+                ).T
        return lineage

    @_first_arg_str_to_df
@@ -309,7 +320,9 @@ class Signal(BridgeH5):
        with h5py.File(self.filename, "r") as f:
            picks = set()
            if path in f:
-                picks = set(zip(*[f[path + name] for name in names]))
+                picks = set(
+                    zip(*[f[path + name] for name in names if name in f[path]])
+                )
            return picks

    def dataset_to_df(self, f: h5py.File, path: str) -> pd.DataFrame:

--- a/src/agora/utils/indexing.py
+++ b/src/agora/utils/indexing.py
@@ -109,18 +109,26 @@ def _assoc_indices_to_3d(ndarray: np.ndarray):
    Convert the last column to a new row while repeating all previous indices.

    This is useful when converting a signal multiindex before comparing association.
+
+    Assumes the input array has shape (N,3)
    """
    result = ndarray
    if len(ndarray) and ndarray.ndim > 1:
-        columns = np.arange(ndarray.shape[1])
-
-        result = np.stack(
-            (
-                ndarray[:, np.delete(columns, -1)],
-                ndarray[:, np.delete(columns, -2)],
-            ),
-            axis=1,
-        )
+        if ndarray.shape[1] == 3:  # Faster indexing for single positions
+            result = np.transpose(
+                np.hstack((ndarray[:, [0]], ndarray)).reshape(-1, 2, 2),
+                axes=[0, 2, 1],
+            )
+        else:  # 20% slower but more general indexing
+            columns = np.arange(ndarray.shape[1])
+
+            result = np.stack(
+                (
+                    ndarray[:, np.delete(columns, -1)],
+                    ndarray[:, np.delete(columns, -2)],
+                ),
+                axis=1,
+            )
    return result



--- a/src/agora/utils/kymograph.py
+++ b/src/agora/utils/kymograph.py
@@ -6,6 +6,8 @@ import numpy as np
 import pandas as pd
 from sklearn.cluster import KMeans

+from agora.utils.indexing import validate_association
+
 index_row = t.Tuple[str, str, int, int]


@@ -120,7 +122,9 @@ def bidirectional_retainment_filter(
 def melt_reset(df: pd.DataFrame, additional_ids: t.Dict[str, pd.Series] = {}):
    new_df = add_index_levels(df, additional_ids)

-    return new_df.melt(ignore_index=False).reset_index()
+    return new_df.melt(
+        ignore_index=False, var_name="time (minutes)", value_name="signal"
+    ).reset_index()


 # Drop cells that if used would reduce info the most
@@ -175,3 +179,67 @@ def drop_mother_label(index: pd.MultiIndex) -> np.ndarray:
 def get_index_as_np(signal: pd.DataFrame):
    # Get mother labels from multiindex dataframe
    return np.array(signal.index.to_list())
+
+
+def standard_filtering(
+    raw: pd.DataFrame,
+    lin: np.ndarray,
+    presence_high: float = 0.8,
+    presence_low: int = 7,
+):
+    # Get all mothers
+    _, valid_indices = validate_association(
+        lin, np.array(raw.index.to_list()), match_column=0
+    )
+    in_lineage = raw.loc[valid_indices]
+
+    # Filter mothers by presence
+    present = in_lineage.loc[
+        in_lineage.notna().sum(axis=1) > (in_lineage.shape[1] * presence_high)
+    ]
+
+    # Get indices
+    indices = np.array(present.index.to_list())
+    to_cast = np.stack((lin[:, :2], lin[:, [0, 2]]), axis=1)
+    ndin = to_cast[..., None] == indices.T[None, ...]
+
+    # use indices to fetch all daughters
+    valid_association = ndin.all(axis=2)[:, 0].any(axis=-1)
+
+    # Remove repeats
+    mothers, daughters = np.split(to_cast[valid_association], 2, axis=1)
+    mothers = mothers[:, 0]
+    daughters = daughters[:, 0]
+    d_m_dict = {tuple(d): m[-1] for m, d in zip(mothers, daughters)}
+
+    # assuming unique sorts
+    raw_mothers = raw.loc[_as_tuples(mothers)]
+    raw_mothers["mother_label"] = 0
+    raw_daughters = raw.loc[_as_tuples(daughters)]
+    raw_daughters["mother_label"] = d_m_dict.values()
+    concat = pd.concat((raw_mothers, raw_daughters)).sort_index()
+    concat.set_index("mother_label", append=True, inplace=True)
+
+    # Last filter to remove tracklets that are too short
+    removed_buds = concat.notna().sum(axis=1) <= presence_low
+    filt = concat.loc[~removed_buds]
+
+    # We check that no mothers are left child-less
+    m_d_dict = {tuple(m): [] for m in mothers}
+    for (trap, d), m in d_m_dict.items():
+        m_d_dict[(trap, m)].append(d)
+
+    for trap, daughter, mother in concat.index[removed_buds]:
+        idx_to_delete = m_d_dict[(trap, mother)].index(daughter)
+        del m_d_dict[(trap, mother)][idx_to_delete]
+
+    bud_free = []
+    for m, d in m_d_dict.items():
+        if not d:
+            bud_free.append(m)
+
+    final_result = filt.drop(bud_free)
+
+    # In the end, we get the mothers present for more than {presence_lineage1}% of the experiment
+    # and their tracklets present for more than {presence_lineage2} time-points
+    return final_result
--- a/src/agora/utils/merge.py
+++ b/src/agora/utils/merge.py
@@ -9,7 +9,7 @@ import numpy as np
 import pandas as pd
 from utils_find_1st import cmp_larger, find_1st

-from agora.utils.indexing import validate_association
+from agora.utils.indexing import compare_indices, validate_association


 def apply_merges(data: pd.DataFrame, merges: np.ndarray):
@@ -31,23 +31,29 @@ def apply_merges(data: pd.DataFrame, merges: np.ndarray):

    """

+    indices = data.index
+    if "mother_label" in indices.names:
+        indices = indices.droplevel("mother_label")
    valid_merges, indices = validate_association(
-        merges, np.array(list(data.index))
+        merges, np.array(list(indices))
    )

    # Assign non-merged
    merged = data.loc[~indices]

    # Implement the merges and drop source rows.
+    # TODO Use matrices to perform merges in batch
+    # for ecficiency
    if valid_merges.any():
        to_merge = data.loc[indices]
-        for target, source in merges[valid_merges]:
-            target, source = tuple(target), tuple(source)
+        targets, sources = zip(*merges[valid_merges])
+        for source, target in zip(sources, targets):
+            target = tuple(target)
            to_merge.loc[target] = join_tracks_pair(
                to_merge.loc[target].values,
-                to_merge.loc[source].values,
+                to_merge.loc[tuple(source)].values,
            )
-            to_merge.drop(source, inplace=True)
+        to_merge.drop(map(tuple, sources), inplace=True)

        merged = pd.concat((merged, to_merge), names=data.index.names)
    return merged
@@ -57,7 +63,84 @@ def join_tracks_pair(target: np.ndarray, source: np.ndarray) -> np.ndarray:
    """
    Join two tracks and return the new value of the target.
    """
-    target_copy = copy(target)
+    target_copy = target
    end = find_1st(target_copy[::-1], 0, cmp_larger)
    target_copy[-end:] = source[-end:]
    return target_copy
+
+
+def group_merges(merges: np.ndarray) -> t.List[t.Tuple]:
+    # Return a list where the cell is present as source and target
+    # (multimerges)
+
+    sources_targets = compare_indices(merges[:, 0, :], merges[:, 1, :])
+    is_multimerge = sources_targets.any(axis=0) | sources_targets.any(axis=1)
+    is_monomerge = ~is_multimerge
+
+    multimerge_subsets = union_find(zip(*np.where(sources_targets)))
+    merge_groups = [merges[np.array(tuple(x))] for x in multimerge_subsets]
+
+    sorted_merges = list(map(sort_association, merge_groups))
+
+    # Ensure that source and target are at the edges
+    return [
+        *sorted_merges,
+        *[[event] for event in merges[is_monomerge]],
+    ]
+
+
+def union_find(lsts):
+    sets = [set(lst) for lst in lsts if lst]
+    merged = True
+    while merged:
+        merged = False
+        results = []
+        while sets:
+            common, rest = sets[0], sets[1:]
+            sets = []
+            for x in rest:
+                if x.isdisjoint(common):
+                    sets.append(x)
+                else:
+                    merged = True
+                    common |= x
+            results.append(common)
+        sets = results
+    return sets
+
+
+def sort_association(array: np.ndarray):
+    # Sort the internal associations
+
+    order = np.where(
+        (array[:, 0, ..., None] == array[:, 1].T[None, ...]).all(axis=1)
+    )
+
+    res = []
+    [res.append(x) for x in np.flip(order).flatten() if x not in res]
+    sorted_array = array[np.array(res)]
+    return sorted_array
+
+
+def merge_association(
+    association: np.ndarray, merges: np.ndarray
+) -> np.ndarray:
+    grouped_merges = group_merges(merges)
+
+    flat_indices = association.reshape(-1, 2)
+    comparison_mat = compare_indices(merges[:, 0], flat_indices)
+
+    valid_indices = comparison_mat.any(axis=0)
+
+    if valid_indices.any():  # Where valid, perform transformation
+        replacement_d = {}
+        for dataset in grouped_merges:
+            for k in dataset:
+                replacement_d[tuple(k[0])] = dataset[-1][1]
+
+        flat_indices[valid_indices] = [
+            replacement_d[tuple(i)] for i in flat_indices[valid_indices]
+        ]
+
+    merged_indices = flat_indices.reshape(-1, 2, 2)
+    return merged_indices
--- a/src/aliby/bin/__init__.py
+++ b/src/aliby/bin/__init__.py
-#!/usr/bin/env jupyter
+"""
+Command Line Interface utilities.
+"""
--- a/src/aliby/bin/annotate.py
+++ b/src/aliby/bin/annotate.py
@@ -5,10 +5,18 @@ Currently only works on UNIX-like systems due to using "/" to split addresses.
 Usage example

 From python
-$ python  dev_async_annotator.py --image_path path/to/folder/with/h5files --results_path path/to/folder/with/images/zarr --pos position_name --ncells max_n_to_annotate
+$ python annotator.py --image_path path/to/folder/with/h5files --results_path path/to/folder/with/images/zarr --pos position_name --ncells max_n_to_annotate

 As executable (installed via poetry)
-$ dev_async_annotator.py --image_path path/to/folder/with/h5files --results_path path/to/folder/with/images/zarr --pos position_name --ncells max_n_to_annotate
+$ annotator.py --image_path path/to/folder/with/h5files --results_path path/to/folder/with/images/zarr --pos position_name --ncells max_n_to_annotate
+
+During annotation:
+- Assign a (binary) label by typing '1' or '2'.
+- Type 'u' to undo.
+- Type 's' to skip.
+- Type 'q' to quit.
+
+File will be saved in: ./YYYY-MM-DD_annotation/annotation.csv, where YYYY-MM-DD is the current date.

 """
 import argparse

--- a/src/aliby/lineage/__init__.py
+++ b/src/aliby/lineage/__init__.py
-#!/usr/bin/env jupyter
+"""
+Models that link regions of interest, such as mothers and buds.
+"""
--- a/src/aliby/lineage/bud_tracker.py
+++ b/src/aliby/lineage/bud_tracker.py
-#!/usr/bin/env jupyter
+"""
+Extracted from the baby repository. Bud Tracker algorithm to link
+cell outlines as mothers and buds.
+"""
+# /usr/bin/env jupyter

 import pickle
 from os.path import join

--- a/src/aliby/pipeline.py
+++ b/src/aliby/pipeline.py
@@ -508,11 +508,12 @@ class Pipeline(ProcessABC):
                            frac_clogged_traps = self.check_earlystop(
                                filename, earlystop, steps["tiler"].tile_size
                            )
-                            self._log(
-                                f"{name}:Clogged_traps:{frac_clogged_traps}"
-                            )
-                            frac = np.round(frac_clogged_traps * 100)
-                            pbar.set_postfix_str(f"{frac} Clogged")
+                            if frac_clogged_traps > 0.3:
+                                self._log(
+                                    f"{name}:Clogged_traps:{frac_clogged_traps}"
+                                )
+                                frac = np.round(frac_clogged_traps * 100)
+                                pbar.set_postfix_str(f"{frac} Clogged")
                        else:
                            # stop if too many traps are clogged
                            self._log(
@@ -567,7 +568,7 @@ class Pipeline(ProcessABC):
        """
        # get the area of the cells organised by trap and cell number
        s = Signal(filename)
-        df = s["/extraction/general/None/area"]
+        df = s.get_raw("/extraction/general/None/area")
        # check the latest time points only
        cells_used = df[
            df.columns[-1 - es_parameters["ntps_to_eval"] : -1]

--- a/src/aliby/utils/imageViewer.py
+++ b/src/aliby/utils/imageViewer.py
@@ -6,12 +6,12 @@ Example of usage:

 fpath = "/home/alan/Documents/dev/skeletons/scripts/data/16543_2019_07_16_aggregates_CTP_switch_2_0glu_0_0glu_URA7young_URA8young_URA8old_01/URA8_young018.h5"

-trap_id = 9
-trange = list(range(0, 30))
+tile_id = 9
+trange = list(range(0, 10))
 ncols = 8

 riv = remoteImageViewer(fpath)
-riv.plot_labelled_trap(trap_id, trange, [0], ncols=ncols)
+riv.plot_labelled_trap(tile_id, trange, [0], ncols=ncols)

 """

@@ -224,7 +224,7 @@ class RemoteImageViewer(BaseImageViewer):

    def get_labelled_trap(
        self,
-        trap_id: int,
+        tile_id: int,
        tps: t.Union[range, t.Collection[int]],
        channels=None,
        concatenate=True,
@@ -234,12 +234,12 @@ class RemoteImageViewer(BaseImageViewer):
        Core method to fetch traps and labels together
        """
        imgs = self.get_pos_timepoints(tps, channels=channels, **kwargs)
-        imgs_list = [x[trap_id] for x in imgs.values()]
+        imgs_list = [x[tile_id] for x in imgs.values()]
        outlines = [
-            self.cells.at_time(tp, kind="edgemask").get(trap_id, [])
+            self.cells.at_time(tp, kind="edgemask").get(tile_id, [])
            for tp in tps
        ]
-        lbls = [self.cells.labels_at_time(tp).get(trap_id, []) for tp in tps]
+        lbls = [self.cells.labels_at_time(tp).get(tile_id, []) for tp in tps]
        lbld_outlines = [
            np.stack([mask * lbl for mask, lbl in zip(maskset, lblset)]).max(
                axis=0
@@ -253,7 +253,7 @@ class RemoteImageViewer(BaseImageViewer):
            imgs_list = np.concatenate(imgs_list, axis=1)
        return lbld_outlines, imgs_list

-    def get_images(self, trap_id, trange, channels, **kwargs):
+    def get_images(self, tile_id, trange, channels, **kwargs):
        """
        Wrapper to fetch images
        """
@@ -262,13 +262,13 @@ class RemoteImageViewer(BaseImageViewer):

        for ch in self._find_channels(channels):
            out, imgs[ch] = self.get_labelled_trap(
-                trap_id, trange, channels=[ch], **kwargs
+                tile_id, trange, channels=[ch], **kwargs
            )
        return out, imgs

    def plot_labelled_trap(
        self,
-        trap_id: int,
+        tile_id: int,
        channels,
        trange: t.Union[range, t.Collection[int]],
        remove_axis: bool = False,
@@ -288,7 +288,7 @@ class RemoteImageViewer(BaseImageViewer):

        Parameters
        ----------
-        trap_id : int
+        tile_id : int
            Identifier of trap
        channels : Union[str, int]
            Channels to use
@@ -325,7 +325,7 @@ class RemoteImageViewer(BaseImageViewer):
        nrows = int(np.ceil(len(trange) / ncols))
        width = self.tiler.tile_size * ncols

-        out, images = self.get_images(trap_id, trange, channels, **kwargs)
+        out, images = self.get_images(tile_id, trange, channels, **kwargs)

        # dilation makes outlines easier to see
        out = dilation(out).astype(float)

--- a/src/postprocessor/core/abc.py
+++ b/src/postprocessor/core/abc.py
@@ -42,14 +42,17 @@ def get_process(process, suffix="") -> PostProcessABC or ParametersABC or None:
    """
    base_location = "postprocessor.core"
    possible_locations = ("processes", "multisignal", "reshapers")
-    valid_syntaxes = (_to_snake_case(process), _to_pascal_case(process))
+    valid_syntaxes = (
+        _to_snake_case(process),
+        _to_pascal_case(_to_snake_case(process)),
+    )

    found = None
    for possible_location, process_syntax in product(
        possible_locations, valid_syntaxes
    ):

-        location = f"{base_location}.{possible_location}.{process.lower()}.{process_syntax}{suffix}"
+        location = f"{base_location}.{possible_location}.{_to_snake_case(process)}.{process_syntax}{suffix}"
        found = locate(location)
        if found is not None:
            break

--- a/src/postprocessor/core/functions/tracks.py
+++ b/src/postprocessor/core/functions/tracks.py
@@ -18,7 +18,7 @@ from postprocessor.core.processes.savgol import non_uniform_savgol


 def load_test_dset():
-    # Load development dataset to test functions
+    """Load development dataset to test functions."""
    return pd.DataFrame(
        {
            ("a", 1, 1): [2, 5, np.nan, 6, 8] + [np.nan] * 5,
@@ -31,7 +31,7 @@ def load_test_dset():


 def max_ntps(track: pd.Series) -> int:
-    # Get number of timepoints
+    """Get number of time points."""
    indices = np.where(track.notna())
    return np.max(indices) - np.min(indices)

@@ -84,9 +84,7 @@ def clean_tracks(
    """
    ntps = get_tracks_ntps(tracks)
    grs = get_avg_grs(tracks)
-
    growing_long_tracks = tracks.loc[(ntps >= min_len) & (grs > min_gr)]
-
    return growing_long_tracks


@@ -111,7 +109,6 @@ def merge_tracks(
    joinable_pairs = get_joinable(tracks, **kwargs)
    if joinable_pairs:
        tracks = join_tracks(tracks, joinable_pairs, drop=drop)
-
    return (tracks, joinable_pairs)


@@ -135,28 +132,23 @@ def get_joint_ids(merging_seqs) -> dict:
    2 ab cd
    3 abcd

-    We shold get:
+    We should get:

    output {a:a, b:a, c:a, d:a}

    """
    if not merging_seqs:
        return {}
-
    targets, origins = list(zip(*merging_seqs))
    static_tracks = set(targets).difference(origins)
-
    joint = {track_id: track_id for track_id in static_tracks}
    for target, origin in merging_seqs:
        joint[origin] = target
-
    moved_target = [
        k for k, v in joint.items() if joint[v] != v and v in joint.values()
    ]
-
    for orig in moved_target:
        joint[orig] = rec_bottom(joint, orig)
-
    return {
        k: v for k, v in joint.items() if k != v
    }  # remove ids that point to themselves
@@ -184,14 +176,11 @@ def join_tracks(tracks, joinable_pairs, drop=True) -> pd.DataFrame:
    :param drop: bool indicating whether or not to drop moved rows

    """
-
    tmp = copy(tracks)
    for target, source in joinable_pairs:
        tmp.loc[target] = join_track_pair(tmp.loc[target], tmp.loc[source])
-
        if drop:
            tmp = tmp.drop(source)
-
    return tmp


@@ -206,7 +195,7 @@ def get_joinable(tracks, smooth=False, tol=0.1, window=5, degree=3) -> dict:
    """
    Get the pair of track (without repeats) that have a smaller error than the
    tolerance. If there is a track that can be assigned to two or more other
-    ones, it chooses the one with a lowest error.
+    ones, choose the one with lowest error.

    :param tracks: (m x n) dataframe where rows are cell tracks and
        columns are timepoints
@@ -324,7 +313,6 @@ def get_means(x, i):
        v = x[~np.isnan(x)][:i]
    else:
        v = x[~np.isnan(x)][i:]
-
    return np.nanmean(v)


@@ -335,12 +323,12 @@ def get_last_i(x, i):
        v = x[~np.isnan(x)][:i]
    else:
        v = x[~np.isnan(x)][i:]
-
    return v


 def localid_to_idx(local_ids, contig_trap):
-    """Fetch then original ids from a nested list with joinable local_ids
+    """
+    Fetch the original ids from a nested list with joinable local_ids.

    input
    :param local_ids: list of list of pairs with cell ids to be joint
@@ -392,7 +380,6 @@ def get_dMetric(
        dMetric = np.abs(np.subtract.outer(post, pre))
    else:
        dMetric = np.abs(np.subtract.outer(pre, post))
-
    dMetric[np.isnan(dMetric)] = (
        tol + 1 + np.nanmax(dMetric)
    )  # nans will be filtered
@@ -404,28 +391,26 @@ def solve_matrices(
 ):
    """
    Solve the distance matrices obtained in get_dMetric and/or merged from
-    independent dMetric matrices
+    independent dMetric matrices.
    """
-
    ids = solve_matrix(dMetric)
    if not len(ids[0]):
        return []
    pre, post = prepost
-
    norm = (
        np.array(pre)[ids[len(pre) > len(post)]] if tol < 1 else 1
    )  # relative or absolute tol
    result = dMetric[ids] / norm
    ids = ids if len(pre) < len(post) else ids[::-1]
-
    return [idx for idx, res in zip(zip(*ids), result) if res <= tol]


 def get_closest_pairs(
    pre: List[float], post: List[float], tol: Union[float, int] = 1
 ):
-    """Calculate a cost matrix the Hungarian algorithm to pick the best set of
-    options
+    """
+    Calculate a cost matrix for the Hungarian algorithm to pick the best set of
+    options.

    input
    :param pre: list of floats with edges on left
@@ -437,7 +422,6 @@ def get_closest_pairs(

    """
    dMetric = get_dMetric(pre, post, tol)
-
    return solve_matrices(dMetric, pre, post, tol)


@@ -465,9 +449,7 @@ def solve_matrix(dMetric):
            tmp[:, j] += np.nan
            glob_is.append(i)
            glob_js.append(j)
-
            std = sorted(tmp[~np.isnan(tmp)])
-
    return (np.array(glob_is), np.array(glob_js))


@@ -475,7 +457,6 @@ def plot_joinable(tracks, joinable_pairs):
    """
    Convenience plotting function for debugging and data vis
    """
-
    nx = 8
    ny = 8
    _, axes = plt.subplots(nx, ny)
@@ -493,7 +474,6 @@ def plot_joinable(tracks, joinable_pairs):
                # except:
                #     pass
                ax.plot(post_srs.index, post_srs.values, "g")
-
    plt.show()


@@ -506,21 +486,17 @@ def get_contiguous_pairs(tracks: pd.DataFrame) -> list:
    :param min_dgr: float minimum difference in growth rate from
        the interpolation
    """
-
    mins, maxes = [
        tracks.notna().apply(np.where, axis=1).apply(fn)
        for fn in (np.min, np.max)
    ]
-
    mins_d = mins.groupby(mins).apply(lambda x: x.index.tolist())
    mins_d.index = mins_d.index - 1  # make indices equal
    # TODO add support for skipping time points
    maxes_d = maxes.groupby(maxes).apply(lambda x: x.index.tolist())
-
    common = sorted(
        set(mins_d.index).intersection(maxes_d.index), reverse=True
    )
-
    return [(maxes_d[t], mins_d[t]) for t in common]



--- a/src/postprocessor/core/lineageprocess.py
+++ b/src/postprocessor/core/lineageprocess.py
@@ -21,7 +21,7 @@ class LineageProcessParameters(ParametersABC):

 class LineageProcess(PostProcessABC):
    """
-    Lineage process that must be passed a (N,3) lineage matrix (where the coliumns are trap, mother, daughter respectively)
+    Lineage process that must be passed a (N,3) lineage matrix (where the columns are trap, mother, daughter respectively)
    """

    def __init__(self, parameters: LineageProcessParameters):
@@ -40,7 +40,7 @@ class LineageProcess(PostProcessABC):
    def as_function(
        cls,
        data: pd.DataFrame,
-        lineage: t.Union[t.Dict[t.Tuple[int], t.List[int]]],
+        lineage: t.Union[t.Dict[t.Tuple[int], t.List[int]]] = None,
        *extra_data,
        **kwargs,
    ):
@@ -60,7 +60,7 @@ class LineageProcess(PostProcessABC):
        elif hasattr(self, "lineage"):
            lineage = self.lineage
        elif hasattr(self, "cells"):
-            with h5py.File(self.cells.filename, "a") as f:
+            with h5py.File(self.cells.filename, "r") as f:
                if (lineage_loc := "modifiers/lineage_merged") in f and merged:
                    lineage = f.get(lineage_loc)[()]
                elif (lineage_loc := "modifiers/lineage)") in f:
No results found