Skip to content

mi process estimates mutual information

Arin Wongprommoon requested to merge mi into master

Users can now use the mi process to estimate mutual information. In other words, the estimateMI() function (https://git.ecdf.ed.ac.uk/pswain/mutual-information/-/blob/master/MIdecoding.py) is now integrated into the pipeline. Thus, future user scripts will be simpler and consistent with other post-processes. Users will no longer need to import MIdecoding.

The input is changed from a list of arrays to a pandas DataFrame for 2 reasons:

  • Consistency with other processes.
  • Consistency with the format of data from the aliby pipeline.

Additionally, the input (currently) has to be a multi-indexed DataFrame with the first column being 'strain'. This should be consistent with the output of grouper; otherwise, data wrangling to fit this structure should be straightforward.

I also decided to remove the 'verbose' parameter for 3 reasons:

  • Cleans up output
  • Printing does not allow for easy storage of the hyperparameters
  • It is rare that users will want to know the hyperparameters

Otherwise, miParameters correspond to the other parameters of estimateMI(), and the main functionality, which became mi.run(), is untouched.

To verify the process: I trimmed the SFP1 data and ran it through estimateMI(). I wrote test_mi.py, which includes the test input and the output data (MI IQR). I put test data in constants rather than CSV files to reduce the testing overhead.

This commit addresses issue #2 (closed).

Merge request reports