A probabilistic atlas for cell identification
Bubnis G, Ban S, DiFranco MD, Kato S
arXiv
2019
Abstract
We propose a general framework for a collaborative machine learning system to assist bioscience researchers with the task of labeling specific cell identities from microscopic still or video imaging. The distinguishing features of this approach versus prior approaches include: (1) use of a statistical model of cell features that is iteratively improved, (2) generation of probabilistic guesses at cell ID rather than single best-guesses for each cell, (3) tracking of joint probabilities of features within and across cells, and (4) ability to exploit multi-modal features, such as cell position, morphology, reporter intensities, and activity. We provide an example implementation of such a system applicable to labeling fluorescently tagged C. elegans neurons. As a proof of concept, we use a generative spring-mass model to simulate sequences of cell imaging datasets with variable cell positions and fluorescence intensities. Training on synthetic data, we find that atlases that track inter-cell positional correlations give higher labeling accuracies than those that treat cell positions independently. Tracking an additional feature type, fluorescence intensity, boosts accuracy relative to a position-only atlas, suggesting that multiple cell features could be leveraged to improve automated label predictions.