Biomedicine and Machine Learning

Using cancer bioimaging to improve Machine Learning research

The Bhargava Lab at UIUC develops methods and valuable bioimaging data to improve breast cancer treatments. Tissue samples are stained for molecular markers, and then used to obtain multispectral images with expert annotations added. Data for an individual patient ranges between 1-4 TB, and includes a large number of processed infrared image files, related spectral values, stained data for molecular markers, and information on image correction and processing.

Patient data are costly and very time-consuming to collect, and require regulatory oversight and collaboration with clinicians in healthcare settings. For researchers who are not able to produce these data themselves, public access to high quality resources such as this one is critical to accelerate translational science. In addition to known interest within the Cancer Research community, the relatively small community of researchers who produce and use infrared imagery want to be able to use this data set to test their methods. Additionally, this data resources will be of value to Machine Learning researchers, who require high quality, well curated data to develop and test algorithms and new techniques.

Reproduced under permissions by the authors. Mittal, S., Yeh, K., Leslie, L.S., Kenkel, S., Kajdacsy-Balla, A. and Bhargava, R., 2018. Simultaneous cancer and tumor microenvironment subtyping using confocal infrared microscopy for all-digital molecular histopathology. Proceedings of the National Academy of Sciences, 115(25), pp.E5651-E5660.