Document Type

Conference Paper


Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence

Publication Details

Published in: 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)


Vibrational spectra of biological species suffer from the influence of many extraneous interfering factors that require removal through preprocessing before analysis. The present study was conducted to optimise the preprocessing methodology and variable subset selection during regression of and confocal Raman microspectroscopy (CRM) and Fourier Transform Infrared microspectroscopy (FTIRM) spectra against ionizing radiation dose. Skin cells were γ-irradiated in-vitro and their Raman and FTIRM spectra were used to retrospectively predict the radiation dose using linear and nonlinear partial least squares (PLS) regression algorithms in addition to support vector regression (SVR). The optimal preprocessing methodology (which comprised combinations of spectral filtering, baseline subtraction, scaling and normalization options) was selected using a genetic algorithm (GA) with the root mean squared error of prediction (RMSEP) used as the fitness criterion for selection of the preprocessing chromosome (where this was calculated on an independent set of test spectra randomly selected from the dataset on each pass of the algorithm). The results indicated that GA selection of the optimal preprocessing methodology substantially improved the predictive capacity of the regression algorithms over baseline methodologies, although the optimal preprocessing chromosomes were similar for various regression algorithms, suggesting an optimal preprocessing methodology for radiobiological analyses with biospectroscopy. Feature selection of both FTIRM and CRM spectra using genetic algorithms and multivariate regression provided further decreases in RMSEP, but only with non-linear multivariate regression algorithms.