This item is available under a Creative Commons License for non-commercial use only
This thesis reviews the current state of photometric classification in Astronomy and identifies two main gaps: a dependence on handcrafted rules, and a lack of interpretability in the more successful classifiers. To address this, Deep Learning and Computer Vision were used to create a more interpretable model, using unsupervised training to reduce human bias.
The main contribution is the investigation into the impact of using unsupervised feature-extraction from multi-wavelength image data for the classification task. The feature-extraction is achieved by implementing an unsupervised Deep Belief Network to extract lower-dimensionality features from the multi-wavelength image data captured by the Sloan Digital Sky Survey. These features are used in a Random Forest classifier alongside 10 color values, calculated from the differences in magnitude between the wavelength bands. These results are compared to a separate Random Forest classifier which was trained on only the 10 color values.
A statistically significant increase in the macro-averaged F1 score of 0.0361 above the baseline result was achieved, indicating that the novel features were useful in the classification task. Increases in the individual F1 scores of each class were found to be 0.0627 for stars, 0.0206 for galaxies and 0.0252 for QSOs. Relative to the baseline results, the novel features added a 4.35% increase to the overall result, 7.81% to the star class, 2.20% for the galaxy class and 3.32% for the QSO class.
An analysis of the model suggests that further improvements might be achieved by implementing a sparsity goal for the unsupervised feature-extraction training.
Lindh, A. (2016)Investigating the Impact of Unsupervised Feature-Extraction from Multi-Wavelength Image Data for Photometric Classification of Stars, Galaxies and QSOs, Dissertation, Dublin Institute of Technology.