Articles

Understanding the Molecular Information Contained in Principal Component Analysis of Vibrational Spectra of Biological Systems

Franck Bonnier, Technological University DublinFollow
Hugh Byrne, Technological University DublinFollow

Document Type

Article

Rights

Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence

Disciplines

Biochemistry and molecular biology

Publication Details

Analyst, 2012, 137, 322. DOI: 10.1039/c1an15821j

Abstract

K-means clustering followed by Principal Component Analysis (PCA) is employed to analyse Raman spectroscopic maps of single biological cells. K-means clustering successfully identifies regions of cellular cytoplasm, nucleus and nucleoli, but the mean spectra do not differentiate their biochemical composition. The loadings of the principal components identified by PCA shed further light on the spectral basis for differentiation but they are complex and, as the number of spectra per cluster is imbalanced, particularly in the case of the nucleoli, the loadings under-represent the basis for differentiation of some cellular regions. Analysis of pure bio-molecules, both structurally and spectrally distinct, in the case of histone, ceramide and RNA, and similar in the case of the proteins albumin, collagen and histone, show the relative strong representation of spectrally sharp features in the spectral loadings, and the systematic variation of the loadings as one cluster becomes reduced in number. The more complex cellular environment is simulated by weighted sums of spectra, illustrating that although the loading become increasingly complex; their origin in a weighted sum of the constituent molecular components is still evident. Returning to the cellular analysis, the number of spectra per cluster is artificially balanced by increasing the weighting of the spectra of smaller number clusters. While it renders the PCA loading more complex for the three-way analysis, a pair wise analysis illustrates clear differences between the identified subcellular regions, and notably the molecular differences between nuclear and nucleoli regions are elucidated. Overall, the study demonstrates how appropriate consideration of the data available can improve the understanding of the information delivered by PCA.

DOI

https://doi.org/10.1039/c1an15821j

Recommended Citation

Bonnier, F. & Byrne, H. (2012) Understanding the Molecular Information Contained in Principal Component Analysis of Vibrational Spectra of Biological Systems. Analyst, 2012, 137, 322. doi:10.1039/c1an15821j

Download

Included in

Biological and Chemical Physics Commons

COinS

Articles

Understanding the Molecular Information Contained in Principal Component Analysis of Vibrational Spectra of Biological Systems

Document Type

Rights

Disciplines

Publication Details

Abstract

DOI

Recommended Citation

Included in

Search

Browse

Author Corner

Articles

Understanding the Molecular Information Contained in Principal Component Analysis of Vibrational Spectra of Biological Systems

Authors

Document Type

Rights

Disciplines

Publication Details

Abstract

DOI

Recommended Citation

Included in

Share

Search

Browse

Author Corner