Author ORCID Identifier
0000-0003-2870-8134
Document Type
Article
Disciplines
Statistics
Abstract
Graph Neural Networks (GNNs) have established themselves as powerful tools for graph-structured data. However, when feature separability among nodes is low, conventional neighborhood aggregation strategies often result in performance degradation due to over-smoothing and noisy information propagation. In this work, we introduce a novel GNN framework that refines the aggregation process by integrating feature similarity and neighborhood entropy into node message passing. Unlike standard models that uniformly aggregate neighbor information, this new model dynamically adjusts neighbor influence, prioritizing nodes with high similarity and low entropy. We evaluate the model on synthetic graphs generated using the Stochastic Block Model (SBM), varying homophily and class imbalance conditions to simulate challenging classification scenarios. Our experiments demonstrate that our method consistently outperforms the GraphSAGE model across Accuracy, Balanced Accuracy, F1 Score, and Matthews Correlation Coefficient, particularly in graphs with weak class signal and high structural noise. Statistical significance tests confirm the robustness of the observed improvements. Furthermore, the optimization of aggregation weights provides information about the model’s adaptation to different graph structures, enhancing interpretability. These findings suggest that integrating structural information selectively during aggregation can significantly improve GNN performance in complex graph environments, offering a new course toward more explainable graph-based models.
DOI
https://doi.org/10.21427/fggr-jf50
Recommended Citation
Bernhardt, Brian Daniel; Marciano, Chiara; and Guarracino, Mario Rosario, "Improving Node Classification for Graphs withWeak Feature Signals: A Similarity-Entropy Aggregation Approach" (2025). SAML-25 Workshop on Statistical and Machine Learning. 13.
https://arrow.tudublin.ie/saml/13
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.
Publication Details
Statistical and Machine Learning: Methods and Applications (SAML-25) on June 5th and 6th, 2025 at TU Dublin, Ireland.
doi:10.21427/fggr-jf50