Doctoral

A Unified Framework for Evaluating Training Efficiency in Deep (Bayesian) Neural Networks: Metrics, Overtraining, Stopping Criteria, and Grokking Computer Science

Eduardo Cueto Mendoza

Document Type

Theses, Ph.D

Disciplines

Computer Sciences

Abstract

Measuring training efficiency for artificial neural networks is an open research problem, current literature reports several attempts to define measures or create reporting frameworks. Current methods lack generality as they require measurements of the hardware or software thus, comparing efficiency between different systems can be difficult. Similarly, current metrics or frameworks generally do not propose the use of the metrics to directly improve training efficiency. This thesis presents three main contributions: (1) a novel framework that quantifies the training efficiency of a neural architecture on a learning task as the average ratio of model accuracy to total energy consumption during training, (2) the definition and analysis of a novel efficiency based stopping criterion for neural network training, and (3) experiments that provide evidence that grokking, which is the sudden increase on the training accuracy of a deep neural network on extremely long training runs, does not alter the dynamics of efficiency. The experimental framework evaluates Convolutional Neural Networks (CNNs) and Bayesian CNNs (BCNNs) across multiple model sizes and convergence conditions on MNIST and CIFAR-10 datasets. Results show that training efficiency declines as training progresses, varies by architecture, and that CNNs generally outperform BCNNs in efficiency, especially as task complexity increases.

DOI

https://doi.org/10.21427/zzyp-0q41

Recommended Citation

Cueto Mendoza, Eduardo, "A Unified Framework for Evaluating Training Efficiency in Deep (Bayesian) Neural Networks: Metrics, Overtraining, Stopping Criteria, and Grokking Computer Science" (2026). Doctoral. 6.
https://arrow.tudublin.ie/compdidadoc/6

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.

Download

Included in

Computer Sciences Commons

COinS

Doctoral

A Unified Framework for Evaluating Training Efficiency in Deep (Bayesian) Neural Networks: Metrics, Overtraining, Stopping Criteria, and Grokking Computer Science

Document Type

Disciplines

Abstract

DOI

Recommended Citation

Creative Commons License

Included in

Search

Browse

Author Corner

Doctoral

A Unified Framework for Evaluating Training Efficiency in Deep (Bayesian) Neural Networks: Metrics, Overtraining, Stopping Criteria, and Grokking Computer Science

Authors

Document Type

Disciplines

Abstract

DOI

Recommended Citation

Creative Commons License

Included in

Share

Search

Browse

Author Corner