Downloads

Keywords:

Neural Networks, Weight Initialization, MNIST Dataset, Xavier/Glorot Initialization, He Initialization, Model Performance, Training Efficiency.

Impact of Weight Initialization Techniques on Neural Network Efficiency and Performance: A Case Study with MNIST Dataset

Authors

Chitra Desai1
Department of Computer Science, National Defence Academy, Pune 1

Abstract

This manuscript investigates the impact of weight initialization on the efficiency and performance of deep learning models, focusing on a specific neural network architecture applied to the MNIST dataset of handwritten digits. It highlights the importance of appropriate weight initialization for achieving rapid convergence and ensuring strong generalization, which are critical for the effective learning of complex data patterns. The study evaluates several weight initialization methods, including random, Xavier/Glorot, and He techniques, within the context of a neural network consisting of a flatten layer, a dense layer with 128 neurons using the ReLU activation function, and a final dense output layer. The examination is rooted in the foundational theories behind these strategies, assessing their effect on the training process and subsequent model performance. Through a detailed analysis, this research aims to clarify the role of these weight initialization techniques in enhancing the convergence speed and overall performance of the neural network on tasks like image recognition. By merging empirical observations with theoretical insights, the study seeks to offer guidance for the strategic selection of weight initialization methods, thereby optimizing the training and effectiveness of deep learning models.

Article Details

Published

2024-04-09

Section

Articles

License

Copyright (c) 2024 International Journal of Engineering and Computer Science Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

How to Cite

Impact of Weight Initialization Techniques on Neural Network Efficiency and Performance: A Case Study with MNIST Dataset. (2024). International Journal of Engineering and Computer Science, 13(04), 26115-26120. https://doi.org/10.18535/ijecs/v13i04.4809