Human brain functions, such as the internal workings of neurons, have historically inspired the design of neural networks. Recently, researchers have increased network parameters to enhance performance and emulate the brain’s complex connectivity, with models like Megatron-Turing NLG reaching 530 billion parameters. These large models, though powerful, require high-end hardware, making them impractical for resource-limited devices. Inspired by glial cells, which create, maintain, and destroy synapses based on their performance, this paper introduces a reinforcement learning agent to optimize neural network structures by adding or pruning nodes in the dense layers of a multi-layer perceptron based on specific reward functions that account for their effectiveness. Experiments on the Fashion-MNIST and CIFAR-10 datasets demonstrate that the RL agent can reduce model parameters by up to 80.95% without losing accuracy. Drawing from neuroscience, this method explores the potential to create efficient, high-performing models suitable for various hardware platforms without a loss of generality.
Glia Cell Inspired Reinforcement Learning Agent for Neural Network Optimization
Fagioli A.
;Foresti G. L.;
2025-01-01
Abstract
Human brain functions, such as the internal workings of neurons, have historically inspired the design of neural networks. Recently, researchers have increased network parameters to enhance performance and emulate the brain’s complex connectivity, with models like Megatron-Turing NLG reaching 530 billion parameters. These large models, though powerful, require high-end hardware, making them impractical for resource-limited devices. Inspired by glial cells, which create, maintain, and destroy synapses based on their performance, this paper introduces a reinforcement learning agent to optimize neural network structures by adding or pruning nodes in the dense layers of a multi-layer perceptron based on specific reward functions that account for their effectiveness. Experiments on the Fashion-MNIST and CIFAR-10 datasets demonstrate that the RL agent can reduce model parameters by up to 80.95% without losing accuracy. Drawing from neuroscience, this method explores the potential to create efficient, high-performing models suitable for various hardware platforms without a loss of generality.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


