ChatGPT has fascinated the public as the realization has settled in that generative artificial intelligence (AI) can be useful in our everyday lives. On the back end, scientists are continually advancing AI for potential applications so vast that it may change life as we know it by accelerating scientific and technological developments.
In research recently published in the Journal of Machine Learning Research, Fenglei Fan, Ph.D. ’23, former Rensselaer doctoral student and current research assistant professor of mathematics at The Chinese University of Hong Kong; Rongjie Lai, former Rensselaer associate professor and now a professor of mathematics at Purdue University; and Ge Wang, Ph.D., Clark & Crossan Endowed Chair Professor and director of the Biomedical Imaging Center at Rensselaer, found that analyzing the topology of artificial neural networks illuminated how to best harness the power of AI in the future.
ChatGPT, which is all the buzz in the AI world, is a deep neural network with many layers, also referred to as a deep learning algorithm. Wang and his collaborators found that network width, which refers to the number of neurons in a layer, also plays a significant role. Interestingly, they found that one type of network may be converted into the other to accomplish a given task, such as regression or classification, which are critical elements of machine learning. (Machine learning is a subset of AI that allows computer-generated predictions without explicit instructions.) In other words, a deep neural network may be converted into a wide neural network, and vice versa.
“Early in the technology, scientists focused on very wide and shallow networks (one to two layers), to do universal approximation,” said Wang. “Later, deep neural networks (many layers work in the feedforward fashion) were proven to be very powerful. However, we were not fully convinced that the focus should be solely on deep learning rather than both wide and deep learning. We feel that depth is just one dimension and width is another, and both need to be considered and combined.”
In their research, the team considered the relationship between deep and wide neural networks. Using quantitative analysis, they found that deep and wide networks can be converted back and forth on a continuum. Using both will give a bigger picture and avoid bias. Their research hints at the future of machine learning, in which networks are both deep and wide and interconnected with favorable dynamics and optimized ratios between width and depth. Networks will become increasingly complicated, and when dynamics reach the desired states, they will produce amazing outcomes.
“It’s like playing with LEGO bricks,” said Wang. “You can build a very tall skyscraper or you can build a flat large building with many rooms on the same level. With networks, the number of neurons and their interconnection are the most important. In 3D space, neurons can be arranged in myriad ways. It’s just like the structure of our brains. The neurons just need to be interconnected in various ways to facilitate diverse tasks.”
“Comprehending the conversion between the depth and width of neural networks remains a dynamic and evolving field,” said Lai. “Both wide and deep networks offer their distinct advantages and drawbacks. Shallow networks, typically, are more straightforward to grasp. Our exploration into the symmetries inherent in these two network types illuminates a new perspective for understanding deep networks through the lens of wide networks.”
“Dr. Wang’s research on the relationship of wide and deep neural networks opens new paths to harness the potential of AI,” said Shekhar Garde, Ph.D., dean of Rensselaer’s School of Engineering. “AI is impacting almost every aspect of our society, from medicine to new materials to finance. It is an exciting time for the field and Dr. Wang is at the forefront of thought on the subject.”