Neural networks that mimic how the human brain works are now often creating art, powering computer vision, and powering many more applications. A Chinese-made neural network microchip called Taichi, which uses photons instead of electrons, can perform AI tasks as well as electronic tasks with 1,000 times less energy, according to new research.
AI typically relies on artificial neural networks for applications such as analyzing medical scans and generating images. In these systems, data is fed into circuit components called neurons, similar to neurons in the human brain, that work together to solve problems such as facial recognition. A neural net is called “deep” if it has multiple layers of these neurons.
“Optical neural networks are no longer toy models. They can now be applied to real-world tasks.” —Lu Fang, Tsinghua University, Beijing
As neural networks grow in size and power, they consume more energy when run on traditional electronics. For example, to train the state-of-the-art neural network GPT-3, 2022 Nature According to the study, OpenAI spent US$4.6 million to run 9,200 GPUs for two weeks.
Because of the drawbacks of electronic computing, some researchers are investigating optical computing as a promising basis for next-generation AI. This photonic approach uses light to perform calculations faster and with less power than electronic approaches.
Now, scientists at Beijing's Tsinghua University and Beijing National Information Science Research Center have developed a photonic microchip that can perform as well as electronic devices in advanced AI tasks, while proving to be much more energy efficient. I developed a certain Taichi.
“Optical neural networks are no longer just toy models,” says Lu Huang, associate professor of electronics engineering at Tsinghua University. “They can now be applied to real-world tasks.”
How do optical neural networks work?
Two strategies for developing optical neural networks are either scattering light in a specific pattern within a microchip or precisely interfering light waves within a device. When input in the form of light flows into these optical neural networks, the output light encodes data from the complex operations performed within these devices.
Both photonic computing approaches have significant advantages and disadvantages, Fang explains. For example, optical neural networks that rely on scattering and diffraction can pack many neurons close together and consume virtually no energy. Diffraction-based neural networks rely on the scattering of light beams as they pass through optical layers that represent the behavior of the network. However, one of the drawbacks of diffraction-based neural networks is that they cannot be reconstructed. Each string of operations basically allows him to use only one specific task.
Taichi boasts 13.96 million parameters.
In contrast, optical neural networks that rely on interference can be easily reconfigured. Interference-based neural nets send multiple beams through a mesh of channels, and the way these channels interfere where they intersect helps the device perform its operations. However, interferometers also have the disadvantage of being bulky, which limits the scale-up of such neural nets. It also consumes a lot of energy.
Additionally, current photonic chips suffer from unavoidable errors. Attempting to scale up optical neural networks by increasing the number of neuron layers in these devices typically only increases this unavoidable noise exponentially. This means that until now, optical neural networks have been limited to basic AI tasks, such as simple pattern recognition. In other words, optical neural nets are generally not suitable for advanced real-world applications, Fang says.
The researchers say that Taichi, by contrast, is a hybrid design that combines both diffraction and interferometric approaches. It contains a cluster of diffraction units that can compress data for large-scale input and output in a compact space. But their chip also includes an array of interferometers for reconfigurable computation. The encoding protocol developed for Taichi splits difficult tasks and large network models into sub-problems and sub-models that can be distributed across different modules, Fang says.
How does Taichi blend both types of neural networks?
Previous research has typically sought to expand the capacity of optical neural networks by increasing the number of neuron layers, mimicking what is often done with their electronic counterparts. Instead, Taichi's architecture scales up by distributing computing across multiple chiplets operating in parallel. This means that Taichi can avoid the problem of exponentially accumulating errors that occur when optical neural networks stack many layers of neurons.
“This 'shallow deep but wide' architecture ensures network scale,” says Fang.
Taich created music clips in the style of Bach and art in the style of Van Gogh and Munch.
For example, previous optical neural networks typically had only a few thousand parameters. This is a connection between neurons that mimics the synapses that connect biological neurons in the human brain. In contrast, Taichi boasts 13.96 million parameters.
Previous optical neural networks were often limited to classifying data along just a dozen categories, for example, determining whether an image represents one out of 10 digits. did. In contrast, in a test using the Omniglot database, which includes 50 different alphabets and 1,623 different handwritten characters, Taichi showed his 91.89 percent accuracy on par with the electronic version.
Scientists also tested Taichi on an advanced AI task: content generation. They discovered that it could create music clips in the style of Johann Sebastian Bach and generate images of figures and landscapes in the style of Vincent van Gogh and Edvard Munch.
Overall, the researchers found that Taichi has an energy efficiency of up to about 160 trillion operations per second per watt and an area efficiency of nearly 880 trillion multiply-accumulate operations (the most basic operations in neural networks) per square millimeter. I discovered what was shown. This makes it more than 1,000 times more energy efficient than his NVIDIA H100, one of the latest electronic GPUs, and about 100 times more energy efficient and area efficient than other optical neural networks to date. But he will be 10 times higher.
Although the Taichi chip is compact and energy efficient, Fang cautions that it relies on many other systems, such as laser light sources and high-speed data coupling. These other systems are much bulkier than a single chip, “taking up almost the entire table,” she points out. In the future, Huang and his colleagues aim to add more modules on the chip, making the overall system more compact and energy efficient.
The scientists published their findings in detail on April 11 in the online journal science.
From an article on your site
Related articles on the web