How Cornell ECE researchers are working to realize revolutionary new architectures.
Computer engineering researchers are starting to grapple with the implications of what has come to be seen as the end of, or the breaking of, Moore’s law.
The observation that transistor density on an integrated circuit doubled about every two years is named after Gordon Moore, whose 1965 paper originally described and predicted this performance growth rate. Moore's law allowed the semiconductor industry to transform the world by building ever-smaller transistors with increasing density, creating the ubiquitous and relatively inexpensive computing environment we live in today.
Even though transistors exist at the nano scale, the limit of how small they can be built and how densely they can be arranged on a chip has been found. Moore’s law essentially ends here.
Historically the processor side of the industry focused on complexity and speed, while the memory side focused on density of storage. Eventually problems arose as the relative speed of processors and memory diverged. Processors were increasingly stalling, waiting for data to come from memory. Clever engineers have found ways to hide this bottleneck over the years, but something has to change.
The demands of modern computing require real-time speed and massive scalability. The delays inherent in the traditional Von Neumann architecture, the model on which most of today’s computers operate, provide a fundamental roadblock in terms of the expenditure of time and energy. Computer architects are searching for ways around this roadblock, to reduce the time and energy it takes to move data between the processor and memory. One solution is to do processing much closer to, or even within the memory itself.
This requires researchers to find new computer architectures and to innovate across the system stack, from the materials and devices used to build the chips to the applications running on the computer. The old, evolutionary approach is no longer adequate. ECE researchers are working to realize revolutionary new architectures to build the next generation of computers.
“For the past two decades, my research emphasis has been power-efficient computer architecture,” said David Albonesi, professor and associate director of ECE. “Much of this focus has been on dynamically adapting cache architectures to fit the needs of the application programs. “
One approach, for which Albonesi recently received the International Symposium on Microarchitecture Test of Time Award, is to dynamically turn off portions of the processor cache according to the present characteristics of the running application. “Figuring this out on the scale of milliseconds is extremely challenging,” he said, “and our recent work adapts algorithms from other problem domains to the fine timescales of computer systems, including cache power management.”
Professor Edward Suh discussed the potential security benefits of in-memory or near-memory computing. “In today's computer systems, data needs to be moved to a processor, which is often another chip,” he said. It’s been shown that the data can be read, or even changed, during the transfer. “There are techniques to encrypt and protect data in memory,” Suh continued, “but they come with some overhead in terms of both performance and energy consumption. If data stays inside memory, it will be quite difficult to tamper with.”
Olalekan Afuye and Xiang Li, both Ph.D. students in ECE, are the lead co-authors of a paper titled “Modeling and Circuit Design of Associative Memories with Spin Orbit Torque FETs” published in the IEEE Journal on Exploratory Solid-State Computational Devices and Circuits. “We explore building associative memories (CAM/TCAM) using the built-in logic functionality of SOTFETs,” Afuye said, “which are a new kind of transistor/memory being developed by multiple materials engineering groups at Cornell.”
Several other major research initiatives currently underway focus on optimizing the memory system to improve the performance and efficiency of computer systems. The most high-profile efforts are two centers: DEEP3M and CRISP.
Huili Grace Xing, professor of electrical and computer engineering and materials science and engineering, is leading an interdisciplinary team to investigate durable, energy-efficient, pausable processing in polymorphic memories (DEEP3M), where computational capabilities are pushed directly into the high-capacity memories. This center is jointly funded by the National Science Foundation (NSF) and the Semiconductor Research Corporation (SRC).
Professor José Martínez, electrical and computer engineering, is assistant director of CRISP, the Center for Research on Intelligent Storage and Processing in Memory. CRISP is part of the Joint University Microelectronics Program (JUMP, also funded by SRC) which supports creating intelligent memory and storage architectures that do as much of the computing as possible as close to the bits as possible.
DEEP3M
An ultimate memory would be suitable to all systems, with the desired features of non-volatility, low-power operation, infinite endurance, nanosecond writing time, sub-nanosecond reading time, good scalability, and more. The DEEP3M team’s approach builds on recent breakthroughs in the physics of magnetic switching and advanced materials and re-imagines the memory device as a computing element itself.
The project is jointly funded by the National Science Foundation and the Semiconductor Research Corporation, supporting research by ECE faculty members Alyssa Apsel, Christopher Batten, Debdeep Jena, José Martínez, Alyosha Molnar and Christoph Studer along with Daniel Ralph in the Department of Physics and Darrell Schlom in Materials Science and Engineering (MSE). Huili Grace Xing, the William L. Quackenbush Professor of Electrical and Computer Engineering in both ECE and MSE, is DEEP3M’s principal investigator.
“My focus is primarily on understanding the physics of this new device we have proposed and then converting that physical understanding into a mathematical model,” Xing said. “The official name of the device, coined by Professor Ralph, is a SOTFET.” The acronym stands for Spin-Orbit-Torque Field-Effect Transistor.
The goal is to use the spin orbit torque to reorient the poles of the transistor’s magnetic layer like a switch, translating the magnetic orientation into ones and zeros. This technique can be used to create memory as well as computational devices.
“The whole field of spintronics that’s been developed over the last 20 years is trying to figure out if there are ways to manipulate the spins and gain new functionality that you can't get with just a charge,” Ralph explained. “We take direct advantage of that. With spins you can transfer angular momentum and apply torques that can reorient magnets. The DEEP3M project is a way of trying to improve on the spin orbit torque, to use it to make a better memory for writing and for reading.”
It doesn’t take much electrical current to flip this magnetic switch, but two other characteristics make this approach very attractive to researchers.
“One is infinite endurance—you can write zeros and ones an infinite amount of times,” Xing explained. “The other one is non-volatility, meaning when there's no power applied to this device, the ones and zeros that are stored in the device remain.” Non-volatility is enormously important to energy efficiency.
“Non-volatility is very useful, especially if you look at big data,” Xing said, “because in big data not all the data will actively participate in computation, communication or decision making. The data are all there and zero energy is spent to store it.”
Achieving significant benefits in performance and energy efficiency will require radically new interdisciplinary approaches spanning materials, devices, circuits, and architectures. The DEEP3M team argues that new materials and devices centered around spin-orbit torque offer compelling benefits compared to both existing and emerging memory technologies in terms of endurance, density, performance, and energy efficiency.
The project has the potential to impact multiple disciplines, re-weighting the cost-benefit
equation of many circuit and architecture designs. It will also encourage significant interdisciplinary collaboration which will lead to further innovations.
In the emerging era of big data, search and pattern matching become critical low-level functions; the research and technology development proposed by DEEP3M will enable such functionality, with long term impacts in virtually every field of science and technology, and beyond into the broader economy. In the future, students at all levels will learn to view a larger picture of the combined materials/devices/circuits/architectures approach to research.
“The best non-volatile memory is magnetic memory because of its infinite endurance and non- volatility,” said Xing. “The best logic is the semiconductor device because it creates a huge dynamic range in terms of resistance. So now we’re bringing those two elements into one device.”
This means a dramatic potential paradigm shift in how we build computers in the future, with the most desired features of memory and logic in the same device.
CRISP
The “von Neumann bottleneck” is a feature-turned-problem that is almost as old as the modern computer itself.
Most modern computers operate using a von Neumann architecture, named after computer scientist John von Neumann. He proposed in 1945 that programs and data should both reside in a computer’s memory, and that the central processing unit may access them as needed using a memory bus. Von Neumann’s paradigm allowed processor and memory technology to evolve largely independently at breakneck pace, the former emphasizing processing speed, and the latter favoring storage density. Soon enough, however, this created a fundamental bottleneck which has become steadily worse over the years, forcing computer architects to concoct a myriad of engineering tricks such as caches, prefetching or speculative execution.
“The faster processors got relative to memory, the more critical this problem of busing data around became,” said José Martínez, professor of electrical and computer engineering. “Today’s processors often find themselves twiddling their thumbs, waiting for data they’ve requested from memory so they can get something done.”
Zhiru Zhang, associate professor of electrical and computer engineering, and Martínez are working to develop a radically new computer architecture through the Center for Research on Intelligent Storage and Processing in Memory (CRISP). CRISP is an eight-university endeavor, led by the University of Virginia; Martínez is CRISP’s assistant director. The center is funded with a $27.5 million grant as part of the Joint University Microelectronics Program (JUMP). A $200-million, five-year national program, JUMP is managed by North Carolina-based Semiconductor Research Corporation, a consortium that includes engineers and scientists from technology companies, universities and government agencies.
CRISP was formed in 2018 as interest in solving the von Neumann bottleneck began to grow. The increasing use of “big data” presents new opportunities to leverage vast sets of digital information for business, health care, science, environmental protection and a wealth of other societal challenges. The center aims to develop a new type of computer architecture that considers processing and storage as one and the same mechanism, rather than two separate components. This can be achieved by building processing capabilities right inside memory storage, and by pairing processors with memory layers in “vertical silicon stacks,” according to Martínez.
“Memory is deeply hierarchical, and at each level there’s an opportunity for adding computing capabilities,” said Martínez, who adds that consideration must be given to data structure and usage patterns. “Organizing the computation around this deeply hierarchical system is a big challenge. I could be physically very close to some stored data, but if the probability of such data being relevant to me is low, that proximity most likely does nothing for me.”
The center takes a vertical approach to the problem, spanning hardware, system and applications research themes. This vertical approach allows the center to tackle another critical challenge: create a programming framework that is intuitive enough for programmers to use productively and effectively.
“We are essentially blurring the boundaries between computing and storage. This introduces a whole host of new challenges to hardware architecture design, as well as software programming. Our goal is to achieve transparent acceleration where the programmers do not have to reason about the low-level hardware details to optimize communication and data placement,” said Zhang.
Zhang and Martínez both conduct their research at Cornell’s Computer Systems Laboratory. As part of the project, they envision co-designing the architecture with new compiler and run-time systems that can automatically translate programs into machine code for their new architectures. “We cannot afford to determine the architecture or the run-time system before attacking the other one. We need to design both at the same time,” said Martínez.
______ This article is featured in the 2020 issue of ECE Connections. Jessica Edmister contributed to this article.