Having a slow connection is always frustrating, but just imagine how supercomputers feel. All those cores doing all kinds of processing at lightning speed, but in the end they’re all waiting on an outdated network interface to stay in sync. DARPA doesn’t like it. So DARPA wants to change it — specifically by making a new network interface a hundred times faster.
The problem is this. As DARPA estimates it, processors and memory on a computer or server can in a general sense work at a speed of roughly 10^14 bits per second — that’s comfortably into the terabit region — and networking hardware like switches and fiber are capable of about the same.
“The true bottleneck for processor throughput is the network interface used to connect a machine to an external network, such as an Ethernet, therefore severely limiting a processor’s data ingest capability,” explained DARPA’s Jonathan Smith in a news post by the agency about the project. (Emphasis mine.)
That network interface usually takes the form of a card (making it a NIC) and handles accepting data from the network and passing it on to the computer’s own systems, or vice versa. Unfortunately its performance is typically more in the gigabit range.
That delta between the NIC and the other components of the network means a fundamental limit in how quickly information can be shared between different computing units — like the hundreds or thousands of servers and GPUs that make up supercomputers and datacenters. The faster one unit can share its information with another, the faster they can move on to the next task.
Think of it like this: You run an apple farm, and every apple needs to be inspected and polished. You’ve got people inspecting apples and people polishing apples, and both can do 14 apples a minute. But the conveyor belts between the departments only carry 10 apples per minute. You can see how things would pile up, and how frustrating it would be for everyone involved!
With the FastNIC program, DARPA wants to “reinvent the network stack” and improve throughput by a factor of 100. After all, if they can crack this problem, their supercomputers will be at an immense advantage over others in the world, in particular those in China, which has vied with the U.S. in the high performance computing arena for years. But it’s not going to be easy.
“There is a lot of expense and complexity involved in building a network stack,” said Smith, the first of which will be physically redesigning the interface. “It starts with the hardware; if you cannot get that right, you are stuck. Software can’t make things faster than the physical layer will allow so we have to first change the physical layer.”
The other main part will, naturally, be redoing the software side to deal with the immense increase in the scale of the data the interface will have to handle. Even a 2x or 4x change would necessitate systematic improvements; 100x will involve pretty much a ground-up redo of the system.
The agency’s researchers — bolstered, of course, by any private industry folks who want to chip in, so to speak — aim to demonstrate a 10 terabit connection, though there’s no timeline just yet. But the good news