The race to build faster and larger artificial intelligence systems is reshaping the architecture of global computing infrastructure. As AI models become increasingly complex and data intensive, technology companies are finding new ways to connect thousands of GPUs together to create massive AI clusters capable of handling advanced workloads.
According to Goldman Sachs, the industry is currently relying on three major approaches to expand AI computing capacity: scale up, scale out and scale across.
The first approach, known as scale up, focuses on increasing computing power within the same equipment environment, typically inside a server rack. Companies are packing more GPUs, memory and networking capabilities into a tightly integrated system to improve performance and reduce communication delays between processors. A prominent example is Nvidia’s Vera Rubin rack architecture, which can connect up to 72 GPUs within a single rack.
The concept is now evolving further with “supernodes,” where multiple racks are interconnected using ultra high speed networking technologies. These systems are designed to make communication between racks nearly as fast as connections inside a single rack, enabling AI workloads to scale more efficiently.
The second method, scale out, is the traditional model of expanding computing infrastructure by adding more servers and networking them through advanced switching technologies. The global financial services firm said that this approach allows companies to build extremely large AI clusters spread across vast data centre environments. Modern AI systems today are capable of supporting scale out networks with more than 1,00,000 GPUs connected together. Such massive clusters are increasingly becoming essential for training large language models, generative AI applications and enterprise AI workloads.
The third approach, scale across, represents the next frontier in AI networking. Goldman Sachs in its report in April highlighted that instead of limiting infrastructure to a single data centre, companies are now connecting servers across multiple geographic locations. This allows computing workloads to be distributed between different data centres while maintaining high speed communication. Nvidia has recently introduced scale across networking solutions powered by its in house Ethernet switches and network interface controller (NIC) technologies to support this transition.

