19 June 2017

Universal ultra high-dense and ‘hot water’ cooled RSC Tornado solution: ready to support Intel® Xeon® Processor Scalable Family, world's first Intel® Omni-Path fabric based and 100% ‘hot water’ liquid cooled switches, improved RSC BasIS functionality

Frankfurt am Main (Germany),   International    Supercomputing Conference (ISC’17), June 19, 2017. — RSC Group, the leading developer and integrator of innovative solutions for high-performance computing (HPC) and data centers in Russia and CIS has demonstrated its ultra high-dense, scalable and energy-efficient RSC Tornado cluster solution with direct liquid cooling (all cabinet elements including high-speed interconnects are liquid cooled) at ISC’17 international exhibition. This RSC solution based on 72-core Intel®  Xeon  Phi™  7290 processor has established the world computing density record for x86 architecture in 1.41 Petaflops per cabinet or over 490 Teraflops/m3.

RSC has showcased a full set of components for modern HPC computing systems of different scale with 100% liquid cooling in ‘hot water’ mode, including high-performance RSC Tornado computing nodes based on 72-cores Intel® Xeon Phi™ 7290  processor and Intel® Server Board S7200AP, Intel® Xeon® E5-2697А v4 and Intel® Server Board S2600KPR(F) with Intel® SSD DC S3520 Series, Intel® SSD DC P3520 Series solid state drives with NVMe interface in high-dense М.2 formats and the latest Intel®  Optane™ SSD DC P4800X  Series. The next generation of RSC Tornado is ready to support the newest Intel® Xeon® Processor Scalable Family (code named Skylake-SP) which are expected to launch in second half of the year.

RSC Tornado solution based on Intel server processors has leading footprint and computing density (up to 153 nodes in one standard cabinet 80cmx80cmx42U), high energy efficiency and provides stable operation of computing nodes in ‘hot water’ mode with cooling agent temperature up to +65 °C at inlets of switching nodes and interconnects.  Operation in ‘hot water’ mode enables all-year free cooling (24x365) using only dry coolers running at ambient air temperature up to +50 °C, and complete elimination of freon circuit and chillers. Therefore, average PUE (power usage efficiency) of the system is less than 1.06. Cooling consumes less than 6% of total consumed power, which is an outstanding result for HPC industry.

At ISC’17, RSC specialists have also introduced the world's first 100% ‘hot water’ liquid cooled 48-port Intel® Omni-Path Edge Switch 100 Series for high-speed interconnects (with up to 100 Gbps per port non-locking switching speed). Intel® Omni-Path Architecture (Intel® OPA) is a complex solution for high-speed switching and data transfer improving application performance in entry-level HPC clusters and large-scale supercomputer projects with minimum expenses. 48-port Intel OPA switch enables connection of 26% more servers than competing solutions with lower budget and power consumption reduced to 60% providing more energy efficient switching and system infrastructure.

RSC BasIS integrated software stack for cluster system management

Innovative management and monitoring system based on RSC BasIS integrated software stack also provides high availability, resistance to failures and ease of use of HPC computing systems based on RSC solutions. This system is an open and easily expandable platform based on open source software and micro-agent architecture. It enables controlling full data centers and their individual elements such as computing nodes, interconnects, infrastructure components, workloads and processes. All system elements (computing nodes, power supplies, hydraulic regulation modules, etc.) consist an integrated management module providing broad capabilities for detailed telemetry and flexible management. Cabinet design supports replacement of computing nodes, power supplies and hydraulic regulation modules (with redundancy) in hot-swap mode without interruption of system operation. Most components of the system (such as computing nodes, power supplies, network and infrastructure components, etc.) are software-defined, and this significantly simplifies and speeds up initial deployment, maintenance and future upgrades of the system. Liquid cooling of all components ensures their longevity.

New functionality of RSC BasIS for monitoring and control of territorially distributed data centers was presented by RSC at ISC’17.

Unique projects at JSCC RAS and SSCC SB RAS

This year RSC Group has completed an upgrade of computing resources of the Joint Supercomputer Center of the Russian Academy of Sciences (JSCC RAS) and the Siberian Supercomputer Center of the Siberian Branch of the Russian Academy of Sciences (SSCC SB RAS) on the basis of the Institute of Computational Mathematics and Mathematical Geophysics. Both projects are unique as they are world's first deployments that have server nodes with ‘hot water’ liquid cooling based on the most powerful 72-cores Intel® Xeon Phi™ 7290 processors and 16-cores Intel® Xeon® E5-2697А v4 processors. For the first time in Russia and CIS region, implementation of these unique projects involved deployment of communication subsystems of two cluster systems based on Intel® Omni-Path Architecture.

Joint resources of JSCC RAS (Moscow) and SSCC SB RAS (Novosibirsk) will be used as the basis for territorial distributed computing facility for solving relevant tasks in fundamental and applied sciences, including advanced research in fields of AI (Artificial Intelligence), ML/DL (Machine Learning, Deep Learning), Big Data and others.

Total peak performance of these computing facilities is currently about 1.1 Petaflops. Innovative management and monitoring system based on RSC BasIS integrated software stack also provides high availability, resistance to failures and ease of use of computing resources at JSCC RAS and SSCC SB RAS.