logo

EnterTheGrid - PrimeurWeekly

EnterTheGrid - PrimeurMagazine is the largest Grid and Supercomputer information source in the world. PrimeurWeekly delivers the news each week in your e-mail box.

>PrimeurMagazine
>PrimeurLive!
>EnterTheGrid
>Analysis
>Backissues
>Calendar
>Subscribe
>Advertise
>Contact
PrimeurWeekly 21 June 2010
>Special
>It takes three to tango in exascale computing: memory, photonic interconnects and embedded processors
>T-Platforms is cruise-speeding on a European HPC-mission
>EuroFlash
>PRACE calls for one year project grants on Europe's fastest computer
>Nanospheres stretch limits of hard disk storage
>Scientists focus on revealing hidden mysteries of the Universe
>USFlash
>ARCS Compute Cloud released nationally
>Over two billion hours served using Argonne's supercomputer to drive discovery and innovation
>RAID Inc. announces OEM partnership with LSI Corporation
>Campus Party Colombia plans a promotion of Rosetta@home
>Surplus PC power yields faster cancer research
>Lawrence Livermore teams with Fusion-io to re-define performance density
>Construction to begin on NCAR-Wyoming Supercomputing Center
>State of Florida leverages Microsoft Cloud solution for Census Count
>HP accelerates client adoption of hybrid delivery models to improve application outcomes
>IBM, ARM, Samsung, GLOBALFOUNDRIES and Synopsys announce delivery of new chip platform
>Oracle achieves world record result with SPECjAppServer2004 Benchmark
>IBM advances analytics with acquisition of Coremetrics
>IBM opens its largest Software Development Lab in North America and enhances investment in Massachusetts high tech talent
>Oracle announces Oracle Business Process Management Suite 11g
>Mississippi State University among 20 fastest US academic supercomputing sites
>SeaMicro enters the Japanese market
>SeaMicro unveils new x86 server
>VLife and CRL join hands to combine the power of accurate and smart docking technology with high performance Cloud computing
>AFOSR-funded initiative creates more secure environment for Cloud computing
>SGI releases InfiniteStorage 5000 SAS external storage system
>SGI helps Queensland Government accelerate climate science in Australia
>Platform Computing and Sybase to demonstrate high performance scalable analytics infrastructure solution for trading at SIFMA Financial Services Technology Expo
>Voltaire expands 10 GbE portfolio with new low-latency layer 2/3 switch for next generation data centres and Cloud computing
>Voltaire collaborates with NASDAQ OMX and HP to deliver next generation high-speed trading platform to Singapore Exchange
>Force10 Networks commends the IEEE for ratifying its P802.3ba 40 and 100 Gigabit Ethernet Standard
It takes three to tango in exascale computing: memory, photonic interconnects and embedded processors
Hamburg 01 June 2010 Free moments for speakers, session chairs and participants were rare at ISC'10 in Hamburg. EnterTheGrid/Primeur Magazine managed to catch John Shalf for a short interview during lunchtime on June 1, 2010. John Shalf, who has a background in electrical and computer engineering, is currently working at the Lawrence Berkeley National Laboratory (LBL) in California, leading the Advanced Technologies Group at NERSC, the National Energy Research Scientific Computing Center. His research may be very promising for the future of supercomputing since he and his colleagues are making large efforts to achieve one of the breakthroughs required to get to exascale computing: developing energy-efficient and easier to program processors. Their approach uses embedded processors, which are now used in mobile devices, and reorganizes them to be effective "many-core" computing components for supercomputers. The Berkeley research is being performed in the Green Flash project.
Advertisement
Visit our sponsors
Advertisement

Primeur Magazine: Applying processors from mobile devices to supercomputers sounds like an original approach to solve the power efficiency problem. Where did the idea come from?

John Shalf:The idea is not entirely original since it was also central to the design of IBM's BlueGene and the SiCortex machines, but our target is devices containing hundreds of cores. The unique nature of our approach is determining new ways to organize these cores so that they can be effective and easy to program for the more extreme many-core chips we will need for future energy-efficient supercomputing designs. Already power is the key limiting factor for supercomputing. At Berkeley, we investigated numerous possible architecture options during the course of our discussions for the "View from Berkeley" (http://view.eecs.berkeley.edu). Of all the approaches, the most practical option turned out to be large arrays of simpler cores rather than continuing with modest-size multicore processors containing dozens of complex cores. The embedded market has a longer history of expertise with power efficiency that uses these very simple core designs. Not only the processor itself but also the design techniques developed by the embedded technology vendors are playing a crucial role in this matter. The hardware and software co-design methodology that is commonly used to develop energy-efficient mobile devices has to be adapted to the design of supercomputers. This is the problem the researchers are trying to solve over the course of the next decade.

The vendors for handheld devices need energy efficiency for a long battery life, yet still have to get the performance - given power is the leading design constraint for HPC, our needs are aligned with that of the embedded/handheld computing market. The delivered application performance per energy for these designs is far greater than what can be achieved using conventional designs. Embedded processors are much more energy-efficient than the traditional processors because of their simplicity, but also because they are tailored for the application. The approach reduces waste by removing any features that are unnecessary for the targeted applications. This application-targeted approach, combined with the co-design process, can achieve a hundreds of times more energy-efficient solution than using conventional desktop components. As a result, there is a lot more energy to do useful work. Embedded processors are also inexpensive. They are used to keeping prices under control, because of the competition in this mass market.

In the high-end of desktop computing it takes 4 years to design a new chip whereas in the embedded market a design firm such as Tensilica may produce upwards of 200 new designs per year. They have developed sophisticated tools to accelerate the turnaround of such tailored processor designs. The software, namely the debuggers, are also playing their part. Everything is customized. The processor and the software are matched to each other. Berkeley Lab is collaborating with Tensilica to explore the use of this company's processor cores as the building blocks in supercomputing design.

Primeur Magazine: So the Tensilica processors are providing more power efficiency but what about the power sufficiency?

John Shalf:Currently, the chip (e.g. an AMD Opteron chip or an Intel Nehalem) is considered the commodity in current supercomputing. They are the building blocks for our current supercomputer designs. In the embedded market, the circuits on the chip are commodity. The Intellectual Property (IP) blocks are stitched together to create ASIC designs from pre-designed and pre-verified components, but can be configured in novel ways to target the application requirements. So 100 cores are put on a chip together with large numbers of memory controllers in order to provide lots of memory bandwidth, and the communication between those cores can be organized to target high-level programming languages such as Unified Parallel C (UPC). The chip is as powerful as a graphics processing unit (GPU) but it works at a fraction of the watts. If you can add some features you can make it easier to program for science. A lot is being gained from what we throw away. Intel for instance has 500 instructions per chip but we need only 80. So we take only what we need from the embedded processor market. Seymour Cray said: "Only put into a supercomputer what is absolutely necessary." COTS technology has limited our ability to adhere to his advice, but the embedded tools give us much more flexibility to return to that kind of design philosophy.

Primeur Magazine: Are your processors already used in supercomputers right now?

John Shalf:Our design is still experimental. We have all the logic to build the chip but we do cycle-accurate simulation of the design using an FPGA (field-programmable gate array) simulator called RAMP (Research Accelerator for MultiProcessors). The simulation runs 10 times slower than the real chip would, but it accurately predicts the real hardware performance because it is in fact the real circuit design. Intel's Larabee is based on some of the same design principles of using large arrays of simpler cores. It answers how many-core processing could be made easier to program. But the Berkeley solution uses far less power than Intel's Larabee because they are much simpler and more closely tailored to the target scientific applications. Each one of the Tensilica processors we use in our design occupies only 2 square millimeters on a chip with full IEEE double precision floating point and consumes only 130 milliwatts.

Intel and Microsoft have co-funded the ParaLab at Berkeley campus to study many-core processors and software for handheld devices. We work closely with the campus researchers, but our target is more on how to scale up this approach to target large-scale supercomputing systems.

Primeur Magazine: Can the solution be commercialized?

John Shalf: Our goal is to influence the HPC market to adopt a radically different design methodology. As such, we have a high burden of proof to demonstrate the value of our approach. So we want to build prototype systems to demonstrate the effectiveness of our approach to industry. It can be commercialized but not by Berkeley as such. So Berkeley seeks a partner to commercialize it. The idea has to become commodity, so it is important to us to make all of the results of this research public and broadly accessible.

Primeur Magazine: What are your future plans and work in this area?

John Shalf:We need to demonstrate a scaled up design within the next five years. We are already three years working on it and now there is a concrete design in simulation. This is used for research programming models and hardware support for many-core. However, demonstrating that our energy models and design processes can hit the target power and performance will require a full-scale prototype design to validate the models. LBL is talking with different partners but it is a question of funding and that takes time.

Primeur Magazine: How do you see the future of supercomputing in general?

John Shalf:Power consumption is the largest impediment to future performance improvements in supercomputing systems of all scales - not just exascale. Therefore it is essential for energy-efficient systems that are easy to program to pursue the new technologies and chip organizations. We are on a critical path to exascale computing but we still need three or four miracles to reach it. The embedded processor approach is one issue to solve. The second one is energy-efficient memory technology. However, there are very few memory vendors left. Actually, there is only Micron which has the intention to address this problem. The third one is more energy-efficient interconnect technology, which may well involve scaling down photonics to work at chip scale. Indeed, we had a talk today from Luca Carloni who is studying "silicon photonics", which are tiny optical switches and wave guides that can be integrated directly onto a CMOS chip design. The last one is storage technology. We know that disk technology is not going to scale at the rate we need to meet storage performance requirements at exascale, but the heir apparent to existing mechanical technology is not clear. Nonvolatile solid-state memory technologies, such as FLASH, phase-change memories, and other NVRAM technologies are making great strides, but it is not yet apparent which approach will win in the marketplace in the long run.

Primeur Magazine: Thank you for your time and best of success with the energy-efficient processor miracle!

More information about the embedded processor research in the Green Flash project at Berkeley is available at http://www.lbl.gov/cs/html/greenflash.html

Advertisement
Advertisement
Leslie Versweyveld

EnterTheGrid - Primeur

James Stewartstraat 248

1325 JN Almere

The Netherlands

http://enterthegrid.com/primeur

mailto:primeur [AT] enterthegrid [DOT] com

© EnterTheGrid - PrimeurWeekly