Newsletter

Programmable Logic DesignLine  >  Design Center

Reconfigurable Computing: Custom Supercomputers on Demand?

RC creates an unprecedented opportunity for orders of magnitude improvement in GFlops-per-dollar, GFlops-per-watt, and just GFlops.

Page 1 of 2

Programmable Logic DesignLine

Reconfigurable computing (RC) is demonstrating spectacular speedups, 50-100X or more in some cases, on some specific high performance computing (HPC) applications, and generally at far lower power consumption than a single CPU or GPU.

It's one of the most flexible, adaptable computing technologies available. However, it is only slowly being utilized in blades and desktops around the world. RC creates an unprecedented opportunity for orders of magnitude improvement in GFlops-per-dollar, GFlops-per-watt, and just GFlops.

It's no silver bullet, though. Realizing the potential of RC requires understanding the basic technology, then making sure that it's the right vehicle for the specific application.

Reconfigurable computing fabric
RC typically depends on field programmable gate arrays (FPGAs). For now, consider a high performance FPGA to be a bag of loose computer parts. Today's largest FPGAs include 500 or more block multipliers, on-chip RAM totaling a few MB, and pools of uncommitted arithmetic, control, and connectivity resources. RAM-based switches and lookup tables control connectivity and function, allowing easy redefinition of the computation.

An accelerator board attaches to an existing computer's main system interconnect, such as HyperTransport in an AMD system, NUMAlink in Silicon Graphics Altix processors, or PCI in a typical workstation. The board contains one or more FPGAs for application computing, plus some amount of SRAM and/or DRAM, arranged in several independently addressable banks. The block diagram shown in Fig 1 is similar to that of a graphics accelerator with a computing engine, on-board memory, and system interconnection.


1. FPGAs typically include low-latency on-board buffers, as well as access to system memory.

Creating the computer
The von Neumann programming model distributes an algorithm across time: one function unit performs a sequence of operations, one at a time, to carry out a specific computation. Speed comes from performing many operations, including memory accesses, in rapid succession.

RC gets away from the von Neumann model; it distributes an algorithm spatially across the configurable computing fabric, as shown in Fig 2. Speed comes from performing tens to hundreds of operations in parallel, using pipelining, broadside parallelism, or a combination of both.


2. Programming an FPGA means configuring it into an application-specific processor.

Programming an FPGA means implementing the control structure of an application as well as the data path in the reconfigurable fabric. Compilers exist for turning C or C-like languages into FPGA "bit files" or executable images: Handel-C from Agility Design Solutions, Mitrion-C from Mitrionics, and Impulse-C from Impulse Accelerated Technologies are just a few of the tools commercially available, and research tools exist in many commercial and academic labs.

High-level descriptions rarely exploit the full potential of an FPGA, however. The biggest reason is that C-like programming languages have sequential execution built deeply into their basic structure, making it extremely difficult to automate the extraction of FPGA-friendly parallelism.

Just as von Neumann programmers may fall back to an assembler for performance-critical kernels, FPGA programmers can use hardware description languages (HDLs) like Verilog or VHDL to expose more of the algorithm to the FPGA fabric. This step generally requires specialized programming skills, just as parallelism is a special case of a programmer's responsibility in C-like languages. HDLs are natively parallel and sequential execution is largely up to the developer.

When considering the "grain" of a processing element (PE), x86-compatible and similar processors traditionally stand at one end of the spectrum (i.e. coarse-grained with one PE but one that is big and complex). The continuum runs through common dual cores, multi-cores on the order of ten PEs (such as Cell Broadband Engine and the UltraSPARC T2 from Sun Microsystems), and many-cores on the order of 102 PEs (like Intel's Polaris or products from Clearspeed and Tilera).

As the number of PEs increases, the size and power of each PE decreases. FPGAs are sometimes considered the fine-grained extreme: on the order of 105 PEs of one-bit functionality, fixed at the time they are programmed. This, however, does not reflect how developers really program FPGAs.

Even in HDLs, design quanta are typically not individual bits of logic, but register arrays, RAM buffers, arithmetic units, or entire filters. As a result, any given FPGA can implement PEs of different sizes at different times, and occupy a different point on the axes of PE complexity vs. number of PEs per chip. In practice, FPGAs represent variable-grain computing, typically with 10-103 PEs custom tailored to the specific application.

Page 2: next page  

Page 1 | 2



Rate this article
WORSE | BETTER
1 2 3 4 5




Altera
Related Content

TECH PAPER
1. Simulating Vector Controlled Induction Motors Using Space Vector Modulation

TECH PAPER
2. Synthesizing Algorithms from MATLAB and Model-based Descriptions

TECH PAPER
3. EasyPath-6 Technology: Fast, Simple, Risk-Free FPGA Cost Reduction

TECH PAPER
4. Xilinx DSP Design Platforms: Simplifying the Adoption of FPGAs for DSP

 


 Featured Jobs
Ascension Health seeking Solutions Development Analyst in St. Louis, MO

National Semiconductor seeking Principal IC Design Engineer in Santa Clara, CA

Taylor Guitars seeking Sr. Web Designer in El Cajon, CA

Covidien seeking Hardware Manager in Boulder, CO

Sierra Nevada seeking Software Engineer in Hagerstown, MD

More jobs on EETimesCareers
 Sponsor
 CAREER CENTER
Ready to take that job and shove it?
SEARCH JOBS:

 SPONSOR

 RECENT JOB POSTINGS
For more great jobs, career related news, features and services, please visit EETimes' Career Center.