|
Multi-mode sensor processing – such as that for radar beamforming and for electro- optical (E-O) or infrared (IR) image processing – presents formidable computing problems. It requires extremely high data throughput and processing power, which requirements change dynamically with operating conditions.
Multi-mode sensor processing has traditionally been implemented using DSPs or FPGAs, but their lack of run-time programmability and re-configurability forces the worst-case design for each type of device. This creates a tremendous need for a solution that is upwardly scalable, reconfigurable, and programmable, all while reducing development costs and time to market.
The Massively Parallel Processor Array (MPPA) solution
Ambric (www.ambric.com) has developed a new computing architecture and device – a Massively Parallel Processor Array (MPPA) – that can be reconfigured in real time to adapt on-demand to the dynamic computational and functional requirements of multi-mode sensor platforms.
Performance of up to one TeraOPS
The ultra-high-performance Ambric Am2045 MPPA features 336 32-bit RISC processors delivering up to one teraOPS processing speed. It also features programmable 32-bit communication fabric for high-performance inter-processor connections. The Am2045 also includes a four-lane PCI Express interface and four GPIO ports, with an aggregate I/O bandwidth of 29 Gbps per second.
Scalability with no need to change design methodology
The Ambric Am2045 MPAA may be scaled to application requirements simply by reallocating computing resources (processors and/or memories) available within the chip or across multiple chips. The Ambric Structural Object Programming Model enables applications to add or reconfigure resources with no major design overhaul. This keeps development on schedule.
Easier development
Ambric provides aDesigner, a comprehensive suite of software development tools that can typically be mastered after just one day of training. The software tools let designers easily turn a "whiteboard" architecture design into an implementation on the Ambric devices. Programs may be developed in Java or in assembly language. A multi-processor simulator lets a designer debug the code at the source level and then validate the design at his or her desktop, even without the actual hardware. The same software tool can then be used for on-chip debugging. For design tuning, the Ambric performance profiling tools provide the designer valuable data about processor utilization, inter-processor communication, and other important statistics.
Dynamic behavior and re-configurability of the Am2045
As shown in the examples below, there are multiple methods for the run-time dynamic behavior of an Am2045 MPPA to adapt to changing demands and workloads in high-performance applications. And, the entire device may be reconfigured in about 10 milliseconds.
The Ambric Structural Object Programming Model
The Ambric Am2045 MPPA processor was designed to implement the Ambric Structural Object Programming Model, which creates objects that run concurrently on an asynchronous array of processors and memories, as shown in Fig 1. All objects are strictly encapsulated, execute with no side effects on each other, and have no implicitly-shared memory.

1. Ambric Structural Object Programming Model.
Objects can be aggregated into hierarchies of composite objects and run independently at their own rates. Inter-processor communication is synchronized by the Ambric channels, with no need for a real-time operating system.
A producer processor automatically stalls when the output channel is full, and a consumer processor stalls when the input channel is empty. Stalled processors automatically resume execution after the necessary condition is satisfied.
The core of the Am2045 is a 9-row by 5-column array of building blocks called brics, which are surrounded by external SDRAM interfaces, a PCIe interface, and four general-purpose I/O (GPIO) ports. Each bric has two Compute Unit (CU) processor clusters and two RAM Unit (RU) memory clusters.

2. Ambric Compute Unit and RAM Unit.
Fig 2 shows a CU and RU pair. A CU contains two SRs (32-bit streaming RISC processors) and two SRDs (32-bit streaming RISC processors with DSP extensions). A hierarchical, configurable network of channels interconnects all these resources. Ambric channel communication is 32-bits wide with an extra bit that structures packets. Each CU is connected by several access engines to an RU that has 8 KB of SDRAM.
|