This creates a "virtual ASIC" that can be reconfigured on the fly, erased and
rebuilt with virtually zero latency. These virtual ASICs can also execute, be
modified, or erased fully in parallel of one another, or even with data
interchange, across the array -- all the while being fully controlled via a
unique configuration manager that insulates timing and data-dependency issues
from the programmer/designer.
Through this approach, which currently involves 25 international patents (both
granted and pending), the XPP-based IP core combines the performance of an ASIC
with the flexibility of a DSP -- all based on PACT's parallel reconfigurable
computing technology. The core enables efficient reconfiguration strategies,
which can be performed in parallel to the processing of data to achieve the
highest-possible application performance, equal to 50,000 MIPS or the equivalent
of 80 Pentium(R) 4 chips running in parallel at 1.3 GHz. The initial clock speed
of the core is 100 MHz, with less than 10% of the power draw of leading DSP
designs, thus enabling new benchmarks for flexibility and performance in
silicon/software design and implementation.
PACT enables customers to tailor their products to specific market demands and
the core is designed to meet the exponentially growing demands for bandwidth and
performance. This is especially relevant in the rapidly emerging markets for 3G+
base stations, handsets, Internet appliances, and wireless linked PDAs, which
are the primary initial markets for the PACT core.
PACT is currently in licensing discussions with major semiconductor
manufacturers, including the leading CPU and DSP vendors, regarding the XPU-128.
The extreme processing capability of the PACT core is based on the unique
combination of the following features:
Parallel Processing
Up to 128 individual 32-bit Processing ALU elements are arranged in a single
core.
Algorithm Mapping
Algorithms are directly mapped to the array and executed in a pipeline in
contrast to classical sequential programming.
Automatic DataFlow Synchronization
Data and Control Information are transferred as packets, which are synchronized
by an intelligent network.
Point to Point Interconnect
Massive Independent Data Channels provide extreme bandwidth without bottlenecks.
Distributed Memories
A large number of internal RAMs and external memory interfaces allow independent
access of algorithms, therefore eliminating the problems with shared memories.
Event Network
Execution flags for conditional program execution and external events are
transferred as packets by the independent Event Network.
Dynamic Runtime Reconfiguration
Larger programs can be split into segments and are sequentially reconfigured and
executed.
Partial Reconfiguration
Any detail of a configuration can be exchanged in runtime. This allows other
processes to interact with the running program.
Modular design
The XPU Architecture is based on a small number of different types of PAEs. The
IP Model allows the combination of these modules in a very simple and efficient
way.
Arithmetic Granularity
The architecture provides flexibility similar to FPGA, but with the arithmetic
capabilities of DSPs.
Scalability
The modular design and regularity of the architecture allows the design of
systems on chip, according to the customers needs. Furthermore XPP devices can
be cascaded for linear speedup of algorithms.