Newsletter

Programmable Logic DesignLine  >  Products

Omniscient code compilation comes to the PIC32 RISC CPU

Compiler boosts PIC32 interrupt handling and reduces code size by 40%



Courtesy of Embedded.com

Phoenix, Az. - At Microchip Technology's MASTERS Conference here Wednesday HI-TECH Software will take the wraps off an "omniscient" ANSI C compiler for 32-bit MCU code that it claims boosts real-time response by more than 25% as well as nearly doubling code density.

The new HI-TECH C PRO compiler for Microchip's PIC32 MCU uses a new technique called omniscient code generation(OCG) to optimize stack and register allocation across all code modules prior to generating the object code. Smaller code generally executes more quickly and requires smaller, less expensive flash memory for storage.

It collects comprehensive data on every register, stack, pointer, object and variable declaration across the entire program. It uses this information to optimize register usage, stack allocations and pointers across the whole program. It also ensures consistent variable and object declarations between modules and deletes unused variables and functions.

According to CEO and company founder Clyde Stubbs, its performance on the PIC32 proves out the company's belief that OCG technique should result in even better performance and code density improvements on 32-bit register-based MCUs than that achieved in 8- and 16-bit MCUs where the company has focused its OCG efforts previously.

Because PIC32 is based on a MIPS Technologies 32-bit core, he believes that the performance improvements achieved should be repeatable on most other MIPS architectural derivatives, as well as many other RISC-based designs. "Right now we are being somewhat conservative and are confining ourselves to architectures that have a clear and large following in the embedded systems market."

Next on the company's agenda is the 32-bit RISC ARM architecture, with a particular focus on the ARM Cortex-M3, which is targeted specifically at embedded applications. There, as with most other 32-bit RISC CPUs, said Stubbs, code is most often generated one module at a time, using variations of GNU Compiler Collection (GCC) techniques.

Because GCC generates code one-module-at-a-time, he said, no comprehensive cross-module data is available. "But without knowing how objects are used across the whole program, it is impossible to achieve the same level of optimization as an OCG compiler," said Stubbs.

In code density benchmarks, the company's OCG compiler achieved code that can be as much as 40% smaller than that generated using industry leading GCC-based PIC32 compilers. "The smaller code size can cut device costs by reducing the amount on on-chip flash required," he said.

Stubbs pointed out what because GCC-based 32-bit compilers are constrained as to which registers can be used to store parameters for called functions. "Whenever a function is called from another code module, the parameters of that function are usually stored in the registers," said Stubbs, via four specific registers reserved for this purpose in GCC-based compilers.

The problem is that if the function has more than four parameters, the additional parameters must be stored on and passed to the called function using the stack (in RAM) - a cycle intensive process that degrades performance and leads to increased RAM usage.

Faster Interrupt Handling.
By comparison, he said, interrupt-intensive code generated by omniscient code compilation typically requires 26% fewer cycles for the PIC32 to execute than code compiled using a non-OCG compiler.

By reducing the number of CPU cycles spent moving data between the registers and stack, HI-TECH's OCG compiler effectively gives the CPU a 26% performance boost. More important, called functions frequently call other functions, which may, in turn call other functions.

"This is particularly true for interrupt intensive applications," said Stubbs. "For example, if the code calls a function, which then calls a second function, the parameters for the first function will have to be saved to the stack to make room for the parameters for the second function. "

If this second function calls a third function, the parameters for the second function will also have to be saved to the stack to make room for the parameters of the third function.

"Data will have to be shifted continuously between the stack and the registers," he said. "The penalty for this is at least a cycle every time data is moved to or from the stack " or 8 cycles to move the data for a single four-parameter function to the stack and back to the registers."

Even if other registers are available, the GCC compiler allocates the extra parameters to the stack once the fixed set of four registers is full. This process wastes both cycles and RAM. It also results in code bloat due to the extra instructions required to save function parameters to the stack.

In contrast, with OCG compilation, said Stubbs, there is perfect knowledge of the register usage of each function. At any point in the program, it knows which registers are available and which registers are not available, and can optimize register usage without any arbitrary constraints.

"When there are two or three deep function calls, it allocates parameters for different functions into non-overlapping register sets, often eliminating the need to store parameters into memory completely," he said.

"This results in better utilization of the available registers, fewer cycles wasted moving parameters between the stacks and the registers, and less RAM usage. It also contributes to smaller code size by reducing or eliminating the need for code to save registers to the stack."

With the use of OCG, the HI-TECH C PRO knows the register usage of every function in the entire program, including interrupts and any functions that are called by the interrupt code.

"It also knows exactly which registers need to be saved and restored for each interrupt routine. The OCG compiler saves only those registers that are necessary, reducing the size of the interrupt context switching code, and decreasing the number of cycles required to execute the interrupt routine."

Improving Memory Optimization.
Since the HI-TECH C PRO compiler knows the usage of every instance of every variable in the program, it has the ability to optimize the allocation of every variable between either the stack or the registers. The optimization is based on the frequency of use of each variable.

Variables that are used intensively can be allocated permanently to registers, which have no cycle penalty at all. All register and stack allocations are always optimized to elicit the best overall performance for the entire program. This highly refined optimization of memory both boosts performance and minimizes power consumption by keeping frequently used data in locations that have the shortest access time.

HI-TECH C PRO for the PIC32 MCU Family is available now through September 30, 2008 for the introductory price of US$1595, after which it will sell for $1995.A fully functional 45-day trial version can be downloaded, free of charge, at HI-TECH's website.



 






 Featured Jobs
T-Mobile seeking Manager 3, Engineering in Snoqualmie, WA

Cirrus Logic seeking Digital IC Design Engineer in Austin, TX

SanDisk seeking Sr Manager, ASIC Design in Milpitas, CA

Exceptional Innovation seeking Electrical Engineer in Westerville, OH

Center for Nanoscale Sci and Tech seeking Operations Mangr in Gaithersburg, MD

More jobs on EETimesCareers
 Sponsor
 CAREER CENTER
Ready to take that job and shove it?
SEARCH JOBS:

 SPONSOR

 RECENT JOB POSTINGS
For more great jobs, career related news, features and services, please visit EETimes' Career Center.