dsPIC33A FPU Coprocessor

Last modified by Microchip on 2025/02/05 14:33

Overview

The dsPIC33A FPU Coprocessor includes hardware implementations of the most common floating point operations for both single-precision (32-bit) and double-precision (64-bit) data formats. It is intended to significantly accelerate C compiler floating point operations when compared to executing software library equivalents and is designed to comply with the IEEE 754-2008/2019 floating point standards.

Key Features

  • Comprehensive IEEE 754-2008/2019 compliant instruction set
    • Supports both Single and Double Precision operations for most instructions
    • Supports all required rounding modes
  • Closely coupled to dsPIC33A CPU core
    • Instructions issued from CPU core as part of application instruction stream
    • Independent instruction pipeline and hazard management
  • 32 x 32-bit data registers (F-regs)
    • May be used to hold 32-bit Single Precision or 64-bit Double Precision values
    • Base plus 7 partial FPU register contexts
  • Optional subnormal handling for improved performance
    • Subnormal result “Flush-To-Zero” (FTZ) mode
    • Subnormal operand “Subnormals-Are-Zero” (SAZ) mode
  • Comprehensive exception implementation and reporting structure
    • IEEE 754-2019 compliant exception implementation
    • Additional exceptions supported for Huge Integer results and Subnormal operands
  • Debug features supported:
    • Exception address capture register (FEAR)
    • Exception break signaling
    • NaN propagation

Back to Top

Block Diagram

FPU Co-processor Block Diagram

Back to Top

Architectural Overview

The FPU coprocessor relies on the associated dsPIC33A CPU for all instruction fetches, most decode, and for all operand movement to and from system memory. The FPU contains no local memory other than its own register set. The CPU fetches instructions (incl. FPU instructions) through the I-bus and issues instructions to the FPU execution pipeline if it is an FPU instruction. There are two groups of Functional Blocks, one for Single Precision (SP) and the other for Double Precision (DP) operations. Each Functional Block consists of one or more Execute pipeline stages, each stage representing a cycle of execution to complete the associated instruction. The FPU can dispatch one instruction into a corresponding Functional Block per cycle. All data movements to/from the FPU coprocessor are through F-registers – these movements are provided via dedicated CPU instructions to move data to/from F-regs/FSR/FCR through W-regs or indirect moves.

Back to Top

FPU Pipeline

The CPU pipeline F-state and A-stage fetch and issue FPU instructions to the FPU execution pipeline, as shown here:

CPU FPU Pipeline

Once the CPU issues the instruction to FPU co-processor, the FPU instruction goes through the following stages of the FPU pipeline:

  • Read:
    • The FPU pipeline RD-stage receives instructions issued by the CPU. The RD-stage is also subject to hazard checks and can therefore be stalled.
  • Execute:
    • Each instruction may consist of one or more execute stages, depending upon the functional block targeted by the operation. When the instruction enters the X[0]-stage, it is registered such that the current FPU instruction is committed and RD-stage is free to receive another instruction issued by the CPU.
  • Write-back:
    • The WB-stage captures each SP or DP result as they exit the execute stage in dedicated registers.
    • Instructions must retire in the same order as that in which they are issued.

Back to Top

FPU Registers

Similar to the CPU, here’s a high-level overview of the programmer’s model highlighting multiple contexts for the floating point working registers, the floating point control registers, and the floating point status registers.

FPU Registers v2

The following registers are available to the programmer:

  • 32, 32-bit working registers ("F-registers"), labeled F0 thru F31.
    • These working registers can hold up to 32 Single Precision values, or up to 16 Double Precision values,
  • FPU Control and Status and control registers, FCR, FSR
  • An FPU Exception Address capture Register (FEAR) which is used to aid system debug.

Note that there are 7 additional register contexts available for F0-F7, FSR and FCR registers, which are mapped to interrupt priority levels 1 through 7. This reduces exception latency since the CPU can automatically switch over to the new context, requiring no saving/restoring of these FPU register states.

Other than data movement in and out of the FPU working registers, all FPU instructions are register-to-register operations within the FPU register set. The F-regs are not memory mapped and can only be accessed by the CPU using specific instructions.

Back to Top

CPU Access of FPU Registers

CPU Access of FPU Regs v2

Dedicated CPU instructions are provided specifically to support data movement into and out of FPU co-processor registers. The supported CPU <---> FPU data movements are as follows:

  • W-regs to F-regs direct move (and vice-versa)
    •  ex. mov.l w0, f0
  • F-regs to F-regs direct move
    •  ex. mov.l f0, f1
  • Indirect move to/from memory
    •  ex. mov.l [w0++], f0
  • Direct literal moves
    •  ex. mov.l #0x3F800000, f0

Back to Top