Read-Only Objects and MPLAB® XC8 compiler for AVR® MCUs

Last modified by Microchip on 2023/11/09 09:13

In this article, we will look at objects that are read but never written by your program. There are several options available to have such objects stored in different memory areas when using the MPLAB® XC8 compiler and building for an AVR® device.

Ordinary objects, such as the foobar shown below, are placed on a data stack or assigned a static address, depending on where in the source code it is defined; but in both cases, it is allocated storage in the AVR’s data memory, or RAM.

int foobar;

Microcontrollers often have relatively small amounts of RAM, as larger amounts of RAM become expensive to implement. If you define a large amount of data that is never written, such as a look-up table, this can quickly fill the device’s RAM and prevent compilation. This can occur even if a large amount of program memory, or Flash memory, remains.

Storing read-only objects in program memory can free up RAM for other objects, and there are several ways in which this can be done. This article explores your options, assuming you are using version 2.05 or later of the XC8 compiler.

Placing Objects in Program Memory

The following table summarises the way in which a C object with static storage duration (globally accessible or static local objects) can be placed in Flash program memory. Each method is discussed in the subsequent sections.

MethodDefault locationHow to read in C source
constanywhere in Flashdirect/indirect
const __memxanywhere in Flashdirect/indirect
const __flashFlash segment 0direct/indirect
const __flashnFlash segment 0 1direct/indirect
const (progmem)Flash segment 0library function

These segments are currently placed in segment 0 by default but need to be moved to the nth segment for direct and indirect access to be properly made.

Using the const Qualifier

An object with static storage duration (globally accessible or static local objects) can be placed in Flash memory simply by qualifying it with const. Below is the correct definition that shows how an initial value should be supplied so that the object's value will be pre-programmed into memory before the program executes.

const int foobar = value;

This is the recommended way to have objects placed in Flash memory.

The const qualifier is used by the C language to primarily indicate that the object is read-only and hence cannot be written by your source code once the program is running. If an object cannot be written, there is no requirement for it to be located in RAM. Thus, const-qualified objects that have a static storage duration are located by default in Flash memory. However, any const-qualified object that has automatic storage duration (auto and parameter objects) is always placed in RAM. If you prefer to have all const-qualified objects located in RAM, the compiler's -mno-const-data-in-progmem option can be used to disable this feature. The preprocessor macro __AVR_CONST_DATA_IN_PROGMEM__ is defined whenever this feature is in effect.

Within the C source code, objects qualified const can be read directly, using the object’s identifier, or indirectly, using a pointer loaded with the object's address (discussed below). The sequence of instructions used to access the object will depend on the target device.

If you are using any device in the ATtiny or ATxmega3 families, the Flash program memory is mapped into the data space. This means that the code generated by the compiler to access const-qualified objects can use the 8-bit AVR MCU’s lds or ld instructions, which read from the data memory space.

For other devices, the Flash memory is not mapped into the data space, but alternate instructions generated by the compiler are used to read the object’s value. Objects qualified const could be located anywhere in the available Flash, and the read sequence to access them varies based on the number of Flash segments the target device implements. The compiler produces a compact code sequence for devices with only one Flash segment and might call a library routine when the target device has multiple Flash segments, for example.

Using just the const qualifier to define read-only objects has several advantages. It does not use any non-standard C keywords, hence it is more portable, especially since the same syntax is used to locate objects in program memory when compiling for 8-bit PIC® devices and using the same compiler. In addition, an option allows you to control where these objects are placed.

Using the __memx Specifier

An object can also be placed in Flash memory by using the __memx specifier, along with the const qualifier, e.g.

__memx const int foobar = value;

This method of placing objects in Flash is largely redundant, as it operates no differently than just using the const qualifier, and it is less portable.

Using the __flash or __flashn Specifiers

An object can also be placed in Flash memory by using the const qualifier and one of the __flash or __flashn specifiers (where n is a number from 1 through 5), e.g.,

__flash const int foobar = value;
__flash2 const int foobar2 = value;

The Flash qualifiers are used to indicate that the objects should be located in different program memory sections. These sections can be individually positioned using the project’s linker script. The linker script must be modified so that sections generated by the __flashn specifier are located in the nth flash segment to ensure that ordinary direct or indirect reads of objects within these sections are handled correctly. Clearly, not all the __flashn specifiers are available for all devices with smaller amounts of Flash memory.

Since there are more restrictions on where Flash-specified objects are located, the code to access them is often more efficient than that produced when accessing objects positioned in Flash memory using other means. This is particularly true if the target device has a large amount of Flash memory implemented over several segments.

Using the progmem Attribute

An object can also be placed in Flash memory by using the progmem attribute and the const qualifier. Provided you include <avr/pgmspace.h> into your source files, you can instead use the PROGMEM macro shortcut, which is used more like a qualifier. Examples of both methods are shown below.

#include <avr/pgmspace.h>
const int __attribute__((progmem)) foobar = value;
const int PROGMEM foobar2 = value;

Prior to the AVR GCC compiler supporting named address spaces, this was the only way in which objects could be placed in Flash, and it can still be used today for compatibility with legacy projects or to improve the portability of code migrated from other platforms.

The compiler is not aware of the special placement of objects that use the program attribute. Thus, any direct or indirect read of such objects results in code that incorrectly accesses RAM instead of program memory. To read objects defined in this way, you must make use of special library functions (discussed at the end of this article), such as pgm_read_byte(), and you must ensure that you use the correct functions. This requirement is one of the reasons why this method of object placement is not recommended for new projects.

Defining Pointers to Program Memory

It is common to access objects indirectly via a pointer. Pointers that can read objects in program memory need to be defined in such a way that this information is conveyed to the compiler. The following table summarizes the features of pointers defined in different ways.

Pointer definitionPointer sizeAccessible objects
const *16/24RAM-based, const__memx__flash__flashnprogmem
const __memx *24RAM-based, const__memx__flash__flashnprogmem
const __flash *16__flash
const __flashn *16__flashn

Pointers to const

Objects in Flash memory can be accessed via a pointer to const, e.g.

const int * ip;

This is the recommended way to indirectly access objects placed in Flash.

For any device in the ATtiny or ATxmega3 families, these pointers are 16 bits wide and can target objects in RAM or in program memory, since the program memory is mapped into the data space.

For other devices, the pointers are 24 bits wide. They can target objects in RAM or in program memory, and the MSb in the address they hold is used to determine which memory space is to be accessed. The indirect read sequence varies based on the number of Flash segments the target device implements. For example, the compiler produces a compact code sequence for devices with only one Flash segment and might call a library routine when the target device has multiple Flash segments.

Using pointers to const to read objects indirectly has several advantages. It does not use any non-standard C keywords, hence is more portable, especially because the same syntax is used to read objects in program memory when compiling for 8-bit PIC devices using the same XC8 compiler. If functions require a parameter to hold an address, making the type of those parameter pointers to const allows the functions to work when passed the address of any type-compatible objects.

Pointers to __memx

Objects in Flash memory can be accessed via a pointer that qualifies its target type with both const and __memx, for example:

const __memx int * ip;

This type of pointer and its use to access objects in Flash is now largely redundant, as it operates similarly to pointers to const, which do not require the use of non-standard keywords.

Pointers to __flash or __flashn

Objects in Flash memory can be accessed via a pointer that qualifies its target type with both const and one of the __flash or __flashn specifiers, (where n is a number from 1 through 5), e.g.

const __flash int * ip;
const __flash1 int * ip1;

These pointers are 16 bits wide and can target objects only in the program memory space, thus they cannot be used to access RAM-based objects. Indeed, they can only correctly access Flash objects that have been defined using the same specifiers as the pointer, so a pointer to __flash1 must be assigned the addresses of objects that are defined using only the __flash1 specifier.

When a __flash pointer is dereferenced, the 16-bit address is used by the lpm instruction. When the __flashn pointers are dereferenced, the elpm instruction uses the 16-bit address together with a constant n loaded to the RAMPZ register. For example, dereferences of a pointer to __flash1 will load 1 into RAMPZ, and then use the 16-bit address stored in the pointer with the elpm instruction to access the target indirectly.

Program Memory Objects and Library Functions

Where possible, make the function parameters hold address argument pointers to const. Functions using these parameters can be made to work with any address, regardless of which memory space that address targets. However, there are many existing library functions that can read only some of the objects defined in the ways discussed above.

Any of the pgm_read_xxx_far() library functions accept 24-bit address parameters that allow the functions to read program memory addresses up to and beyond 64 K, even reading objects that span a 64 K boundary. They allow reading of Flash objects defined using any of the above methods. The pgm_read_xxx_near() functions use 16-bit addresses to read program memory, so they are able to access objects only in the lowest 64 K Flash segment, such as those defined by the __flash specifier or the progmem attribute. None of these functions can read from the RAM.

Similar limitations apply to any of the suffixed string functions. The string functions suffixed with _PF, such as strlen_PF(), use 24-bit addresses to allow them to access objects anywhere in the Flash memory, even if those objects straddle a 64 K segment boundary. They do not operate on RAM-based objects, so you would need to use the regular string functions, such as strlen(), to operate of strings in RAM. Those library functions suffixed with _P, such as strlen_P(), use 16-bit address and can work with objects that are located only in the lowest 64 K Flash segment, such as those defined by the __flash specifier or the progmem attribute.

Library Differences When Using Const in Program Memory

If const objects are placed into Flash (which occurs by default, unless you issue the -mno-const-data-in-progmem option), those standard library functions that take a pointer to const parameter and potentially return the same pointer with an offset, e.g., strchr() or memchr(), return an address that has a pointer to const type. For example, the prototype for strchr() is usually:

     

char * strchr(const char *, int);

However, when the const objects in Flash memory feature is enabled, it becomes:

const char * strchr(const char *, int);

The other difference in library code when const objects are placed in Flash is that arguments to the %s and %p placeholders in the printf or scanf family of functions must have a pointer to const type. This ensures that the correct number of bytes is read off the argument stack for each argument.

If you are writing routines that use a variable number of arguments, you might also need to enforce this same restriction.