Adding to the CCM discussion in this thread:
In F4 MCUs we have 64KB of CCM memory.
There are 2 restrictions using CCM:
- The CCM memory can be used for data, no code can be run from it.
- DMA controllers can not access
The advantage is that CCM is used exclusively by the MCU in a separate bus, so doesn't share bandwidth with any other memory or peripheral.
In theory you could have fast DMA going on in the normal RAM, while the CPU runs from flash using CCM data with no penalty in CPU or DMA performance.
With that in mind, there are 3 possible uses for CCM:
- Normal user variables.
As long as we don't use those variables allocated in those blocks for DMA, all is good.
In the past as proof of concept I modified Steve's F4 USB code to allocate its buffers in CCM. Was not much trouble, and gives the option to use large buffers without taking from normal RAM.
I also allocated the Heap and Stack to CCM, and that provided a small speed gain in one of the CPU benchmarks.
I have not tested like racemaniac did to push the CPU+DMA, but should allow more concurrent operations with no penalty.
Now, allocating all the normal data to CCM is very risky, because if a user allocates a buffer for DMA use there, the code will crash.
Heap and Stack are used can be used for variables too, so it's somewhat risky, but most people using DMA will use a globally allocated buffer (not always though).
With all those conditions in mind, I have been thinking that a good compromise on using CCM without causing much pain would be to allow it as a board option. Similarly to selecting between stlink upload or bootloader upload makes the linker use a different linker script with different addresses, we could add an option that uses a script allocating Stack and Heap to CCM.
We can also add a #define in the core, similar to how __FLASH__ is defined to allow the user to force a variable to flash (that's not used often since the linker will place RO data in flash anyway, but it's in the core):
Code: Select all
#define __attr_flash __attribute__((section (".USER_FLASH")))
#define __FLASH__ __attr_flash
I'm not sure if anyone is using all the RAM in an F4 mcu to need the extra memory by CCM, but the speed gains may be more interesting to some.
What's everyone thought on this?
Should we add a menu option to allocate heap and stack to CCM?
or should be add defines and options in the linker script to allocate data to CCM with an attribute?
or no one has interest at the moment in using CCM?