Fast 32 bit GPIO possible on the STM32F4?

Limited support for STM32F4 Discovery, Nucleo and custom F4 boards
Post Reply
killingtime
Posts: 3
Joined: Mon Jun 05, 2017 2:55 pm

Fast 32 bit GPIO possible on the STM32F4?

Post by killingtime » Mon Jun 05, 2017 3:26 pm

Hello,

Looking at the data sheet for the STM32F4 series MCU I see the output pins are divided into 16 bit ports.

I'm aware that it's possible to manipulate GPIO ports directly by writing to various registers but I've not seen any discussion or examples on how you'd *quickly* get a 32bit variable out onto the hardware pins.

For MCUs with a 32 bit port (e.g.SAM3X8E - Atmel I know) it's just a case of writing a 32 bit variable to a 32 bit port and you're done.

For MCUs with 16 bit ports the only method I can see would be to use bit masking & shifting, then casting to a smaller variable if the compiler demands it (16 bit) then variable assignment to the register. If data throughput is important (and it is) then this is a huge overhead.

So my question is; does this MCU have any inbuilt support for working with 32bit data on the GPIO? A way of linking port registers together perhaps?

If not, any techniques you're aware of that are quick in terms of cpu cycles would be welcome.

Thanks.

Ollie
Posts: 193
Joined: Thu Feb 25, 2016 7:27 pm

Re: Fast 32 bit GPIO possible on the STM32F4?

Post by Ollie » Mon Jun 05, 2017 7:12 pm

In your question, you did imply the answer. In STMF4, the ports are 16 bit wide. All the 16 pins can be written in a single assign statement. In similar way selected pins in the port can be set and/or cleared in a single assign statement. For 32 pin manipulation, you have to use two assign statements.

I am wondering the practical feasibility of this. The pins for the boards are not in nice clusters and it will be a challenge to get 32 pins out in any meaningful wiring scheme. Perhaps you can identify a way to replace the parallel operation with a fast serial processing. If that is not feasible, my recommendation is to solve the problem with FPGA.

User avatar
Rick Kimball
Posts: 1038
Joined: Tue Apr 28, 2015 1:26 am
Location: Eastern NC, US
Contact:

Re: Fast 32 bit GPIO possible on the STM32F4?

Post by Rick Kimball » Mon Jun 05, 2017 8:14 pm

Have you looked at the FMC peripheral? I think it has a 32bit data register.

There is a reference to it on this post:

http://vjordan.info/log/fpga/stm32f4-ge ... -fsmc.html
-rick

stevestrong
Posts: 1747
Joined: Mon Oct 19, 2015 12:06 am
Location: Munich, Germany

Re: Fast 32 bit GPIO possible on the STM32F4?

Post by stevestrong » Tue Jun 06, 2017 8:48 am

With the trick to use address lines FSMC_A0..A15 as data lines, you should be able to get 32 bits as data out from FSMC (STM32F4VET6).

killingtime
Posts: 3
Joined: Mon Jun 05, 2017 2:55 pm

Re: Fast 32 bit GPIO possible on the STM32F4?

Post by killingtime » Tue Jun 06, 2017 4:19 pm

Thanks for the replies.

The FSMC does look to be the only option. it only has a 16bit data bus but if the address lines can be used as well then that may do the trick. How complicated this is to set up is another matter.

I've done some more research on bit shifting. Unless the cpu has an instruction to deal specifically with it, it can be quite a drain. I've seen 1 cpu cycle per shift discussed (that's 16 cycles) but without looking at compiler debug info for this platform I'd be guessing.

There is another more general option; use of the UNION construct in C. This essentially allows you to split larger variables into smaller ones by using multiple variables mapped to the same memory space. You're just reading different parts of the same memory space so no bit shifting is involved. You could work on a 32 bit variable as part of your program and split them into two 16 bit variables just before writing (twice) to the necessary GPIO registers. You're not dependent on a chip specific FSMC with this method.

stevestrong
Posts: 1747
Joined: Mon Oct 19, 2015 12:06 am
Location: Munich, Germany

Re: Fast 32 bit GPIO possible on the STM32F4?

Post by stevestrong » Tue Jun 06, 2017 5:01 pm

Regarding union, check this post.
Take care to implement it in C, not in C++.

User avatar
Rick Kimball
Posts: 1038
Joined: Tue Apr 28, 2015 1:26 am
Location: Eastern NC, US
Contact:

Re: Fast 32 bit GPIO possible on the STM32F4?

Post by Rick Kimball » Tue Jun 06, 2017 5:51 pm

killingtime wrote:For MCUs with a 32 bit port (e.g.SAM3X8E - Atmel I know) it's just a case of writing a 32 bit variable to a 32 bit port and you're done.
How many cycles do you think that takes? What is the maximum speed of the PIO on the atmel?

[Edit] I found this on the arduino site, seems to indicate ~16MHz ish https://forum.arduino.cc/index.php?topi ... msg2466434 [/Edit]

[Edit] Turns out to run the arduino DUE @ 84MHz it has to have its flash wait state set to 4. So maybe you could force your code into SRAM to make it run faster I don't know. Maybe its SRAM has wait states too? : ) My point is this, assuming that because you can write 32 bits to the PIO with a single asm instruction doesn't always result in being first on race day. [/Edit]
killingtime wrote:I've done some more research on bit shifting. Unless the cpu has an instruction to deal specifically with it, it can be quite a drain. I've seen 1 cpu cycle per shift discussed (that's 16 cycles) but without looking at compiler debug info for this platform I'd be guessing.
Have you looked at the arm reference manual on the cortex-m4 architecture? That provides you a cycle count for each instruction. As you noted it also heavily depends on the compiler flags you use that determine how many cycles are actually used. With all that said the shifting features of the cortex-m4 are efficient and extensive.
killingtime wrote:So my question is; does this MCU have any inbuilt support for working with 32bit data on the GPIO? A way of linking port registers together perhaps?
Do you have a real world problem you are trying to solve? When you ask about the stm32f4, that is too generic. There are various versions of the STM32F4 chips and they all don't have the same set of peripheral features. Depending on the problem you are trying to solve one chip might better address that problem than another.
-rick

killingtime
Posts: 3
Joined: Mon Jun 05, 2017 2:55 pm

Re: Fast 32 bit GPIO possible on the STM32F4?

Post by killingtime » Wed Jun 07, 2017 1:02 pm

Hello Rick,

You raise some interesting point's. I've not looked (yet) at the cycle count for different instructions or the exact architecture differences between MCU variants. The wait states on memory is something I'd not considered either.

At the planning stage of a project and looking to pick the correct MCU for the job. Parallel operation of external peripherals would be faster for my application (offtopic:led matrix displays) so I've been looking at different MCU offerings from different manufacturers and wondered how devs using STM32 MCUs got 32 bit data out given the 16 bit ports. My curiosity was piqued.

Regards.

Ollie
Posts: 193
Joined: Thu Feb 25, 2016 7:27 pm

Re: Fast 32 bit GPIO possible on the STM32F4?

Post by Ollie » Wed Jun 07, 2017 2:09 pm

The normal practice for LED matrix displays is to use external driver chips that are connected through serial interface to MCU. The driver chips provide the right voltage and current. Without them, you could have problems in the available source/sink current in the CPU and the wiring of the rquired resistors is a mess.

victor_pv
Posts: 1681
Joined: Mon Apr 27, 2015 12:12 pm

Re: Fast 32 bit GPIO possible on the STM32F4?

Post by victor_pv » Thu Jun 29, 2017 12:55 pm

killingtime wrote:
Tue Jun 06, 2017 4:19 pm
Thanks for the replies.

The FSMC does look to be the only option. it only has a 16bit data bus but if the address lines can be used as well then that may do the trick. How complicated this is to set up is another matter.

I've done some more research on bit shifting. Unless the cpu has an instruction to deal specifically with it, it can be quite a drain. I've seen 1 cpu cycle per shift discussed (that's 16 cycles) but without looking at compiler debug info for this platform I'd be guessing.

There is another more general option; use of the UNION construct in C. This essentially allows you to split larger variables into smaller ones by using multiple variables mapped to the same memory space. You're just reading different parts of the same memory space so no bit shifting is involved. You could work on a 32 bit variable as part of your program and split them into two 16 bit variables just before writing (twice) to the necessary GPIO registers. You're not dependent on a chip specific FSMC with this method.
2 things come to mind:
CortexM has a bit shifting instruction that takes 1 cycle and can shift any number of positions in that 1 cycle, so a 16bit shift takes 1 cycle.
DMA controller can do IO without CPU intervention so if you write to a buffer the right formatted data and fire a DMA transfer it goes as fast as the port can manage, possibly faster.

Post Reply