Planning to add callback to the SPI DMA functions (dmaSend, dmaTransfer...)

Post here first, or if you can't find a relevant section!
victor_pv
Posts: 954
Joined: Mon Apr 27, 2015 12:12 pm

Re: Planning to add callback to the SPI DMA functions (dmaSend, dmaTransfer...)

Postby victor_pv » Mon Feb 27, 2017 5:14 pm

Ollie wrote:I agree, that we will have many moons before we will use Arduino IDE for serious or semiserious F4/F7/H7 programming. In that frontier, the developers need to be aware of the capabilities and restrictions of different peripheral buses and the DMA channels in them.

I do propose that we are using STM terminology for these concepts. In that sense we need to identify that there are only few DMA channels and each of them have multiple streams. The streams in different channels can be active at the same time, but for the streams in a channel only one can be active at any point of time.


The problem with STM, is that for the F1 they call them Controllers, and Channels, and then several peripherals can use the same channel but not at the same time. They don't use the word stream in F1 at all. There is some differences between the F1 and F4 DMA controllers, but to be honest I haven't looked at the F4 at all, so I dont know the differences.

Leaflabs had started to use that terminology to be consistent between the F1 and the F4 ports, so we can go ahead and use it, but they were calling something tubes. Again I haven't looked at that much, I've been using the old methods in the F1. We can switch that to use the "tubes" methods, that should be compatible with both F1 and F4, and perhaps add a wiki page, if leaflabs didn't have one, explaining the differences between the F1 and F4 DMA. Otherwise if you start talking about streams or tubes in the F1 context, and someone tries to find it in the F1 reference manual, won't find anything.

Ollie
Posts: 130
Joined: Thu Feb 25, 2016 7:27 pm

Re: Planning to add callback to the SPI DMA functions (dmaSend, dmaTransfer...)

Postby Ollie » Mon Feb 27, 2017 5:50 pm

Sorry, now I have foot in my mouth. The F1 and other STM MCUs share the same name for the controllers in the peripheral buses. So here is the summary for F4/F7/H7
1) DMA Controllers have multiple streams
2) Streams have multiple channels
3) The stream channels are hard-wired to peripheral devices
- stream configuration selects one of the devices

victor_pv
Posts: 954
Joined: Mon Apr 27, 2015 12:12 pm

Re: Planning to add callback to the SPI DMA functions (dmaSend, dmaTransfer...)

Postby victor_pv » Mon Feb 27, 2017 7:21 pm

Ollie wrote:Sorry, now I have foot in my mouth. The F1 and other STM MCUs share the same name for the controllers in the peripheral buses. So here is the summary for F4/F7/H7
1) DMA Controllers have multiple streams
2) Streams have multiple channels
3) The stream channels are hard-wired to peripheral devices
- stream configuration selects one of the devices


I have to read the reference manual to understand better the similarities and differences, but I think Streams are somewhat similar to the mechanism the F1 follows, where several devices can use a channel, but should only use it 1 at a time. In the F1 though the different peripheral lines are just OR'ed, so is technically possible to enable the requests from 2 devices at the same time, but the transactions will only happen to 1 register address, so it would just mess up the transfers. That may be different with the streams if they provide some kind of arbitration.

racemaniac
Posts: 327
Joined: Sat Nov 07, 2015 9:09 am

Re: Planning to add callback to the SPI DMA functions (dmaSend, dmaTransfer...)

Postby racemaniac » Tue Feb 28, 2017 8:09 am

victor_pv wrote:From what I have found, they have not integrated any DMA functions in the normal SPI library.
There is another library, DmaSpi, which is the one that has DMA capabilities:

https://github.com/crteensy/DmaSpi

That one works similar to what Steve was working on, it queues transfers with a series of properties (buffer pointer, size, and a pin object to control CS), then services those transfers in order.

Not sure if all that is worth the effort, as Steve tested, for small transfers doesn't improve performance due to all the overhead, and the libraries using it have to be heavily modified.

Nice to see how we all come to the same results ^^. for my lightsaber i made a soundstreaming system via SPI DMA, and basically came to the same result: a fifo with dma requests (only specifying the address of the buffer & address on the device it should dma between in my case), and the soundstreams adding requests to the buffer whenever they needed new data :).

It's not that complicated to make such a system, and is indeed very powerful :).

victor_pv
Posts: 954
Joined: Mon Apr 27, 2015 12:12 pm

Re: Planning to add callback to the SPI DMA functions (dmaSend, dmaTransfer...)

Postby victor_pv » Tue Feb 28, 2017 1:37 pm

racemaniac wrote:
victor_pv wrote:From what I have found, they have not integrated any DMA functions in the normal SPI library.
There is another library, DmaSpi, which is the one that has DMA capabilities:

https://github.com/crteensy/DmaSpi

That one works similar to what Steve was working on, it queues transfers with a series of properties (buffer pointer, size, and a pin object to control CS), then services those transfers in order.

Not sure if all that is worth the effort, as Steve tested, for small transfers doesn't improve performance due to all the overhead, and the libraries using it have to be heavily modified.

Nice to see how we all come to the same results ^^. for my lightsaber i made a soundstreaming system via SPI DMA, and basically came to the same result: a fifo with dma requests (only specifying the address of the buffer & address on the device it should dma between in my case), and the soundstreams adding requests to the buffer whenever they needed new data :).

It's not that complicated to make such a system, and is indeed very powerful :).

But you didn't hand over control of GPIO pins for CS to the DMA fifo, is that right?
Where you sending all the transfers to the same device or different ones?
For the I2S library I have used a double buffer managed by the library, similar to what Arduino does, but I don't like since it involves a memcpy from the application buffer to the library buffer, complete waste of time, but did it that way just to maintain compatibility with the API as much as possible. When I have time I plan on adding additional functions to manage a fifo buffer for just a pointer and a size instead.

racemaniac
Posts: 327
Joined: Sat Nov 07, 2015 9:09 am

Re: Planning to add callback to the SPI DMA functions (dmaSend, dmaTransfer...)

Postby racemaniac » Tue Feb 28, 2017 1:48 pm

victor_pv wrote:
racemaniac wrote:
victor_pv wrote:From what I have found, they have not integrated any DMA functions in the normal SPI library.
There is another library, DmaSpi, which is the one that has DMA capabilities:

https://github.com/crteensy/DmaSpi

That one works similar to what Steve was working on, it queues transfers with a series of properties (buffer pointer, size, and a pin object to control CS), then services those transfers in order.

Not sure if all that is worth the effort, as Steve tested, for small transfers doesn't improve performance due to all the overhead, and the libraries using it have to be heavily modified.

Nice to see how we all come to the same results ^^. for my lightsaber i made a soundstreaming system via SPI DMA, and basically came to the same result: a fifo with dma requests (only specifying the address of the buffer & address on the device it should dma between in my case), and the soundstreams adding requests to the buffer whenever they needed new data :).

It's not that complicated to make such a system, and is indeed very powerful :).

But you didn't hand over control of GPIO pins for CS to the DMA fifo, is that right?
Where you sending all the transfers to the same device or different ones?
For the I2S library I have used a double buffer managed by the library, similar to what Arduino does, but I don't like since it involves a memcpy from the application buffer to the library buffer, complete waste of time, but did it that way just to maintain compatibility with the API as much as possible. When I have time I plan on adding additional functions to manage a fifo buffer for just a pointer and a size instead.

in my case it was indeed 1 memory chip containing all the sound samples, but multiple sounds could be streamed at the same time from that chip :). So the CS pin was indeed fixed for my solution :). But adding that wouldn't be that much work, you add the cs pin to the struct of the datarequest you put in the fifo, and use that pin when setting up the next dma transfer :). (and keep it in memory for when your DMA is done so you can disable it again ^^)

And concerning the buffer: i had a similar issue with my solution: it was an spi flash chip, and in 1 transfer you need to send the command (a read command + an address), and then you get your data. My solution was to do an spi transfer (so both send and receive) with the same memory address for sending & receiving data. the first x bytes would contain the command & address on the spi chip, after that enough room for the data i was requesting from the chip.
It was ofcourse a bit overhead, i worked in blocks of 128 or 256 bytes, and the command+addres was probably about 10 bytes or so. for me it felt like it was worth the bit of extra ram usage vs having to do a memcpy (or dma memory to memory transfer) afterwards.

victor_pv
Posts: 954
Joined: Mon Apr 27, 2015 12:12 pm

Re: Planning to add callback to the SPI DMA functions (dmaSend, dmaTransfer...)

Postby victor_pv » Mon Mar 06, 2017 7:50 pm

It seems like I have a working version now that allows to:
-Set callback functions that will be called when a DMA transfer completed. In case the callbacks are set dmaSend and dmaTransfer are non-blocking.
-Allow to set all the DMA related settings with one function (enable DMA controller, set transfer address, destination, data size, etc), and then a second function to reload the DMA transfer size, which needs to be reloaded before enabling the channel again, since the value is not kept at the end of a transmission. So if the buffer address, data side, are reused, only the second function needs to be called repeatedly.

I have tested the callback with the sdfat library and with an ILI spi display. Now the weird thing:
When using sdfat at spi div/2 speed, and using callbacks, somethings the DMA RX never completed, and leaves either 1 or 2 bytes pending.
So let's say I want to receive 512bytes. For that the DMA RX is set to 512 bytes, the DR is read if RXNE is set, RX DMA enabled, and next set and enable TX DMA for 512 bytes.
After each byte goes out, one will come in, and the DMA controller reads it from DR, stores it in the RX buffer, and decrements the count of the RX DMA pending requests.

All works fine if I do not use callbacks, and just block until RX is completed. It also works fine if I set the port to 18Mb (DIV/4) while using callbacks.
But if I use callbacks and set the port to DIV/2 speed, then some times the RX never completes. The TX buffer is all sent, and 1 or 2 bytes still pending in RX. Since TX is completed, it's not producing a clock any more, and RX will never get the last bytes in.

I have run it thru the debugger, and some times it completes several transmissions correctly before one fails, but is a different number of transmissions each time. Some times goes for longer, some for shorter.
I have tried setting the RX DMA priority to very high and the TX to medium, in case the DMA controller was servicing a TX while an RX was pending, which would overwrite the DR register and get the RX byte lost, but that did not help.

Other than setting a callback for Transfer complete event, the DMA setup is exactly the same whether blocking or not, so I can't figure out what is happening, other than perhaps even when blocking the transfers are not always completing, but since there is a timeout check, some times the transfer is just being terminated on timeout and not because RX actually completed.

I need to test that theory by removing the timeout, but has anyone experienced any issue when receiving data from with the sdfat library and the max spi port speed when using DMA, or noticed any corruption in the data read?

stevestrong
Posts: 734
Joined: Mon Oct 19, 2015 12:06 am
Location: Munich, Germany

Re: Planning to add callback to the SPI DMA functions (dmaSend, dmaTransfer...)

Postby stevestrong » Mon Mar 06, 2017 8:30 pm

I think it is normal that Rx bytes are still to be received when Tx ready, since TXE comes earlier than RXNE. Meaning that TXE is set before the last byte is received.
Do you wait in the Tx end callback function for not TXE and BSY?

victor_pv
Posts: 954
Joined: Mon Apr 27, 2015 12:12 pm

Re: Planning to add callback to the SPI DMA functions (dmaSend, dmaTransfer...)

Postby victor_pv » Mon Mar 06, 2017 10:11 pm

stevestrong wrote:I think it is normal that Rx bytes are still to be received when Tx ready, since TXE comes earlier than RXNE. Meaning that TXE is set before the last byte is received.
Do you wait in the Tx end callback function for not TXE and BSY?


Yes, the callback waits, but the callback doesn't even get called since it is set to be called by the DMA RX channel completing. Since it doesn't complete some times, it doesn't call.

According to the datasheet TX and RX can be executed in full duplex without any problem, but there are timing considerations. i suspect that driving the port at 36Mhz with no pauses is too much and some bytes get lost by overlapping.
That's why I was wondering if someone had noticed any kind of issue at 36Mhz spi speed. All works fine at 18Mhz, i have left the MCU running for a full day without an issue at 18Mhz.

Figure 240 in the reference manual describes full duplex in continuous transfer mode (no pause between the last bit on one transmission and the first bit in the next). As the graph shows, there is a small period between RXNE and TXE, and the DR needs to be read within that period or is lost (may not be the last byte that is lost, but some byte in the middle, could still cause 1 RX byte to be lost).
I suspect that either the DMA controller can not service all DMA transactions fast enough, and some fall past that period, or the SPI peripheral itself doesn't get to raise the flags leaving enough time between them.
At first I thought that could be due to the RX channel priority not being high enough to service those requests before anything else, but adjusting priorities did not help much.

Thinking about it, I now have 4 DMA channels running at the same time (thought not all at full speed), plus the sketch has some parts with very intensive IO to and from ram (moving and formatting data between buffers). Perhaps I only get the failure at random times when there is too many requests to the RAM bus and the RX DMA has to wait a cycle or 2 to have the bus. From that Figure 240, the time between RXNE and TXE is 1 bit time, that is 2 cpu cycles at DIV/2, correct? so enough time for only 2 RAM accesses, so if the DMA access to the RAM waits more than 2cpu cycles, the data would be lost. Does that make sense?

When I block instead of using the callback, there is only 1 other DMA going on at 88Khz max, not as much load as 1 DMA at 88khz + 1 at 18Mhz for spi2 + code reading and writing to RAM.

Could also be that my connections to the sdcard module are not "clean" and some bits get someway lost because of some non clean edges, but if that was the case I would expect the same issue to happen when I block and not only when I continue running without blocking.

stevestrong
Posts: 734
Joined: Mon Oct 19, 2015 12:06 am
Location: Munich, Germany

Re: Planning to add callback to the SPI DMA functions (dmaSend, dmaTransfer...)

Postby stevestrong » Tue Mar 07, 2017 1:03 am

Ok, so you run both SPI ports in DMA mode...
You could try to let only the Rx part in DMA mode, the Tx part in "normal" mode to see if Rx bytes are still lost.


Return to “General discussion”

Who is online

Users browsing this forum: Bing [Bot], uXe and 2 guests