Non blocking SPI DMA - Added callback to the SPI DMA functions (dmaSend, dmaTransfer...)

Post here first, or if you can't find a relevant section!
Post Reply
victor_pv
Posts: 1339
Joined: Mon Apr 27, 2015 12:12 pm

Non blocking SPI DMA - Added callback to the SPI DMA functions (dmaSend, dmaTransfer...)

Post by victor_pv » Tue Feb 21, 2017 8:05 pm

UPDATE:
Working code in this post, and an explanation how to modify sdFat to use it in the next post after that:
http://www.stm32duino.com/viewtopic.php ... =60#p30306

======================================================


This is something Roger has suggested several times, and it seems to only make sense.

Currently the dma TX/RX functions we added to the SPI library block until the DMA transfer is completed or has timed out.
This change would add 2 functions to the SPI library, similar to the Arduino official I2S library:

Code: Select all

onTransmit(handler);
onReceive(handler);
https://www.arduino.cc/en/Reference/I2SOnReceive

If the user code calls those functions and sets a handler function to be called on completion, the SPI dma functions will not block after setting up the transfer, and instead will set that handler function as the call back to the core, and return immediately.

The user code will be responsible for not initiation another transfer until confirming the previous one is completed.
If the user code doesn't set a handler, or sets the handler back to NULL (default value), then the SPI dma functions will block as currently until the DMA is completed.

The core currently has the functionality to manage the callback, but is not used since we decided to block.

One thing to decide is whether the SPI will implement it's own ISR to disable the spi DMA requests on the corresponding channel when the transfer is completed, or leave that to the user code. If the spi dma requests are not disabled after a transfer, then if the program wants to use the same dma channel for another peripheral, it can find the spi peripheral causing extra dma requests, so would be undesirable.

The main benefit would be to increase parallelism. If dmaSend returns immediately the CPU can go to work on something else.
This will require a bit more RAM and flash usage.

Anyone has ideas, suggestions, opinions?
Last edited by victor_pv on Tue Jun 27, 2017 3:43 am, edited 2 times in total.

User avatar
RogerClark
Posts: 6130
Joined: Mon Apr 27, 2015 10:36 am
Location: Melbourne, Australia
Contact:

Re: Planning to add callback to the SPI DMA functions (dmaSend, dmaTransfer...)

Post by RogerClark » Tue Feb 21, 2017 8:34 pm

Victor

Its probably worth PM'ing @stevstrong if he does not see this thread, as he has done a lot of work in SPI recently.

I think I have one pending PR from stev, but it seems to slow the SPI down, hence I have not actioned it yet. But it may be necessary as it contains bug fixes.

I think there is a general problem with callbacks into C++ classes, as only static functions addresses can be accessed.
So it would need to be a shared ISR for all instances.

Other APIs I work with, let you pass a pointer to the callback function, into the Transfer function. Which seems logical to me.

victor_pv
Posts: 1339
Joined: Mon Apr 27, 2015 12:12 pm

Re: Planning to add callback to the SPI DMA functions (dmaSend, dmaTransfer...)

Post by victor_pv » Tue Feb 21, 2017 8:50 pm

RogerClark wrote:Victor

Its probably worth PM'ing @stevstrong if he does not see this thread, as he has done a lot of work in SPI recently.

I think I have one pending PR from stev, but it seems to slow the SPI down, hence I have not actioned it yet. But it may be necessary as it contains bug fixes.

I think there is a general problem with callbacks into C++ classes, as only static functions addresses can be accessed.
So it would need to be a shared ISR for all instances.

Other APIs I work with, let you pass a pointer to the callback function, into the Transfer function. Which seems logical to me.
Hi Roger,

You are late ;) I Pm'ed him already, and told him i will use the version of his PR as a base since looks like his changes didn't break anything and improved speed. Also by doing that I will be testing his changes with a couple of displays and sdfat.

Re.: callbacks, when trying to use a class member, the "this" argument is not passed to the function when called from an interrupt, so it fails when trying to access any class member. Grumpyoldpizza has a very elegant hack around it. When setting callbacks, he passes the pointer to the function, and the pointer to "this". Both get saved in a table, and when calling back the user ISRs his core ISR function passes those parameters, then he cast the "this" pointer to a class pointer and all works. Is much easier to understand if you look at his code.

Since I didn't want to hack around our core to do that, I used a different method in the I2S library, but achieves the same. It just needs as many static functions and pointer variables, as class objects can be spanned. But given that I2S ports (and SPI) are a finite resource, and we now the max number of them, all the possibilities are taken in account in the class code.
So we would need up to 3 very short ISRs, and up to 3 void* (4 bytes each), so it's a small cost. In return every SPI device callbacks can properly be serviced by a class member function with access to all the class members and variables.
All that would be needed if we want the SPI class to include a first layer of ISR that will disable dma requests for that port, to avoid the problem I described above. If we instead decide to just leave that to the user code, we don't need any ISR or pointer variables to the SPI "this" pointers.

Now, if we want to use other classes members as callbacks for the SPI class (let's say a ILI9341 class member), then the hackaround described above needs to be implement in the ILI9341 class itself. In any case, since taking advantage of dma callbacks requires modifications to any library, adding those little bits of code should not be a big deal.
Any library not modified using dmaSend as today will get the same behavior, dmaSend will block. Unless the user specifically sets a call back, the code will block as of today.
I'll write something in the wiki explaining how to use class members in callbacks with our current code.
Last edited by victor_pv on Wed Feb 22, 2017 1:05 am, edited 3 times in total.

User avatar
RogerClark
Posts: 6130
Joined: Mon Apr 27, 2015 10:36 am
Location: Melbourne, Australia
Contact:

Re: Planning to add callback to the SPI DMA functions (dmaSend, dmaTransfer...)

Post by RogerClark » Tue Feb 21, 2017 8:56 pm

Hi victor

I have not tested @stevstrongs latest PR, but the previous one was slower than what we had before :-(

Re: one ISR per SPI channel

Sounds OK to me.


thanks

roger

stevestrong
Posts: 1346
Joined: Mon Oct 19, 2015 12:06 am
Location: Munich, Germany

Re: Planning to add callback to the SPI DMA functions (dmaSend, dmaTransfer...)

Post by stevestrong » Tue Feb 21, 2017 9:14 pm

While trying to optimze the ILI9486 lib i have already implemented the non-blocking DMA with callback at job end. This included a short isr at job end which called the cb function if set by the user within the isr.
Unfortunatelly the DMA slows down the CPU very strongly, so at the end, including the ovehead to init the DMA, no time saving could take place compared to the blocking non-DMA version. I tested it with the Adafruit graphics test. You can follow the results from that thread.
My version was designed to reserve the respective DMA channel (i think channel 3) only for SPI. Still, i was not happy with the result.

victor_pv
Posts: 1339
Joined: Mon Apr 27, 2015 12:12 pm

Re: Planning to add callback to the SPI DMA functions (dmaSend, dmaTransfer...)

Post by victor_pv » Tue Feb 21, 2017 9:32 pm

stevestrong wrote:While trying to optimze the ILI9486 lib i have already implemented the non-blocking DMA with callback at job end. This included a short isr at job end which called the cb function if set by the user within the isr.
Unfortunatelly the DMA slows down the CPU very strongly, so at the end, including the ovehead to init the DMA, no time saving could take place compared to the blocking non-DMA version. I tested it with the Adafruit graphics test. You can follow the results from that thread.
My version was designed to reserve the respective DMA channel (i think channel 3) only for SPI. Still, i was not happy with the result.
I do not think the callback will help for normal pixel drawing of fast displays where half the time is spent in setting coordinates, and the other half with small transfers, and there are limited things to do while waiting for a pixel to be drawn.
But for writing and reading big blocks, and working with slower devices that can not operate at 36Mhz, should be beneficial.
Things like reading from an SPI flash to a buffer, while the code formats the data in another buffer and sends it to another device with another DMA transfer.

stevestrong
Posts: 1346
Joined: Mon Oct 19, 2015 12:06 am
Location: Munich, Germany

Re: Planning to add callback to the SPI DMA functions (dmaSend, dmaTransfer...)

Post by stevestrong » Tue Feb 21, 2017 10:00 pm

Sure, it depends on the application.
The display lib was also partially writing larger blocks, no overall speed gain achieved however.
Once again, the cpu is slowed down strongly by the DMA running in background.
Dont understand me wrong, i see the theoretical benefit, that is why i also tested it. Still, the results did not convince me.

As i dont have any other application where saved time would play larger role than in the display lib, i have given up to push it into the repo.
But feel free to get it run. If you think it could help i could share my local version.

victor_pv
Posts: 1339
Joined: Mon Apr 27, 2015 12:12 pm

Re: Planning to add callback to the SPI DMA functions (dmaSend, dmaTransfer...)

Post by victor_pv » Wed Feb 22, 2017 1:02 am

stevestrong wrote:Sure, it depends on the application.
The display lib was also partially writing larger blocks, no overall speed gain achieved however.
Once again, the cpu is slowed down strongly by the DMA running in background.
Dont understand me wrong, i see the theoretical benefit, that is why i also tested it. Still, the results did not convince me.

As i dont have any other application where saved time would play larger role than in the display lib, i have given up to push it into the repo.
But feel free to get it run. If you think it could help i could share my local version.
If you can share yours, sure will help.
About the comments above from Roger, that a modified copy was slower, was that a previous one or the last one in your PR? I want to start with a good version so I dont have to start over midway.

stevestrong
Posts: 1346
Joined: Mon Oct 19, 2015 12:06 am
Location: Munich, Germany

Re: Planning to add callback to the SPI DMA functions (dmaSend, dmaTransfer...)

Post by stevestrong » Wed Feb 22, 2017 9:50 am

Please check my comments posted here: https://github.com/rogerclarkmelbourne/ ... -277516813

racemaniac
Posts: 414
Joined: Sat Nov 07, 2015 9:09 am

Re: Planning to add callback to the SPI DMA functions (dmaSend, dmaTransfer...)

Post by racemaniac » Wed Feb 22, 2017 10:20 am

stevestrong wrote:Sure, it depends on the application.
The display lib was also partially writing larger blocks, no overall speed gain achieved however.
Once again, the cpu is slowed down strongly by the DMA running in background.
Dont understand me wrong, i see the theoretical benefit, that is why i also tested it. Still, the results did not convince me.

As i dont have any other application where saved time would play larger role than in the display lib, i have given up to push it into the repo.
But feel free to get it run. If you think it could help i could share my local version.
what kind of slowdown do you see when the DMA is active?
and how can DMA not give a speedup even if the cpu slows down? your transfer will take equally long as with a blocking transfer, and even if the cpu slows down, at least it can do some work during the transfer.

Post Reply

Who is online

Users browsing this forum: Bing [Bot] and 1 guest