Guide for under- and over-clocking the blue-pill needed

Post here first, or if you can't find a relevant section!
User avatar
Rick Kimball
Posts: 865
Joined: Tue Apr 28, 2015 1:26 am
Location: Eastern NC, US
Contact:

Re: Guide for under- and over-clocking the blue-pill needed

Post by Rick Kimball » Sun Jan 08, 2017 7:12 pm

You really have to look at the assembler code that is setting the GPIO to determine if it is going to be any different.
-rick

victor_pv
Posts: 1263
Joined: Mon Apr 27, 2015 12:12 pm

Re: Guide for under- and over-clocking the blue-pill needed

Post by victor_pv » Sun Jan 08, 2017 7:22 pm

stevestrong wrote:Dear experts,

I would need a step-by-step guide for over- and under-clocking.
Although I have found here and here some relevant info, but it is only partial, I don't have a complete picture about:
- where to change CPU clock? Are flash wait-states involved in final CPU operating speed? How?
- where to change peripheral (SPI, GPIO?) clock? Is it related to changing CPU clock? How?
- can I decrease flash wait-states (to zero maybe) if I under-clock to 48MHz?
- under which circumstances (which CPU/peripheral clock combinations) is USB still working?

Example:
- would it be possible to underclock to 48MHz, decrease flash-wait states (to zero?), set SPI to 24/48 MHz ?

I would greatly appreciate if someone could collect and post this (and more?) info here and/or on wiki.
Steve, I played a bit with speed changes around the time Roger discovered the GD32 mcus. Those allow the clock to be divider by 2 for the USB peripheral, so it still works with a main clock at 96Mhz.
While Roger was testing on the GD, I did some testing on the STM32 to see if had an undocumented DIV/2 setting for the USB, but wasn't working. Anyway I played a bit with PLL, dividers etc, so from what I remember from that time, I will give you some info:
1.-To change clock, you change the main clock PLL multiplier. Easier than replacing the XTAL.
2.- USB peripheral only has div/1 and div/1.5, since it needs to run at 48Mhz, only a main clock of 48Mhz and 72Mhz work for it.
3.- Rest of peripherals, all if not most, derive their frequency from main clock. They are set during the init() routines that are called before main(). SPI port settings in particular work as DIV/x, minimum I believe is DIV/2, so depending of your main clock, the SPI port speed will be different while applying the same divider. Most peripherals work like that.
I don't remember if libmaple setupclocks() function had any calculation to allow for different main clocks, or it was all hard coded. Sorry.
3.- Wait states. Yes they affect performance. Wait states are added wasted cycles per transaction to the flash. a single flash transaction can read 8 or 16 bytes, don't remember, but more than 1 word, so adding 1 wait state doesn't add 1 cycle to a single 32bit read, but rather a 25% time to 1 word, since 4 words are read at once. The datasheet explains this better than me.
4.- STM specifies how many wait states to add depending on main clock speed, so as you increase main clock, the flash access speed scales more slowly due to the added wait states.
5.- To run reliably, on the full temp range, comercial product... blah blah you should follow the datasheet, but it seems that you can reduce the number of wait states a bit without crashing.
5.- Same with max frequency, theory is 72Mhz, I have overclocked a couple of mcus for a few days without issues, but who knows over a wider temperature range and voltage range what would happen. If you dont care about a crash, or you just want to see how fast you can go, I believe I went to 128Mhz without adding more wait states than at 72Mhz. Not sure if even more could be added, I don't remember of the top of my head.\

You can underclock to 48Mhz, set wait states to 0, and run the SPI at 24Mhz. I believe the SPI minimum divider is /2.
On toggling pins, GPIO is not affected by wait states, neither RAM or any other peripheral, only flahs, BUT, if the code toggling a pin is in flash, it may be slower as you add wait states, but again since the flash ready 128bits at once, if it is a very tight loop that falls within 4 instructions, you may end up with a single flash read every 4 instructions. On the other hand, if you jump back and forward, and force 1 flash read for every 1 or 2 instructions, the wait states will impact you. Also, if the loop toggling the pin instead has any other wait on something else, then having the flash run faster or slower may not affect, if that wait for something else is even longer.

Try a loop that does a few jumps back and forth for more than 4 addresses, so forcing several flash reads, and make sure the compiler doesn't unroll it, and you should see a measurable difference depending how many wait states you have.

stevestrong
Posts: 1164
Joined: Mon Oct 19, 2015 12:06 am
Location: Munich, Germany

Re: Guide for under- and over-clocking the blue-pill needed

Post by stevestrong » Sun Jan 08, 2017 7:46 pm

victor_pv wrote: On toggling pins, GPIO is not affected by wait states, neither RAM or any other peripheral, only flahs, BUT, if the code toggling a pin is in flash, it may be slower as you add wait states, but again since the flash ready 128bits at once, if it is a very tight loop that falls within 4 instructions, you may end up with a single flash read every 4 instructions. On the other hand, if you jump back and forward, and force 1 flash read for every 1 or 2 instructions, the wait states will impact you.
Yepp, I think this is in line with the RM0008, page 59:
stm32_reading_flash.jpg
stm32_reading_flash.jpg (53.97 KiB) Viewed 276 times
I think it is the prefetch buffer, combined with compiler optimization. So the toggling will not suffer, but the graphicstest has different runtimes, where the wait states do have influence. So everything is all right.

User avatar
RogerClark
Posts: 5955
Joined: Mon Apr 27, 2015 10:36 am
Location: Melbourne, Australia
Contact:

Re: Guide for under- and over-clocking the blue-pill needed

Post by RogerClark » Sun Jan 08, 2017 8:09 pm

@stevestrong

Yes. I think there is probably a bug with the USB divider setting for 48MHz

It should be set to DIV 1

On the GD32 I was fo focussing on the higher speeds and may have not fully tested the 48MHz option



However I am not sure whether anyone would use the 48MHz option.

I did some tests on a project which needs to run from a battery, and do pulse timing, using millis() .

So I tried running the Maple Mini at 48MHz, to see if the current consumption is lower, and it is about 1/3 less current than at 72MHz.

But this was not low enough for me, so I ended up dropping right down to 8MHz and loosing USB, and this dropped the current by about 90%

Hence, I concluded that the 48MHz option would probably not be used by anyone, as I could not see any benefits; because if the board is running from USB, then saving 10mA is not likely to be important.

What is probably more important is overclocking, but as it kills USB, perhaps it would be better to have a function to set the overclocking speed, and perhaps force USB reset when returned to 72MHz

Overall however, as the F4 boards are now becomming a lot cheaper and STM's F4 core will be here some time this year, I hope ;-), anyone wanting higher speeds would be better of overclocking an F4 board.

stevestrong
Posts: 1164
Joined: Mon Oct 19, 2015 12:06 am
Location: Munich, Germany

Re: Guide for under- and over-clocking the blue-pill needed

Post by stevestrong » Sun Jan 08, 2017 9:34 pm

RogerClark wrote: Overall however, as the F4 boards are now becomming a lot cheaper and STM's F4 core will be here some time this year, I hope ;-), anyone wanting higher speeds would be better of overclocking an F4 board.
Yeah, I also hope that my F4 boards will finally manage to get out from the drawer... 8-)

User avatar
martinayotte
Posts: 1172
Joined: Mon Apr 27, 2015 1:45 pm

Re: Guide for under- and over-clocking the blue-pill needed

Post by martinayotte » Sun Jan 08, 2017 9:49 pm

You can remove dust on your F4 by simply using our own F4 core ... ;)

User avatar
RogerClark
Posts: 5955
Joined: Mon Apr 27, 2015 10:36 am
Location: Melbourne, Australia
Contact:

Re: Guide for under- and over-clocking the blue-pill needed

Post by RogerClark » Sun Jan 08, 2017 11:04 pm

martinayotte wrote:You can remove dust on your F4 by simply using our own F4 core ... ;)
@danielef has also had a go at porting STM's core to the F4, but he had to comment out a load of the "Analog" related files because the HAL structs for the Analog pin setup is completely different from the F1.

With our F4 core the main problem I find is uploading. The built in DFU in the F4 is a pain to use, as it doesnt seem to work with our DFU Util (on the PC) or any other PC DFU-Util I can find, apart from STM's own special version - and that version does not have a CLI, or take binary files (arrrgghhh)

I think we'd need to write a DFU or other upload protocol, before the F4 is going to be widely usable.

User avatar
martinayotte
Posts: 1172
Joined: Mon Apr 27, 2015 1:45 pm

Re: Guide for under- and over-clocking the blue-pill needed

Post by martinayotte » Sun Jan 08, 2017 11:21 pm

@Roger, I've never got issue with DFU, maybe it is because I'm on Linux.

User avatar
RogerClark
Posts: 5955
Joined: Mon Apr 27, 2015 10:36 am
Location: Melbourne, Australia
Contact:

Re: Guide for under- and over-clocking the blue-pill needed

Post by RogerClark » Mon Jan 09, 2017 12:25 am

Martin

Yes. Linux seems to work OK, but Windows doesn't work unless you use STM's own GUI tool and their special drivers.

User avatar
martinayotte
Posts: 1172
Joined: Mon Apr 27, 2015 1:45 pm

Re: Guide for under- and over-clocking the blue-pill needed

Post by martinayotte » Mon Jan 09, 2017 1:11 am

Then, you should do what Ray did a year go : switch to Linux and forget crappy MS Windows ... :lol:
I'm still running WinXP/WinVista/Win7 but inside VMs, so if you still need Windows for your consultant contracts, maybe that solution could fit your needs. ;)

Post Reply

Who is online

Users browsing this forum: No registered users and 3 guests