Page 1 of 2

Code size

Posted: Sun Sep 04, 2022 12:16 pm
by herbk
Hi,

i'm not a "experienced programmer", - just one who relalized a few little projects with Arduino Nano or Pro Mini...
Because the number of analog input pins is a bit to low for some of my next projekts i bougt a few STM32F103C6T8 from Ali.

After installing the needed Boards and Libs to the Arduino IDE i made a few tests to find out what i have to change at my sketches. At this i noticed the the code which the Arduino IDE creates for upload is much bigger (about 10x) as for an Arduino. E.g. : The code for the "Blink without Delay" sketch uses 860 Bytes if i compile it for the Nano, but more as 11.500 Bytes if i compile it for the STM32F103x.

Could that be true ? Or do i do something wrong ?
What is your experience in this regard?

Re: Code size

Posted: Sun Sep 04, 2022 12:32 pm
by GonzoG
It is bigger as STM32 MCUs have much more configuration than ATMega. All this configuration needs code and this code needs space.
If you disable features that you don't use, your code will be smaller.
USB needs about 10-13kB
UART needs about 2-3kB.
ADC (using any analog input) will take about 3kB.
and so on...
There are some features that can be disabled to save space:
https://github.com/stm32duino/wiki/wiki
Advanced usages -> Customization

Re: Code size

Posted: Sun Sep 04, 2022 3:49 pm
by ag123
imho, stm32f103 is 'far' more elaborate vs nano. if you insist on 'smallest' size, you could try doing it the CMake route and omit USB CDC Serial.
viewtopic.php?f=10&t=1648
USB Serial is a feature of stm32f103c8, and it gives you a 'com' port which is as fast as a full speed 12 Mbps device can provide.
normally due to all that usb multiplexing you'd see speeds like 1-2 Mbps, but that that is 10x faster than even the common 115200 bps speeds for uarts.
there is also a lot of other configs due to all that 'extra' stuff as well as with Arm Cortex-M.
i.e. it isn't an apple to apple comparison between binary sizes for stm32f103 vs nano.

most of the time you could do with a 'fatter' binary on stm32f103 c8 and cb. Do not get any chips with a c6 which has been floating in the online 'flea markets' e.g. ebay, alix. a c6 device has much less flash and memory vs c8 (64k) flash and cb (128k) flash.
If you are new to the platform, get either a Nucleo board with comfortable amount of on chip flash / memory e.g. Nucleo f401re or Nucleo f411re
https://github.com/stm32duino/Arduino_Core_STM32
or things like those stm32f401/f411 'black pill' boards. check that they have either cc or ce in the suffixes for the chips. those quite commonly provides like 128k flash or 256k flash and that chip accordingly has like 64k sram. and is *fast* (at least faster than stm32f103)
it is possible to do 'a lot more' things ( and with less hassle) on those chips with more resources.
with things like 64k sram you can allocate a huge buffer say like 10k for some specific purpose say a DAQ buffer to capture adc samples.
On a Atmega 328 which accordingly has 2k sram, it would already blow the whole budget of available sram, but on that stm32f4xx chip with more resource you can have your 10k buffer and your sketch continue to run 'professionally' (even with an arduino sketch), you could literally make a digital storage oscilloscope out of it. that single stm32f4xx large resource chip/board.

Re: Code size

Posted: Sun Sep 04, 2022 10:08 pm
by dannyf
code size depends on a lot of factors, for example, generally bigger chips aren't as space efficient - low code density; you should remove unused sections...

my own ports of arduino on CM3 generally runs somewhere around 4K - GPIO tables and ISRs are the bulk.

LPC210x arduino clone runs in under 2K, with microlib (a Keil thing). on the other end, PIC32 arduino clone runs around 14K.

Re: Code size

Posted: Mon Sep 05, 2022 5:27 pm
by herbk
Thank you all for the Infos, it eliminates my concerns that every sketch grows 10 times... :lol:

Disabling unneedet features sounds good, - i think it helps save energie to, - true ?

Now i first have to figure out where are the differences between the Arduinos and the STMs, e.g. where ( and what) do i need to change the Sketches. Questins to that:
How can I address the pins, is there the need to define a analog pin for input or 5V or 3.3V... it has both pins... ?
Is there somewhere a tutorial für the STMs ( e.g. like the Arduino Reference ) ?
Which is the best adress to put at the bBoardsmanager ? Atm i'm at "https://github.com/stm32duino/BoardMana ... index.json", but there are no example sketches with it and i know i have had some examples at the IDE before i changed to it.

Re: Code size

Posted: Mon Sep 05, 2022 5:52 pm
by ag123
herbk wrote: Mon Sep 05, 2022 5:27 pm Thank you all for the Infos, it eliminates my concerns that every sketch grows 10 times... :lol:
the binaries tend to stay around "similar" sizes for an 'average' sketch, it is "fat" for a particular bundle e.g. if you include the "normal" usb-(cdc)-serial, it may take say 20-30k, but later the 'growth' between 20-60k would be slower, as that depends on your sketch (that is).

the 'official' core
https://github.com/stm32duino/Arduino_Core_STM32
tends to be somewhat a little more bloat as it tries to do a lot of things, which includes trying to cater to pretty much the whole stm32 series.
i.e. it support *a lot of* chips, you can take a look at the boards list on the core repository page.
and on top, it tries to bundle all the basic things (analog analogRead/analogWrite, gpio digitalRead,digitalWrite, uart, etc), FPU (single precision floating point hardware) if you use the stm32f4xx series (this is 'a lot' faster vs stm32f103 if codes run with single precision floating point calcs are benchmarked).

Inevitably, the linker may not be able to figure out too many dependencies and it could include a little extra bloat rather than to leave it out and your sketch may crash, missing dependency.

Hence, get a board/chip that has a 'comfortable' amount of flash and sram. e.g. at least 64k flash and 20k sram (this is quite cramp literally) - that would be for the stm32f103 c8 blue pill based boards. a f103 c6 based board/chip may run out of flash or sram too soon.

And things like that Nucleo F401 RE and F411 RE, normally has like 256-512k flash and like 64K sram (or more I think the specs for RE is 96 K sram).
even things like that 'F401 or F411 blackpill' (it needs to be CC or CE suffix) has like 128-256k flash and 64k sram, and it has that FPU.
This would be so much room, you probably can run pretty 'bulky' apps in it like micropython etc and it'd still dance.

"Ordinary" sketches would then use so little of that resource that you can literally throw many things at it.
Add an LCD library (e.g. Adafruit ILI9341) (this is a huge one), throw in a few more libraries (add SPI), sensors (add Wire (i2c), and SD card + Fat FS library (this is another huge one), add usb-(CDC)-serial, add uart (Serial), play with Hardware Timers, add a big sketch to multitask between them, and you would likely still have spare flash and sram. It probably feels like there is 'megabytes' of it when it is only perhaps 128k flash and 64k sram.

I think something like a chip with RE in suffix 512k flash 96k sram is probably enough to fit the whole marlin firmware 2.0
https://marlinfw.org/
or even micropython
https://micropython.org/
or circuit python etc
https://circuitpython.org/
but for those 'huge' apps, they may even exhaust 512k flash and 96k sram, but more than likely the 'smaller' configs can fit and run a 3d printer.
that could depend on the feature set 'configured', e.g. to add lots of fancy features + wifi (say add a esp8266 dongle - ESP-01) + web server etc 96k sram could be 'too cramped'
Disabling unneedet features sounds good, - i think it helps save energie to, - true ?
not really,
this is for the 'advanced' tinkerers:
stm32 has many peripherals (uart, spi, timers, i2c, gpio banks etc), if you don't clock them they save power, but they are *switched off* if you don't clock them. and for most stm32 chips you can play with and vary the system clocks, under clock, over clock, whatever you please.
Now i first have to figure out where are the differences between the Arduinos and the STMs, e.g. where ( and what) do i need to change the Sketches. Questins to that:
How can I address the pins, is there the need to define a analog pin for input or 5V or 3.3V... it has both pins... ?
Is there somewhere a tutorial für the STMs ( e.g. like the Arduino Reference ) ?
stm32 pins are named after the PAxx, PBxx, PCxx etc definitions according to their specs sheet.
e.g. for some 'pill' boards, they put the led at PC13, and it is active low. check the schematics for your board.
so lighting up the led is

Code: Select all

void setup() {
	pinMode(PC13, OUTPUT);
}

void loop() {
	digitalWrite(PC13, LOW); // turn on led, active low
	delay(10);
	digitalWrite(PC13, HIGH); // turn off led
	delay(10);
}

all stm32 pins are 3.3v, the analog (e.g. ADC) pins normally can only do 3.3v, but some gpio pins can tolerate 5v.
you need check the specs sheet for your chip part number
e.g. for stm32f103c8
https://www.st.com/resource/en/datashee ... f103c8.pdf
and the reference manual is this
https://www.st.com/resource/en/referenc ... ronics.pdf
and do review the wiki if you are running the 'official' core.
https://github.com/stm32duino/wiki/wiki
Which is the best adress to put at the bBoardsmanager ? Atm i'm at "https://github.com/stm32duino/BoardMana ... index.json", but there are no example sketches with it and i know i have had some examples at the IDE before i changed to it.
here are some "samples"
https://github.com/stm32duino/STM32Examples
and in the wiki
https://github.com/stm32duino/wiki/wiki
and the api reference
https://github.com/stm32duino/wiki/wiki/API

Re: Code size

Posted: Mon Sep 05, 2022 5:55 pm
by GonzoG
STM32 Wiki:
https://github.com/stm32duino/wiki/wiki

A to converting Arduino sketch to STM32 - for if you're not using Nucleo with Arduino headers, you will need to change pin names to standard names:
eg. PA1, PB3, etc. (arduino also uses this naming).
You might also need to change libraries as some libraries are hardware oriented and will work only for arduino boards.

STM32 is 3.3V board and everything should be 3.3V but, depending on MCU, some or all I/O pins are 5V compatible in digital mode, in analog you need to stick with 3.3V.

Re: Code size

Posted: Mon Sep 05, 2022 7:40 pm
by dannyf

Code: Select all

	digitalWrite(PC13, LOW); // turn on led, active low
	delay(10);
	digitalWrite(PC13, HIGH); // turn off led
	delay(10);
I often implement a ticks() (macro or function, depending ont he hardware) that tracks the number of cpu ticks elapsed, for timing purposes. for the CM3 chips, you use

Code: Select all

(DWT->CYCCNT)
for that.

so the above code becomes:

Code: Select all

#define DLY_10ms  (F_CPU / 100)  //cpu cycles in 10ms
#define pinFlip(pin)  digitalWrite(pin, !digitalRead(pin))  //flip a digital pin

	//blink led0 every 10ms
	if (ticks() - tick0 > DLY_10ms) {
		tick0+= DLY_10ms;  //advance the timing point
		pinFlip(LED0);  //flip the pin
	}
	
	//blink led1 every second
	if (ticks() - tick1 > DLY_1Sec) {
		tick1+= DLY_1Sec;  //advance the timing point (DLY_1Sec = F_CPU)
		pinFlip(LED1);  //flip the pin
	}

the advantage is that 1) such timing is universal -> regardless of how fast the cpu runs, and 2) other tasks can carry on. for example, you can create multiple led blinking at different rates - in the above example, led0 blinks every 10ms, and led1 blinks every second.

Re: Code size

Posted: Sun Sep 11, 2022 11:55 am
by herbk
OK, i understand ( i hope so...) : The bord needs about 10K flasch for itself, the size of the programm gets on top. But because the flash size is 64K (i know it depends on the board) i have still more as at an Arduino Nano e.g.

@ ag123 :
Power or speed i usualy don't need very much... The main reason why i want to migrate to the STM boards is the bigger amount of analog inputs, the better ADC and th integrated RTC. It would perfect if i could find a bord with an integrated SD Card slot to... ;)
I found one ( https://de.aliexpress.com/i/40001037427 ... pt=glo2deu ) but it seams to be overpaced for my neeeds.
Power means also power consumption, what i don't like, because most of my projekts need to work for long time at one battery.

@ dannyf :
Looks like the same i do at arduinos with millis...

Re: Code size

Posted: Sun Sep 11, 2022 1:18 pm
by ag123
This is just some personal notes, I stuck with stm32 mainly as it is among the mcus that has rather "adequate" sram and that it is *fast* in particular the stm32f4xx and stm32h7xx series. Compared to chips like atmega328, stm32 even for that matter stm32f103c8 which has more resources certainly makes a difference.
But of course these days there are lots of choices for such chips, e.g. Raspberry Pico, the RP2040 chip itself bundles a generous 264kB sram, so do things like esp32 etc. For now, the other thing I'm still doing stm32 is mainly platform familiarity and that stm32 mostly has a lot of peripherals and moderately fast and decent ADCs. oh and other stuff are like DMA ! That gives you videos like this
Image
(note for the 'official' core, to play with DMA, you'd need to go 'beyond' the *duino api - for now, i.e. it is for 'advanced' users/tinkerers. i.e. it isn't there, but that you can attempt e.g. using Cube MX/IDE's HAL or go hard core - play with hardware registers directly - check the reference manuals on ST's web for your chip.)

Note that these days, online "flea markets" merchants are selling stm32f103 c6 rather than c8 or cb. just 2 cents: those who bought those 'cheaper' 'c6 pill boards' unfortunately are "short changed", in the sense that they'd find things pretty cramp e.g. to run stm32duino.

Note that those looking in the "online flea markets" for "development boards" or 'pill' boards, this is probably a good reference resource
https://stm32-base.org/boards/
at least you could use those as 'starting points' and search for part numbers

Preferably, just go straight to stm32f401 or stm32f411, check the suffix that they are CC or CE (check ST's web for the chip specs for flash and sram), 'pill' boards or even better a Nucleo F401RE or F411RE. One could also go one up for even faster and resource endowed MCU like stm32f405rg, You can find this board from Micropython
https://store.micropython.org/product/PYBLITEv1.0
or from Adafruit
https://www.adafruit.com/product/4382
or from Olimex (note get the original ST one)
https://www.olimex.com/Products/ARM/ST/STM32-H405/
there are probably more from others.
There are also others, in particular those STM32F407VE or STM32F407 ZE or ZG boards
https://www.aliexpress.com/wholesale?ca ... 32f407vet6
https://www.aliexpress.com/wholesale?ca ... 32f407zgt6
https://www.olimex.com/Products/ARM/ST/ ... e-hardware
STM32F405, STM32F407 are 168 Mhz chips, if *fast* is what you want!

Then for whose who want to be 'speed devils' there are the STM32H7xx chips, one'd only need to search around
e.g. note this isn't the only one and is just an *example*, one would search with similar part numbers and you may find boards around, there are also Nucleo offerings i think.
viewtopic.php?f=28&t=1348
For 'beginners' my guess is starting with a H7, they may be 'overwhelming/' and it has quite a different design vs say stm32f103xx and stm32f4xx.
H7xx are (very) complex and (very) fast chips packed with pheriperials. They probably go 'beyond' what is deemed a 'mcu', They are like single chip computers just without the MMU.

The difference between say stm32duino vs perhaps a more 'from scratch' (e.g. Cube MX or Cube IDE based template generator) way of doing things is that the *duino style has a *default bundle* of things, e.g. things like adc, uart, (or for that matter usb-serial) etc is part of the default bundle partly in association with the Arduino api.
This accounts for some binary bloat, And in addition, the 'official' stm32duino core attempts to cater to nearly the "spectrum" of stm32 socs, that'd also contribute some binary bloat as the linker won't always be able to tell if a particular function would be used, or that there would be various if-else codes to cater for cross series differences. But that this is really a tradeoff kind of, the ability to "cross series" is no feeble feat, and the amount of effort you saved trying to "cross series" can be 'enormous' if you really want to create a different core for a different soc.

But even that is improving as now there is a 'CMake' approach
viewtopic.php?f=10&t=1648
CMake is well known for being used as an "industrial strength" build tool for large projects. And that one of their design features is that it lets you 'swap libraries' with the same api. This would solve some of the binary bloat issues and that the builds can be much faster.
https://en.wikipedia.org/wiki/CMake
'dependency injection'? CMakes mimics it half way there by letting you swap libraries by changing the build configuration, i.e. the CMakelist.txt.