Any Parallel in Binary Sizesf IDE for Board Created to Standard PC Binary?

Development environment specific, Arduino, Eclipse, VS2013,Em::Blocks etc
User avatar
mrburnette
Posts: 1894
Joined: Mon Apr 27, 2015 12:50 pm
Location: Greater Atlanta
Contact:

Re: Any Parallel in Binary Sizesf IDE for Board Created to Standard PC Binary?

Post by mrburnette » Wed Oct 26, 2016 2:04 pm

So with this very small sample set the magic multiplier is about ~.666 seems evil to me
@Rick
Oh, yea ... that's evil,,, :evil:

@john,
One can spend far too much time "thinking" about code size. You would be better advised in my opinion to think about architecture. Most microcontrollers are of Harvard architecture and most microprocessors are of Von Neumann architecture. However, instruction pipelining, flash buffering, prefetching, and other enhancements can really mess with one's mind if they are doing cross platform programming.

I'm not saying that the binary size that gets stored in flash is not a concern, but I will say that in my 6 years, it has not been a concern. SRAM is always a concern especially if you come from a big-iron machine environment. But IMO I have run out of I/O pins before I have run out of flash.

I do not know how the ArduinoIDE dudes and dudettes decide on compiler runtime flags, maybe the flip a coin. But I have had to use a few pragma directives from time to time to get what I want.

Libraries are a two-edged sword: they make it easy for non-programmers to get started and get a project completed but in the Arduino world they hide methods and techniques that may not be "good programming practices." I am pretty outspoken on the whole library issue and my usual response is to never use a library unless you understand how the library code functions. Obviously, this adds some time to a weekend project but in the long run, it conditions your mind to know the limitations of 3rd party code. You can easily get burned by relying on a library to be perfect.

Based upon what I have seen in the last 3 years, changes in uC devices are going to continue to come even faster than in the past ... or said another way, change is accelerating. I rarely pick up an 8-bit device any more because 32-bit devices are so inexpensive and because these devices come with more resources and run at a higher clock, programming is actually easier because one does not have to make mid-stream decisions on code because of device limitations.

The other thing that is happening very quickly is that devices such as the ESP8266 ($4 w/ breakout) can host their own web-session and where I now use serial-USB to an enumerated PC serial port to a monitor program ... I can utilize a WiFi connection on my router and use AJAX/JavaScript and monitor in my Chromium browser... and with XML I can establish 2-way communication and interact with GUI slider controls, etc. Yes, I could type into the ASCII monitor the value of a control and send that over USB-serial (and repeat, and repeat and ...), but the browser surely does make the test session more fun.


Ray
SliderControl.jpg
SliderControl.jpg (28.26 KiB) Viewed 595 times

keypunch
Posts: 67
Joined: Tue Aug 02, 2016 2:26 am

Re: Any Parallel in Binary Sizesf IDE for Board Created to Standard PC Binary?

Post by keypunch » Wed Oct 26, 2016 7:14 pm

Hello Rick,

Your Size and Evil magic multiplier of -0.666 is interesting and not a complete surprise. Case n point a few yesrs ago I compilered the exact same code on Intel T5600 processor and a Pi 700MHz processor. The program does some key all CPU integer only calculations for as many itterations as I give it. I usually run for the test for at least 10 minutes for one of test processors and then give same itteration count to other to compare. Amazing the PI was much faster than expected for 0.7GHx vs 1.8GHz, et al. The T5600 was only about 30% faster. My guess is likely RISC vs CISC. With that also seems to go less size of the code due to RISC instruction set.

The size commad is really useful to know about and use. Thank you for that and your test of code you had lying about test of sizes.


Regards,

John L. Males
Toronto, Ontario
Canada
26 October 2016 15:15

keypunch
Posts: 67
Joined: Tue Aug 02, 2016 2:26 am

Re: Any Parallel in Binary Sizesf IDE for Board Created to Standard PC Binary?

Post by keypunch » Wed Oct 26, 2016 8:09 pm

Hello Ray,

As always you have great insight and perpective. Your Last Post is much appreciated for the points and thoughts.

The only reason for my code size question is to allow me to know what MCU choices I have. Cost is not only factor, power, headroom (memory and pins) if I need, power demand (AKA battery sizing), physical size, SD/RTC and pins. My big pins requirement for this important personal project is I need at least 10 ADC pins. As you know that is a challenge to start with, then subtrack out I2C, SPI, et al depending on the MCU design. I need about 10 digital pins. Add in what will need in pins for SD/SPI, RTC on I2C, and maybe a future need for a display/touch screen. Latter I am not inclidned to have due to power draw. CPU processing power is now important for the project meaning a ATMega based MCU likely will not be sufficient. The size allows me to know what my possible choices are for a MCU that will have to handle -25C to 70C or bit more.

I thought a Nano would do the job memory or processing as the scale of the requirements I need increased more than I expected. Hence a STM32 based solution that I only learned existed due to only looking at MCUs for first time. I have not worked with MCUs before and want to avoid too many basics of how to know what I need and find the MCUs that fit the criteria. Then choose one or two from that list. I have one post about a few MCUs re support in different subject/thread I created. I have one more new subject to creat for the 4xx type boards Ilike I posed for the 103 boards. The Maple Mini I looked at as small size in the Nano like area of MCU sizes. Sadly not enough pins left when add in SD and RTC. Seems to be number of STM32 with SDIO based SD which may or may not have support yet in the libraries. Not is my understanding from the libraries via this site.

Memory aside, I have looked at an dcheck pin out diagrams to see if a possible MCU of interest is or not for my requirements. Like memory I would say key to the decision related to pins and how many have available for requirements. A few MCUs that looked like great possibilities were sadly quickly ruled out based on the pin out diagram assignments. Again save time and money on a MCU I cannot use for what I need at this time.

Rick's simple and very informative test give me a sense of the flash part and use of flash and . I already know based on what Rick pointed out as what is allocated to RAM that based on the size command the .data and .bss ae not actually accurate from the size command. Long story realted to another special personal project of mine. The size and Evil Magic multiplier of -0.666 is helpful as rough guide I would use such a value for as there are many variables to end size of binary.

That brings me to your points about libraries and knowing what one is using. I agree with you on your points 100%. At start I will likely use libraries as is. Later I will take the source and make things more granular to reduce/eliminate dead code from libraries. I started on a machine with 8K of memory and core memory. I was able to run Fortran, Cobal and to many amazement APL! All on 8K. So I have a sense of memory use and being wise most of time in memory use. On flip side I also worked for some years with IBM Mainframes and many processor and networking technologies in between.

The level of what is possible with MCU and choices is great. There are already lots of choices and your point of more MCUs with more ability is great. The choices are more than fine or my needs now. I just like to find one with RTC and SD on board to simplify the physical design that will already have a mess of boards/circuits to add on top of that save me a few steps of the hardware solution.

I have left the point about better for me to consider architecture to last to factor into my comments I have made. I am not aware of the impact of the differences you made note of at this time from a real life practical use point of view. I do need multitasking that so far seems a "method" vs in the MCU architecture side of things. I tend to write code for cross platform use and ease of just recompile and go. Much of my professional career involved a few different architectures in the solution I had to test. That re-enforced my bias to cross platform code. Having done assembler programming for nearly 7 years it was great to write code in otehr languages during that time and since that work on almost any platform. I do not know if the STM32 MCUs have a more intelligent form to manual multitasking that is more native to many architectures. I have read about manual like multitasking in MCUs so far. I have not dug deep into if there are architectures that allow, let alone price, sizing (maybe way oversized to my needs). I find RISC based designs to be more efficient and faster than most, f not all, CISC based designs. If you have some simple thoughts to point me on this let me know. I will then do the digging and research.

Hoping I have not made too many typo errors in my reply.

Regards,

John L. Males
Toronto, Ontario
Canada
26 October 2016 16:09

User avatar
Rick Kimball
Posts: 1060
Joined: Tue Apr 28, 2015 1:26 am
Location: Eastern NC, US
Contact:

Re: Any Parallel in Binary Sizesf IDE for Board Created to Standard PC Binary?

Post by Rick Kimball » Wed Oct 26, 2016 10:45 pm

keypunch wrote:...At start I will likely use libraries as is. Later I will take the source and make things more granular to reduce/eliminate dead code from libraries.
While paper tape and punch cards have remained the same for 50 plus years, C compiler and linkers have forged forward removing the human optimization drudgery. It is a waste of time to worry about eliminating dead code any more. At least if you are using gcc. There are a couple of command line args that take care of all this for you.

# compiler
$ gcc ... bunch of args ... -ffunction-sections -fdata-sections ... more args foo.c
# linker
$ gcc ... bunch of args -Wl,--gc-sections .. many.o files -o finalapp.elf

The key are the -ffunction-sections and -fdata-sections at compile time arguments. Those args puts every function and every data variable in its own section, even those in libraries. You combine that with the -Wl,--gc-sections argument at link time. That flag tells the linker to track every function and variable at link time to see if they are actually used or not. gc-sections meaning garbage collection. If it find a function or a a variable that isn't used, it doesn't link that code it into the final object.

so a function like this:

extern "C" {

int some_function(int num) {
return num * num;
}

}

ends up in an .elf section named ".text.some_function". The linker script includes both .text and all .text.* sections as valid code sections.

You can use:

$ arm-none-eabi-gcc -Os -ffunction-sections -fdata-sections -mthumb -mcpu=cortex-m3 -c ringbuffer.c
$ arm-none-eabi-readelf -a ringbuffer.o
... this will show you all the info about the different sections in the ringbuffer.o code.

-rick
-rick

User avatar
mrburnette
Posts: 1894
Joined: Mon Apr 27, 2015 12:50 pm
Location: Greater Atlanta
Contact:

Re: Any Parallel in Binary Sizesf IDE for Board Created to Standard PC Binary?

Post by mrburnette » Fri Oct 28, 2016 1:26 pm

keypunch wrote:<...>
I do need multitasking that so far seems a "method" vs in the MCU architecture side of things.<...>

@john,

If you intend on using one of the RTOS handlers, I would suggest that you acclimated to the paradigm ASAP. FreeRTOS has been used with the Maple Mini clones and there is a bit of experience with that tool.

The reason that I suggest you come to grips with whichever RTOS you elect to use is that the programming paradigm shifts a bit. Obviously, there are lots of behind the scenes concerns, too. Of course, your SRAM is also affected.

Your analog requirements may limit your uC choice. Rather than let the tail wag the dog, maybe you should consider an external A/D and use the SPI or I2C bus to retrieve all of the conversions at once.

Prose is OK for discussion, but I think you are at the point where you need to start yourself a matrix of requirements. A matrix is also far easier on the eyes and the brain.

Ray

keypunch
Posts: 67
Joined: Tue Aug 02, 2016 2:26 am

Re: Any Parallel in Binary Sizesf IDE for Board Created to Standard PC Binary?

Post by keypunch » Tue Nov 08, 2016 9:04 am

@Rick

Sorry for my delayed reply. I was not receiving notifications of replies. I have sorted that out going forward, but seems there are very limited choices backwards.

WRT your reply on gcc options "-ffunction-sections -fdata-sections" and linker options "-Wl,--gc-sections". I was not aware of these options. Every so often I will review all the gcc options to decide what makes sense for my use incrementially. These I had never taken notice of in past years of the options you noted. I have one source code set I use dynamicially and statically linked. I will try the options you have suggested to see what impact this has to the static result of the source I use daily for years. I supect these option will have little effect to the dynamic binary until run time. I will let you know the results.

Thanks so much for bringing these options to my attention.


Regards,

John L. Males
Toronto, Ontario
Canada
08 November 2016 04:04
08 November 2016 06:10 Minor typo correction removed word misplaced left over from prior edit. jlm
Last edited by keypunch on Tue Nov 08, 2016 11:11 am, edited 1 time in total.

keypunch
Posts: 67
Joined: Tue Aug 02, 2016 2:26 am

Re: Any Parallel in Binary Sizesf IDE for Board Created to Standard PC Binary?

Post by keypunch » Tue Nov 08, 2016 11:09 am

@Ray,

Thank you for your reply. Sorry for my delayed response. I have sorted out the not being notified going forward of replies aspect.

I have had a matrix for few months time that has allowed me to know what the uC requirements are prior to screen the pin out map of the uC. This allows me to screen and shortlist possible uCs. Then I consider board size, compatiability with libraries, et al. This matrix has alos allowed be to consider using I2C based ADCs as you also suggested as consideration.

WRT your reply:

I read the FreeRTOS link you provided. Insightful points and requirements to consider. Your points on "programming paradigm shifts a bit" and the memory (I will assume mostly RAM) added overhead for sure apply. I may have a few simple to implement in the application and in logic of that may allow me to acheive what I wish for "multitasking". There is one chip I can add that a great side effect will be to make one aspect of any multitasking a bit easier. The primary purpose of the chip I like to use is it allows much better control, behaviour, consistancy and far more devices than the challenges of the multiple devices via GPIO pins 99.9% use. Not to mention the risk of damage to the MCU using the latter 99.9% approach has.

You are correct that my analog requirements limit my uC choices. I have considered an ADC via the I2C. The ADS1115 is one easy choice in terms of cost, availability, and well known. An ADS1115 adds 4 single ADC channels per ADS1115 with a limit of 4 ADS1115s based on I2C addresses the ADS1115 can have. Some STM32s have 16 ADCs. More than I need, but higher than the 8 or 9 ADC pins of STM32F103C8 like Maple Mini or similar that may be a bit too low for my finial needs.

I like to have SD card and RTC as well. I recently received some SPI SD boards and SPI RTC based on the DS3231. My research says the STM32 RTCs are quite "accurate". The drift of the DS3231 is still too much for my needs. There are libraries that allow the drift that is chip/crystal specific to be reduced greatly. I will test the STM32 RTC to see if close to or better than a DS3231 and decide which I will use. Having the battery for the STM32 RTC on board like a SD just simplifies the physical and weather (protected in weatherproof case, but we have winter here!) elements of the design.

As you know manySTM32 boards SD is SDIO based and not SPI based. So far there is no stm32duino.com SDIO SD library. This of course is due to what is needed in libraries and time allows for priority. This means I either see if I can write a SDIO library and not be certain of bugs I may have to find and work out. This factors in when the important data is then logging to the SDIO SD card vs what seems to be a well tested SPI SD. There is a balance of time vs stability. I am likely to side on the SPI SD card implementaion on STM32 board already or added via an SD board I have currently. I will already need to have a "few" other boards that will feed the analog signals and parallel digital signals (in many cases). I am trying to keep as much as can of the hand wiring mess this will sort of create so as not to get way out of hand (for number of sensors with boards or circuits I DIY build from existing schematics). I am trying not to create what may be a one of custom board design (cost and itteration factors inclusive) that would enable a better physical/weather and signal related relaibility considerations. That may be a loger term consideration, perhaps a must. For now I like to avoid custom daughter board(s).

This project will need to handle more than 80% of the 20% part of the 80/20 rule, meaning I have a 80/80 rule for this project. I allready know I have some challenges in making the sensors robust for weather (temperature, physical, salt, precipitation, et al. I learned that quickly in the proof of concept of one sensor that will be used. The sensor is fine, the physical protection design was not, but has been just fine for 10 years the design was created for and will for years to come. Eathquake, tornadoes, hurricanes, et al are currently not a factor in the physical considerations. The usual day to day varied humid summers to winters rain to snow are some of the primary physical considerations for this project.

As there may be a need for more than 20K RAM of a STM32F103C8 like uC, the increase RAM requirements, especially if I need to consider FreeRTOS, goes hand in had with having 16 ADC pins. I may have found a SPI based uC, but is based on a STM32F103C8T6 uC. That holds me to 20K RAM. The application design for the uC will push close to the 20K limit, but I may be able to use some techniques and other considerations to reduce perhaps more to a 4-5K (perhaps less) RAM footprint before libraries are factored into the RAM use. I have no idea if the 128K Flash will be a limit consideration if I need to use a FreeRTOS, or similar approach. I will assume the Maple Mini that has been shipped will be one way for me to figure out the sizing vs if I need to use a RTOS or not.

malloc or similar are not issues in the code I have written effect on RAM use simply beacuse malloc is not used by my code. I cannot speak for uC libraries linked to my code use or not of malloc and usual memory hole effects that can result.

Another side plus of more pins is having more than one I2C Uc channel and more than one SPI channel. This has some future considerations I like that are not pin requirements for more sensors. These could allow me to implement a couple unique simple elements that would be a small and great plus.

I now have some framework code I created over about a week as of about a week+ ago that allows me to have a better idea of the Flash/RAM requirement for the base framework code. Once a few other items arrive I can add in actual code for the devices. This will lead to the binary sizes and how much of the binary uses Flash and RAM. Then I can see if I can make futher code changes to reduce the Flash and RAM requirements of the code. Add in the suggestions from Rick for compiler and linker options to reduce or eliminate dead code from libraries and maybe a STM32F103F8 device with 20K RAM will be more possible and with room to spare for growth elements such as mybe a display if battery requirements will allow for. A display is not high on my requirements list, but could be helpful not being able to read the bits with my eyes written to the SD at time.

I hope my articulated reply and some prior one has now given a sense of the process and thinking that is being taken into account in the design/uC selection process that is of course WIP. For sure I have learned much in my research and part of the WIP since the Spring months before I discovered stm32duino.com.

As always Ray, you have excellent points and suggestions. I find your thoughts and suggestions not just with replies to me, the replies you have made to others and the points and thoughts you make in some of the projects you have built that you have posted. Yes, I have seen a few and read most of what I have discovered. Time at he time was the only reason for my not reading those posts I have not yet.

The same goes for many other that have posted replies to me as well. It also applies to the many posts I have read, many, but really is only fraction of posts on stm32duino.com. Lots of good information and considerations to think about. Some excellent of those points and thoughts of other posts not made by me or as result of reply to me have already factored in to the current design/consideration state of my project. Of course many are not aware they have helped at least me in regards to and are to be thanked as well by me for helping me.


Regards,

John L. Males
Toronto, Ontario
Canada
08 November 2016 04:05 - 05:44/06:09

keypunch
Posts: 67
Joined: Tue Aug 02, 2016 2:26 am

Re: Any Parallel in Binary Sizesf IDE for Board Created to Standard PC Binary?

Post by keypunch » Mon May 29, 2017 6:42 am

I have done a basic set of tests to compare binary sizes for different boards using the same souce code. The tests were used the Arduino IDE, Linux 64 bit Version 1.8.2, the Arduino Zero Zero compiler via boards manager listed as version 1.6.15, and the latest Arduino_STM32-master (appears to be 29 May 2017 "Latest commit 920b57a 6 hours ago"). The tests only took an about an hour and were done about 3 hours ago.

Data was entered into a spreadsheet from the Arduino IDE output of a verify compile of the same source file.

The results are rather interesting. Generally speaking the STM32 flash images are about 5 to almost 6 times larger than the Arduino Uno used as a reference. The one exception was the STM32 STM Nucleo F103RB that was about 3.2 times the size of the Uno flash file size much like the 32 bit Arduino M0 and Arduino Zero boards. RAM useage was rather high for the STM32 boards compared to the Arduino boards.

I would expect the difference between a 8 bit vs 32 bit processor would be about 4 times. This would mean a 32K Flash/2KRAM for 8 bit uC would be about equal to 128K Flash/8K RAM in my opinion. While a few results seem to demonstrate this factor of 4 expectation, many are above this factor of 4 expectation. Comments and thoughts if my expectation of factor of 4 difference makes sense or not are most welcome.

Two boards did not complie due to missing header file. Some of the STM32 compiles had warning messages.

I have attached PDFs of two bar charts from the spreadsheet data created from the number of bytes of flash that would be used and amount of RAM would be used given by the Arduino IDE. The Arduino Zero and M0 compiles did not seem indicate how much RAM is used in the verify compile output information or I am missing where to find that value.


Regards,

John L. Males
Toronto, Ontario
Canada
29 May 2017 06:41 EDT
29 May 2017 06:52 (Added embedded url links)
Attachments
20160528-Tests-STM32VsArduino-BytesCompare-Percent.txt.pdf
(12.29 KiB) Downloaded 20 times
20160528-Tests-STM32VsArduino-BytesCompare.txt.pdf
(11.96 KiB) Downloaded 16 times

User avatar
RogerClark
Posts: 7500
Joined: Mon Apr 27, 2015 10:36 am
Location: Melbourne, Australia
Contact:

Re: Any Parallel in Binary Sizesf IDE for Board Created to Standard PC Binary?

Post by RogerClark » Mon May 29, 2017 7:21 am

What code did you compile? e.g Blink?

Which upload method did you select?

Most options compile the USB serial code into the sketch, which makes it a lot bigger.

If you want to compare with AVR you will need to select upload via Serial as this will not include the USB serial code.

keypunch
Posts: 67
Joined: Tue Aug 02, 2016 2:26 am

Re: Any Parallel in Binary Sizesf IDE for Board Created to Standard PC Binary?

Post by keypunch » Mon May 29, 2017 7:54 am

RogerClark wrote:
What code did you compile? e.g Blink?
Code was for BMP280, with serial.print statements commented out.
Which upload method did you select?
No upload was done. The test was for binary size metrics only. I only compiled, i.e. the "Verify" function of the Arduino IDE that allows one to test a compile and avoid unnecessary flash upload/wear in that part of the coding process.
Most options compile the USB serial code into the sketch, which makes it a lot bigger.
Test was same for all boards. This was not a test for smallest binary, but what the binary size metrics were. Expectation was for a about factor of 4 difference, i.e. 8 bit vs 32 bit uC.
If you want to compare with AVR you will need to select upload via Serial as this will not include the USB serial code.
I will try a set of tests with upload via serial and see what the binary sizes created are.


Regards,

John L. Males
Toronto, Ontario
Canada
29 May 2017 03:54 EDT

Post Reply