Very lean Serial.printf()

Post your cool example code here.
User avatar
mrburnette
Posts: 1774
Joined: Mon Apr 27, 2015 12:50 pm
Location: Greater Atlanta
Contact:

Re: Very lean Serial.printf()

Postby mrburnette » Thu Apr 14, 2016 11:32 pm

Slammer wrote:<...>
No other implementation is so small, I tried almost everything , there are smaller implementations but they don't support all types of integers or width specifiers or they use static variables....


Looking smaller == better, IMO. And, I think the member function implementation is much cleaner. Still, there may be those that would balk at 826 extra Bytes but I think for the STM32F103 and above, it is not a problem.

Yes, there are smaller implementations but your implementation seems more useful for the Joe's and Jill's of the world. Very nice. As I showed earlier, there are options available outside printf() for simple formatting needs.

Now, how fast is it?

Ray

User avatar
Slammer
Posts: 241
Joined: Tue Mar 01, 2016 10:35 pm
Location: Athens, Greece

Re: Very lean Serial.printf()

Postby Slammer » Fri Apr 15, 2016 12:21 am

And for those that saying that STM32 wastes more memory than AVR, the same function in Uno needs 0x40C+0x2A = 1078 bytes

User avatar
mrburnette
Posts: 1774
Joined: Mon Apr 27, 2015 12:50 pm
Location: Greater Atlanta
Contact:

Re: Very lean Serial.printf()

Postby mrburnette » Fri Apr 15, 2016 2:31 pm

mrburnette wrote:<...>
Now, how fast is it?

Ray



To answer my own question for the statement ... Serial.printf("Loop #: % 5d %08X %ld %p %f\r\n",n,n,n,&n,n++);

Loop #: 160 000000A0 160 P F uS=28815

Loop #: 161 000000A1 161 P F uS=28865

Loop #: 162 000000A2 162 P F uS=30876

Loop #: 163 000000A3 163 P F uS=31865

Loop #: 164 000000A4 164 P F uS=28858



The sketch:

Code: Select all

/*
  BlinkNcount for Maple Mini by m. ray burnette
  Compiled on Linux Mint 32-bit
  Arduino 1.7.9 using Maple Mini Bootloader 2.0 tested on 20160415
    Sketch uses 15,260 bytes (10%) of program storage space. Maximum is 122,880 bytes.
    Global variables use 2,560 bytes of dynamic memory.
  Turns on an LED on for one second, then off for one second, repeatedly.
  Counts and displays the count on the attached serial monitor
  This example code is in the public domain.
 */

#define BOARD_LED_PIN PB1                                                     // Maple Mini pin# 33 is status LED

int n=0;

void setup()
{
    pinMode(BOARD_LED_PIN, OUTPUT);                                           // digital pin as an output.
    Serial.begin();                                                           // BAUD has no effect on USB serial
                                                                              // wait to be connected.
    while (!(Serial.isConnected() && (Serial.getDTR() || Serial.getRTS())))
    {
        digitalWrite(BOARD_LED_PIN,!digitalRead(BOARD_LED_PIN));              // Flip LED state
        delay(50);
    }
    Serial.printf("-PROGRAM START-\r\n");
}

void loop()
{
    long uScount;
    digitalWrite(BOARD_LED_PIN, HIGH); delay(500);                            // LED_on  + half-second
    uScount = micros();
    Serial.printf("Loop #: % 5d %08X %ld %p %f\r\n",n,n,n,&n,n++);            // exercise NEW function
    uScount = micros() - uScount;
    Serial.print("\t\t\t\t uS="); Serial.println( uScount);
    digitalWrite(BOARD_LED_PIN, LOW); delay(500);                             // LED_off + half-second
}


Real Flash Impact:
Using the above code, I commented the two statements using .printf() and compiled (this leaves in the Serial.print. The Serial.h and Serial.cpp libs remain modified at this point with Slammer's changes:
Sketch uses 15,196 bytes (12%) of program storage space. Maximum is 122,880 bytes.
Global variables use 2,560 bytes of dynamic memory.


Reverting the library changes (overwriting w/ backup), the new compile figures are (sketch remains unchanged, still w/ comments):
Sketch uses 14,100 bytes (11%) of program storage space. Maximum is 122,880 bytes.
Global variables use 2,560 bytes of dynamic memory.


Therefore, 15196 - 14100 = 1096 Bytes and can be attributed to the non-granularity of the unused member functions from printf() being in flash because I used Serial.print() in the sketch.

I am not going to express a conclusion on this as I feel that each programmer must weigh the value of Serial.printf() with the "forever" increase in the lost of 1K of flash for having the potential access to the functionality. 1096 Bytes is not a great deal of flash to give up, but if we as a forum implement this into the core, the flash is lost forever to this functionality (assuming you use Serial) unless the end-user edits it out. With no Print class utilized, the compile yields:
Sketch uses 12,180 bytes (9%) of program storage space. Maximum is 122,880 bytes.
Global variables use 2,560 bytes of dynamic memory.




Ray

User avatar
Slammer
Posts: 241
Joined: Tue Mar 01, 2016 10:35 pm
Location: Athens, Greece

Re: Very lean Serial.printf()

Postby Slammer » Fri Apr 15, 2016 9:21 pm

Actually this method does not measure the printf itself.... but the time that MCU needs to write the characters to uart.
Our implementation of write to serial (the usart_putc) is not interrupt based, neither support buffer, as result of this, the MCU during writing of a character to serial just waiting to end the transmission ( I dont know the internals of STM32 but either waiting the current char to push out or the previous... but the result is almost the same if you want to push multiple bytes to uart)
In a typical application a ring buffer must be used, the usart_putc normally is the entrance point to ring buffer but is not blocking. The tx interrupt triggers the sending of the next character until the buffer gets empty.
The time that a character needs to leave uart is not small. At 115200 a character needs almost 1/11520 sec = 87 usec, it is really long time for a 72MHz MPU (at 9600 is an eternity....)

User avatar
mrburnette
Posts: 1774
Joined: Mon Apr 27, 2015 12:50 pm
Location: Greater Atlanta
Contact:

Re: Very lean Serial.printf()

Postby mrburnette » Fri Apr 15, 2016 9:36 pm

Slammer wrote:Actually this method does not measure the printf itself.... but the time that MCU needs to write the characters to uart.
Our implementation of write to serial (the usart_putc) is not interrupt based, neither support buffer, as result of this, the MCU during writing of a character to serial just waiting to end the transmission ( I dont know the internals of STM32 but either waiting the current char to push out or the previous... but the result is almost the same if you want to push multiple bytes to uart)
In a typical application a ring buffer must be used, the usart_putc normally is the entrance point to ring buffer but is not blocking. The tx interrupt triggers the sending of the next character until the buffer gets empty.


Yes, you are correct, technically; however since we are dealing with technology that makes you 100% correct. I should have concocted a deferential test, but since .printf() has formatting capabilities that cannot easily be replicated (the purpose of your code), then a fair and impartial test is difficult. Honestly, I have been more than a little distracted with a dirt-cheap NodeMCU and did not give enough attention to this issue.

For now, everyone: please know that the timing is an aggregate composite of base code + Slammer's functions. The take-away here is that the printf() delta-time is smaller than the overall numbers of uS that is printed. Sorry if anyone was confused by these measurements.

To properly do such testing, an approach is to inherit Print and add your printf() worker functions and conduct the timing withing the derived class. Which leads me to ask, for printf(), why not take the superclass approach and build off of the inheritance? That should avoid any need to modify any of the core Print code.

Ray

User avatar
Slammer
Posts: 241
Joined: Tue Mar 01, 2016 10:35 pm
Location: Athens, Greece

Re: Very lean Serial.printf()

Postby Slammer » Fri Apr 15, 2016 9:56 pm

The NodeMCU is really the beast of 2$
The program memory is so huge for MCU applications that a full 50K-60K version of printf is almost nothing.... that's why the Print.printf() is included on the core of esp8266.
From the other side, the most used MPUs in last 25 years of my professional life, are ATmega8 and 89C52... In these machines, even the Print class is a luxury... you have to live with basic itoa and ltoa....
Anyway may be more usefull for our community to try to improve some core functions like buffered transmit on uart... (lol, I want something to keeps my nights busy....)

PS: I am afraid that the measuring of transmit time with Serial.write is more complex. My previous post about UART is technically correct but the Serial.XXX functions are not using a real UART but an emulated device through USB that acts as UART. The timing of this device is not an easy task...
The concept is the same because of the blocking nature of uart_putc but the timing is unknown.
Last edited by Slammer on Fri Apr 15, 2016 10:13 pm, edited 1 time in total.

User avatar
mrburnette
Posts: 1774
Joined: Mon Apr 27, 2015 12:50 pm
Location: Greater Atlanta
Contact:

Re: Very lean Serial.printf()

Postby mrburnette » Fri Apr 15, 2016 10:10 pm

Slammer wrote:<...>
From the other side, the most used MPUs in last 25 years of my professional life, are ATmega8 and 89C52... In these machines, even the Print class is a luxury... you have to live with basic itoa and ltoa....
Anyway may be more usefull for our community to try to improve some core functions like buffered transmit on uart... (lol, I want something to keeps my nights busy....)


The small of the stuff I have played with in the 6 years since retirement is tiny85. I have 2 or 3 tubes of 386P-PU and would have likely have used 'em all had it not been for the STM32F103 about 18 months ago. The ESP8266 is a distraction, but a fun one. I have a clone of the NodeMCU made by LoLin and it has annoyed me severely over the past couple of day. But, I think it may be tamed now.

I hope you do not think I do not appreciate the printf() code, I do. I'm just one of those people that complains - even when something is good! :lol:

Ray

User avatar
Slammer
Posts: 241
Joined: Tue Mar 01, 2016 10:35 pm
Location: Athens, Greece

Re: Very lean Serial.printf()

Postby Slammer » Fri Apr 15, 2016 11:00 pm

Look the results of this code (only loop):

Code: Select all

    long uScount;
    digitalWrite(BOARD_LED_PIN, HIGH); delay(500);                            // LED_on  + half-second
    uScount = micros();
    Serial1.println("This is my big fat text.... 50 characters long....");
    uScount = micros() - uScount;
    Serial.print("\t\t\t\t printf Serial1 : uS="); Serial.println( uScount);
    uScount = micros();
    Serial.println("This is my big fat text.... 50 characters long....");
    uScount = micros() - uScount;
    Serial.print("\t\t\t\t printf USB : uS="); Serial.println( uScount);
    digitalWrite(BOARD_LED_PIN, LOW); delay(500);                             // LED_off + half-second


This code outputs a string of 50 + 2(for endline) characters to Serial and Serial1 (@115200) and measures the time. The result reveals the truth about Uart/USB timing

Code: Select all

This is my big fat text.... 50 characters long....
                                 printf USB : uS=97
                                 printf Serial1 : uS=4347
This is my big fat text.... 50 characters long....
                                 printf USB : uS=107
                                 printf Serial1 : uS=4349
This is my big fat text.... 50 characters long....
                                 printf USB : uS=107
                                 printf Serial1 : uS=4344
This is my big fat text.... 50 characters long....
                                 printf USB : uS=97
                                 printf Serial1 : uS=4345
This is my big fat text.... 50 characters long....
                                 printf USB : uS=97
                                 printf Serial1 : uS=4347
This is my big fat text.... 50 characters long....
                                 printf USB : uS=107
                                 printf Serial1 : uS=4349


The Serial1 is a real UART and according calculations 52 bytes require (we have to count only 50 characters time, because the first is pushed at once because the register is empty, and the last is pushed and the function exits as we dont have to wait for this character):
9 bits/115200 x 50 = 4340 uS
As you see the measured time is about 4344-4349 uS. We can say roughly that the overhead of all functions except putc is 4-9 uS......
This proves how important is the serial buffering..... the cpu is freezed at putc function at 99.9% of time.
If you look the duration in Serial USB, we have a very different situation. It is about 100 uS only. It is like UART with baudrate of 5Mbit.

User avatar
mrburnette
Posts: 1774
Joined: Mon Apr 27, 2015 12:50 pm
Location: Greater Atlanta
Contact:

Re: Very lean Serial.printf()

Postby mrburnette » Sat Apr 16, 2016 2:12 am

The result reveals the truth about Uart/USB timing


Yes, agree.

Ray

victor_pv
Posts: 1055
Joined: Mon Apr 27, 2015 12:12 pm

Re: Very lean Serial.printf()

Postby victor_pv » Sun Jan 15, 2017 4:44 pm

Slammer, I just used your printf function in a small SWO library, I hope you don't mind, I gave you credit ;)

Now, about printf in the print class, I understand Ray's point, that adding 1KB of code here, 1 there, eventually adds up to a good amount, and people may not need it, but I thought if a sketch doesn't use printf, that printf would not be included by the compiler/linker, so unless it's actually used by the sketch, there is no difference whether the function is present in the print class or not.
Am I missing something?

Besides that, one other observation: I tested using sprintf to convert a message to a string to them be able to print it with println, in a test sketch for SWO, and it increases the sketch by 15KB of flash and 1.5KB of RAM +/-. By comparison Slammers printf increases the size by about 1KB or so only.
Given that, if printf doesn't take space when not used, and adding it to the core may save people from having to use sprintf, my vote would go to include it in the core.


EDIT: I compiled my SWO test sketch using println and using printf. In both cases printf is part of the SWO class. It really seems it doesn't add to the code unless used.
Then I went 1 step further and include it in the print class instead, and I get a similar result, the code size does not grow when not used. So I see no harm in adding it to the core. I am not sure why Ray's test was showing increased size when not used.

Code: Select all

printf in SWO class but not used in the sketch (only println):
Sketch uses 15,940 bytes (3%) of program storage space. Maximum is 524,288 bytes.
Global variables use 2,952 bytes of dynamic memory.


printf in SWO class and used:
Sketch uses 16,900 bytes (3%) of program storage space. Maximum is 524,288 bytes.
Global variables use 2,952 bytes of dynamic memory.


printf in print class and used:
Sketch uses 16,908 bytes (3%) of program storage space. Maximum is 524,288 bytes.
Global variables use 2,952 bytes of dynamic memory.


printf in print class and not used:
Sketch uses 15,940 bytes (3%) of program storage space. Maximum is 524,288 bytes.
Global variables use 2,952 bytes of dynamic memory.


Return to “Code snipplets”

Who is online

Users browsing this forum: No registered users and 1 guest