ESP8266 Sans RF

Anything not related to STM32
User avatar
mrburnette
Posts: 1885
Joined: Mon Apr 27, 2015 12:50 pm
Location: Greater Atlanta
Contact:

Re: ESP8622 Sans RF

Post by mrburnette » Tue Apr 19, 2016 4:08 pm

WereCatf wrote:
mrburnette wrote:With the ESP8266, delay(0) is a synonym for yield();
Must be some overhead somewhere; with yield() at 160MHz I get 1860164uS, with delay(0) I get 1799641. With neither yield() or delay(0) I get 380114us, which is a considerable difference already.

Yep. I was profiling some of this last week for my own curiosity. But, with Arduino, it is not an option if the loop() does not repeat every 30-50mS.

So benchmarks must include this rather nasty task switching time.

Ray

User avatar
martinayotte
Posts: 1229
Joined: Mon Apr 27, 2015 1:45 pm

Re: ESP8622 Sans RF

Post by martinayotte » Tue Apr 19, 2016 4:14 pm

mrburnette wrote:
Slammer wrote:<...>
Something like this is used when we want something in IRAM

Code: Select all

#define IRAM0     __attribute__((section(".iram0.text")))
void IRAM0 myfunc(void)
{
}
The code will run much faster.
Results @80MHz ... multiple resets, same results:
Ray, this myfunc() is still empty ...
It is the code from the loop() that should be placed in there, and the loop() will call myfunc() .

User avatar
WereCatf
Posts: 166
Joined: Sat Apr 16, 2016 5:30 pm

Re: ESP8622 Sans RF

Post by WereCatf » Tue Apr 19, 2016 4:16 pm

mrburnette wrote:Yep. I was profiling some of this last week for my own curiosity. But, with Arduino, it is not an option if the loop() does not repeat every 30-50mS.

So benchmarks must include this rather nasty task switching time.
Hm, well, you're not comparing CPU-speeds then, you're comparing implementations of the delay()-function. On the ESP in a tight loop like this, with WiFi off and, calling ESP.wdtFeed() is a much faster way of ensuring the watchdog doesn't kick in than calling either yield() or delay(), but even then it still has some overhead.

Other Arduinos don't offer a way of just feeding the watchdog, then?

User avatar
WereCatf
Posts: 166
Joined: Sat Apr 16, 2016 5:30 pm

Re: ESP8622 Sans RF

Post by WereCatf » Tue Apr 19, 2016 4:17 pm

martinayotte wrote:It is the code from the loop() that should be placed in there, and the loop() will call myfunc() .
You can just do

Code: Select all

void ICACHE_RAM_ATTR loop()
-- the difference in execution-time is minimal, it only changes things a little.

User avatar
mrburnette
Posts: 1885
Joined: Mon Apr 27, 2015 12:50 pm
Location: Greater Atlanta
Contact:

Re: ESP8622 Sans RF

Post by mrburnette » Tue Apr 19, 2016 4:22 pm

martinayotte wrote:
mrburnette wrote:
Slammer wrote:<...>
Something like this is used when we want something in IRAM

Code: Select all

#define IRAM0     __attribute__((section(".iram0.text")))
void IRAM0 myfunc(void)
{
}
The code will run much faster.
Results @80MHz ... multiple resets, same results:
Ray, this myfunc() is still empty ...
It is the code from the loop() that should be placed in there, and the loop() will call myfunc() .
I'm an idiot! Was trying to post some results quickly so I could run to kitchen and make myself a sandwich! I'm stupid, stupid, .... but not hungry anymore :lol:

User avatar
Slammer
Posts: 255
Joined: Tue Mar 01, 2016 10:35 pm
Location: Athens, Greece

Re: ESP8622 Sans RF

Post by Slammer » Tue Apr 19, 2016 4:51 pm

Watch the program and data size, before/after the IRAM0 or ICACHE_RAM_ATTR if the function is really located in iram0
Maybe the linker script does not accept your function, as there are some limitations.... anyway I am a newbie with Esp8266, I am just blinking a led.... :roll:

User avatar
mrburnette
Posts: 1885
Joined: Mon Apr 27, 2015 12:50 pm
Location: Greater Atlanta
Contact:

Re: ESP8622 Sans RF

Post by mrburnette » Tue Apr 19, 2016 5:24 pm

WereCatf wrote:
mrburnette wrote: <...>
Hm, well, you're not comparing CPU-speeds then,<...>

With RISC architecture, clock speed is directly compared. What I am trying to compare is effective computing power - a kind of "workload" indication. Some folks call 'em benchmarks, but everyone in Enterprise computing knows that the fastest clocked machines may not be the fastest computing machine; throughput involves Systems Architecture which is not pure clock. Therefore, in the enterprise, one would run a specific benchmark on a computer based upon the type of workload that computer will generally experience the majority of the time.

From TechTarget:
A defined workload can be specified as a benchmark when evaluating a computer system in terms of performance (how easily the computer handles the workload), which in turn is generally divided into response time (the time between a user request and a response to the request from the system) and throughput (how much work is accomplished over a period of time).
So, if a sketch is considered a workload, then it can rightfully be considered a benchmark for performance and we would recognize that sketch as the benchmarking application. A different performance may likely be obtained for reconfiguring the workload - maybe a different algorithm, a sketch that utilized floating point, long integers, etc.

My intent is to measure an identical workload on the STM32duino and as-close-as-possible, the same workload on the ESP8266, but without the RF section being "on" and consuming 50+mA of wasted current.

My sandwich is consumed as is the nice bowl of butter pecan ice cream. Like my momma told me as a boy at the beach, "... after eating, you must rest your tummy so you do not get cramps!" Momma was right, I can already feel the cramps in my fingers - so, I'm going to break for a while (or day) and come back to this with a clear mind... ice cream always gives me a sugar high: if you eat ice cream, do not type; if you type, do not eat ice cream... it may cause typos.


Ray

This sketch compiles, links, & uploads but fails with stack dump:

Code: Select all

/*
  PrimeNos: by Nick Gammon
  ESP8266 Arduino port m. ray burnette
  OS version Linux Mint 17.3 32-bit tested on 20160419
  Compiled under Arduino 1.6.8
    Sketch uses 232,706 bytes (22%) of program storage space. Maximum is 1,044,464 bytes.
    Global variables use 32,063 bytes (39%) of dynamic memory, leaving 49,857 bytes for local variables. Maximum is 81,920 bytes.
  PUBLIC DOMAIN EXAMPLE
  RF sleep reference: https://github.com/esp8266/Arduino/issues/460#issuecomment-168970936
*/

extern "C" {
#include "user_interface.h"
}

#define IRAM0         __attribute__((section(".iram0.text")))
#define BOARD_LED_PIN  2
#define FREQUENCY     80                                    // valid 80, 160

#include <ESP8266WiFi.h>                                    // for the low power function
#include <Streaming.h>


// just add more primes to the prime table for larger search
// byte data type to save memory - use a larger datatype with prime table entries above 255 :)
byte primes[]= { 
    2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101,
    102, 107, 109, 113, 127, 131,  137 , 139, 149, 151, 157, 163, 167, 173,  179, 181, 191, 193, 197, 
    199, 211, 223, 227, 229, 233, 239, 241, 251 };

// if you change the datatype of primes array to int, change next line to 
// "const int TopPrimeIndex = (sizeof(primes)/2) - 1;"

const unsigned int  TopPrimeIndex = sizeof(primes) - 1;      
const unsigned long TopPrimeSquared = (long)primes[TopPrimeIndex] * (long)primes[TopPrimeIndex];
int primeFlag;
unsigned long t = 0;

// prototype
void myfunc(void);

void setup()                   
{
    
  WiFi.forceSleepBegin();                                   // turn off ESP8266 RF
  delay(1);                                                 // give RF section time to shutdown
  // yield();                                               // generally same as delay(1) but inconsistent
  system_update_cpu_freq(FREQUENCY);

  pinMode(BOARD_LED_PIN, OUTPUT);
  Serial.begin(115200);

  Serial << endl << endl;
  Serial << "Prime Number Generator by Nick Gammon\r\n";
  Serial << "ESP8266 port by Ray Burnette\r\n\r\n";
  Serial << "Number of primes in prime table = ";
  Serial << TopPrimeIndex << endl << endl;
  Serial << "Last prime in table =  ";
  Serial << (unsigned int)primes[TopPrimeIndex] << endl;
  Serial << "Calculating primes through ";
  Serial << TopPrimeSquared << endl << endl;
  Serial << "Starting uS timer now... ";
  t = micros();
}


void loop() {
    myfunc();
    for(;;) yield() ;                                       // infinite loop with yield to prevent WDT
}


void IRAM0 myfunc(void)
{
 for (long x = 1; x < TopPrimeSquared; x+=2)             // skips even numbers, including 2, which is prime, but it makes algorithm tad faster
      {
            for (long j=0; j < TopPrimeIndex; j++){
            primeFlag = true;

            if (x == primes[j]) break;

            if (x % primes[j] == 0){                        // if the test number modolo (next number from prime table) == 0 
                primeFlag = false;                          //  then test number is not prime, bailout and check the next number
                break;
            }
            delay(0);
        }
        if (primeFlag == true)                              // found a prime
        {
            digitalWrite(BOARD_LED_PIN,!digitalRead(BOARD_LED_PIN));
            yield();
            // Serial.println(x);
        }
      }
    Serial << "uS = " << micros()-t << endl;
    Serial << "Sketch terminating.  Press Reset to rerun." << endl;
}

Output:

Code: Select all

Prime Number Generator by Nick Gammon
ESP8266 port by Ray Burnette

Number of primes in prime table = 53

Last prime in table =  251
Calculating primes through 63001

Starting uS timer now... 
Exception (0):
epc1=0x40107299 epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000000 depc=0x00000000

ctx: cont 
sp: 3ffefb20 end: 3ffefcf0 offset: 01a0

>>>stack>>>
3ffefcc0:  3fffdad0 00000000 3ffeec90 4020247e  
3ffefcd0:  feefeffe feefeffe 3ffeecb5 40202c18  
3ffefce0:  feefeffe feefeffe 3ffeecd0 40100958  
<<<stack<<<

 ets Jan  8 2013,rst cause:2, boot mode:(1,6)


 ets Jan  8 2013,rst cause:4, boot mode:(1,6)

wdt reset

User avatar
RogerClark
Posts: 7481
Joined: Mon Apr 27, 2015 10:36 am
Location: Melbourne, Australia
Contact:

Re: ESP8622 Sans RF

Post by RogerClark » Tue Apr 19, 2016 9:41 pm

Lol...

I think the quality of the EPS8266 posting here, are far better than on esp8266.com ;-)

I think virtually all of us use the ESP8266 as well as the STM32.

I designed a PCB that has a Maple mini, ESP-12 as well as a ILI9341 and an nRF905, which I use for various projects ( though it was originally designed just as a display)

If the ESP8266 had better ADC, and USB i could have probably removed the STM32 from my design entirely.

But there are still some things like real time data collection that the ESP8266 is not ideally suited for.

User avatar
martinayotte
Posts: 1229
Joined: Mon Apr 27, 2015 1:45 pm

Re: ESP8622 Sans RF

Post by martinayotte » Wed Apr 20, 2016 7:41 pm

Hi Ray,

I figured it out ! The macro should not be :

Code: Select all

#define IRAM0         __attribute__((section(".iram0.text")))
but (see the .iram0.text vs .iram0.text) :

Code: Select all

#define IRAM0         __attribute__((section(".iram.text")))
Unfortunately, it didn't gave much speed improvement, neither than with the previous "dhry21a"

Maybe because the Flash Cache is already doing a nice job by leaving redundant code in the cache, so in both case, it was already in the RAM after first iteration.

Martin.

User avatar
mrburnette
Posts: 1885
Joined: Mon Apr 27, 2015 12:50 pm
Location: Greater Atlanta
Contact:

Re: ESP8622 Sans RF

Post by mrburnette » Wed Apr 20, 2016 8:09 pm

martinayotte wrote: <...>

Code: Select all

#define IRAM0         __attribute__((section(".iram.text")))
Unfortunately, it didn't gave much speed improvement, neither than with the previous "dhry21a" Maybe because the Flash Cache is already doing a nice job by leaving redundant code in the cache, so in both case, it was already in the RAM after first iteration / Martin.
Wow ... thank you Martin. I have not tried it yet, but will. I got 2 new NodeMCU boards in yesterday ordered less than 2 weeks ago. I'm going to write up the findings in the Off Topic / ESP8266 based Wireless module with CH340 thread... but the essential is that the 1st one received is defective... which explains why it would not respect the Boards entry for NodeMCU 1.0 and the second one is identically marked but came in a completely different antistatic bag and it works perfectly. My guess is the first one was a rework and the second one is prime.

Ray

New, good NodeMCU attached:
Attachments
NewNodeMCU-Works.jpg
NewNodeMCU-Works.jpg (240.04 KiB) Viewed 666 times

Post Reply