ESP8266 Sans RF

Anything not related to STM32
Post Reply
User avatar
mrburnette
Posts: 1829
Joined: Mon Apr 27, 2015 12:50 pm
Location: Greater Atlanta
Contact:

ESP8266 Sans RF

Post by mrburnette » Tue Apr 19, 2016 2:51 pm

20160507: Title changed after someone noticed I had mistyped as ESP8622 :oops:

I became obsessed last week with the thought that the very cheap ESP8622 could be useful if it did not have RF consuming so much power.

Crazy? Probably: but playing around with the idea and finding a very easy mechanism for disabling the RF section to drop the power consumption, I am now finding that a non-RF ESP8622-12E module in the NodeMCU form factor is only taking 20mA at 80MHz and under 40mA (30'ish) at 160MHz. That is pretty remarkable for the cost.

The specific unit I am using in my test is this one - excepting that I paid $3.11 with free shipping.

So what kind of performance does $3 and pennies provide? I am working on more examples, but I took an STM32duino example I ported from the 8-bit AVR world which was written by the renown Nick Gammon and I ported the STM32 port to ESP8266. Adding a single unsigned long var to time a single pass of the program, this is the results on the STM32duino Maple Mini with bootloader 2.0 and the power consumption was 40+mA

Code: Select all

Prime Number Generator
Number of primes in prime table = 53
Last prime in table =  251
Calculating primes through 63001

Total microseconds  = 199887 uS
Using the same programming logic with minor modifications to the ESP8622 set for 80MHz with power consumption 20+mA

Code: Select all

@80MHz

Prime Number Generator by Nick Gammon
ESP8266 port by Ray Burnette
Number of primes in prime table = 53
Last prime in table =  251
Calculating primes through 63001

Starting uS timer now... uS = 3619257
Sketch terminating.
And again with the ESP8266 set for 160MHz and consuming 35+mA

Code: Select all

@160MHz

Prime Number Generator by Nick Gammon
ESP8266 port by Ray Burnette
Number of primes in prime table = 53
Last prime in table =  251
Calculating primes through 63001

Starting uS timer now... uS = 1813374
Sketch terminating.
Even at 160MHz, the ESP8266 with the RF section off is a slower computing performance with similar power consumption of the STM32F103 running at 72MHz.

But, one must remember that the ESP8266 has significantly more RAM and flash; therefore, less expensive with fair performance may mean that an RF disabled ESP8266 may find it's way into a future project.

STM32duino:
Sketch uses 14,508 bytes (11%) of program storage space. Maximum is 122,880 bytes.
Global variables use 2,632 bytes of dynamic memory. (Bootloader 2.0 leaves 20K available)

ESP8266-duino:
Sketch uses 232,782 bytes (22%) of program storage space. Maximum is 1,044,464 bytes.
Global variables use 31,995 bytes (39%) of dynamic memory, leaving 49,925 bytes for local variables. Maximum is 81,920 byte

Very interesting, I think. I am still attempting to validate the remarkable performance of the STM32F103 at less than 0.2 seconds. At first, I thought maybe micros overflowed, but repeated testing using millis() yielded 200. Can it be that much faster? I think so.

Ray


ESP8266 test code

Code: Select all

/*
  PrimeNos: by Nick Gammon
  ESP8266 Arduino port m. ray burnette
  OS version Linux Mint 17.3 32-bit tested on 20160419
  Compiled under Arduino 1.6.8
    Sketch uses 232,782 bytes (22%) of program storage space. Maximum is 1,044,464 bytes.
    Global variables use 31,995 bytes (39%) of dynamic memory, leaving 49,925 bytes for local variables. Maximum is 81,920 bytes.
  PUBLIC DOMAIN EXAMPLE
  RF sleep reference: https://github.com/esp8266/Arduino/issues/460#issuecomment-168970936
*/

extern "C" {
#include "user_interface.h"
}

#include <ESP8266WiFi.h>                                    // for the low power function
#include <Streaming.h>
#define BOARD_LED_PIN  2
#define FREQUENCY     80                                    // valid 80, 160

// just add more primes to the prime table for larger search
// byte data type to save memory - use a larger datatype with prime table entries above 255 :)
byte primes[]= { 
    2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101,
    102, 107, 109, 113, 127, 131,  137 , 139, 149, 151, 157, 163, 167, 173,  179, 181, 191, 193, 197, 
    199, 211, 223, 227, 229, 233, 239, 241, 251 };

// if you change the datatype of primes array to int, change next line to 
// "const int TopPrimeIndex = (sizeof(primes)/2) - 1;"

const unsigned int  TopPrimeIndex = sizeof(primes) - 1;      
const unsigned long TopPrimeSquared = (long)primes[TopPrimeIndex] * (long)primes[TopPrimeIndex];
int primeFlag;
unsigned long t = 0;


void setup()                   
{
    
  WiFi.forceSleepBegin();                                   // turn off ESP8266 RF
  delay(1);                                                 // give RF section time to shutdown
  // yield();                                               // generally same as delay(1) but inconsistent
  system_update_cpu_freq(FREQUENCY);

  pinMode(BOARD_LED_PIN, OUTPUT);
  Serial.begin(115200);

  Serial << endl << endl;
  Serial << "Prime Number Generator by Nick Gammon\r\n";
  Serial << "ESP8266 port by Ray Burnette\r\n\r\n";
  Serial << "Number of primes in prime table = ";
  Serial << TopPrimeIndex << endl << endl;
  Serial << "Last prime in table =  ";
  Serial << (unsigned int)primes[TopPrimeIndex] << endl;
  Serial << "Calculating primes through ";
  Serial << TopPrimeSquared << endl << endl;
  Serial << "Starting uS timer now... ";
  t = micros();
}


void loop()
{
    for (long x = 1; x < TopPrimeSquared; x+=2)             // skips even numbers, including 2, which is prime, but it makes algorithm tad faster
      {
            for (long j=0; j < TopPrimeIndex; j++){
            primeFlag = true;

            if (x == primes[j]) break;

            if (x % primes[j] == 0){                        // if the test number modolo (next number from prime table) == 0 
                primeFlag = false;                          //  then test number is not prime, bailout and check the next number
                break;
            }
            delay(0);
        }
        if (primeFlag == true)                              // found a prime
        {
            digitalWrite(BOARD_LED_PIN,!digitalRead(BOARD_LED_PIN));
            yield();
            // Serial.println(x);
        }
      }
    Serial << "uS = " << micros()-t << endl;
    Serial << "Sketch terminating.  Press Reset to rerun." << endl;
    for(;;) yield() ;                                       // infinite loop with yield to prevent WDT
}
Last edited by mrburnette on Sat May 07, 2016 6:18 pm, edited 1 time in total.

User avatar
martinayotte
Posts: 1229
Joined: Mon Apr 27, 2015 1:45 pm

Re: ESP8622 Sans RF

Post by martinayotte » Tue Apr 19, 2016 3:28 pm

That is strange !
With the Dhrystone benchmarks done last year, the ESP at 80MHz was about 10% slower than STM32F103 :
viewtopic.php?f=3&t=76&start=30#p5435

User avatar
WereCatf
Posts: 166
Joined: Sat Apr 16, 2016 5:30 pm

Re: ESP8622 Sans RF

Post by WereCatf » Tue Apr 19, 2016 3:29 pm

Why do you have a call to

Code: Select all

system_update_cpu_freq(FREQUENCY);
You're not using the Arduino IDE as that allows you to set the CPU-speed right from the menu. Also, it doesn't look like you're doing it right: http://bbs.espressif.com/viewtopic.php?t=889 for example says you first select the clock-speed by setting a bit in a register, *then* update the system-variable with

Code: Select all

system_update_cpu_freq(FREQUENCY);
-- as far as I understood, that function-call itself does not change the CPU-speed, only the divisors the system uses for timers and such.

User avatar
Slammer
Posts: 251
Joined: Tue Mar 01, 2016 10:35 pm
Location: Athens, Greece

Re: ESP8622 Sans RF

Post by Slammer » Tue Apr 19, 2016 3:36 pm

ESP8266 is a special piece of hardware as CPU.
The core has 80KB DRAM (Data RAM) and 35KB IRAM (Instruction RAM).The DRAM and IRAM segments are separated (Harvard architecture).
Programs are stored in the flash memory and due to the fast interface (SPI-quad) instructions can be fetched and executed "in place", in other words, without copying them to ram first. In some occasions you cannot execute your code from SPI flash memory directly, mostly because it needs to be maximum fast or because it handles the flash memory itself (e.g. writing to it). In that case, you can have some code, 32k in total, have copied from the flash at startup, to the IRAM. This code will run from IRAM, not from FLASH.
Unfortunately the IRAM space is mostly occupied by SDK code and not a lot left, from the 32k for your own application, about 2k, maybe 3k.
Even with the fast SPI the execution in flash is slow compared with normal flash (on cpu bus) of our stm32.
There are some compiler directives to force generated code to execute in IRAM but you have to respect the limits of the memory.
Something like this is used when we want something in IRAM

Code: Select all

#define IRAM0     __attribute__((section(".iram0.text")))
void IRAM0 myfunc(void)
{
}
The code will run much faster.

User avatar
mrburnette
Posts: 1829
Joined: Mon Apr 27, 2015 12:50 pm
Location: Greater Atlanta
Contact:

Re: ESP8622 Sans RF

Post by mrburnette » Tue Apr 19, 2016 3:42 pm

WereCatf wrote:Why do you have a call to

Code: Select all

system_update_cpu_freq(FREQUENCY);
You're not using the Arduino IDE as that allows you to set the CPU-speed right from the menu. Also, <...>
My skin is pretty thick - I worked for years for a bunch of corporate bastards (non-sexist use of bastard as I include females in my use) but...
IMO source code should be totally self documenting. Therefore changing the uC clock from the IDE is simply not proper if it can be changed from code. Observation of the results of the tests do show that something is happening 2x as fast from 80MHz to 160MHz.

User avatar
mrburnette
Posts: 1829
Joined: Mon Apr 27, 2015 12:50 pm
Location: Greater Atlanta
Contact:

Re: ESP8622 Sans RF

Post by mrburnette » Tue Apr 19, 2016 3:45 pm

Slammer wrote:ESP8266 is a special piece of hardware as CPU.
The core has 80KB DRAM (Data RAM) and 35KB IRAM (Instruction RAM).The DRAM and IRAM segments are separated (Harvard architecture).
Programs are stored in the flash memory and due to the fast interface (SPI-quad) instructions can be fetched and executed "in place", in other words, without copying them to ram first. In some occasions you cannot execute your code from SPI flash memory directly, mostly because it needs to be maximum fast or because it handles the flash memory itself (e.g. writing to it). In that case, you can have some code, 32k in total, have copied from the flash at startup, to the IRAM. This code will run from IRAM, not from FLASH.<...>
I knew this in the back of my mind from forum posts over on esp8266.com but did not consciously consider it here
. It is a good explanation of what may be happening and I will try the change and post back. Thanks!

Ray

User avatar
WereCatf
Posts: 166
Joined: Sat Apr 16, 2016 5:30 pm

Re: ESP8622 Sans RF

Post by WereCatf » Tue Apr 19, 2016 3:56 pm

This may sound stupid, but what does delay(0) exactly do? I would imagine passing a zero to a delay()-function would do absolutely nothing, but removing that line speeds things up considerably, so it is obviously doing something.

User avatar
mrburnette
Posts: 1829
Joined: Mon Apr 27, 2015 12:50 pm
Location: Greater Atlanta
Contact:

Re: ESP8622 Sans RF

Post by mrburnette » Tue Apr 19, 2016 3:59 pm

Slammer wrote:<...>
Something like this is used when we want something in IRAM

Code: Select all

#define IRAM0     __attribute__((section(".iram0.text")))
void IRAM0 myfunc(void)
{
}
The code will run much faster.
Results @80MHz ... multiple resets, same results:

Code: Select all

Number of primes in prime table = 53

Last prime in table =  251
Calculating primes through 63001

Starting uS timer now... uS = 3613884
Sketch terminating.  Press Reset to rerun.

Full sketch: Moved to later post... 'cause this 1 was silly wrong.
Last edited by mrburnette on Tue Apr 19, 2016 5:41 pm, edited 1 time in total.

User avatar
mrburnette
Posts: 1829
Joined: Mon Apr 27, 2015 12:50 pm
Location: Greater Atlanta
Contact:

Re: ESP8622 Sans RF

Post by mrburnette » Tue Apr 19, 2016 4:01 pm

WereCatf wrote:...but what does delay(0) exactly do?<...>
With the ESP8266, delay(0) is a synonym for yield(); However, Roger was tearing into the delay() function a few days back and there seems to be more complexity than that - so I think something else happens too. Have not had the time to dig myself.

I used it in the ESP sketch because without it, I was getting WDT resets! The Arduino implementation on ESP8266 is a quasi-task switching arrangement, less complex than an RTOS, but Arduino sketches must routinely yield() so the RF side and native API's can get some clock ticks. The Arduino implementation automatically has a yield() embedded in the loop(), so it happens without a need to program in most cases. But, if a delay is created with some looping inside of loop(), the WDT will let you know!


Ray
Last edited by mrburnette on Tue Apr 19, 2016 4:06 pm, edited 1 time in total.

User avatar
WereCatf
Posts: 166
Joined: Sat Apr 16, 2016 5:30 pm

Re: ESP8622 Sans RF

Post by WereCatf » Tue Apr 19, 2016 4:04 pm

mrburnette wrote:With the ESP8266, delay(0) is a synonym for yield();
Must be some overhead somewhere; with yield() at 160MHz I get 1860164uS, with delay(0) I get 1799641. With neither yield() or delay(0) I get 380114us, which is a considerable difference already.

Post Reply