[STM32GENERIC/HAL] SerialUSB TX/RX speed problem

Discussions about the STM32generic core
danieleff
Posts: 336
Joined: Thu Sep 01, 2016 8:52 pm
Location: Hungary
Contact:

Re: [STM32GENERIC/HAL] SerialUSB TX/RX speed problem

Post by danieleff » Tue Aug 15, 2017 6:21 am

Have anyone actually tried to just up the timeout from milliseconds to seconds?
replace the SerialUSB code with the one from STM
SerialUSB sits on top of the whole STM code.
And STM CDC code does not have buffered writes (CDC_Transmit_FS() sends immediately, which was the initial problem), which is why I had to hack in USBSerial_Tx_Handler to the STM CDC code, so when current TX is finished, it checks if there are more things to send, and sends it from the USB interrupt, instead of SerialUSBClass::write (and SerialUSBClass::write will not send it if it knows there is ongoin transmission (the transmission variable) ).

victor_pv
Posts: 1681
Joined: Mon Apr 27, 2015 12:12 pm

Re: [STM32GENERIC/HAL] SerialUSB TX/RX speed problem

Post by victor_pv » Wed Aug 16, 2017 2:07 am

danieleff wrote:
Tue Aug 15, 2017 6:21 am
Have anyone actually tried to just up the timeout from milliseconds to seconds?
replace the SerialUSB code with the one from STM
SerialUSB sits on top of the whole STM code.
And STM CDC code does not have buffered writes (CDC_Transmit_FS() sends immediately, which was the initial problem), which is why I had to hack in USBSerial_Tx_Handler to the STM CDC code, so when current TX is finished, it checks if there are more things to send, and sends it from the USB interrupt, instead of SerialUSBClass::write (and SerialUSBClass::write will not send it if it knows there is ongoin transmission (the transmission variable) ).
Can't we use the head and tail to determine if there is more in the buffer rather than the transmission variable?
If we only have 1 function pulling data (TX_Handler) and 1 function adding data (SerialUSB::write), and we don't allow the head to hit the tail, then we can always know what's currently in the buffer even if it we have interrupts and whatnot.


About the code, I didn't know that's what STM uses since it shows Vassilis as the author, I thought STM had written their own.

victor_pv
Posts: 1681
Joined: Mon Apr 27, 2015 12:12 pm

Re: [STM32GENERIC/HAL] SerialUSB TX/RX speed problem

Post by victor_pv » Wed Aug 16, 2017 4:46 pm

Everyone in this thread, can we first agree what's the desirable behavior for USB TX and RX in case the buffers fill?
My preference:
For TX, should return right away and the return value indicate how many bytes it could queue. If 0 bytes, then return 0. For X bytes, return X. (so this invoves taking out the timeout, leave the timeout or retries for the application).

For RX, if buffer is full, NAK the last host packet so it does not send another one. Once the buffer has enough capacity to receive at least 1 more packet, issue the ACK for the previous packet so the host can send a new one. If the application starts reading bytes from the RX buffer at the point it gets enough space for another packet, then issue the NAK and keep going.

The above is how I have modified Steve's F4 RX code to work (not the TX as for now).

danieleff
Posts: 336
Joined: Thu Sep 01, 2016 8:52 pm
Location: Hungary
Contact:

Re: [STM32GENERIC/HAL] SerialUSB TX/RX speed problem

Post by danieleff » Thu Aug 17, 2017 7:33 am

victor_pv wrote:
Wed Aug 16, 2017 4:46 pm
Everyone in this thread, can we first agree what's the desirable behavior for USB TX and RX in case the buffers fill?
My preference:
For TX, should return right away and the return value indicate how many bytes it could queue. If 0 bytes, then return 0. For X bytes, return X. (so this invoves taking out the timeout, leave the timeout or retries for the application).
The return values are OK, but a small timeout should be there. Nobody will ever ever do retries with Serial.print/write(...).
victor_pv wrote:
Wed Aug 16, 2017 4:46 pm
For RX, if buffer is full, NAK the last host packet so it does not send another one. Once the buffer has enough capacity to receive at least 1 more packet, issue the ACK for the previous packet so the host can send a new one. If the application starts reading bytes from the RX buffer at the point it gets enough space for another packet, then issue the NAK and keep going.

The above is how I have modified Steve's F4 RX code to work (not the TX as for now).
You can try to do this, but that is deeply inside STM CDC code. (I think. I did not check actually)

As for the dropped data, at last I was able to setup a test so I can actually see the problem. (Using `for (i = 0; i < TXCHARS; i++) Serial.write('0' + (i % 10));` plus TeraTerm I do not need to log, and can see the 0123456789 pattern get corrupted.)
BTW The problem persists even if I comment out the timeout, so its not that.

danieleff
Posts: 336
Joined: Thu Sep 01, 2016 8:52 pm
Location: Hungary
Contact:

Re: [STM32GENERIC/HAL] SerialUSB TX/RX speed problem

Post by danieleff » Thu Aug 17, 2017 6:06 pm

The following code:

Code: Select all

// USBSERIAL TX PROBLEM DEMONSTRATION
// Pito 7/2017

#include "Arduino.h"

void setup() {
  Serial.begin(115200);
  delay(3000);
}

#define TXCHARS 1000000

void loop() {
  uint32_t i;
  uint32_t elapsed = micros();

  for (i = 0; i < TXCHARS; i++) {
    Serial.write('0' + (i % 10));
  }

  elapsed = micros() - elapsed;
  Serial.println("***");
  Serial.print("USB TX speed = ");
  Serial.print((1000.0 * TXCHARS) / elapsed, 2);
  Serial.println(" KBytes/sec");
  delay(5000);
}
With `#define CDC_SERIAL_BUFFER_SIZE 1024 //USBSerial buffer data length` on F407VE
In TeraTerm gets corrupted:
teraterm.png
teraterm.png (3.8 KiB) Viewed 191 times

However if I open Arduino IDE serial monitor, copy paste to Notepad, the Col shows 1000000 received:
notepad.png
notepad.png (8.87 KiB) Viewed 191 times
So the bytes are send, but not received in TeraTerm. I do not know if there is a control flag to know if the data sent from the device is dropped by the USB host application. If there is, it is way below the buffering level of SerialUSB.cpp, inside the usb/*.c files, which I am not familiar with (STM code).

victor_pv
Posts: 1681
Joined: Mon Apr 27, 2015 12:12 pm

Re: [STM32GENERIC/HAL] SerialUSB TX/RX speed problem

Post by victor_pv » Thu Aug 17, 2017 6:47 pm

Different applications may pull the RX packets at different rates from the host buffer, so it's possible that they buffer could be filling up when using Teraterm, and then the TX code will drop bytes, and the Arduino IDE may be pulling them faster and never letting the TX buffer fill up.
We can add some flag in the TX code to see if the buffer ever gets full.

When using the Arduino IDE, do you ever see byte sequences out of order? (probably just replacing 0123456789 for "" or "ok" and seeing what you have left)

danieleff
Posts: 336
Joined: Thu Sep 01, 2016 8:52 pm
Location: Hungary
Contact:

Re: [STM32GENERIC/HAL] SerialUSB TX/RX speed problem

Post by danieleff » Fri Aug 18, 2017 5:09 am

This is not about the TX buffer. The following code bypasses the whole SerialUSB class, no Serial.xxx call at all:

Code: Select all

#include "Arduino.h"

char buffer[200];

#define TX 1000

void setup() {
  Serial.begin(115200);
  delay(3000);

  memset(buffer, '.', sizeof(buffer));

  for(int i=0; i<sizeof(buffer) / 10; i++) {
      buffer[i * 10] = '0' + (i % 10);
  }

  buffer[sizeof(buffer) - 2] = '\r';
  buffer[sizeof(buffer) - 1] = '\n';

}

void loop() {
    for(size_t i=0; i<TX; i++) {

        sprintf(buffer, "[%6d %10lu]", i, micros());

        while(CDC_Transmit_FS((uint8_t*)buffer, 200) != USBD_OK);

        USBD_CDC_HandleTypeDef *hcdc = (USBD_CDC_HandleTypeDef*)hUsbDeviceFS.pClassData;
        while (hcdc->TxState != 0); // Wait for USB transfer to finish

    }

    delay(5000);
}
In Arduino IDE Serial monitor it is fine.
In TeraTerm:
teraterm2.png
teraterm2.png (8.61 KiB) Viewed 176 times
The chip is just sending data too fast.
Frankly I don't think it is possible (in this level) to know that oops first half of the message was not delivered.

stevestrong
Posts: 1740
Joined: Mon Oct 19, 2015 12:06 am
Location: Munich, Germany

Re: [STM32GENERIC/HAL] SerialUSB TX/RX speed problem

Post by stevestrong » Fri Aug 18, 2017 7:00 am

I agree, too, that no one will ever try to resend data when outputting over serial.
That is I think sending over serial shall be kept as simple as possible, thus be either blocking or not blocking.

The F4 serial USB was originally configured to non-blocking, but I have changed this because I missed some data on the host side. After changing that reception on host side was ok.

The chip should send data as fast as it can.
Also, the host should read that data as fast as it can. Failing to do that will result of course in data loss.
If at least one host application is able to read all data, this means that the chip is working fine.

It looks like teraterm has problems on Win10?

User avatar
Pito
Posts: 1592
Joined: Sat Mar 26, 2016 3:26 pm
Location: Rapa Nui

Re: [STM32GENERIC/HAL] SerialUSB TX/RX speed problem

Post by Pito » Fri Aug 18, 2017 6:22 pm

CommEcho is loosing data too (see above from my Win7_64bit).
Could you try on your machines - the CommEcho - whether it returns 1mil chars ok, plz?

Update:
with latest libmaple F1 I get 1mil chars from Teraterm when logged into file (217kB/s when not logged, 102kB/s when logged into file).
with latest libmaple F4 I get 1mil chars from Teraterm when logged into file (170kB/s when not logged, 102kB/s when logged into file).
Pukao Hats Cleaning Services Ltd.

victor_pv
Posts: 1681
Joined: Mon Apr 27, 2015 12:12 pm

Re: [STM32GENERIC/HAL] SerialUSB TX/RX speed problem

Post by victor_pv » Fri Aug 18, 2017 8:50 pm

The problem with blocking TX, is that in a serial port if there is nothing in the other end, there is a finite amount that it will take to send the data and the set baud rate, and then it will return, so it will not block permanently.
But in USB, if the other end is not receiving it could block forever unless we have the timeout. So larger or smaller, but I think some timeout is needed. We could set the timeout as a multiple of the time per bytes for a certain minimum rate, so the timeout is not same when sending 1 byte as when sending 1000. That would resemble more what the UART driver would do.

On the other hand with the timeout, if the host is slow getting data, we may lose bytes. But I think anyone not wanting to lose bytes should check on the returned value to confirm it was sent. If it is not critical, the ignore the return.
For example if we decide to use a timeout to simulate a minimum 100KB/s rate, then it would be 10uS per byte. If the host is slower than that (or disconnected) then transmissions may timeout, but if the host keeps at least that rate, they will complete within the allowed timeout period.

But on RX unless we stop sending ACKs to the host until there is room in the buffer, is very possible that data will be lost not matter what the application does, since the host is potentially much faster.
The F1 core does that, and in the libmaple F4 I have it modified locally to that, I will send a PR so more people can test that RX.

On Generic TX I will try to repeat some of the tests Pito has done and see if there is any difference, but I am not convinced is just teraterm.

Post Reply