Page 1 of 2

MicroSD hangup

Posted: Wed Nov 14, 2018 10:43 am
by PaulFreund
Hi,

I've got a little problem and maybe someone of you has more experience working with MicroSD cards.

I'm initializing the MicroSD like this:
#define DEVICE_MICROSD_MISO GPIO_NUM_26;
#define DEVICE_MICROSD_MOSI GPIO_NUM_25;
#define DEVICE_MICROSD_SCK GPIO_NUM_33;
#define DEVICE_MICROSD_CS GPIO_NUM_32;

void DeviceManager::initSD() {
sdmmc_host_t hostConfig;
memset(&hostConfig, 0, sizeof(hostConfig));
hostConfig.flags = SDMMC_HOST_FLAG_SPI;
hostConfig.slot = VSPI_HOST;
hostConfig.max_freq_khz = SDMMC_FREQ_DEFAULT;
hostConfig.io_voltage = 3.3f;
hostConfig.init = &sdspi_host_init;
hostConfig.set_bus_width = NULL;
hostConfig.set_card_clk = &sdspi_host_set_card_clk;
hostConfig.do_transaction = &sdspi_host_do_transaction;
hostConfig.deinit = &sdspi_host_deinit;
hostConfig.command_timeout_ms = 0;

sdspi_slot_config_t slotConfig;
memset(&slotConfig, 0, sizeof(slotConfig));
slotConfig.gpio_miso = DEVICE_MICROSD_MISO;
slotConfig.gpio_mosi = DEVICE_MICROSD_MOSI;
slotConfig.gpio_sck = DEVICE_MICROSD_SCK;
slotConfig.gpio_cs = DEVICE_MICROSD_CS;
slotConfig.gpio_cd = SDMMC_SLOT_NO_CD;
slotConfig.gpio_wp = SDMMC_SLOT_NO_WP;
slotConfig.dma_channel = 1;

esp_vfs_fat_sdmmc_mount_config_t mountConfig;
memset(&mountConfig, 0, sizeof(mountConfig));
mountConfig.format_if_mount_failed = true;
mountConfig.max_files = 10;
mountConfig.allocation_unit_size = 0; // sector size

esp_err_t ret = esp_vfs_fat_sdmmc_mount("/sd", &hostConfig, &slotConfig, &mountConfig, &this->microSDCard);

if (ret != ESP_OK) {
if (ret == ESP_FAIL) {
ESP_LOGE(LOGTAG, "esp_vfs_fat_sdmmc_mount failed");
} else {
ESP_LOGE(LOGTAG, "Initializing MicroSD card failed with error: %d", ret);
}
return;
}
}
I'm accessing the MicroSD only via fopen/fwrite/fflush/fclose and after some random time (sometimes days sometimes hours) it stops working. Soft reset via esp_restart does not fix the error and on next boot I get:
E (203) DeviceManager: Mounting MicroSD card
E (203) sdmmc_io: sdmmc_io_reset: unexpected return: 0x104
E (203) DeviceManager: Initializing MicroSD card failed with error: 260
This leaves me to believe that the MicroSD is in an invalid state because once I disconnect power and connect power again it is working again, again for a longer period.

The errors occuring would be bearable but I need a dead safe way to reset the MicroSD into a working state.

Re: MicroSD hangup

Posted: Wed Nov 14, 2018 1:48 pm
by ESP_igrr
Could you please try increasing log level to Debug or even Verbose, and capture the log when the error happens? The interesting part is where the card "stops working", that is before you do software reset.

If you have a logic analyzer, observing SPI lines when the issue happens may also prove useful to figure out what the issue is.

Re: MicroSD hangup

Posted: Wed Nov 14, 2018 2:42 pm
by PaulFreund
I will try but the error sometimes happens after days and most devices are not accessible, only a small test setup on my desk. As the other devices are in "the field" the question stands if it could be helpful to set the pins to gpio first and turn them low or something in that regard.

Re: MicroSD hangup

Posted: Thu Nov 15, 2018 2:51 am
by ESP_igrr
PaulFreund wrote:
Wed Nov 14, 2018 2:42 pm
if it could be helpful to set the pins to gpio first and turn them low or something in that regard.
Possibly, but the driver is supposed to be doing this — sending 40 80 clocks, followed by CMD0. After this, a well behaved card should go into SPI mode. Apparently it doesn't in this case, hence the need to troubleshoot...

Re: MicroSD hangup

Posted: Thu Nov 15, 2018 7:28 am
by PaulFreund
Thank you very much for following up with this. I was actually able to capture the issue over night.

I also tried removing the MicroSD while the electronic is powered and doing a soft reset with esp_restart() which makes the system work again.

This happens immediately before it stops working:
V (1706031) sdmmc_cmd: cmd response 00000000 00000000 00000000 00000000 err=0x0 state=0
D (1706041) httpd-freertos: Sel add listen 50
D (1706051) httpd-freertos: select ret
V (1706051) sdmmc_cmd: sending cmd slot=2 op=17 arg=4246 flags=1c50 data=0x3ffc61d4 blklen=512 datalen=512 timeout=1000
V (1706061) sdspi_host: sdspi_host_start_command: slot=2, CMD17, arg=0x00004246 flags=0x5, data=0x3ffc61d4, data_size=512 crc=0x03
V (1706071) sdspi_transaction: r1 = 0x00 hw_cmd.r[0]=0x323a2272
V (1706071) sdmmc_cmd: cmd response 00000000 00000000 00000000 00000000 err=0x0 state=0
V (1706081) sdmmc_cmd: sending cmd slot=2 op=17 arg=4247 flags=1c50 data=0x3ffc61d4 blklen=512 datalen=512 timeout=1000
V (1706101) sdspi_host: sdspi_host_start_command: slot=2, CMD17, arg=0x00004247 flags=0x5, data=0x3ffc61d4, data_size=512 crc=0x0a
V (1706111) sdspi_transaction: r1 = 0x00 hw_cmd.r[0]=0x656c222c
V (1706121) sdmmc_cmd: cmd response 00000000 00000000 00000000 00000000 err=0x0 state=0
D (1706131) httpd-freertos: Sel add listen 50
D (1706131) httpd-freertos: select ret
V (1706131) sdmmc_cmd: sending cmd slot=2 op=17 arg=4248 flags=1c50 data=0x3ffc61d4 blklen=512 datalen=512 timeout=1000
V (1706141) sdspi_host: sdspi_host_start_command: slot=2, CMD17, arg=0x00004248 flags=0x5, data=0x3ffc61d4, data_size=512 crc=0x7d
V (1706151) sdspi_transaction: r1 = 0x00 hw_cmd.r[0]=0x6c222c36
V (1706161) sdmmc_cmd: cmd response 00000000 00000000 00000000 00000000 err=0x0 state=0
V (1706161) sdmmc_cmd: sending cmd slot=2 op=17 arg=4249 flags=1c50 data=0x3ffc61d4 blklen=512 datalen=512 timeout=1000
V (1706181) sdspi_host: sdspi_host_start_command: slot=2, CMD17, arg=0x00004249 flags=0x5, data=0x3ffc61d4, data_size=512 crc=0x74
***ERROR*** A stack overflow in task esphttpd has been detected.
abort() was called at PC 0x40090370 on core 0

Backtrace: 0x4009005f:0x3ffde5d0 0x40090359:0x3ffde5f0 0x40090370:0x3ffde610 0x4008e60d:0x3ffde630 0x4008f204:0x3ffde660 0x4008f1ba:0x3f467b03

Rebooting...
And here is the full next initialization that fails:
E (381) DeviceManager: Mounting MicroSD card
D (391) sdspi_host: sdspi_host_init_slot: SPI3 miso=26 mosi=25 sck=33 cs=32 cd=-1 wp=-1, dma_ch=1
D (401) spi: SPI2 use gpio matrix.
V (401) intr_alloc: esp_intr_alloc_intrstatus (cpu 0): checking args
V (411) intr_alloc: esp_intr_alloc_intrstatus (cpu 0): Args okay. Resulting flags 0xC0E
D (411) intr_alloc: Connected src 31 to int 13 CMD52, arg=0x80000c08 flags=0x4, data=0x0, data_size=0 crc=0x4f
D (461) spi_master: SPI device changed from 3 to 0
V (461) sdspi_transaction: r1 = 0x75 hw_cmd.r[0]=0xffffffff
D (471) sdspi_transaction: cmd=52, R1 response: command not supported
D (471) sdmmc_cmd: cmd=52, sdmmc_req_run returned 0x106
V (481) sdmmc_io: sdmmc_io_rw_direct: sdmmc_send_cmd returned 0x106
V (491) sdmmc_cmd: sending cmd slot=2 op=0 arg=0 flags=20 data=0x0 blklen=0 datalen=0 timeout=1000
V (491) sdspi_host: sdspi_host_start_command: slot=2, CMD0, arg=0x00000000 flags=0x40, data=0x0, data_size=0 crc=0x4a
V (511) sdspi_host: start_command_default: ignoring response byte
V (511) sdspi_transaction: r1 = 0x00 hw_cmd.r[0]=0xffffffff
V (521) sdmmc_cmd: cmd response 00000000 00000000 00000000 00000000 err=0x0 state=0
V (541) sdmmc_cmd: sending cmd slot=2 op=0 arg=0 flags=1c20 data=0x0 blklen=0 datalen=0 timeout=1000
V (541) sdspi_host: sdspi_host_start_command: slot=2, CMD0, arg=0x00000000 flags=0x4, data=0x0, data_size=0 crc=0x4a
V (541) sdspi_transaction: r1 = 0x7b hw_cmd.r[0]=0xffffffff
D (551) sdspi_transaction: cmd=0, R1 response: command CRC error
D (561) sdmmc_cmd: cmd=0, sdmmc_req_run returned 0x109
D (561) sdmmc_init: sdmmc_card_init: sdmmc_send_cmd_go_idle_state returned 0x109
D (571) vfs_fat_sdmmc: sdmmc_card_init failed 0x(109)
I (571) gpio: GPIO[25]| InputEn: 0| OutputEn: 0| OpenDrain: 0| Pullup: 1| Pulldown: 0| Intr:0
I (581) gpio: GPIO[26]| InputEn: 0| OutputEn: 0| OpenDrain: 0| Pullup: 1| Pulldown: 0| Intr:0
I (591) gpio: GPIO[33]| InputEn: 0| OutputEn: 0| OpenDrain: 0| Pullup: 1| Pulldown: 0| Intr:0
V (601) intr_alloc: esp_intr_free: Disabling int, killing handler
E (601) DeviceManager: Initializing MicroSD card failed with error: 265
E (611) DeviceConfig: [loadConfig] no config file found
E (611) DeviceConfig: [storeConfig] fopen failed

As the esphttpd task is going into stack overflow multiple times directly related to sdmmc operations the question arises if I have to mutex fopen/fclose/fwrite/fread/ftell/fflush because it is written from a different task.

Thanks,

Paul

Re: MicroSD hangup

Posted: Thu Nov 15, 2018 10:20 am
by ESP_igrr
Since the error mentions stack overflow... Did you try monitoring the used stack size in the httpd task, and possibly increasing it?

The fact that SD card doesn't re-initialize after such condition is still a bug; Thank you for the logs, we'll try to reproduce the issue.

Re: MicroSD hangup

Posted: Thu Nov 15, 2018 10:40 am
by PaulFreund
I will definitely look into the stack but I monitor the min free and so far was no problem. I have now added mutexed access to whole SPI, le'ts see if that helps.

The MicroSD is SDSDQAF3-008G-XI (https://eu.mouser.com/ProductDetail/San ... NydrkX78BC) by the way

Re: MicroSD hangup

Posted: Thu Nov 15, 2018 4:16 pm
by PaulFreund
:( Mutexing did unfortunately not solve the issue, also doubling the esphttpd task stack size did not help

Re: MicroSD hangup

Posted: Thu Nov 15, 2018 7:43 pm
by PeterR
I had (what I believe to be) multicore & general reentrancy/race issues with IDF 3.1 SPI (& even 3.2-dev).
Essentially my SPI transaction ended up hung on what I believe was a race condition.
I have since moved to 3.2-rc1. This version adds a lot of C++11 atomics to the SPI driver and what seem to be ISR/Task race guards.
The type&scope of the 3.2-rc1 SPI changes certainly suggests that someone thinks that there were a few RT gaps in 3.1...
Maybe SD repeats the SPI issues?
Quick win & so may be worth a look.
PS I am new to ESP32 but what confuses me (given the 3.2-rc1 changes) is why we don't get a clearer documention warning of driver issues which seem to have entered the RC process.
Maybe I'm not looking in the right place.
EDIT: But note that 3.2-rc1 is slower and uses more stack/resources. So increase stack again!

Re: MicroSD hangup

Posted: Fri Nov 16, 2018 2:19 am
by ESP_igrr
PaulFreund:
Interested that increasing the stack size did not help. So this might be that the beginning of the buffer allocated for the stack gets corrupted by some other code, and this triggers stack overflow check. Could you please try enabling heap poisoning, and see if you get heap corruption errors logged?

PeterR:
We have not released 3.2-rc1 yet. There's a -beta1 version which is not a release candidate. Changes in SPI driver between 3.1 and 3.2 are related to supporting new functionality: polling mode for short transactions, where ISR is not used to start the transaction and handle its results. If you are seeing race conditions with SPI driver in 3.1, please open an issue on GitHub along with code to reproduce. We can then investigate the issue and fix it in one of the subsequent 3.1.x bugfix releases.