ESP32 IDF CAN issues
Posted: Sat May 16, 2020 6:05 pm
I have receieved reports that our ESP32 product's CAN fault tolerence is significantly worse than other devices.
I have performed some basic tests & find that when faced with a marginal CAN bus the ESP is likely to reject more frames (refuses frames which other devices accept) but also allows frames which other devices reject and/or the ESP corrupts a valid frame the end result being that invalid data enters the ESP application.
CAN overflows are not presently handled in the IDF. An overflow typically results in 0x88 in the frames last data byte. The overflow flag must be manually reset if you are to detect the condition and ignore the bad frame (else once you have had one overflow you will ignore all frames).
It appears that you may recover with:
Next & particularly when receiving high frame rate: If you add/remove the CAN termination resistor as if you had a loose CAN connector in a car driving along an English road (bumps n full of holes) then you may quite 'reliably' cause spurious data to appear in the 'valid' frames passed to you by the IDF.
In my IDF 4.1 build I see 0x00 0x35 0x40 (EDIT: Corruptions) but at random parts of the frame such that in some frames I may only have 0x00 or 0x40.
In an earlier IDF build I have seen other values but lacked the overrun patch above. I am fairly certain that the pattern is IDF/hardware determined as I have made quite a few diagnostic changes and the pattern remains the same. Not random pointer stuff.
Third party CAN bus monitoring systems did not spot the eronous data - all values were as expected EDIT: Third party monitors do not show corrupt frames - its not clear if this is a corrupt frame getting through or a reaction of the ESP/IDF to an earlier corrupt frame. The third party monitors (& I have quite a few types) reported lower frame error counts than the ESP. The ESP seemed periodically 'phased' and blanked out whilst others continued. I don't want too go to deep on that point as the main goal is just to have valid frames.
Late the other night I found a cryptic note in the CAN driver along the lines that there were issues in dealing with overrun because of the hardware. Forgot to add a reference..... but sure your ticket system will take you there.
So the questions is - how does one reliably receive CAN frames from the ESP?
Failing that what triggers would you suggest? 0x00 0x35 0x40 must mean something to someone.
PS Same result on an EVB.
I have performed some basic tests & find that when faced with a marginal CAN bus the ESP is likely to reject more frames (refuses frames which other devices accept) but also allows frames which other devices reject and/or the ESP corrupts a valid frame the end result being that invalid data enters the ESP application.
CAN overflows are not presently handled in the IDF. An overflow typically results in 0x88 in the frames last data byte. The overflow flag must be manually reset if you are to detect the condition and ignore the bad frame (else once you have had one overflow you will ignore all frames).
It appears that you may recover with:
Code: Select all
can_ll_set_cmd_clear_data_overrun(can_context.dev);
In my IDF 4.1 build I see 0x00 0x35 0x40 (EDIT: Corruptions) but at random parts of the frame such that in some frames I may only have 0x00 or 0x40.
In an earlier IDF build I have seen other values but lacked the overrun patch above. I am fairly certain that the pattern is IDF/hardware determined as I have made quite a few diagnostic changes and the pattern remains the same. Not random pointer stuff.
Third party CAN bus monitoring systems did not spot the eronous data - all values were as expected EDIT: Third party monitors do not show corrupt frames - its not clear if this is a corrupt frame getting through or a reaction of the ESP/IDF to an earlier corrupt frame. The third party monitors (& I have quite a few types) reported lower frame error counts than the ESP. The ESP seemed periodically 'phased' and blanked out whilst others continued. I don't want too go to deep on that point as the main goal is just to have valid frames.
Late the other night I found a cryptic note in the CAN driver along the lines that there were issues in dealing with overrun because of the hardware. Forgot to add a reference..... but sure your ticket system will take you there.
So the questions is - how does one reliably receive CAN frames from the ESP?
Failing that what triggers would you suggest? 0x00 0x35 0x40 must mean something to someone.
PS Same result on an EVB.