Page 1 of 1

Problem parsing Regex pattern with escape characters

Posted: Sat Nov 09, 2024 8:14 pm
by iParcelBox
I'm trying to use regex.h on an ESP32S3 in ESP-IDF 5.3.1 to extract elements from a log message so that I can reformat it to be IETF compliant. However, whenever I try to use a regex pattern including any escape characters, it fails.

For example, I'm trying to parse a simple log message such as the below:

I (1460) ipb_lfs: Initializing LittleFS
If I use the following code the regex parses OK and I'm able to extract the Log level from the rest of the message:

Code: Select all

const char *regex_pattern = "([IDWE]) (.+)";
    regex_t regex;
    int ret = regcomp(&regex, regex_pattern, REG_EXTENDED);
    if (ret != 0)
    {
        printf("Failed to compile regex\n");
        return -1;
    }
    regmatch_t matches[3];

    ret = regexec(&regex, clean_message, 3, matches, 0)
However as soon as I change the regex pattern to something even slightly more complicated, such as:

Code: Select all

const char *regex_pattern = "([IDWE]) (\\S+) (.+)";
It fails to parse the message.

It seems the problem is linked to the inclusion of any escape characters.

Any suggestions how to solve this?

Re: Problem parsing Regex pattern with escape characters

Posted: Sat Nov 09, 2024 11:38 pm
by MicroController
Have you tried, e.g.

Code: Select all

const char *regex_pattern = "([IDWE]) (\\S+) (.+)";
?

Re: Problem parsing Regex pattern with escape characters

Posted: Sun Nov 10, 2024 7:32 am
by iParcelBox
Apologies mine was a typo, yes it was the below I was trying but failing with:

Code: Select all

const char *regex_pattern = "([IDWE]) (\\S+) (.+)";
I’ve corrected the op above

Re: Problem parsing Regex pattern with escape characters

Posted: Tue Nov 12, 2024 7:06 am
by iParcelBox
For others trying to resolve the same issue, I was unable to get regex to work with any escaped characters, so for my specific use case I ended up modifying it as follows:

Code: Select all

const char *regex_pattern = "([IDWE]) .([0-9]+). ([A-Za-z0-9_-]+): ?(.+)?";


It's not as elegant, but it works....