ESP32 CAM direct access to image data

mrt_duino · Postby **mrt_duino** » Sat May 04, 2019 5:55 pm

Hi! I am using an ESP32-CAM with OV2640 and it works fine with the CameraWebServer example. Now I want to check raw pixel data to detect a laser pointer on a single colored background but everything seems automated for web broadcast. I am lost in all those web stuff that I know nothing about. Can you tell how a regular Arduino guy read simple pixel data and check colors without doing anything fancy? ( A minimum example perhaps? )

By the way, I am planning to achieve my goal by pointing a laser on a dark grey(maybe white) paper and read a 160*120 image line by line to find where the brightest pixels are. I will find horizontal and vertical start/end coordinates for a given threshold to get a rectangle and use its center as the final product. If you have better ideas, I'm all ears!

ZycaRko · Postby **ZycaRko** » Mon May 06, 2019 2:37 pm

you should search for frame-buffer data, if you set proper the pixel formal in setting (no compression) then you should be able to access frame-buffer data directly:

// pixel format
config.pixel_format = PIXFORMAT_RAW; // or PIXFORMAT_GRAYSCALE ?

// capture camera frame
camera_fb_t *fb = esp_camera_fb_get();
const char *data = (const char *)fb->buf;
size_t size = fb->len;

XenonXenon · Postby **XenonXenon** » Tue Jun 11, 2019 12:25 pm

I am also interested in this.
Can someone please tell me where the real, detailed documentation is on the functions provided?
For example where do I find these answers:
1. What is the format of PIXFORMAT_GRAYSCALE, for example?
2. Where is the list of all the possible formats?
3. Where is the list of available utility functions?

The closest I've found is https://github.com/espressif/esp32-came ... /README.md
- which is very far from what I am looknig for.

It seems the only way to make progress is to scratch around on the web trying to find examples which you can bend to work in your application, but that's a miserable way to write code, in my view, so I'm hoping there is some quality documentation somewhere?

Thanks
X

mrt_duino · Postby **mrt_duino** » Tue Jun 11, 2019 2:06 pm

It's a little late maybe but I must thank ZycaRko, his code did the trick for me. As XenonXenon said it isn't easy to read a manual and learn everything as in other situations.
I didn't want to mess with bit shifting in my limited time so I used GRAYSCALE and I was able to use it by writing them to uint8 variables and kept updating the brightest pixel so far as I found a new one. In the end I was able to see the coordinates on a 128x96 OLED. Still I believe things should be easier to figure out on my own.

Postby **ESP_Me-no-dev** » Tue Jun 11, 2019 2:09 pm

I'm really sorry you find this so frustrating. I had no idea that "GRAYSCALE", "RGB565" or any other of the formats need explanation.
1. GRAYSCALE means values from 0 to 255 for each pixel
2. RGB565 means that a 16 bit value is per pixel and is comprised from 5 bits red, 6 bits green and 5 bits blue.
3. RGB888 means that there are 3 bytes per pixel, one for red, one for green and one for blue. Each vary between 0 and 255.
4. YUV means that each pixel has it's Y channel as a separate byte (0 to 255) and each two adjacent pixels share their U and V values.
6. JPEG means that the image is encoded into JPEG format either by the camera itself or in software.
When you get a frame from the camera, it will contain the frame buffer in the format you selected.

API functions are documented here: https://github.com/espressif/esp32-came ... p_camera.h
Formats and Resolutions are listed here (formats explained above are supported): https://github.com/espressif/esp32-came ... e/sensor.h
JPEG Decoder (one function that is shown in the examples): https://github.com/espressif/esp32-came ... g_decode.h
IMG Converters API (documented): https://github.com/espressif/esp32-came ... nverters.h
Examples using the camera driver can be found in the ESP-WHO repository: https://github.com/espressif/esp-who/tr ... web_server

Overall everything is there are documented well enough. Just someone needs to spend the time to go through the headers and examples.

XenonXenon · Postby **XenonXenon** » Tue Jun 11, 2019 6:18 pm

Me-no-dev thanks for your prompt reply.

I knew what "greyscale" meant in the general sense, but I had no way to know that, in this case, it means 8 bits, so your explanation is helpful

Anyhoooo - should I assume that this 8-bit data is organised as a 2D array of bytes?
Are there headers or footers or checksums or other things - or just the raaw bytes?
Where is the structure which tells me about the structure which is returned into the frame buffer?
Can I define a custom resolution? I only want a thin horizontal strip of greyscale image from the sensor.
I don't see that informationm in any of the links you gave me.

You say that camera.h is documented at that link, but it is LISTED there and that isn't the same thing. Yes a smart chap can look at the code and work many things out, but that is not the same as documentation. Of course, if you're a hobbyist giving us your time for free then that's OK and very much appreciated but if you're an employee then maybe that's not the best way to grow a community of product-users.

Then again, maybe everyone else is far smaterter than me and I should creep away into the shadows...

Many thanks
Chris

Postby **ESP_Me-no-dev** » Wed Jun 12, 2019 7:02 am

I don't agree that things are not well documented. The camera driver is rather a basic thing. You setup the camera and start grabbing frame buffers from it. Structure of the frame buffer is in the header. Frame buffer in every case that I have seen contains an array of pixels in the form of (width * height * bytes per pixel). This is true for 99.9% of the frame buffers in the world. You can easily extract a line or whatever you need. Now setting a custom resolution is beyond the normal use of the sensors (you need to read the datasheet of the particular sensor in order to know what to do) and in that case you do need to poke in the code, read docs and so on. Not a trivial task and not the same on every sensor. This is not something to be documented. If you need full information on the sensor, you need to read it's manufacturer documentation.

XenonXenon · Postby **XenonXenon** » Wed Jun 12, 2019 8:19 am

OK well thanks again.

XenonXenon · Postby **XenonXenon** » Wed Jun 12, 2019 2:38 pm

Here is an examplke of what I'm struggling with.
From the code examples I can see that:

Code: Select all

fb = esp_camera_fb_get();  - gets a frame buffer from the camera.

and from the files you linked to for me I can see that that a frame buffer is this structure:

Code: Select all

    typedef struct {
    uint8_t * buf;       // Pointer to the pixel data 
    size_t len;          // Length of the buffer in bytes 
    size_t width;        // Width of the buffer in pixels 
    size_t height;       // Height of the buffer in pixels 
    pixformat_t format;  // Format of the pixel data 
} camera_fb_t;

From other code I see that:

Code: Select all

    dl_matrix3du_t *image_matrix = dl_matrix3du_alloc(1, fb->width, fb->height, 3);

- declares a variable "image_matrix" as a pointer to a dl_matrix3du_t, and from other examples I can see that:

Code: Select all

 typedef struct
{   int w; // Width
    int h; // Height
    int c; // Channel
    int n; // Number, to record filter's out_channels. input and output must be 1
    int stride;
    uc_t *item;
} dl_matrix3du_t;

But I cannot find the code for dl_matrix3du_alloc anywhere and whilst the calling context makes clear that parameters 2 & 3 are width & height I do not know what parms 1 & 4 are.

I also cannot find what "stride" migth be, nor the meaning of Channel.

Could you tell me how to find these answsers in the documentation please?
Perhaps I am not seeing some to add to my list of essential bookmarks.

XenonXenon · Postby **XenonXenon** » Wed Jun 12, 2019 3:18 pm

From studying generally about image formats online I'm now guessing that "Channel" is the R, G or B channel.
If so, are the channel numbers 0,1,2 or 1,2,3 or something else? Are they in R then G then B order?
From your earlier comments, perhaps the order is obvious to experienced programmers in this field and therefore not worth spelling out. And I accept that you can't document everything for every kind of idiot who fancies himself a programmer.

I'm also guessing that "Stride" is the number of bytes which each pixel takes? Again - probably obvious to those experienced in this field - but not to me.

In the sample code I see this:

Code: Select all

 out_buf    = image_matrix->item;

I'm struggling with the type "uc_t", which 5 minutes of searching online can't find.
I note that out_buf is declared as:

Code: Select all

    uint8_t * out_buf;

- and not * uc_t, yet I see the assignment:

Code: Select all

    out_buf    = image_matrix->item;

- which compiles and works so ... there's that...

But what is "item"? I'm guessing it's a pointer the the byte array itself,
but uint8_t is a single byte so it can't be a variable address, presumably?

Perhaps it is my lack of experience which leaves me struggling with the available guidance, rather than the guidance itself.
I have 15 years programming but not in C or C++ and not in image processing.

ESP32 CAM direct access to image data

ESP32 CAM direct access to image data

Re: ESP32 CAM direct access to image data

Re: ESP32 CAM direct access to image data

Re: ESP32 CAM direct access to image data

Re: ESP32 CAM direct access to image data

Re: ESP32 CAM direct access to image data

Re: ESP32 CAM direct access to image data

Re: ESP32 CAM direct access to image data

Re: ESP32 CAM direct access to image data

Re: ESP32 CAM direct access to image data

Who is online

About Us

Extra

Information