My intended application requires that I be able to grab a frame in bitmap format and at the highest possible resolution. Ideally FORMAT_RGB888 but FORMAT_RGB565 might do, Ideally FRAMESIZE_UXGA but a slightly smaller size might do. My app needs to be able to scan through the bitmap (accessing individual pixels) It would also help, during development, to be able to modify pixels in the bitmap and to be able to save the framer buffer out to a bitmap file.
GETTING STARTED
I started off with a simple program (found on the internet) for grabbing a single JPEG frame and writing it out to a JPEG file. The essence of that operation is as follows:
Code: Select all
//init with high specs to pre-allocate larger buffers
config.frame_size = FRAMESIZE_UXGA;
config.pixel_format = PIXFORMAT_JPEG;
esp_err_t err = esp_camera_init(&config);
sensor_t * sensor = esp_camera_sensor_get();
sensor->set_framesize(s, FRAMESIZE_UXGA);
sensor->pixformat = PIXFORMAT_JPEG;
camera_fb_t * FrameBuffer = esp_camera_fb_get();
Code: Select all
sensor->set_framesize(sensor, FRAMESIZE_INDEX_VGA);
I noted that the example code also uses that sensor variable to set the pixel format before actually doing the frame grab. So I tried this:
Code: Select all
config.pixel_format = PIXFORMAT_JPEG;
esp_err_t err = esp_camera_init(&config);
sensor_t * sensor = esp_camera_sensor_get();
sensor->set_framesize(s, FRAMESIZE_UXGA);
sensor->pixformat = PIXFORMAT_RGB888;
camera_fb_t * FrameBuffer = esp_camera_fb_get();
And it worked. Well, I THOUGHT it worked. It succeeded in doing the esp_camera_fb_get(); and subsequently storing the image on the SD card (with me being sure to give the file a BMP extension). And in fact I could even bring up the file on my PC in my paint program and see the image.
BUT !!!! I had also added some code to the example program to calculate and show (via the Serial Monitor) the expected number of bytes that should end up in the frame, using the simple calculation Height * Width * BytesPerPixel, where the Height is the number of image rows, the Width is the number of pixels per row, and BytesPerPixel is 3 (for the RGB888 format). And I also printed out the FrameBuffer->len, i.e. the ACTUAL number of bytes in the returned frame buffer.
TOO FEW BYTES IN FRAME BUFFER
And THEY DO NOT MATCH. Indeed, the numbers are nowhere close to each other. The actual number of bytes in the frame buffer is significantly, SIGNIFICANTLY lower that the expected value. If what's in the frame buffer is indeed a bitmap then the resultant frame buffer should contain AT LEAST the number of bytes that I calculated, since a BITMAP frame buffer should contain pixel data for every single pixel in the image.
I opened up the image (that had been saved to my SD card). I then used the Save As function in that point program to save it (whatever it was that it had loaded) out as a genuine BITMAP file. That file ended up being pretty darn close to the size I had calculated, and it was a genuine bitmap file. It appears that what had gotten loaded into the frame buffer was NOT PIXFORMAT_RGB888. that I had specified with the
Code: Select all
sensor->pixformat = PIXFORMAT_RGB888
Code: Select all
config.pixel_format = PIXFORMAT_JPEG;
sensor->pixformat DOES NOTHING !!
In the code:
Code: Select all
sensor->set_framesize(s, FRAMESIZE_UXGA);
sensor->pixformat = PIXFORMAT_RGB888
QUESTION # 1
What is the purpose of the pixformat item in the sensor_t object? Why does it allow me to change that item?
OBSERVATION # 1
It appears that whatever PIXFORMAT is specified in the config object is the format that will be used when retrieving frames from the camera and saving out to files. Trying to change it via the pixformat item in the sensor object has no impact.
EXPERIMENT # 1
Having figured out that I needed to set PIXFORMAT_RGB888 right at the beginning in the config object I tried the following, taking the approach that was done in the example code that you start with the largest framesize and later (if necessary) downsize to a desired smaller size.
Code: Select all
//init with high specs to pre-allocate larger buffers
config.frame_size = FRAMESIZE_UXGA; // MAX SIZE
config.pixel_format = PIXFORMAT_RGB888; // START with the intended format
esp_err_t err = esp_camera_init(&config);
sensor_t * sensor = esp_camera_sensor_get();
sensor->set_framesize(s, FRAMESIZE_UXGA); // MAX SIZE
sensor->pixformat = PIXFORMAT_RGB888; // Apparently irrelevant
camera_fb_t * FrameBuffer = esp_camera_fb_get();
Code: Select all
esp_camera_init(&config);
[E][camera.c:250] camera_fb_init(): Allocating 5625 KB frame buffer Failed
[E][camera.c:1161] camera_init(): Failed to allocate frame buffer
CONCLUSION # 2
The camera simply does not have enough memory to handle the largest possible
RGB888 bitmap.
QUESTION # 2
WHY? Why isn't enough memory provided to handle ALL possible combinations of formats and sizes?
Why don't the examples include notes about this, something like :
FRAMESIZE_QVGA, // 320X240
FRAMESIZE_CIF, // 400x296
FRAMESIZE_VGA, // 640x480
FRAMESIZE_SVGA, // 800x600
FRAMESIZE_XGA, // 1024x768
FRAMESIZE_SXGA, // 1280x1024
FRAMESIZE_UXGA // 1600x1200 Not useable for PIXFORMAT_RGB888, insufficient memory
Why isn't there comprehensive documentation about the ESP32-CAM that includes information like this? As far as I can tell there's no such documentation at all.
EXPERIMENT # 2
I needed to find out what frame size I COULD get with PIXFORMAT_RGB888. So I tried each one in turn.
FRAMESIZE_QVGA, // 320X240 ALLOCATED 230,400 bytes but esp_camera_fb_get() TIMEOUT FAILURE
FRAMESIZE_CIF, // 400x296 ALLOCATED 355,200 bytes but esp_camera_fb_get() TIMEOUT FAILURE
FRAMESIZE_VGA, // 640x480 ALLOCATED 921,600 bytes but esp_camera_fb_get() TIMEOUT FAILURE
FRAMESIZE_SVGA, // 800x600 ALLOCATED 1,440,000 bytes but esp_camera_fb_get()TIMEOUT FAILURE
FRAMESIZE_XGA, // 1024x768 NOPE Failed to allocate 2304 KB
FRAMESIZE_SXGA, // 1280x1024 NOPE Failed to allocate 3840 KB
FRAMESIZE_UXGA // 1600x1200 NOPE Failed to allocate 5625 KB
For the three largest frame sizes it failed to allocate the necessary buffer size. I will point out that it did indeed appear to be trying to allocate the CORRECT SIZES for the frame buffer, i.e. sizes that my program calculated to be the necessary size.
For ALL others it succeeded in allocating the buffers BUT then failed during the:
Code: Select all
camera_fb_t * FrameBuffer = esp_camera_fb_get();
[E][camera.c:1344] esp_camera_fb_get(): Failed to get the frame on time!
OBSERVATION # 2
It is IMPOSSIBLE to grab a frame buffer in PIXFORMAT_RGB888 at ANY SIZE!! It either fails to allocate enough memory or it timeouts during the call to esp_camera_fb_get();
QUESTION # 3
HOW CAN THAT BE?? I realize that the ESP32-CAM does not include enough memory to handle the larger frame sizes for RGB888. I think it's crazy that it doesn't, but that surely seems to be the case.
But then why is it failing with a timeout after it successfully allocates the necessary (smaller) memory? The only thing that should be occurring is the simple TRANSFER of pixel data from the camera over into the frame buffer. It's not like it's trying to format the data for JPEG, which likely takes a lot of calculations. It's just pixel-in (from camera image memory), pixel-out (to frame buffer).
I would expect that, when it's not transferring the ENTIRE image, it would (could!) simply grab every Nth pixel in every Nth row in order to create the smaller frame buffer. Now, MAYBE, just maybe, it's trying to AVERAGE over a square of pixels in order to create each output pixel, and tat certainly could take extra time. But WHY TIMEOUT?? Is it just not calculating how much time it should take? Would it actually SUCCEED if it were allowed just a bit more time?
SUGGESTION # 2
Do a better job of estimating the necessary time for accomplishing the pixel transfer and formatting in order to eliminate that timeout error.
Provide an OPTION that would allow the programmer to choose between an EVERY OTHER Nth PIXEL method of grabbing pixels for lower frame sizes and an AVERAGE ACROSS ADJACENT PIXELS method. There surely are times when the programmer would want one of those methods over the other. Make it an OPTION!
EXPERIMENT # 3
OK, so I CAN'T actually directly load a frame buffer with PIXFORMAT_RGB888,no matter what the size. How about for the PIXFORMAT_RGB565 format? It has only 2 bytes per pixel so maybe, just maybe it will work. As for the RGB888 format I tried each framesize in turn, and here's the results:
FRAMESIZE_QVGA, // 320X240 ALLOCATED 153,600 bytes and SUCCESS filling frame buffer SUCCESS saving to file BUT see below...
FRAMESIZE_CIF, // 400x296 ALLOCATED 236,800 bytes but esp_camera_fb_get() TIMEOUT FAILURE
FRAMESIZE_VGA, // 640x480 ALLOCATED 614,400 bytes but esp_camera_fb_get() TIMEOUT FAILURE
FRAMESIZE_SVGA, // 800x600 ALLOCATED 960,000 bytes but esp_camera_fb_get() TIMEOUT FAILURE
FRAMESIZE_XGA, // 1024x768 ALLOCATED 1,572,864 bytes but esp_camera_fb_get() TIMEOUT FAILURE
FRAMESIZE_SXGA, // 1280x1024 NOPE Failed to allocate 2560 KB
FRAMESIZE_UXGA // 1600x1200 NOPE Failed to allocate 3750 KB
So, for six of the frame sizes it either failed to allocate enough memory or, having successfully allocated the memory, it timed out when trying to grab the frame. Only the tiniest of frame size, the FRAMESIZE_QVGA, succeeded on allocating memory AND grabbing a frame from the camera.
OBSERVATION # 3
This is ridiculous for all the same reasons as when trying to grab the image in RGB888 format.
Oh, but wait, We DID actually have a success in grabbing that one image, at the very smallest size. And it saved out to the file as well. Let's have a look at the file using my paint program on my PC. What's that? It won't display? It says it's NOT a BITMAP file, the format of the file is not recognized?? [ And I DID make sure I put a .BMP extension on the file name ]
Let's look at how it saves it out to the file, all it does is this:
file.write(FrameBuffer->buf, FrameBuffer->len);
It's just saving the contents of the FrameBuffer. But hang on a minute. My program calculates how many bytes should be in the Frame Buffer in order to hold every single one of the pixels from the image and that's exactly how big the frame buffer IS. So the Frame Buffer holds JUST the pixel data. It does NOT include any HEADER data that normally appears in any file to identify what kind of file it is, how big it is, the particular format of the data, and so on. Which means that the file that gets created from this PIXFORMAT_RGB565 Frame Buffer does NOT HAVE THE NECESSARY HEADER DATA do identify it as bitmap data. So of course my paint progfam can't display anything.
CONCLUSION # 3
YOU CAN SAVE JPEG AS A VALID FILE
It surely APPEARS that when you grab a Frame Buffer in PIXFORMAT_JPEG it not only puts the JPEG pixel data out into the Frame Buffer but it also puts in the JPEG HEADER information into it at the beginning. If it didn't then opening up a saved JPEG file would FAIL just like it does for trying to open a saved BMP file.
YOU CANNOT SAVE BMP AS A VALID FILE
And when you have PIXFORMAT_RGB565 data in the Frame Buffer it does NOT have the necessary header info, so OF COURSE IT FAILS to be displayed by my paint program. It's simply NOT A VALID BMP FILE!!!
QUESTION # 4
WHY? WHY? WHY?
OBSERVATION # (I've lost track)
Now, my whole objective is to have pixel data, ideally in RGB888 but if need be just in RGB565 format, so that my program can then scan through the data (easily) and analyze what it sees. I also would like to be able to modify the pixel data to, for example, draw a box around pixels of interest. SO, it is indeed useful to have a Frame Buffer that ONLY has the pixel data (and no header data in it).
But a Frame buffer with only pixel data is DIFFERENT from one with header data as well. Yet there's no obvious distinction made between the Frame Buffer made using PIXFORMAT_JPEG and one made using PIXFORMAT_RBG565 or PIXFORMAT_RGB888. There's no documentation telling you what to expect and what you can DO with them. It's perfectly valid, from a programming standpoint, to SAVE the Frame Buffer out to a file. But doing so with PIXFORMAT_JPEG creates a valid JPEG file while doing so with PIXFORMAT_RBG565 or PIXFORMAT_RGB888 creates an invalid, unusable file. WHY???
SUGGESTION
DOCUMENTATION !!
At the very least there needs to be complete, comprehensive, thorough DOCUMENTATION of how this all works, what formats can be handled, which ones can't. But there isn't. It's left to the programmer to wade through the examples and TRY to figure out what's going on and how to accomplish what they need to do. But even those examples don't provide enough info. It looks like they don't even try.
Oh, and look through the actual core code of the functions that manipulate the camera? Barely a comment here or there. No overview of how it all works. No explanations of WHAT each method is trying to accomplish or HOW it's trying to accomplish it. There's just the bare code. Totally insufficient. I would be embarrassed to hand in code like that to my boss.
BITMAP DATA ACCESS AND FILES !!
It's crazy that getting actual BITMAP data directly from the camera is nearly impossible, limited to only PIXFORMAT_RBG565 and the very smallest frame size. THAT'S NUTS!
There needs to be a way to get genuine BITAMP data in a format that can then be sent out to a file with that file being a VALID BITMAP file. But there also needs a way to be able to ACCESS the actual pixel data in the program in order to analyze and/or modify it.
OBSERVATION: LAST
I've come across some code on various sites and in YouTube videos that suggest that "the best way to get BITMAP data" is to first grab it as JPEG data and then use one of the conversion functions that are provided. Whoa. So many issues with that.
First of all, JPEG is a LOSSY format. Camera image sensor -to- JPEG -to- BMP means you don't get the full, actual, original pixel data. Yet that's what we want (or at least some of us do).
Secondly, when you've loaded a reasonably large) JPEG image into memory and you select to do one of the conversions, like JPEG to RGB565, it then has to have a buffer allocated, in addition to the already allocated JPEG buffer, for the BITMAP data, and that often results in a memory failure. There just isn't enough memory on th ESP32-CAM to accommodate the target BITMAP buffer. So it fails.
I did a FEW tests and it's pretty clear that it only succeeds if you've grabbed one of the SMALLEST JPEG frame buffers. But, dang it, I want a BITMAP that's the BIGGEST possible size. Why can't I do that??? Oh, right, the designers didn't provide enough memory on the board to handle it.
SUGGESTION : FINAL
So here's my final suggestion: Since we can't grab an entire FRAMESIZE_UXGA , 1600x1200, frame how about there be a new function that grabs a PORTION of the specified frame. When you indicate that you want FRAMESIZE_UXGA it implicitly means that you want the ACTUAL pixel data available from the camera. You don't want it averaged or picking up pixels at regular intervals, you want the actual resolution of the camera. So how about this:
Code: Select all
esp_camera_fb_get(size_t UpperLeftX, size_t UpperLeftY, size_t Height, size_t Width);
Taking that approach the program could get any subsection of the image it wants AND at the resolution he wants (resolution effectively being specified by the requested frame size).
In my particular app I'd be perfectly fine with getting just a small subsection as long as it had the highest resolution. If I scan that and don't find what I'm looking for I can load a different subsection and keep doing that until I find it. I can then remember what subsection it's in and only load that subsection on the next pass as I'm tracking its movement. Frankly, it would make tracking objects MUCH FASTER!
So, please, whoever is in charge of the core code, consider adding this function.
MY APP
As far as my app is concerned, and as the ESP32-CAM stands, I can't use the ESP32-CAM for this project. It just doesn't do what I need it to do. It's just utterly amazing to me that it contains a super high resolution camera but the programmer really does NOT have access to its full resolution.