Inputting audio to an ESP32 from an INMP441 I2S microphone: success
Posted: Wed Apr 15, 2020 9:15 am
I'm working on a baby monitor. I've got this microphone: https://invensense.tdk.com/products/digital/inmp441/
So i got this audio input to work ! Here I'll share my findings:
Here's the setup code: (note that this mic is not PDM)
Here's the code to input the data:
The microphone is a 24-bit one, but if you use bits_per_sample = I2S_BITS_PER_SAMPLE_24BIT it doesn't work, maybe an ESP bug? Anyway 32 bits works with some workarounds.
The 4 bytes that make up each sample in buffer32 are as follows:
First byte: E0 or 00, regardless of the sign of the other bytes, discard these!
Byte 2: Lowest significant byte of the sample. You can also discard this because it's just fuzz even in a quiet room.
Byte 3: Middle byte of the sample (signed)
Last byte: Most significant byte of the sample (signed).
Now what i recommend is converting it to 16-bit signed samples:
And that's it! You've got an array of 16-bit signed samples that are good to go. I find in a quiet room, the maximum samples hover around 7, after converting to the int16, and around 2-300 when i speak normally half a metre away.
If you'd like to play them back to the built-in DAC, what i've found is that the DAC expects non-signed samples, so basically you cast each sample to an int32_t, add 0x8000, then cast back to a uint16_t.
Hope that helps someone
So i got this audio input to work ! Here I'll share my findings:
Here's the setup code: (note that this mic is not PDM)
Code: Select all
i2s_config_t i2s_config = {
.mode = (i2s_mode_t)(I2S_MODE_MASTER | I2S_MODE_RX),
.sample_rate = 11025, // or 44100 if you like
.bits_per_sample = I2S_BITS_PER_SAMPLE_32BIT,
.channel_format = I2S_CHANNEL_FMT_ONLY_LEFT, // Ground the L/R pin on the INMP441.
.communication_format = i2s_comm_format_t(I2S_COMM_FORMAT_I2S | I2S_COMM_FORMAT_I2S_MSB),
.intr_alloc_flags = ESP_INTR_FLAG_LEVEL1,
.dma_buf_count = 4,
.dma_buf_len = ESP_NOW_MAX_DATA_LEN * 4,
.use_apll = false,
.tx_desc_auto_clear = false,
.fixed_mclk = 0,
};
if (ESP_OK != i2s_driver_install(I2S_NUM_0, &i2s_config, 0, NULL)) {
Serial.println("i2s_driver_install: error");
}
i2s_pin_config_t pin_config = {
.bck_io_num = 14, // Bit Clock.
.ws_io_num = 15, // Word Select aka left/right clock aka LRCL.
.data_out_num = -1,
.data_in_num = 34, // Data-out of the mic. (someone used 23 on forums).
};
if (ESP_OK != i2s_set_pin(I2S_NUM_0, &pin_config)) {
Serial.println("i2s_set_pin: error");
}
Code: Select all
size_t bytesRead = 0;
uint8_t buffer32[ESP_NOW_MAX_DATA_LEN * 4] = {0};
i2s_read(I2S_NUM_0, &buffer32, sizeof(buffer32), &bytesRead, 100);
int samplesRead = bytesRead / 4;
The 4 bytes that make up each sample in buffer32 are as follows:
First byte: E0 or 00, regardless of the sign of the other bytes, discard these!
Byte 2: Lowest significant byte of the sample. You can also discard this because it's just fuzz even in a quiet room.
Byte 3: Middle byte of the sample (signed)
Last byte: Most significant byte of the sample (signed).
Now what i recommend is converting it to 16-bit signed samples:
Code: Select all
int16_t buffer16[ESP_NOW_MAX_DATA_LEN] = {0};
for (int i=0; i<samplesRead; i++) {
uint8_t mid = buffer32[i * 4 + 2];
uint8_t msb = buffer32[i * 4 + 3];
uint16_t raw = (((uint32_t)msb) << 8) + ((uint32_t)mid);
memcpy(&buffer16[i], &raw, sizeof(raw)); // Copy so sign bits aren't interfered with somehow.
}
If you'd like to play them back to the built-in DAC, what i've found is that the DAC expects non-signed samples, so basically you cast each sample to an int32_t, add 0x8000, then cast back to a uint16_t.
Hope that helps someone