New I2S driver microphone bytes format

Posted: Sun Feb 09, 2025 1:10 am
by messydr

I want to move from a setup using Arduino IDE with legacy I2S ("driver/i2s.h") to ESP-IDF and "driver/i2s_std.h".

The hardware / logic setup is:
  • ESP32-C3
  • MEMS microphone INMP411
  • GPIO numbers used: WS 0, SCK 1, SD 2
  • 16-bit, mono, 44.1kHz
  • data is send over network socket to a host
Highlights of working Arudino IDE with legacy I2S driver:

const i2s_config_t i2s_config = {.mode = i2s_mode_t(I2S_MODE_MASTER | I2S_MODE_RX),
				 .sample_rate = 44100,
				 .bits_per_sample = i2s_bits_per_sample_t(16),
				 .channel_format = I2S_CHANNEL_FMT_ONLY_LEFT,
				 .communication_format = i2s_comm_format_t(I2S_COMM_FORMAT_STAND_I2S),
				 .intr_alloc_flags = 0,
				 .dma_desc_num = 10,
				 .dma_frame_num = 1024,
				 .use_apll = false

int16_t volume_samples[1024];
size_t bytesIn = 0;

i2s_read(I2S_ESP32_PORT, &volume_samples, 2048, &bytesIn, portMAX_DELAY);
Audio data is:
  • send as:

	send(sockfd, volume_samples, bytesIn, 0);
  • received (for testing) in a Python script and stored as:

	with open(filename, "ab") as file:

		data = current_client_socket.recv(2048)
		if data:
  • decoded to signed int16 in a Python script as:

	file = open(args.filename, "r")
	volumes = np.fromfile(file, dtype=np.int16)
Works fine. After decoding to int16 the audio can look as this example:
Result with Arduino IDE and old (legacy) I2S driver
I want to replicate this setup with ESP-IDF and new I2S driver as:
  • config

i2s_chan_config_t rx_chan_cfg = {
		.id = I2S_NUM_AUTO,
		.role = I2S_ROLE_MASTER,
		.dma_desc_num = 10,
		.dma_frame_num = 1024,
i2s_std_config_t rx_std_cfg = {
		.clk_cfg = I2S_STD_CLK_DEFAULT_CONFIG(44100),
		.gpio_cfg =
			.mclk = I2S_GPIO_UNUSED,
			.bclk = GPIO_NUM_1,
			.ws = GPIO_NUM_0,
			.dout = I2S_GPIO_UNUSED,
			.din = GPIO_NUM_2,
			.invert_flags =
					.mclk_inv = false,
					.bclk_inv = false,
					.ws_inv = false,
rx_std_cfg.slot_cfg.slot_mask = I2S_STD_SLOT_LEFT;
ESP_ERROR_CHECK(i2s_channel_init_std_mode(rx_chan, &rx_std_cfg));
  • reading data

i2s_channel_read(rx_chan, &volume_samples, 2048, &bytesIn, portMAX_DELAY)
  • data is sent, received and decoded in the same way as outlined above
Unfortunately, with ESP-IDF and new I2S driver I can't get the proper data. After decoding I receive something like this.
Result with ESP-IDF and new I2S driver
Strangely, although the data contains only positive numbers (and in very narrow range), the audio (for example human speech) is understandable, but obviously very distorted.

How can I get the correct signed int16 values of audio data with the new I2S driver?
Could this be a bug? I strived to set the same settings using both drivers (old and new), so I'd expect them to give the same results.