MEMS I2S Audio and ESP-DSP FFT
Posted: Tue Aug 27, 2019 6:50 am
I'd like to get feedback on an attempt to pull audio of a single MEMS I2S microphone on one channel and run it through the ESP-DSP FFT library...
The MEMS microphone produces I2S digital audio - 32bit words per channel, 24 bits of MSB aligned data per word, of which 18bits are significant. Data on non-connected channels are tri-state - adding a pull down resistor to the data line can clear up the unconected channel, which makes it easier to determine whats what.
Here's the I2S config:
The I2S appears to be acting as one would expect, with values increasing in amplitude with sound pressure. I read off the uint32_t values and right shift them 14 bits and cast them into an int with 16 bit range. To this end I decide to try the ESP-DSP sc16 FFT following the DSP 32 bit float FFT basic_math demo logic. My FFT code looks like this:
From here I sum the y_cf array into 1024 processed values (half of the 2048 raw data vales) using the formula in the demo where the processed value for bucket i is:
Looking at the data that results here, the value ranges are between 0 and 25, with the mean being around 15, but if I supply sound pressure, I'm not seeing any variation. As the raw data from I2S does vary I'm wondering if my use of the FFT is flawed.
Any thoughts appreciated!
The MEMS microphone produces I2S digital audio - 32bit words per channel, 24 bits of MSB aligned data per word, of which 18bits are significant. Data on non-connected channels are tri-state - adding a pull down resistor to the data line can clear up the unconected channel, which makes it easier to determine whats what.
Here's the I2S config:
Code: Select all
const i2s_config_t i2s_config = {
.mode = i2s_mode_t( I2S_MODE_MASTER | I2S_MODE_RX ), // Receive as Master
.sample_rate = 22050, // 22kHz is equivalent to 11kHz maximum audio frequence, re Nyquist
.bits_per_sample = I2S_BITS_PER_SAMPLE_32BIT, // 32 clock pulses per channel
.channel_format = I2S_CHANNEL_FMT_ONLY_LEFT, // Only capture the LHS channel
.communication_format = i2s_comm_format_t( I2S_COMM_FORMAT_I2S | I2S_COMM_FORMAT_I2S_MSB ), // Philips format
.intr_alloc_flags = ESP_INTR_FLAG_LEVEL1, // Interrupt level 1
.dma_buf_count = 9, // (9-1)*256 = 2048 samples per read (1 buffer in use and unavailable during read)
.dma_buf_len = 256, //
.use_apll = false, //
.tx_desc_auto_clear = false, //
.fixed_mclk = 0 };
Code: Select all
// Create the complex vector
for ( int i = 0; i < 2048; i++ )
{
int val = ( *(uint16_t*) &m_samples [ i ] ); // m_samples are the raw 32 bit words
y_cf [ i * 2 + 0 ] = val; // Not using windowing, but is in the basic math demo
y_cf [ i * 2 + 1 ] = 0;
}
dsps_fft2r_sc16( data, length ); // The FFT calc
dsps_bit_rev_sc16_ansi( data, length ); // Bit reverse used in the 32 - Do not know what this does or why
dsps_cplx2reC_sc16( data, length ); // Convert one complex vector to two complex vectors - Do not know what this does or why
Code: Select all
10 * log10f(
( y_cf [ i * 2 + 0 ] * y_cf [ i * 2 + 0 ]
+ y_cf [ i * 2 + 1 ] * y_cf [ i * 2 + 1 ] ) / 2048) )
Any thoughts appreciated!