Unable to do simple audio transformations

MicroController · Postby **MicroController** » Thu Oct 12, 2023 8:47 pm

My 2 cents:

Cent #1: You read from indices "j + ...", but you write the data to "sIndex + j + ..." - effectively overwriting all data with the data from the first 4 frames. That's totally not what you want.

Cent #2: Still not sure about the sample values you posted. Is your input plausibly of an amplitude of within a few percent of the maximum possible 'loudness'? Also notice that e.g. the first sample (-8186112), when multiplied by 1.1, yields -9004723, which is beyond the range of a 24-bit signed value.

MicroController · Postby **MicroController** » Thu Oct 12, 2023 9:29 pm

Here's how I would probably draft it:

Code: Select all

static const uint32_t bytes_per_sample = 3;
static const uint32_t samples_per_frame = 2;

// Read a big-endian, signed 24-bit value from src
static int32_t read24(const uint8_t* const src) {
    int32_t r = (int8_t)src[0]; // takes care of the sign-extension
    r = (r<<8) | src[1];
    r = (r<<8) | src[2];
    return r;
}

// Write a signed 24-bit value as big-endian to dest
static void write24(const int32_t v, uint8_t* const dest) {
    dest[0] = v >> 16;
    dest[1] = v >> 8;
    dest[2] = v;
}

void setAudioGain(uint8_t *buf, const size_t bLength, const float gain) {

    uint32_t frames_left = bLength/(bytes_per_sample * samples_per_frame);

    while(frames_left) {

        int32_t left = read24(buf);
        left = (int32_t)(left * gain);
        write24(left, buf);

        buf += bytes_per_sample;

        int32_t right = read24(buf);
        right = (int32_t)(right * gain);
        write24(right, buf);

        buf += bytes_per_sample;

        frames_left -= 1;

    }
}

SlimyRedstone · Postby **SlimyRedstone** » Fri Oct 13, 2023 5:24 pm

MicroController wrote: ↑

Thu Oct 12, 2023 9:29 pm

Here's how I would probably draft it:

Code: Select all

static const uint32_t bytes_per_sample = 3;
static const uint32_t samples_per_frame = 2;

// Read a big-endian, signed 24-bit value from src
static int32_t read24(const uint8_t* const src) {
    int32_t r = (int8_t)src[0]; // takes care of the sign-extension
    r = (r<<8) | src[1];
    r = (r<<8) | src[2];
    return r;
}

// Write a signed 24-bit value as big-endian to dest
static void write24(const int32_t v, uint8_t* const dest) {
    dest[0] = v >> 16;
    dest[1] = v >> 8;
    dest[2] = v;
}

void setAudioGain(uint8_t *buf, const size_t bLength, const float gain) {

    uint32_t frames_left = bLength/(bytes_per_sample * samples_per_frame);

    while(frames_left) {

        int32_t left = read24(buf);
        left = (int32_t)(left * gain);
        write24(left, buf);

        buf += bytes_per_sample;

        int32_t right = read24(buf);
        right = (int32_t)(right * gain);
        write24(right, buf);

        buf += bytes_per_sample;

        frames_left -= 1;

    }
}

So, I've adapted your draft to my code, and there is more strange behavior.
When the gain is precisely of 0.5 or 1.5, there is sound coming out, but same as before it comes out saturated, even though I can hear myself, well a glitchy version.
If gain is 1.0, sound is normal, anything else and there is no sound at all (except for 0.5 and 1.5)
Here is the code that I wrote with your draft

Code: [Select all] [Expand/Collapse]

#define CONVERT8bits(n,o) (uint8_t)((n>>8*o)&0xFF)
 
static int32_t read24(const uint8_t *buffer) {
  int32_t value = (int8_t)buffer[0];
  value = (value<<8) | buffer[1];
  value = (value<<8) | buffer[2];
  return value;
}
 
void setAudioGain(uint8_t *buffer, size_t bLength, float gain) {
  int32_t *sample = calloc(2,sizeof(int32_t)); // L : R
  for (uint32_t sIndex = 0; sIndex < bLength; sIndex += 6) {
 
    sample[Left]  = read24(buffer+0);
    sample[Right] = read24(buffer+3);
 
    sample[Left]  = (int32_t)(sample[Left] * gain);
    sample[Right] = (int32_t)(sample[Right] * gain);
 
    *((uint8_t *)(buffer + 0 )) = CONVERT8bits(sample[Left],2);   //  Left Channel MSB
    *((uint8_t *)(buffer + 1 )) = CONVERT8bits(sample[Left],1);   //  Left Channel
    *((uint8_t *)(buffer + 2 )) = CONVERT8bits(sample[Left],0);   //  Left Channel LSB
 
    *((uint8_t *)(buffer + 3 )) = CONVERT8bits(sample[Right],2);  //  Right Channel MSB
    *((uint8_t *)(buffer + 4 )) = CONVERT8bits(sample[Right],1);  //  Right Channel
    *((uint8_t *)(buffer + 5 )) = CONVERT8bits(sample[Right],0);  //  Right Channel LSB
 
    buffer += 6;
  }
  free(sample);
}

GeSHi © Codebox Plus Extension

Thank you for pointing out the simple mistake about the "writing in buffer where there are unprocessed data", I've remove the second for-loop

SlimyRedstone · Postby **SlimyRedstone** » Fri Oct 13, 2023 5:46 pm

I don't know how and why, but I finally managed to do it.
Two months, I tried every witchery and randomly when clearing the bloated lines and the commented definitly not working ones, it works !
I thank both of you guys (MicroController and ESP_Sprite)
Here is the working code (for one channel), if you want more pm me, I'll be glad to help

Code: [Select all] [Expand/Collapse]

void setAudioGain(uint8_t *buffer, size_t bLength, double gain) {
  int32_t sample = 0;
  for (uint32_t sIndex = 0; sIndex < bLength; sIndex += 6) {
    sample = (int32_t)(((int8_t)buffer[sIndex + 2])<<16 | (buffer[sIndex + 1])<<8 | (buffer[sIndex + 0]));
 
    sample = (int32_t)((double)(sample)  * gain);
 
    buffer[sIndex + 0] = sample  >>  0;
    buffer[sIndex + 1] = sample  >>  8;
    buffer[sIndex + 2] = sample  >> 16;
  }
}

GeSHi © Codebox Plus Extension

Next step, an audio mixer.

MicroController · Postby **MicroController** » Fri Oct 13, 2023 8:27 pm

Code: Select all

sample = (int32_t)(((int8_t)buffer[sIndex + 2])<<16 | (buffer[sIndex + 1])<<8 | (buffer[sIndex + 0]));

So the byte order actually was little-endian all along

Btw, for performance reasons, I would definitely steer clear of using double, and generally would avoid/reduce floating point calculations, see also here. Fixed-point integer calculations are an alternative worth exploring.

SlimyRedstone · Postby **SlimyRedstone** » Fri Oct 13, 2023 9:11 pm

MicroController wrote: ↑
Fri Oct 13, 2023 8:27 pm
Code: Select all
sample = (int32_t)(((int8_t)buffer[sIndex + 2])<<16 | (buffer[sIndex + 1])<<8 | (buffer[sIndex + 0]));
So the byte order actually was little-endian all along

Btw, for performance reasons, I would definitely steer clear of using double, and generally would avoid/reduce floating point calculations, see also here. Fixed-point integer calculations are an alternative worth exploring.

Do you mean instead of doing that:

Code: Select all

float gain = 2.00;
sample[Left]  = (int32_t)((double)(sample[Left])  * gain);

I should do it in this manner:

Code: Select all

uint32_t gain = 200; // actual gain is  -> gain/100
sample[Left]  = (int32_t)((int64_t)(sample[Left])  * gain / 100);

MicroController · Postby **MicroController** » Fri Oct 13, 2023 9:31 pm

Yes, exactly.
Usually one chooses a power of two as scale, e.g. 256=2^8, so that scaling can be handled via cheap bit-shifting instead of (integer) multiplication and division.
Like

Code: Select all

const uint32_t scale_bits = 10; // Taking 10 bits as an example specifically because 24+10 bits won't fit into an int32_t
const uint32_t gain_factor = (uint32_t)(gain * (1<<scale_bits));
for (...) {
  ...
  sample = (int32_t)(((int64_t)sample * gain_factor) >> scale_bits);
  ...
}

One must always be aware that multiplications 'add' bitlengths and appropriate data types must be chosen for intermediate values.
However, 64-bit multiplication on a 32-bit machine may or may not be faster than using the FPU for a float multiplication; needs benchmarking. If the scale is <= 8 bits, no int64_t intermediate is needed and things should be faster than float.

CityHunter71 · Postby **CityHunter71** » Tue Oct 22, 2024 1:04 pm

For my opinion have 2 problem:

1) when inizialize i2s use a PHILIPS schema with this configuration (input and output):

void init_microphone(void)
{
i2s_chan_config_t chan_cfg = I2S_CHANNEL_DEFAULT_CONFIG(I2S_NUM_AUTO, I2S_ROLE_MASTER);
ESP_ERROR_CHECK(i2s_new_channel(&chan_cfg, NULL, &rx_handle));

i2s_std_config_t std_rx_cfg = {
.clk_cfg = I2S_STD_CLK_DEFAULT_CONFIG(CONFIG_EXAMPLE_MIC_SAMPLE_RATE),
.slot_cfg = I2S_STD_PHILIPS_SLOT_DEFAULT_CONFIG(CONFIG_EXAMPLE_MIC_BIT_SAMPLE, CONFIG_EXAMPLE_MIC_NUM_CHANNELS),
.gpio_cfg = {
.mclk = I2S_GPIO_UNUSED,
.bclk = CONFIG_EXAMPLE_I2S_MIC_BCLK_GPIO,
.ws = CONFIG_EXAMPLE_I2S_MIC_WS_GPIO,
.din = CONFIG_EXAMPLE_I2S_MIC_DATA_GPIO,
.invert_flags = {
.mclk_inv = false,
.bclk_inv = false,
.ws_inv = false,
},
},
};

ESP_ERROR_CHECK(i2s_channel_init_std_mode(rx_handle, &std_rx_cfg));
ESP_ERROR_CHECK(i2s_channel_enable(rx_handle));
}

Philips schema is perfect for edit buffer.

Second point is use 32bit data buffer

this function is for apply gain with test for top gain available.

void apply_volume_control(wave_info_t *wave_info, bool Normalize)
{
// Se Normalize è TRUE, imposta la scala del volume a 1.0 per normalizzare il segnale
float volume_scale = Normalize ? 1.0f : wave_info->volume_level / 9.0f;

// Trova il picco massimo del segnale e applica la trasformazione del volume in un solo ciclo
int16_t max_sample = 0;
for (size_t i = 0; i < wave_info->buffer_size / sizeof(int16_t); i++) {
// Step 1: Scala il volume in base all'impostazione utente o la normalizzazione
int32_t sample = (int32_t)(wave_info->buffer * volume_scale);

// Trova il picco massimo attuale
if (abs(sample) > max_sample) {
max_sample = abs(sample);
}

// Sovrascrive temporaneamente il campione nel buffer
wave_info->buffer = (int16_t)sample;
}

// Se Normalize è TRUE, calcola un fattore di amplificazione per normalizzare il segnale
float amplification_factor = 1.0f;
if (Normalize && max_sample > 0) {
// Normalizza il segnale audio facendo sì che il picco più alto raggiunga INT16_MAX
amplification_factor = (float)INT16_MAX / max_sample;
} else if (!Normalize && max_sample > 0) {
// Se non si sta normalizzando, applica il fattore di amplificazione solo se il segnale è basso
amplification_factor = (float)INT16_MAX / max_sample;
}

// Applica l'amplificazione e il soft clipping
for (size_t i = 0; i < wave_info->buffer_size / sizeof(int16_t); i++) {
int32_t sample = wave_info->buffer;

// Step 2: Applica l'amplificazione
sample = (int32_t)(sample * amplification_factor);

// Step 3: Applica il soft clipping per evitare distorsioni
if (sample > INT16_MAX) {
sample = INT16_MAX - ((sample - INT16_MAX) / 2); // Soft clipping sopra INT16_MAX
} else if (sample < INT16_MIN) {
sample = INT16_MIN + ((sample - INT16_MIN) / 2); // Soft clipping sotto INT16_MIN
}

// Step 4: Aggiorna il buffer con il campione modificato
wave_info->buffer = (int16_t)sample;
}
}

i have same problem

in this moment i try to convert a 48khz to a 44100hz

Best Regards.

Unable to do simple audio transformations

Re: Unable to do simple audio transformations

Re: Unable to do simple audio transformations

Re: Unable to do simple audio transformations

Re: Unable to do simple audio transformations

Re: Unable to do simple audio transformations

Re: Unable to do simple audio transformations

Re: Unable to do simple audio transformations

Re: Unable to do simple audio transformations

Who is online

About Us

Extra

Information