Periodic Wrong Pixels using DMA+SPI - ESP32

Hello. I am currently having a problem with using DMA memory on the ESP32 + SPI using the ILI9488 TFT display. When running LVGL and setting up the draw buffers using normal memory allocation, everything works fine, albeit very slow. Too slow for the application that I am developing. So I am trying to use the DMA capability that the ESP32 has. And for the most part, it works quite well. I get much faster refreshes and good stability.

However,

I am plagued by random pixels (well, not really random) that show up on the screen like such:

What MCU/Processor/Board and compiler are you using?

ESP32 custom board (verified working hardware layout) and ILI9488. Using ESP-IDF 4.4

What do you want to achieve?

Using DMA capabilities with SPI without inducing the artifacts that appear in the screenshot.

What have you tried so far?

I have tried to modify the flush function by adding 1 to the size of the buffer being read, thinking that maybe it is missing the last pixel in the buffer, but that seems like a hack, and it has not worked.

Code to reproduce

Here is my flush function:

void ili9488_flush(lv_disp_drv_t * drv, const lv_area_t * area, lv_color_t * color_map)
{
    uint32_t size = (lv_area_get_width(area) * lv_area_get_height(area));
    lv_color16_t * buffer_16bit = (lv_color16_t *) color_map;
    uint8_t *mybuf;

    do {
        mybuf = (uint8_t *)heap_caps_malloc(3 * size * sizeof(uint8_t), MALLOC_CAP_DMA);
        if (mybuf == NULL) {
            LV_LOG_WARN("Could not allocate enough DMA memory!");
            ESP_LOGW(TAG, "Could not allocate enough DMA memory!");
        }
    } while (mybuf == NULL);


    uint32_t LD = 0;
    uint32_t j = 0;

    for (uint32_t i = 0; i < size; i++) {
        LD = buffer_16bit[i].full;
        mybuf[j] = (uint8_t) (((LD & 0xF800) >> 8) | ((LD & 0x8000) >> 13));
        j++;
        mybuf[j] = (uint8_t) ((LD & 0x07E0) >> 3);
        j++;
        mybuf[j] = (uint8_t) (((LD & 0x001F) << 3) | ((LD & 0x0010) >> 2));
        j++;
    }

	/* Column addresses  */
	uint8_t xb[] = {
	    (uint8_t) (area->x1 >> 8) & 0xFF,
	    (uint8_t) (area->x1) & 0xFF,
	    (uint8_t) (area->x2 >> 8) & 0xFF,
	    (uint8_t) (area->x2) & 0xFF,
	};

	/* Page addresses  */
	uint8_t yb[] = {
	    (uint8_t) (area->y1 >> 8) & 0xFF,
	    (uint8_t) (area->y1) & 0xFF,
	    (uint8_t) (area->y2 >> 8) & 0xFF,
	    (uint8_t) (area->y2) & 0xFF,
	};

	/* Column addresses */
	ili9488_send_cmd(drv, ILI9488_CMD_COLUMN_ADDRESS_SET);
	ili9488_send_data(drv, xb, 4);

	/* Page addresses */
	ili9488_send_cmd(drv, ILI9488_CMD_PAGE_ADDRESS_SET);
	ili9488_send_data(drv, yb, 4);

	/* Memory write */
	ili9488_send_cmd(drv, ILI9488_CMD_MEMORY_WRITE);

	ili9488_send_color(drv, (void *) mybuf, size  * 3);
	free(mybuf);
}

And this is the initialization code for the buffers when initializing LVGL:

lv_color_t* TFT::buf1 = NULL;
lv_color_t* TFT::buf2 = NULL;

void TFT::display_init()
{
    uint32_t display_buffer_size = (HOR_RES * 40);

    /* Initialize the working buffer depending on the selected display.
    * NOTE: buf2 == NULL when using monochrome displays.
    * */
    buf1 = static_cast<lv_color_t*>(heap_caps_malloc((display_buffer_size) * sizeof(lv_color_t), MALLOC_CAP_DMA));
    assert(buf1 != NULL);

    buf2 = static_cast<lv_color_t*>(heap_caps_malloc((display_buffer_size) * sizeof(lv_color_t), MALLOC_CAP_DMA));
    assert(buf2 != NULL);

    /* Initialize the LVGL subsystem */
    lv_init();

    /* Initialize the Message Subscription/Publish Framework for LVGL */
    lv_msg_init();

    /* Initialize the display driver for LVGL */
    lv_disp_drv_init(&disp_drv);

    disp_drv.hor_res = HOR_RES;                 /*Set the horizontal resolution in pixels*/
    disp_drv.ver_res = VER_RES;                 /*Set the vertical resolution in pixels*/

    /* Initialize SPI, I2C for the screen */
    lvgl_interface_init();

    /* Initialize the GPIO pins that are used to control the Display */
    lvgl_display_gpios_init();

    /* Initialize the display and activate */
    disp_driver_init(&disp_drv);

    uint32_t size_in_px = display_buffer_size;
    lv_disp_draw_buf_init(&disp_buf, buf1, buf2, size_in_px);
    
    disp_drv.flush_cb = disp_driver_flush;
    disp_drv.draw_buf = &disp_buf;

    tftDisplay = lv_disp_drv_register(&disp_drv);

    /* Register an input device when enabled on the menuconfig */
    #if CONFIG_LV_TOUCH_CONTROLLER != TOUCH_CONTROLLER_NONE
        lv_indev_drv_init(&indev_drv);
        indev_drv.read_cb = &touch_driver_read;
        indev_drv.type = LV_INDEV_TYPE_POINTER;
        lv_indev_drv_register(&indev_drv);
    #endif

    /* Create and start a periodic timer interrupt to call lv_tick_inc */
    const esp_timer_create_args_t periodic_timer_args = {
        .callback = &lv_tick_task,
        .name = "periodic_gui"
    };
    esp_timer_handle_t periodic_timer;
    ESP_ERROR_CHECK(esp_timer_create(&periodic_timer_args, &periodic_timer));
    ESP_ERROR_CHECK(esp_timer_start_periodic(periodic_timer, LV_TICK_PERIOD_MS * 1000));
}

Any suggestions or pushes in a better direction would be so greatly appreciated. I suspect I am missing the bigger picture here. I will post any and all other code that is pertinent if needed.
Thanks.

did you resolve it?
I am having the same problem

As a matter of fact, it has been solved. Note in this chunk of code within the ili9488_flush function:

uint8_t *mybuf;

    do {
        mybuf = (uint8_t *)heap_caps_malloc(3 * size * sizeof(uint8_t), MALLOC_CAP_DMA);
        if (mybuf == NULL) {
            LV_LOG_WARN("Could not allocate enough DMA memory!");
            ESP_LOGW(TAG, "Could not allocate enough DMA memory!");
        }
    } while (mybuf == NULL);

This is the wrong way to do it. First, this code block should be completely removed from the flush function. Then, declare uint8_t * mybuf as a global variable, and then place the following line in your ili9488_init function, as the last instruction in that function:

mybuf = (uint8_t *)heap_caps_malloc(3 * size * sizeof(uint8_t), MALLOC_CAP_DMA);

Let me know if that helps.

plus this construct in normal DMA offload situation is impossible. You cant free memory , that DMA on background hw transfer use in non blocking mode.
Your flush start this transfer and dont wait for end. Need check end on next flush… And Adam in init you dont have size info…

I too search for clean and simple DMA +…
I mean actual is implemented only in Lovyan, but little hard to config

You need to add a 1-5 us (sleep_us(1);) delay before set the CS pin to 1 in the DMA. Wait to send last byte. It should take < 1uS @ 10MHz , if send too quick, CS has been asserted, but DMA is still transferring from FIFO to SPI hardware.

My take on this.

From esp32_drivers → disp_spi.h, line 55 :

*	Important! 
	All buffers should also be 32-bit aligned and DMA capable to prevent extra allocations and copying.
	When DMA reading (even in polling mode) the ESP32 always read in 4-byte chunks even if less is requested.
	Extra space will be zero filled. Always ensure the out buffer is large enough to hold at least 4 bytes!
*/

So one one can do an algorithm checking this DMA buffer size alignment, for every flush size, or do it like me, just make sure there is enough DMA buffer allocated by adding 10, mind the parentheses:

do {
        mybuf = (uint8_t *)heap_caps_malloc((3 * size * sizeof(uint8_t)+10), MALLOC_CAP_DMA);
        if (mybuf == NULL) {
            LV_LOG_WARN("Could not allocate enough DMA memory!");
            ESP_LOGW(TAG, "Could not allocate enough DMA memory!");
        }
    } while (mybuf == NULL);

To zum it up :

It works on my computer :smile: