Corrupted graphics when activating DMA2D on a SM32F750

Martin · January 27, 2021, 12:50pm

Basic information:

MCU: STM32F750
Display: 800x480 16 bit
LVGL: 7.7.2 (not modified)

So. We’ve used LVGL for a while in this project and it works fine although it’s a bit slow, so I’ve sat down to try to speed things up. The first thing I did was to activate LV_USE_GPU_STM32_DMA2D and it compiles and runs, but everything looks like BLEEP. I’ve tried make clean and rebuild everything (including unpacking lvgl from the zip archive), but no luck.

It looks ok if I turn off DMA and run entirely without acceleration. I’ve tried to google for some clue but so far I’ve not had any luck. The function for writing pixels is dumb as a rock:

static void system_display_lv_flush(lv_disp_drv_t *disp, const lv_area_t *area,
                                    lv_color_t *color_p) {
  for (uint16_t y = area->y1; y <= area->y2; y++) {
    for (uint16_t x = area->x1; x <= area->x2; x++) {
      drv_framebuffer_set_pixel(x, y, color_p->full);
      color_p++;
    }
  }
  lv_disp_flush_ready(disp);
}

The device setup looks like:

  static lv_disp_drv_t disp_drv;
  lv_disp_drv_init(&disp_drv);
  disp_drv.hor_res = 480;
  disp_drv.ver_res = 800;
  disp_drv.flush_cb = system_display_lv_flush;
  disp_drv.buffer = &disp_buf;
  lv_disp_drv_register(&disp_drv);

So, pretty basic and works fine (albeit slow) when DMA is off.

LVGL is initialised as:

  lv_init();
  lv_gpu_stm32_dma2d_init();

The second line doesn’t seem to make a difference in the behaviour.

Adding my LVGL config if it can give some insight.

Any help or a pointer to tutorial etc would be appreciated. My Google karma seems to suck.

lv_conf.h (24.7 KB)

TimH · January 27, 2021, 1:22pm

I don’t have a specific answer for you but something I need to investigate myself but might help you now.

I use a different processor (SAMA5D27) and it, too, works OK but rather slowly when not using DMA, but DMA gives a very similar visual effect to yours.

My hunch is that DMA is so fast that I need to hold off calling lv_disp_flush_ready() until the vertical sync period otherwise the display memory is being overwritten while actually being displayed.

Not using DMA is so slow in comparison that the chances of the actually memory location being written to coinciding with it being read is slim to none.

This could be cods wallop, but might help you if I’m right.

Martin · January 27, 2021, 1:24pm

Worth trying.

Any pointer to how to achieve that? Ie, waiting on the refresh?

TimH · January 27, 2021, 1:30pm

My processor has a “start of frame” interrupt but no timing diagram to suggest when that is! I suspect it will definitely be processor-specific though.

Martin · January 27, 2021, 1:32pm

Thanks.

Doubt I’ll have time to look into it today (got some other stuff thrown at me with higher priority), but will let you know.

embeddedt · January 27, 2021, 1:33pm

The cache needs to be flushed on STM32F7 (with SCB_CleanInvalidateDCache) before you write to the framebuffer, especially when using DMA2D. I have never seen it produce this type of buffer repetition on the display before, but it might be worth a shot.

Martin · January 27, 2021, 1:47pm

I’m an expert of breaking things. Last night I spun my Mac into a kernel panic.

So, flushing the cache first in system_display_lv_flush?

Sounds easy enough. I’ll try and let you know.

embeddedt · January 27, 2021, 2:29pm

Exactly. My previous post was inaccurate; it should be invalidated before you start writing (to pick up on the changes from DMA2D) and cleaned after you finish writing (so that the changes get reflected in memory, otherwise the LCD doesn’t see them).

Martin · January 27, 2021, 4:20pm

Sorry, no cookie I’m afraid.

static void system_display_lv_flush(lv_disp_drv_t *disp, const lv_area_t *area,
                                    lv_color_t *color_p) {
  SCB_CleanInvalidateDCache();
  for (uint16_t y = area->y1; y <= area->y2; y++) {
    for (uint16_t x = area->x1; x <= area->x2; x++) {
      drv_framebuffer_set_pixel(x, y, color_p->full);
      color_p++;
    }
  }
  SCB_CleanInvalidateDCache();
  lv_disp_flush_ready(disp);
}

Still the same problem.

robekras · January 27, 2021, 4:51pm

Hmm, on top of your first post you write 800 x 600,
whereas your code shows:

  disp_drv.hor_res = 480;
  disp_drv.ver_res = 800;

Martin · January 28, 2021, 7:37am

Yep. Sorry. Brain fart.

Old PC VGA habit that 800 in width means 600 in height, 800x480 is correct.

robekras · January 28, 2021, 7:55am

Is it 800 x 480, or is it 480 x 800?
That’s a difference.

Martin · January 28, 2021, 8:21am

800 in width, 480 in height. Listen to the code, it works. My brain is more dubious.

Martin · January 29, 2021, 5:05pm

Oh, here’s a fun tidbit I just realised.
The code sets the background to red with the following code:

  static lv_style_t my_red_style;
  lv_style_init(&my_red_style);
  lv_style_set_bg_color(&my_red_style, LV_STATE_DEFAULT, BACKGROUND_COLOR);
  lv_obj_add_style(system_display, LV_OBJ_PART_MAIN, &my_red_style);

This works as expected when I run without activating LV_USE_GPU_STM32_DMA2D and the background has the right color (ie red when the code is in developer mode) but when I activate DMA2D it doesn’t! Even curiouser it goes black despite that when LVGL starts the screen is white.

Weird.

embeddedt · January 29, 2021, 5:20pm

If something as simple as a background has the wrong color, that means the DMA2D peripheral is being configured incorrectly, which is likely to be a bug in LVGL.

What type of buffering are you using for disp_buf (one buffer/two small buffers/two fullscreen buffers)?

Martin · February 4, 2021, 10:28am

Sorry for late response, other things came in with higher priority for a few days.

I tried to upgrade lvgl to 7.10.0 to see if that changed anything, but alas. Still the same behaviour.

The buffer is initialised as:

  static lv_disp_buf_t disp_buf;
  static lv_color_t buf[LV_HOR_RES_MAX * 10];
  lv_disp_buf_init(&disp_buf, buf, NULL, LV_HOR_RES_MAX * 10);

I tried to bounce it up 10 to 130, which changes the behaviour. It still looks like and the background color is still black instead of red, but the repetition on the buffer is less (so to speak). Instead of the multitude of repetitions as in the picture above I now have 3 repetitions. The noise in the upper left corner is, however, still there.

Going to try initialising with SDRAM instead of internal RAM and throw some more RAM on it to see what happens. Where can I read more about fullscreen buffers? Would that be beneficial? How do I initialise two buffers instead of one?

Martin · February 4, 2021, 4:33pm

So, what I’ve done so far is:

Updraded to 7.10.0.
Made no difference.

Increassing buffer size after tip from embeddedt.
Made HUGE difference! Even without DMA.

Enabled DMA to transfer data in flush_cb.
Makes noticeable difference but corrupts the data. I also get a DMA transmission error.

Enabled lv_gpu_stm32_dma2d_init.
Also makes noticeable difference, but background color isn’t drawn and invisible objects only gets erased where there’s other objects, so popups etc leaves “residue” on the screen.

The relevant parts of the code now looks like this:

#define DISP_BUF_SIZE (LV_HOR_RES_MAX * LV_VER_RES_MAX)
#define USE_DMA (1)

#if USE_DMA
static DMA_HandleTypeDef pixel_dma;
#endif

static void system_display_lv_flush(lv_disp_drv_t *disp, const lv_area_t *area,
                                    lv_color_t *color_p) {
#if USE_DMA
  int width = 1 + area->x2 - area->x1;

  for (uint16_t y = area->y1; y <= area->y2; y++) {
    uint32_t FBIndex = (uint32_t)(drv_framebuffer_get_ptr() + y * LV_HOR_RES_MAX + area->x1);

    // Start the DMA transfer using polling mode
    HAL_StatusTypeDef hal_status =
        HAL_DMA_Start(&pixel_dma, (uint32_t)color_p, FBIndex, width);
    if (hal_status != HAL_OK) {
      system_fault_panic("HAL_DMA_Start falied",
                         (hal_status << 16) | pixel_dma.ErrorCode);
    }
    hal_status = HAL_DMA_PollForTransfer(&pixel_dma, HAL_DMA_FULL_TRANSFER, 1000);
#if 0
    // This keeps panicing so we definitely have problems with DMA.
    if (hal_status != HAL_OK) {
      system_fault_panic(
          "HAL_DMA_PollForTransfer falied",
          (hal_status << 16) | pixel_dma.ErrorCode);
    }
#endif
    color_p += width;

#else
  for (uint16_t y = area->y1; y <= area->y2; y++) {
    for (uint16_t x = area->x1; x <= area->x2; x++) {
      drv_framebuffer_set_pixel(x, y, color_p->full);
      color_p++;
    }
#endif
  }
  lv_disp_flush_ready(disp);
}

void system_display_create() {
  lv_init();
#if LV_USE_GPU_STM32_DMA2D
  lv_gpu_stm32_dma2d_init();
#endif

#if USE_DMA
  __HAL_RCC_DMA2_CLK_ENABLE();
  NVIC_ClearPendingIRQ(DMA2_Stream0_IRQn);
  HAL_NVIC_DisableIRQ(DMA2_Stream0_IRQn); // DMA IRQ Disable

  // Configure DMA request pixel_dma on DMA2_Stream0
  pixel_dma.Instance = DMA2_Stream0;
  pixel_dma.Init.Channel = DMA_CHANNEL_0;
  pixel_dma.Init.Direction = DMA_MEMORY_TO_MEMORY;
  pixel_dma.Init.PeriphInc = DMA_PINC_ENABLE;
  pixel_dma.Init.MemInc = DMA_MINC_ENABLE;
  pixel_dma.Init.PeriphDataAlignment = DMA_PDATAALIGN_WORD;
  pixel_dma.Init.MemDataAlignment = DMA_MDATAALIGN_WORD;
  pixel_dma.Init.Mode = DMA_NORMAL;
  pixel_dma.Init.Priority = DMA_PRIORITY_LOW;
  pixel_dma.Init.FIFOMode = DMA_FIFOMODE_ENABLE;
  pixel_dma.Init.FIFOThreshold = DMA_FIFO_THRESHOLD_FULL;
  pixel_dma.Init.MemBurst = DMA_MBURST_SINGLE;
  pixel_dma.Init.PeriphBurst = DMA_PBURST_SINGLE;

  HAL_StatusTypeDef hal_status = HAL_DMA_Init(&pixel_dma);
  if (hal_status != HAL_OK) {
    system_fault_panic("HAL_DMA_Init falied",
                       (hal_status << 16) | pixel_dma.ErrorCode);
  }
#endif

  static lv_disp_buf_t disp_buf;
  lv_color_t *buf = system_sdram_alloc_top(DISP_BUF_SIZE * sizeof(lv_color_t));
  memset(buf, 0, DISP_BUF_SIZE * sizeof(lv_color_t));
  lv_disp_buf_init(&disp_buf, buf, NULL, DISP_BUF_SIZE);

  static lv_disp_drv_t disp_drv;
  lv_disp_drv_init(&disp_drv);
  disp_drv.hor_res = LV_HOR_RES_MAX;
  disp_drv.ver_res = LV_VER_RES_MAX;
  disp_drv.flush_cb = system_display_lv_flush;
  disp_drv.buffer = &disp_buf;
  lv_disp_drv_register(&disp_drv);

This is how it looks with both DMAs turned off:

Graphics is a bit slow but everything gets drawn correctly.

This is with only USE_DMA enabled:

Some pixels in the text is missing and there’s other weird artefacts like that parts of graphics/text is drawn as an extra copy in the wrong place (like the text here).

This is with only LV_USE_GPU_STM32_DMA2D enabled:

Background is never drawn (probably why popups are leaving residue) and there’s a weird digital noise in the upper left corner of the screen. That digital noise is only visible when LV_USE_GPU_STM32_DMA2D
is enabled, it’s completely gone when I disable it.

With USE_DMA and LV_USE_GPU_STM32_DMA2D active the graphics is IMPRESSIVELY fast, so I’d really want to get this working.

Any tip or suggestion is welcome.

embeddedt · February 4, 2021, 5:07pm

Start with LV_USE_GPU_STM32_DMA2D off so that your code controls all of the DMA commands. Much simpler to debug.

The first potential problem I see is that you’re requesting word alignment when initializing DMA. If you are using 16-bit color, I believe you have to choose halfword alignment.

I also notice that in the F769 example project’s driver, the FIFO threshold is set to be 1/4 full instead of completely full. Not sure if that will help or not, but it is different from your driver.

Martin · February 5, 2021, 5:03pm

<Homer>D'oh!</>

Of course it should be HALFWORD! Thanks for that one, it solved one of the bugs; I don’t get the “shadows” and the HAL_DMA_PollForTransfer doesn’t return an error anymore. Improvement!

However, I still get lost bits so I’ll look into that example code of yours and see if that gives me any ideas.

In the pics above I’ve only run one DMA or the other, not both at the same time. Nothing really spectacular happens when I run both though, it just means that both bugs appear on top of each other.

Pondering changing the graphics to 32 bit instead of 16 bit and see if that solves anything. Might be worth wasting some RAM if that means that the DMAs works even though we only have 16 bits to the display. But that’ll have to wait until Wednesday or so, will get back with a report (and possibly a bug report? ).

robekras · February 5, 2021, 6:06pm

Doesn’t it need a call of invalidate_cache() (or SCB_CleanInvalidateDCache ()) when using DMA within the flush function?