8080 Display Interface with FMC on STM32U575

Description

I’m currently developing a battery-powered, cost-sensitive device that uses a 480x320 TFT display driven by an STM32U575 MCU. Unlike more powerful STM32F7/H5/U5G9 MCUs commonly used with LVGL, the STM32U575 does not have an integrated LTDC peripheral.

To address this, I initially used the FMC interface to drive the display. At first, the implementation was straightforward: writing pixel data in a for() loop. However, this method was very slow, and screen updates were visibly noticeable.

To optimize performance, I transitioned to using DMA transfers, which significantly improved display update speeds. Recently, I’ve integrated DMA2D, allowing the entire display buffer to be updated in a single operation. The visual update speed is now excellent, as the display refreshes faster than the human eye can perceive.

Current Issue: Despite the significant improvements, animations (particularly scrolling animations) still appear choppy and are not as fluent as desired.

Hardware and Compiler Details

  • MCU: STM32U575VGT6Q
  • Compiler: GNU++17
  • Clock Speed: 100 MHz

Goal

I would like to achieve smoother, more fluid animations, especially for scrolling.

Efforts Made So Far

  • Implemented partial buffering to reduce memory usage and enhance performance.
  • Optimized hardware paths (FMC, DMA, DMA2D integration) to maximize transfer speed.
  • Confirmed efficient utilization of DMA2D for complete screen buffer transfers.

Questions

  • Are there recommended LVGL configuration settings specifically tailored for MCUs without an LTDC peripheral?
  • Could additional DMA2D optimizations further improve animation fluency?
  • Are there known software or hardware techniques to smooth animations further on resource-limited MCUs?
  • Could I further optimize my buffer management (e.g., double buffering or smaller partial buffers)?
  • Would adjusting the LVGL refresh rate or tick rate help improve animation smoothness?
  • Are there specific compiler or code-level optimizations known to enhance LVGL animation performance?
  • Can interrupt priorities or DMA channel management further enhance responsiveness and fluidity?

Any suggestions or experiences from the community would be greatly appreciated!

Hello MootSeeker,

I used an 8080 Display interface via FMC in the past and was very very fast, even with a simple “for” loop. My LCD was 480x320 too and achieve like 60FPS* if I dont remember wrong.
*Refreshing only like half of the screen, the full screen will half of that.

The project used an STM32G4 at 170MHz.

// FMC memory-mapped address
#define LCD_CMD_ADDR	*(volatile uint16_t*)(0x60000000)
#define LCD_DATA_ADDR	*(volatile uint16_t*)(0x60000002)

void lcd_bmp( tLcd *pThis, int16_t x, int16_t y, uint16_t w, uint16_t h, uint16_t *buf )
{
    uint32_t i = 0;

    lcd_set_window( pThis, x, y, w, h );

    LCD_CMD_ADDR = LCD_REG_MEMORY_WRITE;
    for( i = w*h ; i ; i-- )
    {
        LCD_DATA_ADDR = *buf++;
    }
}

You can find the code here, there are version with LVGL and with another UI lib called Nuklear.
https://github.com/jgpeiro/black_scope/blob/master/software/stm32g4_scope/Lib/Lcd/lcd.c

Hi @jgpeiro,
I’m quite sure that running a for-loop at maximum speed isn’t the solution I’m looking for.

I don’t think the write speed itself is the issue, as writing the buffer via the DMA2D peripheral is already quite fast. My suspicion is that there’s some software configuration or setting I’m missing, which is preventing LVGL from updating faster.

I also pushed my lv_conf.h if this would help:
lv_conf.h (36.3 KB)

Yes, you are right, the for loop is not the best solution. I always have in mind use all MCU features to get the max FPS, but the problem normally is the cost to implement (+portability) respect the improvement obtained, that is why in my case I simply used a for loop.

I dont know any feature on the lv_conf.h that can help, I normally leave everything to default except the color depth, but I think there was a feature to enable the dma2d in the STM32.
Al least there is some api related to it here: dma2d — LVGL documentation

Anyway, what I do at most is to use double buffer, so I cant help you.

Maybe the tear signal could help.

I haven’t implemented this in my driver yet, I think it must be a bit complicated and time-consuming to do.