I’m currently developing a battery-powered, cost-sensitive device that uses a 480x320 TFT display driven by an STM32U575 MCU. Unlike more powerful STM32F7/H5/U5G9 MCUs commonly used with LVGL, the STM32U575 does not have an integrated LTDC peripheral.
To address this, I initially used the FMC interface to drive the display. At first, the implementation was straightforward: writing pixel data in a for() loop. However, this method was very slow, and screen updates were visibly noticeable.
To optimize performance, I transitioned to using DMA transfers, which significantly improved display update speeds. Recently, I’ve integrated DMA2D, allowing the entire display buffer to be updated in a single operation. The visual update speed is now excellent, as the display refreshes faster than the human eye can perceive.
Current Issue: Despite the significant improvements, animations (particularly scrolling animations) still appear choppy and are not as fluent as desired.
Hardware and Compiler Details
MCU: STM32U575VGT6Q
Compiler: GNU++17
Clock Speed: 100 MHz
Goal
I would like to achieve smoother, more fluid animations, especially for scrolling.
Efforts Made So Far
Implemented partial buffering to reduce memory usage and enhance performance.
Optimized hardware paths (FMC, DMA, DMA2D integration) to maximize transfer speed.
Confirmed efficient utilization of DMA2D for complete screen buffer transfers.
Questions
Are there recommended LVGL configuration settings specifically tailored for MCUs without an LTDC peripheral?
Could additional DMA2D optimizations further improve animation fluency?
Are there known software or hardware techniques to smooth animations further on resource-limited MCUs?
Could I further optimize my buffer management (e.g., double buffering or smaller partial buffers)?
Would adjusting the LVGL refresh rate or tick rate help improve animation smoothness?
Are there specific compiler or code-level optimizations known to enhance LVGL animation performance?
Can interrupt priorities or DMA channel management further enhance responsiveness and fluidity?
Any suggestions or experiences from the community would be greatly appreciated!
I used an 8080 Display interface via FMC in the past and was very very fast, even with a simple “for” loop. My LCD was 480x320 too and achieve like 60FPS* if I dont remember wrong.
*Refreshing only like half of the screen, the full screen will half of that.
The project used an STM32G4 at 170MHz.
// FMC memory-mapped address
#define LCD_CMD_ADDR *(volatile uint16_t*)(0x60000000)
#define LCD_DATA_ADDR *(volatile uint16_t*)(0x60000002)
void lcd_bmp( tLcd *pThis, int16_t x, int16_t y, uint16_t w, uint16_t h, uint16_t *buf )
{
uint32_t i = 0;
lcd_set_window( pThis, x, y, w, h );
LCD_CMD_ADDR = LCD_REG_MEMORY_WRITE;
for( i = w*h ; i ; i-- )
{
LCD_DATA_ADDR = *buf++;
}
}
Hi @jgpeiro,
I’m quite sure that running a for-loop at maximum speed isn’t the solution I’m looking for.
I don’t think the write speed itself is the issue, as writing the buffer via the DMA2D peripheral is already quite fast. My suspicion is that there’s some software configuration or setting I’m missing, which is preventing LVGL from updating faster.
I also pushed my lv_conf.h if this would help: lv_conf.h (36.3 KB)
Yes, you are right, the for loop is not the best solution. I always have in mind use all MCU features to get the max FPS, but the problem normally is the cost to implement (+portability) respect the improvement obtained, that is why in my case I simply used a for loop.
I dont know any feature on the lv_conf.h that can help, I normally leave everything to default except the color depth, but I think there was a feature to enable the dma2d in the STM32.
Al least there is some api related to it here: dma2d — LVGL documentation
Anyway, what I do at most is to use double buffer, so I cant help you.
Some displays have an output signal called tearing.
This signal is used to tell the external controller( in your case STM32U575 MCU ) when to send the pixels to glass display, to sync with glass display “internal” controller( ili9341, ssd1963, … ).
This way the tear artifacts in screen disappear.
You can search for screen tearing in the google.
The TE signal is a specific solution for panels that have an internal frame buffer (such as RAM in COG panels). On “dumb” RGB displays without internal RAM, the tearing problem is addressed by synchronizing with the VSYNC and HSYNC signals, and the TE signal is generally not implemented on these panels.