I’m wondering what is the best practice for “nice” performance on a large-ish screen parallel LCD which doesn’t have it’s own VRAM.
Right now I have STM32F429I with 256Mb SDRAM and LTDC (parallel RGB interface, 565) and I’m using LVGL 8.x as following:
- Init SDRAM
- Init LTDC
- Create 2 LTDC layers, both full-screen
- Layer 1 framebuffer @ start sdram, layer 2 framebuffer after layer 1.
- When registering lv display driver, I set direct_mode and full_refresh to 1.
- In flush_cb, I check if color_p points to 1st or 2nd framebuffer, disable / enable appropriate layers, and configure LTDC to refresh on next vsync.
- I also wait for actual reconfiguration to happen by polling for LTDC_FLAG_RR. I’ve tried doing this in IRQ and just while () inside flush_cb and performance was about same.
- I have SPI-based touchscreen, but I think that stuff is working as good as it can, so there’s not much to optimize there.
I am not using RTOS, just a main loop with some stuff happening in IRQs.
I have a Systick interrupt which calls lv_tick_inc() every 1ms, and lv_timer_handler every 5 ms.
Now, while setting all this up I’ve seen lots of examples of people doing stuff with buffers in SRAM, DMA2D and all that jazz. Before I go and spend a bunch of time setting that up, I’m wondering what IS the actual best practice for this?
Is 2 framebuffers in SDRAM bad because drawing + refresh both have to go through the (limited) AHB bus to access external ram? Should I have a small buffer in SRAM and DMA2D it over? Will performance actually improve a lot?
What about caching images (icons, bitmaps etc) for drawing, also in SDRAM? Would that result in triple copies on each redraw?
image (sdram) → lvgl composite/draw/whatever → back to sdram → vsync → LTDC reads it out on refresh.
I’ve run the benchmark demo, and pretty much every metric is showing 32-34fps or so, while the screen feels kinda chunky at times (doesn’t really look like 30fps, more like 15 or so).
Anyone setup similar kind of stuff and can comment whether I should bother researching into DMA2D and such or is this the best it’s going to get given the limited memory bandwidth of STM32 SDRAM controller?
Thanks.