Frame buffers and tearing effect

Hi

I’m a first timer with LVGL, and displays/graphics libraries altogether, and need some advice regarding choice of frame buffers.

I am looking to use LVGL with a 5”, 800x480, no-touch, IPS TFT display (with a standard RAM-less onboard display driver IC, like EK9716B or ST7262E43).
A STM32H7 with LTDC display peripheral will be used with 16bit RGB565 interfacing to the display.

If possible I will prefer using only internal RAM for the frame and draw buffers, avoiding the more complex PCB layout for an external SDRAM, and the need for a larger pin count uC.

At the moment the STM32H7 with largest RAM is STM32H7A3 with 1.4MB (the newer STM32U5 has up to 3MB of RAM, but is slower than the H7 and lacks some other features needed in my application)

I have been looking at the LVGL docs about buffers, but am in doubt what configuration to use when using the LTDC peripheral with a 800x480 resolution RAM-less display driver IC.
Maybe I’m not understanding it correctly, but some of the info regarding buffers seems to apply only when using a dedicated display controller IC with internal RAM (e.g. ILI9341), as used in some lower resolution displays.

My main concern is to avoid any tearing effects. Is it possible to use only a single full-size frame buffer (plus one or two smaller 1/10 size draw buffers) and still avoid tearing?
Or is it necessary to use two full-size frame buffers (double buffering) to avoid tearing?

A single full-size frame buffer uses (800x480x16bit)/(8x1024) = 750kB. And the two 1/10 size draw buffers use 2x750kB/10 = 150kB. A total of 900kB which will fit fine in the 1.4MB of RAM.
But if two full-size frame buffers are necessary to avoid tearing it will occupy 2x750kB = 1.46MB, making external SDRAM necessary.

I would like to understand this tearing effect better and how to avoid it. If using only a single full-size frame buffer, are there use cases where tearing will occur, and others where it won’t? I.e. does it depend on the specific UI design?

I found the below answer on the NXP forum (https://community.nxp.com/t5/i-MX-Processors/Display-is-Flickering-in-Single-Frame-Buffer-Configuration/td-p/1730002)
“Using a single frame buffer means that all rendering have to be done during the period of VSW (VSYNC Pulse Width). If the rendering can’t be done before the end of VBP (VSYNC Back Porch) tearing will likely be observed on the screen”

My question is, does LVGL automatically make sure rendering only happens in the time period mentioned above, or should this be implemented somehow in the user code?

In my case the UI will be very minimalistic, with only rectangle boxes and text (quite similar to the examples below). No animations is used, except if possible the text should have a smooth ‘scrolling/crossfading’ effect when changing from one parameter to another.
The only animation-like graphic is the thin red ‘saw/triangle’ waveform (top-left in the second picture), which should change it’s slope when adjusting the corresponding parameter.

image
image

Best regards,
Pete

Welcome,

If I understand correctly, some form of double buffering is necessary to avoid tearing completely.

The other two methods will introduce some sort of tearing, even if you take the vsync signal of your display into account.

Single frame buffer

The single frame buffer (that is, using LVGL to draw directly to the display buffer) will always introduce tearing as LVGL constantly draws something to this buffer, hence it will not take your vsync signal into account at all, this introduces tearing.

Two smaller frame buffers

This will make the LVGL system draw to these smaller buffers as some sort of “in between” state, LVGL will use a callback to notify you that a smaller buffer is ready to be drawn to the screen. This could technically be done to match up with the display vsync signal, however, it will introduce tearing as it basically draws horizontal strips of data to your main display buffer.

Double buffering should fix both of these issues, as changing what is displayed on the screen can be done near instantly by just changing a pointer to memory after LVGL has notified you it is done rendering and on a vsync signal.
However! If you are able to test your display/memory configuration before investing in external SDRAM for this, you may find that properly implementing your two smaller buffers will make the tearing almost invisible even in motion. Another trick you may use, is to make your smaller buffers just large enough to be as big as the animated area on your display- this will make it so that the tearing on your triangle wave will not be visible if done right.

LVGL does no such thing, handling timing is up to you.

Timing and virtual sync is one of the trickiest parts of making your display look fluid and is almost the only thing LVGL does not handle for you (except if you use a certified board with up-to-date example code). I really hope this helps, if it is possible for you to experiment beforehand, I suggest trying that first.

Hi Tinus

Thanks a lot for helping and the detailed explanation.

As suggested I think I will start out getting the basics working on a chip with enough internal RAM for holding two full-sized buffers (making the code more simple when not using external SDRAM). The STM32U5F7 has 3MB internal RAM, enough for two buffers even if using RGB888 format.

When that works, I will try experimenting with single buffer mode and how that affects tearing, and then maybe move to a STM32H7 with external SDRAM. Hopefully ST will make an H7 with larger RAM in the future.

Though I’m still in doubt if it’s possible at all to avoid tearing (or at least only barely visible tearing) with a single full-size frame buffer (+ some smaller draw buffers).
All display development boards I’ve looked at (for 800x480 or larger resolution) seem to use two full-size buffers (double buffering).
Do you know of any boards or application notes (be it ST, NXP, ESP, Renesas etc) using only a single frame buffer and still avoiding noticeable tearing?

Best regards,
Pete

Hello Pete,

Yes, good idea to try it first on the STM32U5F7.

I do not know of any existing examples that show this, however I have made one myself for the Renesas RA6M3G processor (about a year before this board was even officially supported). However, I will not be able to share it. Besides, the implementation is specificically for Renesas RA and its FSP software.

I can tell you what I did however.
You will have to use DMA (direct memory access) for both of your smaller buffers, this makes it so that you can offload the process of copying data from these smaller buffers to your main screen buffer, allowing you to render and copy data at the same time- this will greatly speed things up.
I have found in my testing that the memory used is a major limiting factor in this, internal RAM will probably be much faster than using external SDRAM. I have found that placing the large framebuffer for your entire display in external SDRAM and the smaller ones internal is a good tradeoff between memory usage and speed, altough putting everything on your internal RAM will be the fastest option.

One thing I have noticed between the two STM MCUs you mentioned, is that the STM32UF7 has much more internal RAM but a relatively slow clock speed of 160 Mhz, I do not know how much this impact rendering speed on such a large display. But on the Renesas RA6M3G I used (240Mhz Cortex-M4) it became a limiting factor on a 480x800 16-bit display- I was able to draw quickly enough to the screen, but LVGL took its time rendering complex screens with many buttons and a large list.

So if the performance on the -U5F7 is lackluster, I suggest trying out the other MCU you mentioned as it has a much higher clock speed and putting the larger framebuffer in external RAM may not be that bad for performance/tearing.

Kind regards

Hello Tinus

Thanks for the pointers. I think you’re right that the STM32U5 is a bit slow at 160MHz. I’m hoping it will be adequate for my rather simple UI with no animations.
It also comforts me that the LGVL approved Riverdi 5” 800x480 display uses the STM32U5.
Although in the below article Gabor mentions that 33 FPS with 90-96% CPU load was obtained with this board, so there will not be much CPU left for other tasks. But I guess it will be the most troublefree way for my initial LVGL tests, and then later shift to a STM32H7 when/if ST releases one with more internal RAM.

Thanks again for helping,

Pete

1 Like