Best MCU for LVGL? Would the new Teensy 4.1 be enough?

A powerful CPU can compensate for the lack of a blending accelerator, but the display interface also needs to be fast, otherwise it will limit the frame rate.

The complexity of the UI matters as well. Are you blending several translucent images together, or just scrolling a list of text and symbols?

As a real-world example, I tinker with LittlevGL on a 200MHz STM32F746 board with a 480x272 screen. The CPU has a blending accelerator, but I don’t use it. I run it with a buffer 1/4 the size of the display’s actual framebuffer, so that would be 65K in size. At this moment, I’m running the widgets demo on the site. Even with the processor cache disabled, frame rates are very reasonable, and when I enable cache, the UI is always buttery smooth. From the sounds of it, Teensy is roughly 3x more powerful than my current setup with regards to processing power. What would your UI look like in comparison to this?

If someone did end up getting one of these, it would be great if they could run this benchmark and share their results and hardware configuration.

1 Like

I just ordered a Teensy 4.1 and it should arrive in a week. Once I get it up and running, if I do, I’ll report back my experience. I’ll also try to run the benchmark on it.

Thanks

Hi Jason.
Which LCD panel are you going to connect to this board? There is nothing suitable on this board except SPI. The processor itself has the necessary interfaces, but they are not displayed on the board. But the panels with the SPI interface, with a resolution of 800x480 do not exist, in my opinion.

I assume it’s because refreshing them at full speed using the relatively slow SPI interface is near-impossible, although I’m no display interface expert. :slightly_smiling_face:

I mean, there are smart displays with SPI and a screen buffer that end in 320x240 resolution. Just not very successfully hinted to Jason that this particular module is not suitable, even though the controller is good.

@embig71
I think you lose most the performance due to VSYNC. To mitigate this triple buffering can be used. Otherwise you spend time by waiting for VSYNC. The other issue is that you render directly into external SRAM when random writes are very slow compared to the internal SRAM.

I added a new comment in the related topic to figure out what causes the issue.

@kisvegabor, are there any examples I can reference that show how to properly implement triple buffering?

Through this thread, I see the STM32F line is mentioned a lot. From what I’ve read, those MCU’s have a slower Mhz then the Teensy 4. What features does the STM chips have, making them a better option, that the Teensy 4 doesn’t have? If there is anything. I’m unsure and this sounds like a reasonable question to help me understand.

@embig71, I was looking a the following 800*480 boards. The first one might not be SPI, but I can also work with 8-bit parallel if its an option and/or if it offers greater performance.

This is off the top of my head:

  • They have a graphics accelerator built-in, meaning that they can handle extra uses of blending, etc. at higher resolutions.
  • The Discovery boards also tend to ship with higher-speed display interfaces, but I don’t know if actual hardware designs end up utilizing those.
  • They’re extremely popular, which means that more people are familiar with them, thus they get more recommendations.

I’m planning to play with it soon, but not now. :frowning:
However, it’d be used well if the MCU has LCD/TFT driver periphery (to swap buffer with one command) which is not the case for Teensy 4.

I think it’s more important than having GPU. Sending the pixels to the display via SPI or parallel port takes a lot of time for larger screens.

Is the “higher-speed display interface” something similar to SPI but something most common MCU’s do not offer? Is there a specific name for this display port option so I know what to look for when exploring MCUs?

Thanks again for the support and information.

Another question for you all,

Does lvgl support that RA8875 chipset? I honestly know nothing about it other then is appears to be a dedicated display IC that supports SPI. Would using a chip like this offer any improvements compared to the other standard display options?

Probably the best option is a dedicated TFT-LCD periphery which directly sends the frames using HSYNC, VSYNC, and 16 or 24 pins for colors.

In case of STM it’s called LTDC. This PDF is specific to ST but the first few pages summarize the different options to drive a TFT.

Yes, but not all features. RA8875 has a built-in drawing engine that can’t be used directly by LVGL. However, you can use it like any other display controller: select rectangular area (called window) and copy the LVGL rendered image there.

Does this means that the power of the RA8875 is useless in lvgl’s case? Without that engine, does RA8875 offer any performance benefits compared to not using it?

Also, thanks for the PDF on STM’s LTDC tech. It was very informative. I’ll be focusing some time on learning more about the STM32F7x family.

The STM32F4 also has an LTDC controller, but does not have the Cortex-M7 core and caching abilities that the F7 does. Typically the F4 is not used for large resolutions like 800x480 as they require a lot of RAM for the framebuffer and the processor ends up being a bottleneck. However, I believe the F4 is cheaper than the F7.

I’m not sure the RA8875 will work well for your use case as it appears to interface through SPI, has its own memory buffer for the display (useful mainly when the host MCU already has RAM dedicated to an application), and its own drawing system, none of which LVGL can take advantage of at the moment. (Given that the RA8875 doesn’t seem to have any antialiasing or custom font support, this is unlikely to change, as you lose most of the control and flexibility LVGL gives you.)

Hi folks,

As I look into exactly the same type of application:

For an spi interface, you must use it to the max freq and it means hardware spi with dma.

There are spi screens 5 or 7 inches with spi. Surenoo store has lcd “modules” with spi. 800x480. Crystalfonz have their eve 5 inch sunreadable.

I will actually use the 5 inch surenoo with ESP 32 . I need 1hz frame rate …

For the teensy 4.1 you might be able using flexio block to build an alternate 16 bit mcu interface. That means the real frame buffer has to remain in the screen. It means that the lcd module you will buy needs to have an internal ram . Rgb 16/24 bits lcd with pclk cannot be used unless you build with flexio a real lcd pixel interface. There is an app note on nxp website about using flexio like that for kinetis cpus ( cpu is différent but flexio block is the same)

Hope it helps a bit.

1 Like

I’m testing 5" displays with 800x480 using a LT768 video processor/accellerator on an ESP32. It seem to be fast for playing video, has also panning, scrolling PIP and other HW accelleration.
It is this one: https://www.aliexpress.com/item/4000282154239.html?spm=a2g0o.productlist.0.0.2d745798XCI0r1&algo_pvid=46993936-b698-4ef2-9524-f50deb133ce8&algo_expid=46993936-b698-4ef2-9524-f50deb133ce8-0&btsid=0be3769015916778431468077e2d09&ws_ab_test=searchweb0_0,searchweb201602_,searchweb201603_
But it’s quite new and I don’t think there is yet a LVGL support for it.
Before, I have been working with RA8875 5" 800x480 displays.
The LT768 display can be used in parallel or SPI mode und there a resitive and capacitive touch versions available.

It seems very slow from the video on aliexpress. Have I missed something?

I’m using LVGL with the i.MX RT 1052 boards and I’m pleased with the performance.
This is for a 480x272 with touch controller. It has the advantage that the CPU is able to transmit the data directly to the display which is fast compared to use an SPI connection. SPI works well for me on smaller displays too.
If selecting a display controller type: I like the SSD (Solomon Systech) ones which (at least some of them) have a ‘dual RAM’: you can write to the second RAM of the display and then switch the content: that way the update is ‘immediate’. You still will need time to write the data, but at least it is less visible to the human eye.

On a side note: if using the Teensy, keep in mind you won’t be able to debug it (at least the ones I have used).

I hope this helps.

1 Like

Hello,

I am using the evkbimxrt1050 board, 480x272 resolution panel, 16-bit pixel. I tried the lvgl7 benchmark, the result is:

Weighted FPS: 68
Opa. speed: 92%

The panel is a dumb panel driven by MCU’s LCDIF. If smart panel is used, I guess it will be slower because of SPI transfer speed.

It’s strange that, even with only I-Cache enabled, the weaker STM32F746NG still gets double the frame rate that you are getting, with a very similar display setup (480x272, 16bpp, LTDC interface). Do you have cache enabled? Does the iMX.RT have any graphics accelerator to take advantage of?