High CPU usage with shadow opacity animations

wiklod · February 4, 2022, 10:09pm

Hi,

I am facing quite strange performance hit from … animations? It is quite hard for me to track where it comes from because of unexpected results when I try to isolate the slowdown cause.

So I am designing UI which will display data from two identical power management systems. Basically my screen is divided in half and on both sides (I will call them panels) I have the same content. I have designed some animations which would help to understand the power flow.

I uploaded the video for reference.

The problem is that when I have most “advanced” version of animations on both panels the CPU usage jumps to 100% and frames drop from 33 to 21-24. (0:00-0:08 on video) At first I assumed: ok, the animations are simply too “expensive” so I will get rid of them. But later I realized that when I have “advanced” animation on one panel and static screen on the other one then cpu usage is around 0-4%. It confused me a little. I started to experiment a little and found out that disabling the animation which makes the “shadow” of the power source glow (it is blue one on the beginning of the video) also reduces cpu usage to 12-20% with both panels in “most advanced” configuration. So I assumed that this animation is the reason of the slowdown. But having this “glowing” animation isolated on both sides of the screen (like on right side in 0:08-0:19) gives me literally 0% cpu usage. At the end of the video I have another configuration in which I have 2x “glowing shadow” animation and 2x " 6 moving points" animations and in this configuration I got round 20% cpu usage (although it is the very similar configuration to the one at the beginning).
I am wondering if this has something to do with how the display is being refreshed?
I am using Raspberry Pi Zero 2 with framebuffer display driver, my display is 1024x600 px.
Below I attached code for animations:

def anim_shadow_opa(obj, val):
    obj.set_style_shadow_opa(val,0)
    obj.invalidate()

def shadow_pulse_animation(obj,level):
	a = lv.anim_t()
	a.init()
	a.set_var(obj)
	a.set_values(level, 255)
	a.set_time(1500)
	a.set_playback_time(1500)
	a.set_repeat_count(lv.ANIM_REPEAT.INFINITE)
	a.set_path_cb(lv.anim_t.path_ease_in_out)
	a.set_custom_exec_cb(lambda a,val: anim_shadow_opa(obj,val))
	anim = a.start()
	return anim

def anim_x_y_predefined_path(obj,val,path,color1,color2):
	obj.set_pos(path[val][0],path[val][1])
	color_ratio = round((val/len(path))*255)
	obj.set_color(color2.color_mix(color1,color_ratio))

def one_to_two_animation(l11,l12,l13,l21,l22,l23,source_color,dest_color_upper,dest_color_lower,starting_point):
	path_lower_abs = [(0, 0), (1, 0), (2, 0), (3, 0), (4, 0), (5, 0), (6, 0), (7, 0), (8, 0), (9, 0), (10, 0), (11, 0), (12, 0), (13, 0), (14, 0), (15, 0), (16, 0), (17, 0), (18, 0), (19, 0), (20, 0), (21, 0), (22, 0), (23, 0), (24, 0), (25, 0), (26, 0), (27, 0), (28, 0), (29, 0), (30, 0), (31, 0), (32, 0), (33, 0), (34, 0), (35, 0), (36, 0), (37, 0), (38, 0), (39, 0), (40, 0), (41, 0), (42, 0), (43, 0), (44, 0), (45, 0), (46, 0), (47, 0), (48, 0), (49, 0), (50, 0), (51, 0), (52, 0), (53, 0), (54, 0), (55, 0), (56, 0), (57, 0), (58, 0), (59, 0), (60, 0), (61, 0), (62, 0), (63, 0), (64, 0), (65, 0), (66, 0), (67, 0), (68, 0), (69, 0), (70, 0), (71, 0), (72, 0), (73, 0), (74, 0), (75, 0), (76, 0), (77, 0), (78, 0), (79, 0), (80, 0), (81, 0), (82, 0), (83, 0), (84, 0), (85, 0), (86, 0), (87, 0), (88, 0), (89, 0), (90, 0), (91, 0), (92, 0), (93, 0), (94, 0), (95, 0), (96, 0), (97, 0), (98, 0), (99, 0), (100, 0), (101, 1), (102, 1), (103, 1), (104, 2), (105, 3), (106, 3), (107, 4), (108, 5), (109, 6), (109, 7), (110, 8), (111, 9), (111, 10), (111, 11), (112, 12), (112, 13), (112, 14), (112, 15), (112, 14), (112, 15), (112, 16), (112, 17), (112, 18), (112, 19), (112, 20), (112, 21), (112, 22), (112, 23), (112, 24), (112, 25), (112, 26), (112, 27), (112, 28), (112, 29), (112, 30), (112, 31), (112, 32), (112, 33), (112, 34), (112, 35), (112, 36), (112, 37), (112, 38), (112, 39), (112, 40), (112, 41), (112, 42), (112, 43), (112, 44), (112, 45), (112, 46), (112, 47), (112, 48), (112, 49), (112, 50), (113, 51), (113, 52), (113, 53), (114, 54), (115, 55), (115, 56), (116, 57), (117, 58), (118, 59), (119, 59), (120, 60), (121, 61), (122, 61), (123, 61), (124, 62), (125, 62), (126, 62), (127, 62), (128, 62), (129, 62), (130, 62), (131, 62), (132, 62), (133, 62), (134, 62), (135, 62), (136, 62), (137, 62), (138, 62), (139, 62), (140, 62), (141, 62), (142, 62), (143, 62), (144, 62), (145, 62), (146, 62), (147, 62), (148, 62), (149, 62), (150, 62), (151, 62), (152, 62), (153, 62), (154, 62), (155, 62), (156, 62), (157, 62), (158, 62), (159, 62), (160, 62), (161, 62), (162, 62), (163, 62), (164, 62), (165, 62), (166, 62), (167, 62), (168, 62), (169, 62), (170, 62), (171, 62), (172, 62), (173, 62), (174, 62), (175, 62), (176, 62), (177, 62), (178, 62), (179, 62), (180, 62), (181, 62), (182, 62), (183, 62), (184, 62), (185, 62), (186, 62), (187, 62), (188, 62), (189, 62), (190, 62), (191, 62), (192, 62), (193, 62), (194, 62), (195, 62), (196, 62), (197, 62), (198, 62), (199, 62), (200, 62), (201, 62), (202, 62), (203, 62), (204, 62), (205, 62), (206, 62), (207, 62)]
	path_upper_abs = [(0, 0), (1, 0), (2, 0), (3, 0), (4, 0), (5, 0), (6, 0), (7, 0), (8, 0), (9, 0), (10, 0), (11, 0), (12, 0), (13, 0), (14, 0), (15, 0), (16, 0), (17, 0), (18, 0), (19, 0), (20, 0), (21, 0), (22, 0), (23, 0), (24, 0), (25, 0), (26, 0), (27, 0), (28, 0), (29, 0), (30, 0), (31, 0), (32, 0), (33, 0), (34, 0), (35, 0), (36, 0), (37, 0), (38, 0), (39, 0), (40, 0), (41, 0), (42, 0), (43, 0), (44, 0), (45, 0), (46, 0), (47, 0), (48, 0), (49, 0), (50, 0), (51, 0), (52, 0), (53, 0), (54, 0), (55, 0), (56, 0), (57, 0), (58, 0), (59, 0), (60, 0), (61, 0), (62, 0), (63, 0), (64, 0), (65, 0), (66, 0), (67, 0), (68, 0), (69, 0), (70, 0), (71, 0), (72, 0), (73, 0), (74, 0), (75, 0), (76, 0), (77, 0), (78, 0), (79, 0), (80, 0), (81, 0), (82, 0), (83, 0), (84, 0), (85, 0), (86, 0), (87, 0), (88, 0), (89, 0), (90, 0), (91, 0), (92, 0), (93, 0), (94, 0), (95, 0), (96, 0), (97, 0), (98, 0), (99, 0), (100, 0), (101, -1), (102, -1), (103, -1), (104, -2), (105, -3), (106, -3), (107, -4), (108, -5), (109, -6), (109, -7), (110, -8), (111, -9), (111, -10), (111, -11), (112, -12), (112, -13), (112, -14), (112, -15), (112, -14), (112, -15), (112, -16), (112, -17), (112, -18), (112, -19), (112, -20), (112, -21), (112, -22), (112, -23), (112, -24), (112, -25), (112, -26), (112, -27), (112, -28), (112, -29), (112, -30), (112, -31), (112, -32), (112, -33), (112, -34), (112, -35), (112, -36), (112, -37), (112, -38), (112, -39), (112, -40), (112, -41), (112, -42), (112, -43), (112, -44), (112, -45), (112, -46), (112, -47), (112, -48), (112, -49), (112, -50), (113, -51), (113, -52), (113, -53), (114, -54), (115, -55), (115, -56), (116, -57), (117, -58), (118, -59), (119, -59), (120, -60), (121, -61), (122, -61), (123, -61), (124, -62), (125, -62), (126, -62), (127, -62), (128, -62), (129, -62), (130, -62), (131, -62), (132, -62), (133, -62), (134, -62), (135, -62), (136, -62), (137, -62), (138, -62), (139, -62), (140, -62), (141, -62), (142, -62), (143, -62), (144, -62), (145, -62), (146, -62), (147, -62), (148, -62), (149, -62), (150, -62), (151, -62), (152, -62), (153, -62), (154, -62), (155, -62), (156, -62), (157, -62), (158, -62), (159, -62), (160, -62), (161, -62), (162, -62), (163, -62), (164, -62), (165, -62), (166, -62), (167, -62), (168, -62), (169, -62), (170, -62), (171, -62), (172, -62), (173, -62), (174, -62), (175, -62), (176, -62), (177, -62), (178, -62), (179, -62), (180, -62), (181, -62), (182, -62), (183, -62), (184, -62), (185, -62), (186, -62), (187, -62), (188, -62), (189, -62), (190, -62), (191, -62), (192, -62), (193, -62), (194, -62), (195, -62), (196, -62), (197, -62), (198, -62), (199, -62), (200, -62), (201, -62), (202, -62), (203, -62), (204, -62), (205, -62), (206, -62), (207, -62)]
	path_upper = [(x[0]+starting_point[0],x[1]+starting_point[1]) for x in path_upper_abs]
	path_lower = [(x[0]+starting_point[0],x[1]+starting_point[1]) for x in path_lower_abs]	

	interval = 6000
	steps = len(path_lower)

	a11 = lv.anim_t()
	a11.init()
	a11.set_var(l11)
	a11.set_values(0, steps-1)
	a11.set_delay(0)
	a11.set_time(3*interval)
	a11.set_repeat_count(lv.ANIM_REPEAT.INFINITE)
	a11.set_path_cb(lv.anim_t.path_ease_in_out)
	a11.set_custom_exec_cb(lambda a,val: anim_x_y_predefined_path(l11,val,path_upper,source_color,dest_color_upper))
	anim11 = a11.start()

	a12 = lv.anim_t()
	a12.init()
	a12.set_var(l12)
	a12.set_values(0, steps-1)
	a12.set_delay(1*interval)
	a12.set_time(3*interval)
	a12.set_repeat_count(lv.ANIM_REPEAT.INFINITE)
	a12.set_path_cb(lv.anim_t.path_ease_in_out)
	a12.set_custom_exec_cb(lambda a,val: anim_x_y_predefined_path(l12,val,path_upper,source_color,dest_color_upper))
	anim12 = a12.start()

	a13 = lv.anim_t()
	a13.init()
	a13.set_var(l13)
	a13.set_values(0, steps-1)
	a13.set_delay(2*interval)
	a13.set_time(3*interval)
	a13.set_repeat_count(lv.ANIM_REPEAT.INFINITE)
	a13.set_path_cb(lv.anim_t.path_ease_in_out)
	a13.set_custom_exec_cb(lambda a,val: anim_x_y_predefined_path(l13,val,path_upper,source_color,dest_color_upper))
	anim13 = a13.start()

	a21 = lv.anim_t()
	a21.init()
	a21.set_var(l21)
	a21.set_values(0, steps-1)
	a21.set_delay(int(0.5*interval))
	a21.set_time(3*interval)
	a21.set_repeat_count(lv.ANIM_REPEAT.INFINITE)
	a21.set_path_cb(lv.anim_t.path_ease_in_out)
	a21.set_custom_exec_cb(lambda a,val: anim_x_y_predefined_path(l21,val,path_lower,source_color,dest_color_lower))
	anim21 = a21.start()

	a22 = lv.anim_t()
	a22.init()
	a22.set_var(l22)
	a22.set_values(0, steps-1)
	a22.set_delay(int(1.5*interval))
	a22.set_time(3*interval)
	a22.set_repeat_count(lv.ANIM_REPEAT.INFINITE)
	a22.set_path_cb(lv.anim_t.path_ease_in_out)
	a22.set_custom_exec_cb(lambda a,val: anim_x_y_predefined_path(l22,val,path_lower,source_color,dest_color_lower))
	anim22 = a22.start()

	a23 = lv.anim_t()
	a23.init()
	a23.set_var(l23)
	a23.set_values(0, steps-1)
	a23.set_delay(int(2.5*interval))
	a23.set_time(3*interval)
	a23.set_repeat_count(lv.ANIM_REPEAT.INFINITE)
	a23.set_path_cb(lv.anim_t.path_ease_in_out)
	a23.set_custom_exec_cb(lambda a,val: anim_x_y_predefined_path(l23,val,path_lower,source_color,dest_color_lower))
	anim23 = a23.start()

	return [anim11, anim12, anim13, anim21, anim22, anim23]

The “moving points” animations work in the way that 6 lv_led’s are moved along predefined path and have their color changed gradually.
Maybe there is a better way to make this animations? Maybe there are some adjustments I should do in my framebuffer driver? I was expecting that with such cpu (Broadcom BCM2710A1 1GHz) I won’t have problems Every advice much appreciated!

amirgon · February 5, 2022, 8:10pm

Hi @wiklod !

It could be helpful to have a simpler example which we could run on our side in the simulator and see the problem ourselves.
With the information you provided, it’s a bit hard to say if the problem is related to LVGL, to Micropython or to the display driver.

Take into account that Micropython is much slower than C. When you implement animation callbacks in Micropython, these callbacks are called many times a second and Micropython performance penalty could become more evident. You can try to see if modifying the animation callbacks affect your cpu performance.
If this is really the case and the Micropython callbacks are the bottleneck, you can try to optimize them, use the “native” or “viper” decorators, or even implement them in C.

wiklod · February 7, 2022, 8:31am

I’ll try to prepare something later this week

After year of using mainly Julia I start to forgot that dynamism and high level usually comes with the price of the performance

But that’s quite intriguing what you said about callbacks. Since this callbacks are in most cases related to assigning the new value to the object I am surprised that C can outperform Micropython this much. Maybe I should focus on implementing whole animations in C? I have some background in C so rewriting callbacks wouldn’t be a problem. BTW did you mean something like: Micropython external C modules or Native machine code in .mpy files?

I’ll experiment with that (and decorators) a little and come here with the results.

amirgon · February 11, 2022, 9:58pm

Before jumping to that, I recommend first profiling your code and checking whehter these callbacks are really the source of the problem you are seeing.

wiklod · February 16, 2022, 3:02pm

So I made some quick tests and find out that this disproportions of cpu usage declared by lvgl have nothing to do with what linux top is reporting. So basically the mystery of very different cpu usage with almost the same animations on screen seem to be solved.

Nevertheless the animations are still consuming a lot of cpu power. I didn’t quantitatively profile my code, but I was disabling animations one by one and observing cpu usage by micropython process with top. I find out that:

with only one “moving points” animation enabled I get around 10% CPU usage
with only two “moving points” animations enabled I get around 60% CPU usage
with only one “shadow opacity” animation enabled I get around 40% CPU usage
with only two “shadow opacity” animations enabled I get around 70% CPU usage
with everything enabled I get constant 100%
with all animations disabled I get up to 5%…

So in my understanding it proves that something in this animations are really time expensive…

I made quick demo in online simulator for you to see the setup.

I added @micropython.native decorators but it didn’t affect the CPU usage at all…

amirgon · February 16, 2022, 9:57pm

Thank you for the demo, it is very useful for figuring out where the problem is.

I’ve profiled your demo on Linux with perf and hotspot:

It’s very easy to see that LVGL is eating all the cycles (and not Micropython in this case so there is no point in optimizing the Micropython callbacks).
Object drawing is consuming ~90% of the cpu cycles, where most work is done on shadows and blurred corners.

@embeddedt , @kisvegabor - Is that expected and normal? Perhaps worth optimizing these functions.

kisvegabor · February 17, 2022, 5:21pm

LV_SHADOW_CACHE_SIZE should solve exactly this issue.

amirgon · February 18, 2022, 11:29am

@kisvegabor - Are you sure? I’ve tried setting LV_SHADOW_CACHE_SIZE to 100 and got more-or-less the same results when profiling…

@wiklod - Could you try some values for LV_SHADOW_CACHE_SIZE on your side and see if it helps?
You set it on lv_binding_micropython/lv_conf.h:

github.com

lvgl/lv_binding_micropython/blob/81dadf150f12620240e970fe477df6591b0180de/lv_conf.h#L118

    
      
          *-----------*/
          
          
/*Enable complex draw engine.
          *Required to draw shadow, gradient, rounded corners, circles, arc, skew lines, image transformations or any masks*/
          #define LV_DRAW_COMPLEX 1
          #if LV_DRAW_COMPLEX != 0
          
          
   /*Allow buffering some shadow calculation.
             *LV_SHADOW_CACHE_SIZE is the max. shadow size to buffer, where shadow size is `shadow_width + radius`
             *Caching has LV_SHADOW_CACHE_SIZE^2 RAM cost*/
             #define LV_SHADOW_CACHE_SIZE 0
          
          
   /* Set number of maximally cached circle data.
             * The circumference of 1/4 circle are saved for anti-aliasing
             * radius * 4 bytes are used per circle (the most often used radiuses are saved)
             * 0: to disable caching */
             #define LV_CIRCLE_CACHE_SIZE 4
          #endif /*LV_DRAW_COMPLEX*/
          
          
/*Default image cache size. Image caching keeps the images opened.
          *If only the built-in image formats are used there is no real advantage of caching. (I.e. if no new image decoder is added)

wiklod · February 18, 2022, 11:35am

I have set it to 180 and don’t see any improvement

kisvegabor · February 18, 2022, 1:51pm

What is the size of the shadow (shadow_width style parameter)?

wiklod · February 18, 2022, 1:55pm

In demo it is 50.
In my app I have 30 but this is probably irrelevant.

kisvegabor · February 18, 2022, 2:12pm

It’s important because LV_SHADOW_CACHE_SIZE must be greater than shadow_width + radius. From the video LV_SHADOW_CACHE_SIZE = 180 should be enough, but please confirm it.

Have you enabled compiler optimization when built micropython?

Anyway, it should work smoothly on an 1 GHz processor. But if we can’t make it fast enough a work around could be to use an image as shadow.

wiklod · February 18, 2022, 2:19pm

I confirm that I have shadow_width set to 30

Yes, to -O3.

This is my backup plan

EDIT:
I tried to set cache size to 1000 for an experiment and still no improvement

kisvegabor · February 18, 2022, 3:23pm

radius also matters and with circle widgets it can be a large value.

Even in the profiler don’t you see shadow_blur_corner shrinking?