FreeRTOS getting thread safe - or thread safe in general?

Is there a plan to make LVGL thread-safe on FreeRTOS?

I have a method that is “ok” and relatively automated in some ways, but I am still tripping up on thread safety myself once in a while as to whether I’m in a locking thread that will automate the mutex take/give versus not. And so I know people who use my c++ library (LVGLPlusPlus) will trip up on it too.

I suppose at a higher level - is there a plan for making the library thread safe, generally?

Thanks!
bob

Hi Bob,

We were thinking a lot about it but haven’t had a good idea for it so far. Maybe you will have some :slight_smile:

So the goal is to allow setting and getting all widget, style, animation, etc property from threads concurrently.

To make LVGL fully thread safe all top level API functions (e.g. lv_obj_set_width()) should begin with a mutex take and end with a mutex give. However

  • it’s a lot of boiler plate
  • hard to maintain and read
  • error prone to do
  • could be slow
  • some corner cases might not be covered

Due to this we found that we can leave it to the user as usually most of the UI called are handled from LVGL events and timers which called from lv_timer_handler so no mutex is needed.

There is a small good things though: in v9 we integrate an OSAL into LVGL and FreeRTOS will be supported too. So mutexes can managed via a common lv_os_motex_... API.

Thanks for this insight to where safety is needed. I’m feeling like I’m biting off a bit too much at the moment but I’d love to re-visit this around v9 as that sounds like a better integration point given the work that’s already been done there. And maybe some of this will become a bit more of a natural in an automated version of LVGLPlusPlus ‘generated bindings’ approach.

On the performance front, my gut says that this wouldn’t necessarily be burdensom or a big penalty so long as usage patterns of the library are “reasonable”. Do you have a sense at all of what a user doing ‘heavy usage’ of LVGL looks like? Thinking that if the construct for mutexes were done well, it may be very simple to build the library in a thread-safe form with some performance penalties and also in a non-threadsafe-version which has no such penalties. Measurements could even be had to know how extreme the cases would have to get in order to actually be notable.

–bob

1 Like

Hello Bob,

You have probably seen this bit of documentation already: Operating system and interrupts — LVGL documentation

As I understand it, locking and unlocking a mutex around the lv_task_handler call (sometimes referred to as lv_timer_handler) will mostly make it thread-safe, as most of the lvgl functionality is handled via that function call.
Again, if I understand it correctly this means one can safely have a seperate, single thread for only LVGL and seperate other threads for handling other functionality.

Do you mean to use LVGL itself with multiple threads?

At the moment I have a seperate thread for handling touch screen inputs and another for LVGL, no issues with FreeRTOS so far. So now I am curious if more complex systems will cause issues with LVGL when used in this manner.

Kind regards

1 Like

Performance is only one possible issue but it might occur e.g. when 500 objects are created and each has a 5 children. So 3,000 widgets in total. Let’s there are 10 LVGL API calls per widget, so, 300,000 mutex lock/unlocks. I’m also not sure how fast mutex locks/unlocks are but it seems a lot.

@Tinus indeed I’m aware of how mutexes need to be used in order to make my code thread-safe with lvgl when I’ve got multiple threads. The desirable end-game for me would be that lvgl is thread-safe within itself so there’s no “handle with care” needed on this topic. Even though I fully understand what’s required, I’ve still tripped myself up a few times and after 5 minutes of scratching my head I go ohhhhhh DRAT. :stuck_out_tongue_winking_eye:

1 Like

@kisvegabor Do you really have use cases that you know of which are both this large and also this “busy” in terms of calls? That’s a stunning set of numbers. :open_mouth:

I’m not certain of the actual “cost” of the mutex and I’d expect it might be OS dependent. Might not be terribly hard to come up with a “unit test” that essentially benchmarks mutex usage in such a scenario if that’s a real scenario and be able to run that benchmark on a variety of hosts to see if it’s appreciable.

–bob

It might be a little bit exaggerated but hundreds of list items are not uncommon. E.g. in a project I needed to create a list item to download the map of each country.

However, regardless of performance issues the other things that I listed still suggest me that “API level mutexes” are not the best idea.

Ok - great point on the large lists. I hadn’t considered those entries to each be lv_obj_t items for manipulation.

As for the bigger picture, then, can we identify the critical data structures and accesses which are collisions so that we can only set mutexes around those areas or that we create self-locking areas, essentially?

And a more global question - is the library necessarily always in C due to technology issues in its hosts? Or is C a project related decision? I ask because it seems that a handful of well designed classes could aid in the automation of what needs locked when. Just talking in broad brush strokes here.

–bob

I think any object/animation/timer/etc can be a collision point. Imagine that 2 threads want to set the same label’s text at the same time.

Migrating to C++ comes up time to time, but we have decided to stick with C for now because of the simplicity of C.

I can tell you that using a mutex is not fast at all. It slows things down quite a bit depending on how they are used The best way to use a mutex is to check if it is locked and if it is then move onto updating something else and come back to update that object later…

The design of LVGL is what makes it difficult when using threads because if one thread is doing anything with any LVGL object all other threads are not able to do anything in LVGL until that single thread is finished. This is due to using globals to store different things that are taking place. It’s for convenience functions like lv_disp_get_default which collects the disp_t from a global.

Could this be handled differently, sure. add a field to to lv_disp_t named “is_default” with a boolean value. when setting the default display iterate over the displays setting them all to false and then set the one passed to the set_default function to true. When getting the default iterate over the displays looking for the one that has the “is_default” field set to true and return that.

Depending on the use case it could save memory or it could cost more. In all use cases there is going to be a performance hit due to iterating over an array but I don’t think that would make a noticeable difference in most use cases, maybe some edge cases it might.

That is only a single example as there several places where this is taking place. The same thing explained above could be done to most if not all of the other globals.

I think that is the biggest hangup with using multiple threads in parallel to update LVGL. The only thing that you would not be able to do is refresh the display and make changes. I don’t recall seeing anything like setting the value of a slider directly updating the frame buffer. The new value gets stored and when the refresh timer expires and lv_task_handler is called is when the new value gets written to the frame buffer. That could be dealt with pretty easily tho using 2 mutexs and an array that holds a marker for each thread. After the last thread has finished with whatever it is changing in LVGL it sets the marker for that thread and then releases the mutex which allows another thread to update the display. Using an array of bools that is allocated ahead of time and each thread changing a specific index in that array would not cause any corruption issues. it would be arranged like this.

mutex1 locked
mutes2 locked
array [False, False, False]

display update thread at mutex1 (locked)

thread 1 - 3 collects sensor data and then each thread updates LVGL. once finished updating LVGL it sets the flag in the array for that thread to True. each thread checks to see if all of the flags are True and if they are then mutex1 will be unlocked.

once the display update thread has finished it locks mutex1 and unlocks mutex 2. each of the 3 threads then sets the flag to False in the array and the last thread that sets the False flag locks it also locks mutex2. It goes around and around like that.

synchronous threading is what you would call it I guess.

The above way of doing it would be pretty fast too.

in Python is looks like this, it would be pretty similar in it’s layout except instead of classes and methods structures and functions would be used.

import threading



threads = []


class ThreadWorker(threading.Thread):
    
    def __init__(self, lock1):
        self.lvgl_update_func = None
        self.sensor_func = None
        self.flag = False
        self.lock1 = lock1
        self.lock = threading.Lock()
        threading.Thread.__init__(self, target=self.run)
        
    def start(self, sensor_func, lvgl_update_func):
        self.sensor_func = sensor_func
        self.lvgl_update_func = lvgl_update_func
        threading.Thread.start(self)

        
    def run(self):
        
        while True:
            self.sensor_func()
            self.lock.acquire()
            self.flag = False
            self.lvgl_update_func()
                        
            for t in threads:
                if t.flag is False:
                    break
            else:
                self.lock1.release()
                
                
class DisplayUpdateThread(threading.Thread):
    
    def __init__(self):
        self.lock = threading.Lock()
        self.update_func = None
        self.lock.acquire()
        threading.Thread.__init__(self, target=self.run)
        
    def start(self, update_func):
        self.update_func = update_func
        
        threading.Thread.start(self)
    
    def run(self):
        
        while True:
            self.lock.acquire()
            self.update_func()
            for t in threads:
                t.lock.release()
                
                
def lv_update_func_1():
    pass

def lv_update_func_2():
    pass

def lv_update_func_3():
    pass


def sensor_update_func_1():
    pass

def sensor_update_func_2():
    pass

def sensor_update_func_3():
    pass


def display_update_func():
    pass


disp_thread = DisplayUpdateThread()
disp_thread.start(display_update_func)

t1 = ThreadWorker(disp_thread.lock)
t2 = ThreadWorker(disp_thread.lock)
t3 = ThreadWorker(disp_thread.lock)

t1.start(sensor_update_func_1, lv_update_func_1)
t2.start(sensor_update_func_2, lv_update_func_2)
t3.start(sensor_update_func_3, lv_update_func_3)

Hi @bobwolff68 ,

Sorry I am a bit late to the party here…

I have created two large projects with LVGL and FreeRTOS and my approach doesn’t require mutexes…

I create a FreeRTOS thread for LVGL which carries out the initialisation. I then create a global ‘GUI’ message queue using FreeRTOS to receive messages/events from other parts of the system to update the GUI at runtime. This global message queue is managed by an LVGL timer which is used to process the message queue and perform tasks within the execution path of the LVGL thread only which negates the need for a mutex. This timer is called periodically by LVGL and updates screens widgets etc. on demand, I have found in my systems a period of 10mS usually gives a good performance, but depending on your CPU and other system load you may need to change this. Finally the thread goes on to call lv_task_handler() periodically as required for the system to function. Code snippet GUI task:

// These are initialised in main.c
	cpu0_globals->gui.msg_q = xQueueCreate( 128, sizeof( otg_sysmsg_t ) );								// Create GUI message queue
	cpu0_globals->sys_action_q = xQueueCreate( 128, sizeof( otg_sysmsg_t ) );							// Create System message queue


void gui_update_task(void *p) {

	// Initialise VGA Hardware
	set_vga_prams( VGA_1440X900_60HZ_CVTRA );
	// Initialise GUI
	lv_init();
	lv_theme_default_init(cpu0_globals->gui.disp, shmem_p->personality == OTG_IDU ? confp->sys.IDU_gui_colour : confp->sys.ODU_gui_colour,
	  get_theme_secondary_colour(), (((shmem_p->personality == OTG_IDU) ? confp->sys.IDU_style : confp->sys.ODU_style) ? 0 : 1), LV_FONT_DEFAULT);
	lv_disp_drv_init((lv_disp_drv_t*)&cpu0_globals->gui.disp_drv);
	lv_disp_draw_buf_init(&cpu0_globals->gui.disp_buf, (void*)LV_VBUF1_ADR, (void*)LV_VBUF2_ADR, (HOR_RES_MAX*VER_RES_MAX));
	cpu0_globals->gui.disp_drv.flush_cb = vga_disp_flush;
	cpu0_globals->gui.disp_drv.hor_res = HOR_RES_MAX;                 /*Set the horizontal resolution in pixels*/
	cpu0_globals->gui.disp_drv.ver_res = VER_RES_MAX;                 /*Set the vertical resolution in pixels*/
	cpu0_globals->gui.disp_drv.draw_buf = &cpu0_globals->gui.disp_buf;
	cpu0_globals->gui.disp_drv.full_refresh = pdFALSE;
	cpu0_globals->gui.disp_drv.direct_mode = pdTRUE;
	cpu0_globals->gui.disp = lv_disp_drv_register((lv_disp_drv_t*)&cpu0_globals->gui.disp_drv);
	lv_disp_set_bg_opa(NULL, LV_OPA_TRANSP);
	startup_gui_create();
	lv_timer_create((lv_timer_cb_t)process_msg_q, 10, NULL);	// Check for GUI thread messages every 10ms
	while(1) {
		lv_task_handler();
		vTaskDelay(pdMS_TO_TICKS(4));
	}
}

I also have a system manager thread which consists of a second global FreeRTOS ‘System’ queue and task which receives system functions triggered by GUI actions, these can be actions which require file system access, networks requests or what ever. By doing this the GUI can be kept responsive during various background tasks, by scheduling these other tasks at a lower priority. For example if a configuration requires saving to flash this may take a second or two, if you click a button in LVGL and call the code to save to flash directly in the LVGL event handler it will block the GUI until the save returns. If you instead queue an event request, the GUI won’t block and the system manager will pull the event from the queue and execute the save to flash at a lower priority in the background leaving the GUI responsive during the flash update, which is much more pleasing to the user.

Code snippet for LVGL timer function receive and process requests to update parts of the GUI:

static void process_msg_q ( lv_timer_t *timer ) {

	char					*pmsg = NULL;
	otg_sysmsg_t			msg = { 0,  NULL, 0, 0, shmem_p->personality, 0, NULL };
	otg_eventdb_entry_t		*db_entry;

	if( ( xQueueReceive( cpu0_globals->gui.msg_q, &msg, 0 ) ) ) {
		if( msg.id < LOGMSG_ID_END ){
			if( (db_entry = get_log_msg_by_id( msg.id ) ) == NULL ) return;
			if( cpu0_globals->gui.screen.main.stup_ta != NULL && (db_entry->category == log_cat_none) ) {
				lv_textarea_add_text( cpu0_globals->gui.screen.main.stup_ta, db_entry->msg );	// All startup messages are sent raw to pseudo console
				lv_textarea_add_text( cpu0_globals->gui.screen.main.stup_ta, "\n" );
			} else {
				if( msg.extra_data != NULL ) {
					pmsg = strstr( msg.extra_data,"->" ); // This strips the date off the front of messages so we don't print it on screen
					if( pmsg != NULL ) {
						pmsg += 2;
					} else pmsg = msg.extra_data;
				}
				show_sys_message( pmsg );
			}
		} else {
			switch( msg.id ) {

				case LOAD_MAIN_GUI:
					lv_obj_del( cpu0_globals->gui.screen.main.stup_ta );
					cpu0_globals->gui.screen.main.stup_ta = NULL;
					lv_obj_del( cpu0_globals->gui.screen.main.bgrd );
					main_gui_create();
					break;

				case RESET_GUI_LOG:
					reset_gui_log();
					break;

				case REFRESH_GUI_LOG:
					refresh_gui_log();
					break;

				case UPDATE_GUI_SYS_SCR:
					sys_scrupdate();
					break;

				case UPDATE_GUI_SITE_SCR:
					site_estupdate();
					site_sysupdate();
					site_antupdate();
					break;

				case UPDATE_GUI_SAT_SCR:
					sat_scrupdate();
					break;

				case UPDATE_GUI_TRK_SCR:
					track_scrupdate();
					break;

				case UPDATE_GUI_SIM_SCR:
					sim_scrupdate();
					break;

				case UPDATE_GUI_THEME:
					update_gui_theme( pdTRUE );
					break;

				default:
					break;
			}
		}
		if( msg.free_extra & FREE_EXTRA ) vPortFree( msg.extra_data );
	}
}

Code snippet for system manager:

static void sysmansup( void *p ) {

	otg_sysmsg_t		msg = { 0, NULL, 0, 0, shmem_p->personality, log_src_sysman, NULL };
	otg_netcom_pkt_t	net_msg = { { 0, 0, 0, 0 }, NULL, NULL };

	while( 1 ) {
		if( xQueueReceive( cpu0_globals->sys_action_q, &msg, pdMS_TO_TICKS(111) ) ) {
			switch(msg.id) {

				case SAVE_CONFIG:
					if( !save_config( confp, log_src_sysman ) ) {
						msg.id = INF_SAVE_CONFIG_OK;
					} else {
						msg.id = ERR_SAVE_CONFIG;
					}
					break;

				case REM_SAVE_CONFIG:
					if( !save_config( confp, log_src_sysman ) ) {
						msg.id = INF_RSAVE_CONFIG_OK;
					} else {
						msg.id = ERR_RSAVE_CONFIG;
					}
					break;

				case SAVE_CONFIG_TLE:
					if( !save_config( confp, log_src_sysman ) ) {
						msg.id = INF_TLE_SAVE_CONFIG_OK;
					} else {
						msg.id = ERR_TLE_SAVE_CONFIG;
					}
					break;


				case CLR_LOG_REQ:
					msg.id = LOG_CLEAR_EVENT;
					q_event( &msg );		// Special case so we queue here.
					break;

				case SYS_REBOOT:
					msg.id = WARN_USR_REBOOT;
					q_event( &msg );
					vTaskDelay(pdMS_TO_TICKS(1000));
					msg.id = WARN_REBOOT_INPROG;
					msg.extra_data = NULL;
					msg.edat_size = 0;
					q_event( &msg );
					vTaskDelay(pdMS_TO_TICKS(4000));
					system_reboot();
					break;

				case ERASE_FP_FLASH:
					if( erase_factory_prams(log_src_sysman) ) {
						msg.id = INF_FP_FL_ERASE_OK;
					} else {
						msg.id = ERR_FP_FL_ERASE_FAIL;
					}
					break;

				case PROG_FP_FLASH:
					if( save_factory_prams(log_src_sysman) ) {
						msg.id = INF_FP_FL_PROG_OK;
					} else {
						msg.id = ERR_FP_FL_PROG_FAIL;
					}
					break;

				case LOAD_EVENT_LOG:
					load_event_log();
					break;

				case UPDATE_CONFIG_TLES:
					update_config_tles(log_src_sysman);
					break;

				case GET_SYSTEM_TEMPERATURE:
					shmem_p->sys_temp = get_sys_temp( log_src_sysman );					// Update System Temperature as per TEMP_POLL_COUNT
					break;

				default:
					break;
			}
			if( msg.id < LOGMSG_ID_END ) q_event( &msg );
		}
		if( xQueueReceive( cpu0_globals->event_q, &msg, 0 ) ) {
			process_event( &msg );
		}
	}
	cpu0_globals->spawn_stat &= ~SYSMD_RUN;
	vTaskDelete(NULL);
}

I have created a full port of LVGL and FreeRTOS here which uses the described methodology for the Xilinx Zynq platform but the core parts discussed above could be easily ported to other platforms. If this is a useful approach for you to eliminate the use of mutexes (I, being quite old tend to forget to add the calls! :slight_smile: ) and you want to discuss it further or have questions please don’t hesitate to comment here and I will do my best to respond quickly.

Kind Regards,

Pete

1 Like

Pete’s solution is actually the best way to go about it. If his way can be made to share the workload between CPU cores it would be a fantastic solution. I know that FreeRTOS does have mechanisms to work with multi-core processors. How it functions is above my pay grade.

1 Like