I think buffering may still be very useful. Currently the single threaded python seems to throttle the multithreaded httpd. Using buffers may help relaxing this as python would not be blocked during the long transmission time.
httpd websocket support was added to espidf in march 2020 ⌠grumble âŚ
Any plans to update espidf?
I believe this is dependent on upstream, as our fork of MicroPython mostly adds LVGL-related changes.
Apparently someone opened a PR to update to ESP-IDF 4.1 a few months ago, but there hasnât been any movement on it: https://github.com/micropython/micropython/pull/6413
If you want ESP-IDF updated quickly, you may be on your own, as upstream has historically released updates at a very sparse rate. It sounds like they are planning to do some updates to the ESP32 port in the next release, but thatâs not scheduled for release till April.
Maybe itâs possible to just include the updated httpd for a start ⌠although itâs a super ugly solution. But before doing anything like that I need to prepare something clean enough to release.
So the basic functionality (get and post) of the http_server is working now and I donât need the espidf patched anymore. So things start to look nicer now and I was courious how the httpd performs with high lvgl graphics load and how the âcore_idâ parameter in the httpd config influences this ⌠but âŚ
The httpd works fine under negligible load on lvgl side. But once i switch to the âChartâ demo page the httpd stops working. What stops is related to the scheduler. httpd internal error messages still work flawlessly and debug info tells me that mp_sched_schedule is being called even under high load. But in that case the scheduled function is never called.
Itâs my understanding that lvgl processing itself uses the same mechanism and is obviously still working as the chart graphic is still animated nicely. But my scheduled function is never called, Reducing load doesnât make the previously schedules function being called. Itâs lost. May lvgl32âs attempts to call the scheduler itself somehow overwrite my own attempts? But why then only under high load âŚ
Furher investigation shows that the call to mp_sched_schedule() return false which in turn means that the scheduler is full.
Sounds like some job is filling the scheduler queue. May this be the screen update? If yes, would it make sense to limit that so at most one of these jobs is pending? It seems these get frequently lost, anyway.
Successfully tried this in lvesp32 as a quick hack and this indeed makes my callback work and doesnât have a visible negative effect on the Chart demo. Maybe something similar would make sense.
static bool schedule_in_progress = 0;
STATIC mp_obj_t mp_lv_task_handler(mp_obj_t arg)
{
lv_task_handler();
schedule_in_progress = 0;
return mp_const_none;
}
STATIC MP_DEFINE_CONST_FUN_OBJ_1(mp_lv_task_handler_obj, mp_lv_task_handler);
static void vTimerCallback(TimerHandle_t pxTimer)
{
lv_tick_inc(portTICK_RATE_MS);
if(schedule_in_progress) return;
schedule_in_progress = 1;
mp_sched_schedule((mp_obj_t)&mp_lv_task_handler_obj, mp_const_none);
}
Edit ⌠ok, this is not perfect and everything blocks after some time. But itâs at least the right direction
IMO it would be cleanest for the next lv_task_handler
call to be scheduled inside mp_lv_task_handler
, though this would result in all the idle CPU time being taken by lv_task_handler
. Right now it gets scheduled at a fixed rate which results in missed calls like youâve noted.
Maybe. But I still wonder why my approach hangs after a while. I changed from the bool to a proper binary semaphore and also catched the case where mp_sched_schedule fails (which actually doesnât happen). Still everything locks up after a while. It feels like there are cases where lv_task_handler() never gets called or never returns ⌠if thatâs the case then your approach would also hang.
Edit: A little more debugging shows that there seem to be rare cases where something has successfully been scheduled (mp_sched_schedule returned true) but still the scheduled function is never being called âŚ
Edit^2: In that locked case mp_sched_num_pending() constantly returns 1. So the function definitely is pending. But the scheduler never tries to run it.
The following solution works. But itâs rather ugly and may still cause trouble if a third job is being scheduled. Then thereâs once more not enough room in the queue.
static void vTimerCallback(TimerHandle_t pxTimer)
{
lv_tick_inc(portTICK_RATE_MS);
// never try to use the last free seat ...
if(mp_sched_num_pending() >= MICROPY_SCHEDULER_DEPTH-1)
return;
mp_sched_schedule((mp_obj_t)&mp_lv_task_handler_obj, mp_const_none);
}
Is there, by chance, a way to check whether &mp_lv_task_handler_obj
is pending? Then you could just not schedule a new one till the previous one is no longer pending.
I donât know. But I assume that this would still expose my problem as I see that there is a job waiting in the scheduler when I get into the locked state. I am pretty sure itâs the lvgl job being stuck. The question is: why doesnât the scheduler run it and why does scheduling another job cause the stuck one to be run as well?
Maybe this happens when an exception is thrown.
This can happen if lv_task_handler
calls some callback which raises an exception without catching it. In such case I believe lv_task_handler
wonât return.
A general question is whether LVGL always keeps its state consistent in cases of callback functions that donât always return. This could also happen on other bindings such as C++.
Another option is that the Micropython thread being blocked, but I doubt this is the case because I would expect the scheduler to be blocked and apparently itâs not.
Are you using the _thread
module? Maybe one thread is blocked and others still run?
Iâm not sure catching Python exceptions on C code is the best idea.
But catching exceptions in Python is straightforward, so here are some ideas:
- Try this with
lv_async
. Since ILI9341 imports lvesp32 (unfortunately), you need to calllvesp32.deinit()
after initializing the display and only then calllv_async()
. In current project Iâm using uasyncio with this technique.
The advantage is that you donât need to rely on Micropython scheduler.
I think that ili9xxx should not importlvesp32
, but changing that now would break backward compatibility for anyone assuming ili9xxx importslvesp32
, so maybe itâs better to do this change only on the next major release. - Replace
lvesp32
by Python implementation that uses a timer, as done with stm32.
It might be possible to calllv_task_handler
directly, although Iâm not sure. The docs, at least, warn that the callback might be called in interrupt context so we would still need to callschedule
in order to calllv_task_handler
. On the other hand, this scheduling is (apparently) not needed for stm32 so maybe we can get away with that on esp32 as well. - Replace
lvesp32
by Python implementation that uses a FreeRTOS timer, like done in lvesp32.
That would require exposingxTimerCreate
on espidf with callback conventions etc. - Another thought - if the Micropythonâs thread (FreeRTOS task) priority is higher than the httpd priority, Micropython could block httpd indefinitely since FreeRTOS uses strict priority.
If you donât want to change thread priorities, a simple thread wait (esp.task_delay_ms
) could give the lower priority thread an opportunity to run. - This problem reminds me the issues we had with lvesp32 + bluetooth. Increasing the timer period there seemed to help to some extent.
It does not. The expectation is that control will always flow back through the call chain till lv_task_handler
returns, since this is how C works (unless thereâs a crash, obviously).
That is a problem now that you mention it. It means that throwing exceptions out of an event handler without catching them will lead to a hang. Is there a way for the binding to detect this?
Itâs possible of course, at a price.
It would mean wrapping every callback with exception handling code that costs both program memory and cycles.
When doing that, itâs not clear what would be the callback return value in case of exception.
Other options are:
- By convention, require anyone writing a callback to catch exceptions.
- Change LVGL assumptions regarding callbacks.
This problem is not limited to Micropython, itâs relevant to any binding that can throw exceptions, such as C++.
This is more easily solvable: we could adopt a convention of having callbacks return 0, NULL, or nothing (depending on their normal return type) as a default or error state.
Wouldnât this be the same cost as handling it within the binding itself?
Nevertheless, I think this is the best option, as I donât see an easy way to make LVGL handle this case. The assumption in a standard C program is that control passes in and out of the function at some point. I think preventing that from being an issue would significantly complicate LVGLâs event loop.
I have this problem with the advanced demo and the only change over the official MP lvgl version is the attempt in modlvesp32 to keep the scheduler from overflowing. I will redo the entire setup with a fresh download. But imho there are no exceptions or the like involved.
I still think itâs worth trying with uasyncio and lv_async
, where lv_task_handler
can be called directly without scheduling.
I just restarted with an antirely fresh setup. To find this lockup problem but also to make sure that my current http version runs with an unpatched espidf. And guess what? No locks so far ⌠dunno what I did previously.
The following is from the modlvesp32 I am now using. IMO it really makes sense doing it that way. With the previous version Iâd expect other schedule attempts to also fail. The same schedule mechanism is used for interrupt handling, right? You should then see lots of lost interrupts in graphics high load situation. I really think this should be fixed.
static SemaphoreHandle_t schedule_in_progress;
STATIC mp_obj_t mp_lv_task_handler(mp_obj_t arg)
{
lv_task_handler();
xSemaphoreGive(schedule_in_progress);
return mp_const_none;
}
STATIC MP_DEFINE_CONST_FUN_OBJ_1(mp_lv_task_handler_obj, mp_lv_task_handler);
static void vTimerCallback(TimerHandle_t pxTimer)
{
lv_tick_inc(portTICK_RATE_MS);
if(!xSemaphoreTake(schedule_in_progress, 0))
return;
if(!mp_sched_schedule((mp_obj_t)&mp_lv_task_handler_obj, mp_const_none))
xSemaphoreGive(schedule_in_progress);
}
STATIC mp_obj_t mp_init_lvesp32()
{
if (xTimer) return mp_const_none;
lv_init();
// create binary semaphore to make sure only one callback is being
// scheduled at a time
schedule_in_progress = xSemaphoreCreateBinary();
xSemaphoreGive(schedule_in_progress);
xTimer = xTimerCreate(
"lvgl_timer",
1, // The timer period in ticks.
pdTRUE, // The timers will auto-reload themselves when they expire.
NULL, // User data passed to callback
vTimerCallback // Callback function
);
if (xTimer == NULL || xTimerStart( xTimer, 0 ) != pdPASS){
ESP_LOGE(TAG, "Failed creating or starting LVGL timer!");
}
return mp_const_none;
}
In general I agree that modlvesp32 should be fixed such that the scheduler queue is not overflown, but Iâm not sure a blocking semaphore here is a good idea.
We should expect that in certain situations the previous call to lv_task_handler
might not complete before itâs time to schedule the next call.
This can happen with high FPS and heavy rendering, but also in case the user callback is taking too long.
In such occasions itâs fine to skip the next call to lv_task_handler
, and possibly lose a frame or two, but we should not skip the call to lv_tick_inc
.
The problems I see with your suggestion are:
- You are blocking both
lv_task_handler
andlv_tick_inc
- You are blocking a FreeRTOS timer, which is bad because it affects other unrelated timers and FreeRTOS command queue in general.
A different approach could be to use a counter and simply skip calls to lv_task_handler
if the previous hasnât finished yet (or keep one or two calls âin flightâ).
The problem with that approach is that it breaks down once lv_task_handler
is allowed not to return, due to exception that is thrown on a callback as discussed above.
In such case it might be worth catching Micropython exceptions in C and decrease the counter before propagating them further (kind of a âfinallyâ block in C).