Heap use in MicroPython

I’m trying to debug a heap memory allocation failure.

Traceback (most recent call last):
File "<stdin>", line 29, in <module>
MemoryError: memory allocation failed, allocating 76 bytes

Here are 3 tests:

TEST 1: Baseline

import gc
import time
import urandom
import lvgl as lv
import ILI9341 as ili
import lvesp32

# Initialize the ILI9341 driver
# spihost:  1=HSPI 2=VSPI
# display has exclusive use of this SPI bus
disp = ili.display(spihost=2, miso=19, mosi=23, clk=18, cs=5, dc=21, rst=4, backlight=22, mhz=40, share=ili.EXCLUSIVE)
disp.init()

# Register display driver to LittlevGL
disp_buf1 = lv.disp_buf_t()
buf1_1 = bytearray(480*10)
lv.disp_buf_init(disp_buf1,buf1_1, None, len(buf1_1)//4)
disp_drv = lv.disp_drv_t()
lv.disp_drv_init(disp_drv)
disp_drv.buffer = disp_buf1
disp_drv.flush_cb = disp.flush
disp_drv.hor_res = 320  
disp_drv.ver_res = 240
disp_drv.rotated = 0
lv.disp_drv_register(disp_drv)  

while True:
    print('1',gc.mem_free())
    screen = lv.obj()
    print('2',gc.mem_free())
    l = [urandom.random(), urandom.random()]
    print('3',gc.mem_free(),'\n')
    time.sleep_ms(1)
    gc.collect()

The GC output shows that the heap allocation is not freed. Eventually, allocation failure happens.

1 99664
2 99568
3 99504

1 99568
2 99472
3 99408

TEST 2: comment out "screen = lv.obj()"

import gc
import time
import urandom
import lvgl as lv
import ILI9341 as ili
import lvesp32

# Initialize the ILI9341 driver
# spihost:  1=HSPI 2=VSPI
# display has exclusive use of this SPI bus
disp = ili.display(spihost=2, miso=19, mosi=23, clk=18, cs=5, dc=21, rst=4, backlight=22, mhz=40, share=ili.EXCLUSIVE)
disp.init()

# Register display driver to LittlevGL
disp_buf1 = lv.disp_buf_t()
buf1_1 = bytearray(480*10)
lv.disp_buf_init(disp_buf1,buf1_1, None, len(buf1_1)//4)
disp_drv = lv.disp_drv_t()
lv.disp_drv_init(disp_drv)
disp_drv.buffer = disp_buf1
disp_drv.flush_cb = disp.flush
disp_drv.hor_res = 320  
disp_drv.ver_res = 240
disp_drv.rotated = 0
lv.disp_drv_register(disp_drv)  

while True:
    print('1',gc.mem_free())
    #screen = lv.obj()
    print('2',gc.mem_free())
    l = [urandom.random(), urandom.random()]
    print('3',gc.mem_free(),'\n')
    time.sleep_ms(1)
    gc.collect()

The GC output shows that the heap does not change size

1 104768
2 104768
3 104704

1 104768
2 104768
3 104704

TEST 3: comment out driver registration

import gc
import time
import urandom
import lvgl as lv
import ILI9341 as ili
import lvesp32

# Initialize the ILI9341 driver
# spihost:  1=HSPI 2=VSPI
# display has exclusive use of this SPI bus
disp = ili.display(spihost=2, miso=19, mosi=23, clk=18, cs=5, dc=21, rst=4, backlight=22, mhz=40, share=ili.EXCLUSIVE)
disp.init()

'''
# Register display driver to LittlevGL
disp_buf1 = lv.disp_buf_t()
buf1_1 = bytearray(480*10)
lv.disp_buf_init(disp_buf1,buf1_1, None, len(buf1_1)//4)
disp_drv = lv.disp_drv_t()
lv.disp_drv_init(disp_drv)
disp_drv.buffer = disp_buf1
disp_drv.flush_cb = disp.flush
disp_drv.hor_res = 320  
disp_drv.ver_res = 240
disp_drv.rotated = 0
lv.disp_drv_register(disp_drv)  
'''

while True:
    print('1',gc.mem_free())
    screen = lv.obj()
    print('2',gc.mem_free())
    l = [urandom.random(), urandom.random()]
    print('3',gc.mem_free(),'\n')
    time.sleep_ms(1)
    gc.collect()

The GC output shows that the heap does not change size

1 110576
2 110560
3 110496

1 110576
2 110560
3 110496

Does anyone have an insight into this behavior? Perhaps I don’t have the driver registration correctly configured?

Are you using the latest version?
Could you use the micropython version of the ili9341 driver? I no longer maintain the C version.

When you create an lv.obj with no parameters, you are actually calling lv_obj_create(NULL,NULL)

In such case:

  • If display is not initialized, lv_obj_create returns immediately with a warning. You didn’t register the log callback so you didn’t see the warning. No object is created in such case and no RAM is allocated.
  • If display is initialized, the new object is created as a screen of default display by lv_ll_ins_head(&disp->scr_ll). Since the new object is referenced by the active display, it will not be collected by gc.

If you wish to delete a screen (objects with no parents) after it was created and assigned to a display, you can call screen.delete() which maps to lv_obj_del.

@amirgon Thanks for digging into this problem and finding the root cause !

I didn’t know littlevgl/micropython worked this way. So, every screen gets referenced to the display (which never gets unallocated)…each added screen consumes heap space … until memory allocation fails.

I tried to use the delete() method as a workaround. Deleting screens solves the allocation issue, but I now see a consistent crash in my application code. I narrowed down the problem to the addition of lv.scr_load(screen). I made a simple test case to reproduce the problem, below. I suspect the lv_task_handler() tries to access memory related to the now deleted screen. No crash happens when lv.scr_load(screen) is removed.

I’ll start looking for a solution or workaround

import gc
import time
import urandom
import lvgl as lv
import ILI9341 as ili
import lvesp32

# Initialize the ILI9341 driver
# spihost:  1=HSPI 2=VSPI
# display has exclusive use of this SPI bus
disp = ili.display(spihost=2, miso=19, mosi=23, clk=18, cs=5, dc=21, rst=4, backlight=22, mhz=40, share=ili.EXCLUSIVE)
disp.init()

# Register display driver to LittlevGL
disp_buf1 = lv.disp_buf_t()
buf1_1 = bytearray(480*10)
lv.disp_buf_init(disp_buf1,buf1_1, None, len(buf1_1)//4)
disp_drv = lv.disp_drv_t()
lv.disp_drv_init(disp_drv)
disp_drv.buffer = disp_buf1
disp_drv.flush_cb = disp.flush
disp_drv.hor_res = 320  
disp_drv.ver_res = 240
disp_drv.rotated = 0
lv.disp_drv_register(disp_drv)  

screen = None
while True:
    print('1',gc.mem_free())
    screen = lv.obj()
    lv.scr_load(screen)   
    print('2',gc.mem_free())
    l = [urandom.random(), urandom.random()]
    print('3',gc.mem_free(),'\n')
    print('before delete')
    screen.delete()
    print('after delete')
    gc.collect()
    time.sleep(1)

the result:

struct lv_disp_t
1 104576
2 104480
3 104416

before delete
0
after delete
Guru Meditation Error: Core  1 panic'ed (LoadProhibited). Exception was unhandled.
Core 1 register dump:
PC      : 0x401ba68e  PS      : 0x00060b30  A0      : 0x8010c126  A1      : 0x3ffc26b0
A2      : 0xbbbbbbbf  A3      : 0x3ffc5040  A4      : 0xbbbbbbbb  A5      : 0x00000010
A6      : 0x00000001  A7      : 0x3ffc2890  A8      : 0x00000000  A9      : 0x0000013f
A10     : 0xffffbbbb  A11     : 0x00000001  A12     : 0x00000000  A13     : 0x3ffc2820
A14     : 0x00000001  A15     : 0x00004a82  SAR     : 0x00000006  EXCCAUSE: 0x0000001c
EXCVADDR: 0xbbbbbbbf  LBEG    : 0x400d2d1c  LEND    : 0x400d2d7a  LCOUNT  : 0x00000000

ELF file SHA256: 0000000000000000000000000000000000000000000000000000000000000000

Note: I compiled a build with the espidf binding, but the problem doesn’t seem driver related so I’m going to stick with the C driver for now. The espidf binding is quite interesting, but it uses 50k of flash space.

I don’t think you are supposed to delete an active screen.
I noticed in your test that you load the screen and then delete it. Please try to load a different screen before deleting the first one.

btw, when ESP32 crashes it also provides a very useful backtrace (stack trace). It’s provided in a raw format but can be very easily parsed using addr2line from the Xtensa toolchain. In a lot of cases such backtrace can give a very good clue where the problem is.

With some effort espidf binding size could be significantly reduced. I currently just naively threw in it lots of APIs.

Anyway, if you stick to the C driver and fix/improve anything in it - I would be interested in that (a pull request could be great!). I’m not using it any more but others might still be. It has several issues related to performance, DMA, interoperability with other devices on the same SPI bus etc.

Correct. We should probably just log an error in that case instead of blindly deleting it and letting crashes happen later.

I really like this idea. It’s not clear to a user that this would lead to a crash.

Would it be possible to have a new method or option to flag the screen for delete, and then delete it later when it’s safe to do so?

I read a bit more of the documentation and found a clear warning about deleting screens. But, perhaps this warning could be bolder and mention that the application will crash?
https://docs.littlevgl.com/en/html/overview/display.html

Screens can be deleted with lv_obj_del(scr) , but ensure that you do not delete the currently loaded screen.

I’ll try to change the design to delete a screen after the next screen becomes active.

Thanks for this tip. That will be really useful.

I think the way I’m using LittlevGL is core the memory allocation problems I encounter. All of the screens in my current project have fixed layouts - only the screen data changes. In the present design, the application creates a new screen each time a screen changes. Knowing that screen memory is not garbage collected I’ll investigate changing the design approach - create all the screens on startup and then activate the appropriate screen when it is needed (rather than creating a new screen each time).

Is that a better LittlevGL “design pattern”?

In that case, why don’t you just change the information that the objects are displaying instead of recreating a whole new screen?

I think that @kisvegabor and @embeddedt (and possibly others) have much more experience than I do for commenting on LittlevGL design patterns, but I can think of a few pros and cons for working this way or the other:

When creating all the screens in advance -

  • Consumes more RAM because all screens need to be allocated simultaneously
  • Performance will be better when switching screens.
  • Predictable - Most RAM is allocated at the beginning so you don’t have to worry about running out of memory in the middle.
  • Simpler design, you don’t have to take care of deleting screens

When creating screens only when needed -

  • Consumes less RAM in total because only one screen is allocated simultaneously.
    However, repeatedly allocating and freeing objects could increase heap fragmentation with micropython’s gc. I’m not sure heap fragmentation is a real problem because you would be allocating/deallocating objects of the same size so the gc might be able to “fill-up the holes”, worth experimenting with this a little.
  • Depending on the complexity of your screens, performance might lag when switching screens.
  • It’s less predictable how much RAM is needed and whether you could run out of RAM in the middle.
  • You need to take care of deallocating screens.

When working with micropython, if RAM is an issue, you might want to consider freezing your modules or at least some of them (usually those that don’t change much).
Frozen modules benefit from both RAM and performance (load-time actually) benefits.

I’m not sure we’re quite on the same page here. For a given active screen, the program updates the appropriate objects when information changes (as you indicated). The memory leak issue comes up when screens are changed. When the program moves to a new screen, that screen is created. The previous screen never gets garbage collected. I’m wondering if a better approach is to create all screens at the start of the program and to switch to a given screen when it is needed.

Hopefully it’s now clearer what I’m asking …

Thanks for reflecting on this problem. Looking at the pros and cons, I think creating screens in advance offers the least risk. Memory leaks are showstoppers for an always-on, memory constrained embedded device. I guess anything that allocates memory must be created at the start, not just screens. I’ll likely put some debugging code into the micropython alloc routines to get a better view on what leads to memory allocation in the micropython heap.

Definitely. I use a manifest to freeze every module that is imported externally. And, finally, when coding is done I freeze the mainline code as well – main.py has one line "import " (where the main program is frozen in flash).

You should be able to switch to a different screen with lv.scr_load and then delete the previous screen (which is now just an ordinary object with no attachment to the display). That should ensure that the previous screen is garbage collected. If garbage collection doesn’t occur given those steps, that may be a bug with the MicroPython binding.

These suggestions were golden. The memory leak issue is solved. Whenever a new screen needs to be loaded, the program first gets a reference to the active screen, loads the new screen, and finally deletes the previous screen. With this approach the heap used by the previous screen becomes free. And, the program no longer crashes after deleting a screen.

Here is a code snip:

prev_screen = lv.scr_act()
lv.scr_load(sound_screen)        
prev_screen.delete()

Thank you both !

1 Like