Standalone binary .mpy module for lvgl (so we can "just use it" in micropython)

kdschlosser · November 27, 2023, 12:21am

I got it to compile as a user c module for Unix. YAY!!!

EDIT

I just ran a quick test on the Unix port. I am pleased to say that it is functional so I am now pretty hopeful about the ESP32 port working properly.

amirgon · November 27, 2023, 6:50am

@kisvegabor it’s a long thread and I’m not following. Could you summarize the question?

kdschlosser · November 27, 2023, 6:56am

@amirgon

It was about where in the micropython build system the changes were made to get it to compile with LVGL for unix.

I already figured it out and got it working, sorry to bother you.

matt.trentini · November 27, 2023, 11:03am

Hello folks, I’ve scheduled a zoom meeting (hopefully that’s ok with you all?) for noon CET/10pm AEDT tomorrow, Tuesday the 28th November.

Look forward to chatting with you in ~24hours!

kisvegabor · November 27, 2023, 11:56am

Thank you Matt. I’ve send a Google Calendar invite to be sure we are on the same page.

bdbarnett · November 27, 2023, 5:18pm

I have had my head in the sand the last 3 weeks developing a driver and architecture that do most of these things, including compiling as a USER_C_MODULE, using DMA and non-blocking. I currently only have it implemented for ESP32 targets, but the C code for ESP32-specific calls is separated from the common code, so adding other targets should be simplified. Most of the plumbing is essentially the same as @kdschlosser’s implementation because they are both based on ESP_LCD in ESP-IDF. The difference is my approach is essentially a replacement for framebuf.Framebuffer like @andrewleech mentioned that will compile in bare Micropython yet have functions exposed for allocating DMA buffers, blitting to the display, and registering a callback function to be run when the blit is complete, so they will work with LVGL or any other graphics library that can call blit. The buffers can be full screen (resources permitting) or partial. They remove lv_micropython’s dependency on ESP-IDF! I haven’t tried compiling as an .mpy yet, but I don’t see why that wouldn’t work since they compile as a USER_C_MODULE now.

I wrote a helper Devices class in python that makes it easy to attach non-LVGL drivers for display, touch and rotary encoders to be used with LVGL. This is awesome because we can use ANY Micropython touchscreen driver that has a function to read the touch coordinates (which is all touchscreen drivers). Likewise with rotary encoders–we need to know the function that reads the position and the function that reads the button pin value and that’s it. Now we’ve got a lot bigger developer base writing drivers because they don’t have to be LVGL specific! I think the Devices class, or some variation of it, may help in standardizing an API and facilitate generic naming like @gitcnd mentioned. Here’s an example of a complete display_driver.py (or whatever you would name it, using that name for historical purposes and because most lvmp examples use that name to load the drivers):

from lvmp_devices import Devices
from mpdisplay import allocate_buffer
from display_config import display_drv
from i2c_config import touch_drv  # Add ', rtc'  or any other I2C devices on the same bus
from encoder_config import enc1_funcs, enc2_funcs

devices = Devices(
    display_drv = display_drv,
    bgr = True, # for LVGL 8 only.  LVGL 9 removed lv.COLOR_FORMAT.NATIVE_REVERSE
    factor = 10,
    blit_func = display_drv.blit,
    alloc_buf_func = allocate_buffer,
    reg_ready_cb_func = display_drv.register_cb,
    touch_read_func = touch_drv.get_positions,
    touch_rotation = 5,
    enc_funcs = [enc1_funcs, enc2_funcs],
    )

What’s more, if you have a second display and touchscreen, you just do the same thing, like:

devices2 = Devices(...

usage could be either

import display_driver

or better yet

from display_driver import devices
devices.list() # not required, lists all the devices and functions setup by Devices

Since Devices takes care of lv.init and starting the eventloop if that hasn’t already been done, the above display_driver.py is complete. Since the class is created in python rather than C, other hardware devices that aren’t LVGL specific can be tacked on to the devices object, which is what I do with the external RTC on one of my displays since the driver for it is loaded at the same time as the touchscreen driver in i2c_config.py.

devices.rtc = rtc

Please take a look at the Display Driver announcement.

kdschlosser · November 27, 2023, 6:48pm

If people could use the same name they use in the forum here for the zoom it would make it easier to know who is who. Or use your real name and in parenthesis your forum name.

I am an old goat and can’t remember names so it would make it easier for me that way. It might help others as well.

bdbarnett · November 30, 2023, 11:07pm

Sorry for taking a couple of days to get to this. @kisvegabor asked us to summarize what we got out of our meeting Tuesday. I’ve been dragging my feet about it, hoping someone else might summarize so I wouldn’t need to.

@kisvegabor, @andrewleech, @matt.trentini, Patrick Joy, Robin Reiter and I met Tuesday to discuss the items in this thread. We didn’t get far into the weeds during that initial meeting, but it looks like all our goals are the same: to make getting started with LVGL on Micropython as easy as possible for the end-user. We don’t necessarily agree on how to get there, but we’re still all on the same team and have plans to meet again. I am very grateful for having the opportunity to be a part of the discussion and to meet with several of the people I’ve seen in both Micropython and LVGL forums. You guys have helped create two fantastic projects that I spend a great deal of my time with. It’s OK, my wife already knows.

@andrewleech has ideas to implement non-blocking IO functionality for peripherals and has posted an RFC about it here. If Micropython had this functionality built in, any driver, regardless whether it was display based or not, could see significant improvements in speed without the driver having to implement its own bus. For instance, the way Micropython is now, when a driver sends out some data through the SPI bus, execution is stalled, or blocked, from doing anything else until that transaction has completed. The drivers @amirgon created as well as the drivers @kisvegabor and I are working on now work around this by implementing the SPI calls in C rather than using Micropython’s classes and functions. Having non-blocking IO in exposed to be used by the end-user in Micropython is pretty huge.

@kisvegabor makes a great argument that LVGL for Micropython will be easier for users to get started if it brings its own drivers with it. I can’t argue with that. Having the drivers baked in means one less thing the user has to download and compile in. He also agrees with @andrewleech’s proposal for Micropython to handle the low level IO and LVGL to provide the drivers that use that IO.

My idea is to have a display driver for Micropython that doesn’t care which graphics library uses it, whether that’s LVGL or something else. My goals to develop mpdisplay haven’t changed much. My primary use case for mpdisplay will continue to be LVGL, but, as has been the case from the start, mpdisplay will work with other libraries as well. This will create an alternative to the drivers LVGL for Micropython brings with it, which I don’t think is a bad thing at all.

@kdschlosser wasn’t able to make the meeting. (If my information is correct, the meeting was at 3:00am his time!) He is working on a project that excites us all, which is making the lv_binding_micropython compile as a user C module. He has already given me some fantastic tips on making mpdisplay better and I hope to stay engaged with him and the lv_micropython community.

matt.trentini · November 30, 2023, 11:42pm

Thanks @bdbarnett, it was on @andrewleech and I to summarise - I’d spoken with Andrew but I"d dropped the ball in actually writing up the summary (had a couple of busy nights!). Thanks for putting that together, here are some additional notes that I was going to tidy up a little, but raw seems fine on the back of your post:

lv_bindings
- Remove dependency on ESP-IDF
- Separate out drivers (which force the dependency)
  - Add a layer of abstraction
Aim to include LVGL as a submodule in MicroPython (libs dir)
- Start with implementing as a C User Module (proof of concept)
- Motivation: Allow LVGL to be built - the same way - for all ports/boards
  - Reduce maintenance burden on LVGL team
  - Easier for new (and experienced!) users
- Medium-term: Aim to provide native module (binary mpy) support for the LVGL submodule
Document and provide an example to use the LVGL ↔ driver interface
- Build some example drivers (ILI934x etc)
- Work towards separate interface between display driver code and bus driver
  - To allow drivers to be used for different bus implementations (and across ports)
- Over time, build more mature bus (SPI/I2C) drivers including DMA support

On a personal level, it was great that everyone was positive about a collaboration and open to listening. @bdbarnett and @kdschlosser have made excellent progress, we just need to build on it!

We aim to have a follow-up meeting in two weeks.

kdschlosser · December 1, 2023, 5:29am

Meeting was at 4:00 AM my time. I didn’t sleep the night before and I made it to about 3:00 before I nodded off. Wife woke me up at 3:45 but I was not having it at that point. Sorry about that guys. I have a medical condition that messes with my sleep patterns it just happen to be one of those times. It’s not by my choice and it is the reason why I am retired.

bdbarnett:

@andrewleech has ideas to implement non-blocking IO functionality for peripherals and has posted an RFC about it here. If Micropython had this functionality built in, any driver, regardless whether it was display based or not, could see significant improvements in speed without the driver having to implement its own bus. For instance, the way Micropython is now, when a driver sends out some data through the SPI bus, execution is stalled, or blocked, from doing anything else until that transaction has completed. The drivers @amirgon created as well as the drivers @kisvegabor and I are working on now work around this by implementing the SPI calls in C rather than using Micropython’s classes and functions. Having non-blocking IO in exposed to be used by the end-user in Micropython is pretty huge.

There is already non blocking IO being done in the current binding. The current binding only has SPI so it only works for that.

I am not understanding how you will not need bus drivers how are you going to control how the data gets sent? That doesn’t make any sense to me at all. You must define the transport mechanism in order to be able to communicate with the display IC. Not unless you are going to write several display drivers for the same display one for each of the bus types the driver IC supports. That is simply way to much work and also way to much repetitive code across the different driver IC’s

You MUST deal with MicroPythons way of working in order to expose the parts and pieces to the user. That is the way it works. No way around that. The user has to be able to instruct the driver as to what pins they have the display attached to so you MUST use the MicroPython API to get that to work.

Several of the SDK’s for the different MCU’s have already coded Display features in them. writing your own software based solution is not only going to be slower it is also going to be a lot more code to have to maintain. Use what is built into the SDKs and write a small layer in between that allows it to be accessed from MicroPython is the best way. Far less headache this way.

If you take a look at the binding I have already got up and running you will see there is non blocking IO for I2C, SPI, I8080 and RGB. Not all boards support DSI or other forms of display busses. Those are the ones that are pretty much able to be supported across the majority of the boards.

asyncio is blocking. It has to stop the execution of user code to check and see if there is anything coming in oir going out and to also receive the data or to send the data. It is not going to run in parallel with user code. It uses tasks and switches between tasks. The problem here is you are sending large amounts of data. you cannot stop doing that to switch to user code to do something. There is no blocking IO with how everything currently works. The program is not sitting there waiting for data or waiting to send data. Blocking means it is sitting there waiting as in idle and not doing anything, transmitting data is being active not waiting. So currently there is no IO blocking occurring and because there is no IO blocking occurring asyncio will do nothing to help. Asyncio manages idle time by being able to do something during that time. There should be no idle time on an MCU if the users code is running properly. It loops until the end of time actively doing something the entire time.

You cannot bake drivers into it. The entire point to using Python is rapid development and having to compile over and over again defeats the sole purpose of using Python. It is a runtime language and as such the drivers need to load at runtime not compile time. You also have the problem of it getting too fat and not being able to to do OTA updates of the firmware. It is already on the edge of not being able to do it. If all kind of extra stuff gets baked into the firmware it is not going to be able to flashed as an OTA update. There isn’t an unlimited amount of storage on these MCU’s and you have to remember that Micropython already takes up a large amount of space and adding LVGL to it pushes it into the 2.5MB area. It’s already getting to be a chunky monkey so keep that in mind when deciding to compile things into the firmware or freezing anything into the firmware. You need to leave a considerable amount of space for the user to be able to drop their files onto as well. It needs to fit and also run a users application. SD Card readers to store images are not always an option for storage and python source files will not load from an SD card reader.

kdschlosser · December 1, 2023, 5:45am

I am currently thinking about how to trim down things so OTA updates can work without issue. There is always a tradeoff when you do something like this and that tradeoff would be more memory use. If you have an 8MB flash amount to work with and you want to do OTA updates that is going to leave 3MB for user code including any binaries like images. 2.5mb for each of the OTA partitions at a minimum. Images take up a lot of space. It’s what they do, 3mb doesn’t go very far. I am thinking if we can get LVGL to compile as a c module that is not baked into the firmware this will shrink the footprint of the firmware. Take this as an example.

lets say LVGL takes up 1 mb of space in the firmware. So compiling currently takes up 2.5mb. without lvgl it takes up 1.5mb. so now partitions gets set up as firmware1 = 1.5 mb, firmware2 = 1.5 and user code = 5. 1 mb of the user code will be used up by the lvgl c extension. so you end up with 4 mb of user code space. 1 mb larger than baking lvgl into the firmware.

That only holds true when doing OTA based updates of the firmware. if not there is less issue but there can still be an issue if the firmware gets to fat. If you have a board that has 4mb of flash you are going to be left with 1.5 mb for user code. again that space doesn’t go very far as we all know.

bdbarnett · December 1, 2023, 5:54am

Hey @kdschlosser, I can relate to not getting any sleep. Except my only excuse is the best time for me to code is after the wife is in bed. The meeting was 5:00am my time, and I only made it because I was excited about the invitation. Hope to catch you on the next one.

Don’t take my summary as being 100% correct. You can see more about what @andrewleech is discussing in the link to his RFC. He isn’t discussing non-blocking functionality for LVGL. You and I are already doing that, but in C code. He is proposing for it to be included in Micropython, not just for display io, but all io, removing the need to implement them in our own drivers. He can explain it way better than I can, so I’ll stop talking about it.

Likewise, I probably misunderstood what @kisvegabor was saying simply because I don’t understand yet. I am working on compiling the new lv_binding_micropython you’re working on. I probably will understand better when I reach that point.

I’m regretting posting my summary now. It was not my intention to stir the pot. I was just trying to keep the dialog moving. @matt.trentini’s summary is much better than mine.

kdschlosser · December 1, 2023, 7:06am

@bdbarnett

No NO it’s good that you did post it. It is how you understood things. So there might need some more discussion on those things so everyone gets a better understanding of what is happening.

There is technically speaking zero blocking IO taking place at any point in time with regards to MCU’s and micropython. Nothing sits there and waits unless it is specifically told to sit there and wait. If a blocking call i to be made then there is a function to check if anything has been received. blocking IO only takes place when code execution stops because something is being waited for. transmitting data is not blocking IO because it is not waiting.

asyncio which is what was mentioned in that article you linked to is what is intended to be used asyncio only has a benefit if the user code execution stops because it is waiting for something to happen. It uses that time to go and do something else. If the user code is actively sending data then guess what? nothing else is going to be able to be done. Think of asyncio as micromanaging the time when no work is being performed. The thing is MCU’s and their SDKs are designed in a manner that this would only occur if the user explicitly causes it either unintentional or intentional.

The current code in LVGL has ZERO idle time. I do not believe there is anything in LVGL that causes any kind of a wait or a spinning wheel that would be considered a really long time.

bdbarnett · December 1, 2023, 8:38am

@kdschlosser
Thank you for your encouragement and understanding!

I’m probably using the wrong terminology, so I’ll use an excerpt from the .blit method in my code to try to explain it:

    esp_lcd_panel_draw_bitmap(self->panel_handle, x, y, x + w, y + h, buf);
    // if ready_cb_func wasn't registered, wait for esp_lcd_panel_draw_bitmap to complete.
    // esp_lcd_panel_draw_bitmap calls lcd_panel_done when complete, which sets lcd_panel_active = False
    // and calls ready_cb_func if it is registered.  SEE cb_isr COMMENTS BELOW!!!!
    if (self->ready_cb_func == mp_const_none) { 
        while (lcd_panel_active) {
        }
    }
    return mp_const_none;

In this case, if a callback is registered, like display.flush_cb, then the blit method can return immediately and let Python code continue to execute, which in turn lets LVGL continue rendering. The callback will be executed by lcd_panel_done, or bus_trans_done_cb, or whatever on_color_trans_done points to in the panel io config. In your code, there is always a callback, so it is never blocking. Since I’m trying to keep my code graphics library independent, if a callback isn’t registered, it has to block (not return immediately) until the flag lcd_panel_active is set to false by the lcd_panel_done callback. That prevents it from returning to Micropython code and the graphics library from trying to use the buffer before esp_lcd_panel_draw_bitmap is finished with it. In lv_micropython, I’ll always register a callback on the ESP32 architecture and any other like it that has something equivalent to on_color_trans_done. If the architecture doesn’t have something equivalent, or if the library doesn’t have a callback to call, I’ve got to make it block in a while loop. I could set a flag in C and check for that flag periodically before calling display.flush_cb, but that’s still depending on functionality written in C.

As I understand it, SPI writes using Micropython calls (not C) block until they are finished. They don’t use a while loop like my C code does, but the result is the same: Micropython code execution doesn’t continue with the next line until the write is completed. I think what @andrewleech is discussing is making it so that Micropython’s SPI writes have the option to return immediately, and there being some sort of mechanism the caller can check periodically to see if the write has completed. That way the caller (in python code) can keep doing work while the write is in progress, and then take control of the buffer when the write is completed. This would give drivers (including but not limited to display drivers) written all in Python a significant speed boost. That’s not a benefit for you and me the way we currently have our drivers written on the ESP32 architecture using ESP_LCD, but it would be super beneficial to have one mechanism that worked across several or all architectures. Again, my understanding may be completely off. I’m still learning, but I’m a quick learner for an old fart.

andrewleech · December 1, 2023, 9:22am

My micropython discussion post is basically about planning how to add a background / dma / non-blocking mode to port drivers like SPI in micropython, in a way that can be implemented in all ports and exposed in both C and python.

I know there’s are C bus drivers like yours that have been written for esp32 etc, I want to use that as example code and bring that into micropython officially so it doesn’t need to be rewritten as separate modules for all ports.

In the discussion piece I include talk about asyncio because the same underlying background IO functionality can be used to provide asyncio access to peripheral functions, do want to meet both use cases with the one API. Lots of micropython users want asyncio access to peripherals, so this also gains more interest. PS asyncio is about a lot more than just managing idle time, it’s biggest benefit is managing background/ concurrent functionality and making it easy to use in application code.

kdschlosser · December 1, 2023, 9:50am

OK now that makes sense. I get what you are wanting to do now.

The problem is that MicroPython only has support for I2C and SPI. nothing for RGB or I8080 for any MCU. so you would need to have those added to micropython as well. This has been requested many times and it has not been done, CAN for the esp32 has been requested and the code has been written and there has been a PR for it for a long time (years) and new PR’s have been made to update the code with each MicroPython release but it never gets added.

The reason why it doesn’t get added is because of the API differences between the different board manufacturers and CAN not being implemented across the board for all boards. the same thing would hold true with display busses. it will not get added because not all boards support it and putting together a common API between the ones that do may not be able to be done. I don’t know enough about any of the other boards to know if we can put together a common API across them to make this work. At this point in the game it would be ideal if someone that knows STM32 could jump in and see if we can come up with something.

If we are not able to come up with some kind of a common API between the different busses for the different boards this may end up not being able to be done. that is something that needs to be checked into now. The question of getting LVGL to compile as a user c module has been answered. that can be done and it does work. that was step 1. step 2 is getting an API hammered out

kdschlosser · December 1, 2023, 10:03am

when you want to send data over SPI there has to be a call made to micropython code. It is in that code that the board specific SDK functions get called to transfer the data,

This is the function that gets called when you want to send data over SPI for an esp32


STATIC void machine_hw_spi_transfer(mp_obj_base_t *self_in, size_t len, const uint8_t *src, uint8_t *dest) {
    machine_hw_spi_obj_t *self = MP_OBJ_TO_PTR(self_in);

    if (self->state == MACHINE_HW_SPI_STATE_DEINIT) {
        mp_raise_msg(&mp_type_OSError, MP_ERROR_TEXT("transfer on deinitialized SPI"));
        return;
    }

    // Round to nearest whole set of bits
    int bits_to_send = len * 8 / self->bits * self->bits;

    if (!bits_to_send) {
        mp_raise_ValueError(MP_ERROR_TEXT("buffer too short"));
    }

    if (len <= 4) {
        spi_transaction_t transaction = { 0 };

        if (src != NULL) {
            memcpy(&transaction.tx_data, src, len);
        }

        transaction.flags = SPI_TRANS_USE_TXDATA | SPI_TRANS_USE_RXDATA;
        transaction.length = bits_to_send;
        spi_device_transmit(self->spi, &transaction);

        if (dest != NULL) {
            memcpy(dest, &transaction.rx_data, len);
        }
    } else {
        int offset = 0;
        int bits_remaining = bits_to_send;
        int optimum_word_size = 8 * self->bits / gcd(8, self->bits);
        int max_transaction_bits = MP_HW_SPI_MAX_XFER_BITS / optimum_word_size * optimum_word_size;
        spi_transaction_t *transaction, *result, transactions[2];
        int i = 0;

        spi_device_acquire_bus(self->spi, portMAX_DELAY);

        while (bits_remaining) {
            transaction = transactions + i++ % 2;
            memset(transaction, 0, sizeof(spi_transaction_t));

            transaction->length =
                bits_remaining > max_transaction_bits ? max_transaction_bits : bits_remaining;

            if (src != NULL) {
                transaction->tx_buffer = src + offset;
            }
            if (dest != NULL) {
                transaction->rx_buffer = dest + offset;
            }

            spi_device_queue_trans(self->spi, transaction, portMAX_DELAY);
            bits_remaining -= transaction->length;

            if (offset > 0) {
                // wait for previously queued transaction
                MP_THREAD_GIL_EXIT();
                spi_device_get_trans_result(self->spi, &result, portMAX_DELAY);
                MP_THREAD_GIL_ENTER();
            }

            // doesn't need ceil(); loop ends when bits_remaining is 0
            offset += transaction->length / 8;
        }

        // wait for last transaction
        MP_THREAD_GIL_EXIT();
        spi_device_get_trans_result(self->spi, &result, portMAX_DELAY);
        MP_THREAD_GIL_ENTER();
        spi_device_release_bus(self->spi);
    }
}

This function call

            spi_device_queue_trans(self->spi, transaction, portMAX_DELAY);

is the function that is in the ESPIDF to send the data. The sending of the data is not handled by MicroPython. It is handled by the ESP-IDF SDK code.

kdschlosser · December 1, 2023, 10:13am

bdbarnett:

esp_lcd_panel_draw_bitmap(self->panel_handle, x, y, x + w, y + h, buf);
    // if ready_cb_func wasn't registered, wait for esp_lcd_panel_draw_bitmap to complete.
    // esp_lcd_panel_draw_bitmap calls lcd_panel_done when complete, which sets lcd_panel_active = False
    // and calls ready_cb_func if it is registered.  SEE cb_isr COMMENTS BELOW!!!!
    if (self->ready_cb_func == mp_const_none) { 
        while (lcd_panel_active) {
        }
    }
    return mp_const_none;

In that code example it is pointless to loop like that if DMA has not been used for the frame buffers. That is because the call to the underlying SDK function for sending the data is not going to return until all of the data has sent. You would want to loop like that if DMA has been set and no callback function has been given for when the transfer has completed. If the user has the flush ready function being called in their flush function as soon as the code above exits and returns to the flush function LVGL will be told that it is OK to write to the buffer when the data could still be sending. So the need to stall would have to be there if there is no callback supplied. that would keep from ending up with the buffer data getting corrupted due to LVGL writing to the buffer while it is still sending. This would only be needed if DMA memory is used and no callback is provided. If it is a non DMA buffer there is no issue or if a callback has been provided there is no issue.

I actually have my code written so that a callback should always be supplied for the transfer done. so if the buffer is not DMA the callback gets called once the buffer has emptied before the function returns the flush ready function in LVGL gets called from that callback function and not from the flush function. Everything is handled in C code to keep the API the same across the different busses regardless of the kind of memory the buffer has been placed into.

kdschlosser · December 1, 2023, 10:30am

I corrected the code so that loop will function properly I had started to add it to the RGB bus. dunno why because I have no control over what memory that uses but I can check for a user supplied callback and if there is none then I can stall the RGB function from returning until the transfer is flagged as completed.

kisvegabor · December 1, 2023, 10:39am

Thank you for the summaries guys. I just would like to highlight one point: even tough LVGL will have built in display controller drivers, the user might or might not want to use them. If their device is supported is great, they will have a tested, integrated C speed driver, out of the box. Else they an add a Python driver provided by the vendor or someone else.

I sorry, but probably I won’t be able to follow this discussion very deeply. Please tag me if there is anything that require my attention.