Define special types for interfaces

glueckm · November 28, 2023, 3:26pm

Hi,

I’m currently in the progress of generating a wrapper for cpython.

Since I try to generate as much of the wrapper code as possible from the souce code it would be really helpful to use real types in the interface instead of “gereric” types like void **

For exampe,

void lv_animimg_set_src(lv_obj_t * img, const void * dsc[], size_t num);

How should the wrapper generator know what to expect from the dsc parameter.

If a new typedef would be introduced

typedef void * lv_image_src_t;

and the used in the interface like this:

void lv_animimg_set_src(lv_obj_t * img, const lv_image_src_t dsc[], size_t num);

The binding generate would have the chance to handle the different types of parameter in a better way.

Personally I also think that this gives a bit more of clarity to the user as what type of data is allowed if special types are used consistently thorough the library.

What are your thought on this?

Martin

kisvegabor · November 29, 2023, 10:09am

Hi,

I cc a few guy who are deeply into MicroPython.

@kdschlosser @matt.trentini @bdbarnett

kdschlosser · November 29, 2023, 1:49pm

Have a look at this cpython binding I wrote.

This one the binding uses ctypes

and this one uses CFFI

Issues I ran into is keeping references of the different LVGL objects and knowing when to delete those references. The single biggest issue is keeping things pythonic and dealing with things like casting done behind the scenes. I had not figured out how to handle getting an object from LVGL using functions that are like lv_obj_get_child and being able to determine what the object instance is so that I could return the correct object to the user. Because of how FFI works (ctypes uses FFI) the returned object for some reason ends up having a different memory location than the original, The only way I could come up with for being able to identify things is an ID would have to be added to the objects and that ID could then be used to return the correct python instance for the object. I know I could construct a new object if I wanted to but the issue there is equality testing would not work properly.

glueckm · November 29, 2023, 1:58pm

Hi,
thinks for the links, I will definitely look at them.
I have implemented a richcompare function and compare the address of the underlying c object to check for equal.
This will for a == b, but not for a is b.
So, you still get a new python wrapper from lv_obj_get_check but at least you can compare it with a previous object to see if these are the same objects.
And since no other information than the c object is store in the python wrapper, I don not see an problem with that approach.

But as I said, I will take a look at your approach.

Thanks,
Martin

kdschlosser · November 29, 2023, 2:02pm

I had also brought this exact same thing up for buffer objects. That is mainly where you see the use of void *. There really is no way of knowing what uint8_t * blah; is in a structure. It could be an array or it could be a pointer to a single integer. an array can only be specified if the length is given so that wouldn’t work for a variable length array. My thought was to add types like so.

typedef struct {
    uint8_t *buf;
    uint32_t len;
} uint8_buffer_t;

typedef struct {
    void *buf;
    uint32_t len;
} void_buffer_t;

void * is hard to deal with in CPython because it can be anything. so long as you have the [ ]'s you at least know that it is an array of pointers. The real hangup is with the structures. if the fields had a type like what is seen above then it would be easy to determine what the field really is. defining a class for void with a method to cast to another object type is easy to provide but without knowing if you are dealing with an array or with a simple pointer is where it causes problems.

kdschlosser · November 29, 2023, 2:09pm

My intention was to bury all of the funky ctypes or CFFI syntax so the user would not have to deal with it. The syntax really makes no sense to someone that doesn’t know C. an example is.

uint8_array = (c_uint8 * 10)()

and it gets even more complicated when dealing with having to cast and knowing what to cast to.

ctypes.cast(obj, ctypes.POINTER(some_ctype)).contents

having to do that with an object that is not a pointer

ctypes.cast(ctypes.addressof(obj), ctypes.POINTER(some_ctype)).contents

which sometimes works and sometimes doesn’t because the addressof doesn’t always point to the correct memory location and id needs to be used instead

ctypes.cast(id(obj), ctypes.POINTER(some_ctype)).contents

glueckm · November 29, 2023, 2:33pm

My approach was to generate real classes as c extension types and not using ctypes/CFFi at all.
The problem is that many function of LVGL are inline functions and for them CFFI does not work.

I have started experimenting with the Buffer Protocol.
For a first test, using the Buffer Protocol on a lv_image_dsc_t I was able to to create an array one image dsc, one path to an image on the file system and one string used as a symbol and use that array for the lv_animimg_set_src function and that worked fine.
The problem is that a unicode string in python does not support the Buffer Protocol, so one has to use the bytes type for that:
b"A:asstes/image001-png"

For *storage" I would use the builtin bytesarray type:

buffer = bytearray (100) # create a buffer with the size of 100 bytes

Using real extension types I was able to almost recreate the micropython interface, which I think would be really helpful if one can reuse the code from micropython excatly.

kdschlosser · November 29, 2023, 4:12pm

You can use an array.array, bytes, or bytearray when needing to pass a buffer. If you need to pass a const string then it should be passed as an array.array… you can also pass memoryview objects as well if you are concerned about memory use.

I got around the inline problem by having the script that generates the code create c functions that get compiled to expose those functions. I prefixed the function names with py_. simple solution.

Are you using Pycparser to read the code or are you using clang?

kdschlosser · November 29, 2023, 4:14pm

Someone before has tried going the C extension route. They ran into an issue with memory leaks. I don’t know anything more about it. I never looked at what they had done You should however generate a stub file so code completion and intellisense will work in an IDE.

glueckm · November 29, 2023, 6:26pm

Yea, the reference counting could be an issue here. Especially with the callbacks.
As long as everything as done via python interaction it is not that difficult.
But as soon as you for example add a callback via python and later on it is removed via C it will be difficult to get the reference counting right.
Right now I’m trying to figure out when exactly the LV_EVENT_DELETE is sent and if all the event bindings are still present at that time.
If the event bindings are still available that it should not be a problem to get the reference counting for the objects right.

Another things is the use of the user_data.
Right now the user data is used as bridge between the c world and the python world.

For example, when we attach a event callback to LV_EVENT_CLICKED, we need to use a special C function which uses the user_data as a the real callback.
This way we can handle the reference counting during the delete to decrease it for python callbacks,
But, how do we know if the user_data of an event_cb is the reference to the python object or a user_data from a plain C event binding do by some internals of LVGL?

Another option would be to use the address of the underlying C object as a key into a global dictionary and store all the stuff in there.
In this case we could also store the python wrapper object in this dict and so return always the exact same instance → than also a is b would work as expected.

But, the problem with that is that for example for animations, this approach does not work because when you start an animation, a copy of the animation will be created and added to the list of running animations.
And because the link between the python instances/callbacks would be based on the address of the C object the link will be broken if the a copy is made.

When I started with this I was playing around with the idea of adding an additional field to all the structs for holding a reference to the extra stuff needed for the python wrapper. But I abounded that because I thought the when micropython bindings does not need this, it must be possible for cpython as well.

Not so easy are all… And any input is more than welcome.

glueckm · November 29, 2023, 6:33pm

Yea I know, it’s just for example specifying a filename for an URL it feels un pythonic to wrirte:

image.set_src (b"A:image.png")

and not

image.set_src ("A:image.png")

I started with pycparser. Right now I’m now sure how to continue. I’m currently playing with the idea of using the json file generated by the micropython binding generator just to make sure the intefaces are compatible as much as possible.

Therefore it may make sense to split the micropython binding generator into two parts:

List One which generates the json file
the second one could than take the json file and generate the micopython bindings, or the cpython bindings, or a C++ binding, maybe one for rust…

kdschlosser · November 29, 2023, 6:42pm

the real hard part is collecting the correct object from say an event. this is because the object attached to an event is lv_obj_t which can be any widget. you have to know what the object actually is in order to provide the correct python representation for it.

kdschlosser · November 29, 2023, 6:43pm

There is not enough information to be able to properly use the JSON output. it was easier to learn pycparser than it was to try and figure out the gen_mpy script.

kdschlosser · November 29, 2023, 6:46pm

you are hitting the same road blocks I did. There is no really easy way to do this. It is easier to accomplish in MicroPython than it is for CPython. It is almost easier to hard code the entire thing then to have to be generated.

You can also use something like swig to generate the code

glueckm · November 29, 2023, 6:51pm

I think I solved that part by looking at how the microypthon bingings did it:
I created a dict mapping the address of the lv_obj_class to the address of an function which creates the python object.
That works well actually.

glueckm · November 29, 2023, 6:53pm

As I stated it may be possible to enhance the json generator to add the information currently missing so that we don’t reinvent the parser for all the bindings we have in the future…

glueckm · November 29, 2023, 7:01pm

A just did a quick search through the code and I the only place I found in the current master code base is the animation where a new animation is created.
And the problem with that copy is that within the lv_anim_start function the callback get_value_cb is already used.
So the wrapper does not have a chance of fixing anything before any of the callbacks are used.

@kisvegabor
Is there any possibility to add an additional callback to the animation which will be called once the copy is created but before the get_value_cb is used?

kdschlosser · November 29, 2023, 8:16pm

why is there a need to do this? The reason why it is done how it is done is so that an animation is able to be initialized inside of a function. the original structure is able to go out of scope and get GC’d. Because the animation was copied it is able to keep on running without any issues.

glueckm · November 29, 2023, 8:27pm

The get GC’d is he problem. Where do we store the python callbacks? Currently I do it in the user_data field. Now the user_data field is shared by two animations, but the wrapper framework has no chance if increasing the ref count for the wrappers.
So when now the original animation goes out of scope it would decrease the ref count of the callbacks and possible the ref count goes to 0 and the callback gets deleted which will cause a crash.

kdschlosser · November 29, 2023, 9:48pm

Micropython stores a dictionary in the user_data fields. IDK structure wise how it is stored, I am thinking something along the line of the object and the python function are what is used. You would basically create a C function for the callback and that is what gets set as the callback function in the structure or passed to a function call.

The microPython binding keeps all of the inner working hidden. so a user is not able to directly access the fields in the obj_t structure.