Adding esp_http_server.h to the generator

Till_Harbaum · February 11, 2021, 8:38pm

The malloc is definitely related to this. This runs forever:

esp_err_t http_server_internal_handler(httpd_req_t *req) {
  printf("req\n");  
  httpd_resp_sendstr(req, "<h1>Micropython test</h1>");
  return 0;
}

This crashes after some time or triggers those weird python error messages:

esp_err_t http_server_internal_handler(httpd_req_t *req) {
  printf("req\n");
  httpd_resp_sendstr(req, "<h1>Micropython test</h1>");

  void *p = NEW_PTR_OBJ(httpd_req_t, req);
  printf("p = %p\n", p);
  return 0;
}

Nothing else involved, no scheduler, no semaphores, no python callback … just the pure object creation …

amirgon · February 11, 2021, 8:59pm

Sure, but under the hoods the Python object you are passing is also created with m_new_obj so I don’t see how that would help.
On the C side a Python object is represented by mp_obj_t.

Maybe the object is garbage collected?
In such case its memory is allocated to another object. So when you write to it you overwrite some other Python object.
To prevent this, a reference to your object must be preserved somewhere on the Python side.

You can verify whether gc is related by trying to disable garbage collection by gc.disable(), and see if the problem still happens when gc is disabled.

Till_Harbaum · February 11, 2021, 9:04pm

Maybe. That would not hurt. I am not touching the object ever again. I am just creating it and then forget about it.

Till_Harbaum · February 11, 2021, 9:20pm

I am testing a very ugly solution which so far looks pretty good … I am not generating a new object for every callback. Instead I create it once to re-use it. So I keep a reference and replace the embedded pointer on every callback invocation,

But now I likely have a gc problem since I’ll never know when gc will destroy this object … I can of course keep a reference on python side but that will look confusing as I keep a reference to this object for no apparent reason.

amirgon · February 11, 2021, 9:28pm

You can keep it as another member of handler_data_t next to user_data.
The user won’t care about your object in the same way he doesn’t care about user_data, but gc will not collect it as long as the user holds handler_data_t.

Still… I wonder what we are missing here that is causing this issue.

Till_Harbaum · February 11, 2021, 9:33pm

I am doing exactly that. But how should gc know that this void pointer actually points to one of the objects its about to delete? MP IMO does not know anything about pointers I store inside handler_data_t.

amirgon · February 11, 2021, 9:49pm

It knows.
The gc scans all memories it allocated (handler_data_t included) looking for pointers to other memories it allocated, and marks them (it’s a “mark and sweep” gc).

Till_Harbaum · February 11, 2021, 9:51pm

So if I store random data which coincidentally equals the pointer to some object then this object will not be deleted?

amirgon · February 11, 2021, 9:54pm

Correct.
But the chances for that are low and the consequences are mild (some memory would not be freed).
If anything, the disadvantage is performance. Every time gc is collected, all allocated RAM is being read actually.

Till_Harbaum · February 12, 2021, 10:42am

But this actually makes my “dirty” solution to be at least “ok”. Yes, you are right, that we should understand what’s the problem with this object creation as I am still doing this once and it may just be the case that this is still doing harm and still overwrites the wrong memory area. The problem may just have become less obvious but it may still be there.

Anyway, things start to become usable and the httpd performs pretty good even when lvgl is under load and when each httpd request requires a callback into python.

I still think I would like to add the ability to serve files without any callback into python. But in order to do that I would have to access vfs from httpd …

Till_Harbaum · February 13, 2021, 2:11pm

Here’s another patch. This time the object is created in the python task. This runs very stable and quite fast.

http_server.patch.txt (11.2 KB)

uraich · February 13, 2021, 8:38pm

Just tried and succeeded to compile the patched lv_micropython. Server is running. I can start playing with it.
Thanks!

Till_Harbaum · February 14, 2021, 9:25am

With LVGL in the bg it still crashes quite fast. Even with a minimal single label screen without touch driver.

What happens is quite interesting: The args pointer given to the scheduler doesn’t arrive in the handler. Instead “6” arrives which imho is MP’s represenation for “None”. This happens with gc disabled.

mp_sched_schedule(0x3f458b50,0x3f81bc00)
http_server_handler_cb(0x3f81bc00)
...
mp_sched_schedule(0x3f458b50,0x3f81bc00)
http_server_handler_cb(0x3f81bc00)
...
mp_sched_schedule(0x3f458b50,0x3f81bc00)
http_server_handler_cb(0x3f81bc00)
...
mp_sched_schedule(0x3f458b50,0x3f81bc00)
http_server_handler_cb(0x3f81bc00)
...
mp_sched_schedule(0x3f458b50,0x3f81bc00)
http_server_handler_cb(0x3f81bc00)
...
mp_sched_schedule(0x3f458b50,0x3f81bc00)
http_server_handler_cb(0x6)

… and reboot as dereferencing 6 isn’t a good idea. Now I need to figure out where this can get lost.

Edit: This is not a permanent thing. If I allow the handler to return if arg is wrong then the subsequent calls are often fine again. So there’s nothing permanently messed up.

Till_Harbaum · February 14, 2021, 11:16am

Using uasync for the lvgl handling doesn’t change anything (assuming I did it correctly). Attached are my two simple http_servers, each serving a single simple page and running a small scrolling label inlvgl. One classic style using the scheduler and one using uasync.

I’ve checked that MP_STATE_VM(sched_queue) is consistent in the good and the failing scheduler invocations. It is …

http_server_lvgl.py.txt (1.3 KB) http_server_lvgl_async.py.txt (1.4 KB)

amirgon · February 14, 2021, 12:07pm

Do you have an option to connect a debugger through JTAG?

Till_Harbaum · February 14, 2021, 2:27pm

No, I don’t. The ESP32 modules don’t expose the JTAG pins, do they?

Anyway, I was wrong about the uasync test. The lvesp32.deinit() also needs to be put after the display has been initialized. Otherwise it starts to call the scheduler, again.

And guess what? Now that the scheduler is not used by lvgl anymore, the httpd runs somewhat stable. I still think we should understand why using the scheduler the normal way leads to this problem.

Here’s a server that works with lvgl:
http_server_lvgl.py.txt (8.9 KB)

amirgon · February 14, 2021, 2:40pm

Actually they do!

ESP32 PORT  FT232H PORT  COLOR
==========  ==========   ======
GPIO13      AD0 (TCK)    Purple
GPIO12      AD1 (TDI)    Blue
GPIO15      AD2 (TDO)    Green
GPIO14      AD3 (TMS)    Yellow
GND         GND          Black

I agree, and I think that a debugger could be helpful for that.

Till_Harbaum · February 14, 2021, 3:21pm

I just ordered a ft232h adapter. These GPIOs are being used on my custom board but I should easily do a breadboard setup that exposes the same issues.

My current suspicion is that the MP scheduler is not multicore safe. What I do see in these problematic situations is that function pointers and arg pointers do get messed up … as if the scheduler queue is written while the scheduler runs entries from it. There are “atomic” macros which are supposed to handle that.

amirgon · February 14, 2021, 3:29pm

Under the hoods it uses a mutex exposed by ESP32-FreeRTOS, which is multicore safe, but maybe there’s some MP code which should be protected and is not.

Till_Harbaum · February 14, 2021, 8:00pm

Funny side note: I think I am tracing a bug in the esp-idf which is not passing the user_ctx to the handler if websocket is being used …