On STM32 dual-port DMA

I came across https://github.com/lvgl/lv_port_stm32f746_disco_atollic/pull/12 . I can’t answer there as that conversation is locked, so let me explain here:

The fix revolves around STM32 DMA initialization:

MemDataAlignment should be inited with DMA_MDATAALIGN_HALFWORD.

I also wonder how was it working because these defines have different values.

The dual-port DMA used in 'F7 (and 'F2/'F4, and as “general-purpose” DMA1/DMA2 in 'H7) has in internal 16-byte (4 word) FIFO. If enabled, the DMA can read on the source port a different width data than write on destination port, the FIFO allows to rearrange (split or assemble) the data as needed. FIFO can be disabled (and that’s the default state, called by ST rather confusingly Direct mode, see the even more confusingly named DMA_SxFCR.DMDIS (Direct Mode Disable = FIFO enable) bit).

In the code in question, DMA is used to transfer from LVGL’s “assembly” buffer into the LCD controller’s framebuffer. Memory-to-memory (M2M) mode is used, which requires FIFO to be on, and that’s set appropriately. In M2M mode, implicitly, the Peripheral port is source, and Memory port is destination.

Now in Cube/HAL, the data width on each port are given by DMA_SxCR.MSIZE/DMA_SxCR.PSIZE filled from the init struct’s MemDataAlignment/PeriphDataAlignment fields (characteristically, the fields’ name is even more confusing, thanks to the obvious incompetence of Cube/HAL’s authors - even if datum size implicitly does imply required address alignment, as DMA is not designed to perform unaligned bus transfers, there’s no clue in the name as per data width).

So, if you set PSIZE to halfword and MSIZE to word, DMA will read from peripheral port halfwords (their number given by the NDTR setting), assemble them into words and write on the memory port. If NDTR is odd, the last word write will overwrite one halfword after the intended size.

So, the fix above indeed corrects that, and also allows the destination (the LTDC framebuffer) to reside on word-unaligned halfword-aligned addresses.

However, reading and writing halfwords takes twice as much time (not exactly as the process is complex and involves many variables inside and outside DMA) as reading and writing words, so in that sense it’s a pessimisation step; and, contrary, the PSIZE ought to be set to words.

Rationale is, that I’m yet to see a display which has resolution in either direction other than multiple of 8 or even (much) higher power of 2 There’s little to absolute no reason on the destination (Memory) side not to see other than word alignment and word size of the framebuffer. On the source (Peripheral) side, it’s the lvgl’s “assembly” buffer which is user-settable, but again, it’s not unreasonable to expect the user to set it word aligned and word-sized.

There’s no “absolutely perfect” solution in the real microcontroller world; there are always tradeoffs, and thinking is required.

2 eurocents


1 Like

Thanks for telling this.

What a “luck”. Anyway, in the current version for CubeIDE it’s fixed: https://github.com/lvgl/lv_port_stm32f746_disco/blob/master/hal_stm_lvgl/tft/tft.c#L428-L429

Actually, half-word aligned transfer can happen even if the buffer itself is alaigned.

E.g. if the buffer’s address id 0x1000 and a 15x20 sized area is rendered. DMA transfers the first line into the framebuffer from 0x1000. After that the next line will begin at 0x100F (unaligned).

Besides the destination area could be unaligned too. E.g. when a button on (3;0) coordinate is redrawn. Even if the source buffer is aligned the destination address will be frame buffer + 3.

The first issue (aligned source address) could be solved by introducing the concept of stride in LVGL.
See the related conversation here: https://github.com/lvgl/lvgl/issues/1858

However, I’ve no idea for the destination buffer issue.

I see and stand corrected.

Introducing the concept of stride in LVGL.

Nice. I like things being optional…

IMO much of the “back-end” is platform-dependent thus moving target, and I don’t think there is an universally perfect solution for all possible cases.