I managed to get it ALMOST working with the following code.
for (x_fill_act = x2_flush ; x_fill_act >= x1_flush; x_fill_act--)
{
for(y_fill_act = act_y1; y_fill_act <= y2_fill; y_fill_act ++)
{
memcpy((uint32_t)&my_fb[(271 - x_fill_act) * (272) + y_fill_act], (uint32_t)(buf_to_flush + x_fill_act + (y_fill_act - act_y1)*(x2_flush - x1_flush + 1)),2);
}
}
However there is still some issue with the copying algorithm. I get the following issue in the attached picture.