How to optimize png tiles display performance

I will show the png map tiles in screen (800x480)
my hardware performance looks good(cortex-a7 1.2G, linux framebuffer driver, RGB TFT interface)
every png tile size is 256x256, the display finished and looks perfect

I pre-malloc a structure list, that include enough png obj, set all flag is LV_OBJ_FLAG_HIDDEN, and put it in free_list
when draw it, I will change the png src, and clear LV_OBJ_FLAG_HIDDEN, and link it into show_list

the problem is
when I want to change the tiles(means map center changed), the map moves, but not so smooth like on pc simulator.
maybe before drawing, the lvgl also do the decode work, how to optimize this?

//read tile and show it
        char tilebuf[64*1024];
        int32_t ercd = me->tileio->read(me->tileio, tile->zoom, tile->xtile, tile->ytile, tilebuf, sizeof(tilebuf));
        if(ercd > 0)
                OBJ_IMG * obj_img = list_head(&me->list_free, OBJ_IMG,node);//get first free element
                list_insert(&me->list_show, &obj_img->node);

                obj_img->tile_len = ercd;
                memcpy(obj_img->tile_buf, tilebuf, obj_img->tile_len);
                obj_img->img_dsc.header.always_zero = 0;
                obj_img->img_dsc.header.w = tdim;
                obj_img->img_dsc.header.h = tdim;
                obj_img->img_dsc.data_size = obj_img->tile_len;
                obj_img-> = LV_IMG_CF_RAW_ALPHA;
                obj_img-> = obj_img->tile_buf;

                obj_img->xtile = tile->xtile;
                obj_img->ytile = tile->ytile;
                obj_img->zoom = tile->zoom;

                //lv_img_set_src(obj_img->img, &obj_img->img_dsc);
                lv_obj_set_pos(obj_img->img, tile->pos_x, tile->pos_y);
                lv_obj_clear_flag(obj_img->img, LV_OBJ_FLAG_HIDDEN);
                lv_obj_invalidate(obj_img->img);//let lvgl refresh the obj