DVP to KPU data transfer using DMA through Standalone SDK

  • Can you please provide example of how this is done, when using the latest KMODEL. Currently the run_kmodel in nncase.cpp is using

    kernels::k210::kpu_upload(src, mem.data(), shape);

    It uses a memcpy function to copy.

    The memcpy currently takes 17ms when processing single frame and 27 ms per frame when processing continuously from camera. This is followed by the inference time of the model.

    There is incidentally a kpu_upload_dma in the same class, which is unused. Any attempt to use the method results in coredump due to some incorrect usage.

    I hope a proper DMA transfer will cut this overhead drastically.

  • For any one looking for an answer, they can refer here for DMA usage