Understanding what is the list of mathematical operations that the KPU implements, and their performance

Hi all,
I would like to understand what is the list of operations that the KPU supports. Can I use the KPU to do matrix multiplications? How about dot product?
The datasheet is not clear on this respect. For example when talking about fixed models It says "Supports the fixedpoint model that the mainstream training framework trains according to specific restriction rules". What are the restrictions?
My goal is to understand and prove the performance capabilities of the KPU (how many operations per second in various scenarios)
Are there simple programs or models that can help me exercise the KPU this way?

Hi @adioltean ,
Here, you can see transfer functions, which k210 supports, including linear, Relu and leaky.
https://github.com/kendryte/kendrytemodelcompiler/blob/master/k210_layer.py
Matrix multiplication is also supported (main calculation of convolution neural network!)
But I'm not sure that only 16 points linear interpolation can manage to do as well as GPU's do in same results.
The restriction is mainly bit precision, because it only support 8 or 16bit calculation, no FP support.
Anyway, multiply of matrix is main process of NN or CNN. And even 8 or 16bit is enough to do transductions.
Maybe it be used in another tasks such as DSP.