An interesting expansion on custom AI models: There are lots of quantization schemes coming up that can't be employed to full efficiency on classical GPUs or CPUs, such as Bitnet 1.58, which is dependent on extremely efficient turnary operations which in contrast can be accelerated quite aggressively on FPGAs and ASICs.
@beetlebox48585 ай бұрын
In the case of binary and ternary, FPGAs has huge advantage over GPUs because of their ability to exploit XOR operations. Problem is that accuracy drops badly below the 4 bit mark.
@novantha15 ай бұрын
@@beetlebox4858 In the case Bitnet specifically the accuracy actually isn't terribly far off of full precision when trained with quantization aware training as described in the paper; the main issue is actually having to train the model from scratch, but it does work pretty well. And, as you noted, it's very efficient on FGPAs.
@souravjamwal775 ай бұрын
I am always interested in applying current trends to the core or very base hardware. Like last year I built a custom lambda image that had low latency than an 8Gb EC2 instance with GPU. Thanks for the video.
@beetlebox48585 ай бұрын
EC2 actually has some FPGA images called F1s that you can experiment with.
@souravjamwal775 ай бұрын
@@beetlebox4858 thanks
@juanpauloazaelpalacios-vil54224 ай бұрын
currently using Brevitas/FINN frameworks to deploy a tiny-yolo model on a Kria board! Love the hardware design challenge of it all
@beetlebox48584 ай бұрын
Sounds very cool. We are looking at developing a Vitis AI and MLOps video as well.
@juanpauloazaelpalacios-vil54224 ай бұрын
@@beetlebox4858 Looking forward to it, thanks for putting in the work in these videos by the way! High quality stuff!