Wait - are companies actually happy running their models at full precision after creation?! That's an insanely inefficient waste of resources for no gain given modern (2022+) quantisation techniques.
@savelist12 күн бұрын
They are no different you take any model and just give a good prompt you get the same result