Inside TensorFlow: Quantization aware training

  Рет қаралды 15,320

TensorFlow

TensorFlow

Күн бұрын

Пікірлер: 43
@foolmarks
@foolmarks 4 жыл бұрын
GitHub link doesn't work. Audio is terrible.
@autripat
@autripat 3 жыл бұрын
Hey all, at 14:26, are we missing the quantize_annotate_layer wrapper over the Conv2d layer (inside Sequential), like this: quantize_annotate_layer(tf.keras.layers.Conv2d(32, 5, input_shape=(28,28,1))
@gauravsingh-jm6dk
@gauravsingh-jm6dk 4 ай бұрын
Yes you are right . Quantize Annotate works for annotation purpose only so that quantize apply knows which layer to quantize . There is one other way which provides you better granularity you can directly use Qunatize_Wrapper class. It will give you freedom to quantize the model as per your needs as here you can set configuration parameters for quantization
@shubhammane6357
@shubhammane6357 2 жыл бұрын
I tried QAT, as result I got .h5 model with quantize wrapper layers, I want to remove it and get back my original model with modified weights, How can I dot that?
@athreyamurali1439
@athreyamurali1439 4 жыл бұрын
Hey can you re-upload with better audio, please?
@alias15vapour
@alias15vapour 3 жыл бұрын
Sorry about that. I recorded the audio locally so it's better, but forgot Airpods audio compression over bluetooth lost quality.
@athreyamurali1439
@athreyamurali1439 3 жыл бұрын
@@alias15vapour All good, it happens. The topic seems really interesting tho, so I'd really appreciate it if you could re-upload or re-record it sometime. Thanks!
@alias15vapour
@alias15vapour 3 жыл бұрын
@@athreyamurali1439 - Thanks. This takes a bunch of post-production work, so a bit unlikely tbh but me (or someone else on the team) will definitely do this and a better job for the next version.
@lisali6120
@lisali6120 3 жыл бұрын
Thanks for sharing! Does it support mixed precision?
@bryanlozano8905
@bryanlozano8905 3 жыл бұрын
it should, he mentioned custom quantization for specific layers
@alias15vapour
@alias15vapour 3 жыл бұрын
QAT does emulation of model execution in certain precisions so model accuracy is preserved. If that's your goal, you can totally do it like Bryan mentioned. But it's unlike mixed precision for training.
@sunnyguha2
@sunnyguha2 4 жыл бұрын
Get better microphone
@Lisa-hb3js
@Lisa-hb3js 10 ай бұрын
I got this error whatever I do (the same if the network only contains Dense layers...) : ValueError: Unable to clone model. This generally happens if you used custom Keras layers or objects in your model. Please specify them via `quantize_scope` for your calls to `quantize_model` and `quantize_apply`. [Layer supplied to wrapper is not a supported layer type. Please ensure wrapped layer is a valid Keras layer.].
@opydas4548
@opydas4548 7 ай бұрын
Have you found the solutions?
@gauravsingh-jm6dk
@gauravsingh-jm6dk 4 ай бұрын
This error you get when you are defining your custom layer . For example I define my custom layer as Class CustomLayer(tf.keras.layers.Layer) and did some operations inside it then. Whenever you are doing quantize_apply you need to declare its parameters in quantize_scope
@sanjoetv5748
@sanjoetv5748 11 ай бұрын
i am having a problem when i convert my .h5 to tflite,, when i test the tflite on my mobile app the accuracy is so much lower than when i try to run the .h5 on jupyter.... my question is does quantization aware training can help me to lower the accuracy loss when converted it to tflite after the quantization aware training? please someone help!!!
@gauravsingh-jm6dk
@gauravsingh-jm6dk 4 ай бұрын
Yes if QAT done properly. It will increase your accuracy for sure
@Hav0c1000
@Hav0c1000 3 жыл бұрын
Hey Pulkit, Say I wanted to constrain quantization parameters to power of 2 values. Would that be supported?
@yoloswaggins2161
@yoloswaggins2161 4 жыл бұрын
Can this be used for tensor cores on Nvidia GPUs or is it only for embedded devices?
@alias15vapour
@alias15vapour 3 жыл бұрын
By default it supports the TFLite Quantization spec. If you want to use it for Nvidia, you would have to write custom quantization configs specific to NVidia. But it absolutely can be done.
@yoloswaggins2161
@yoloswaggins2161 3 жыл бұрын
@@alias15vapour Thanks for the answer, would that be writing CUDA kernels for this or could you wrap with something higher level like Tensorrt?
@alias15vapour
@alias15vapour 3 жыл бұрын
​@@yoloswaggins2161 You wouldn't need to write any kernels. You would just need to arrange the TF graph in a way that it emulates the quantization on Nvidia chips. It would just reuse their existing kernels. Possible to use TensorRT but you would need to know deep internals of TensorRT to construct the graph correctly.
@yoloswaggins2161
@yoloswaggins2161 3 жыл бұрын
@@alias15vapour I see, thank you.
@morekaccino
@morekaccino 4 жыл бұрын
I can't hear anything
@andresfernandoaranda5498
@andresfernandoaranda5498 4 жыл бұрын
same
@anishdeepak1826
@anishdeepak1826 2 жыл бұрын
I have trained ssd_mobilenet_v2 model using object detection api and saved the model as .pb file. How to apply the quantization to a my model. I dont have .h5 file.
@gauravsingh-jm6dk
@gauravsingh-jm6dk 4 ай бұрын
For PTQ (Post Training Quantization) if you are doing it on GPU use TensorRT. If you are doing it on intel CPU use OpenVIno. If you want to Do QAT (Quantize Aware Training) tensorflow_model _optimization library you can refer and If you have GPU you can also utilize Nvidia Toolkit for Quantize Aware Training
@rupeshmohanasundaram6718
@rupeshmohanasundaram6718 10 ай бұрын
for object detection, QAT Works, if so how?
@sreeragm8366
@sreeragm8366 4 жыл бұрын
Is there any scenarios in which quantisation shouldn't be done? Like, Incase I want to convert it to other formats supporting optimization, such as TensorRT.
@alias15vapour
@alias15vapour 3 жыл бұрын
That depends on your needs. If you want to use TensorRT for optimization that works fine as well. Quantization is useful if performance is a concern for you.
@PremKumar-qi3cd
@PremKumar-qi3cd 4 жыл бұрын
When I try to post-quantize(int8) the SimpleRNN model for a time series data, it is throwing an error saying only single graph is supported. So Does the RNN, LSTMs support for quantization and conversion to tflite models? And If yes, how can I address the error? Thanks in advance.:)
@raisaalphonse4094
@raisaalphonse4094 3 жыл бұрын
I'm using QAT for a functional model only, but I'm getting a value error saying, quantize_model '`to_quantize` can only either be a tf.keras Sequential or ' ValueError: `to_quantize` can only either be a tf.keras Sequential or Functional model. I'm not really sure why I'm getting this error. Could anyone please help me out in this?
@gauravsingh-jm6dk
@gauravsingh-jm6dk 4 ай бұрын
If you do model.summary() you must be having a layer containing sub-layers. Keras Model declared in class acting as single layer. That what's this error is talking about. Prepare a proper functional model then only you can utilize QAT
@nataliameira2283
@nataliameira2283 4 жыл бұрын
Documentation → goo.gle/2WMUZze ---> ERROR (Sorry, we couldn't find that page.)
@sairamvarma6208
@sairamvarma6208 4 жыл бұрын
The Github link in the description doesn't work
@alias15vapour
@alias15vapour 3 жыл бұрын
Sorry about that, there's a typo. Just use the link below.
@ramamunireddyyanamala973
@ramamunireddyyanamala973 2 жыл бұрын
Very good Sir
@travelsome
@travelsome 4 жыл бұрын
Waiting for a video for sequential modelling
@rushikeshgandhmal
@rushikeshgandhmal 4 жыл бұрын
Hey how should I start start learning Deep learning ? Could you suggest me?
@gokulakrishnanm
@gokulakrishnanm Жыл бұрын
@@rushikeshgandhmalhow’s your learning journey🎉
@bryanlozano8905
@bryanlozano8905 3 жыл бұрын
Bruh, is someone weed-whacking outside?
@alias15vapour
@alias15vapour 3 жыл бұрын
Unfortunately, yes. They started that the moment I started recording :(
Inside TensorFlow: TensorFlow Lite
37:31
TensorFlow
Рет қаралды 22 М.
Маусымашар-2023 / Гала-концерт / АТУ қоштасу
1:27:35
Jaidarman OFFICIAL / JCI
Рет қаралды 390 М.
UFC 287 : Перейра VS Адесанья 2
6:02
Setanta Sports UFC
Рет қаралды 486 М.
A friendly introduction to distributed training (ML Tech Talks)
24:19
Transformers (how LLMs work) explained visually | DL5
27:14
3Blue1Brown
Рет қаралды 4,6 МЛН
Quantization vs Pruning vs Distillation: Optimizing NNs for Inference
19:46
Why Does Diffusion Work Better than Auto-Regression?
20:18
Algorithmic Simplicity
Рет қаралды 445 М.
Inside TensorFlow: MLIR for TF developers
43:41
TensorFlow
Рет қаралды 10 М.
Quantization in Deep Learning (LLMs)
13:04
AI Bites
Рет қаралды 8 М.
Inside TensorFlow: New TF Lite Converter
37:32
TensorFlow
Рет қаралды 8 М.
AI can't cross this line and we don't know why.
24:07
Welch Labs
Рет қаралды 1,6 МЛН
Маусымашар-2023 / Гала-концерт / АТУ қоштасу
1:27:35
Jaidarman OFFICIAL / JCI
Рет қаралды 390 М.