Cover Image for Data-Free Quantization-Aware Training Research Project Weekly Stand-up
Cover Image for Data-Free Quantization-Aware Training Research Project Weekly Stand-up
Avatar for AIFoundry.org
Presented by
AIFoundry.org
In-person and virtual community events of AIFoundry.org
Hosted By

Data-Free Quantization-Aware Training Research Project Weekly Stand-up

Discord
Registration
Welcome! To join the event, please register below.
About Event

Quantization is a technique applied to reduce the size of Large Language Models and machine learning models in general. It reduces the precision of weights and thus the size of data, allowing the model to more easily reside in the memory of a system running inference on the model.

Most quantized models you can find on Hugging Face are created by a technique called Post-Training Quantization, or PTG for lack of imagination. It does pretty well.

Another technique is Quantization-Aware Training or QAT. In other words, train your model with the understanding that it will also be quantized. This is expensive since it requires additional training.

But what if there's a way we could achieve Quantization-Aware Training without data?

That is what this open research project is investigating.

Check out the proposal on our Github here: https://github.com/aifoundry-org/.github/wiki/Proposal-on-QAT-LLM-quantization

Drop by our weekly stand-up meeting at the AIFoundry.org Discord Server to learn more and see how you can participate.

Avatar for AIFoundry.org
Presented by
AIFoundry.org
In-person and virtual community events of AIFoundry.org
Hosted By