Data-Free Quantization-Aware Training Research Project Weekly Stand-up

AIFoundry.org

Discord

Past Event

Welcome! To join the event, please register below.

You will be asked to verify token ownership with your wallet.

About Event

Quantization is a technique applied to reduce the size of Large Language Models and machine learning models in general. It reduces the precision of weights and thus the size of data, allowing the model to more easily reside in the memory of a system running inference on the model.

Most quantized models you can find on Hugging Face are created by a technique called Post-Training Quantization, or PTG for lack of imagination. It does pretty well.

Another technique is Quantization-Aware Training or QAT. In other words, train your model with the understanding that it will also be quantized. This is expensive since it requires additional training.

But what if there's a way we could achieve Quantization-Aware Training without data?

That is what this open research project is investigating.

Check out the proposal on our Github here: https://github.com/aifoundry-org/.github/wiki/Proposal-on-QAT-LLM-quantization

Drop by our weekly stand-up meeting at the AIFoundry.org Discord Server to learn more and see how you can participate.

Presented by

AIFoundry.org

In-person and virtual community events of AIFoundry.org

Hosted By