AI on Edge: how to run neural networks on smartphone
Aleksei Naumov, Lead AI Engineer at Terra Quantum, will host an exclusive workshop for the Global Talent Club on deploying high-performance AI models directly on user devices - like smartphones with limited memory (6-12 GB).
Terra Quantum is one of Europe’s top deep-tech companies that secured over $100M in funding.
Aleksei has led numerous research projects focused on compressing large language models (LLM) and deploying neural networks on devices with constrained computational resources. His team’s latest work, TQCompressor, was presented at IEEE MIPR 2024 (California, USA), a leading conference on AI applications for consumer multimedia and Efficient AI, bringing together experts from NVIDIA, Google, Amazon, UC Berkeley, and other industry giants.
Why on-device AI is important:
- Running AI on-device leads to significant reductions in cloud computing costs
- Fully secure data processing by running locally on user’s smartphones
Workshop agenda:
- Language model compression methods for mobile devices: quantization, distillation, pruning, and matrix decompositions
- How to enhance sub-billion parameter LLMs performance on smartphones. Discussion of recent studies: MobileLLM (Meta) and TQCompressor (Terra Quantum) – layer sharing, in-depth model scaling
- The future of on-device AI: developing new models for smartphones versus adapting existing ones
Workshop will be in Russian language.