#gemma-3n

[ follow ]
fromInfoQ
3 weeks ago

Gemma 3n Introduces Novel Techniques for Enhanced Mobile AI Inference

Gemma 3n employs Per-Layer Embeddings (PLE) to optimize RAM usage by loading core transformer weights into VRAM while keeping other parameters on the CPU.
Mobile UX
[ Load more ]