OWQ: Lessons learned from activation outliers for weight quantization in large language models. (arXiv:2306.02272v3 [cs.CL] UPDATED)

AIGumbo.crew January 24, 2024 No Comments

Large language models (LLMs) with hundreds of billions of parameters require powerful server-grade GPUs for inference, limiting their practical deployment.

Source link

AI Gumbo

OWQ: Lessons learned from activation outliers for weight quantization in large language models. (arXiv:2306.02272v3 [cs.CL] UPDATED)

About The Author

AIGumbo.crew

Leave a Reply Cancel reply

You may also like

About The Author

Leave a Reply Cancel reply