Uncategorized

Paper page — FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design



"Large Language Models"Microsoft presents FP6-LLM
Efficiently serving large language models through fp6-centric algorithm-system co-design.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *