Microsoft presents FP6-LLM
Efficiently serving large language models through fp6-centric algorithm-system co-design.
Source link
Microsoft presents FP6-LLM
Efficiently serving large language models through fp6-centric algorithm-system co-design.
Source link