Projects I've built or joined — give a star if you like!
This project focuses on converting the Qwen3 model to INT8 format to improve inference speed and significantly reduce GPU memory consumption, making it more efficient for deployment.