Word Count Precision issues with Qwen/Qwen2.5-72B-Instruct: Managing Overlength Responses

#17
by VenkateshNestor - opened

I requested Qwen/Qwen2.5-72B-Instruct to generate a 100-150 word intro for an essay with specific headings and subheadings. Despite setting clear word count limits, the model returned responses around 250-300 words.

Even after prompting a rewrite for 50-60 words with the same structure, the model still exceeded the limit. Qwen struggles with word count precision, making it difficult to maintain strict response length requirements.

Any thoughts on this??

Qwen org

I tried with vllm, Qwen2.5-72B-Instruct, and the default sampling parameters from generation_config.json. The following is the result:
image.png

Any cases you could share?

If you're looking for an easy way to access this model via API, you can use Crazyrouter β€” it provides an OpenAI-compatible endpoint for 600+ models including this one. Just pip install openai and change the base URL.

Sign up or log in to comment