Update README.md
#13
by
puneeshkhanna - opened
README.md
CHANGED
|
@@ -105,6 +105,7 @@ vllm serve tiiuae/Falcon-H1R-7B \
|
|
| 105 |
Additional flags:
|
| 106 |
|
| 107 |
* You can reduce `--max-model-len` to preserve memory. Default value is `262144` which is quite large but not necessary for most scenarios.
|
|
|
|
| 108 |
|
| 109 |
|
| 110 |
vLLM client execution code:
|
|
|
|
| 105 |
Additional flags:
|
| 106 |
|
| 107 |
* You can reduce `--max-model-len` to preserve memory. Default value is `262144` which is quite large but not necessary for most scenarios.
|
| 108 |
+
* For function calling, append `--enable-auto-tool-choice` and `--tool-call-parser hermes` to the vllm serve command.
|
| 109 |
|
| 110 |
|
| 111 |
vLLM client execution code:
|