What about latencies

by LorenzoCevolaniAXA - opened Apr 17, 2024

do you have a benchmark for the full mixtral on 48xlarge vs the medusa modified mixtral awq here on the 12xlarge?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment