Together

company

Verified

https://together.ai

togethercompute

togethercomputer

Inference Provider

3,268,656 monthly requests

AI & ML interests

Foundation Models, Decentralized Computing, Open Source AI.

Recent Activity

juewang published a dataset 1 day ago

togethercomputer/aurora

zelc updated a dataset 1 day ago

togethercomputer/aurora

biyuan authored a paper 8 months ago

FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU

View all activity

Papers

Untied Ulysses: Memory-Efficient Context Parallelism via Headwise Chunking

View all Papers

Articles

Fine-tune Any LLM from the Hugging Face Hub with Together AI

published a dataset 1 day ago

togethercomputer/aurora

Viewer • Updated 1 day ago • 1.24M • 5 • 1

zelc

updated a dataset 1 day ago

togethercomputer/aurora

Viewer • Updated 1 day ago • 1.24M • 5 • 1

posted an update 8 months ago

Post

347

🚀 Full-Quality Wan2.2 Video Generation on a single 24GB GPU — Powered by DFloat11

We just released the DFloat11 compressed Wan2.2 models. Now you can run full-quality Wan2.2 video generation on a single 24GB GPU, thanks to DFloat11 compression and CPU offloading.

🔗 Image-to-Video: DFloat11/Wan2.2-I2V-A14B-DF11
🔗 Text-to-Video: DFloat11/Wan2.2-T2V-A14B-DF11

authored a paper 11 months ago

70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float

Paper • 2504.11651 • Published Apr 15, 2025 • 31

danielepaliotta

authored a paper over 1 year ago

The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Paper • 2408.15237 • Published Aug 27, 2024 • 42

authored a paper almost 2 years ago

Mixture-of-Agents Enhances Large Language Model Capabilities

Paper • 2406.04692 • Published Jun 7, 2024 • 59

authored a paper over 2 years ago

Distributed Inference and Fine-tuning of Large Language Models Over The Internet

Paper • 2312.08361 • Published Dec 13, 2023 • 27

authored a paper over 2 years ago

FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores

Paper • 2311.05908 • Published Nov 10, 2023 • 14

authored a paper over 2 years ago

PDFTriage: Question Answering over Long, Structured Documents

Paper • 2309.08872 • Published Sep 16, 2023 • 55

authored a paper over 2 years ago

Simple Hardware-Efficient Long Convolutions for Sequence Modeling

Paper • 2302.06646 • Published Feb 13, 2023 • 2

updated a Space over 2 years ago

About Together

authored 3 papers almost 3 years ago

Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training

Paper • 2305.14342 • Published May 23, 2023

Anticipatory Music Transformer

Paper • 2306.08620 • Published Jun 14, 2023 • 10

SQuAD: 100,000+ Questions for Machine Comprehension of Text

Paper • 1606.05250 • Published Jun 16, 2016 • 4

updated a dataset almost 3 years ago

togethercomputer/RedPajama-Data-Instruct

Preview • Updated Jun 6, 2023 • 91 • 35

updated 2 models almost 3 years ago

togethercomputer/RedPajama-INCITE-7B-Base

Text Generation • Updated Jun 6, 2023 • 1.1k • 91

togethercomputer/RedPajama-INCITE-7B-Chat

Text Generation • Updated Jun 5, 2023 • 1.1k • 93

authored 3 papers almost 3 years ago

Cheaply Evaluating Inference Efficiency Metrics for Autoregressive Transformer APIs

Paper • 2305.02440 • Published May 3, 2023 • 1

Truncation Sampling as Language Model Desmoothing

Paper • 2210.15191 • Published Oct 27, 2022 • 1

Lexinvariant Language Models

Paper • 2305.16349 • Published May 24, 2023 • 2