Pintora-Coder-7B

Introduction

Pintora-Coder-7B is a fine-tuned version of Qwen2.5-Coder-7B to support the Pintora diagram language.

The model supports the following features:

  1. Generate diagrams from scratch.
  2. Edit existing diagrams.

Training Details

The model has been trained in the following steps:

  1. Continued pretraining with the pintora-instruct dataset.
  2. Instruction fine-tuned with the pintora-edit-instruct dataset.

Examples

Example 1: Generate diagram from scratch

Instruction:

Create a component diagram for a platform to buy and sell games between users.

Output code:

componentDiagram

package "Services" {
    [Game Service]
    [User Service]
}
database "Storage" {
    [Games Table]
    [Users Table]
}
[Game Service] --> [Games Table]
[User Service] --> [Users Table]
[Game Service] --> [User Service] : for authentication

Preview:

image

Example 2: Edit existing diagram

Instruction:

Add shopping cart and checkout service to this component diagram

componentDiagram

package "Services" {
    [Game Service]
    [User Service]
}
database "Storage" {
    [Games Table]
    [Users Table]
}
[Game Service] --> [Games Table]
[User Service] --> [Users Table]
[Game Service] --> [User Service] : for authentication

Output code:

componentDiagram
package "Services" {
    [Game Service]
    [User Service]
    [Shopping Cart Service]
    [Checkout Service]
}
database "Storage" {
    [Games Table]
    [Users Table]
    [Cart Items Table]
    [Orders Table]
}
[Game Service] --> [Games Table]
[User Service] --> [Users Table]
[Shopping Cart Service] --> [Cart Items Table]
[Checkout Service] --> [Cart Items Table]
[Checkout Service] --> [Orders Table]
[Game Service] --> [User Service] : for authentication
[Shopping Cart Service] --> [User Service] : for authentication
[Checkout Service] --> [User Service] : for authentication

Preview:

image

Running

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load model
model_name = "huytd189/pintora-coder-7b"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, device_map="auto")

# Prompt template
edit_prompt = """Pintora Diagram Edit Instruction

### Instruction:
{}

{}

### Response:
{}"""

# Example 1: Generate from scratch
inputs = tokenizer([
    edit_prompt.format(
        "Create a component diagram for a platform to buy and sell games between users.",
        "",
        ""
    )
], return_tensors="pt").to("cuda")

outputs = model.generate(**inputs, max_new_tokens=512, use_cache=True)
print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])

print("\n" + "="*80 + "\n")

# Example 2: Edit existing diagram
inputs = tokenizer([
    edit_prompt.format(
        "Add shopping cart and checkout service to this component diagram",
        """componentDiagram

package "Services" {
    [Game Service]
    [User Service]
}
database "Storage" {
    [Games Table]
    [Users Table]
}
[Game Service] --> [Games Table]
[User Service] --> [Users Table]
[Game Service] --> [User Service] : for authentication""",
        ""
    )
], return_tensors="pt").to("cuda")

outputs = model.generate(**inputs, max_new_tokens=512, use_cache=True)
print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])
Downloads last month
132
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train huytd189/pintora-coder-7b