LLaMA 2 7B Chat Fine-Tuned

This is a fine-tuned version of the LLaMA 2 7B model designed for chat-based tasks. The model has been fine-tuned to improve its performance on generating conversational responses.

Model Details

  • Model Name: LLaMA 2 7B Chat Fine-Tuned
  • Base Model: NousResearch/Llama-2-7b-chat-hf
  • Architecture: LlamaForCausalLM
  • Tokenization: Supported
    • pad_token_id: 0
    • bos_token_id: 1
    • eos_token_id: 2

Supported Tasks

This model supports the following task:

  • Text Generation

Configuration

  • Model Configuration (config.json):

    • Hidden Size: 4096
    • Number of Layers: 32
    • Number of Attention Heads: 32
    • Vocab Size: 32000
    • Token Type IDs: Not used
  • Generation Configuration (generation_config.json):

    • Sampling Temperature: 0.9
    • Top-p (nucleus sampling): 0.6
    • Pad Token ID: 32000
    • Bos Token ID: 1
    • Eos Token ID: 2

Usage

To use this model for text generation via the Hugging Face API, use the following Python code snippet:

import requests

api_url = "https://api-inference.huggingface.co/models/rahul77/llama-2-7b-chat-finetune"
headers = {
    "Authorization": "Bearer YOUR_API_TOKEN",  # Replace with your Hugging Face API token
    "Content-Type": "application/json"
}

data = {
    "inputs": "What is a large language model?",
    "parameters": {
        "max_length": 50
    }
}

response = requests.post(api_url, headers=headers, json=data)

if response.status_code == 200:
    print(response.json())
else:
    print(f"Error: {response.status_code}")
    print(response.json())


---
license: apache-2.0
---
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support