LLaMA 2 7B Chat Fine-Tuned
This is a fine-tuned version of the LLaMA 2 7B model designed for chat-based tasks. The model has been fine-tuned to improve its performance on generating conversational responses.
Model Details
- Model Name: LLaMA 2 7B Chat Fine-Tuned
- Base Model: NousResearch/Llama-2-7b-chat-hf
- Architecture: LlamaForCausalLM
- Tokenization: Supported
pad_token_id: 0bos_token_id: 1eos_token_id: 2
Supported Tasks
This model supports the following task:
- Text Generation
Configuration
Model Configuration (
config.json):- Hidden Size: 4096
- Number of Layers: 32
- Number of Attention Heads: 32
- Vocab Size: 32000
- Token Type IDs: Not used
Generation Configuration (
generation_config.json):- Sampling Temperature: 0.9
- Top-p (nucleus sampling): 0.6
- Pad Token ID: 32000
- Bos Token ID: 1
- Eos Token ID: 2
Usage
To use this model for text generation via the Hugging Face API, use the following Python code snippet:
import requests
api_url = "https://api-inference.huggingface.co/models/rahul77/llama-2-7b-chat-finetune"
headers = {
"Authorization": "Bearer YOUR_API_TOKEN", # Replace with your Hugging Face API token
"Content-Type": "application/json"
}
data = {
"inputs": "What is a large language model?",
"parameters": {
"max_length": 50
}
}
response = requests.post(api_url, headers=headers, json=data)
if response.status_code == 200:
print(response.json())
else:
print(f"Error: {response.status_code}")
print(response.json())
---
license: apache-2.0
---
- Downloads last month
- -