Overview

This SpaRTA adapter spacializes the google/gemma-2b-it instruction-following model to do sentiment classification of English text sentences.

Adapter Description

  • PEFT method: SpaRTA @ 99.8% sparsity

  • Base Model: google/gemma-2b-it

  • Task: Text Classification (Sentiment Analysis)

  • Language: English

Inputs and Outputs

  • Input:

    Text string representing a sentence to be classified as having positive or negative sentiment, wrapped within an instruction and formated with the model (google/gemma-2b-it) chat template as follows:

    
    input_template = ("<start_of_turn>user\n"
                    "Determine the sentiment of the following sentence about a movie. "
                    "The sentiment can only be classified as positive or negative.\n"
                    "Sentence: {sentence}"
                    "<end_of_turn>\n<start_of_turn>model\n"
                    "The sentiment of the sentence is")
    
    sentence = "I loved the movie. It was great."
    
    model_input = input_template.format(sentence=sentence)
                
    print(model_input)
    
    <start_of_turn>user
    Determine the sentiment of the following sentence about a movie. The sentiment can only be classified as positive or negative.
    Sentence: I loved the movie. It was great.<end_of_turn>
    <start_of_turn>model
    The sentiment of the sentence is
    

    We neeed to use this input template since the adapted model was trained with it.

  • Output:

    One of two tokens represeting the sentiment class of the input: with token id 0 for negative sentiment, and 1 for positive.

How to Use

For instructions on how to load and use this adapter to classify input sentences, see https://pypi.org/project/peft-sparta/.

Training Details

Training Procedure

The adapter was trained on the SST-2 dataset, using a 99.8% sparsity, that is, freezing 99.8% of the Gemma-2B-IT model parameters and training only on the remaining 0.2%. The trainable parameters were chosen randomly from the self-attention value (Wv) and output (Wo) projection metrices. This resulted in a total of approx. 5 million trainable parameters. We used a handcrafted instruction and the model (google/gemma-2b-it) chat template to process the raw inputs (text sentences) in the training set. See model input template for details.

Training Data

The Gemma 2B IT model was fine-tuned using SpaRTA on the SST-2 (Stanford Sentiment Treebank) dataset. We used only 66,349 examples of the training spilt for training, with the remainder for validation; and used the validation split with 872 examples for testing (not seen during training)

Intended Use

Binary sentiment classification (positive/negative).

Performance Evaluation (on Test Set)

  • Balanced accuracy: 96.0%
  • Per class accuracy:
    • negative sentiment: 96.7%
    • positive sentiment: 95.3%
  • MCC: 0.920
  • F1-score: 0.960
Downloads last month
28
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jesusriosal/sparta-gemma_2b_it-sst2

Base model

google/gemma-2b-it
Finetuned
(106)
this model

Dataset used to train jesusriosal/sparta-gemma_2b_it-sst2