Safetensors
English
qwen2

Pretrained Reward Model Classifier

Overview

This is a specialized binary classifier that evaluates text chunks and predicts whether they would be "Chosen" (A) or "Rejected" (B).

How It Works

  1. Text is split into exact 64-token chunks using the Qwen 2.5 tokenizer
  2. The model evaluates the preference between chunks in a specific format
  3. Only token IDs [32, 33] have non-zero weights in the LM head (A=Chosen, B=Rejected)

Input Format

The model expects input in this precise format:

[Original text from previous 64-token chunks]

<<JUDGEMENT_REGION>>
[Next 64-token chunk to evaluate]
<</JUDGEMENT_REGION>>

<<JUDGEMENT>>

Example

Original paragraph:

The city council meeting started promptly at 6 PM with all members present. Mayor Johnson opened by addressing concerns about the new parking regulations downtown. Citizens expressed both support and opposition during the public comment period. The council ultimately voted 4-2 to implement the regulations starting next month.

Formatted for prediction:

The city council meeting started promptly at 6 PM with all members present. Mayor Johnson opened by addressing concerns about the new parking regulations downtown.

<<JUDGEMENT_REGION>>
Citizens expressed both support and opposition during the public comment period. The council ultimately voted 4-2 to implement the regulations starting next month.
<</JUDGEMENT_REGION>>

<<JUDGEMENT>>

Output

The model predicts whether the chunk in the JUDGEMENT_REGION would be:

  • A: Chosen (preferred content)
  • B: Rejected (less preferred content)

The prediction is based on the relative probabilities between these two tokens only.

Analysis

For practical use, results should be aggregated by taking the mean of log ratios between the two probabilities:

log_ratio = log(P(A) / P(B))

This log ratio approach provides a more stable and interpretable signal across multiple evaluations than using raw probabilities alone.

Downloads last month
3
Safetensors
Model size
7.62B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for Quest-AI/pretrain-rm-baseline-7b

Base model

Qwen/Qwen2.5-7B
Finetuned
(243)
this model

Dataset used to train Quest-AI/pretrain-rm-baseline-7b