NVIDIA has released HelpSteer2, an open-source dataset designed to train state-of-the-art reward models for aligning LLMs with human preferences. The permissively licensed dataset under CC-BY-4.0 contains 10,681 prompt-response pairs annotated across five attributes on a Likert scale by over 1,000 US-based annotators.
Read the full paper here.
The HelpSteer2 dataset achieves a state-of-the-art 92.0% accuracy on RewardBench’s primary dataset when used to train a reward model with NVIDIA’s 340B Nemotron-4 base model, outperforming all other open and proprietary models as of June 12, 2024.
It is highly data-efficient, requiring only 10,000 response pairs compared to the millions used in other preference datasets, thus significantly reducing computational costs.
It enables the training of reward models that can effectively align large language models like Llama 3 70B to match or exceed the performance of models such as Llama 3 70B Instruct and GPT-4 on major alignment metrics. Additionally, it introduces SteerLM 2.0, a novel model alignment approach that leverages multi-attribute reward predictions to train LLMs on complex, multi-requirement instructions.
“High-quality preference data is crucial for aligning AI systems with human values, but existing datasets are often proprietary or of inconsistent quality,” said Zhilin Wang, senior research scientist at NVIDIA. “
HelpSteer2 provides an open, permissively licensed alternative for both commercial and academic use.
The HelpSteer2 dataset is available on the Hugging Face hub, and the code is open-sourced on NVIDIA’s NeMo-Aligner GitHub repository.
Sentiment Analysis Datasets
HelpSteer2 trains and guides models to behave in ways that people prefer. Additionally, there are many other sentiment analysis models with applications in various fields, helping enterprises accurately understand and learn from their clients or customers.
Some examples include Amazon product data, the multi-domain sentiment dataset, and Sentiment140.