Primus
Paper • 2502.11191 • Published • 4Note Start by reading the 🚀Primus Paper! To the best of our knowledge, we are the 🏄🏽♂️ first to release datasets covering cybersecurity pretraining, IFT, and reasoning distillation. Of course, we are also the first to pretrain an LLM on a large-scale cybersecurity corpus.
trend-cybertron/Llama-Primus-Base
Text Generation • UpdatedNote Based on Llama-3.1-8B-Instruct, continually pretrained on 2.77B tokens of cybersecurity text, achieving a 🚀15.88% improvement in the aggregated score across multiple cybersecurity benchmarks.
trend-cybertron/Llama-Primus-Merged
Text Generation • UpdatedNote Instruct Model! While maintaining nearly the same instruction-following capability as Llama-3.1-8B-Instruct, achieving a 🚀14.84% improvement across multiple cybersecurity benchmarks.
trend-cybertron/Llama-Primus-Reasoning
Text Generation • UpdatedNote Distilled on reasoning and reflection data from o1-preview for cybersecurity tasks, achieving a 🚀10% improvement on CISSP.
trend-cybertron/Primus-Seed
Updated • 72Note Includes high-quality cybersecurity texts manually collected from reputable sources such as wikipedia, MITRE, cybersecurity company websites, CTI, and more.
trend-cybertron/Primus-FineWeb
Updated • 74 • 1Note Includes 2.57B tokens of cybersecurity texts filtered from FineWeb.
trend-cybertron/Primus-Instruct
Updated • 67 • 1Note Includes approximately 1K QA pairs covering common cybersecurity business scenarios.
trend-cybertron/Primus-Reasoning
Updated • 65 • 1Note Includes reasoning and reflection data generated by o1-preview on cybersecurity tasks for distillation.