HF中国镜像站

Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2501.19393

LLM and Reasoning Papers

Papers dump of LLM Reasoning domain

Internal Consistency and Self-Feedback in Large Language Models: A Survey

Paper • 2407.14507 • Published Jul 19, 2024 • 46
Large Language Models are Zero-Shot Reasoners

Paper • 2205.11916 • Published May 24, 2022 • 1
Let's Verify Step by Step

Paper • 2305.20050 • Published May 31, 2023 • 10
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Paper • 2201.11903 • Published Jan 28, 2022 • 11

s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31 • 111
LIMO: Less is More for Reasoning

Paper • 2502.03387 • Published Feb 5 • 58

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 203
Demystifying Long Chain-of-Thought Reasoning in LLMs

Paper • 2502.03373 • Published Feb 5 • 56
Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22 • 103
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 108

s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31 • 111

Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published Jan 17 • 106
PaSa: An LLM Agent for Comprehensive Academic Paper Search

Paper • 2501.10120 • Published Jan 17 • 44
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong

Paper • 2501.09775 • Published Jan 16 • 29
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario

Paper • 2501.10132 • Published Jan 17 • 19

test time compute

s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31 • 111

s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31 • 111

Large Language Models Think Too Fast To Explore Effectively

Paper • 2501.18009 • Published Jan 29 • 23
s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31 • 111
Scalable-Softmax Is Superior for Attention

Paper • 2501.19399 • Published Jan 31 • 21
SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers

Paper • 2502.20545 • Published 14 days ago • 20

RL+reason model

about 12 hours ago

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published Jan 24 • 25
Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27 • 26
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 108
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Paper • 2412.12098 • Published Dec 16, 2024 • 4

OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking

Paper • 2501.09751 • Published Jan 16 • 47
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models

Paper • 2501.09686 • Published Jan 16 • 37
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 346
s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31 • 111

Previous
1
2
3
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs