xiaoqijian's picture
1 2

xiaoqijian

mx1024
·

AI & ML interests

None yet

Recent Activity

commented on an article about 9 hours ago
Open R1: Update #3
upvoted an article about 9 hours ago
Open R1: Update #3
liked a model 17 days ago
qihoo360/TinyR1-32B-Preview
View all activity

Organizations

OpenReasoning's profile picture

mx1024's activity

commented on Open R1: Update #3 about 9 hours ago
view reply

How is packing implemented in your code? Have you tried using a 4D attention mask to avoid the overlap between samples that you mentioned?

upvoted an article about 9 hours ago