Model Card for Model TwinDoc/RedWhale-2-12B

Llama3.1 8B를 TLI하여 12B 모델로 만든 후 사전학습한 모델입니다. 사전학습은 한국어 Corpus로 진행하였습니다.
TLI는 transformer의 layer를 복제하는 모델 up-scale 방법론입니다.

Model Details

Model Description

Developed by: AgileSoda
Model type: Llama
Language(s) (NLP): 한국어
License: [More Information Needed]
Finetuned from model [optional]: TwinDoc/RedWhale-2-12B-Instruct
Foundation Model: RedWhale-2-12B-TLI

Model Sources [optional]

Repository: [More Information Needed]
Paper [optional]: [More Information Needed]
Demo [optional]: [More Information Needed]

Uses

RedWhale-2-12B 모델 사용 방법은 meta-llama/Llama-3.1-8B 모델 사용 방법과 동일합니다. 사용하고자 하는 서빙 엔진의 공식 문서를 참고하세요. 다음은 예시입니다.

Direct Use

usage with Transformers 예시 코드는 transformers == 4.48.1에서 작성되었습니다.

from transformers import AutoModelForCausalLM,AutoTokenizer
import torch

loading_args = {"torch_dtype": torch.bfloat16, "device_map": "auto"} ## for multi gpu loading
model = AutoModelForCausalLM.from_pretrained("TwinDoc/RedWhale-2-12B",**loading_args)
tokenizer = AutoTokenizer.from_pretrained("TwinDoc/RedWhale-2-12B")

text = "대한민국의 수도는 "
inputs = tokenizer(text,return_tensors="pt")
outputs = model.generate(**inputs,max_new_tokens = 100)

>>> print(tokenizer.decode(outputs[0]))
"<|begin_of_text|>대한민국의 수도는 1000만여 명 이상이 거주하고 있는 서울로 대표되는 도심지이다. 본 연구에서는 서울의 중심을 나타내는 4대문 안을 도심지로 정의하고, 그 경계를 북악산, 인왕산, 남산, 낙산으로 구분하는 4산의 산줄기와 도로로 구성되는 8개의 변을 경계로 정한다. 국토 공간적 관점에서 우리나라의"

Out-of-Scope Use

사전학습만 진행한 모델이기 때문에 Instruction을 따르는 능력은 없습니다. 특정 Task에 바로 사용하기 보다는 Fine-Tuning을 위한 Base모델로 사용하는 것을 권장합니다.

TwinDoc
/

RedWhale-2-12B

You need to agree to share your contact information to access this model

Model Card for Model TwinDoc/RedWhale-2-12B

Model Details

Model Description

Model Sources [optional]

Uses

Direct Use

Out-of-Scope Use

Training Details

Training Data

Training Procedure

Compute Infrastructure

Hardware

Collection including TwinDoc/RedWhale-2-12B

RedWhale2