Transformers documentation
Auto Classes
Auto Classes
In many cases, the architecture you want to use can be guessed from the name or the path of the pretrained model you
are supplying to the from_pretrained()
method. AutoClasses are here to do this job for you so that you
automatically retrieve the relevant model given the name/path to the pretrained weights/config/vocabulary.
Instantiating one of AutoConfig, AutoModel, and AutoTokenizer will directly create a class of the relevant architecture. For instance
model = AutoModel.from_pretrained("bert-base-cased")
will create a model that is an instance of BertModel.
There is one class of AutoModel
for each task, and for each backend (PyTorch, TensorFlow, or Flax).
Extending the Auto Classes
Each of the auto classes has a method to be extended with your custom classes. For instance, if you have defined a
custom class of model NewModel
, make sure you have a NewModelConfig
then you can add those to the auto
classes like this:
from transformers import AutoConfig, AutoModel
AutoConfig.register("new-model", NewModelConfig)
AutoModel.register(NewModelConfig, NewModel)
You will then be able to use the auto classes like you would usually do!
If your NewModelConfig
is a subclass of PretrainedConfig
, make sure its
attribute is set to the same key you use when registering the config (here "new-model"
Likewise, if your NewModel
is a subclass of PreTrainedModel, make sure its
attribute is set to the same class you use when registering the model (here
This is a generic configuration class that will be instantiated as one of the configuration classes of the library when created with the from_pretrained() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( pretrained_model_name_or_path **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model configuration hosted inside a model repo on Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing a configuration file saved using the
save_pretrained() method, or the save_pretrained() method,
. - A path or url to a saved configuration JSON file, e.g.,
- A string, the model id of a pretrained model configuration hosted inside a model repo on Valid model ids can be located at the root-level, like
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download the model weights and configuration files and override the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
return_unused_kwargs (
, optional, defaults toFalse
) — IfFalse
, then this function returns just the final configuration object.If
, then this functions returns aTuple(config, unused_kwargs)
where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not configuration attributes: i.e., the part ofkwargs
which has not been used to updateconfig
and is otherwise ignored. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs(additional keyword arguments, optional) —
The values in kwargs of any keys which are configuration attributes will be used to override the loaded
values. Behavior concerning key/value pairs whose keys are not configuration attributes is controlled
by the
keyword parameter.
Instantiate one of the configuration classes of the library from a pretrained model configuration.
The configuration class to instantiate is selected based on the model_type
property of the config object that
is loaded, or when it’s missing, by falling back to using pattern matching on pretrained_model_name_or_path
- albert — AlbertConfig (ALBERT model)
- bart — BartConfig (BART model)
- beit — BeitConfig (BEiT model)
- bert — BertConfig (BERT model)
- bert-generation — BertGenerationConfig (Bert Generation model)
- big_bird — BigBirdConfig (BigBird model)
- bigbird_pegasus — BigBirdPegasusConfig (BigBird-Pegasus model)
- blenderbot — BlenderbotConfig (Blenderbot model)
- blenderbot-small — BlenderbotSmallConfig (BlenderbotSmall model)
- bloom — BloomConfig (BLOOM model)
- camembert — CamembertConfig (CamemBERT model)
- canine — CanineConfig (CANINE model)
- clip — CLIPConfig (CLIP model)
- convbert — ConvBertConfig (ConvBERT model)
- convnext — ConvNextConfig (ConvNeXT model)
- ctrl — CTRLConfig (CTRL model)
- cvt — CvtConfig (CvT model)
- data2vec-audio — Data2VecAudioConfig (Data2VecAudio model)
- data2vec-text — Data2VecTextConfig (Data2VecText model)
- data2vec-vision — Data2VecVisionConfig (Data2VecVision model)
- deberta — DebertaConfig (DeBERTa model)
- deberta-v2 — DebertaV2Config (DeBERTa-v2 model)
- decision_transformer — DecisionTransformerConfig (Decision Transformer model)
- deit — DeiTConfig (DeiT model)
- detr — DetrConfig (DETR model)
- distilbert — DistilBertConfig (DistilBERT model)
- dpr — DPRConfig (DPR model)
- dpt — DPTConfig (DPT model)
- electra — ElectraConfig (ELECTRA model)
- encoder-decoder — EncoderDecoderConfig (Encoder decoder model)
- flaubert — FlaubertConfig (FlauBERT model)
- flava — FlavaConfig (FLAVA model)
- fnet — FNetConfig (FNet model)
- fsmt — FSMTConfig (FairSeq Machine-Translation model)
- funnel — FunnelConfig (Funnel Transformer model)
- glpn — GLPNConfig (GLPN model)
- gpt2 — GPT2Config (OpenAI GPT-2 model)
- gpt_neo — GPTNeoConfig (GPT Neo model)
- gpt_neox — GPTNeoXConfig (GPT NeoX model)
- gptj — GPTJConfig (GPT-J model)
- hubert — HubertConfig (Hubert model)
- ibert — IBertConfig (I-BERT model)
- imagegpt — ImageGPTConfig (ImageGPT model)
- layoutlm — LayoutLMConfig (LayoutLM model)
- layoutlmv2 — LayoutLMv2Config (LayoutLMv2 model)
- layoutlmv3 — LayoutLMv3Config (LayoutLMv3 model)
- led — LEDConfig (LED model)
- levit — LevitConfig (LeViT model)
- longformer — LongformerConfig (Longformer model)
- longt5 — LongT5Config (LongT5 model)
- luke — LukeConfig (LUKE model)
- lxmert — LxmertConfig (LXMERT model)
- m2m_100 — M2M100Config (M2M100 model)
- marian — MarianConfig (Marian model)
- maskformer — MaskFormerConfig (MaskFormer model)
- mbart — MBartConfig (mBART model)
- mctct — MCTCTConfig (M-CTC-T model)
- megatron-bert — MegatronBertConfig (Megatron-BERT model)
- mobilebert — MobileBertConfig (MobileBERT model)
- mpnet — MPNetConfig (MPNet model)
- mt5 — MT5Config (MT5 model)
- nystromformer — NystromformerConfig (Nyströmformer model)
- openai-gpt — OpenAIGPTConfig (OpenAI GPT model)
- opt — OPTConfig (OPT model)
- pegasus — PegasusConfig (Pegasus model)
- perceiver — PerceiverConfig (Perceiver model)
- plbart — PLBartConfig (PLBart model)
- poolformer — PoolFormerConfig (PoolFormer model)
- prophetnet — ProphetNetConfig (ProphetNet model)
- qdqbert — QDQBertConfig (QDQBert model)
- rag — RagConfig (RAG model)
- realm — RealmConfig (REALM model)
- reformer — ReformerConfig (Reformer model)
- regnet — RegNetConfig (RegNet model)
- rembert — RemBertConfig (RemBERT model)
- resnet — ResNetConfig (ResNet model)
- retribert — RetriBertConfig (RetriBERT model)
- roberta — RobertaConfig (RoBERTa model)
- roformer — RoFormerConfig (RoFormer model)
- segformer — SegformerConfig (SegFormer model)
- sew — SEWConfig (SEW model)
- sew-d — SEWDConfig (SEW-D model)
- speech-encoder-decoder — SpeechEncoderDecoderConfig (Speech Encoder decoder model)
- speech_to_text — Speech2TextConfig (Speech2Text model)
- speech_to_text_2 — Speech2Text2Config (Speech2Text2 model)
- splinter — SplinterConfig (Splinter model)
- squeezebert — SqueezeBertConfig (SqueezeBERT model)
- swin — SwinConfig (Swin Transformer model)
- t5 — T5Config (T5 model)
- tapas — TapasConfig (TAPAS model)
- trajectory_transformer — TrajectoryTransformerConfig (Trajectory Transformer model)
- transfo-xl — TransfoXLConfig (Transformer-XL model)
- trocr — TrOCRConfig (TrOCR model)
- unispeech — UniSpeechConfig (UniSpeech model)
- unispeech-sat — UniSpeechSatConfig (UniSpeechSat model)
- van — VanConfig (VAN model)
- vilt — ViltConfig (ViLT model)
- vision-encoder-decoder — VisionEncoderDecoderConfig (Vision Encoder decoder model)
- vision-text-dual-encoder — VisionTextDualEncoderConfig (VisionTextDualEncoder model)
- visual_bert — VisualBertConfig (VisualBERT model)
- vit — ViTConfig (ViT model)
- vit_mae — ViTMAEConfig (ViTMAE model)
- wav2vec2 — Wav2Vec2Config (Wav2Vec2 model)
- wav2vec2-conformer — Wav2Vec2ConformerConfig (Wav2Vec2-Conformer model)
- wavlm — WavLMConfig (WavLM model)
- xglm — XGLMConfig (XGLM model)
- xlm — XLMConfig (XLM model)
- xlm-prophetnet — XLMProphetNetConfig (XLM-ProphetNet model)
- xlm-roberta — XLMRobertaConfig (XLM-RoBERTa model)
- xlm-roberta-xl — XLMRobertaXLConfig (XLM-RoBERTa-XL model)
- xlnet — XLNetConfig (XLNet model)
- yolos — YolosConfig (YOLOS model)
- yoso — YosoConfig (YOSO model)
>>> from transformers import AutoConfig
>>> # Download configuration from and cache.
>>> config = AutoConfig.from_pretrained("bert-base-uncased")
>>> # Download configuration from (user-uploaded) and cache.
>>> config = AutoConfig.from_pretrained("dbmdz/bert-base-german-cased")
>>> # If configuration file is in a directory (e.g., was saved using *save_pretrained('./test/saved_model/')*).
>>> config = AutoConfig.from_pretrained("./test/bert_saved_model/")
>>> # Load a specific configuration file.
>>> config = AutoConfig.from_pretrained("./test/bert_saved_model/my_configuration.json")
>>> # Change some config attributes when loading a pretrained config.
>>> config = AutoConfig.from_pretrained("bert-base-uncased", output_attentions=True, foo=False)
>>> config.output_attentions
>>> config, unused_kwargs = AutoConfig.from_pretrained(
... "bert-base-uncased", output_attentions=True, foo=False, return_unused_kwargs=True
... )
>>> config.output_attentions
>>> unused_kwargs
{'foo': False}
< source >( model_type config )
model_type (
) — The model type like “bert” or “gpt”. - config (PretrainedConfig) — The config to register.
Register a new configuration for this class.
This is a generic tokenizer class that will be instantiated as one of the tokenizer classes of the library when created with the AutoTokenizer.from_pretrained() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( pretrained_model_name_or_path *inputs **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a predefined tokenizer hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing vocabulary files required by the tokenizer, for instance saved
using the save_pretrained() method, e.g.,
. - A path or url to a single saved vocabulary file if and only if the tokenizer only requires a
single vocabulary file (like Bert or XLNet), e.g.:
. (Not applicable to all derived classes)
- A string, the model id of a predefined tokenizer hosted inside a model repo on
Valid model ids can be located at the root-level, like
inputs (additional positional arguments, optional) —
Will be passed along to the Tokenizer
method. - config (PretrainedConfig, optional) — The configuration object used to dertermine the tokenizer class to instantiate.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download the model weights and configuration files and override the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
subfolder (
, optional) — In case the relevant files are located inside a subfolder of the model repo on (e.g. for facebook/rag-token-base), specify it here. -
use_fast (
, optional, defaults toTrue
) — Whether or not to try to load the fast version of the tokenizer. -
tokenizer_type (
, optional) — Tokenizer type to be loaded. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Will be passed to the Tokenizer
method. Can be used to set special tokens likebos_token
. See parameters in the__init__()
for more details.
Instantiate one of the tokenizer classes of the library from a pretrained model vocabulary.
The tokenizer class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- albert — AlbertTokenizer or AlbertTokenizerFast (ALBERT model)
- bart — BartTokenizer or BartTokenizerFast (BART model)
- barthez — BarthezTokenizer or BarthezTokenizerFast (BARThez model)
- bartpho — BartphoTokenizer (BARTpho model)
- bert — BertTokenizer or BertTokenizerFast (BERT model)
- bert-generation — BertGenerationTokenizer (Bert Generation model)
- bert-japanese — BertJapaneseTokenizer (BertJapanese model)
- bertweet — BertweetTokenizer (BERTweet model)
- big_bird — BigBirdTokenizer or BigBirdTokenizerFast (BigBird model)
- bigbird_pegasus — PegasusTokenizer or PegasusTokenizerFast (BigBird-Pegasus model)
- blenderbot — BlenderbotTokenizer or BlenderbotTokenizerFast (Blenderbot model)
- blenderbot-small — BlenderbotSmallTokenizer (BlenderbotSmall model)
- bloom — BloomTokenizerFast (BLOOM model)
- byt5 — ByT5Tokenizer (ByT5 model)
- camembert — CamembertTokenizer or CamembertTokenizerFast (CamemBERT model)
- canine — CanineTokenizer (CANINE model)
- clip — CLIPTokenizer or CLIPTokenizerFast (CLIP model)
- convbert — ConvBertTokenizer or ConvBertTokenizerFast (ConvBERT model)
- cpm — CpmTokenizer or CpmTokenizerFast (CPM model)
- ctrl — CTRLTokenizer (CTRL model)
- data2vec-text — RobertaTokenizer or RobertaTokenizerFast (Data2VecText model)
- deberta — DebertaTokenizer or DebertaTokenizerFast (DeBERTa model)
- deberta-v2 — DebertaV2Tokenizer or DebertaV2TokenizerFast (DeBERTa-v2 model)
- distilbert — DistilBertTokenizer or DistilBertTokenizerFast (DistilBERT model)
- dpr — DPRQuestionEncoderTokenizer or DPRQuestionEncoderTokenizerFast (DPR model)
- electra — ElectraTokenizer or ElectraTokenizerFast (ELECTRA model)
- flaubert — FlaubertTokenizer (FlauBERT model)
- fnet — FNetTokenizer or FNetTokenizerFast (FNet model)
- fsmt — FSMTTokenizer (FairSeq Machine-Translation model)
- funnel — FunnelTokenizer or FunnelTokenizerFast (Funnel Transformer model)
- gpt2 — GPT2Tokenizer or GPT2TokenizerFast (OpenAI GPT-2 model)
- gpt_neo — GPT2Tokenizer or GPT2TokenizerFast (GPT Neo model)
- gpt_neox — GPTNeoXTokenizerFast (GPT NeoX model)
- gptj — GPT2Tokenizer or GPT2TokenizerFast (GPT-J model)
- herbert — HerbertTokenizer or HerbertTokenizerFast (HerBERT model)
- hubert — Wav2Vec2CTCTokenizer (Hubert model)
- ibert — RobertaTokenizer or RobertaTokenizerFast (I-BERT model)
- layoutlm — LayoutLMTokenizer or LayoutLMTokenizerFast (LayoutLM model)
- layoutlmv2 — LayoutLMv2Tokenizer or LayoutLMv2TokenizerFast (LayoutLMv2 model)
- layoutlmv3 — LayoutLMv3Tokenizer or LayoutLMv3TokenizerFast (LayoutLMv3 model)
- layoutxlm — LayoutXLMTokenizer or LayoutXLMTokenizerFast (LayoutXLM model)
- led — LEDTokenizer or LEDTokenizerFast (LED model)
- longformer — LongformerTokenizer or LongformerTokenizerFast (Longformer model)
- longt5 — T5Tokenizer or T5TokenizerFast (LongT5 model)
- luke — LukeTokenizer (LUKE model)
- lxmert — LxmertTokenizer or LxmertTokenizerFast (LXMERT model)
- m2m_100 — M2M100Tokenizer (M2M100 model)
- marian — MarianTokenizer (Marian model)
- mbart — MBartTokenizer or MBartTokenizerFast (mBART model)
- mbart50 — MBart50Tokenizer or MBart50TokenizerFast (mBART-50 model)
- megatron-bert — BertTokenizer or BertTokenizerFast (Megatron-BERT model)
- mluke — MLukeTokenizer (mLUKE model)
- mobilebert — MobileBertTokenizer or MobileBertTokenizerFast (MobileBERT model)
- mpnet — MPNetTokenizer or MPNetTokenizerFast (MPNet model)
- mt5 — MT5Tokenizer or MT5TokenizerFast (MT5 model)
- nystromformer — AlbertTokenizer or AlbertTokenizerFast (Nyströmformer model)
- openai-gpt — OpenAIGPTTokenizer or OpenAIGPTTokenizerFast (OpenAI GPT model)
- opt — GPT2Tokenizer (OPT model)
- pegasus — PegasusTokenizer or PegasusTokenizerFast (Pegasus model)
- perceiver — PerceiverTokenizer (Perceiver model)
- phobert — PhobertTokenizer (PhoBERT model)
- plbart — PLBartTokenizer (PLBart model)
- prophetnet — ProphetNetTokenizer (ProphetNet model)
- qdqbert — BertTokenizer or BertTokenizerFast (QDQBert model)
- rag — RagTokenizer (RAG model)
- realm — RealmTokenizer or RealmTokenizerFast (REALM model)
- reformer — ReformerTokenizer or ReformerTokenizerFast (Reformer model)
- rembert — RemBertTokenizer or RemBertTokenizerFast (RemBERT model)
- retribert — RetriBertTokenizer or RetriBertTokenizerFast (RetriBERT model)
- roberta — RobertaTokenizer or RobertaTokenizerFast (RoBERTa model)
- roformer — RoFormerTokenizer or RoFormerTokenizerFast (RoFormer model)
- speech_to_text — Speech2TextTokenizer (Speech2Text model)
- speech_to_text_2 — Speech2Text2Tokenizer (Speech2Text2 model)
- splinter — SplinterTokenizer or SplinterTokenizerFast (Splinter model)
- squeezebert — SqueezeBertTokenizer or SqueezeBertTokenizerFast (SqueezeBERT model)
- t5 — T5Tokenizer or T5TokenizerFast (T5 model)
- tapas — TapasTokenizer (TAPAS model)
- tapex — TapexTokenizer (TAPEX model)
- transfo-xl — TransfoXLTokenizer (Transformer-XL model)
- vilt — BertTokenizer or BertTokenizerFast (ViLT model)
- visual_bert — BertTokenizer or BertTokenizerFast (VisualBERT model)
- wav2vec2 — Wav2Vec2CTCTokenizer (Wav2Vec2 model)
- wav2vec2-conformer — Wav2Vec2CTCTokenizer (Wav2Vec2-Conformer model)
- wav2vec2_phoneme — Wav2Vec2PhonemeCTCTokenizer (Wav2Vec2Phoneme model)
- xglm — XGLMTokenizer or XGLMTokenizerFast (XGLM model)
- xlm — XLMTokenizer (XLM model)
- xlm-prophetnet — XLMProphetNetTokenizer (XLM-ProphetNet model)
- xlm-roberta — XLMRobertaTokenizer or XLMRobertaTokenizerFast (XLM-RoBERTa model)
- xlm-roberta-xl — RobertaTokenizer or RobertaTokenizerFast (XLM-RoBERTa-XL model)
- xlnet — XLNetTokenizer or XLNetTokenizerFast (XLNet model)
- yoso — AlbertTokenizer or AlbertTokenizerFast (YOSO model)
>>> from transformers import AutoTokenizer
>>> # Download vocabulary from and cache.
>>> tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
>>> # Download vocabulary from (user-uploaded) and cache.
>>> tokenizer = AutoTokenizer.from_pretrained("dbmdz/bert-base-german-cased")
>>> # If vocabulary files are in a directory (e.g. tokenizer was saved using *save_pretrained('./test/saved_model/')*)
>>> tokenizer = AutoTokenizer.from_pretrained("./test/bert_saved_model/")
>>> # Download vocabulary from and define model-specific arguments
>>> tokenizer = AutoTokenizer.from_pretrained("roberta-base", add_prefix_space=True)
< source >( config_class slow_tokenizer_class = None fast_tokenizer_class = None )
- config_class (PretrainedConfig) — The configuration corresponding to the model to register.
slow_tokenizer_class (
, optional) — The slow tokenizer to register. -
slow_tokenizer_class (
, optional) — The fast tokenizer to register.
Register a new tokenizer in this mapping.
This is a generic feature extractor class that will be instantiated as one of the feature extractor classes of the library when created with the AutoFeatureExtractor.from_pretrained() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( pretrained_model_name_or_path **kwargs )
pretrained_model_name_or_path (
) — This can be either:- a string, the model id of a pretrained feature_extractor hosted inside a model repo on Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - a path to a directory containing a feature extractor file saved using the
save_pretrained() method, e.g.,
. - a path or url to a saved feature extractor JSON file, e.g.,
- a string, the model id of a pretrained feature_extractor hosted inside a model repo on Valid model ids can be located at the root-level, like
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model feature extractor should be cached if the standard cache should not be used. -
force_download (
, optional, defaults toFalse
) — Whether or not to force to (re-)download the feature extractor files and override the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received file. Attempts to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}.
The proxies are used on each request. -
use_auth_token (
or bool, optional) — The token to use as HTTP bearer authorization for remote files. IfTrue
, will use the token generated when runningtransformers-cli login
(stored in~/.huggingface
). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
return_unused_kwargs (
, optional, defaults toFalse
) — IfFalse
, then this function returns just the final feature extractor object. IfTrue
, then this functions returns aTuple(feature_extractor, unused_kwargs)
where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part ofkwargs
which has not been used to updatefeature_extractor
and is otherwise ignored. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (
Dict[str, Any]
, optional) — The values in kwargs of any keys which are feature extractor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are not feature extractor attributes is controlled by thereturn_unused_kwargs
keyword parameter.
Instantiate one of the feature extractor classes of the library from a pretrained model vocabulary.
The feature extractor class to instantiate is selected based on the model_type
property of the config object
(either passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s
missing, by falling back to using pattern matching on pretrained_model_name_or_path
- beit — BeitFeatureExtractor (BEiT model)
- clip — CLIPFeatureExtractor (CLIP model)
- convnext — ConvNextFeatureExtractor (ConvNeXT model)
- cvt — ConvNextFeatureExtractor (CvT model)
- data2vec-audio — Wav2Vec2FeatureExtractor (Data2VecAudio model)
- data2vec-vision — BeitFeatureExtractor (Data2VecVision model)
- deit — DeiTFeatureExtractor (DeiT model)
- detr — DetrFeatureExtractor (DETR model)
- dpt — DPTFeatureExtractor (DPT model)
- flava — FlavaFeatureExtractor (FLAVA model)
- glpn — GLPNFeatureExtractor (GLPN model)
- hubert — Wav2Vec2FeatureExtractor (Hubert model)
- imagegpt — ImageGPTFeatureExtractor (ImageGPT model)
- layoutlmv2 — LayoutLMv2FeatureExtractor (LayoutLMv2 model)
- layoutlmv3 — LayoutLMv3FeatureExtractor (LayoutLMv3 model)
- levit — LevitFeatureExtractor (LeViT model)
- maskformer — MaskFormerFeatureExtractor (MaskFormer model)
- mctct — MCTCTFeatureExtractor (M-CTC-T model)
- perceiver — PerceiverFeatureExtractor (Perceiver model)
- poolformer — PoolFormerFeatureExtractor (PoolFormer model)
- regnet — ConvNextFeatureExtractor (RegNet model)
- resnet — ConvNextFeatureExtractor (ResNet model)
- segformer — SegformerFeatureExtractor (SegFormer model)
- speech_to_text — Speech2TextFeatureExtractor (Speech2Text model)
- swin — ViTFeatureExtractor (Swin Transformer model)
- van — ConvNextFeatureExtractor (VAN model)
- vilt — ViltFeatureExtractor (ViLT model)
- vit — ViTFeatureExtractor (ViT model)
- vit_mae — ViTFeatureExtractor (ViTMAE model)
- wav2vec2 — Wav2Vec2FeatureExtractor (Wav2Vec2 model)
- wav2vec2-conformer — Wav2Vec2FeatureExtractor (Wav2Vec2-Conformer model)
- yolos — YolosFeatureExtractor (YOLOS model)
Passing use_auth_token=True
is required when you want to use a private model.
>>> from transformers import AutoFeatureExtractor
>>> # Download feature extractor from and cache.
>>> feature_extractor = AutoFeatureExtractor.from_pretrained("facebook/wav2vec2-base-960h")
>>> # If feature extractor files are in a directory (e.g. feature extractor was saved using *save_pretrained('./test/saved_model/')*)
>>> feature_extractor = AutoFeatureExtractor.from_pretrained("./test/saved_model/")
< source >( config_class feature_extractor_class )
- config_class (PretrainedConfig) — The configuration corresponding to the model to register.
feature_extractor_class (
) — The feature extractor to register.
Register a new feature extractor for this class.
This is a generic processor class that will be instantiated as one of the processor classes of the library when created with the AutoProcessor.from_pretrained() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( pretrained_model_name_or_path **kwargs )
pretrained_model_name_or_path (
) — This can be either:- a string, the model id of a pretrained feature_extractor hosted inside a model repo on Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - a path to a directory containing a processor files saved using the
method, e.g.,./my_model_directory/
- a string, the model id of a pretrained feature_extractor hosted inside a model repo on Valid model ids can be located at the root-level, like
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model feature extractor should be cached if the standard cache should not be used. -
force_download (
, optional, defaults toFalse
) — Whether or not to force to (re-)download the feature extractor files and override the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received file. Attempts to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}.
The proxies are used on each request. -
use_auth_token (
or bool, optional) — The token to use as HTTP bearer authorization for remote files. IfTrue
, will use the token generated when runningtransformers-cli login
(stored in~/.huggingface
). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
return_unused_kwargs (
, optional, defaults toFalse
) — IfFalse
, then this function returns just the final feature extractor object. IfTrue
, then this functions returns aTuple(feature_extractor, unused_kwargs)
where unused_kwargs is a dictionary consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part ofkwargs
which has not been used to updatefeature_extractor
and is otherwise ignored. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (
Dict[str, Any]
, optional) — The values in kwargs of any keys which are feature extractor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are not feature extractor attributes is controlled by thereturn_unused_kwargs
keyword parameter.
Instantiate one of the processor classes of the library from a pretrained model vocabulary.
The processor class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible):
- clip — CLIPProcessor (CLIP model)
- flava —
(FLAVA model) - layoutlmv2 — LayoutLMv2Processor (LayoutLMv2 model)
- layoutlmv3 — LayoutLMv3Processor (LayoutLMv3 model)
- layoutxlm — LayoutXLMProcessor (LayoutXLM model)
- sew — Wav2Vec2Processor (SEW model)
- sew-d — Wav2Vec2Processor (SEW-D model)
- speech_to_text — Speech2TextProcessor (Speech2Text model)
- speech_to_text_2 — Speech2Text2Processor (Speech2Text2 model)
- trocr — TrOCRProcessor (TrOCR model)
- unispeech — Wav2Vec2Processor (UniSpeech model)
- unispeech-sat — Wav2Vec2Processor (UniSpeechSat model)
- vilt — ViltProcessor (ViLT model)
- vision-text-dual-encoder — VisionTextDualEncoderProcessor (VisionTextDualEncoder model)
- wav2vec2 — Wav2Vec2Processor (Wav2Vec2 model)
- wav2vec2-conformer — Wav2Vec2Processor (Wav2Vec2-Conformer model)
- wavlm — Wav2Vec2Processor (WavLM model)
Passing use_auth_token=True
is required when you want to use a private model.
>>> from transformers import AutoProcessor
>>> # Download processor from and cache.
>>> processor = AutoProcessor.from_pretrained("facebook/wav2vec2-base-960h")
>>> # If processor files are in a directory (e.g. processor was saved using *save_pretrained('./test/saved_model/')*)
>>> processor = AutoProcessor.from_pretrained("./test/saved_model/")
< source >( config_class processor_class )
- config_class (PretrainedConfig) — The configuration corresponding to the model to register.
processor_class (
) — The processor to register.
Register a new processor for this class.
This is a generic model class that will be instantiated as one of the base model classes of the library when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: AlbertModel (ALBERT model)
- BartConfig configuration class: BartModel (BART model)
- BeitConfig configuration class: BeitModel (BEiT model)
- BertConfig configuration class: BertModel (BERT model)
- BertGenerationConfig configuration class: BertGenerationEncoder (Bert Generation model)
- BigBirdConfig configuration class: BigBirdModel (BigBird model)
- BigBirdPegasusConfig configuration class: BigBirdPegasusModel (BigBird-Pegasus model)
- BlenderbotConfig configuration class: BlenderbotModel (Blenderbot model)
- BlenderbotSmallConfig configuration class: BlenderbotSmallModel (BlenderbotSmall model)
- BloomConfig configuration class: BloomModel (BLOOM model)
- CLIPConfig configuration class: CLIPModel (CLIP model)
- CTRLConfig configuration class: CTRLModel (CTRL model)
- CamembertConfig configuration class: CamembertModel (CamemBERT model)
- CanineConfig configuration class: CanineModel (CANINE model)
- ConvBertConfig configuration class: ConvBertModel (ConvBERT model)
- ConvNextConfig configuration class: ConvNextModel (ConvNeXT model)
- CvtConfig configuration class: CvtModel (CvT model)
- DPRConfig configuration class: DPRQuestionEncoder (DPR model)
- DPTConfig configuration class: DPTModel (DPT model)
- Data2VecAudioConfig configuration class: Data2VecAudioModel (Data2VecAudio model)
- Data2VecTextConfig configuration class: Data2VecTextModel (Data2VecText model)
- Data2VecVisionConfig configuration class: Data2VecVisionModel (Data2VecVision model)
- DebertaConfig configuration class: DebertaModel (DeBERTa model)
- DebertaV2Config configuration class: DebertaV2Model (DeBERTa-v2 model)
- DecisionTransformerConfig configuration class: DecisionTransformerModel (Decision Transformer model)
- DeiTConfig configuration class: DeiTModel (DeiT model)
- DetrConfig configuration class: DetrModel (DETR model)
- DistilBertConfig configuration class: DistilBertModel (DistilBERT model)
- ElectraConfig configuration class: ElectraModel (ELECTRA model)
- FNetConfig configuration class: FNetModel (FNet model)
- FSMTConfig configuration class: FSMTModel (FairSeq Machine-Translation model)
- FlaubertConfig configuration class: FlaubertModel (FlauBERT model)
- FlavaConfig configuration class: FlavaModel (FLAVA model)
- FunnelConfig configuration class: FunnelModel or FunnelBaseModel (Funnel Transformer model)
- GLPNConfig configuration class: GLPNModel (GLPN model)
- GPT2Config configuration class: GPT2Model (OpenAI GPT-2 model)
- GPTJConfig configuration class: GPTJModel (GPT-J model)
- GPTNeoConfig configuration class: GPTNeoModel (GPT Neo model)
- GPTNeoXConfig configuration class: GPTNeoXModel (GPT NeoX model)
- HubertConfig configuration class: HubertModel (Hubert model)
- IBertConfig configuration class: IBertModel (I-BERT model)
- ImageGPTConfig configuration class: ImageGPTModel (ImageGPT model)
- LEDConfig configuration class: LEDModel (LED model)
- LayoutLMConfig configuration class: LayoutLMModel (LayoutLM model)
- LayoutLMv2Config configuration class: LayoutLMv2Model (LayoutLMv2 model)
- LayoutLMv3Config configuration class: LayoutLMv3Model (LayoutLMv3 model)
- LevitConfig configuration class: LevitModel (LeViT model)
- LongT5Config configuration class: LongT5Model (LongT5 model)
- LongformerConfig configuration class: LongformerModel (Longformer model)
- LukeConfig configuration class: LukeModel (LUKE model)
- LxmertConfig configuration class: LxmertModel (LXMERT model)
- M2M100Config configuration class: M2M100Model (M2M100 model)
- MBartConfig configuration class: MBartModel (mBART model)
- MCTCTConfig configuration class: MCTCTModel (M-CTC-T model)
- MPNetConfig configuration class: MPNetModel (MPNet model)
- MT5Config configuration class: MT5Model (MT5 model)
- MarianConfig configuration class: MarianModel (Marian model)
- MaskFormerConfig configuration class: MaskFormerModel (MaskFormer model)
- MegatronBertConfig configuration class: MegatronBertModel (Megatron-BERT model)
- MobileBertConfig configuration class: MobileBertModel (MobileBERT model)
- NystromformerConfig configuration class: NystromformerModel (Nyströmformer model)
- OPTConfig configuration class: OPTModel (OPT model)
- OpenAIGPTConfig configuration class: OpenAIGPTModel (OpenAI GPT model)
- PLBartConfig configuration class: PLBartModel (PLBart model)
- PegasusConfig configuration class: PegasusModel (Pegasus model)
- PerceiverConfig configuration class: PerceiverModel (Perceiver model)
- PoolFormerConfig configuration class: PoolFormerModel (PoolFormer model)
- ProphetNetConfig configuration class: ProphetNetModel (ProphetNet model)
- QDQBertConfig configuration class: QDQBertModel (QDQBert model)
- ReformerConfig configuration class: ReformerModel (Reformer model)
- RegNetConfig configuration class: RegNetModel (RegNet model)
- RemBertConfig configuration class: RemBertModel (RemBERT model)
- ResNetConfig configuration class: ResNetModel (ResNet model)
- RetriBertConfig configuration class: RetriBertModel (RetriBERT model)
- RoFormerConfig configuration class: RoFormerModel (RoFormer model)
- RobertaConfig configuration class: RobertaModel (RoBERTa model)
- SEWConfig configuration class: SEWModel (SEW model)
- SEWDConfig configuration class: SEWDModel (SEW-D model)
- SegformerConfig configuration class: SegformerModel (SegFormer model)
- Speech2TextConfig configuration class: Speech2TextModel (Speech2Text model)
- SplinterConfig configuration class: SplinterModel (Splinter model)
- SqueezeBertConfig configuration class: SqueezeBertModel (SqueezeBERT model)
- SwinConfig configuration class: SwinModel (Swin Transformer model)
- T5Config configuration class: T5Model (T5 model)
- TapasConfig configuration class: TapasModel (TAPAS model)
- TrajectoryTransformerConfig configuration class: TrajectoryTransformerModel (Trajectory Transformer model)
- TransfoXLConfig configuration class: TransfoXLModel (Transformer-XL model)
- UniSpeechConfig configuration class: UniSpeechModel (UniSpeech model)
- UniSpeechSatConfig configuration class: UniSpeechSatModel (UniSpeechSat model)
- VanConfig configuration class: VanModel (VAN model)
- ViTConfig configuration class: ViTModel (ViT model)
- ViTMAEConfig configuration class: ViTMAEModel (ViTMAE model)
- ViltConfig configuration class: ViltModel (ViLT model)
- VisionTextDualEncoderConfig configuration class: VisionTextDualEncoderModel (VisionTextDualEncoder model)
- VisualBertConfig configuration class: VisualBertModel (VisualBERT model)
- Wav2Vec2Config configuration class: Wav2Vec2Model (Wav2Vec2 model)
- Wav2Vec2ConformerConfig configuration class: Wav2Vec2ConformerModel (Wav2Vec2-Conformer model)
- WavLMConfig configuration class: WavLMModel (WavLM model)
- XGLMConfig configuration class: XGLMModel (XGLM model)
- XLMConfig configuration class: XLMModel (XLM model)
- XLMProphetNetConfig configuration class: XLMProphetNetModel (XLM-ProphetNet model)
- XLMRobertaConfig configuration class: XLMRobertaModel (XLM-RoBERTa model)
- XLMRobertaXLConfig configuration class: XLMRobertaXLModel (XLM-RoBERTa-XL model)
- XLNetConfig configuration class: XLNetModel (XLNet model)
- YolosConfig configuration class: YolosModel (YOLOS model)
- YosoConfig configuration class: YosoModel (YOSO model)
Instantiates one of the base model classes of the library from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a tensorflow index checkpoint file (e.g,
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_tf (
, optional, defaults toFalse
) — Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the base model classes of the library from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- albert — AlbertModel (ALBERT model)
- bart — BartModel (BART model)
- beit — BeitModel (BEiT model)
- bert — BertModel (BERT model)
- bert-generation — BertGenerationEncoder (Bert Generation model)
- big_bird — BigBirdModel (BigBird model)
- bigbird_pegasus — BigBirdPegasusModel (BigBird-Pegasus model)
- blenderbot — BlenderbotModel (Blenderbot model)
- blenderbot-small — BlenderbotSmallModel (BlenderbotSmall model)
- bloom — BloomModel (BLOOM model)
- camembert — CamembertModel (CamemBERT model)
- canine — CanineModel (CANINE model)
- clip — CLIPModel (CLIP model)
- convbert — ConvBertModel (ConvBERT model)
- convnext — ConvNextModel (ConvNeXT model)
- ctrl — CTRLModel (CTRL model)
- cvt — CvtModel (CvT model)
- data2vec-audio — Data2VecAudioModel (Data2VecAudio model)
- data2vec-text — Data2VecTextModel (Data2VecText model)
- data2vec-vision — Data2VecVisionModel (Data2VecVision model)
- deberta — DebertaModel (DeBERTa model)
- deberta-v2 — DebertaV2Model (DeBERTa-v2 model)
- decision_transformer — DecisionTransformerModel (Decision Transformer model)
- deit — DeiTModel (DeiT model)
- detr — DetrModel (DETR model)
- distilbert — DistilBertModel (DistilBERT model)
- dpr — DPRQuestionEncoder (DPR model)
- dpt — DPTModel (DPT model)
- electra — ElectraModel (ELECTRA model)
- flaubert — FlaubertModel (FlauBERT model)
- flava — FlavaModel (FLAVA model)
- fnet — FNetModel (FNet model)
- fsmt — FSMTModel (FairSeq Machine-Translation model)
- funnel — FunnelModel or FunnelBaseModel (Funnel Transformer model)
- glpn — GLPNModel (GLPN model)
- gpt2 — GPT2Model (OpenAI GPT-2 model)
- gpt_neo — GPTNeoModel (GPT Neo model)
- gpt_neox — GPTNeoXModel (GPT NeoX model)
- gptj — GPTJModel (GPT-J model)
- hubert — HubertModel (Hubert model)
- ibert — IBertModel (I-BERT model)
- imagegpt — ImageGPTModel (ImageGPT model)
- layoutlm — LayoutLMModel (LayoutLM model)
- layoutlmv2 — LayoutLMv2Model (LayoutLMv2 model)
- layoutlmv3 — LayoutLMv3Model (LayoutLMv3 model)
- led — LEDModel (LED model)
- levit — LevitModel (LeViT model)
- longformer — LongformerModel (Longformer model)
- longt5 — LongT5Model (LongT5 model)
- luke — LukeModel (LUKE model)
- lxmert — LxmertModel (LXMERT model)
- m2m_100 — M2M100Model (M2M100 model)
- marian — MarianModel (Marian model)
- maskformer — MaskFormerModel (MaskFormer model)
- mbart — MBartModel (mBART model)
- mctct — MCTCTModel (M-CTC-T model)
- megatron-bert — MegatronBertModel (Megatron-BERT model)
- mobilebert — MobileBertModel (MobileBERT model)
- mpnet — MPNetModel (MPNet model)
- mt5 — MT5Model (MT5 model)
- nystromformer — NystromformerModel (Nyströmformer model)
- openai-gpt — OpenAIGPTModel (OpenAI GPT model)
- opt — OPTModel (OPT model)
- pegasus — PegasusModel (Pegasus model)
- perceiver — PerceiverModel (Perceiver model)
- plbart — PLBartModel (PLBart model)
- poolformer — PoolFormerModel (PoolFormer model)
- prophetnet — ProphetNetModel (ProphetNet model)
- qdqbert — QDQBertModel (QDQBert model)
- reformer — ReformerModel (Reformer model)
- regnet — RegNetModel (RegNet model)
- rembert — RemBertModel (RemBERT model)
- resnet — ResNetModel (ResNet model)
- retribert — RetriBertModel (RetriBERT model)
- roberta — RobertaModel (RoBERTa model)
- roformer — RoFormerModel (RoFormer model)
- segformer — SegformerModel (SegFormer model)
- sew — SEWModel (SEW model)
- sew-d — SEWDModel (SEW-D model)
- speech_to_text — Speech2TextModel (Speech2Text model)
- splinter — SplinterModel (Splinter model)
- squeezebert — SqueezeBertModel (SqueezeBERT model)
- swin — SwinModel (Swin Transformer model)
- t5 — T5Model (T5 model)
- tapas — TapasModel (TAPAS model)
- trajectory_transformer — TrajectoryTransformerModel (Trajectory Transformer model)
- transfo-xl — TransfoXLModel (Transformer-XL model)
- unispeech — UniSpeechModel (UniSpeech model)
- unispeech-sat — UniSpeechSatModel (UniSpeechSat model)
- van — VanModel (VAN model)
- vilt — ViltModel (ViLT model)
- vision-text-dual-encoder — VisionTextDualEncoderModel (VisionTextDualEncoder model)
- visual_bert — VisualBertModel (VisualBERT model)
- vit — ViTModel (ViT model)
- vit_mae — ViTMAEModel (ViTMAE model)
- wav2vec2 — Wav2Vec2Model (Wav2Vec2 model)
- wav2vec2-conformer — Wav2Vec2ConformerModel (Wav2Vec2-Conformer model)
- wavlm — WavLMModel (WavLM model)
- xglm — XGLMModel (XGLM model)
- xlm — XLMModel (XLM model)
- xlm-prophetnet — XLMProphetNetModel (XLM-ProphetNet model)
- xlm-roberta — XLMRobertaModel (XLM-RoBERTa model)
- xlm-roberta-xl — XLMRobertaXLModel (XLM-RoBERTa-XL model)
- xlnet — XLNetModel (XLNet model)
- yolos — YolosModel (YOLOS model)
- yoso — YosoModel (YOSO model)
The model is set in evaluation mode by default using model.eval()
(so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
>>> from transformers import AutoConfig, AutoModel
>>> # Download model and configuration from and cache.
>>> model = AutoModel.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModel.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModel.from_pretrained(
... "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a pretraining head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: AlbertForPreTraining (ALBERT model)
- BartConfig configuration class: BartForConditionalGeneration (BART model)
- BertConfig configuration class: BertForPreTraining (BERT model)
- BigBirdConfig configuration class: BigBirdForPreTraining (BigBird model)
- BloomConfig configuration class: BloomForCausalLM (BLOOM model)
- CTRLConfig configuration class: CTRLLMHeadModel (CTRL model)
- CamembertConfig configuration class: CamembertForMaskedLM (CamemBERT model)
- Data2VecTextConfig configuration class: Data2VecTextForMaskedLM (Data2VecText model)
- DebertaConfig configuration class: DebertaForMaskedLM (DeBERTa model)
- DebertaV2Config configuration class: DebertaV2ForMaskedLM (DeBERTa-v2 model)
- DistilBertConfig configuration class: DistilBertForMaskedLM (DistilBERT model)
- ElectraConfig configuration class: ElectraForPreTraining (ELECTRA model)
- FNetConfig configuration class: FNetForPreTraining (FNet model)
- FSMTConfig configuration class: FSMTForConditionalGeneration (FairSeq Machine-Translation model)
- FlaubertConfig configuration class: FlaubertWithLMHeadModel (FlauBERT model)
- FlavaConfig configuration class: FlavaForPreTraining (FLAVA model)
- FunnelConfig configuration class: FunnelForPreTraining (Funnel Transformer model)
- GPT2Config configuration class: GPT2LMHeadModel (OpenAI GPT-2 model)
- IBertConfig configuration class: IBertForMaskedLM (I-BERT model)
- LayoutLMConfig configuration class: LayoutLMForMaskedLM (LayoutLM model)
- LongformerConfig configuration class: LongformerForMaskedLM (Longformer model)
- LxmertConfig configuration class: LxmertForPreTraining (LXMERT model)
- MPNetConfig configuration class: MPNetForMaskedLM (MPNet model)
- MegatronBertConfig configuration class: MegatronBertForPreTraining (Megatron-BERT model)
- MobileBertConfig configuration class: MobileBertForPreTraining (MobileBERT model)
- OpenAIGPTConfig configuration class: OpenAIGPTLMHeadModel (OpenAI GPT model)
- RetriBertConfig configuration class: RetriBertModel (RetriBERT model)
- RobertaConfig configuration class: RobertaForMaskedLM (RoBERTa model)
- SplinterConfig configuration class: SplinterForPreTraining (Splinter model)
- SqueezeBertConfig configuration class: SqueezeBertForMaskedLM (SqueezeBERT model)
- T5Config configuration class: T5ForConditionalGeneration (T5 model)
- TapasConfig configuration class: TapasForMaskedLM (TAPAS model)
- TransfoXLConfig configuration class: TransfoXLLMHeadModel (Transformer-XL model)
- UniSpeechConfig configuration class: UniSpeechForPreTraining (UniSpeech model)
- UniSpeechSatConfig configuration class: UniSpeechSatForPreTraining (UniSpeechSat model)
- ViTMAEConfig configuration class: ViTMAEForPreTraining (ViTMAE model)
- VisualBertConfig configuration class: VisualBertForPreTraining (VisualBERT model)
- Wav2Vec2Config configuration class: Wav2Vec2ForPreTraining (Wav2Vec2 model)
- Wav2Vec2ConformerConfig configuration class: Wav2Vec2ConformerForPreTraining (Wav2Vec2-Conformer model)
- XLMConfig configuration class: XLMWithLMHeadModel (XLM model)
- XLMRobertaConfig configuration class: XLMRobertaForMaskedLM (XLM-RoBERTa model)
- XLMRobertaXLConfig configuration class: XLMRobertaXLForMaskedLM (XLM-RoBERTa-XL model)
- XLNetConfig configuration class: XLNetLMHeadModel (XLNet model)
Instantiates one of the model classes of the library (with a pretraining head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a tensorflow index checkpoint file (e.g,
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_tf (
, optional, defaults toFalse
) — Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a pretraining head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- albert — AlbertForPreTraining (ALBERT model)
- bart — BartForConditionalGeneration (BART model)
- bert — BertForPreTraining (BERT model)
- big_bird — BigBirdForPreTraining (BigBird model)
- bloom — BloomForCausalLM (BLOOM model)
- camembert — CamembertForMaskedLM (CamemBERT model)
- ctrl — CTRLLMHeadModel (CTRL model)
- data2vec-text — Data2VecTextForMaskedLM (Data2VecText model)
- deberta — DebertaForMaskedLM (DeBERTa model)
- deberta-v2 — DebertaV2ForMaskedLM (DeBERTa-v2 model)
- distilbert — DistilBertForMaskedLM (DistilBERT model)
- electra — ElectraForPreTraining (ELECTRA model)
- flaubert — FlaubertWithLMHeadModel (FlauBERT model)
- flava — FlavaForPreTraining (FLAVA model)
- fnet — FNetForPreTraining (FNet model)
- fsmt — FSMTForConditionalGeneration (FairSeq Machine-Translation model)
- funnel — FunnelForPreTraining (Funnel Transformer model)
- gpt2 — GPT2LMHeadModel (OpenAI GPT-2 model)
- ibert — IBertForMaskedLM (I-BERT model)
- layoutlm — LayoutLMForMaskedLM (LayoutLM model)
- longformer — LongformerForMaskedLM (Longformer model)
- lxmert — LxmertForPreTraining (LXMERT model)
- megatron-bert — MegatronBertForPreTraining (Megatron-BERT model)
- mobilebert — MobileBertForPreTraining (MobileBERT model)
- mpnet — MPNetForMaskedLM (MPNet model)
- openai-gpt — OpenAIGPTLMHeadModel (OpenAI GPT model)
- retribert — RetriBertModel (RetriBERT model)
- roberta — RobertaForMaskedLM (RoBERTa model)
- splinter — SplinterForPreTraining (Splinter model)
- squeezebert — SqueezeBertForMaskedLM (SqueezeBERT model)
- t5 — T5ForConditionalGeneration (T5 model)
- tapas — TapasForMaskedLM (TAPAS model)
- transfo-xl — TransfoXLLMHeadModel (Transformer-XL model)
- unispeech — UniSpeechForPreTraining (UniSpeech model)
- unispeech-sat — UniSpeechSatForPreTraining (UniSpeechSat model)
- visual_bert — VisualBertForPreTraining (VisualBERT model)
- vit_mae — ViTMAEForPreTraining (ViTMAE model)
- wav2vec2 — Wav2Vec2ForPreTraining (Wav2Vec2 model)
- wav2vec2-conformer — Wav2Vec2ConformerForPreTraining (Wav2Vec2-Conformer model)
- xlm — XLMWithLMHeadModel (XLM model)
- xlm-roberta — XLMRobertaForMaskedLM (XLM-RoBERTa model)
- xlm-roberta-xl — XLMRobertaXLForMaskedLM (XLM-RoBERTa-XL model)
- xlnet — XLNetLMHeadModel (XLNet model)
The model is set in evaluation mode by default using model.eval()
(so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
>>> from transformers import AutoConfig, AutoModelForPreTraining
>>> # Download model and configuration from and cache.
>>> model = AutoModelForPreTraining.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForPreTraining.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForPreTraining.from_pretrained(
... "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a causal language modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- BartConfig configuration class: BartForCausalLM (BART model)
- BertConfig configuration class: BertLMHeadModel (BERT model)
- BertGenerationConfig configuration class: BertGenerationDecoder (Bert Generation model)
- BigBirdConfig configuration class: BigBirdForCausalLM (BigBird model)
- BigBirdPegasusConfig configuration class: BigBirdPegasusForCausalLM (BigBird-Pegasus model)
- BlenderbotConfig configuration class: BlenderbotForCausalLM (Blenderbot model)
- BlenderbotSmallConfig configuration class: BlenderbotSmallForCausalLM (BlenderbotSmall model)
- BloomConfig configuration class: BloomForCausalLM (BLOOM model)
- CTRLConfig configuration class: CTRLLMHeadModel (CTRL model)
- CamembertConfig configuration class: CamembertForCausalLM (CamemBERT model)
- Data2VecTextConfig configuration class: Data2VecTextForCausalLM (Data2VecText model)
- ElectraConfig configuration class: ElectraForCausalLM (ELECTRA model)
- GPT2Config configuration class: GPT2LMHeadModel (OpenAI GPT-2 model)
- GPTJConfig configuration class: GPTJForCausalLM (GPT-J model)
- GPTNeoConfig configuration class: GPTNeoForCausalLM (GPT Neo model)
- GPTNeoXConfig configuration class: GPTNeoXForCausalLM (GPT NeoX model)
- MBartConfig configuration class: MBartForCausalLM (mBART model)
- MarianConfig configuration class: MarianForCausalLM (Marian model)
- MegatronBertConfig configuration class: MegatronBertForCausalLM (Megatron-BERT model)
- OPTConfig configuration class: OPTForCausalLM (OPT model)
- OpenAIGPTConfig configuration class: OpenAIGPTLMHeadModel (OpenAI GPT model)
- PLBartConfig configuration class: PLBartForCausalLM (PLBart model)
- PegasusConfig configuration class: PegasusForCausalLM (Pegasus model)
- ProphetNetConfig configuration class: ProphetNetForCausalLM (ProphetNet model)
- QDQBertConfig configuration class: QDQBertLMHeadModel (QDQBert model)
- ReformerConfig configuration class: ReformerModelWithLMHead (Reformer model)
- RemBertConfig configuration class: RemBertForCausalLM (RemBERT model)
- RoFormerConfig configuration class: RoFormerForCausalLM (RoFormer model)
- RobertaConfig configuration class: RobertaForCausalLM (RoBERTa model)
- Speech2Text2Config configuration class: Speech2Text2ForCausalLM (Speech2Text2 model)
- TrOCRConfig configuration class: TrOCRForCausalLM (TrOCR model)
- TransfoXLConfig configuration class: TransfoXLLMHeadModel (Transformer-XL model)
- XGLMConfig configuration class: XGLMForCausalLM (XGLM model)
- XLMConfig configuration class: XLMWithLMHeadModel (XLM model)
- XLMProphetNetConfig configuration class: XLMProphetNetForCausalLM (XLM-ProphetNet model)
- XLMRobertaConfig configuration class: XLMRobertaForCausalLM (XLM-RoBERTa model)
- XLMRobertaXLConfig configuration class: XLMRobertaXLForCausalLM (XLM-RoBERTa-XL model)
- XLNetConfig configuration class: XLNetLMHeadModel (XLNet model)
Instantiates one of the model classes of the library (with a causal language modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a tensorflow index checkpoint file (e.g,
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_tf (
, optional, defaults toFalse
) — Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a causal language modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- bart — BartForCausalLM (BART model)
- bert — BertLMHeadModel (BERT model)
- bert-generation — BertGenerationDecoder (Bert Generation model)
- big_bird — BigBirdForCausalLM (BigBird model)
- bigbird_pegasus — BigBirdPegasusForCausalLM (BigBird-Pegasus model)
- blenderbot — BlenderbotForCausalLM (Blenderbot model)
- blenderbot-small — BlenderbotSmallForCausalLM (BlenderbotSmall model)
- bloom — BloomForCausalLM (BLOOM model)
- camembert — CamembertForCausalLM (CamemBERT model)
- ctrl — CTRLLMHeadModel (CTRL model)
- data2vec-text — Data2VecTextForCausalLM (Data2VecText model)
- electra — ElectraForCausalLM (ELECTRA model)
- gpt2 — GPT2LMHeadModel (OpenAI GPT-2 model)
- gpt_neo — GPTNeoForCausalLM (GPT Neo model)
- gpt_neox — GPTNeoXForCausalLM (GPT NeoX model)
- gptj — GPTJForCausalLM (GPT-J model)
- marian — MarianForCausalLM (Marian model)
- mbart — MBartForCausalLM (mBART model)
- megatron-bert — MegatronBertForCausalLM (Megatron-BERT model)
- openai-gpt — OpenAIGPTLMHeadModel (OpenAI GPT model)
- opt — OPTForCausalLM (OPT model)
- pegasus — PegasusForCausalLM (Pegasus model)
- plbart — PLBartForCausalLM (PLBart model)
- prophetnet — ProphetNetForCausalLM (ProphetNet model)
- qdqbert — QDQBertLMHeadModel (QDQBert model)
- reformer — ReformerModelWithLMHead (Reformer model)
- rembert — RemBertForCausalLM (RemBERT model)
- roberta — RobertaForCausalLM (RoBERTa model)
- roformer — RoFormerForCausalLM (RoFormer model)
- speech_to_text_2 — Speech2Text2ForCausalLM (Speech2Text2 model)
- transfo-xl — TransfoXLLMHeadModel (Transformer-XL model)
- trocr — TrOCRForCausalLM (TrOCR model)
- xglm — XGLMForCausalLM (XGLM model)
- xlm — XLMWithLMHeadModel (XLM model)
- xlm-prophetnet — XLMProphetNetForCausalLM (XLM-ProphetNet model)
- xlm-roberta — XLMRobertaForCausalLM (XLM-RoBERTa model)
- xlm-roberta-xl — XLMRobertaXLForCausalLM (XLM-RoBERTa-XL model)
- xlnet — XLNetLMHeadModel (XLNet model)
The model is set in evaluation mode by default using model.eval()
(so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
>>> from transformers import AutoConfig, AutoModelForCausalLM
>>> # Download model and configuration from and cache.
>>> model = AutoModelForCausalLM.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForCausalLM.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForCausalLM.from_pretrained(
... "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a masked language modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: AlbertForMaskedLM (ALBERT model)
- BartConfig configuration class: BartForConditionalGeneration (BART model)
- BertConfig configuration class: BertForMaskedLM (BERT model)
- BigBirdConfig configuration class: BigBirdForMaskedLM (BigBird model)
- CamembertConfig configuration class: CamembertForMaskedLM (CamemBERT model)
- ConvBertConfig configuration class: ConvBertForMaskedLM (ConvBERT model)
- Data2VecTextConfig configuration class: Data2VecTextForMaskedLM (Data2VecText model)
- DebertaConfig configuration class: DebertaForMaskedLM (DeBERTa model)
- DebertaV2Config configuration class: DebertaV2ForMaskedLM (DeBERTa-v2 model)
- DistilBertConfig configuration class: DistilBertForMaskedLM (DistilBERT model)
- ElectraConfig configuration class: ElectraForMaskedLM (ELECTRA model)
- FNetConfig configuration class: FNetForMaskedLM (FNet model)
- FlaubertConfig configuration class: FlaubertWithLMHeadModel (FlauBERT model)
- FunnelConfig configuration class: FunnelForMaskedLM (Funnel Transformer model)
- IBertConfig configuration class: IBertForMaskedLM (I-BERT model)
- LayoutLMConfig configuration class: LayoutLMForMaskedLM (LayoutLM model)
- LongformerConfig configuration class: LongformerForMaskedLM (Longformer model)
- LukeConfig configuration class: LukeForMaskedLM (LUKE model)
- MBartConfig configuration class: MBartForConditionalGeneration (mBART model)
- MPNetConfig configuration class: MPNetForMaskedLM (MPNet model)
- MegatronBertConfig configuration class: MegatronBertForMaskedLM (Megatron-BERT model)
- MobileBertConfig configuration class: MobileBertForMaskedLM (MobileBERT model)
- NystromformerConfig configuration class: NystromformerForMaskedLM (Nyströmformer model)
- PerceiverConfig configuration class: PerceiverForMaskedLM (Perceiver model)
- QDQBertConfig configuration class: QDQBertForMaskedLM (QDQBert model)
- ReformerConfig configuration class: ReformerForMaskedLM (Reformer model)
- RemBertConfig configuration class: RemBertForMaskedLM (RemBERT model)
- RoFormerConfig configuration class: RoFormerForMaskedLM (RoFormer model)
- RobertaConfig configuration class: RobertaForMaskedLM (RoBERTa model)
- SqueezeBertConfig configuration class: SqueezeBertForMaskedLM (SqueezeBERT model)
- TapasConfig configuration class: TapasForMaskedLM (TAPAS model)
- Wav2Vec2Config configuration class:
(Wav2Vec2 model) - XLMConfig configuration class: XLMWithLMHeadModel (XLM model)
- XLMRobertaConfig configuration class: XLMRobertaForMaskedLM (XLM-RoBERTa model)
- XLMRobertaXLConfig configuration class: XLMRobertaXLForMaskedLM (XLM-RoBERTa-XL model)
- YosoConfig configuration class: YosoForMaskedLM (YOSO model)
Instantiates one of the model classes of the library (with a masked language modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a tensorflow index checkpoint file (e.g,
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_tf (
, optional, defaults toFalse
) — Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a masked language modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- albert — AlbertForMaskedLM (ALBERT model)
- bart — BartForConditionalGeneration (BART model)
- bert — BertForMaskedLM (BERT model)
- big_bird — BigBirdForMaskedLM (BigBird model)
- camembert — CamembertForMaskedLM (CamemBERT model)
- convbert — ConvBertForMaskedLM (ConvBERT model)
- data2vec-text — Data2VecTextForMaskedLM (Data2VecText model)
- deberta — DebertaForMaskedLM (DeBERTa model)
- deberta-v2 — DebertaV2ForMaskedLM (DeBERTa-v2 model)
- distilbert — DistilBertForMaskedLM (DistilBERT model)
- electra — ElectraForMaskedLM (ELECTRA model)
- flaubert — FlaubertWithLMHeadModel (FlauBERT model)
- fnet — FNetForMaskedLM (FNet model)
- funnel — FunnelForMaskedLM (Funnel Transformer model)
- ibert — IBertForMaskedLM (I-BERT model)
- layoutlm — LayoutLMForMaskedLM (LayoutLM model)
- longformer — LongformerForMaskedLM (Longformer model)
- luke — LukeForMaskedLM (LUKE model)
- mbart — MBartForConditionalGeneration (mBART model)
- megatron-bert — MegatronBertForMaskedLM (Megatron-BERT model)
- mobilebert — MobileBertForMaskedLM (MobileBERT model)
- mpnet — MPNetForMaskedLM (MPNet model)
- nystromformer — NystromformerForMaskedLM (Nyströmformer model)
- perceiver — PerceiverForMaskedLM (Perceiver model)
- qdqbert — QDQBertForMaskedLM (QDQBert model)
- reformer — ReformerForMaskedLM (Reformer model)
- rembert — RemBertForMaskedLM (RemBERT model)
- roberta — RobertaForMaskedLM (RoBERTa model)
- roformer — RoFormerForMaskedLM (RoFormer model)
- squeezebert — SqueezeBertForMaskedLM (SqueezeBERT model)
- tapas — TapasForMaskedLM (TAPAS model)
- wav2vec2 —
(Wav2Vec2 model) - xlm — XLMWithLMHeadModel (XLM model)
- xlm-roberta — XLMRobertaForMaskedLM (XLM-RoBERTa model)
- xlm-roberta-xl — XLMRobertaXLForMaskedLM (XLM-RoBERTa-XL model)
- yoso — YosoForMaskedLM (YOSO model)
The model is set in evaluation mode by default using model.eval()
(so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
>>> from transformers import AutoConfig, AutoModelForMaskedLM
>>> # Download model and configuration from and cache.
>>> model = AutoModelForMaskedLM.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForMaskedLM.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForMaskedLM.from_pretrained(
... "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence language modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- BartConfig configuration class: BartForConditionalGeneration (BART model)
- BigBirdPegasusConfig configuration class: BigBirdPegasusForConditionalGeneration (BigBird-Pegasus model)
- BlenderbotConfig configuration class: BlenderbotForConditionalGeneration (Blenderbot model)
- BlenderbotSmallConfig configuration class: BlenderbotSmallForConditionalGeneration (BlenderbotSmall model)
- EncoderDecoderConfig configuration class: EncoderDecoderModel (Encoder decoder model)
- FSMTConfig configuration class: FSMTForConditionalGeneration (FairSeq Machine-Translation model)
- LEDConfig configuration class: LEDForConditionalGeneration (LED model)
- LongT5Config configuration class: LongT5ForConditionalGeneration (LongT5 model)
- M2M100Config configuration class: M2M100ForConditionalGeneration (M2M100 model)
- MBartConfig configuration class: MBartForConditionalGeneration (mBART model)
- MT5Config configuration class: MT5ForConditionalGeneration (MT5 model)
- MarianConfig configuration class: MarianMTModel (Marian model)
- PLBartConfig configuration class: PLBartForConditionalGeneration (PLBart model)
- PegasusConfig configuration class: PegasusForConditionalGeneration (Pegasus model)
- ProphetNetConfig configuration class: ProphetNetForConditionalGeneration (ProphetNet model)
- T5Config configuration class: T5ForConditionalGeneration (T5 model)
- XLMProphetNetConfig configuration class: XLMProphetNetForConditionalGeneration (XLM-ProphetNet model)
Instantiates one of the model classes of the library (with a sequence-to-sequence language modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a tensorflow index checkpoint file (e.g,
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_tf (
, optional, defaults toFalse
) — Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a sequence-to-sequence language modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- bart — BartForConditionalGeneration (BART model)
- bigbird_pegasus — BigBirdPegasusForConditionalGeneration (BigBird-Pegasus model)
- blenderbot — BlenderbotForConditionalGeneration (Blenderbot model)
- blenderbot-small — BlenderbotSmallForConditionalGeneration (BlenderbotSmall model)
- encoder-decoder — EncoderDecoderModel (Encoder decoder model)
- fsmt — FSMTForConditionalGeneration (FairSeq Machine-Translation model)
- led — LEDForConditionalGeneration (LED model)
- longt5 — LongT5ForConditionalGeneration (LongT5 model)
- m2m_100 — M2M100ForConditionalGeneration (M2M100 model)
- marian — MarianMTModel (Marian model)
- mbart — MBartForConditionalGeneration (mBART model)
- mt5 — MT5ForConditionalGeneration (MT5 model)
- pegasus — PegasusForConditionalGeneration (Pegasus model)
- plbart — PLBartForConditionalGeneration (PLBart model)
- prophetnet — ProphetNetForConditionalGeneration (ProphetNet model)
- t5 — T5ForConditionalGeneration (T5 model)
- xlm-prophetnet — XLMProphetNetForConditionalGeneration (XLM-ProphetNet model)
The model is set in evaluation mode by default using model.eval()
(so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
>>> from transformers import AutoConfig, AutoModelForSeq2SeqLM
>>> # Download model and configuration from and cache.
>>> model = AutoModelForSeq2SeqLM.from_pretrained("t5-base")
>>> # Update configuration during loading
>>> model = AutoModelForSeq2SeqLM.from_pretrained("t5-base", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/t5_tf_model_config.json")
>>> model = AutoModelForSeq2SeqLM.from_pretrained(
... "./tf_model/t5_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: AlbertForSequenceClassification (ALBERT model)
- BartConfig configuration class: BartForSequenceClassification (BART model)
- BertConfig configuration class: BertForSequenceClassification (BERT model)
- BigBirdConfig configuration class: BigBirdForSequenceClassification (BigBird model)
- BigBirdPegasusConfig configuration class: BigBirdPegasusForSequenceClassification (BigBird-Pegasus model)
- BloomConfig configuration class: BloomForSequenceClassification (BLOOM model)
- CTRLConfig configuration class: CTRLForSequenceClassification (CTRL model)
- CamembertConfig configuration class: CamembertForSequenceClassification (CamemBERT model)
- CanineConfig configuration class: CanineForSequenceClassification (CANINE model)
- ConvBertConfig configuration class: ConvBertForSequenceClassification (ConvBERT model)
- Data2VecTextConfig configuration class: Data2VecTextForSequenceClassification (Data2VecText model)
- DebertaConfig configuration class: DebertaForSequenceClassification (DeBERTa model)
- DebertaV2Config configuration class: DebertaV2ForSequenceClassification (DeBERTa-v2 model)
- DistilBertConfig configuration class: DistilBertForSequenceClassification (DistilBERT model)
- ElectraConfig configuration class: ElectraForSequenceClassification (ELECTRA model)
- FNetConfig configuration class: FNetForSequenceClassification (FNet model)
- FlaubertConfig configuration class: FlaubertForSequenceClassification (FlauBERT model)
- FunnelConfig configuration class: FunnelForSequenceClassification (Funnel Transformer model)
- GPT2Config configuration class: GPT2ForSequenceClassification (OpenAI GPT-2 model)
- GPTJConfig configuration class: GPTJForSequenceClassification (GPT-J model)
- GPTNeoConfig configuration class: GPTNeoForSequenceClassification (GPT Neo model)
- IBertConfig configuration class: IBertForSequenceClassification (I-BERT model)
- LEDConfig configuration class: LEDForSequenceClassification (LED model)
- LayoutLMConfig configuration class: LayoutLMForSequenceClassification (LayoutLM model)
- LayoutLMv2Config configuration class: LayoutLMv2ForSequenceClassification (LayoutLMv2 model)
- LayoutLMv3Config configuration class: LayoutLMv3ForSequenceClassification (LayoutLMv3 model)
- LongformerConfig configuration class: LongformerForSequenceClassification (Longformer model)
- MBartConfig configuration class: MBartForSequenceClassification (mBART model)
- MPNetConfig configuration class: MPNetForSequenceClassification (MPNet model)
- MegatronBertConfig configuration class: MegatronBertForSequenceClassification (Megatron-BERT model)
- MobileBertConfig configuration class: MobileBertForSequenceClassification (MobileBERT model)
- NystromformerConfig configuration class: NystromformerForSequenceClassification (Nyströmformer model)
- OpenAIGPTConfig configuration class: OpenAIGPTForSequenceClassification (OpenAI GPT model)
- PLBartConfig configuration class: PLBartForSequenceClassification (PLBart model)
- PerceiverConfig configuration class: PerceiverForSequenceClassification (Perceiver model)
- QDQBertConfig configuration class: QDQBertForSequenceClassification (QDQBert model)
- ReformerConfig configuration class: ReformerForSequenceClassification (Reformer model)
- RemBertConfig configuration class: RemBertForSequenceClassification (RemBERT model)
- RoFormerConfig configuration class: RoFormerForSequenceClassification (RoFormer model)
- RobertaConfig configuration class: RobertaForSequenceClassification (RoBERTa model)
- SqueezeBertConfig configuration class: SqueezeBertForSequenceClassification (SqueezeBERT model)
- TapasConfig configuration class: TapasForSequenceClassification (TAPAS model)
- TransfoXLConfig configuration class: TransfoXLForSequenceClassification (Transformer-XL model)
- XLMConfig configuration class: XLMForSequenceClassification (XLM model)
- XLMRobertaConfig configuration class: XLMRobertaForSequenceClassification (XLM-RoBERTa model)
- XLMRobertaXLConfig configuration class: XLMRobertaXLForSequenceClassification (XLM-RoBERTa-XL model)
- XLNetConfig configuration class: XLNetForSequenceClassification (XLNet model)
- YosoConfig configuration class: YosoForSequenceClassification (YOSO model)
Instantiates one of the model classes of the library (with a sequence classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a tensorflow index checkpoint file (e.g,
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_tf (
, optional, defaults toFalse
) — Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a sequence classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- albert — AlbertForSequenceClassification (ALBERT model)
- bart — BartForSequenceClassification (BART model)
- bert — BertForSequenceClassification (BERT model)
- big_bird — BigBirdForSequenceClassification (BigBird model)
- bigbird_pegasus — BigBirdPegasusForSequenceClassification (BigBird-Pegasus model)
- bloom — BloomForSequenceClassification (BLOOM model)
- camembert — CamembertForSequenceClassification (CamemBERT model)
- canine — CanineForSequenceClassification (CANINE model)
- convbert — ConvBertForSequenceClassification (ConvBERT model)
- ctrl — CTRLForSequenceClassification (CTRL model)
- data2vec-text — Data2VecTextForSequenceClassification (Data2VecText model)
- deberta — DebertaForSequenceClassification (DeBERTa model)
- deberta-v2 — DebertaV2ForSequenceClassification (DeBERTa-v2 model)
- distilbert — DistilBertForSequenceClassification (DistilBERT model)
- electra — ElectraForSequenceClassification (ELECTRA model)
- flaubert — FlaubertForSequenceClassification (FlauBERT model)
- fnet — FNetForSequenceClassification (FNet model)
- funnel — FunnelForSequenceClassification (Funnel Transformer model)
- gpt2 — GPT2ForSequenceClassification (OpenAI GPT-2 model)
- gpt_neo — GPTNeoForSequenceClassification (GPT Neo model)
- gptj — GPTJForSequenceClassification (GPT-J model)
- ibert — IBertForSequenceClassification (I-BERT model)
- layoutlm — LayoutLMForSequenceClassification (LayoutLM model)
- layoutlmv2 — LayoutLMv2ForSequenceClassification (LayoutLMv2 model)
- layoutlmv3 — LayoutLMv3ForSequenceClassification (LayoutLMv3 model)
- led — LEDForSequenceClassification (LED model)
- longformer — LongformerForSequenceClassification (Longformer model)
- mbart — MBartForSequenceClassification (mBART model)
- megatron-bert — MegatronBertForSequenceClassification (Megatron-BERT model)
- mobilebert — MobileBertForSequenceClassification (MobileBERT model)
- mpnet — MPNetForSequenceClassification (MPNet model)
- nystromformer — NystromformerForSequenceClassification (Nyströmformer model)
- openai-gpt — OpenAIGPTForSequenceClassification (OpenAI GPT model)
- perceiver — PerceiverForSequenceClassification (Perceiver model)
- plbart — PLBartForSequenceClassification (PLBart model)
- qdqbert — QDQBertForSequenceClassification (QDQBert model)
- reformer — ReformerForSequenceClassification (Reformer model)
- rembert — RemBertForSequenceClassification (RemBERT model)
- roberta — RobertaForSequenceClassification (RoBERTa model)
- roformer — RoFormerForSequenceClassification (RoFormer model)
- squeezebert — SqueezeBertForSequenceClassification (SqueezeBERT model)
- tapas — TapasForSequenceClassification (TAPAS model)
- transfo-xl — TransfoXLForSequenceClassification (Transformer-XL model)
- xlm — XLMForSequenceClassification (XLM model)
- xlm-roberta — XLMRobertaForSequenceClassification (XLM-RoBERTa model)
- xlm-roberta-xl — XLMRobertaXLForSequenceClassification (XLM-RoBERTa-XL model)
- xlnet — XLNetForSequenceClassification (XLNet model)
- yoso — YosoForSequenceClassification (YOSO model)
The model is set in evaluation mode by default using model.eval()
(so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
>>> from transformers import AutoConfig, AutoModelForSequenceClassification
>>> # Download model and configuration from and cache.
>>> model = AutoModelForSequenceClassification.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForSequenceClassification.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForSequenceClassification.from_pretrained(
... "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a multiple choice head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: AlbertForMultipleChoice (ALBERT model)
- BertConfig configuration class: BertForMultipleChoice (BERT model)
- BigBirdConfig configuration class: BigBirdForMultipleChoice (BigBird model)
- CamembertConfig configuration class: CamembertForMultipleChoice (CamemBERT model)
- CanineConfig configuration class: CanineForMultipleChoice (CANINE model)
- ConvBertConfig configuration class: ConvBertForMultipleChoice (ConvBERT model)
- Data2VecTextConfig configuration class: Data2VecTextForMultipleChoice (Data2VecText model)
- DebertaV2Config configuration class: DebertaV2ForMultipleChoice (DeBERTa-v2 model)
- DistilBertConfig configuration class: DistilBertForMultipleChoice (DistilBERT model)
- ElectraConfig configuration class: ElectraForMultipleChoice (ELECTRA model)
- FNetConfig configuration class: FNetForMultipleChoice (FNet model)
- FlaubertConfig configuration class: FlaubertForMultipleChoice (FlauBERT model)
- FunnelConfig configuration class: FunnelForMultipleChoice (Funnel Transformer model)
- IBertConfig configuration class: IBertForMultipleChoice (I-BERT model)
- LongformerConfig configuration class: LongformerForMultipleChoice (Longformer model)
- MPNetConfig configuration class: MPNetForMultipleChoice (MPNet model)
- MegatronBertConfig configuration class: MegatronBertForMultipleChoice (Megatron-BERT model)
- MobileBertConfig configuration class: MobileBertForMultipleChoice (MobileBERT model)
- NystromformerConfig configuration class: NystromformerForMultipleChoice (Nyströmformer model)
- QDQBertConfig configuration class: QDQBertForMultipleChoice (QDQBert model)
- RemBertConfig configuration class: RemBertForMultipleChoice (RemBERT model)
- RoFormerConfig configuration class: RoFormerForMultipleChoice (RoFormer model)
- RobertaConfig configuration class: RobertaForMultipleChoice (RoBERTa model)
- SqueezeBertConfig configuration class: SqueezeBertForMultipleChoice (SqueezeBERT model)
- XLMConfig configuration class: XLMForMultipleChoice (XLM model)
- XLMRobertaConfig configuration class: XLMRobertaForMultipleChoice (XLM-RoBERTa model)
- XLMRobertaXLConfig configuration class: XLMRobertaXLForMultipleChoice (XLM-RoBERTa-XL model)
- XLNetConfig configuration class: XLNetForMultipleChoice (XLNet model)
- YosoConfig configuration class: YosoForMultipleChoice (YOSO model)
Instantiates one of the model classes of the library (with a multiple choice head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a tensorflow index checkpoint file (e.g,
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_tf (
, optional, defaults toFalse
) — Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a multiple choice head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- albert — AlbertForMultipleChoice (ALBERT model)
- bert — BertForMultipleChoice (BERT model)
- big_bird — BigBirdForMultipleChoice (BigBird model)
- camembert — CamembertForMultipleChoice (CamemBERT model)
- canine — CanineForMultipleChoice (CANINE model)
- convbert — ConvBertForMultipleChoice (ConvBERT model)
- data2vec-text — Data2VecTextForMultipleChoice (Data2VecText model)
- deberta-v2 — DebertaV2ForMultipleChoice (DeBERTa-v2 model)
- distilbert — DistilBertForMultipleChoice (DistilBERT model)
- electra — ElectraForMultipleChoice (ELECTRA model)
- flaubert — FlaubertForMultipleChoice (FlauBERT model)
- fnet — FNetForMultipleChoice (FNet model)
- funnel — FunnelForMultipleChoice (Funnel Transformer model)
- ibert — IBertForMultipleChoice (I-BERT model)
- longformer — LongformerForMultipleChoice (Longformer model)
- megatron-bert — MegatronBertForMultipleChoice (Megatron-BERT model)
- mobilebert — MobileBertForMultipleChoice (MobileBERT model)
- mpnet — MPNetForMultipleChoice (MPNet model)
- nystromformer — NystromformerForMultipleChoice (Nyströmformer model)
- qdqbert — QDQBertForMultipleChoice (QDQBert model)
- rembert — RemBertForMultipleChoice (RemBERT model)
- roberta — RobertaForMultipleChoice (RoBERTa model)
- roformer — RoFormerForMultipleChoice (RoFormer model)
- squeezebert — SqueezeBertForMultipleChoice (SqueezeBERT model)
- xlm — XLMForMultipleChoice (XLM model)
- xlm-roberta — XLMRobertaForMultipleChoice (XLM-RoBERTa model)
- xlm-roberta-xl — XLMRobertaXLForMultipleChoice (XLM-RoBERTa-XL model)
- xlnet — XLNetForMultipleChoice (XLNet model)
- yoso — YosoForMultipleChoice (YOSO model)
The model is set in evaluation mode by default using model.eval()
(so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
>>> from transformers import AutoConfig, AutoModelForMultipleChoice
>>> # Download model and configuration from and cache.
>>> model = AutoModelForMultipleChoice.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForMultipleChoice.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForMultipleChoice.from_pretrained(
... "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a next sentence prediction head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- BertConfig configuration class: BertForNextSentencePrediction (BERT model)
- FNetConfig configuration class: FNetForNextSentencePrediction (FNet model)
- MegatronBertConfig configuration class: MegatronBertForNextSentencePrediction (Megatron-BERT model)
- MobileBertConfig configuration class: MobileBertForNextSentencePrediction (MobileBERT model)
- QDQBertConfig configuration class: QDQBertForNextSentencePrediction (QDQBert model)
Instantiates one of the model classes of the library (with a next sentence prediction head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a tensorflow index checkpoint file (e.g,
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_tf (
, optional, defaults toFalse
) — Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a next sentence prediction head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- bert — BertForNextSentencePrediction (BERT model)
- fnet — FNetForNextSentencePrediction (FNet model)
- megatron-bert — MegatronBertForNextSentencePrediction (Megatron-BERT model)
- mobilebert — MobileBertForNextSentencePrediction (MobileBERT model)
- qdqbert — QDQBertForNextSentencePrediction (QDQBert model)
The model is set in evaluation mode by default using model.eval()
(so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
>>> from transformers import AutoConfig, AutoModelForNextSentencePrediction
>>> # Download model and configuration from and cache.
>>> model = AutoModelForNextSentencePrediction.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForNextSentencePrediction.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForNextSentencePrediction.from_pretrained(
... "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a token classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: AlbertForTokenClassification (ALBERT model)
- BertConfig configuration class: BertForTokenClassification (BERT model)
- BigBirdConfig configuration class: BigBirdForTokenClassification (BigBird model)
- BloomConfig configuration class: BloomForTokenClassification (BLOOM model)
- CamembertConfig configuration class: CamembertForTokenClassification (CamemBERT model)
- CanineConfig configuration class: CanineForTokenClassification (CANINE model)
- ConvBertConfig configuration class: ConvBertForTokenClassification (ConvBERT model)
- Data2VecTextConfig configuration class: Data2VecTextForTokenClassification (Data2VecText model)
- DebertaConfig configuration class: DebertaForTokenClassification (DeBERTa model)
- DebertaV2Config configuration class: DebertaV2ForTokenClassification (DeBERTa-v2 model)
- DistilBertConfig configuration class: DistilBertForTokenClassification (DistilBERT model)
- ElectraConfig configuration class: ElectraForTokenClassification (ELECTRA model)
- FNetConfig configuration class: FNetForTokenClassification (FNet model)
- FlaubertConfig configuration class: FlaubertForTokenClassification (FlauBERT model)
- FunnelConfig configuration class: FunnelForTokenClassification (Funnel Transformer model)
- GPT2Config configuration class: GPT2ForTokenClassification (OpenAI GPT-2 model)
- IBertConfig configuration class: IBertForTokenClassification (I-BERT model)
- LayoutLMConfig configuration class: LayoutLMForTokenClassification (LayoutLM model)
- LayoutLMv2Config configuration class: LayoutLMv2ForTokenClassification (LayoutLMv2 model)
- LayoutLMv3Config configuration class: LayoutLMv3ForTokenClassification (LayoutLMv3 model)
- LongformerConfig configuration class: LongformerForTokenClassification (Longformer model)
- MPNetConfig configuration class: MPNetForTokenClassification (MPNet model)
- MegatronBertConfig configuration class: MegatronBertForTokenClassification (Megatron-BERT model)
- MobileBertConfig configuration class: MobileBertForTokenClassification (MobileBERT model)
- NystromformerConfig configuration class: NystromformerForTokenClassification (Nyströmformer model)
- QDQBertConfig configuration class: QDQBertForTokenClassification (QDQBert model)
- RemBertConfig configuration class: RemBertForTokenClassification (RemBERT model)
- RoFormerConfig configuration class: RoFormerForTokenClassification (RoFormer model)
- RobertaConfig configuration class: RobertaForTokenClassification (RoBERTa model)
- SqueezeBertConfig configuration class: SqueezeBertForTokenClassification (SqueezeBERT model)
- XLMConfig configuration class: XLMForTokenClassification (XLM model)
- XLMRobertaConfig configuration class: XLMRobertaForTokenClassification (XLM-RoBERTa model)
- XLMRobertaXLConfig configuration class: XLMRobertaXLForTokenClassification (XLM-RoBERTa-XL model)
- XLNetConfig configuration class: XLNetForTokenClassification (XLNet model)
- YosoConfig configuration class: YosoForTokenClassification (YOSO model)
Instantiates one of the model classes of the library (with a token classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a tensorflow index checkpoint file (e.g,
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_tf (
, optional, defaults toFalse
) — Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a token classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- albert — AlbertForTokenClassification (ALBERT model)
- bert — BertForTokenClassification (BERT model)
- big_bird — BigBirdForTokenClassification (BigBird model)
- bloom — BloomForTokenClassification (BLOOM model)
- camembert — CamembertForTokenClassification (CamemBERT model)
- canine — CanineForTokenClassification (CANINE model)
- convbert — ConvBertForTokenClassification (ConvBERT model)
- data2vec-text — Data2VecTextForTokenClassification (Data2VecText model)
- deberta — DebertaForTokenClassification (DeBERTa model)
- deberta-v2 — DebertaV2ForTokenClassification (DeBERTa-v2 model)
- distilbert — DistilBertForTokenClassification (DistilBERT model)
- electra — ElectraForTokenClassification (ELECTRA model)
- flaubert — FlaubertForTokenClassification (FlauBERT model)
- fnet — FNetForTokenClassification (FNet model)
- funnel — FunnelForTokenClassification (Funnel Transformer model)
- gpt2 — GPT2ForTokenClassification (OpenAI GPT-2 model)
- ibert — IBertForTokenClassification (I-BERT model)
- layoutlm — LayoutLMForTokenClassification (LayoutLM model)
- layoutlmv2 — LayoutLMv2ForTokenClassification (LayoutLMv2 model)
- layoutlmv3 — LayoutLMv3ForTokenClassification (LayoutLMv3 model)
- longformer — LongformerForTokenClassification (Longformer model)
- megatron-bert — MegatronBertForTokenClassification (Megatron-BERT model)
- mobilebert — MobileBertForTokenClassification (MobileBERT model)
- mpnet — MPNetForTokenClassification (MPNet model)
- nystromformer — NystromformerForTokenClassification (Nyströmformer model)
- qdqbert — QDQBertForTokenClassification (QDQBert model)
- rembert — RemBertForTokenClassification (RemBERT model)
- roberta — RobertaForTokenClassification (RoBERTa model)
- roformer — RoFormerForTokenClassification (RoFormer model)
- squeezebert — SqueezeBertForTokenClassification (SqueezeBERT model)
- xlm — XLMForTokenClassification (XLM model)
- xlm-roberta — XLMRobertaForTokenClassification (XLM-RoBERTa model)
- xlm-roberta-xl — XLMRobertaXLForTokenClassification (XLM-RoBERTa-XL model)
- xlnet — XLNetForTokenClassification (XLNet model)
- yoso — YosoForTokenClassification (YOSO model)
The model is set in evaluation mode by default using model.eval()
(so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
>>> from transformers import AutoConfig, AutoModelForTokenClassification
>>> # Download model and configuration from and cache.
>>> model = AutoModelForTokenClassification.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForTokenClassification.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForTokenClassification.from_pretrained(
... "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a question answering head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: AlbertForQuestionAnswering (ALBERT model)
- BartConfig configuration class: BartForQuestionAnswering (BART model)
- BertConfig configuration class: BertForQuestionAnswering (BERT model)
- BigBirdConfig configuration class: BigBirdForQuestionAnswering (BigBird model)
- BigBirdPegasusConfig configuration class: BigBirdPegasusForQuestionAnswering (BigBird-Pegasus model)
- CamembertConfig configuration class: CamembertForQuestionAnswering (CamemBERT model)
- CanineConfig configuration class: CanineForQuestionAnswering (CANINE model)
- ConvBertConfig configuration class: ConvBertForQuestionAnswering (ConvBERT model)
- Data2VecTextConfig configuration class: Data2VecTextForQuestionAnswering (Data2VecText model)
- DebertaConfig configuration class: DebertaForQuestionAnswering (DeBERTa model)
- DebertaV2Config configuration class: DebertaV2ForQuestionAnswering (DeBERTa-v2 model)
- DistilBertConfig configuration class: DistilBertForQuestionAnswering (DistilBERT model)
- ElectraConfig configuration class: ElectraForQuestionAnswering (ELECTRA model)
- FNetConfig configuration class: FNetForQuestionAnswering (FNet model)
- FlaubertConfig configuration class: FlaubertForQuestionAnsweringSimple (FlauBERT model)
- FunnelConfig configuration class: FunnelForQuestionAnswering (Funnel Transformer model)
- GPTJConfig configuration class: GPTJForQuestionAnswering (GPT-J model)
- IBertConfig configuration class: IBertForQuestionAnswering (I-BERT model)
- LEDConfig configuration class: LEDForQuestionAnswering (LED model)
- LayoutLMv2Config configuration class: LayoutLMv2ForQuestionAnswering (LayoutLMv2 model)
- LayoutLMv3Config configuration class: LayoutLMv3ForQuestionAnswering (LayoutLMv3 model)
- LongformerConfig configuration class: LongformerForQuestionAnswering (Longformer model)
- LxmertConfig configuration class: LxmertForQuestionAnswering (LXMERT model)
- MBartConfig configuration class: MBartForQuestionAnswering (mBART model)
- MPNetConfig configuration class: MPNetForQuestionAnswering (MPNet model)
- MegatronBertConfig configuration class: MegatronBertForQuestionAnswering (Megatron-BERT model)
- MobileBertConfig configuration class: MobileBertForQuestionAnswering (MobileBERT model)
- NystromformerConfig configuration class: NystromformerForQuestionAnswering (Nyströmformer model)
- QDQBertConfig configuration class: QDQBertForQuestionAnswering (QDQBert model)
- ReformerConfig configuration class: ReformerForQuestionAnswering (Reformer model)
- RemBertConfig configuration class: RemBertForQuestionAnswering (RemBERT model)
- RoFormerConfig configuration class: RoFormerForQuestionAnswering (RoFormer model)
- RobertaConfig configuration class: RobertaForQuestionAnswering (RoBERTa model)
- SplinterConfig configuration class: SplinterForQuestionAnswering (Splinter model)
- SqueezeBertConfig configuration class: SqueezeBertForQuestionAnswering (SqueezeBERT model)
- XLMConfig configuration class: XLMForQuestionAnsweringSimple (XLM model)
- XLMRobertaConfig configuration class: XLMRobertaForQuestionAnswering (XLM-RoBERTa model)
- XLMRobertaXLConfig configuration class: XLMRobertaXLForQuestionAnswering (XLM-RoBERTa-XL model)
- XLNetConfig configuration class: XLNetForQuestionAnsweringSimple (XLNet model)
- YosoConfig configuration class: YosoForQuestionAnswering (YOSO model)
Instantiates one of the model classes of the library (with a question answering head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a tensorflow index checkpoint file (e.g,
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_tf (
, optional, defaults toFalse
) — Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a question answering head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- albert — AlbertForQuestionAnswering (ALBERT model)
- bart — BartForQuestionAnswering (BART model)
- bert — BertForQuestionAnswering (BERT model)
- big_bird — BigBirdForQuestionAnswering (BigBird model)
- bigbird_pegasus — BigBirdPegasusForQuestionAnswering (BigBird-Pegasus model)
- camembert — CamembertForQuestionAnswering (CamemBERT model)
- canine — CanineForQuestionAnswering (CANINE model)
- convbert — ConvBertForQuestionAnswering (ConvBERT model)
- data2vec-text — Data2VecTextForQuestionAnswering (Data2VecText model)
- deberta — DebertaForQuestionAnswering (DeBERTa model)
- deberta-v2 — DebertaV2ForQuestionAnswering (DeBERTa-v2 model)
- distilbert — DistilBertForQuestionAnswering (DistilBERT model)
- electra — ElectraForQuestionAnswering (ELECTRA model)
- flaubert — FlaubertForQuestionAnsweringSimple (FlauBERT model)
- fnet — FNetForQuestionAnswering (FNet model)
- funnel — FunnelForQuestionAnswering (Funnel Transformer model)
- gptj — GPTJForQuestionAnswering (GPT-J model)
- ibert — IBertForQuestionAnswering (I-BERT model)
- layoutlmv2 — LayoutLMv2ForQuestionAnswering (LayoutLMv2 model)
- layoutlmv3 — LayoutLMv3ForQuestionAnswering (LayoutLMv3 model)
- led — LEDForQuestionAnswering (LED model)
- longformer — LongformerForQuestionAnswering (Longformer model)
- lxmert — LxmertForQuestionAnswering (LXMERT model)
- mbart — MBartForQuestionAnswering (mBART model)
- megatron-bert — MegatronBertForQuestionAnswering (Megatron-BERT model)
- mobilebert — MobileBertForQuestionAnswering (MobileBERT model)
- mpnet — MPNetForQuestionAnswering (MPNet model)
- nystromformer — NystromformerForQuestionAnswering (Nyströmformer model)
- qdqbert — QDQBertForQuestionAnswering (QDQBert model)
- reformer — ReformerForQuestionAnswering (Reformer model)
- rembert — RemBertForQuestionAnswering (RemBERT model)
- roberta — RobertaForQuestionAnswering (RoBERTa model)
- roformer — RoFormerForQuestionAnswering (RoFormer model)
- splinter — SplinterForQuestionAnswering (Splinter model)
- squeezebert — SqueezeBertForQuestionAnswering (SqueezeBERT model)
- xlm — XLMForQuestionAnsweringSimple (XLM model)
- xlm-roberta — XLMRobertaForQuestionAnswering (XLM-RoBERTa model)
- xlm-roberta-xl — XLMRobertaXLForQuestionAnswering (XLM-RoBERTa-XL model)
- xlnet — XLNetForQuestionAnsweringSimple (XLNet model)
- yoso — YosoForQuestionAnswering (YOSO model)
The model is set in evaluation mode by default using model.eval()
(so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
>>> from transformers import AutoConfig, AutoModelForQuestionAnswering
>>> # Download model and configuration from and cache.
>>> model = AutoModelForQuestionAnswering.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForQuestionAnswering.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForQuestionAnswering.from_pretrained(
... "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a table question answering head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- TapasConfig configuration class: TapasForQuestionAnswering (TAPAS model)
Instantiates one of the model classes of the library (with a table question answering head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a tensorflow index checkpoint file (e.g,
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_tf (
, optional, defaults toFalse
) — Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a table question answering head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- tapas — TapasForQuestionAnswering (TAPAS model)
The model is set in evaluation mode by default using model.eval()
(so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
>>> from transformers import AutoConfig, AutoModelForTableQuestionAnswering
>>> # Download model and configuration from and cache.
>>> model = AutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq")
>>> # Update configuration during loading
>>> model = AutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/tapas_tf_model_config.json")
>>> model = AutoModelForTableQuestionAnswering.from_pretrained(
... "./tf_model/tapas_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a image classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- BeitConfig configuration class: BeitForImageClassification (BEiT model)
- ConvNextConfig configuration class: ConvNextForImageClassification (ConvNeXT model)
- CvtConfig configuration class: CvtForImageClassification (CvT model)
- Data2VecVisionConfig configuration class: Data2VecVisionForImageClassification (Data2VecVision model)
- DeiTConfig configuration class: DeiTForImageClassification or DeiTForImageClassificationWithTeacher (DeiT model)
- ImageGPTConfig configuration class: ImageGPTForImageClassification (ImageGPT model)
- LevitConfig configuration class: LevitForImageClassification or LevitForImageClassificationWithTeacher (LeViT model)
- PerceiverConfig configuration class: PerceiverForImageClassificationLearned or PerceiverForImageClassificationFourier or PerceiverForImageClassificationConvProcessing (Perceiver model)
- PoolFormerConfig configuration class: PoolFormerForImageClassification (PoolFormer model)
- RegNetConfig configuration class: RegNetForImageClassification (RegNet model)
- ResNetConfig configuration class: ResNetForImageClassification (ResNet model)
- SegformerConfig configuration class: SegformerForImageClassification (SegFormer model)
- SwinConfig configuration class: SwinForImageClassification (Swin Transformer model)
- VanConfig configuration class: VanForImageClassification (VAN model)
- ViTConfig configuration class: ViTForImageClassification (ViT model)
Instantiates one of the model classes of the library (with a image classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a tensorflow index checkpoint file (e.g,
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_tf (
, optional, defaults toFalse
) — Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a image classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- beit — BeitForImageClassification (BEiT model)
- convnext — ConvNextForImageClassification (ConvNeXT model)
- cvt — CvtForImageClassification (CvT model)
- data2vec-vision — Data2VecVisionForImageClassification (Data2VecVision model)
- deit — DeiTForImageClassification or DeiTForImageClassificationWithTeacher (DeiT model)
- imagegpt — ImageGPTForImageClassification (ImageGPT model)
- levit — LevitForImageClassification or LevitForImageClassificationWithTeacher (LeViT model)
- perceiver — PerceiverForImageClassificationLearned or PerceiverForImageClassificationFourier or PerceiverForImageClassificationConvProcessing (Perceiver model)
- poolformer — PoolFormerForImageClassification (PoolFormer model)
- regnet — RegNetForImageClassification (RegNet model)
- resnet — ResNetForImageClassification (ResNet model)
- segformer — SegformerForImageClassification (SegFormer model)
- swin — SwinForImageClassification (Swin Transformer model)
- van — VanForImageClassification (VAN model)
- vit — ViTForImageClassification (ViT model)
The model is set in evaluation mode by default using model.eval()
(so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
>>> from transformers import AutoConfig, AutoModelForImageClassification
>>> # Download model and configuration from and cache.
>>> model = AutoModelForImageClassification.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForImageClassification.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForImageClassification.from_pretrained(
... "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a vision-to-text modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- VisionEncoderDecoderConfig configuration class: VisionEncoderDecoderModel (Vision Encoder decoder model)
Instantiates one of the model classes of the library (with a vision-to-text modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a tensorflow index checkpoint file (e.g,
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_tf (
, optional, defaults toFalse
) — Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a vision-to-text modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- vision-encoder-decoder — VisionEncoderDecoderModel (Vision Encoder decoder model)
The model is set in evaluation mode by default using model.eval()
(so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
>>> from transformers import AutoConfig, AutoModelForVision2Seq
>>> # Download model and configuration from and cache.
>>> model = AutoModelForVision2Seq.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForVision2Seq.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForVision2Seq.from_pretrained(
... "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a visual question answering head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- ViltConfig configuration class: ViltForQuestionAnswering (ViLT model)
Instantiates one of the model classes of the library (with a visual question answering head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a tensorflow index checkpoint file (e.g,
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_tf (
, optional, defaults toFalse
) — Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a visual question answering head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- vilt — ViltForQuestionAnswering (ViLT model)
The model is set in evaluation mode by default using model.eval()
(so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
>>> from transformers import AutoConfig, AutoModelForVisualQuestionAnswering
>>> # Download model and configuration from and cache.
>>> model = AutoModelForVisualQuestionAnswering.from_pretrained("dandelin/vilt-b32-finetuned-vqa")
>>> # Update configuration during loading
>>> model = AutoModelForVisualQuestionAnswering.from_pretrained("dandelin/vilt-b32-finetuned-vqa", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/vilt_tf_model_config.json")
>>> model = AutoModelForVisualQuestionAnswering.from_pretrained(
... "./tf_model/vilt_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a audio classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- Data2VecAudioConfig configuration class: Data2VecAudioForSequenceClassification (Data2VecAudio model)
- HubertConfig configuration class: HubertForSequenceClassification (Hubert model)
- SEWConfig configuration class: SEWForSequenceClassification (SEW model)
- SEWDConfig configuration class: SEWDForSequenceClassification (SEW-D model)
- UniSpeechConfig configuration class: UniSpeechForSequenceClassification (UniSpeech model)
- UniSpeechSatConfig configuration class: UniSpeechSatForSequenceClassification (UniSpeechSat model)
- Wav2Vec2Config configuration class: Wav2Vec2ForSequenceClassification (Wav2Vec2 model)
- Wav2Vec2ConformerConfig configuration class: Wav2Vec2ConformerForSequenceClassification (Wav2Vec2-Conformer model)
- WavLMConfig configuration class: WavLMForSequenceClassification (WavLM model)
Instantiates one of the model classes of the library (with a audio classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a tensorflow index checkpoint file (e.g,
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_tf (
, optional, defaults toFalse
) — Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a audio classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- data2vec-audio — Data2VecAudioForSequenceClassification (Data2VecAudio model)
- hubert — HubertForSequenceClassification (Hubert model)
- sew — SEWForSequenceClassification (SEW model)
- sew-d — SEWDForSequenceClassification (SEW-D model)
- unispeech — UniSpeechForSequenceClassification (UniSpeech model)
- unispeech-sat — UniSpeechSatForSequenceClassification (UniSpeechSat model)
- wav2vec2 — Wav2Vec2ForSequenceClassification (Wav2Vec2 model)
- wav2vec2-conformer — Wav2Vec2ConformerForSequenceClassification (Wav2Vec2-Conformer model)
- wavlm — WavLMForSequenceClassification (WavLM model)
The model is set in evaluation mode by default using model.eval()
(so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
>>> from transformers import AutoConfig, AutoModelForAudioClassification
>>> # Download model and configuration from and cache.
>>> model = AutoModelForAudioClassification.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForAudioClassification.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForAudioClassification.from_pretrained(
... "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a audio frame (token) classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- Data2VecAudioConfig configuration class: Data2VecAudioForAudioFrameClassification (Data2VecAudio model)
- UniSpeechSatConfig configuration class: UniSpeechSatForAudioFrameClassification (UniSpeechSat model)
- Wav2Vec2Config configuration class: Wav2Vec2ForAudioFrameClassification (Wav2Vec2 model)
- Wav2Vec2ConformerConfig configuration class: Wav2Vec2ConformerForAudioFrameClassification (Wav2Vec2-Conformer model)
- WavLMConfig configuration class: WavLMForAudioFrameClassification (WavLM model)
Instantiates one of the model classes of the library (with a audio frame (token) classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a tensorflow index checkpoint file (e.g,
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_tf (
, optional, defaults toFalse
) — Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a audio frame (token) classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- data2vec-audio — Data2VecAudioForAudioFrameClassification (Data2VecAudio model)
- unispeech-sat — UniSpeechSatForAudioFrameClassification (UniSpeechSat model)
- wav2vec2 — Wav2Vec2ForAudioFrameClassification (Wav2Vec2 model)
- wav2vec2-conformer — Wav2Vec2ConformerForAudioFrameClassification (Wav2Vec2-Conformer model)
- wavlm — WavLMForAudioFrameClassification (WavLM model)
The model is set in evaluation mode by default using model.eval()
(so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
>>> from transformers import AutoConfig, AutoModelForAudioFrameClassification
>>> # Download model and configuration from and cache.
>>> model = AutoModelForAudioFrameClassification.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForAudioFrameClassification.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForAudioFrameClassification.from_pretrained(
... "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a connectionist temporal classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- Data2VecAudioConfig configuration class: Data2VecAudioForCTC (Data2VecAudio model)
- HubertConfig configuration class: HubertForCTC (Hubert model)
- MCTCTConfig configuration class: MCTCTForCTC (M-CTC-T model)
- SEWConfig configuration class: SEWForCTC (SEW model)
- SEWDConfig configuration class: SEWDForCTC (SEW-D model)
- UniSpeechConfig configuration class: UniSpeechForCTC (UniSpeech model)
- UniSpeechSatConfig configuration class: UniSpeechSatForCTC (UniSpeechSat model)
- Wav2Vec2Config configuration class: Wav2Vec2ForCTC (Wav2Vec2 model)
- Wav2Vec2ConformerConfig configuration class: Wav2Vec2ConformerForCTC (Wav2Vec2-Conformer model)
- WavLMConfig configuration class: WavLMForCTC (WavLM model)
Instantiates one of the model classes of the library (with a connectionist temporal classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a tensorflow index checkpoint file (e.g,
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_tf (
, optional, defaults toFalse
) — Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a connectionist temporal classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- data2vec-audio — Data2VecAudioForCTC (Data2VecAudio model)
- hubert — HubertForCTC (Hubert model)
- mctct — MCTCTForCTC (M-CTC-T model)
- sew — SEWForCTC (SEW model)
- sew-d — SEWDForCTC (SEW-D model)
- unispeech — UniSpeechForCTC (UniSpeech model)
- unispeech-sat — UniSpeechSatForCTC (UniSpeechSat model)
- wav2vec2 — Wav2Vec2ForCTC (Wav2Vec2 model)
- wav2vec2-conformer — Wav2Vec2ConformerForCTC (Wav2Vec2-Conformer model)
- wavlm — WavLMForCTC (WavLM model)
The model is set in evaluation mode by default using model.eval()
(so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
>>> from transformers import AutoConfig, AutoModelForCTC
>>> # Download model and configuration from and cache.
>>> model = AutoModelForCTC.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForCTC.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForCTC.from_pretrained(
... "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- Speech2TextConfig configuration class: Speech2TextForConditionalGeneration (Speech2Text model)
- SpeechEncoderDecoderConfig configuration class: SpeechEncoderDecoderModel (Speech Encoder decoder model)
Instantiates one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a tensorflow index checkpoint file (e.g,
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_tf (
, optional, defaults toFalse
) — Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- speech-encoder-decoder — SpeechEncoderDecoderModel (Speech Encoder decoder model)
- speech_to_text — Speech2TextForConditionalGeneration (Speech2Text model)
The model is set in evaluation mode by default using model.eval()
(so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
>>> from transformers import AutoConfig, AutoModelForSpeechSeq2Seq
>>> # Download model and configuration from and cache.
>>> model = AutoModelForSpeechSeq2Seq.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForSpeechSeq2Seq.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForSpeechSeq2Seq.from_pretrained(
... "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a audio retrieval via x-vector head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- Data2VecAudioConfig configuration class: Data2VecAudioForXVector (Data2VecAudio model)
- UniSpeechSatConfig configuration class: UniSpeechSatForXVector (UniSpeechSat model)
- Wav2Vec2Config configuration class: Wav2Vec2ForXVector (Wav2Vec2 model)
- Wav2Vec2ConformerConfig configuration class: Wav2Vec2ConformerForXVector (Wav2Vec2-Conformer model)
- WavLMConfig configuration class: WavLMForXVector (WavLM model)
Instantiates one of the model classes of the library (with a audio retrieval via x-vector head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a tensorflow index checkpoint file (e.g,
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_tf (
, optional, defaults toFalse
) — Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a audio retrieval via x-vector head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- data2vec-audio — Data2VecAudioForXVector (Data2VecAudio model)
- unispeech-sat — UniSpeechSatForXVector (UniSpeechSat model)
- wav2vec2 — Wav2Vec2ForXVector (Wav2Vec2 model)
- wav2vec2-conformer — Wav2Vec2ConformerForXVector (Wav2Vec2-Conformer model)
- wavlm — WavLMForXVector (WavLM model)
The model is set in evaluation mode by default using model.eval()
(so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
>>> from transformers import AutoConfig, AutoModelForAudioXVector
>>> # Download model and configuration from and cache.
>>> model = AutoModelForAudioXVector.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForAudioXVector.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForAudioXVector.from_pretrained(
... "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a masked image modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- DeiTConfig configuration class: DeiTForMaskedImageModeling (DeiT model)
- SwinConfig configuration class: SwinForMaskedImageModeling (Swin Transformer model)
- ViTConfig configuration class: ViTForMaskedImageModeling (ViT model)
Instantiates one of the model classes of the library (with a masked image modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a tensorflow index checkpoint file (e.g,
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_tf (
, optional, defaults toFalse
) — Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a masked image modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- deit — DeiTForMaskedImageModeling (DeiT model)
- swin — SwinForMaskedImageModeling (Swin Transformer model)
- vit — ViTForMaskedImageModeling (ViT model)
The model is set in evaluation mode by default using model.eval()
(so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
>>> from transformers import AutoConfig, AutoModelForMaskedImageModeling
>>> # Download model and configuration from and cache.
>>> model = AutoModelForMaskedImageModeling.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForMaskedImageModeling.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForMaskedImageModeling.from_pretrained(
... "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a object detection head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- DetrConfig configuration class: DetrForObjectDetection (DETR model)
- YolosConfig configuration class: YolosForObjectDetection (YOLOS model)
Instantiates one of the model classes of the library (with a object detection head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a tensorflow index checkpoint file (e.g,
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_tf (
, optional, defaults toFalse
) — Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a object detection head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- detr — DetrForObjectDetection (DETR model)
- yolos — YolosForObjectDetection (YOLOS model)
The model is set in evaluation mode by default using model.eval()
(so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
>>> from transformers import AutoConfig, AutoModelForObjectDetection
>>> # Download model and configuration from and cache.
>>> model = AutoModelForObjectDetection.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForObjectDetection.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForObjectDetection.from_pretrained(
... "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a image segmentation head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- DetrConfig configuration class: DetrForSegmentation (DETR model)
Instantiates one of the model classes of the library (with a image segmentation head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a tensorflow index checkpoint file (e.g,
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_tf (
, optional, defaults toFalse
) — Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a image segmentation head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- detr — DetrForSegmentation (DETR model)
The model is set in evaluation mode by default using model.eval()
(so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
>>> from transformers import AutoConfig, AutoModelForImageSegmentation
>>> # Download model and configuration from and cache.
>>> model = AutoModelForImageSegmentation.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForImageSegmentation.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForImageSegmentation.from_pretrained(
... "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a semantic segmentation head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- BeitConfig configuration class: BeitForSemanticSegmentation (BEiT model)
- DPTConfig configuration class: DPTForSemanticSegmentation (DPT model)
- Data2VecVisionConfig configuration class: Data2VecVisionForSemanticSegmentation (Data2VecVision model)
- SegformerConfig configuration class: SegformerForSemanticSegmentation (SegFormer model)
Instantiates one of the model classes of the library (with a semantic segmentation head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a tensorflow index checkpoint file (e.g,
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_tf (
, optional, defaults toFalse
) — Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a semantic segmentation head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- beit — BeitForSemanticSegmentation (BEiT model)
- data2vec-vision — Data2VecVisionForSemanticSegmentation (Data2VecVision model)
- dpt — DPTForSemanticSegmentation (DPT model)
- segformer — SegformerForSemanticSegmentation (SegFormer model)
The model is set in evaluation mode by default using model.eval()
(so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
>>> from transformers import AutoConfig, AutoModelForSemanticSegmentation
>>> # Download model and configuration from and cache.
>>> model = AutoModelForSemanticSegmentation.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForSemanticSegmentation.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForSemanticSegmentation.from_pretrained(
... "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a instance segmentation head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- MaskFormerConfig configuration class: MaskFormerForInstanceSegmentation (MaskFormer model)
Instantiates one of the model classes of the library (with a instance segmentation head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a tensorflow index checkpoint file (e.g,
). In this case,from_tf
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the TensorFlow checkpoint in a PyTorch model using the provided conversion scripts and loading the PyTorch model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
state_dict (Dict[str, torch.Tensor], optional) —
A state dictionary to use instead of a state dictionary loaded from saved weights file.
This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using save_pretrained() and from_pretrained() is not a simpler option.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_tf (
, optional, defaults toFalse
) — Load the model weights from a TensorFlow checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a instance segmentation head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- maskformer — MaskFormerForInstanceSegmentation (MaskFormer model)
The model is set in evaluation mode by default using model.eval()
(so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with model.train()
>>> from transformers import AutoConfig, AutoModelForInstanceSegmentation
>>> # Download model and configuration from and cache.
>>> model = AutoModelForInstanceSegmentation.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = AutoModelForInstanceSegmentation.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForInstanceSegmentation.from_pretrained(
... "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )
This is a generic model class that will be instantiated as one of the base model classes of the library when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: TFAlbertModel (ALBERT model)
- BartConfig configuration class: TFBartModel (BART model)
- BertConfig configuration class: TFBertModel (BERT model)
- BlenderbotConfig configuration class: TFBlenderbotModel (Blenderbot model)
- BlenderbotSmallConfig configuration class: TFBlenderbotSmallModel (BlenderbotSmall model)
- CLIPConfig configuration class: TFCLIPModel (CLIP model)
- CTRLConfig configuration class: TFCTRLModel (CTRL model)
- CamembertConfig configuration class: TFCamembertModel (CamemBERT model)
- ConvBertConfig configuration class: TFConvBertModel (ConvBERT model)
- ConvNextConfig configuration class: TFConvNextModel (ConvNeXT model)
- DPRConfig configuration class: TFDPRQuestionEncoder (DPR model)
- Data2VecVisionConfig configuration class: TFData2VecVisionModel (Data2VecVision model)
- DebertaConfig configuration class: TFDebertaModel (DeBERTa model)
- DebertaV2Config configuration class: TFDebertaV2Model (DeBERTa-v2 model)
- DistilBertConfig configuration class: TFDistilBertModel (DistilBERT model)
- ElectraConfig configuration class: TFElectraModel (ELECTRA model)
- FlaubertConfig configuration class: TFFlaubertModel (FlauBERT model)
- FunnelConfig configuration class: TFFunnelModel or TFFunnelBaseModel (Funnel Transformer model)
- GPT2Config configuration class: TFGPT2Model (OpenAI GPT-2 model)
- GPTJConfig configuration class: TFGPTJModel (GPT-J model)
- HubertConfig configuration class: TFHubertModel (Hubert model)
- LEDConfig configuration class: TFLEDModel (LED model)
- LayoutLMConfig configuration class: TFLayoutLMModel (LayoutLM model)
- LongformerConfig configuration class: TFLongformerModel (Longformer model)
- LxmertConfig configuration class: TFLxmertModel (LXMERT model)
- MBartConfig configuration class: TFMBartModel (mBART model)
- MPNetConfig configuration class: TFMPNetModel (MPNet model)
- MT5Config configuration class: TFMT5Model (MT5 model)
- MarianConfig configuration class: TFMarianModel (Marian model)
- MobileBertConfig configuration class: TFMobileBertModel (MobileBERT model)
- OPTConfig configuration class: TFOPTModel (OPT model)
- OpenAIGPTConfig configuration class: TFOpenAIGPTModel (OpenAI GPT model)
- PegasusConfig configuration class: TFPegasusModel (Pegasus model)
- RemBertConfig configuration class: TFRemBertModel (RemBERT model)
- RoFormerConfig configuration class: TFRoFormerModel (RoFormer model)
- RobertaConfig configuration class: TFRobertaModel (RoBERTa model)
- Speech2TextConfig configuration class: TFSpeech2TextModel (Speech2Text model)
- SwinConfig configuration class: TFSwinModel (Swin Transformer model)
- T5Config configuration class: TFT5Model (T5 model)
- TapasConfig configuration class: TFTapasModel (TAPAS model)
- TransfoXLConfig configuration class: TFTransfoXLModel (Transformer-XL model)
- ViTConfig configuration class: TFViTModel (ViT model)
- ViTMAEConfig configuration class: TFViTMAEModel (ViTMAE model)
- Wav2Vec2Config configuration class: TFWav2Vec2Model (Wav2Vec2 model)
- XLMConfig configuration class: TFXLMModel (XLM model)
- XLMRobertaConfig configuration class: TFXLMRobertaModel (XLM-RoBERTa model)
- XLNetConfig configuration class: TFXLNetModel (XLNet model)
Instantiates one of the base model classes of the library from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the base model classes of the library from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- albert — TFAlbertModel (ALBERT model)
- bart — TFBartModel (BART model)
- bert — TFBertModel (BERT model)
- blenderbot — TFBlenderbotModel (Blenderbot model)
- blenderbot-small — TFBlenderbotSmallModel (BlenderbotSmall model)
- camembert — TFCamembertModel (CamemBERT model)
- clip — TFCLIPModel (CLIP model)
- convbert — TFConvBertModel (ConvBERT model)
- convnext — TFConvNextModel (ConvNeXT model)
- ctrl — TFCTRLModel (CTRL model)
- data2vec-vision — TFData2VecVisionModel (Data2VecVision model)
- deberta — TFDebertaModel (DeBERTa model)
- deberta-v2 — TFDebertaV2Model (DeBERTa-v2 model)
- distilbert — TFDistilBertModel (DistilBERT model)
- dpr — TFDPRQuestionEncoder (DPR model)
- electra — TFElectraModel (ELECTRA model)
- flaubert — TFFlaubertModel (FlauBERT model)
- funnel — TFFunnelModel or TFFunnelBaseModel (Funnel Transformer model)
- gpt2 — TFGPT2Model (OpenAI GPT-2 model)
- gptj — TFGPTJModel (GPT-J model)
- hubert — TFHubertModel (Hubert model)
- layoutlm — TFLayoutLMModel (LayoutLM model)
- led — TFLEDModel (LED model)
- longformer — TFLongformerModel (Longformer model)
- lxmert — TFLxmertModel (LXMERT model)
- marian — TFMarianModel (Marian model)
- mbart — TFMBartModel (mBART model)
- mobilebert — TFMobileBertModel (MobileBERT model)
- mpnet — TFMPNetModel (MPNet model)
- mt5 — TFMT5Model (MT5 model)
- openai-gpt — TFOpenAIGPTModel (OpenAI GPT model)
- opt — TFOPTModel (OPT model)
- pegasus — TFPegasusModel (Pegasus model)
- rembert — TFRemBertModel (RemBERT model)
- roberta — TFRobertaModel (RoBERTa model)
- roformer — TFRoFormerModel (RoFormer model)
- speech_to_text — TFSpeech2TextModel (Speech2Text model)
- swin — TFSwinModel (Swin Transformer model)
- t5 — TFT5Model (T5 model)
- tapas — TFTapasModel (TAPAS model)
- transfo-xl — TFTransfoXLModel (Transformer-XL model)
- vit — TFViTModel (ViT model)
- vit_mae — TFViTMAEModel (ViTMAE model)
- wav2vec2 — TFWav2Vec2Model (Wav2Vec2 model)
- xlm — TFXLMModel (XLM model)
- xlm-roberta — TFXLMRobertaModel (XLM-RoBERTa model)
- xlnet — TFXLNetModel (XLNet model)
>>> from transformers import AutoConfig, TFAutoModel
>>> # Download model and configuration from and cache.
>>> model = TFAutoModel.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = TFAutoModel.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModel.from_pretrained(
... "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a pretraining head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: TFAlbertForPreTraining (ALBERT model)
- BartConfig configuration class: TFBartForConditionalGeneration (BART model)
- BertConfig configuration class: TFBertForPreTraining (BERT model)
- CTRLConfig configuration class: TFCTRLLMHeadModel (CTRL model)
- CamembertConfig configuration class: TFCamembertForMaskedLM (CamemBERT model)
- DistilBertConfig configuration class: TFDistilBertForMaskedLM (DistilBERT model)
- ElectraConfig configuration class: TFElectraForPreTraining (ELECTRA model)
- FlaubertConfig configuration class: TFFlaubertWithLMHeadModel (FlauBERT model)
- FunnelConfig configuration class: TFFunnelForPreTraining (Funnel Transformer model)
- GPT2Config configuration class: TFGPT2LMHeadModel (OpenAI GPT-2 model)
- LayoutLMConfig configuration class: TFLayoutLMForMaskedLM (LayoutLM model)
- LxmertConfig configuration class: TFLxmertForPreTraining (LXMERT model)
- MPNetConfig configuration class: TFMPNetForMaskedLM (MPNet model)
- MobileBertConfig configuration class: TFMobileBertForPreTraining (MobileBERT model)
- OpenAIGPTConfig configuration class: TFOpenAIGPTLMHeadModel (OpenAI GPT model)
- RobertaConfig configuration class: TFRobertaForMaskedLM (RoBERTa model)
- T5Config configuration class: TFT5ForConditionalGeneration (T5 model)
- TapasConfig configuration class: TFTapasForMaskedLM (TAPAS model)
- TransfoXLConfig configuration class: TFTransfoXLLMHeadModel (Transformer-XL model)
- ViTMAEConfig configuration class: TFViTMAEForPreTraining (ViTMAE model)
- XLMConfig configuration class: TFXLMWithLMHeadModel (XLM model)
- XLMRobertaConfig configuration class: TFXLMRobertaForMaskedLM (XLM-RoBERTa model)
- XLNetConfig configuration class: TFXLNetLMHeadModel (XLNet model)
Instantiates one of the model classes of the library (with a pretraining head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a pretraining head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- albert — TFAlbertForPreTraining (ALBERT model)
- bart — TFBartForConditionalGeneration (BART model)
- bert — TFBertForPreTraining (BERT model)
- camembert — TFCamembertForMaskedLM (CamemBERT model)
- ctrl — TFCTRLLMHeadModel (CTRL model)
- distilbert — TFDistilBertForMaskedLM (DistilBERT model)
- electra — TFElectraForPreTraining (ELECTRA model)
- flaubert — TFFlaubertWithLMHeadModel (FlauBERT model)
- funnel — TFFunnelForPreTraining (Funnel Transformer model)
- gpt2 — TFGPT2LMHeadModel (OpenAI GPT-2 model)
- layoutlm — TFLayoutLMForMaskedLM (LayoutLM model)
- lxmert — TFLxmertForPreTraining (LXMERT model)
- mobilebert — TFMobileBertForPreTraining (MobileBERT model)
- mpnet — TFMPNetForMaskedLM (MPNet model)
- openai-gpt — TFOpenAIGPTLMHeadModel (OpenAI GPT model)
- roberta — TFRobertaForMaskedLM (RoBERTa model)
- t5 — TFT5ForConditionalGeneration (T5 model)
- tapas — TFTapasForMaskedLM (TAPAS model)
- transfo-xl — TFTransfoXLLMHeadModel (Transformer-XL model)
- vit_mae — TFViTMAEForPreTraining (ViTMAE model)
- xlm — TFXLMWithLMHeadModel (XLM model)
- xlm-roberta — TFXLMRobertaForMaskedLM (XLM-RoBERTa model)
- xlnet — TFXLNetLMHeadModel (XLNet model)
>>> from transformers import AutoConfig, TFAutoModelForPreTraining
>>> # Download model and configuration from and cache.
>>> model = TFAutoModelForPreTraining.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = TFAutoModelForPreTraining.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForPreTraining.from_pretrained(
... "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a causal language modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- BertConfig configuration class: TFBertLMHeadModel (BERT model)
- CTRLConfig configuration class: TFCTRLLMHeadModel (CTRL model)
- CamembertConfig configuration class: TFCamembertForCausalLM (CamemBERT model)
- GPT2Config configuration class: TFGPT2LMHeadModel (OpenAI GPT-2 model)
- GPTJConfig configuration class: TFGPTJForCausalLM (GPT-J model)
- OPTConfig configuration class: TFOPTForCausalLM (OPT model)
- OpenAIGPTConfig configuration class: TFOpenAIGPTLMHeadModel (OpenAI GPT model)
- RemBertConfig configuration class: TFRemBertForCausalLM (RemBERT model)
- RoFormerConfig configuration class: TFRoFormerForCausalLM (RoFormer model)
- RobertaConfig configuration class: TFRobertaForCausalLM (RoBERTa model)
- TransfoXLConfig configuration class: TFTransfoXLLMHeadModel (Transformer-XL model)
- XLMConfig configuration class: TFXLMWithLMHeadModel (XLM model)
- XLNetConfig configuration class: TFXLNetLMHeadModel (XLNet model)
Instantiates one of the model classes of the library (with a causal language modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a causal language modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- bert — TFBertLMHeadModel (BERT model)
- camembert — TFCamembertForCausalLM (CamemBERT model)
- ctrl — TFCTRLLMHeadModel (CTRL model)
- gpt2 — TFGPT2LMHeadModel (OpenAI GPT-2 model)
- gptj — TFGPTJForCausalLM (GPT-J model)
- openai-gpt — TFOpenAIGPTLMHeadModel (OpenAI GPT model)
- opt — TFOPTForCausalLM (OPT model)
- rembert — TFRemBertForCausalLM (RemBERT model)
- roberta — TFRobertaForCausalLM (RoBERTa model)
- roformer — TFRoFormerForCausalLM (RoFormer model)
- transfo-xl — TFTransfoXLLMHeadModel (Transformer-XL model)
- xlm — TFXLMWithLMHeadModel (XLM model)
- xlnet — TFXLNetLMHeadModel (XLNet model)
>>> from transformers import AutoConfig, TFAutoModelForCausalLM
>>> # Download model and configuration from and cache.
>>> model = TFAutoModelForCausalLM.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = TFAutoModelForCausalLM.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForCausalLM.from_pretrained(
... "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a image classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- ConvNextConfig configuration class: TFConvNextForImageClassification (ConvNeXT model)
- Data2VecVisionConfig configuration class: TFData2VecVisionForImageClassification (Data2VecVision model)
- SwinConfig configuration class: TFSwinForImageClassification (Swin Transformer model)
- ViTConfig configuration class: TFViTForImageClassification (ViT model)
Instantiates one of the model classes of the library (with a image classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a image classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- convnext — TFConvNextForImageClassification (ConvNeXT model)
- data2vec-vision — TFData2VecVisionForImageClassification (Data2VecVision model)
- swin — TFSwinForImageClassification (Swin Transformer model)
- vit — TFViTForImageClassification (ViT model)
>>> from transformers import AutoConfig, TFAutoModelForImageClassification
>>> # Download model and configuration from and cache.
>>> model = TFAutoModelForImageClassification.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = TFAutoModelForImageClassification.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForImageClassification.from_pretrained(
... "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a masked language modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: TFAlbertForMaskedLM (ALBERT model)
- BertConfig configuration class: TFBertForMaskedLM (BERT model)
- CamembertConfig configuration class: TFCamembertForMaskedLM (CamemBERT model)
- ConvBertConfig configuration class: TFConvBertForMaskedLM (ConvBERT model)
- DebertaConfig configuration class: TFDebertaForMaskedLM (DeBERTa model)
- DebertaV2Config configuration class: TFDebertaV2ForMaskedLM (DeBERTa-v2 model)
- DistilBertConfig configuration class: TFDistilBertForMaskedLM (DistilBERT model)
- ElectraConfig configuration class: TFElectraForMaskedLM (ELECTRA model)
- FlaubertConfig configuration class: TFFlaubertWithLMHeadModel (FlauBERT model)
- FunnelConfig configuration class: TFFunnelForMaskedLM (Funnel Transformer model)
- LayoutLMConfig configuration class: TFLayoutLMForMaskedLM (LayoutLM model)
- LongformerConfig configuration class: TFLongformerForMaskedLM (Longformer model)
- MPNetConfig configuration class: TFMPNetForMaskedLM (MPNet model)
- MobileBertConfig configuration class: TFMobileBertForMaskedLM (MobileBERT model)
- RemBertConfig configuration class: TFRemBertForMaskedLM (RemBERT model)
- RoFormerConfig configuration class: TFRoFormerForMaskedLM (RoFormer model)
- RobertaConfig configuration class: TFRobertaForMaskedLM (RoBERTa model)
- TapasConfig configuration class: TFTapasForMaskedLM (TAPAS model)
- XLMConfig configuration class: TFXLMWithLMHeadModel (XLM model)
- XLMRobertaConfig configuration class: TFXLMRobertaForMaskedLM (XLM-RoBERTa model)
Instantiates one of the model classes of the library (with a masked language modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a masked language modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- albert — TFAlbertForMaskedLM (ALBERT model)
- bert — TFBertForMaskedLM (BERT model)
- camembert — TFCamembertForMaskedLM (CamemBERT model)
- convbert — TFConvBertForMaskedLM (ConvBERT model)
- deberta — TFDebertaForMaskedLM (DeBERTa model)
- deberta-v2 — TFDebertaV2ForMaskedLM (DeBERTa-v2 model)
- distilbert — TFDistilBertForMaskedLM (DistilBERT model)
- electra — TFElectraForMaskedLM (ELECTRA model)
- flaubert — TFFlaubertWithLMHeadModel (FlauBERT model)
- funnel — TFFunnelForMaskedLM (Funnel Transformer model)
- layoutlm — TFLayoutLMForMaskedLM (LayoutLM model)
- longformer — TFLongformerForMaskedLM (Longformer model)
- mobilebert — TFMobileBertForMaskedLM (MobileBERT model)
- mpnet — TFMPNetForMaskedLM (MPNet model)
- rembert — TFRemBertForMaskedLM (RemBERT model)
- roberta — TFRobertaForMaskedLM (RoBERTa model)
- roformer — TFRoFormerForMaskedLM (RoFormer model)
- tapas — TFTapasForMaskedLM (TAPAS model)
- xlm — TFXLMWithLMHeadModel (XLM model)
- xlm-roberta — TFXLMRobertaForMaskedLM (XLM-RoBERTa model)
>>> from transformers import AutoConfig, TFAutoModelForMaskedLM
>>> # Download model and configuration from and cache.
>>> model = TFAutoModelForMaskedLM.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = TFAutoModelForMaskedLM.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForMaskedLM.from_pretrained(
... "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence language modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- BartConfig configuration class: TFBartForConditionalGeneration (BART model)
- BlenderbotConfig configuration class: TFBlenderbotForConditionalGeneration (Blenderbot model)
- BlenderbotSmallConfig configuration class: TFBlenderbotSmallForConditionalGeneration (BlenderbotSmall model)
- EncoderDecoderConfig configuration class: TFEncoderDecoderModel (Encoder decoder model)
- LEDConfig configuration class: TFLEDForConditionalGeneration (LED model)
- MBartConfig configuration class: TFMBartForConditionalGeneration (mBART model)
- MT5Config configuration class: TFMT5ForConditionalGeneration (MT5 model)
- MarianConfig configuration class: TFMarianMTModel (Marian model)
- PegasusConfig configuration class: TFPegasusForConditionalGeneration (Pegasus model)
- T5Config configuration class: TFT5ForConditionalGeneration (T5 model)
Instantiates one of the model classes of the library (with a sequence-to-sequence language modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a sequence-to-sequence language modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- bart — TFBartForConditionalGeneration (BART model)
- blenderbot — TFBlenderbotForConditionalGeneration (Blenderbot model)
- blenderbot-small — TFBlenderbotSmallForConditionalGeneration (BlenderbotSmall model)
- encoder-decoder — TFEncoderDecoderModel (Encoder decoder model)
- led — TFLEDForConditionalGeneration (LED model)
- marian — TFMarianMTModel (Marian model)
- mbart — TFMBartForConditionalGeneration (mBART model)
- mt5 — TFMT5ForConditionalGeneration (MT5 model)
- pegasus — TFPegasusForConditionalGeneration (Pegasus model)
- t5 — TFT5ForConditionalGeneration (T5 model)
>>> from transformers import AutoConfig, TFAutoModelForSeq2SeqLM
>>> # Download model and configuration from and cache.
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained("t5-base")
>>> # Update configuration during loading
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained("t5-base", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/t5_pt_model_config.json")
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained(
... "./pt_model/t5_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: TFAlbertForSequenceClassification (ALBERT model)
- BertConfig configuration class: TFBertForSequenceClassification (BERT model)
- CTRLConfig configuration class: TFCTRLForSequenceClassification (CTRL model)
- CamembertConfig configuration class: TFCamembertForSequenceClassification (CamemBERT model)
- ConvBertConfig configuration class: TFConvBertForSequenceClassification (ConvBERT model)
- DebertaConfig configuration class: TFDebertaForSequenceClassification (DeBERTa model)
- DebertaV2Config configuration class: TFDebertaV2ForSequenceClassification (DeBERTa-v2 model)
- DistilBertConfig configuration class: TFDistilBertForSequenceClassification (DistilBERT model)
- ElectraConfig configuration class: TFElectraForSequenceClassification (ELECTRA model)
- FlaubertConfig configuration class: TFFlaubertForSequenceClassification (FlauBERT model)
- FunnelConfig configuration class: TFFunnelForSequenceClassification (Funnel Transformer model)
- GPT2Config configuration class: TFGPT2ForSequenceClassification (OpenAI GPT-2 model)
- GPTJConfig configuration class: TFGPTJForSequenceClassification (GPT-J model)
- LayoutLMConfig configuration class: TFLayoutLMForSequenceClassification (LayoutLM model)
- LongformerConfig configuration class: TFLongformerForSequenceClassification (Longformer model)
- MPNetConfig configuration class: TFMPNetForSequenceClassification (MPNet model)
- MobileBertConfig configuration class: TFMobileBertForSequenceClassification (MobileBERT model)
- OpenAIGPTConfig configuration class: TFOpenAIGPTForSequenceClassification (OpenAI GPT model)
- RemBertConfig configuration class: TFRemBertForSequenceClassification (RemBERT model)
- RoFormerConfig configuration class: TFRoFormerForSequenceClassification (RoFormer model)
- RobertaConfig configuration class: TFRobertaForSequenceClassification (RoBERTa model)
- TapasConfig configuration class: TFTapasForSequenceClassification (TAPAS model)
- TransfoXLConfig configuration class: TFTransfoXLForSequenceClassification (Transformer-XL model)
- XLMConfig configuration class: TFXLMForSequenceClassification (XLM model)
- XLMRobertaConfig configuration class: TFXLMRobertaForSequenceClassification (XLM-RoBERTa model)
- XLNetConfig configuration class: TFXLNetForSequenceClassification (XLNet model)
Instantiates one of the model classes of the library (with a sequence classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a sequence classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- albert — TFAlbertForSequenceClassification (ALBERT model)
- bert — TFBertForSequenceClassification (BERT model)
- camembert — TFCamembertForSequenceClassification (CamemBERT model)
- convbert — TFConvBertForSequenceClassification (ConvBERT model)
- ctrl — TFCTRLForSequenceClassification (CTRL model)
- deberta — TFDebertaForSequenceClassification (DeBERTa model)
- deberta-v2 — TFDebertaV2ForSequenceClassification (DeBERTa-v2 model)
- distilbert — TFDistilBertForSequenceClassification (DistilBERT model)
- electra — TFElectraForSequenceClassification (ELECTRA model)
- flaubert — TFFlaubertForSequenceClassification (FlauBERT model)
- funnel — TFFunnelForSequenceClassification (Funnel Transformer model)
- gpt2 — TFGPT2ForSequenceClassification (OpenAI GPT-2 model)
- gptj — TFGPTJForSequenceClassification (GPT-J model)
- layoutlm — TFLayoutLMForSequenceClassification (LayoutLM model)
- longformer — TFLongformerForSequenceClassification (Longformer model)
- mobilebert — TFMobileBertForSequenceClassification (MobileBERT model)
- mpnet — TFMPNetForSequenceClassification (MPNet model)
- openai-gpt — TFOpenAIGPTForSequenceClassification (OpenAI GPT model)
- rembert — TFRemBertForSequenceClassification (RemBERT model)
- roberta — TFRobertaForSequenceClassification (RoBERTa model)
- roformer — TFRoFormerForSequenceClassification (RoFormer model)
- tapas — TFTapasForSequenceClassification (TAPAS model)
- transfo-xl — TFTransfoXLForSequenceClassification (Transformer-XL model)
- xlm — TFXLMForSequenceClassification (XLM model)
- xlm-roberta — TFXLMRobertaForSequenceClassification (XLM-RoBERTa model)
- xlnet — TFXLNetForSequenceClassification (XLNet model)
>>> from transformers import AutoConfig, TFAutoModelForSequenceClassification
>>> # Download model and configuration from and cache.
>>> model = TFAutoModelForSequenceClassification.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = TFAutoModelForSequenceClassification.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForSequenceClassification.from_pretrained(
... "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a multiple choice head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: TFAlbertForMultipleChoice (ALBERT model)
- BertConfig configuration class: TFBertForMultipleChoice (BERT model)
- CamembertConfig configuration class: TFCamembertForMultipleChoice (CamemBERT model)
- ConvBertConfig configuration class: TFConvBertForMultipleChoice (ConvBERT model)
- DistilBertConfig configuration class: TFDistilBertForMultipleChoice (DistilBERT model)
- ElectraConfig configuration class: TFElectraForMultipleChoice (ELECTRA model)
- FlaubertConfig configuration class: TFFlaubertForMultipleChoice (FlauBERT model)
- FunnelConfig configuration class: TFFunnelForMultipleChoice (Funnel Transformer model)
- LongformerConfig configuration class: TFLongformerForMultipleChoice (Longformer model)
- MPNetConfig configuration class: TFMPNetForMultipleChoice (MPNet model)
- MobileBertConfig configuration class: TFMobileBertForMultipleChoice (MobileBERT model)
- RemBertConfig configuration class: TFRemBertForMultipleChoice (RemBERT model)
- RoFormerConfig configuration class: TFRoFormerForMultipleChoice (RoFormer model)
- RobertaConfig configuration class: TFRobertaForMultipleChoice (RoBERTa model)
- XLMConfig configuration class: TFXLMForMultipleChoice (XLM model)
- XLMRobertaConfig configuration class: TFXLMRobertaForMultipleChoice (XLM-RoBERTa model)
- XLNetConfig configuration class: TFXLNetForMultipleChoice (XLNet model)
Instantiates one of the model classes of the library (with a multiple choice head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a multiple choice head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- albert — TFAlbertForMultipleChoice (ALBERT model)
- bert — TFBertForMultipleChoice (BERT model)
- camembert — TFCamembertForMultipleChoice (CamemBERT model)
- convbert — TFConvBertForMultipleChoice (ConvBERT model)
- distilbert — TFDistilBertForMultipleChoice (DistilBERT model)
- electra — TFElectraForMultipleChoice (ELECTRA model)
- flaubert — TFFlaubertForMultipleChoice (FlauBERT model)
- funnel — TFFunnelForMultipleChoice (Funnel Transformer model)
- longformer — TFLongformerForMultipleChoice (Longformer model)
- mobilebert — TFMobileBertForMultipleChoice (MobileBERT model)
- mpnet — TFMPNetForMultipleChoice (MPNet model)
- rembert — TFRemBertForMultipleChoice (RemBERT model)
- roberta — TFRobertaForMultipleChoice (RoBERTa model)
- roformer — TFRoFormerForMultipleChoice (RoFormer model)
- xlm — TFXLMForMultipleChoice (XLM model)
- xlm-roberta — TFXLMRobertaForMultipleChoice (XLM-RoBERTa model)
- xlnet — TFXLNetForMultipleChoice (XLNet model)
>>> from transformers import AutoConfig, TFAutoModelForMultipleChoice
>>> # Download model and configuration from and cache.
>>> model = TFAutoModelForMultipleChoice.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = TFAutoModelForMultipleChoice.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForMultipleChoice.from_pretrained(
... "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a next sentence prediction head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- BertConfig configuration class: TFBertForNextSentencePrediction (BERT model)
- MobileBertConfig configuration class: TFMobileBertForNextSentencePrediction (MobileBERT model)
Instantiates one of the model classes of the library (with a next sentence prediction head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a next sentence prediction head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- bert — TFBertForNextSentencePrediction (BERT model)
- mobilebert — TFMobileBertForNextSentencePrediction (MobileBERT model)
>>> from transformers import AutoConfig, TFAutoModelForNextSentencePrediction
>>> # Download model and configuration from and cache.
>>> model = TFAutoModelForNextSentencePrediction.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = TFAutoModelForNextSentencePrediction.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForNextSentencePrediction.from_pretrained(
... "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a table question answering head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- TapasConfig configuration class: TFTapasForQuestionAnswering (TAPAS model)
Instantiates one of the model classes of the library (with a table question answering head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a table question answering head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- tapas — TFTapasForQuestionAnswering (TAPAS model)
>>> from transformers import AutoConfig, TFAutoModelForTableQuestionAnswering
>>> # Download model and configuration from and cache.
>>> model = TFAutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq")
>>> # Update configuration during loading
>>> model = TFAutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/tapas_pt_model_config.json")
>>> model = TFAutoModelForTableQuestionAnswering.from_pretrained(
... "./pt_model/tapas_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a token classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: TFAlbertForTokenClassification (ALBERT model)
- BertConfig configuration class: TFBertForTokenClassification (BERT model)
- CamembertConfig configuration class: TFCamembertForTokenClassification (CamemBERT model)
- ConvBertConfig configuration class: TFConvBertForTokenClassification (ConvBERT model)
- DebertaConfig configuration class: TFDebertaForTokenClassification (DeBERTa model)
- DebertaV2Config configuration class: TFDebertaV2ForTokenClassification (DeBERTa-v2 model)
- DistilBertConfig configuration class: TFDistilBertForTokenClassification (DistilBERT model)
- ElectraConfig configuration class: TFElectraForTokenClassification (ELECTRA model)
- FlaubertConfig configuration class: TFFlaubertForTokenClassification (FlauBERT model)
- FunnelConfig configuration class: TFFunnelForTokenClassification (Funnel Transformer model)
- LayoutLMConfig configuration class: TFLayoutLMForTokenClassification (LayoutLM model)
- LongformerConfig configuration class: TFLongformerForTokenClassification (Longformer model)
- MPNetConfig configuration class: TFMPNetForTokenClassification (MPNet model)
- MobileBertConfig configuration class: TFMobileBertForTokenClassification (MobileBERT model)
- RemBertConfig configuration class: TFRemBertForTokenClassification (RemBERT model)
- RoFormerConfig configuration class: TFRoFormerForTokenClassification (RoFormer model)
- RobertaConfig configuration class: TFRobertaForTokenClassification (RoBERTa model)
- XLMConfig configuration class: TFXLMForTokenClassification (XLM model)
- XLMRobertaConfig configuration class: TFXLMRobertaForTokenClassification (XLM-RoBERTa model)
- XLNetConfig configuration class: TFXLNetForTokenClassification (XLNet model)
Instantiates one of the model classes of the library (with a token classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a token classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- albert — TFAlbertForTokenClassification (ALBERT model)
- bert — TFBertForTokenClassification (BERT model)
- camembert — TFCamembertForTokenClassification (CamemBERT model)
- convbert — TFConvBertForTokenClassification (ConvBERT model)
- deberta — TFDebertaForTokenClassification (DeBERTa model)
- deberta-v2 — TFDebertaV2ForTokenClassification (DeBERTa-v2 model)
- distilbert — TFDistilBertForTokenClassification (DistilBERT model)
- electra — TFElectraForTokenClassification (ELECTRA model)
- flaubert — TFFlaubertForTokenClassification (FlauBERT model)
- funnel — TFFunnelForTokenClassification (Funnel Transformer model)
- layoutlm — TFLayoutLMForTokenClassification (LayoutLM model)
- longformer — TFLongformerForTokenClassification (Longformer model)
- mobilebert — TFMobileBertForTokenClassification (MobileBERT model)
- mpnet — TFMPNetForTokenClassification (MPNet model)
- rembert — TFRemBertForTokenClassification (RemBERT model)
- roberta — TFRobertaForTokenClassification (RoBERTa model)
- roformer — TFRoFormerForTokenClassification (RoFormer model)
- xlm — TFXLMForTokenClassification (XLM model)
- xlm-roberta — TFXLMRobertaForTokenClassification (XLM-RoBERTa model)
- xlnet — TFXLNetForTokenClassification (XLNet model)
>>> from transformers import AutoConfig, TFAutoModelForTokenClassification
>>> # Download model and configuration from and cache.
>>> model = TFAutoModelForTokenClassification.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = TFAutoModelForTokenClassification.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForTokenClassification.from_pretrained(
... "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a question answering head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: TFAlbertForQuestionAnswering (ALBERT model)
- BertConfig configuration class: TFBertForQuestionAnswering (BERT model)
- CamembertConfig configuration class: TFCamembertForQuestionAnswering (CamemBERT model)
- ConvBertConfig configuration class: TFConvBertForQuestionAnswering (ConvBERT model)
- DebertaConfig configuration class: TFDebertaForQuestionAnswering (DeBERTa model)
- DebertaV2Config configuration class: TFDebertaV2ForQuestionAnswering (DeBERTa-v2 model)
- DistilBertConfig configuration class: TFDistilBertForQuestionAnswering (DistilBERT model)
- ElectraConfig configuration class: TFElectraForQuestionAnswering (ELECTRA model)
- FlaubertConfig configuration class: TFFlaubertForQuestionAnsweringSimple (FlauBERT model)
- FunnelConfig configuration class: TFFunnelForQuestionAnswering (Funnel Transformer model)
- GPTJConfig configuration class: TFGPTJForQuestionAnswering (GPT-J model)
- LongformerConfig configuration class: TFLongformerForQuestionAnswering (Longformer model)
- MPNetConfig configuration class: TFMPNetForQuestionAnswering (MPNet model)
- MobileBertConfig configuration class: TFMobileBertForQuestionAnswering (MobileBERT model)
- RemBertConfig configuration class: TFRemBertForQuestionAnswering (RemBERT model)
- RoFormerConfig configuration class: TFRoFormerForQuestionAnswering (RoFormer model)
- RobertaConfig configuration class: TFRobertaForQuestionAnswering (RoBERTa model)
- XLMConfig configuration class: TFXLMForQuestionAnsweringSimple (XLM model)
- XLMRobertaConfig configuration class: TFXLMRobertaForQuestionAnswering (XLM-RoBERTa model)
- XLNetConfig configuration class: TFXLNetForQuestionAnsweringSimple (XLNet model)
Instantiates one of the model classes of the library (with a question answering head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a question answering head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- albert — TFAlbertForQuestionAnswering (ALBERT model)
- bert — TFBertForQuestionAnswering (BERT model)
- camembert — TFCamembertForQuestionAnswering (CamemBERT model)
- convbert — TFConvBertForQuestionAnswering (ConvBERT model)
- deberta — TFDebertaForQuestionAnswering (DeBERTa model)
- deberta-v2 — TFDebertaV2ForQuestionAnswering (DeBERTa-v2 model)
- distilbert — TFDistilBertForQuestionAnswering (DistilBERT model)
- electra — TFElectraForQuestionAnswering (ELECTRA model)
- flaubert — TFFlaubertForQuestionAnsweringSimple (FlauBERT model)
- funnel — TFFunnelForQuestionAnswering (Funnel Transformer model)
- gptj — TFGPTJForQuestionAnswering (GPT-J model)
- longformer — TFLongformerForQuestionAnswering (Longformer model)
- mobilebert — TFMobileBertForQuestionAnswering (MobileBERT model)
- mpnet — TFMPNetForQuestionAnswering (MPNet model)
- rembert — TFRemBertForQuestionAnswering (RemBERT model)
- roberta — TFRobertaForQuestionAnswering (RoBERTa model)
- roformer — TFRoFormerForQuestionAnswering (RoFormer model)
- xlm — TFXLMForQuestionAnsweringSimple (XLM model)
- xlm-roberta — TFXLMRobertaForQuestionAnswering (XLM-RoBERTa model)
- xlnet — TFXLNetForQuestionAnsweringSimple (XLNet model)
>>> from transformers import AutoConfig, TFAutoModelForQuestionAnswering
>>> # Download model and configuration from and cache.
>>> model = TFAutoModelForQuestionAnswering.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = TFAutoModelForQuestionAnswering.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForQuestionAnswering.from_pretrained(
... "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a vision-to-text modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- VisionEncoderDecoderConfig configuration class: TFVisionEncoderDecoderModel (Vision Encoder decoder model)
Instantiates one of the model classes of the library (with a vision-to-text modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a vision-to-text modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- vision-encoder-decoder — TFVisionEncoderDecoderModel (Vision Encoder decoder model)
>>> from transformers import AutoConfig, TFAutoModelForVision2Seq
>>> # Download model and configuration from and cache.
>>> model = TFAutoModelForVision2Seq.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = TFAutoModelForVision2Seq.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForVision2Seq.from_pretrained(
... "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- Speech2TextConfig configuration class: TFSpeech2TextForConditionalGeneration (Speech2Text model)
Instantiates one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- speech_to_text — TFSpeech2TextForConditionalGeneration (Speech2Text model)
>>> from transformers import AutoConfig, TFAutoModelForSpeechSeq2Seq
>>> # Download model and configuration from and cache.
>>> model = TFAutoModelForSpeechSeq2Seq.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = TFAutoModelForSpeechSeq2Seq.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForSpeechSeq2Seq.from_pretrained(
... "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the base model classes of the library when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: FlaxAlbertModel (ALBERT model)
- BartConfig configuration class: FlaxBartModel (BART model)
- BeitConfig configuration class: FlaxBeitModel (BEiT model)
- BertConfig configuration class: FlaxBertModel (BERT model)
- BigBirdConfig configuration class: FlaxBigBirdModel (BigBird model)
- BlenderbotConfig configuration class: FlaxBlenderbotModel (Blenderbot model)
- BlenderbotSmallConfig configuration class: FlaxBlenderbotSmallModel (BlenderbotSmall model)
- CLIPConfig configuration class: FlaxCLIPModel (CLIP model)
- DistilBertConfig configuration class: FlaxDistilBertModel (DistilBERT model)
- ElectraConfig configuration class: FlaxElectraModel (ELECTRA model)
- GPT2Config configuration class: FlaxGPT2Model (OpenAI GPT-2 model)
- GPTJConfig configuration class: FlaxGPTJModel (GPT-J model)
- GPTNeoConfig configuration class: FlaxGPTNeoModel (GPT Neo model)
- LongT5Config configuration class: FlaxLongT5Model (LongT5 model)
- MBartConfig configuration class: FlaxMBartModel (mBART model)
- MT5Config configuration class: FlaxMT5Model (MT5 model)
- MarianConfig configuration class: FlaxMarianModel (Marian model)
- OPTConfig configuration class: FlaxOPTModel (OPT model)
- PegasusConfig configuration class: FlaxPegasusModel (Pegasus model)
- RoFormerConfig configuration class: FlaxRoFormerModel (RoFormer model)
- RobertaConfig configuration class: FlaxRobertaModel (RoBERTa model)
- T5Config configuration class: FlaxT5Model (T5 model)
- ViTConfig configuration class: FlaxViTModel (ViT model)
- VisionTextDualEncoderConfig configuration class: FlaxVisionTextDualEncoderModel (VisionTextDualEncoder model)
- Wav2Vec2Config configuration class: FlaxWav2Vec2Model (Wav2Vec2 model)
- XGLMConfig configuration class: FlaxXGLMModel (XGLM model)
- XLMRobertaConfig configuration class: FlaxXLMRobertaModel (XLM-RoBERTa model)
Instantiates one of the base model classes of the library from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the base model classes of the library from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- albert — FlaxAlbertModel (ALBERT model)
- bart — FlaxBartModel (BART model)
- beit — FlaxBeitModel (BEiT model)
- bert — FlaxBertModel (BERT model)
- big_bird — FlaxBigBirdModel (BigBird model)
- blenderbot — FlaxBlenderbotModel (Blenderbot model)
- blenderbot-small — FlaxBlenderbotSmallModel (BlenderbotSmall model)
- clip — FlaxCLIPModel (CLIP model)
- distilbert — FlaxDistilBertModel (DistilBERT model)
- electra — FlaxElectraModel (ELECTRA model)
- gpt2 — FlaxGPT2Model (OpenAI GPT-2 model)
- gpt_neo — FlaxGPTNeoModel (GPT Neo model)
- gptj — FlaxGPTJModel (GPT-J model)
- longt5 — FlaxLongT5Model (LongT5 model)
- marian — FlaxMarianModel (Marian model)
- mbart — FlaxMBartModel (mBART model)
- mt5 — FlaxMT5Model (MT5 model)
- opt — FlaxOPTModel (OPT model)
- pegasus — FlaxPegasusModel (Pegasus model)
- roberta — FlaxRobertaModel (RoBERTa model)
- roformer — FlaxRoFormerModel (RoFormer model)
- t5 — FlaxT5Model (T5 model)
- vision-text-dual-encoder — FlaxVisionTextDualEncoderModel (VisionTextDualEncoder model)
- vit — FlaxViTModel (ViT model)
- wav2vec2 — FlaxWav2Vec2Model (Wav2Vec2 model)
- xglm — FlaxXGLMModel (XGLM model)
- xlm-roberta — FlaxXLMRobertaModel (XLM-RoBERTa model)
>>> from transformers import AutoConfig, FlaxAutoModel
>>> # Download model and configuration from and cache.
>>> model = FlaxAutoModel.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = FlaxAutoModel.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModel.from_pretrained(
... "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a causal language modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- BartConfig configuration class: FlaxBartForCausalLM (BART model)
- BertConfig configuration class: FlaxBertForCausalLM (BERT model)
- BigBirdConfig configuration class: FlaxBigBirdForCausalLM (BigBird model)
- ElectraConfig configuration class: FlaxElectraForCausalLM (ELECTRA model)
- GPT2Config configuration class: FlaxGPT2LMHeadModel (OpenAI GPT-2 model)
- GPTJConfig configuration class: FlaxGPTJForCausalLM (GPT-J model)
- GPTNeoConfig configuration class: FlaxGPTNeoForCausalLM (GPT Neo model)
- OPTConfig configuration class: FlaxOPTForCausalLM (OPT model)
- RobertaConfig configuration class: FlaxRobertaForCausalLM (RoBERTa model)
- XGLMConfig configuration class: FlaxXGLMForCausalLM (XGLM model)
Instantiates one of the model classes of the library (with a causal language modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a causal language modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- bart — FlaxBartForCausalLM (BART model)
- bert — FlaxBertForCausalLM (BERT model)
- big_bird — FlaxBigBirdForCausalLM (BigBird model)
- electra — FlaxElectraForCausalLM (ELECTRA model)
- gpt2 — FlaxGPT2LMHeadModel (OpenAI GPT-2 model)
- gpt_neo — FlaxGPTNeoForCausalLM (GPT Neo model)
- gptj — FlaxGPTJForCausalLM (GPT-J model)
- opt — FlaxOPTForCausalLM (OPT model)
- roberta — FlaxRobertaForCausalLM (RoBERTa model)
- xglm — FlaxXGLMForCausalLM (XGLM model)
>>> from transformers import AutoConfig, FlaxAutoModelForCausalLM
>>> # Download model and configuration from and cache.
>>> model = FlaxAutoModelForCausalLM.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = FlaxAutoModelForCausalLM.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForCausalLM.from_pretrained(
... "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a pretraining head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: FlaxAlbertForPreTraining (ALBERT model)
- BartConfig configuration class: FlaxBartForConditionalGeneration (BART model)
- BertConfig configuration class: FlaxBertForPreTraining (BERT model)
- BigBirdConfig configuration class: FlaxBigBirdForPreTraining (BigBird model)
- ElectraConfig configuration class: FlaxElectraForPreTraining (ELECTRA model)
- LongT5Config configuration class: FlaxLongT5ForConditionalGeneration (LongT5 model)
- MBartConfig configuration class: FlaxMBartForConditionalGeneration (mBART model)
- MT5Config configuration class: FlaxMT5ForConditionalGeneration (MT5 model)
- RoFormerConfig configuration class: FlaxRoFormerForMaskedLM (RoFormer model)
- RobertaConfig configuration class: FlaxRobertaForMaskedLM (RoBERTa model)
- T5Config configuration class: FlaxT5ForConditionalGeneration (T5 model)
- Wav2Vec2Config configuration class: FlaxWav2Vec2ForPreTraining (Wav2Vec2 model)
- XLMRobertaConfig configuration class: FlaxXLMRobertaForMaskedLM (XLM-RoBERTa model)
Instantiates one of the model classes of the library (with a pretraining head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a pretraining head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- albert — FlaxAlbertForPreTraining (ALBERT model)
- bart — FlaxBartForConditionalGeneration (BART model)
- bert — FlaxBertForPreTraining (BERT model)
- big_bird — FlaxBigBirdForPreTraining (BigBird model)
- electra — FlaxElectraForPreTraining (ELECTRA model)
- longt5 — FlaxLongT5ForConditionalGeneration (LongT5 model)
- mbart — FlaxMBartForConditionalGeneration (mBART model)
- mt5 — FlaxMT5ForConditionalGeneration (MT5 model)
- roberta — FlaxRobertaForMaskedLM (RoBERTa model)
- roformer — FlaxRoFormerForMaskedLM (RoFormer model)
- t5 — FlaxT5ForConditionalGeneration (T5 model)
- wav2vec2 — FlaxWav2Vec2ForPreTraining (Wav2Vec2 model)
- xlm-roberta — FlaxXLMRobertaForMaskedLM (XLM-RoBERTa model)
>>> from transformers import AutoConfig, FlaxAutoModelForPreTraining
>>> # Download model and configuration from and cache.
>>> model = FlaxAutoModelForPreTraining.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = FlaxAutoModelForPreTraining.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForPreTraining.from_pretrained(
... "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a masked language modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: FlaxAlbertForMaskedLM (ALBERT model)
- BartConfig configuration class: FlaxBartForConditionalGeneration (BART model)
- BertConfig configuration class: FlaxBertForMaskedLM (BERT model)
- BigBirdConfig configuration class: FlaxBigBirdForMaskedLM (BigBird model)
- DistilBertConfig configuration class: FlaxDistilBertForMaskedLM (DistilBERT model)
- ElectraConfig configuration class: FlaxElectraForMaskedLM (ELECTRA model)
- MBartConfig configuration class: FlaxMBartForConditionalGeneration (mBART model)
- RoFormerConfig configuration class: FlaxRoFormerForMaskedLM (RoFormer model)
- RobertaConfig configuration class: FlaxRobertaForMaskedLM (RoBERTa model)
- XLMRobertaConfig configuration class: FlaxXLMRobertaForMaskedLM (XLM-RoBERTa model)
Instantiates one of the model classes of the library (with a masked language modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a masked language modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- albert — FlaxAlbertForMaskedLM (ALBERT model)
- bart — FlaxBartForConditionalGeneration (BART model)
- bert — FlaxBertForMaskedLM (BERT model)
- big_bird — FlaxBigBirdForMaskedLM (BigBird model)
- distilbert — FlaxDistilBertForMaskedLM (DistilBERT model)
- electra — FlaxElectraForMaskedLM (ELECTRA model)
- mbart — FlaxMBartForConditionalGeneration (mBART model)
- roberta — FlaxRobertaForMaskedLM (RoBERTa model)
- roformer — FlaxRoFormerForMaskedLM (RoFormer model)
- xlm-roberta — FlaxXLMRobertaForMaskedLM (XLM-RoBERTa model)
>>> from transformers import AutoConfig, FlaxAutoModelForMaskedLM
>>> # Download model and configuration from and cache.
>>> model = FlaxAutoModelForMaskedLM.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = FlaxAutoModelForMaskedLM.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForMaskedLM.from_pretrained(
... "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence language modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- BartConfig configuration class: FlaxBartForConditionalGeneration (BART model)
- BlenderbotConfig configuration class: FlaxBlenderbotForConditionalGeneration (Blenderbot model)
- BlenderbotSmallConfig configuration class: FlaxBlenderbotSmallForConditionalGeneration (BlenderbotSmall model)
- EncoderDecoderConfig configuration class: FlaxEncoderDecoderModel (Encoder decoder model)
- LongT5Config configuration class: FlaxLongT5ForConditionalGeneration (LongT5 model)
- MBartConfig configuration class: FlaxMBartForConditionalGeneration (mBART model)
- MT5Config configuration class: FlaxMT5ForConditionalGeneration (MT5 model)
- MarianConfig configuration class: FlaxMarianMTModel (Marian model)
- PegasusConfig configuration class: FlaxPegasusForConditionalGeneration (Pegasus model)
- T5Config configuration class: FlaxT5ForConditionalGeneration (T5 model)
Instantiates one of the model classes of the library (with a sequence-to-sequence language modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a sequence-to-sequence language modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- bart — FlaxBartForConditionalGeneration (BART model)
- blenderbot — FlaxBlenderbotForConditionalGeneration (Blenderbot model)
- blenderbot-small — FlaxBlenderbotSmallForConditionalGeneration (BlenderbotSmall model)
- encoder-decoder — FlaxEncoderDecoderModel (Encoder decoder model)
- longt5 — FlaxLongT5ForConditionalGeneration (LongT5 model)
- marian — FlaxMarianMTModel (Marian model)
- mbart — FlaxMBartForConditionalGeneration (mBART model)
- mt5 — FlaxMT5ForConditionalGeneration (MT5 model)
- pegasus — FlaxPegasusForConditionalGeneration (Pegasus model)
- t5 — FlaxT5ForConditionalGeneration (T5 model)
>>> from transformers import AutoConfig, FlaxAutoModelForSeq2SeqLM
>>> # Download model and configuration from and cache.
>>> model = FlaxAutoModelForSeq2SeqLM.from_pretrained("t5-base")
>>> # Update configuration during loading
>>> model = FlaxAutoModelForSeq2SeqLM.from_pretrained("t5-base", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/t5_pt_model_config.json")
>>> model = FlaxAutoModelForSeq2SeqLM.from_pretrained(
... "./pt_model/t5_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: FlaxAlbertForSequenceClassification (ALBERT model)
- BartConfig configuration class: FlaxBartForSequenceClassification (BART model)
- BertConfig configuration class: FlaxBertForSequenceClassification (BERT model)
- BigBirdConfig configuration class: FlaxBigBirdForSequenceClassification (BigBird model)
- DistilBertConfig configuration class: FlaxDistilBertForSequenceClassification (DistilBERT model)
- ElectraConfig configuration class: FlaxElectraForSequenceClassification (ELECTRA model)
- MBartConfig configuration class: FlaxMBartForSequenceClassification (mBART model)
- RoFormerConfig configuration class: FlaxRoFormerForSequenceClassification (RoFormer model)
- RobertaConfig configuration class: FlaxRobertaForSequenceClassification (RoBERTa model)
- XLMRobertaConfig configuration class: FlaxXLMRobertaForSequenceClassification (XLM-RoBERTa model)
Instantiates one of the model classes of the library (with a sequence classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a sequence classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- albert — FlaxAlbertForSequenceClassification (ALBERT model)
- bart — FlaxBartForSequenceClassification (BART model)
- bert — FlaxBertForSequenceClassification (BERT model)
- big_bird — FlaxBigBirdForSequenceClassification (BigBird model)
- distilbert — FlaxDistilBertForSequenceClassification (DistilBERT model)
- electra — FlaxElectraForSequenceClassification (ELECTRA model)
- mbart — FlaxMBartForSequenceClassification (mBART model)
- roberta — FlaxRobertaForSequenceClassification (RoBERTa model)
- roformer — FlaxRoFormerForSequenceClassification (RoFormer model)
- xlm-roberta — FlaxXLMRobertaForSequenceClassification (XLM-RoBERTa model)
>>> from transformers import AutoConfig, FlaxAutoModelForSequenceClassification
>>> # Download model and configuration from and cache.
>>> model = FlaxAutoModelForSequenceClassification.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = FlaxAutoModelForSequenceClassification.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForSequenceClassification.from_pretrained(
... "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a question answering head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: FlaxAlbertForQuestionAnswering (ALBERT model)
- BartConfig configuration class: FlaxBartForQuestionAnswering (BART model)
- BertConfig configuration class: FlaxBertForQuestionAnswering (BERT model)
- BigBirdConfig configuration class: FlaxBigBirdForQuestionAnswering (BigBird model)
- DistilBertConfig configuration class: FlaxDistilBertForQuestionAnswering (DistilBERT model)
- ElectraConfig configuration class: FlaxElectraForQuestionAnswering (ELECTRA model)
- MBartConfig configuration class: FlaxMBartForQuestionAnswering (mBART model)
- RoFormerConfig configuration class: FlaxRoFormerForQuestionAnswering (RoFormer model)
- RobertaConfig configuration class: FlaxRobertaForQuestionAnswering (RoBERTa model)
- XLMRobertaConfig configuration class: FlaxXLMRobertaForQuestionAnswering (XLM-RoBERTa model)
Instantiates one of the model classes of the library (with a question answering head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a question answering head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- albert — FlaxAlbertForQuestionAnswering (ALBERT model)
- bart — FlaxBartForQuestionAnswering (BART model)
- bert — FlaxBertForQuestionAnswering (BERT model)
- big_bird — FlaxBigBirdForQuestionAnswering (BigBird model)
- distilbert — FlaxDistilBertForQuestionAnswering (DistilBERT model)
- electra — FlaxElectraForQuestionAnswering (ELECTRA model)
- mbart — FlaxMBartForQuestionAnswering (mBART model)
- roberta — FlaxRobertaForQuestionAnswering (RoBERTa model)
- roformer — FlaxRoFormerForQuestionAnswering (RoFormer model)
- xlm-roberta — FlaxXLMRobertaForQuestionAnswering (XLM-RoBERTa model)
>>> from transformers import AutoConfig, FlaxAutoModelForQuestionAnswering
>>> # Download model and configuration from and cache.
>>> model = FlaxAutoModelForQuestionAnswering.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = FlaxAutoModelForQuestionAnswering.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForQuestionAnswering.from_pretrained(
... "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a token classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: FlaxAlbertForTokenClassification (ALBERT model)
- BertConfig configuration class: FlaxBertForTokenClassification (BERT model)
- BigBirdConfig configuration class: FlaxBigBirdForTokenClassification (BigBird model)
- DistilBertConfig configuration class: FlaxDistilBertForTokenClassification (DistilBERT model)
- ElectraConfig configuration class: FlaxElectraForTokenClassification (ELECTRA model)
- RoFormerConfig configuration class: FlaxRoFormerForTokenClassification (RoFormer model)
- RobertaConfig configuration class: FlaxRobertaForTokenClassification (RoBERTa model)
- XLMRobertaConfig configuration class: FlaxXLMRobertaForTokenClassification (XLM-RoBERTa model)
Instantiates one of the model classes of the library (with a token classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a token classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- albert — FlaxAlbertForTokenClassification (ALBERT model)
- bert — FlaxBertForTokenClassification (BERT model)
- big_bird — FlaxBigBirdForTokenClassification (BigBird model)
- distilbert — FlaxDistilBertForTokenClassification (DistilBERT model)
- electra — FlaxElectraForTokenClassification (ELECTRA model)
- roberta — FlaxRobertaForTokenClassification (RoBERTa model)
- roformer — FlaxRoFormerForTokenClassification (RoFormer model)
- xlm-roberta — FlaxXLMRobertaForTokenClassification (XLM-RoBERTa model)
>>> from transformers import AutoConfig, FlaxAutoModelForTokenClassification
>>> # Download model and configuration from and cache.
>>> model = FlaxAutoModelForTokenClassification.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = FlaxAutoModelForTokenClassification.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForTokenClassification.from_pretrained(
... "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a multiple choice head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- AlbertConfig configuration class: FlaxAlbertForMultipleChoice (ALBERT model)
- BertConfig configuration class: FlaxBertForMultipleChoice (BERT model)
- BigBirdConfig configuration class: FlaxBigBirdForMultipleChoice (BigBird model)
- DistilBertConfig configuration class: FlaxDistilBertForMultipleChoice (DistilBERT model)
- ElectraConfig configuration class: FlaxElectraForMultipleChoice (ELECTRA model)
- RoFormerConfig configuration class: FlaxRoFormerForMultipleChoice (RoFormer model)
- RobertaConfig configuration class: FlaxRobertaForMultipleChoice (RoBERTa model)
- XLMRobertaConfig configuration class: FlaxXLMRobertaForMultipleChoice (XLM-RoBERTa model)
Instantiates one of the model classes of the library (with a multiple choice head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a multiple choice head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- albert — FlaxAlbertForMultipleChoice (ALBERT model)
- bert — FlaxBertForMultipleChoice (BERT model)
- big_bird — FlaxBigBirdForMultipleChoice (BigBird model)
- distilbert — FlaxDistilBertForMultipleChoice (DistilBERT model)
- electra — FlaxElectraForMultipleChoice (ELECTRA model)
- roberta — FlaxRobertaForMultipleChoice (RoBERTa model)
- roformer — FlaxRoFormerForMultipleChoice (RoFormer model)
- xlm-roberta — FlaxXLMRobertaForMultipleChoice (XLM-RoBERTa model)
>>> from transformers import AutoConfig, FlaxAutoModelForMultipleChoice
>>> # Download model and configuration from and cache.
>>> model = FlaxAutoModelForMultipleChoice.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = FlaxAutoModelForMultipleChoice.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForMultipleChoice.from_pretrained(
... "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a next sentence prediction head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- BertConfig configuration class: FlaxBertForNextSentencePrediction (BERT model)
Instantiates one of the model classes of the library (with a next sentence prediction head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a next sentence prediction head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- bert — FlaxBertForNextSentencePrediction (BERT model)
>>> from transformers import AutoConfig, FlaxAutoModelForNextSentencePrediction
>>> # Download model and configuration from and cache.
>>> model = FlaxAutoModelForNextSentencePrediction.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = FlaxAutoModelForNextSentencePrediction.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForNextSentencePrediction.from_pretrained(
... "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a image classification head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- BeitConfig configuration class: FlaxBeitForImageClassification (BEiT model)
- ViTConfig configuration class: FlaxViTForImageClassification (ViT model)
Instantiates one of the model classes of the library (with a image classification head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a image classification head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- beit — FlaxBeitForImageClassification (BEiT model)
- vit — FlaxViTForImageClassification (ViT model)
>>> from transformers import AutoConfig, FlaxAutoModelForImageClassification
>>> # Download model and configuration from and cache.
>>> model = FlaxAutoModelForImageClassification.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = FlaxAutoModelForImageClassification.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForImageClassification.from_pretrained(
... "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )
This is a generic model class that will be instantiated as one of the model classes of the library (with a vision-to-text modeling head) when created with the from_pretrained() class method or the from_config() class method.
This class cannot be instantiated directly using __init__()
(throws an error).
< source >( **kwargs )
config (PretrainedConfig) —
The model class to instantiate is selected based on the configuration class:
- VisionEncoderDecoderConfig configuration class: FlaxVisionEncoderDecoderModel (Vision Encoder decoder model)
Instantiates one of the model classes of the library (with a vision-to-text modeling head) from a configuration.
Note: Loading a model from its configuration file does not load the model weights. It only affects the model’s configuration. Use from_pretrained() to load the model weights.
< source >( *model_args **kwargs )
pretrained_model_name_or_path (
) — Can be either:- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - A path to a directory containing model weights saved using
save_pretrained(), e.g.,
. - A path or url to a PyTorch state_dict save file (e.g,
). In this case,from_pt
should be set toTrue
and a configuration object should be provided asconfig
argument. This loading path is slower than converting the PyTorch model in a TensorFlow model using the provided conversion scripts and loading the TensorFlow model afterwards.
- A string, the model id of a pretrained model hosted inside a model repo on
Valid model ids can be located at the root-level, like
model_args (additional positional arguments, optional) —
Will be passed along to the underlying model
method. -
config (PretrainedConfig, optional) —
Configuration for the model to use instead of an automatically loaded configuration. Configuration can
be automatically loaded when:
- The model is a model provided by the library (loaded with the model id string of a pretrained model).
- The model was saved using save_pretrained() and is reloaded by supplying the save directory.
- The model is loaded by supplying a local directory as
and a configuration JSON file named config.json is found in the directory.
cache_dir (
, optional) — Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. -
from_pt (
, optional, defaults toFalse
) — Load the model weights from a PyTorch checkpoint save file (see docstring ofpretrained_model_name_or_path
argument). -
force_download (
, optional, defaults toFalse
) — Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist. -
resume_download (
, optional, defaults toFalse
) — Whether or not to delete incompletely received files. Will attempt to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': '', 'http://hostname': ''}
. The proxies are used on each request. -
, optional, defaults toFalse
) — Whether ot not to also return a dictionary containing missing keys, unexpected keys and error messages. -
, optional, defaults toFalse
) — Whether or not to only look at local files (e.g., not try downloading the model). -
revision (
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on, sorevision
can be any identifier allowed by git. -
trust_remote_code (
, optional, defaults toFalse
) — Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set toTrue
for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. -
kwargs (additional keyword arguments, optional) —
Can be used to update the configuration object (after it being loaded) and initiate the model (e.g.,
). Behaves differently depending on whether aconfig
is provided or automatically loaded:- If a configuration is provided with
will be directly passed to the underlying model’s__init__
method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided,
will be first passed to the configuration class initialization function (from_pretrained()). Each key ofkwargs
that corresponds to a configuration attribute will be used to override said attribute with the suppliedkwargs
value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model’s__init__
- If a configuration is provided with
Instantiate one of the model classes of the library (with a vision-to-text modeling head) from a pretrained model.
The model class to instantiate is selected based on the model_type
property of the config object (either
passed as an argument or loaded from pretrained_model_name_or_path
if possible), or when it’s missing, by
falling back to using pattern matching on pretrained_model_name_or_path
- vision-encoder-decoder — FlaxVisionEncoderDecoderModel (Vision Encoder decoder model)
>>> from transformers import AutoConfig, FlaxAutoModelForVision2Seq
>>> # Download model and configuration from and cache.
>>> model = FlaxAutoModelForVision2Seq.from_pretrained("bert-base-cased")
>>> # Update configuration during loading
>>> model = FlaxAutoModelForVision2Seq.from_pretrained("bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForVision2Seq.from_pretrained(
... "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )