Automatic Speech Recognition
ESPnet
English
audio
audio_captioning
shikhar7ssu's picture
Upload 19 files
f8585bb verified

RESULTS

Environments

  • date: Fri Nov 29 14:59:50 EST 2024
  • python version: 3.9.20 (main, Oct 3 2024, 07:27:41) [GCC 11.2.0]
  • espnet version: espnet 202409
  • pytorch version: pytorch 2.4.0
  • Git hash: 65ea259e8effab5a43cdff87161a301dc0f20930
    • Commit date: Fri Nov 29 10:54:44 2024 -0500

exp/asr_pt

WER

dataset Snt Wrd Corr Sub Del Ins Err S.Err
inference_ctc_weight0.0_hugging_face_decoderTrue_asr_model_latest/evaluation 1045 0 0.0 0.0 0.0 0.0 0.0 100.0

CER

dataset Snt Wrd Corr Sub Del Ins Err S.Err
inference_ctc_weight0.0_hugging_face_decoderTrue_asr_model_latest/evaluation 1045 0 0.0 0.0 0.0 0.0 0.0 100.0

TER

dataset Snt Wrd Corr Sub Del Ins Err S.Err

exp/asr_pt/inference_ctc_weight0.0_hugging_face_decoderTrue_asr_model_latest

WER

dataset Snt Wrd Corr Sub Del Ins Err S.Err
org/validation 1045 12004 15.7 76.4 7.9 35.7 120.0 100.0

CER

dataset Snt Wrd Corr Sub Del Ins Err S.Err
org/validation 1045 65932 45.5 39.4 15.1 39.4 93.9 100.0

TER

dataset Snt Wrd Corr Sub Del Ins Err S.Err