YouTube動画をfaster_whisperを使って高速に文字起こししよう

2023年4月1日 14:45

OpenAI がリリースした、文字起こしが出来る Whisper の高速版、faster_whisper を使って、YouTube 動画を google colab で高速に文字起こしする方法を解説します。

条件

Google Colab で実行する
- ランタイム > ランタイムのタイプを変更 > ハードウェアアクセラレータを GPU にすること
YouTube 動画のダウンロードには yt-dlp を用いる
- https://github.com/yt-dlp/yt-dlp
文字起こしには faster-whisper を用いる
- https://github.com/guillaumekln/faster-whisper
- 公式の whisper より省メモリ、同精度で最大4倍高速

手順

以下の手順を google colab で実行してください。

まず、yt-dlp, faster-whisper をインストールします。

!pip install yt-dlp
!pip install faster-whisper

yt-dlp を用いて対象の動画をダウンロードします。url に youtube リンクを入れてください。

from yt_dlp import YoutubeDL

audio_name = "audio.mp3"  #@param {type: "string"}

ydl_opts = {
    "outtmpl": audio_name,
    "format": "bestaudio"
}

url = "https://youtu.be/0t8CH5BAgLY"  #@param {type: "string"}

with YoutubeDL(ydl_opts) as ydl:
    ydl.download([url])

HIKAKINさんのこの動画を文字起こししてみます。

次に、faster-whisper を用いて文字起こしをします。文字起こし結果を出力しながら、ダウンロード用のファイルへの書き込みも行います。

from faster_whisper import WhisperModel

model_size = "large-v2"
model = WhisperModel(model_size, device="cuda", compute_type="float16")

segments, info = model.transcribe(audio=audio_name, beam_size=5)

print(f"Detected language {info.language} with probability {info.language_probability}")

f = open("output.txt", "w", encoding="UTF-8")
for segment in segments:
    print(f"[{segment.start} -> {segment.end}] {segment.text}")
    f.write(f"{segment.text}\n")
f.close()

最後に、出来上がったファイルをダウンロードします。

from google.colab import files
files.download("output.txt")

以上で処理は完了です。煮るなり焼くなり好きに使ってください。

有料部分にはすぐに使える colab のリンクを載せておきます。
自分のドライブにコピーして使ってみてください。

ここから先は

0字

この記事のみ ¥ 100

大学院生の satoooh が、無料で公開したくないけどみんなに伝えたい日々の雑記を公開するメンバー…

このメンバーシップの詳細

スタンダード

¥500 / 月

ダイヤモンド

¥2,000 / 月

ログイン

この記事が気に入ったらサポートをしてみませんか？

YouTube動画をfaster_whisperを使って高速に文字起こししよう

条件

手順

ここから先は

メンバーシップ ¥ 500~ /月

スタンダード

ダイヤモンド