Google Colab で ORCA Mini 13B を試す

2023年6月26日 07:16

「Google Colab」で「ORCA Mini 13B」を試したので、まとめました。

【注意】Google Colab Pro/Pro+ の A100で動作確認しています。

1. ORCA Mini

「ORCA Mini」は、MicrosoftのOrcaの論文を参考にしてOpenLlamaベースで作られたモデルです。

「ORCA」は、従来の Instructionチューニング (Vicuna-13Bなど)とは異なり、GPT-4のような大規模言語モデルのステップバイステップの推論過程を学習させることで、軽量なモデルながらChatGPT (3.5)に匹敵する精度を実現したモデルです。

2. モデル一覧

ORCA Miniは、次の3つのモデルが提供されています。

・psmathur/orca_mini_13b
・psmathur/orca_mini_7b
・psmathur/orca_mini_3b

3. Colabでの実行

「Google Colab」での実行手順は、次のとおりです。

(1) パッケージのインストール。

# パッケージのインストール
!pip install transformers accelerate sentencepiece

(2) トークナイザーとモデルの準備。
今回は、「ORCA Mini 13B」を利用します。

import torch
from transformers import LlamaForCausalLM, LlamaTokenizer

# トークナイザーとモデルの準備
tokenizer = LlamaTokenizer.from_pretrained(
    "psmathur/orca_mini_13b"
)
model = LlamaForCausalLM.from_pretrained(
    "psmathur/orca_mini_13b",
    torch_dtype=torch.float16,
    device_map="auto",
)

(3) テキスト生成関数の定義。

# テキスト生成関数の定義
def generate_text(system, instruction, input=None):
    if input:
        prompt = f"### System:\n{system}\n\n### User:\n{instruction}\n\n### Input:\n{input}\n\n### Response:\n"
    else:
        prompt = f"### System:\n{system}\n\n### User:\n{instruction}\n\n### Response:\n"
    print(prompt)

    tokens = tokenizer.encode(prompt)
    tokens = torch.LongTensor(tokens).unsqueeze(0)
    tokens = tokens.to('cuda')

    instance = {'input_ids': tokens,'top_p': 1.0, 'temperature':0.7, 'generate_len': 1024, 'top_k': 50}
    length = len(tokens[0])
    with torch.no_grad():
        rest = model.generate(
            input_ids=tokens,
            max_length=length+instance['generate_len'],
            use_cache=True,
            do_sample=True,
            top_p=instance['top_p'],
            temperature=instance['temperature'],
            top_k=instance['top_k']
        )
    output = rest[0][length:]
    string = tokenizer.decode(output, skip_special_tokens=True)
    return string

(3) 推論の実行。
英語の方が精度が高いので、英語で文脈付き質問応答しています。

# 推論の実行
system = 'You are an AI assistant that follows instruction extremely well. Help as much as you can.'
instruction = "What is the name of the band Hitori Goto joins?"
input = 'Hitori Goto is a lonely girl who loves the guitar. She was lonely at home and just played every day, but by chance she decided to join the "Kessoku Band" led by Nijika Ijichi. Can Goto, who is unfamiliar with performing in front of people, become a good band member?'
print(generate_text(system, instruction, input))

### System:
You are an AI assistant that follows instruction extremely well. Help as much as you can.

### User:
What is the name of the band Hitori Goto joins?

### Input:
Hitori Goto is a lonely girl who loves the guitar. She was lonely at home and just played every day, but by chance she decided to join the "Kessoku Band" led by Nijika Ijichi. Can Goto, who is unfamiliar with performing in front of people, become a good band member?

### Response:

The name of the band Hitori Goto joins is "Kessoku Band".

### システム:
あなたは、指示に非常によく従うAIアシスタントです。できるだけ助けてください。
### ユーザー:
後藤ひとりが加入するバンドの名前は何ですか？
### 入力:
後藤ひとりはギターを愛する孤独な少女。家では孤独でただ遊んでばかりの毎日だったが、ひょんなことから伊地知虹夏率いる「結束バンド」に加入することに。人前で演奏することに不慣れな後藤は、立派なバンドマンになれるのか？
### 応答:

後藤ひとりが加入するバンドの名前は「結束バンド」です。

Google Colab で ORCA Mini 13B を試す

1. ORCA Mini

2. モデル一覧

3. Colabでの実行

関連