HanBert-Transformers icon indicating copy to clipboard operation
HanBert-Transformers copied to clipboard

Issue about transformers version

Open tmtmaj opened this issue 4 years ago β€’ 1 comments

μ•ˆλ…•ν•˜μ„Έμš”. κ³΅κ°œν•΄μ£Όμ‹  μ½”λ“œ 정말 μœ μš©ν•˜κ²Œ 잘 μ“°κ³  μžˆμŠ΅λ‹ˆλ‹€. κ°μ‚¬ν•©λ‹ˆλ‹€!

닀름 μ•„λ‹ˆλΌ transformers 4.0.0 μ—μ„œ μ œκ³΅ν•΄μ£Όμ‹  toy example을 κ·ΈλŒ€λ‘œ μ‚¬μš©ν•  수 μ—†μ–΄μ„œ μ΄λ ‡κ²Œ 글을 λ‚¨κΉλ‹ˆλ‹€.

μ•„λž˜μ™€ 같은 μ—λŸ¬κ°€ λ°œμƒν•©λ‹ˆλ‹€. (Ubuntuμ—μ„œ μ‹€ν–‰λ˜μ—ˆκ³ , 디렉토리 μ…‹νŒ…λ„ λ™μΌν•˜κ²Œ ν–ˆμŠ΅λ‹ˆλ‹€.)

from tokenization_hanbert import HanBertTokenizer
tokenizer = HanBertTokenizer.from_pretrained('HanBert-54kN-torch')
text = "λ‚˜λŠ” κ±Έμ–΄κ°€κ³  μžˆλŠ” μ€‘μž…λ‹ˆλ‹€. λ‚˜λŠ”κ±Έμ–΄ κ°€κ³ μžˆλŠ” μ€‘μž…λ‹ˆλ‹€. 잘 λΆ„λ₯˜λ˜κΈ°λ„ ν•œλ‹€. 잘 먹기도 ν•œλ‹€."
tokenizer.tokenize(text)
...
AttributeError: 'HanBertTokenizer' object has no attribute 'vocab'

model output에 κ΄€λ ¨ν•œ toy example도 output type이 λ°”λ€Œμ–΄ μ•½κ°„μ˜ μˆ˜μ •μ΄ ν•„μš”ν•˜μ§€λ§Œ errorλŠ” λ°œμƒν•˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€.

참고둜 transformers 2.2.2 μ—μ„œλŠ” λ¬Έμ œμ—†μ΄ μ‹€ν–‰ κ°€λŠ₯ν•©λ‹ˆλ‹€.

transformers version에 λŒ€ν•œ μš”κ΅¬μ‚¬ν•­μ΄ μΆ”κ°€λ˜μ–΄μ•Όν•  것 κ°™μ•„ μ΄λ ‡κ²Œ 글을 λ‚¨κΉλ‹ˆλ‹€!

tmtmaj avatar Dec 07 '20 06:12 tmtmaj

tokenization_hanbert.py μ—μ„œ μ•„λž˜μ™€ 같이 λ°”κΏ”μ£Όμ‹œλ©΄ 될 것 κ°™μŠ΅λ‹ˆλ‹€. from transformers import PreTrainedTokenizer -> from transformers.tokenization_utils import PreTrainedTokenizer

bzantium avatar Dec 19 '20 02:12 bzantium