swift-models icon indicating copy to clipboard operation
swift-models copied to clipboard

GPT2 and BERT alignment

Open texasmichelle opened this issue 5 years ago • 0 comments

Refactor the GPT-2 and BERT models to make it clear which concepts are shared and which are distinct. For example, transformers and multi-head attention are shared concepts, but naming collisions have resulted in files such as TransformerBERT.swift and structs such as MultiHeadAttentionGPT2. As our collection of transformer-based language models increases, this becomes important for code reuse and maintenance.

Investigate alignment with HuggingFace APIs, which do a great job of unifying a wide variety of transformer-based models.

texasmichelle avatar Apr 02 '20 01:04 texasmichelle