Understanding AI – How do Transformer models and GPT work?
20. March 2024 by Alexander Stahlkopf
This video explains how Transformers work in simple steps and uses Excel functions to show the structure of OpenAI’s GPT2 architecture.
The underlying table can be downloaded here and tried out for yourself (download section).
The implementation of spreadsheets are all you need serves to understand how they work and only works with little load and corresponding limitations:
- Full GPT2 small (124M parameters) model including byte pair encoding, embeddings, multi-headed attention, and multi-layer perceptron stages
- Inference/forward pass only (no training)
- Context is limited to 10 tokens in length
- 10 characters per word limit
- Zero temperature output only
A nice way to get into the topic and gain more understanding.
Alexander Stahlkopf
Alex loves Marketing, UX and bringing ideas into life.
After experiences as musician, publisher, manager and concepter - building music education and web-agency locations and then growing his independent IT-consulting company for 8 years - he combines these to follow his vision: "Make work-life easier and more fun."
He likes board sports and traveling with his camper.
Trainierte Übersetzungsmodelle
in Unternehmenssprache. Mammutstark.
Translate jetzt 30 Tage kostenfrei ausprobieren!
Erhalte Deinen kostenfreien Zugang zu Echtzeit-Übersetzungen von Texten & Dokumenten.
Dein erster Schrittt für unternehmenseigene Übersetzungsmodelle
Maschinelle Übersetzung in bestmöglicher Qualität.