Raised expectations | Gavia Arctica

18 Mar 2023

In 2019 when OpenAI released version of their generative language model (GPT-2) they first wanted to hold it back and restrict access to it gearing it would be too powerful. While this wasn't the first transformer based model it already demonstrated the potential and capability of the architecture.

After it's release it was quickly adopted to various different systems, but it wasn't quite there yet so the hype died relatively fast. Other models and derivates also emerged to offer different alternatives to the restricted (and expensive) service OpenAI offered to access their model.

Last year OpenAI released their next version of the model vastly improving the quality of the output. Soon after they followed with ChatGPT which improved the quality even further.

Like before the alternatives keep popping up. While still struggling to meet the quality of the ChatGPT there has been impressive achievements made that, compare to only what we had two years ago would have been impressive. But as we've seen what the best model can do running an inferior model (that still vastly outperforms for example the GPT-2 model) on a mobile phone hardware is dismissed as useless. Running something that just few years ago required specialized hardware today on commodity hardware is quite impressive.

Comments