ChatGPT, beyond the hype
February 2023
Senior Triad consultant Henry Grech-Cini is an expert in computer vision. He gave a crash course in ChatGPT live at Digital Leader’s Public Sector Insight Week. This is what he had to say.
ChatGPT is an application built using GPT-3 (and more recently GPT-4), which are large-scale language models developed by OpenAI. GPT stands for Generative Pre-trained Transformer model. The transformer part came from a paper in 2017 called “Attention is all you need”. The paper discussed how one can take large amounts of data and train a model to generate output. The ‘generative’ stands for making the next step in the message sequence. It is pre-trained and uses a transformer model, which is why it can give the impression that it can reason. However, it doesn’t reason as humans do.
Appreciating the simplicity of GPT is important, because this will allow you to understand its capability and limitations. Given an input sequence of information (GPT calls them tokens), it will try to predict the probability of what the next token will be. And it performs that repeatedly. Tokens aren’t words. They equate to about 4 characters each. So, you give it information, and it predicts and selects the next token and reiterates.
ChatGPT can give you an Illusion of Intelligence. It can refer to a limited amount of the conversation that preceded it. But it can’t currently fact-check itself. It is a sophisticated predictive system based on its training data and refinements over time. It does not possess any ability to reason. Don’t share its output without checking thoroughly. Try challenging its response; it may apologise and change its position.
Well, it depends on what you want to use it for. I think there are many use cases. Here are five:
ChatGPT is particularly good at summarising long texts and articles. For example, you could paste some text and ask it to summarise the text in 50 words. The transformer architecture means that it is excellent for that. Don’t submit any information that is either sensitive, official, or personal because OpenAI collects all input to improve the system.
The most widely used version of ChatGPT was trained on a vast amount of information and knowledge on a wide range of topics, from history and science to sports and entertainment up until 2021. Be wary of using it instead of search engines, and don’t use it for recent topics. It is good for creating content such as agendas or providing recommendations. Its output is only as good as the guidance you have given it. It takes skill and effort to hone your input.
ChatGPT uses a neural network architecture called a transformer. And this helps it generate coherent and grammatically correct text by capturing long-range dependencies between words to maintain the context over long text passages. Don’t use the first response. Refine it with further clarifications as required. Be careful; there are significant risks when using it to generate text about a subject you know little about. There is a high probability that it can be convincing, but false.
It’s trained on previous chat sessions between humans and can be used for general webchat, though its depth of engagement can be shallow. There is huge potential here.
ChatGPT was pre-trained on parallel corpora text in multiple languages. These are texts in one language paired with translations in another language. This pre-training enables the model to have a strong understanding of the relationship between different languages and how they map to each other. Results can be impressive. It’s worth noting that there are dedicated models that may provide greater accuracy. Check the translation with a native speaker and be careful when translating complex or idiomatic phrases.
There are significant pitfalls associated with using ChatGPT and other generative AI tools. These include issues related to:
Artificial Intelligence (AI) models will improve incrementally over time:
Originally posted here