Evolution of Generative Artificial Intelligence for Text (ChatGPT)

⟵

Share

Evolution of Generative Artificial Intelligence for Text (ChatGPT)

By

Ranganathan Rajkumar

Ex VP - Advanced Analytics - Intelligence

Many companies and research organizations that are pioneers in AI have been actively contributing to the growth of generative AI content by bringing in & applying different AI models to fine-tune the precision of the output.

Before discussing the applications of generative AI in text and large language models, let’s see how the concept has evolved over the decades.

RNN sequencing

After researchers proposed the seq2seq algorithm, which is a class of Recurrent Neural Networks (RNN), it was later adopted & developed by companies like Google, Facebook, Microsoft, etc., to solve Large Language Problems.

The element-by-element sequencing model revolutionized how machines conversed with humans, yet, it had limitations and drawbacks like grammar mistakes and bad semantic sense.

LSTM

RNN suffered from a problem called Vanishing Gradient. LSTM (Long Short Term Memory) and GRU (Gated Recurring Unit) were introduced to address this issue.

Though in structure, they remain the same, LSTM preserves the context/information present in the initial part of the statement by preventing the issue of Vanishing Gradient. To retain the part of the statement, it introduced cell state and cell gates with layers such as forget gate, input gate, and output gate.

Transformer model

While LSTM was a rock star during its time in the NLP evolution, it had issues such as slow training and lack of contextual awareness due to a one-directional process. Bi-directional LSTM learned the context in forward & backward directions and concatenated them. Still, it was not ahead and back together, and it struggled to perform tasks such as text summarization and Q&A that deal with long sequences. Enter, Transformers. This popular model was introduced with improved training efficiency. Also, the model could parallelly process the sequences, based on which many text training algorithms were developed.

UNILM

Unified language model was developed from a transformer model, BERT – Bi-directional Encoder Representations. In this model, every output element is connected to every input element, and the language co-relation between the words was dynamically calculated. AI content improved with the tuning of algorithms and extensive training.

T5

Text to Text Transfer Transformer, with text as input, generates target text. This is an enhanced language translation model. It had a bi-directional encoder and a left-right decoder pre-trained on a mix of unsupervised and supervised tasks.

BART

Bi-directional & auto regressive transformers, a model structure proposed in 2020 by Facebook. Consider it as a generalization of BERT and GPT. It combines ideas from both the encoder and decoder. It had a bi-directional encoder and a left-to-right decoder.

GPT: Generative Pre-trained Transformer

GPT is the first autoregressive model based on Transformer architecture. Evolved as GPT, GPT2, GPT3, GPT 3.5 (aka GPT 3 Davinci-003) pre-trained model, which was fine-tuned & released to the public as ChatGPT (based on InstructGPT) by OpenAI.

The backbone of the model is Reinforcement Learning from Human Feedback (RLHF). It’s continuously human-trained for text, audio, and video.

This version of GPT converses like a human to a great extent, which is why this bot has a lot of hype. With all the tremendous efforts that went into scaling AI content, companies are striving to make it more human-like.

More Large Language and Generative AI models were built and released by Google (BARD based on Language Model for Dialogue Applications (LaMDA), HuggingFace (BLOOM), and the latest from Meta Research LLaMA, which was open-sourced.

Application of ChatGPT and Generative AI models

With companies expanding their investment in data analytics to use the power of data to derive critical insights, we must discuss AI bots’ role in data analytics.

The applications of Generative AI and ChatGPT are vast. From generating a new text, answering questions conversing like a human, assisting developers with generating code, explaining code, writing newsletters, blogs, social media posts, and articles (This post was not written by ChatGPT ) to sales reports and generating new images and audio, ChatGPT can do it all.

As we read in the earlier paragraph on various applications of Generative AI, different models come into play for the same. We continue to see ChatGPT experiences from people of various backgrounds and industries. How and where can enterprises use ChatGPT?

As you know, ChatGPT is Language Model. Its application is predominantly in “Text” and tasks that require human-like conversation, taking notes in a meeting, composing an email, writing content, and increasing developers’ productivity.

Key challenges in using open-source AI bots for data analysis

Most data analysis projects deal with sensitive data. Large organizations sign agreements on data privacy protection with customers and the government that prevents them from disclosing sensitive information to open-source tools.

That’s why organizations must understand what kind of support the data engineering team looks for from AI bots and ensure no sensitive information is disclosed.

A known risk: AI models have continuously evolved to ensure improved accuracy. This implies that there is definitely room for errors. The open-source conversational bots, even if well-trained to perform certain activities, hold no responsibility for the output it provides. You need the right eye to ensure the AI gets the correct data, understands it, and does what it should.

Responsible governance & corporate policies

Technology is fast evolving and has the entire world working on it such that innovations, new tools, and upgrades are happening in the flash of an eye. It’s so compelling to try new tools for critical tasks. But, every organization must ensure the right policies are in place to responsibly handle booms or sensations like ChatGPT.

By

Ranganathan Rajkumar

Ex VP - Advanced Analytics - Intelligence

⟵

Building Intelligent Applications: A Comprehensive Guide

Chatbot vs Conversational AI – Key Differences in 2024

Conversational AI: The Future of Communication

Exploring the Evolution of UI/UX Design

Embracing Human Journeys for Lasting Customer Experiences

Mastering DesignOps: Roles and Partnerships for Success