Transformers in a Nutshell.

Neural Networks has become a fundamental building block for AI models in the field of natural language processing (NLP). It is more relevant for generative AI (Gen-AI), a subfield of AI. These neural networks, inspired by human brain, are trained to learn patterns and make predictions, making it well-suited for generative AI tasks. ChatGPT by OpenAI is one such generative AI model. RNN and Transformers are widely used neural networks in the recent time.

Imagine a waiter taking your order at a restaurant one item at a time, remembering them sequentially. Now imagine the waiter reading your order off a piece of paper. Do you notice any difference in outcome? By reading it off a paper, the waiter can see the entire order and observe that few items belong together as a combo. At the heart of the Transformer, this is how it works – ‘Attention’.

RNN (Recurrent Neural Network) processes each element sequentially, like the waiter in the first scenario. The focus is on trying to understand the meaning of the sentence. This process is not effective in identifying the most relevant parts of the sentence to build the context.

Transformers follow parallel processing. It considers all the words at once. This is how the waiter in the second scenario was able to identify a combo order. The ability to understand long-range dependencies, or in simple language, the ability to pay attention to the relationships has revolutionized the way we process language. This began in the year 2017 when researchers at Google AI introduced the Transformers architecture in their paper ‘Attention is All You Need’.

This is a landmark achievement as the communication between humans and machines has never been more natural. Multiple applications are spawning beyond machine translation and text summarization as a direct result of Transformers. Such advancements are bound to have significant impact on the field of data management and analytical reporting.

Here are three reasons why you should care:

Natural Language Querying (NLQ): NLQ enables users to question in natural language instead of using query language. Salesforce Einstein, which is integrated into many Salesforce products, uses Transformer architecture to enable users to use plain English to query and analyze data.

Personalization in data experiences: Companies are leveraging the Transformers architecture to provide relevant and tailored products/features. Amazon Sage Maker Canvas uses it to build machine learning models without writing code to make data analysis accessible to functional or business users.

Data Summarization and Visualization: It helps in summarizing large datasets, identifying trends, building insights and patterns while not having to manually review vast amounts of data. Tableau Voyager can generates summaries from data visualizations to highlight key takeaways.

As the Transformer architecture continues to evolve, rapid innovation and industry transformations will follow. It is time to embrace, pay closer attention and leverage the changing landscape.

Note: The blog is meant to build a simplistic idea on Transformers architecture.

Reference : Attention is All you Need (neurips.cc)

Source: Various. The write up is a summary of my understanding.

Popular Posts