Generative AI Language Modeling with Transformers

Early bird sale! Unlock 10,000+ courses from Google, Microsoft, and more for £160/year. Save now.

Generative AI Language Modeling with Transformers

This course is part of multiple programs.

Instructors: Joseph Santarcangelo +2 more

11,077 already enrolled

Included with Coursera Plus

2 modules

Gain insight into a topic and learn the fundamentals.

4.5

(85 reviews)

Intermediate level

Recommended experience

9 hours to complete

Flexible schedule

Learn at your own pace

2 modules

Gain insight into a topic and learn the fundamentals.

4.5

(85 reviews)

Intermediate level

Recommended experience

9 hours to complete

Flexible schedule

Learn at your own pace

What you'll learn

Explain the role of attention mechanisms in transformer models for capturing contextual relationships in text
Describe the differences in language modeling approaches between decoder-based models like GPT and encoder-based models like BERT
Implement key components of transformer models, including positional encoding, attention mechanisms, and masking, using PyTorch
Apply transformer-based models for real-world NLP tasks, such as text classification and language translation, using PyTorch and Hugging Face tools

Skills you'll gain

Category: Text Mining
Category: Large Language Modeling
Category: Machine Learning Methods
Category: PyTorch (Machine Learning Library)
Category: Natural Language Processing
Category: Generative AI
Category: Deep Learning
Category: Applied Machine Learning

Details to know

Shareable certificate

Add to your LinkedIn profile

Assessments

6 assignments

Taught in English

Build your subject-matter expertise

This course is available as part of

When you enroll in this course, you'll also be asked to select a specific program.

Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate

There are 2 modules in this course

This course provides a practical introduction to using transformer-based models for natural language processing (NLP) applications. You will learn to build and train models for text classification using encoder-based architectures like Bidirectional Encoder Representations from Transformers (BERT), and explore core concepts such as positional encoding, word embeddings, and attention mechanisms.

The course covers multi-head attention, self-attention, and causal language modeling with GPT for tasks like text generation and translation. You will gain hands-on experience implementing transformer models in PyTorch, including pretraining strategies such as masked language modeling (MLM) and next sentence prediction (NSP). Through guided labs, you’ll apply encoder and decoder models to real-world scenarios. This course is designed for learners interested in generative AI engineering and requires prior knowledge of Python, PyTorch, and machine learning. Enroll now to build your skills in NLP with transformers!

In this module, you will learn how transformers process sequential data using positional encoding and attention mechanisms. You will explore how to implement positional encoding in PyTorch and understand how attention helps models focus on relevant parts of input sequences. You'll dive deeper into self-attention and scaled dot-product attention with multiple heads to see how they contribute to language modeling tasks. The module also explains how the transformer architecture leverages these mechanisms efficiently. Through hands-on labs, you’ll implement these concepts and build transformer encoder layers in PyTorch. Finally, you'll apply transformer models for text classification, including building a data pipeline, defining the model, and training it, while also exploring techniques to optimize transformer training performance.

What's included

6 videos4 readings2 assignments2 app items1 plugin

6 videosTotal 39 minutes

Course Introduction2 minutesPreview module
Positional Encoding6 minutes
Attention Mechanism7 minutes
Self-attention Mechanism7 minutes
From Attention to Transformers7 minutes
Transformers for Classification: Encoder8 minutes

4 readingsTotal 17 minutes

Course Overview5 minutes
Specialization Overview7 minutes
Optimization Techniques for Efficient Transformer Training 3 minutes
Summary and Highlights2 minutes

2 assignmentsTotal 45 minutes

Practice Quiz: Positional Encoding, Attention, and Application in Classification15 minutes
Graded Quiz: Fundamental Concepts of Transformer Architecture30 minutes

2 app itemsTotal 105 minutes

Hands-on Lab: Attention Mechanism and Positional Encoding45 minutes
Hands-on Lab: Applying Transformers for Classification60 minutes

1 pluginTotal 2 minutes

Helpful Tips for Course Completion2 minutes

In this module, you will learn how decoder-based models like GPT are trained using causal language modeling and implemented in PyTorch for both training and inference. You will explore encoder-based models, such as Bidirectional Encoder Representations from Transformers (BERT), and understand their pretraining strategies using masked language modeling (MLM) and next sentence prediction (NSP), along with data preparation techniques in PyTorch. You will also examine how transformer architectures are applied to machine translation, including their implementation using PyTorch. Through hands-on labs, you will gain practical experience with decoder models, encoder models, and translation tasks. The module concludes with a cheat sheet, glossary, and summary to help consolidate your understanding of key concepts.

What's included

10 videos6 readings4 assignments4 app items2 plugins

10 videosTotal 67 minutes

Language Modeling with the Decoders and GPT-like Models6 minutesPreview module
Training Decoder Models7 minutes
Decoder Models- PyTorch Implementation-Causal LM5 minutes
Decoder Models: PyTorch Implementation Using Training and Inference5 minutes
Encoder Models with BERT: Pretraining Using MLM5 minutes
Encoder Models with BERT: Pretraining Using NSP6 minutes
Data Preparation for BERT with PyTorch8 minutes
Pretraining BERT Models with PyTorch8 minutes
Transformer Architecture for Language Translation5 minutes
Transformer Architecture for Translation: PyTorch Implementation7 minutes

6 readingsTotal 9 minutes

Summary and Highlights1 minute
Summary and Highlights1 minute
Summary and Highlights1 minute
Course Conclusion2 minutes
Thanks from the Course team2 minutes
Congratulations and Next Steps2 minutes

4 assignmentsTotal 63 minutes

Practice Quiz: Decoder Models12 minutes
Practice Quiz: Encoder Models12 minutes
Practice Quiz: Application of Transformers for Translation9 minutes
Graded Quiz: Advanced Concepts of Transformer Architecture30 minutes

4 app itemsTotal 180 minutes

Hands-on Lab: Decoder GPT-like Models45 minutes
Hands-on Lab: Pretraining BERT Models60 minutes
Hands-on Lab: Data Preparation for BERT45 minutes
Lab: Transformers for Translation30 minutes

2 pluginsTotal 18 minutes

Cheat Sheet: Language Modeling with Transformers15 minutes
Course Glossary: Language Modeling with Transformers 3 minutes

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.

Instructors

Instructor ratings

4.1 (13 ratings)

Joseph Santarcangelo

IBM

35 Courses1,982,745 learners

Offered by

IBM

Why people choose Coursera for their career

Felipe M.

Learner since 2018

"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."

Jennifer J.

Learner since 2020

"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."

Larry W.

Learner since 2021

"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."

Chaitanya A.

"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Learner reviews

4.5

85 reviews

5 stars
75.58%
4 stars
13.95%
3 stars
3.48%
2 stars
1.16%
1 star
5.81%

Showing 3 of 85

Reviewed on Nov 17, 2024

need assistance from humans, which seems lacking though a coach can give guidance but not to the extent of human touch.

Reviewed on Jan 18, 2025

Exceptional course and all the labs are industry related

Reviewed on Dec 30, 2024

This course gives me a wide picture of what transformers can be.

Frequently asked questions

It will take only two weeks to complete this course if you spend 3–5 hours of study time per week.

It would be good if you had a basic knowledge of Python and a familiarity with machine learning and neural network concepts. It would be beneficial if you are familiar with text preprocessing steps and N-gram, Word2Vec, and sequence-to-sequence models. Knowledge of evaluation metrics such as bilingual evaluation understudy (BLEU) will be advantageous.

This course is part of the Generative AI Engineering Essentials with LLMs PC specialization. When you complete the specialization, you will prepare yourself with the skills and confidence to take on jobs such as AI Engineer, NLP Engineer, Machine Learning Engineer, Deep Learning Engineer, and Data Scientist.

Access to lectures and assignments depends on your type of enrollment. If you take a course in audit mode, you will be able to see most course materials for free. To access graded assignments and to earn a Certificate, you will need to purchase the Certificate experience, during or after your audit. If you don't see the audit option:

The course may not offer an audit option. You can try a Free Trial instead, or apply for Financial Aid.
The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you enroll in the course, you get access to all of the courses in the Certificate, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

Generative AI Language Modeling with Transformers

What you'll learn

Skills you'll gain

Details to know

Build your subject-matter expertise

There are 2 modules in this course

Fundamental Concepts of Transformer Architecture

What's included

6 videosTotal 39 minutes

4 readingsTotal 17 minutes

2 assignmentsTotal 45 minutes

2 app itemsTotal 105 minutes

1 pluginTotal 2 minutes

Advanced Concepts of Transformer Architecture

What's included

10 videosTotal 67 minutes

6 readingsTotal 9 minutes

4 assignmentsTotal 63 minutes

4 app itemsTotal 180 minutes

2 pluginsTotal 18 minutes

Earn a career certificate

Instructors

Offered by

Why people choose Coursera for their career

Learner reviews

Frequently asked questions

Coursera

Community

More

Mobile App

Generative AI Language Modeling with Transformers

What you'll learn

Skills you'll gain

Details to know

Build your subject-matter expertise

There are 2 modules in this course

Fundamental Concepts of Transformer Architecture

What's included

Advanced Concepts of Transformer Architecture

What's included

Earn a career certificate

Instructors

Offered by

Why people choose Coursera for their career

Learner reviews

Frequently asked questions

How long does it take to complete the Specialization?

Do I need any background knowledge to complete this course successfully?

Which roles can I perform after completing this course?