AI Foundations Watch on YouTube

How to Scrape ANY YouTube Video Transcript with n8n! (full workflow)

3 min read 6 days ago

Published on Aug 31, 2025 This response is partially generated with the help of AI. It may contain inaccuracies.

Table of Contents

Introduction

This tutorial guides you through the process of scraping YouTube video transcripts using the n8n automation tool. By following these steps, you can create an efficient workflow that allows you to extract transcripts, store them in a database, and utilize them for various applications, including AI agents.

Step 1: Gather Required Tools

Before starting, ensure you have the following tools at your disposal:

n8n: An open-source workflow automation tool.
Airtable: A database solution to store the scraped transcripts.
Apify: A web scraping tool to facilitate the extraction of transcripts.

Step 2: Create an Airtable Database

Sign in to your Airtable account.
Create a new base for storing the transcripts.
Define the necessary fields such as:
- Video Title
- Video URL
- Transcript Content
Ensure the database is organized for easy access and management.

Step 3: Clean the Database

Review the structure of your Airtable base.
Remove any unnecessary fields that may complicate data entry or retrieval.
Ensure that the fields align with the data you will be scraping.

Step 4: Set Up Your n8n Workflow

Open n8n and create a new workflow.
Add a Webhook node to trigger the automation when a specific event occurs.
Configure the Webhook to accept data from your Airtable form submissions.

Step 5: Configure Airtable Automation

In Airtable, create a form linked to your database.
Set up automation to trigger when a new form submission is received.
This automation will send the video URL to your n8n workflow via the Webhook.

Step 6: Set Up the Apify Key

Sign up for an Apify account if you haven’t already.
Obtain your API key from the Apify dashboard.
Store the key securely as you will need it to authenticate your requests in n8n.

Step 7: Create HTTP Request for Transcript Scraping

In n8n, add an HTTP Request node.
Configure the node to use the Apify API for scraping transcripts.
Use the following parameters in your request:
- Method: GET
- URL: Your Apify endpoint for transcript scraping.
- Headers: Include your Apify API key for authorization.

Step 8: Test the URL Set Field

Use the n8n interface to input a sample YouTube video URL.
Run the workflow to ensure that the URL is processed correctly.

Step 9: Test the HTTP Request

Execute the HTTP Request node to confirm that it retrieves the transcript data.
Check the response for any errors or issues in the scraping process.

Step 10: Clean the Transcript with Code Node

Add a Code Node to your workflow to process the raw transcript data.

Use the following example code to clean the transcript:

const cleanedTranscript = items[0].json.transcript.replace(/[^a-zA-Z0-9\s.,!?'-]/g, '');
return [{ json: { cleanedTranscript } }];

Step 11: Append Transcript to Airtable

Add an Airtable node to your workflow.
Configure it to create a new record in your database with the following fields:
- Video Title
- Video URL
- Cleaned Transcript Content

Conclusion

You have now set up a complete workflow to scrape YouTube video transcripts using n8n. This system allows you to automate the extraction and storage of transcripts in Airtable, making it easier to access and utilize for AI applications. Explore further by integrating AI agents with the collected data to enhance your projects.

Table of Contents

Recent