How to Scrape ANY YouTube Video Transcript with n8n! (full workflow)
3 min read
6 days ago
Published on Aug 31, 2025
This response is partially generated with the help of AI. It may contain inaccuracies.
Table of Contents
Introduction
This tutorial guides you through the process of scraping YouTube video transcripts using the n8n automation tool. By following these steps, you can create an efficient workflow that allows you to extract transcripts, store them in a database, and utilize them for various applications, including AI agents.
Step 1: Gather Required Tools
Before starting, ensure you have the following tools at your disposal:
- n8n: An open-source workflow automation tool.
- Airtable: A database solution to store the scraped transcripts.
- Apify: A web scraping tool to facilitate the extraction of transcripts.
Step 2: Create an Airtable Database
- Sign in to your Airtable account.
- Create a new base for storing the transcripts.
- Define the necessary fields such as:
- Video Title
- Video URL
- Transcript Content
- Ensure the database is organized for easy access and management.
Step 3: Clean the Database
- Review the structure of your Airtable base.
- Remove any unnecessary fields that may complicate data entry or retrieval.
- Ensure that the fields align with the data you will be scraping.
Step 4: Set Up Your n8n Workflow
- Open n8n and create a new workflow.
- Add a Webhook node to trigger the automation when a specific event occurs.
- Configure the Webhook to accept data from your Airtable form submissions.
Step 5: Configure Airtable Automation
- In Airtable, create a form linked to your database.
- Set up automation to trigger when a new form submission is received.
- This automation will send the video URL to your n8n workflow via the Webhook.
Step 6: Set Up the Apify Key
- Sign up for an Apify account if you haven’t already.
- Obtain your API key from the Apify dashboard.
- Store the key securely as you will need it to authenticate your requests in n8n.
Step 7: Create HTTP Request for Transcript Scraping
- In n8n, add an HTTP Request node.
- Configure the node to use the Apify API for scraping transcripts.
- Use the following parameters in your request:
- Method: GET
- URL: Your Apify endpoint for transcript scraping.
- Headers: Include your Apify API key for authorization.
Step 8: Test the URL Set Field
- Use the n8n interface to input a sample YouTube video URL.
- Run the workflow to ensure that the URL is processed correctly.
Step 9: Test the HTTP Request
- Execute the HTTP Request node to confirm that it retrieves the transcript data.
- Check the response for any errors or issues in the scraping process.
Step 10: Clean the Transcript with Code Node
- Add a Code Node to your workflow to process the raw transcript data.
- Use the following example code to clean the transcript:
const cleanedTranscript = items[0].json.transcript.replace(/[^a-zA-Z0-9\s.,!?'-]/g, ''); return [{ json: { cleanedTranscript } }];
Step 11: Append Transcript to Airtable
- Add an Airtable node to your workflow.
- Configure it to create a new record in your database with the following fields:
- Video Title
- Video URL
- Cleaned Transcript Content
Conclusion
You have now set up a complete workflow to scrape YouTube video transcripts using n8n. This system allows you to automate the extraction and storage of transcripts in Airtable, making it easier to access and utilize for AI applications. Explore further by integrating AI agents with the collected data to enhance your projects.