Honcho Blog - SEO, PPC & Digital Marketing News & Insight

How To Scrape Youtube Video Views With Python

Written by Admin | May 20, 2021 4:41:14 PM

Getting Views From Youtube Videos

This guide will show you how to scrape video views with Python with a primary focus on either scraping a single video or a number of videos. While this is fairly easy to do thanks to the python file we’ll be using, there are few parts that may require a bit more slightly more knowledge.

We are using an existing Python file created by Python Engineer. I’d highly recommend the Youtube tutorial which covers this file in greater detail. All credit to him for creating this Python file and making all our lives much easier! Link to the Youtube Playlist.

Overview of the process

  1. You will need an API key from https://console.cloud.google.com/ 
  2. You will also need both the Video ID and Channel ID ideally in a list or Data Frame enabling us to iterate over the values
  3. Installation of pandas
  4. The following Python file from Python Engineer: https://github.com/python-engineer/youtube-analyzer

Prerequisites and set-up used

  • Python Version: 3.9.4
  • IDE: Visual Studio Code
  • Libraries: Pandas, TQDM & yt_stats
  • Google Cloud: Youtube Data API V3
  • OS: Big Sur – 11.3.1

How to Get Video & Channel IDs

The channel and Video IDs can be identified by the URL from Youtube.

Step 1: Adding your Channel & Video ID to a Dataframe

In my example, I’ll be using a data frame from an Excel File, with the 3 columns, the first being a Channel ID and the 2nd being a Video ID. For demonstration purposes, I have labelled the column names “Channel ID” and “Video ID”.

File Name: youtube_video_channel_ID.xlsx

Column 1: “Title”

Column 2: “Video ID”

Column 3: “Channel ID”

Step 2: Install & Import Files

  1. Save the yt_stats.py file in the directory of your script
  2. Import Pandas, yt_stats and tqdm

Example:

import pandas 
from yt_stats import YTstats
from tqdm import tqdm

Step 3: Add your API Key

Add your API key to the API_KEY Object.

Example:

API_KEY = '#####Your API KEY#####'

Step 4: Read your Excel/CSV file and create an empty list

Read “youtube_video_channel_ID.xlsx” and add to the “df” object. Then we’ll create an empty list where we’ll append the view data, later on.

Example:

df = pd.read_excel("Book2.xlsx")
Views =[]

Step 5: Create our script which will retrieve the view data.

The following script will use the Channel ID and Video ID to locate the video, then using the Youtube Data API V3 will retrieve the video “Statistics”, from that taking the “viewCount”.

It’s worth noting that depending on your Data Frame headings you may need to alter the script for the correct references.

Example:

for x,y in tqdm(zip(df["Video ID"],df["Channel ID"])):
    channel_id = y
    video_id = x
    part = 'statistics'
    yt = YTstats(API_KEY, channel_id)
    a = yt._get_single_video_data(video_id,part)
    Views.append(a['viewCount'])

Step 6: Retrieving our data

The data will now be in the view list. Using Pandas we can add that to our existing data frame and you can output the file to a CSV, Excel or JSON file. 

Example:

df["views"] = Views
df.to_excel("output.xlsx")

Output & notes

Once saved to excel you should something matching the following.

The Youtube Data API V3 has a quota of 5,000 queries a day. If you are planning a large scale project, you’ll have to look at paid options. 

More information on the Youtube Data API V3 can be found here:  https://developers.google.com/youtube/v3/determine_quota_cost



Full Code:

import pandas as pd
from yt_stats import YTstats
from tqdm import tqdm

API_KEY = '#####API KEY#####'

df = pd.read_excel("youtube_video_channel_ID.xlsx")
Views =[]

for x,y in tqdm(zip(df["Video ID"],df["Channel ID"])):
    channel_id = y
    video_id = x
    part = 'statistics'
    yt = YTstats(API_KEY, channel_id)
    a = yt._get_single_video_data(video_id,part)
    Views.append(a['viewCount'])

df["views"] = Views
df.to_excel("output.xlsx")