1. How can I scrape the content of the 'information panel providing publisher context' on YouTube in Python, which appears only below a video and describes the reason why the channel is verified (if it is)?
Blue box below a video (see screenshot): Information Panel providing Publisher Context
The scraping happens in following context: I got sent an excel file with 40,000 YouTube channel names in respective rows and have to check, whether those are verified or not. If yes, I shall provide the reason of the verification (found in the information panel). For this, I state yes/no in the next column to the channel name and the verification reason in the second next one.
2. Does it make sense to use a web scraper or rather an RPA or something else for this task? And if a web scraper is the right application, which libraries / tools are able to perform these tasks?
So far I have been working with ChatGPT and several websites to understand how to scrape data. Hence, I set up the YouTube Data API and found ways to scrape the information, whether a channel is verified or not. I have not been able to use the YouTube API for this specific data so far.
3. My goal is to create an application, which includes following activities:
- Extract the data (YouTube channel names) from the excel file
- Scrapes the information, if the channel is verified or not and the 'information panel content' from YouTube
- Prints into the excel file first yes / no and second the info panel content