I am exploring the Youtube Data API, and I am trying to get the most viewed videos that were published in a period of time. The problem is, I do not get accurate results. I put my code and a result.
from datetime import datetimeimport requestsimport jsondef get_most_watched_videos(api_key, year_b, month_b, day_b, hour_b, min_b, year_a, month_a, day_a, hour_a, min_a, category_id): published_before = datetime(year_b, month_b, day_b, hour_b, min_b).isoformat("T") +"Z" published_after = datetime(year_a, month_a, day_a, hour_a, min_a).isoformat("T") +"Z" params = {'part': 'snippet','maxResults': 10,'order': 'viewCount','type': 'video','publishedBefore': published_before,'publishedAfter': published_after,'region': 'US', #'relevanceLanguage': 'en','key': api_key, } response = requests.get('https://www.googleapis.com/youtube/v3/search', params=params) return response.json()year_b, month_b, day_b, hour_b, min_b = 2023, 6, 25, 0, 0year_a, month_a, day_a, hour_a, min_a = 2010, 6, 24, 0, 0videos = get_most_watched_videos(api_key, year_b, month_b, day_b, hour_b, min_b, year_a, month_a, day_a, hour_a, min_a, category_id)video_ids = [video['id']['videoId'] for video in videos['items']]# get the view count for each videoview_counts = []video_list = []for video_id in video_ids: response = requests.get(f'https://www.googleapis.com/youtube/v3/videos?part=statistics&id={video_id}&key={api_key}') video = json.loads(response.text) video_list.append(video) view_count = int(video['items'][0]['statistics']['viewCount']) view_counts.append((video_id, view_count))# sort the videos by view countview_counts.sort(key=lambda x: x[1], reverse=True)# print the videos and their view countsfor video_id, view_count in view_counts: print(f'Video ID: {video_id}, View Count: {view_count}')
The results of this function is not accurate in terms of real life views.Am I using the API in a wrong way? or it does not work?
I have searched for this issue for example on youtube data api search by viewCount wrong results, but they recommend using the v1 version of the API, which is deprecated.
This is the results I get when running the previous code (they are clearly wrong, because "Despacito" has more than 17M views:
Video ID: Qpd5dox2GlI, View Count: 17148767
Video ID: 9tXsku7CQQA, View Count: 13613482
Video ID: o3cYy7IGQ_w, View Count: 11185616
Video ID: ygCxKydStJ8, View Count: 9768904
Video ID: F6MfmV-u2kc, View Count: 7319194
Video ID: 6weJp1JkFcg, View Count: 5612655
Video ID: lo7SH3S6zCw, View Count: 4121992
Video ID: iDMren1gIzw, View Count: 4061525
Video ID: 9iNrvndVBqE, View Count: 3842975
Video ID: oYX5v_W3qT0, View Count: 3316633
Sometimes I change the starting date (to a bigger period) and I get lower results.