I have created a python script to collect over 100,000 comments from the Youtube developer API. Originally i just use a list to append all the comments as they come in. However if I want to scale number of comments, I may need a faster means of storage. Is there any better solutions like e.g sqlite that is faster than a list? I need to both store and access the comments. The order of elements does not matter, as I use these comments for sentiment analysis.
edit: The data structure do not need to be indexed, and the ordering does not matter. I will also not be searching the data structure, just retrieving them one by one.
here is the current code snippet that I use:
# Fetch comments comments = [] nextPageToken = None while len(comments) < LIMIT: request = youtube.commentThreads().list( part='snippet', videoId=video_id, maxResults=100, # You can fetch up to 100 comments per request pageToken=nextPageToken ) response = request.execute() for item in response['items']: comment = item['snippet']['topLevelComment']['snippet'] comments.append(comment['textDisplay']) nextPageToken = response.get('nextPageToken') if not nextPageToken: breakthis works fine, but I can't help but feel I can increase the speed in some way.