Quantcast
Channel: Active questions tagged youtube-api - Stack Overflow
Viewing all articles
Browse latest Browse all 3831

Working with Google Takeout's video metadata

$
0
0

Has anyone taken the video metadata from a Google YouTube Takeout file, and matched it to the filenames contained in the export itself? I'll describe what I've found below. Any solution doesn't need to use the YouTube API or pythons. I'm flexible with that.

I can find lots of videos and web pages talking about how to use Takeout to get these files, but none actually say how to use the videos themselves (the file names) and the metadata, playlists, etc. (keyed on the video ID) that come in the export.

I have been trying to find a way to connect the video metadata (csv files) from a Takeout export that a user has give my department with the video files (mp4) that are actually contained in the export. This will be uploading to another video service our University uses. There are over 700 videos of 81GB with 4 exports, so this isn't something to do by hand. Just connecting the videos in the 2 services isn't an option for business reasons. Avoiding a script to redownload the videos via the API is also not an option for business reasons.

The Takeout export puts video in a videos folder with just the individual filenames (truncated at 55 characters). The metadata are in muptiple csvs, but mainly in videos*.csv (capped at 200 entires for each cvs, but I can combine them) in the "video metadata" folder. We need only some of the fields from the metadata, and I can carve them out. (I'm using pandas in python, but there are other ways and languages that I can do this part in.)

The videos*.csv headers are:

Video ID,Approx Duration (ms),Video Audio Language,Video Category,Video Description (Original),Video Description (Original) Language,Channel ID,Tag 1,Video Title (Original),Video Title (Original) Language,Privacy,Video State,Video Create Timestamp

The problem that comes up is that there is no place for the VideoID field to match to the filename. After way too much digging, the closest I've found is the YouTube Data API v3 (in python, but I'm not picky about languages):

There are 2 main problems:

  • The returned fileDetails has the fileName that was originally uploaded, but not the file name that is in the Takeout export. This only matches some files due to name changes YouTube makes, like changing "-" and "." to spaces, etc. (I haven't dig for all of them, but even knowing the substitutions wouldn't help at this point, so I didn't bother yet.)
  • The filenames are truncated at 55 characters and will resolve duplicates of the truncation with adding (1), (2), etc. at the end:
video with really long name that I will pad out(1).mp4'video with really long name that I will pad out(2).mp4'video with really long name that I will pad out(3).mp4'

What I needs to do isn't as simple as that, but this is the key that keeps me from even starting to address the other issues (videoIDs duplicating over the 4 exports, videos checksums showing files, but with different file names, end so much else).

Tried: Matching the video ID field to the filenames included in the Google Takeout export.Tried: YouTube Data API v3:

request = youtube.videos().list(        part="id, fileDetails",        id="loNfsc4fXXX,O1ene1dYYYY, FMiZ61RDZZZ"    )

Expected: the videoID and the exported file name. Note, the results below give .mov where the exports are .mp4, the mp4s haver different file names (see above)

Results: no mention of the exported video file names in the metadata,

{'kind': 'youtube#videoListResponse', 'etag': 'UUv5J-eWz99PTUQpQcAKms9dBGY', 'items': [{'kind': 'youtube#video', 'etag': 'QoUZs_ouTYspN9h0SxOovQbS578', 'id': 'loNfsc4fXXX', 'fileDetails':{'fileName': 'Screen Recording 2024-12-25 at 12.45.22\u202fPM.mov', 'fileType': 'video', 'videoStreams': [{'widthPixels': 2876, 'heightPixels': 2130, 'frameRateFps': 59.50515463917526, 'aspectRatio': 1.3502347417840375, 'codec': 'h264', 'bitrateBps': '21358617'}], 'durationMs': '56550', 'bitrateBps': '21358617'            }        },{'kind': 'youtube#video', 'etag': 'cZWV7tOcYkvzniYFayZ7dKgJXyQ','id': 'O1ene1dYYYY','fileDetails':{'fileName': 'Screen Recording 2024-12-25 at 12.45.23\u202fPM.mov','fileType': 'video', 'videoStreams': [{'widthPixels': 2876, 'heightPixels': 2130, 'frameRateFps': 59.41984732824427, 'aspectRatio': 1.3502347417840375, 'codec': 'h264', 'bitrateBps': '34756285'}], 'durationMs': '32699', 'bitrateBps': '34756285''kind': 'youtube#video', 'etag': 'cZWV7tOcYkvzniYFayZ7dKgJXyQ','id': 'FMiZ61RDZZZ','fileDetails':{'fileName': 'Screen Recording 2024-12-25 at 12.45.24\u202fPM.mov','fileType': 'video', 'videoStreams': [{'widthPixels': 2876, 'heightPixels': 2130, 'frameRateFps': 59.41984732824427, 'aspectRatio': 1.3502347417840375, 'codec': 'h264', 'bitrateBps': '34756285'}], 'durationMs': '32699', 'bitrateBps': '34756285'        }    }],'pageInfo': {'totalResults': 2, 'resultsPerPage': 2}}

Viewing all articles
Browse latest Browse all 3831

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>