Quantcast
Channel: Active questions tagged youtube-api - Stack Overflow
Viewing all articles
Browse latest Browse all 3831

large list of lists (multiple json responses from YouTube API) to a dataframe efficiently in R?

$
0
0

I downloaded json file responses from YouTube API v3 calls to the videos endpoint See data structure example for around 7500 videos, I made around 150 calls using 50 videos each (list).

yt_jsons_ <- purrr::map(list.files('./yt_calls_wget/',full.names = TRUE) , fromJSON)

I get a large list in R, containing all the json calls merged.

View(yt_jsons_[[1]][["items"]])

where 1 is the first call of 150 containing 50 (videos) observations. The data that im looking for is in these fields

yt_jsons_[[1]][["items"]][["snippet"]][["publishedAt"]][1] "2022-03-02T14:24:12Z" "2022-03-04T15:50:12Z" "2022-02-20T18:20:08Z"[4] "2021-11-17T16:43:40Z" "2022-03-24T17:33:27Z" "2022-03-06T16:14:41Z"[7] "2022-03-24T13:50:27Z" "2021-11-20T19:13:58Z" "2022-02-06T15:00:41Z"[10] "2021-07-14T09:00:22Z" "2022-01-19T12:37:25Z" "2022-03-05T01:24:01Z"[13] "2022-01-19T22:00:05Z" "2021-11-30T14:49:18Z" "2021-12-28T02:45:01Z"[16] "2022-03-25T15:02:42Z" "2021-12-22T21:09:51Z" "2022-02-26T00:50:24Z"[19] "2021-05-12T18:38:18Z" "2021-10-26T15:30:00Z" "2022-02-16T15:59:33Z"[22] "2022-01-07T18:50:45Z" "2022-03-26T15:59:18Z" "2022-01-31T22:02:17Z"[25] "2021-12-11T18:53:35Z" "2021-04-24T21:35:39Z" "2022-02-02T20:55:40Z"[28] "2021-12-27T23:00:11Z" "2022-03-27T09:36:32Z" "2022-03-24T19:48:14Z"[31] "2021-06-01T02:00:01Z" "2021-03-25T11:08:29Z" "2021-12-09T21:00:26Z"[34] "2022-03-30T13:00:31Z" "2022-03-22T14:25:49Z" "2022-01-07T19:30:12Z"[37] "2022-03-26T19:09:17Z" "2021-12-09T01:43:56Z" "2022-01-20T20:00:05Z"[40] "2022-02-01T01:00:28Z" "2022-02-23T20:03:27Z" "2022-02-23T22:59:23Z"[43] "2022-01-22T14:41:10Z" "2022-02-11T20:00:05Z"yt_jsons_[[1]][["items"]][["statistics"]]   viewCount likeCount favoriteCount commentCount1       3715       160             0           172      14313       876             0           493      17376       274             0          1274       7584       338             0           645      13422       508             0           436    1535743     33215             0         14647       8493       752             0           788       <NA>        25             0            09        104         5             0            010     50342      2995             0          14811       126        14             0            012       189         2             0            113        25         4             0            014    258485      5876             0          70015     11063      1221             0          18816     12825       488             0          15517      7594       420             0            518     79783      5689             0          20219   2121226     16016             0         177720      7453       577             0           5421     22329      1098             0          10322        83         4             0            023     14329       571             0           8324       121         6             0            025      3120       465             0           2826      1103      <NA>             0           1027     23139       140             0            628      1189       112             0           1529     46067      2319             0          28830     39595      3852             0          21531     19738       298             0           1532    494391     17683             0          39733      7080       262             0           4334     71213      4917             0          36335      4793        41             0            936     42288      2786             0          87137      5497       272             0           5438       883        47             0           2039      3492       209             0            740     11285       302             0           3141     22962      1772             0          23542       387        18             0            043      1652        95             0           2744     27904      1189             0           98yt_jsons_[[1]][["items"]][["topicDetails"]]topicCategories1                                       https://en.wikipedia.org/wiki/Action-adventure_game, https://en.wikipedia.org/wiki/Action_game, https://en.wikipedia.org/wiki/Video_game_culture2                                                                                                                                                                                   NULL3                                                                                             https://en.wikipedia.org/wiki/Lifestyle_(sociology), https://en.wikipedia.org/wiki/Vehicle4                                                                                                                                                  https://en.wikipedia.org/wiki/Society5                                                                                                                                                  https://en.wikipedia.org/wiki/Society6                                                                                                                                    https://en.wikipedia.org/wiki/Lifestyle_(sociology)7                                                                                                                                                  https://en.wikipedia.org/wiki/Society8                                                                                                                                    https://en.wikipedia.org/wiki/Lifestyle_(sociology)9                                                                                                                                    https://en.wikipedia.org/wiki/Lifestyle_(sociology)10                                                                                                                                                 https://en.wikipedia.org/wiki/Society11                                                                                                                                   https://en.wikipedia.org/wiki/Lifestyle_(sociology)12                                                                                                                                                 https://en.wikipedia.org/wiki/Society13                                                                                                                                   https://en.wikipedia.org/wiki/Lifestyle_(sociology)14                                                                                            https://en.wikipedia.org/wiki/Lifestyle_(sociology), https://en.wikipedia.org/wiki/Vehicle15                                                                                                         https://en.wikipedia.org/wiki/Politics, https://en.wikipedia.org/wiki/Society16                                                                                                                                                 https://en.wikipedia.org/wiki/Society17                                                                                                                                                 https://en.wikipedia.org/wiki/Society18                                                                                                                                                 https://en.wikipedia.org/wiki/Society19                                                                                            https://en.wikipedia.org/wiki/Lifestyle_(sociology), https://en.wikipedia.org/wiki/Vehicle20                                                                                                                                                 https://en.wikipedia.org/wiki/Society21                                                                                            https://en.wikipedia.org/wiki/Lifestyle_(sociology), https://en.wikipedia.org/wiki/Vehicle22                                                                                                                                   https://en.wikipedia.org/wiki/Lifestyle_(sociology)23                                                                                                                                                 https://en.wikipedia.org/wiki/Society24                                                                                         https://en.wikipedia.org/wiki/Lifestyle_(sociology), https://en.wikipedia.org/wiki/Technology25                                                                                                                                                 https://en.wikipedia.org/wiki/Society26                                                                                                                                   https://en.wikipedia.org/wiki/Lifestyle_(sociology)27                                                                                                                                   https://en.wikipedia.org/wiki/Lifestyle_(sociology)28                                                                                                                                                 https://en.wikipedia.org/wiki/Society29                                                                                                                                                                                  NULL30                                                                                                                                                 https://en.wikipedia.org/wiki/Society31                                                                                                                                                                                  NULL32                                                                                                                                   https://en.wikipedia.org/wiki/Lifestyle_(sociology)33                                                                                                                                                 https://en.wikipedia.org/wiki/Society34                                                                                                                                                                                  NULL35                                                                                                                                                 https://en.wikipedia.org/wiki/Society36                                                                                                                                                 https://en.wikipedia.org/wiki/Society37 https://en.wikipedia.org/wiki/Lifestyle_(sociology), https://en.wikipedia.org/wiki/Motorsport, https://en.wikipedia.org/wiki/Racing_video_game, https://en.wikipedia.org/wiki/Vehicle38                                                                                                                                                 https://en.wikipedia.org/wiki/Society39                                                                                                                                                   https://en.wikipedia.org/wiki/Music40                                                                                                                                   https://en.wikipedia.org/wiki/Lifestyle_(sociology)41                                                                                                                                                 https://en.wikipedia.org/wiki/Society42                                                                                                                                                 https://en.wikipedia.org/wiki/Society43                                                                                                                                   https://en.wikipedia.org/wiki/Lifestyle_(sociology)44                                                                                            https://en.wikipedia.org/wiki/Lifestyle_(sociology), https://en.wikipedia.org/wiki/Vehicle

I tried

# this did not work.. SOtry <- yt_jsons_ %>%    map(unlist) %>%    map(as_tibble) %>%    bind_rows()Sotry# A tibble: 401,564 × 1   value                      <chr>                       1 youtube#videoListResponse   2 5OfTreCNJKiXdr9MzkeSZpaSubk 3 youtube#video               4 youtube#video               5 youtube#video               6 youtube#video               7 youtube#video               8 youtube#video               9 youtube#video              10 youtube#video              # ℹ 401,554 more rows# ℹ Use `print(n = ...)` to see more rows

but it did not work. I used this block for the Reddit idt3 See data structure example json files, and it worked. But those were one post per file, in the YouTube json files I have a list of posts.

I then tried this

# not yet...yt_not_quite <- as.data.frame(t(sapply(yt_jsons_, function(x) c(x$items[["statistics"]],x$items[["snippet"]]))))

and I do get all the columns I'd like but

head(yt_not_quite$viewCount)[[1]] [1] "3715"    "14313"   "17376"   "7584"    "13422"   "1535743" "8493"    [8] NA        "104"     "50342"   "126"     "189"     "25"      "258485" [15] "11063"   "12825"   "7594"    "79783"   "2121226" "7453"    "22329"  [22] "83"      "14329"   "121"     "3120"    "1103"    "23139"   "1189"   [29] "46067"   "39595"   "19738"   "494391"  "7080"    "71213"   "4793"   [36] "42288"   "5497"    "883"     "3492"    "11285"   "22962"   "387"    [43] "1652"    "27904"  [[2]] [1] "473"      "9255"     "229601"   "757"      "13341"    "173"      [7] "19675"    "12506"    "206659"   "642"      "23756"    "87018"   [13] "10920"    "10"       "2144"     "29667"    "485063"   "436"     [19] "1064"     "35040"    "225344"   "585"      "66893"    "1481145" [25] "636470"   "10813623" "3727"     "27"       "1514"     "6489"    [31] "435"      "38213"    "55"       "23821"    "1116"     "3584"    [37] "13142"    "271"      "1254"     "18886"    "16313"    "744081"  [43] "47812"    "4122"     "304"      "886"      "654"      "673016"  [[3]] [1] "47164"   "2660"    "123"     "4952"    "28"      "5194541" "40587"   [8] "55"      "341545"  "32999"   "28410"   "232312"  "78703"   "52012"  [15] "107539"  "233602"  "1537"    "97889"   "291250"  "1742"    "4331606"[22] "58315"   "255241"  "18942"   "13852"   "4636"    "1147"    "49748"  [29] "71897"   "578"     "565"     "209506"  "770"     "514"     "228"    [[4]] [1] "79987"   "5857"    "72042"   "11751"   "3201"    "6849"    "71014"   [8] "192"     "9420"    "155120"  "14011"   "253221"  "5209"    "1968"   [15] "10197"   "81255"   "856"     "5152"    "5353"    "1600"    "242"    [22] "20135"   "60032"   "8127"    "17766"   "1"       "17014"   "807"    [29] "191783"  "4903648" "1525"    "730"     "39839"   "104"     "12428"  [36] "3139"    "11199"   "9611"    "13012"   "46456"   "211132"  "3520"   [43] "468"     "124582"  "18337"   "3266"   [[5]] [1] "153416" "6984"   "13870"  "40215"  "6557"   "2818"   "343829" "4382"   [9] "21651"  "2370"   "3888"   "10953"  "20003"  "8644"   NA       "213"   [17] "807179" "20361"  "21390"  "169421" "8766"   "53595"  "340"    "10506" [25] "970"    "16253" [[6]] [1] "2234"    "19093"   "104322"  "44101"   "1309843" "4864"    "17798"   [8] "4485"    "23376"   "963159"  "10385"   "6974"    "45116"   "106527" [15] "860"     "251528"  "13"      "735"     "16355"   "3566"    "50150"  [22] "3398"    "83611"   "19595"   "2564"    "24"      "11016"   "40530"  [29] "137"     "22480"   "6869"    "2546"    "171143"  "321389"  "58745"  [36] "5275"    "29680"   "26941"   "13220"   "81599"   "42311"   "135"    [43] "2243"    "53860"   "50783"   "3897"   typeof(yt_not_quite)[1] "list"

i still have each json call 'encapsulated' by row (the legth of rows is equal to the number of calls i imported, the fields for each column are vectors of the list of videos I queried each call).

I can extract each call individually like

View(as.data.frame(yt_not_quite[1,1],col.names = colnames(yt_not_quite[1])))as.data.frame(yt_not_quite[1,1],col.names = colnames(yt_not_quite[1]))   viewCount1       37152      143133      173764       75845      134226    15357437       84938       <NA>9        10410     5034211       12612       18913        2514    25848515     1106316     1282517      759418     7978319   212122620      745321     2232922        8323     1432924       12125      312026      110327     2313928      118929     4606730     3959531     1973832    49439133      708034     7121335      479336     4228837      549738       88339      349240     1128541     2296242       38743      165244     27904

But I should be applying this to multiple columns and across all n rows and rbind them together, to get all of the 7500 videos...

Anyway, I wonder if there is a better way to convert these to a dataframe. I am trying to keep the memory usage processes under 1GB for shinyapps


Viewing all articles
Browse latest Browse all 3831

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>