You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

254 lines
8.1 KiB

Switch codebase to use sanitized_Request instead of compat_urllib_request.Request [downloader/dash] Use sanitized_Request [downloader/http] Use sanitized_Request [atresplayer] Use sanitized_Request [bambuser] Use sanitized_Request [bliptv] Use sanitized_Request [brightcove] Use sanitized_Request [cbs] Use sanitized_Request [ceskatelevize] Use sanitized_Request [collegerama] Use sanitized_Request [extractor/common] Use sanitized_Request [crunchyroll] Use sanitized_Request [dailymotion] Use sanitized_Request [dcn] Use sanitized_Request [dramafever] Use sanitized_Request [dumpert] Use sanitized_Request [eitb] Use sanitized_Request [escapist] Use sanitized_Request [everyonesmixtape] Use sanitized_Request [extremetube] Use sanitized_Request [facebook] Use sanitized_Request [fc2] Use sanitized_Request [flickr] Use sanitized_Request [4tube] Use sanitized_Request [gdcvault] Use sanitized_Request [extractor/generic] Use sanitized_Request [hearthisat] Use sanitized_Request [hotnewhiphop] Use sanitized_Request [hypem] Use sanitized_Request [iprima] Use sanitized_Request [ivi] Use sanitized_Request [keezmovies] Use sanitized_Request [letv] Use sanitized_Request [lynda] Use sanitized_Request [metacafe] Use sanitized_Request [minhateca] Use sanitized_Request [miomio] Use sanitized_Request [meovideo] Use sanitized_Request [mofosex] Use sanitized_Request [moniker] Use sanitized_Request [mooshare] Use sanitized_Request [movieclips] Use sanitized_Request [mtv] Use sanitized_Request [myvideo] Use sanitized_Request [neteasemusic] Use sanitized_Request [nfb] Use sanitized_Request [niconico] Use sanitized_Request [noco] Use sanitized_Request [nosvideo] Use sanitized_Request [novamov] Use sanitized_Request [nowness] Use sanitized_Request [nuvid] Use sanitized_Request [played] Use sanitized_Request [pluralsight] Use sanitized_Request [pornhub] Use sanitized_Request [pornotube] Use sanitized_Request [primesharetv] Use sanitized_Request [promptfile] Use sanitized_Request [qqmusic] Use sanitized_Request [rtve] Use sanitized_Request [safari] Use sanitized_Request [sandia] Use sanitized_Request [shared] Use sanitized_Request [sharesix] Use sanitized_Request [sina] Use sanitized_Request [smotri] Use sanitized_Request [sohu] Use sanitized_Request [spankwire] Use sanitized_Request [sportdeutschland] Use sanitized_Request [streamcloud] Use sanitized_Request [streamcz] Use sanitized_Request [tapely] Use sanitized_Request [tube8] Use sanitized_Request [tubitv] Use sanitized_Request [twitch] Use sanitized_Request [twitter] Use sanitized_Request [udemy] Use sanitized_Request [vbox7] Use sanitized_Request [veoh] Use sanitized_Request [vessel] Use sanitized_Request [vevo] Use sanitized_Request [viddler] Use sanitized_Request [videomega] Use sanitized_Request [viewvster] Use sanitized_Request [viki] Use sanitized_Request [vk] Use sanitized_Request [vodlocker] Use sanitized_Request [voicerepublic] Use sanitized_Request [wistia] Use sanitized_Request [xfileshare] Use sanitized_Request [xtube] Use sanitized_Request [xvideos] Use sanitized_Request [yandexmusic] Use sanitized_Request [youku] Use sanitized_Request [youporn] Use sanitized_Request [youtube] Use sanitized_Request [patreon] Use sanitized_Request [extractor/common] Remove unused import [nfb] PEP 8
9 years ago
10 years ago
10 years ago
10 years ago
10 years ago
10 years ago
10 years ago
10 years ago
10 years ago
Switch codebase to use sanitized_Request instead of compat_urllib_request.Request [downloader/dash] Use sanitized_Request [downloader/http] Use sanitized_Request [atresplayer] Use sanitized_Request [bambuser] Use sanitized_Request [bliptv] Use sanitized_Request [brightcove] Use sanitized_Request [cbs] Use sanitized_Request [ceskatelevize] Use sanitized_Request [collegerama] Use sanitized_Request [extractor/common] Use sanitized_Request [crunchyroll] Use sanitized_Request [dailymotion] Use sanitized_Request [dcn] Use sanitized_Request [dramafever] Use sanitized_Request [dumpert] Use sanitized_Request [eitb] Use sanitized_Request [escapist] Use sanitized_Request [everyonesmixtape] Use sanitized_Request [extremetube] Use sanitized_Request [facebook] Use sanitized_Request [fc2] Use sanitized_Request [flickr] Use sanitized_Request [4tube] Use sanitized_Request [gdcvault] Use sanitized_Request [extractor/generic] Use sanitized_Request [hearthisat] Use sanitized_Request [hotnewhiphop] Use sanitized_Request [hypem] Use sanitized_Request [iprima] Use sanitized_Request [ivi] Use sanitized_Request [keezmovies] Use sanitized_Request [letv] Use sanitized_Request [lynda] Use sanitized_Request [metacafe] Use sanitized_Request [minhateca] Use sanitized_Request [miomio] Use sanitized_Request [meovideo] Use sanitized_Request [mofosex] Use sanitized_Request [moniker] Use sanitized_Request [mooshare] Use sanitized_Request [movieclips] Use sanitized_Request [mtv] Use sanitized_Request [myvideo] Use sanitized_Request [neteasemusic] Use sanitized_Request [nfb] Use sanitized_Request [niconico] Use sanitized_Request [noco] Use sanitized_Request [nosvideo] Use sanitized_Request [novamov] Use sanitized_Request [nowness] Use sanitized_Request [nuvid] Use sanitized_Request [played] Use sanitized_Request [pluralsight] Use sanitized_Request [pornhub] Use sanitized_Request [pornotube] Use sanitized_Request [primesharetv] Use sanitized_Request [promptfile] Use sanitized_Request [qqmusic] Use sanitized_Request [rtve] Use sanitized_Request [safari] Use sanitized_Request [sandia] Use sanitized_Request [shared] Use sanitized_Request [sharesix] Use sanitized_Request [sina] Use sanitized_Request [smotri] Use sanitized_Request [sohu] Use sanitized_Request [spankwire] Use sanitized_Request [sportdeutschland] Use sanitized_Request [streamcloud] Use sanitized_Request [streamcz] Use sanitized_Request [tapely] Use sanitized_Request [tube8] Use sanitized_Request [tubitv] Use sanitized_Request [twitch] Use sanitized_Request [twitter] Use sanitized_Request [udemy] Use sanitized_Request [vbox7] Use sanitized_Request [veoh] Use sanitized_Request [vessel] Use sanitized_Request [vevo] Use sanitized_Request [viddler] Use sanitized_Request [videomega] Use sanitized_Request [viewvster] Use sanitized_Request [viki] Use sanitized_Request [vk] Use sanitized_Request [vodlocker] Use sanitized_Request [voicerepublic] Use sanitized_Request [wistia] Use sanitized_Request [xfileshare] Use sanitized_Request [xtube] Use sanitized_Request [xvideos] Use sanitized_Request [yandexmusic] Use sanitized_Request [youku] Use sanitized_Request [youporn] Use sanitized_Request [youtube] Use sanitized_Request [patreon] Use sanitized_Request [extractor/common] Remove unused import [nfb] PEP 8
9 years ago
  1. # coding: utf-8
  2. from __future__ import unicode_literals
  3. import base64
  4. from .common import InfoExtractor
  5. from ..compat import (
  6. compat_urllib_parse,
  7. compat_ord,
  8. )
  9. from ..utils import (
  10. ExtractorError,
  11. sanitized_Request,
  12. )
  13. class YoukuIE(InfoExtractor):
  14. IE_NAME = 'youku'
  15. IE_DESC = '优酷'
  16. _VALID_URL = r'''(?x)
  17. (?:
  18. http://(?:v|player)\.youku\.com/(?:v_show/id_|player\.php/sid/)|
  19. youku:)
  20. (?P<id>[A-Za-z0-9]+)(?:\.html|/v\.swf|)
  21. '''
  22. _TESTS = [{
  23. 'url': 'http://v.youku.com/v_show/id_XMTc1ODE5Njcy.html',
  24. 'md5': '5f3af4192eabacc4501508d54a8cabd7',
  25. 'info_dict': {
  26. 'id': 'XMTc1ODE5Njcy_part1',
  27. 'title': '★Smile﹗♡ Git Fresh -Booty Music舞蹈.',
  28. 'ext': 'flv'
  29. }
  30. }, {
  31. 'url': 'http://player.youku.com/player.php/sid/XNDgyMDQ2NTQw/v.swf',
  32. 'only_matching': True,
  33. }, {
  34. 'url': 'http://v.youku.com/v_show/id_XODgxNjg1Mzk2_ev_1.html',
  35. 'info_dict': {
  36. 'id': 'XODgxNjg1Mzk2',
  37. 'title': '武媚娘传奇 85',
  38. },
  39. 'playlist_count': 11,
  40. }, {
  41. 'url': 'http://v.youku.com/v_show/id_XMTI1OTczNDM5Mg==.html',
  42. 'info_dict': {
  43. 'id': 'XMTI1OTczNDM5Mg',
  44. 'title': '花千骨 04',
  45. },
  46. 'playlist_count': 13,
  47. 'skip': 'Available in China only',
  48. }, {
  49. 'url': 'http://v.youku.com/v_show/id_XNjA1NzA2Njgw.html',
  50. 'note': 'Video protected with password',
  51. 'info_dict': {
  52. 'id': 'XNjA1NzA2Njgw',
  53. 'title': '邢義田复旦讲座之想象中的胡人—从“左衽孔子”说起',
  54. },
  55. 'playlist_count': 19,
  56. 'params': {
  57. 'videopassword': '100600',
  58. },
  59. }]
  60. def construct_video_urls(self, data1, data2):
  61. # get sid, token
  62. def yk_t(s1, s2):
  63. ls = list(range(256))
  64. t = 0
  65. for i in range(256):
  66. t = (t + ls[i] + compat_ord(s1[i % len(s1)])) % 256
  67. ls[i], ls[t] = ls[t], ls[i]
  68. s = bytearray()
  69. x, y = 0, 0
  70. for i in range(len(s2)):
  71. y = (y + 1) % 256
  72. x = (x + ls[y]) % 256
  73. ls[x], ls[y] = ls[y], ls[x]
  74. s.append(compat_ord(s2[i]) ^ ls[(ls[x] + ls[y]) % 256])
  75. return bytes(s)
  76. sid, token = yk_t(
  77. b'becaf9be', base64.b64decode(data2['ep'].encode('ascii'))
  78. ).decode('ascii').split('_')
  79. # get oip
  80. oip = data2['ip']
  81. # get fileid
  82. string_ls = list(
  83. 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ/\:._-1234567890')
  84. shuffled_string_ls = []
  85. seed = data1['seed']
  86. N = len(string_ls)
  87. for ii in range(N):
  88. seed = (seed * 0xd3 + 0x754f) % 0x10000
  89. idx = seed * len(string_ls) // 0x10000
  90. shuffled_string_ls.append(string_ls[idx])
  91. del string_ls[idx]
  92. fileid_dict = {}
  93. for format in data1['streamtypes']:
  94. streamfileid = [
  95. int(i) for i in data1['streamfileids'][format].strip('*').split('*')]
  96. fileid = ''.join(
  97. [shuffled_string_ls[i] for i in streamfileid])
  98. fileid_dict[format] = fileid[:8] + '%s' + fileid[10:]
  99. def get_fileid(format, n):
  100. fileid = fileid_dict[format] % hex(int(n))[2:].upper().zfill(2)
  101. return fileid
  102. # get ep
  103. def generate_ep(format, n):
  104. fileid = get_fileid(format, n)
  105. ep_t = yk_t(
  106. b'bf7e5f01',
  107. ('%s_%s_%s' % (sid, fileid, token)).encode('ascii')
  108. )
  109. ep = base64.b64encode(ep_t).decode('ascii')
  110. return ep
  111. # generate video_urls
  112. video_urls_dict = {}
  113. for format in data1['streamtypes']:
  114. video_urls = []
  115. for dt in data1['segs'][format]:
  116. n = str(int(dt['no']))
  117. param = {
  118. 'K': dt['k'],
  119. 'hd': self.get_hd(format),
  120. 'myp': 0,
  121. 'ts': dt['seconds'],
  122. 'ypp': 0,
  123. 'ctype': 12,
  124. 'ev': 1,
  125. 'token': token,
  126. 'oip': oip,
  127. 'ep': generate_ep(format, n)
  128. }
  129. video_url = \
  130. 'http://k.youku.com/player/getFlvPath/' + \
  131. 'sid/' + sid + \
  132. '_' + str(int(n) + 1).zfill(2) + \
  133. '/st/' + self.parse_ext_l(format) + \
  134. '/fileid/' + get_fileid(format, n) + '?' + \
  135. compat_urllib_parse.urlencode(param)
  136. video_urls.append(video_url)
  137. video_urls_dict[format] = video_urls
  138. return video_urls_dict
  139. def get_hd(self, fm):
  140. hd_id_dict = {
  141. 'flv': '0',
  142. 'mp4': '1',
  143. 'hd2': '2',
  144. 'hd3': '3',
  145. '3gp': '0',
  146. '3gphd': '1'
  147. }
  148. return hd_id_dict[fm]
  149. def parse_ext_l(self, fm):
  150. ext_dict = {
  151. 'flv': 'flv',
  152. 'mp4': 'mp4',
  153. 'hd2': 'flv',
  154. 'hd3': 'flv',
  155. '3gp': 'flv',
  156. '3gphd': 'mp4'
  157. }
  158. return ext_dict[fm]
  159. def get_format_name(self, fm):
  160. _dict = {
  161. '3gp': 'h6',
  162. '3gphd': 'h5',
  163. 'flv': 'h4',
  164. 'mp4': 'h3',
  165. 'hd2': 'h2',
  166. 'hd3': 'h1'
  167. }
  168. return _dict[fm]
  169. def _real_extract(self, url):
  170. video_id = self._match_id(url)
  171. def retrieve_data(req_url, note):
  172. req = sanitized_Request(req_url)
  173. cn_verification_proxy = self._downloader.params.get('cn_verification_proxy')
  174. if cn_verification_proxy:
  175. req.add_header('Ytdl-request-proxy', cn_verification_proxy)
  176. raw_data = self._download_json(req, video_id, note=note)
  177. return raw_data['data'][0]
  178. video_password = self._downloader.params.get('videopassword', None)
  179. # request basic data
  180. basic_data_url = 'http://v.youku.com/player/getPlayList/VideoIDS/%s' % video_id
  181. if video_password:
  182. basic_data_url += '?password=%s' % video_password
  183. data1 = retrieve_data(
  184. basic_data_url,
  185. 'Downloading JSON metadata 1')
  186. data2 = retrieve_data(
  187. 'http://v.youku.com/player/getPlayList/VideoIDS/%s/Pf/4/ctype/12/ev/1' % video_id,
  188. 'Downloading JSON metadata 2')
  189. error_code = data1.get('error_code')
  190. if error_code:
  191. error = data1.get('error')
  192. if error is not None and '因版权原因无法观看此视频' in error:
  193. raise ExtractorError(
  194. 'Youku said: Sorry, this video is available in China only', expected=True)
  195. else:
  196. msg = 'Youku server reported error %i' % error_code
  197. if error is not None:
  198. msg += ': ' + error
  199. raise ExtractorError(msg)
  200. title = data1['title']
  201. # generate video_urls_dict
  202. video_urls_dict = self.construct_video_urls(data1, data2)
  203. # construct info
  204. entries = [{
  205. 'id': '%s_part%d' % (video_id, i + 1),
  206. 'title': title,
  207. 'formats': [],
  208. # some formats are not available for all parts, we have to detect
  209. # which one has all
  210. } for i in range(max(len(v) for v in data1['segs'].values()))]
  211. for fm in data1['streamtypes']:
  212. video_urls = video_urls_dict[fm]
  213. for video_url, seg, entry in zip(video_urls, data1['segs'][fm], entries):
  214. entry['formats'].append({
  215. 'url': video_url,
  216. 'format_id': self.get_format_name(fm),
  217. 'ext': self.parse_ext_l(fm),
  218. 'filesize': int(seg['size']),
  219. })
  220. return {
  221. '_type': 'multi_video',
  222. 'id': video_id,
  223. 'title': title,
  224. 'entries': entries,
  225. }