You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

364 lines
13 KiB

10 years ago
10 years ago
10 years ago
10 years ago
10 years ago
Switch codebase to use sanitized_Request instead of compat_urllib_request.Request [downloader/dash] Use sanitized_Request [downloader/http] Use sanitized_Request [atresplayer] Use sanitized_Request [bambuser] Use sanitized_Request [bliptv] Use sanitized_Request [brightcove] Use sanitized_Request [cbs] Use sanitized_Request [ceskatelevize] Use sanitized_Request [collegerama] Use sanitized_Request [extractor/common] Use sanitized_Request [crunchyroll] Use sanitized_Request [dailymotion] Use sanitized_Request [dcn] Use sanitized_Request [dramafever] Use sanitized_Request [dumpert] Use sanitized_Request [eitb] Use sanitized_Request [escapist] Use sanitized_Request [everyonesmixtape] Use sanitized_Request [extremetube] Use sanitized_Request [facebook] Use sanitized_Request [fc2] Use sanitized_Request [flickr] Use sanitized_Request [4tube] Use sanitized_Request [gdcvault] Use sanitized_Request [extractor/generic] Use sanitized_Request [hearthisat] Use sanitized_Request [hotnewhiphop] Use sanitized_Request [hypem] Use sanitized_Request [iprima] Use sanitized_Request [ivi] Use sanitized_Request [keezmovies] Use sanitized_Request [letv] Use sanitized_Request [lynda] Use sanitized_Request [metacafe] Use sanitized_Request [minhateca] Use sanitized_Request [miomio] Use sanitized_Request [meovideo] Use sanitized_Request [mofosex] Use sanitized_Request [moniker] Use sanitized_Request [mooshare] Use sanitized_Request [movieclips] Use sanitized_Request [mtv] Use sanitized_Request [myvideo] Use sanitized_Request [neteasemusic] Use sanitized_Request [nfb] Use sanitized_Request [niconico] Use sanitized_Request [noco] Use sanitized_Request [nosvideo] Use sanitized_Request [novamov] Use sanitized_Request [nowness] Use sanitized_Request [nuvid] Use sanitized_Request [played] Use sanitized_Request [pluralsight] Use sanitized_Request [pornhub] Use sanitized_Request [pornotube] Use sanitized_Request [primesharetv] Use sanitized_Request [promptfile] Use sanitized_Request [qqmusic] Use sanitized_Request [rtve] Use sanitized_Request [safari] Use sanitized_Request [sandia] Use sanitized_Request [shared] Use sanitized_Request [sharesix] Use sanitized_Request [sina] Use sanitized_Request [smotri] Use sanitized_Request [sohu] Use sanitized_Request [spankwire] Use sanitized_Request [sportdeutschland] Use sanitized_Request [streamcloud] Use sanitized_Request [streamcz] Use sanitized_Request [tapely] Use sanitized_Request [tube8] Use sanitized_Request [tubitv] Use sanitized_Request [twitch] Use sanitized_Request [twitter] Use sanitized_Request [udemy] Use sanitized_Request [vbox7] Use sanitized_Request [veoh] Use sanitized_Request [vessel] Use sanitized_Request [vevo] Use sanitized_Request [viddler] Use sanitized_Request [videomega] Use sanitized_Request [viewvster] Use sanitized_Request [viki] Use sanitized_Request [vk] Use sanitized_Request [vodlocker] Use sanitized_Request [voicerepublic] Use sanitized_Request [wistia] Use sanitized_Request [xfileshare] Use sanitized_Request [xtube] Use sanitized_Request [xvideos] Use sanitized_Request [yandexmusic] Use sanitized_Request [youku] Use sanitized_Request [youporn] Use sanitized_Request [youtube] Use sanitized_Request [patreon] Use sanitized_Request [extractor/common] Remove unused import [nfb] PEP 8
9 years ago
10 years ago
10 years ago
10 years ago
10 years ago
Switch codebase to use sanitized_Request instead of compat_urllib_request.Request [downloader/dash] Use sanitized_Request [downloader/http] Use sanitized_Request [atresplayer] Use sanitized_Request [bambuser] Use sanitized_Request [bliptv] Use sanitized_Request [brightcove] Use sanitized_Request [cbs] Use sanitized_Request [ceskatelevize] Use sanitized_Request [collegerama] Use sanitized_Request [extractor/common] Use sanitized_Request [crunchyroll] Use sanitized_Request [dailymotion] Use sanitized_Request [dcn] Use sanitized_Request [dramafever] Use sanitized_Request [dumpert] Use sanitized_Request [eitb] Use sanitized_Request [escapist] Use sanitized_Request [everyonesmixtape] Use sanitized_Request [extremetube] Use sanitized_Request [facebook] Use sanitized_Request [fc2] Use sanitized_Request [flickr] Use sanitized_Request [4tube] Use sanitized_Request [gdcvault] Use sanitized_Request [extractor/generic] Use sanitized_Request [hearthisat] Use sanitized_Request [hotnewhiphop] Use sanitized_Request [hypem] Use sanitized_Request [iprima] Use sanitized_Request [ivi] Use sanitized_Request [keezmovies] Use sanitized_Request [letv] Use sanitized_Request [lynda] Use sanitized_Request [metacafe] Use sanitized_Request [minhateca] Use sanitized_Request [miomio] Use sanitized_Request [meovideo] Use sanitized_Request [mofosex] Use sanitized_Request [moniker] Use sanitized_Request [mooshare] Use sanitized_Request [movieclips] Use sanitized_Request [mtv] Use sanitized_Request [myvideo] Use sanitized_Request [neteasemusic] Use sanitized_Request [nfb] Use sanitized_Request [niconico] Use sanitized_Request [noco] Use sanitized_Request [nosvideo] Use sanitized_Request [novamov] Use sanitized_Request [nowness] Use sanitized_Request [nuvid] Use sanitized_Request [played] Use sanitized_Request [pluralsight] Use sanitized_Request [pornhub] Use sanitized_Request [pornotube] Use sanitized_Request [primesharetv] Use sanitized_Request [promptfile] Use sanitized_Request [qqmusic] Use sanitized_Request [rtve] Use sanitized_Request [safari] Use sanitized_Request [sandia] Use sanitized_Request [shared] Use sanitized_Request [sharesix] Use sanitized_Request [sina] Use sanitized_Request [smotri] Use sanitized_Request [sohu] Use sanitized_Request [spankwire] Use sanitized_Request [sportdeutschland] Use sanitized_Request [streamcloud] Use sanitized_Request [streamcz] Use sanitized_Request [tapely] Use sanitized_Request [tube8] Use sanitized_Request [tubitv] Use sanitized_Request [twitch] Use sanitized_Request [twitter] Use sanitized_Request [udemy] Use sanitized_Request [vbox7] Use sanitized_Request [veoh] Use sanitized_Request [vessel] Use sanitized_Request [vevo] Use sanitized_Request [viddler] Use sanitized_Request [videomega] Use sanitized_Request [viewvster] Use sanitized_Request [viki] Use sanitized_Request [vk] Use sanitized_Request [vodlocker] Use sanitized_Request [voicerepublic] Use sanitized_Request [wistia] Use sanitized_Request [xfileshare] Use sanitized_Request [xtube] Use sanitized_Request [xvideos] Use sanitized_Request [yandexmusic] Use sanitized_Request [youku] Use sanitized_Request [youporn] Use sanitized_Request [youtube] Use sanitized_Request [patreon] Use sanitized_Request [extractor/common] Remove unused import [nfb] PEP 8
9 years ago
10 years ago
  1. # coding: utf-8
  2. from __future__ import unicode_literals
  3. import base64
  4. import datetime
  5. import hashlib
  6. import re
  7. import time
  8. from .common import InfoExtractor
  9. from ..compat import (
  10. compat_ord,
  11. compat_str,
  12. compat_urllib_parse_urlencode,
  13. )
  14. from ..utils import (
  15. determine_ext,
  16. encode_data_uri,
  17. ExtractorError,
  18. int_or_none,
  19. orderedSet,
  20. parse_iso8601,
  21. sanitized_Request,
  22. str_or_none,
  23. url_basename,
  24. )
  25. class LeIE(InfoExtractor):
  26. IE_DESC = '乐视网'
  27. _VALID_URL = r'https?://(?:www\.le\.com/ptv/vplay|sports\.le\.com/video)/(?P<id>\d+)\.html'
  28. _URL_TEMPLATE = 'http://www.le.com/ptv/vplay/%s.html'
  29. _TESTS = [{
  30. 'url': 'http://www.le.com/ptv/vplay/22005890.html',
  31. 'md5': 'edadcfe5406976f42f9f266057ee5e40',
  32. 'info_dict': {
  33. 'id': '22005890',
  34. 'ext': 'mp4',
  35. 'title': '第87届奥斯卡颁奖礼完美落幕 《鸟人》成最大赢家',
  36. 'description': 'md5:a9cb175fd753e2962176b7beca21a47c',
  37. },
  38. 'params': {
  39. 'hls_prefer_native': True,
  40. },
  41. }, {
  42. 'url': 'http://www.le.com/ptv/vplay/1415246.html',
  43. 'info_dict': {
  44. 'id': '1415246',
  45. 'ext': 'mp4',
  46. 'title': '美人天下01',
  47. 'description': 'md5:f88573d9d7225ada1359eaf0dbf8bcda',
  48. },
  49. 'params': {
  50. 'hls_prefer_native': True,
  51. },
  52. }, {
  53. 'note': 'This video is available only in Mainland China, thus a proxy is needed',
  54. 'url': 'http://www.le.com/ptv/vplay/1118082.html',
  55. 'md5': '2424c74948a62e5f31988438979c5ad1',
  56. 'info_dict': {
  57. 'id': '1118082',
  58. 'ext': 'mp4',
  59. 'title': '与龙共舞 完整版',
  60. 'description': 'md5:7506a5eeb1722bb9d4068f85024e3986',
  61. },
  62. 'params': {
  63. 'hls_prefer_native': True,
  64. },
  65. 'skip': 'Only available in China',
  66. }, {
  67. 'url': 'http://sports.le.com/video/25737697.html',
  68. 'only_matching': True,
  69. }]
  70. @staticmethod
  71. def urshift(val, n):
  72. return val >> n if val >= 0 else (val + 0x100000000) >> n
  73. # ror() and calc_time_key() are reversed from a embedded swf file in KLetvPlayer.swf
  74. def ror(self, param1, param2):
  75. _loc3_ = 0
  76. while _loc3_ < param2:
  77. param1 = self.urshift(param1, 1) + ((param1 & 1) << 31)
  78. _loc3_ += 1
  79. return param1
  80. def calc_time_key(self, param1):
  81. _loc2_ = 773625421
  82. _loc3_ = self.ror(param1, _loc2_ % 13)
  83. _loc3_ = _loc3_ ^ _loc2_
  84. _loc3_ = self.ror(_loc3_, _loc2_ % 17)
  85. return _loc3_
  86. # see M3U8Encryption class in KLetvPlayer.swf
  87. @staticmethod
  88. def decrypt_m3u8(encrypted_data):
  89. if encrypted_data[:5].decode('utf-8').lower() != 'vc_01':
  90. return encrypted_data
  91. encrypted_data = encrypted_data[5:]
  92. _loc4_ = bytearray(2 * len(encrypted_data))
  93. for idx, val in enumerate(encrypted_data):
  94. b = compat_ord(val)
  95. _loc4_[2 * idx] = b // 16
  96. _loc4_[2 * idx + 1] = b % 16
  97. idx = len(_loc4_) - 11
  98. _loc4_ = _loc4_[idx:] + _loc4_[:idx]
  99. _loc7_ = bytearray(len(encrypted_data))
  100. for i in range(len(encrypted_data)):
  101. _loc7_[i] = _loc4_[2 * i] * 16 + _loc4_[2 * i + 1]
  102. return bytes(_loc7_)
  103. def _real_extract(self, url):
  104. media_id = self._match_id(url)
  105. page = self._download_webpage(url, media_id)
  106. params = {
  107. 'id': media_id,
  108. 'platid': 1,
  109. 'splatid': 101,
  110. 'format': 1,
  111. 'tkey': self.calc_time_key(int(time.time())),
  112. 'domain': 'www.le.com'
  113. }
  114. play_json_req = sanitized_Request(
  115. 'http://api.le.com/mms/out/video/playJson?' + compat_urllib_parse_urlencode(params)
  116. )
  117. cn_verification_proxy = self._downloader.params.get('cn_verification_proxy')
  118. if cn_verification_proxy:
  119. play_json_req.add_header('Ytdl-request-proxy', cn_verification_proxy)
  120. play_json = self._download_json(
  121. play_json_req,
  122. media_id, 'Downloading playJson data')
  123. # Check for errors
  124. playstatus = play_json['playstatus']
  125. if playstatus['status'] == 0:
  126. flag = playstatus['flag']
  127. if flag == 1:
  128. msg = 'Country %s auth error' % playstatus['country']
  129. else:
  130. msg = 'Generic error. flag = %d' % flag
  131. raise ExtractorError(msg, expected=True)
  132. playurl = play_json['playurl']
  133. formats = ['350', '1000', '1300', '720p', '1080p']
  134. dispatch = playurl['dispatch']
  135. urls = []
  136. for format_id in formats:
  137. if format_id in dispatch:
  138. media_url = playurl['domain'][0] + dispatch[format_id][0]
  139. media_url += '&' + compat_urllib_parse_urlencode({
  140. 'm3v': 1,
  141. 'format': 1,
  142. 'expect': 3,
  143. 'rateid': format_id,
  144. })
  145. nodes_data = self._download_json(
  146. media_url, media_id,
  147. 'Download JSON metadata for format %s' % format_id)
  148. req = self._request_webpage(
  149. nodes_data['nodelist'][0]['location'], media_id,
  150. note='Downloading m3u8 information for format %s' % format_id)
  151. m3u8_data = self.decrypt_m3u8(req.read())
  152. url_info_dict = {
  153. 'url': encode_data_uri(m3u8_data, 'application/vnd.apple.mpegurl'),
  154. 'ext': determine_ext(dispatch[format_id][1]),
  155. 'format_id': format_id,
  156. 'protocol': 'm3u8',
  157. }
  158. if format_id[-1:] == 'p':
  159. url_info_dict['height'] = int_or_none(format_id[:-1])
  160. urls.append(url_info_dict)
  161. publish_time = parse_iso8601(self._html_search_regex(
  162. r'发布时间&nbsp;([^<>]+) ', page, 'publish time', default=None),
  163. delimiter=' ', timezone=datetime.timedelta(hours=8))
  164. description = self._html_search_meta('description', page, fatal=False)
  165. return {
  166. 'id': media_id,
  167. 'formats': urls,
  168. 'title': playurl['title'],
  169. 'thumbnail': playurl['pic'],
  170. 'description': description,
  171. 'timestamp': publish_time,
  172. }
  173. class LePlaylistIE(InfoExtractor):
  174. _VALID_URL = r'https?://[a-z]+\.le\.com/(?!video)[a-z]+/(?P<id>[a-z0-9_]+)'
  175. _TESTS = [{
  176. 'url': 'http://www.le.com/tv/46177.html',
  177. 'info_dict': {
  178. 'id': '46177',
  179. 'title': '美人天下',
  180. 'description': 'md5:395666ff41b44080396e59570dbac01c'
  181. },
  182. 'playlist_count': 35
  183. }, {
  184. 'url': 'http://tv.le.com/izt/wuzetian/index.html',
  185. 'info_dict': {
  186. 'id': 'wuzetian',
  187. 'title': '武媚娘传奇',
  188. 'description': 'md5:e12499475ab3d50219e5bba00b3cb248'
  189. },
  190. # This playlist contains some extra videos other than the drama itself
  191. 'playlist_mincount': 96
  192. }, {
  193. 'url': 'http://tv.le.com/pzt/lswjzzjc/index.shtml',
  194. # This series is moved to http://www.le.com/tv/10005297.html
  195. 'only_matching': True,
  196. }, {
  197. 'url': 'http://www.le.com/comic/92063.html',
  198. 'only_matching': True,
  199. }, {
  200. 'url': 'http://list.le.com/listn/c1009_sc532002_d2_p1_o1.html',
  201. 'only_matching': True,
  202. }]
  203. @classmethod
  204. def suitable(cls, url):
  205. return False if LeIE.suitable(url) else super(LePlaylistIE, cls).suitable(url)
  206. def _real_extract(self, url):
  207. playlist_id = self._match_id(url)
  208. page = self._download_webpage(url, playlist_id)
  209. # Currently old domain names are still used in playlists
  210. media_ids = orderedSet(re.findall(
  211. r'<a[^>]+href="http://www\.letv\.com/ptv/vplay/(\d+)\.html', page))
  212. entries = [self.url_result(LeIE._URL_TEMPLATE % media_id, ie='Le')
  213. for media_id in media_ids]
  214. title = self._html_search_meta('keywords', page,
  215. fatal=False).split('')[0]
  216. description = self._html_search_meta('description', page, fatal=False)
  217. return self.playlist_result(entries, playlist_id, playlist_title=title,
  218. playlist_description=description)
  219. class LetvCloudIE(InfoExtractor):
  220. # Most of *.letv.com is changed to *.le.com on 2016/01/02
  221. # but yuntv.letv.com is kept, so also keep the extractor name
  222. IE_DESC = '乐视云'
  223. _VALID_URL = r'https?://yuntv\.letv\.com/bcloud.html\?.+'
  224. _TESTS = [{
  225. 'url': 'http://yuntv.letv.com/bcloud.html?uu=p7jnfw5hw9&vu=467623dedf',
  226. 'md5': '26450599afd64c513bc77030ad15db44',
  227. 'info_dict': {
  228. 'id': 'p7jnfw5hw9_467623dedf',
  229. 'ext': 'mp4',
  230. 'title': 'Video p7jnfw5hw9_467623dedf',
  231. },
  232. }, {
  233. 'url': 'http://yuntv.letv.com/bcloud.html?uu=p7jnfw5hw9&vu=ec93197892&pu=2c7cd40209&auto_play=1&gpcflag=1&width=640&height=360',
  234. 'md5': 'e03d9cc8d9c13191e1caf277e42dbd31',
  235. 'info_dict': {
  236. 'id': 'p7jnfw5hw9_ec93197892',
  237. 'ext': 'mp4',
  238. 'title': 'Video p7jnfw5hw9_ec93197892',
  239. },
  240. }, {
  241. 'url': 'http://yuntv.letv.com/bcloud.html?uu=p7jnfw5hw9&vu=187060b6fd',
  242. 'md5': 'cb988699a776b22d4a41b9d43acfb3ac',
  243. 'info_dict': {
  244. 'id': 'p7jnfw5hw9_187060b6fd',
  245. 'ext': 'mp4',
  246. 'title': 'Video p7jnfw5hw9_187060b6fd',
  247. },
  248. }]
  249. @staticmethod
  250. def sign_data(obj):
  251. if obj['cf'] == 'flash':
  252. salt = '2f9d6924b33a165a6d8b5d3d42f4f987'
  253. items = ['cf', 'format', 'ran', 'uu', 'ver', 'vu']
  254. elif obj['cf'] == 'html5':
  255. salt = 'fbeh5player12c43eccf2bec3300344'
  256. items = ['cf', 'ran', 'uu', 'bver', 'vu']
  257. input_data = ''.join([item + obj[item] for item in items]) + salt
  258. obj['sign'] = hashlib.md5(input_data.encode('utf-8')).hexdigest()
  259. def _get_formats(self, cf, uu, vu, media_id):
  260. def get_play_json(cf, timestamp):
  261. data = {
  262. 'cf': cf,
  263. 'ver': '2.2',
  264. 'bver': 'firefox44.0',
  265. 'format': 'json',
  266. 'uu': uu,
  267. 'vu': vu,
  268. 'ran': compat_str(timestamp),
  269. }
  270. self.sign_data(data)
  271. return self._download_json(
  272. 'http://api.letvcloud.com/gpc.php?' + compat_urllib_parse_urlencode(data),
  273. media_id, 'Downloading playJson data for type %s' % cf)
  274. play_json = get_play_json(cf, time.time())
  275. # The server time may be different from local time
  276. if play_json.get('code') == 10071:
  277. play_json = get_play_json(cf, play_json['timestamp'])
  278. if not play_json.get('data'):
  279. if play_json.get('message'):
  280. raise ExtractorError('Letv cloud said: %s' % play_json['message'], expected=True)
  281. elif play_json.get('code'):
  282. raise ExtractorError('Letv cloud returned error %d' % play_json['code'], expected=True)
  283. else:
  284. raise ExtractorError('Letv cloud returned an unknwon error')
  285. def b64decode(s):
  286. return base64.b64decode(s.encode('utf-8')).decode('utf-8')
  287. formats = []
  288. for media in play_json['data']['video_info']['media'].values():
  289. play_url = media['play_url']
  290. url = b64decode(play_url['main_url'])
  291. decoded_url = b64decode(url_basename(url))
  292. formats.append({
  293. 'url': url,
  294. 'ext': determine_ext(decoded_url),
  295. 'format_id': str_or_none(play_url.get('vtype')),
  296. 'format_note': str_or_none(play_url.get('definition')),
  297. 'width': int_or_none(play_url.get('vwidth')),
  298. 'height': int_or_none(play_url.get('vheight')),
  299. })
  300. return formats
  301. def _real_extract(self, url):
  302. uu_mobj = re.search('uu=([\w]+)', url)
  303. vu_mobj = re.search('vu=([\w]+)', url)
  304. if not uu_mobj or not vu_mobj:
  305. raise ExtractorError('Invalid URL: %s' % url, expected=True)
  306. uu = uu_mobj.group(1)
  307. vu = vu_mobj.group(1)
  308. media_id = uu + '_' + vu
  309. formats = self._get_formats('flash', uu, vu, media_id) + self._get_formats('html5', uu, vu, media_id)
  310. self._sort_formats(formats)
  311. return {
  312. 'id': media_id,
  313. 'title': 'Video %s' % media_id,
  314. 'formats': formats,
  315. }