release 2016.11.04

[ChangeLog] Actualize
[anvato] Improve formats extraction
2016-11-04 22:07:54 +07:00 · 2016-11-04 21:59:42 +07:00 · 2016-11-04 21:45:24 +07:00 · 2016-11-04 21:33:08 +07:00 · 2016-11-04 21:32:30 +07:00 · 2016-11-04 21:17:56 +07:00
20 changed files with 380 additions and 150 deletions
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
@@ -6,8 +6,8 @@
 ---
-### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.11.02*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
+### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.11.04*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.11.02**
+- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.11.04**
 ### Before submitting an *issue* make sure you have:
 - [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
 [debug] User config: []
 [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
-[debug] youtube-dl version 2016.11.02
+[debug] youtube-dl version 2016.11.04
 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
 [debug] Proxy map: {}
--- a/.gitignore
+++ b/.gitignore
@@ -30,6 +30,7 @@ updates_key.pem
 *.m4v
 *.mp3
 *.3gp
 *.wav
 *.part
 *.swp
 test/testdata
--- a/19
+++ b/19
@@ -1,3 +1,22 @@
 version 2016.11.04
 Core
 * [extractor/common] Tolerate malformed RESOLUTION attribute in m3u8
  manifests (#11113)
 * [downloader/ism] Fix AVC Decoder Configuration Record
 Extractors
 + [fox9] Add support for fox9.com (#11110)
 + [anvato] Extract more metadata and improve formats extraction
 * [vodlocker] Improve removed videos detection (#11106)
 + [vzaar] Add support for vzaar.com (#11093)
 + [vice] Add support for uplynk preplay videos (#11101)
 * [tubitv] Fix extraction (#11061)
 + [shahid] Add support for authentication (#11091)
 + [radiocanada] Add subtitles support (#11096)
 + [generic] Add support for ISM manifests
 version 2016.11.02
 Core
--- a/2
+++ b/2
@@ -1,7 +1,7 @@
 all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites
 clean:
-	rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.jpg *.png CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe
+	rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.wav *.jpg *.png CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe
 	find . -name "*.pyc" -delete
 	find . -name "*.class" -delete
--- a/docs/supportedsites.md
+++ b/docs/supportedsites.md
@@ -247,6 +247,7 @@
 - **FootyRoom**
 - **Formula1**
 - **FOX**
 - **FOX9**
 - **Foxgay**
 - **foxnews**: Fox News and Fox Business Video
 - **foxnews:article**
@@ -870,6 +871,7 @@
 - **vube**: Vube.com
 - **VuClip**
 - **VyboryMos**
 - **Vzaar**
 - **Walla**
 - **washingtonpost**
 - **washingtonpost:article**
--- a/youtube_dl/downloader/ism.py
+++ b/youtube_dl/downloader/ism.py
@@ -129,7 +129,7 @@ def write_piff_header(stream, params):
        sample_entry_payload += u1616.pack(params['sampling_rate'])
        if fourcc == 'AACL':
-            smaple_entry_box = box(b'mp4a', sample_entry_payload)
+            sample_entry_box = box(b'mp4a', sample_entry_payload)
    else:
        sample_entry_payload = sample_entry_payload
        sample_entry_payload += u16.pack(0)  # pre defined
@@ -149,9 +149,7 @@ def write_piff_header(stream, params):
        if fourcc in ('H264', 'AVC1'):
            sps, pps = codec_private_data.split(u32.pack(1))[1:]
            avcc_payload = u8.pack(1)  # configuration version
-            avcc_payload += sps[1]  # avc profile indication
+            avcc_payload += sps[1:4]  # avc profile indication + profile compatibility + avc level indication
            avcc_payload += sps[2]  # profile compatibility
            avcc_payload += sps[3]  # avc level indication
            avcc_payload += u8.pack(0xfc | (params.get('nal_unit_length_field', 4) - 1))  # complete represenation (1) + reserved (11111) + length size minus one
            avcc_payload += u8.pack(1)  # reserved (0) + number of sps (0000001)
            avcc_payload += u16.pack(len(sps))
@@ -160,8 +158,8 @@ def write_piff_header(stream, params):
            avcc_payload += u16.pack(len(pps))
            avcc_payload += pps
            sample_entry_payload += box(b'avcC', avcc_payload)  # AVC Decoder Configuration Record
-            smaple_entry_box = box(b'avc1', sample_entry_payload)  # AVC Simple Entry
+            sample_entry_box = box(b'avc1', sample_entry_payload)  # AVC Simple Entry
-    stsd_payload += smaple_entry_box
+    stsd_payload += sample_entry_box
    stbl_payload = full_box(b'stsd', 0, 0, stsd_payload)  # Sample Description Box
--- a/youtube_dl/extractor/anvato.py
+++ b/youtube_dl/extractor/anvato.py
@@ -157,22 +157,16 @@ class AnvatoIE(InfoExtractor):
            video_data_url, video_id, transform_source=strip_jsonp,
            data=json.dumps(payload).encode('utf-8'))
-    def _extract_anvato_videos(self, webpage, video_id):
+    def _get_anvato_videos(self, access_key, video_id):
        anvplayer_data = self._parse_json(self._html_search_regex(
            r'<script[^>]+data-anvp=\'([^\']+)\'', webpage,
            'Anvato player data'), video_id)
        video_id = anvplayer_data['video']
        access_key = anvplayer_data['accessKey']
        video_data = self._get_video_json(access_key, video_id)
        formats = []
        for published_url in video_data['published_urls']:
            video_url = published_url['embed_url']
            media_format = published_url.get('format')
            ext = determine_ext(video_url)
-            if ext == 'smil':
+            if ext == 'smil' or media_format == 'smil':
                formats.extend(self._extract_smil_formats(video_url, video_id))
                continue
@@ -183,7 +177,7 @@ class AnvatoIE(InfoExtractor):
                'tbr': tbr if tbr != 0 else None,
            }
-            if ext == 'm3u8':
+            if ext == 'm3u8' or media_format in ('m3u8', 'm3u8-variant'):
                # Not using _extract_m3u8_formats here as individual media
                # playlists are also included in published_urls.
                if tbr is None:
@@ -194,7 +188,7 @@ class AnvatoIE(InfoExtractor):
                        'format_id': '-'.join(filter(None, ['hls', compat_str(tbr)])),
                        'ext': 'mp4',
                    })
-            elif ext == 'mp3':
+            elif ext == 'mp3' or media_format == 'mp3':
                a_format['vcodec'] = 'none'
            else:
                a_format.update({
@@ -218,7 +212,19 @@ class AnvatoIE(InfoExtractor):
            'formats': formats,
            'title': video_data.get('def_title'),
            'description': video_data.get('def_description'),
            'tags': video_data.get('def_tags', '').split(','),
            'categories': video_data.get('categories'),
            'thumbnail': video_data.get('thumbnail'),
            'timestamp': int_or_none(video_data.get(
                'ts_published') or video_data.get('ts_added')),
            'uploader': video_data.get('mcp_id'),
            'duration': int_or_none(video_data.get('duration')),
            'subtitles': subtitles,
        }
    def _extract_anvato_videos(self, webpage, video_id):
        anvplayer_data = self._parse_json(self._html_search_regex(
            r'<script[^>]+data-anvp=\'([^\']+)\'', webpage,
            'Anvato player data'), video_id)
        return self._get_anvato_videos(
            anvplayer_data['accessKey'], anvplayer_data['video'])
--- a/youtube_dl/extractor/cbslocal.py
+++ b/youtube_dl/extractor/cbslocal.py
@@ -22,6 +22,7 @@ class CBSLocalIE(AnvatoIE):
            'thumbnail': 're:^https?://.*',
            'timestamp': 1463440500,
            'upload_date': '20160516',
            'uploader': 'CBS',
            'subtitles': {
                'en': 'mincount:5',
            },
@@ -35,6 +36,7 @@ class CBSLocalIE(AnvatoIE):
                'Syndication\\Curb.tv',
                'Content\\News'
            ],
            'tags': ['CBS 2 News Evening'],
        },
    }, {
        # SendtoNews embed
--- a/youtube_dl/extractor/common.py
+++ b/youtube_dl/extractor/common.py
@@ -1280,9 +1280,10 @@ class InfoExtractor(object):
                }
                resolution = last_info.get('RESOLUTION')
                if resolution:
-                    width_str, height_str = resolution.split('x')
+                    mobj = re.search(r'(?P<width>\d+)[xX](?P<height>\d+)', resolution)
-                    f['width'] = int(width_str)
+                    if mobj:
-                    f['height'] = int(height_str)
+                        f['width'] = int(mobj.group('width'))
                        f['height'] = int(mobj.group('height'))
                # Unified Streaming Platform
                mobj = re.search(
                    r'audio.*?(?:%3D|=)(\d+)(?:-video.*?(?:%3D|=)(\d+))?', f['url'])
--- a/youtube_dl/extractor/extractors.py
+++ b/youtube_dl/extractor/extractors.py
@@ -296,6 +296,7 @@ from .footyroom import FootyRoomIE
 from .formula1 import Formula1IE
 from .fourtube import FourTubeIE
 from .fox import FOXIE
 from .fox9 import FOX9IE
 from .foxgay import FoxgayIE
 from .foxnews import (
    FoxNewsIE,
@@ -1101,6 +1102,7 @@ from .vrt import VRTIE
 from .vube import VubeIE
 from .vuclip import VuClipIE
 from .vyborymos import VyboryMosIE
 from .vzaar import VzaarIE
 from .walla import WallaIE
 from .washingtonpost import (
    WashingtonPostIE,
--- a/youtube_dl/extractor/fox9.py
+++ b/youtube_dl/extractor/fox9.py
@@ -0,0 +1,43 @@
 # coding: utf-8
 from __future__ import unicode_literals
 from .anvato import AnvatoIE
 from ..utils import js_to_json
 class FOX9IE(AnvatoIE):
    _VALID_URL = r'https?://(?:www\.)?fox9\.com/(?:[^/]+/)+(?P<id>\d+)-story'
    _TESTS = [{
        'url': 'http://www.fox9.com/news/215123287-story',
        'md5': 'd6e1b2572c3bab8a849c9103615dd243',
        'info_dict': {
            'id': '314473',
            'ext': 'mp4',
            'title': 'Bear climbs tree in downtown Duluth',
            'description': 'md5:6a36bfb5073a411758a752455408ac90',
            'duration': 51,
            'timestamp': 1478123580,
            'upload_date': '20161102',
            'uploader': 'EPFOX',
            'categories': ['News', 'Sports'],
            'tags': ['news', 'video'],
        },
    }, {
        'url': 'http://www.fox9.com/news/investigators/214070684-story',
        'only_matching': True,
    }]
    def _real_extract(self, url):
        video_id = self._match_id(url)
        webpage = self._download_webpage(url, video_id)
        video_id = self._parse_json(
            self._search_regex(
                r'AnvatoPlaylist\s*\(\s*(\[.+?\])\s*\)\s*;',
                webpage, 'anvato playlist'),
            video_id, transform_source=js_to_json)[0]['video']
        return self._get_anvato_videos(
            'anvato_epfox_app_web_prod_b3373168e12f423f41504f207000188daf88251b',
            video_id)
--- a/youtube_dl/extractor/generic.py
+++ b/youtube_dl/extractor/generic.py
@@ -1634,6 +1634,10 @@ class GenericIE(InfoExtractor):
            doc = compat_etree_fromstring(webpage.encode('utf-8'))
            if doc.tag == 'rss':
                return self._extract_rss(url, video_id, doc)
            elif doc.tag == 'SmoothStreamingMedia':
                info_dict['formats'] = self._parse_ism_formats(doc, url)
                self._sort_formats(info_dict['formats'])
                return info_dict
            elif re.match(r'^(?:{[^}]+})?smil$', doc.tag):
                smil = self._parse_smil(doc, url, video_id)
                self._sort_formats(smil['formats'])
@@ -2449,6 +2453,21 @@ class GenericIE(InfoExtractor):
                entry_info_dict['formats'] = self._extract_mpd_formats(video_url, video_id)
            elif ext == 'f4m':
                entry_info_dict['formats'] = self._extract_f4m_formats(video_url, video_id)
            elif re.search(r'(?i)\.(?:ism|smil)/manifest', video_url) and video_url != url:
                # Just matching .ism/manifest is not enough to be reliably sure
                # whether it's actually an ISM manifest or some other streaming
                # manifest since there are various streaming URL formats
                # possible (see [1]) as well as some other shenanigans like
                # .smil/manifest URLs that actually serve an ISM (see [2]) and
                # so on.
                # Thus the most reasonable way to solve this is to delegate
                # to generic extractor in order to look into the contents of
                # the manifest itself.
                # 1. https://azure.microsoft.com/en-us/documentation/articles/media-services-deliver-content-overview/#streaming-url-formats
                # 2. https://svs.itworkscdn.net/lbcivod/smil:itwfcdn/lbci/170976.smil/Manifest
                entry_info_dict = self.url_result(
                    smuggle_url(video_url, {'to_generic': True}),
                    GenericIE.ie_key())
            else:
                entry_info_dict['url'] = video_url
--- a/youtube_dl/extractor/radiocanada.py
+++ b/youtube_dl/extractor/radiocanada.py
@@ -125,6 +125,14 @@ class RadioCanadaIE(InfoExtractor):
                                f4m_id='hds', fatal=False))
        self._sort_formats(formats)
        subtitles = {}
        closed_caption_url = get_meta('closedCaption') or get_meta('closedCaptionHTML5')
        if closed_caption_url:
            subtitles['fr'] = [{
                'url': closed_caption_url,
                'ext': determine_ext(closed_caption_url, 'vtt'),
            }]
        return {
            'id': video_id,
            'title': get_meta('Title'),
@@ -135,6 +143,7 @@ class RadioCanadaIE(InfoExtractor):
            'season_number': int_or_none('SrcSaison'),
            'episode_number': int_or_none('SrcEpisode'),
            'upload_date': unified_strdate(get_meta('Date')),
            'subtitles': subtitles,
            'formats': formats,
        }
--- a/youtube_dl/extractor/shahid.py
+++ b/youtube_dl/extractor/shahid.py
@@ -1,17 +1,24 @@
 # coding: utf-8
 from __future__ import unicode_literals
 import re
 import json
 from .common import InfoExtractor
 from ..compat import compat_HTTPError
 from ..utils import (
    ExtractorError,
    int_or_none,
    parse_iso8601,
    str_or_none,
    urlencode_postdata,
    clean_html,
 )
 class ShahidIE(InfoExtractor):
-    _VALID_URL = r'https?://shahid\.mbc\.net/ar/episode/(?P<id>\d+)/?'
+    _NETRC_MACHINE = 'shahid'
    _VALID_URL = r'https?://shahid\.mbc\.net/ar/(?P<type>episode|movie)/(?P<id>\d+)'
    _TESTS = [{
        'url': 'https://shahid.mbc.net/ar/episode/90574/%D8%A7%D9%84%D9%85%D9%84%D9%83-%D8%B9%D8%A8%D8%AF%D8%A7%D9%84%D9%84%D9%87-%D8%A7%D9%84%D8%A5%D9%86%D8%B3%D8%A7%D9%86-%D8%A7%D9%84%D9%85%D9%88%D8%B3%D9%85-1-%D9%83%D9%84%D9%8A%D8%A8-3.html',
        'info_dict': {
@@ -27,18 +34,54 @@ class ShahidIE(InfoExtractor):
            # m3u8 download
            'skip_download': True,
        }
    }, {
        'url': 'https://shahid.mbc.net/ar/movie/151746/%D8%A7%D9%84%D9%82%D9%86%D8%A7%D8%B5%D8%A9.html',
        'only_matching': True
    }, {
        # shahid plus subscriber only
        'url': 'https://shahid.mbc.net/ar/episode/90511/%D9%85%D8%B1%D8%A7%D9%8A%D8%A7-2011-%D8%A7%D9%84%D9%85%D9%88%D8%B3%D9%85-1-%D8%A7%D9%84%D8%AD%D9%84%D9%82%D8%A9-1.html',
        'only_matching': True
    }]
-    def _call_api(self, path, video_id, note):
+    def _real_initialize(self):
-        data = self._download_json(
+        email, password = self._get_login_info()
-            'http://api.shahid.net/api/v1_1/' + path, video_id, note, query={
+        if email is None:
-                'apiKey': 'sh@hid0nlin3',
+            return
-                'hash': 'b2wMCTHpSmyxGqQjJFOycRmLSex+BpTK/ooxy6vHaqs=',
+
-            }).get('data', {})
+        try:
            user_data = self._download_json(
                'https://shahid.mbc.net/wd/service/users/login',
                None, 'Logging in', data=json.dumps({
                    'email': email,
                    'password': password,
                    'basic': 'false',
                }).encode('utf-8'), headers={
                    'Content-Type': 'application/json; charset=UTF-8',
                })['user']
        except ExtractorError as e:
            if isinstance(e.cause, compat_HTTPError):
                fail_data = self._parse_json(
                    e.cause.read().decode('utf-8'), None, fatal=False)
                if fail_data:
                    faults = fail_data.get('faults', [])
                    faults_message = ', '.join([clean_html(fault['userMessage']) for fault in faults if fault.get('userMessage')])
                    if faults_message:
                        raise ExtractorError(faults_message, expected=True)
            raise
        self._download_webpage(
            'https://shahid.mbc.net/populateContext',
            None, 'Populate Context', data=urlencode_postdata({
                'firstName': user_data['firstName'],
                'lastName': user_data['lastName'],
                'userName': user_data['email'],
                'csg_user_name': user_data['email'],
                'subscriberId': user_data['id'],
                'sessionId': user_data['sessionId'],
            }))
    def _get_api_data(self, response):
        data = response.get('data', {})
        error = data.get('error')
        if error:
@@ -49,11 +92,11 @@ class ShahidIE(InfoExtractor):
        return data
    def _real_extract(self, url):
-        video_id = self._match_id(url)
+        page_type, video_id = re.match(self._VALID_URL, url).groups()
-        player = self._call_api(
+        player = self._get_api_data(self._download_json(
-            'Content/Episode/%s' % video_id,
+            'https://shahid.mbc.net/arContent/getPlayerContent-param-.id-%s.type-player.html' % video_id,
-            video_id, 'Downloading player JSON')
+            video_id, 'Downloading player JSON'))
        if player.get('drm'):
            raise ExtractorError('This video is DRM protected.', expected=True)
@@ -61,9 +104,12 @@ class ShahidIE(InfoExtractor):
        formats = self._extract_m3u8_formats(player['url'], video_id, 'mp4')
        self._sort_formats(formats)
-        video = self._call_api(
+        video = self._get_api_data(self._download_json(
-            'episode/%s' % video_id, video_id,
+            'http://api.shahid.net/api/v1_1/%s/%s' % (page_type, video_id),
-            'Downloading video JSON')['episode']
+            video_id, 'Downloading video JSON', query={
                'apiKey': 'sh@hid0nlin3',
                'hash': 'b2wMCTHpSmyxGqQjJFOycRmLSex+BpTK/ooxy6vHaqs=',
            }))[page_type]
        title = video['title']
        categories = [
--- a/youtube_dl/extractor/tubitv.py
+++ b/youtube_dl/extractor/tubitv.py
@@ -9,7 +9,6 @@ from ..utils import (
    int_or_none,
    sanitized_Request,
    urlencode_postdata,
    parse_iso8601,
 )
@@ -19,17 +18,13 @@ class TubiTvIE(InfoExtractor):
    _NETRC_MACHINE = 'tubitv'
    _TEST = {
        'url': 'http://tubitv.com/video/283829/the_comedian_at_the_friday',
        'md5': '43ac06be9326f41912dc64ccf7a80320',
        'info_dict': {
            'id': '283829',
            'ext': 'mp4',
            'title': 'The Comedian at The Friday',
            'description': 'A stand up comedian is forced to look at the decisions in his life while on a one week trip to the west coast.',
-            'uploader': 'Indie Rights Films',
+            'uploader_id': 'bc168bee0d18dd1cb3b86c68706ab434',
            'upload_date': '20160111',
            'timestamp': 1452555979,
        },
        'params': {
            'skip_download': 'HLS download',
        },
    }
@@ -58,19 +53,28 @@ class TubiTvIE(InfoExtractor):
        video_id = self._match_id(url)
        video_data = self._download_json(
            'http://tubitv.com/oz/videos/%s/content' % video_id, video_id)
-        title = video_data['n']
+        title = video_data['title']
        formats = self._extract_m3u8_formats(
-            video_data['mh'], video_id, 'mp4', 'm3u8_native')
+            self._proto_relative_url(video_data['url']),
            video_id, 'mp4', 'm3u8_native')
        self._sort_formats(formats)
        thumbnails = []
        for thumbnail_url in video_data.get('thumbnails', []):
            if not thumbnail_url:
                continue
            thumbnails.append({
                'url': self._proto_relative_url(thumbnail_url),
            })
        subtitles = {}
-        for sub in video_data.get('sb', []):
+        for sub in video_data.get('subtitles', []):
-            sub_url = sub.get('u')
+            sub_url = sub.get('url')
            if not sub_url:
                continue
-            subtitles.setdefault(sub.get('l', 'en'), []).append({
+            subtitles.setdefault(sub.get('lang', 'English'), []).append({
-                'url': sub_url,
+                'url': self._proto_relative_url(sub_url),
            })
        return {
@@ -78,9 +82,8 @@ class TubiTvIE(InfoExtractor):
            'title': title,
            'formats': formats,
            'subtitles': subtitles,
-            'thumbnail': video_data.get('ph'),
+            'thumbnails': thumbnails,
-            'description': video_data.get('d'),
+            'description': video_data.get('description'),
-            'duration': int_or_none(video_data.get('s')),
+            'duration': int_or_none(video_data.get('duration')),
-            'timestamp': parse_iso8601(video_data.get('u')),
+            'uploader_id': video_data.get('publisher_id'),
            'uploader': video_data.get('on'),
        }
--- a/youtube_dl/extractor/vice.py
+++ b/youtube_dl/extractor/vice.py
@@ -1,12 +1,93 @@
 # coding: utf-8
 from __future__ import unicode_literals
 import re
 import time
 import hashlib
 import json
 from .adobepass import AdobePassIE
 from .common import InfoExtractor
-from ..utils import ExtractorError
+from ..compat import compat_HTTPError
 from ..utils import (
    int_or_none,
    parse_age_limit,
    str_or_none,
    parse_duration,
    ExtractorError,
    extract_attributes,
 )
-class ViceIE(InfoExtractor):
+class ViceBaseIE(AdobePassIE):
    def _extract_preplay_video(self, url, webpage):
        watch_hub_data = extract_attributes(self._search_regex(
            r'(?s)(<watch-hub\s*.+?</watch-hub>)', webpage, 'watch hub'))
        video_id = watch_hub_data['vms-id']
        title = watch_hub_data['video-title']
        query = {}
        is_locked = watch_hub_data.get('video-locked') == '1'
        if is_locked:
            resource = self._get_mvpd_resource(
                'VICELAND', title, video_id,
                watch_hub_data.get('video-rating'))
            query['tvetoken'] = self._extract_mvpd_auth(url, video_id, 'VICELAND', resource)
        # signature generation algorithm is reverse engineered from signatureGenerator in
        # webpack:///../shared/~/vice-player/dist/js/vice-player.js in
        # https://www.viceland.com/assets/common/js/web.vendor.bundle.js
        exp = int(time.time()) + 14400
        query.update({
            'exp': exp,
            'sign': hashlib.sha512(('%s:GET:%d' % (video_id, exp)).encode()).hexdigest(),
        })
        try:
            host = 'www.viceland' if is_locked else self._PREPLAY_HOST
            preplay = self._download_json('https://%s.com/en_us/preplay/%s' % (host, video_id), video_id, query=query)
        except ExtractorError as e:
            if isinstance(e.cause, compat_HTTPError) and e.cause.code == 400:
                error = json.loads(e.cause.read().decode())
                raise ExtractorError('%s said: %s' % (self.IE_NAME, error['details']), expected=True)
            raise
        video_data = preplay['video']
        base = video_data['base']
        uplynk_preplay_url = preplay['preplayURL']
        episode = video_data.get('episode', {})
        channel = video_data.get('channel', {})
        subtitles = {}
        cc_url = preplay.get('ccURL')
        if cc_url:
            subtitles['en'] = [{
                'url': cc_url,
            }]
        return {
            '_type': 'url_transparent',
            'url': uplynk_preplay_url,
            'id': video_id,
            'title': title,
            'description': base.get('body'),
            'thumbnail': watch_hub_data.get('cover-image') or watch_hub_data.get('thumbnail'),
            'duration': parse_duration(video_data.get('video_duration') or watch_hub_data.get('video-duration')),
            'timestamp': int_or_none(video_data.get('created_at')),
            'age_limit': parse_age_limit(video_data.get('video_rating')),
            'series': video_data.get('show_title') or watch_hub_data.get('show-title'),
            'episode_number': int_or_none(episode.get('episode_number') or watch_hub_data.get('episode')),
            'episode_id': str_or_none(episode.get('id') or video_data.get('episode_id')),
            'season_number': int_or_none(watch_hub_data.get('season')),
            'season_id': str_or_none(episode.get('season_id')),
            'uploader': channel.get('base', {}).get('title') or watch_hub_data.get('channel-title'),
            'uploader_id': str_or_none(channel.get('id')),
            'subtitles': subtitles,
            'ie_key': 'UplynkPreplay',
        }
 class ViceIE(ViceBaseIE):
    _VALID_URL = r'https?://(?:.+?\.)?vice\.com/(?:[^/]+/)?videos?/(?P<id>[^/?#&]+)'
    _TESTS = [{
@@ -21,7 +102,7 @@ class ViceIE(InfoExtractor):
        'add_ie': ['Ooyala'],
    }, {
        'url': 'http://www.vice.com/video/how-to-hack-a-car',
-        'md5': '6fb2989a3fed069fb8eab3401fc2d3c9',
+        'md5': 'a7ecf64ee4fa19b916c16f4b56184ae2',
        'info_dict': {
            'id': '3jstaBeXgAs',
            'ext': 'mp4',
@@ -32,6 +113,22 @@ class ViceIE(InfoExtractor):
            'upload_date': '20140529',
        },
        'add_ie': ['Youtube'],
    }, {
        'url': 'https://video.vice.com/en_us/video/the-signal-from-tolva/5816510690b70e6c5fd39a56',
        'md5': '',
        'info_dict': {
            'id': '5816510690b70e6c5fd39a56',
            'ext': 'mp4',
            'uploader': 'Waypoint',
            'title': 'The Signal From Tölva',
            'uploader_id': '57f7d621e05ca860fa9ccaf9',
            'timestamp': 1477941983938,
        },
        'params': {
            # m3u8 download
            'skip_download': True,
        },
        'add_ie': ['UplynkPreplay'],
    }, {
        'url': 'https://news.vice.com/video/experimenting-on-animals-inside-the-monkey-lab',
        'only_matching': True,
@@ -42,21 +139,21 @@ class ViceIE(InfoExtractor):
        'url': 'https://munchies.vice.com/en/videos/watch-the-trailer-for-our-new-series-the-pizza-show',
        'only_matching': True,
    }]
    _PREPLAY_HOST = 'video.vice'
    def _real_extract(self, url):
        video_id = self._match_id(url)
-        webpage = self._download_webpage(url, video_id)
+        webpage, urlh = self._download_webpage_handle(url, video_id)
-        try:
+        embed_code = self._search_regex(
-            embed_code = self._search_regex(
+            r'embedCode=([^&\'"]+)', webpage,
-                r'embedCode=([^&\'"]+)', webpage,
+            'ooyala embed code', default=None)
-                'ooyala embed code', default=None)
+        if embed_code:
-            if embed_code:
+            return self.url_result('ooyala:%s' % embed_code, 'Ooyala')
-                return self.url_result('ooyala:%s' % embed_code, 'Ooyala')
+        youtube_id = self._search_regex(
-            youtube_id = self._search_regex(
+            r'data-youtube-id="([^"]+)"', webpage, 'youtube id', default=None)
-                r'data-youtube-id="([^"]+)"', webpage, 'youtube id')
+        if youtube_id:
            return self.url_result(youtube_id, 'Youtube')
-        except ExtractorError:
+        return self._extract_preplay_video(urlh.geturl(), webpage)
            raise ExtractorError('The page doesn\'t contain a video', expected=True)
 class ViceShowIE(InfoExtractor):
--- a/youtube_dl/extractor/viceland.py
+++ b/youtube_dl/extractor/viceland.py
@@ -1,23 +1,10 @@
 # coding: utf-8
 from __future__ import unicode_literals
-import time
+from .vice import ViceBaseIE
 import hashlib
 import json
 from .adobepass import AdobePassIE
 from ..compat import compat_HTTPError
 from ..utils import (
    int_or_none,
    parse_age_limit,
    str_or_none,
    parse_duration,
    ExtractorError,
    extract_attributes,
 )
-class VicelandIE(AdobePassIE):
+class VicelandIE(ViceBaseIE):
    _VALID_URL = r'https?://(?:www\.)?viceland\.com/[^/]+/video/[^/]+/(?P<id>[a-f0-9]+)'
    _TEST = {
        'url': 'https://www.viceland.com/en_us/video/cyberwar-trailer/57608447973ee7705f6fbd4e',
@@ -38,70 +25,9 @@ class VicelandIE(AdobePassIE):
        },
        'add_ie': ['UplynkPreplay'],
    }
    _PREPLAY_HOST = 'www.viceland'
    def _real_extract(self, url):
        video_id = self._match_id(url)
        webpage = self._download_webpage(url, video_id)
-        watch_hub_data = extract_attributes(self._search_regex(
+        return self._extract_preplay_video(url, webpage)
            r'(?s)(<watch-hub\s*.+?</watch-hub>)', webpage, 'watch hub'))
        video_id = watch_hub_data['vms-id']
        title = watch_hub_data['video-title']
        query = {}
        if watch_hub_data.get('video-locked') == '1':
            resource = self._get_mvpd_resource(
                'VICELAND', title, video_id,
                watch_hub_data.get('video-rating'))
            query['tvetoken'] = self._extract_mvpd_auth(url, video_id, 'VICELAND', resource)
        # signature generation algorithm is reverse engineered from signatureGenerator in
        # webpack:///../shared/~/vice-player/dist/js/vice-player.js in
        # https://www.viceland.com/assets/common/js/web.vendor.bundle.js
        exp = int(time.time()) + 14400
        query.update({
            'exp': exp,
            'sign': hashlib.sha512(('%s:GET:%d' % (video_id, exp)).encode()).hexdigest(),
        })
        try:
            preplay = self._download_json('https://www.viceland.com/en_us/preplay/%s' % video_id, video_id, query=query)
        except ExtractorError as e:
            if isinstance(e.cause, compat_HTTPError) and e.cause.code == 400:
                error = json.loads(e.cause.read().decode())
                raise ExtractorError('%s said: %s' % (self.IE_NAME, error['details']), expected=True)
            raise
        video_data = preplay['video']
        base = video_data['base']
        uplynk_preplay_url = preplay['preplayURL']
        episode = video_data.get('episode', {})
        channel = video_data.get('channel', {})
        subtitles = {}
        cc_url = preplay.get('ccURL')
        if cc_url:
            subtitles['en'] = [{
                'url': cc_url,
            }]
        return {
            '_type': 'url_transparent',
            'url': uplynk_preplay_url,
            'id': video_id,
            'title': title,
            'description': base.get('body'),
            'thumbnail': watch_hub_data.get('cover-image') or watch_hub_data.get('thumbnail'),
            'duration': parse_duration(video_data.get('video_duration') or watch_hub_data.get('video-duration')),
            'timestamp': int_or_none(video_data.get('created_at')),
            'age_limit': parse_age_limit(video_data.get('video_rating')),
            'series': video_data.get('show_title') or watch_hub_data.get('show-title'),
            'episode_number': int_or_none(episode.get('episode_number') or watch_hub_data.get('episode')),
            'episode_id': str_or_none(episode.get('id') or video_data.get('episode_id')),
            'season_number': int_or_none(watch_hub_data.get('season')),
            'season_id': str_or_none(episode.get('season_id')),
            'uploader': channel.get('base', {}).get('title') or watch_hub_data.get('channel-title'),
            'uploader_id': str_or_none(channel.get('id')),
            'subtitles': subtitles,
            'ie_key': 'UplynkPreplay',
        }
--- a/youtube_dl/extractor/vodlocker.py
+++ b/youtube_dl/extractor/vodlocker.py
@@ -31,7 +31,8 @@ class VodlockerIE(InfoExtractor):
        if any(p in webpage for p in (
                '>THIS FILE WAS DELETED<',
                '>File Not Found<',
-                'The file you were looking for could not be found, sorry for any inconvenience.<')):
+                'The file you were looking for could not be found, sorry for any inconvenience.<',
                '>The file was removed')):
            raise ExtractorError('Video %s does not exist' % video_id, expected=True)
        fields = self._hidden_inputs(webpage)
--- a/youtube_dl/extractor/vzaar.py
+++ b/youtube_dl/extractor/vzaar.py
@@ -0,0 +1,55 @@
 # coding: utf-8
 from __future__ import unicode_literals
 from .common import InfoExtractor
 from ..utils import (
    int_or_none,
    float_or_none,
 )
 class VzaarIE(InfoExtractor):
    _VALID_URL = r'https?://(?:(?:www|view)\.)?vzaar\.com/(?:videos/)?(?P<id>\d+)'
    _TESTS = [{
        'url': 'https://vzaar.com/videos/1152805',
        'md5': 'bde5ddfeb104a6c56a93a06b04901dbf',
        'info_dict': {
            'id': '1152805',
            'ext': 'mp4',
            'title': 'sample video (public)',
        },
    }, {
        'url': 'https://view.vzaar.com/27272/player',
        'md5': '3b50012ac9bbce7f445550d54e0508f2',
        'info_dict': {
            'id': '27272',
            'ext': 'mp3',
            'title': 'MP3',
        },
    }]
    def _real_extract(self, url):
        video_id = self._match_id(url)
        video_data = self._download_json(
            'http://view.vzaar.com/v2/%s/video' % video_id, video_id)
        source_url = video_data['sourceUrl']
        info = {
            'id': video_id,
            'title': video_data['videoTitle'],
            'url': source_url,
            'thumbnail': self._proto_relative_url(video_data.get('poster')),
            'duration': float_or_none(video_data.get('videoDuration')),
        }
        if 'audio' in source_url:
            info.update({
                'vcodec': 'none',
                'ext': 'mp3',
            })
        else:
            info.update({
                'width': int_or_none(video_data.get('width')),
                'height': int_or_none(video_data.get('height')),
                'ext': 'mp4',
            })
        return info
--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@@ -1,3 +1,3 @@
 from __future__ import unicode_literals
-__version__ = '2016.11.02'
+__version__ = '2016.11.04'
Author	SHA1	Message	Date
Sergey M․	b30e4c2754	release 2016.11.04	2016-11-04 22:07:54 +07:00
Sergey M․	09ffe34b00	[ChangeLog] Actualize	2016-11-04 21:59:42 +07:00
Sergey M․	640aff1d0c	[anvato] Improve formats extraction	2016-11-04 21:45:24 +07:00
Sergey M․	c897af8aac	[cbslocal] Update test	2016-11-04 21:33:08 +07:00
Sergey M․	f3c705f8ec	[fox9] Add extractor (closes #11110 )	2016-11-04 21:32:30 +07:00
Sergey M․	f93ac1d175	[anvato] Extract more metadata	2016-11-04 21:17:56 +07:00
Sergey M․	c4c9b8440c	[extractor/common] Tolerate malformed RESOLUTION attribute in m3u8 manifests (closes #11113 )	2016-11-04 05:02:31 +07:00
Sergey M․	32f2627aed	[vodlocker] Add another removed file pattern (closes #11106 )	2016-11-03 22:22:40 +07:00
Sergey M․	9d64e1dcdc	[downloader/ism] Fix typo	2016-11-03 22:15:09 +07:00
Remita Amine	10380e55de	[downloader/ism] fix AVC Decoder Configuration Record creation in python 3	2016-11-03 16:08:57 +01:00
Remita Amine	22979993e7	[vice] add coding cookie	2016-11-03 16:07:22 +01:00
Remita Amine	b47ecd0b74	[vzaar] Add new extractor(closes #11093 )	2016-11-03 12:50:41 +01:00
Yen Chi Hsuan	3a86b2c51e	Ignore and clean .wav files	2016-11-03 18:55:55 +08:00
Remita Amine	b811b4c93b	[vice] add support for uplynk preplay videos(#11101 )	2016-11-03 10:37:07 +01:00
Remita Amine	f4dfa9a5ed	[tubitv] fix extraction(closes #11061 )	2016-11-03 09:04:20 +01:00
Remita Amine	3b4b66b50c	[shahid] add support for authentication(closes #11091 )	2016-11-03 00:44:12 +01:00
Sergey M․	4119a96ce5	[extractor/generic] Skip URLs we came from when delegating ISM extraction	2016-11-02 23:43:41 +07:00
Sergey M․	26aae56690	[extractor/generic] Improve ISM extraction	2016-11-02 23:34:37 +07:00
Remita Amine	4f9cd4d36f	[radiocanada] extract subtitle(closes #11096 )	2016-11-02 13:55:40 +01:00
Sergey M․	cc99a77ac1	[extractor/generic] Add support for ISM manifests	2016-11-02 03:01:13 +07:00
`@@ -1,3 +1,3 @@`
	`from __future__ import unicode_literals`	`from __future__ import unicode_literals`

	`__version__ = '2016.11.02'`	`__version__ = '2016.11.04'`