release 2016.07.06

[prosiebensat1] Make downloading urls JSON non fatal
[onionstudios] fix info extraction
2016-07-06 00:54:23 +07:00 · 2016-07-06 00:52:48 +07:00 · 2016-07-05 18:05:07 +01:00 · 2016-07-05 23:30:44 +07:00 · 2016-07-05 17:11:45 +01:00 · 2016-07-05 14:45:39 +01:00
34 changed files with 445 additions and 323 deletions
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
@@ -6,8 +6,8 @@

 ---

-### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.07.03.1*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.07.03.1**
+### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.07.06*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
+- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.07.06**

 ### Before submitting an *issue* make sure you have:
 - [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
 [debug] User config: []
 [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
-[debug] youtube-dl version 2016.07.03.1
+[debug] youtube-dl version 2016.07.06
 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
 [debug] Proxy map: {}
--- a/README.md
+++ b/README.md
@@ -103,9 +103,9 @@ which means you can modify it, redistribute it or use it however you like.
                                     (experimental)
    -6, --force-ipv6                 Make all connections via IPv6
                                     (experimental)
-    --cn-verification-proxy URL      Use this proxy to verify the IP address for
-                                     some Chinese sites. The default proxy
-                                     specified by --proxy (or none, if the
+    --geo-verification-proxy URL     Use this proxy to verify the IP address for
+                                     some geo-restricted sites. The default
+                                     proxy specified by --proxy (or none, if the
                                     options is not present) is used for the
                                     actual downloading. (experimental)

@@ -424,7 +424,7 @@ which means you can modify it, redistribute it or use it however you like.

 # CONFIGURATION

-You can configure youtube-dl by placing any supported command line option to a configuration file. On Linux and OS X, the system wide configuration file is located at `/etc/youtube-dl.conf` and the user wide configuration file at `~/.config/youtube-dl/config`. On Windows, the user wide configuration file locations are `%APPDATA%\youtube-dl\config.txt` or `C:\Users\<user name>\youtube-dl.conf`.
+You can configure youtube-dl by placing any supported command line option to a configuration file. On Linux and OS X, the system wide configuration file is located at `/etc/youtube-dl.conf` and the user wide configuration file at `~/.config/youtube-dl/config`. On Windows, the user wide configuration file locations are `%APPDATA%\youtube-dl\config.txt` or `C:\Users\<user name>\youtube-dl.conf`. Note that by default configuration file may not exist so you may need to create it yourself.

 For example, with the following configuration file youtube-dl will always extract the audio, not copy the mtime, use a proxy and save all videos under `Movies` directory in your home directory:
 ```
--- a/devscripts/show-downloads-statistics.py
+++ b/devscripts/show-downloads-statistics.py
@@ -0,0 +1,41 @@
+#!/usr/bin/env python
+from __future__ import unicode_literals
+
+import json
+import os
+import re
+import sys
+
+sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
+
+from youtube_dl.compat import (
+    compat_print,
+    compat_urllib_request,
+)
+from youtube_dl.utils import format_bytes
+
+
+def format_size(bytes):
+    return '%s (%d bytes)' % (format_bytes(bytes), bytes)
+
+
+total_bytes = 0
+
+releases = json.loads(compat_urllib_request.urlopen(
+    'https://api.github.com/repos/rg3/youtube-dl/releases').read().decode('utf-8'))
+
+for release in releases:
+    compat_print(release['name'])
+    for asset in release['assets']:
+        asset_name = asset['name']
+        total_bytes += asset['download_count'] * asset['size']
+        if all(not re.match(p, asset_name) for p in (
+                r'^youtube-dl$',
+                r'^youtube-dl-\d{4}\.\d{2}\.\d{2}(?:\.\d+)?\.tar\.gz$',
+                r'^youtube-dl\.exe$')):
+            continue
+        compat_print(
+            ' %s size: %s downloads: %d'
+            % (asset_name, format_size(asset['size']), asset['download_count']))
+
+compat_print('total downloads traffic: %s' % format_size(total_bytes))
--- a/docs/supportedsites.md
+++ b/docs/supportedsites.md
@@ -857,6 +857,7 @@
 - **youtube:search**: YouTube.com searches
 - **youtube:search:date**: YouTube.com searches, newest videos first
 - **youtube:search_url**: YouTube.com search URLs
+ - **youtube:shared**
 - **youtube:show**: YouTube.com (multi-season) shows
 - **youtube:subscriptions**: YouTube.com subscriptions feed, "ytsubs" keyword (requires authentication)
 - **youtube:user**: YouTube.com user videos (URL or "ytuser" keyword)
--- a/test/test_http.py
+++ b/test/test_http.py
@@ -138,27 +138,27 @@ class TestProxy(unittest.TestCase):
        self.proxy_thread.daemon = True
        self.proxy_thread.start()

-        self.cn_proxy = compat_http_server.HTTPServer(
-            ('localhost', 0), _build_proxy_handler('cn'))
-        self.cn_port = http_server_port(self.cn_proxy)
-        self.cn_proxy_thread = threading.Thread(target=self.cn_proxy.serve_forever)
-        self.cn_proxy_thread.daemon = True
-        self.cn_proxy_thread.start()
+        self.geo_proxy = compat_http_server.HTTPServer(
+            ('localhost', 0), _build_proxy_handler('geo'))
+        self.geo_port = http_server_port(self.geo_proxy)
+        self.geo_proxy_thread = threading.Thread(target=self.geo_proxy.serve_forever)
+        self.geo_proxy_thread.daemon = True
+        self.geo_proxy_thread.start()

    def test_proxy(self):
-        cn_proxy = 'localhost:{0}'.format(self.cn_port)
+        geo_proxy = 'localhost:{0}'.format(self.geo_port)
        ydl = YoutubeDL({
            'proxy': 'localhost:{0}'.format(self.port),
-            'cn_verification_proxy': cn_proxy,
+            'geo_verification_proxy': geo_proxy,
        })
        url = 'http://foo.com/bar'
        response = ydl.urlopen(url).read().decode('utf-8')
        self.assertEqual(response, 'normal: {0}'.format(url))

        req = compat_urllib_request.Request(url)
-        req.add_header('Ytdl-request-proxy', cn_proxy)
+        req.add_header('Ytdl-request-proxy', geo_proxy)
        response = ydl.urlopen(req).read().decode('utf-8')
-        self.assertEqual(response, 'cn: {0}'.format(url))
+        self.assertEqual(response, 'geo: {0}'.format(url))

    def test_proxy_with_idn(self):
        ydl = YoutubeDL({
--- a/test/test_utils.py
+++ b/test/test_utils.py
@@ -405,6 +405,12 @@ class TestUtil(unittest.TestCase):
        self.assertEqual(res_url, url)
        self.assertEqual(res_data, None)

+        smug_url = smuggle_url(url, {'a': 'b'})
+        smug_smug_url = smuggle_url(smug_url, {'c': 'd'})
+        res_url, res_data = unsmuggle_url(smug_smug_url)
+        self.assertEqual(res_url, url)
+        self.assertEqual(res_data, {'a': 'b', 'c': 'd'})
+
    def test_shell_quote(self):
        args = ['ffmpeg', '-i', encodeFilename('ñ€ß\'.mp4')]
        self.assertEqual(shell_quote(args), """ffmpeg -i 'ñ€ß'"'"'.mp4'""")
--- a/youtube_dl/YoutubeDL.py
+++ b/youtube_dl/YoutubeDL.py
@@ -196,8 +196,8 @@ class YoutubeDL(object):
    prefer_insecure:   Use HTTP instead of HTTPS to retrieve information.
                       At the moment, this is only supported by YouTube.
    proxy:             URL of the proxy server to use
-    cn_verification_proxy:  URL of the proxy to use for IP address verification
-                       on Chinese sites. (Experimental)
+    geo_verification_proxy:  URL of the proxy to use for IP address verification
+                       on geo-restricted sites. (Experimental)
    socket_timeout:    Time to wait for unresponsive hosts, in seconds
    bidi_workaround:   Work around buggy terminals without bidirectional text
                       support, using fridibi
@@ -304,6 +304,11 @@ class YoutubeDL(object):
        self.params.update(params)
        self.cache = Cache(self)

+        if self.params.get('cn_verification_proxy') is not None:
+            self.report_warning('--cn-verification-proxy is deprecated. Use --geo-verification-proxy instead.')
+            if self.params.get('geo_verification_proxy') is None:
+                self.params['geo_verification_proxy'] = self.params['cn_verification_proxy']
+
        if params.get('bidi_workaround', False):
            try:
                import pty
--- a/youtube_dl/init.py
+++ b/youtube_dl/init.py
@@ -382,6 +382,8 @@ def _real_main(argv=None):
        'external_downloader_args': external_downloader_args,
        'postprocessor_args': postprocessor_args,
        'cn_verification_proxy': opts.cn_verification_proxy,
+        'geo_verification_proxy': opts.geo_verification_proxy,
+
    }

    with YoutubeDL(ydl_opts) as ydl:
--- a/youtube_dl/extractor/brightcove.py
+++ b/youtube_dl/extractor/brightcove.py
@@ -585,6 +585,13 @@ class BrightcoveNewIE(InfoExtractor):
                        'format_id': build_format_id('rtmp'),
                    })
                formats.append(f)
+
+        errors = json_data.get('errors')
+        if not formats and errors:
+            error = errors[0]
+            raise ExtractorError(
+                error.get('message') or error.get('error_subcode') or error['error_code'], expected=True)
+
        self._sort_formats(formats)

        subtitles = {}
--- a/youtube_dl/extractor/common.py
+++ b/youtube_dl/extractor/common.py
@@ -1729,6 +1729,13 @@ class InfoExtractor(object):
    def _mark_watched(self, *args, **kwargs):
        raise NotImplementedError('This method must be implemented by subclasses')

+    def geo_verification_headers(self):
+        headers = {}
+        geo_verification_proxy = self._downloader.params.get('geo_verification_proxy')
+        if geo_verification_proxy:
+            headers['Ytdl-request-proxy'] = geo_verification_proxy
+        return headers
+

 class SearchInfoExtractor(InfoExtractor):
    """
--- a/youtube_dl/extractor/extractors.py
+++ b/youtube_dl/extractor/extractors.py
@@ -1066,6 +1066,7 @@ from .youtube import (
    YoutubeSearchDateIE,
    YoutubeSearchIE,
    YoutubeSearchURLIE,
+    YoutubeSharedVideoIE,
    YoutubeShowIE,
    YoutubeSubscriptionsIE,
    YoutubeTruncatedIDIE,
--- a/youtube_dl/extractor/generic.py
+++ b/youtube_dl/extractor/generic.py
@@ -1295,6 +1295,21 @@ class GenericIE(InfoExtractor):
                'uploader': 'cylus cyrus',
            },
        },
+        {
+            # video stored on custom kaltura server
+            'url': 'http://www.expansion.com/multimedia/videos.html?media=EQcM30NHIPv',
+            'md5': '537617d06e64dfed891fa1593c4b30cc',
+            'info_dict': {
+                'id': '0_1iotm5bh',
+                'ext': 'mp4',
+                'title': 'Elecciones británicas: 5 lecciones para Rajoy',
+                'description': 'md5:435a89d68b9760b92ce67ed227055f16',
+                'uploader_id': 'videos.expansion@el-mundo.net',
+                'upload_date': '20150429',
+                'timestamp': 1430303472,
+            },
+            'add_ie': ['Kaltura'],
+        },
    ]

    def report_following_redirect(self, new_url):
--- a/youtube_dl/extractor/iqiyi.py
+++ b/youtube_dl/extractor/iqiyi.py
@@ -165,7 +165,7 @@ class IqiyiIE(InfoExtractor):

    _TESTS = [{
        'url': 'http://www.iqiyi.com/v_19rrojlavg.html',
-        'md5': '5b0591f55961117155430b5d544fdb01',
+        # MD5 checksum differs on my machine and Travis CI
        'info_dict': {
            'id': '9c1fb1b99d192b21c559e5a1a2cb3c73',
            'ext': 'mp4',
@@ -293,14 +293,10 @@ class IqiyiIE(InfoExtractor):
            't': tm,
        }

-        headers = {}
-        cn_verification_proxy = self._downloader.params.get('cn_verification_proxy')
-        if cn_verification_proxy:
-            headers['Ytdl-request-proxy'] = cn_verification_proxy
        return self._download_json(
            'http://cache.m.iqiyi.com/jp/tmts/%s/%s/' % (tvid, video_id),
            video_id, transform_source=lambda s: remove_start(s, 'var tvInfoJs='),
-            query=params, headers=headers)
+            query=params, headers=self.geo_verification_headers())

    def _extract_playlist(self, webpage):
        PAGE_SIZE = 50
--- a/youtube_dl/extractor/kaltura.py
+++ b/youtube_dl/extractor/kaltura.py
@@ -6,7 +6,6 @@ import base64

 from .common import InfoExtractor
 from ..compat import (
-    compat_urllib_parse_urlencode,
    compat_urlparse,
    compat_parse_qs,
 )
@@ -15,6 +14,7 @@ from ..utils import (
    ExtractorError,
    int_or_none,
    unsmuggle_url,
+    smuggle_url,
 )


@@ -34,7 +34,8 @@ class KalturaIE(InfoExtractor):
                        )(?:/(?P<path>[^?]+))?(?:\?(?P<query>.*))?
                )
                '''
-    _API_BASE = 'http://cdnapi.kaltura.com/api_v3/index.php?'
+    _SERVICE_URL = 'http://cdnapi.kaltura.com'
+    _SERVICE_BASE = '/api_v3/index.php'
    _TESTS = [
        {
            'url': 'kaltura:269692:1_1jc2y3e4',
@@ -88,18 +89,26 @@ class KalturaIE(InfoExtractor):
                    (?P<q3>["\'])(?P<id>.+?)(?P=q3)
                ''', webpage))
        if mobj:
-            return 'kaltura:%(partner_id)s:%(id)s' % mobj.groupdict()
+            embed_info = mobj.groupdict()
+            url = 'kaltura:%(partner_id)s:%(id)s' % embed_info
+            escaped_pid = re.escape(embed_info['partner_id'])
+            service_url = re.search(
+                r'<script[^>]+src=["\']((?:https?:)?//.+?)/p/%s/sp/%s00/embedIframeJs' % (escaped_pid, escaped_pid),
+                webpage)
+            if service_url:
+                url = smuggle_url(url, {'service_url': service_url.group(1)})
+            return url

-    def _kaltura_api_call(self, video_id, actions, *args, **kwargs):
+    def _kaltura_api_call(self, video_id, actions, service_url=None, *args, **kwargs):
        params = actions[0]
        if len(actions) > 1:
            for i, a in enumerate(actions[1:], start=1):
                for k, v in a.items():
                    params['%d:%s' % (i, k)] = v

-        query = compat_urllib_parse_urlencode(params)
-        url = self._API_BASE + query
-        data = self._download_json(url, video_id, *args, **kwargs)
+        data = self._download_json(
+            (service_url or self._SERVICE_URL) + self._SERVICE_BASE,
+            video_id, query=params, *args, **kwargs)

        status = data if len(actions) == 1 else data[0]
        if status.get('objectType') == 'KalturaAPIException':
@@ -108,7 +117,7 @@ class KalturaIE(InfoExtractor):

        return data

-    def _get_kaltura_signature(self, video_id, partner_id):
+    def _get_kaltura_signature(self, video_id, partner_id, service_url=None):
        actions = [{
            'apiVersion': '3.1',
            'expiry': 86400,
@@ -118,10 +127,10 @@ class KalturaIE(InfoExtractor):
            'widgetId': '_%s' % partner_id,
        }]
        return self._kaltura_api_call(
-            video_id, actions, note='Downloading Kaltura signature')['ks']
+            video_id, actions, service_url, note='Downloading Kaltura signature')['ks']

-    def _get_video_info(self, video_id, partner_id):
-        signature = self._get_kaltura_signature(video_id, partner_id)
+    def _get_video_info(self, video_id, partner_id, service_url=None):
+        signature = self._get_kaltura_signature(video_id, partner_id, service_url)
        actions = [
            {
                'action': 'null',
@@ -144,7 +153,7 @@ class KalturaIE(InfoExtractor):
            },
        ]
        return self._kaltura_api_call(
-            video_id, actions, note='Downloading video info JSON')
+            video_id, actions, service_url, note='Downloading video info JSON')

    def _real_extract(self, url):
        url, smuggled_data = unsmuggle_url(url, {})
@@ -153,7 +162,7 @@ class KalturaIE(InfoExtractor):
        partner_id, entry_id = mobj.group('partner_id', 'id')
        ks = None
        if partner_id and entry_id:
-            info, flavor_assets = self._get_video_info(entry_id, partner_id)
+            info, flavor_assets = self._get_video_info(entry_id, partner_id, smuggled_data.get('service_url'))
        else:
            path, query = mobj.group('path', 'query')
            if not path and not query:
@@ -201,12 +210,17 @@ class KalturaIE(InfoExtractor):
                unsigned_url += '?referrer=%s' % referrer
            return unsigned_url

+        data_url = info['dataUrl']
+        if '/flvclipper/' in data_url:
+            data_url = re.sub(r'/flvclipper/.*', '/serveFlavor', data_url)
+
        formats = []
        for f in flavor_assets:
            # Continue if asset is not ready
            if f['status'] != 2:
                continue
-            video_url = sign_url('%s/flavorId/%s' % (info['dataUrl'], f['id']))
+            video_url = sign_url(
+                '%s/flavorId/%s' % (data_url, f['id']))
            formats.append({
                'format_id': '%(fileExt)s-%(bitrate)s' % f,
                'ext': f.get('fileExt'),
@@ -219,9 +233,12 @@ class KalturaIE(InfoExtractor):
                'width': int_or_none(f.get('width')),
                'url': video_url,
            })
-        m3u8_url = sign_url(info['dataUrl'].replace('format/url', 'format/applehttp'))
-        formats.extend(self._extract_m3u8_formats(
-            m3u8_url, entry_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False))
+        if '/playManifest/' in data_url:
+            m3u8_url = sign_url(data_url.replace(
+                'format/url', 'format/applehttp'))
+            formats.extend(self._extract_m3u8_formats(
+                m3u8_url, entry_id, 'mp4', 'm3u8_native',
+                m3u8_id='hls', fatal=False))

        self._check_formats(formats, entry_id)
        self._sort_formats(formats)
--- a/youtube_dl/extractor/kuwo.py
+++ b/youtube_dl/extractor/kuwo.py
@@ -26,11 +26,6 @@ class KuwoBaseIE(InfoExtractor):
    def _get_formats(self, song_id, tolerate_ip_deny=False):
        formats = []
        for file_format in self._FORMATS:
-            headers = {}
-            cn_verification_proxy = self._downloader.params.get('cn_verification_proxy')
-            if cn_verification_proxy:
-                headers['Ytdl-request-proxy'] = cn_verification_proxy
-
            query = {
                'format': file_format['ext'],
                'br': file_format.get('br', ''),
@@ -42,7 +37,7 @@ class KuwoBaseIE(InfoExtractor):
            song_url = self._download_webpage(
                'http://antiserver.kuwo.cn/anti.s',
                song_id, note='Download %s url info' % file_format['format'],
-                query=query, headers=headers,
+                query=query, headers=self.geo_verification_headers(),
            )

            if song_url == 'IPDeny' and not tolerate_ip_deny:
--- a/youtube_dl/extractor/la7.py
+++ b/youtube_dl/extractor/la7.py
@@ -3,8 +3,8 @@ from __future__ import unicode_literals

 from .common import InfoExtractor
 from ..utils import (
-    determine_ext,
    js_to_json,
+    smuggle_url,
 )


@@ -18,13 +18,16 @@ class LA7IE(InfoExtractor):
    _TESTS = [{
        # 'src' is a plain URL
        'url': 'http://www.la7.it/crozza/video/inccool8-02-10-2015-163722',
-        'md5': '6054674766e7988d3e02f2148ff92180',
+        'md5': '8b613ffc0c4bf9b9e377169fc19c214c',
        'info_dict': {
            'id': 'inccool8-02-10-2015-163722',
            'ext': 'mp4',
            'title': 'Inc.Cool8',
            'description': 'Benvenuti nell\'incredibile mondo della INC. COOL. 8. dove “INC.” sta per “Incorporated” “COOL” sta per “fashion” ed Eight sta per il gesto  atletico',
            'thumbnail': 're:^https?://.*',
+            'uploader_id': 'kdla7pillole@iltrovatore.it',
+            'timestamp': 1443814869,
+            'upload_date': '20151002',
        },
    }, {
        # 'src' is a dictionary
@@ -49,26 +52,14 @@ class LA7IE(InfoExtractor):
            self._search_regex(r'videoLa7\(({[^;]+})\);', webpage, 'player data'),
            video_id, transform_source=js_to_json)

-        source = player_data['src']
-        source_urls = source.values() if isinstance(source, dict) else [source]
-
-        formats = []
-        for source_url in source_urls:
-            ext = determine_ext(source_url)
-            if ext == 'm3u8':
-                formats.extend(self._extract_m3u8_formats(
-                    source_url, video_id, ext='mp4',
-                    entry_protocol='m3u8_native', m3u8_id='hls'))
-            else:
-                formats.append({
-                    'url': source_url,
-                })
-        self._sort_formats(formats)
-
        return {
+            '_type': 'url_transparent',
+            'url': smuggle_url('kaltura:103:%s' % player_data['vid'], {
+                'service_url': 'http://kdam.iltrovatore.it',
+            }),
            'id': video_id,
            'title': player_data['title'],
            'description': self._og_search_description(webpage, default=None),
            'thumbnail': player_data.get('poster'),
-            'formats': formats,
+            'ie_key': 'Kaltura',
        }
--- a/youtube_dl/extractor/leeco.py
+++ b/youtube_dl/extractor/leeco.py
@@ -20,7 +20,6 @@ from ..utils import (
    int_or_none,
    orderedSet,
    parse_iso8601,
-    sanitized_Request,
    str_or_none,
    url_basename,
    urshift,
@@ -121,16 +120,11 @@ class LeIE(InfoExtractor):
            'tkey': self.calc_time_key(int(time.time())),
            'domain': 'www.le.com'
        }
-        play_json_req = sanitized_Request(
-            'http://api.le.com/mms/out/video/playJson?' + compat_urllib_parse_urlencode(params)
-        )
-        cn_verification_proxy = self._downloader.params.get('cn_verification_proxy')
-        if cn_verification_proxy:
-            play_json_req.add_header('Ytdl-request-proxy', cn_verification_proxy)

        play_json = self._download_json(
-            play_json_req,
-            media_id, 'Downloading playJson data')
+            'http://api.le.com/mms/out/video/playJson',
+            media_id, 'Downloading playJson data', query=params,
+            headers=self.geo_verification_headers())

        # Check for errors
        playstatus = play_json['playstatus']
--- a/youtube_dl/extractor/onionstudios.py
+++ b/youtube_dl/extractor/onionstudios.py
@@ -7,6 +7,7 @@ from .common import InfoExtractor
 from ..utils import (
    determine_ext,
    int_or_none,
+    float_or_none,
 )


@@ -15,15 +16,14 @@ class OnionStudiosIE(InfoExtractor):

    _TESTS = [{
        'url': 'http://www.onionstudios.com/videos/hannibal-charges-forward-stops-for-a-cocktail-2937',
-        'md5': 'd4851405d31adfadf71cd7a487b765bb',
+        'md5': 'e49f947c105b8a78a675a0ee1bddedfe',
        'info_dict': {
            'id': '2937',
            'ext': 'mp4',
            'title': 'Hannibal charges forward, stops for a cocktail',
-            'description': 'md5:e786add7f280b7f0fe237b64cc73df76',
            'thumbnail': 're:^https?://.*\.jpg$',
            'uploader': 'The A.V. Club',
-            'uploader_id': 'TheAVClub',
+            'uploader_id': 'the-av-club',
        },
    }, {
        'url': 'http://www.onionstudios.com/embed?id=2855&autoplay=true',
@@ -40,50 +40,39 @@ class OnionStudiosIE(InfoExtractor):
    def _real_extract(self, url):
        video_id = self._match_id(url)

-        webpage = self._download_webpage(
-            'http://www.onionstudios.com/embed?id=%s' % video_id, video_id)
+        video_data = self._download_json(
+            'http://www.onionstudios.com/video/%s.json' % video_id, video_id)
+
+        title = video_data['title']

        formats = []
-        for src in re.findall(r'<source[^>]+src="([^"]+)"', webpage):
-            ext = determine_ext(src)
-            if ext == 'm3u8':
+        for source in video_data.get('sources', []):
+            source_url = source.get('url')
+            if not source_url:
+                continue
+            content_type = source.get('content_type')
+            ext = determine_ext(source_url)
+            if content_type == 'application/x-mpegURL' or ext == 'm3u8':
                formats.extend(self._extract_m3u8_formats(
-                    src, video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False))
+                    source_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False))
            else:
-                height = int_or_none(self._search_regex(
-                    r'/(\d+)\.%s' % ext, src, 'height', default=None))
+                tbr = int_or_none(source.get('bitrate'))
                formats.append({
-                    'format_id': ext + ('-%sp' % height if height else ''),
-                    'url': src,
-                    'height': height,
+                    'format_id': ext + ('-%d' % tbr if tbr else ''),
+                    'url': source_url,
+                    'width': int_or_none(source.get('width')),
+                    'tbr': tbr,
                    'ext': ext,
-                    'preference': 1,
                })
        self._sort_formats(formats)

-        title = self._search_regex(
-            r'share_title\s*=\s*(["\'])(?P<title>[^\1]+?)\1',
-            webpage, 'title', group='title')
-        description = self._search_regex(
-            r'share_description\s*=\s*(["\'])(?P<description>[^\'"]+?)\1',
-            webpage, 'description', default=None, group='description')
-        thumbnail = self._search_regex(
-            r'poster\s*=\s*(["\'])(?P<thumbnail>[^\1]+?)\1',
-            webpage, 'thumbnail', default=False, group='thumbnail')
-
-        uploader_id = self._search_regex(
-            r'twitter_handle\s*=\s*(["\'])(?P<uploader_id>[^\1]+?)\1',
-            webpage, 'uploader id', fatal=False, group='uploader_id')
-        uploader = self._search_regex(
-            r'window\.channelName\s*=\s*(["\'])Embedded:(?P<uploader>[^\1]+?)\1',
-            webpage, 'uploader', default=False, group='uploader')
-
        return {
            'id': video_id,
            'title': title,
-            'description': description,
-            'thumbnail': thumbnail,
-            'uploader': uploader,
-            'uploader_id': uploader_id,
+            'thumbnail': video_data.get('poster_url'),
+            'uploader': video_data.get('channel_name'),
+            'uploader_id': video_data.get('channel_slug'),
+            'duration': float_or_none(video_data.get('duration', 1000)),
+            'tags': video_data.get('tags'),
            'formats': formats,
        }
--- a/youtube_dl/extractor/pornhub.py
+++ b/youtube_dl/extractor/pornhub.py
@@ -82,6 +82,10 @@ class PornHubIE(InfoExtractor):
        # removed by uploader
        'url': 'http://www.pornhub.com/view_video.php?viewkey=ph572716d15a111',
        'only_matching': True,
+    }, {
+        # private video
+        'url': 'http://www.pornhub.com/view_video.php?viewkey=ph56fd731fce6b7',
+        'only_matching': True,
    }, {
        'url': 'https://www.thumbzilla.com/video/ph56c6114abd99a/horny-girlfriend-sex',
        'only_matching': True,
@@ -107,7 +111,7 @@ class PornHubIE(InfoExtractor):
        webpage = self._download_webpage(req, video_id)

        error_msg = self._html_search_regex(
-            r'(?s)<div[^>]+class=(["\']).*?\bremoved\b.*?\1[^>]*>(?P<error>.+?)</div>',
+            r'(?s)<div[^>]+class=(["\']).*?\b(?:removed|userMessageSection)\b.*?\1[^>]*>(?P<error>.+?)</div>',
            webpage, 'error message', default=None, group='error')
        if error_msg:
            error_msg = re.sub(r'\s+', ' ', error_msg)
--- a/youtube_dl/extractor/prosiebensat1.py
+++ b/youtube_dl/extractor/prosiebensat1.py
@@ -5,7 +5,7 @@ import re

 from hashlib import sha1
 from .common import InfoExtractor
-from ..compat import compat_urllib_parse_urlencode
+from ..compat import compat_str
 from ..utils import (
    ExtractorError,
    determine_ext,
@@ -71,6 +71,7 @@ class ProSiebenSat1IE(InfoExtractor):
                # rtmp download
                'skip_download': True,
            },
+            'skip': 'This video is unavailable',
        },
        {
            'url': 'http://www.sixx.de/stars-style/video/sexy-laufen-in-ugg-boots-clip',
@@ -86,6 +87,7 @@ class ProSiebenSat1IE(InfoExtractor):
                # rtmp download
                'skip_download': True,
            },
+            'skip': 'This video is unavailable',
        },
        {
            'url': 'http://www.sat1.de/film/der-ruecktritt/video/im-interview-kai-wiesinger-clip',
@@ -101,6 +103,7 @@ class ProSiebenSat1IE(InfoExtractor):
                # rtmp download
                'skip_download': True,
            },
+            'skip': 'This video is unavailable',
        },
        {
            'url': 'http://www.kabeleins.de/tv/rosins-restaurants/videos/jagd-auf-fertigkost-im-elsthal-teil-2-ganze-folge',
@@ -116,6 +119,7 @@ class ProSiebenSat1IE(InfoExtractor):
                # rtmp download
                'skip_download': True,
            },
+            'skip': 'This video is unavailable',
        },
        {
            'url': 'http://www.ran.de/fussball/bundesliga/video/schalke-toennies-moechte-raul-zurueck-ganze-folge',
@@ -131,6 +135,7 @@ class ProSiebenSat1IE(InfoExtractor):
                # rtmp download
                'skip_download': True,
            },
+            'skip': 'This video is unavailable',
        },
        {
            'url': 'http://www.the-voice-of-germany.de/video/31-andreas-kuemmert-rocket-man-clip',
@@ -227,70 +232,42 @@ class ProSiebenSat1IE(InfoExtractor):
    ]

    def _extract_clip(self, url, webpage):
-        clip_id = self._html_search_regex(self._CLIPID_REGEXES, webpage, 'clip id')
+        clip_id = self._html_search_regex(
+            self._CLIPID_REGEXES, webpage, 'clip id')

        access_token = 'prosieben'
        client_name = 'kolibri-2.0.19-splec4'
        client_location = url

-        videos_api_url = 'http://vas.sim-technik.de/vas/live/v2/videos?%s' % compat_urllib_parse_urlencode({
-            'access_token': access_token,
-            'client_location': client_location,
-            'client_name': client_name,
-            'ids': clip_id,
-        })
-
-        video = self._download_json(videos_api_url, clip_id, 'Downloading videos JSON')[0]
+        video = self._download_json(
+            'http://vas.sim-technik.de/vas/live/v2/videos',
+            clip_id, 'Downloading videos JSON', query={
+                'access_token': access_token,
+                'client_location': client_location,
+                'client_name': client_name,
+                'ids': clip_id,
+            })[0]

        if video.get('is_protected') is True:
            raise ExtractorError('This video is DRM protected.', expected=True)

        duration = float_or_none(video.get('duration'))
-        source_ids = [source['id'] for source in video['sources']]
-        source_ids_str = ','.join(map(str, source_ids))
+        source_ids = [compat_str(source['id']) for source in video['sources']]

        g = '01!8d8F_)r9]4s[qeuXfP%'
+        client_id = g[:2] + sha1(''.join([clip_id, g, access_token, client_location, g, client_name]).encode('utf-8')).hexdigest()

-        client_id = g[:2] + sha1(''.join([clip_id, g, access_token, client_location, g, client_name])
-                                 .encode('utf-8')).hexdigest()
-
-        sources_api_url = 'http://vas.sim-technik.de/vas/live/v2/videos/%s/sources?%s' % (clip_id, compat_urllib_parse_urlencode({
-            'access_token': access_token,
-            'client_id': client_id,
-            'client_location': client_location,
-            'client_name': client_name,
-        }))
-
-        sources = self._download_json(sources_api_url, clip_id, 'Downloading sources JSON')
+        sources = self._download_json(
+            'http://vas.sim-technik.de/vas/live/v2/videos/%s/sources' % clip_id,
+            clip_id, 'Downloading sources JSON', query={
+                'access_token': access_token,
+                'client_id': client_id,
+                'client_location': client_location,
+                'client_name': client_name,
+            })
        server_id = sources['server_id']

-        client_id = g[:2] + sha1(''.join([g, clip_id, access_token, server_id,
-                                          client_location, source_ids_str, g, client_name])
-                                 .encode('utf-8')).hexdigest()
-
-        url_api_url = 'http://vas.sim-technik.de/vas/live/v2/videos/%s/sources/url?%s' % (clip_id, compat_urllib_parse_urlencode({
-            'access_token': access_token,
-            'client_id': client_id,
-            'client_location': client_location,
-            'client_name': client_name,
-            'server_id': server_id,
-            'source_ids': source_ids_str,
-        }))
-
-        urls = self._download_json(url_api_url, clip_id, 'Downloading urls JSON')
-
        title = self._html_search_regex(self._TITLE_REGEXES, webpage, 'title')
-        description = self._html_search_regex(self._DESCRIPTION_REGEXES, webpage, 'description', fatal=False)
-        thumbnail = self._og_search_thumbnail(webpage)
-
-        upload_date = unified_strdate(self._html_search_regex(
-            self._UPLOAD_DATE_REGEXES, webpage, 'upload date', default=None))
-
-        formats = []
-
-        urls_sources = urls['sources']
-        if isinstance(urls_sources, dict):
-            urls_sources = urls_sources.values()

        def fix_bitrate(bitrate):
            bitrate = int_or_none(bitrate)
@@ -298,37 +275,73 @@ class ProSiebenSat1IE(InfoExtractor):
                return None
            return (bitrate // 1000) if bitrate % 1000 == 0 else bitrate

-        for source in urls_sources:
-            protocol = source['protocol']
-            source_url = source['url']
-            if protocol == 'rtmp' or protocol == 'rtmpe':
-                mobj = re.search(r'^(?P<url>rtmpe?://[^/]+)/(?P<path>.+)$', source_url)
-                if not mobj:
+        formats = []
+        for source_id in source_ids:
+            client_id = g[:2] + sha1(''.join([g, clip_id, access_token, server_id, client_location, source_id, g, client_name]).encode('utf-8')).hexdigest()
+            urls = self._download_json(
+                'http://vas.sim-technik.de/vas/live/v2/videos/%s/sources/url' % clip_id,
+                clip_id, 'Downloading urls JSON', fatal=False, query={
+                    'access_token': access_token,
+                    'client_id': client_id,
+                    'client_location': client_location,
+                    'client_name': client_name,
+                    'server_id': server_id,
+                    'source_ids': source_id,
+                })
+            if not urls:
+                continue
+            if urls.get('status_code') != 0:
+                raise ExtractorError('This video is unavailable', expected=True)
+            urls_sources = urls['sources']
+            if isinstance(urls_sources, dict):
+                urls_sources = urls_sources.values()
+            for source in urls_sources:
+                source_url = source.get('url')
+                if not source_url:
                    continue
-                path = mobj.group('path')
-                mp4colon_index = path.rfind('mp4:')
-                app = path[:mp4colon_index]
-                play_path = path[mp4colon_index:]
-                formats.append({
-                    'url': '%s/%s' % (mobj.group('url'), app),
-                    'app': app,
-                    'play_path': play_path,
-                    'player_url': 'http://livepassdl.conviva.com/hf/ver/2.79.0.17083/LivePassModuleMain.swf',
-                    'page_url': 'http://www.prosieben.de',
-                    'vbr': fix_bitrate(source['bitrate']),
-                    'ext': 'mp4',
-                    'format_id': '%s_%s' % (source['cdn'], source['bitrate']),
-                })
-            elif 'f4mgenerator' in source_url or determine_ext(source_url) == 'f4m':
-                formats.extend(self._extract_f4m_formats(source_url, clip_id))
-            else:
-                formats.append({
-                    'url': source_url,
-                    'vbr': fix_bitrate(source['bitrate']),
-                })
-
+                protocol = source.get('protocol')
+                mimetype = source.get('mimetype')
+                if mimetype == 'application/f4m+xml' or 'f4mgenerator' in source_url or determine_ext(source_url) == 'f4m':
+                    formats.extend(self._extract_f4m_formats(
+                        source_url, clip_id, f4m_id='hds', fatal=False))
+                elif mimetype == 'application/x-mpegURL':
+                    formats.extend(self._extract_m3u8_formats(
+                        source_url, clip_id, 'mp4', 'm3u8_native',
+                        m3u8_id='hls', fatal=False))
+                else:
+                    tbr = fix_bitrate(source['bitrate'])
+                    if protocol in ('rtmp', 'rtmpe'):
+                        mobj = re.search(r'^(?P<url>rtmpe?://[^/]+)/(?P<path>.+)$', source_url)
+                        if not mobj:
+                            continue
+                        path = mobj.group('path')
+                        mp4colon_index = path.rfind('mp4:')
+                        app = path[:mp4colon_index]
+                        play_path = path[mp4colon_index:]
+                        formats.append({
+                            'url': '%s/%s' % (mobj.group('url'), app),
+                            'app': app,
+                            'play_path': play_path,
+                            'player_url': 'http://livepassdl.conviva.com/hf/ver/2.79.0.17083/LivePassModuleMain.swf',
+                            'page_url': 'http://www.prosieben.de',
+                            'tbr': tbr,
+                            'ext': 'flv',
+                            'format_id': 'rtmp%s' % ('-%d' % tbr if tbr else ''),
+                        })
+                    else:
+                        formats.append({
+                            'url': source_url,
+                            'tbr': tbr,
+                            'format_id': 'http%s' % ('-%d' % tbr if tbr else ''),
+                        })
        self._sort_formats(formats)

+        description = self._html_search_regex(
+            self._DESCRIPTION_REGEXES, webpage, 'description', fatal=False)
+        thumbnail = self._og_search_thumbnail(webpage)
+        upload_date = unified_strdate(self._html_search_regex(
+            self._UPLOAD_DATE_REGEXES, webpage, 'upload date', default=None))
+
        return {
            'id': clip_id,
            'title': title,
--- a/youtube_dl/extractor/rai.py
+++ b/youtube_dl/extractor/rai.py
@@ -20,17 +20,12 @@ class RaiBaseIE(InfoExtractor):
        formats = []

        for platform in ('mon', 'flash', 'native'):
-            headers = {}
-            # TODO: rename --cn-verification-proxy
-            cn_verification_proxy = self._downloader.params.get('cn_verification_proxy')
-            if cn_verification_proxy:
-                headers['Ytdl-request-proxy'] = cn_verification_proxy
-
            relinker = self._download_xml(
                relinker_url, video_id,
                note='Downloading XML metadata for platform %s' % platform,
                transform_source=fix_xml_ampersands,
-                query={'output': 45, 'pl': platform}, headers=headers)
+                query={'output': 45, 'pl': platform},
+                headers=self.geo_verification_headers())

            media_url = find_xpath_attr(relinker, './url', 'type', 'content').text
            if media_url == 'http://download.rai.it/video_no_available.mp4':
--- a/youtube_dl/extractor/rtvnh.py
+++ b/youtube_dl/extractor/rtvnh.py
@@ -9,7 +9,7 @@ class RTVNHIE(InfoExtractor):
    _VALID_URL = r'https?://(?:www\.)?rtvnh\.nl/video/(?P<id>[0-9]+)'
    _TEST = {
        'url': 'http://www.rtvnh.nl/video/131946',
-        'md5': '6e1d0ab079e2a00b6161442d3ceacfc1',
+        'md5': 'cdbec9f44550763c8afc96050fa747dc',
        'info_dict': {
            'id': '131946',
            'ext': 'mp4',
@@ -29,15 +29,29 @@ class RTVNHIE(InfoExtractor):
            raise ExtractorError(
                '%s returned error code %d' % (self.IE_NAME, status), expected=True)

-        formats = self._extract_smil_formats(
-            'http://www.rtvnh.nl/video/smil?m=' + video_id, video_id, fatal=False)
+        formats = []
+        rtmp_formats = self._extract_smil_formats(
+            'http://www.rtvnh.nl/video/smil?m=' + video_id, video_id)
+        formats.extend(rtmp_formats)

-        for item in meta['source']['fb']:
-            if item.get('type') == 'hls':
-                formats.extend(self._extract_m3u8_formats(
-                    item['file'], video_id, ext='mp4', entry_protocol='m3u8_native'))
-            elif item.get('type') == '':
-                formats.append({'url': item['file']})
+        for rtmp_format in rtmp_formats:
+            rtmp_url = '%s/%s' % (rtmp_format['url'], rtmp_format['play_path'])
+            rtsp_format = rtmp_format.copy()
+            del rtsp_format['play_path']
+            del rtsp_format['ext']
+            rtsp_format.update({
+                'format_id': rtmp_format['format_id'].replace('rtmp', 'rtsp'),
+                'url': rtmp_url.replace('rtmp://', 'rtsp://'),
+                'protocol': 'rtsp',
+            })
+            formats.append(rtsp_format)
+            http_base_url = rtmp_url.replace('rtmp://', 'http://')
+            formats.extend(self._extract_m3u8_formats(
+                http_base_url + '/playlist.m3u8', video_id, 'mp4',
+                'm3u8_native', m3u8_id='hls', fatal=False))
+            formats.extend(self._extract_f4m_formats(
+                http_base_url + '/manifest.f4m',
+                video_id, f4m_id='hds', fatal=False))
        self._sort_formats(formats)

        return {
--- a/youtube_dl/extractor/sandia.py
+++ b/youtube_dl/extractor/sandia.py
@@ -1,18 +1,12 @@
 # coding: utf-8
 from __future__ import unicode_literals

-import itertools
 import json
-import re

 from .common import InfoExtractor
-from ..compat import compat_urlparse
 from ..utils import (
    int_or_none,
-    js_to_json,
    mimetype2ext,
-    sanitized_Request,
-    unified_strdate,
 )


@@ -27,7 +21,8 @@ class SandiaIE(InfoExtractor):
            'ext': 'mp4',
            'title': 'Xyce Software Training - Section 1',
            'description': 're:(?s)SAND Number: SAND 2013-7800.{200,}',
-            'upload_date': '20120904',
+            'upload_date': '20120409',
+            'timestamp': 1333983600,
            'duration': 7794,
        }
    }
@@ -35,81 +30,36 @@ class SandiaIE(InfoExtractor):
    def _real_extract(self, url):
        video_id = self._match_id(url)

-        req = sanitized_Request(url)
-        req.add_header('Cookie', 'MediasitePlayerCaps=ClientPlugins=4')
-        webpage = self._download_webpage(req, video_id)
+        presentation_data = self._download_json(
+            'http://digitalops.sandia.gov/Mediasite/PlayerService/PlayerService.svc/json/GetPlayerOptions',
+            video_id, data=json.dumps({
+                'getPlayerOptionsRequest': {
+                    'ResourceId': video_id,
+                    'QueryString': '',
+                }
+            }), headers={
+                'Content-Type': 'application/json; charset=utf-8',
+            })['d']['Presentation']

-        js_path = self._search_regex(
-            r'<script type="text/javascript" src="(/Mediasite/FileServer/Presentation/[^"]+)"',
-            webpage, 'JS code URL')
-        js_url = compat_urlparse.urljoin(url, js_path)
-
-        js_code = self._download_webpage(
-            js_url, video_id, note='Downloading player')
-
-        def extract_str(key, **args):
-            return self._search_regex(
-                r'Mediasite\.PlaybackManifest\.%s\s*=\s*(.+);\s*?\n' % re.escape(key),
-                js_code, key, **args)
-
-        def extract_data(key, **args):
-            data_json = extract_str(key, **args)
-            if data_json is None:
-                return data_json
-            return self._parse_json(
-                data_json, video_id, transform_source=js_to_json)
+        title = presentation_data['Title']

        formats = []
-        for i in itertools.count():
-            fd = extract_data('VideoUrls[%d]' % i, default=None)
-            if fd is None:
-                break
-            formats.append({
-                'format_id': '%s' % i,
-                'format_note': fd['MimeType'].partition('/')[2],
-                'ext': mimetype2ext(fd['MimeType']),
-                'url': fd['Location'],
-                'protocol': 'f4m' if fd['MimeType'] == 'video/x-mp4-fragmented' else None,
-            })
+        for stream in presentation_data.get('Streams', []):
+            for fd in stream.get('VideoUrls', []):
+                formats.append({
+                    'format_id': fd['MediaType'],
+                    'format_note': fd['MimeType'].partition('/')[2],
+                    'ext': mimetype2ext(fd['MimeType']),
+                    'url': fd['Location'],
+                    'protocol': 'f4m' if fd['MimeType'] == 'video/x-mp4-fragmented' else None,
+                })
        self._sort_formats(formats)

-        slide_baseurl = compat_urlparse.urljoin(
-            url, extract_data('SlideBaseUrl'))
-        slide_template = slide_baseurl + re.sub(
-            r'\{0:D?([0-9+])\}', r'%0\1d', extract_data('SlideImageFileNameTemplate'))
-        slides = []
-        last_slide_time = 0
-        for i in itertools.count(1):
-            sd = extract_str('Slides[%d]' % i, default=None)
-            if sd is None:
-                break
-            timestamp = int_or_none(self._search_regex(
-                r'^Mediasite\.PlaybackManifest\.CreateSlide\("[^"]*"\s*,\s*([0-9]+),',
-                sd, 'slide %s timestamp' % i, fatal=False))
-            slides.append({
-                'url': slide_template % i,
-                'duration': timestamp - last_slide_time,
-            })
-            last_slide_time = timestamp
-        formats.append({
-            'format_id': 'slides',
-            'protocol': 'slideshow',
-            'url': json.dumps(slides),
-            'preference': -10000,  # Downloader not yet written
-        })
-        self._sort_formats(formats)
-
-        title = extract_data('Title')
-        description = extract_data('Description', fatal=False)
-        duration = int_or_none(extract_data(
-            'Duration', fatal=False), scale=1000)
-        upload_date = unified_strdate(extract_data('AirDate', fatal=False))
-
        return {
            'id': video_id,
            'title': title,
-            'description': description,
+            'description': presentation_data.get('Description'),
            'formats': formats,
-            'upload_date': upload_date,
-            'duration': duration,
+            'timestamp': int_or_none(presentation_data.get('UnixTime'), 1000),
+            'duration': int_or_none(presentation_data.get('Duration'), 1000),
        }
--- a/youtube_dl/extractor/slideshare.py
+++ b/youtube_dl/extractor/slideshare.py
@@ -9,6 +9,7 @@ from ..compat import (
 )
 from ..utils import (
    ExtractorError,
+    get_element_by_id,
 )


@@ -40,7 +41,7 @@ class SlideshareIE(InfoExtractor):
        bucket = info['jsplayer']['video_bucket']
        ext = info['jsplayer']['video_extension']
        video_url = compat_urlparse.urljoin(bucket, doc + '-SD.' + ext)
-        description = self._html_search_regex(
+        description = get_element_by_id('slideshow-description-paragraph', webpage) or self._html_search_regex(
            r'(?s)<p[^>]+itemprop="description"[^>]*>(.+?)</p>', webpage,
            'description', fatal=False)

@@ -51,5 +52,5 @@ class SlideshareIE(InfoExtractor):
            'ext': ext,
            'url': video_url,
            'thumbnail': info['slideshow']['pin_image_url'],
-            'description': description,
+            'description': description.strip() if description else None,
        }
--- a/youtube_dl/extractor/sohu.py
+++ b/youtube_dl/extractor/sohu.py
@@ -8,10 +8,7 @@ from ..compat import (
    compat_str,
    compat_urllib_parse_urlencode,
 )
-from ..utils import (
-    ExtractorError,
-    sanitized_Request,
-)
+from ..utils import ExtractorError


 class SohuIE(InfoExtractor):
@@ -96,15 +93,10 @@ class SohuIE(InfoExtractor):
            else:
                base_data_url = 'http://hot.vrs.sohu.com/vrs_flash.action?vid='

-            req = sanitized_Request(base_data_url + vid_id)
-
-            cn_verification_proxy = self._downloader.params.get('cn_verification_proxy')
-            if cn_verification_proxy:
-                req.add_header('Ytdl-request-proxy', cn_verification_proxy)
-
            return self._download_json(
-                req, video_id,
-                'Downloading JSON data for %s' % vid_id)
+                base_data_url + vid_id, video_id,
+                'Downloading JSON data for %s' % vid_id,
+                headers=self.geo_verification_headers())

        mobj = re.match(self._VALID_URL, url)
        video_id = mobj.group('id')
--- a/youtube_dl/extractor/spiegel.py
+++ b/youtube_dl/extractor/spiegel.py
@@ -4,8 +4,13 @@ from __future__ import unicode_literals
 import re

 from .common import InfoExtractor
-from ..compat import compat_urlparse
 from .spiegeltv import SpiegeltvIE
+from ..compat import compat_urlparse
+from ..utils import (
+    extract_attributes,
+    unified_strdate,
+    get_element_by_attribute,
+)


 class SpiegelIE(InfoExtractor):
@@ -19,6 +24,7 @@ class SpiegelIE(InfoExtractor):
            'title': 'Vulkanausbruch in Ecuador: Der "Feuerschlund" ist wieder aktiv',
            'description': 'md5:8029d8310232196eb235d27575a8b9f4',
            'duration': 49,
+            'upload_date': '20130311',
        },
    }, {
        'url': 'http://www.spiegel.de/video/schach-wm-videoanalyse-des-fuenften-spiels-video-1309159.html',
@@ -29,6 +35,7 @@ class SpiegelIE(InfoExtractor):
            'title': 'Schach-WM in der Videoanalyse: Carlsen nutzt die Fehlgriffe des Titelverteidigers',
            'description': 'md5:c2322b65e58f385a820c10fa03b2d088',
            'duration': 983,
+            'upload_date': '20131115',
        },
    }, {
        'url': 'http://www.spiegel.de/video/astronaut-alexander-gerst-von-der-iss-station-beantwortet-fragen-video-1519126-embed.html',
@@ -38,6 +45,7 @@ class SpiegelIE(InfoExtractor):
            'ext': 'mp4',
            'description': 'SPIEGEL ONLINE-Nutzer durften den deutschen Astronauten Alexander Gerst über sein Leben auf der ISS-Station befragen. Hier kommen seine Antworten auf die besten sechs Fragen.',
            'title': 'Fragen an Astronaut Alexander Gerst: "Bekommen Sie die Tageszeiten mit?"',
+            'upload_date': '20140904',
        }
    }, {
        'url': 'http://www.spiegel.de/video/astronaut-alexander-gerst-von-der-iss-station-beantwortet-fragen-video-1519126-iframe.html',
@@ -52,10 +60,10 @@ class SpiegelIE(InfoExtractor):
        if SpiegeltvIE.suitable(handle.geturl()):
            return self.url_result(handle.geturl(), 'Spiegeltv')

-        title = re.sub(r'\s+', ' ', self._html_search_regex(
-            r'(?s)<(?:h1|div) class="module-title"[^>]*>(.*?)</(?:h1|div)>',
-            webpage, 'title'))
-        description = self._html_search_meta('description', webpage, 'description')
+        video_data = extract_attributes(self._search_regex(r'(<div[^>]+id="spVideoElements"[^>]+>)', webpage, 'video element', default=''))
+
+        title = video_data.get('data-video-title') or get_element_by_attribute('class', 'module-title', webpage)
+        description = video_data.get('data-video-teaser') or self._html_search_meta('description', webpage, 'description')

        base_url = self._search_regex(
            [r'server\s*:\s*(["\'])(?P<url>.+?)\1', r'var\s+server\s*=\s*"(?P<url>[^"]+)\"'],
@@ -87,8 +95,9 @@ class SpiegelIE(InfoExtractor):
        return {
            'id': video_id,
            'title': title,
-            'description': description,
+            'description': description.strip() if description else None,
            'duration': duration,
+            'upload_date': unified_strdate(video_data.get('data-video-date')),
            'formats': formats,
        }

--- a/youtube_dl/extractor/stitcher.py
+++ b/youtube_dl/extractor/stitcher.py
@@ -56,7 +56,7 @@ class StitcherIE(InfoExtractor):

        episode = self._parse_json(
            js_to_json(self._search_regex(
-                r'(?s)var\s+stitcher\s*=\s*({.+?});\n', webpage, 'episode config')),
+                r'(?s)var\s+stitcher(?:Config)?\s*=\s*({.+?});\n', webpage, 'episode config')),
            display_id)['config']['episode']

        title = unescapeHTML(episode['title'])
--- a/youtube_dl/extractor/xuite.py
+++ b/youtube_dl/extractor/xuite.py
@@ -67,6 +67,20 @@ class XuiteIE(InfoExtractor):
            'categories': ['電玩動漫'],
        },
        'skip': 'Video removed',
+    }, {
+        # Video with encoded media id
+        # from http://forgetfulbc.blogspot.com/2016/06/date.html
+        'url': 'http://vlog.xuite.net/embed/cE1xbENoLTI3NDQ3MzM2LmZsdg==?ar=0&as=0',
+        'info_dict': {
+            'id': 'cE1xbENoLTI3NDQ3MzM2LmZsdg==',
+            'ext': 'mp4',
+            'title': '男女平權只是口號？專家解釋約會時男生是否該幫女生付錢 (中字)',
+            'description': 'md5:f0abdcb69df300f522a5442ef3146f2a',
+            'timestamp': 1466160960,
+            'upload_date': '20160617',
+            'uploader': 'B.C. & Lowy',
+            'uploader_id': '232279340',
+        },
    }, {
        'url': 'http://vlog.xuite.net/play/S1dDUjdyLTMyOTc3NjcuZmx2/%E5%AD%AB%E7%87%95%E5%A7%BF-%E7%9C%BC%E6%B7%9A%E6%88%90%E8%A9%A9',
        'only_matching': True,
@@ -80,10 +94,9 @@ class XuiteIE(InfoExtractor):
    def base64_encode_utf8(data):
        return base64.b64encode(data.encode('utf-8')).decode('utf-8')

-    def _extract_flv_config(self, media_id):
-        base64_media_id = self.base64_encode_utf8(media_id)
+    def _extract_flv_config(self, encoded_media_id):
        flv_config = self._download_xml(
-            'http://vlog.xuite.net/flash/player?media=%s' % base64_media_id,
+            'http://vlog.xuite.net/flash/player?media=%s' % encoded_media_id,
            'flv config')
        prop_dict = {}
        for prop in flv_config.findall('./property'):
@@ -108,9 +121,14 @@ class XuiteIE(InfoExtractor):
                '%s returned error: %s' % (self.IE_NAME, error_msg),
                expected=True)

-        video_id = self._html_search_regex(
-            r'data-mediaid="(\d+)"', webpage, 'media id')
-        flv_config = self._extract_flv_config(video_id)
+        encoded_media_id = self._search_regex(
+            r'attributes\.name\s*=\s*"([^"]+)"', webpage,
+            'encoded media id', default=None)
+        if encoded_media_id is None:
+            video_id = self._html_search_regex(
+                r'data-mediaid="(\d+)"', webpage, 'media id')
+            encoded_media_id = self.base64_encode_utf8(video_id)
+        flv_config = self._extract_flv_config(encoded_media_id)

        FORMATS = {
            'audio': 'mp3',
--- a/youtube_dl/extractor/yahoo.py
+++ b/youtube_dl/extractor/yahoo.py
@@ -19,6 +19,7 @@ from ..utils import (
    mimetype2ext,
 )

+from .brightcove import BrightcoveNewIE
 from .nbc import NBCSportsVPlayerIE


@@ -227,7 +228,12 @@ class YahooIE(InfoExtractor):
        # Look for NBCSports iframes
        nbc_sports_url = NBCSportsVPlayerIE._extract_url(webpage)
        if nbc_sports_url:
-            return self.url_result(nbc_sports_url, 'NBCSportsVPlayer')
+            return self.url_result(nbc_sports_url, NBCSportsVPlayerIE.ie_key())
+
+        # Look for Brightcove New Studio embeds
+        bc_url = BrightcoveNewIE._extract_url(webpage)
+        if bc_url:
+            return self.url_result(bc_url, BrightcoveNewIE.ie_key())

        # Query result is often embedded in webpage as JSON. Sometimes explicit requests
        # to video API results in a failure with geo restriction reason therefore using
--- a/youtube_dl/extractor/youku.py
+++ b/youtube_dl/extractor/youku.py
@@ -16,7 +16,6 @@ from ..compat import (
 from ..utils import (
    ExtractorError,
    get_element_by_attribute,
-    sanitized_Request,
 )


@@ -218,14 +217,10 @@ class YoukuIE(InfoExtractor):
            headers = {
                'Referer': req_url,
            }
+            headers.update(self.geo_verification_headers())
            self._set_cookie('youku.com', 'xreferrer', 'http://www.youku.com')
-            req = sanitized_Request(req_url, headers=headers)

-            cn_verification_proxy = self._downloader.params.get('cn_verification_proxy')
-            if cn_verification_proxy:
-                req.add_header('Ytdl-request-proxy', cn_verification_proxy)
-
-            raw_data = self._download_json(req, video_id, note=note)
+            raw_data = self._download_json(req_url, video_id, note=note, headers=headers)

            return raw_data['data']

--- a/youtube_dl/extractor/youtube.py
+++ b/youtube_dl/extractor/youtube.py
@@ -1730,6 +1730,39 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
        }


+class YoutubeSharedVideoIE(InfoExtractor):
+    _VALID_URL = r'(?:https?:)?//(?:www\.)?youtube\.com/shared\?ci=(?P<id>[0-9A-Za-z_-]{11})'
+    IE_NAME = 'youtube:shared'
+
+    _TEST = {
+        'url': 'https://www.youtube.com/shared?ci=1nEzmT-M4fU',
+        'info_dict': {
+            'id': 'uPDB5I9wfp8',
+            'ext': 'webm',
+            'title': 'Pocoyo: 90 minutos de episódios completos Português para crianças - PARTE 3',
+            'description': 'md5:d9e4d9346a2dfff4c7dc4c8cec0f546d',
+            'upload_date': '20160219',
+            'uploader': 'Pocoyo - Português (BR)',
+            'uploader_id': 'PocoyoBrazil',
+        },
+        'add_ie': ['Youtube'],
+        'params': {
+            # There are already too many Youtube downloads
+            'skip_download': True,
+        },
+    }
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        webpage = self._download_webpage(url, video_id)
+
+        real_video_id = self._html_search_meta(
+            'videoId', webpage, 'YouTube video id', fatal=True)
+
+        return self.url_result(real_video_id, YoutubeIE.ie_key())
+
+
 class YoutubePlaylistIE(YoutubePlaylistBaseInfoExtractor):
    IE_DESC = 'YouTube.com playlists'
    _VALID_URL = r"""(?x)(?:
@@ -1962,9 +1995,13 @@ class YoutubeChannelIE(YoutubePlaylistBaseInfoExtractor):
            channel_playlist_id = self._html_search_meta(
                'channelId', channel_page, 'channel id', default=None)
            if not channel_playlist_id:
-                channel_playlist_id = self._search_regex(
-                    r'data-(?:channel-external-|yt)id="([^"]+)"',
-                    channel_page, 'channel id', default=None)
+                channel_url = self._html_search_meta(
+                    ('al:ios:url', 'twitter:app:url:iphone', 'twitter:app:url:ipad'),
+                    channel_page, 'channel url', default=None)
+                if channel_url:
+                    channel_playlist_id = self._search_regex(
+                        r'vnd\.youtube://user/([0-9A-Za-z_-]+)',
+                        channel_url, 'channel id', default=None)
        if channel_playlist_id and channel_playlist_id.startswith('UC'):
            playlist_id = 'UU' + channel_playlist_id[2:]
            return self.url_result(
@@ -1987,6 +2024,15 @@ class YoutubeChannelIE(YoutubePlaylistBaseInfoExtractor):
                for video_id, video_title in self.extract_videos_from_page(channel_page)]
            return self.playlist_result(entries, channel_id)

+        try:
+            next(self._entries(channel_page, channel_id))
+        except StopIteration:
+            alert_message = self._html_search_regex(
+                r'(?s)<div[^>]+class=(["\']).*?\byt-alert-message\b.*?\1[^>]*>(?P<alert>[^<]+)</div>',
+                channel_page, 'alert', default=None, group='alert')
+            if alert_message:
+                raise ExtractorError('Youtube said: %s' % alert_message, expected=True)
+
        return self.playlist_result(self._entries(channel_page, channel_id), channel_id)


@@ -2000,7 +2046,8 @@ class YoutubeUserIE(YoutubeChannelIE):
        'url': 'https://www.youtube.com/user/TheLinuxFoundation',
        'playlist_mincount': 320,
        'info_dict': {
-            'title': 'TheLinuxFoundation',
+            'id': 'UUfX55Sx5hEFjoC3cNs6mCUQ',
+            'title': 'Uploads from The Linux Foundation',
        }
    }, {
        'url': 'ytuser:phihag',
@@ -2008,6 +2055,10 @@ class YoutubeUserIE(YoutubeChannelIE):
    }, {
        'url': 'https://www.youtube.com/c/gametrailers',
        'only_matching': True,
+    }, {
+        # This channel is not available.
+        'url': 'https://www.youtube.com/user/kananishinoSMEJ/videos',
+        'only_matching': True,
    }]

    @classmethod
--- a/youtube_dl/options.py
+++ b/youtube_dl/options.py
@@ -209,11 +209,16 @@ def parseOpts(overrideArguments=None):
        action='store_const', const='::', dest='source_address',
        help='Make all connections via IPv6 (experimental)',
    )
+    network.add_option(
+        '--geo-verification-proxy',
+        dest='geo_verification_proxy', default=None, metavar='URL',
+        help='Use this proxy to verify the IP address for some geo-restricted sites. '
+        'The default proxy specified by --proxy (or none, if the options is not present) is used for the actual downloading. (experimental)'
+    )
    network.add_option(
        '--cn-verification-proxy',
        dest='cn_verification_proxy', default=None, metavar='URL',
-        help='Use this proxy to verify the IP address for some Chinese sites. '
-        'The default proxy specified by --proxy (or none, if the options is not present) is used for the actual downloading. (experimental)'
+        help=optparse.SUPPRESS_HELP,
    )

    selection = optparse.OptionGroup(parser, 'Video Selection')
--- a/youtube_dl/utils.py
+++ b/youtube_dl/utils.py
@@ -1444,6 +1444,8 @@ def shell_quote(args):
 def smuggle_url(url, data):
    """ Pass additional data in a URL for internal use. """

+    url, idata = unsmuggle_url(url, {})
+    data.update(idata)
    sdata = compat_urllib_parse_urlencode(
        {'__youtubedl_smuggle': json.dumps(data)})
    return url + '#' + sdata
--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@@ -1,3 +1,3 @@
 from __future__ import unicode_literals

-__version__ = '2016.07.03.1'
+__version__ = '2016.07.06'
Author	SHA1	Message	Date
Sergey M․	0e94b4713d	release 2016.07.06	2016-07-06 00:54:23 +07:00
Sergey M․	a6d3b89feb	[prosiebensat1] Make downloading urls JSON non fatal	2016-07-06 00:52:48 +07:00
Remita Amine	6c26815d63	[onionstudios] fix info extraction	2016-07-05 18:05:07 +01:00
Sergey M․	73c4ac2c95	[youtube:channel] Improve channel id extraction and detect unavailable channels (Closes #10009 )	2016-07-05 23:30:44 +07:00
Remita Amine	84f214d840	[prosiebensat1] extract all formats	2016-07-05 17:11:45 +01:00
Remita Amine	e3f88be7a9	[rtvnh] extract all formats	2016-07-05 14:45:39 +01:00
Remita Amine	31af3e35e0	[sandia] remove unused imports	2016-07-05 13:39:24 +01:00
Remita Amine	94a5cff91d	[sendia] fix info extraction	2016-07-05 13:37:46 +01:00
Remita Amine	77082c7b9e	[slideshare] fix description extraction	2016-07-05 12:01:04 +01:00
Remita Amine	252a1f75d2	[spiegel] improve info extraction	2016-07-05 11:46:25 +01:00
Remita Amine	5abf513cf8	[stitcher] fix episode config extraction	2016-07-05 10:44:16 +01:00
Yen Chi Hsuan	c6054e3201	[xuite] Support videos with already encoded media id	2016-07-05 14:26:42 +08:00
Yen Chi Hsuan	4080530624	[youtube:shared] Recognize the new 'shared' URLs Closes #10007	2016-07-05 13:15:05 +08:00
Sergey M․	c25f1a9b63	release 2016.07.05	2016-07-05 06:32:46 +07:00
Remita Amine	dfaa86b75e	[test_utils] add test for smuggling a smuggled url	2016-07-04 21:36:32 +01:00
Remita Amine	d9163ae3b6	[kaltura] fix extraction error for videos from multiple kaltura servers	2016-07-04 21:34:27 +01:00
Remita Amine	dafafe7cf1	[la7] extract more info from a kaltura custom server	2016-07-04 17:59:58 +01:00
Remita Amine	81953d1ae5	[kaltura] add support videos stored on custom kaltura servers(closes #5557 )	2016-07-04 17:59:58 +01:00
Yen Chi Hsuan	3a212ed62e	[iqiyi] Skip an unstable MD5 checksum	2016-07-04 11:25:46 +08:00
Sergey M․	195f084542	[pornhub] Detect private videos (Closes #9987 )	2016-07-04 03:27:00 +07:00
Sergey M․	aa7a455b2e	[README.md] Clarify configuration file may not exist by default	2016-07-04 01:24:33 +07:00
Sergey M․	6a4e659c93	[yahoo] Recognize brightcove embed (Closes #9995 )	2016-07-03 23:00:36 +07:00
Yen Chi Hsuan	40f3666f6b	[test/test_http] Update tests for `38cce791c7`	2016-07-03 23:50:55 +08:00
Remita Amine	dd801bbe18	[brightcove] improve error detection	2016-07-03 16:37:22 +01:00
Yen Chi Hsuan	38cce791c7	Rename --cn-verfication-proxy to --geo-verification-proxy And deprecate the former one Since commit `f138873900`, this option is not limited to China websites, so rename it.	2016-07-03 23:29:56 +08:00
Sergey M․	bf3ae6a543	[devscripts/show-downloads-statictics] Add script for displaying downloads statistics	2016-07-03 22:20:14 +07:00