release 2015.06.25

[README.md] Fix typo
[extractor/generic] Add test for OnionStudios embeds
2015-06-25 07:43:49 +02:00 · 2015-06-25 01:10:42 +06:00 · 2015-06-24 23:23:16 +06:00 · 2015-06-24 23:19:50 +06:00 · 2015-06-24 23:16:33 +06:00 · 2015-06-24 23:12:13 +06:00
68 changed files with 2346 additions and 533 deletions
--- a/1
+++ b/1
@@ -127,3 +127,4 @@ Julian Richen
 Ping O.
 Mister Hat
 Peter Ding
 jackyzy823
--- a/README.md
+++ b/README.md
@@ -52,8 +52,9 @@ which means you can modify it, redistribute it or use it however you like.
    -i, --ignore-errors              Continue on download errors, for example to skip unavailable videos in a playlist
    --abort-on-error                 Abort downloading of further videos (in the playlist or the command line) if an error occurs
    --dump-user-agent                Display the current browser identification
-    --list-extractors                List all supported extractors and the URLs they would handle
+    --list-extractors                List all supported extractors
    --extractor-descriptions         Output descriptions of all supported extractors
    --force-generic-extractor        Force extraction to use the generic extractor
    --default-search PREFIX          Use this prefix for unqualified URLs. For example "gvsearch2:" downloads two videos from google videos for youtube-dl "large apple".
                                     Use the value "auto" to let youtube-dl guess ("auto_warning" to emit a warning when guessing). "error" just throws an error. The
                                     default value "fixup_error" repairs broken URLs, but emits an error if this is not possible instead of searching.
@@ -223,7 +224,7 @@ which means you can modify it, redistribute it or use it however you like.
                                     parameters replace existing values. Additional templates: %(album)s, %(artist)s. Example: --metadata-from-title "%(artist)s -
                                     %(title)s" matches a title like "Coldplay - Paradise"
    --xattrs                         Write metadata to the video file's xattrs (using dublin core and xdg standards)
-    --fixup POLICY                   Automatically correct known faults of the file. One of never (do nothing), warn (only emit a warning), detect_or_warn(the default;
+    --fixup POLICY                   Automatically correct known faults of the file. One of never (do nothing), warn (only emit a warning), detect_or_warn (the default;
                                     fix file if we can, warn otherwise)
    --prefer-avconv                  Prefer avconv over ffmpeg for running the postprocessors (default)
    --prefer-ffmpeg                  Prefer ffmpeg over avconv for running the postprocessors
@@ -379,7 +380,7 @@ In February 2015, the new YouTube player contained a character sequence in a str
 ### HTTP Error 429: Too Many Requests or 402: Payment Required
-These two error codes indicate that the service is blocking your IP address because of overuse. Contact the service and ask them to unblock your IP address, or - if you have acquired a whitelisted IP address already - use the [`--proxy` or `--network-address` options](#network-options) to select another IP address.
+These two error codes indicate that the service is blocking your IP address because of overuse. Contact the service and ask them to unblock your IP address, or - if you have acquired a whitelisted IP address already - use the [`--proxy` or `--source-address` options](#network-options) to select another IP address.
 ### SyntaxError: Non-ASCII character ###
--- a/docs/supportedsites.md
+++ b/docs/supportedsites.md
@@ -17,6 +17,7 @@
 - **AcademicEarth:Course**
 - **AddAnime**
 - **AdobeTV**
 - **AdobeTVVideo**
 - **AdultSwim**
 - **Aftenposten**
 - **Aftonbladet**
@@ -110,6 +111,7 @@
 - **dailymotion**
 - **dailymotion:playlist**
 - **dailymotion:user**
 - **DailymotionCloud**
 - **daum.net**
 - **DBTV**
 - **DctpTv**
@@ -120,6 +122,8 @@
 - **divxstage**: DivxStage
 - **Dotsub**
 - **DouyuTV**
 - **dramafever**
 - **dramafever:series**
 - **DRBonanza**
 - **Dropbox**
 - **DrTuber**
@@ -153,6 +157,7 @@
 - **fernsehkritik.tv**
 - **fernsehkritik.tv:postecke**
 - **Firstpost**
 - **FiveTV**
 - **Flickr**
 - **Folketinget**: Folketinget (ft.dk; Danish parliament)
 - **FootyRoom**
@@ -217,6 +222,7 @@
 - **instagram:user**: Instagram user profile
 - **InternetVideoArchive**
 - **IPrima**
 - **iqiyi**
 - **ivi**: ivi.ru
 - **ivi:compilation**: ivi.ru compilations
 - **Izlesene**
@@ -340,6 +346,7 @@
 - **Odnoklassniki**
 - **OktoberfestTV**
 - **on.aol.com**
 - **OnionStudios**
 - **Ooyala**
 - **OoyalaExternal**
 - **OpenFilm**
@@ -353,6 +360,7 @@
 - **PhilharmonieDeParis**: Philharmonie de Paris
 - **Phoenix**
 - **Photobucket**
 - **Pinkbike**
 - **Pladform**
 - **PlanetaPlay**
 - **play.fm**
@@ -407,6 +415,7 @@
 - **rutube:movie**: Rutube movies
 - **rutube:person**: Rutube person videos
 - **RUTV**: RUTV.RU
 - **Ruutu**
 - **safari**: safaribooksonline.com online video
 - **safari:course**: safaribooksonline.com online courses
 - **Sandia**: Sandia National Laboratories
@@ -519,6 +528,8 @@
 - **TV2**
 - **TV2Article**
 - **TV4**: tv4.se and tv4play.se
 - **TVC**
 - **TVCArticle**
 - **tvigle**: Интернет-телевидение Tvigle.ru
 - **tvp.pl**
 - **tvp.pl:Series**
@@ -605,6 +616,7 @@
 - **XBef**
 - **XboxClips**
 - **XHamster**
 - **XHamsterEmbed**
 - **XMinus**
 - **XNXX**
 - **Xstream**
@@ -621,7 +633,7 @@
 - **YesJapan**
 - **Ynet**
 - **YouJizz**
- - **Youku**
+ - **youku**
 - **YouPorn**
 - **YourUpload**
 - **youtube**: YouTube.com
--- a/youtube_dl/YoutubeDL.py
+++ b/youtube_dl/YoutubeDL.py
@@ -119,7 +119,7 @@ class YoutubeDL(object):
    username:          Username for authentication purposes.
    password:          Password for authentication purposes.
-    videopassword:     Password for acces a video.
+    videopassword:     Password for accessing a video.
    usenetrc:          Use netrc for authentication instead.
    verbose:           Print additional info to stdout.
    quiet:             Do not print messages to stdout.
@@ -139,6 +139,7 @@ class YoutubeDL(object):
    outtmpl:           Template for output names.
    restrictfilenames: Do not allow "&" and spaces in file names
    ignoreerrors:      Do not stop on download errors.
    force_generic_extractor: Force downloader to use the generic extractor
    nooverwrites:      Prevent overwriting files.
    playliststart:     Playlist item to start at.
    playlistend:       Playlist item to end at.
@@ -626,13 +627,16 @@ class YoutubeDL(object):
            info_dict.setdefault(key, value)
    def extract_info(self, url, download=True, ie_key=None, extra_info={},
-                     process=True):
+                     process=True, force_generic_extractor=False):
        '''
        Returns a list with a dictionary for each video we find.
        If 'download', also downloads the videos.
        extra_info is a dict containing the extra values to add to each result
        '''
        if not ie_key and force_generic_extractor:
            ie_key = 'Generic'
        if ie_key:
            ies = [self.get_info_extractor(ie_key)]
        else:
@@ -1016,13 +1020,13 @@ class YoutubeDL(object):
            info_dict['display_id'] = info_dict['id']
        if info_dict.get('upload_date') is None and info_dict.get('timestamp') is not None:
-            # Working around negative timestamps in Windows
+            # Working around out-of-range timestamp values (e.g. negative ones on Windows,
-            # (see http://bugs.python.org/issue1646728)
+            # see http://bugs.python.org/issue1646728)
-            if info_dict['timestamp'] < 0 and os.name == 'nt':
+            try:
-                info_dict['timestamp'] = 0
+                upload_date = datetime.datetime.utcfromtimestamp(info_dict['timestamp'])
-            upload_date = datetime.datetime.utcfromtimestamp(
+                info_dict['upload_date'] = upload_date.strftime('%Y%m%d')
-                info_dict['timestamp'])
+            except (ValueError, OverflowError, OSError):
-            info_dict['upload_date'] = upload_date.strftime('%Y%m%d')
+                pass
        if self.params.get('listsubtitles', False):
            if 'automatic_captions' in info_dict:
@@ -1033,12 +1037,6 @@ class YoutubeDL(object):
            info_dict['id'], info_dict.get('subtitles'),
            info_dict.get('automatic_captions'))
        # This extractors handle format selection themselves
        if info_dict['extractor'] in ['Youku']:
            if download:
                self.process_info(info_dict)
            return info_dict
        # We now pick which formats have to be downloaded
        if info_dict.get('formats') is None:
            # There's only one format available
@@ -1499,7 +1497,8 @@ class YoutubeDL(object):
        for url in url_list:
            try:
                # It also downloads the videos
-                res = self.extract_info(url)
+                res = self.extract_info(
                    url, force_generic_extractor=self.params.get('force_generic_extractor', False))
            except UnavailableVideoError:
                self.report_error('unable to download video')
            except MaxDownloadsReached:
--- a/youtube_dl/init.py
+++ b/youtube_dl/init.py
@@ -293,6 +293,7 @@ def _real_main(argv=None):
        'autonumber_size': opts.autonumber_size,
        'restrictfilenames': opts.restrictfilenames,
        'ignoreerrors': opts.ignoreerrors,
        'force_generic_extractor': opts.force_generic_extractor,
        'ratelimit': opts.ratelimit,
        'nooverwrites': opts.nooverwrites,
        'retries': opts_retries,
--- a/youtube_dl/extractor/init.py
+++ b/youtube_dl/extractor/init.py
@@ -4,7 +4,10 @@ from .abc import ABCIE
 from .abc7news import Abc7NewsIE
 from .academicearth import AcademicEarthCourseIE
 from .addanime import AddAnimeIE
-from .adobetv import AdobeTVIE
+from .adobetv import (
    AdobeTVIE,
    AdobeTVVideoIE,
 )
 from .adultswim import AdultSwimIE
 from .aftenposten import AftenpostenIE
 from .aftonbladet import AftonbladetIE
@@ -103,6 +106,7 @@ from .dailymotion import (
    DailymotionIE,
    DailymotionPlaylistIE,
    DailymotionUserIE,
    DailymotionCloudIE,
 )
 from .daum import DaumIE
 from .dbtv import DBTVIE
@@ -112,6 +116,10 @@ from .dfb import DFBIE
 from .dhm import DHMIE
 from .dotsub import DotsubIE
 from .douyutv import DouyuTVIE
 from .dramafever import (
    DramaFeverIE,
    DramaFeverSeriesIE,
 )
 from .dreisat import DreiSatIE
 from .drbonanza import DRBonanzaIE
 from .drtuber import DrTuberIE
@@ -152,6 +160,7 @@ from .fc2 import FC2IE
 from .firstpost import FirstpostIE
 from .firsttv import FirstTVIE
 from .fivemin import FiveMinIE
 from .fivetv import FiveTVIE
 from .fktv import (
    FKTVIE,
    FKTVPosteckeIE,
@@ -229,6 +238,7 @@ from .infoq import InfoQIE
 from .instagram import InstagramIE, InstagramUserIE
 from .internetvideoarchive import InternetVideoArchiveIE
 from .iprima import IPrimaIE
 from .iqiyi import IqiyiIE
 from .ivi import (
    IviIE,
    IviCompilationIE
@@ -378,6 +388,7 @@ from .nytimes import (
 from .nuvid import NuvidIE
 from .odnoklassniki import OdnoklassnikiIE
 from .oktoberfesttv import OktoberfestTVIE
 from .onionstudios import OnionStudiosIE
 from .ooyala import (
    OoyalaIE,
    OoyalaExternalIE,
@@ -395,6 +406,7 @@ from .pbs import PBSIE
 from .philharmoniedeparis import PhilharmonieDeParisIE
 from .phoenix import PhoenixIE
 from .photobucket import PhotobucketIE
 from .pinkbike import PinkbikeIE
 from .planetaplay import PlanetaPlayIE
 from .pladform import PladformIE
 from .played import PlayedIE
@@ -453,6 +465,7 @@ from .rutube import (
    RutubePersonIE,
 )
 from .rutv import RUTVIE
 from .ruutu import RuutuIE
 from .sandia import SandiaIE
 from .safari import (
    SafariIE,
@@ -582,6 +595,10 @@ from .tv2 import (
    TV2ArticleIE,
 )
 from .tv4 import TV4IE
 from .tvc import (
    TVCIE,
    TVCArticleIE,
 )
 from .tvigle import TvigleIE
 from .tvp import TvpIE, TvpSeriesIE
 from .tvplay import TVPlayIE
@@ -685,7 +702,10 @@ from .wrzuta import WrzutaIE
 from .wsj import WSJIE
 from .xbef import XBefIE
 from .xboxclips import XboxClipsIE
-from .xhamster import XHamsterIE
+from .xhamster import (
    XHamsterIE,
    XHamsterEmbedIE,
 )
 from .xminus import XMinusIE
 from .xnxx import XNXXIE
 from .xstream import XstreamIE
--- a/youtube_dl/extractor/adobetv.py
+++ b/youtube_dl/extractor/adobetv.py
@@ -5,6 +5,8 @@ from ..utils import (
    parse_duration,
    unified_strdate,
    str_to_int,
    float_or_none,
    ISO639Utils,
 )
@@ -69,3 +71,61 @@ class AdobeTVIE(InfoExtractor):
            'view_count': view_count,
            'formats': formats,
        }
 class AdobeTVVideoIE(InfoExtractor):
    _VALID_URL = r'https?://video\.tv\.adobe\.com/v/(?P<id>\d+)'
    _TEST = {
        # From https://helpx.adobe.com/acrobat/how-to/new-experience-acrobat-dc.html?set=acrobat--get-started--essential-beginners
        'url': 'https://video.tv.adobe.com/v/2456/',
        'md5': '43662b577c018ad707a63766462b1e87',
        'info_dict': {
            'id': '2456',
            'ext': 'mp4',
            'title': 'New experience with Acrobat DC',
            'description': 'New experience with Acrobat DC',
            'duration': 248.667,
        },
    }
    def _real_extract(self, url):
        video_id = self._match_id(url)
        webpage = self._download_webpage(url, video_id)
        player_params = self._parse_json(self._search_regex(
            r'var\s+bridge\s*=\s*([^;]+);', webpage, 'player parameters'),
            video_id)
        formats = [{
            'url': source['src'],
            'width': source.get('width'),
            'height': source.get('height'),
            'tbr': source.get('bitrate'),
        } for source in player_params['sources']]
        # For both metadata and downloaded files the duration varies among
        # formats. I just pick the max one
        duration = max(filter(None, [
            float_or_none(source.get('duration'), scale=1000)
            for source in player_params['sources']]))
        subtitles = {}
        for translation in player_params.get('translations', []):
            lang_id = translation.get('language_w3c') or ISO639Utils.long2short(translation['language_medium'])
            if lang_id not in subtitles:
                subtitles[lang_id] = []
            subtitles[lang_id].append({
                'url': translation['vttPath'],
                'ext': 'vtt',
            })
        return {
            'id': video_id,
            'formats': formats,
            'title': player_params['title'],
            'description': self._og_search_description(webpage),
            'duration': duration,
            'subtitles': subtitles,
        }
--- a/youtube_dl/extractor/bbccouk.py
+++ b/youtube_dl/extractor/bbccouk.py
@@ -129,6 +129,20 @@ class BBCCoUkIE(InfoExtractor):
                'skip_download': True,
            },
            'skip': 'geolocation',
        }, {
            'url': 'http://www.bbc.co.uk/iplayer/episode/b05zmgwn/royal-academy-summer-exhibition',
            'info_dict': {
                'id': 'b05zmgw1',
                'ext': 'flv',
                'description': 'Kirsty Wark and Morgan Quaintance visit the Royal Academy as it prepares for its annual artistic extravaganza, meeting people who have come together to make the show unique.',
                'title': 'Royal Academy Summer Exhibition',
                'duration': 3540,
            },
            'params': {
                # rtmp download
                'skip_download': True,
            },
            'skip': 'geolocation',
        }, {
            'url': 'http://www.bbc.co.uk/iplayer/playlist/p01dvks4',
            'only_matching': True,
@@ -237,26 +251,11 @@ class BBCCoUkIE(InfoExtractor):
        for connection in self._extract_connections(media):
            captions = self._download_xml(connection.get('href'), programme_id, 'Downloading captions')
            lang = captions.get('{http://www.w3.org/XML/1998/namespace}lang', 'en')
            ps = captions.findall('./{0}body/{0}div/{0}p'.format('{http://www.w3.org/2006/10/ttaf1}'))
            srt = ''
            def _extract_text(p):
                if p.text is not None:
                    stripped_text = p.text.strip()
                    if stripped_text:
                        return stripped_text
                return ' '.join(span.text.strip() for span in p.findall('{http://www.w3.org/2006/10/ttaf1}span'))
            for pos, p in enumerate(ps):
                srt += '%s\r\n%s --> %s\r\n%s\r\n\r\n' % (str(pos), p.get('begin'), p.get('end'), _extract_text(p))
            subtitles[lang] = [
                {
                    'url': connection.get('href'),
                    'ext': 'ttml',
                },
                {
                    'data': srt,
                    'ext': 'srt',
                },
            ]
        return subtitles
@@ -267,7 +266,7 @@ class BBCCoUkIE(InfoExtractor):
                programme_id, 'Downloading media selection XML')
        except ExtractorError as ee:
            if isinstance(ee.cause, compat_HTTPError) and ee.cause.code == 403:
-                media_selection = xml.etree.ElementTree.fromstring(ee.cause.read().encode('utf-8'))
+                media_selection = xml.etree.ElementTree.fromstring(ee.cause.read().decode('utf-8'))
            else:
                raise
@@ -362,7 +361,7 @@ class BBCCoUkIE(InfoExtractor):
            formats, subtitles = self._download_media_selector(programme_id)
            title = self._og_search_title(webpage)
            description = self._search_regex(
-                r'<p class="medium-description">([^<]+)</p>',
+                r'<p class="[^"]*medium-description[^"]*">([^<]+)</p>',
                webpage, 'description', fatal=False)
        else:
            programme_id, title, description, duration, formats, subtitles = self._download_playlist(group_id)
--- a/youtube_dl/extractor/bilibili.py
+++ b/youtube_dl/extractor/bilibili.py
@@ -105,7 +105,7 @@ class BiliBiliIE(InfoExtractor):
                'filesize': int_or_none(
                    lq_durl.find('./size'), get_attr='text'),
            }]
-            if hq_durl:
+            if hq_durl is not None:
                formats.append({
                    'format_id': 'hq',
                    'quality': 2,
--- a/youtube_dl/extractor/brightcove.py
+++ b/youtube_dl/extractor/brightcove.py
@@ -13,6 +13,7 @@ from ..compat import (
    compat_urllib_parse_urlparse,
    compat_urllib_request,
    compat_urlparse,
    compat_xml_parse_error,
 )
 from ..utils import (
    determine_ext,
@@ -119,7 +120,7 @@ class BrightcoveIE(InfoExtractor):
        try:
            object_doc = xml.etree.ElementTree.fromstring(object_str.encode('utf-8'))
-        except xml.etree.ElementTree.ParseError:
+        except compat_xml_parse_error:
            return
        fv_el = find_xpath_attr(object_doc, './param', 'name', 'flashVars')
@@ -156,6 +157,28 @@ class BrightcoveIE(InfoExtractor):
        linkBase = find_param('linkBaseURL')
        if linkBase is not None:
            params['linkBaseURL'] = linkBase
        return cls._make_brightcove_url(params)
    @classmethod
    def _build_brighcove_url_from_js(cls, object_js):
        # The layout of JS is as follows:
        # customBC.createVideo = function (width, height, playerID, playerKey, videoPlayer, VideoRandomID) {
        #   // build Brightcove <object /> XML
        # }
        m = re.search(
            r'''(?x)customBC.\createVideo\(
                .*?                                                  # skipping width and height
                ["\'](?P<playerID>\d+)["\']\s*,\s*                   # playerID
                ["\'](?P<playerKey>AQ[^"\']{48})[^"\']*["\']\s*,\s*  # playerKey begins with AQ and is 50 characters
                                                                     # in length, however it's appended to itself
                                                                     # in places, so truncate
                ["\'](?P<videoID>\d+)["\']                           # @videoPlayer
            ''', object_js)
        if m:
            return cls._make_brightcove_url(m.groupdict())
    @classmethod
    def _make_brightcove_url(cls, params):
        data = compat_urllib_parse.urlencode(params)
        return cls._FEDERATED_URL_TEMPLATE % data
@@ -172,7 +195,7 @@ class BrightcoveIE(InfoExtractor):
        """Return a list of all Brightcove URLs from the webpage """
        url_m = re.search(
-            r'<meta\s+property="og:video"\s+content="(https?://(?:secure|c)\.brightcove.com/[^"]+)"',
+            r'<meta\s+property=[\'"]og:video[\'"]\s+content=[\'"](https?://(?:secure|c)\.brightcove.com/[^\'"]+)[\'"]',
            webpage)
        if url_m:
            url = unescapeHTML(url_m.group(1))
@@ -188,7 +211,12 @@ class BrightcoveIE(InfoExtractor):
                [^>]*?>\s*<param\s+name="movie"\s+value="https?://[^/]*brightcove\.com/
            ).+?>\s*</object>''',
            webpage)
-        return list(filter(None, [cls._build_brighcove_url(m) for m in matches]))
+        if matches:
            return list(filter(None, [cls._build_brighcove_url(m) for m in matches]))
        return list(filter(None, [
            cls._build_brighcove_url_from_js(custom_bc)
            for custom_bc in re.findall(r'(customBC\.createVideo\(.+?\);)', webpage)]))
    def _real_extract(self, url):
        url, smuggled_data = unsmuggle_url(url, {})
--- a/youtube_dl/extractor/cbs.py
+++ b/youtube_dl/extractor/cbs.py
@@ -4,12 +4,13 @@ from .common import InfoExtractor
 class CBSIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?cbs\.com/shows/[^/]+/(?:video|artist)/(?P<id>[^/]+)/.*'
+    _VALID_URL = r'https?://(?:www\.)?(?:cbs\.com/shows/[^/]+/(?:video|artist)|colbertlateshow\.com/(?:video|podcasts))/[^/]+/(?P<id>[^/]+)'
    _TESTS = [{
        'url': 'http://www.cbs.com/shows/garth-brooks/video/_u7W953k6la293J7EPTd9oHkSPs6Xn6_/connect-chat-feat-garth-brooks/',
        'info_dict': {
            'id': '4JUVEwq3wUT7',
            'display_id': 'connect-chat-feat-garth-brooks',
            'ext': 'flv',
            'title': 'Connect Chat feat. Garth Brooks',
            'description': 'Connect with country music singer Garth Brooks, as he chats with fans on Wednesday November 27, 2013. Be sure to tune in to Garth Brooks: Live from Las Vegas, Friday November 29, at 9/8c on CBS!',
@@ -24,6 +25,7 @@ class CBSIE(InfoExtractor):
        'url': 'http://www.cbs.com/shows/liveonletterman/artist/221752/st-vincent/',
        'info_dict': {
            'id': 'WWF_5KqY3PK1',
            'display_id': 'st-vincent',
            'ext': 'flv',
            'title': 'Live on Letterman - St. Vincent',
            'description': 'Live On Letterman: St. Vincent in concert from New York\'s Ed Sullivan Theater on Tuesday, July 16, 2014.',
@@ -34,12 +36,23 @@ class CBSIE(InfoExtractor):
            'skip_download': True,
        },
        '_skip': 'Blocked outside the US',
    }, {
        'url': 'http://colbertlateshow.com/video/8GmB0oY0McANFvp2aEffk9jZZZ2YyXxy/the-colbeard/',
        'only_matching': True,
    }, {
        'url': 'http://www.colbertlateshow.com/podcasts/dYSwjqPs_X1tvbV_P2FcPWRa_qT6akTC/in-the-bad-room-with-stephen/',
        'only_matching': True,
    }]
    def _real_extract(self, url):
-        video_id = self._match_id(url)
+        display_id = self._match_id(url)
-        webpage = self._download_webpage(url, video_id)
+        webpage = self._download_webpage(url, display_id)
        real_id = self._search_regex(
-            r"video\.settings\.pid\s*=\s*'([^']+)';",
+            [r"video\.settings\.pid\s*=\s*'([^']+)';", r"cbsplayer\.pid\s*=\s*'([^']+)';"],
            webpage, 'real video ID')
-        return self.url_result('theplatform:%s' % real_id)
+        return {
            '_type': 'url_transparent',
            'ie_key': 'ThePlatform',
            'url': 'theplatform:%s' % real_id,
            'display_id': display_id,
        }
--- a/youtube_dl/extractor/cnet.py
+++ b/youtube_dl/extractor/cnet.py
@@ -11,7 +11,7 @@ from ..utils import (
 class CNETIE(InfoExtractor):
    _VALID_URL = r'https?://(?:www\.)?cnet\.com/videos/(?P<id>[^/]+)/'
-    _TEST = {
+    _TESTS = [{
        'url': 'http://www.cnet.com/videos/hands-on-with-microsofts-windows-8-1-update/',
        'info_dict': {
            'id': '56f4ea68-bd21-4852-b08c-4de5b8354c60',
@@ -25,7 +25,20 @@ class CNETIE(InfoExtractor):
        'params': {
            'skip_download': 'requires rtmpdump',
        }
-    }
+    }, {
        'url': 'http://www.cnet.com/videos/whiny-pothole-tweets-at-local-government-when-hit-by-cars-tomorrow-daily-187/',
        'info_dict': {
            'id': '56527b93-d25d-44e3-b738-f989ce2e49ba',
            'ext': 'flv',
            'description': 'Khail and Ashley wonder what other civic woes can be solved by self-tweeting objects, investigate a new kind of VR camera and watch an origami robot self-assemble, walk, climb, dig and dissolve. #TDPothole',
            'uploader_id': 'b163284d-6b73-44fc-b3e6-3da66c392d40',
            'uploader': 'Ashley Esqueda',
            'title': 'Whiny potholes tweet at local government when hit by cars (Tomorrow Daily 187)',
        },
        'params': {
            'skip_download': True,  # requires rtmpdump
        },
    }]
    def _real_extract(self, url):
        display_id = self._match_id(url)
@@ -42,7 +55,7 @@ class CNETIE(InfoExtractor):
            raise ExtractorError('Cannot find video data')
        mpx_account = data['config']['players']['default']['mpx_account']
-        vid = vdata['files']['rtmp']
+        vid = vdata['files'].get('rtmp', vdata['files']['hds'])
        tp_link = 'http://link.theplatform.com/s/%s/%s' % (mpx_account, vid)
        video_id = vdata['id']
--- a/youtube_dl/extractor/common.py
+++ b/youtube_dl/extractor/common.py
@@ -846,7 +846,7 @@ class InfoExtractor(object):
    def _extract_m3u8_formats(self, m3u8_url, video_id, ext=None,
                              entry_protocol='m3u8', preference=None,
-                              m3u8_id=None):
+                              m3u8_id=None, note=None, errnote=None):
        formats = [{
            'format_id': '-'.join(filter(None, [m3u8_id, 'meta'])),
@@ -865,8 +865,8 @@ class InfoExtractor(object):
        m3u8_doc = self._download_webpage(
            m3u8_url, video_id,
-            note='Downloading m3u8 information',
+            note=note or 'Downloading m3u8 information',
-            errnote='Failed to download m3u8 information')
+            errnote=errnote or 'Failed to download m3u8 information')
        last_info = None
        last_media = None
        kv_rex = re.compile(
--- a/youtube_dl/extractor/dailymotion.py
+++ b/youtube_dl/extractor/dailymotion.py
@@ -251,3 +251,45 @@ class DailymotionUserIE(DailymotionPlaylistIE):
            'title': full_user,
            'entries': self._extract_entries(user),
        }
 class DailymotionCloudIE(DailymotionBaseInfoExtractor):
    _VALID_URL = r'http://api\.dmcloud\.net/embed/[^/]+/(?P<id>[^/?]+)'
    _TEST = {
        # From http://www.francetvinfo.fr/economie/entreprises/les-entreprises-familiales-le-secret-de-la-reussite_933271.html
        # Tested at FranceTvInfo_2
        'url': 'http://api.dmcloud.net/embed/4e7343f894a6f677b10006b4/556e03339473995ee145930c?auth=1464865870-0-jyhsm84b-ead4c701fb750cf9367bf4447167a3db&autoplay=1',
        'only_matching': True,
    }
    @classmethod
    def _extract_dmcloud_url(self, webpage):
        mobj = re.search(r'<iframe[^>]+src=[\'"](http://api\.dmcloud\.net/embed/[^/]+/[^\'"]+)[\'"]', webpage)
        if mobj:
            return mobj.group(1)
        mobj = re.search(r'<input[^>]+id=[\'"]dmcloudUrlEmissionSelect[\'"][^>]+value=[\'"](http://api\.dmcloud\.net/embed/[^/]+/[^\'"]+)[\'"]', webpage)
        if mobj:
            return mobj.group(1)
    def _real_extract(self, url):
        video_id = self._match_id(url)
        request = self._build_request(url)
        webpage = self._download_webpage(request, video_id)
        title = self._html_search_regex(r'<title>([^>]+)</title>', webpage, 'title')
        video_info = self._parse_json(self._search_regex(
            r'var\s+info\s*=\s*([^;]+);', webpage, 'video info'), video_id)
        # TODO: parse ios_url, which is in fact a manifest
        video_url = video_info['mp4_url']
        return {
            'id': video_id,
            'url': video_url,
            'title': title,
            'thumbnail': video_info.get('thumbnail_url'),
        }
--- a/youtube_dl/extractor/discovery.py
+++ b/youtube_dl/extractor/discovery.py
@@ -2,19 +2,19 @@ from __future__ import unicode_literals
 from .common import InfoExtractor
 from ..utils import (
    parse_duration,
    parse_iso8601,
    int_or_none,
 )
 from ..compat import compat_str
 class DiscoveryIE(InfoExtractor):
    _VALID_URL = r'http://www\.discovery\.com\/[a-zA-Z0-9\-]*/[a-zA-Z0-9\-]*/videos/(?P<id>[a-zA-Z0-9_\-]*)(?:\.htm)?'
-    _TEST = {
+    _TESTS = [{
        'url': 'http://www.discovery.com/tv-shows/mythbusters/videos/mission-impossible-outtakes.htm',
        'md5': '3c69d77d9b0d82bfd5e5932a60f26504',
        'info_dict': {
-            'id': 'mission-impossible-outtakes',
+            'id': '20769',
-            'ext': 'flv',
+            'ext': 'mp4',
            'title': 'Mission Impossible Outtakes',
            'description': ('Watch Jamie Hyneman and Adam Savage practice being'
                            ' each other -- to the point of confusing Jamie\'s dog -- and '
@@ -24,22 +24,36 @@ class DiscoveryIE(InfoExtractor):
            'timestamp': 1303099200,
            'upload_date': '20110418',
        },
-    }
+        'params': {
            'skip_download': True,  # requires ffmpeg
        }
    }, {
        'url': 'http://www.discovery.com/tv-shows/mythbusters/videos/mythbusters-the-simpsons',
        'info_dict': {
            'id': 'mythbusters-the-simpsons',
            'title': 'MythBusters: The Simpsons',
        },
        'playlist_count': 9,
    }]
    def _real_extract(self, url):
        video_id = self._match_id(url)
-        webpage = self._download_webpage(url, video_id)
+        info = self._download_json(url + '?flat=1', video_id)
-        info = self._parse_json(self._search_regex(
+        video_title = info.get('playlist_title') or info.get('video_title')
            r'(?s)<script type="application/ld\+json">(.*?)</script>',
            webpage, 'video info'), video_id)
-        return {
+        entries = [{
-            'id': video_id,
+            'id': compat_str(video_info['id']),
-            'title': info['name'],
+            'formats': self._extract_m3u8_formats(
-            'url': info['contentURL'],
+                video_info['src'], video_id, ext='mp4',
-            'description': info.get('description'),
+                note='Download m3u8 information for video %d' % (idx + 1)),
-            'thumbnail': info.get('thumbnailUrl'),
+            'title': video_info['title'],
-            'timestamp': parse_iso8601(info.get('uploadDate')),
+            'description': video_info.get('description'),
-            'duration': int_or_none(info.get('duration')),
+            'duration': parse_duration(video_info.get('video_length')),
-        }
+            'webpage_url': video_info.get('href'),
            'thumbnail': video_info.get('thumbnailURL'),
            'alt_title': video_info.get('secondary_title'),
            'timestamp': parse_iso8601(video_info.get('publishedDate')),
        } for idx, video_info in enumerate(info['playlist'])]
        return self.playlist_result(entries, video_id, video_title)
--- a/youtube_dl/extractor/dramafever.py
+++ b/youtube_dl/extractor/dramafever.py
@@ -0,0 +1,197 @@
 # encoding: utf-8
 from __future__ import unicode_literals
 import itertools
 from .common import InfoExtractor
 from ..compat import (
    compat_HTTPError,
    compat_urllib_parse,
    compat_urllib_request,
    compat_urlparse,
 )
 from ..utils import (
    ExtractorError,
    clean_html,
    determine_ext,
    int_or_none,
    parse_iso8601,
 )
 class DramaFeverBaseIE(InfoExtractor):
    _LOGIN_URL = 'https://www.dramafever.com/accounts/login/'
    _NETRC_MACHINE = 'dramafever'
    def _real_initialize(self):
        self._login()
    def _login(self):
        (username, password) = self._get_login_info()
        if username is None:
            return
        login_form = {
            'username': username,
            'password': password,
        }
        request = compat_urllib_request.Request(
            self._LOGIN_URL, compat_urllib_parse.urlencode(login_form).encode('utf-8'))
        response = self._download_webpage(
            request, None, 'Logging in as %s' % username)
        if all(logout_pattern not in response
               for logout_pattern in ['href="/accounts/logout/"', '>Log out<']):
            error = self._html_search_regex(
                r'(?s)class="hidden-xs prompt"[^>]*>(.+?)<',
                response, 'error message', default=None)
            if error:
                raise ExtractorError('Unable to login: %s' % error, expected=True)
            raise ExtractorError('Unable to log in')
 class DramaFeverIE(DramaFeverBaseIE):
    IE_NAME = 'dramafever'
    _VALID_URL = r'https?://(?:www\.)?dramafever\.com/drama/(?P<id>[0-9]+/[0-9]+)(?:/|$)'
    _TEST = {
        'url': 'http://www.dramafever.com/drama/4512/1/Cooking_with_Shin/',
        'info_dict': {
            'id': '4512.1',
            'ext': 'flv',
            'title': 'Cooking with Shin 4512.1',
            'description': 'md5:a8eec7942e1664a6896fcd5e1287bfd0',
            'thumbnail': 're:^https?://.*\.jpg',
            'timestamp': 1404336058,
            'upload_date': '20140702',
            'duration': 343,
        }
    }
    def _real_extract(self, url):
        video_id = self._match_id(url).replace('/', '.')
        try:
            feed = self._download_json(
                'http://www.dramafever.com/amp/episode/feed.json?guid=%s' % video_id,
                video_id, 'Downloading episode JSON')['channel']['item']
        except ExtractorError as e:
            if isinstance(e.cause, compat_HTTPError):
                raise ExtractorError(
                    'Currently unavailable in your country.', expected=True)
            raise
        media_group = feed.get('media-group', {})
        formats = []
        for media_content in media_group['media-content']:
            src = media_content.get('@attributes', {}).get('url')
            if not src:
                continue
            ext = determine_ext(src)
            if ext == 'f4m':
                formats.extend(self._extract_f4m_formats(
                    src, video_id, f4m_id='hds'))
            elif ext == 'm3u8':
                formats.extend(self._extract_m3u8_formats(
                    src, video_id, 'mp4', m3u8_id='hls'))
            else:
                formats.append({
                    'url': src,
                })
        self._sort_formats(formats)
        title = media_group.get('media-title')
        description = media_group.get('media-description')
        duration = int_or_none(media_group['media-content'][0].get('@attributes', {}).get('duration'))
        thumbnail = self._proto_relative_url(
            media_group.get('media-thumbnail', {}).get('@attributes', {}).get('url'))
        timestamp = parse_iso8601(feed.get('pubDate'), ' ')
        subtitles = {}
        for media_subtitle in media_group.get('media-subTitle', []):
            lang = media_subtitle.get('@attributes', {}).get('lang')
            href = media_subtitle.get('@attributes', {}).get('href')
            if not lang or not href:
                continue
            subtitles[lang] = [{
                'ext': 'ttml',
                'url': href,
            }]
        return {
            'id': video_id,
            'title': title,
            'description': description,
            'thumbnail': thumbnail,
            'timestamp': timestamp,
            'duration': duration,
            'formats': formats,
            'subtitles': subtitles,
        }
 class DramaFeverSeriesIE(DramaFeverBaseIE):
    IE_NAME = 'dramafever:series'
    _VALID_URL = r'https?://(?:www\.)?dramafever\.com/drama/(?P<id>[0-9]+)(?:/(?:(?!\d+(?:/|$)).+)?)?$'
    _TESTS = [{
        'url': 'http://www.dramafever.com/drama/4512/Cooking_with_Shin/',
        'info_dict': {
            'id': '4512',
            'title': 'Cooking with Shin',
            'description': 'md5:84a3f26e3cdc3fb7f500211b3593b5c1',
        },
        'playlist_count': 4,
    }, {
        'url': 'http://www.dramafever.com/drama/124/IRIS/',
        'info_dict': {
            'id': '124',
            'title': 'IRIS',
            'description': 'md5:b3a30e587cf20c59bd1c01ec0ee1b862',
        },
        'playlist_count': 20,
    }]
    _CONSUMER_SECRET = 'DA59dtVXYLxajktV'
    _PAGE_SIZE = 60  # max is 60 (see http://api.drama9.com/#get--api-4-episode-series-)
    def _get_consumer_secret(self, video_id):
        mainjs = self._download_webpage(
            'http://www.dramafever.com/static/51afe95/df2014/scripts/main.js',
            video_id, 'Downloading main.js', fatal=False)
        if not mainjs:
            return self._CONSUMER_SECRET
        return self._search_regex(
            r"var\s+cs\s*=\s*'([^']+)'", mainjs,
            'consumer secret', default=self._CONSUMER_SECRET)
    def _real_extract(self, url):
        series_id = self._match_id(url)
        consumer_secret = self._get_consumer_secret(series_id)
        series = self._download_json(
            'http://www.dramafever.com/api/4/series/query/?cs=%s&series_id=%s'
            % (consumer_secret, series_id),
            series_id, 'Downloading series JSON')['series'][series_id]
        title = clean_html(series['name'])
        description = clean_html(series.get('description') or series.get('description_short'))
        entries = []
        for page_num in itertools.count(1):
            episodes = self._download_json(
                'http://www.dramafever.com/api/4/episode/series/?cs=%s&series_id=%s&page_size=%d&page_number=%d'
                % (consumer_secret, series_id, self._PAGE_SIZE, page_num),
                series_id, 'Downloading episodes JSON page #%d' % page_num)
            for episode in episodes.get('value', []):
                episode_url = episode.get('episode_url')
                if not episode_url:
                    continue
                entries.append(self.url_result(
                    compat_urlparse.urljoin(url, episode_url),
                    'DramaFever', episode.get('guid')))
            if page_num == episodes['num_pages']:
                break
        return self.playlist_result(entries, series_id, title, description)
--- a/youtube_dl/extractor/drbonanza.py
+++ b/youtube_dl/extractor/drbonanza.py
@@ -15,7 +15,6 @@ class DRBonanzaIE(InfoExtractor):
    _TESTS = [{
        'url': 'http://www.dr.dk/bonanza/serie/portraetter/Talkshowet.htm?assetId=65517',
        'md5': 'fe330252ddea607635cf2eb2c99a0af3',
        'info_dict': {
            'id': '65517',
            'ext': 'mp4',
@@ -26,6 +25,9 @@ class DRBonanzaIE(InfoExtractor):
            'upload_date': '20110120',
            'duration': 3664,
        },
        'params': {
            'skip_download': True,  # requires rtmp
        },
    }, {
        'url': 'http://www.dr.dk/bonanza/radio/serie/sport/fodbold.htm?assetId=59410',
        'md5': '6dfe039417e76795fb783c52da3de11d',
@@ -93,6 +95,11 @@ class DRBonanzaIE(InfoExtractor):
                        'format_id': file['Type'].replace('Video', ''),
                        'preference': preferencemap.get(file['Type'], -10),
                    })
                    if format['url'].startswith('rtmp'):
                        rtmp_url = format['url']
                        format['rtmp_live'] = True  # --resume does not work
                        if '/bonanza/' in rtmp_url:
                            format['play_path'] = rtmp_url.split('/bonanza/')[1]
                    formats.append(format)
                elif file['Type'] == "Thumb":
                    thumbnail = file['Location']
@@ -111,9 +118,6 @@ class DRBonanzaIE(InfoExtractor):
        description = '%s\n%s\n%s\n' % (
            info['Description'], info['Actors'], info['Colophon'])
        for f in formats:
            f['url'] = f['url'].replace('rtmp://vod-bonanza.gss.dr.dk/bonanza/', 'http://vodfiles.dr.dk/')
            f['url'] = f['url'].replace('mp4:bonanza', 'bonanza')
        self._sort_formats(formats)
        display_id = re.sub(r'[^\w\d-]', '', re.sub(r' ', '-', title.lower())) + '-' + asset_id
--- a/youtube_dl/extractor/faz.py
+++ b/youtube_dl/extractor/faz.py
@@ -6,9 +6,9 @@ from .common import InfoExtractor
 class FazIE(InfoExtractor):
    IE_NAME = 'faz.net'
-    _VALID_URL = r'https?://www\.faz\.net/multimedia/videos/.*?-(?P<id>\d+)\.html'
+    _VALID_URL = r'https?://(?:www\.)?faz\.net/(?:[^/]+/)*.*?-(?P<id>\d+)\.html'
-    _TEST = {
+    _TESTS = [{
        'url': 'http://www.faz.net/multimedia/videos/stockholm-chemie-nobelpreis-fuer-drei-amerikanische-forscher-12610585.html',
        'info_dict': {
            'id': '12610585',
@@ -16,7 +16,22 @@ class FazIE(InfoExtractor):
            'title': 'Stockholm: Chemie-Nobelpreis für drei amerikanische Forscher',
            'description': 'md5:1453fbf9a0d041d985a47306192ea253',
        },
-    }
+    }, {
        'url': 'http://www.faz.net/aktuell/politik/berlin-gabriel-besteht-zerreissprobe-ueber-datenspeicherung-13659345.html',
        'only_matching': True,
    }, {
        'url': 'http://www.faz.net/berlin-gabriel-besteht-zerreissprobe-ueber-datenspeicherung-13659345.html',
        'only_matching': True,
    }, {
        'url': 'http://www.faz.net/-13659345.html',
        'only_matching': True,
    }, {
        'url': 'http://www.faz.net/aktuell/politik/-13659345.html',
        'only_matching': True,
    }, {
        'url': 'http://www.faz.net/foobarblafasel-13659345.html',
        'only_matching': True,
    }]
    def _real_extract(self, url):
        video_id = self._match_id(url)
--- a/youtube_dl/extractor/fivetv.py
+++ b/youtube_dl/extractor/fivetv.py
@@ -0,0 +1,88 @@
 # coding: utf-8
 from __future__ import unicode_literals
 import re
 from .common import InfoExtractor
 from ..utils import int_or_none
 class FiveTVIE(InfoExtractor):
    _VALID_URL = r'''(?x)
                    http://
                        (?:www\.)?5-tv\.ru/
                        (?:
                            (?:[^/]+/)+(?P<id>\d+)|
                            (?P<path>[^/?#]+)(?:[/?#])?
                        )
                    '''
    _TESTS = [{
        'url': 'http://5-tv.ru/news/96814/',
        'md5': 'bbff554ad415ecf5416a2f48c22d9283',
        'info_dict': {
            'id': '96814',
            'ext': 'mp4',
            'title': 'Россияне выбрали имя для общенациональной платежной системы',
            'description': 'md5:a8aa13e2b7ad36789e9f77a74b6de660',
            'thumbnail': 're:^https?://.*\.jpg$',
            'duration': 180,
        },
    }, {
        'url': 'http://5-tv.ru/video/1021729/',
        'info_dict': {
            'id': '1021729',
            'ext': 'mp4',
            'title': '3D принтер',
            'description': 'md5:d76c736d29ef7ec5c0cf7d7c65ffcb41',
            'thumbnail': 're:^https?://.*\.jpg$',
            'duration': 180,
        },
    }, {
        'url': 'http://www.5-tv.ru/glavnoe/#itemDetails',
        'info_dict': {
            'id': 'glavnoe',
            'ext': 'mp4',
            'title': 'Итоги недели с 8 по 14 июня 2015 года',
            'thumbnail': 're:^https?://.*\.jpg$',
        },
    }, {
        'url': 'http://www.5-tv.ru/glavnoe/broadcasts/508645/',
        'only_matching': True,
    }, {
        'url': 'http://5-tv.ru/films/1507502/',
        'only_matching': True,
    }, {
        'url': 'http://5-tv.ru/programs/broadcast/508713/',
        'only_matching': True,
    }, {
        'url': 'http://5-tv.ru/angel/',
        'only_matching': True,
    }, {
        'url': 'http://www.5-tv.ru/schedule/?iframe=true&width=900&height=450',
        'only_matching': True,
    }]
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        video_id = mobj.group('id') or mobj.group('path')
        webpage = self._download_webpage(url, video_id)
        video_url = self._search_regex(
            r'<a[^>]+?href="([^"]+)"[^>]+?class="videoplayer"',
            webpage, 'video url')
        title = self._og_search_title(webpage, default=None) or self._search_regex(
            r'<title>([^<]+)</title>', webpage, 'title')
        duration = int_or_none(self._og_search_property(
            'video:duration', webpage, 'duration', default=None))
        return {
            'id': video_id,
            'url': video_url,
            'title': title,
            'description': self._og_search_description(webpage, default=None),
            'thumbnail': self._og_search_thumbnail(webpage, default=None),
            'duration': duration,
        }
--- a/youtube_dl/extractor/francetv.py
+++ b/youtube_dl/extractor/francetv.py
@@ -18,6 +18,7 @@ from ..utils import (
    parse_duration,
    determine_ext,
 )
 from .dailymotion import DailymotionCloudIE
 class FranceTVBaseInfoExtractor(InfoExtractor):
@@ -60,7 +61,7 @@ class FranceTVBaseInfoExtractor(InfoExtractor):
                    continue
                video_url_parsed = compat_urllib_parse_urlparse(video_url)
                f4m_url = self._download_webpage(
-                    'http://hdfauth.francetv.fr/esi/urltokengen2.html?url=%s' % video_url_parsed.path,
+                    'http://hdfauth.francetv.fr/esi/TA?url=%s' % video_url_parsed.path,
                    video_id, 'Downloading f4m manifest token', fatal=False)
                if f4m_url:
                    formats.extend(self._extract_f4m_formats(f4m_url, video_id, 1, format_id))
@@ -131,12 +132,26 @@ class FranceTvInfoIE(FranceTVBaseInfoExtractor):
            'skip_download': 'HLS (reqires ffmpeg)'
        },
        'skip': 'Ce direct est terminé et sera disponible en rattrapage dans quelques minutes.',
    }, {
        'url': 'http://www.francetvinfo.fr/economie/entreprises/les-entreprises-familiales-le-secret-de-la-reussite_933271.html',
        'md5': 'f485bda6e185e7d15dbc69b72bae993e',
        'info_dict': {
            'id': '556e03339473995ee145930c',
            'ext': 'mp4',
            'title': 'Les entreprises familiales : le secret de la réussite',
            'thumbnail': 're:^https?://.*\.jpe?g$',
        }
    }]
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        page_title = mobj.group('title')
        webpage = self._download_webpage(url, page_title)
        dmcloud_url = DailymotionCloudIE._extract_dmcloud_url(webpage)
        if dmcloud_url:
            return self.url_result(dmcloud_url, 'DailymotionCloud')
        video_id, catalogue = self._search_regex(
            r'id-video=([^@]+@[^"]+)', webpage, 'video id').split('@')
        return self._extract_video(video_id, catalogue)
--- a/youtube_dl/extractor/generic.py
+++ b/youtube_dl/extractor/generic.py
@@ -34,6 +34,7 @@ from .brightcove import BrightcoveIE
 from .nbc import NBCSportsVPlayerIE
 from .ooyala import OoyalaIE
 from .rutv import RUTVIE
 from .tvc import TVCIE
 from .sportbox import SportBoxEmbedIE
 from .smotri import SmotriIE
 from .condenast import CondeNastIE
@@ -41,6 +42,11 @@ from .udn import UDNEmbedIE
 from .senateisvp import SenateISVPIE
 from .bliptv import BlipTVIE
 from .svt import SVTIE
 from .pornhub import PornHubIE
 from .xhamster import XHamsterEmbedIE
 from .vimeo import VimeoIE
 from .dailymotion import DailymotionCloudIE
 from .onionstudios import OnionStudiosIE
 class GenericIE(InfoExtractor):
@@ -291,6 +297,15 @@ class GenericIE(InfoExtractor):
                'skip_download': True,
            },
        },
        # TVC embed
        {
            'url': 'http://sch1298sz.mskobr.ru/dou_edu/karamel_ki/filial_galleries/video/iframe_src_http_tvc_ru_video_iframe_id_55304_isplay_false_acc_video_id_channel_brand_id_11_show_episodes_episode_id_32307_frameb/',
            'info_dict': {
                'id': '55304',
                'ext': 'mp4',
                'title': 'Дошкольное воспитание',
            },
        },
        # SportBox embed
        {
            'url': 'http://www.vestifinance.ru/articles/25753',
@@ -322,6 +337,15 @@ class GenericIE(InfoExtractor):
                'skip_download': True,
            },
        },
        # XHamster embed
        {
            'url': 'http://www.numisc.com/forum/showthread.php?11696-FM15-which-pumiscer-was-this-%28-vid-%29-%28-alfa-as-fuck-srx-%29&s=711f5db534502e22260dec8c5e2d66d8',
            'info_dict': {
                'id': 'showthread',
                'title': '[NSFL] [FM15] which pumiscer was this ( vid ) ( alfa as fuck srx )',
            },
            'playlist_mincount': 7,
        },
        # Embedded TED video
        {
            'url': 'http://en.support.wordpress.com/videos/ted-talks/',
@@ -789,6 +813,53 @@ class GenericIE(InfoExtractor):
                # rtmpe downloads
                'skip_download': True,
            }
        },
        # Brightcove URL in single quotes
        {
            'url': 'http://www.sportsnet.ca/baseball/mlb/sn-presents-russell-martin-world-citizen/',
            'md5': '4ae374f1f8b91c889c4b9203c8c752af',
            'info_dict': {
                'id': '4255764656001',
                'ext': 'mp4',
                'title': 'SN Presents: Russell Martin, World Citizen',
                'description': 'To understand why he was the Toronto Blue Jays’ top off-season priority is to appreciate his background and upbringing in Montreal, where he first developed his baseball skills. Written and narrated by Stephen Brunt.',
                'uploader': 'Rogers Sportsnet',
            },
        },
        # Dailymotion Cloud video
        {
            'url': 'http://replay.publicsenat.fr/vod/le-debat/florent-kolandjian,dominique-cena,axel-decourtye,laurence-abeille,bruno-parmentier/175910',
            'md5': '49444254273501a64675a7e68c502681',
            'info_dict': {
                'id': '5585de919473990de4bee11b',
                'ext': 'mp4',
                'title': 'Le débat',
                'thumbnail': 're:^https?://.*\.jpe?g$',
            }
        },
        # OnionStudios embed
        {
            'url': 'http://www.clickhole.com/video/dont-understand-bitcoin-man-will-mumble-explanatio-2537',
            'info_dict': {
                'id': '2855',
                'ext': 'mp4',
                'title': 'Don’t Understand Bitcoin? This Man Will Mumble An Explanation At You',
                'thumbnail': 're:^https?://.*\.jpe?g$',
                'uploader': 'ClickHole',
                'uploader_id': 'clickhole',
            }
        },
        # AdobeTVVideo embed
        {
            'url': 'https://helpx.adobe.com/acrobat/how-to/new-experience-acrobat-dc.html?set=acrobat--get-started--essential-beginners',
            'md5': '43662b577c018ad707a63766462b1e87',
            'info_dict': {
                'id': '2456',
                'ext': 'mp4',
                'title': 'New experience with Acrobat DC',
                'description': 'New experience with Acrobat DC',
                'duration': 248.667,
            },
        }
    ]
@@ -956,7 +1027,9 @@ class GenericIE(InfoExtractor):
            }
        if not self._downloader.params.get('test', False) and not is_intentional:
-            self._downloader.report_warning('Falling back on generic information extractor.')
+            force = self._downloader.params.get('force_generic_extractor', False)
            self._downloader.report_warning(
                '%s on generic information extractor.' % ('Forcing' if force else 'Falling back'))
        if not full_response:
            request = compat_urllib_request.Request(url)
@@ -1061,23 +1134,14 @@ class GenericIE(InfoExtractor):
        # Look for embedded rtl.nl player
        matches = re.findall(
-            r'<iframe\s+(?:[a-zA-Z-]+="[^"]+"\s+)*?src="((?:https?:)?//(?:www\.)?rtl\.nl/system/videoplayer/[^"]+video_embed[^"]+)"',
+            r'<iframe[^>]+?src="((?:https?:)?//(?:www\.)?rtl\.nl/system/videoplayer/[^"]+(?:video_)?embed[^"]+)"',
            webpage)
        if matches:
            return _playlist_from_matches(matches, ie='RtlNl')
-        # Look for embedded (iframe) Vimeo player
+        vimeo_url = VimeoIE._extract_vimeo_url(url, webpage)
-        mobj = re.search(
+        if vimeo_url is not None:
-            r'<iframe[^>]+?src=(["\'])(?P<url>(?:https?:)?//player\.vimeo\.com/video/.+?)\1', webpage)
+            return self.url_result(vimeo_url)
        if mobj:
            player_url = unescapeHTML(mobj.group('url'))
            surl = smuggle_url(player_url, {'Referer': url})
            return self.url_result(surl)
        # Look for embedded (swf embed) Vimeo player
        mobj = re.search(
            r'<embed[^>]+?src="((?:https?:)?//(?:www\.)?vimeo\.com/moogaloop\.swf.+?)"', webpage)
        if mobj:
            return self.url_result(mobj.group(1))
        # Look for embedded YouTube player
        matches = re.findall(r'''(?x)
@@ -1289,11 +1353,32 @@ class GenericIE(InfoExtractor):
        if rutv_url:
            return self.url_result(rutv_url, 'RUTV')
        # Look for embedded TVC player
        tvc_url = TVCIE._extract_url(webpage)
        if tvc_url:
            return self.url_result(tvc_url, 'TVC')
        # Look for embedded SportBox player
        sportbox_urls = SportBoxEmbedIE._extract_urls(webpage)
        if sportbox_urls:
            return _playlist_from_matches(sportbox_urls, ie='SportBoxEmbed')
        # Look for embedded PornHub player
        pornhub_url = PornHubIE._extract_url(webpage)
        if pornhub_url:
            return self.url_result(pornhub_url, 'PornHub')
        # Look for embedded XHamster player
        xhamster_urls = XHamsterEmbedIE._extract_urls(webpage)
        if xhamster_urls:
            return _playlist_from_matches(xhamster_urls, ie='XHamsterEmbed')
        # Look for embedded Tvigle player
        mobj = re.search(
            r'<iframe[^>]+?src=(["\'])(?P<url>(?:https?:)?//cloud\.tvigle\.ru/video/.+?)\1', webpage)
        if mobj is not None:
            return self.url_result(mobj.group('url'), 'Tvigle')
        # Look for embedded TED player
        mobj = re.search(
            r'<iframe[^>]+?src=(["\'])(?P<url>https?://embed(?:-ssl)?\.ted\.com/.+?)\1', webpage)
@@ -1455,6 +1540,25 @@ class GenericIE(InfoExtractor):
        if senate_isvp_url:
            return self.url_result(senate_isvp_url, 'SenateISVP')
        # Look for Dailymotion Cloud videos
        dmcloud_url = DailymotionCloudIE._extract_dmcloud_url(webpage)
        if dmcloud_url:
            return self.url_result(dmcloud_url, 'DailymotionCloud')
        # Look for OnionStudios embeds
        onionstudios_url = OnionStudiosIE._extract_url(webpage)
        if onionstudios_url:
            return self.url_result(onionstudios_url)
        # Look for AdobeTVVideo embeds
        mobj = re.search(
            r'<iframe[^>]+src=[\'"]((?:https?:)?//video\.tv\.adobe\.com/v/\d+[^"]+)[\'"]',
            webpage)
        if mobj is not None:
            return self.url_result(
                self._proto_relative_url(unescapeHTML(mobj.group(1))),
                'AdobeTVVideo')
        def check_video(vurl):
            if YoutubeIE.suitable(vurl):
                return True
--- a/youtube_dl/extractor/imdb.py
+++ b/youtube_dl/extractor/imdb.py
@@ -46,7 +46,7 @@ class ImdbIE(InfoExtractor):
            format_info = info['videoPlayerObject']['video']
            formats.append({
                'format_id': f_id,
-                'url': format_info['url'],
+                'url': format_info['videoInfoList'][0]['videoUrl'],
            })
        return {
--- a/youtube_dl/extractor/instagram.py
+++ b/youtube_dl/extractor/instagram.py
@@ -3,7 +3,10 @@ from __future__ import unicode_literals
 import re
 from .common import InfoExtractor
-from ..utils import int_or_none
+from ..utils import (
    int_or_none,
    limit_length,
 )
 class InstagramIE(InfoExtractor):
@@ -100,11 +103,13 @@ class InstagramUserIE(InfoExtractor):
                thumbnails_el = it.get('images', {})
                thumbnail = thumbnails_el.get('thumbnail', {}).get('url')
-                title = it.get('caption', {}).get('text', it['id'])
+                # In some cases caption is null, which corresponds to None
                # in python. As a result, it.get('caption', {}) gives None
                title = (it.get('caption') or {}).get('text', it['id'])
                entries.append({
                    'id': it['id'],
-                    'title': title,
+                    'title': limit_length(title, 80),
                    'formats': formats,
                    'thumbnail': thumbnail,
                    'webpage_url': it.get('link'),
--- a/youtube_dl/extractor/iqiyi.py
+++ b/youtube_dl/extractor/iqiyi.py
@@ -0,0 +1,296 @@
 # coding: utf-8
 from __future__ import unicode_literals
 import hashlib
 import math
 import os.path
 import random
 import re
 import time
 import uuid
 import zlib
 from .common import InfoExtractor
 from ..compat import compat_urllib_parse
 from ..utils import (
    ExtractorError,
    url_basename,
 )
 class IqiyiIE(InfoExtractor):
    IE_NAME = 'iqiyi'
    _VALID_URL = r'http://(?:www\.)iqiyi.com/v_.+?\.html'
    _TESTS = [{
        'url': 'http://www.iqiyi.com/v_19rrojlavg.html',
        'md5': '2cb594dc2781e6c941a110d8f358118b',
        'info_dict': {
            'id': '9c1fb1b99d192b21c559e5a1a2cb3c73',
            'title': '美国德州空中惊现奇异云团 酷似UFO',
            'ext': 'f4v',
        }
    }, {
        'url': 'http://www.iqiyi.com/v_19rrhnnclk.html',
        'info_dict': {
            'id': 'e3f585b550a280af23c98b6cb2be19fb',
            'title': '名侦探柯南第752集',
        },
        'playlist': [{
            'md5': '7e49376fecaffa115d951634917fe105',
            'info_dict': {
                'id': 'e3f585b550a280af23c98b6cb2be19fb_part1',
                'ext': 'f4v',
                'title': '名侦探柯南第752集',
            },
        }, {
            'md5': '41b75ba13bb7ac0e411131f92bc4f6ca',
            'info_dict': {
                'id': 'e3f585b550a280af23c98b6cb2be19fb_part2',
                'ext': 'f4v',
                'title': '名侦探柯南第752集',
            },
        }, {
            'md5': '0cee1dd0a3d46a83e71e2badeae2aab0',
            'info_dict': {
                'id': 'e3f585b550a280af23c98b6cb2be19fb_part3',
                'ext': 'f4v',
                'title': '名侦探柯南第752集',
            },
        }, {
            'md5': '4f8ad72373b0c491b582e7c196b0b1f9',
            'info_dict': {
                'id': 'e3f585b550a280af23c98b6cb2be19fb_part4',
                'ext': 'f4v',
                'title': '名侦探柯南第752集',
            },
        }, {
            'md5': 'd89ad028bcfad282918e8098e811711d',
            'info_dict': {
                'id': 'e3f585b550a280af23c98b6cb2be19fb_part5',
                'ext': 'f4v',
                'title': '名侦探柯南第752集',
            },
        }, {
            'md5': '9cb1e5c95da25dff0660c32ae50903b7',
            'info_dict': {
                'id': 'e3f585b550a280af23c98b6cb2be19fb_part6',
                'ext': 'f4v',
                'title': '名侦探柯南第752集',
            },
        }, {
            'md5': '155116e0ff1867bbc9b98df294faabc9',
            'info_dict': {
                'id': 'e3f585b550a280af23c98b6cb2be19fb_part7',
                'ext': 'f4v',
                'title': '名侦探柯南第752集',
            },
        }, {
            'md5': '53f5db77622ae14fa493ed2a278a082b',
            'info_dict': {
                'id': 'e3f585b550a280af23c98b6cb2be19fb_part8',
                'ext': 'f4v',
                'title': '名侦探柯南第752集',
            },
        }],
    }]
    _FORMATS_MAP = [
        ('1', 'h6'),
        ('2', 'h5'),
        ('3', 'h4'),
        ('4', 'h3'),
        ('5', 'h2'),
        ('10', 'h1'),
    ]
    def construct_video_urls(self, data, video_id, _uuid):
        def do_xor(x, y):
            a = y % 3
            if a == 1:
                return x ^ 121
            if a == 2:
                return x ^ 72
            return x ^ 103
        def get_encode_code(l):
            a = 0
            b = l.split('-')
            c = len(b)
            s = ''
            for i in range(c - 1, -1, -1):
                a = do_xor(int(b[c - i - 1], 16), i)
                s += chr(a)
            return s[::-1]
        def get_path_key(x, format_id, segment_index):
            mg = ')(*&^flash@#$%a'
            tm = self._download_json(
                'http://data.video.qiyi.com/t?tn=' + str(random.random()), video_id,
                note='Download path key of segment %d for format %s' % (segment_index + 1, format_id)
            )['t']
            t = str(int(math.floor(int(tm) / (600.0))))
            return hashlib.md5((t + mg + x).encode('utf8')).hexdigest()
        video_urls_dict = {}
        for format_item in data['vp']['tkl'][0]['vs']:
            if 0 < int(format_item['bid']) <= 10:
                format_id = self.get_format(format_item['bid'])
            else:
                continue
            video_urls = []
            video_urls_info = format_item['fs']
            if not format_item['fs'][0]['l'].startswith('/'):
                t = get_encode_code(format_item['fs'][0]['l'])
                if t.endswith('mp4'):
                    video_urls_info = format_item['flvs']
            for segment_index, segment in enumerate(video_urls_info):
                vl = segment['l']
                if not vl.startswith('/'):
                    vl = get_encode_code(vl)
                key = get_path_key(
                    vl.split('/')[-1].split('.')[0], format_id, segment_index)
                filesize = segment['b']
                base_url = data['vp']['du'].split('/')
                base_url.insert(-1, key)
                base_url = '/'.join(base_url)
                param = {
                    'su': _uuid,
                    'qyid': uuid.uuid4().hex,
                    'client': '',
                    'z': '',
                    'bt': '',
                    'ct': '',
                    'tn': str(int(time.time()))
                }
                api_video_url = base_url + vl + '?' + \
                    compat_urllib_parse.urlencode(param)
                js = self._download_json(
                    api_video_url, video_id,
                    note='Download video info of segment %d for format %s' % (segment_index + 1, format_id))
                video_url = js['l']
                video_urls.append(
                    (video_url, filesize))
            video_urls_dict[format_id] = video_urls
        return video_urls_dict
    def get_format(self, bid):
        matched_format_ids = [_format_id for _bid, _format_id in self._FORMATS_MAP if _bid == str(bid)]
        return matched_format_ids[0] if len(matched_format_ids) else None
    def get_bid(self, format_id):
        matched_bids = [_bid for _bid, _format_id in self._FORMATS_MAP if _format_id == format_id]
        return matched_bids[0] if len(matched_bids) else None
    def get_raw_data(self, tvid, video_id, enc_key, _uuid):
        tm = str(int(time.time()))
        param = {
            'key': 'fvip',
            'src': hashlib.md5(b'youtube-dl').hexdigest(),
            'tvId': tvid,
            'vid': video_id,
            'vinfo': 1,
            'tm': tm,
            'enc': hashlib.md5(
                (enc_key + tm + tvid).encode('utf8')).hexdigest(),
            'qyid': _uuid,
            'tn': random.random(),
            'um': 0,
            'authkey': hashlib.md5(
                (tm + tvid).encode('utf8')).hexdigest()
        }
        api_url = 'http://cache.video.qiyi.com/vms' + '?' + \
            compat_urllib_parse.urlencode(param)
        raw_data = self._download_json(api_url, video_id)
        return raw_data
    def get_enc_key(self, swf_url, video_id):
        filename, _ = os.path.splitext(url_basename(swf_url))
        enc_key_json = self._downloader.cache.load('iqiyi-enc-key', filename)
        if enc_key_json is not None:
            return enc_key_json[0]
        req = self._request_webpage(
            swf_url, video_id, note='download swf content')
        cn = req.read()
        cn = zlib.decompress(cn[8:])
        pt = re.compile(b'MixerRemote\x08(?P<enc_key>.+?)\$&vv')
        enc_key = self._search_regex(pt, cn, 'enc_key').decode('utf8')
        self._downloader.cache.store('iqiyi-enc-key', filename, [enc_key])
        return enc_key
    def _real_extract(self, url):
        webpage = self._download_webpage(
            url, 'temp_id', note='download video page')
        tvid = self._search_regex(
            r'data-player-tvid\s*=\s*[\'"](\d+)', webpage, 'tvid')
        video_id = self._search_regex(
            r'data-player-videoid\s*=\s*[\'"]([a-f\d]+)', webpage, 'video_id')
        swf_url = self._search_regex(
            r'(http://[^\'"]+MainPlayer[^.]+\.swf)', webpage, 'swf player URL')
        _uuid = uuid.uuid4().hex
        enc_key = self.get_enc_key(swf_url, video_id)
        raw_data = self.get_raw_data(tvid, video_id, enc_key, _uuid)
        if raw_data['code'] != 'A000000':
            raise ExtractorError('Unable to load data. Error code: ' + raw_data['code'])
        if not raw_data['data']['vp']['tkl']:
            raise ExtractorError('No support iQiqy VIP video')
        data = raw_data['data']
        title = data['vi']['vn']
        # generate video_urls_dict
        video_urls_dict = self.construct_video_urls(
            data, video_id, _uuid)
        # construct info
        entries = []
        for format_id in video_urls_dict:
            video_urls = video_urls_dict[format_id]
            for i, video_url_info in enumerate(video_urls):
                if len(entries) < i + 1:
                    entries.append({'formats': []})
                entries[i]['formats'].append(
                    {
                        'url': video_url_info[0],
                        'filesize': video_url_info[-1],
                        'format_id': format_id,
                        'preference': int(self.get_bid(format_id))
                    }
                )
        for i in range(len(entries)):
            self._sort_formats(entries[i]['formats'])
            entries[i].update(
                {
                    'id': '%s_part%d' % (video_id, i + 1),
                    'title': title,
                }
            )
        if len(entries) > 1:
            info = {
                '_type': 'multi_video',
                'id': video_id,
                'title': title,
                'entries': entries,
            }
        else:
            info = entries[0]
            info['id'] = video_id
            info['title'] = title
        return info
--- a/youtube_dl/extractor/izlesene.py
+++ b/youtube_dl/extractor/izlesene.py
@@ -4,6 +4,7 @@ from __future__ import unicode_literals
 import re
 from .common import InfoExtractor
 from ..compat import compat_urllib_parse_unquote
 from ..utils import (
    determine_ext,
    float_or_none,
@@ -30,7 +31,7 @@ class IzleseneIE(InfoExtractor):
                'description': 'md5:253753e2655dde93f59f74b572454f6d',
                'thumbnail': 're:^http://.*\.jpg',
                'uploader_id': 'pelikzzle',
-                'timestamp': 1404302298,
+                'timestamp': int,
                'upload_date': '20140702',
                'duration': 95.395,
                'age_limit': 0,
@@ -46,7 +47,7 @@ class IzleseneIE(InfoExtractor):
                'description': 'Tarkan Dortmund 2006 Konseri',
                'thumbnail': 're:^http://.*\.jpg',
                'uploader_id': 'parlayankiz',
-                'timestamp': 1163322193,
+                'timestamp': int,
                'upload_date': '20061112',
                'duration': 253.666,
                'age_limit': 0,
@@ -67,9 +68,9 @@ class IzleseneIE(InfoExtractor):
        uploader = self._html_search_regex(
            r"adduserUsername\s*=\s*'([^']+)';",
-            webpage, 'uploader', fatal=False, default='')
+            webpage, 'uploader', fatal=False)
        timestamp = parse_iso8601(self._html_search_meta(
-            'uploadDate', webpage, 'upload date', fatal=False))
+            'uploadDate', webpage, 'upload date'))
        duration = float_or_none(self._html_search_regex(
            r'"videoduration"\s*:\s*"([^"]+)"',
@@ -86,8 +87,7 @@ class IzleseneIE(InfoExtractor):
        # Might be empty for some videos.
        streams = self._html_search_regex(
-            r'"qualitylevel"\s*:\s*"([^"]+)"',
+            r'"qualitylevel"\s*:\s*"([^"]+)"', webpage, 'streams', default='')
            webpage, 'streams', fatal=False, default='')
        formats = []
        if streams:
@@ -95,15 +95,15 @@ class IzleseneIE(InfoExtractor):
                quality, url = re.search(r'\[(\w+)\](.+)', stream).groups()
                formats.append({
                    'format_id': '%sp' % quality if quality else 'sd',
-                    'url': url,
+                    'url': compat_urllib_parse_unquote(url),
                    'ext': ext,
                })
        else:
            stream_url = self._search_regex(
-                r'"streamurl"\s?:\s?"([^"]+)"', webpage, 'stream URL')
+                r'"streamurl"\s*:\s*"([^"]+)"', webpage, 'stream URL')
            formats.append({
                'format_id': 'sd',
-                'url': stream_url,
+                'url': compat_urllib_parse_unquote(stream_url),
                'ext': ext,
            })
--- a/youtube_dl/extractor/kickstarter.py
+++ b/youtube_dl/extractor/kickstarter.py
@@ -28,6 +28,14 @@ class KickStarterIE(InfoExtractor):
            'uploader': 'Pebble Technology',
            'title': 'Pebble iOS Notifications',
        }
    }, {
        'url': 'https://www.kickstarter.com/projects/1420158244/power-drive-2000/widget/video.html',
        'info_dict': {
            'id': '1420158244',
            'ext': 'mp4',
            'title': 'Power Drive 2000',
        },
        'expected_warnings': ['OpenGraph description'],
    }]
    def _real_extract(self, url):
@@ -48,10 +56,15 @@ class KickStarterIE(InfoExtractor):
                'title': title,
            }
        thumbnail = self._og_search_thumbnail(webpage, default=None)
        if thumbnail is None:
            thumbnail = self._html_search_regex(
                r'<img[^>]+class="[^"]+\s*poster\s*[^"]+"[^>]+src="([^"]+)"',
                webpage, 'thumbnail image', fatal=False)
        return {
            'id': video_id,
            'url': video_url,
            'title': title,
            'description': self._og_search_description(webpage),
-            'thumbnail': self._og_search_thumbnail(webpage),
+            'thumbnail': thumbnail,
        }
--- a/youtube_dl/extractor/lifenews.py
+++ b/youtube_dl/extractor/lifenews.py
@@ -8,6 +8,7 @@ from ..compat import compat_urlparse
 from ..utils import (
    determine_ext,
    int_or_none,
    remove_end,
    unified_strdate,
    ExtractorError,
 )
@@ -39,7 +40,6 @@ class LifeNewsIE(InfoExtractor):
            'title': 'В Сети появилось видео захвата «Правым сектором» колхозных полей ',
            'description': 'Жители двух поселков Днепропетровской области не простили радикалам угрозу лишения плодородных земель и пошли в лобовую. ',
            'upload_date': '20150402',
            'uploader': 'embed.life.ru',
        }
    }, {
        'url': 'http://lifenews.ru/news/153461',
@@ -50,7 +50,6 @@ class LifeNewsIE(InfoExtractor):
            'title': 'В Москве спасли потерявшегося медвежонка, который спрятался на дереве',
            'description': 'Маленький хищник не смог найти дорогу домой и обрел временное убежище на тополе недалеко от жилого массива, пока его не нашла соседская собака.',
            'upload_date': '20150505',
            'uploader': 'embed.life.ru',
        }
    }, {
        'url': 'http://lifenews.ru/video/13035',
@@ -72,20 +71,20 @@ class LifeNewsIE(InfoExtractor):
        if not videos and not iframe_link:
            raise ExtractorError('No media links available for %s' % video_id)
-        title = self._og_search_title(webpage)
+        title = remove_end(
-        TITLE_SUFFIX = ' - Первый по срочным новостям — LIFE | NEWS'
+            self._og_search_title(webpage),
-        if title.endswith(TITLE_SUFFIX):
+            ' - Первый по срочным новостям — LIFE | NEWS')
            title = title[:-len(TITLE_SUFFIX)]
        description = self._og_search_description(webpage)
        view_count = self._html_search_regex(
            r'<div class=\'views\'>\s*(\d+)\s*</div>', webpage, 'view count', fatal=False)
        comment_count = self._html_search_regex(
-            r'<div class=\'comments\'>\s*<span class=\'counter\'>\s*(\d+)\s*</span>', webpage, 'comment count', fatal=False)
+            r'=\'commentCount\'[^>]*>\s*(\d+)\s*<',
            webpage, 'comment count', fatal=False)
        upload_date = self._html_search_regex(
-            r'<time datetime=\'([^\']+)\'>', webpage, 'upload date', fatal=False)
+            r'<time[^>]*datetime=\'([^\']+)\'', webpage, 'upload date', fatal=False)
        if upload_date is not None:
            upload_date = unified_strdate(upload_date)
--- a/youtube_dl/extractor/liveleak.py
+++ b/youtube_dl/extractor/liveleak.py
@@ -40,6 +40,17 @@ class LiveLeakIE(InfoExtractor):
            'title': 'Man is Fatally Struck by Reckless Car While Packing up a Moving Truck',
            'age_limit': 18,
        }
    }, {
        # Covers https://github.com/rg3/youtube-dl/pull/5983
        'url': 'http://www.liveleak.com/view?i=801_1409392012',
        'md5': '0b3bec2d888c20728ca2ad3642f0ef15',
        'info_dict': {
            'id': '801_1409392012',
            'ext': 'mp4',
            'description': "Happened on 27.7.2014. \r\nAt 0:53 you can see people still swimming at near beach.",
            'uploader': 'bony333',
            'title': 'Crazy Hungarian tourist films close call waterspout in Croatia'
        }
    }]
    def _real_extract(self, url):
@@ -85,7 +96,10 @@ class LiveLeakIE(InfoExtractor):
            'url': s['file'],
        } for i, s in enumerate(sources)]
        for i, s in enumerate(sources):
-            orig_url = s['file'].replace('.h264_base.mp4', '')
+            # Removing '.h264_*.mp4' gives the raw video, which is essentially
            # the same video without the LiveLeak logo at the top (see
            # https://github.com/rg3/youtube-dl/pull/4768)
            orig_url = re.sub(r'\.h264_.+?\.mp4', '', s['file'])
            if s['file'] != orig_url:
                formats.append({
                    'format_id': 'original-%s' % i,
--- a/youtube_dl/extractor/nfl.py
+++ b/youtube_dl/extractor/nfl.py
@@ -19,7 +19,7 @@ class NFLIE(InfoExtractor):
    _VALID_URL = r'''(?x)https?://
        (?P<host>(?:www\.)?(?:nfl\.com|.*?\.clubs\.nfl\.com))/
        (?:.+?/)*
-        (?P<id>(?:\d[a-z]{2}\d{13}|\w{8}\-(?:\w{4}\-){3}\w{12}))'''
+        (?P<id>(?:[a-z0-9]{16}|\w{8}\-(?:\w{4}\-){3}\w{12}))'''
    _TESTS = [
        {
            'url': 'http://www.nfl.com/videos/nfl-game-highlights/0ap3000000398478/Week-3-Redskins-vs-Eagles-highlights',
@@ -58,6 +58,10 @@ class NFLIE(InfoExtractor):
                'upload_date': '20150202',
            },
        },
        {
            'url': 'http://www.nfl.com/videos/nfl-network-top-ten/09000d5d810a6bd4/Top-10-Gutsiest-Performances-Jack-Youngblood',
            'only_matching': True,
        }
    ]
    @staticmethod
--- a/youtube_dl/extractor/niconico.py
+++ b/youtube_dl/extractor/niconico.py
@@ -182,7 +182,6 @@ class NiconicoIE(InfoExtractor):
        extension = xpath_text(video_info, './/movie_type')
        if not extension:
            extension = determine_ext(video_real_url)
        video_format = extension.upper()
        thumbnail = (
            xpath_text(video_info, './/thumbnail_url') or
@@ -241,7 +240,7 @@ class NiconicoIE(InfoExtractor):
            'url': video_real_url,
            'title': title,
            'ext': extension,
-            'format': video_format,
+            'format_id': 'economy' if video_real_url.endswith('low') else 'normal',
            'thumbnail': thumbnail,
            'description': description,
            'uploader': uploader,
--- a/youtube_dl/extractor/noco.py
+++ b/youtube_dl/extractor/noco.py
@@ -166,6 +166,10 @@ class NocoIE(InfoExtractor):
        self._sort_formats(formats)
        timestamp = parse_iso8601(show.get('online_date_start_utc'), ' ')
        if timestamp is not None and timestamp < 0:
            timestamp = None
        uploader = show.get('partner_name')
        uploader_id = show.get('partner_key')
        duration = float_or_none(show.get('duration_ms'), 1000)
@@ -191,7 +195,7 @@ class NocoIE(InfoExtractor):
        if episode_number:
            title += ' #' + compat_str(episode_number)
        if episode:
-            title += ' - ' + episode
+            title += ' - ' + compat_str(episode)
        description = show.get('show_resume') or show.get('family_resume')
--- a/youtube_dl/extractor/onionstudios.py
+++ b/youtube_dl/extractor/onionstudios.py
@@ -0,0 +1,74 @@
 # coding: utf-8
 from __future__ import unicode_literals
 import re
 from .common import InfoExtractor
 from ..utils import determine_ext
 class OnionStudiosIE(InfoExtractor):
    _VALID_URL = r'https?://(?:www\.)?onionstudios\.com/(?:videos/[^/]+-|embed\?.*\bid=)(?P<id>\d+)(?!-)'
    _TESTS = [{
        'url': 'http://www.onionstudios.com/videos/hannibal-charges-forward-stops-for-a-cocktail-2937',
        'md5': 'd4851405d31adfadf71cd7a487b765bb',
        'info_dict': {
            'id': '2937',
            'ext': 'mp4',
            'title': 'Hannibal charges forward, stops for a cocktail',
            'description': 'md5:545299bda6abf87e5ec666548c6a9448',
            'thumbnail': 're:^https?://.*\.jpg$',
            'uploader': 'The A.V. Club',
            'uploader_id': 'TheAVClub',
        },
    }, {
        'url': 'http://www.onionstudios.com/embed?id=2855&autoplay=true',
        'only_matching': True,
    }]
    @staticmethod
    def _extract_url(webpage):
        mobj = re.search(
            r'<iframe[^>]+?src=(["\'])(?P<url>(?:https?:)?//(?:www\.)?onionstudios\.com/embed.+?)\1', webpage)
        if mobj:
            return mobj.group('url')
    def _real_extract(self, url):
        video_id = self._match_id(url)
        webpage = self._download_webpage(
            'http://www.onionstudios.com/embed?id=%s' % video_id, video_id)
        formats = []
        for src in re.findall(r'<source[^>]+src="([^"]+)"', webpage):
            if determine_ext(src) != 'm3u8':  # m3u8 always results in 403
                formats.append({
                    'url': src,
                })
        self._sort_formats(formats)
        title = self._search_regex(
            r'share_title\s*=\s*"([^"]+)"', webpage, 'title')
        description = self._search_regex(
            r'share_description\s*=\s*"([^"]+)"', webpage,
            'description', default=None)
        thumbnail = self._search_regex(
            r'poster="([^"]+)"', webpage, 'thumbnail', default=False)
        uploader_id = self._search_regex(
            r'twitter_handle\s*=\s*"([^"]+)"',
            webpage, 'uploader id', fatal=False)
        uploader = self._search_regex(
            r'window\.channelName\s*=\s*"Embedded:([^"]+)"',
            webpage, 'uploader', default=False)
        return {
            'id': video_id,
            'title': title,
            'description': description,
            'thumbnail': thumbnail,
            'uploader': uploader,
            'uploader_id': uploader_id,
            'formats': formats,
        }
--- a/youtube_dl/extractor/pinkbike.py
+++ b/youtube_dl/extractor/pinkbike.py
@@ -0,0 +1,96 @@
 # coding: utf-8
 from __future__ import unicode_literals
 import re
 from .common import InfoExtractor
 from ..utils import (
    int_or_none,
    remove_end,
    remove_start,
    str_to_int,
    unified_strdate,
 )
 class PinkbikeIE(InfoExtractor):
    _VALID_URL = r'https?://(?:(?:www\.)?pinkbike\.com/video/|es\.pinkbike\.org/i/kvid/kvid-y5\.swf\?id=)(?P<id>[0-9]+)'
    _TESTS = [{
        'url': 'http://www.pinkbike.com/video/402811/',
        'md5': '4814b8ca7651034cd87e3361d5c2155a',
        'info_dict': {
            'id': '402811',
            'ext': 'mp4',
            'title': 'Brandon Semenuk - RAW 100',
            'description': 'Official release: www.redbull.ca/rupertwalker',
            'thumbnail': 're:^https?://.*\.jpg$',
            'duration': 100,
            'upload_date': '20150406',
            'uploader': 'revelco',
            'location': 'Victoria, British Columbia, Canada',
            'view_count': int,
            'comment_count': int,
        }
    }, {
        'url': 'http://es.pinkbike.org/i/kvid/kvid-y5.swf?id=406629',
        'only_matching': True,
    }]
    def _real_extract(self, url):
        video_id = self._match_id(url)
        webpage = self._download_webpage(
            'http://www.pinkbike.com/video/%s' % video_id, video_id)
        formats = []
        for _, format_id, src in re.findall(
                r'data-quality=((?:\\)?["\'])(.+?)\1[^>]+src=\1(.+?)\1', webpage):
            height = int_or_none(self._search_regex(
                r'^(\d+)[pP]$', format_id, 'height', default=None))
            formats.append({
                'url': src,
                'format_id': format_id,
                'height': height,
            })
        self._sort_formats(formats)
        title = remove_end(self._og_search_title(webpage), ' Video - Pinkbike')
        description = self._html_search_regex(
            r'(?s)id="media-description"[^>]*>(.+?)<',
            webpage, 'description', default=None) or remove_start(
            self._og_search_description(webpage), title + '. ')
        thumbnail = self._og_search_thumbnail(webpage)
        duration = int_or_none(self._html_search_meta(
            'video:duration', webpage, 'duration'))
        uploader = self._search_regex(
            r'un:\s*"([^"]+)"', webpage, 'uploader', fatal=False)
        upload_date = unified_strdate(self._search_regex(
            r'class="fullTime"[^>]+title="([^"]+)"',
            webpage, 'upload date', fatal=False))
        location = self._html_search_regex(
            r'(?s)<dt>Location</dt>\s*<dd>(.+?)<',
            webpage, 'location', fatal=False)
        def extract_count(webpage, label):
            return str_to_int(self._search_regex(
                r'<span[^>]+class="stat-num"[^>]*>([\d,.]+)</span>\s*<span[^>]+class="stat-label"[^>]*>%s' % label,
                webpage, label, fatal=False))
        view_count = extract_count(webpage, 'Views')
        comment_count = extract_count(webpage, 'Comments')
        return {
            'id': video_id,
            'title': title,
            'description': description,
            'thumbnail': thumbnail,
            'duration': duration,
            'upload_date': upload_date,
            'uploader': uploader,
            'location': location,
            'view_count': view_count,
            'comment_count': comment_count,
            'formats': formats
        }
--- a/youtube_dl/extractor/pornhub.py
+++ b/youtube_dl/extractor/pornhub.py
@@ -19,8 +19,8 @@ from ..aes import (
 class PornHubIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?pornhub\.com/view_video\.php\?viewkey=(?P<id>[0-9a-f]+)'
+    _VALID_URL = r'https?://(?:www\.)?pornhub\.com/(?:view_video\.php\?viewkey=|embed/)(?P<id>[0-9a-z]+)'
-    _TEST = {
+    _TESTS = [{
        'url': 'http://www.pornhub.com/view_video.php?viewkey=648719015',
        'md5': '882f488fa1f0026f023f33576004a2ed',
        'info_dict': {
@@ -30,7 +30,17 @@ class PornHubIE(InfoExtractor):
            "title": "Seductive Indian beauty strips down and fingers her pink pussy",
            "age_limit": 18
        }
-    }
+    }, {
        'url': 'http://www.pornhub.com/view_video.php?viewkey=ph557bbb6676d2d',
        'only_matching': True,
    }]
    @classmethod
    def _extract_url(cls, webpage):
        mobj = re.search(
            r'<iframe[^>]+?src=(["\'])(?P<url>(?:https?:)?//(?:www\.)?pornhub\.com/embed/\d+)\1', webpage)
        if mobj:
            return mobj.group('url')
    def _extract_count(self, pattern, webpage, name):
        return str_to_int(self._search_regex(
@@ -39,7 +49,8 @@ class PornHubIE(InfoExtractor):
    def _real_extract(self, url):
        video_id = self._match_id(url)
-        req = compat_urllib_request.Request(url)
+        req = compat_urllib_request.Request(
            'http://www.pornhub.com/view_video.php?viewkey=%s' % video_id)
        req.add_header('Cookie', 'age_verified=1')
        webpage = self._download_webpage(req, video_id)
--- a/youtube_dl/extractor/pornovoisines.py
+++ b/youtube_dl/extractor/pornovoisines.py
@@ -34,7 +34,7 @@ class PornoVoisinesIE(InfoExtractor):
            'duration': 120,
            'view_count': int,
            'average_rating': float,
-            'categories': ['Débutante', 'Scénario', 'Sodomie'],
+            'categories': ['Débutantes', 'Scénario', 'Sodomie'],
            'age_limit': 18,
        }
    }
@@ -71,7 +71,7 @@ class PornoVoisinesIE(InfoExtractor):
        view_count = int_or_none(self._search_regex(
            r'(\d+) vues', webpage, 'view count', fatal=False))
        average_rating = self._search_regex(
-            r'Note : (\d+,\d+)', webpage, 'average rating', fatal=False)
+            r'Note\s*:\s*(\d+(?:,\d+)?)', webpage, 'average rating', fatal=False)
        if average_rating:
            average_rating = float_or_none(average_rating.replace(',', '.'))
--- a/youtube_dl/extractor/prosiebensat1.py
+++ b/youtube_dl/extractor/prosiebensat1.py
@@ -177,6 +177,7 @@ class ProSiebenSat1IE(InfoExtractor):
        r'<header class="clearfix">\s*<h3>(.+?)</h3>',
        r'<!-- start video -->\s*<h1>(.+?)</h1>',
        r'<h1 class="att-name">\s*(.+?)</h1>',
        r'<header class="module_header">\s*<h2>([^<]+)</h2>\s*</header>',
    ]
    _DESCRIPTION_REGEXES = [
        r'<p itemprop="description">\s*(.+?)</p>',
@@ -206,8 +207,8 @@ class ProSiebenSat1IE(InfoExtractor):
    def _extract_clip(self, url, webpage):
        clip_id = self._html_search_regex(self._CLIPID_REGEXES, webpage, 'clip id')
-        access_token = 'testclient'
+        access_token = 'prosieben'
-        client_name = 'kolibri-1.2.5'
+        client_name = 'kolibri-1.12.6'
        client_location = url
        videos_api_url = 'http://vas.sim-technik.de/vas/live/v2/videos?%s' % compat_urllib_parse.urlencode({
@@ -275,13 +276,17 @@ class ProSiebenSat1IE(InfoExtractor):
        for source in urls_sources:
            protocol = source['protocol']
            if protocol == 'rtmp' or protocol == 'rtmpe':
-                mobj = re.search(r'^(?P<url>rtmpe?://[^/]+/(?P<app>[^/]+))/(?P<playpath>.+)$', source['url'])
+                mobj = re.search(r'^(?P<url>rtmpe?://[^/]+)/(?P<path>.+)$', source['url'])
                if not mobj:
                    continue
                path = mobj.group('path')
                mp4colon_index = path.rfind('mp4:')
                app = path[:mp4colon_index]
                play_path = path[mp4colon_index:]
                formats.append({
-                    'url': mobj.group('url'),
+                    'url': '%s/%s' % (mobj.group('url'), app),
-                    'app': mobj.group('app'),
+                    'app': app,
-                    'play_path': mobj.group('playpath'),
+                    'play_path': play_path,
                    'player_url': 'http://livepassdl.conviva.com/hf/ver/2.79.0.17083/LivePassModuleMain.swf',
                    'page_url': 'http://www.prosieben.de',
                    'vbr': fix_bitrate(source['bitrate']),
--- a/youtube_dl/extractor/qqmusic.py
+++ b/youtube_dl/extractor/qqmusic.py
@@ -18,10 +18,10 @@ class QQMusicIE(InfoExtractor):
    _VALID_URL = r'http://y.qq.com/#type=song&mid=(?P<id>[0-9A-Za-z]+)'
    _TESTS = [{
        'url': 'http://y.qq.com/#type=song&mid=004295Et37taLD',
-        'md5': 'bed90b6db2a7a7a7e11bc585f471f63a',
+        'md5': '9ce1c1c8445f561506d2e3cfb0255705',
        'info_dict': {
            'id': '004295Et37taLD',
-            'ext': 'm4a',
+            'ext': 'mp3',
            'title': '可惜没如果',
            'upload_date': '20141227',
            'creator': '林俊杰',
@@ -29,6 +29,12 @@ class QQMusicIE(InfoExtractor):
        }
    }]
    _FORMATS = {
        'mp3-320': {'prefix': 'M800', 'ext': 'mp3', 'preference': 40, 'abr': 320},
        'mp3-128': {'prefix': 'M500', 'ext': 'mp3', 'preference': 30, 'abr': 128},
        'm4a': {'prefix': 'C200', 'ext': 'm4a', 'preference': 10}
    }
    # Reference: m_r_GetRUin() in top_player.js
    # http://imgcache.gtimg.cn/music/portal_v3/y/top_player.js
    @staticmethod
@@ -68,11 +74,22 @@ class QQMusicIE(InfoExtractor):
            'http://base.music.qq.com/fcgi-bin/fcg_musicexpress.fcg?json=3&guid=%s' % guid,
            mid, note='Retrieve vkey', errnote='Unable to get vkey',
            transform_source=strip_jsonp)['key']
-        song_url = 'http://cc.stream.qqmusic.qq.com/C200%s.m4a?vkey=%s&guid=%s&fromtag=0' % (mid, vkey, guid)
+
        formats = []
        for format_id, details in self._FORMATS.items():
            formats.append({
                'url': 'http://cc.stream.qqmusic.qq.com/%s%s.%s?vkey=%s&guid=%s&fromtag=0'
                       % (details['prefix'], mid, details['ext'], vkey, guid),
                'format': format_id,
                'format_id': format_id,
                'preference': details['preference'],
                'abr': details.get('abr'),
            })
        self._sort_formats(formats)
        return {
            'id': mid,
-            'url': song_url,
+            'formats': formats,
            'title': song_name,
            'upload_date': publish_time,
            'creator': singer,
--- a/youtube_dl/extractor/rtbf.py
+++ b/youtube_dl/extractor/rtbf.py
@@ -21,6 +21,13 @@ class RTBFIE(InfoExtractor):
        }
    }
    _QUALITIES = [
        ('mobile', 'mobile'),
        ('web', 'SD'),
        ('url', 'MD'),
        ('high', 'HD'),
    ]
    def _real_extract(self, url):
        video_id = self._match_id(url)
@@ -32,14 +39,21 @@ class RTBFIE(InfoExtractor):
                r'data-video="([^"]+)"', webpage, 'data video')),
            video_id)
        video_url = data.get('downloadUrl') or data.get('url')
        if data.get('provider').lower() == 'youtube':
            video_url = data.get('downloadUrl') or data.get('url')
            return self.url_result(video_url, 'Youtube')
        formats = []
        for key, format_id in self._QUALITIES:
            format_url = data['sources'].get(key)
            if format_url:
                formats.append({
                    'format_id': format_id,
                    'url': format_url,
                })
        return {
            'id': video_id,
-            'url': video_url,
+            'formats': formats,
            'title': data['title'],
            'description': data.get('description') or data.get('subtitle'),
            'thumbnail': data.get('thumbnail'),
--- a/youtube_dl/extractor/rtlnl.py
+++ b/youtube_dl/extractor/rtlnl.py
@@ -12,10 +12,10 @@ class RtlNlIE(InfoExtractor):
    IE_NAME = 'rtl.nl'
    IE_DESC = 'rtl.nl and rtlxl.nl'
    _VALID_URL = r'''(?x)
-        https?://(www\.)?
+        https?://(?:www\.)?
        (?:
            rtlxl\.nl/\#!/[^/]+/|
-            rtl\.nl/system/videoplayer/[^?#]+?/video_embed\.html\#uuid=
+            rtl\.nl/system/videoplayer/(?:[^/]+/)+(?:video_)?embed\.html\b.+?\buuid=
        )
        (?P<id>[0-9a-f-]+)'''
@@ -43,6 +43,9 @@ class RtlNlIE(InfoExtractor):
            'upload_date': '20150215',
            'description': 'Er zijn nieuwe beelden vrijgegeven die vlak na de aanslag in Kopenhagen zijn gemaakt. Op de video is goed te zien hoe omstanders zich bekommeren om één van de slachtoffers, terwijl de eerste agenten ter plaatse komen.',
        }
    }, {
        'url': 'http://www.rtl.nl/system/videoplayer/derden/embed.html#!/uuid=bb0353b0-d6a4-1dad-90e9-18fe75b8d1f0',
        'only_matching': True,
    }]
    def _real_extract(self, url):
--- a/youtube_dl/extractor/ruutu.py
+++ b/youtube_dl/extractor/ruutu.py
@@ -0,0 +1,119 @@
 # coding: utf-8
 from __future__ import unicode_literals
 from .common import InfoExtractor
 from ..compat import compat_urllib_parse_urlparse
 from ..utils import (
    determine_ext,
    int_or_none,
    xpath_text,
 )
 class RuutuIE(InfoExtractor):
    _VALID_URL = r'http://(?:www\.)?ruutu\.fi/ohjelmat/(?:[^/?#]+/)*(?P<id>[^/?#]+)'
    _TESTS = [
        {
            'url': 'http://www.ruutu.fi/ohjelmat/oletko-aina-halunnut-tietaa-mita-tapahtuu-vain-hetki-ennen-lahetysta-nyt-se-selvisi',
            'md5': 'ab2093f39be1ca8581963451b3c0234f',
            'info_dict': {
                'id': '2058907',
                'display_id': 'oletko-aina-halunnut-tietaa-mita-tapahtuu-vain-hetki-ennen-lahetysta-nyt-se-selvisi',
                'ext': 'mp4',
                'title': 'Oletko aina halunnut tietää mitä tapahtuu vain hetki ennen lähetystä? - Nyt se selvisi!',
                'description': 'md5:cfc6ccf0e57a814360df464a91ff67d6',
                'thumbnail': 're:^https?://.*\.jpg$',
                'duration': 114,
                'age_limit': 0,
            },
        },
        {
            'url': 'http://www.ruutu.fi/ohjelmat/superpesis/superpesis-katso-koko-kausi-ruudussa',
            'md5': '065a10ae4d5b8cfd9d0c3d332465e3d9',
            'info_dict': {
                'id': '2057306',
                'display_id': 'superpesis-katso-koko-kausi-ruudussa',
                'ext': 'mp4',
                'title': 'Superpesis: katso koko kausi Ruudussa',
                'description': 'md5:44c44a99fdbe5b380ab74ebd75f0af77',
                'thumbnail': 're:^https?://.*\.jpg$',
                'duration': 40,
                'age_limit': 0,
            },
        },
    ]
    def _real_extract(self, url):
        display_id = self._match_id(url)
        webpage = self._download_webpage(url, display_id)
        video_id = self._search_regex(
            r'data-media-id="(\d+)"', webpage, 'media id')
        video_xml_url = None
        media_data = self._search_regex(
            r'jQuery\.extend\([^,]+,\s*(.+?)\);', webpage,
            'media data', default=None)
        if media_data:
            media_json = self._parse_json(media_data, display_id, fatal=False)
            if media_json:
                xml_url = media_json.get('ruutuplayer', {}).get('xmlUrl')
                if xml_url:
                    video_xml_url = xml_url.replace('{ID}', video_id)
        if not video_xml_url:
            video_xml_url = 'http://gatling.ruutu.fi/media-xml-cache?id=%s' % video_id
        video_xml = self._download_xml(video_xml_url, video_id)
        formats = []
        processed_urls = []
        def extract_formats(node):
            for child in node:
                if child.tag.endswith('Files'):
                    extract_formats(child)
                elif child.tag.endswith('File'):
                    video_url = child.text
                    if not video_url or video_url in processed_urls or 'NOT_USED' in video_url:
                        return
                    processed_urls.append(video_url)
                    ext = determine_ext(video_url)
                    if ext == 'm3u8':
                        formats.extend(self._extract_m3u8_formats(
                            video_url, video_id, 'mp4', m3u8_id='hls'))
                    elif ext == 'f4m':
                        formats.extend(self._extract_f4m_formats(
                            video_url, video_id, f4m_id='hds'))
                    else:
                        proto = compat_urllib_parse_urlparse(video_url).scheme
                        if not child.tag.startswith('HTTP') and proto != 'rtmp':
                            continue
                        preference = -1 if proto == 'rtmp' else 1
                        label = child.get('label')
                        tbr = int_or_none(child.get('bitrate'))
                        width, height = [int_or_none(x) for x in child.get('resolution', '').split('x')]
                        formats.append({
                            'format_id': '%s-%s' % (proto, label if label else tbr),
                            'url': video_url,
                            'width': width,
                            'height': height,
                            'tbr': tbr,
                            'preference': preference,
                        })
        extract_formats(video_xml.find('./Clip'))
        self._sort_formats(formats)
        return {
            'id': video_id,
            'display_id': display_id,
            'title': self._og_search_title(webpage),
            'description': self._og_search_description(webpage),
            'thumbnail': self._og_search_thumbnail(webpage),
            'duration': int_or_none(xpath_text(video_xml, './/Runtime', 'duration')),
            'age_limit': int_or_none(xpath_text(video_xml, './/AgeLimit', 'age limit')),
            'formats': formats,
        }
--- a/youtube_dl/extractor/safari.py
+++ b/youtube_dl/extractor/safari.py
@@ -83,7 +83,7 @@ class SafariIE(SafariBaseIE):
                                    library/view/[^/]+|
                                    api/v1/book
                                )/
-                                (?P<course_id>\d+)/
+                                (?P<course_id>[^/]+)/
                                    (?:chapter(?:-content)?/)?
                                (?P<part>part\d+)\.html
    '''
@@ -100,6 +100,10 @@ class SafariIE(SafariBaseIE):
    }, {
        'url': 'https://www.safaribooksonline.com/api/v1/book/9780133392838/chapter/part00.html',
        'only_matching': True,
    }, {
        # non-digits in course id
        'url': 'https://www.safaribooksonline.com/library/view/create-a-nodejs/100000006A0210/part00.html',
        'only_matching': True,
    }]
    def _real_extract(self, url):
@@ -122,7 +126,7 @@ class SafariCourseIE(SafariBaseIE):
    IE_NAME = 'safari:course'
    IE_DESC = 'safaribooksonline.com online courses'
-    _VALID_URL = r'https?://(?:www\.)?safaribooksonline\.com/(?:library/view/[^/]+|api/v1/book)/(?P<id>\d+)/?(?:[#?]|$)'
+    _VALID_URL = r'https?://(?:www\.)?safaribooksonline\.com/(?:library/view/[^/]+|api/v1/book)/(?P<id>[^/]+)/?(?:[#?]|$)'
    _TESTS = [{
        'url': 'https://www.safaribooksonline.com/library/view/hadoop-fundamentals-livelessons/9780133392838/',
--- a/youtube_dl/extractor/sohu.py
+++ b/youtube_dl/extractor/sohu.py
@@ -6,9 +6,12 @@ import re
 from .common import InfoExtractor
 from ..compat import (
    compat_str,
-    compat_urllib_request
+    compat_urllib_request,
    compat_urllib_parse,
 )
 from ..utils import (
    ExtractorError,
 )
 from ..utils import ExtractorError
 class SohuIE(InfoExtractor):
@@ -26,7 +29,7 @@ class SohuIE(InfoExtractor):
        'skip': 'On available in China',
    }, {
        'url': 'http://tv.sohu.com/20150305/n409385080.shtml',
-        'md5': 'ac9a5d322b4bf9ae184d53e4711e4f1a',
+        'md5': '699060e75cf58858dd47fb9c03c42cfb',
        'info_dict': {
            'id': '409385080',
            'ext': 'mp4',
@@ -34,7 +37,7 @@ class SohuIE(InfoExtractor):
        }
    }, {
        'url': 'http://my.tv.sohu.com/us/232799889/78693464.shtml',
-        'md5': '49308ff6dafde5ece51137d04aec311e',
+        'md5': '9bf34be48f2f4dadcb226c74127e203c',
        'info_dict': {
            'id': '78693464',
            'ext': 'mp4',
@@ -48,7 +51,7 @@ class SohuIE(InfoExtractor):
            'title': '【神探苍实战秘籍】第13期 战争之影 赫卡里姆',
        },
        'playlist': [{
-            'md5': '492923eac023ba2f13ff69617c32754a',
+            'md5': 'bdbfb8f39924725e6589c146bc1883ad',
            'info_dict': {
                'id': '78910339_part1',
                'ext': 'mp4',
@@ -56,7 +59,7 @@ class SohuIE(InfoExtractor):
                'title': '【神探苍实战秘籍】第13期 战争之影 赫卡里姆',
            }
        }, {
-            'md5': 'de604848c0e8e9c4a4dde7e1347c0637',
+            'md5': '3e1f46aaeb95354fd10e7fca9fc1804e',
            'info_dict': {
                'id': '78910339_part2',
                'ext': 'mp4',
@@ -64,7 +67,7 @@ class SohuIE(InfoExtractor):
                'title': '【神探苍实战秘籍】第13期 战争之影 赫卡里姆',
            }
        }, {
-            'md5': '93584716ee0657c0b205b8aa3d27aa13',
+            'md5': '8407e634175fdac706766481b9443450',
            'info_dict': {
                'id': '78910339_part3',
                'ext': 'mp4',
@@ -139,21 +142,42 @@ class SohuIE(InfoExtractor):
        for i in range(part_count):
            formats = []
            for format_id, format_data in formats_json.items():
-                data = format_data['data']
+                allot = format_data['allot']
                data = format_data['data']
                clips_url = data['clipsURL']
                su = data['su']
                # URLs starts with http://newflv.sohu.ccgslb.net/ is not usable
                # so retry until got a working URL
                video_url = 'newflv.sohu.ccgslb.net'
                cdnId = None
                retries = 0
-                while 'newflv.sohu.ccgslb.net' in video_url and retries < 5:
+
-                    download_note = 'Download information from CDN gateway for format ' + format_id
+                while 'newflv.sohu.ccgslb.net' in video_url:
                    params = {
                        'prot': 9,
                        'file': clips_url[i],
                        'new': su[i],
                        'prod': 'flash',
                    }
                    if cdnId is not None:
                        params['idc'] = cdnId
                    download_note = 'Downloading %s video URL part %d of %d' % (
                        format_id, i + 1, part_count)
                    if retries > 0:
                        download_note += ' (retry #%d)' % retries
                    part_info = self._parse_json(self._download_webpage(
                        'http://%s/?%s' % (allot, compat_urllib_parse.urlencode(params)),
                        video_id, download_note), video_id)
                    video_url = part_info['url']
                    cdnId = part_info.get('nid')
                    retries += 1
-                    cdn_info = self._download_json(
+                    if retries > 5:
-                        'http://data.vod.itc.cn/cdnList?new=' + data['su'][i],
+                        raise ExtractorError('Failed to get video URL')
                        video_id, download_note)
                    video_url = cdn_info['url']
                formats.append({
                    'url': video_url,
--- a/youtube_dl/extractor/soundcloud.py
+++ b/youtube_dl/extractor/soundcloud.py
@@ -29,7 +29,7 @@ class SoundcloudIE(InfoExtractor):
    _VALID_URL = r'''(?x)^(?:https?://)?
                    (?:(?:(?:www\.|m\.)?soundcloud\.com/
                            (?P<uploader>[\w\d-]+)/
-                            (?!sets/|likes/?(?:$|[?#]))
+                            (?!sets/|(?:likes|tracks)/?(?:$|[?#]))
                            (?P<title>[\w\d-]+)/?
                            (?P<token>[^?]+?)?(?:[?].*)?$)
                       |(?:api\.soundcloud\.com/tracks/(?P<track_id>\d+)
@@ -307,6 +307,9 @@ class SoundcloudUserIE(SoundcloudIE):
            'title': 'The Royal Concept',
        },
        'playlist_mincount': 1,
    }, {
        'url': 'https://soundcloud.com/the-akashic-chronicler/tracks',
        'only_matching': True,
    }]
    def _real_extract(self, url):
--- a/youtube_dl/extractor/spankwire.py
+++ b/youtube_dl/extractor/spankwire.py
@@ -27,7 +27,7 @@ class SpankwireIE(InfoExtractor):
            'description': 'Crazy Bitch X rated music video.',
            'uploader': 'oreusz',
            'uploader_id': '124697',
-            'upload_date': '20070508',
+            'upload_date': '20070507',
            'age_limit': 18,
        }
    }
@@ -44,7 +44,7 @@ class SpankwireIE(InfoExtractor):
        title = self._html_search_regex(
            r'<h1>([^<]+)', webpage, 'title')
        description = self._html_search_regex(
-            r'<div\s+id="descriptionContent">([^<]+)<',
+            r'(?s)<div\s+id="descriptionContent">(.+?)</div>',
            webpage, 'description', fatal=False)
        thumbnail = self._html_search_regex(
            r'playerData\.screenShot\s*=\s*["\']([^"\']+)["\']',
@@ -64,12 +64,12 @@ class SpankwireIE(InfoExtractor):
            r'<div id="viewsCounter"><span>([\d,\.]+)</span> views</div>',
            webpage, 'view count', fatal=False))
        comment_count = str_to_int(self._html_search_regex(
-            r'Comments<span[^>]+>\s*\(([\d,\.]+)\)</span>',
+            r'<span\s+id="spCommentCount"[^>]*>([\d,\.]+)</span>',
            webpage, 'comment count', fatal=False))
        video_urls = list(map(
            compat_urllib_parse.unquote,
-            re.findall(r'playerData\.cdnPath[0-9]{3,}\s*=\s*["\']([^"\']+)["\']', webpage)))
+            re.findall(r'playerData\.cdnPath[0-9]{3,}\s*=\s*(?:encodeURIComponent\()?["\']([^"\']+)["\']', webpage)))
        if webpage.find('flashvars\.encrypted = "true"') != -1:
            password = self._search_regex(
                r'flashvars\.video_title = "([^"]+)',
--- a/youtube_dl/extractor/spiegeltv.py
+++ b/youtube_dl/extractor/spiegeltv.py
@@ -2,7 +2,11 @@
 from __future__ import unicode_literals
 from .common import InfoExtractor
-from ..utils import float_or_none
+from ..compat import compat_urllib_parse_urlparse
 from ..utils import (
    determine_ext,
    float_or_none,
 )
 class SpiegeltvIE(InfoExtractor):
@@ -17,7 +21,7 @@ class SpiegeltvIE(InfoExtractor):
            'thumbnail': 're:http://.*\.jpg$',
        },
        'params': {
-            # rtmp download
+            # m3u8 download
            'skip_download': True,
        }
    }, {
@@ -53,7 +57,35 @@ class SpiegeltvIE(InfoExtractor):
        server_json = self._download_json(
            'http://spiegeltv-prod-static.s3.amazonaws.com/projectConfigs/projectConfig.json',
            video_id, note='Downloading server information')
-        server = server_json['streamingserver'][0]['endpoint']
+
        format = '16x9' if is_wide else '4x3'
        formats = []
        for streamingserver in server_json['streamingserver']:
            endpoint = streamingserver.get('endpoint')
            if not endpoint:
                continue
            play_path = 'mp4:%s_spiegeltv_0500_%s.m4v' % (uuid, format)
            if endpoint.startswith('rtmp'):
                formats.append({
                    'url': endpoint,
                    'format_id': 'rtmp',
                    'app': compat_urllib_parse_urlparse(endpoint).path[1:],
                    'play_path': play_path,
                    'player_path': 'http://prod-static.spiegel.tv/frontend-076.swf',
                    'ext': 'flv',
                    'rtmp_live': True,
                })
            elif determine_ext(endpoint) == 'm3u8':
                formats.extend(self._extract_m3u8_formats(
                    endpoint.replace('[video]', play_path),
                    video_id, 'm4v',
                    preference=1,  # Prefer hls since it allows to workaround georestriction
                    m3u8_id='hls'))
            else:
                formats.append({
                    'url': endpoint,
                })
        thumbnails = []
        for image in media_json['images']:
@@ -65,17 +97,12 @@ class SpiegeltvIE(InfoExtractor):
        description = media_json['subtitle']
        duration = float_or_none(media_json.get('duration_in_ms'), scale=1000)
        format = '16x9' if is_wide else '4x3'
        url = server + 'mp4:' + uuid + '_spiegeltv_0500_' + format + '.m4v'
        return {
            'id': video_id,
            'title': title,
            'url': url,
            'ext': 'm4v',
            'description': description,
            'duration': duration,
            'thumbnails': thumbnails,
-            'rtmp_live': True,
+            'formats': formats,
        }
--- a/youtube_dl/extractor/sunporno.py
+++ b/youtube_dl/extractor/sunporno.py
@@ -44,7 +44,7 @@ class SunPornoIE(InfoExtractor):
            webpage, 'duration', fatal=False))
        view_count = int_or_none(self._html_search_regex(
-            r'class="views">\s*(\d+)\s*<',
+            r'class="views">(?:<noscript>)?\s*(\d+)\s*<',
            webpage, 'view count', fatal=False))
        comment_count = int_or_none(self._html_search_regex(
            r'(\d+)</b> Comments?',
--- a/youtube_dl/extractor/teamcoco.py
+++ b/youtube_dl/extractor/teamcoco.py
@@ -51,6 +51,17 @@ class TeamcocoIE(InfoExtractor):
            'params': {
                'skip_download': True,  # m3u8 downloads
            }
        }, {
            'url': 'http://teamcoco.com/video/full-episode-mon-6-1-joel-mchale-jake-tapper-and-musical-guest-courtney-barnett?playlist=x;eyJ0eXBlIjoidGFnIiwiaWQiOjl9',
            'info_dict': {
                'id': '89341',
                'ext': 'mp4',
                'title': 'Full Episode - Mon. 6/1 - Joel McHale, Jake Tapper, And Musical Guest Courtney Barnett',
                'description': 'Guests: Joel McHale, Jake Tapper, And Musical Guest Courtney Barnett',
            },
            'params': {
                'skip_download': True,  # m3u8 downloads
            }
        }
    ]
    _VIDEO_ID_REGEXES = (
@@ -110,9 +121,23 @@ class TeamcocoIE(InfoExtractor):
        get_quality = qualities(['500k', '480p', '1000k', '720p', '1080p'])
        for filed in data['files']:
            if determine_ext(filed['url']) == 'm3u8':
-                formats.extend(self._extract_m3u8_formats(
+                # compat_urllib_parse.urljoin does not work here
-                    filed['url'], video_id, ext='mp4'))
+                if filed['url'].startswith('/'):
                    m3u8_url = 'http://ht.cdn.turner.com/tbs/big/teamcoco' + filed['url']
                else:
                    m3u8_url = filed['url']
                m3u8_formats = self._extract_m3u8_formats(
                    m3u8_url, video_id, ext='mp4')
                for m3u8_format in m3u8_formats:
                    if m3u8_format not in formats:
                        formats.append(m3u8_format)
            elif determine_ext(filed['url']) == 'f4m':
                # TODO Correct f4m extraction
                continue
            else:
                if filed['url'].startswith('/mp4:protected/'):
                    # TODO Correct extraction for these files
                    continue
                m_format = re.search(r'(\d+(k|p))\.mp4', filed['url'])
                if m_format is not None:
                    format_id = m_format.group(1)
--- a/youtube_dl/extractor/theplatform.py
+++ b/youtube_dl/extractor/theplatform.py
@@ -26,7 +26,7 @@ _x = lambda p: xpath_with_ns(p, {'smil': 'http://www.w3.org/2005/SMIL21/Language
 class ThePlatformIE(InfoExtractor):
    _VALID_URL = r'''(?x)
        (?:https?://(?:link|player)\.theplatform\.com/[sp]/(?P<provider_id>[^/]+)/
-           (?P<config>(?:[^/\?]+/(?:swf|config)|onsite)/select/)?
+           (?:(?P<media>(?:[^/]+/)+select/media/)|(?P<config>(?:[^/\?]+/(?:swf|config)|onsite)/select/))?
         |theplatform:)(?P<id>[^/\?&]+)'''
    _TESTS = [{
@@ -56,6 +56,17 @@ class ThePlatformIE(InfoExtractor):
            # rtmp download
            'skip_download': True,
        }
    }, {
        'url': 'https://player.theplatform.com/p/D6x-PC/pulse_preview/embed/select/media/yMBg9E8KFxZD',
        'info_dict': {
            'id': 'yMBg9E8KFxZD',
            'ext': 'mp4',
            'description': 'md5:644ad9188d655b742f942bf2e06b002d',
            'title': 'HIGHLIGHTS: USA bag first ever series Cup win',
        }
    }, {
        'url': 'http://player.theplatform.com/p/NnzsPC/widget/select/media/4Y0TlYUr_ZT7',
        'only_matching': True,
    }]
    @staticmethod
@@ -85,6 +96,11 @@ class ThePlatformIE(InfoExtractor):
        if not provider_id:
            provider_id = 'dJ5BDC'
        path = provider_id
        if mobj.group('media'):
            path += '/media'
        path += '/' + video_id
        if smuggled_data.get('force_smil_url', False):
            smil_url = url
        elif mobj.group('config'):
@@ -94,8 +110,7 @@ class ThePlatformIE(InfoExtractor):
            config = self._download_json(config_url, video_id, 'Downloading config')
            smil_url = config['releaseUrl'] + '&format=SMIL&formats=MPEG4&manifest=f4m'
        else:
-            smil_url = ('http://link.theplatform.com/s/{0}/{1}/meta.smil?'
+            smil_url = 'http://link.theplatform.com/s/%s/meta.smil?format=smil&mbr=true' % path
                        'format=smil&mbr=true'.format(provider_id, video_id))
        sig = smuggled_data.get('sig')
        if sig:
@@ -112,7 +127,7 @@ class ThePlatformIE(InfoExtractor):
        else:
            raise ExtractorError(error_msg, expected=True)
-        info_url = 'http://link.theplatform.com/s/{0}/{1}?format=preview'.format(provider_id, video_id)
+        info_url = 'http://link.theplatform.com/s/%s?format=preview' % path
        info_json = self._download_webpage(info_url, video_id)
        info = json.loads(info_json)
--- a/youtube_dl/extractor/tlc.py
+++ b/youtube_dl/extractor/tlc.py
@@ -12,17 +12,22 @@ class TlcIE(DiscoveryIE):
    IE_NAME = 'tlc.com'
    _VALID_URL = r'http://www\.tlc\.com\/[a-zA-Z0-9\-]*/[a-zA-Z0-9\-]*/videos/(?P<id>[a-zA-Z0-9\-]*)(.htm)?'
-    _TEST = {
+    # DiscoveryIE has _TESTS
    _TESTS = [{
        'url': 'http://www.tlc.com/tv-shows/cake-boss/videos/too-big-to-fly.htm',
        'md5': 'c4038f4a9b44d0b5d74caaa64ed2a01a',
        'info_dict': {
-            'id': '853232',
+            'id': '104493',
            'ext': 'mp4',
-            'title': 'Cake Boss: Too Big to Fly',
+            'title': 'Too Big to Fly',
            'description': 'Buddy has taken on a high flying task.',
            'duration': 119,
            'timestamp': 1393365060,
            'upload_date': '20140225',
        },
-    }
+        'params': {
            'skip_download': True,  # requires ffmpef
        },
    }]
 class TlcDeIE(InfoExtractor):
--- a/youtube_dl/extractor/tube8.py
+++ b/youtube_dl/extractor/tube8.py
@@ -47,7 +47,7 @@ class Tube8IE(InfoExtractor):
        webpage = self._download_webpage(req, display_id)
        flashvars = json.loads(self._html_search_regex(
-            r'flashvars\s*=\s*({.+?})', webpage, 'flashvars'))
+            r'flashvars\s*=\s*({.+?});\r?\n', webpage, 'flashvars'))
        video_url = flashvars['video_url']
        if flashvars.get('encrypted') is True:
--- a/youtube_dl/extractor/tumblr.py
+++ b/youtube_dl/extractor/tumblr.py
@@ -4,6 +4,8 @@ from __future__ import unicode_literals
 import re
 from .common import InfoExtractor
 from .pornhub import PornHubIE
 from .vimeo import VimeoIE
 class TumblrIE(InfoExtractor):
@@ -39,6 +41,17 @@ class TumblrIE(InfoExtractor):
            'timestamp': 1430931613,
        },
        'add_ie': ['Vidme'],
    }, {
        'url': 'http://camdamage.tumblr.com/post/98846056295/',
        'md5': 'a9e0c8371ea1ca306d6554e3fecf50b6',
        'info_dict': {
            'id': '105463834',
            'ext': 'mp4',
            'title': 'Cam Damage-HD 720p',
            'uploader': 'John Moyer',
            'uploader_id': 'user32021558',
        },
        'add_ie': ['Vimeo'],
    }]
    def _real_extract(self, url):
@@ -55,6 +68,14 @@ class TumblrIE(InfoExtractor):
        if vid_me_embed_url is not None:
            return self.url_result(vid_me_embed_url, 'Vidme')
        pornhub_url = PornHubIE._extract_url(webpage)
        if pornhub_url:
            return self.url_result(pornhub_url, 'PornHub')
        vimeo_url = VimeoIE._extract_vimeo_url(url, webpage)
        if vimeo_url:
            return self.url_result(vimeo_url, 'Vimeo')
        iframe_url = self._search_regex(
            r'src=\'(https?://www\.tumblr\.com/video/[^\']+)\'',
            webpage, 'iframe url')
--- a/youtube_dl/extractor/turbo.py
+++ b/youtube_dl/extractor/turbo.py
@@ -23,7 +23,7 @@ class TurboIE(InfoExtractor):
            'ext': 'mp4',
            'duration': 3715,
            'title': 'Turbo du 07/09/2014 : Renault Twingo 3, Bentley Continental GT Speed, CES, Guide Achat Dacia... ',
-            'description': 'Retrouvez dans cette rubrique toutes les vidéos de l\'Turbo du 07/09/2014 : Renault Twingo 3, Bentley Continental GT Speed, CES, Guide Achat Dacia... ',
+            'description': 'Turbo du 07/09/2014 : Renault Twingo 3, Bentley Continental GT Speed, CES, Guide Achat Dacia...',
            'thumbnail': 're:^https?://.*\.jpg$',
        }
    }
@@ -42,7 +42,7 @@ class TurboIE(InfoExtractor):
        title = xpath_text(item, './title', 'title')
        duration = int_or_none(xpath_text(item, './durate', 'duration'))
        thumbnail = xpath_text(item, './visuel_clip', 'thumbnail')
-        description = self._og_search_description(webpage)
+        description = self._html_search_meta('description', webpage)
        formats = []
        get_quality = qualities(['3g', 'sd', 'hq'])
--- a/youtube_dl/extractor/tvc.py
+++ b/youtube_dl/extractor/tvc.py
@@ -0,0 +1,109 @@
 # coding: utf-8
 from __future__ import unicode_literals
 import re
 from .common import InfoExtractor
 from ..utils import (
    clean_html,
    int_or_none,
 )
 class TVCIE(InfoExtractor):
    _VALID_URL = r'http://(?:www\.)?tvc\.ru/video/iframe/id/(?P<id>\d+)'
    _TEST = {
        'url': 'http://www.tvc.ru/video/iframe/id/74622/isPlay/false/id_stat/channel/?acc_video_id=/channel/brand/id/17/show/episodes/episode_id/39702',
        'md5': 'bbc5ff531d1e90e856f60fc4b3afd708',
        'info_dict': {
            'id': '74622',
            'ext': 'mp4',
            'title': 'События. "События". Эфир от 22.05.2015 14:30',
            'thumbnail': 're:^https?://.*\.jpg$',
            'duration': 1122,
        },
    }
    @classmethod
    def _extract_url(cls, webpage):
        mobj = re.search(
            r'<iframe[^>]+?src=(["\'])(?P<url>(?:http:)?//(?:www\.)?tvc\.ru/video/iframe/id/[^"]+)\1', webpage)
        if mobj:
            return mobj.group('url')
    def _real_extract(self, url):
        video_id = self._match_id(url)
        video = self._download_json(
            'http://www.tvc.ru/video/json/id/%s' % video_id, video_id)
        formats = []
        for info in video.get('path', {}).get('quality', []):
            video_url = info.get('url')
            if not video_url:
                continue
            format_id = self._search_regex(
                r'cdnvideo/([^/]+?)(?:-[^/]+?)?/', video_url,
                'format id', default=None)
            formats.append({
                'url': video_url,
                'format_id': format_id,
                'width': int_or_none(info.get('width')),
                'height': int_or_none(info.get('height')),
                'tbr': int_or_none(info.get('bitrate')),
            })
        self._sort_formats(formats)
        return {
            'id': video_id,
            'title': video['title'],
            'thumbnail': video.get('picture'),
            'duration': int_or_none(video.get('duration')),
            'formats': formats,
        }
 class TVCArticleIE(InfoExtractor):
    _VALID_URL = r'http://(?:www\.)?tvc\.ru/(?!video/iframe/id/)(?P<id>[^?#]+)'
    _TESTS = [{
        'url': 'http://www.tvc.ru/channel/brand/id/29/show/episodes/episode_id/39702/',
        'info_dict': {
            'id': '74622',
            'ext': 'mp4',
            'title': 'События. "События". Эфир от 22.05.2015 14:30',
            'description': 'md5:ad7aa7db22903f983e687b8a3e98c6dd',
            'thumbnail': 're:^https?://.*\.jpg$',
            'duration': 1122,
        },
    }, {
        'url': 'http://www.tvc.ru/news/show/id/69944',
        'info_dict': {
            'id': '75399',
            'ext': 'mp4',
            'title': 'Эксперты: в столице встал вопрос о максимально безопасных остановках',
            'description': 'md5:f2098f71e21f309e89f69b525fd9846e',
            'thumbnail': 're:^https?://.*\.jpg$',
            'duration': 278,
        },
    }, {
        'url': 'http://www.tvc.ru/channel/brand/id/47/show/episodes#',
        'info_dict': {
            'id': '2185',
            'ext': 'mp4',
            'title': 'Ещё не поздно. Эфир от 03.08.2013',
            'description': 'md5:51fae9f3f8cfe67abce014e428e5b027',
            'thumbnail': 're:^https?://.*\.jpg$',
            'duration': 3316,
        },
    }]
    def _real_extract(self, url):
        webpage = self._download_webpage(url, self._match_id(url))
        return {
            '_type': 'url_transparent',
            'ie_key': 'TVC',
            'url': self._og_search_video_url(webpage),
            'title': clean_html(self._og_search_title(webpage)),
            'description': clean_html(self._og_search_description(webpage)),
            'thumbnail': self._og_search_thumbnail(webpage),
        }
--- a/youtube_dl/extractor/tvplay.py
+++ b/youtube_dl/extractor/tvplay.py
@@ -26,6 +26,7 @@ class TVPlayIE(InfoExtractor):
           viasat4play\.no/programmer|
           tv6play\.no/programmer|
           tv3play\.dk/programmer|
           play\.novatv\.bg/programi
        )/[^/]+/(?P<id>\d+)
        '''
    _TESTS = [
@@ -173,6 +174,22 @@ class TVPlayIE(InfoExtractor):
                'skip_download': True,
            },
        },
        {
            'url': 'http://play.novatv.bg/programi/zdravei-bulgariya/624952?autostart=true',
            'info_dict': {
                'id': '624952',
                'ext': 'flv',
                'title': 'Здравей, България (12.06.2015 г.) ',
                'description': 'md5:99f3700451ac5bb71a260268b8daefd7',
                'duration': 8838,
                'timestamp': 1434100372,
                'upload_date': '20150612',
            },
            'params': {
                # rtmp download
                'skip_download': True,
            },
        },
    ]
    def _real_extract(self, url):
--- a/youtube_dl/extractor/vbox7.py
+++ b/youtube_dl/extractor/vbox7.py
@@ -5,6 +5,7 @@ from .common import InfoExtractor
 from ..compat import (
    compat_urllib_parse,
    compat_urllib_request,
    compat_urlparse,
 )
 from ..utils import (
    ExtractorError,
@@ -26,11 +27,21 @@ class Vbox7IE(InfoExtractor):
    def _real_extract(self, url):
        video_id = self._match_id(url)
-        redirect_page, urlh = self._download_webpage_handle(url, video_id)
+        # need to get the page 3 times for the correct jsSecretToken cookie
-        new_location = self._search_regex(r'window\.location = \'(.*)\';',
+        # which is necessary for the correct title
-                                          redirect_page, 'redirect location')
+        def get_session_id():
-        redirect_url = urlh.geturl() + new_location
+            redirect_page = self._download_webpage(url, video_id)
-        webpage = self._download_webpage(redirect_url, video_id,
+            session_id_url = self._search_regex(
                r'var\s*url\s*=\s*\'([^\']+)\';', redirect_page,
                'session id url')
            self._download_webpage(
                compat_urlparse.urljoin(url, session_id_url), video_id,
                'Getting session id')
        get_session_id()
        get_session_id()
        webpage = self._download_webpage(url, video_id,
                                         'Downloading redirect page')
        title = self._html_search_regex(r'<title>(.*)</title>',
--- a/youtube_dl/extractor/viki.py
+++ b/youtube_dl/extractor/viki.py
@@ -1,5 +1,7 @@
 # coding: utf-8
 from __future__ import unicode_literals
 import json
 import time
 import hmac
 import hashlib
@@ -11,6 +13,7 @@ from ..utils import (
    parse_age_limit,
    parse_iso8601,
 )
 from ..compat import compat_urllib_request
 from .common import InfoExtractor
@@ -23,27 +26,35 @@ class VikiBaseIE(InfoExtractor):
    _APP_VERSION = '2.2.5.1428709186'
    _APP_SECRET = '-$iJ}@p7!G@SyU/je1bEyWg}upLu-6V6-Lg9VD(]siH,r.,m-r|ulZ,U4LC/SeR)'
-    def _prepare_call(self, path, timestamp=None):
+    _NETRC_MACHINE = 'viki'
    _token = None
    def _prepare_call(self, path, timestamp=None, post_data=None):
        path += '?' if '?' not in path else '&'
        if not timestamp:
            timestamp = int(time.time())
        query = self._API_QUERY_TEMPLATE % (path, self._APP, timestamp)
        if self._token:
            query += '&token=%s' % self._token
        sig = hmac.new(
            self._APP_SECRET.encode('ascii'),
            query.encode('ascii'),
            hashlib.sha1
        ).hexdigest()
-        return self._API_URL_TEMPLATE % (query, sig)
+        url = self._API_URL_TEMPLATE % (query, sig)
        return compat_urllib_request.Request(
            url, json.dumps(post_data).encode('utf-8')) if post_data else url
-    def _call_api(self, path, video_id, note, timestamp=None):
+    def _call_api(self, path, video_id, note, timestamp=None, post_data=None):
        resp = self._download_json(
-            self._prepare_call(path, timestamp), video_id, note)
+            self._prepare_call(path, timestamp, post_data), video_id, note)
        error = resp.get('error')
        if error:
            if error == 'invalid timestamp':
                resp = self._download_json(
-                    self._prepare_call(path, int(resp['current_timestamp'])),
+                    self._prepare_call(path, int(resp['current_timestamp']), post_data),
                    video_id, '%s (retry)' % note)
                error = resp.get('error')
            if error:
@@ -56,6 +67,27 @@ class VikiBaseIE(InfoExtractor):
            '%s returned error: %s' % (self.IE_NAME, error),
            expected=True)
    def _real_initialize(self):
        self._login()
    def _login(self):
        (username, password) = self._get_login_info()
        if username is None:
            return
        login_form = {
            'login_id': username,
            'password': password,
        }
        login = self._call_api(
            'sessions.json', None,
            'Logging in as %s' % username, post_data=login_form)
        self._token = login.get('token')
        if not self._token:
            self.report_warning('Unable to get session token, login has probably failed')
 class VikiIE(VikiBaseIE):
    IE_NAME = 'viki'
--- a/youtube_dl/extractor/vimeo.py
+++ b/youtube_dl/extractor/vimeo.py
@@ -22,6 +22,7 @@ from ..utils import (
    unified_strdate,
    unsmuggle_url,
    urlencode_postdata,
    unescapeHTML,
 )
@@ -173,6 +174,21 @@ class VimeoIE(VimeoBaseInfoExtractor):
        },
    ]
    @staticmethod
    def _extract_vimeo_url(url, webpage):
        # Look for embedded (iframe) Vimeo player
        mobj = re.search(
            r'<iframe[^>]+?src=(["\'])(?P<url>(?:https?:)?//player\.vimeo\.com/video/.+?)\1', webpage)
        if mobj:
            player_url = unescapeHTML(mobj.group('url'))
            surl = smuggle_url(player_url, {'Referer': url})
            return surl
        # Look for embedded (swf embed) Vimeo player
        mobj = re.search(
            r'<embed[^>]+?src="((?:https?:)?//(?:www\.)?vimeo\.com/moogaloop\.swf.+?)"', webpage)
        if mobj:
            return mobj.group(1)
    def _verify_video_password(self, url, video_id, webpage):
        password = self._downloader.params.get('videopassword', None)
        if password is None:
--- a/youtube_dl/extractor/vk.py
+++ b/youtube_dl/extractor/vk.py
@@ -13,6 +13,7 @@ from ..compat import (
 from ..utils import (
    ExtractorError,
    orderedSet,
    str_to_int,
    unescapeHTML,
    unified_strdate,
 )
@@ -34,6 +35,7 @@ class VKIE(InfoExtractor):
                'uploader': 're:(?:Noize MC|Alexander Ilyashenko).*',
                'duration': 195,
                'upload_date': '20120212',
                'view_count': int,
            },
        },
        {
@@ -45,7 +47,8 @@ class VKIE(InfoExtractor):
                'uploader': 'Tom Cruise',
                'title': 'No name',
                'duration': 9,
-                'upload_date': '20130721'
+                'upload_date': '20130721',
                'view_count': int,
            }
        },
        {
@@ -59,6 +62,7 @@ class VKIE(InfoExtractor):
                'title': 'Lin Dan',
                'duration': 101,
                'upload_date': '20120730',
                'view_count': int,
            }
        },
        {
@@ -73,7 +77,8 @@ class VKIE(InfoExtractor):
                'uploader': 'Триллеры',
                'title': '► Бойцовский клуб / Fight Club 1999 [HD 720]',
                'duration': 8352,
-                'upload_date': '20121218'
+                'upload_date': '20121218',
                'view_count': int,
            },
            'skip': 'Requires vk account credentials',
        },
@@ -100,6 +105,7 @@ class VKIE(InfoExtractor):
                'title': 'Книга Илая',
                'duration': 6771,
                'upload_date': '20140626',
                'view_count': int,
            },
            'skip': 'Only works from Russia',
        },
@@ -119,8 +125,8 @@ class VKIE(InfoExtractor):
            'act': 'login',
            'role': 'al_frame',
            'expire': '1',
-            'email': username,
+            'email': username.encode('cp1251'),
-            'pass': password,
+            'pass': password.encode('cp1251'),
        }
        request = compat_urllib_request.Request('https://login.vk.com/?act=login',
@@ -175,25 +181,29 @@ class VKIE(InfoExtractor):
                m_rutube.group(1).replace('\\', ''))
            return self.url_result(rutube_url)
-        m_opts = re.search(r'(?s)var\s+opts\s*=\s*({.*?});', info_page)
+        m_opts = re.search(r'(?s)var\s+opts\s*=\s*({.+?});', info_page)
        if m_opts:
-            m_opts_url = re.search(r"url\s*:\s*'([^']+)", m_opts.group(1))
+            m_opts_url = re.search(r"url\s*:\s*'((?!/\b)[^']+)", m_opts.group(1))
            if m_opts_url:
                opts_url = m_opts_url.group(1)
                if opts_url.startswith('//'):
                    opts_url = 'http:' + opts_url
                return self.url_result(opts_url)
-        data_json = self._search_regex(r'var vars = ({.*?});', info_page, 'vars')
+        data_json = self._search_regex(r'var\s+vars\s*=\s*({.+?});', info_page, 'vars')
        data = json.loads(data_json)
        # Extract upload date
        upload_date = None
-        mobj = re.search(r'id="mv_date_wrap".*?Added ([a-zA-Z]+ [0-9]+), ([0-9]+) at', info_page)
+        mobj = re.search(r'id="mv_date(?:_views)?_wrap"[^>]*>([a-zA-Z]+ [0-9]+), ([0-9]+) at', info_page)
        if mobj is not None:
            mobj.group(1) + ' ' + mobj.group(2)
            upload_date = unified_strdate(mobj.group(1) + ' ' + mobj.group(2))
        view_count = str_to_int(self._search_regex(
            r'"mv_views_count_number"[^>]*>([\d,.]+) views<',
            info_page, 'view count', fatal=False))
        formats = [{
            'format_id': k,
            'url': v,
@@ -210,6 +220,7 @@ class VKIE(InfoExtractor):
            'uploader': data.get('md_author'),
            'duration': data.get('duration'),
            'upload_date': upload_date,
            'view_count': view_count,
        }
--- a/youtube_dl/extractor/xhamster.py
+++ b/youtube_dl/extractor/xhamster.py
@@ -13,7 +13,6 @@ from ..utils import (
 class XHamsterIE(InfoExtractor):
    """Information Extractor for xHamster"""
    _VALID_URL = r'(?P<proto>https?)://(?:.+?\.)?xhamster\.com/movies/(?P<id>[0-9]+)/(?P<seo>.+?)\.html(?:\?.*)?'
    _TESTS = [
        {
@@ -133,3 +132,36 @@ class XHamsterIE(InfoExtractor):
            'age_limit': age_limit,
            'formats': formats,
        }
 class XHamsterEmbedIE(InfoExtractor):
    _VALID_URL = r'https?://(?:www\.)?xhamster\.com/xembed\.php\?video=(?P<id>\d+)'
    _TEST = {
        'url': 'http://xhamster.com/xembed.php?video=3328539',
        'info_dict': {
            'id': '3328539',
            'ext': 'mp4',
            'title': 'Pen Masturbation',
            'upload_date': '20140728',
            'uploader_id': 'anonymous',
            'duration': 5,
            'age_limit': 18,
        }
    }
    @staticmethod
    def _extract_urls(webpage):
        return [url for _, url in re.findall(
            r'<iframe[^>]+?src=(["\'])(?P<url>(?:https?:)?//(?:www\.)?xhamster\.com/xembed\.php\?video=\d+)\1',
            webpage)]
    def _real_extract(self, url):
        video_id = self._match_id(url)
        webpage = self._download_webpage(url, video_id)
        video_url = self._search_regex(
            r'href="(https?://xhamster\.com/movies/%s/[^"]+\.html[^"]*)"' % video_id,
            webpage, 'xhamster url')
        return self.url_result(video_url, 'XHamster')
--- a/youtube_dl/extractor/xvideos.py
+++ b/youtube_dl/extractor/xvideos.py
@@ -5,10 +5,12 @@ import re
 from .common import InfoExtractor
 from ..compat import (
    compat_urllib_parse,
    compat_urllib_request,
 )
 from ..utils import (
    clean_html,
    ExtractorError,
    determine_ext,
 )
@@ -25,6 +27,8 @@ class XVideosIE(InfoExtractor):
        }
    }
    _ANDROID_USER_AGENT = 'Mozilla/5.0 (Linux; Android 4.0.4; Galaxy Nexus Build/IMM76B) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1025.133 Mobile Safari/535.19'
    def _real_extract(self, url):
        video_id = self._match_id(url)
        webpage = self._download_webpage(url, video_id)
@@ -40,9 +44,30 @@ class XVideosIE(InfoExtractor):
        video_thumbnail = self._search_regex(
            r'url_bigthumb=(.+?)&amp', webpage, 'thumbnail', fatal=False)
        formats = [{
            'url': video_url,
        }]
        android_req = compat_urllib_request.Request(url)
        android_req.add_header('User-Agent', self._ANDROID_USER_AGENT)
        android_webpage = self._download_webpage(android_req, video_id, fatal=False)
        if android_webpage is not None:
            player_params_str = self._search_regex(
                'mobileReplacePlayerDivTwoQual\(([^)]+)\)',
                android_webpage, 'player parameters', default='')
            player_params = list(map(lambda s: s.strip(' \''), player_params_str.split(',')))
            if player_params:
                formats.extend([{
                    'url': param,
                    'preference': -10,
                } for param in player_params if determine_ext(param) == 'mp4'])
        self._sort_formats(formats)
        return {
            'id': video_id,
-            'url': video_url,
+            'formats': formats,
            'title': video_title,
            'ext': 'flv',
            'thumbnail': video_thumbnail,
--- a/youtube_dl/extractor/youku.py
+++ b/youtube_dl/extractor/youku.py
@@ -1,123 +1,235 @@
 # coding: utf-8
 from __future__ import unicode_literals
-import math
+import base64
 import random
 import re
 import time
 from .common import InfoExtractor
-from ..utils import (
+from ..utils import ExtractorError
-    ExtractorError,
+
 from ..compat import (
    compat_urllib_parse,
    compat_ord,
    compat_urllib_request,
 )
 class YoukuIE(InfoExtractor):
    IE_NAME = 'youku'
    _VALID_URL = r'''(?x)
        (?:
            http://(?:v|player)\.youku\.com/(?:v_show/id_|player\.php/sid/)|
            youku:)
        (?P<id>[A-Za-z0-9]+)(?:\.html|/v\.swf|)
    '''
-    _TEST = {
+
-        'url': 'http://v.youku.com/v_show/id_XNDgyMDQ2NTQw.html',
+    _TESTS = [{
-        'md5': 'ffe3f2e435663dc2d1eea34faeff5b5b',
+        'url': 'http://v.youku.com/v_show/id_XMTc1ODE5Njcy.html',
-        'params': {
+        'md5': '5f3af4192eabacc4501508d54a8cabd7',
            'test': False
        },
        'info_dict': {
-            'id': 'XNDgyMDQ2NTQw_part00',
+            'id': 'XMTc1ODE5Njcy_part1',
-            'ext': 'flv',
+            'title': '★Smile﹗♡ Git Fresh -Booty Music舞蹈.',
-            'title': 'youtube-dl test video "\'/\\ä↭𝕐'
+            'ext': 'flv'
        }
-    }
+    }, {
        'url': 'http://player.youku.com/player.php/sid/XNDgyMDQ2NTQw/v.swf',
        'only_matching': True,
    }, {
        'url': 'http://v.youku.com/v_show/id_XODgxNjg1Mzk2_ev_1.html',
        'info_dict': {
            'id': 'XODgxNjg1Mzk2',
            'title': '武媚娘传奇 85',
        },
        'playlist_count': 11,
    }, {
        'url': 'http://v.youku.com/v_show/id_XMTI1OTczNDM5Mg==.html',
        'info_dict': {
            'id': 'XMTI1OTczNDM5Mg',
            'title': '花千骨 04',
        },
        'playlist_count': 13,
        'skip': 'Available in China only',
    }]
-    def _gen_sid(self):
+    def construct_video_urls(self, data1, data2):
-        nowTime = int(time.time() * 1000)
+        # get sid, token
-        random1 = random.randint(1000, 1998)
+        def yk_t(s1, s2):
-        random2 = random.randint(1000, 9999)
+            ls = list(range(256))
            t = 0
            for i in range(256):
                t = (t + ls[i] + compat_ord(s1[i % len(s1)])) % 256
                ls[i], ls[t] = ls[t], ls[i]
            s = bytearray()
            x, y = 0, 0
            for i in range(len(s2)):
                y = (y + 1) % 256
                x = (x + ls[y]) % 256
                ls[x], ls[y] = ls[y], ls[x]
                s.append(compat_ord(s2[i]) ^ ls[(ls[x] + ls[y]) % 256])
            return bytes(s)
-        return "%d%d%d" % (nowTime, random1, random2)
+        sid, token = yk_t(
            b'becaf9be', base64.b64decode(data2['ep'].encode('ascii'))
        ).decode('ascii').split('_')
-    def _get_file_ID_mix_string(self, seed):
+        # get oip
-        mixed = []
+        oip = data2['ip']
        source = list("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ/\:._-1234567890")
        seed = float(seed)
        for i in range(len(source)):
            seed = (seed * 211 + 30031) % 65536
            index = math.floor(seed / 65536 * len(source))
            mixed.append(source[int(index)])
            source.remove(source[int(index)])
        # return ''.join(mixed)
        return mixed
-    def _get_file_id(self, fileId, seed):
+        # get fileid
-        mixed = self._get_file_ID_mix_string(seed)
+        string_ls = list(
-        ids = fileId.split('*')
+            'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ/\:._-1234567890')
-        realId = []
+        shuffled_string_ls = []
-        for ch in ids:
+        seed = data1['seed']
-            if ch:
+        N = len(string_ls)
-                realId.append(mixed[int(ch)])
+        for ii in range(N):
-        return ''.join(realId)
+            seed = (seed * 0xd3 + 0x754f) % 0x10000
            idx = seed * len(string_ls) // 0x10000
            shuffled_string_ls.append(string_ls[idx])
            del string_ls[idx]
        fileid_dict = {}
        for format in data1['streamtypes']:
            streamfileid = [
                int(i) for i in data1['streamfileids'][format].strip('*').split('*')]
            fileid = ''.join(
                [shuffled_string_ls[i] for i in streamfileid])
            fileid_dict[format] = fileid[:8] + '%s' + fileid[10:]
        def get_fileid(format, n):
            fileid = fileid_dict[format] % hex(int(n))[2:].upper().zfill(2)
            return fileid
        # get ep
        def generate_ep(format, n):
            fileid = get_fileid(format, n)
            ep_t = yk_t(
                b'bf7e5f01',
                ('%s_%s_%s' % (sid, fileid, token)).encode('ascii')
            )
            ep = base64.b64encode(ep_t).decode('ascii')
            return ep
        # generate video_urls
        video_urls_dict = {}
        for format in data1['streamtypes']:
            video_urls = []
            for dt in data1['segs'][format]:
                n = str(int(dt['no']))
                param = {
                    'K': dt['k'],
                    'hd': self.get_hd(format),
                    'myp': 0,
                    'ts': dt['seconds'],
                    'ypp': 0,
                    'ctype': 12,
                    'ev': 1,
                    'token': token,
                    'oip': oip,
                    'ep': generate_ep(format, n)
                }
                video_url = \
                    'http://k.youku.com/player/getFlvPath/' + \
                    'sid/' + sid + \
                    '_' + str(int(n) + 1).zfill(2) + \
                    '/st/' + self.parse_ext_l(format) + \
                    '/fileid/' + get_fileid(format, n) + '?' + \
                    compat_urllib_parse.urlencode(param)
                video_urls.append(video_url)
            video_urls_dict[format] = video_urls
        return video_urls_dict
    def get_hd(self, fm):
        hd_id_dict = {
            'flv': '0',
            'mp4': '1',
            'hd2': '2',
            'hd3': '3',
            '3gp': '0',
            '3gphd': '1'
        }
        return hd_id_dict[fm]
    def parse_ext_l(self, fm):
        ext_dict = {
            'flv': 'flv',
            'mp4': 'mp4',
            'hd2': 'flv',
            'hd3': 'flv',
            '3gp': 'flv',
            '3gphd': 'mp4'
        }
        return ext_dict[fm]
    def get_format_name(self, fm):
        _dict = {
            '3gp': 'h6',
            '3gphd': 'h5',
            'flv': 'h4',
            'mp4': 'h3',
            'hd2': 'h2',
            'hd3': 'h1'
        }
        return _dict[fm]
    def _real_extract(self, url):
-        mobj = re.match(self._VALID_URL, url)
+        video_id = self._match_id(url)
        video_id = mobj.group('id')
-        info_url = 'http://v.youku.com/player/getPlayList/VideoIDS/' + video_id
+        def retrieve_data(req_url, note):
            req = compat_urllib_request.Request(req_url)
-        config = self._download_json(info_url, video_id)
+            cn_verification_proxy = self._downloader.params.get('cn_verification_proxy')
            if cn_verification_proxy:
                req.add_header('Ytdl-request-proxy', cn_verification_proxy)
-        error_code = config['data'][0].get('error_code')
+            raw_data = self._download_json(req, video_id, note=note)
            return raw_data['data'][0]
        # request basic data
        data1 = retrieve_data(
            'http://v.youku.com/player/getPlayList/VideoIDS/%s' % video_id,
            'Downloading JSON metadata 1')
        data2 = retrieve_data(
            'http://v.youku.com/player/getPlayList/VideoIDS/%s/Pf/4/ctype/12/ev/1' % video_id,
            'Downloading JSON metadata 2')
        error_code = data1.get('error_code')
        if error_code:
-            # -8 means blocked outside China.
+            error = data1.get('error')
-            error = config['data'][0].get('error')  # Chinese and English, separated by newline.
+            if error is not None and '因版权原因无法观看此视频' in error:
-            raise ExtractorError(error or 'Server reported error %i' % error_code,
+                raise ExtractorError(
-                                 expected=True)
+                    'Youku said: Sorry, this video is available in China only', expected=True)
        video_title = config['data'][0]['title']
        seed = config['data'][0]['seed']
        format = self._downloader.params.get('format', None)
        supported_format = list(config['data'][0]['streamfileids'].keys())
        # TODO proper format selection
        if format is None or format == 'best':
            if 'hd2' in supported_format:
                format = 'hd2'
            else:
-                format = 'flv'
+                msg = 'Youku server reported error %i' % error_code
-            ext = 'flv'
+                if error is not None:
-        elif format == 'worst':
+                    msg += ': ' + error
-            format = 'mp4'
+                raise ExtractorError(msg)
            ext = 'mp4'
        else:
            format = 'flv'
            ext = 'flv'
-        fileid = config['data'][0]['streamfileids'][format]
+        title = data1['title']
        keys = [s['k'] for s in config['data'][0]['segs'][format]]
        # segs is usually a dictionary, but an empty *list* if an error occured.
-        files_info = []
+        # generate video_urls_dict
-        sid = self._gen_sid()
+        video_urls_dict = self.construct_video_urls(data1, data2)
        fileid = self._get_file_id(fileid, seed)
-        # column 8,9 of fileid represent the segment number
+        # construct info
-        # fileid[7:9] should be changed
+        entries = [{
-        for index, key in enumerate(keys):
+            'id': '%s_part%d' % (video_id, i + 1),
-            temp_fileid = '%s%02X%s' % (fileid[0:8], index, fileid[10:])
+            'title': title,
-            download_url = 'http://k.youku.com/player/getFlvPath/sid/%s_%02X/st/flv/fileid/%s?k=%s' % (sid, index, temp_fileid, key)
+            'formats': [],
            # some formats are not available for all parts, we have to detect
            # which one has all
        } for i in range(max(len(v) for v in data1['segs'].values()))]
        for fm in data1['streamtypes']:
            video_urls = video_urls_dict[fm]
            for video_url, seg, entry in zip(video_urls, data1['segs'][fm], entries):
                entry['formats'].append({
                    'url': video_url,
                    'format_id': self.get_format_name(fm),
                    'ext': self.parse_ext_l(fm),
                    'filesize': int(seg['size']),
                })
-            info = {
+        return {
-                'id': '%s_part%02d' % (video_id, index),
+            '_type': 'multi_video',
-                'url': download_url,
+            'id': video_id,
-                'uploader': None,
+            'title': title,
-                'upload_date': None,
+            'entries': entries,
-                'title': video_title,
+        }
                'ext': ext,
            }
            files_info.append(info)
        return files_info
--- a/youtube_dl/extractor/youtube.py
+++ b/youtube_dl/extractor/youtube.py
@@ -234,6 +234,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
        '44': {'ext': 'webm', 'width': 854, 'height': 480},
        '45': {'ext': 'webm', 'width': 1280, 'height': 720},
        '46': {'ext': 'webm', 'width': 1920, 'height': 1080},
        '59': {'ext': 'mp4', 'width': 854, 'height': 480},
        '78': {'ext': 'mp4', 'width': 854, 'height': 480},
        # 3d videos
@@ -785,7 +787,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
            s = mobj.group(1)
            dec_s = self._decrypt_signature(s, video_id, player_url, age_gate)
            return '/signature/%s' % dec_s
-        dash_manifest_url = re.sub(r'/s/([\w\.]+)', decrypt_sig, dash_manifest_url)
+        dash_manifest_url = re.sub(r'/s/([a-fA-F0-9\.]+)', decrypt_sig, dash_manifest_url)
        dash_doc = self._download_xml(
            dash_manifest_url, video_id,
            note='Downloading DASH manifest',
@@ -1290,7 +1292,6 @@ class YoutubePlaylistIE(YoutubeBaseInfoExtractor):
    def _extract_playlist(self, playlist_id):
        url = self._TEMPLATE_URL % playlist_id
        page = self._download_webpage(url, playlist_id)
        more_widget_html = content_html = page
        for match in re.findall(r'<div class="yt-alert-message">([^<]+)</div>', page):
            match = match.strip()
@@ -1310,36 +1311,36 @@ class YoutubePlaylistIE(YoutubeBaseInfoExtractor):
                self.report_warning('Youtube gives an alert message: ' + match)
        # Extract the video ids from the playlist pages
-        ids = []
+        def _entries():
            more_widget_html = content_html = page
            for page_num in itertools.count(1):
                matches = re.finditer(self._VIDEO_RE, content_html)
                # We remove the duplicates and the link with index 0
                # (it's not the first video of the playlist)
                new_ids = orderedSet(m.group('id') for m in matches if m.group('index') != '0')
                for vid_id in new_ids:
                    yield self.url_result(vid_id, 'Youtube', video_id=vid_id)
-        for page_num in itertools.count(1):
+                mobj = re.search(r'data-uix-load-more-href="/?(?P<more>[^"]+)"', more_widget_html)
-            matches = re.finditer(self._VIDEO_RE, content_html)
+                if not mobj:
-            # We remove the duplicates and the link with index 0
+                    break
            # (it's not the first video of the playlist)
            new_ids = orderedSet(m.group('id') for m in matches if m.group('index') != '0')
            ids.extend(new_ids)
-            mobj = re.search(r'data-uix-load-more-href="/?(?P<more>[^"]+)"', more_widget_html)
+                more = self._download_json(
-            if not mobj:
+                    'https://youtube.com/%s' % mobj.group('more'), playlist_id,
-                break
+                    'Downloading page #%s' % page_num,
-
+                    transform_source=uppercase_escape)
-            more = self._download_json(
+                content_html = more['content_html']
-                'https://youtube.com/%s' % mobj.group('more'), playlist_id,
+                if not content_html.strip():
-                'Downloading page #%s' % page_num,
+                    # Some webpages show a "Load more" button but they don't
-                transform_source=uppercase_escape)
+                    # have more videos
-            content_html = more['content_html']
+                    break
-            if not content_html.strip():
+                more_widget_html = more['load_more_widget_html']
                # Some webpages show a "Load more" button but they don't
                # have more videos
                break
            more_widget_html = more['load_more_widget_html']
        playlist_title = self._html_search_regex(
            r'(?s)<h1 class="pl-header-title[^"]*">\s*(.*?)\s*</h1>',
            page, 'title')
-        url_results = self._ids_to_results(ids)
+        return self.playlist_result(_entries(), playlist_id, playlist_title)
        return self.playlist_result(url_results, playlist_id, playlist_title)
    def _real_extract(self, url):
        # Extract playlist id
@@ -1406,10 +1407,12 @@ class YoutubeChannelIE(InfoExtractor):
        channel_page = self._download_webpage(
            url + '?view=57', channel_id,
            'Downloading channel page', fatal=False)
-        channel_playlist_id = self._search_regex(
+        channel_playlist_id = self._html_search_meta(
-            [r'<meta itemprop="channelId" content="([^"]+)">',
+            'channelId', channel_page, 'channel id', default=None)
-             r'data-channel-external-id="([^"]+)"'],
+        if not channel_playlist_id:
-            channel_page, 'channel id', default=None)
+            channel_playlist_id = self._search_regex(
                r'data-channel-external-id="([^"]+)"',
                channel_page, 'channel id', default=None)
        if channel_playlist_id and channel_playlist_id.startswith('UC'):
            playlist_id = 'UU' + channel_playlist_id[2:]
            return self.url_result(
@@ -1503,7 +1506,7 @@ class YoutubeSearchIE(SearchInfoExtractor, YoutubePlaylistIE):
        for pagenum in itertools.count(1):
            url_query = {
-                'search_query': query,
+                'search_query': query.encode('utf-8'),
                'page': pagenum,
                'spf': 'navigate',
            }
--- a/youtube_dl/options.py
+++ b/youtube_dl/options.py
@@ -145,11 +145,15 @@ def parseOpts(overrideArguments=None):
    general.add_option(
        '--list-extractors',
        action='store_true', dest='list_extractors', default=False,
-        help='List all supported extractors and the URLs they would handle')
+        help='List all supported extractors')
    general.add_option(
        '--extractor-descriptions',
        action='store_true', dest='list_extractor_descriptions', default=False,
        help='Output descriptions of all supported extractors')
    general.add_option(
        '--force-generic-extractor',
        action='store_true', dest='force_generic_extractor', default=False,
        help='Force extraction to use the generic extractor')
    general.add_option(
        '--default-search',
        dest='default_search', metavar='PREFIX',
@@ -725,7 +729,7 @@ def parseOpts(overrideArguments=None):
        metavar='POLICY', dest='fixup', default='detect_or_warn',
        help='Automatically correct known faults of the file. '
             'One of never (do nothing), warn (only emit a warning), '
-             'detect_or_warn(the default; fix file if we can, warn otherwise)')
+             'detect_or_warn (the default; fix file if we can, warn otherwise)')
    postproc.add_option(
        '--prefer-avconv',
        action='store_false', dest='prefer_ffmpeg',
--- a/youtube_dl/postprocessor/embedthumbnail.py
+++ b/youtube_dl/postprocessor/embedthumbnail.py
@@ -35,6 +35,11 @@ class EmbedThumbnailPP(FFmpegPostProcessor):
        thumbnail_filename = info['thumbnails'][-1]['filename']
        if not os.path.exists(encodeFilename(thumbnail_filename)):
            self._downloader.report_warning(
                'Skipping embedding the thumbnail because the file is missing.')
            return [], info
        if info['ext'] == 'mp3':
            options = [
                '-c', 'copy', '-map', '0', '-map', '1',
--- a/youtube_dl/postprocessor/ffmpeg.py
+++ b/youtube_dl/postprocessor/ffmpeg.py
@@ -21,6 +21,7 @@ from ..utils import (
    shell_quote,
    subtitles_filename,
    dfxp2srt,
    ISO639Utils,
 )
@@ -307,199 +308,6 @@ class FFmpegVideoConvertorPP(FFmpegPostProcessor):
 class FFmpegEmbedSubtitlePP(FFmpegPostProcessor):
    # See http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt
    _lang_map = {
        'aa': 'aar',
        'ab': 'abk',
        'ae': 'ave',
        'af': 'afr',
        'ak': 'aka',
        'am': 'amh',
        'an': 'arg',
        'ar': 'ara',
        'as': 'asm',
        'av': 'ava',
        'ay': 'aym',
        'az': 'aze',
        'ba': 'bak',
        'be': 'bel',
        'bg': 'bul',
        'bh': 'bih',
        'bi': 'bis',
        'bm': 'bam',
        'bn': 'ben',
        'bo': 'bod',
        'br': 'bre',
        'bs': 'bos',
        'ca': 'cat',
        'ce': 'che',
        'ch': 'cha',
        'co': 'cos',
        'cr': 'cre',
        'cs': 'ces',
        'cu': 'chu',
        'cv': 'chv',
        'cy': 'cym',
        'da': 'dan',
        'de': 'deu',
        'dv': 'div',
        'dz': 'dzo',
        'ee': 'ewe',
        'el': 'ell',
        'en': 'eng',
        'eo': 'epo',
        'es': 'spa',
        'et': 'est',
        'eu': 'eus',
        'fa': 'fas',
        'ff': 'ful',
        'fi': 'fin',
        'fj': 'fij',
        'fo': 'fao',
        'fr': 'fra',
        'fy': 'fry',
        'ga': 'gle',
        'gd': 'gla',
        'gl': 'glg',
        'gn': 'grn',
        'gu': 'guj',
        'gv': 'glv',
        'ha': 'hau',
        'he': 'heb',
        'hi': 'hin',
        'ho': 'hmo',
        'hr': 'hrv',
        'ht': 'hat',
        'hu': 'hun',
        'hy': 'hye',
        'hz': 'her',
        'ia': 'ina',
        'id': 'ind',
        'ie': 'ile',
        'ig': 'ibo',
        'ii': 'iii',
        'ik': 'ipk',
        'io': 'ido',
        'is': 'isl',
        'it': 'ita',
        'iu': 'iku',
        'ja': 'jpn',
        'jv': 'jav',
        'ka': 'kat',
        'kg': 'kon',
        'ki': 'kik',
        'kj': 'kua',
        'kk': 'kaz',
        'kl': 'kal',
        'km': 'khm',
        'kn': 'kan',
        'ko': 'kor',
        'kr': 'kau',
        'ks': 'kas',
        'ku': 'kur',
        'kv': 'kom',
        'kw': 'cor',
        'ky': 'kir',
        'la': 'lat',
        'lb': 'ltz',
        'lg': 'lug',
        'li': 'lim',
        'ln': 'lin',
        'lo': 'lao',
        'lt': 'lit',
        'lu': 'lub',
        'lv': 'lav',
        'mg': 'mlg',
        'mh': 'mah',
        'mi': 'mri',
        'mk': 'mkd',
        'ml': 'mal',
        'mn': 'mon',
        'mr': 'mar',
        'ms': 'msa',
        'mt': 'mlt',
        'my': 'mya',
        'na': 'nau',
        'nb': 'nob',
        'nd': 'nde',
        'ne': 'nep',
        'ng': 'ndo',
        'nl': 'nld',
        'nn': 'nno',
        'no': 'nor',
        'nr': 'nbl',
        'nv': 'nav',
        'ny': 'nya',
        'oc': 'oci',
        'oj': 'oji',
        'om': 'orm',
        'or': 'ori',
        'os': 'oss',
        'pa': 'pan',
        'pi': 'pli',
        'pl': 'pol',
        'ps': 'pus',
        'pt': 'por',
        'qu': 'que',
        'rm': 'roh',
        'rn': 'run',
        'ro': 'ron',
        'ru': 'rus',
        'rw': 'kin',
        'sa': 'san',
        'sc': 'srd',
        'sd': 'snd',
        'se': 'sme',
        'sg': 'sag',
        'si': 'sin',
        'sk': 'slk',
        'sl': 'slv',
        'sm': 'smo',
        'sn': 'sna',
        'so': 'som',
        'sq': 'sqi',
        'sr': 'srp',
        'ss': 'ssw',
        'st': 'sot',
        'su': 'sun',
        'sv': 'swe',
        'sw': 'swa',
        'ta': 'tam',
        'te': 'tel',
        'tg': 'tgk',
        'th': 'tha',
        'ti': 'tir',
        'tk': 'tuk',
        'tl': 'tgl',
        'tn': 'tsn',
        'to': 'ton',
        'tr': 'tur',
        'ts': 'tso',
        'tt': 'tat',
        'tw': 'twi',
        'ty': 'tah',
        'ug': 'uig',
        'uk': 'ukr',
        'ur': 'urd',
        'uz': 'uzb',
        've': 'ven',
        'vi': 'vie',
        'vo': 'vol',
        'wa': 'wln',
        'wo': 'wol',
        'xh': 'xho',
        'yi': 'yid',
        'yo': 'yor',
        'za': 'zha',
        'zh': 'zho',
        'zu': 'zul',
    }
    @classmethod
    def _conver_lang_code(cls, code):
        """Convert language code from ISO 639-1 to ISO 639-2/T"""
        return cls._lang_map.get(code[:2])
    def run(self, information):
        if information['ext'] not in ['mp4', 'mkv']:
            self._downloader.to_screen('[ffmpeg] Subtitles can only be embedded in mp4 or mkv files')
@@ -525,7 +333,7 @@ class FFmpegEmbedSubtitlePP(FFmpegPostProcessor):
            opts += ['-c:s', 'mov_text']
        for (i, lang) in enumerate(sub_langs):
            opts.extend(['-map', '%d:0' % (i + 1)])
-            lang_code = self._conver_lang_code(lang)
+            lang_code = ISO639Utils.short2long(lang)
            if lang_code is not None:
                opts.extend(['-metadata:s:s:%d' % i, 'language=%s' % lang_code])
--- a/youtube_dl/update.py
+++ b/youtube_dl/update.py
@@ -50,7 +50,7 @@ def rsa_verify(message, signature, key):
 def update_self(to_screen, verbose):
    """Update the program file with the latest version from the repository"""
-    UPDATE_URL = "http://rg3.github.io/youtube-dl/update/"
+    UPDATE_URL = "https://rg3.github.io/youtube-dl/update/"
    VERSION_URL = UPDATE_URL + 'LATEST_VERSION'
    JSON_URL = UPDATE_URL + 'versions.json'
    UPDATES_RSA_KEY = (0x9d60ee4d8f805312fdb15a62f87b95bd66177b91df176765d13514a0f1754bcd2057295c5b6f1d35daa6742c3ffc9a82d3e118861c207995a8031e151d863c9927e304576bc80692bc8e094896fcf11b66f3e29e04e3a71e9a11558558acea1840aec37fc396fb6b65dc81a1c4144e03bd1c011de62e3f1357b327d08426fe93, 65537)
--- a/youtube_dl/utils.py
+++ b/youtube_dl/utils.py
@@ -1841,7 +1841,10 @@ def srt_subtitles_timecode(seconds):
 def dfxp2srt(dfxp_data):
-    _x = functools.partial(xpath_with_ns, ns_map={'ttml': 'http://www.w3.org/ns/ttml'})
+    _x = functools.partial(xpath_with_ns, ns_map={
        'ttml': 'http://www.w3.org/ns/ttml',
        'ttaf1': 'http://www.w3.org/2006/10/ttaf1',
    })
    def parse_node(node):
        str_or_empty = functools.partial(str_or_none, default='')
@@ -1849,9 +1852,9 @@ def dfxp2srt(dfxp_data):
        out = str_or_empty(node.text)
        for child in node:
-            if child.tag in (_x('ttml:br'), 'br'):
+            if child.tag in (_x('ttml:br'), _x('ttaf1:br'), 'br'):
                out += '\n' + str_or_empty(child.tail)
-            elif child.tag in (_x('ttml:span'), 'span'):
+            elif child.tag in (_x('ttml:span'), _x('ttaf1:span'), 'span'):
                out += str_or_empty(parse_node(child))
            else:
                out += str_or_empty(xml.etree.ElementTree.tostring(child))
@@ -1860,7 +1863,7 @@ def dfxp2srt(dfxp_data):
    dfxp = xml.etree.ElementTree.fromstring(dfxp_data.encode('utf-8'))
    out = []
-    paras = dfxp.findall(_x('.//ttml:p')) or dfxp.findall('.//p')
+    paras = dfxp.findall(_x('.//ttml:p')) or dfxp.findall(_x('.//ttaf1:p')) or dfxp.findall('.//p')
    if not paras:
        raise ValueError('Invalid dfxp/TTML subtitle')
@@ -1879,6 +1882,208 @@ def dfxp2srt(dfxp_data):
    return ''.join(out)
 class ISO639Utils(object):
    # See http://www.loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt
    _lang_map = {
        'aa': 'aar',
        'ab': 'abk',
        'ae': 'ave',
        'af': 'afr',
        'ak': 'aka',
        'am': 'amh',
        'an': 'arg',
        'ar': 'ara',
        'as': 'asm',
        'av': 'ava',
        'ay': 'aym',
        'az': 'aze',
        'ba': 'bak',
        'be': 'bel',
        'bg': 'bul',
        'bh': 'bih',
        'bi': 'bis',
        'bm': 'bam',
        'bn': 'ben',
        'bo': 'bod',
        'br': 'bre',
        'bs': 'bos',
        'ca': 'cat',
        'ce': 'che',
        'ch': 'cha',
        'co': 'cos',
        'cr': 'cre',
        'cs': 'ces',
        'cu': 'chu',
        'cv': 'chv',
        'cy': 'cym',
        'da': 'dan',
        'de': 'deu',
        'dv': 'div',
        'dz': 'dzo',
        'ee': 'ewe',
        'el': 'ell',
        'en': 'eng',
        'eo': 'epo',
        'es': 'spa',
        'et': 'est',
        'eu': 'eus',
        'fa': 'fas',
        'ff': 'ful',
        'fi': 'fin',
        'fj': 'fij',
        'fo': 'fao',
        'fr': 'fra',
        'fy': 'fry',
        'ga': 'gle',
        'gd': 'gla',
        'gl': 'glg',
        'gn': 'grn',
        'gu': 'guj',
        'gv': 'glv',
        'ha': 'hau',
        'he': 'heb',
        'hi': 'hin',
        'ho': 'hmo',
        'hr': 'hrv',
        'ht': 'hat',
        'hu': 'hun',
        'hy': 'hye',
        'hz': 'her',
        'ia': 'ina',
        'id': 'ind',
        'ie': 'ile',
        'ig': 'ibo',
        'ii': 'iii',
        'ik': 'ipk',
        'io': 'ido',
        'is': 'isl',
        'it': 'ita',
        'iu': 'iku',
        'ja': 'jpn',
        'jv': 'jav',
        'ka': 'kat',
        'kg': 'kon',
        'ki': 'kik',
        'kj': 'kua',
        'kk': 'kaz',
        'kl': 'kal',
        'km': 'khm',
        'kn': 'kan',
        'ko': 'kor',
        'kr': 'kau',
        'ks': 'kas',
        'ku': 'kur',
        'kv': 'kom',
        'kw': 'cor',
        'ky': 'kir',
        'la': 'lat',
        'lb': 'ltz',
        'lg': 'lug',
        'li': 'lim',
        'ln': 'lin',
        'lo': 'lao',
        'lt': 'lit',
        'lu': 'lub',
        'lv': 'lav',
        'mg': 'mlg',
        'mh': 'mah',
        'mi': 'mri',
        'mk': 'mkd',
        'ml': 'mal',
        'mn': 'mon',
        'mr': 'mar',
        'ms': 'msa',
        'mt': 'mlt',
        'my': 'mya',
        'na': 'nau',
        'nb': 'nob',
        'nd': 'nde',
        'ne': 'nep',
        'ng': 'ndo',
        'nl': 'nld',
        'nn': 'nno',
        'no': 'nor',
        'nr': 'nbl',
        'nv': 'nav',
        'ny': 'nya',
        'oc': 'oci',
        'oj': 'oji',
        'om': 'orm',
        'or': 'ori',
        'os': 'oss',
        'pa': 'pan',
        'pi': 'pli',
        'pl': 'pol',
        'ps': 'pus',
        'pt': 'por',
        'qu': 'que',
        'rm': 'roh',
        'rn': 'run',
        'ro': 'ron',
        'ru': 'rus',
        'rw': 'kin',
        'sa': 'san',
        'sc': 'srd',
        'sd': 'snd',
        'se': 'sme',
        'sg': 'sag',
        'si': 'sin',
        'sk': 'slk',
        'sl': 'slv',
        'sm': 'smo',
        'sn': 'sna',
        'so': 'som',
        'sq': 'sqi',
        'sr': 'srp',
        'ss': 'ssw',
        'st': 'sot',
        'su': 'sun',
        'sv': 'swe',
        'sw': 'swa',
        'ta': 'tam',
        'te': 'tel',
        'tg': 'tgk',
        'th': 'tha',
        'ti': 'tir',
        'tk': 'tuk',
        'tl': 'tgl',
        'tn': 'tsn',
        'to': 'ton',
        'tr': 'tur',
        'ts': 'tso',
        'tt': 'tat',
        'tw': 'twi',
        'ty': 'tah',
        'ug': 'uig',
        'uk': 'ukr',
        'ur': 'urd',
        'uz': 'uzb',
        've': 'ven',
        'vi': 'vie',
        'vo': 'vol',
        'wa': 'wln',
        'wo': 'wol',
        'xh': 'xho',
        'yi': 'yid',
        'yo': 'yor',
        'za': 'zha',
        'zh': 'zho',
        'zu': 'zul',
    }
    @classmethod
    def short2long(cls, code):
        """Convert language code from ISO 639-1 to ISO 639-2/T"""
        return cls._lang_map.get(code[:2])
    @classmethod
    def long2short(cls, code):
        """Convert language code from ISO 639-2/T to ISO 639-1"""
        for short_name, long_name in cls._lang_map.items():
            if long_name == code:
                return short_name
 class PerRequestProxyHandler(compat_urllib_request.ProxyHandler):
    def __init__(self, proxies=None):
        # Set default handlers
--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@@ -1,3 +1,3 @@
 from __future__ import unicode_literals
-__version__ = '2015.06.04.1'
+__version__ = '2015.06.25'
`@@ -1,3 +1,3 @@`
	`from __future__ import unicode_literals`	`from __future__ import unicode_literals`

	`__version__ = '2015.06.04.1'`	`__version__ = '2015.06.25'`