release 2017.07.09

[ChangeLog] Actualize
[dailymail] Add support for embeds
2017-07-09 20:16:38 +07:00 · 2017-07-09 20:15:15 +07:00 · 2017-07-09 20:06:24 +07:00 · 2017-07-09 19:18:12 +07:00 · 2017-07-09 19:17:54 +07:00 · 2017-07-09 19:17:38 +07:00
87 changed files with 1889 additions and 663 deletions
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
@@ -6,8 +6,8 @@

 ---

-### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.06.05*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.06.05**
+### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.07.09*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
+- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.07.09**

 ### Before submitting an *issue* make sure you have:
 - [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
 [debug] User config: []
 [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
-[debug] youtube-dl version 2017.06.05
+[debug] youtube-dl version 2017.07.09
 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
 [debug] Proxy map: {}
--- a/3
+++ b/3
@@ -220,3 +220,6 @@ gritstub
 Adam Voss
 Mike Fährmann
 Jan Kundrát
+Giuseppe Fabiano
+Örn Guðjónsson
+Parmjit Virk
--- a/143
+++ b/143
@@ -1,3 +1,146 @@
+version 2017.07.09
+
+Core
+ [extractor/common] Add support for AMP tags in _parse_html5_media_entries
+ [utils] Support attributes with no values in get_elements_by_attribute
+
+Extractors
+ [dailymail] Add support for embeds
+ [joj] Add support for joj.sk (#13268)
+* [abc.net.au:iview] Extract more formats (#13492, #13489)
+* [egghead:course] Fix extraction (#6635, #13370)
+ [cjsw] Add support for cjsw.com (#13525)
+ [eagleplatform] Add support for referrer protected videos (#13557)
+ [eagleplatform] Add support for another embed pattern (#13557)
+* [veoh] Extend URL regular expression (#13601)
+* [npo:live] Fix live stream id extraction (#13568, #13605)
+* [googledrive] Fix height extraction (#13603)
+ [dailymotion] Add support for new layout (#13580)
+- [yam] Remove extractor
+* [xhamster] Extract all formats and fix duration extraction (#13593)
+ [xhamster] Add support for new URL schema (#13593)
+* [espn] Extend URL regular expression (#13244, #13549)
+* [kaltura] Fix typo in subtitles extraction (#13569)
+* [vier] Adapt extraction to redesign (#13575)
+
+
+version 2017.07.02
+
+Core
+* [extractor/common] Improve _json_ld
+
+Extractors
+ [thisoldhouse] Add more fallbacks for video id
+* [thisoldhouse] Fix video id extraction (#13540, #13541)
+* [xfileshare] Extend format regular expression (#13536)
+* [ted] Fix extraction (#13535)
+ [tastytrade] Add support for tastytrade.com (#13521)
+* [dplayit] Relax video id regular expression (#13524)
+ [generic] Extract more generic metadata (#13527)
+ [bbccouk] Capture and output error message (#13501, #13518)
+* [cbsnews] Relax video info regular expression (#13284, #13503)
+ [facebook] Add support for plugin video embeds and multiple embeds (#13493)
+* [soundcloud] Switch to https for API requests (#13502)
+* [pandatv] Switch to https for API and download URLs
+ [pandatv] Add support for https URLs (#13491)
+ [niconico] Support sp subdomain (#13494)
+
+
+version 2017.06.25
+
+Core
+ [adobepass] Add support for DIRECTV NOW (mso ATTOTT) (#13472)
+* [YoutubeDL] Skip malformed formats for better extraction robustness
+
+Extractors
+ [wsj] Add support for barrons.com (#13470)
+ [ign] Add another video id pattern (#13328)
+ [raiplay:live] Add support for live streams (#13414)
+ [redbulltv] Add support for live videos and segments (#13486)
+ [onetpl] Add support for videos embedded via pulsembed (#13482)
+* [ooyala] Make more robust
+* [ooyala] Skip empty format URLs (#13471, #13476)
+* [hgtv.com:show] Fix typo
+
+
+version 2017.06.23
+
+Core
+* [adobepass] Fix extraction on older python 2.6
+
+Extractors
+* [youtube] Adapt to new automatic captions rendition (#13467)
+* [hgtv.com:show] Relax video config regular expression (#13279, #13461)
+* [drtuber] Fix formats extraction (#12058)
+* [youporn] Fix upload date extraction
+* [youporn] Improve formats extraction
+* [youporn] Fix title extraction (#13456)
+* [googledrive] Fix formats sorting (#13443)
+* [watchindianporn] Fix extraction (#13411, #13415)
+ [vimeo] Add fallback mp4 extension for original format
+ [ruv] Add support for ruv.is (#13396)
+* [viu] Fix extraction on older python 2.6
+* [pandora.tv] Fix upload_date extraction (#12846)
+ [asiancrush] Add support for asiancrush.com (#13420)
+
+
+version 2017.06.18
+
+Core
+* [downloader/common] Use utils.shell_quote for debug command line
+* [utils] Use compat_shlex_quote in shell_quote
+* [postprocessor/execafterdownload] Encode command line (#13407)
+* [compat] Fix compat_shlex_quote on Windows (#5889, #10254)
+* [postprocessor/metadatafromtitle] Fix missing optional meta fields processing
+   in --metadata-from-title (#13408)
+* [extractor/common] Fix json dumping with --geo-bypass
+ [extractor/common] Improve jwplayer subtitles extraction
+ [extractor/common] Improve jwplayer formats extraction (#13379)
+
+Extractors
+* [polskieradio] Fix extraction (#13392)
+ [xfileshare] Add support for fastvideo.me (#13385)
+* [bilibili] Fix extraction of videos with double quotes in titles (#13387)
+* [4tube] Fix extraction (#13381, #13382)
+ [disney] Add support for disneychannel.de (#13383)
+* [npo] Improve URL regular expression (#13376)
+ [corus] Add support for showcase.ca
+ [corus] Add support for history.ca (#13359)
+
+
+version 2017.06.12
+
+Core
+* [utils] Handle compat_HTMLParseError in extract_attributes (#13349)
+ [compat] Introduce compat_HTMLParseError
+* [utils] Improve unified_timestamp
+* [extractor/generic] Ensure format id is unicode string
+* [extractor/common] Return unicode string from _match_id
+ [YoutubeDL] Sanitize more fields (#13313)
+
+Extractors
+ [xfileshare] Add support for rapidvideo.tv (#13348)
+* [xfileshare] Modernize and pass Referer
+ [rutv] Add support for testplayer.vgtrk.com (#13347)
+ [newgrounds] Extract more metadata (#13232)
+ [newgrounds:playlist] Add support for playlists (#10611)
+* [newgrounds] Improve formats and uploader extraction (#13346)
+* [msn] Fix formats extraction
+* [turbo] Ensure format id is string
+* [sexu] Ensure height is int
+* [jove] Ensure comment count is int
+* [golem] Ensure format id is string
+* [gfycat] Ensure filesize is int
+* [foxgay] Ensure height is int
+* [flickr] Ensure format id is string
+* [sohu] Fix numeric fields
+* [safari] Improve authentication detection (#13319)
+* [liveleak] Ensure height is int (#13313)
+* [streamango] Make title optional (#13292)
+* [rtlnl] Improve URL regular expression (#13295)
+* [tvplayer] Fix extraction (#13291)
+
+
 version 2017.06.05

 Core
--- a/2
+++ b/2
@@ -101,7 +101,7 @@ youtube-dl.tar.gz: youtube-dl README.md README.txt youtube-dl.1 youtube-dl.bash-
 		--exclude '*.pyc' \
 		--exclude '*.pyo' \
 		--exclude '*~' \
-		--exclude '__pycache' \
+		--exclude '__pycache__' \
 		--exclude '.git' \
 		--exclude 'testdata' \
 		--exclude 'docs/_build' \
--- a/README.md
+++ b/README.md
@@ -145,18 +145,18 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
    --max-views COUNT                Do not download any videos with more than
                                     COUNT views
    --match-filter FILTER            Generic video filter. Specify any key (see
-                                     help for -o for a list of available keys)
-                                     to match if the key is present, !key to
-                                     check if the key is not present, key >
-                                     NUMBER (like "comment_count > 12", also
-                                     works with >=, <, <=, !=, =) to compare
-                                     against a number, key = 'LITERAL' (like
-                                     "uploader = 'Mike Smith'", also works with
-                                     !=) to match against a string literal and &
-                                     to require multiple matches. Values which
-                                     are not known are excluded unless you put a
-                                     question mark (?) after the operator. For
-                                     example, to only match videos that have
+                                     the "OUTPUT TEMPLATE" for a list of
+                                     available keys) to match if the key is
+                                     present, !key to check if the key is not
+                                     present, key > NUMBER (like "comment_count
+                                     > 12", also works with >=, <, <=, !=, =) to
+                                     compare against a number, key = 'LITERAL'
+                                     (like "uploader = 'Mike Smith'", also works
+                                     with !=) to match against a string literal
+                                     and & to require multiple matches. Values
+                                     which are not known are excluded unless you
+                                     put a question mark (?) after the operator.
+                                     For example, to only match videos that have
                                     been liked more than 100 times and disliked
                                     less than 50 times (or the dislike
                                     functionality is not available at the given
@@ -277,8 +277,8 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
    --get-filename                   Simulate, quiet but print output filename
    --get-format                     Simulate, quiet but print output format
    -j, --dump-json                  Simulate, quiet but print JSON information.
-                                     See --output for a description of available
-                                     keys.
+                                     See the "OUTPUT TEMPLATE" for a description
+                                     of available keys.
    -J, --dump-single-json           Simulate, quiet but print JSON information
                                     for each command-line argument. If the URL
                                     refers to a playlist, dump the whole
@@ -535,13 +535,14 @@ The basic usage is not to set any template arguments when downloading a single f
 - `playlist_id` (string): Playlist identifier
 - `playlist_title` (string): Playlist title

-
 Available for the video that belongs to some logical chapter or section:
+
 - `chapter` (string): Name or title of the chapter the video belongs to
 - `chapter_number` (numeric): Number of the chapter the video belongs to
 - `chapter_id` (string): Id of the chapter the video belongs to

 Available for the video that is an episode of some series or programme:
+
 - `series` (string): Title of the series or programme the video episode belongs to
 - `season` (string): Title of the season the video episode belongs to
 - `season_number` (numeric): Number of the season the video episode belongs to
@@ -551,6 +552,7 @@ Available for the video that is an episode of some series or programme:
 - `episode_id` (string): Id of the video episode

 Available for the media that is a track or a part of a music album:
+
 - `track` (string): Title of the track
 - `track_number` (numeric): Number of the track within an album or a disc
 - `track_id` (string): Id of the track
--- a/devscripts/prepare_manpage.py
+++ b/devscripts/prepare_manpage.py
@@ -8,7 +8,7 @@ import re
 ROOT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
 README_FILE = os.path.join(ROOT_DIR, 'README.md')

-PREFIX = '''%YOUTUBE-DL(1)
+PREFIX = r'''%YOUTUBE-DL(1)

 # NAME

--- a/docs/supportedsites.md
+++ b/docs/supportedsites.md
@@ -67,6 +67,8 @@
 - **arte.tv:info**
 - **arte.tv:magazine**
 - **arte.tv:playlist**
+ - **AsianCrush**
+ - **AsianCrushPlaylist**
 - **AtresPlayer**
 - **ATTTechChannel**
 - **ATVAt**
@@ -152,6 +154,7 @@
 - **chirbit**
 - **chirbit:profile**
 - **Cinchcast**
+ - **CJSW**
 - **Clipfish**
 - **cliphunter**
 - **ClipRs**
@@ -367,6 +370,7 @@
 - **Jamendo**
 - **JamendoAlbum**
 - **JeuxVideo**
+ - **Joj**
 - **Jove**
 - **jpopsuki.tv**
 - **JWPlatform**
@@ -512,6 +516,7 @@
 - **netease:song**: 网易云音乐
 - **Netzkino**
 - **Newgrounds**
+ - **NewgroundsPlaylist**
 - **Newstube**
 - **NextMedia**: 蘋果日報
 - **NextMediaActionNews**: 蘋果日報 - 動新聞
@@ -641,6 +646,7 @@
 - **RadioJavan**
 - **Rai**
 - **RaiPlay**
+ - **RaiPlayLive**
 - **RBMARadio**
 - **RDS**: RDS.ca
 - **RedBullTV**
@@ -685,6 +691,7 @@
 - **rutube:person**: Rutube person videos
 - **RUTV**: RUTV.RU
 - **Ruutu**
+ - **Ruv**
 - **safari**: safaribooksonline.com online video
 - **safari:api**
 - **safari:course**: safaribooksonline.com online courses
@@ -763,6 +770,7 @@
 - **Tagesschau**
 - **tagesschau:player**
 - **Tass**
+ - **TastyTrade**
 - **TBS**
 - **TDSLifeway**
 - **teachertube**: teachertube.com videos
@@ -974,7 +982,7 @@
 - **WSJArticle**
 - **XBef**
 - **XboxClips**
- - **XFileShare**: XFileShare based sites: DaClips, FileHoot, GorillaVid, MovPod, PowerWatch, Rapidvideo.ws, TheVideoBee, Vidto, Streamin.To, XVIDSTAGE, Vid ABC, VidBom, vidlo
+ - **XFileShare**: XFileShare based sites: DaClips, FileHoot, GorillaVid, MovPod, PowerWatch, Rapidvideo.ws, TheVideoBee, Vidto, Streamin.To, XVIDSTAGE, Vid ABC, VidBom, vidlo, RapidVideo.TV, FastVideo.me
 - **XHamster**
 - **XHamsterEmbed**
 - **xiami:album**: 虾米音乐 - 专辑
@@ -990,7 +998,6 @@
 - **XVideos**
 - **XXXYMovies**
 - **Yahoo**: Yahoo screen and movies
- - **Yam**: 蕃薯藤yam天空部落
 - **yandexmusic:album**: Яндекс.Музыка - Альбом
 - **yandexmusic:playlist**: Яндекс.Музыка - Плейлист
 - **yandexmusic:track**: Яндекс.Музыка - Трек
--- a/test/test_utils.py
+++ b/test/test_utils.py
@@ -98,6 +98,7 @@ from youtube_dl.compat import (
    compat_chr,
    compat_etree_fromstring,
    compat_getenv,
+    compat_os_name,
    compat_setenv,
    compat_urlparse,
    compat_parse_qs,
@@ -340,6 +341,7 @@ class TestUtil(unittest.TestCase):
        self.assertEqual(unified_timestamp('May 16, 2016 11:15 PM'), 1463440500)
        self.assertEqual(unified_timestamp('Feb 7, 2016 at 6:35 pm'), 1454870100)
        self.assertEqual(unified_timestamp('2017-03-30T17:52:41Q'), 1490896361)
+        self.assertEqual(unified_timestamp('Sep 11, 2013 | 5:49 AM'), 1378878540)

    def test_determine_ext(self):
        self.assertEqual(determine_ext('http://example.com/foo/bar.mp4/?download'), 'mp4')
@@ -447,7 +449,9 @@ class TestUtil(unittest.TestCase):

    def test_shell_quote(self):
        args = ['ffmpeg', '-i', encodeFilename('ñ€ß\'.mp4')]
-        self.assertEqual(shell_quote(args), """ffmpeg -i 'ñ€ß'"'"'.mp4'""")
+        self.assertEqual(
+            shell_quote(args),
+            """ffmpeg -i 'ñ€ß'"'"'.mp4'""" if compat_os_name != 'nt' else '''ffmpeg -i "ñ€ß'.mp4"''')

    def test_str_to_int(self):
        self.assertEqual(str_to_int('123,456'), 123456)
@@ -915,6 +919,8 @@ class TestUtil(unittest.TestCase):
            supports_outside_bmp = False
        if supports_outside_bmp:
            self.assertEqual(extract_attributes('<e x="Smile &#128512;!">'), {'x': 'Smile \U0001f600!'})
+        # Malformed HTML should not break attributes extraction on older Python
+        self.assertEqual(extract_attributes('<mal"formed/>'), {})

    def test_clean_html(self):
        self.assertEqual(clean_html('a:\nb'), 'a: b')
@@ -929,7 +935,7 @@ class TestUtil(unittest.TestCase):
    def test_args_to_str(self):
        self.assertEqual(
            args_to_str(['foo', 'ba/r', '-baz', '2 be', '']),
-            'foo ba/r -baz \'2 be\' \'\''
+            'foo ba/r -baz \'2 be\' \'\'' if compat_os_name != 'nt' else 'foo ba/r -baz "2 be" ""'
        )

    def test_parse_filesize(self):
@@ -1225,6 +1231,12 @@ part 3</font></u>
        self.assertEqual(get_element_by_attribute('class', 'foo', html), None)
        self.assertEqual(get_element_by_attribute('class', 'no-such-foo', html), None)

+        html = '''
+            <div itemprop="author" itemscope>foo</div>
+        '''
+
+        self.assertEqual(get_element_by_attribute('itemprop', 'author', html), 'foo')
+
    def test_get_elements_by_class(self):
        html = '''
            <span class="foo bar">nice</span><span class="foo bar">also nice</span>
--- a/youtube_dl/YoutubeDL.py
+++ b/youtube_dl/YoutubeDL.py
@@ -58,6 +58,7 @@ from .utils import (
    format_bytes,
    formatSeconds,
    GeoRestrictedError,
+    int_or_none,
    ISO3166Utils,
    locked_file,
    make_HTTPS_handler,
@@ -302,6 +303,17 @@ class YoutubeDL(object):
                        postprocessor.
    """

+    _NUMERIC_FIELDS = set((
+        'width', 'height', 'tbr', 'abr', 'asr', 'vbr', 'fps', 'filesize', 'filesize_approx',
+        'timestamp', 'upload_year', 'upload_month', 'upload_day',
+        'duration', 'view_count', 'like_count', 'dislike_count', 'repost_count',
+        'average_rating', 'comment_count', 'age_limit',
+        'start_time', 'end_time',
+        'chapter_number', 'season_number', 'episode_number',
+        'track_number', 'disc_number', 'release_year',
+        'playlist_index',
+    ))
+
    params = None
    _ies = []
    _pps = []
@@ -639,22 +651,11 @@ class YoutubeDL(object):
                    r'%%(\1)0%dd' % field_size_compat_map[mobj.group('field')],
                    outtmpl)

-            NUMERIC_FIELDS = set((
-                'width', 'height', 'tbr', 'abr', 'asr', 'vbr', 'fps', 'filesize', 'filesize_approx',
-                'timestamp', 'upload_year', 'upload_month', 'upload_day',
-                'duration', 'view_count', 'like_count', 'dislike_count', 'repost_count',
-                'average_rating', 'comment_count', 'age_limit',
-                'start_time', 'end_time',
-                'chapter_number', 'season_number', 'episode_number',
-                'track_number', 'disc_number', 'release_year',
-                'playlist_index',
-            ))
-
            # Missing numeric fields used together with integer presentation types
            # in format specification will break the argument substitution since
            # string 'NA' is returned for missing fields. We will patch output
            # template for missing fields to meet string presentation type.
-            for numeric_field in NUMERIC_FIELDS:
+            for numeric_field in self._NUMERIC_FIELDS:
                if numeric_field not in template_dict:
                    # As of [1] format syntax is:
                    #  %[mapping_key][conversion_flags][minimum_width][.precision][length_modifier]type
@@ -1345,9 +1346,28 @@ class YoutubeDL(object):
        if 'title' not in info_dict:
            raise ExtractorError('Missing "title" field in extractor result')

-        if not isinstance(info_dict['id'], compat_str):
-            self.report_warning('"id" field is not a string - forcing string conversion')
-            info_dict['id'] = compat_str(info_dict['id'])
+        def report_force_conversion(field, field_not, conversion):
+            self.report_warning(
+                '"%s" field is not %s - forcing %s conversion, there is an error in extractor'
+                % (field, field_not, conversion))
+
+        def sanitize_string_field(info, string_field):
+            field = info.get(string_field)
+            if field is None or isinstance(field, compat_str):
+                return
+            report_force_conversion(string_field, 'a string', 'string')
+            info[string_field] = compat_str(field)
+
+        def sanitize_numeric_fields(info):
+            for numeric_field in self._NUMERIC_FIELDS:
+                field = info.get(numeric_field)
+                if field is None or isinstance(field, compat_numeric_types):
+                    continue
+                report_force_conversion(numeric_field, 'numeric', 'int')
+                info[numeric_field] = int_or_none(field)
+
+        sanitize_string_field(info_dict, 'id')
+        sanitize_numeric_fields(info_dict)

        if 'playlist' not in info_dict:
            # It isn't part of a playlist
@@ -1428,15 +1448,25 @@ class YoutubeDL(object):
        if not formats:
            raise ExtractorError('No video formats found!')

+        def is_wellformed(f):
+            url = f.get('url')
+            valid_url = url and isinstance(url, compat_str)
+            if not valid_url:
+                self.report_warning(
+                    '"url" field is missing or empty - skipping format, '
+                    'there is an error in extractor')
+            return valid_url
+
+        # Filter out malformed formats for better extraction robustness
+        formats = list(filter(is_wellformed, formats))
+
        formats_dict = {}

        # We check that all the formats have the format and format_id fields
        for i, format in enumerate(formats):
-            if 'url' not in format:
-                raise ExtractorError('Missing "url" key in result (index %d)' % i)
-
+            sanitize_string_field(format, 'format_id')
+            sanitize_numeric_fields(format)
            format['url'] = sanitize_url(format['url'])
-
            if format.get('format_id') is None:
                format['format_id'] = compat_str(i)
            else:
@@ -1860,7 +1890,7 @@ class YoutubeDL(object):
                        info_dict.get('protocol') == 'm3u8' and
                        self.params.get('hls_prefer_native')):
                    if fixup_policy == 'warn':
-                        self.report_warning('%s: malformated aac bitstream.' % (
+                        self.report_warning('%s: malformed AAC bitstream detected.' % (
                            info_dict['id']))
                    elif fixup_policy == 'detect_or_warn':
                        fixup_pp = FFmpegFixupM3u8PP(self)
@@ -1869,7 +1899,7 @@ class YoutubeDL(object):
                            info_dict['__postprocessors'].append(fixup_pp)
                        else:
                            self.report_warning(
-                                '%s: malformated aac bitstream. %s'
+                                '%s: malformed AAC bitstream detected. %s'
                                % (info_dict['id'], INSTALL_FFMPEG_MESSAGE))
                    else:
                        assert fixup_policy in ('ignore', 'never')
--- a/youtube_dl/compat.py
+++ b/youtube_dl/compat.py
@@ -2322,6 +2322,19 @@ try:
 except ImportError:  # Python 2
    from HTMLParser import HTMLParser as compat_HTMLParser

+try:  # Python 2
+    from HTMLParser import HTMLParseError as compat_HTMLParseError
+except ImportError:  # Python <3.4
+    try:
+        from html.parser import HTMLParseError as compat_HTMLParseError
+    except ImportError:  # Python >3.4
+
+        # HTMLParseError has been deprecated in Python 3.3 and removed in
+        # Python 3.5. Introducing dummy exception for Python >3.5 for compatible
+        # and uniform cross-version exceptiong handling
+        class compat_HTMLParseError(Exception):
+            pass
+
 try:
    from subprocess import DEVNULL
    compat_subprocess_get_DEVNULL = lambda: DEVNULL
@@ -2604,14 +2617,22 @@ except ImportError:  # Python 2
                parsed_result[name] = [value]
        return parsed_result

-try:
-    from shlex import quote as compat_shlex_quote
-except ImportError:  # Python < 3.3
+
+compat_os_name = os._name if os.name == 'java' else os.name
+
+
+if compat_os_name == 'nt':
    def compat_shlex_quote(s):
-        if re.match(r'^[-_\w./]+$', s):
-            return s
-        else:
-            return "'" + s.replace("'", "'\"'\"'") + "'"
+        return s if re.match(r'^[-_\w./]+$', s) else '"%s"' % s.replace('"', '\\"')
+else:
+    try:
+        from shlex import quote as compat_shlex_quote
+    except ImportError:  # Python < 3.3
+        def compat_shlex_quote(s):
+            if re.match(r'^[-_\w./]+$', s):
+                return s
+            else:
+                return "'" + s.replace("'", "'\"'\"'") + "'"


 try:
@@ -2636,9 +2657,6 @@ def compat_ord(c):
        return ord(c)


-compat_os_name = os._name if os.name == 'java' else os.name
-
-
 if sys.version_info >= (3, 0):
    compat_getenv = os.getenv
    compat_expanduser = os.path.expanduser
@@ -2882,6 +2900,7 @@ else:


 __all__ = [
+    'compat_HTMLParseError',
    'compat_HTMLParser',
    'compat_HTTPError',
    'compat_basestring',
--- a/youtube_dl/downloader/common.py
+++ b/youtube_dl/downloader/common.py
@@ -8,10 +8,11 @@ import random

 from ..compat import compat_os_name
 from ..utils import (
+    decodeArgument,
    encodeFilename,
    error_to_compat_str,
-    decodeArgument,
    format_bytes,
+    shell_quote,
    timeconvert,
 )

@@ -381,10 +382,5 @@ class FileDownloader(object):
        if exe is None:
            exe = os.path.basename(str_args[0])

-        try:
-            import pipes
-            shell_quote = lambda args: ' '.join(map(pipes.quote, str_args))
-        except ImportError:
-            shell_quote = repr
        self.to_screen('[debug] %s command line: %s' % (
            exe, shell_quote(str_args)))
--- a/youtube_dl/extractor/abc.py
+++ b/youtube_dl/extractor/abc.py
@@ -3,11 +3,13 @@ from __future__ import unicode_literals
 import re

 from .common import InfoExtractor
+from ..compat import compat_str
 from ..utils import (
    ExtractorError,
    js_to_json,
    int_or_none,
    parse_iso8601,
+    try_get,
 )


@@ -124,7 +126,20 @@ class ABCIViewIE(InfoExtractor):
        title = video_params.get('title') or video_params['seriesTitle']
        stream = next(s for s in video_params['playlist'] if s.get('type') == 'program')

-        formats = self._extract_akamai_formats(stream['hds-unmetered'], video_id)
+        format_urls = [
+            try_get(stream, lambda x: x['hds-unmetered'], compat_str)]
+
+        # May have higher quality video
+        sd_url = try_get(
+            stream, lambda x: x['streams']['hds']['sd'], compat_str)
+        if sd_url:
+            format_urls.append(sd_url.replace('metered', 'um'))
+
+        formats = []
+        for format_url in format_urls:
+            if format_url:
+                formats.extend(
+                    self._extract_akamai_formats(format_url, video_id))
        self._sort_formats(formats)

        subtitles = {}
--- a/youtube_dl/extractor/abcotvs.py
+++ b/youtube_dl/extractor/abcotvs.py
@@ -22,7 +22,7 @@ class ABCOTVSIE(InfoExtractor):
                'display_id': 'east-bay-museum-celebrates-vintage-synthesizers',
                'ext': 'mp4',
                'title': 'East Bay museum celebrates vintage synthesizers',
-                'description': 'md5:a4f10fb2f2a02565c1749d4adbab4b10',
+                'description': 'md5:24ed2bd527096ec2a5c67b9d5a9005f3',
                'thumbnail': r're:^https?://.*\.jpg$',
                'timestamp': 1421123075,
                'upload_date': '20150113',
--- a/youtube_dl/extractor/adobepass.py
+++ b/youtube_dl/extractor/adobepass.py
@@ -6,12 +6,16 @@ import time
 import xml.etree.ElementTree as etree

 from .common import InfoExtractor
-from ..compat import compat_urlparse
+from ..compat import (
+    compat_kwargs,
+    compat_urlparse,
+)
 from ..utils import (
    unescapeHTML,
    urlencode_postdata,
    unified_timestamp,
    ExtractorError,
+    NO_DEFAULT,
 )


@@ -21,6 +25,11 @@ MSO_INFO = {
        'username_field': 'username',
        'password_field': 'password',
    },
+    'ATTOTT': {
+        'name': 'DIRECTV NOW',
+        'username_field': 'email',
+        'password_field': 'loginpassword',
+    },
    'Rogers': {
        'name': 'Rogers',
        'username_field': 'UserName',
@@ -1313,11 +1322,14 @@ class AdobePassIE(InfoExtractor):
    _USER_AGENT = 'Mozilla/5.0 (X11; Linux i686; rv:47.0) Gecko/20100101 Firefox/47.0'
    _MVPD_CACHE = 'ap-mvpd'

+    _DOWNLOADING_LOGIN_PAGE = 'Downloading Provider Login Page'
+
    def _download_webpage_handle(self, *args, **kwargs):
        headers = kwargs.get('headers', {})
        headers.update(self.geo_verification_headers())
        kwargs['headers'] = headers
-        return super(AdobePassIE, self)._download_webpage_handle(*args, **kwargs)
+        return super(AdobePassIE, self)._download_webpage_handle(
+            *args, **compat_kwargs(kwargs))

    @staticmethod
    def _get_mvpd_resource(provider_id, title, guid, rating):
@@ -1361,6 +1373,21 @@ class AdobePassIE(InfoExtractor):
                'Use --ap-mso to specify Adobe Pass Multiple-system operator Identifier '
                'and --ap-username and --ap-password or --netrc to provide account credentials.', expected=True)

+        def extract_redirect_url(html, url=None, fatal=False):
+            # TODO: eliminate code duplication with generic extractor and move
+            # redirection code into _download_webpage_handle
+            REDIRECT_REGEX = r'[0-9]{,2};\s*(?:URL|url)=\'?([^\'"]+)'
+            redirect_url = self._search_regex(
+                r'(?i)<meta\s+(?=(?:[a-z-]+="[^"]+"\s+)*http-equiv="refresh")'
+                r'(?:[a-z-]+="[^"]+"\s+)*?content="%s' % REDIRECT_REGEX,
+                html, 'meta refresh redirect',
+                default=NO_DEFAULT if fatal else None, fatal=fatal)
+            if not redirect_url:
+                return None
+            if url:
+                redirect_url = compat_urlparse.urljoin(url, unescapeHTML(redirect_url))
+            return redirect_url
+
        mvpd_headers = {
            'ap_42': 'anonymous',
            'ap_11': 'Linux i686',
@@ -1410,16 +1437,15 @@ class AdobePassIE(InfoExtractor):
                        if '<form name="signin"' in provider_redirect_page:
                            provider_login_page_res = provider_redirect_page_res
                        elif 'http-equiv="refresh"' in provider_redirect_page:
-                            oauth_redirect_url = self._html_search_regex(
-                                r'content="0;\s*url=([^\'"]+)',
-                                provider_redirect_page, 'meta refresh redirect')
+                            oauth_redirect_url = extract_redirect_url(
+                                provider_redirect_page, fatal=True)
                            provider_login_page_res = self._download_webpage_handle(
                                oauth_redirect_url, video_id,
-                                'Downloading Provider Login Page')
+                                self._DOWNLOADING_LOGIN_PAGE)
                        else:
                            provider_login_page_res = post_form(
                                provider_redirect_page_res,
-                                'Downloading Provider Login Page')
+                                self._DOWNLOADING_LOGIN_PAGE)

                        mvpd_confirm_page_res = post_form(
                            provider_login_page_res, 'Logging in', {
@@ -1466,8 +1492,17 @@ class AdobePassIE(InfoExtractor):
                            'Content-Type': 'application/x-www-form-urlencoded'
                        })
                else:
+                    # Some providers (e.g. DIRECTV NOW) have another meta refresh
+                    # based redirect that should be followed.
+                    provider_redirect_page, urlh = provider_redirect_page_res
+                    provider_refresh_redirect_url = extract_redirect_url(
+                        provider_redirect_page, url=urlh.geturl())
+                    if provider_refresh_redirect_url:
+                        provider_redirect_page_res = self._download_webpage_handle(
+                            provider_refresh_redirect_url, video_id,
+                            'Downloading Provider Redirect Page (meta refresh)')
                    provider_login_page_res = post_form(
-                        provider_redirect_page_res, 'Downloading Provider Login Page')
+                        provider_redirect_page_res, self._DOWNLOADING_LOGIN_PAGE)
                    mvpd_confirm_page_res = post_form(provider_login_page_res, 'Logging in', {
                        mso_info.get('username_field', 'username'): username,
                        mso_info.get('password_field', 'password'): password,
--- a/youtube_dl/extractor/asiancrush.py
+++ b/youtube_dl/extractor/asiancrush.py
@@ -0,0 +1,93 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from .kaltura import KalturaIE
+from ..utils import (
+    extract_attributes,
+    remove_end,
+    urlencode_postdata,
+)
+
+
+class AsianCrushIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?asiancrush\.com/video/(?:[^/]+/)?0+(?P<id>\d+)v\b'
+    _TESTS = [{
+        'url': 'https://www.asiancrush.com/video/012869v/women-who-flirt/',
+        'md5': 'c3b740e48d0ba002a42c0b72857beae6',
+        'info_dict': {
+            'id': '1_y4tmjm5r',
+            'ext': 'mp4',
+            'title': 'Women Who Flirt',
+            'description': 'md5:3db14e9186197857e7063522cb89a805',
+            'timestamp': 1496936429,
+            'upload_date': '20170608',
+            'uploader_id': 'craig@crifkin.com',
+        },
+    }, {
+        'url': 'https://www.asiancrush.com/video/she-was-pretty/011886v-pretty-episode-3/',
+        'only_matching': True,
+    }]
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        data = self._download_json(
+            'https://www.asiancrush.com/wp-admin/admin-ajax.php', video_id,
+            data=urlencode_postdata({
+                'postid': video_id,
+                'action': 'get_channel_kaltura_vars',
+            }))
+
+        entry_id = data['entry_id']
+
+        return self.url_result(
+            'kaltura:%s:%s' % (data['partner_id'], entry_id),
+            ie=KalturaIE.ie_key(), video_id=entry_id,
+            video_title=data.get('vid_label'))
+
+
+class AsianCrushPlaylistIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?asiancrush\.com/series/0+(?P<id>\d+)s\b'
+    _TEST = {
+        'url': 'https://www.asiancrush.com/series/012481s/scholar-walks-night/',
+        'info_dict': {
+            'id': '12481',
+            'title': 'Scholar Who Walks the Night',
+            'description': 'md5:7addd7c5132a09fd4741152d96cce886',
+        },
+        'playlist_count': 20,
+    }
+
+    def _real_extract(self, url):
+        playlist_id = self._match_id(url)
+
+        webpage = self._download_webpage(url, playlist_id)
+
+        entries = []
+
+        for mobj in re.finditer(
+                r'<a[^>]+href=(["\'])(?P<url>%s.*?)\1[^>]*>' % AsianCrushIE._VALID_URL,
+                webpage):
+            attrs = extract_attributes(mobj.group(0))
+            if attrs.get('class') == 'clearfix':
+                entries.append(self.url_result(
+                    mobj.group('url'), ie=AsianCrushIE.ie_key()))
+
+        title = remove_end(
+            self._html_search_regex(
+                r'(?s)<h1\b[^>]\bid=["\']movieTitle[^>]+>(.+?)</h1>', webpage,
+                'title', default=None) or self._og_search_title(
+                webpage, default=None) or self._html_search_meta(
+                'twitter:title', webpage, 'title',
+                default=None) or self._search_regex(
+                r'<title>([^<]+)</title>', webpage, 'title', fatal=False),
+            ' | AsianCrush')
+
+        description = self._og_search_description(
+            webpage, default=None) or self._html_search_meta(
+            'twitter:description', webpage, 'description', fatal=False)
+
+        return self.playlist_result(entries, playlist_id, title, description)
--- a/youtube_dl/extractor/bbc.py
+++ b/youtube_dl/extractor/bbc.py
@@ -36,7 +36,7 @@ class BBCCoUkIE(InfoExtractor):
                        (?:
                            programmes/(?!articles/)|
                            iplayer(?:/[^/]+)?/(?:episode/|playlist/)|
-                            music/clips[/#]|
+                            music/(?:clips|audiovideo/popular)[/#]|
                            radio/player/
                        )
                        (?P<id>%s)(?!/(?:episodes|broadcasts|clips))
@@ -229,8 +229,10 @@ class BBCCoUkIE(InfoExtractor):
        }, {
            'url': 'http://www.bbc.co.uk/radio/player/p03cchwf',
            'only_matching': True,
-        }
-    ]
+        }, {
+            'url': 'https://www.bbc.co.uk/music/audiovideo/popular#p055bc55',
+            'only_matching': True,
+        }]

    _USP_RE = r'/([^/]+?)\.ism(?:\.hlsv2\.ism)?/[^/]+\.m3u8'

@@ -523,6 +525,12 @@ class BBCCoUkIE(InfoExtractor):

        webpage = self._download_webpage(url, group_id, 'Downloading video page')

+        error = self._search_regex(
+            r'<div\b[^>]+\bclass=["\']smp__message delta["\'][^>]*>([^<]+)<',
+            webpage, 'error', default=None)
+        if error:
+            raise ExtractorError(error, expected=True)
+
        programme_id = None
        duration = None

--- a/youtube_dl/extractor/bilibili.py
+++ b/youtube_dl/extractor/bilibili.py
@@ -54,6 +54,22 @@ class BiliBiliIE(InfoExtractor):
            'description': '如果你是神明，并且能够让妄想成为现实。那你会进行怎么样的妄想？是淫靡的世界？独裁社会？毁灭性的制裁？还是……2015年，涩谷。从6年前发生的大灾害“涩谷地震”之后复兴了的这个街区里新设立的私立高中...',
        },
        'skip': 'Geo-restricted to China',
+    }, {
+        # Title with double quotes
+        'url': 'http://www.bilibili.com/video/av8903802/',
+        'info_dict': {
+            'id': '8903802',
+            'ext': 'mp4',
+            'title': '阿滴英文｜英文歌分享#6 "Closer',
+            'description': '滴妹今天唱Closer給你聽! 有史以来，被推最多次也是最久的歌曲，其实歌词跟我原本想像差蛮多的，不过还是好听！ 微博@阿滴英文',
+            'uploader': '阿滴英文',
+            'uploader_id': '65880958',
+            'timestamp': 1488382620,
+            'upload_date': '20170301',
+        },
+        'params': {
+            'skip_download': True,  # Test metadata only
+        },
    }]

    _APP_KEY = '84956560bc028eb7'
@@ -135,7 +151,7 @@ class BiliBiliIE(InfoExtractor):
                'formats': formats,
            })

-        title = self._html_search_regex('<h1[^>]+title="([^"]+)">', webpage, 'title')
+        title = self._html_search_regex('<h1[^>]*>([^<]+)</h1>', webpage, 'title')
        description = self._html_search_meta('description', webpage)
        timestamp = unified_timestamp(self._html_search_regex(
            r'<time[^>]+datetime="([^"]+)"', webpage, 'upload time', default=None))
--- a/youtube_dl/extractor/buzzfeed.py
+++ b/youtube_dl/extractor/buzzfeed.py
@@ -84,9 +84,10 @@ class BuzzFeedIE(InfoExtractor):
                continue
            entries.append(self.url_result(video['url']))

-        facebook_url = FacebookIE._extract_url(webpage)
-        if facebook_url:
-            entries.append(self.url_result(facebook_url))
+        facebook_urls = FacebookIE._extract_urls(webpage)
+        entries.extend([
+            self.url_result(facebook_url)
+            for facebook_url in facebook_urls])

        return {
            '_type': 'playlist',
--- a/youtube_dl/extractor/cbsnews.py
+++ b/youtube_dl/extractor/cbsnews.py
@@ -15,19 +15,23 @@ class CBSNewsIE(CBSIE):

    _TESTS = [
        {
-            'url': 'http://www.cbsnews.com/news/tesla-and-spacex-elon-musks-industrial-empire/',
+            # 60 minutes
+            'url': 'http://www.cbsnews.com/news/artificial-intelligence-positioned-to-be-a-game-changer/',
            'info_dict': {
-                'id': 'tesla-and-spacex-elon-musks-industrial-empire',
-                'ext': 'flv',
-                'title': 'Tesla and SpaceX: Elon Musk\'s industrial empire',
-                'thumbnail': 'http://beta.img.cbsnews.com/i/2014/03/30/60147937-2f53-4565-ad64-1bdd6eb64679/60-0330-pelley-640x360.jpg',
-                'duration': 791,
+                'id': '_B6Ga3VJrI4iQNKsir_cdFo9Re_YJHE_',
+                'ext': 'mp4',
+                'title': 'Artificial Intelligence',
+                'description': 'md5:8818145f9974431e0fb58a1b8d69613c',
+                'thumbnail': r're:^https?://.*\.jpg$',
+                'duration': 1606,
+                'uploader': 'CBSI-NEW',
+                'timestamp': 1498431900,
+                'upload_date': '20170625',
            },
            'params': {
-                # rtmp download
+                # m3u8 download
                'skip_download': True,
            },
-            'skip': 'Subscribers only',
        },
        {
            'url': 'http://www.cbsnews.com/videos/fort-hood-shooting-army-downplays-mental-illness-as-cause-of-attack/',
@@ -52,6 +56,22 @@ class CBSNewsIE(CBSIE):
                'skip_download': True,
            },
        },
+        {
+            # 48 hours
+            'url': 'http://www.cbsnews.com/news/maria-ridulph-murder-will-the-nations-oldest-cold-case-to-go-to-trial-ever-get-solved/',
+            'info_dict': {
+                'id': 'QpM5BJjBVEAUFi7ydR9LusS69DPLqPJ1',
+                'ext': 'mp4',
+                'title': 'Cold as Ice',
+                'description': 'Can a childhood memory of a friend\'s murder solve a 1957 cold case? "48 Hours" correspondent Erin Moriarty has the latest.',
+                'upload_date': '20170604',
+                'timestamp': 1496538000,
+                'uploader': 'CBSI-NEW',
+            },
+            'params': {
+                'skip_download': True,
+            },
+        },
    ]

    def _real_extract(self, url):
@@ -60,7 +80,7 @@ class CBSNewsIE(CBSIE):
        webpage = self._download_webpage(url, video_id)

        video_info = self._parse_json(self._html_search_regex(
-            r'(?:<ul class="media-list items" id="media-related-items"><li data-video-info|<div id="cbsNewsVideoPlayer" data-video-player-options)=\'({.+?})\'',
+            r'(?:<ul class="media-list items" id="media-related-items"[^>]*><li data-video-info|<div id="cbsNewsVideoPlayer" data-video-player-options)=\'({.+?})\'',
            webpage, 'video JSON info', default='{}'), video_id, fatal=False)

        if video_info:
--- a/youtube_dl/extractor/cjsw.py
+++ b/youtube_dl/extractor/cjsw.py
@@ -0,0 +1,72 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from ..utils import (
+    determine_ext,
+    unescapeHTML,
+)
+
+
+class CJSWIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?cjsw\.com/program/(?P<program>[^/]+)/episode/(?P<id>\d+)'
+    _TESTS = [{
+        'url': 'http://cjsw.com/program/freshly-squeezed/episode/20170620',
+        'md5': 'cee14d40f1e9433632c56e3d14977120',
+        'info_dict': {
+            'id': '91d9f016-a2e7-46c5-8dcb-7cbcd7437c41',
+            'ext': 'mp3',
+            'title': 'Freshly Squeezed – Episode June 20, 2017',
+            'description': 'md5:c967d63366c3898a80d0c7b0ff337202',
+            'series': 'Freshly Squeezed',
+            'episode_id': '20170620',
+        },
+    }, {
+        # no description
+        'url': 'http://cjsw.com/program/road-pops/episode/20170707/',
+        'only_matching': True,
+    }]
+
+    def _real_extract(self, url):
+        mobj = re.match(self._VALID_URL, url)
+        program, episode_id = mobj.group('program', 'id')
+        audio_id = '%s/%s' % (program, episode_id)
+
+        webpage = self._download_webpage(url, episode_id)
+
+        title = unescapeHTML(self._search_regex(
+            (r'<h1[^>]+class=["\']episode-header__title["\'][^>]*>(?P<title>[^<]+)',
+             r'data-audio-title=(["\'])(?P<title>(?:(?!\1).)+)\1'),
+            webpage, 'title', group='title'))
+
+        audio_url = self._search_regex(
+            r'<button[^>]+data-audio-src=(["\'])(?P<url>(?:(?!\1).)+)\1',
+            webpage, 'audio url', group='url')
+
+        audio_id = self._search_regex(
+            r'/([\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})\.mp3',
+            audio_url, 'audio id', default=audio_id)
+
+        formats = [{
+            'url': audio_url,
+            'ext': determine_ext(audio_url, 'mp3'),
+            'vcodec': 'none',
+        }]
+
+        description = self._html_search_regex(
+            r'<p>(?P<description>.+?)</p>', webpage, 'description',
+            default=None)
+        series = self._search_regex(
+            r'data-showname=(["\'])(?P<name>(?:(?!\1).)+)\1', webpage,
+            'series', default=program, group='name')
+
+        return {
+            'id': audio_id,
+            'title': title,
+            'description': description,
+            'formats': formats,
+            'series': series,
+            'episode_id': episode_id,
+        }
--- a/youtube_dl/extractor/common.py
+++ b/youtube_dl/extractor/common.py
@@ -376,7 +376,7 @@ class InfoExtractor(object):
            cls._VALID_URL_RE = re.compile(cls._VALID_URL)
        m = cls._VALID_URL_RE.match(url)
        assert m
-        return m.group('id')
+        return compat_str(m.group('id'))

    @classmethod
    def working(cls):
@@ -420,7 +420,7 @@ class InfoExtractor(object):
            if country_code:
                self._x_forwarded_for_ip = GeoUtils.random_ipv4(country_code)
                if self._downloader.params.get('verbose', False):
-                    self._downloader.to_stdout(
+                    self._downloader.to_screen(
                        '[debug] Using fake IP %s (%s) as X-Forwarded-For.'
                        % (self._x_forwarded_for_ip, country_code.upper()))

@@ -1002,17 +1002,17 @@ class InfoExtractor(object):
                item_type = e.get('@type')
                if expected_type is not None and expected_type != item_type:
                    return info
-                if item_type == 'TVEpisode':
+                if item_type in ('TVEpisode', 'Episode'):
                    info.update({
                        'episode': unescapeHTML(e.get('name')),
                        'episode_number': int_or_none(e.get('episodeNumber')),
                        'description': unescapeHTML(e.get('description')),
                    })
                    part_of_season = e.get('partOfSeason')
-                    if isinstance(part_of_season, dict) and part_of_season.get('@type') == 'TVSeason':
+                    if isinstance(part_of_season, dict) and part_of_season.get('@type') in ('TVSeason', 'Season', 'CreativeWorkSeason'):
                        info['season_number'] = int_or_none(part_of_season.get('seasonNumber'))
                    part_of_series = e.get('partOfSeries') or e.get('partOfTVSeries')
-                    if isinstance(part_of_series, dict) and part_of_series.get('@type') == 'TVSeries':
+                    if isinstance(part_of_series, dict) and part_of_series.get('@type') in ('TVSeries', 'Series', 'CreativeWorkSeries'):
                        info['series'] = unescapeHTML(part_of_series.get('name'))
                elif item_type == 'Article':
                    info.update({
@@ -1022,10 +1022,10 @@ class InfoExtractor(object):
                    })
                elif item_type == 'VideoObject':
                    extract_video_object(e)
-                elif item_type == 'WebPage':
-                    video = e.get('video')
-                    if isinstance(video, dict) and video.get('@type') == 'VideoObject':
-                        extract_video_object(video)
+                    continue
+                video = e.get('video')
+                if isinstance(video, dict) and video.get('@type') == 'VideoObject':
+                    extract_video_object(video)
                break
        return dict((k, v) for k, v in info.items() if v is not None)

@@ -2132,15 +2132,18 @@ class InfoExtractor(object):
            return is_plain_url, formats

        entries = []
+        # amp-video and amp-audio are very similar to their HTML5 counterparts
+        # so we wll include them right here (see
+        # https://www.ampproject.org/docs/reference/components/amp-video)
        media_tags = [(media_tag, media_type, '')
                      for media_tag, media_type
-                      in re.findall(r'(?s)(<(video|audio)[^>]*/>)', webpage)]
+                      in re.findall(r'(?s)(<(?:amp-)?(video|audio)[^>]*/>)', webpage)]
        media_tags.extend(re.findall(
            # We only allow video|audio followed by a whitespace or '>'.
            # Allowing more characters may end up in significant slow down (see
            # https://github.com/rg3/youtube-dl/issues/11979, example URL:
            # http://www.porntrex.com/maps/videositemap.xml).
-            r'(?s)(<(?P<tag>video|audio)(?:\s+[^>]*)?>)(.*?)</(?P=tag)>', webpage))
+            r'(?s)(<(?P<tag>(?:amp-)?(?:video|audio))(?:\s+[^>]*)?>)(.*?)</(?P=tag)>', webpage))
        for media_tag, media_type, media_content in media_tags:
            media_info = {
                'formats': [],
@@ -2299,6 +2302,8 @@ class InfoExtractor(object):
            tracks = video_data.get('tracks')
            if tracks and isinstance(tracks, list):
                for track in tracks:
+                    if not isinstance(track, dict):
+                        continue
                    if track.get('kind') != 'captions':
                        continue
                    track_url = urljoin(base_url, track.get('file'))
@@ -2328,6 +2333,8 @@ class InfoExtractor(object):
        urls = []
        formats = []
        for source in jwplayer_sources_data:
+            if not isinstance(source, dict):
+                continue
            source_url = self._proto_relative_url(source.get('file'))
            if not source_url:
                continue
--- a/youtube_dl/extractor/corus.py
+++ b/youtube_dl/extractor/corus.py
@@ -8,7 +8,16 @@ from ..utils import int_or_none


 class CorusIE(ThePlatformFeedIE):
-    _VALID_URL = r'https?://(?:www\.)?(?P<domain>(?:globaltv|etcanada)\.com|(?:hgtv|foodnetwork|slice)\.ca)/(?:video/|(?:[^/]+/)+(?:videos/[a-z0-9-]+-|video\.html\?.*?\bv=))(?P<id>\d+)'
+    _VALID_URL = r'''(?x)
+                    https?://
+                        (?:www\.)?
+                        (?P<domain>
+                            (?:globaltv|etcanada)\.com|
+                            (?:hgtv|foodnetwork|slice|history|showcase)\.ca
+                        )
+                        /(?:video/|(?:[^/]+/)+(?:videos/[a-z0-9-]+-|video\.html\?.*?\bv=))
+                        (?P<id>\d+)
+                    '''
    _TESTS = [{
        'url': 'http://www.hgtv.ca/shows/bryan-inc/videos/movie-night-popcorn-with-bryan-870923331648/',
        'md5': '05dcbca777bf1e58c2acbb57168ad3a6',
@@ -27,6 +36,12 @@ class CorusIE(ThePlatformFeedIE):
    }, {
        'url': 'http://etcanada.com/video/873675331955/meet-the-survivor-game-changers-castaways-part-2/',
        'only_matching': True,
+    }, {
+        'url': 'http://www.history.ca/the-world-without-canada/video/full-episodes/natural-resources/video.html?v=955054659646#video',
+        'only_matching': True,
+    }, {
+        'url': 'http://www.showcase.ca/eyewitness/video/eyewitness++106/video.html?v=955070531919&p=1&s=da#video',
+        'only_matching': True,
    }]

    _TP_FEEDS = {
@@ -50,6 +65,14 @@ class CorusIE(ThePlatformFeedIE):
            'feed_id': '5tUJLgV2YNJ5',
            'account_id': 2414427935,
        },
+        'history': {
+            'feed_id': 'tQFx_TyyEq4J',
+            'account_id': 2369613659,
+        },
+        'showcase': {
+            'feed_id': '9H6qyshBZU3E',
+            'account_id': 2414426607,
+        },
    }

    def _real_extract(self, url):
--- a/youtube_dl/extractor/dailymail.py
+++ b/youtube_dl/extractor/dailymail.py
@@ -1,6 +1,8 @@
 # coding: utf-8
 from __future__ import unicode_literals

+import re
+
 from .common import InfoExtractor
 from ..compat import compat_str
 from ..utils import (
@@ -12,8 +14,8 @@ from ..utils import (


 class DailyMailIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?dailymail\.co\.uk/video/[^/]+/video-(?P<id>[0-9]+)'
-    _TEST = {
+    _VALID_URL = r'https?://(?:www\.)?dailymail\.co\.uk/(?:video/[^/]+/video-|embed/video/)(?P<id>[0-9]+)'
+    _TESTS = [{
        'url': 'http://www.dailymail.co.uk/video/tvshowbiz/video-1295863/The-Mountain-appears-sparkling-water-ad-Heavy-Bubbles.html',
        'md5': 'f6129624562251f628296c3a9ffde124',
        'info_dict': {
@@ -22,7 +24,16 @@ class DailyMailIE(InfoExtractor):
            'title': 'The Mountain appears in sparkling water ad for \'Heavy Bubbles\'',
            'description': 'md5:a93d74b6da172dd5dc4d973e0b766a84',
        }
-    }
+    }, {
+        'url': 'http://www.dailymail.co.uk/embed/video/1295863.html',
+        'only_matching': True,
+    }]
+
+    @staticmethod
+    def _extract_urls(webpage):
+        return re.findall(
+            r'<iframe\b[^>]+\bsrc=["\'](?P<url>(?:https?:)?//(?:www\.)?dailymail\.co\.uk/embed/video/\d+\.html)',
+            webpage)

    def _real_extract(self, url):
        video_id = self._match_id(url)
--- a/youtube_dl/extractor/dailymotion.py
+++ b/youtube_dl/extractor/dailymotion.py
@@ -147,7 +147,7 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
        view_count_str = self._search_regex(
            (r'<meta[^>]+itemprop="interactionCount"[^>]+content="UserPlays:([\s\d,.]+)"',
             r'video_views_count[^>]+>\s+([\s\d\,.]+)'),
-            webpage, 'view count', fatal=False)
+            webpage, 'view count', default=None)
        if view_count_str:
            view_count_str = re.sub(r'\s', '', view_count_str)
        view_count = str_to_int(view_count_str)
@@ -159,7 +159,9 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
            [r'buildPlayer\(({.+?})\);\n',  # See https://github.com/rg3/youtube-dl/issues/7826
             r'playerV5\s*=\s*dmp\.create\([^,]+?,\s*({.+?})\);',
             r'buildPlayer\(({.+?})\);',
-             r'var\s+config\s*=\s*({.+?});'],
+             r'var\s+config\s*=\s*({.+?});',
+             # New layout regex (see https://github.com/rg3/youtube-dl/issues/13580)
+             r'__PLAYER_CONFIG__\s*=\s*({.+?});'],
            webpage, 'player v5', default=None)
        if player_v5:
            player = self._parse_json(player_v5, video_id)
--- a/youtube_dl/extractor/disney.py
+++ b/youtube_dl/extractor/disney.py
@@ -15,7 +15,7 @@ from ..utils import (

 class DisneyIE(InfoExtractor):
    _VALID_URL = r'''(?x)
-        https?://(?P<domain>(?:[^/]+\.)?(?:disney\.[a-z]{2,3}(?:\.[a-z]{2})?|disney(?:(?:me|latino)\.com|turkiye\.com\.tr)|(?:starwars|marvelkids)\.com))/(?:(?:embed/|(?:[^/]+/)+[\w-]+-)(?P<id>[a-z0-9]{24})|(?:[^/]+/)?(?P<display_id>[^/?#]+))'''
+        https?://(?P<domain>(?:[^/]+\.)?(?:disney\.[a-z]{2,3}(?:\.[a-z]{2})?|disney(?:(?:me|latino)\.com|turkiye\.com\.tr|channel\.de)|(?:starwars|marvelkids)\.com))/(?:(?:embed/|(?:[^/]+/)+[\w-]+-)(?P<id>[a-z0-9]{24})|(?:[^/]+/)?(?P<display_id>[^/?#]+))'''
    _TESTS = [{
        # Disney.EmbedVideo
        'url': 'http://video.disney.com/watch/moana-trailer-545ed1857afee5a0ec239977',
@@ -68,6 +68,9 @@ class DisneyIE(InfoExtractor):
    }, {
        'url': 'http://disneyjunior.en.disneyme.com/dj/watch-my-friends-tigger-and-pooh-promo',
        'only_matching': True,
+    }, {
+        'url': 'http://disneychannel.de/sehen/soy-luna-folge-118-5518518987ba27f3cc729268',
+        'only_matching': True,
    }, {
        'url': 'http://disneyjunior.disney.com/galactech-the-galactech-grab-galactech-an-admiral-rescue',
        'only_matching': True,
--- a/youtube_dl/extractor/dplay.py
+++ b/youtube_dl/extractor/dplay.py
@@ -184,7 +184,7 @@ class DPlayItIE(InfoExtractor):
        webpage = self._download_webpage(url, display_id)

        info_url = self._search_regex(
-            r'url\s*:\s*["\']((?:https?:)?//[^/]+/playback/videoPlaybackInfo/\d+)',
+            r'url\s*[:=]\s*["\']((?:https?:)?//[^/]+/playback/videoPlaybackInfo/\d+)',
            webpage, 'video id')

        title = remove_end(self._og_search_title(webpage), ' | Dplay')
--- a/youtube_dl/extractor/drtuber.py
+++ b/youtube_dl/extractor/drtuber.py
@@ -44,8 +44,23 @@ class DrTuberIE(InfoExtractor):
        webpage = self._download_webpage(
            'http://www.drtuber.com/video/%s' % video_id, display_id)

-        video_url = self._html_search_regex(
-            r'<source src="([^"]+)"', webpage, 'video URL')
+        video_data = self._download_json(
+            'http://www.drtuber.com/player_config_json/', video_id, query={
+                'vid': video_id,
+                'embed': 0,
+                'aid': 0,
+                'domain_id': 0,
+            })
+
+        formats = []
+        for format_id, video_url in video_data['files'].items():
+            if video_url:
+                formats.append({
+                    'format_id': format_id,
+                    'quality': 2 if format_id == 'hq' else 1,
+                    'url': video_url
+                })
+        self._sort_formats(formats)

        title = self._html_search_regex(
            (r'class="title_watch"[^>]*><(?:p|h\d+)[^>]*>([^<]+)<',
@@ -75,7 +90,7 @@ class DrTuberIE(InfoExtractor):
        return {
            'id': video_id,
            'display_id': display_id,
-            'url': video_url,
+            'formats': formats,
            'title': title,
            'thumbnail': thumbnail,
            'like_count': like_count,
--- a/youtube_dl/extractor/eagleplatform.py
+++ b/youtube_dl/extractor/eagleplatform.py
@@ -11,6 +11,7 @@ from ..compat import (
 from ..utils import (
    ExtractorError,
    int_or_none,
+    unsmuggle_url,
 )


@@ -50,6 +51,10 @@ class EaglePlatformIE(InfoExtractor):
            'view_count': int,
        },
        'skip': 'Georestricted',
+    }, {
+        # referrer protected video (https://tvrain.ru/lite/teleshow/kak_vse_nachinalos/namin-418921/)
+        'url': 'tvrainru.media.eagleplatform.com:582306',
+        'only_matching': True,
    }]

    @staticmethod
@@ -60,16 +65,40 @@ class EaglePlatformIE(InfoExtractor):
            webpage)
        if mobj is not None:
            return mobj.group('url')
-        # Basic usage embedding (see http://dultonmedia.github.io/eplayer/)
+        PLAYER_JS_RE = r'''
+                        <script[^>]+
+                            src=(?P<qjs>["\'])(?:https?:)?//(?P<host>(?:(?!(?P=qjs)).)+\.media\.eagleplatform\.com)/player/player\.js(?P=qjs)
+                        .+?
+                    '''
+        # "Basic usage" embedding (see http://dultonmedia.github.io/eplayer/)
        mobj = re.search(
            r'''(?xs)
-                    <script[^>]+
-                        src=(?P<q1>["\'])(?:https?:)?//(?P<host>.+?\.media\.eagleplatform\.com)/player/player\.js(?P=q1)
-                    .+?
+                    %s
                    <div[^>]+
-                        class=(?P<q2>["\'])eagleplayer(?P=q2)[^>]+
+                        class=(?P<qclass>["\'])eagleplayer(?P=qclass)[^>]+
                        data-id=["\'](?P<id>\d+)
-            ''', webpage)
+            ''' % PLAYER_JS_RE, webpage)
+        if mobj is not None:
+            return 'eagleplatform:%(host)s:%(id)s' % mobj.groupdict()
+        # Generalization of "Javascript code usage", "Combined usage" and
+        # "Usage without attaching to DOM" embeddings (see
+        # http://dultonmedia.github.io/eplayer/)
+        mobj = re.search(
+            r'''(?xs)
+                    %s
+                    <script>
+                    .+?
+                    new\s+EaglePlayer\(
+                        (?:[^,]+\s*,\s*)?
+                        {
+                            .+?
+                            \bid\s*:\s*["\']?(?P<id>\d+)
+                            .+?
+                        }
+                    \s*\)
+                    .+?
+                    </script>
+            ''' % PLAYER_JS_RE, webpage)
        if mobj is not None:
            return 'eagleplatform:%(host)s:%(id)s' % mobj.groupdict()

@@ -79,9 +108,10 @@ class EaglePlatformIE(InfoExtractor):
        if status != 200:
            raise ExtractorError(' '.join(response['errors']), expected=True)

-    def _download_json(self, url_or_request, video_id, note='Downloading JSON metadata', *args, **kwargs):
+    def _download_json(self, url_or_request, video_id, *args, **kwargs):
        try:
-            response = super(EaglePlatformIE, self)._download_json(url_or_request, video_id, note)
+            response = super(EaglePlatformIE, self)._download_json(
+                url_or_request, video_id, *args, **kwargs)
        except ExtractorError as ee:
            if isinstance(ee.cause, compat_HTTPError):
                response = self._parse_json(ee.cause.read().decode('utf-8'), video_id)
@@ -93,11 +123,24 @@ class EaglePlatformIE(InfoExtractor):
        return self._download_json(url_or_request, video_id, note)['data'][0]

    def _real_extract(self, url):
+        url, smuggled_data = unsmuggle_url(url, {})
+
        mobj = re.match(self._VALID_URL, url)
        host, video_id = mobj.group('custom_host') or mobj.group('host'), mobj.group('id')

+        headers = {}
+        query = {
+            'id': video_id,
+        }
+
+        referrer = smuggled_data.get('referrer')
+        if referrer:
+            headers['Referer'] = referrer
+            query['referrer'] = referrer
+
        player_data = self._download_json(
-            'http://%s/api/player_data?id=%s' % (host, video_id), video_id)
+            'http://%s/api/player_data' % host, video_id,
+            headers=headers, query=query)

        media = player_data['data']['playlist']['viewports'][0]['medialist'][0]

--- a/youtube_dl/extractor/egghead.py
+++ b/youtube_dl/extractor/egghead.py
@@ -1,15 +1,13 @@
 # coding: utf-8
 from __future__ import unicode_literals

-import re
-
 from .common import InfoExtractor


 class EggheadCourseIE(InfoExtractor):
    IE_DESC = 'egghead.io course'
    IE_NAME = 'egghead:course'
-    _VALID_URL = r'https://egghead\.io/courses/(?P<id>[a-zA-Z_0-9-]+)'
+    _VALID_URL = r'https://egghead\.io/courses/(?P<id>[^/?#&]+)'
    _TEST = {
        'url': 'https://egghead.io/courses/professor-frisby-introduces-composable-functional-javascript',
        'playlist_count': 29,
@@ -22,18 +20,16 @@ class EggheadCourseIE(InfoExtractor):

    def _real_extract(self, url):
        playlist_id = self._match_id(url)
-        webpage = self._download_webpage(url, playlist_id)

-        title = self._html_search_regex(r'<h1 class="title">([^<]+)</h1>', webpage, 'title')
-        ul = self._search_regex(r'(?s)<ul class="series-lessons-list">(.*?)</ul>', webpage, 'session list')
+        course = self._download_json(
+            'https://egghead.io/api/v1/series/%s' % playlist_id, playlist_id)

-        found = re.findall(r'(?s)<a class="[^"]*"\s*href="([^"]+)">\s*<li class="item', ul)
-        entries = [self.url_result(m) for m in found]
+        entries = [
+            self.url_result(
+                'wistia:%s' % lesson['wistia_id'], ie='Wistia',
+                video_id=lesson['wistia_id'], video_title=lesson.get('title'))
+            for lesson in course['lessons'] if lesson.get('wistia_id')]

-        return {
-            '_type': 'playlist',
-            'id': playlist_id,
-            'title': title,
-            'description': self._og_search_description(webpage),
-            'entries': entries,
-        }
+        return self.playlist_result(
+            entries, playlist_id, course.get('title'),
+            course.get('description'))
--- a/youtube_dl/extractor/espn.py
+++ b/youtube_dl/extractor/espn.py
@@ -10,7 +10,25 @@ from ..utils import (


 class ESPNIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:espn\.go|(?:www\.)?espn)\.com/video/clip(?:\?.*?\bid=|/_/id/)(?P<id>\d+)'
+    _VALID_URL = r'''(?x)
+                    https?://
+                        (?:
+                            (?:(?:\w+\.)+)?espn\.go|
+                            (?:www\.)?espn
+                        )\.com/
+                        (?:
+                            (?:
+                                video/clip|
+                                watch/player
+                            )
+                            (?:
+                                \?.*?\bid=|
+                                /_/id/
+                            )
+                        )
+                        (?P<id>\d+)
+                    '''
+
    _TESTS = [{
        'url': 'http://espn.go.com/video/clip?id=10365079',
        'info_dict': {
@@ -25,20 +43,34 @@ class ESPNIE(InfoExtractor):
            'skip_download': True,
        },
    }, {
-        # intl video, from http://www.espnfc.us/video/mls-highlights/150/video/2743663/must-see-moments-best-of-the-mls-season
-        'url': 'http://espn.go.com/video/clip?id=2743663',
+        'url': 'https://broadband.espn.go.com/video/clip?id=18910086',
        'info_dict': {
-            'id': '2743663',
+            'id': '18910086',
            'ext': 'mp4',
-            'title': 'Must-See Moments: Best of the MLS season',
-            'description': 'md5:4c2d7232beaea572632bec41004f0aeb',
-            'timestamp': 1449446454,
-            'upload_date': '20151207',
+            'title': 'Kyrie spins around defender for two',
+            'description': 'md5:2b0f5bae9616d26fba8808350f0d2b9b',
+            'timestamp': 1489539155,
+            'upload_date': '20170315',
        },
        'params': {
            'skip_download': True,
        },
        'expected_warnings': ['Unable to download f4m manifest'],
+    }, {
+        'url': 'http://nonredline.sports.espn.go.com/video/clip?id=19744672',
+        'only_matching': True,
+    }, {
+        'url': 'https://cdn.espn.go.com/video/clip/_/id/19771774',
+        'only_matching': True,
+    }, {
+        'url': 'http://www.espn.com/watch/player?id=19141491',
+        'only_matching': True,
+    }, {
+        'url': 'http://www.espn.com/watch/player?bucketId=257&id=19505875',
+        'only_matching': True,
+    }, {
+        'url': 'http://www.espn.com/watch/player/_/id/19141491',
+        'only_matching': True,
    }, {
        'url': 'http://www.espn.com/video/clip?id=10365079',
        'only_matching': True,
--- a/youtube_dl/extractor/extractors.py
+++ b/youtube_dl/extractor/extractors.py
@@ -71,6 +71,10 @@ from .arte import (
    TheOperaPlatformIE,
    ArteTVPlaylistIE,
 )
+from .asiancrush import (
+    AsianCrushIE,
+    AsianCrushPlaylistIE,
+)
 from .atresplayer import AtresPlayerIE
 from .atttechchannel import ATTTechChannelIE
 from .atvat import ATVAtIE
@@ -181,6 +185,7 @@ from .chirbit import (
    ChirbitProfileIE,
 )
 from .cinchcast import CinchcastIE
+from .cjsw import CJSWIE
 from .clipfish import ClipfishIE
 from .cliphunter import CliphunterIE
 from .cliprs import ClipRsIE
@@ -465,6 +470,7 @@ from .jamendo import (
 )
 from .jeuxvideo import JeuxVideoIE
 from .jove import JoveIE
+from .joj import JojIE
 from .jwplatform import JWPlatformIE
 from .jpopsukitv import JpopsukiIE
 from .kaltura import KalturaIE
@@ -636,7 +642,10 @@ from .neteasemusic import (
    NetEaseMusicProgramIE,
    NetEaseMusicDjRadioIE,
 )
-from .newgrounds import NewgroundsIE
+from .newgrounds import (
+    NewgroundsIE,
+    NewgroundsPlaylistIE,
+)
 from .newstube import NewstubeIE
 from .nextmedia import (
    NextMediaIE,
@@ -817,6 +826,7 @@ from .radiobremen import RadioBremenIE
 from .radiofrance import RadioFranceIE
 from .rai import (
    RaiPlayIE,
+    RaiPlayLiveIE,
    RaiIE,
 )
 from .rbmaradio import RBMARadioIE
@@ -868,6 +878,7 @@ from .rutube import (
 )
 from .rutv import RUTVIE
 from .ruutu import RuutuIE
+from .ruv import RuvIE
 from .sandia import SandiaIE
 from .safari import (
    SafariIE,
@@ -964,6 +975,7 @@ from .tagesschau import (
    TagesschauIE,
 )
 from .tass import TassIE
+from .tastytrade import TastyTradeIE
 from .tbs import TBSIE
 from .tdslifeway import TDSLifewayIE
 from .teachertube import (
@@ -1270,7 +1282,6 @@ from .yahoo import (
    YahooIE,
    YahooSearchIE,
 )
-from .yam import YamIE
 from .yandexmusic import (
    YandexMusicTrackIE,
    YandexMusicAlbumIE,
--- a/youtube_dl/extractor/facebook.py
+++ b/youtube_dl/extractor/facebook.py
@@ -203,19 +203,19 @@ class FacebookIE(InfoExtractor):
    }]

    @staticmethod
-    def _extract_url(webpage):
-        mobj = re.search(
-            r'<iframe[^>]+?src=(["\'])(?P<url>https://www\.facebook\.com/video/embed.+?)\1', webpage)
-        if mobj is not None:
-            return mobj.group('url')
-
+    def _extract_urls(webpage):
+        urls = []
+        for mobj in re.finditer(
+                r'<iframe[^>]+?src=(["\'])(?P<url>https?://www\.facebook\.com/(?:video/embed|plugins/video\.php).+?)\1',
+                webpage):
+            urls.append(mobj.group('url'))
        # Facebook API embed
        # see https://developers.facebook.com/docs/plugins/embedded-video-player
-        mobj = re.search(r'''(?x)<div[^>]+
+        for mobj in re.finditer(r'''(?x)<div[^>]+
                class=(?P<q1>[\'"])[^\'"]*\bfb-(?:video|post)\b[^\'"]*(?P=q1)[^>]+
-                data-href=(?P<q2>[\'"])(?P<url>(?:https?:)?//(?:www\.)?facebook.com/.+?)(?P=q2)''', webpage)
-        if mobj is not None:
-            return mobj.group('url')
+                data-href=(?P<q2>[\'"])(?P<url>(?:https?:)?//(?:www\.)?facebook.com/.+?)(?P=q2)''', webpage):
+            urls.append(mobj.group('url'))
+        return urls

    def _login(self):
        (useremail, password) = self._get_login_info()
--- a/youtube_dl/extractor/flickr.py
+++ b/youtube_dl/extractor/flickr.py
@@ -1,7 +1,10 @@
 from __future__ import unicode_literals

 from .common import InfoExtractor
-from ..compat import compat_urllib_parse_urlencode
+from ..compat import (
+    compat_str,
+    compat_urllib_parse_urlencode,
+)
 from ..utils import (
    ExtractorError,
    int_or_none,
@@ -81,7 +84,7 @@ class FlickrIE(InfoExtractor):

            formats = []
            for stream in streams['stream']:
-                stream_type = str(stream.get('type'))
+                stream_type = compat_str(stream.get('type'))
                formats.append({
                    'format_id': stream_type,
                    'url': stream['_content'],
--- a/youtube_dl/extractor/fourtube.py
+++ b/youtube_dl/extractor/fourtube.py
@@ -85,11 +85,11 @@ class FourTubeIE(InfoExtractor):
            media_id = params[0]
            sources = ['%s' % p for p in params[2]]

-        token_url = 'http://tkn.4tube.com/{0}/desktop/{1}'.format(
+        token_url = 'https://tkn.kodicdn.com/{0}/desktop/{1}'.format(
            media_id, '+'.join(sources))
        headers = {
            b'Content-Type': b'application/x-www-form-urlencoded',
-            b'Origin': b'http://www.4tube.com',
+            b'Origin': b'https://www.4tube.com',
        }
        token_req = sanitized_Request(token_url, b'{}', headers)
        tokens = self._download_json(token_req, video_id)
--- a/youtube_dl/extractor/foxgay.py
+++ b/youtube_dl/extractor/foxgay.py
@@ -5,6 +5,7 @@ import itertools
 from .common import InfoExtractor
 from ..utils import (
    get_element_by_id,
+    int_or_none,
    remove_end,
 )

@@ -46,7 +47,7 @@ class FoxgayIE(InfoExtractor):

        formats = [{
            'url': source,
-            'height': resolution,
+            'height': int_or_none(resolution),
        } for source, resolution in zip(
            video_data['sources'], video_data.get('resolutions', itertools.repeat(None)))]

--- a/youtube_dl/extractor/generic.py
+++ b/youtube_dl/extractor/generic.py
@@ -10,6 +10,7 @@ from .common import InfoExtractor
 from .youtube import YoutubeIE
 from ..compat import (
    compat_etree_fromstring,
+    compat_str,
    compat_urllib_parse_unquote,
    compat_urlparse,
    compat_xml_parse_error,
@@ -56,6 +57,7 @@ from .dailymotion import (
    DailymotionIE,
    DailymotionCloudIE,
 )
+from .dailymail import DailyMailIE
 from .onionstudios import OnionStudiosIE
 from .viewlift import ViewLiftEmbedIE
 from .mtv import MTVServicesEmbeddedIE
@@ -90,6 +92,7 @@ from .anvato import AnvatoIE
 from .washingtonpost import WashingtonPostIE
 from .wistia import WistiaIE
 from .mediaset import MediasetIE
+from .joj import JojIE


 class GenericIE(InfoExtractor):
@@ -758,6 +761,20 @@ class GenericIE(InfoExtractor):
            },
            'add_ie': ['Dailymotion'],
        },
+        # DailyMail embed
+        {
+            'url': 'http://www.bumm.sk/krimi/2017/07/05/biztonsagi-kamera-buktatta-le-az-agg-ferfit-utlegelo-apolot',
+            'info_dict': {
+                'id': '1495629',
+                'ext': 'mp4',
+                'title': 'Care worker punches elderly dementia patient in head 11 times',
+                'description': 'md5:3a743dee84e57e48ec68bf67113199a5',
+            },
+            'add_ie': ['DailyMail'],
+            'params': {
+                'skip_download': True,
+            },
+        },
        # YouTube embed
        {
            'url': 'http://www.badzine.de/ansicht/datum/2014/06/09/so-funktioniert-die-neue-englische-badminton-liga.html',
@@ -1184,7 +1201,7 @@ class GenericIE(InfoExtractor):
            },
            'add_ie': ['Kaltura'],
        },
-        # Eagle.Platform embed (generic URL)
+        # EaglePlatform embed (generic URL)
        {
            'url': 'http://lenta.ru/news/2015/03/06/navalny/',
            # Not checking MD5 as sometimes the direct HTTP link results in 404 and HLS is used
@@ -1198,8 +1215,26 @@ class GenericIE(InfoExtractor):
                'view_count': int,
                'age_limit': 0,
            },
+            'params': {
+                'skip_download': True,
+            },
        },
-        # ClipYou (Eagle.Platform) embed (custom URL)
+        # referrer protected EaglePlatform embed
+        {
+            'url': 'https://tvrain.ru/lite/teleshow/kak_vse_nachinalos/namin-418921/',
+            'info_dict': {
+                'id': '582306',
+                'ext': 'mp4',
+                'title': 'Стас Намин: «Мы нарушили девственность Кремля»',
+                'thumbnail': r're:^https?://.*\.jpg$',
+                'duration': 3382,
+                'view_count': int,
+            },
+            'params': {
+                'skip_download': True,
+            },
+        },
+        # ClipYou (EaglePlatform) embed (custom URL)
        {
            'url': 'http://muz-tv.ru/play/7129/',
            # Not checking MD5 as sometimes the direct HTTP link results in 404 and HLS is used
@@ -1211,6 +1246,9 @@ class GenericIE(InfoExtractor):
                'duration': 216,
                'view_count': int,
            },
+            'params': {
+                'skip_download': True,
+            },
        },
        # Pladform embed
        {
@@ -1521,6 +1559,21 @@ class GenericIE(InfoExtractor):
                'title': 'Facebook video #599637780109885',
            },
        },
+        # Facebook <iframe> embed, plugin video
+        {
+            'url': 'http://5pillarsuk.com/2017/06/07/tariq-ramadan-disagrees-with-pr-exercise-by-imams-refusing-funeral-prayers-for-london-attackers/',
+            'info_dict': {
+                'id': '1754168231264132',
+                'ext': 'mp4',
+                'title': 'About the Imams and Religious leaders refusing to perform funeral prayers for...',
+                'uploader': 'Tariq Ramadan (official)',
+                'timestamp': 1496758379,
+                'upload_date': '20170606',
+            },
+            'params': {
+                'skip_download': True,
+            },
+        },
        # Facebook API embed
        {
            'url': 'http://www.lothype.com/blue-stars-2016-preview-standstill-full-show/',
@@ -1733,6 +1786,26 @@ class GenericIE(InfoExtractor):
            },
            'add_ie': [MediasetIE.ie_key()],
        },
+        {
+            # JOJ.sk embeds
+            'url': 'https://www.noviny.sk/slovensko/238543-slovenskom-sa-prehnala-vlna-silnych-burok',
+            'info_dict': {
+                'id': '238543-slovenskom-sa-prehnala-vlna-silnych-burok',
+                'title': 'Slovenskom sa prehnala vlna silných búrok',
+            },
+            'playlist_mincount': 5,
+            'add_ie': [JojIE.ie_key()],
+        },
+        {
+            # AMP embed (see https://www.ampproject.org/docs/reference/components/amp-video)
+            'url': 'https://tvrain.ru/amp/418921/',
+            'md5': 'cc00413936695987e8de148b67d14f1d',
+            'info_dict': {
+                'id': '418921',
+                'ext': 'mp4',
+                'title': 'Стас Намин: «Мы нарушили девственность Кремля»',
+            },
+        },
        # {
        #     # TODO: find another test
        #     # http://schema.org/VideoObject
@@ -1907,14 +1980,14 @@ class GenericIE(InfoExtractor):
        content_type = head_response.headers.get('Content-Type', '').lower()
        m = re.match(r'^(?P<type>audio|video|application(?=/(?:ogg$|(?:vnd\.apple\.|x-)?mpegurl)))/(?P<format_id>[^;\s]+)', content_type)
        if m:
-            format_id = m.group('format_id')
+            format_id = compat_str(m.group('format_id'))
            if format_id.endswith('mpegurl'):
                formats = self._extract_m3u8_formats(url, video_id, 'mp4')
            elif format_id == 'f4m':
                formats = self._extract_f4m_formats(url, video_id)
            else:
                formats = [{
-                    'format_id': m.group('format_id'),
+                    'format_id': format_id,
                    'url': url,
                    'vcodec': 'none' if m.group('type') == 'audio' else None
                }]
@@ -2032,6 +2105,13 @@ class GenericIE(InfoExtractor):
        video_description = self._og_search_description(webpage, default=None)
        video_thumbnail = self._og_search_thumbnail(webpage, default=None)

+        info_dict.update({
+            'title': video_title,
+            'description': video_description,
+            'thumbnail': video_thumbnail,
+            'age_limit': age_limit,
+        })
+
        # Look for Brightcove Legacy Studio embeds
        bc_urls = BrightcoveLegacyIE._extract_brightcove_urls(webpage)
        if bc_urls:
@@ -2125,6 +2205,12 @@ class GenericIE(InfoExtractor):
                return self.playlist_from_matches(
                    playlists, video_id, video_title, lambda p: '//dailymotion.com/playlist/%s' % p)

+        # Look for DailyMail embeds
+        dailymail_urls = DailyMailIE._extract_urls(webpage)
+        if dailymail_urls:
+            return self.playlist_from_matches(
+                dailymail_urls, video_id, video_title, ie=DailyMailIE.ie_key())
+
        # Look for embedded Wistia player
        wistia_url = WistiaIE._extract_url(webpage)
        if wistia_url:
@@ -2221,9 +2307,9 @@ class GenericIE(InfoExtractor):
            return self.url_result(mobj.group('url'))

        # Look for embedded Facebook player
-        facebook_url = FacebookIE._extract_url(webpage)
-        if facebook_url is not None:
-            return self.url_result(facebook_url, 'Facebook')
+        facebook_urls = FacebookIE._extract_urls(webpage)
+        if facebook_urls:
+            return self.playlist_from_matches(facebook_urls, video_id, video_title)

        # Look for embedded VK player
        mobj = re.search(r'<iframe[^>]+?src=(["\'])(?P<url>https?://vk\.com/video_ext\.php.+?)\1', webpage)
@@ -2420,12 +2506,12 @@ class GenericIE(InfoExtractor):
        if kaltura_url:
            return self.url_result(smuggle_url(kaltura_url, {'source_url': url}), KalturaIE.ie_key())

-        # Look for Eagle.Platform embeds
+        # Look for EaglePlatform embeds
        eagleplatform_url = EaglePlatformIE._extract_url(webpage)
        if eagleplatform_url:
-            return self.url_result(eagleplatform_url, EaglePlatformIE.ie_key())
+            return self.url_result(smuggle_url(eagleplatform_url, {'referrer': url}), EaglePlatformIE.ie_key())

-        # Look for ClipYou (uses Eagle.Platform) embeds
+        # Look for ClipYou (uses EaglePlatform) embeds
        mobj = re.search(
            r'<iframe[^>]+src="https?://(?P<host>media\.clipyou\.ru)/index/player\?.*\brecord_id=(?P<id>\d+).*"', webpage)
        if mobj is not None:
@@ -2668,18 +2754,32 @@ class GenericIE(InfoExtractor):
            return self.playlist_from_matches(
                mediaset_urls, video_id, video_title, ie=MediasetIE.ie_key())

+        # Look for JOJ.sk embeds
+        joj_urls = JojIE._extract_urls(webpage)
+        if joj_urls:
+            return self.playlist_from_matches(
+                joj_urls, video_id, video_title, ie=JojIE.ie_key())
+
+        def merge_dicts(dict1, dict2):
+            merged = {}
+            for k, v in dict1.items():
+                if v is not None:
+                    merged[k] = v
+            for k, v in dict2.items():
+                if v is None:
+                    continue
+                if (k not in merged or
+                        (isinstance(v, compat_str) and v and
+                            isinstance(merged[k], compat_str) and
+                            not merged[k])):
+                    merged[k] = v
+            return merged
+
        # Looking for http://schema.org/VideoObject
        json_ld = self._search_json_ld(
            webpage, video_id, default={}, expected_type='VideoObject')
        if json_ld.get('url'):
-            info_dict.update({
-                'title': video_title or info_dict['title'],
-                'description': video_description,
-                'thumbnail': video_thumbnail,
-                'age_limit': age_limit
-            })
-            info_dict.update(json_ld)
-            return info_dict
+            return merge_dicts(json_ld, info_dict)

        # Look for HTML5 media
        entries = self._parse_html5_media_entries(url, webpage, video_id, m3u8_id='hls')
@@ -2697,9 +2797,7 @@ class GenericIE(InfoExtractor):
        if jwplayer_data:
            info = self._parse_jwplayer_data(
                jwplayer_data, video_id, require_title=False, base_url=url)
-            if not info.get('title'):
-                info['title'] = video_title
-            return info
+            return merge_dicts(info, info_dict)

        def check_video(vurl):
            if YoutubeIE.suitable(vurl):
--- a/youtube_dl/extractor/gfycat.py
+++ b/youtube_dl/extractor/gfycat.py
@@ -82,7 +82,7 @@ class GfycatIE(InfoExtractor):
            video_url = gfy.get('%sUrl' % format_id)
            if not video_url:
                continue
-            filesize = gfy.get('%sSize' % format_id)
+            filesize = int_or_none(gfy.get('%sSize' % format_id))
            formats.append({
                'url': video_url,
                'format_id': format_id,
--- a/youtube_dl/extractor/golem.py
+++ b/youtube_dl/extractor/golem.py
@@ -3,6 +3,7 @@ from __future__ import unicode_literals

 from .common import InfoExtractor
 from ..compat import (
+    compat_str,
    compat_urlparse,
 )
 from ..utils import (
@@ -46,7 +47,7 @@ class GolemIE(InfoExtractor):
                continue

            formats.append({
-                'format_id': e.tag,
+                'format_id': compat_str(e.tag),
                'url': compat_urlparse.urljoin(self._PREFIX, url),
                'height': self._int(e.get('height'), 'height'),
                'width': self._int(e.get('width'), 'width'),
--- a/youtube_dl/extractor/googledrive.py
+++ b/youtube_dl/extractor/googledrive.py
@@ -69,19 +69,32 @@ class GoogleDriveIE(InfoExtractor):
            r'"fmt_stream_map"\s*,\s*"([^"]+)', webpage, 'fmt stream map').split(',')
        fmt_list = self._search_regex(r'"fmt_list"\s*,\s*"([^"]+)', webpage, 'fmt_list').split(',')

+        resolutions = {}
+        for fmt in fmt_list:
+            mobj = re.search(
+                r'^(?P<format_id>\d+)/(?P<width>\d+)[xX](?P<height>\d+)', fmt)
+            if mobj:
+                resolutions[mobj.group('format_id')] = (
+                    int(mobj.group('width')), int(mobj.group('height')))
+
        formats = []
-        for fmt, fmt_stream in zip(fmt_list, fmt_stream_map):
-            fmt_id, fmt_url = fmt_stream.split('|')
-            resolution = fmt.split('/')[1]
-            width, height = resolution.split('x')
-            formats.append({
-                'url': lowercase_escape(fmt_url),
-                'format_id': fmt_id,
-                'resolution': resolution,
-                'width': int_or_none(width),
-                'height': int_or_none(height),
-                'ext': self._FORMATS_EXT[fmt_id],
-            })
+        for fmt_stream in fmt_stream_map:
+            fmt_stream_split = fmt_stream.split('|')
+            if len(fmt_stream_split) < 2:
+                continue
+            format_id, format_url = fmt_stream_split[:2]
+            f = {
+                'url': lowercase_escape(format_url),
+                'format_id': format_id,
+                'ext': self._FORMATS_EXT[format_id],
+            }
+            resolution = resolutions.get(format_id)
+            if resolution:
+                f.update({
+                    'width': resolution[0],
+                    'height': resolution[1],
+                })
+            formats.append(f)
        self._sort_formats(formats)

        return {
--- a/youtube_dl/extractor/hgtv.py
+++ b/youtube_dl/extractor/hgtv.py
@@ -7,14 +7,19 @@ from .common import InfoExtractor
 class HGTVComShowIE(InfoExtractor):
    IE_NAME = 'hgtv.com:show'
    _VALID_URL = r'https?://(?:www\.)?hgtv\.com/shows/[^/]+/(?P<id>[^/?#&]+)'
-    _TEST = {
-        'url': 'http://www.hgtv.com/shows/flip-or-flop/flip-or-flop-full-episodes-videos',
+    _TESTS = [{
+        # data-module="video"
+        'url': 'http://www.hgtv.com/shows/flip-or-flop/flip-or-flop-full-episodes-season-4-videos',
        'info_dict': {
-            'id': 'flip-or-flop-full-episodes-videos',
+            'id': 'flip-or-flop-full-episodes-season-4-videos',
            'title': 'Flip or Flop Full Episodes',
        },
        'playlist_mincount': 15,
-    }
+    }, {
+        # data-deferred-module="video"
+        'url': 'http://www.hgtv.com/shows/good-bones/episodes/an-old-victorian-house-gets-a-new-facelift',
+        'only_matching': True,
+    }]

    def _real_extract(self, url):
        display_id = self._match_id(url)
@@ -23,7 +28,7 @@ class HGTVComShowIE(InfoExtractor):

        config = self._parse_json(
            self._search_regex(
-                r'(?s)data-module=["\']video["\'][^>]*>.*?<script[^>]+type=["\']text/x-config["\'][^>]*>(.+?)</script',
+                r'(?s)data-(?:deferred-)?module=["\']video["\'][^>]*>.*?<script[^>]+type=["\']text/x-config["\'][^>]*>(.+?)</script',
                webpage, 'video config'),
            display_id)['channels'][0]

--- a/youtube_dl/extractor/ign.py
+++ b/youtube_dl/extractor/ign.py
@@ -89,6 +89,11 @@ class IGNIE(InfoExtractor):
            'url': 'http://me.ign.com/ar/angry-birds-2/106533/video/lrd-ldyy-lwl-lfylm-angry-birds',
            'only_matching': True,
        },
+        {
+            # videoId pattern
+            'url': 'http://www.ign.com/articles/2017/06/08/new-ducktales-short-donalds-birthday-doesnt-go-as-planned',
+            'only_matching': True,
+        },
    ]

    def _find_video_id(self, webpage):
@@ -98,6 +103,8 @@ class IGNIE(InfoExtractor):
            r'data-video-id="(.+?)"',
            r'<object id="vid_(.+?)"',
            r'<meta name="og:image" content=".*/(.+?)-(.+?)/.+.jpg"',
+            r'videoId&quot;\s*:\s*&quot;(.+?)&quot;',
+            r'videoId["\']\s*:\s*["\']([^"\']+?)["\']',
        ]
        return self._search_regex(res_id, webpage, 'video id', default=None)

--- a/youtube_dl/extractor/joj.py
+++ b/youtube_dl/extractor/joj.py
@@ -0,0 +1,100 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from ..compat import compat_str
+from ..utils import (
+    int_or_none,
+    js_to_json,
+    try_get,
+)
+
+
+class JojIE(InfoExtractor):
+    _VALID_URL = r'''(?x)
+                    (?:
+                        joj:|
+                        https?://media\.joj\.sk/embed/
+                    )
+                    (?P<id>[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})
+                '''
+    _TESTS = [{
+        'url': 'https://media.joj.sk/embed/a388ec4c-6019-4a4a-9312-b1bee194e932',
+        'info_dict': {
+            'id': 'a388ec4c-6019-4a4a-9312-b1bee194e932',
+            'ext': 'mp4',
+            'title': 'NOVÉ BÝVANIE',
+            'thumbnail': r're:^https?://.*\.jpg$',
+            'duration': 3118,
+        }
+    }, {
+        'url': 'joj:a388ec4c-6019-4a4a-9312-b1bee194e932',
+        'only_matching': True,
+    }]
+
+    @staticmethod
+    def _extract_urls(webpage):
+        return re.findall(
+            r'<iframe\b[^>]+\bsrc=["\'](?P<url>(?:https?:)?//media\.joj\.sk/embed/[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})',
+            webpage)
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+
+        webpage = self._download_webpage(
+            'https://media.joj.sk/embed/%s' % video_id, video_id)
+
+        title = self._search_regex(
+            (r'videoTitle\s*:\s*(["\'])(?P<title>(?:(?!\1).)+)\1',
+             r'<title>(?P<title>[^<]+)'), webpage, 'title',
+            default=None, group='title') or self._og_search_title(webpage)
+
+        bitrates = self._parse_json(
+            self._search_regex(
+                r'(?s)bitrates\s*=\s*({.+?});', webpage, 'bitrates',
+                default='{}'),
+            video_id, transform_source=js_to_json, fatal=False)
+
+        formats = []
+        for format_url in try_get(bitrates, lambda x: x['mp4'], list) or []:
+            if isinstance(format_url, compat_str):
+                height = self._search_regex(
+                    r'(\d+)[pP]\.', format_url, 'height', default=None)
+                formats.append({
+                    'url': format_url,
+                    'format_id': '%sp' % height if height else None,
+                    'height': int(height),
+                })
+        if not formats:
+            playlist = self._download_xml(
+                'https://media.joj.sk/services/Video.php?clip=%s' % video_id,
+                video_id)
+            for file_el in playlist.findall('./files/file'):
+                path = file_el.get('path')
+                if not path:
+                    continue
+                format_id = file_el.get('id') or file_el.get('label')
+                formats.append({
+                    'url': 'http://n16.joj.sk/storage/%s' % path.replace(
+                        'dat/', '', 1),
+                    'format_id': format_id,
+                    'height': int_or_none(self._search_regex(
+                        r'(\d+)[pP]', format_id or path, 'height',
+                        default=None)),
+                })
+        self._sort_formats(formats)
+
+        thumbnail = self._og_search_thumbnail(webpage)
+
+        duration = int_or_none(self._search_regex(
+            r'videoDuration\s*:\s*(\d+)', webpage, 'duration', fatal=False))
+
+        return {
+            'id': video_id,
+            'title': title,
+            'thumbnail': thumbnail,
+            'duration': duration,
+            'formats': formats,
+        }
--- a/youtube_dl/extractor/jove.py
+++ b/youtube_dl/extractor/jove.py
@@ -65,9 +65,9 @@ class JoveIE(InfoExtractor):
            webpage, 'description', fatal=False)
        publish_date = unified_strdate(self._html_search_meta(
            'citation_publication_date', webpage, 'publish date', fatal=False))
-        comment_count = self._html_search_regex(
+        comment_count = int(self._html_search_regex(
            r'<meta name="num_comments" content="(\d+) Comments?"',
-            webpage, 'comment count', fatal=False)
+            webpage, 'comment count', fatal=False))

        return {
            'id': video_id,
--- a/youtube_dl/extractor/kaltura.py
+++ b/youtube_dl/extractor/kaltura.py
@@ -324,7 +324,7 @@ class KalturaIE(InfoExtractor):
        if captions:
            for caption in captions.get('objects', []):
                # Continue if caption is not ready
-                if f.get('status') != 2:
+                if caption.get('status') != 2:
                    continue
                if not caption.get('id'):
                    continue
--- a/youtube_dl/extractor/liveleak.py
+++ b/youtube_dl/extractor/liveleak.py
@@ -115,8 +115,9 @@ class LiveLeakIE(InfoExtractor):

        for a_format in info_dict['formats']:
            if not a_format.get('height'):
-                a_format['height'] = self._search_regex(
-                    r'([0-9]+)p\.mp4', a_format['url'], 'height label', default=None)
+                a_format['height'] = int_or_none(self._search_regex(
+                    r'([0-9]+)p\.mp4', a_format['url'], 'height label',
+                    default=None))

        self._sort_formats(info_dict['formats'])

--- a/youtube_dl/extractor/msn.py
+++ b/youtube_dl/extractor/msn.py
@@ -68,10 +68,6 @@ class MSNIE(InfoExtractor):
            format_url = file_.get('url')
            if not format_url:
                continue
-            ext = determine_ext(format_url)
-            if ext == 'ism':
-                formats.extend(self._extract_ism_formats(
-                    format_url + '/Manifest', display_id, 'mss', fatal=False))
            if 'm3u8' in format_url:
                # m3u8_native should not be used here until
                # https://github.com/rg3/youtube-dl/issues/9913 is fixed
@@ -79,6 +75,9 @@ class MSNIE(InfoExtractor):
                    format_url, display_id, 'mp4',
                    m3u8_id='hls', fatal=False)
                formats.extend(m3u8_formats)
+            elif determine_ext(format_url) == 'ism':
+                formats.extend(self._extract_ism_formats(
+                    format_url + '/Manifest', display_id, 'mss', fatal=False))
            else:
                formats.append({
                    'url': format_url,
--- a/youtube_dl/extractor/newgrounds.py
+++ b/youtube_dl/extractor/newgrounds.py
@@ -1,6 +1,15 @@
 from __future__ import unicode_literals

+import re
+
 from .common import InfoExtractor
+from ..utils import (
+    extract_attributes,
+    int_or_none,
+    parse_duration,
+    parse_filesize,
+    unified_timestamp,
+)


 class NewgroundsIE(InfoExtractor):
@@ -13,7 +22,10 @@ class NewgroundsIE(InfoExtractor):
            'ext': 'mp3',
            'title': 'B7 - BusMode',
            'uploader': 'Burn7',
-        }
+            'timestamp': 1378878540,
+            'upload_date': '20130911',
+            'duration': 143,
+        },
    }, {
        'url': 'https://www.newgrounds.com/portal/view/673111',
        'md5': '3394735822aab2478c31b1004fe5e5bc',
@@ -22,25 +34,133 @@ class NewgroundsIE(InfoExtractor):
            'ext': 'mp4',
            'title': 'Dancin',
            'uploader': 'Squirrelman82',
+            'timestamp': 1460256780,
+            'upload_date': '20160410',
+        },
+    }, {
+        # source format unavailable, additional mp4 formats
+        'url': 'http://www.newgrounds.com/portal/view/689400',
+        'info_dict': {
+            'id': '689400',
+            'ext': 'mp4',
+            'title': 'ZTV News Episode 8',
+            'uploader': 'BennettTheSage',
+            'timestamp': 1487965140,
+            'upload_date': '20170224',
+        },
+        'params': {
+            'skip_download': True,
        },
    }]

    def _real_extract(self, url):
        media_id = self._match_id(url)
+
        webpage = self._download_webpage(url, media_id)

        title = self._html_search_regex(
            r'<title>([^>]+)</title>', webpage, 'title')

-        uploader = self._html_search_regex(
-            r'Author\s*<a[^>]+>([^<]+)', webpage, 'uploader', fatal=False)
+        media_url = self._parse_json(self._search_regex(
+            r'"url"\s*:\s*("[^"]+"),', webpage, ''), media_id)

-        music_url = self._parse_json(self._search_regex(
-            r'"url":("[^"]+"),', webpage, ''), media_id)
+        formats = [{
+            'url': media_url,
+            'format_id': 'source',
+            'quality': 1,
+        }]
+
+        max_resolution = int_or_none(self._search_regex(
+            r'max_resolution["\']\s*:\s*(\d+)', webpage, 'max resolution',
+            default=None))
+        if max_resolution:
+            url_base = media_url.rpartition('.')[0]
+            for resolution in (360, 720, 1080):
+                if resolution > max_resolution:
+                    break
+                formats.append({
+                    'url': '%s.%dp.mp4' % (url_base, resolution),
+                    'format_id': '%dp' % resolution,
+                    'height': resolution,
+                })
+
+        self._check_formats(formats, media_id)
+        self._sort_formats(formats)
+
+        uploader = self._search_regex(
+            r'(?:Author|Writer)\s*<a[^>]+>([^<]+)', webpage, 'uploader',
+            fatal=False)
+
+        timestamp = unified_timestamp(self._search_regex(
+            r'<dt>Uploaded</dt>\s*<dd>([^<]+)', webpage, 'timestamp',
+            default=None))
+        duration = parse_duration(self._search_regex(
+            r'<dd>Song\s*</dd><dd>.+?</dd><dd>([^<]+)', webpage, 'duration',
+            default=None))
+
+        filesize_approx = parse_filesize(self._html_search_regex(
+            r'<dd>Song\s*</dd><dd>(.+?)</dd>', webpage, 'filesize',
+            default=None))
+        if len(formats) == 1:
+            formats[0]['filesize_approx'] = filesize_approx
+
+        if '<dd>Song' in webpage:
+            formats[0]['vcodec'] = 'none'

        return {
            'id': media_id,
            'title': title,
-            'url': music_url,
            'uploader': uploader,
+            'timestamp': timestamp,
+            'duration': duration,
+            'formats': formats,
        }
+
+
+class NewgroundsPlaylistIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?newgrounds\.com/(?:collection|[^/]+/search/[^/]+)/(?P<id>[^/?#&]+)'
+    _TESTS = [{
+        'url': 'https://www.newgrounds.com/collection/cats',
+        'info_dict': {
+            'id': 'cats',
+            'title': 'Cats',
+        },
+        'playlist_mincount': 46,
+    }, {
+        'url': 'http://www.newgrounds.com/portal/search/author/ZONE-SAMA',
+        'info_dict': {
+            'id': 'ZONE-SAMA',
+            'title': 'Portal Search: ZONE-SAMA',
+        },
+        'playlist_mincount': 47,
+    }, {
+        'url': 'http://www.newgrounds.com/audio/search/title/cats',
+        'only_matching': True,
+    }]
+
+    def _real_extract(self, url):
+        playlist_id = self._match_id(url)
+
+        webpage = self._download_webpage(url, playlist_id)
+
+        title = self._search_regex(
+            r'<title>([^>]+)</title>', webpage, 'title', default=None)
+
+        # cut left menu
+        webpage = self._search_regex(
+            r'(?s)<div[^>]+\bclass=["\']column wide(.+)',
+            webpage, 'wide column', default=webpage)
+
+        entries = []
+        for a, path, media_id in re.findall(
+                r'(<a[^>]+\bhref=["\']/?((?:portal/view|audio/listen)/(\d+))[^>]+>)',
+                webpage):
+            a_class = extract_attributes(a).get('class')
+            if a_class not in ('item-portalsubmission', 'item-audiosubmission'):
+                continue
+            entries.append(
+                self.url_result(
+                    'https://www.newgrounds.com/%s' % path,
+                    ie=NewgroundsIE.ie_key(), video_id=media_id))
+
+        return self.playlist_result(entries, playlist_id, title)
--- a/youtube_dl/extractor/niconico.py
+++ b/youtube_dl/extractor/niconico.py
@@ -83,9 +83,12 @@ class NiconicoIE(InfoExtractor):
            'uploader_id': '312',
        },
        'skip': 'The viewing period of the video you were searching for has expired.',
+    }, {
+        'url': 'http://sp.nicovideo.jp/watch/sm28964488?ss_pos=1&cp_in=wt_tg',
+        'only_matching': True,
    }]

-    _VALID_URL = r'https?://(?:www\.|secure\.)?nicovideo\.jp/watch/(?P<id>(?:[a-z]{2})?[0-9]+)'
+    _VALID_URL = r'https?://(?:www\.|secure\.|sp\.)?nicovideo\.jp/watch/(?P<id>(?:[a-z]{2})?[0-9]+)'
    _NETRC_MACHINE = 'niconico'

    def _real_initialize(self):
--- a/youtube_dl/extractor/npo.py
+++ b/youtube_dl/extractor/npo.py
@@ -35,7 +35,7 @@ class NPOIE(NPOBaseIE):
                        https?://
                            (?:www\.)?
                            (?:
-                                npo\.nl/(?!live|radio)(?:[^/]+/){2}|
+                                npo\.nl/(?!(?:live|radio)/)(?:[^/]+/){2}|
                                ntr\.nl/(?:[^/]+/){2,}|
                                omroepwnl\.nl/video/fragment/[^/]+__|
                                zapp\.nl/[^/]+/[^/]+/
@@ -150,6 +150,9 @@ class NPOIE(NPOBaseIE):
        # live stream
        'url': 'npo:LI_NL1_4188102',
        'only_matching': True,
+    }, {
+        'url': 'http://www.npo.nl/radio-gaga/13-06-2017/BNN_101383373',
+        'only_matching': True,
    }]

    def _real_extract(self, url):
@@ -338,7 +341,7 @@ class NPOLiveIE(NPOBaseIE):
        webpage = self._download_webpage(url, display_id)

        live_id = self._search_regex(
-            r'data-prid="([^"]+)"', webpage, 'live id')
+            [r'media-id="([^"]+)"', r'data-prid="([^"]+)"'], webpage, 'live id')

        return {
            '_type': 'url_transparent',
--- a/youtube_dl/extractor/onet.py
+++ b/youtube_dl/extractor/onet.py
@@ -11,6 +11,7 @@ from ..utils import (
    get_element_by_class,
    int_or_none,
    js_to_json,
+    NO_DEFAULT,
    parse_iso8601,
    remove_start,
    strip_or_none,
@@ -198,6 +199,19 @@ class OnetPlIE(InfoExtractor):
            'upload_date': '20170214',
            'timestamp': 1487078046,
        },
+    }, {
+        # embedded via pulsembed
+        'url': 'http://film.onet.pl/pensjonat-nad-rozlewiskiem-relacja-z-planu-serialu/y428n0',
+        'info_dict': {
+            'id': '501235.965429946',
+            'ext': 'mp4',
+            'title': '"Pensjonat nad rozlewiskiem": relacja z planu serialu',
+            'upload_date': '20170622',
+            'timestamp': 1498159955,
+        },
+        'params': {
+            'skip_download': True,
+        },
    }, {
        'url': 'http://film.onet.pl/zwiastuny/ghost-in-the-shell-drugi-zwiastun-pl/5q6yl3',
        'only_matching': True,
@@ -212,13 +226,25 @@ class OnetPlIE(InfoExtractor):
        'only_matching': True,
    }]

+    def _search_mvp_id(self, webpage, default=NO_DEFAULT):
+        return self._search_regex(
+            r'data-(?:params-)?mvp=["\'](\d+\.\d+)', webpage, 'mvp id',
+            default=default)
+
    def _real_extract(self, url):
        video_id = self._match_id(url)

        webpage = self._download_webpage(url, video_id)

-        mvp_id = self._search_regex(
-            r'data-params-mvp=["\'](\d+\.\d+)', webpage, 'mvp id')
+        mvp_id = self._search_mvp_id(webpage, default=None)
+
+        if not mvp_id:
+            pulsembed_url = self._search_regex(
+                r'data-src=(["\'])(?P<url>(?:https?:)?//pulsembed\.eu/.+?)\1',
+                webpage, 'pulsembed url', group='url')
+            webpage = self._download_webpage(
+                pulsembed_url, video_id, 'Downloading pulsembed webpage')
+            mvp_id = self._search_mvp_id(webpage)

        return self.url_result(
            'onetmvp:%s' % mvp_id, OnetMVPIE.ie_key(), video_id=mvp_id)
--- a/youtube_dl/extractor/ooyala.py
+++ b/youtube_dl/extractor/ooyala.py
@@ -3,12 +3,14 @@ import re
 import base64

 from .common import InfoExtractor
+from ..compat import compat_str
 from ..utils import (
-    int_or_none,
-    float_or_none,
-    ExtractorError,
-    unsmuggle_url,
    determine_ext,
+    ExtractorError,
+    float_or_none,
+    int_or_none,
+    try_get,
+    unsmuggle_url,
 )
 from ..compat import compat_urllib_parse_urlencode

@@ -39,13 +41,15 @@ class OoyalaBaseIE(InfoExtractor):
        formats = []
        if cur_auth_data['authorized']:
            for stream in cur_auth_data['streams']:
-                s_url = base64.b64decode(
-                    stream['url']['data'].encode('ascii')).decode('utf-8')
-                if s_url in urls:
+                url_data = try_get(stream, lambda x: x['url']['data'], compat_str)
+                if not url_data:
+                    continue
+                s_url = base64.b64decode(url_data.encode('ascii')).decode('utf-8')
+                if not s_url or s_url in urls:
                    continue
                urls.append(s_url)
                ext = determine_ext(s_url, None)
-                delivery_type = stream['delivery_type']
+                delivery_type = stream.get('delivery_type')
                if delivery_type == 'hls' or ext == 'm3u8':
                    formats.extend(self._extract_m3u8_formats(
                        re.sub(r'/ip(?:ad|hone)/', '/all/', s_url), embed_code, 'mp4', 'm3u8_native',
@@ -65,7 +69,7 @@ class OoyalaBaseIE(InfoExtractor):
                else:
                    formats.append({
                        'url': s_url,
-                        'ext': ext or stream.get('delivery_type'),
+                        'ext': ext or delivery_type,
                        'vcodec': stream.get('video_codec'),
                        'format_id': delivery_type,
                        'width': int_or_none(stream.get('width')),
@@ -136,6 +140,11 @@ class OoyalaIE(OoyalaBaseIE):
                'title': 'Divide Tool Path.mp4',
                'duration': 204.405,
            }
+        },
+        {
+            # empty stream['url']['data']
+            'url': 'http://player.ooyala.com/player.js?embedCode=w2bnZtYjE6axZ_dw1Cd0hQtXd_ige2Is',
+            'only_matching': True,
        }
    ]

--- a/youtube_dl/extractor/pandatv.py
+++ b/youtube_dl/extractor/pandatv.py
@@ -10,13 +10,13 @@ from ..utils import (

 class PandaTVIE(InfoExtractor):
    IE_DESC = '熊猫TV'
-    _VALID_URL = r'http://(?:www\.)?panda\.tv/(?P<id>[0-9]+)'
-    _TEST = {
-        'url': 'http://www.panda.tv/10091',
+    _VALID_URL = r'https?://(?:www\.)?panda\.tv/(?P<id>[0-9]+)'
+    _TESTS = [{
+        'url': 'http://www.panda.tv/66666',
        'info_dict': {
-            'id': '10091',
+            'id': '66666',
            'title': 're:.+',
-            'uploader': '囚徒',
+            'uploader': '刘杀鸡',
            'ext': 'flv',
            'is_live': True,
        },
@@ -24,13 +24,16 @@ class PandaTVIE(InfoExtractor):
            'skip_download': True,
        },
        'skip': 'Live stream is offline',
-    }
+    }, {
+        'url': 'https://www.panda.tv/66666',
+        'only_matching': True,
+    }]

    def _real_extract(self, url):
        video_id = self._match_id(url)

        config = self._download_json(
-            'http://www.panda.tv/api_room?roomid=%s' % video_id, video_id)
+            'https://www.panda.tv/api_room?roomid=%s' % video_id, video_id)

        error_code = config.get('errno', 0)
        if error_code is not 0:
@@ -74,7 +77,7 @@ class PandaTVIE(InfoExtractor):
                continue
            for pref, (ext, pl) in enumerate((('m3u8', '-hls'), ('flv', ''))):
                formats.append({
-                    'url': 'http://pl%s%s.live.panda.tv/live_panda/%s%s%s.%s'
+                    'url': 'https://pl%s%s.live.panda.tv/live_panda/%s%s%s.%s'
                    % (pl, plflag1, room_key, live_panda, suffix[quality], ext),
                    'format_id': '%s-%s' % (k, ext),
                    'quality': quality,
--- a/youtube_dl/extractor/pandoratv.py
+++ b/youtube_dl/extractor/pandoratv.py
@@ -19,7 +19,7 @@ class PandoraTVIE(InfoExtractor):
    IE_NAME = 'pandora.tv'
    IE_DESC = '판도라TV'
    _VALID_URL = r'https?://(?:.+?\.)?channel\.pandora\.tv/channel/video\.ptv\?'
-    _TEST = {
+    _TESTS = [{
        'url': 'http://jp.channel.pandora.tv/channel/video.ptv?c1=&prgid=53294230&ch_userid=mikakim&ref=main&lot=cate_01_2',
        'info_dict': {
            'id': '53294230',
@@ -34,7 +34,26 @@ class PandoraTVIE(InfoExtractor):
            'view_count': int,
            'like_count': int,
        }
-    }
+    }, {
+        'url': 'http://channel.pandora.tv/channel/video.ptv?ch_userid=gogoucc&prgid=54721744',
+        'info_dict': {
+            'id': '54721744',
+            'ext': 'flv',
+            'title': '[HD] JAPAN COUNTDOWN 170423',
+            'description': '[HD] JAPAN COUNTDOWN 170423',
+            'thumbnail': r're:^https?://.*\.jpg$',
+            'duration': 1704.9,
+            'upload_date': '20170423',
+            'uploader': 'GOGO_UCC',
+            'uploader_id': 'gogoucc',
+            'view_count': int,
+            'like_count': int,
+        },
+        'params': {
+            # Test metadata only
+            'skip_download': True,
+        },
+    }]

    def _real_extract(self, url):
        qs = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
@@ -86,7 +105,7 @@ class PandoraTVIE(InfoExtractor):
            'description': info.get('body'),
            'thumbnail': info.get('thumbnail') or info.get('poster'),
            'duration': float_or_none(info.get('runtime'), 1000) or parse_duration(info.get('time')),
-            'upload_date': info['fid'][:8] if isinstance(info.get('fid'), compat_str) else None,
+            'upload_date': info['fid'].split('/')[-1][:8] if isinstance(info.get('fid'), compat_str) else None,
            'uploader': info.get('nickname'),
            'uploader_id': info.get('upload_userid'),
            'view_count': str_to_int(info.get('hit')),
--- a/youtube_dl/extractor/polskieradio.py
+++ b/youtube_dl/extractor/polskieradio.py
@@ -65,7 +65,7 @@ class PolskieRadioIE(InfoExtractor):
        webpage = self._download_webpage(url, playlist_id)

        content = self._search_regex(
-            r'(?s)<div[^>]+class="audio atarticle"[^>]*>(.+?)<script>',
+            r'(?s)<div[^>]+class="\s*this-article\s*"[^>]*>(.+?)<div[^>]+class="tags"[^>]*>',
            webpage, 'content')

        timestamp = unified_timestamp(self._html_search_regex(
--- a/youtube_dl/extractor/rai.py
+++ b/youtube_dl/extractor/rai.py
@@ -191,11 +191,12 @@ class RaiPlayIE(RaiBaseIE):

        info = {
            'id': video_id,
-            'title': title,
+            'title': self._live_title(title) if relinker_info.get(
+                'is_live') else title,
            'alt_title': media.get('subtitle'),
            'description': media.get('description'),
-            'uploader': media.get('channel'),
-            'creator': media.get('editor'),
+            'uploader': strip_or_none(media.get('channel')),
+            'creator': strip_or_none(media.get('editor')),
            'duration': parse_duration(video.get('duration')),
            'timestamp': timestamp,
            'thumbnails': thumbnails,
@@ -208,10 +209,46 @@ class RaiPlayIE(RaiBaseIE):
        }

        info.update(relinker_info)
-
        return info


+class RaiPlayLiveIE(RaiBaseIE):
+    _VALID_URL = r'https?://(?:www\.)?raiplay\.it/dirette/(?P<id>[^/?#&]+)'
+    _TEST = {
+        'url': 'http://www.raiplay.it/dirette/rainews24',
+        'info_dict': {
+            'id': 'd784ad40-e0ae-4a69-aa76-37519d238a9c',
+            'display_id': 'rainews24',
+            'ext': 'mp4',
+            'title': 're:^Diretta di Rai News 24 [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
+            'description': 'md5:6eca31500550f9376819f174e5644754',
+            'uploader': 'Rai News 24',
+            'creator': 'Rai News 24',
+            'is_live': True,
+        },
+        'params': {
+            'skip_download': True,
+        },
+    }
+
+    def _real_extract(self, url):
+        display_id = self._match_id(url)
+
+        webpage = self._download_webpage(url, display_id)
+
+        video_id = self._search_regex(
+            r'data-uniquename=["\']ContentItem-(%s)' % RaiBaseIE._UUID_RE,
+            webpage, 'content id')
+
+        return {
+            '_type': 'url_transparent',
+            'ie_key': RaiPlayIE.ie_key(),
+            'url': 'http://www.raiplay.it/dirette/ContentItem-%s.html' % video_id,
+            'id': video_id,
+            'display_id': display_id,
+        }
+
+
 class RaiIE(RaiBaseIE):
    _VALID_URL = r'https?://[^/]+\.(?:rai\.(?:it|tv)|rainews\.it)/dl/.+?-(?P<id>%s)(?:-.+?)?\.html' % RaiBaseIE._UUID_RE
    _TESTS = [{
--- a/youtube_dl/extractor/redbulltv.py
+++ b/youtube_dl/extractor/redbulltv.py
@@ -13,7 +13,7 @@ from ..utils import (


 class RedBullTVIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?redbull\.tv/(?:video|film)/(?P<id>AP-\w+)'
+    _VALID_URL = r'https?://(?:www\.)?redbull\.tv/(?:video|film|live)/(?:AP-\w+/segment/)?(?P<id>AP-\w+)'
    _TESTS = [{
        # film
        'url': 'https://www.redbull.tv/video/AP-1Q756YYX51W11/abc-of-wrc',
@@ -42,6 +42,22 @@ class RedBullTVIE(InfoExtractor):
            'season_number': 2,
            'episode_number': 4,
        },
+        'params': {
+            'skip_download': True,
+        },
+    }, {
+        # segment
+        'url': 'https://www.redbull.tv/live/AP-1R5DX49XS1W11/segment/AP-1QSAQJ6V52111/semi-finals',
+        'info_dict': {
+            'id': 'AP-1QSAQJ6V52111',
+            'ext': 'mp4',
+            'title': 'Semi Finals - Vans Park Series Pro Tour',
+            'description': 'md5:306a2783cdafa9e65e39aa62f514fd97',
+            'duration': 11791.991,
+        },
+        'params': {
+            'skip_download': True,
+        },
    }, {
        'url': 'https://www.redbull.tv/film/AP-1MSKKF5T92111/in-motion',
        'only_matching': True,
@@ -82,7 +98,8 @@ class RedBullTVIE(InfoExtractor):
        title = info['title'].strip()

        formats = self._extract_m3u8_formats(
-            video['url'], video_id, 'mp4', 'm3u8_native')
+            video['url'], video_id, 'mp4', entry_protocol='m3u8_native',
+            m3u8_id='hls')
        self._sort_formats(formats)

        subtitles = {}
--- a/youtube_dl/extractor/rtlnl.py
+++ b/youtube_dl/extractor/rtlnl.py
@@ -15,7 +15,7 @@ class RtlNlIE(InfoExtractor):
        https?://(?:www\.)?
        (?:
            rtlxl\.nl/[^\#]*\#!/[^/]+/|
-            rtl\.nl/system/videoplayer/(?:[^/]+/)+(?:video_)?embed\.html\b.+?\buuid=
+            rtl\.nl/(?:system/videoplayer/(?:[^/]+/)+(?:video_)?embed\.html\b.+?\buuid=|video/)
        )
        (?P<id>[0-9a-f-]+)'''

@@ -70,6 +70,9 @@ class RtlNlIE(InfoExtractor):
    }, {
        'url': 'http://rtlxl.nl/?_ga=1.204735956.572365465.1466978370#!/rtl-nieuws-132237/3c487912-023b-49ac-903e-2c5d79f8410f',
        'only_matching': True,
+    }, {
+        'url': 'https://www.rtl.nl/video/c603c9c2-601d-4b5e-8175-64f1e942dc7d/',
+        'only_matching': True,
    }]

    def _real_extract(self, url):
--- a/youtube_dl/extractor/rutv.py
+++ b/youtube_dl/extractor/rutv.py
@@ -13,11 +13,15 @@ from ..utils import (
 class RUTVIE(InfoExtractor):
    IE_DESC = 'RUTV.RU'
    _VALID_URL = r'''(?x)
-        https?://player\.(?:rutv\.ru|vgtrk\.com)/
-            (?P<path>flash\d+v/container\.swf\?id=
-            |iframe/(?P<type>swf|video|live)/id/
-            |index/iframe/cast_id/)
-            (?P<id>\d+)'''
+                    https?://
+                        (?:test)?player\.(?:rutv\.ru|vgtrk\.com)/
+                        (?P<path>
+                            flash\d+v/container\.swf\?id=|
+                            iframe/(?P<type>swf|video|live)/id/|
+                            index/iframe/cast_id/
+                        )
+                        (?P<id>\d+)
+                    '''

    _TESTS = [
        {
@@ -99,17 +103,21 @@ class RUTVIE(InfoExtractor):
                'skip_download': True,
            },
        },
+        {
+            'url': 'https://testplayer.vgtrk.com/iframe/live/id/19201/showZoomBtn/false/isPlay/true/',
+            'only_matching': True,
+        },
    ]

    @classmethod
    def _extract_url(cls, webpage):
        mobj = re.search(
-            r'<iframe[^>]+?src=(["\'])(?P<url>https?://player\.(?:rutv\.ru|vgtrk\.com)/(?:iframe/(?:swf|video|live)/id|index/iframe/cast_id)/.+?)\1', webpage)
+            r'<iframe[^>]+?src=(["\'])(?P<url>https?://(?:test)?player\.(?:rutv\.ru|vgtrk\.com)/(?:iframe/(?:swf|video|live)/id|index/iframe/cast_id)/.+?)\1', webpage)
        if mobj:
            return mobj.group('url')

        mobj = re.search(
-            r'<meta[^>]+?property=(["\'])og:video\1[^>]+?content=(["\'])(?P<url>https?://player\.(?:rutv\.ru|vgtrk\.com)/flash\d+v/container\.swf\?id=.+?\2)',
+            r'<meta[^>]+?property=(["\'])og:video\1[^>]+?content=(["\'])(?P<url>https?://(?:test)?player\.(?:rutv\.ru|vgtrk\.com)/flash\d+v/container\.swf\?id=.+?\2)',
            webpage)
        if mobj:
            return mobj.group('url')
--- a/youtube_dl/extractor/ruv.py
+++ b/youtube_dl/extractor/ruv.py
@@ -0,0 +1,101 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import (
+    determine_ext,
+    unified_timestamp,
+)
+
+
+class RuvIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?ruv\.is/(?:sarpurinn/[^/]+|node)/(?P<id>[^/]+(?:/\d+)?)'
+    _TESTS = [{
+        # m3u8
+        'url': 'http://ruv.is/sarpurinn/ruv-aukaras/fh-valur/20170516',
+        'md5': '66347652f4e13e71936817102acc1724',
+        'info_dict': {
+            'id': '1144499',
+            'display_id': 'fh-valur/20170516',
+            'ext': 'mp4',
+            'title': 'FH - Valur',
+            'description': 'Bein útsending frá 3. leik FH og Vals í úrslitum Olísdeildar karla í handbolta.',
+            'timestamp': 1494963600,
+            'upload_date': '20170516',
+        },
+    }, {
+        # mp3
+        'url': 'http://ruv.is/sarpurinn/ras-2/morgunutvarpid/20170619',
+        'md5': '395ea250c8a13e5fdb39d4670ef85378',
+        'info_dict': {
+            'id': '1153630',
+            'display_id': 'morgunutvarpid/20170619',
+            'ext': 'mp3',
+            'title': 'Morgunútvarpið',
+            'description': 'md5:a4cf1202c0a1645ca096b06525915418',
+            'timestamp': 1497855000,
+            'upload_date': '20170619',
+        },
+    }, {
+        'url': 'http://ruv.is/sarpurinn/ruv/frettir/20170614',
+        'only_matching': True,
+    }, {
+        'url': 'http://www.ruv.is/node/1151854',
+        'only_matching': True,
+    }, {
+        'url': 'http://ruv.is/sarpurinn/klippa/secret-soltice-hefst-a-morgun',
+        'only_matching': True,
+    }, {
+        'url': 'http://ruv.is/sarpurinn/ras-1/morgunvaktin/20170619',
+        'only_matching': True,
+    }]
+
+    def _real_extract(self, url):
+        display_id = self._match_id(url)
+
+        webpage = self._download_webpage(url, display_id)
+
+        title = self._og_search_title(webpage)
+
+        FIELD_RE = r'video\.%s\s*=\s*(["\'])(?P<url>(?:(?!\1).)+)\1'
+
+        media_url = self._html_search_regex(
+            FIELD_RE % 'src', webpage, 'video URL', group='url')
+
+        video_id = self._search_regex(
+            r'<link\b[^>]+\bhref=["\']https?://www\.ruv\.is/node/(\d+)',
+            webpage, 'video id', default=display_id)
+
+        ext = determine_ext(media_url)
+
+        if ext == 'm3u8':
+            formats = self._extract_m3u8_formats(
+                media_url, video_id, 'mp4', entry_protocol='m3u8_native',
+                m3u8_id='hls')
+        elif ext == 'mp3':
+            formats = [{
+                'format_id': 'mp3',
+                'url': media_url,
+                'vcodec': 'none',
+            }]
+        else:
+            formats = [{
+                'url': media_url,
+            }]
+
+        description = self._og_search_description(webpage, default=None)
+        thumbnail = self._og_search_thumbnail(
+            webpage, default=None) or self._search_regex(
+            FIELD_RE % 'poster', webpage, 'thumbnail', fatal=False)
+        timestamp = unified_timestamp(self._html_search_meta(
+            'article:published_time', webpage, 'timestamp', fatal=False))
+
+        return {
+            'id': video_id,
+            'display_id': display_id,
+            'title': title,
+            'description': description,
+            'thumbnail': thumbnail,
+            'timestamp': timestamp,
+            'formats': formats,
+        }
--- a/youtube_dl/extractor/safari.py
+++ b/youtube_dl/extractor/safari.py
@@ -16,7 +16,6 @@ from ..utils import (

 class SafariBaseIE(InfoExtractor):
    _LOGIN_URL = 'https://www.safaribooksonline.com/accounts/login/'
-    _SUCCESSFUL_LOGIN_REGEX = r'<a href="/accounts/logout/"[^>]*>Sign Out</a>'
    _NETRC_MACHINE = 'safari'

    _API_BASE = 'https://www.safaribooksonline.com/api/v1'
@@ -28,10 +27,6 @@ class SafariBaseIE(InfoExtractor):
        self._login()

    def _login(self):
-        # We only need to log in once for courses or individual videos
-        if self.LOGGED_IN:
-            return
-
        (username, password) = self._get_login_info()
        if username is None:
            return
@@ -39,11 +34,17 @@ class SafariBaseIE(InfoExtractor):
        headers = std_headers.copy()
        if 'Referer' not in headers:
            headers['Referer'] = self._LOGIN_URL
-        login_page_request = sanitized_Request(self._LOGIN_URL, headers=headers)

        login_page = self._download_webpage(
-            login_page_request, None,
-            'Downloading login form')
+            self._LOGIN_URL, None, 'Downloading login form', headers=headers)
+
+        def is_logged(webpage):
+            return any(re.search(p, webpage) for p in (
+                r'href=["\']/accounts/logout/', r'>Sign Out<'))
+
+        if is_logged(login_page):
+            self.LOGGED_IN = True
+            return

        csrf = self._html_search_regex(
            r"name='csrfmiddlewaretoken'\s+value='([^']+)'",
@@ -62,15 +63,13 @@ class SafariBaseIE(InfoExtractor):
        login_page = self._download_webpage(
            request, None, 'Logging in as %s' % username)

-        if re.search(self._SUCCESSFUL_LOGIN_REGEX, login_page) is None:
+        if not is_logged(login_page):
            raise ExtractorError(
                'Login failed; make sure your credentials are correct and try again.',
                expected=True)

        self.LOGGED_IN = True

-        self.to_screen('Login successful')
-

 class SafariIE(SafariBaseIE):
    IE_NAME = 'safari'
--- a/youtube_dl/extractor/sexu.py
+++ b/youtube_dl/extractor/sexu.py
@@ -32,8 +32,9 @@ class SexuIE(InfoExtractor):
        formats = [{
            'url': source['file'].replace('\\', ''),
            'format_id': source.get('label'),
-            'height': self._search_regex(
-                r'^(\d+)[pP]', source.get('label', ''), 'height', default=None),
+            'height': int(self._search_regex(
+                r'^(\d+)[pP]', source.get('label', ''), 'height',
+                default=None)),
        } for source in sources if source.get('file')]
        self._sort_formats(formats)

--- a/youtube_dl/extractor/sohu.py
+++ b/youtube_dl/extractor/sohu.py
@@ -8,7 +8,11 @@ from ..compat import (
    compat_str,
    compat_urllib_parse_urlencode,
 )
-from ..utils import ExtractorError
+from ..utils import (
+    ExtractorError,
+    int_or_none,
+    try_get,
+)


 class SohuIE(InfoExtractor):
@@ -169,10 +173,11 @@ class SohuIE(InfoExtractor):
                formats.append({
                    'url': video_url,
                    'format_id': format_id,
-                    'filesize': data['clipsBytes'][i],
-                    'width': data['width'],
-                    'height': data['height'],
-                    'fps': data['fps'],
+                    'filesize': int_or_none(
+                        try_get(data, lambda x: x['clipsBytes'][i])),
+                    'width': int_or_none(data.get('width')),
+                    'height': int_or_none(data.get('height')),
+                    'fps': int_or_none(data.get('fps')),
                })
            self._sort_formats(formats)

--- a/youtube_dl/extractor/soundcloud.py
+++ b/youtube_dl/extractor/soundcloud.py
@@ -136,7 +136,7 @@ class SoundcloudIE(InfoExtractor):

    @classmethod
    def _resolv_url(cls, url):
-        return 'http://api.soundcloud.com/resolve.json?url=' + url + '&client_id=' + cls._CLIENT_ID
+        return 'https://api.soundcloud.com/resolve.json?url=' + url + '&client_id=' + cls._CLIENT_ID

    def _extract_info_dict(self, info, full_title=None, quiet=False, secret_token=None):
        track_id = compat_str(info['id'])
@@ -174,7 +174,7 @@ class SoundcloudIE(InfoExtractor):

        # We have to retrieve the url
        format_dict = self._download_json(
-            'http://api.soundcloud.com/i1/tracks/%s/streams' % track_id,
+            'https://api.soundcloud.com/i1/tracks/%s/streams' % track_id,
            track_id, 'Downloading track url', query={
                'client_id': self._CLIENT_ID,
                'secret_token': secret_token,
@@ -236,7 +236,7 @@ class SoundcloudIE(InfoExtractor):
        track_id = mobj.group('track_id')

        if track_id is not None:
-            info_json_url = 'http://api.soundcloud.com/tracks/' + track_id + '.json?client_id=' + self._CLIENT_ID
+            info_json_url = 'https://api.soundcloud.com/tracks/' + track_id + '.json?client_id=' + self._CLIENT_ID
            full_title = track_id
            token = mobj.group('secret_token')
            if token:
@@ -261,7 +261,7 @@ class SoundcloudIE(InfoExtractor):

            self.report_resolve(full_title)

-            url = 'http://soundcloud.com/%s' % resolve_title
+            url = 'https://soundcloud.com/%s' % resolve_title
            info_json_url = self._resolv_url(url)
        info = self._download_json(info_json_url, full_title, 'Downloading info JSON')

@@ -290,7 +290,7 @@ class SoundcloudSetIE(SoundcloudPlaylistBaseIE):
            'id': '2284613',
            'title': 'The Royal Concept EP',
        },
-        'playlist_mincount': 6,
+        'playlist_mincount': 5,
    }, {
        'url': 'https://soundcloud.com/the-concept-band/sets/the-royal-concept-ep/token',
        'only_matching': True,
@@ -304,7 +304,7 @@ class SoundcloudSetIE(SoundcloudPlaylistBaseIE):
        # extract simple title (uploader + slug of song title)
        slug_title = mobj.group('slug_title')
        full_title = '%s/sets/%s' % (uploader, slug_title)
-        url = 'http://soundcloud.com/%s/sets/%s' % (uploader, slug_title)
+        url = 'https://soundcloud.com/%s/sets/%s' % (uploader, slug_title)

        token = mobj.group('token')
        if token:
@@ -380,7 +380,7 @@ class SoundcloudUserIE(SoundcloudPlaylistBaseIE):
        'url': 'https://soundcloud.com/grynpyret/spotlight',
        'info_dict': {
            'id': '7098329',
-            'title': 'GRYNPYRET (Spotlight)',
+            'title': 'Grynpyret (Spotlight)',
        },
        'playlist_mincount': 1,
    }]
@@ -410,7 +410,7 @@ class SoundcloudUserIE(SoundcloudPlaylistBaseIE):
        mobj = re.match(self._VALID_URL, url)
        uploader = mobj.group('user')

-        url = 'http://soundcloud.com/%s/' % uploader
+        url = 'https://soundcloud.com/%s/' % uploader
        resolv_url = self._resolv_url(url)
        user = self._download_json(
            resolv_url, uploader, 'Downloading user info')
@@ -473,7 +473,7 @@ class SoundcloudPlaylistIE(SoundcloudPlaylistBaseIE):
    _VALID_URL = r'https?://api\.soundcloud\.com/playlists/(?P<id>[0-9]+)(?:/?\?secret_token=(?P<token>[^&]+?))?$'
    IE_NAME = 'soundcloud:playlist'
    _TESTS = [{
-        'url': 'http://api.soundcloud.com/playlists/4110309',
+        'url': 'https://api.soundcloud.com/playlists/4110309',
        'info_dict': {
            'id': '4110309',
            'title': 'TILT Brass - Bowery Poetry Club, August \'03 [Non-Site SCR 02]',
--- a/youtube_dl/extractor/streamango.py
+++ b/youtube_dl/extractor/streamango.py
@@ -21,6 +21,17 @@ class StreamangoIE(InfoExtractor):
            'ext': 'mp4',
            'title': '20170315_150006.mp4',
        }
+    }, {
+        # no og:title
+        'url': 'https://streamango.com/embed/foqebrpftarclpob/asdf_asd_2_mp4',
+        'info_dict': {
+            'id': 'foqebrpftarclpob',
+            'ext': 'mp4',
+            'title': 'foqebrpftarclpob',
+        },
+        'params': {
+            'skip_download': True,
+        },
    }, {
        'url': 'https://streamango.com/embed/clapasobsptpkdfe/20170315_150006_mp4',
        'only_matching': True,
@@ -31,7 +42,7 @@ class StreamangoIE(InfoExtractor):

        webpage = self._download_webpage(url, video_id)

-        title = self._og_search_title(webpage)
+        title = self._og_search_title(webpage, default=video_id)

        formats = []
        for format_ in re.findall(r'({[^}]*\bsrc\s*:\s*[^}]*})', webpage):
--- a/youtube_dl/extractor/tastytrade.py
+++ b/youtube_dl/extractor/tastytrade.py
@@ -0,0 +1,43 @@
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from .ooyala import OoyalaIE
+
+
+class TastyTradeIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?tastytrade\.com/tt/shows/[^/]+/episodes/(?P<id>[^/?#&]+)'
+
+    _TESTS = [{
+        'url': 'https://www.tastytrade.com/tt/shows/market-measures/episodes/correlation-in-short-volatility-06-28-2017',
+        'info_dict': {
+            'id': 'F3bnlzbToeI6pLEfRyrlfooIILUjz4nM',
+            'ext': 'mp4',
+            'title': 'A History of Teaming',
+            'description': 'md5:2a9033db8da81f2edffa4c99888140b3',
+            'duration': 422.255,
+        },
+        'params': {
+            'skip_download': True,
+        },
+        'add_ie': ['Ooyala'],
+    }, {
+        'url': 'https://www.tastytrade.com/tt/shows/daily-dose/episodes/daily-dose-06-30-2017',
+        'only_matching': True,
+    }]
+
+    def _real_extract(self, url):
+        display_id = self._match_id(url)
+        webpage = self._download_webpage(url, display_id)
+
+        ooyala_code = self._search_regex(
+            r'data-media-id=(["\'])(?P<code>(?:(?!\1).)+)\1',
+            webpage, 'ooyala code', group='code')
+
+        info = self._search_json_ld(webpage, display_id, fatal=False)
+        info.update({
+            '_type': 'url_transparent',
+            'ie_key': OoyalaIE.ie_key(),
+            'url': 'ooyala:%s' % ooyala_code,
+            'display_id': display_id,
+        })
+        return info
--- a/youtube_dl/extractor/ted.py
+++ b/youtube_dl/extractor/ted.py
@@ -6,7 +6,10 @@ import re
 from .common import InfoExtractor

 from ..compat import compat_str
-from ..utils import int_or_none
+from ..utils import (
+    int_or_none,
+    try_get,
+)


 class TEDIE(InfoExtractor):
@@ -113,8 +116,9 @@ class TEDIE(InfoExtractor):
    }

    def _extract_info(self, webpage):
-        info_json = self._search_regex(r'q\("\w+.init",({.+})\)</script>',
-                                       webpage, 'info json')
+        info_json = self._search_regex(
+            r'(?s)q\(\s*"\w+.init"\s*,\s*({.+})\)\s*</script>',
+            webpage, 'info json')
        return json.loads(info_json)

    def _real_extract(self, url):
@@ -136,11 +140,16 @@ class TEDIE(InfoExtractor):
        webpage = self._download_webpage(url, name,
                                         'Downloading playlist webpage')
        info = self._extract_info(webpage)
-        playlist_info = info['playlist']
+
+        playlist_info = try_get(
+            info, lambda x: x['__INITIAL_DATA__']['playlist'],
+            dict) or info['playlist']

        playlist_entries = [
            self.url_result('http://www.ted.com/talks/' + talk['slug'], self.ie_key())
-            for talk in info['talks']
+            for talk in try_get(
+                info, lambda x: x['__INITIAL_DATA__']['talks'],
+                dict) or info['talks']
        ]
        return self.playlist_result(
            playlist_entries,
@@ -149,9 +158,14 @@ class TEDIE(InfoExtractor):

    def _talk_info(self, url, video_name):
        webpage = self._download_webpage(url, video_name)
-        self.report_extraction(video_name)

-        talk_info = self._extract_info(webpage)['talks'][0]
+        info = self._extract_info(webpage)
+
+        talk_info = try_get(
+            info, lambda x: x['__INITIAL_DATA__']['talks'][0],
+            dict) or info['talks'][0]
+
+        title = talk_info['title'].strip()

        external = talk_info.get('external')
        if external:
@@ -165,19 +179,27 @@ class TEDIE(InfoExtractor):
                'url': ext_url or external['uri'],
            }

+        native_downloads = try_get(
+            talk_info, lambda x: x['downloads']['nativeDownloads'],
+            dict) or talk_info['nativeDownloads']
+
        formats = [{
            'url': format_url,
            'format_id': format_id,
            'format': format_id,
-        } for (format_id, format_url) in talk_info['nativeDownloads'].items() if format_url is not None]
+        } for (format_id, format_url) in native_downloads.items() if format_url is not None]
        if formats:
            for f in formats:
                finfo = self._NATIVE_FORMATS.get(f['format_id'])
                if finfo:
                    f.update(finfo)

+        player_talk = talk_info['player_talks'][0]
+
+        resources_ = player_talk.get('resources') or talk_info.get('resources')
+
        http_url = None
-        for format_id, resources in talk_info['resources'].items():
+        for format_id, resources in resources_.items():
            if format_id == 'h264':
                for resource in resources:
                    h264_url = resource.get('file')
@@ -237,14 +259,11 @@ class TEDIE(InfoExtractor):

        video_id = compat_str(talk_info['id'])

-        thumbnail = talk_info['thumb']
-        if not thumbnail.startswith('http'):
-            thumbnail = 'http://' + thumbnail
        return {
            'id': video_id,
-            'title': talk_info['title'].strip(),
-            'uploader': talk_info['speaker'],
-            'thumbnail': thumbnail,
+            'title': title,
+            'uploader': player_talk.get('speaker') or talk_info.get('speaker'),
+            'thumbnail': player_talk.get('thumb') or talk_info.get('thumb'),
            'description': self._og_search_description(webpage),
            'subtitles': self._get_subtitles(video_id, talk_info),
            'formats': formats,
--- a/youtube_dl/extractor/thisoldhouse.py
+++ b/youtube_dl/extractor/thisoldhouse.py
@@ -2,13 +2,15 @@
 from __future__ import unicode_literals

 from .common import InfoExtractor
+from ..compat import compat_str
+from ..utils import try_get


 class ThisOldHouseIE(InfoExtractor):
    _VALID_URL = r'https?://(?:www\.)?thisoldhouse\.com/(?:watch|how-to|tv-episode)/(?P<id>[^/?#]+)'
    _TESTS = [{
        'url': 'https://www.thisoldhouse.com/how-to/how-to-build-storage-bench',
-        'md5': '946f05bbaa12a33f9ae35580d2dfcfe3',
+        'md5': '568acf9ca25a639f0c4ff905826b662f',
        'info_dict': {
            'id': '2REGtUDQ',
            'ext': 'mp4',
@@ -28,8 +30,15 @@ class ThisOldHouseIE(InfoExtractor):
    def _real_extract(self, url):
        display_id = self._match_id(url)
        webpage = self._download_webpage(url, display_id)
-        drupal_settings = self._parse_json(self._search_regex(
-            r'jQuery\.extend\(Drupal\.settings\s*,\s*({.+?})\);',
-            webpage, 'drupal settings'), display_id)
-        video_id = drupal_settings['jwplatform']['video_id']
+        video_id = self._search_regex(
+            (r'data-mid=(["\'])(?P<id>(?:(?!\1).)+)\1',
+             r'id=(["\'])inline-video-player-(?P<id>(?:(?!\1).)+)\1'),
+            webpage, 'video id', default=None, group='id')
+        if not video_id:
+            drupal_settings = self._parse_json(self._search_regex(
+                r'jQuery\.extend\(Drupal\.settings\s*,\s*({.+?})\);',
+                webpage, 'drupal settings'), display_id)
+            video_id = try_get(
+                drupal_settings, lambda x: x['jwplatform']['video_id'],
+                compat_str) or list(drupal_settings['comScore'])[0]
        return self.url_result('jwplatform:' + video_id, 'JWPlatform', video_id)
--- a/youtube_dl/extractor/turbo.py
+++ b/youtube_dl/extractor/turbo.py
@@ -4,6 +4,7 @@ from __future__ import unicode_literals
 import re

 from .common import InfoExtractor
+from ..compat import compat_str
 from ..utils import (
    ExtractorError,
    int_or_none,
@@ -49,7 +50,7 @@ class TurboIE(InfoExtractor):
        for child in item:
            m = re.search(r'url_video_(?P<quality>.+)', child.tag)
            if m:
-                quality = m.group('quality')
+                quality = compat_str(m.group('quality'))
                formats.append({
                    'format_id': quality,
                    'url': child.text,
--- a/youtube_dl/extractor/tvplayer.py
+++ b/youtube_dl/extractor/tvplayer.py
@@ -48,7 +48,7 @@ class TVPlayerIE(InfoExtractor):
            'https://tvplayer.com/watch/context', display_id,
            'Downloading JSON context', query={
                'resource': resource_id,
-                'nonce': token,
+                'gen': token,
            })

        validate = context['validate']
--- a/youtube_dl/extractor/veoh.py
+++ b/youtube_dl/extractor/veoh.py
@@ -12,47 +12,46 @@ from ..utils import (


 class VeohIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?veoh\.com/(?:watch|iphone/#_Watch)/(?P<id>(?:v|yapi-)[\da-zA-Z]+)'
+    _VALID_URL = r'https?://(?:www\.)?veoh\.com/(?:watch|iphone/#_Watch)/(?P<id>(?:v|e|yapi-)[\da-zA-Z]+)'

-    _TESTS = [
-        {
-            'url': 'http://www.veoh.com/watch/v56314296nk7Zdmz3',
-            'md5': '620e68e6a3cff80086df3348426c9ca3',
-            'info_dict': {
-                'id': '56314296',
-                'ext': 'mp4',
-                'title': 'Straight Backs Are Stronger',
-                'uploader': 'LUMOback',
-                'description': 'At LUMOback, we believe straight backs are stronger.  The LUMOback Posture & Movement Sensor:  It gently vibrates when you slouch, inspiring improved posture and mobility.  Use the app to track your data and improve your posture over time. ',
-            },
+    _TESTS = [{
+        'url': 'http://www.veoh.com/watch/v56314296nk7Zdmz3',
+        'md5': '620e68e6a3cff80086df3348426c9ca3',
+        'info_dict': {
+            'id': '56314296',
+            'ext': 'mp4',
+            'title': 'Straight Backs Are Stronger',
+            'uploader': 'LUMOback',
+            'description': 'At LUMOback, we believe straight backs are stronger.  The LUMOback Posture & Movement Sensor:  It gently vibrates when you slouch, inspiring improved posture and mobility.  Use the app to track your data and improve your posture over time. ',
        },
-        {
-            'url': 'http://www.veoh.com/watch/v27701988pbTc4wzN?h1=Chile+workers+cover+up+to+avoid+skin+damage',
-            'md5': '4a6ff84b87d536a6a71e6aa6c0ad07fa',
-            'info_dict': {
-                'id': '27701988',
-                'ext': 'mp4',
-                'title': 'Chile workers cover up to avoid skin damage',
-                'description': 'md5:2bd151625a60a32822873efc246ba20d',
-                'uploader': 'afp-news',
-                'duration': 123,
-            },
-            'skip': 'This video has been deleted.',
+    }, {
+        'url': 'http://www.veoh.com/watch/v27701988pbTc4wzN?h1=Chile+workers+cover+up+to+avoid+skin+damage',
+        'md5': '4a6ff84b87d536a6a71e6aa6c0ad07fa',
+        'info_dict': {
+            'id': '27701988',
+            'ext': 'mp4',
+            'title': 'Chile workers cover up to avoid skin damage',
+            'description': 'md5:2bd151625a60a32822873efc246ba20d',
+            'uploader': 'afp-news',
+            'duration': 123,
        },
-        {
-            'url': 'http://www.veoh.com/watch/v69525809F6Nc4frX',
-            'md5': '4fde7b9e33577bab2f2f8f260e30e979',
-            'note': 'Embedded ooyala video',
-            'info_dict': {
-                'id': '69525809',
-                'ext': 'mp4',
-                'title': 'Doctors Alter Plan For Preteen\'s Weight Loss Surgery',
-                'description': 'md5:f5a11c51f8fb51d2315bca0937526891',
-                'uploader': 'newsy-videos',
-            },
-            'skip': 'This video has been deleted.',
+        'skip': 'This video has been deleted.',
+    }, {
+        'url': 'http://www.veoh.com/watch/v69525809F6Nc4frX',
+        'md5': '4fde7b9e33577bab2f2f8f260e30e979',
+        'note': 'Embedded ooyala video',
+        'info_dict': {
+            'id': '69525809',
+            'ext': 'mp4',
+            'title': 'Doctors Alter Plan For Preteen\'s Weight Loss Surgery',
+            'description': 'md5:f5a11c51f8fb51d2315bca0937526891',
+            'uploader': 'newsy-videos',
        },
-    ]
+        'skip': 'This video has been deleted.',
+    }, {
+        'url': 'http://www.veoh.com/watch/e152215AJxZktGS',
+        'only_matching': True,
+    }]

    def _extract_formats(self, source):
        formats = []
--- a/youtube_dl/extractor/vier.py
+++ b/youtube_dl/extractor/vier.py
@@ -15,7 +15,21 @@ from ..utils import (
 class VierIE(InfoExtractor):
    IE_NAME = 'vier'
    IE_DESC = 'vier.be and vijf.be'
-    _VALID_URL = r'https?://(?:www\.)?(?P<site>vier|vijf)\.be/(?:[^/]+/videos/(?P<display_id>[^/]+)(?:/(?P<id>\d+))?|video/v3/embed/(?P<embed_id>\d+))'
+    _VALID_URL = r'''(?x)
+                    https?://
+                        (?:www\.)?(?P<site>vier|vijf)\.be/
+                        (?:
+                            (?:
+                                [^/]+/videos|
+                                video(?:/[^/]+)*
+                            )/
+                            (?P<display_id>[^/]+)(?:/(?P<id>\d+))?|
+                            (?:
+                                video/v3/embed|
+                                embed/video/public
+                            )/(?P<embed_id>\d+)
+                        )
+                    '''
    _NETRC_MACHINE = 'vier'
    _TESTS = [{
        'url': 'http://www.vier.be/planb/videos/het-wordt-warm-de-moestuin/16129',
@@ -83,6 +97,15 @@ class VierIE(InfoExtractor):
    }, {
        'url': 'http://www.vier.be/video/v3/embed/16129',
        'only_matching': True,
+    }, {
+        'url': 'https://www.vijf.be/embed/video/public/4093',
+        'only_matching': True,
+    }, {
+        'url': 'https://www.vier.be/video/blockbusters/in-juli-en-augustus-summer-classics',
+        'only_matching': True,
+    }, {
+        'url': 'https://www.vier.be/video/achter-de-rug/2017/achter-de-rug-seizoen-1-aflevering-6',
+        'only_matching': True,
    }]

    def _real_initialize(self):
@@ -133,14 +156,20 @@ class VierIE(InfoExtractor):
        video_id = self._search_regex(
            [r'data-nid="(\d+)"', r'"nid"\s*:\s*"(\d+)"'],
            webpage, 'video id', default=video_id or display_id)
-        application = self._search_regex(
-            [r'data-application="([^"]+)"', r'"application"\s*:\s*"([^"]+)"'],
-            webpage, 'application', default=site + '_vod')
-        filename = self._search_regex(
-            [r'data-filename="([^"]+)"', r'"filename"\s*:\s*"([^"]+)"'],
-            webpage, 'filename')

-        playlist_url = 'http://vod.streamcloud.be/%s/_definst_/mp4:%s.mp4/playlist.m3u8' % (application, filename)
+        playlist_url = self._search_regex(
+            r'data-file=(["\'])(?P<url>(?:https?:)?//[^/]+/.+?\.m3u8.*?)\1',
+            webpage, 'm3u8 url', default=None, group='url')
+
+        if not playlist_url:
+            application = self._search_regex(
+                [r'data-application="([^"]+)"', r'"application"\s*:\s*"([^"]+)"'],
+                webpage, 'application', default=site + '_vod')
+            filename = self._search_regex(
+                [r'data-filename="([^"]+)"', r'"filename"\s*:\s*"([^"]+)"'],
+                webpage, 'filename')
+            playlist_url = 'http://vod.streamcloud.be/%s/_definst_/mp4:%s.mp4/playlist.m3u8' % (application, filename)
+
        formats = self._extract_wowza_formats(
            playlist_url, display_id, skip_protocols=['dash'])
        self._sort_formats(formats)
--- a/youtube_dl/extractor/vimeo.py
+++ b/youtube_dl/extractor/vimeo.py
@@ -615,7 +615,10 @@ class VimeoIE(VimeoBaseInfoExtractor):
                if download_url and not source_file.get('is_cold') and not source_file.get('is_defrosting'):
                    source_name = source_file.get('public_name', 'Original')
                    if self._is_valid_url(download_url, video_id, '%s video' % source_name):
-                        ext = source_file.get('extension', determine_ext(download_url)).lower()
+                        ext = (try_get(
+                            source_file, lambda x: x['extension'],
+                            compat_str) or determine_ext(
+                            download_url, None) or 'mp4').lower()
                        formats.append({
                            'url': download_url,
                            'ext': ext,
--- a/youtube_dl/extractor/viu.py
+++ b/youtube_dl/extractor/viu.py
@@ -4,7 +4,10 @@ from __future__ import unicode_literals
 import re

 from .common import InfoExtractor
-from ..compat import compat_str
+from ..compat import (
+    compat_kwargs,
+    compat_str,
+)
 from ..utils import (
    ExtractorError,
    int_or_none,
@@ -36,7 +39,8 @@ class ViuBaseIE(InfoExtractor):
        headers.update(kwargs.get('headers', {}))
        kwargs['headers'] = headers
        response = self._download_json(
-            'https://www.viu.com/api/' + path, *args, **kwargs)['response']
+            'https://www.viu.com/api/' + path, *args,
+            **compat_kwargs(kwargs))['response']
        if response.get('status') != 'success':
            raise ExtractorError('%s said: %s' % (
                self.IE_NAME, response['message']), expected=True)
--- a/youtube_dl/extractor/watchindianporn.py
+++ b/youtube_dl/extractor/watchindianporn.py
@@ -4,11 +4,7 @@ from __future__ import unicode_literals
 import re

 from .common import InfoExtractor
-from ..utils import (
-    unified_strdate,
-    parse_duration,
-    int_or_none,
-)
+from ..utils import parse_duration


 class WatchIndianPornIE(InfoExtractor):
@@ -23,11 +19,8 @@ class WatchIndianPornIE(InfoExtractor):
            'ext': 'mp4',
            'title': 'Hot milf from kerala shows off her gorgeous large breasts on camera',
            'thumbnail': r're:^https?://.*\.jpg$',
-            'uploader': 'LoveJay',
-            'upload_date': '20160428',
            'duration': 226,
            'view_count': int,
-            'comment_count': int,
            'categories': list,
            'age_limit': 18,
        }
@@ -40,51 +33,36 @@ class WatchIndianPornIE(InfoExtractor):

        webpage = self._download_webpage(url, display_id)

-        video_url = self._html_search_regex(
-            r"url: escape\('([^']+)'\)", webpage, 'url')
+        info_dict = self._parse_html5_media_entries(url, webpage, video_id)[0]

-        title = self._html_search_regex(
-            r'<h2 class="he2"><span>(.*?)</span>',
-            webpage, 'title')
-        thumbnail = self._html_search_regex(
-            r'<span id="container"><img\s+src="([^"]+)"',
-            webpage, 'thumbnail', fatal=False)
-
-        uploader = self._html_search_regex(
-            r'class="aupa">\s*(.*?)</a>',
-            webpage, 'uploader')
-        upload_date = unified_strdate(self._html_search_regex(
-            r'Added: <strong>(.+?)</strong>', webpage, 'upload date', fatal=False))
+        title = self._html_search_regex((
+            r'<title>(.+?)\s*-\s*Indian\s+Porn</title>',
+            r'<h4>(.+?)</h4>'
+        ), webpage, 'title')

        duration = parse_duration(self._search_regex(
-            r'<td>Time:\s*</td>\s*<td align="right"><span>\s*(.+?)\s*</span>',
+            r'Time:\s*<strong>\s*(.+?)\s*</strong>',
            webpage, 'duration', fatal=False))

-        view_count = int_or_none(self._search_regex(
-            r'<td>Views:\s*</td>\s*<td align="right"><span>\s*(\d+)\s*</span>',
+        view_count = int(self._search_regex(
+            r'(?s)Time:\s*<strong>.*?</strong>.*?<strong>\s*(\d+)\s*</strong>',
            webpage, 'view count', fatal=False))
-        comment_count = int_or_none(self._search_regex(
-            r'<td>Comments:\s*</td>\s*<td align="right"><span>\s*(\d+)\s*</span>',
-            webpage, 'comment count', fatal=False))

        categories = re.findall(
-            r'<a href="[^"]+/search/video/desi"><span>([^<]+)</span></a>',
+            r'<a[^>]+class=[\'"]categories[\'"][^>]*>\s*([^<]+)\s*</a>',
            webpage)

-        return {
+        info_dict.update({
            'id': video_id,
            'display_id': display_id,
-            'url': video_url,
            'http_headers': {
                'Referer': url,
            },
            'title': title,
-            'thumbnail': thumbnail,
-            'uploader': uploader,
-            'upload_date': upload_date,
            'duration': duration,
            'view_count': view_count,
-            'comment_count': comment_count,
            'categories': categories,
            'age_limit': 18,
-        }
+        })
+
+        return info_dict
--- a/youtube_dl/extractor/wsj.py
+++ b/youtube_dl/extractor/wsj.py
@@ -13,7 +13,7 @@ class WSJIE(InfoExtractor):
    _VALID_URL = r'''(?x)
                        (?:
                            https?://video-api\.wsj\.com/api-video/player/iframe\.html\?.*?\bguid=|
-                            https?://(?:www\.)?wsj\.com/video/[^/]+/|
+                            https?://(?:www\.)?(?:wsj|barrons)\.com/video/[^/]+/|
                            wsj:
                        )
                        (?P<id>[a-fA-F0-9-]{36})
@@ -35,6 +35,9 @@ class WSJIE(InfoExtractor):
    }, {
        'url': 'http://www.wsj.com/video/can-alphabet-build-a-smarter-city/359DDAA8-9AC1-489C-82E6-0429C1E430E0.html',
        'only_matching': True,
+    }, {
+        'url': 'http://www.barrons.com/video/capitalism-deserves-more-respect-from-millennials/F301217E-6F46-43AE-B8D2-B7180D642EE9.html',
+        'only_matching': True,
    }]

    def _real_extract(self, url):
--- a/youtube_dl/extractor/xfileshare.py
+++ b/youtube_dl/extractor/xfileshare.py
@@ -10,7 +10,6 @@ from ..utils import (
    ExtractorError,
    int_or_none,
    NO_DEFAULT,
-    sanitized_Request,
    urlencode_postdata,
 )

@@ -30,6 +29,8 @@ class XFileShareIE(InfoExtractor):
        (r'vidabc\.com', 'Vid ABC'),
        (r'vidbom\.com', 'VidBom'),
        (r'vidlo\.us', 'vidlo'),
+        (r'rapidvideo\.(?:cool|org)', 'RapidVideo.TV'),
+        (r'fastvideo\.me', 'FastVideo.me'),
    )

    IE_DESC = 'XFileShare based sites: %s' % ', '.join(list(zip(*_SITES))[1])
@@ -109,6 +110,12 @@ class XFileShareIE(InfoExtractor):
        'params': {
            'skip_download': True,
        },
+    }, {
+        'url': 'http://www.rapidvideo.cool/b667kprndr8w',
+        'only_matching': True,
+    }, {
+        'url': 'http://www.fastvideo.me/k8604r8nk8sn/FAST_FURIOUS_8_-_Trailer_italiano_ufficiale.mp4.html',
+        'only_matching': True
    }]

    def _real_extract(self, url):
@@ -130,12 +137,12 @@ class XFileShareIE(InfoExtractor):
            if countdown:
                self._sleep(countdown, video_id)

-            post = urlencode_postdata(fields)
-
-            req = sanitized_Request(url, post)
-            req.add_header('Content-type', 'application/x-www-form-urlencoded')
-
-            webpage = self._download_webpage(req, video_id, 'Downloading video page')
+            webpage = self._download_webpage(
+                url, video_id, 'Downloading video page',
+                data=urlencode_postdata(fields), headers={
+                    'Referer': url,
+                    'Content-type': 'application/x-www-form-urlencoded',
+                })

        title = (self._search_regex(
            (r'style="z-index: [0-9]+;">([^<]+)</span>',
@@ -150,7 +157,7 @@ class XFileShareIE(InfoExtractor):
        def extract_formats(default=NO_DEFAULT):
            urls = []
            for regex in (
-                    r'file\s*:\s*(["\'])(?P<url>http(?:(?!\1).)+\.(?:m3u8|mp4|flv)(?:(?!\1).)*)\1',
+                    r'(?:file|src)\s*:\s*(["\'])(?P<url>http(?:(?!\1).)+\.(?:m3u8|mp4|flv)(?:(?!\1).)*)\1',
                    r'file_link\s*=\s*(["\'])(?P<url>http(?:(?!\1).)+)\1',
                    r'addVariable\((\\?["\'])file\1\s*,\s*(\\?["\'])(?P<url>http(?:(?!\2).)+)\2\)',
                    r'<embed[^>]+src=(["\'])(?P<url>http(?:(?!\1).)+\.(?:m3u8|mp4|flv)(?:(?!\1).)*)\1'):
--- a/youtube_dl/extractor/xhamster.py
+++ b/youtube_dl/extractor/xhamster.py
@@ -3,6 +3,7 @@ from __future__ import unicode_literals
 import re

 from .common import InfoExtractor
+from ..compat import compat_str
 from ..utils import (
    clean_html,
    dict_get,
@@ -14,12 +15,21 @@ from ..utils import (


 class XHamsterIE(InfoExtractor):
-    _VALID_URL = r'(?P<proto>https?)://(?:.+?\.)?xhamster\.com/movies/(?P<id>[0-9]+)/(?P<seo>.*?)\.html(?:\?.*)?'
+    _VALID_URL = r'''(?x)
+                    https?://
+                        (?:.+?\.)?xhamster\.com/
+                        (?:
+                            movies/(?P<id>\d+)/(?P<display_id>[^/]*)\.html|
+                            videos/(?P<display_id_2>[^/]*)-(?P<id_2>\d+)
+                        )
+                    '''
+
    _TESTS = [{
        'url': 'http://xhamster.com/movies/1509445/femaleagent_shy_beauty_takes_the_bait.html',
        'md5': '8281348b8d3c53d39fffb377d24eac4e',
        'info_dict': {
            'id': '1509445',
+            'display_id': 'femaleagent_shy_beauty_takes_the_bait',
            'ext': 'mp4',
            'title': 'FemaleAgent Shy beauty takes the bait',
            'upload_date': '20121014',
@@ -32,6 +42,7 @@ class XHamsterIE(InfoExtractor):
        'url': 'http://xhamster.com/movies/2221348/britney_spears_sexy_booty.html?hd',
        'info_dict': {
            'id': '2221348',
+            'display_id': 'britney_spears_sexy_booty',
            'ext': 'mp4',
            'title': 'Britney Spears  Sexy Booty',
            'upload_date': '20130914',
@@ -66,26 +77,18 @@ class XHamsterIE(InfoExtractor):
        # This video is visible for marcoalfa123456's friends only
        'url': 'https://it.xhamster.com/movies/7263980/la_mia_vicina.html',
        'only_matching': True,
+    }, {
+        # new URL schema
+        'url': 'https://pt.xhamster.com/videos/euro-pedal-pumping-7937821',
+        'only_matching': True,
    }]

    def _real_extract(self, url):
-        def extract_video_url(webpage, name):
-            return self._search_regex(
-                [r'''file\s*:\s*(?P<q>["'])(?P<mp4>.+?)(?P=q)''',
-                 r'''<a\s+href=(?P<q>["'])(?P<mp4>.+?)(?P=q)\s+class=["']mp4Thumb''',
-                 r'''<video[^>]+file=(?P<q>["'])(?P<mp4>.+?)(?P=q)[^>]*>'''],
-                webpage, name, group='mp4')
-
-        def is_hd(webpage):
-            return '<div class=\'icon iconHD\'' in webpage
-
        mobj = re.match(self._VALID_URL, url)
+        video_id = mobj.group('id') or mobj.group('id_2')
+        display_id = mobj.group('display_id') or mobj.group('display_id_2')

-        video_id = mobj.group('id')
-        seo = mobj.group('seo')
-        proto = mobj.group('proto')
-        mrss_url = '%s://xhamster.com/movies/%s/%s.html' % (proto, video_id, seo)
-        webpage = self._download_webpage(mrss_url, video_id)
+        webpage = self._download_webpage(url, video_id)

        error = self._html_search_regex(
            r'<div[^>]+id=["\']videoClosed["\'][^>]*>(.+?)</div>',
@@ -99,6 +102,39 @@ class XHamsterIE(InfoExtractor):
             r'<title[^>]*>(.+?)(?:,\s*[^,]*?\s*Porn\s*[^,]*?:\s*xHamster[^<]*| - xHamster\.com)</title>'],
            webpage, 'title')

+        formats = []
+        format_urls = set()
+
+        sources = self._parse_json(
+            self._search_regex(
+                r'sources\s*:\s*({.+?})\s*,?\s*\n', webpage, 'sources',
+                default='{}'),
+            video_id, fatal=False)
+        for format_id, format_url in sources.items():
+            if not isinstance(format_url, compat_str):
+                continue
+            if format_url in format_urls:
+                continue
+            format_urls.add(format_url)
+            formats.append({
+                'format_id': format_id,
+                'url': format_url,
+                'height': int_or_none(self._search_regex(
+                    r'^(\d+)[pP]', format_id, 'height', default=None))
+            })
+
+        video_url = self._search_regex(
+            [r'''file\s*:\s*(?P<q>["'])(?P<mp4>.+?)(?P=q)''',
+             r'''<a\s+href=(?P<q>["'])(?P<mp4>.+?)(?P=q)\s+class=["']mp4Thumb''',
+             r'''<video[^>]+file=(?P<q>["'])(?P<mp4>.+?)(?P=q)[^>]*>'''],
+            webpage, 'video url', group='mp4', default=None)
+        if video_url and video_url not in format_urls:
+            formats.append({
+                'url': video_url,
+            })
+
+        self._sort_formats(formats)
+
        # Only a few videos have an description
        mobj = re.search(r'<span>Description: </span>([^<]+)', webpage)
        description = mobj.group(1) if mobj else None
@@ -117,7 +153,8 @@ class XHamsterIE(InfoExtractor):
            webpage, 'thumbnail', fatal=False, group='thumbnail')

        duration = parse_duration(self._search_regex(
-            r'Runtime:\s*</span>\s*([\d:]+)', webpage,
+            [r'<[^<]+\bitemprop=["\']duration["\'][^<]+\bcontent=["\'](.+?)["\']',
+             r'Runtime:\s*</span>\s*([\d:]+)'], webpage,
            'duration', fatal=False))

        view_count = int_or_none(self._search_regex(
@@ -132,30 +169,6 @@ class XHamsterIE(InfoExtractor):

        age_limit = self._rta_search(webpage)

-        hd = is_hd(webpage)
-
-        format_id = 'hd' if hd else 'sd'
-
-        video_url = extract_video_url(webpage, format_id)
-        formats = [{
-            'url': video_url,
-            'format_id': 'hd' if hd else 'sd',
-            'preference': 1,
-        }]
-
-        if not hd:
-            mrss_url = self._search_regex(r'<link rel="canonical" href="([^"]+)', webpage, 'mrss_url')
-            webpage = self._download_webpage(mrss_url + '?hd', video_id, note='Downloading HD webpage')
-            if is_hd(webpage):
-                video_url = extract_video_url(webpage, 'hd')
-                formats.append({
-                    'url': video_url,
-                    'format_id': 'hd',
-                    'preference': 2,
-                })
-
-        self._sort_formats(formats)
-
        categories_html = self._search_regex(
            r'(?s)<table.+?(<span>Categories:.+?)</table>', webpage,
            'categories', default=None)
@@ -164,6 +177,7 @@ class XHamsterIE(InfoExtractor):

        return {
            'id': video_id,
+            'display_id': display_id,
            'title': title,
            'description': description,
            'upload_date': upload_date,
--- a/youtube_dl/extractor/xuite.py
+++ b/youtube_dl/extractor/xuite.py
@@ -1,14 +1,13 @@
 # coding: utf-8
 from __future__ import unicode_literals

-import base64
-
 from .common import InfoExtractor
-from ..compat import compat_urllib_parse_unquote
 from ..utils import (
    ExtractorError,
+    float_or_none,
+    get_element_by_attribute,
    parse_iso8601,
-    parse_duration,
+    remove_end,
 )


@@ -24,6 +23,7 @@ class XuiteIE(InfoExtractor):
            'id': '3860914',
            'ext': 'mp3',
            'title': '孤單南半球-歐德陽',
+            'description': '孤單南半球-歐德陽',
            'thumbnail': r're:^https?://.*\.jpg$',
            'duration': 247.246,
            'timestamp': 1314932940,
@@ -44,7 +44,7 @@ class XuiteIE(InfoExtractor):
            'duration': 596.458,
            'timestamp': 1454242500,
            'upload_date': '20160131',
-            'uploader': 'yan12125',
+            'uploader': '屁姥',
            'uploader_id': '12158353',
            'categories': ['個人短片'],
            'description': 'http://download.blender.org/peach/bigbuckbunny_movies/BigBuckBunny_320x180.mp4',
@@ -72,10 +72,10 @@ class XuiteIE(InfoExtractor):
        # from http://forgetfulbc.blogspot.com/2016/06/date.html
        'url': 'http://vlog.xuite.net/embed/cE1xbENoLTI3NDQ3MzM2LmZsdg==?ar=0&as=0',
        'info_dict': {
-            'id': 'cE1xbENoLTI3NDQ3MzM2LmZsdg==',
+            'id': '27447336',
            'ext': 'mp4',
            'title': '男女平權只是口號？專家解釋約會時男生是否該幫女生付錢 (中字)',
-            'description': 'md5:f0abdcb69df300f522a5442ef3146f2a',
+            'description': 'md5:1223810fa123b179083a3aed53574706',
            'timestamp': 1466160960,
            'upload_date': '20160617',
            'uploader': 'B.C. & Lowy',
@@ -86,29 +86,9 @@ class XuiteIE(InfoExtractor):
        'only_matching': True,
    }]

-    @staticmethod
-    def base64_decode_utf8(data):
-        return base64.b64decode(data.encode('utf-8')).decode('utf-8')
-
-    @staticmethod
-    def base64_encode_utf8(data):
-        return base64.b64encode(data.encode('utf-8')).decode('utf-8')
-
-    def _extract_flv_config(self, encoded_media_id):
-        flv_config = self._download_xml(
-            'http://vlog.xuite.net/flash/player?media=%s' % encoded_media_id,
-            'flv config')
-        prop_dict = {}
-        for prop in flv_config.findall('./property'):
-            prop_id = self.base64_decode_utf8(prop.attrib['id'])
-            # CDATA may be empty in flv config
-            if not prop.text:
-                continue
-            encoded_content = self.base64_decode_utf8(prop.text)
-            prop_dict[prop_id] = compat_urllib_parse_unquote(encoded_content)
-        return prop_dict
-
    def _real_extract(self, url):
+        # /play/ URLs provide embedded video URL and more metadata
+        url = url.replace('/embed/', '/play/')
        video_id = self._match_id(url)

        webpage = self._download_webpage(url, video_id)
@@ -121,51 +101,53 @@ class XuiteIE(InfoExtractor):
                '%s returned error: %s' % (self.IE_NAME, error_msg),
                expected=True)

-        encoded_media_id = self._search_regex(
-            r'attributes\.name\s*=\s*"([^"]+)"', webpage,
-            'encoded media id', default=None)
-        if encoded_media_id is None:
-            video_id = self._html_search_regex(
-                r'data-mediaid="(\d+)"', webpage, 'media id')
-            encoded_media_id = self.base64_encode_utf8(video_id)
-        flv_config = self._extract_flv_config(encoded_media_id)
+        media_info = self._parse_json(self._search_regex(
+            r'var\s+mediaInfo\s*=\s*({.*});', webpage, 'media info'), video_id)

-        FORMATS = {
-            'audio': 'mp3',
-            'video': 'mp4',
-        }
+        video_id = media_info['MEDIA_ID']

        formats = []
-        for format_tag in ('src', 'hq_src'):
-            video_url = flv_config.get(format_tag)
+        for key in ('html5Url', 'html5HQUrl'):
+            video_url = media_info.get(key)
            if not video_url:
                continue
            format_id = self._search_regex(
-                r'\bq=(.+?)\b', video_url, 'format id', default=format_tag)
+                r'\bq=(.+?)\b', video_url, 'format id', default=None)
            formats.append({
                'url': video_url,
-                'ext': FORMATS.get(flv_config['type'], 'mp4'),
+                'ext': 'mp4' if format_id.isnumeric() else format_id,
                'format_id': format_id,
                'height': int(format_id) if format_id.isnumeric() else None,
            })
        self._sort_formats(formats)

-        timestamp = flv_config.get('publish_datetime')
+        timestamp = media_info.get('PUBLISH_DATETIME')
        if timestamp:
            timestamp = parse_iso8601(timestamp + ' +0800', ' ')

-        category = flv_config.get('category')
+        category = media_info.get('catName')
        categories = [category] if category else []

+        uploader = media_info.get('NICKNAME')
+        uploader_url = None
+
+        author_div = get_element_by_attribute('itemprop', 'author', webpage)
+        if author_div:
+            uploader = uploader or self._html_search_meta('name', author_div)
+            uploader_url = self._html_search_regex(
+                r'<link[^>]+itemprop="url"[^>]+href="([^"]+)"', author_div,
+                'uploader URL', fatal=False)
+
        return {
            'id': video_id,
-            'title': flv_config['title'],
-            'description': flv_config.get('description'),
-            'thumbnail': flv_config.get('thumb'),
+            'title': media_info['TITLE'],
+            'description': remove_end(media_info.get('metaDesc'), ' (Xuite 影音)'),
+            'thumbnail': media_info.get('ogImageUrl'),
            'timestamp': timestamp,
-            'uploader': flv_config.get('author_name'),
-            'uploader_id': flv_config.get('author_id'),
-            'duration': parse_duration(flv_config.get('duration')),
+            'uploader': uploader,
+            'uploader_id': media_info.get('MEMBER_ID'),
+            'uploader_url': uploader_url,
+            'duration': float_or_none(media_info.get('MEDIA_DURATION'), 1000000),
            'categories': categories,
            'formats': formats,
        }
--- a/youtube_dl/extractor/yam.py
+++ b/youtube_dl/extractor/yam.py
@@ -1,123 +0,0 @@
-# coding: utf-8
-from __future__ import unicode_literals
-
-import re
-
-from .common import InfoExtractor
-from ..compat import compat_urlparse
-from ..utils import (
-    float_or_none,
-    month_by_abbreviation,
-    ExtractorError,
-    get_element_by_attribute,
-)
-
-
-class YamIE(InfoExtractor):
-    IE_DESC = '蕃薯藤yam天空部落'
-    _VALID_URL = r'https?://mymedia\.yam\.com/m/(?P<id>\d+)'
-
-    _TESTS = [{
-        # An audio hosted on Yam
-        'url': 'http://mymedia.yam.com/m/2283921',
-        'md5': 'c011b8e262a52d5473d9c2e3c9963b9c',
-        'info_dict': {
-            'id': '2283921',
-            'ext': 'mp3',
-            'title': '發現 - 趙薇 京華煙雲主題曲',
-            'description': '發現 - 趙薇 京華煙雲主題曲',
-            'uploader_id': 'princekt',
-            'upload_date': '20080807',
-            'duration': 313.0,
-        }
-    }, {
-        # An external video hosted on YouTube
-        'url': 'http://mymedia.yam.com/m/3599430',
-        'md5': '03127cf10d8f35d120a9e8e52e3b17c6',
-        'info_dict': {
-            'id': 'CNpEoQlrIgA',
-            'ext': 'mp4',
-            'upload_date': '20150306',
-            'uploader': '新莊社大瑜伽社',
-            'description': 'md5:11e2e405311633ace874f2e6226c8b17',
-            'uploader_id': '2323agoy',
-            'title': '20090412陽明山二子坪-1',
-        },
-        'skip': 'Video does not exist',
-    }, {
-        'url': 'http://mymedia.yam.com/m/3598173',
-        'info_dict': {
-            'id': '3598173',
-            'ext': 'mp4',
-        },
-        'skip': 'cause Yam system error',
-    }, {
-        'url': 'http://mymedia.yam.com/m/3599437',
-        'info_dict': {
-            'id': '3599437',
-            'ext': 'mp4',
-        },
-        'skip': 'invalid YouTube URL',
-    }, {
-        'url': 'http://mymedia.yam.com/m/2373534',
-        'md5': '7ff74b91b7a817269d83796f8c5890b1',
-        'info_dict': {
-            'id': '2373534',
-            'ext': 'mp3',
-            'title': '林俊傑&蔡卓妍-小酒窩',
-            'description': 'md5:904003395a0fcce6cfb25028ff468420',
-            'upload_date': '20080928',
-            'uploader_id': 'onliner2',
-        }
-    }]
-
-    def _real_extract(self, url):
-        video_id = self._match_id(url)
-        page = self._download_webpage(url, video_id)
-
-        # Check for errors
-        system_msg = self._html_search_regex(
-            r'系統訊息(?:<br>|\n|\r)*([^<>]+)<br>', page, 'system message',
-            default=None)
-        if system_msg:
-            raise ExtractorError(system_msg, expected=True)
-
-        # Is it hosted externally on YouTube?
-        youtube_url = self._html_search_regex(
-            r'<embed src="(http://www.youtube.com/[^"]+)"',
-            page, 'YouTube url', default=None)
-        if youtube_url:
-            return self.url_result(youtube_url, 'Youtube')
-
-        title = self._html_search_regex(
-            r'<h1[^>]+class="heading"[^>]*>\s*(.+)\s*</h1>', page, 'title')
-
-        api_page = self._download_webpage(
-            'http://mymedia.yam.com/api/a/?pID=' + video_id, video_id,
-            note='Downloading API page')
-        api_result_obj = compat_urlparse.parse_qs(api_page)
-
-        info_table = get_element_by_attribute('class', 'info', page)
-        uploader_id = self._html_search_regex(
-            r'<!-- 發表作者 -->：[\n ]+<a href="/([a-z0-9]+)"',
-            info_table, 'uploader id', fatal=False)
-        mobj = re.search(r'<!-- 發表於 -->(?P<mon>[A-Z][a-z]{2})\s+' +
-                         r'(?P<day>\d{1,2}), (?P<year>\d{4})', page)
-        if mobj:
-            upload_date = '%s%02d%02d' % (
-                mobj.group('year'),
-                month_by_abbreviation(mobj.group('mon')),
-                int(mobj.group('day')))
-        else:
-            upload_date = None
-        duration = float_or_none(api_result_obj['totaltime'][0], scale=1000)
-
-        return {
-            'id': video_id,
-            'url': api_result_obj['mp3file'][0],
-            'title': title,
-            'description': self._html_search_meta('description', page),
-            'duration': duration,
-            'uploader_id': uploader_id,
-            'upload_date': upload_date,
-        }
--- a/youtube_dl/extractor/youporn.py
+++ b/youtube_dl/extractor/youporn.py
@@ -3,6 +3,7 @@ from __future__ import unicode_literals
 import re

 from .common import InfoExtractor
+from ..compat import compat_str
 from ..utils import (
    int_or_none,
    sanitized_Request,
@@ -26,7 +27,7 @@ class YouPornIE(InfoExtractor):
            'description': 'Love & Sex Answers: http://bit.ly/DanAndJenn -- Is It Unhealthy To Masturbate Daily?',
            'thumbnail': r're:^https?://.*\.jpg$',
            'uploader': 'Ask Dan And Jennifer',
-            'upload_date': '20101221',
+            'upload_date': '20101217',
            'average_rating': int,
            'view_count': int,
            'comment_count': int,
@@ -45,7 +46,7 @@ class YouPornIE(InfoExtractor):
            'description': 'http://sweetlivegirls.com Big Tits Awesome Brunette On amazing webcam show.mp4',
            'thumbnail': r're:^https?://.*\.jpg$',
            'uploader': 'Unknown',
-            'upload_date': '20111125',
+            'upload_date': '20110418',
            'average_rating': int,
            'view_count': int,
            'comment_count': int,
@@ -68,28 +69,46 @@ class YouPornIE(InfoExtractor):
        webpage = self._download_webpage(request, display_id)

        title = self._search_regex(
-            [r'(?:video_titles|videoTitle)\s*[:=]\s*(["\'])(?P<title>.+?)\1',
-             r'<h1[^>]+class=["\']heading\d?["\'][^>]*>([^<])<'],
-            webpage, 'title', group='title')
+            [r'(?:video_titles|videoTitle)\s*[:=]\s*(["\'])(?P<title>(?:(?!\1).)+)\1',
+             r'<h1[^>]+class=["\']heading\d?["\'][^>]*>(?P<title>[^<]+)<'],
+            webpage, 'title', group='title',
+            default=None) or self._og_search_title(
+            webpage, default=None) or self._html_search_meta(
+            'title', webpage, fatal=True)

        links = []

+        # Main source
+        definitions = self._parse_json(
+            self._search_regex(
+                r'mediaDefinition\s*=\s*(\[.+?\]);', webpage,
+                'media definitions', default='[]'),
+            video_id, fatal=False)
+        if definitions:
+            for definition in definitions:
+                if not isinstance(definition, dict):
+                    continue
+                video_url = definition.get('videoUrl')
+                if isinstance(video_url, compat_str) and video_url:
+                    links.append(video_url)
+
+        # Fallback #1, this also contains extra low quality 180p format
+        for _, link in re.findall(r'<a[^>]+href=(["\'])(http.+?)\1[^>]+title=["\']Download [Vv]ideo', webpage):
+            links.append(link)
+
+        # Fallback #2 (unavailable as at 22.06.2017)
        sources = self._search_regex(
            r'(?s)sources\s*:\s*({.+?})', webpage, 'sources', default=None)
        if sources:
            for _, link in re.findall(r'[^:]+\s*:\s*(["\'])(http.+?)\1', sources):
                links.append(link)

-        # Fallback #1
+        # Fallback #3 (unavailable as at 22.06.2017)
        for _, link in re.findall(
-                r'(?:videoUrl|videoSrc|videoIpadUrl|html5PlayerSrc)\s*[:=]\s*(["\'])(http.+?)\1', webpage):
+                r'(?:videoSrc|videoIpadUrl|html5PlayerSrc)\s*[:=]\s*(["\'])(http.+?)\1', webpage):
            links.append(link)

-        # Fallback #2, this also contains extra low quality 180p format
-        for _, link in re.findall(r'<a[^>]+href=(["\'])(http.+?)\1[^>]+title=["\']Download [Vv]ideo', webpage):
-            links.append(link)
-
-        # Fallback #3, encrypted links
+        # Fallback #4, encrypted links (unavailable as at 22.06.2017)
        for _, encrypted_link in re.findall(
                r'encryptedQuality\d{3,4}URL\s*=\s*(["\'])([\da-zA-Z+/=]+)\1', webpage):
            links.append(aes_decrypt_text(encrypted_link, title, 32).decode('utf-8'))
@@ -124,7 +143,8 @@ class YouPornIE(InfoExtractor):
            r'(?s)<div[^>]+class=["\']submitByLink["\'][^>]*>(.+?)</div>',
            webpage, 'uploader', fatal=False)
        upload_date = unified_strdate(self._html_search_regex(
-            r'(?s)<div[^>]+class=["\']videoInfo(?:Date|Time)["\'][^>]*>(.+?)</div>',
+            [r'Date\s+[Aa]dded:\s*<span>([^<]+)',
+             r'(?s)<div[^>]+class=["\']videoInfo(?:Date|Time)["\'][^>]*>(.+?)</div>'],
            webpage, 'upload date', fatal=False))

        age_limit = self._rta_search(webpage)
--- a/youtube_dl/extractor/youtube.py
+++ b/youtube_dl/extractor/youtube.py
@@ -1269,37 +1269,57 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
                    sub_lang_list[sub_lang] = sub_formats
                return sub_lang_list

+            def make_captions(sub_url, sub_langs):
+                parsed_sub_url = compat_urllib_parse_urlparse(sub_url)
+                caption_qs = compat_parse_qs(parsed_sub_url.query)
+                captions = {}
+                for sub_lang in sub_langs:
+                    sub_formats = []
+                    for ext in self._SUBTITLE_FORMATS:
+                        caption_qs.update({
+                            'tlang': [sub_lang],
+                            'fmt': [ext],
+                        })
+                        sub_url = compat_urlparse.urlunparse(parsed_sub_url._replace(
+                            query=compat_urllib_parse_urlencode(caption_qs, True)))
+                        sub_formats.append({
+                            'url': sub_url,
+                            'ext': ext,
+                        })
+                    captions[sub_lang] = sub_formats
+                return captions
+
+            # New captions format as of 22.06.2017
+            player_response = args.get('player_response')
+            if player_response and isinstance(player_response, compat_str):
+                player_response = self._parse_json(
+                    player_response, video_id, fatal=False)
+                if player_response:
+                    renderer = player_response['captions']['playerCaptionsTracklistRenderer']
+                    base_url = renderer['captionTracks'][0]['baseUrl']
+                    sub_lang_list = []
+                    for lang in renderer['translationLanguages']:
+                        lang_code = lang.get('languageCode')
+                        if lang_code:
+                            sub_lang_list.append(lang_code)
+                    return make_captions(base_url, sub_lang_list)
+
            # Some videos don't provide ttsurl but rather caption_tracks and
            # caption_translation_languages (e.g. 20LmZk1hakA)
+            # Does not used anymore as of 22.06.2017
            caption_tracks = args['caption_tracks']
            caption_translation_languages = args['caption_translation_languages']
            caption_url = compat_parse_qs(caption_tracks.split(',')[0])['u'][0]
-            parsed_caption_url = compat_urllib_parse_urlparse(caption_url)
-            caption_qs = compat_parse_qs(parsed_caption_url.query)
-
-            sub_lang_list = {}
+            sub_lang_list = []
            for lang in caption_translation_languages.split(','):
                lang_qs = compat_parse_qs(compat_urllib_parse_unquote_plus(lang))
                sub_lang = lang_qs.get('lc', [None])[0]
-                if not sub_lang:
-                    continue
-                sub_formats = []
-                for ext in self._SUBTITLE_FORMATS:
-                    caption_qs.update({
-                        'tlang': [sub_lang],
-                        'fmt': [ext],
-                    })
-                    sub_url = compat_urlparse.urlunparse(parsed_caption_url._replace(
-                        query=compat_urllib_parse_urlencode(caption_qs, True)))
-                    sub_formats.append({
-                        'url': sub_url,
-                        'ext': ext,
-                    })
-                sub_lang_list[sub_lang] = sub_formats
-            return sub_lang_list
+                if sub_lang:
+                    sub_lang_list.append(sub_lang)
+            return make_captions(caption_url, sub_lang_list)
        # An extractor error can be raise by the download process if there are
        # no automatic captions but there are subtitles
-        except (KeyError, ExtractorError):
+        except (KeyError, IndexError, ExtractorError):
            self._downloader.report_warning(err_msg)
            return {}

--- a/youtube_dl/options.py
+++ b/youtube_dl/options.py
@@ -310,7 +310,7 @@ def parseOpts(overrideArguments=None):
        metavar='FILTER', dest='match_filter', default=None,
        help=(
            'Generic video filter. '
-            'Specify any key (see help for -o for a list of available keys) to '
+            'Specify any key (see the "OUTPUT TEMPLATE" for a list of available keys) to '
            'match if the key is present, '
            '!key to check if the key is not present, '
            'key > NUMBER (like "comment_count > 12", also works with '
@@ -618,7 +618,7 @@ def parseOpts(overrideArguments=None):
    verbosity.add_option(
        '-j', '--dump-json',
        action='store_true', dest='dumpjson', default=False,
-        help='Simulate, quiet but print JSON information. See --output for a description of available keys.')
+        help='Simulate, quiet but print JSON information. See the "OUTPUT TEMPLATE" for a description of available keys.')
    verbosity.add_option(
        '-J', '--dump-single-json',
        action='store_true', dest='dump_single_json', default=False,
--- a/youtube_dl/postprocessor/execafterdownload.py
+++ b/youtube_dl/postprocessor/execafterdownload.py
@@ -4,7 +4,10 @@ import subprocess

 from .common import PostProcessor
 from ..compat import compat_shlex_quote
-from ..utils import PostProcessingError
+from ..utils import (
+    encodeArgument,
+    PostProcessingError,
+)


 class ExecAfterDownloadPP(PostProcessor):
@@ -20,7 +23,7 @@ class ExecAfterDownloadPP(PostProcessor):
        cmd = cmd.replace('{}', compat_shlex_quote(information['filepath']))

        self._downloader.to_screen('[exec] Executing command: %s' % cmd)
-        retCode = subprocess.call(cmd, shell=True)
+        retCode = subprocess.call(encodeArgument(cmd), shell=True)
        if retCode != 0:
            raise PostProcessingError(
                'Command returned error code %d' % retCode)
--- a/youtube_dl/postprocessor/ffmpeg.py
+++ b/youtube_dl/postprocessor/ffmpeg.py
@@ -542,7 +542,7 @@ class FFmpegFixupM3u8PP(FFmpegPostProcessor):
            temp_filename = prepend_extension(filename, 'temp')

            options = ['-c', 'copy', '-f', 'mp4', '-bsf:a', 'aac_adtstoasc']
-            self._downloader.to_screen('[ffmpeg] Fixing malformated aac bitstream in "%s"' % filename)
+            self._downloader.to_screen('[ffmpeg] Fixing malformed AAC bitstream in "%s"' % filename)
            self.run_ffmpeg(filename, temp_filename, options)

            os.remove(encodeFilename(filename))
--- a/youtube_dl/postprocessor/metadatafromtitle.py
+++ b/youtube_dl/postprocessor/metadatafromtitle.py
@@ -35,11 +35,14 @@ class MetadataFromTitlePP(PostProcessor):
        title = info['title']
        match = re.match(self._titleregex, title)
        if match is None:
-            self._downloader.to_screen('[fromtitle] Could not interpret title of video as "%s"' % self._titleformat)
+            self._downloader.to_screen(
+                '[fromtitle] Could not interpret title of video as "%s"'
+                % self._titleformat)
            return [], info
        for attribute, value in match.groupdict().items():
-            value = match.group(attribute)
            info[attribute] = value
-            self._downloader.to_screen('[fromtitle] parsed ' + attribute + ': ' + value)
+            self._downloader.to_screen(
+                '[fromtitle] parsed %s: %s'
+                % (attribute, value if value is not None else 'NA'))

        return [], info
--- a/youtube_dl/utils.py
+++ b/youtube_dl/utils.py
@@ -22,7 +22,6 @@ import locale
 import math
 import operator
 import os
-import pipes
 import platform
 import random
 import re
@@ -36,6 +35,7 @@ import xml.etree.ElementTree
 import zlib

 from .compat import (
+    compat_HTMLParseError,
    compat_HTMLParser,
    compat_basestring,
    compat_chr,
@@ -365,9 +365,9 @@ def get_elements_by_attribute(attribute, value, html, escape_value=True):
    retlist = []
    for m in re.finditer(r'''(?xs)
        <([a-zA-Z0-9:._-]+)
-         (?:\s+[a-zA-Z0-9:._-]+(?:=[a-zA-Z0-9:._-]*|="[^"]*"|='[^']*'))*?
+         (?:\s+[a-zA-Z0-9:._-]+(?:=[a-zA-Z0-9:._-]*|="[^"]*"|='[^']*'|))*?
         \s+%s=['"]?%s['"]?
-         (?:\s+[a-zA-Z0-9:._-]+(?:=[a-zA-Z0-9:._-]*|="[^"]*"|='[^']*'))*?
+         (?:\s+[a-zA-Z0-9:._-]+(?:=[a-zA-Z0-9:._-]*|="[^"]*"|='[^']*'|))*?
        \s*>
        (?P<content>.*?)
        </\1>
@@ -409,8 +409,12 @@ def extract_attributes(html_element):
    but the cases in the unit test will work for all of 2.6, 2.7, 3.2-3.5.
    """
    parser = HTMLAttributeParser()
-    parser.feed(html_element)
-    parser.close()
+    try:
+        parser.feed(html_element)
+        parser.close()
+    # Older Python may throw HTMLParseError in case of malformed HTML
+    except compat_HTMLParseError:
+        pass
    return parser.attrs


@@ -1179,7 +1183,7 @@ def unified_timestamp(date_str, day_first=True):
    if date_str is None:
        return None

-    date_str = date_str.replace(',', ' ')
+    date_str = re.sub(r'[,|]', '', date_str)

    pm_delta = 12 if re.search(r'(?i)PM', date_str) else 0
    timezone, date_str = extract_timezone(date_str)
@@ -1530,7 +1534,7 @@ def shell_quote(args):
        if isinstance(a, bytes):
            # We may get a filename encoded with 'encodeFilename'
            a = a.decode(encoding)
-        quoted_args.append(pipes.quote(a))
+        quoted_args.append(compat_shlex_quote(a))
    return ' '.join(quoted_args)


--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@@ -1,3 +1,3 @@
 from __future__ import unicode_literals

-__version__ = '2017.06.05'
+__version__ = '2017.07.09'
Author	SHA1	Message	Date
Sergey M․	65c416dda8	release 2017.07.09	2017-07-09 20:16:38 +07:00
Sergey M․	207acd8465	[ChangeLog] Actualize	2017-07-09 20:15:15 +07:00
Sergey M․	71a1db8919	[dailymail] Add support for embeds	2017-07-09 20:06:24 +07:00
Sergey M․	6e925598d6	[csjw] Add coding cookie	2017-07-09 19:18:12 +07:00
Sergey M․	73cf76a93f	[joj] Rewrite and add support for generic embeds (closes #13268 )	2017-07-09 19:17:54 +07:00
luboss	256a746d21	[joj] Add extractor	2017-07-09 19:17:38 +07:00
Sergey M․	58179eb7d9	[abc.net.au:iview] Extract more formats (closes #13492 , closes #13489 )	2017-07-09 17:55:40 +07:00
Sergey M․	485cb37576	[egghead:course] Improve (closes #13370 )	2017-07-09 17:30:49 +07:00
Santiago Calcagno	ed84454d35	[egghead:course] Fix extraction	2017-07-09 17:30:25 +07:00
Sergey M․	a02682fd13	Keep in sync with ffmpeg's current malformed AAC bitstream wording (closes #13587 )	2017-07-09 17:09:44 +07:00
Sergey M․	0d2f0b0357	[csjw] Make description optional	2017-07-09 17:05:11 +07:00
Sergey M․	c319d1c483	[csjw] Fix issues and improve extraction (closes #13525 )	2017-07-09 17:01:05 +07:00
Christopher Smith	d2b9f362fa	[cjsw] Add extractor	2017-07-09 17:01:00 +07:00
Sergey M․	4328ddf82b	[extractor/common] Add support for AMP tags in _parse_html5_media_entries	2017-07-09 16:29:52 +07:00
Sergey M․	250b042c7e	[generic] Add tests for #13557	2017-07-09 16:02:38 +07:00
Sergey M․	665e945246	[eagleplatform] Add support for referrer protected videos (closes #13557 )	2017-07-09 15:57:58 +07:00
Sergey M․	5af2fd7fa0	[eagleplatform] Add support for another embed pattern (#13557 )	2017-07-09 15:55:04 +07:00
mlindner	15237fcd51	[veoh] Extend _VALID_URL	2017-07-09 14:54:52 +07:00
rrooij	7a57730907	[npo:live] Fix live stream id extraction (closes #13568 )	2017-07-09 14:21:40 +07:00
Sergey M․	8b347a389e	[googledrive] Fix height extraction (closes #13603 )	2017-07-09 00:26:13 +07:00
Sergey M․	a49804816c	[dailymotion] Add support for new layout (close #13580 )	2017-07-08 18:12:15 +07:00
Yen Chi Hsuan	eadd313321	[yam] Remove extractor mymedia.yam.com is dead. An wikipedia user also pointed out that Yam's blog service is no longer available. [1] [1] https://zh.wikipedia.org/zh-tw/%E5%A4%A9%E7%A9%BA%E9%83%A8%E8%90%BD	2017-07-08 15:48:05 +08:00
Sergey M․	d852c6bc59	[xhamster] Extract all formats and fix duration extraction (#13593 )	2017-07-07 22:49:11 +07:00
Sergey M․	00e5c36315	[xhamster] Add support for new URL schema (closes #13593 )	2017-07-07 22:27:34 +07:00
Sergey M․	8a04ade86b	Credit @parmjitv for #13322 , #13503 , #13541 , #13549	2017-07-06 23:15:23 +07:00
Sergey M․	ab328411d5	Credit @orng for ruv (#13396 )	2017-07-06 23:15:16 +07:00
Sergey M․	ddeff4be3f	Credit @gfabiano for #13382 , #13385 , #13415	2017-07-06 23:15:09 +07:00
Parmjit Virk	60d4401c5e	[espn] Extend _VALID_URL (fixes #13244 )	2017-07-06 22:55:59 +07:00
Sergey M․	dee2ff1d81	[test_utils] Fix tests under Windows	2017-07-06 00:25:37 +07:00
Sergey M․	6554708252	[kaltura] Fix typo in subtitles extraction (closes #13569 )	2017-07-05 23:20:50 +07:00
Sergey M․	0a2e1b2e30	[vier] Adapt extraction to redesign (#13575 )	2017-07-05 22:52:47 +07:00
Yen Chi Hsuan	babbc04d45	[xuite] Move to the new HTML5 API and reduce # of requests	2017-07-05 23:27:12 +08:00
Yen Chi Hsuan	609ff8ca19	[utils] Support attributes with no values in get_elements_by_attribute()	2017-07-05 23:27:12 +08:00
Sergey M․	b6c9fe4162	release 2017.07.02	2017-07-02 20:17:10 +07:00
Sergey M․	4d9ba27bba	[ChangeLog] Actualize	2017-07-02 20:12:40 +07:00
Sergey M․	50ae3f646e	[thisoldhouse] Add more fallbacks for video id (closes #13541 )	2017-07-02 20:06:15 +07:00
Parmjit Virk	99a7e76240	[thisoldhouse] Update test	2017-07-02 20:05:11 +07:00
Parmjit Virk	a3a6d01a96	[thisoldhouse] Fix video id extraction (closes #13540 )	2017-07-02 20:04:51 +07:00
Sergey M․	02d61a65e2	[xfileshare] Extend format regex (closes #13536 )	2017-07-02 08:00:22 +07:00
Sergey M․	9b35297be1	[extractors] Add import for tastytrade	2017-07-01 18:39:29 +07:00
Sergey M․	4917478803	[ted] Fix extraction (closes #13535 ))	2017-07-01 18:39:01 +07:00
Sergey M․	54faac2235	[tastytrade] Add extractor (closes #13521 )	2017-06-30 22:20:30 +07:00
Sergey M․	c69701c6ab	[extractor/common] Improve _json_ld	2017-06-30 22:19:06 +07:00
Sergey M․	d4f8ce6e91	[dplayit] Relax video id regex (closes #13524 )	2017-06-30 21:55:45 +07:00
Sergey M․	b311b0ead2	[generic] Extract more generic metadata (closes #13527 )	2017-06-30 21:42:04 +07:00
Sergey M․	72d256c434	[bbccouk] Extend _VALID_URL	2017-06-29 22:29:28 +07:00
Sergey M․	b2ed954fc6	[bbccouk] Capture and output error message (closes #13518 )	2017-06-29 22:27:53 +07:00
Sergey M․	a919ca0ad6	[cbsnews] Actualize test	2017-06-28 22:30:12 +07:00
Parmjit Virk	88d6b7c2bd	[cbsnews] Relax video info regex (fixes #13284 )	2017-06-28 22:21:35 +07:00
Sergey M․	fd1c5fba6b	[facebook] Add test for plugin video embed (#13493 )	2017-06-27 22:38:59 +07:00
Sergey M․	0646e34c7d	[facebook] Add support for plugin video embeds and multiple embeds (closes #13493 )	2017-06-27 22:38:54 +07:00
Sergey M․	bf2dc9cc6e	[soundcloud] Fix tests	2017-06-27 21:26:46 +07:00
Viktor Szakats	f1c051009b	[soundcloud] Switch to https for API requests	2017-06-27 21:20:18 +07:00
Sergey M․	33ffb645a6	[pandatv] Switch to https for API and download URLs	2017-06-26 22:11:09 +07:00
Xuan Hu (Sean)	35544690e4	[pandatv] Add support for https URLs	2017-06-26 22:00:31 +07:00
Yen Chi Hsuan	136503e302	[ChangeLog] Update after #13494	2017-06-26 19:56:07 +08:00
Luca Steeb	4a87de72df	[niconico] fix sp subdomain links	2017-06-25 21:30:05 +02:00
Sergey M․	a7ce8f16c4	release 2017.06.25	2017-06-25 05:16:06 +07:00
Sergey M․	a5aea53fc8	[ChangeLog] Actualize	2017-06-25 05:13:12 +07:00
Sergey M․	0c7a631b61	[adobepass] Add support for ATTOTT MSO (DIRECTV NOW) (closes #13472 )	2017-06-25 05:03:17 +07:00
Sergey M․	fd9ee4de8c	[wsj] Add support for barrons.com (closes #13470 )	2017-06-25 02:15:35 +07:00
Argn0	5744cf6c03	[ign] Add another video id pattern (closes #13328 )	2017-06-25 01:59:15 +07:00
Sergey M․	9c48b5a193	[raiplay:live] Improve and add test (closes #13414 )	2017-06-25 01:49:27 +07:00
james	449c665776	[raiplay:live] Add extractor	2017-06-25 01:48:54 +07:00
Sergey M․	23aec3d623	[redbulltv] Restore hls format prefix	2017-06-25 01:10:31 +07:00
Sergey M․	27449ad894	[redbulltv] Add support for lives and segments (closes #13486 ))	2017-06-25 01:09:12 +07:00
Sergey M․	bd65f18153	[onetpl] Add support for videos embedded via pulsembed (closes #13482 )	2017-06-24 18:33:31 +07:00
Sergey M․	73af5cc817	[YoutubeDL] Skip malformed formats for better extraction robustness	2017-06-23 21:18:33 +07:00
Sergey M․	b5f523ed62	[ooyala] Add test for missing stream['url']['data']	2017-06-23 20:56:48 +07:00
Sergey M․	4f4dd8d797	[ooyala] Make more robust	2017-06-23 20:56:21 +07:00
Sergey M․	4cb18ab1b9	[ooyala] Skip empty format URLs (closes #13471 , closes #13476 )	2017-06-23 20:50:48 +07:00
Sergey M․	ac7409eec5	[hgtv.com:show] Fix typo	2017-06-23 02:54:12 +07:00
Sergey M․	170719414d	release 2017.06.23	2017-06-23 02:13:21 +07:00
Sergey M․	38dad4737f	[ChangeLog] Actualize	2017-06-23 02:10:54 +07:00
Sergey M․	ddbb4c5c3e	[youtube] Adapt to new automatic captions rendition (closes #13467 )	2017-06-23 02:00:19 +07:00
Sergey M․	fa3ea7223a	[hgtv.com:show] Relax video config regex and update test (closes #13279 , closes #13461 )	2017-06-23 00:42:42 +07:00
Parmjit Virk	0f4a5a73e7	[drtuber] Fix formats extraction (fixes 12058)	2017-06-23 00:08:36 +07:00
Sergey M․	18166bb8e8	[youporn] Fix upload date extraction	2017-06-22 00:47:02 +07:00
Sergey M․	d4893e764b	[youporn] Improve formats extraction	2017-06-22 00:40:15 +07:00
Sergey M․	97b6e30113	[youporn] Fix title extraction (closes #13456 )	2017-06-22 00:20:45 +07:00
Sergey M․	9be9ec5980	[googledrive] Fix formats' sorting (closes #13443 )	2017-06-20 22:58:33 +07:00
Giuseppe Fabiano	048b55804d	[watchindianporn] Fix extraction (closes #13411 )	2017-06-20 04:30:45 +07:00
Giuseppe Fabiano	6ce79d7ac0	[abcotvs] Fix test md5	2017-06-20 04:07:00 +07:00
Sergey M․	1641ca402d	[vimeo] Add fallback mp4 extension for original format	2017-06-20 01:27:59 +07:00
Sergey M․	85cbcede5b	[ruv] Improve, extract all formats and metadata (closes #13396 )	2017-06-19 23:46:03 +07:00
Orn	a1de83e5f0	[ruv] Add extractor	2017-06-19 23:45:45 +07:00
Sergey M․	fee00b3884	[viu] Fix extraction on older python 2.6	2017-06-19 22:57:37 +07:00
Sergey M․	2d2132ac6e	[adobepass] Fix extraction on older python 2.6	2017-06-19 22:54:53 +07:00
Yen Chi Hsuan	cc2ffe5afe	[pandora.tv] Fix upload_date extraction (closes #12846 )	2017-06-19 16:20:36 +08:00
Sergey M․	560050669b	[asiancrush] Add extractor (closes #13420 )	2017-06-18 20:18:51 +07:00
Sergey M․	eaa006d1bd	release 2017.06.18	2017-06-18 00:16:49 +07:00
Sergey M․	a6f29820c6	[ChangeLog] Actualize	2017-06-18 00:15:43 +07:00
Sergey M․	1433734c35	[downloader/common] Use utils.shell_quote for debug command line	2017-06-17 23:50:21 +07:00
Sergey M․	aefce8e6dc	[utils] Use compat_shlex_quote in shell_quote	2017-06-17 23:48:58 +07:00
Sergey M․	8b6ac49ecc	[postprocessor/execafterdownload] Encode command line (closes #13407 )	2017-06-17 23:16:53 +07:00
Sergey M․	b08e235f09	[compat] Fix compat_shlex_quote on Windows (closes #5889 , closes #10254 )	2017-06-17 23:14:24 +07:00
Sergey M․	be80986ed9	[postprocessor/metadatafromtitle] Fix missing optional meta fields (closes #13408 )	2017-06-17 19:05:10 +07:00
Yen Chi Hsuan	473e87064b	[devscripts/prepare_manpage] Fix deprecated escape sequence on py36	2017-06-17 17:37:25 +08:00
Yen Chi Hsuan	4f90d2aeac	[Makefile] Excluding __pycache__ correctly (#13400 )	2017-06-17 17:09:24 +08:00
Jakub Adam Wieczorek	b230fefc3c	[polskieradio] Fix extraction	2017-06-16 04:57:56 +07:00
Sergey M․	96a2daa1ee	[extractor/common] Improve jwplayer subtitles extraction	2017-06-15 23:40:39 +07:00
gfabiano	0ea6efbb7a	[xfileshare] Add support for fastvideo.me	2017-06-15 21:41:49 +07:00
Yen Chi Hsuan	6a9cb29509	[extractor/common] Fix json dumping with --geo-bypass The line "[debug] Using fake IP %s (%s) as X-Forwarded-For." was printed to stdout even with -j/-J, which breaks the resultant JSON.	2017-06-15 13:04:36 +08:00
Yen Chi Hsuan	ca27037171	[bilibili] Fix extraction of videos with double quotes in titles Closes #13387	2017-06-15 11:19:03 +08:00
gfabiano	0bf4b71b75	[4tube] Fix extraction (closes #13381 )	2017-06-15 04:16:50 +07:00
Marcin Cieślak	5215f45327	[disney] Add support for disneychannel.de	2017-06-15 04:13:04 +07:00
Sergey M․	0a268c6e11	[extractor/common] Improve jwplayer formats extraction (closes #13379 )	2017-06-14 22:02:15 +07:00
Sergey M․	7dd5415cd0	[npo] Improve _VALID_URL (closes #13376 )	2017-06-14 21:33:40 +07:00
Sergey M․	b5dc33daa9	[corus] Add support for showcase.ca	2017-06-13 23:27:27 +07:00
Sergey M․	97fa1f8dc4	[corus] Add support for history.ca (closes #13359 )	2017-06-13 23:16:21 +07:00
Sergey M․	b081f53b08	[compat] Add compat_HTMLParseError to __all__	2017-06-12 02:36:43 +07:00
Sergey M․	cb1e6d8985	release 2017.06.12	2017-06-12 02:23:17 +07:00
Sergey M․	9932ac5c58	[ChangeLog] Actualize	2017-06-12 02:01:15 +07:00
Sergey M․	bf87c36c93	[xfileshare] PEP 8	2017-06-12 02:01:12 +07:00
Sergey M․	b4a3d461e4	[utils] Handle HTMLParseError in extract_attributes (closes #13349 )	2017-06-12 01:52:24 +07:00
Sergey M․	72b409559c	[compat] Introduce compat_HTMLParseError	2017-06-12 01:50:32 +07:00
Sergey M․	534863e057	[xfileshare] Add support for rapidvideo (closes #13348 )	2017-06-12 00:16:47 +07:00
Sergey M․	16bc958287	[xfileshare] Modernize and pass referrer	2017-06-12 00:14:04 +07:00
Sergey M․	624bd0104c	[rutv] Add support for testplayer.vgtrk.com (closes #13347 )	2017-06-11 21:36:19 +07:00
Sergey M․	28a4d6cce8	[newgrounds] Extract more metadata (closes #13232 )	2017-06-11 21:30:06 +07:00
Sergey M․	2ae2ffda5e	[utils] Improve unified_timestamp	2017-06-11 21:27:22 +07:00
Sergey M․	70e7967202	[newgrounds:playlist] Add extractor (closes #10611 )	2017-06-11 20:50:55 +07:00
Sergey M․	6e999fbc12	[newgrounds] Improve formats and uploader extraction (closes #13346 )	2017-06-11 19:44:44 +07:00
Sergey M․	7409af9eb3	[msn] Fix formats extraction	2017-06-11 08:56:53 +07:00
Sergey M․	4e3637034c	[extractor/generic] Ensure format id is unicode string	2017-06-10 23:56:20 +07:00
Sergey M․	1afd0b0da7	[extractor/common] Return unicode string from _match_id	2017-06-09 00:40:03 +07:00
Sergey M․	7515830422	[turbo] Ensure format id is string	2017-06-09 00:31:56 +07:00
Sergey M․	f5521ea209	[sexu] Ensure height is int	2017-06-09 00:30:23 +07:00
Sergey M․	34646967ba	[jove] Ensure comment count is int	2017-06-09 00:29:20 +07:00
Sergey M․	e4d2e76d8e	[golem] Ensure format id is string	2017-06-09 00:27:11 +07:00
Sergey M․	87f5646937	[gfycat] Ensure filesize is int	2017-06-09 00:24:23 +07:00
Sergey M․	cc69a3de1b	[foxgay] Ensure height is int	2017-06-09 00:22:14 +07:00
Sergey M․	15aeeb1188	[flickr] Ensure format id is string	2017-06-09 00:20:07 +07:00
Sergey M․	1693bebe4d	[sohu] Fix numeric fields	2017-06-09 00:16:42 +07:00
Sergey M․	4244a13a1d	[safari] Improve authentication detection (closes #13319 )	2017-06-08 23:20:48 +07:00
Sergey M․	931adf8cc1	[liveleak] Ensure height is int (closes #13313 )	2017-06-08 22:54:30 +07:00
Sergey M․	c996943418	[YoutubeDL] Sanitize more fields (#13313 )	2017-06-08 22:53:14 +07:00
Sergey M․	76e6378358	[README.md] Improve man page formatting	2017-06-08 22:02:42 +07:00
Sergey M․	a355b57f58	[README.md] Clarify output template references (closes #13316 )	2017-06-08 21:52:19 +07:00
Sergey M․	1508da30c2	[streamango] Skip download for test (closes #13292 )	2017-06-07 21:53:40 +07:00
Luca Steeb	eb703e5380	[streamango] Make title optional	2017-06-07 21:53:33 +07:00
Sergey M․	0a3924e746	[rtlnl] Improve _VALID_URL (closes #13295 )	2017-06-06 21:21:44 +07:00
Sergey M․	e1db730d86	[tvplayer] Fix extraction (closes #13291 )	2017-06-06 00:13:57 +07:00