release 2017.06.12

[ChangeLog] Actualize
[xfileshare] PEP 8
2017-06-12 02:23:17 +07:00 · 2017-06-12 02:01:15 +07:00 · 2017-06-12 02:01:12 +07:00 · 2017-06-12 01:52:24 +07:00 · 2017-06-12 01:50:32 +07:00 · 2017-06-12 00:16:47 +07:00
50 changed files with 936 additions and 448 deletions
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
@@ -6,8 +6,8 @@

 ---

-### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.05.26*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.05.26**
+### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.06.12*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
+- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.06.12**

 ### Before submitting an *issue* make sure you have:
 - [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
 [debug] User config: []
 [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
-[debug] youtube-dl version 2017.05.26
+[debug] youtube-dl version 2017.06.12
 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
 [debug] Proxy map: {}
--- a/3
+++ b/3
@@ -217,3 +217,6 @@ Marvin Ewald
 Frédéric Bournival
 Timendum
 gritstub
+Adam Voss
+Mike Fährmann
+Jan Kundrát
--- a/69
+++ b/69
@@ -1,3 +1,72 @@
+version 2017.06.12
+
+Core
+* [utils] Handle compat_HTMLParseError in extract_attributes (#13349)
+ [compat] Introduce compat_HTMLParseError
+* [utils] Improve unified_timestamp
+* [extractor/generic] Ensure format id is unicode string
+* [extractor/common] Return unicode string from _match_id
+ [YoutubeDL] Sanitize more fields (#13313)
+
+Extractors
+ [xfileshare] Add support for rapidvideo.tv (#13348)
+* [xfileshare] Modernize and pass Referer
+ [rutv] Add support for testplayer.vgtrk.com (#13347)
+ [newgrounds] Extract more metadata (#13232)
+ [newgrounds:playlist] Add support for playlists (#10611)
+* [newgrounds] Improve formats and uploader extraction (#13346)
+* [msn] Fix formats extraction
+* [turbo] Ensure format id is string
+* [sexu] Ensure height is int
+* [jove] Ensure comment count is int
+* [golem] Ensure format id is string
+* [gfycat] Ensure filesize is int
+* [foxgay] Ensure height is int
+* [flickr] Ensure format id is string
+* [sohu] Fix numeric fields
+* [safari] Improve authentication detection (#13319)
+* [liveleak] Ensure height is int (#13313)
+* [streamango] Make title optional (#13292)
+* [rtlnl] Improve URL regular expression (#13295)
+* [tvplayer] Fix extraction (#13291)
+
+
+version 2017.06.05
+
+Core
+* [YoutubeDL] Don't emit ANSI escape codes on Windows (#13270)
+
+Extractors
+ [bandcamp:weekly] Add support for bandcamp weekly (#12758)
+* [pornhub:playlist] Fix extraction (#13281)
+- [godtv] Remove extractor (#13175)
+* [safari] Fix typo (#13252)
+* [youtube] Improve chapters extraction (#13247)
+* [1tv] Lower preference for HTTP formats (#13246)
+* [francetv] Relax URL regular expression
+* [drbonanza] Fix extraction (#13231)
+* [packtpub] Fix authentication (#13240)
+
+
+version 2017.05.29
+
+Extractors
+* [youtube] Fix DASH MPD extraction for videos with non-encrypted format URLs
+  (#13211)
+* [xhamster] Fix uploader and like/dislike count extraction (#13216))
+ [xhamster] Extract categories (#11728)
+ [abcnews] Add support for embed URLs (#12851)
+* [gaskrank] Fix extraction (#12493)
+* [medialaan] Fix videos with missing videoUrl (#12774)
+* [dvtv] Fix playlist support
+ [dvtv] Add support for DASH and HLS formats (#3063)
+ [beam:vod] Add support for beam.pro/mixer.com VODs (#13032))
+* [cbsinteractive] Relax URL regular expression (#13213)
+* [adn] Fix formats extraction
+ [youku] Extract more metadata (#10433)
+* [cbsnews] Fix extraction (#13205)
+
+
 version 2017.05.26

 Core
--- a/README.md
+++ b/README.md
@@ -145,18 +145,18 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
    --max-views COUNT                Do not download any videos with more than
                                     COUNT views
    --match-filter FILTER            Generic video filter. Specify any key (see
-                                     help for -o for a list of available keys)
-                                     to match if the key is present, !key to
-                                     check if the key is not present, key >
-                                     NUMBER (like "comment_count > 12", also
-                                     works with >=, <, <=, !=, =) to compare
-                                     against a number, key = 'LITERAL' (like
-                                     "uploader = 'Mike Smith'", also works with
-                                     !=) to match against a string literal and &
-                                     to require multiple matches. Values which
-                                     are not known are excluded unless you put a
-                                     question mark (?) after the operator. For
-                                     example, to only match videos that have
+                                     the "OUTPUT TEMPLATE" for a list of
+                                     available keys) to match if the key is
+                                     present, !key to check if the key is not
+                                     present, key > NUMBER (like "comment_count
+                                     > 12", also works with >=, <, <=, !=, =) to
+                                     compare against a number, key = 'LITERAL'
+                                     (like "uploader = 'Mike Smith'", also works
+                                     with !=) to match against a string literal
+                                     and & to require multiple matches. Values
+                                     which are not known are excluded unless you
+                                     put a question mark (?) after the operator.
+                                     For example, to only match videos that have
                                     been liked more than 100 times and disliked
                                     less than 50 times (or the dislike
                                     functionality is not available at the given
@@ -277,8 +277,8 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
    --get-filename                   Simulate, quiet but print output filename
    --get-format                     Simulate, quiet but print output format
    -j, --dump-json                  Simulate, quiet but print JSON information.
-                                     See --output for a description of available
-                                     keys.
+                                     See the "OUTPUT TEMPLATE" for a description
+                                     of available keys.
    -J, --dump-single-json           Simulate, quiet but print JSON information
                                     for each command-line argument. If the URL
                                     refers to a playlist, dump the whole
@@ -474,7 +474,10 @@ machine twitch login my_twitch_account_name password my_twitch_password
 ```
 To activate authentication with the `.netrc` file you should pass `--netrc` to youtube-dl or place it in the [configuration file](#configuration).

-On Windows you may also need to setup the `%HOME%` environment variable manually.
+On Windows you may also need to setup the `%HOME%` environment variable manually. For example:
+```
+set HOME=%USERPROFILE%
+```

 # OUTPUT TEMPLATE

@@ -532,13 +535,14 @@ The basic usage is not to set any template arguments when downloading a single f
 - `playlist_id` (string): Playlist identifier
 - `playlist_title` (string): Playlist title

-
 Available for the video that belongs to some logical chapter or section:
+
 - `chapter` (string): Name or title of the chapter the video belongs to
 - `chapter_number` (numeric): Number of the chapter the video belongs to
 - `chapter_id` (string): Id of the chapter the video belongs to

 Available for the video that is an episode of some series or programme:
+
 - `series` (string): Title of the series or programme the video episode belongs to
 - `season` (string): Title of the season the video episode belongs to
 - `season_number` (numeric): Number of the season the video episode belongs to
@@ -548,6 +552,7 @@ Available for the video that is an episode of some series or programme:
 - `episode_id` (string): Id of the video episode

 Available for the media that is a track or a part of a music album:
+
 - `track` (string): Title of the track
 - `track_number` (numeric): Number of the track within an album or a disc
 - `track_id` (string): Id of the track
@@ -649,7 +654,7 @@ Also filtering work for comparisons `=` (equals), `!=` (not equals), `^=` (begin
 - `acodec`: Name of the audio codec in use
 - `vcodec`: Name of the video codec in use
 - `container`: Name of the container format
- - `protocol`: The protocol that will be used for the actual download, lower-case (`http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `mms`, `f4m`, `ism`, `m3u8`, or `m3u8_native`)
+ - `protocol`: The protocol that will be used for the actual download, lower-case (`http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `mms`, `f4m`, `ism`, `http_dash_segments`, `m3u8`, or `m3u8_native`)
 - `format_id`: A short description of the format

 Note that none of the aforementioned meta fields are guaranteed to be present since this solely depends on the metadata obtained by particular extractor, i.e. the metadata offered by the video hoster.
--- a/docs/supportedsites.md
+++ b/docs/supportedsites.md
@@ -87,13 +87,13 @@
 - **bambuser:channel**
 - **Bandcamp**
 - **Bandcamp:album**
+ - **Bandcamp:weekly**
 - **bangumi.bilibili.com**: BiliBili番剧
 - **bbc**: BBC
 - **bbc.co.uk**: BBC iPlayer
 - **bbc.co.uk:article**: BBC articles
 - **bbc.co.uk:iplayer:playlist**
 - **bbc.co.uk:playlist**
- - **Beam:live**
 - **Beatport**
 - **Beeg**
 - **BehindKink**
@@ -311,7 +311,6 @@
 - **Go**
 - **Go90**
 - **GodTube**
- - **GodTV**
 - **Golem**
 - **GoogleDrive**
 - **Goshgay**
@@ -453,6 +452,8 @@
 - **mixcloud:playlist**
 - **mixcloud:stream**
 - **mixcloud:user**
+ - **Mixer:live**
+ - **Mixer:vod**
 - **MLB**
 - **Mnet**
 - **MoeVideo**: LetitBit video services: moevideo.net, playreplay.net and videochart.net
@@ -511,6 +512,7 @@
 - **netease:song**: 网易云音乐
 - **Netzkino**
 - **Newgrounds**
+ - **NewgroundsPlaylist**
 - **Newstube**
 - **NextMedia**: 蘋果日報
 - **NextMediaActionNews**: 蘋果日報 - 動新聞
@@ -973,7 +975,7 @@
 - **WSJArticle**
 - **XBef**
 - **XboxClips**
- - **XFileShare**: XFileShare based sites: DaClips, FileHoot, GorillaVid, MovPod, PowerWatch, Rapidvideo.ws, TheVideoBee, Vidto, Streamin.To, XVIDSTAGE, Vid ABC, VidBom, vidlo
+ - **XFileShare**: XFileShare based sites: DaClips, FileHoot, GorillaVid, MovPod, PowerWatch, Rapidvideo.ws, TheVideoBee, Vidto, Streamin.To, XVIDSTAGE, Vid ABC, VidBom, vidlo, RapidVideo.TV
 - **XHamster**
 - **XHamsterEmbed**
 - **xiami:album**: 虾米音乐 - 专辑
--- a/test/test_utils.py
+++ b/test/test_utils.py
@@ -340,6 +340,7 @@ class TestUtil(unittest.TestCase):
        self.assertEqual(unified_timestamp('May 16, 2016 11:15 PM'), 1463440500)
        self.assertEqual(unified_timestamp('Feb 7, 2016 at 6:35 pm'), 1454870100)
        self.assertEqual(unified_timestamp('2017-03-30T17:52:41Q'), 1490896361)
+        self.assertEqual(unified_timestamp('Sep 11, 2013 | 5:49 AM'), 1378878540)

    def test_determine_ext(self):
        self.assertEqual(determine_ext('http://example.com/foo/bar.mp4/?download'), 'mp4')
@@ -915,6 +916,8 @@ class TestUtil(unittest.TestCase):
            supports_outside_bmp = False
        if supports_outside_bmp:
            self.assertEqual(extract_attributes('<e x="Smile &#128512;!">'), {'x': 'Smile \U0001f600!'})
+        # Malformed HTML should not break attributes extraction on older Python
+        self.assertEqual(extract_attributes('<mal"formed/>'), {})

    def test_clean_html(self):
        self.assertEqual(clean_html('a:\nb'), 'a: b')
--- a/test/test_youtube_chapters.py
+++ b/test/test_youtube_chapters.py
@@ -254,6 +254,13 @@ class TestYoutubeChapters(unittest.TestCase):
                'title': '3 - Из серпов луны...[Iz serpov luny]',
            }]
        ),
+        (
+            # https://www.youtube.com/watch?v=xZW70zEasOk
+            # time point more than duration
+            '''● LCS Spring finals: Saturday and Sunday from <a href="#" onclick="yt.www.watch.player.seekTo(13*60+30);return false;">13:30</a> outside the venue! <br />● PAX East: Fri, Sat & Sun - more info in tomorrows video on the main channel!''',
+            283,
+            []
+        ),
    ]

    def test_youtube_chapters(self):
--- a/youtube_dl/YoutubeDL.py
+++ b/youtube_dl/YoutubeDL.py
@@ -58,6 +58,7 @@ from .utils import (
    format_bytes,
    formatSeconds,
    GeoRestrictedError,
+    int_or_none,
    ISO3166Utils,
    locked_file,
    make_HTTPS_handler,
@@ -302,6 +303,17 @@ class YoutubeDL(object):
                        postprocessor.
    """

+    _NUMERIC_FIELDS = set((
+        'width', 'height', 'tbr', 'abr', 'asr', 'vbr', 'fps', 'filesize', 'filesize_approx',
+        'timestamp', 'upload_year', 'upload_month', 'upload_day',
+        'duration', 'view_count', 'like_count', 'dislike_count', 'repost_count',
+        'average_rating', 'comment_count', 'age_limit',
+        'start_time', 'end_time',
+        'chapter_number', 'season_number', 'episode_number',
+        'track_number', 'disc_number', 'release_year',
+        'playlist_index',
+    ))
+
    params = None
    _ies = []
    _pps = []
@@ -498,24 +510,25 @@ class YoutubeDL(object):
    def to_console_title(self, message):
        if not self.params.get('consoletitle', False):
            return
-        if compat_os_name == 'nt' and ctypes.windll.kernel32.GetConsoleWindow():
-            # c_wchar_p() might not be necessary if `message` is
-            # already of type unicode()
-            ctypes.windll.kernel32.SetConsoleTitleW(ctypes.c_wchar_p(message))
+        if compat_os_name == 'nt':
+            if ctypes.windll.kernel32.GetConsoleWindow():
+                # c_wchar_p() might not be necessary if `message` is
+                # already of type unicode()
+                ctypes.windll.kernel32.SetConsoleTitleW(ctypes.c_wchar_p(message))
        elif 'TERM' in os.environ:
            self._write_string('\033]0;%s\007' % message, self._screen_file)

    def save_console_title(self):
        if not self.params.get('consoletitle', False):
            return
-        if 'TERM' in os.environ:
+        if compat_os_name != 'nt' and 'TERM' in os.environ:
            # Save the title on stack
            self._write_string('\033[22;0t', self._screen_file)

    def restore_console_title(self):
        if not self.params.get('consoletitle', False):
            return
-        if 'TERM' in os.environ:
+        if compat_os_name != 'nt' and 'TERM' in os.environ:
            # Restore the title from stack
            self._write_string('\033[23;0t', self._screen_file)

@@ -638,22 +651,11 @@ class YoutubeDL(object):
                    r'%%(\1)0%dd' % field_size_compat_map[mobj.group('field')],
                    outtmpl)

-            NUMERIC_FIELDS = set((
-                'width', 'height', 'tbr', 'abr', 'asr', 'vbr', 'fps', 'filesize', 'filesize_approx',
-                'timestamp', 'upload_year', 'upload_month', 'upload_day',
-                'duration', 'view_count', 'like_count', 'dislike_count', 'repost_count',
-                'average_rating', 'comment_count', 'age_limit',
-                'start_time', 'end_time',
-                'chapter_number', 'season_number', 'episode_number',
-                'track_number', 'disc_number', 'release_year',
-                'playlist_index',
-            ))
-
            # Missing numeric fields used together with integer presentation types
            # in format specification will break the argument substitution since
            # string 'NA' is returned for missing fields. We will patch output
            # template for missing fields to meet string presentation type.
-            for numeric_field in NUMERIC_FIELDS:
+            for numeric_field in self._NUMERIC_FIELDS:
                if numeric_field not in template_dict:
                    # As of [1] format syntax is:
                    #  %[mapping_key][conversion_flags][minimum_width][.precision][length_modifier]type
@@ -1344,9 +1346,28 @@ class YoutubeDL(object):
        if 'title' not in info_dict:
            raise ExtractorError('Missing "title" field in extractor result')

-        if not isinstance(info_dict['id'], compat_str):
-            self.report_warning('"id" field is not a string - forcing string conversion')
-            info_dict['id'] = compat_str(info_dict['id'])
+        def report_force_conversion(field, field_not, conversion):
+            self.report_warning(
+                '"%s" field is not %s - forcing %s conversion, there is an error in extractor'
+                % (field, field_not, conversion))
+
+        def sanitize_string_field(info, string_field):
+            field = info.get(string_field)
+            if field is None or isinstance(field, compat_str):
+                return
+            report_force_conversion(string_field, 'a string', 'string')
+            info[string_field] = compat_str(field)
+
+        def sanitize_numeric_fields(info):
+            for numeric_field in self._NUMERIC_FIELDS:
+                field = info.get(numeric_field)
+                if field is None or isinstance(field, compat_numeric_types):
+                    continue
+                report_force_conversion(numeric_field, 'numeric', 'int')
+                info[numeric_field] = int_or_none(field)
+
+        sanitize_string_field(info_dict, 'id')
+        sanitize_numeric_fields(info_dict)

        if 'playlist' not in info_dict:
            # It isn't part of a playlist
@@ -1434,6 +1455,8 @@ class YoutubeDL(object):
            if 'url' not in format:
                raise ExtractorError('Missing "url" key in result (index %d)' % i)

+            sanitize_string_field(format, 'format_id')
+            sanitize_numeric_fields(format)
            format['url'] = sanitize_url(format['url'])

            if format.get('format_id') is None:
--- a/youtube_dl/compat.py
+++ b/youtube_dl/compat.py
@@ -2322,6 +2322,19 @@ try:
 except ImportError:  # Python 2
    from HTMLParser import HTMLParser as compat_HTMLParser

+try:  # Python 2
+    from HTMLParser import HTMLParseError as compat_HTMLParseError
+except ImportError:  # Python <3.4
+    try:
+        from html.parser import HTMLParseError as compat_HTMLParseError
+    except ImportError:  # Python >3.4
+
+        # HTMLParseError has been deprecated in Python 3.3 and removed in
+        # Python 3.5. Introducing dummy exception for Python >3.5 for compatible
+        # and uniform cross-version exceptiong handling
+        class compat_HTMLParseError(Exception):
+            pass
+
 try:
    from subprocess import DEVNULL
    compat_subprocess_get_DEVNULL = lambda: DEVNULL
--- a/youtube_dl/extractor/abcnews.py
+++ b/youtube_dl/extractor/abcnews.py
@@ -12,7 +12,15 @@ from ..compat import compat_urlparse

 class AbcNewsVideoIE(AMPIE):
    IE_NAME = 'abcnews:video'
-    _VALID_URL = r'https?://abcnews\.go\.com/[^/]+/video/(?P<display_id>[0-9a-z-]+)-(?P<id>\d+)'
+    _VALID_URL = r'''(?x)
+                    https?://
+                        abcnews\.go\.com/
+                        (?:
+                            [^/]+/video/(?P<display_id>[0-9a-z-]+)-|
+                            video/embed\?.*?\bid=
+                        )
+                        (?P<id>\d+)
+                    '''

    _TESTS = [{
        'url': 'http://abcnews.go.com/ThisWeek/video/week-exclusive-irans-foreign-minister-zarif-20411932',
@@ -29,6 +37,9 @@ class AbcNewsVideoIE(AMPIE):
            # m3u8 download
            'skip_download': True,
        },
+    }, {
+        'url': 'http://abcnews.go.com/video/embed?id=46979033',
+        'only_matching': True,
    }, {
        'url': 'http://abcnews.go.com/2020/video/2020-husband-stands-teacher-jail-student-affairs-26119478',
        'only_matching': True,
--- a/youtube_dl/extractor/adn.py
+++ b/youtube_dl/extractor/adn.py
@@ -15,6 +15,7 @@ from ..utils import (
    intlist_to_bytes,
    srt_subtitles_timecode,
    strip_or_none,
+    urljoin,
 )


@@ -31,25 +32,28 @@ class ADNIE(InfoExtractor):
            'description': 'md5:2f7b5aa76edbc1a7a92cedcda8a528d5',
        }
    }
+    _BASE_URL = 'http://animedigitalnetwork.fr'

    def _get_subtitles(self, sub_path, video_id):
        if not sub_path:
            return None

        enc_subtitles = self._download_webpage(
-            'http://animedigitalnetwork.fr/' + sub_path,
-            video_id, fatal=False)
+            urljoin(self._BASE_URL, sub_path),
+            video_id, fatal=False, headers={
+                'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:53.0) Gecko/20100101 Firefox/53.0',
+            })
        if not enc_subtitles:
            return None

        # http://animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js
        dec_subtitles = intlist_to_bytes(aes_cbc_decrypt(
            bytes_to_intlist(base64.b64decode(enc_subtitles[24:])),
-            bytes_to_intlist(b'\nd\xaf\xd2J\xd0\xfc\xe1\xfc\xdf\xb61\xe8\xe1\xf0\xcc'),
+            bytes_to_intlist(b'\x1b\xe0\x29\x61\x38\x94\x24\x00\x12\xbd\xc5\x80\xac\xce\xbe\xb0'),
            bytes_to_intlist(base64.b64decode(enc_subtitles[:24]))
        ))
        subtitles_json = self._parse_json(
-            dec_subtitles[:-compat_ord(dec_subtitles[-1])],
+            dec_subtitles[:-compat_ord(dec_subtitles[-1])].decode(),
            None, fatal=False)
        if not subtitles_json:
            return None
@@ -103,9 +107,16 @@ class ADNIE(InfoExtractor):
        metas = options.get('metas') or {}
        title = metas.get('title') or video_info['title']
        links = player_config.get('links') or {}
+        if not links:
+            links_url = player_config['linksurl']
+            links_data = self._download_json(urljoin(
+                self._BASE_URL, links_url), video_id)
+            links = links_data.get('links') or {}

        formats = []
        for format_id, qualities in links.items():
+            if not isinstance(qualities, dict):
+                continue
            for load_balancer_url in qualities.values():
                load_balancer_data = self._download_json(
                    load_balancer_url, video_id, fatal=False) or {}
--- a/youtube_dl/extractor/bandcamp.py
+++ b/youtube_dl/extractor/bandcamp.py
@@ -14,14 +14,16 @@ from ..utils import (
    ExtractorError,
    float_or_none,
    int_or_none,
+    KNOWN_EXTENSIONS,
    parse_filesize,
    unescapeHTML,
    update_url_query,
+    unified_strdate,
 )


 class BandcampIE(InfoExtractor):
-    _VALID_URL = r'https?://.*?\.bandcamp\.com/track/(?P<title>.*)'
+    _VALID_URL = r'https?://.*?\.bandcamp\.com/track/(?P<title>[^/?#&]+)'
    _TESTS = [{
        'url': 'http://youtube-dl.bandcamp.com/track/youtube-dl-test-song',
        'md5': 'c557841d5e50261777a6585648adf439',
@@ -155,7 +157,7 @@ class BandcampIE(InfoExtractor):

 class BandcampAlbumIE(InfoExtractor):
    IE_NAME = 'Bandcamp:album'
-    _VALID_URL = r'https?://(?:(?P<subdomain>[^.]+)\.)?bandcamp\.com(?:/album/(?P<album_id>[^?#]+)|/?(?:$|[?#]))'
+    _VALID_URL = r'https?://(?:(?P<subdomain>[^.]+)\.)?bandcamp\.com(?:/album/(?P<album_id>[^/?#&]+))?'

    _TESTS = [{
        'url': 'http://blazo.bandcamp.com/album/jazz-format-mixtape-vol-1',
@@ -222,6 +224,12 @@ class BandcampAlbumIE(InfoExtractor):
        'playlist_count': 2,
    }]

+    @classmethod
+    def suitable(cls, url):
+        return (False
+                if BandcampWeeklyIE.suitable(url) or BandcampIE.suitable(url)
+                else super(BandcampAlbumIE, cls).suitable(url))
+
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        uploader_id = mobj.group('subdomain')
@@ -250,3 +258,92 @@ class BandcampAlbumIE(InfoExtractor):
            'title': title,
            'entries': entries,
        }
+
+
+class BandcampWeeklyIE(InfoExtractor):
+    IE_NAME = 'Bandcamp:weekly'
+    _VALID_URL = r'https?://(?:www\.)?bandcamp\.com/?\?(?:.*?&)?show=(?P<id>\d+)'
+    _TESTS = [{
+        'url': 'https://bandcamp.com/?show=224',
+        'md5': 'b00df799c733cf7e0c567ed187dea0fd',
+        'info_dict': {
+            'id': '224',
+            'ext': 'opus',
+            'title': 'BC Weekly April 4th 2017 - Magic Moments',
+            'description': 'md5:5d48150916e8e02d030623a48512c874',
+            'duration': 5829.77,
+            'release_date': '20170404',
+            'series': 'Bandcamp Weekly',
+            'episode': 'Magic Moments',
+            'episode_number': 208,
+            'episode_id': '224',
+        }
+    }, {
+        'url': 'https://bandcamp.com/?blah/blah@&show=228',
+        'only_matching': True
+    }]
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+        webpage = self._download_webpage(url, video_id)
+
+        blob = self._parse_json(
+            self._search_regex(
+                r'data-blob=(["\'])(?P<blob>{.+?})\1', webpage,
+                'blob', group='blob'),
+            video_id, transform_source=unescapeHTML)
+
+        show = blob['bcw_show']
+
+        # This is desired because any invalid show id redirects to `bandcamp.com`
+        # which happens to expose the latest Bandcamp Weekly episode.
+        show_id = int_or_none(show.get('show_id')) or int_or_none(video_id)
+
+        formats = []
+        for format_id, format_url in show['audio_stream'].items():
+            if not isinstance(format_url, compat_str):
+                continue
+            for known_ext in KNOWN_EXTENSIONS:
+                if known_ext in format_id:
+                    ext = known_ext
+                    break
+            else:
+                ext = None
+            formats.append({
+                'format_id': format_id,
+                'url': format_url,
+                'ext': ext,
+                'vcodec': 'none',
+            })
+        self._sort_formats(formats)
+
+        title = show.get('audio_title') or 'Bandcamp Weekly'
+        subtitle = show.get('subtitle')
+        if subtitle:
+            title += ' - %s' % subtitle
+
+        episode_number = None
+        seq = blob.get('bcw_seq')
+
+        if seq and isinstance(seq, list):
+            try:
+                episode_number = next(
+                    int_or_none(e.get('episode_number'))
+                    for e in seq
+                    if isinstance(e, dict) and int_or_none(e.get('id')) == show_id)
+            except StopIteration:
+                pass
+
+        return {
+            'id': video_id,
+            'title': title,
+            'description': show.get('desc') or show.get('short_desc'),
+            'duration': float_or_none(show.get('audio_duration')),
+            'is_live': False,
+            'release_date': unified_strdate(show.get('published_date')),
+            'series': 'Bandcamp Weekly',
+            'episode': show.get('subtitle'),
+            'episode_number': episode_number,
+            'episode_id': compat_str(video_id),
+            'formats': formats
+        }
--- a/youtube_dl/extractor/beampro.py
+++ b/youtube_dl/extractor/beampro.py
@@ -6,18 +6,33 @@ from ..utils import (
    ExtractorError,
    clean_html,
    compat_str,
+    float_or_none,
    int_or_none,
    parse_iso8601,
    try_get,
+    urljoin,
 )


-class BeamProLiveIE(InfoExtractor):
-    IE_NAME = 'Beam:live'
-    _VALID_URL = r'https?://(?:\w+\.)?beam\.pro/(?P<id>[^/?#&]+)'
+class BeamProBaseIE(InfoExtractor):
+    _API_BASE = 'https://mixer.com/api/v1'
    _RATINGS = {'family': 0, 'teen': 13, '18+': 18}
+
+    def _extract_channel_info(self, chan):
+        user_id = chan.get('userId') or try_get(chan, lambda x: x['user']['id'])
+        return {
+            'uploader': chan.get('token') or try_get(
+                chan, lambda x: x['user']['username'], compat_str),
+            'uploader_id': compat_str(user_id) if user_id else None,
+            'age_limit': self._RATINGS.get(chan.get('audience')),
+        }
+
+
+class BeamProLiveIE(BeamProBaseIE):
+    IE_NAME = 'Mixer:live'
+    _VALID_URL = r'https?://(?:\w+\.)?(?:beam\.pro|mixer\.com)/(?P<id>[^/?#&]+)'
    _TEST = {
-        'url': 'http://www.beam.pro/niterhayven',
+        'url': 'http://mixer.com/niterhayven',
        'info_dict': {
            'id': '261562',
            'ext': 'mp4',
@@ -38,11 +53,17 @@ class BeamProLiveIE(InfoExtractor):
        },
    }

+    _MANIFEST_URL_TEMPLATE = '%s/channels/%%s/manifest.%%s' % BeamProBaseIE._API_BASE
+
+    @classmethod
+    def suitable(cls, url):
+        return False if BeamProVodIE.suitable(url) else super(BeamProLiveIE, cls).suitable(url)
+
    def _real_extract(self, url):
        channel_name = self._match_id(url)

        chan = self._download_json(
-            'https://beam.pro/api/v1/channels/%s' % channel_name, channel_name)
+            '%s/channels/%s' % (self._API_BASE, channel_name), channel_name)

        if chan.get('online') is False:
            raise ExtractorError(
@@ -50,24 +71,118 @@ class BeamProLiveIE(InfoExtractor):

        channel_id = chan['id']

+        def manifest_url(kind):
+            return self._MANIFEST_URL_TEMPLATE % (channel_id, kind)
+
        formats = self._extract_m3u8_formats(
-            'https://beam.pro/api/v1/channels/%s/manifest.m3u8' % channel_id,
-            channel_name, ext='mp4', m3u8_id='hls', fatal=False)
+            manifest_url('m3u8'), channel_name, ext='mp4', m3u8_id='hls',
+            fatal=False)
+        formats.extend(self._extract_smil_formats(
+            manifest_url('smil'), channel_name, fatal=False))
        self._sort_formats(formats)

-        user_id = chan.get('userId') or try_get(chan, lambda x: x['user']['id'])
-
-        return {
+        info = {
            'id': compat_str(chan.get('id') or channel_name),
            'title': self._live_title(chan.get('name') or channel_name),
            'description': clean_html(chan.get('description')),
-            'thumbnail': try_get(chan, lambda x: x['thumbnail']['url'], compat_str),
+            'thumbnail': try_get(
+                chan, lambda x: x['thumbnail']['url'], compat_str),
            'timestamp': parse_iso8601(chan.get('updatedAt')),
-            'uploader': chan.get('token') or try_get(
-                chan, lambda x: x['user']['username'], compat_str),
-            'uploader_id': compat_str(user_id) if user_id else None,
-            'age_limit': self._RATINGS.get(chan.get('audience')),
            'is_live': True,
            'view_count': int_or_none(chan.get('viewersTotal')),
            'formats': formats,
        }
+        info.update(self._extract_channel_info(chan))
+
+        return info
+
+
+class BeamProVodIE(BeamProBaseIE):
+    IE_NAME = 'Mixer:vod'
+    _VALID_URL = r'https?://(?:\w+\.)?(?:beam\.pro|mixer\.com)/[^/?#&]+\?.*?\bvod=(?P<id>\d+)'
+    _TEST = {
+        'url': 'https://mixer.com/willow8714?vod=2259830',
+        'md5': 'b2431e6e8347dc92ebafb565d368b76b',
+        'info_dict': {
+            'id': '2259830',
+            'ext': 'mp4',
+            'title': 'willow8714\'s Channel',
+            'duration': 6828.15,
+            'thumbnail': r're:https://.*source\.png$',
+            'timestamp': 1494046474,
+            'upload_date': '20170506',
+            'uploader': 'willow8714',
+            'uploader_id': '6085379',
+            'age_limit': 13,
+            'view_count': int,
+        },
+        'params': {
+            'skip_download': True,
+        },
+    }
+
+    @staticmethod
+    def _extract_format(vod, vod_type):
+        if not vod.get('baseUrl'):
+            return []
+
+        if vod_type == 'hls':
+            filename, protocol = 'manifest.m3u8', 'm3u8_native'
+        elif vod_type == 'raw':
+            filename, protocol = 'source.mp4', 'https'
+        else:
+            assert False
+
+        data = vod.get('data') if isinstance(vod.get('data'), dict) else {}
+
+        format_id = [vod_type]
+        if isinstance(data.get('Height'), compat_str):
+            format_id.append('%sp' % data['Height'])
+
+        return [{
+            'url': urljoin(vod['baseUrl'], filename),
+            'format_id': '-'.join(format_id),
+            'ext': 'mp4',
+            'protocol': protocol,
+            'width': int_or_none(data.get('Width')),
+            'height': int_or_none(data.get('Height')),
+            'fps': int_or_none(data.get('Fps')),
+            'tbr': int_or_none(data.get('Bitrate'), 1000),
+        }]
+
+    def _real_extract(self, url):
+        vod_id = self._match_id(url)
+
+        vod_info = self._download_json(
+            '%s/recordings/%s' % (self._API_BASE, vod_id), vod_id)
+
+        state = vod_info.get('state')
+        if state != 'AVAILABLE':
+            raise ExtractorError(
+                'VOD %s is not available (state: %s)' % (vod_id, state),
+                expected=True)
+
+        formats = []
+        thumbnail_url = None
+
+        for vod in vod_info['vods']:
+            vod_type = vod.get('format')
+            if vod_type in ('hls', 'raw'):
+                formats.extend(self._extract_format(vod, vod_type))
+            elif vod_type == 'thumbnail':
+                thumbnail_url = urljoin(vod.get('baseUrl'), 'source.png')
+
+        self._sort_formats(formats)
+
+        info = {
+            'id': vod_id,
+            'title': vod_info.get('name') or vod_id,
+            'duration': float_or_none(vod_info.get('duration')),
+            'thumbnail': thumbnail_url,
+            'timestamp': parse_iso8601(vod_info.get('createdAt')),
+            'view_count': int_or_none(vod_info.get('viewsTotal')),
+            'formats': formats,
+        }
+        info.update(self._extract_channel_info(vod_info.get('channel') or {}))
+
+        return info
--- a/youtube_dl/extractor/cbsinteractive.py
+++ b/youtube_dl/extractor/cbsinteractive.py
@@ -8,7 +8,7 @@ from ..utils import int_or_none


 class CBSInteractiveIE(CBSIE):
-    _VALID_URL = r'https?://(?:www\.)?(?P<site>cnet|zdnet)\.com/(?:videos|video/share)/(?P<id>[^/?]+)'
+    _VALID_URL = r'https?://(?:www\.)?(?P<site>cnet|zdnet)\.com/(?:videos|video(?:/share)?)/(?P<id>[^/?]+)'
    _TESTS = [{
        'url': 'http://www.cnet.com/videos/hands-on-with-microsofts-windows-8-1-update/',
        'info_dict': {
@@ -60,6 +60,9 @@ class CBSInteractiveIE(CBSIE):
            # m3u8 download
            'skip_download': True,
        },
+    }, {
+        'url': 'http://www.zdnet.com/video/huawei-matebook-x-video/',
+        'only_matching': True,
    }]

    MPX_ACCOUNTS = {
--- a/youtube_dl/extractor/cbsnews.py
+++ b/youtube_dl/extractor/cbsnews.py
@@ -61,11 +61,17 @@ class CBSNewsIE(CBSIE):

        video_info = self._parse_json(self._html_search_regex(
            r'(?:<ul class="media-list items" id="media-related-items"><li data-video-info|<div id="cbsNewsVideoPlayer" data-video-player-options)=\'({.+?})\'',
-            webpage, 'video JSON info'), video_id)
+            webpage, 'video JSON info', default='{}'), video_id, fatal=False)

-        item = video_info['item'] if 'item' in video_info else video_info
-        guid = item['mpxRefId']
-        return self._extract_video_info(guid, 'cbsnews')
+        if video_info:
+            item = video_info['item'] if 'item' in video_info else video_info
+        else:
+            state = self._parse_json(self._search_regex(
+                r'data-cbsvideoui-options=(["\'])(?P<json>{.+?})\1', webpage,
+                'playlist JSON info', group='json'), video_id)['state']
+            item = state['playlist'][state['pid']]
+
+        return self._extract_video_info(item['mpxRefId'], 'cbsnews')


 class CBSNewsLiveVideoIE(InfoExtractor):
--- a/youtube_dl/extractor/common.py
+++ b/youtube_dl/extractor/common.py
@@ -376,7 +376,7 @@ class InfoExtractor(object):
            cls._VALID_URL_RE = re.compile(cls._VALID_URL)
        m = cls._VALID_URL_RE.match(url)
        assert m
-        return m.group('id')
+        return compat_str(m.group('id'))

    @classmethod
    def working(cls):
--- a/youtube_dl/extractor/drbonanza.py
+++ b/youtube_dl/extractor/drbonanza.py
@@ -1,135 +1,59 @@
 from __future__ import unicode_literals

-import json
 import re

 from .common import InfoExtractor
 from ..utils import (
-    int_or_none,
-    parse_iso8601,
+    js_to_json,
+    parse_duration,
+    unescapeHTML,
 )


 class DRBonanzaIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?dr\.dk/bonanza/(?:[^/]+/)+(?:[^/])+?(?:assetId=(?P<id>\d+))?(?:[#&]|$)'
-
-    _TESTS = [{
-        'url': 'http://www.dr.dk/bonanza/serie/portraetter/Talkshowet.htm?assetId=65517',
+    _VALID_URL = r'https?://(?:www\.)?dr\.dk/bonanza/[^/]+/\d+/[^/]+/(?P<id>\d+)/(?P<display_id>[^/?#&]+)'
+    _TEST = {
+        'url': 'http://www.dr.dk/bonanza/serie/154/matador/40312/matador---0824-komme-fremmede-',
        'info_dict': {
-            'id': '65517',
+            'id': '40312',
+            'display_id': 'matador---0824-komme-fremmede-',
            'ext': 'mp4',
-            'title': 'Talkshowet - Leonard Cohen',
-            'description': 'md5:8f34194fb30cd8c8a30ad8b27b70c0ca',
+            'title': 'MATADOR - 08:24. "Komme fremmede".',
+            'description': 'md5:77b4c1ac4d4c1b9d610ab4395212ff84',
            'thumbnail': r're:^https?://.*\.(?:gif|jpg)$',
-            'timestamp': 1295537932,
-            'upload_date': '20110120',
-            'duration': 3664,
+            'duration': 4613,
        },
-        'params': {
-            'skip_download': True,  # requires rtmp
-        },
-    }, {
-        'url': 'http://www.dr.dk/bonanza/radio/serie/sport/fodbold.htm?assetId=59410',
-        'md5': '6dfe039417e76795fb783c52da3de11d',
-        'info_dict': {
-            'id': '59410',
-            'ext': 'mp3',
-            'title': 'EM fodbold 1992 Danmark - Tyskland finale Transmission',
-            'description': 'md5:501e5a195749480552e214fbbed16c4e',
-            'thumbnail': r're:^https?://.*\.(?:gif|jpg)$',
-            'timestamp': 1223274900,
-            'upload_date': '20081006',
-            'duration': 7369,
-        },
-    }]
+    }

    def _real_extract(self, url):
-        url_id = self._match_id(url)
-        webpage = self._download_webpage(url, url_id)
+        mobj = re.match(self._VALID_URL, url)
+        video_id, display_id = mobj.group('id', 'display_id')

-        if url_id:
-            info = json.loads(self._html_search_regex(r'({.*?%s.*})' % url_id, webpage, 'json'))
-        else:
-            # Just fetch the first video on that page
-            info = json.loads(self._html_search_regex(r'bonanzaFunctions.newPlaylist\(({.*})\)', webpage, 'json'))
+        webpage = self._download_webpage(url, display_id)

-        asset_id = str(info['AssetId'])
-        title = info['Title'].rstrip(' \'\"-,.:;!?')
-        duration = int_or_none(info.get('Duration'), scale=1000)
-        # First published online. "FirstPublished" contains the date for original airing.
-        timestamp = parse_iso8601(
-            re.sub(r'\.\d+$', '', info['Created']))
+        info = self._parse_html5_media_entries(
+            url, webpage, display_id, m3u8_id='hls',
+            m3u8_entry_protocol='m3u8_native')[0]
+        self._sort_formats(info['formats'])

-        def parse_filename_info(url):
-            match = re.search(r'/\d+_(?P<width>\d+)x(?P<height>\d+)x(?P<bitrate>\d+)K\.(?P<ext>\w+)$', url)
-            if match:
-                return {
-                    'width': int(match.group('width')),
-                    'height': int(match.group('height')),
-                    'vbr': int(match.group('bitrate')),
-                    'ext': match.group('ext')
-                }
-            match = re.search(r'/\d+_(?P<bitrate>\d+)K\.(?P<ext>\w+)$', url)
-            if match:
-                return {
-                    'vbr': int(match.group('bitrate')),
-                    'ext': match.group(2)
-                }
-            return {}
+        asset = self._parse_json(
+            self._search_regex(
+                r'(?s)currentAsset\s*=\s*({.+?})\s*</script', webpage, 'asset'),
+            display_id, transform_source=js_to_json)

-        video_types = ['VideoHigh', 'VideoMid', 'VideoLow']
-        preferencemap = {
-            'VideoHigh': -1,
-            'VideoMid': -2,
-            'VideoLow': -3,
-            'Audio': -4,
-        }
+        title = unescapeHTML(asset['AssetTitle']).strip()

-        formats = []
-        for file in info['Files']:
-            if info['Type'] == 'Video':
-                if file['Type'] in video_types:
-                    format = parse_filename_info(file['Location'])
-                    format.update({
-                        'url': file['Location'],
-                        'format_id': file['Type'].replace('Video', ''),
-                        'preference': preferencemap.get(file['Type'], -10),
-                    })
-                    if format['url'].startswith('rtmp'):
-                        rtmp_url = format['url']
-                        format['rtmp_live'] = True  # --resume does not work
-                        if '/bonanza/' in rtmp_url:
-                            format['play_path'] = rtmp_url.split('/bonanza/')[1]
-                    formats.append(format)
-                elif file['Type'] == 'Thumb':
-                    thumbnail = file['Location']
-            elif info['Type'] == 'Audio':
-                if file['Type'] == 'Audio':
-                    format = parse_filename_info(file['Location'])
-                    format.update({
-                        'url': file['Location'],
-                        'format_id': file['Type'],
-                        'vcodec': 'none',
-                    })
-                    formats.append(format)
-                elif file['Type'] == 'Thumb':
-                    thumbnail = file['Location']
+        def extract(field):
+            return self._search_regex(
+                r'<div[^>]+>\s*<p>%s:<p>\s*</div>\s*<div[^>]+>\s*<p>([^<]+)</p>' % field,
+                webpage, field, default=None)

-        description = '%s\n%s\n%s\n' % (
-            info['Description'], info['Actors'], info['Colophon'])
-
-        self._sort_formats(formats)
-
-        display_id = re.sub(r'[^\w\d-]', '', re.sub(r' ', '-', title.lower())) + '-' + asset_id
-        display_id = re.sub(r'-+', '-', display_id)
-
-        return {
-            'id': asset_id,
+        info.update({
+            'id': asset.get('AssetId') or video_id,
            'display_id': display_id,
            'title': title,
-            'formats': formats,
-            'description': description,
-            'thumbnail': thumbnail,
-            'timestamp': timestamp,
-            'duration': duration,
-        }
+            'description': extract('Programinfo'),
+            'duration': parse_duration(extract('Tid')),
+            'thumbnail': asset.get('AssetImageUrl'),
+        })
+        return info
--- a/youtube_dl/extractor/dvtv.py
+++ b/youtube_dl/extractor/dvtv.py
@@ -5,9 +5,12 @@ import re

 from .common import InfoExtractor
 from ..utils import (
-    js_to_json,
-    unescapeHTML,
+    determine_ext,
    ExtractorError,
+    int_or_none,
+    js_to_json,
+    mimetype2ext,
+    unescapeHTML,
 )


@@ -24,14 +27,7 @@ class DVTVIE(InfoExtractor):
            'id': 'dc0768de855511e49e4b0025900fea04',
            'ext': 'mp4',
            'title': 'Vondra o Českém století: Při pohledu na Havla mi bylo trapně',
-        }
-    }, {
-        'url': 'http://video.aktualne.cz/dvtv/stropnicky-policie-vrbetice-preventivne-nekontrolovala/r~82ed4322849211e4a10c0025900fea04/',
-        'md5': '6388f1941b48537dbd28791f712af8bf',
-        'info_dict': {
-            'id': '72c02230849211e49f60002590604f2e',
-            'ext': 'mp4',
-            'title': 'Stropnický: Policie Vrbětice preventivně nekontrolovala',
+            'duration': 1484,
        }
    }, {
        'url': 'http://video.aktualne.cz/dvtv/dvtv-16-12-2014-utok-talibanu-boj-o-kliniku-uprchlici/r~973eb3bc854e11e498be002590604f2e/',
@@ -44,55 +40,100 @@ class DVTVIE(InfoExtractor):
            'info_dict': {
                'id': 'b0b40906854d11e4bdad0025900fea04',
                'ext': 'mp4',
-                'title': 'Drtinová Veselovský TV 16. 12. 2014: Témata dne'
+                'title': 'Drtinová Veselovský TV 16. 12. 2014: Témata dne',
+                'description': 'md5:0916925dea8e30fe84222582280b47a0',
+                'timestamp': 1418760010,
+                'upload_date': '20141216',
            }
        }, {
            'md5': '5f7652a08b05009c1292317b449ffea2',
            'info_dict': {
                'id': '420ad9ec854a11e4bdad0025900fea04',
                'ext': 'mp4',
-                'title': 'Školní masakr možná změní boj s Talibanem, říká novinářka'
+                'title': 'Školní masakr možná změní boj s Talibanem, říká novinářka',
+                'description': 'md5:ff2f9f6de73c73d7cef4f756c1c1af42',
+                'timestamp': 1418760010,
+                'upload_date': '20141216',
            }
        }, {
            'md5': '498eb9dfa97169f409126c617e2a3d64',
            'info_dict': {
                'id': '95d35580846a11e4b6d20025900fea04',
                'ext': 'mp4',
-                'title': 'Boj o kliniku: Veřejný zájem, nebo právo na majetek?'
+                'title': 'Boj o kliniku: Veřejný zájem, nebo právo na majetek?',
+                'description': 'md5:889fe610a70fee5511dc3326a089188e',
+                'timestamp': 1418760010,
+                'upload_date': '20141216',
            }
        }, {
            'md5': 'b8dc6b744844032dab6ba3781a7274b9',
            'info_dict': {
                'id': '6fe14d66853511e4833a0025900fea04',
                'ext': 'mp4',
-                'title': 'Pánek: Odmítání syrských uprchlíků je ostudou české vlády'
+                'title': 'Pánek: Odmítání syrských uprchlíků je ostudou české vlády',
+                'description': 'md5:544f86de6d20c4815bea11bf2ac3004f',
+                'timestamp': 1418760010,
+                'upload_date': '20141216',
            }
        }],
+    }, {
+        'url': 'https://video.aktualne.cz/dvtv/zeman-si-jen-leci-mindraky-sobotku-nenavidi-a-babis-se-mu-te/r~960cdb3a365a11e7a83b0025900fea04/',
+        'md5': 'f8efe9656017da948369aa099788c8ea',
+        'info_dict': {
+            'id': '3c496fec365911e7a6500025900fea04',
+            'ext': 'mp4',
+            'title': 'Zeman si jen léčí mindráky, Sobotku nenávidí a Babiš se mu teď hodí, tvrdí Kmenta',
+            'duration': 1103,
+        },
+        'params': {
+            'skip_download': True,
+        },
    }, {
        'url': 'http://video.aktualne.cz/v-cechach-poprve-zazni-zelenkova-zrestaurovana-mse/r~45b4b00483ec11e4883b002590604f2e/',
        'only_matching': True,
    }]

    def _parse_video_metadata(self, js, video_id):
-        metadata = self._parse_json(js, video_id, transform_source=js_to_json)
+        data = self._parse_json(js, video_id, transform_source=js_to_json)
+
+        title = unescapeHTML(data['title'])

        formats = []
-        for video in metadata['sources']:
-            ext = video['type'][6:]
-            formats.append({
-                'url': video['file'],
-                'ext': ext,
-                'format_id': '%s-%s' % (ext, video['label']),
-                'height': int(video['label'].rstrip('p')),
-                'fps': 25,
-            })
-
+        for video in data['sources']:
+            video_url = video.get('file')
+            if not video_url:
+                continue
+            video_type = video.get('type')
+            ext = determine_ext(video_url, mimetype2ext(video_type))
+            if video_type == 'application/vnd.apple.mpegurl' or ext == 'm3u8':
+                formats.extend(self._extract_m3u8_formats(
+                    video_url, video_id, 'mp4', entry_protocol='m3u8_native',
+                    m3u8_id='hls', fatal=False))
+            elif video_type == 'application/dash+xml' or ext == 'mpd':
+                formats.extend(self._extract_mpd_formats(
+                    video_url, video_id, mpd_id='dash', fatal=False))
+            else:
+                label = video.get('label')
+                height = self._search_regex(
+                    r'^(\d+)[pP]', label or '', 'height', default=None)
+                format_id = ['http']
+                for f in (ext, label):
+                    if f:
+                        format_id.append(f)
+                formats.append({
+                    'url': video_url,
+                    'format_id': '-'.join(format_id),
+                    'height': int_or_none(height),
+                })
        self._sort_formats(formats)

        return {
-            'id': metadata['mediaid'],
-            'title': unescapeHTML(metadata['title']),
-            'thumbnail': self._proto_relative_url(metadata['image'], 'http:'),
+            'id': data.get('mediaid') or video_id,
+            'title': title,
+            'description': data.get('description'),
+            'thumbnail': data.get('image'),
+            'duration': int_or_none(data.get('duration')),
+            'timestamp': int_or_none(data.get('pubtime')),
            'formats': formats
        }

@@ -103,7 +144,7 @@ class DVTVIE(InfoExtractor):

        # single video
        item = self._search_regex(
-            r"(?s)embedData[0-9a-f]{32}\['asset'\]\s*=\s*(\{.+?\});",
+            r'(?s)embedData[0-9a-f]{32}\[["\']asset["\']\]\s*=\s*(\{.+?\});',
            webpage, 'video', default=None, fatal=False)

        if item:
@@ -113,6 +154,8 @@ class DVTVIE(InfoExtractor):
        items = re.findall(
            r"(?s)BBX\.context\.assets\['[0-9a-f]{32}'\]\.push\(({.+?})\);",
            webpage)
+        if not items:
+            items = re.findall(r'(?s)var\s+asset\s*=\s*({.+?});\n', webpage)

        if items:
            return {
--- a/youtube_dl/extractor/extractors.py
+++ b/youtube_dl/extractor/extractors.py
@@ -90,7 +90,7 @@ from .azmedien import (
 )
 from .baidu import BaiduVideoIE
 from .bambuser import BambuserIE, BambuserChannelIE
-from .bandcamp import BandcampIE, BandcampAlbumIE
+from .bandcamp import BandcampIE, BandcampAlbumIE, BandcampWeeklyIE
 from .bbc import (
    BBCCoUkIE,
    BBCCoUkArticleIE,
@@ -98,7 +98,10 @@ from .bbc import (
    BBCCoUkPlaylistIE,
    BBCIE,
 )
-from .beampro import BeamProLiveIE
+from .beampro import (
+    BeamProLiveIE,
+    BeamProVodIE,
+)
 from .beeg import BeegIE
 from .behindkink import BehindKinkIE
 from .bellmedia import BellMediaIE
@@ -389,7 +392,6 @@ from .globo import (
 from .go import GoIE
 from .go90 import Go90IE
 from .godtube import GodTubeIE
-from .godtv import GodTVIE
 from .golem import GolemIE
 from .googledrive import GoogleDriveIE
 from .googleplus import GooglePlusIE
@@ -634,7 +636,10 @@ from .neteasemusic import (
    NetEaseMusicProgramIE,
    NetEaseMusicDjRadioIE,
 )
-from .newgrounds import NewgroundsIE
+from .newgrounds import (
+    NewgroundsIE,
+    NewgroundsPlaylistIE,
+)
 from .newstube import NewstubeIE
 from .nextmedia import (
    NextMediaIE,
--- a/youtube_dl/extractor/firsttv.py
+++ b/youtube_dl/extractor/firsttv.py
@@ -102,6 +102,8 @@ class FirstTVIE(InfoExtractor):
                    'format_id': f.get('name'),
                    'tbr': tbr,
                    'source_preference': quality(f.get('name')),
+                    # quality metadata of http formats may be incorrect
+                    'preference': -1,
                })
            # m3u8 URL format is reverse engineered from [1] (search for
            # master.m3u8). dashEdges (that is currently balancer-vod.1tv.ru)
--- a/youtube_dl/extractor/flickr.py
+++ b/youtube_dl/extractor/flickr.py
@@ -1,7 +1,10 @@
 from __future__ import unicode_literals

 from .common import InfoExtractor
-from ..compat import compat_urllib_parse_urlencode
+from ..compat import (
+    compat_str,
+    compat_urllib_parse_urlencode,
+)
 from ..utils import (
    ExtractorError,
    int_or_none,
@@ -81,7 +84,7 @@ class FlickrIE(InfoExtractor):

            formats = []
            for stream in streams['stream']:
-                stream_type = str(stream.get('type'))
+                stream_type = compat_str(stream.get('type'))
                formats.append({
                    'format_id': stream_type,
                    'url': stream['_content'],
--- a/youtube_dl/extractor/foxgay.py
+++ b/youtube_dl/extractor/foxgay.py
@@ -5,6 +5,7 @@ import itertools
 from .common import InfoExtractor
 from ..utils import (
    get_element_by_id,
+    int_or_none,
    remove_end,
 )

@@ -46,7 +47,7 @@ class FoxgayIE(InfoExtractor):

        formats = [{
            'url': source,
-            'height': resolution,
+            'height': int_or_none(resolution),
        } for source, resolution in zip(
            video_data['sources'], video_data.get('resolutions', itertools.repeat(None)))]

--- a/youtube_dl/extractor/francetv.py
+++ b/youtube_dl/extractor/francetv.py
@@ -112,7 +112,7 @@ class FranceTVBaseInfoExtractor(InfoExtractor):


 class FranceTVIE(FranceTVBaseInfoExtractor):
-    _VALID_URL = r'https?://(?:(?:www\.)?france\.tv|mobile\.france\.tv)/(?:[^/]+/)+(?P<id>[^/]+)\.html'
+    _VALID_URL = r'https?://(?:(?:www\.)?france\.tv|mobile\.france\.tv)/(?:[^/]+/)*(?P<id>[^/]+)\.html'

    _TESTS = [{
        'url': 'https://www.france.tv/france-2/13h15-le-dimanche/140921-les-mysteres-de-jesus.html',
@@ -157,6 +157,9 @@ class FranceTVIE(FranceTVBaseInfoExtractor):
    }, {
        'url': 'https://mobile.france.tv/france-5/c-dans-l-air/137347-emission-du-vendredi-12-mai-2017.html',
        'only_matching': True,
+    }, {
+        'url': 'https://www.france.tv/142749-rouge-sang.html',
+        'only_matching': True,
    }]

    def _real_extract(self, url):
--- a/youtube_dl/extractor/gaskrank.py
+++ b/youtube_dl/extractor/gaskrank.py
@@ -6,62 +6,52 @@ from .common import InfoExtractor
 from ..utils import (
    float_or_none,
    int_or_none,
-    js_to_json,
    unified_strdate,
 )


 class GaskrankIE(InfoExtractor):
-    """InfoExtractor for gaskrank.tv"""
-    _VALID_URL = r'https?://(?:www\.)?gaskrank\.tv/tv/(?P<categories>[^/]+)/(?P<id>[^/]+)\.html?'
-    _TESTS = [
-        {
-            'url': 'http://www.gaskrank.tv/tv/motorrad-fun/strike-einparken-durch-anfaenger-crash-mit-groesserem-flurschaden.htm',
-            'md5': '1ae88dbac97887d85ebd1157a95fc4f9',
-            'info_dict': {
-                'id': '201601/26955',
-                'ext': 'mp4',
-                'title': 'Strike! Einparken können nur Männer - Flurschaden hält sich in Grenzen *lol*',
-                'thumbnail': r're:^https?://.*\.jpg$',
-                'categories': ['motorrad-fun'],
-                'display_id': 'strike-einparken-durch-anfaenger-crash-mit-groesserem-flurschaden',
-                'uploader_id': 'Bikefun',
-                'upload_date': '20170110',
-                'uploader_url': None,
-            }
-        },
-        {
-            'url': 'http://www.gaskrank.tv/tv/racing/isle-of-man-tt-2011-michael-du-15920.htm',
-            'md5': 'c33ee32c711bc6c8224bfcbe62b23095',
-            'info_dict': {
-                'id': '201106/15920',
-                'ext': 'mp4',
-                'title': 'Isle of Man - Michael Dunlop vs Guy Martin - schwindelig kucken',
-                'thumbnail': r're:^https?://.*\.jpg$',
-                'categories': ['racing'],
-                'display_id': 'isle-of-man-tt-2011-michael-du-15920',
-                'uploader_id': 'IOM',
-                'upload_date': '20160506',
-                'uploader_url': 'www.iomtt.com',
-            }
+    _VALID_URL = r'https?://(?:www\.)?gaskrank\.tv/tv/(?P<categories>[^/]+)/(?P<id>[^/]+)\.htm'
+    _TESTS = [{
+        'url': 'http://www.gaskrank.tv/tv/motorrad-fun/strike-einparken-durch-anfaenger-crash-mit-groesserem-flurschaden.htm',
+        'md5': '1ae88dbac97887d85ebd1157a95fc4f9',
+        'info_dict': {
+            'id': '201601/26955',
+            'ext': 'mp4',
+            'title': 'Strike! Einparken können nur Männer - Flurschaden hält sich in Grenzen *lol*',
+            'thumbnail': r're:^https?://.*\.jpg$',
+            'categories': ['motorrad-fun'],
+            'display_id': 'strike-einparken-durch-anfaenger-crash-mit-groesserem-flurschaden',
+            'uploader_id': 'Bikefun',
+            'upload_date': '20170110',
+            'uploader_url': None,
        }
-    ]
+    }, {
+        'url': 'http://www.gaskrank.tv/tv/racing/isle-of-man-tt-2011-michael-du-15920.htm',
+        'md5': 'c33ee32c711bc6c8224bfcbe62b23095',
+        'info_dict': {
+            'id': '201106/15920',
+            'ext': 'mp4',
+            'title': 'Isle of Man - Michael Dunlop vs Guy Martin - schwindelig kucken',
+            'thumbnail': r're:^https?://.*\.jpg$',
+            'categories': ['racing'],
+            'display_id': 'isle-of-man-tt-2011-michael-du-15920',
+            'uploader_id': 'IOM',
+            'upload_date': '20170523',
+            'uploader_url': 'www.iomtt.com',
+        }
+    }]

    def _real_extract(self, url):
-        """extract information from gaskrank.tv"""
-        def fix_json(code):
-            """Removes trailing comma in json: {{},} --> {{}}"""
-            return re.sub(r',\s*}', r'}', js_to_json(code))
-
        display_id = self._match_id(url)
+
        webpage = self._download_webpage(url, display_id)
+
+        title = self._og_search_title(
+            webpage, default=None) or self._html_search_meta(
+            'title', webpage, fatal=True)
+
        categories = [re.match(self._VALID_URL, url).group('categories')]
-        title = self._search_regex(
-            r'movieName\s*:\s*\'([^\']*)\'',
-            webpage, 'title')
-        thumbnail = self._search_regex(
-            r'poster\s*:\s*\'([^\']*)\'',
-            webpage, 'thumbnail', default=None)

        mobj = re.search(
            r'Video von:\s*(?P<uploader_id>[^|]*?)\s*\|\s*vom:\s*(?P<upload_date>[0-9][0-9]\.[0-9][0-9]\.[0-9][0-9][0-9][0-9])',
@@ -89,29 +79,14 @@ class GaskrankIE(InfoExtractor):
        if average_rating:
            average_rating = float_or_none(average_rating.replace(',', '.'))

-        playlist = self._parse_json(
-            self._search_regex(
-                r'playlist\s*:\s*\[([^\]]*)\]',
-                webpage, 'playlist', default='{}'),
-            display_id, transform_source=fix_json, fatal=False)
-
        video_id = self._search_regex(
            r'https?://movies\.gaskrank\.tv/([^-]*?)(-[^\.]*)?\.mp4',
-            playlist.get('0').get('src'), 'video id')
+            webpage, 'video id', default=display_id)

-        formats = []
-        for key in playlist:
-            formats.append({
-                'url': playlist[key]['src'],
-                'format_id': key,
-                'quality': playlist[key].get('quality')})
-        self._sort_formats(formats, field_preference=['format_id'])
-
-        return {
+        entry = self._parse_html5_media_entries(url, webpage, video_id)[0]
+        entry.update({
            'id': video_id,
            'title': title,
-            'formats': formats,
-            'thumbnail': thumbnail,
            'categories': categories,
            'display_id': display_id,
            'uploader_id': uploader_id,
@@ -120,4 +95,7 @@ class GaskrankIE(InfoExtractor):
            'tags': tags,
            'view_count': view_count,
            'average_rating': average_rating,
-        }
+        })
+        self._sort_formats(entry['formats'])
+
+        return entry
--- a/youtube_dl/extractor/generic.py
+++ b/youtube_dl/extractor/generic.py
@@ -10,6 +10,7 @@ from .common import InfoExtractor
 from .youtube import YoutubeIE
 from ..compat import (
    compat_etree_fromstring,
+    compat_str,
    compat_urllib_parse_unquote,
    compat_urlparse,
    compat_xml_parse_error,
@@ -1907,14 +1908,14 @@ class GenericIE(InfoExtractor):
        content_type = head_response.headers.get('Content-Type', '').lower()
        m = re.match(r'^(?P<type>audio|video|application(?=/(?:ogg$|(?:vnd\.apple\.|x-)?mpegurl)))/(?P<format_id>[^;\s]+)', content_type)
        if m:
-            format_id = m.group('format_id')
+            format_id = compat_str(m.group('format_id'))
            if format_id.endswith('mpegurl'):
                formats = self._extract_m3u8_formats(url, video_id, 'mp4')
            elif format_id == 'f4m':
                formats = self._extract_f4m_formats(url, video_id)
            else:
                formats = [{
-                    'format_id': m.group('format_id'),
+                    'format_id': format_id,
                    'url': url,
                    'vcodec': 'none' if m.group('type') == 'audio' else None
                }]
--- a/youtube_dl/extractor/gfycat.py
+++ b/youtube_dl/extractor/gfycat.py
@@ -82,7 +82,7 @@ class GfycatIE(InfoExtractor):
            video_url = gfy.get('%sUrl' % format_id)
            if not video_url:
                continue
-            filesize = gfy.get('%sSize' % format_id)
+            filesize = int_or_none(gfy.get('%sSize' % format_id))
            formats.append({
                'url': video_url,
                'format_id': format_id,
--- a/youtube_dl/extractor/godtv.py
+++ b/youtube_dl/extractor/godtv.py
@@ -1,66 +0,0 @@
-from __future__ import unicode_literals
-
-from .common import InfoExtractor
-from .ooyala import OoyalaIE
-from ..utils import js_to_json
-
-
-class GodTVIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?god\.tv(?:/[^/]+)*/(?P<id>[^/?#&]+)'
-    _TESTS = [{
-        'url': 'http://god.tv/jesus-image/video/jesus-conference-2016/randy-needham',
-        'info_dict': {
-            'id': 'lpd3g2MzE6D1g8zFAKz8AGpxWcpu6o_3',
-            'ext': 'mp4',
-            'title': 'Randy Needham',
-            'duration': 3615.08,
-        },
-        'params': {
-            'skip_download': True,
-        }
-    }, {
-        'url': 'http://god.tv/playlist/bible-study',
-        'info_dict': {
-            'id': 'bible-study',
-        },
-        'playlist_mincount': 37,
-    }, {
-        'url': 'http://god.tv/node/15097',
-        'only_matching': True,
-    }, {
-        'url': 'http://god.tv/live/africa',
-        'only_matching': True,
-    }, {
-        'url': 'http://god.tv/liveevents',
-        'only_matching': True,
-    }]
-
-    def _real_extract(self, url):
-        display_id = self._match_id(url)
-
-        webpage = self._download_webpage(url, display_id)
-
-        settings = self._parse_json(
-            self._search_regex(
-                r'jQuery\.extend\(Drupal\.settings\s*,\s*({.+?})\);',
-                webpage, 'settings', default='{}'),
-            display_id, transform_source=js_to_json, fatal=False)
-
-        ooyala_id = None
-
-        if settings:
-            playlist = settings.get('playlist')
-            if playlist and isinstance(playlist, list):
-                entries = [
-                    OoyalaIE._build_url_result(video['content_id'])
-                    for video in playlist if video.get('content_id')]
-                if entries:
-                    return self.playlist_result(entries, display_id)
-            ooyala_id = settings.get('ooyala', {}).get('content_id')
-
-        if not ooyala_id:
-            ooyala_id = self._search_regex(
-                r'["\']content_id["\']\s*:\s*(["\'])(?P<id>[\w-]+)\1',
-                webpage, 'ooyala id', group='id')
-
-        return OoyalaIE._build_url_result(ooyala_id)
--- a/youtube_dl/extractor/golem.py
+++ b/youtube_dl/extractor/golem.py
@@ -3,6 +3,7 @@ from __future__ import unicode_literals

 from .common import InfoExtractor
 from ..compat import (
+    compat_str,
    compat_urlparse,
 )
 from ..utils import (
@@ -46,7 +47,7 @@ class GolemIE(InfoExtractor):
                continue

            formats.append({
-                'format_id': e.tag,
+                'format_id': compat_str(e.tag),
                'url': compat_urlparse.urljoin(self._PREFIX, url),
                'height': self._int(e.get('height'), 'height'),
                'width': self._int(e.get('width'), 'width'),
--- a/youtube_dl/extractor/jove.py
+++ b/youtube_dl/extractor/jove.py
@@ -65,9 +65,9 @@ class JoveIE(InfoExtractor):
            webpage, 'description', fatal=False)
        publish_date = unified_strdate(self._html_search_meta(
            'citation_publication_date', webpage, 'publish date', fatal=False))
-        comment_count = self._html_search_regex(
+        comment_count = int(self._html_search_regex(
            r'<meta name="num_comments" content="(\d+) Comments?"',
-            webpage, 'comment count', fatal=False)
+            webpage, 'comment count', fatal=False))

        return {
            'id': video_id,
--- a/youtube_dl/extractor/liveleak.py
+++ b/youtube_dl/extractor/liveleak.py
@@ -115,8 +115,9 @@ class LiveLeakIE(InfoExtractor):

        for a_format in info_dict['formats']:
            if not a_format.get('height'):
-                a_format['height'] = self._search_regex(
-                    r'([0-9]+)p\.mp4', a_format['url'], 'height label', default=None)
+                a_format['height'] = int_or_none(self._search_regex(
+                    r'([0-9]+)p\.mp4', a_format['url'], 'height label',
+                    default=None))

        self._sort_formats(info_dict['formats'])

--- a/youtube_dl/extractor/medialaan.py
+++ b/youtube_dl/extractor/medialaan.py
@@ -17,7 +17,7 @@ from ..utils import (
 class MedialaanIE(InfoExtractor):
    _VALID_URL = r'''(?x)
                    https?://
-                        (?:www\.)?
+                        (?:www\.|nieuws\.)?
                        (?:
                            (?P<site_id>vtm|q2|vtmkzoom)\.be/
                            (?:
@@ -85,6 +85,22 @@ class MedialaanIE(InfoExtractor):
        # clip
        'url': 'http://vtmkzoom.be/k3-dansstudio/een-nieuw-seizoen-van-k3-dansstudio',
        'only_matching': True,
+    }, {
+        # http/s redirect
+        'url': 'https://vtmkzoom.be/video?aid=45724',
+        'info_dict': {
+            'id': '257136373657000',
+            'ext': 'mp4',
+            'title': 'K3 Dansstudio Ushuaia afl.6',
+        },
+        'params': {
+            'skip_download': True,
+        },
+        'skip': 'Requires account credentials',
+    }, {
+        # nieuws.vtm.be
+        'url': 'https://nieuws.vtm.be/stadion/stadion/genk-nog-moeilijk-programma',
+        'only_matching': True,
    }]

    def _real_initialize(self):
@@ -146,6 +162,8 @@ class MedialaanIE(InfoExtractor):
                video_id, transform_source=lambda s: '[%s]' % s, fatal=False)
            if player:
                video = player[-1]
+                if video['videoUrl'] in ('http', 'https'):
+                    return self.url_result(video['url'], MedialaanIE.ie_key())
                info = {
                    'id': video_id,
                    'url': video['videoUrl'],
--- a/youtube_dl/extractor/msn.py
+++ b/youtube_dl/extractor/msn.py
@@ -68,10 +68,6 @@ class MSNIE(InfoExtractor):
            format_url = file_.get('url')
            if not format_url:
                continue
-            ext = determine_ext(format_url)
-            if ext == 'ism':
-                formats.extend(self._extract_ism_formats(
-                    format_url + '/Manifest', display_id, 'mss', fatal=False))
            if 'm3u8' in format_url:
                # m3u8_native should not be used here until
                # https://github.com/rg3/youtube-dl/issues/9913 is fixed
@@ -79,6 +75,9 @@ class MSNIE(InfoExtractor):
                    format_url, display_id, 'mp4',
                    m3u8_id='hls', fatal=False)
                formats.extend(m3u8_formats)
+            elif determine_ext(format_url) == 'ism':
+                formats.extend(self._extract_ism_formats(
+                    format_url + '/Manifest', display_id, 'mss', fatal=False))
            else:
                formats.append({
                    'url': format_url,
--- a/youtube_dl/extractor/newgrounds.py
+++ b/youtube_dl/extractor/newgrounds.py
@@ -1,6 +1,15 @@
 from __future__ import unicode_literals

+import re
+
 from .common import InfoExtractor
+from ..utils import (
+    extract_attributes,
+    int_or_none,
+    parse_duration,
+    parse_filesize,
+    unified_timestamp,
+)


 class NewgroundsIE(InfoExtractor):
@@ -13,7 +22,10 @@ class NewgroundsIE(InfoExtractor):
            'ext': 'mp3',
            'title': 'B7 - BusMode',
            'uploader': 'Burn7',
-        }
+            'timestamp': 1378878540,
+            'upload_date': '20130911',
+            'duration': 143,
+        },
    }, {
        'url': 'https://www.newgrounds.com/portal/view/673111',
        'md5': '3394735822aab2478c31b1004fe5e5bc',
@@ -22,25 +34,133 @@ class NewgroundsIE(InfoExtractor):
            'ext': 'mp4',
            'title': 'Dancin',
            'uploader': 'Squirrelman82',
+            'timestamp': 1460256780,
+            'upload_date': '20160410',
+        },
+    }, {
+        # source format unavailable, additional mp4 formats
+        'url': 'http://www.newgrounds.com/portal/view/689400',
+        'info_dict': {
+            'id': '689400',
+            'ext': 'mp4',
+            'title': 'ZTV News Episode 8',
+            'uploader': 'BennettTheSage',
+            'timestamp': 1487965140,
+            'upload_date': '20170224',
+        },
+        'params': {
+            'skip_download': True,
        },
    }]

    def _real_extract(self, url):
        media_id = self._match_id(url)
+
        webpage = self._download_webpage(url, media_id)

        title = self._html_search_regex(
            r'<title>([^>]+)</title>', webpage, 'title')

-        uploader = self._html_search_regex(
-            r'Author\s*<a[^>]+>([^<]+)', webpage, 'uploader', fatal=False)
+        media_url = self._parse_json(self._search_regex(
+            r'"url"\s*:\s*("[^"]+"),', webpage, ''), media_id)

-        music_url = self._parse_json(self._search_regex(
-            r'"url":("[^"]+"),', webpage, ''), media_id)
+        formats = [{
+            'url': media_url,
+            'format_id': 'source',
+            'quality': 1,
+        }]
+
+        max_resolution = int_or_none(self._search_regex(
+            r'max_resolution["\']\s*:\s*(\d+)', webpage, 'max resolution',
+            default=None))
+        if max_resolution:
+            url_base = media_url.rpartition('.')[0]
+            for resolution in (360, 720, 1080):
+                if resolution > max_resolution:
+                    break
+                formats.append({
+                    'url': '%s.%dp.mp4' % (url_base, resolution),
+                    'format_id': '%dp' % resolution,
+                    'height': resolution,
+                })
+
+        self._check_formats(formats, media_id)
+        self._sort_formats(formats)
+
+        uploader = self._search_regex(
+            r'(?:Author|Writer)\s*<a[^>]+>([^<]+)', webpage, 'uploader',
+            fatal=False)
+
+        timestamp = unified_timestamp(self._search_regex(
+            r'<dt>Uploaded</dt>\s*<dd>([^<]+)', webpage, 'timestamp',
+            default=None))
+        duration = parse_duration(self._search_regex(
+            r'<dd>Song\s*</dd><dd>.+?</dd><dd>([^<]+)', webpage, 'duration',
+            default=None))
+
+        filesize_approx = parse_filesize(self._html_search_regex(
+            r'<dd>Song\s*</dd><dd>(.+?)</dd>', webpage, 'filesize',
+            default=None))
+        if len(formats) == 1:
+            formats[0]['filesize_approx'] = filesize_approx
+
+        if '<dd>Song' in webpage:
+            formats[0]['vcodec'] = 'none'

        return {
            'id': media_id,
            'title': title,
-            'url': music_url,
            'uploader': uploader,
+            'timestamp': timestamp,
+            'duration': duration,
+            'formats': formats,
        }
+
+
+class NewgroundsPlaylistIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?newgrounds\.com/(?:collection|[^/]+/search/[^/]+)/(?P<id>[^/?#&]+)'
+    _TESTS = [{
+        'url': 'https://www.newgrounds.com/collection/cats',
+        'info_dict': {
+            'id': 'cats',
+            'title': 'Cats',
+        },
+        'playlist_mincount': 46,
+    }, {
+        'url': 'http://www.newgrounds.com/portal/search/author/ZONE-SAMA',
+        'info_dict': {
+            'id': 'ZONE-SAMA',
+            'title': 'Portal Search: ZONE-SAMA',
+        },
+        'playlist_mincount': 47,
+    }, {
+        'url': 'http://www.newgrounds.com/audio/search/title/cats',
+        'only_matching': True,
+    }]
+
+    def _real_extract(self, url):
+        playlist_id = self._match_id(url)
+
+        webpage = self._download_webpage(url, playlist_id)
+
+        title = self._search_regex(
+            r'<title>([^>]+)</title>', webpage, 'title', default=None)
+
+        # cut left menu
+        webpage = self._search_regex(
+            r'(?s)<div[^>]+\bclass=["\']column wide(.+)',
+            webpage, 'wide column', default=webpage)
+
+        entries = []
+        for a, path, media_id in re.findall(
+                r'(<a[^>]+\bhref=["\']/?((?:portal/view|audio/listen)/(\d+))[^>]+>)',
+                webpage):
+            a_class = extract_attributes(a).get('class')
+            if a_class not in ('item-portalsubmission', 'item-audiosubmission'):
+                continue
+            entries.append(
+                self.url_result(
+                    'https://www.newgrounds.com/%s' % path,
+                    ie=NewgroundsIE.ie_key(), video_id=media_id))
+
+        return self.playlist_result(entries, playlist_id, title)
--- a/youtube_dl/extractor/packtpub.py
+++ b/youtube_dl/extractor/packtpub.py
@@ -1,5 +1,6 @@
 from __future__ import unicode_literals

+import json
 import re

 from .common import InfoExtractor
@@ -14,7 +15,6 @@ from ..utils import (
    strip_or_none,
    unified_timestamp,
    urljoin,
-    urlencode_postdata,
 )


@@ -45,22 +45,15 @@ class PacktPubIE(PacktPubBaseIE):
        (username, password) = self._get_login_info()
        if username is None:
            return
-        webpage = self._download_webpage(self._PACKT_BASE, None)
-        login_form = self._form_hidden_inputs(
-            'packt-user-login-form', webpage)
-        login_form.update({
-            'email': username,
-            'password': password,
-        })
-        self._download_webpage(
-            self._PACKT_BASE, None, 'Logging in as %s' % username,
-            data=urlencode_postdata(login_form))
        try:
            self._TOKEN = self._download_json(
-                '%s/users/tokens/sessions' % self._MAPT_REST, None,
-                'Downloading Authorization Token')['data']['token']
+                self._MAPT_REST + '/users/tokens', None,
+                'Downloading Authorization Token', data=json.dumps({
+                    'email': username,
+                    'password': password,
+                }).encode())['data']['access']
        except ExtractorError as e:
-            if isinstance(e.cause, compat_HTTPError) and e.cause.code in (401, 404):
+            if isinstance(e.cause, compat_HTTPError) and e.cause.code in (400, 401, 404):
                message = self._parse_json(e.cause.read().decode(), None)['message']
                raise ExtractorError(message, expected=True)
            raise
@@ -83,7 +76,7 @@ class PacktPubIE(PacktPubBaseIE):

        headers = {}
        if self._TOKEN:
-            headers['Authorization'] = self._TOKEN
+            headers['Authorization'] = 'Bearer ' + self._TOKEN
        video = self._download_json(
            '%s/users/me/products/%s/chapters/%s/sections/%s'
            % (self._MAPT_REST, course_id, chapter_id, video_id), video_id,
--- a/youtube_dl/extractor/pornhub.py
+++ b/youtube_dl/extractor/pornhub.py
@@ -252,11 +252,14 @@ class PornHubPlaylistBaseIE(InfoExtractor):

        playlist = self._parse_json(
            self._search_regex(
-                r'playlistObject\s*=\s*({.+?});', webpage, 'playlist'),
-            playlist_id)
+                r'(?:playlistObject|PLAYLIST_VIEW)\s*=\s*({.+?});', webpage,
+                'playlist', default='{}'),
+            playlist_id, fatal=False)
+        title = playlist.get('title') or self._search_regex(
+            r'>Videos\s+in\s+(.+?)\s+[Pp]laylist<', webpage, 'title', fatal=False)

        return self.playlist_result(
-            entries, playlist_id, playlist.get('title'), playlist.get('description'))
+            entries, playlist_id, title, playlist.get('description'))


 class PornHubPlaylistIE(PornHubPlaylistBaseIE):
@@ -296,6 +299,7 @@ class PornHubUserVideosIE(PornHubPlaylistBaseIE):
            except ExtractorError as e:
                if isinstance(e.cause, compat_HTTPError) and e.cause.code == 404:
                    break
+                raise
            page_entries = self._extract_entries(webpage)
            if not page_entries:
                break
--- a/youtube_dl/extractor/rtlnl.py
+++ b/youtube_dl/extractor/rtlnl.py
@@ -15,7 +15,7 @@ class RtlNlIE(InfoExtractor):
        https?://(?:www\.)?
        (?:
            rtlxl\.nl/[^\#]*\#!/[^/]+/|
-            rtl\.nl/system/videoplayer/(?:[^/]+/)+(?:video_)?embed\.html\b.+?\buuid=
+            rtl\.nl/(?:system/videoplayer/(?:[^/]+/)+(?:video_)?embed\.html\b.+?\buuid=|video/)
        )
        (?P<id>[0-9a-f-]+)'''

@@ -70,6 +70,9 @@ class RtlNlIE(InfoExtractor):
    }, {
        'url': 'http://rtlxl.nl/?_ga=1.204735956.572365465.1466978370#!/rtl-nieuws-132237/3c487912-023b-49ac-903e-2c5d79f8410f',
        'only_matching': True,
+    }, {
+        'url': 'https://www.rtl.nl/video/c603c9c2-601d-4b5e-8175-64f1e942dc7d/',
+        'only_matching': True,
    }]

    def _real_extract(self, url):
--- a/youtube_dl/extractor/rutv.py
+++ b/youtube_dl/extractor/rutv.py
@@ -13,11 +13,15 @@ from ..utils import (
 class RUTVIE(InfoExtractor):
    IE_DESC = 'RUTV.RU'
    _VALID_URL = r'''(?x)
-        https?://player\.(?:rutv\.ru|vgtrk\.com)/
-            (?P<path>flash\d+v/container\.swf\?id=
-            |iframe/(?P<type>swf|video|live)/id/
-            |index/iframe/cast_id/)
-            (?P<id>\d+)'''
+                    https?://
+                        (?:test)?player\.(?:rutv\.ru|vgtrk\.com)/
+                        (?P<path>
+                            flash\d+v/container\.swf\?id=|
+                            iframe/(?P<type>swf|video|live)/id/|
+                            index/iframe/cast_id/
+                        )
+                        (?P<id>\d+)
+                    '''

    _TESTS = [
        {
@@ -99,17 +103,21 @@ class RUTVIE(InfoExtractor):
                'skip_download': True,
            },
        },
+        {
+            'url': 'https://testplayer.vgtrk.com/iframe/live/id/19201/showZoomBtn/false/isPlay/true/',
+            'only_matching': True,
+        },
    ]

    @classmethod
    def _extract_url(cls, webpage):
        mobj = re.search(
-            r'<iframe[^>]+?src=(["\'])(?P<url>https?://player\.(?:rutv\.ru|vgtrk\.com)/(?:iframe/(?:swf|video|live)/id|index/iframe/cast_id)/.+?)\1', webpage)
+            r'<iframe[^>]+?src=(["\'])(?P<url>https?://(?:test)?player\.(?:rutv\.ru|vgtrk\.com)/(?:iframe/(?:swf|video|live)/id|index/iframe/cast_id)/.+?)\1', webpage)
        if mobj:
            return mobj.group('url')

        mobj = re.search(
-            r'<meta[^>]+?property=(["\'])og:video\1[^>]+?content=(["\'])(?P<url>https?://player\.(?:rutv\.ru|vgtrk\.com)/flash\d+v/container\.swf\?id=.+?\2)',
+            r'<meta[^>]+?property=(["\'])og:video\1[^>]+?content=(["\'])(?P<url>https?://(?:test)?player\.(?:rutv\.ru|vgtrk\.com)/flash\d+v/container\.swf\?id=.+?\2)',
            webpage)
        if mobj:
            return mobj.group('url')
--- a/youtube_dl/extractor/safari.py
+++ b/youtube_dl/extractor/safari.py
@@ -16,7 +16,6 @@ from ..utils import (

 class SafariBaseIE(InfoExtractor):
    _LOGIN_URL = 'https://www.safaribooksonline.com/accounts/login/'
-    _SUCCESSFUL_LOGIN_REGEX = r'<a href="/accounts/logout/"[^>]*>Sign Out</a>'
    _NETRC_MACHINE = 'safari'

    _API_BASE = 'https://www.safaribooksonline.com/api/v1'
@@ -28,10 +27,6 @@ class SafariBaseIE(InfoExtractor):
        self._login()

    def _login(self):
-        # We only need to log in once for courses or individual videos
-        if self.LOGGED_IN:
-            return
-
        (username, password) = self._get_login_info()
        if username is None:
            return
@@ -39,11 +34,17 @@ class SafariBaseIE(InfoExtractor):
        headers = std_headers.copy()
        if 'Referer' not in headers:
            headers['Referer'] = self._LOGIN_URL
-        login_page_request = sanitized_Request(self._LOGIN_URL, headers=headers)

        login_page = self._download_webpage(
-            login_page_request, None,
-            'Downloading login form')
+            self._LOGIN_URL, None, 'Downloading login form', headers=headers)
+
+        def is_logged(webpage):
+            return any(re.search(p, webpage) for p in (
+                r'href=["\']/accounts/logout/', r'>Sign Out<'))
+
+        if is_logged(login_page):
+            self.LOGGED_IN = True
+            return

        csrf = self._html_search_regex(
            r"name='csrfmiddlewaretoken'\s+value='([^']+)'",
@@ -62,14 +63,12 @@ class SafariBaseIE(InfoExtractor):
        login_page = self._download_webpage(
            request, None, 'Logging in as %s' % username)

-        if re.search(self._SUCCESSFUL_LOGIN_REGEX, login_page) is None:
+        if not is_logged(login_page):
            raise ExtractorError(
                'Login failed; make sure your credentials are correct and try again.',
                expected=True)

-        SafariBaseIE.LOGGED_IN = True
-
-        self.to_screen('Login successful')
+        self.LOGGED_IN = True


 class SafariIE(SafariBaseIE):
--- a/youtube_dl/extractor/sexu.py
+++ b/youtube_dl/extractor/sexu.py
@@ -32,8 +32,9 @@ class SexuIE(InfoExtractor):
        formats = [{
            'url': source['file'].replace('\\', ''),
            'format_id': source.get('label'),
-            'height': self._search_regex(
-                r'^(\d+)[pP]', source.get('label', ''), 'height', default=None),
+            'height': int(self._search_regex(
+                r'^(\d+)[pP]', source.get('label', ''), 'height',
+                default=None)),
        } for source in sources if source.get('file')]
        self._sort_formats(formats)

--- a/youtube_dl/extractor/sohu.py
+++ b/youtube_dl/extractor/sohu.py
@@ -8,7 +8,11 @@ from ..compat import (
    compat_str,
    compat_urllib_parse_urlencode,
 )
-from ..utils import ExtractorError
+from ..utils import (
+    ExtractorError,
+    int_or_none,
+    try_get,
+)


 class SohuIE(InfoExtractor):
@@ -169,10 +173,11 @@ class SohuIE(InfoExtractor):
                formats.append({
                    'url': video_url,
                    'format_id': format_id,
-                    'filesize': data['clipsBytes'][i],
-                    'width': data['width'],
-                    'height': data['height'],
-                    'fps': data['fps'],
+                    'filesize': int_or_none(
+                        try_get(data, lambda x: x['clipsBytes'][i])),
+                    'width': int_or_none(data.get('width')),
+                    'height': int_or_none(data.get('height')),
+                    'fps': int_or_none(data.get('fps')),
                })
            self._sort_formats(formats)

--- a/youtube_dl/extractor/streamango.py
+++ b/youtube_dl/extractor/streamango.py
@@ -21,6 +21,17 @@ class StreamangoIE(InfoExtractor):
            'ext': 'mp4',
            'title': '20170315_150006.mp4',
        }
+    }, {
+        # no og:title
+        'url': 'https://streamango.com/embed/foqebrpftarclpob/asdf_asd_2_mp4',
+        'info_dict': {
+            'id': 'foqebrpftarclpob',
+            'ext': 'mp4',
+            'title': 'foqebrpftarclpob',
+        },
+        'params': {
+            'skip_download': True,
+        },
    }, {
        'url': 'https://streamango.com/embed/clapasobsptpkdfe/20170315_150006_mp4',
        'only_matching': True,
@@ -31,7 +42,7 @@ class StreamangoIE(InfoExtractor):

        webpage = self._download_webpage(url, video_id)

-        title = self._og_search_title(webpage)
+        title = self._og_search_title(webpage, default=video_id)

        formats = []
        for format_ in re.findall(r'({[^}]*\bsrc\s*:\s*[^}]*})', webpage):
--- a/youtube_dl/extractor/turbo.py
+++ b/youtube_dl/extractor/turbo.py
@@ -4,6 +4,7 @@ from __future__ import unicode_literals
 import re

 from .common import InfoExtractor
+from ..compat import compat_str
 from ..utils import (
    ExtractorError,
    int_or_none,
@@ -49,7 +50,7 @@ class TurboIE(InfoExtractor):
        for child in item:
            m = re.search(r'url_video_(?P<quality>.+)', child.tag)
            if m:
-                quality = m.group('quality')
+                quality = compat_str(m.group('quality'))
                formats.append({
                    'format_id': quality,
                    'url': child.text,
--- a/youtube_dl/extractor/tvplayer.py
+++ b/youtube_dl/extractor/tvplayer.py
@@ -48,7 +48,7 @@ class TVPlayerIE(InfoExtractor):
            'https://tvplayer.com/watch/context', display_id,
            'Downloading JSON context', query={
                'resource': resource_id,
-                'nonce': token,
+                'gen': token,
            })

        validate = context['validate']
--- a/youtube_dl/extractor/xfileshare.py
+++ b/youtube_dl/extractor/xfileshare.py
@@ -10,7 +10,6 @@ from ..utils import (
    ExtractorError,
    int_or_none,
    NO_DEFAULT,
-    sanitized_Request,
    urlencode_postdata,
 )

@@ -30,6 +29,7 @@ class XFileShareIE(InfoExtractor):
        (r'vidabc\.com', 'Vid ABC'),
        (r'vidbom\.com', 'VidBom'),
        (r'vidlo\.us', 'vidlo'),
+        (r'rapidvideo\.(?:cool|org)', 'RapidVideo.TV'),
    )

    IE_DESC = 'XFileShare based sites: %s' % ', '.join(list(zip(*_SITES))[1])
@@ -109,6 +109,9 @@ class XFileShareIE(InfoExtractor):
        'params': {
            'skip_download': True,
        },
+    }, {
+        'url': 'http://www.rapidvideo.cool/b667kprndr8w',
+        'only_matching': True,
    }]

    def _real_extract(self, url):
@@ -130,12 +133,12 @@ class XFileShareIE(InfoExtractor):
            if countdown:
                self._sleep(countdown, video_id)

-            post = urlencode_postdata(fields)
-
-            req = sanitized_Request(url, post)
-            req.add_header('Content-type', 'application/x-www-form-urlencoded')
-
-            webpage = self._download_webpage(req, video_id, 'Downloading video page')
+            webpage = self._download_webpage(
+                url, video_id, 'Downloading video page',
+                data=urlencode_postdata(fields), headers={
+                    'Referer': url,
+                    'Content-type': 'application/x-www-form-urlencoded',
+                })

        title = (self._search_regex(
            (r'style="z-index: [0-9]+;">([^<]+)</span>',
--- a/youtube_dl/extractor/xhamster.py
+++ b/youtube_dl/extractor/xhamster.py
@@ -4,6 +4,7 @@ import re

 from .common import InfoExtractor
 from ..utils import (
+    clean_html,
    dict_get,
    ExtractorError,
    int_or_none,
@@ -25,6 +26,7 @@ class XHamsterIE(InfoExtractor):
            'uploader': 'Ruseful2011',
            'duration': 893,
            'age_limit': 18,
+            'categories': ['Fake Hub', 'Amateur', 'MILFs', 'POV', 'Boss', 'Office', 'Oral', 'Reality', 'Sexy'],
        },
    }, {
        'url': 'http://xhamster.com/movies/2221348/britney_spears_sexy_booty.html?hd',
@@ -36,6 +38,7 @@ class XHamsterIE(InfoExtractor):
            'uploader': 'jojo747400',
            'duration': 200,
            'age_limit': 18,
+            'categories': ['Britney Spears', 'Celebrities', 'HD Videos', 'Sexy', 'Sexy Booty'],
        },
        'params': {
            'skip_download': True,
@@ -51,6 +54,7 @@ class XHamsterIE(InfoExtractor):
            'uploader': 'parejafree',
            'duration': 72,
            'age_limit': 18,
+            'categories': ['Amateur', 'Blowjobs'],
        },
        'params': {
            'skip_download': True,
@@ -104,7 +108,7 @@ class XHamsterIE(InfoExtractor):
            webpage, 'upload date', fatal=False))

        uploader = self._html_search_regex(
-            r'<span[^>]+itemprop=["\']author[^>]+><a[^>]+href=["\'].+?xhamster\.com/user/[^>]+>(?P<uploader>.+?)</a>',
+            r'<span[^>]+itemprop=["\']author[^>]+><a[^>]+><span[^>]+>([^<]+)',
            webpage, 'uploader', default='anonymous')

        thumbnail = self._search_regex(
@@ -120,7 +124,7 @@ class XHamsterIE(InfoExtractor):
            r'content=["\']User(?:View|Play)s:(\d+)',
            webpage, 'view count', fatal=False))

-        mobj = re.search(r"hint='(?P<likecount>\d+) Likes / (?P<dislikecount>\d+) Dislikes'", webpage)
+        mobj = re.search(r'hint=[\'"](?P<likecount>\d+) Likes / (?P<dislikecount>\d+) Dislikes', webpage)
        (like_count, dislike_count) = (mobj.group('likecount'), mobj.group('dislikecount')) if mobj else (None, None)

        mobj = re.search(r'</label>Comments \((?P<commentcount>\d+)\)</div>', webpage)
@@ -152,6 +156,12 @@ class XHamsterIE(InfoExtractor):

        self._sort_formats(formats)

+        categories_html = self._search_regex(
+            r'(?s)<table.+?(<span>Categories:.+?)</table>', webpage,
+            'categories', default=None)
+        categories = [clean_html(category) for category in re.findall(
+            r'<a[^>]+>(.+?)</a>', categories_html)] if categories_html else None
+
        return {
            'id': video_id,
            'title': title,
@@ -165,6 +175,7 @@ class XHamsterIE(InfoExtractor):
            'dislike_count': int_or_none(dislike_count),
            'comment_count': int_or_none(comment_count),
            'age_limit': age_limit,
+            'categories': categories,
            'formats': formats,
        }

--- a/youtube_dl/extractor/youku.py
+++ b/youtube_dl/extractor/youku.py
@@ -12,6 +12,7 @@ from ..utils import (
    ExtractorError,
    get_element_by_class,
    js_to_json,
+    str_or_none,
    strip_jsonp,
    urljoin,
 )
@@ -36,6 +37,12 @@ class YoukuIE(InfoExtractor):
            'id': 'XMTc1ODE5Njcy',
            'title': '★Smile﹗♡ Git Fresh -Booty Music舞蹈.',
            'ext': 'mp4',
+            'duration': 74.73,
+            'thumbnail': r're:^https?://.*',
+            'uploader': '。躲猫猫、',
+            'uploader_id': '36017967',
+            'uploader_url': 'http://i.youku.com/u/UMTQ0MDcxODY4',
+            'tags': list,
        }
    }, {
        'url': 'http://player.youku.com/player.php/sid/XNDgyMDQ2NTQw/v.swf',
@@ -46,6 +53,12 @@ class YoukuIE(InfoExtractor):
            'id': 'XODgxNjg1Mzk2',
            'ext': 'mp4',
            'title': '武媚娘传奇 85',
+            'duration': 1999.61,
+            'thumbnail': r're:^https?://.*',
+            'uploader': '疯狂豆花',
+            'uploader_id': '62583473',
+            'uploader_url': 'http://i.youku.com/u/UMjUwMzMzODky',
+            'tags': list,
        },
    }, {
        'url': 'http://v.youku.com/v_show/id_XMTI1OTczNDM5Mg==.html',
@@ -53,6 +66,12 @@ class YoukuIE(InfoExtractor):
            'id': 'XMTI1OTczNDM5Mg',
            'ext': 'mp4',
            'title': '花千骨 04',
+            'duration': 2363,
+            'thumbnail': r're:^https?://.*',
+            'uploader': '放剧场-花千骨',
+            'uploader_id': '772849359',
+            'uploader_url': 'http://i.youku.com/u/UMzA5MTM5NzQzNg==',
+            'tags': list,
        },
    }, {
        'url': 'http://v.youku.com/v_show/id_XNjA1NzA2Njgw.html',
@@ -61,6 +80,12 @@ class YoukuIE(InfoExtractor):
            'id': 'XNjA1NzA2Njgw',
            'ext': 'mp4',
            'title': '邢義田复旦讲座之想象中的胡人—从“左衽孔子”说起',
+            'duration': 7264.5,
+            'thumbnail': r're:^https?://.*',
+            'uploader': 'FoxJin1006',
+            'uploader_id': '322014285',
+            'uploader_url': 'http://i.youku.com/u/UMTI4ODA1NzE0MA==',
+            'tags': list,
        },
        'params': {
            'videopassword': '100600',
@@ -72,6 +97,12 @@ class YoukuIE(InfoExtractor):
            'id': 'XOTUxMzg4NDMy',
            'ext': 'mp4',
            'title': '我的世界☆明月庄主☆车震猎杀☆杀人艺术Minecraft',
+            'duration': 702.08,
+            'thumbnail': r're:^https?://.*',
+            'uploader': '明月庄主moon',
+            'uploader_id': '38465621',
+            'uploader_url': 'http://i.youku.com/u/UMTUzODYyNDg0',
+            'tags': list,
        },
    }, {
        'url': 'http://video.tudou.com/v/XMjIyNzAzMTQ4NA==.html?f=46177805',
@@ -79,6 +110,12 @@ class YoukuIE(InfoExtractor):
            'id': 'XMjIyNzAzMTQ4NA',
            'ext': 'mp4',
            'title': '卡马乔国足开大脚长传冲吊集锦',
+            'duration': 289,
+            'thumbnail': r're:^https?://.*',
+            'uploader': '阿卜杜拉之星',
+            'uploader_id': '2382249',
+            'uploader_url': 'http://i.youku.com/u/UOTUyODk5Ng==',
+            'tags': list,
        },
    }, {
        'url': 'http://video.tudou.com/v/XMjE4ODI3OTg2MA==.html',
@@ -154,7 +191,8 @@ class YoukuIE(InfoExtractor):
                raise ExtractorError(msg)

        # get video title
-        title = data['video']['title']
+        video_data = data['video']
+        title = video_data['title']

        formats = [{
            'url': stream['m3u8_url'],
@@ -171,6 +209,12 @@ class YoukuIE(InfoExtractor):
            'id': video_id,
            'title': title,
            'formats': formats,
+            'duration': video_data.get('seconds'),
+            'thumbnail': video_data.get('logo'),
+            'uploader': video_data.get('username'),
+            'uploader_id': str_or_none(video_data.get('userid')),
+            'uploader_url': data.get('uploader', {}).get('homepage'),
+            'tags': video_data.get('tags'),
        }


--- a/youtube_dl/extractor/youtube.py
+++ b/youtube_dl/extractor/youtube.py
@@ -1353,10 +1353,16 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
            start_time = parse_duration(time_point)
            if start_time is None:
                continue
+            if start_time > duration:
+                break
            end_time = (duration if next_num == len(chapter_lines)
                        else parse_duration(chapter_lines[next_num][1]))
            if end_time is None:
                continue
+            if end_time > duration:
+                end_time = duration
+            if start_time > end_time:
+                break
            chapter_title = re.sub(
                r'<a[^>]+>[^<]+</a>', '', chapter_line).strip(' \t-')
            chapter_title = re.sub(r'\s+', ' ', chapter_title)
@@ -1715,12 +1721,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
                format_id = url_data['itag'][0]
                url = url_data['url'][0]

-                if 'sig' in url_data:
-                    url += '&signature=' + url_data['sig'][0]
-                elif 's' in url_data:
-                    encrypted_sig = url_data['s'][0]
+                if 's' in url_data or self._downloader.params.get('youtube_include_dash_manifest', True):
                    ASSETS_RE = r'"assets":.+?"js":\s*("[^"]+")'
-
                    jsplayer_url_json = self._search_regex(
                        ASSETS_RE,
                        embed_webpage if age_gate else video_webpage,
@@ -1741,6 +1743,11 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
                            video_webpage, 'age gate player URL')
                        player_url = json.loads(player_url_json)

+                if 'sig' in url_data:
+                    url += '&signature=' + url_data['sig'][0]
+                elif 's' in url_data:
+                    encrypted_sig = url_data['s'][0]
+
                    if self._downloader.params.get('verbose'):
                        if player_url is None:
                            player_version = 'unknown'
--- a/youtube_dl/options.py
+++ b/youtube_dl/options.py
@@ -310,7 +310,7 @@ def parseOpts(overrideArguments=None):
        metavar='FILTER', dest='match_filter', default=None,
        help=(
            'Generic video filter. '
-            'Specify any key (see help for -o for a list of available keys) to '
+            'Specify any key (see the "OUTPUT TEMPLATE" for a list of available keys) to '
            'match if the key is present, '
            '!key to check if the key is not present, '
            'key > NUMBER (like "comment_count > 12", also works with '
@@ -618,7 +618,7 @@ def parseOpts(overrideArguments=None):
    verbosity.add_option(
        '-j', '--dump-json',
        action='store_true', dest='dumpjson', default=False,
-        help='Simulate, quiet but print JSON information. See --output for a description of available keys.')
+        help='Simulate, quiet but print JSON information. See the "OUTPUT TEMPLATE" for a description of available keys.')
    verbosity.add_option(
        '-J', '--dump-single-json',
        action='store_true', dest='dump_single_json', default=False,
--- a/youtube_dl/utils.py
+++ b/youtube_dl/utils.py
@@ -36,6 +36,7 @@ import xml.etree.ElementTree
 import zlib

 from .compat import (
+    compat_HTMLParseError,
    compat_HTMLParser,
    compat_basestring,
    compat_chr,
@@ -409,8 +410,12 @@ def extract_attributes(html_element):
    but the cases in the unit test will work for all of 2.6, 2.7, 3.2-3.5.
    """
    parser = HTMLAttributeParser()
-    parser.feed(html_element)
-    parser.close()
+    try:
+        parser.feed(html_element)
+        parser.close()
+    # Older Python may throw HTMLParseError in case of malformed HTML
+    except compat_HTMLParseError:
+        pass
    return parser.attrs


@@ -932,14 +937,6 @@ class YoutubeDLHandler(compat_urllib_request.HTTPHandler):
        except zlib.error:
            return zlib.decompress(data)

-    @staticmethod
-    def addinfourl_wrapper(stream, headers, url, code):
-        if hasattr(compat_urllib_request.addinfourl, 'getcode'):
-            return compat_urllib_request.addinfourl(stream, headers, url, code)
-        ret = compat_urllib_request.addinfourl(stream, headers, url)
-        ret.code = code
-        return ret
-
    def http_request(self, req):
        # According to RFC 3986, URLs can not contain non-ASCII characters, however this is not
        # always respected by websites, some tend to give out URLs with non percent-encoded
@@ -991,13 +988,13 @@ class YoutubeDLHandler(compat_urllib_request.HTTPHandler):
                    break
                else:
                    raise original_ioerror
-            resp = self.addinfourl_wrapper(uncompressed, old_resp.headers, old_resp.url, old_resp.code)
+            resp = compat_urllib_request.addinfourl(uncompressed, old_resp.headers, old_resp.url, old_resp.code)
            resp.msg = old_resp.msg
            del resp.headers['Content-encoding']
        # deflate
        if resp.headers.get('Content-encoding', '') == 'deflate':
            gz = io.BytesIO(self.deflate(resp.read()))
-            resp = self.addinfourl_wrapper(gz, old_resp.headers, old_resp.url, old_resp.code)
+            resp = compat_urllib_request.addinfourl(gz, old_resp.headers, old_resp.url, old_resp.code)
            resp.msg = old_resp.msg
            del resp.headers['Content-encoding']
        # Percent-encode redirect URL of Location HTTP header to satisfy RFC 3986 (see
@@ -1187,7 +1184,7 @@ def unified_timestamp(date_str, day_first=True):
    if date_str is None:
        return None

-    date_str = date_str.replace(',', ' ')
+    date_str = re.sub(r'[,|]', '', date_str)

    pm_delta = 12 if re.search(r'(?i)PM', date_str) else 0
    timezone, date_str = extract_timezone(date_str)
--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@@ -1,3 +1,3 @@
 from __future__ import unicode_literals

-__version__ = '2017.05.26'
+__version__ = '2017.06.12'
Author	SHA1	Message	Date
Sergey M․	cb1e6d8985	release 2017.06.12	2017-06-12 02:23:17 +07:00
Sergey M․	9932ac5c58	[ChangeLog] Actualize	2017-06-12 02:01:15 +07:00
Sergey M․	bf87c36c93	[xfileshare] PEP 8	2017-06-12 02:01:12 +07:00
Sergey M․	b4a3d461e4	[utils] Handle HTMLParseError in extract_attributes (closes #13349 )	2017-06-12 01:52:24 +07:00
Sergey M․	72b409559c	[compat] Introduce compat_HTMLParseError	2017-06-12 01:50:32 +07:00
Sergey M․	534863e057	[xfileshare] Add support for rapidvideo (closes #13348 )	2017-06-12 00:16:47 +07:00
Sergey M․	16bc958287	[xfileshare] Modernize and pass referrer	2017-06-12 00:14:04 +07:00
Sergey M․	624bd0104c	[rutv] Add support for testplayer.vgtrk.com (closes #13347 )	2017-06-11 21:36:19 +07:00
Sergey M․	28a4d6cce8	[newgrounds] Extract more metadata (closes #13232 )	2017-06-11 21:30:06 +07:00
Sergey M․	2ae2ffda5e	[utils] Improve unified_timestamp	2017-06-11 21:27:22 +07:00
Sergey M․	70e7967202	[newgrounds:playlist] Add extractor (closes #10611 )	2017-06-11 20:50:55 +07:00
Sergey M․	6e999fbc12	[newgrounds] Improve formats and uploader extraction (closes #13346 )	2017-06-11 19:44:44 +07:00
Sergey M․	7409af9eb3	[msn] Fix formats extraction	2017-06-11 08:56:53 +07:00
Sergey M․	4e3637034c	[extractor/generic] Ensure format id is unicode string	2017-06-10 23:56:20 +07:00
Sergey M․	1afd0b0da7	[extractor/common] Return unicode string from _match_id	2017-06-09 00:40:03 +07:00
Sergey M․	7515830422	[turbo] Ensure format id is string	2017-06-09 00:31:56 +07:00
Sergey M․	f5521ea209	[sexu] Ensure height is int	2017-06-09 00:30:23 +07:00
Sergey M․	34646967ba	[jove] Ensure comment count is int	2017-06-09 00:29:20 +07:00
Sergey M․	e4d2e76d8e	[golem] Ensure format id is string	2017-06-09 00:27:11 +07:00
Sergey M․	87f5646937	[gfycat] Ensure filesize is int	2017-06-09 00:24:23 +07:00
Sergey M․	cc69a3de1b	[foxgay] Ensure height is int	2017-06-09 00:22:14 +07:00
Sergey M․	15aeeb1188	[flickr] Ensure format id is string	2017-06-09 00:20:07 +07:00
Sergey M․	1693bebe4d	[sohu] Fix numeric fields	2017-06-09 00:16:42 +07:00
Sergey M․	4244a13a1d	[safari] Improve authentication detection (closes #13319 )	2017-06-08 23:20:48 +07:00
Sergey M․	931adf8cc1	[liveleak] Ensure height is int (closes #13313 )	2017-06-08 22:54:30 +07:00
Sergey M․	c996943418	[YoutubeDL] Sanitize more fields (#13313 )	2017-06-08 22:53:14 +07:00
Sergey M․	76e6378358	[README.md] Improve man page formatting	2017-06-08 22:02:42 +07:00
Sergey M․	a355b57f58	[README.md] Clarify output template references (closes #13316 )	2017-06-08 21:52:19 +07:00
Sergey M․	1508da30c2	[streamango] Skip download for test (closes #13292 )	2017-06-07 21:53:40 +07:00
Luca Steeb	eb703e5380	[streamango] Make title optional	2017-06-07 21:53:33 +07:00
Sergey M․	0a3924e746	[rtlnl] Improve _VALID_URL (closes #13295 )	2017-06-06 21:21:44 +07:00
Sergey M․	e1db730d86	[tvplayer] Fix extraction (closes #13291 )	2017-06-06 00:13:57 +07:00
Sergey M․	537191826f	release 2017.06.05	2017-06-05 00:48:07 +07:00
Sergey M․	130880ba48	[ChangeLog] Actualize	2017-06-05 00:43:38 +07:00
Sergey M․	f8ba3fda4d	Credit @jktjkt for dvtv formats (#13063 )	2017-06-05 00:38:44 +07:00
Sergey M․	e1b90cc3db	Credit @mikf for beam:vod (#13032 )	2017-06-05 00:35:41 +07:00
Sergey M․	43e6579558	Credit @adamvoss for bandcamp:weekly (#12758 )	2017-06-04 23:22:19 +07:00
Sergey M․	6d923aab35	[bandcamp:weekly] Improve and extract more metadata (closes #12758 )	2017-06-04 23:21:30 +07:00
Adam Voss	62bafabc09	[bandcamp:weekly] Add extractor	2017-06-04 23:21:07 +07:00
Sergey M․	9edcdac90c	[pornhub:uservideos] Add missing raise	2017-06-04 20:39:55 +07:00
Sergey M․	cd138d8bd4	[pornhub:playlist] Fix extraction (closes #13281 )	2017-06-04 15:54:19 +07:00
Sergey M․	cd750b731c	[godtv] Remove extractor (closes #13175 )	2017-06-03 22:08:12 +07:00
CeruleanSky	4bede0d8f5	[YoutubeDL] Don't emit ANSI escape codes on Windows	2017-06-03 19:14:23 +07:00
Sergey M․	f129c3f349	[safari] Fix typo (closes #13252 )	2017-06-02 01:03:51 +07:00
Sergey M․	39d4c1be4d	[youtube] Improve chapters extraction (closes #13247 )	2017-06-01 23:29:45 +07:00
Sergey M․	f7a747ce59	[1tv] Lower preference for http formats (closes #13246 )	2017-06-01 22:19:52 +07:00
Sergey M․	4489d41816	[francetv] Relax _VALID_URL	2017-06-01 00:15:15 +07:00
Sergey M․	87b5184a0d	[drbonanza] Fix extraction (closes #13231 )	2017-05-31 23:56:32 +07:00
Remita Amine	c56ad5c975	[packtpub] Fix authentication(closes #13240 )	2017-05-31 15:44:29 +01:00
Sergey M․	6b7ce85cdc	[README.md] Mention http_dash_segments protocol	2017-05-30 23:50:48 +07:00
Yen Chi Hsuan	d10d0e3cf8	[README.md] Add an example for how to use .netrc on Windows That's a Python bug: http://bugs.python.org/issue28334 Most likely it will be fixed in Python 3.7: https://github.com/python/cpython/pull/123	2017-05-29 14:58:07 +08:00
Sergey M․	941ea38ef5	release 2017.05.29	2017-05-29 00:42:18 +07:00
Sergey M․	99bea8d298	[ChangeLog] Actualize	2017-05-29 00:33:56 +07:00
Yen Chi Hsuan	a49eccdfa7	[youtube] Parse player_url if format URLs are encrypted or DASH MPDs are requested Fixes #13211	2017-05-28 20:20:20 +08:00
Sergey M․	a846173d93	[xhamster] Simplify (closes #13216 )	2017-05-28 07:55:56 +07:00
fiocfun	78e210dea5	[xhamster] Fix author and like/dislike count extraction	2017-05-28 07:55:07 +07:00
Sergey M․	8555204274	[xhamster] Extract categories (closes #11728 )	2017-05-28 07:50:15 +07:00
Sergey M․	164fcbfeb7	[abcnews] Improve and remove duplicate test (closes #12851 )	2017-05-28 07:06:56 +07:00
Tithen-Firion	bc22df29c4	[abcnews] Add support for embed URLs	2017-05-28 07:06:29 +07:00
Sergey M․	7e688d2f6a	[gaskrank] Improve (closes #12493 )	2017-05-28 06:47:38 +07:00
motophil	5a6d1da442	[gaskrank] Fix extraction	2017-05-28 06:47:30 +07:00
Sergey M․	703751add4	[medialaan] PEP 8 (closes #12774 )	2017-05-28 06:27:57 +07:00
midas02	4050be78e5	[medialaan] Fix videos with missing videoUrl A rough trick to get around the two different json styles medialaan seems to be using. Fix for these example videos: https://vtmkzoom.be/video?aid=45724 https://vtmkzoom.be/video?aid=45425	2017-05-28 06:27:52 +07:00
Sergey M․	4d9fc40100	[dvtv] Improve and fix playlists support (closes #13063 )	2017-05-28 06:19:54 +07:00
Jan Kundrát	765522345f	[dvtv] Parse adaptive formats as well The old code hit an error when it attempted to parse the string "adaptive" for video height. Actually parsing the returned playlists is a good idea because it adds more output formats, including some audio-only-ones.	2017-05-28 06:19:46 +07:00
Sergey M․	6bceb36b99	[beam] Improve and add support for mixer.com (closes #13032 )	2017-05-28 05:43:04 +07:00
Mike Fährmann	1e0d65f0bd	[beam:vod] Add extractor	2017-05-28 05:42:23 +07:00
Sergey M․	03327bc9a6	[cbsinteractive] Relax _VALID_URL (closes #13213 )	2017-05-27 22:37:24 +07:00
Yen Chi Hsuan	b407d8533d	[utils] Drop an compatibility wrapper for Python < 2.6 addinfourl.getcode is added since Python 2.6a1. As youtube-dl now requires 2.6+, this is no longer necessary. See `9b0d46db11`	2017-05-27 23:05:02 +08:00
Remita Amine	20e2c9de04	[adn] fix formats extraction	2017-05-26 20:00:44 +01:00
Yen Chi Hsuan	d16c0121b9	[youku] Extract more metadata (closes #10433 )	2017-05-27 00:08:37 +08:00
Sergey M․	7f4c3a7439	[cbsnews] Fix extraction (closes #13205 )	2017-05-26 22:42:27 +07:00