release 2014.06.04

[ard] Fix format extraction (fixes #3006 and #3032 )
Merge pull request #2962 from simonwjackson/patch-1
2014-06-04 06:47:57 +02:00 · 2014-06-03 21:56:49 +02:00 · 2014-06-03 16:47:59 +02:00 · 2014-06-03 19:59:08 +07:00 · 2014-06-02 20:20:21 +07:00 · 2014-06-02 13:30:23 +04:00
46 changed files with 720 additions and 337 deletions
--- a/14
+++ b/14
@@ -1,14 +0,0 @@
 2013.01.02  Codename: GIULIA
    * Add support for ComedyCentral clips <nto>
    * Corrected Vimeo description fetching <Nick Daniels>
    * Added the --no-post-overwrites argument <Barbu Paul - Gheorghe>
    * --verbose offers more environment info
    * New info_dict field: uploader_id
    * New updates system, with signature checking
    * New IEs: NBA, JustinTV, FunnyOrDie, TweetReel, Steam, Ustream
    * Fixed IEs: BlipTv
    * Fixed for Python 3 IEs: Xvideo, Youku, XNXX, Dailymotion, Vimeo, InfoQ
    * Simplified IEs and test code
    * Various (Python 3 and other) fixes
    * Revamped and expanded tests
--- a/8
+++ b/8
@@ -1,7 +1,7 @@
 all: youtube-dl README.md README.txt youtube-dl.1 youtube-dl.bash-completion
 clean:
-	rm -rf youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz
+	rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz
 cleanall: clean
 	rm -f youtube-dl youtube-dl.exe
@@ -55,7 +55,9 @@ README.txt: README.md
 	pandoc -f markdown -t plain README.md -o README.txt
 youtube-dl.1: README.md
-	pandoc -s -f markdown -t man README.md -o youtube-dl.1
+	python devscripts/prepare_manpage.py >youtube-dl.1.temp.md
 	pandoc -s -f markdown -t man youtube-dl.1.temp.md -o youtube-dl.1
 	rm -f youtube-dl.1.temp.md
 youtube-dl.bash-completion: youtube_dl/*.py youtube_dl/*/*.py devscripts/bash-completion.in
 	python devscripts/bash-completion.py
@@ -75,6 +77,6 @@ youtube-dl.tar.gz: youtube-dl README.md README.txt youtube-dl.1 youtube-dl.bash-
 		--exclude 'docs/_build' \
 		-- \
 		bin devscripts test youtube_dl docs \
-		CHANGELOG LICENSE README.md README.txt \
+		LICENSE README.md README.txt \
 		Makefile MANIFEST.in youtube-dl.1 youtube-dl.bash-completion setup.py \
 		youtube-dl
--- a/README.md
+++ b/README.md
@@ -1,11 +1,24 @@
 % YOUTUBE-DL(1)
 # NAME
 youtube-dl - download videos from youtube.com or other video platforms
 # SYNOPSIS
 **youtube-dl** [OPTIONS] URL [URL...]
 # INSTALLATION
 To install it right away for all UNIX users (Linux, OS X, etc.), type:
    sudo curl https://yt-dl.org/latest/youtube-dl -o /usr/local/bin/youtube-dl
    sudo chmod a+x /usr/local/bin/youtube-dl
 If you do not have curl, you can alternatively use a recent wget:
    sudo wget https://yt-dl.org/downloads/2014.05.13/youtube-dl -O /usr/local/bin/youtube-dl
    sudo chmod a+x /usr/local/bin/youtube-dl
 Windows users can [download a .exe file](https://yt-dl.org/latest/youtube-dl.exe) and place it in their home directory or any other location on their [PATH](http://en.wikipedia.org/wiki/PATH_%28variable%29).
 Alternatively, refer to the developer instructions below for how to check out and work with the git repository. For further options, including PGP signatures, see https://rg3.github.io/youtube-dl/download.html .
 # DESCRIPTION
 **youtube-dl** is a small command-line program to download videos from
 YouTube.com and a few more sites. It requires the Python interpreter, version
@@ -458,7 +471,7 @@ If your report is shorter than two lines, it is almost certainly missing some of
 For bug reports, this means that your report should contain the *complete* output of youtube-dl when called with the -v flag. The error message you get for (most) bugs even says so, but you would not believe how many of our bug reports do not contain this information.
-Site support requests must contain an example URL. An example URL is a URL you might want to download, like http://www.youtube.com/watch?v=BaW_jenozKc . There should be an obvious video present. Except under very special circumstances, the main page of a video service (e.g. http://www.youtube.com/ ) is *not* an example URL.
+Site support requests **must contain an example URL**. An example URL is a URL you might want to download, like http://www.youtube.com/watch?v=BaW_jenozKc . There should be an obvious video present. Except under very special circumstances, the main page of a video service (e.g. http://www.youtube.com/ ) is *not* an example URL.
 ###  Are you using the latest version?
--- a/devscripts/make_readme.py
+++ b/devscripts/make_readme.py
@@ -15,7 +15,7 @@ header = oldreadme[:oldreadme.index('# OPTIONS')]
 footer = oldreadme[oldreadme.index('# CONFIGURATION'):]
 options = helptext[helptext.index('  General Options:') + 19:]
-options = re.sub(r'^  (\w.+)$', r'## \1', options, flags=re.M)
+options = re.sub(r'(?m)^  (\w.+)$', r'## \1', options)
 options = '# OPTIONS\n' + options + '\n'
 with io.open(README_FILE, 'w', encoding='utf-8') as f:
--- a/devscripts/prepare_manpage.py
+++ b/devscripts/prepare_manpage.py
@@ -0,0 +1,20 @@
 import io
 import os.path
 import sys
 import re
 ROOT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
 README_FILE = os.path.join(ROOT_DIR, 'README.md')
 with io.open(README_FILE, encoding='utf-8') as f:
    readme = f.read()
 PREFIX = '%YOUTUBE-DL(1)\n\n# NAME\n'
 readme = re.sub(r'(?s)# INSTALLATION.*?(?=# DESCRIPTION)', '', readme)
 readme = PREFIX + readme
 if sys.version_info < (3, 0):
    print(readme.encode('utf-8'))
 else:
    print(readme)
--- a/devscripts/release.sh
+++ b/devscripts/release.sh
@@ -45,9 +45,9 @@ fi
 /bin/echo -e "\n### Changing version in version.py..."
 sed -i "s/__version__ = '.*'/__version__ = '$version'/" youtube_dl/version.py
-/bin/echo -e "\n### Committing CHANGELOG README.md and youtube_dl/version.py..."
+/bin/echo -e "\n### Committing README.md and youtube_dl/version.py..."
 make README.md
-git add CHANGELOG README.md youtube_dl/version.py
+git add README.md youtube_dl/version.py
 git commit -m "release $version"
 /bin/echo -e "\n### Now tagging, signing and pushing..."
--- a/test/test_YoutubeDL.py
+++ b/test/test_YoutubeDL.py
@@ -67,7 +67,7 @@ class TestFormatSelection(unittest.TestCase):
        downloaded = ydl.downloaded_info_dicts[0]
        self.assertEqual(downloaded['ext'], 'mp4')
-        # No prefer_free_formats => prefer mp4 and flv for greater compatibilty
+        # No prefer_free_formats => prefer mp4 and flv for greater compatibility
        ydl = YDL()
        ydl.params['prefer_free_formats'] = False
        formats = [
@@ -279,7 +279,7 @@ class TestFormatSelection(unittest.TestCase):
        self.assertEqual(ydl._format_note({}), '')
        assertRegexpMatches(self, ydl._format_note({
            'vbr': 10,
-        }), '^x\s*10k$')
+        }), '^\s*10k$')
 if __name__ == '__main__':
    unittest.main()
--- a/test/test_age_restriction.py
+++ b/test/test_age_restriction.py
@@ -13,7 +13,7 @@ from youtube_dl import YoutubeDL
 def _download_restricted(url, filename, age):
-    """ Returns true iff the file has been downloaded """
+    """ Returns true if the file has been downloaded """
    params = {
        'age_limit': age,
--- a/test/test_playlists.py
+++ b/test/test_playlists.py
@@ -209,20 +209,20 @@ class TestPlaylists(unittest.TestCase):
    def test_ivi_compilation(self):
        dl = FakeYDL()
        ie = IviCompilationIE(dl)
-        result = ie.extract('http://www.ivi.ru/watch/dezhurnyi_angel')
+        result = ie.extract('http://www.ivi.ru/watch/dvoe_iz_lartsa')
        self.assertIsPlaylist(result)
-        self.assertEqual(result['id'], 'dezhurnyi_angel')
+        self.assertEqual(result['id'], 'dvoe_iz_lartsa')
-        self.assertEqual(result['title'], 'Дежурный ангел (2010 - 2012)')
+        self.assertEqual(result['title'], 'Двое из ларца (2006 - 2008)')
-        self.assertTrue(len(result['entries']) >= 23)
+        self.assertTrue(len(result['entries']) >= 24)
    def test_ivi_compilation_season(self):
        dl = FakeYDL()
        ie = IviCompilationIE(dl)
-        result = ie.extract('http://www.ivi.ru/watch/dezhurnyi_angel/season2')
+        result = ie.extract('http://www.ivi.ru/watch/dvoe_iz_lartsa/season1')
        self.assertIsPlaylist(result)
-        self.assertEqual(result['id'], 'dezhurnyi_angel/season2')
+        self.assertEqual(result['id'], 'dvoe_iz_lartsa/season1')
-        self.assertEqual(result['title'], 'Дежурный ангел (2010 - 2012) 2 сезон')
+        self.assertEqual(result['title'], 'Двое из ларца (2006 - 2008) 1 сезон')
-        self.assertTrue(len(result['entries']) >= 7)
+        self.assertTrue(len(result['entries']) >= 12)
    def test_imdb_list(self):
        dl = FakeYDL()
--- a/youtube_dl/init.py
+++ b/youtube_dl/init.py
@@ -56,6 +56,7 @@ __authors__  = (
    'Nicolas Évrard',
    'Jason Normore',
    'Hoje Lee',
    'Adam Thalhammer',
 )
 __license__ = 'Public Domain'
--- a/youtube_dl/extractor/init.py
+++ b/youtube_dl/extractor/init.py
@@ -194,7 +194,10 @@ from .normalboots import NormalbootsIE
 from .novamov import NovaMovIE
 from .nowness import NownessIE
 from .nowvideo import NowVideoIE
-from .nrk import NRKIE
+from .nrk import (
    NRKIE,
    NRKTVIE,
 )
 from .ntv import NTVIE
 from .nytimes import NYTimesIE
 from .nuvid import NuvidIE
@@ -260,6 +263,7 @@ from .stanfordoc import StanfordOpenClassroomIE
 from .steam import SteamIE
 from .streamcloud import StreamcloudIE
 from .streamcz import StreamCZIE
 from .swrmediathek import SWRMediathekIE
 from .syfy import SyfyIE
 from .sztvhu import SztvHuIE
 from .teamcoco import TeamcocoIE
--- a/youtube_dl/extractor/aftonbladet.py
+++ b/youtube_dl/extractor/aftonbladet.py
@@ -1,7 +1,6 @@
 # encoding: utf-8
 from __future__ import unicode_literals
 import datetime
 import re
 from .common import InfoExtractor
@@ -16,6 +15,7 @@ class AftonbladetIE(InfoExtractor):
            'ext': 'mp4',
            'title': 'Vulkanutbrott i rymden - nu släpper NASA bilderna',
            'description': 'Jupiters måne mest aktiv av alla himlakroppar',
            'timestamp': 1394142732,
            'upload_date': '20140306',
        },
    }
@@ -27,17 +27,17 @@ class AftonbladetIE(InfoExtractor):
        webpage = self._download_webpage(url, video_id)
        # find internal video meta data
-        META_URL = 'http://aftonbladet-play.drlib.aptoma.no/video/%s.json'
+        meta_url = 'http://aftonbladet-play.drlib.aptoma.no/video/%s.json'
        internal_meta_id = self._html_search_regex(
            r'data-aptomaId="([\w\d]+)"', webpage, 'internal_meta_id')
-        internal_meta_url = META_URL % internal_meta_id
+        internal_meta_url = meta_url % internal_meta_id
        internal_meta_json = self._download_json(
            internal_meta_url, video_id, 'Downloading video meta data')
        # find internal video formats
-        FORMATS_URL = 'http://aftonbladet-play.videodata.drvideo.aptoma.no/actions/video/?id=%s'
+        format_url = 'http://aftonbladet-play.videodata.drvideo.aptoma.no/actions/video/?id=%s'
        internal_video_id = internal_meta_json['videoId']
-        internal_formats_url = FORMATS_URL % internal_video_id
+        internal_formats_url = format_url % internal_video_id
        internal_formats_json = self._download_json(
            internal_formats_url, video_id, 'Downloading video formats')
@@ -54,16 +54,13 @@ class AftonbladetIE(InfoExtractor):
            })
        self._sort_formats(formats)
        timestamp = datetime.datetime.fromtimestamp(internal_meta_json['timePublished'])
        upload_date = timestamp.strftime('%Y%m%d')
        return {
            'id': video_id,
            'title': internal_meta_json['title'],
            'formats': formats,
            'thumbnail': internal_meta_json['imageUrl'],
            'description': internal_meta_json['shortPreamble'],
-            'upload_date': upload_date,
+            'timestamp': internal_meta_json['timePublished'],
            'duration': internal_meta_json['duration'],
            'view_count': internal_meta_json['views'],
        }
--- a/youtube_dl/extractor/ard.py
+++ b/youtube_dl/extractor/ard.py
@@ -38,15 +38,19 @@ class ARDIE(InfoExtractor):
        webpage = self._download_webpage(url, video_id)
        title = self._html_search_regex(
-            r'<h1(?:\s+class="boxTopHeadline")?>(.*?)</h1>', webpage, 'title')
+            [r'<h1(?:\s+class="boxTopHeadline")?>(.*?)</h1>',
             r'<meta name="dcterms.title" content="(.*?)"/>',
             r'<h4 class="headline">(.*?)</h4>'],
            webpage, 'title')
        description = self._html_search_meta(
            'dcterms.abstract', webpage, 'description')
        thumbnail = self._og_search_thumbnail(webpage)
-        streams = [
+
-            mo.groupdict()
+        media_info = self._download_json(
-            for mo in re.finditer(
+            'http://www.ardmediathek.de/play/media/%s' % video_id, video_id)
-                r'mediaCollection\.addMediaStream\((?P<media_type>\d+), (?P<quality>\d+), "(?P<rtmp_url>[^"]*)", "(?P<video_url>[^"]*)", "[^"]*"\)', webpage)]
+        # The second element of the _mediaArray contains the standard http urls
        streams = media_info['_mediaArray'][1]['_mediaStreamArray']
        if not streams:
            if '"fsk"' in webpage:
                raise ExtractorError('This video is only available after 20:00')
@@ -54,21 +58,12 @@ class ARDIE(InfoExtractor):
        formats = []
        for s in streams:
            format = {
-                'quality': int(s['quality']),
+                'quality': s['_quality'],
                'url': s['_stream'],
            }
            if s.get('rtmp_url'):
                format['protocol'] = 'rtmp'
                format['url'] = s['rtmp_url']
                format['playpath'] = s['video_url']
            else:
                format['url'] = s['video_url']
-            quality_name = self._search_regex(
+            format['format_id'] = '%s-%s' % (
-                r'[,.]([a-zA-Z0-9_-]+),?\.mp4', format['url'],
+                determine_ext(format['url']), format['quality'])
                'quality name', default='NA')
            format['format_id'] = '%s-%s-%s-%s' % (
                determine_ext(format['url']), quality_name, s['media_type'],
                s['quality'])
            formats.append(format)
--- a/youtube_dl/extractor/bandcamp.py
+++ b/youtube_dl/extractor/bandcamp.py
@@ -19,7 +19,7 @@ class BandcampIE(InfoExtractor):
        'md5': 'c557841d5e50261777a6585648adf439',
        'info_dict': {
            "title": "youtube-dl  \"'/\\\u00e4\u21ad - youtube-dl test song \"'/\\\u00e4\u21ad",
-            "duration": 10,
+            "duration": 9.8485,
        },
        '_skip': 'There is a limit of 200 free downloads / month for the test song'
    }]
@@ -28,36 +28,32 @@ class BandcampIE(InfoExtractor):
        mobj = re.match(self._VALID_URL, url)
        title = mobj.group('title')
        webpage = self._download_webpage(url, title)
        # We get the link to the free download page
        m_download = re.search(r'freeDownloadPage: "(.*?)"', webpage)
-        if m_download is None:
+        if not m_download:
            m_trackinfo = re.search(r'trackinfo: (.+),\s*?\n', webpage)
            if m_trackinfo:
                json_code = m_trackinfo.group(1)
-                data = json.loads(json_code)
+                data = json.loads(json_code)[0]
                d = data[0]
                duration = int(round(d['duration']))
                formats = []
-                for format_id, format_url in d['file'].items():
+                for format_id, format_url in data['file'].items():
-                    ext, _, abr_str = format_id.partition('-')
+                    ext, abr_str = format_id.split('-', 1)
                    formats.append({
                        'format_id': format_id,
                        'url': format_url,
-                        'ext': format_id.partition('-')[0],
+                        'ext': ext,
                        'vcodec': 'none',
-                        'acodec': format_id.partition('-')[0],
+                        'acodec': ext,
-                        'abr': int(format_id.partition('-')[2]),
+                        'abr': int(abr_str),
                    })
                self._sort_formats(formats)
                return {
-                    'id': compat_str(d['id']),
+                    'id': compat_str(data['id']),
-                    'title': d['title'],
+                    'title': data['title'],
                    'formats': formats,
-                    'duration': duration,
+                    'duration': float(data['duration']),
                }
            else:
                raise ExtractorError('No free songs found')
@@ -67,11 +63,9 @@ class BandcampIE(InfoExtractor):
            r'var TralbumData = {(.*?)id: (?P<id>\d*?)$',
            webpage, re.MULTILINE | re.DOTALL).group('id')
-        download_webpage = self._download_webpage(download_link, video_id,
+        download_webpage = self._download_webpage(download_link, video_id, 'Downloading free downloads page')
-                                                  'Downloading free downloads page')
+        # We get the dictionary of the track from some javascript code
-        # We get the dictionary of the track from some javascrip code
+        info = re.search(r'items: (.*?),$', download_webpage, re.MULTILINE).group(1)
        info = re.search(r'items: (.*?),$',
                         download_webpage, re.MULTILINE).group(1)
        info = json.loads(info)[0]
        # We pick mp3-320 for now, until format selection can be easily implemented.
        mp3_info = info['downloads']['mp3-320']
@@ -100,7 +94,7 @@ class BandcampIE(InfoExtractor):
 class BandcampAlbumIE(InfoExtractor):
    IE_NAME = 'Bandcamp:album'
-    _VALID_URL = r'https?://(?:(?P<subdomain>[^.]+)\.)?bandcamp\.com(?:/album/(?P<title>[^?#]+))?'
+    _VALID_URL = r'https?://(?:(?P<subdomain>[^.]+)\.)?bandcamp\.com(?:/album/(?P<title>[^?#]+))'
    _TEST = {
        'url': 'http://blazo.bandcamp.com/album/jazz-format-mixtape-vol-1',
@@ -123,7 +117,7 @@ class BandcampAlbumIE(InfoExtractor):
        'params': {
            'playlistend': 2
        },
-        'skip': 'Bancamp imposes download limits. See test_playlists:test_bandcamp_album for the playlist test'
+        'skip': 'Bandcamp imposes download limits. See test_playlists:test_bandcamp_album for the playlist test'
    }
    def _real_extract(self, url):
--- a/youtube_dl/extractor/blinkx.py
+++ b/youtube_dl/extractor/blinkx.py
@@ -1,6 +1,5 @@
 from __future__ import unicode_literals
 import datetime
 import json
 import re
@@ -19,15 +18,16 @@ class BlinkxIE(InfoExtractor):
        'file': '8aQUy7GV.mp4',
        'md5': '2e9a07364af40163a908edbf10bb2492',
        'info_dict': {
-            "title": "Police Car Rolls Away",
+            'title': 'Police Car Rolls Away',
-            "uploader": "stupidvideos.com",
+            'uploader': 'stupidvideos.com',
-            "upload_date": "20131215",
+            'upload_date': '20131215',
-            "description": "A police car gently rolls away from a fight. Maybe it felt weird being around a confrontation and just had to get out of there!",
+            'timestamp': 1387068000,
-            "duration": 14.886,
+            'description': 'A police car gently rolls away from a fight. Maybe it felt weird being around a confrontation and just had to get out of there!',
-            "thumbnails": [{
+            'duration': 14.886,
-                "width": 100,
+            'thumbnails': [{
-                "height": 76,
+                'width': 100,
-                "url": "http://cdn.blinkx.com/stream/b/41/StupidVideos/20131215/1873969261/1873969261_tn_0.jpg",
+                'height': 76,
                'url': 'http://cdn.blinkx.com/stream/b/41/StupidVideos/20131215/1873969261/1873969261_tn_0.jpg',
            }],
        },
    }
@@ -41,9 +41,6 @@ class BlinkxIE(InfoExtractor):
                   'video=%s' % video_id)
        data_json = self._download_webpage(api_url, display_id)
        data = json.loads(data_json)['api']['results'][0]
        dt = datetime.datetime.fromtimestamp(data['pubdate_epoch'])
        pload_date = dt.strftime('%Y%m%d')
        duration = None
        thumbnails = []
        formats = []
@@ -64,10 +61,7 @@ class BlinkxIE(InfoExtractor):
                vcodec = remove_start(m['vcodec'], 'ff')
                acodec = remove_start(m['acodec'], 'ff')
                tbr = (int(m['vbr']) + int(m['abr'])) // 1000
-                format_id = (u'%s-%sk-%s' %
+                format_id = u'%s-%sk-%s' % (vcodec, tbr, m['w'])
                             (vcodec,
                              tbr,
                              m['w']))
                formats.append({
                    'format_id': format_id,
                    'url': m['link'],
@@ -88,7 +82,7 @@ class BlinkxIE(InfoExtractor):
            'title': data['title'],
            'formats': formats,
            'uploader': data['channel_name'],
-            'upload_date': pload_date,
+            'timestamp': data['pubdate_epoch'],
            'description': data.get('description'),
            'thumbnails': thumbnails,
            'duration': duration,
--- a/youtube_dl/extractor/bliptv.py
+++ b/youtube_dl/extractor/bliptv.py
@@ -1,102 +1,124 @@
 from __future__ import unicode_literals
 import datetime
 import re
 from .common import InfoExtractor
 from .subtitles import SubtitlesInfoExtractor
 from ..utils import (
    compat_str,
    compat_urllib_request,
    unescapeHTML,
    parse_iso8601,
    compat_urlparse,
    clean_html,
    compat_str,
 )
 class BlipTVIE(SubtitlesInfoExtractor):
-    """Information extractor for blip.tv"""
+    _VALID_URL = r'https?://(?:\w+\.)?blip\.tv/(?:(?:.+-|rss/flash/)(?P<id>\d+)|((?:play/|api\.swf#)(?P<lookup_id>[\da-zA-Z]+)))'
-    _VALID_URL = r'https?://(?:\w+\.)?blip\.tv/((.+/)|(play/)|(api\.swf#))(?P<presumptive_id>.+)$'
+    _TESTS = [
-
+        {
-    _TESTS = [{
+            'url': 'http://blip.tv/cbr/cbr-exclusive-gotham-city-imposters-bats-vs-jokerz-short-3-5796352',
-        'url': 'http://blip.tv/cbr/cbr-exclusive-gotham-city-imposters-bats-vs-jokerz-short-3-5796352',
+            'md5': 'c6934ad0b6acf2bd920720ec888eb812',
-        'md5': 'c6934ad0b6acf2bd920720ec888eb812',
+            'info_dict': {
-        'info_dict': {
+                'id': '5779306',
-            'id': '5779306',
+                'ext': 'mov',
-            'ext': 'mov',
+                'title': 'CBR EXCLUSIVE: "Gotham City Imposters" Bats VS Jokerz Short 3',
-            'upload_date': '20111205',
+                'description': 'md5:9bc31f227219cde65e47eeec8d2dc596',
-            'description': 'md5:9bc31f227219cde65e47eeec8d2dc596',
+                'timestamp': 1323138843,
-            'uploader': 'Comic Book Resources - CBR TV',
+                'upload_date': '20111206',
-            'title': 'CBR EXCLUSIVE: "Gotham City Imposters" Bats VS Jokerz Short 3',
+                'uploader': 'cbr',
                'uploader_id': '679425',
                'duration': 81,
            }
        },
        {
            # https://github.com/rg3/youtube-dl/pull/2274
            'note': 'Video with subtitles',
            'url': 'http://blip.tv/play/h6Uag5OEVgI.html',
            'md5': '309f9d25b820b086ca163ffac8031806',
            'info_dict': {
                'id': '6586561',
                'ext': 'mp4',
                'title': 'Red vs. Blue Season 11 Episode 1',
                'description': 'One-Zero-One',
                'timestamp': 1371261608,
                'upload_date': '20130615',
                'uploader': 'redvsblue',
                'uploader_id': '792887',
                'duration': 279,
            }
        }
-    }, {
+    ]
        # https://github.com/rg3/youtube-dl/pull/2274
        'note': 'Video with subtitles',
        'url': 'http://blip.tv/play/h6Uag5OEVgI.html',
        'md5': '309f9d25b820b086ca163ffac8031806',
        'info_dict': {
            'id': '6586561',
            'ext': 'mp4',
            'uploader': 'Red vs. Blue',
            'description': 'One-Zero-One',
            'upload_date': '20130614',
            'title': 'Red vs. Blue Season 11 Episode 1',
        }
    }]
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
-        presumptive_id = mobj.group('presumptive_id')
+        lookup_id = mobj.group('lookup_id')
        # See https://github.com/rg3/youtube-dl/issues/857
-        embed_mobj = re.match(r'https?://(?:\w+\.)?blip\.tv/(?:play/|api\.swf#)([a-zA-Z0-9]+)', url)
+        if lookup_id:
-        if embed_mobj:
+            info_page = self._download_webpage(
-            info_url = 'http://blip.tv/play/%s.x?p=1' % embed_mobj.group(1)
+                'http://blip.tv/play/%s.x?p=1' % lookup_id, lookup_id, 'Resolving lookup id')
-            info_page = self._download_webpage(info_url, embed_mobj.group(1))
+            video_id = self._search_regex(r'data-episode-id="([0-9]+)', info_page, 'video_id')
            video_id = self._search_regex(
                r'data-episode-id="([0-9]+)', info_page, 'video_id')
            return self.url_result('http://blip.tv/a/a-' + video_id, 'BlipTV')
        cchar = '&' if '?' in url else '?'
        json_url = url + cchar + 'skin=json&version=2&no_wrap=1'
        request = compat_urllib_request.Request(json_url)
        request.add_header('User-Agent', 'iTunes/10.6.1')
        json_data = self._download_json(request, video_id=presumptive_id)
        if 'Post' in json_data:
            data = json_data['Post']
        else:
-            data = json_data
+            video_id = mobj.group('id')
        rss = self._download_xml('http://blip.tv/rss/flash/%s' % video_id, video_id, 'Downloading video RSS')
        def blip(s):
            return '{http://blip.tv/dtd/blip/1.0}%s' % s
        def media(s):
            return '{http://search.yahoo.com/mrss/}%s' % s
        def itunes(s):
            return '{http://www.itunes.com/dtds/podcast-1.0.dtd}%s' % s
        item = rss.find('channel/item')
        video_id = item.find(blip('item_id')).text
        title = item.find('./title').text
        description = clean_html(compat_str(item.find(blip('puredescription')).text))
        timestamp = parse_iso8601(item.find(blip('datestamp')).text)
        uploader = item.find(blip('user')).text
        uploader_id = item.find(blip('userid')).text
        duration = int(item.find(blip('runtime')).text)
        media_thumbnail = item.find(media('thumbnail'))
        thumbnail = media_thumbnail.get('url') if media_thumbnail is not None else item.find(itunes('image')).text
        categories = [category.text for category in item.findall('category')]
        video_id = compat_str(data['item_id'])
        upload_date = datetime.datetime.strptime(data['datestamp'], '%m-%d-%y %H:%M%p').strftime('%Y%m%d')
        subtitles = {}
        formats = []
-        if 'additionalMedia' in data:
+        subtitles = {}
-            for f in data['additionalMedia']:
+
-                if f.get('file_type_srt') == 1:
+        media_group = item.find(media('group'))
-                    LANGS = {
+        for media_content in media_group.findall(media('content')):
-                        'english': 'en',
+            url = media_content.get('url')
-                    }
+            role = media_content.get(blip('role'))
-                    lang = f['role'].rpartition('-')[-1].strip().lower()
+            msg = self._download_webpage(
-                    langcode = LANGS.get(lang, lang)
+                url + '?showplayer=20140425131715&referrer=http://blip.tv&mask=7&skin=flashvars&view=url',
-                    subtitles[langcode] = f['url']
+                video_id, 'Resolving URL for %s' % role)
-                    continue
+            real_url = compat_urlparse.parse_qs(msg)['message'][0]
-                if not int(f['media_width']):  # filter m3u8
+
-                    continue
+            media_type = media_content.get('type')
            if media_type == 'text/srt' or url.endswith('.srt'):
                LANGS = {
                    'english': 'en',
                }
                lang = role.rpartition('-')[-1].strip().lower()
                langcode = LANGS.get(lang, lang)
                subtitles[langcode] = url
            elif media_type.startswith('video/'):
                formats.append({
-                    'url': f['url'],
+                    'url': real_url,
-                    'format_id': f['role'],
+                    'format_id': role,
-                    'width': int(f['media_width']),
+                    'format_note': media_type,
-                    'height': int(f['media_height']),
+                    'vcodec': media_content.get(blip('vcodec')),
                    'acodec': media_content.get(blip('acodec')),
                    'filesize': media_content.get('filesize'),
                    'width': int(media_content.get('width')),
                    'height': int(media_content.get('height')),
                })
        else:
            formats.append({
                'url': data['media']['url'],
                'width': int(data['media']['width']),
                'height': int(data['media']['height']),
            })
        self._sort_formats(formats)
        # subtitles
@@ -107,12 +129,14 @@ class BlipTVIE(SubtitlesInfoExtractor):
        return {
            'id': video_id,
-            'uploader': data['display_name'],
+            'title': title,
-            'upload_date': upload_date,
+            'description': description,
-            'title': data['title'],
+            'timestamp': timestamp,
-            'thumbnail': data['thumbnailUrl'],
+            'uploader': uploader,
-            'description': data['description'],
+            'uploader_id': uploader_id,
-            'user_agent': 'iTunes/10.6.1',
+            'duration': duration,
            'thumbnail': thumbnail,
            'categories': categories,
            'formats': formats,
            'subtitles': video_subtitles,
        }
--- a/youtube_dl/extractor/cinemassacre.py
+++ b/youtube_dl/extractor/cinemassacre.py
@@ -1,10 +1,12 @@
 # encoding: utf-8
 from __future__ import unicode_literals
 import re
 from .common import InfoExtractor
 from ..utils import (
    ExtractorError,
    int_or_none,
 )
@@ -13,9 +15,10 @@ class CinemassacreIE(InfoExtractor):
    _TESTS = [
        {
            'url': 'http://cinemassacre.com/2012/11/10/avgn-the-movie-trailer/',
-            'file': '19911.mp4',
+            'md5': 'fde81fbafaee331785f58cd6c0d46190',
            'md5': '782f8504ca95a0eba8fc9177c373eec7',
            'info_dict': {
                'id': '19911',
                'ext': 'mp4',
                'upload_date': '20121110',
                'title': '“Angry Video Game Nerd: The Movie” – Trailer',
                'description': 'md5:fb87405fcb42a331742a0dce2708560b',
@@ -23,9 +26,10 @@ class CinemassacreIE(InfoExtractor):
        },
        {
            'url': 'http://cinemassacre.com/2013/10/02/the-mummys-hand-1940',
-            'file': '521be8ef82b16.mp4',
+            'md5': 'd72f10cd39eac4215048f62ab477a511',
            'md5': 'dec39ee5118f8d9cc067f45f9cbe3a35',
            'info_dict': {
                'id': '521be8ef82b16',
                'ext': 'mp4',
                'upload_date': '20131002',
                'title': 'The Mummy’s Hand (1940)',
            },
@@ -50,29 +54,40 @@ class CinemassacreIE(InfoExtractor):
            r'<div class="entry-content">(?P<description>.+?)</div>',
            webpage, 'description', flags=re.DOTALL, fatal=False)
-        playerdata = self._download_webpage(playerdata_url, video_id)
+        playerdata = self._download_webpage(playerdata_url, video_id, 'Downloading player webpage')
        video_thumbnail = self._search_regex(
            r'image: \'(?P<thumbnail>[^\']+)\'', playerdata, 'thumbnail', fatal=False)
        sd_url = self._search_regex(r'file: \'([^\']+)\', label: \'SD\'', playerdata, 'sd_file')
        videolist_url = self._search_regex(r'file: \'([^\']+\.smil)\'}', playerdata, 'videolist_url')
-        sd_url = self._html_search_regex(r'file: \'([^\']+)\', label: \'SD\'', playerdata, 'sd_file')
+        videolist = self._download_xml(videolist_url, video_id, 'Downloading videolist XML')
        hd_url = self._html_search_regex(
            r'file: \'([^\']+)\', label: \'HD\'', playerdata, 'hd_file',
            default=None)
        video_thumbnail = self._html_search_regex(r'image: \'(?P<thumbnail>[^\']+)\'', playerdata, 'thumbnail', fatal=False)
-        formats = [{
+        formats = []
-            'url': sd_url,
+        baseurl = sd_url[:sd_url.rfind('/')+1]
-            'ext': 'mp4',
+        for video in videolist.findall('.//video'):
-            'format': 'sd',
+            src = video.get('src')
-            'format_id': 'sd',
+            if not src:
-            'quality': 1,
+                continue
-        }]
+            file_ = src.partition(':')[-1]
-        if hd_url:
+            width = int_or_none(video.get('width'))
-            formats.append({
+            height = int_or_none(video.get('height'))
-                'url': hd_url,
+            bitrate = int_or_none(video.get('system-bitrate'))
-                'ext': 'mp4',
+            format = {
-                'format': 'hd',
+                'url': baseurl + file_,
-                'format_id': 'hd',
+                'format_id': src.rpartition('.')[0].rpartition('_')[-1],
-                'quality': 2,
+            }
-            })
+            if width or height:
                format.update({
                    'tbr': bitrate // 1000 if bitrate else None,
                    'width': width,
                    'height': height,
                })
            else:
                format.update({
                    'abr': bitrate // 1000 if bitrate else None,
                    'vcodec': 'none',
                })
            formats.append(format)
        self._sort_formats(formats)
        return {
--- a/youtube_dl/extractor/comedycentral.py
+++ b/youtube_dl/extractor/comedycentral.py
@@ -188,7 +188,7 @@ class ComedyCentralShowsIE(InfoExtractor):
                })
                formats.append({
                    'format_id': 'rtmp-%s' % format,
-                    'url': rtmp_video_url,
+                    'url': rtmp_video_url.replace('viacomccstrm', 'viacommtvstrm'),
                    'ext': self._video_extensions.get(format, 'mp4'),
                    'height': h,
                    'width': w,
--- a/youtube_dl/extractor/common.py
+++ b/youtube_dl/extractor/common.py
@@ -113,6 +113,8 @@ class InfoExtractor(object):
    webpage_url:    The url to the video webpage, if given to youtube-dl it
                    should allow to get the same result again. (It will be set
                    by YoutubeDL if it's missing)
    categories:     A list of categories that the video falls in, for example
                    ["Sports", "Berlin"]
    Unless mentioned otherwise, the fields should be Unicode strings.
@@ -242,7 +244,7 @@ class InfoExtractor(object):
                url = url_or_request.get_full_url()
            except AttributeError:
                url = url_or_request
-            basen = video_id + '_' + url
+            basen = '%s_%s' % (video_id, url)
            if len(basen) > 240:
                h = u'___' + hashlib.md5(basen.encode('utf-8')).hexdigest()
                basen = basen[:240 - len(h)] + h
--- a/youtube_dl/extractor/empflix.py
+++ b/youtube_dl/extractor/empflix.py
@@ -3,20 +3,18 @@ from __future__ import unicode_literals
 import re
 from .common import InfoExtractor
 from ..utils import (
    ExtractorError,
 )
 class EmpflixIE(InfoExtractor):
    _VALID_URL = r'^https?://www\.empflix\.com/videos/.*?-(?P<id>[0-9]+)\.html'
    _TEST = {
        'url': 'http://www.empflix.com/videos/Amateur-Finger-Fuck-33051.html',
-        'md5': '5e5cc160f38ca9857f318eb97146e13e',
+        'md5': 'b1bc15b6412d33902d6e5952035fcabc',
        'info_dict': {
            'id': '33051',
-            'ext': 'flv',
+            'ext': 'mp4',
            'title': 'Amateur Finger Fuck',
            'description': 'Amateur solo finger fucking.',
            'age_limit': 18,
        }
    }
@@ -30,6 +28,8 @@ class EmpflixIE(InfoExtractor):
        video_title = self._html_search_regex(
            r'name="title" value="(?P<title>[^"]*)"', webpage, 'title')
        video_description = self._html_search_regex(
            r'name="description" value="([^"]*)"', webpage, 'description', fatal=False)
        cfg_url = self._html_search_regex(
            r'flashvars\.config = escape\("([^"]+)"',
@@ -37,12 +37,18 @@ class EmpflixIE(InfoExtractor):
        cfg_xml = self._download_xml(
            cfg_url, video_id, note='Downloading metadata')
-        video_url = cfg_xml.find('videoLink').text
+
        formats = [
            {
                'url': item.find('videoLink').text,
                'format_id': item.find('res').text,
            } for item in cfg_xml.findall('./quality/item')
        ]
        return {
            'id': video_id,
            'url': video_url,
            'ext': 'flv',
            'title': video_title,
            'description': video_description,
            'formats': formats,
            'age_limit': age_limit,
        }
--- a/youtube_dl/extractor/extremetube.py
+++ b/youtube_dl/extractor/extremetube.py
@@ -37,7 +37,7 @@ class ExtremeTubeIE(InfoExtractor):
        webpage = self._download_webpage(req, video_id)
        video_title = self._html_search_regex(
-            r'<h1 [^>]*?title="([^"]+)"[^>]*>\1<', webpage, 'title')
+            r'<h1 [^>]*?title="([^"]+)"[^>]*>', webpage, 'title')
        uploader = self._html_search_regex(
            r'>Posted by:(?=<)(?:\s|<[^>]*>)*(.+?)\|', webpage, 'uploader',
            fatal=False)
--- a/youtube_dl/extractor/fc2.py
+++ b/youtube_dl/extractor/fc2.py
@@ -13,7 +13,7 @@ from ..utils import (
 class FC2IE(InfoExtractor):
-    _VALID_URL = r'^http://video\.fc2\.com/(?P<lang>[^/]+)/content/(?P<id>[^/]+)'
+    _VALID_URL = r'^http://video\.fc2\.com/((?P<lang>[^/]+)/)?content/(?P<id>[^/]+)'
    IE_NAME = 'fc2'
    _TEST = {
        'url': 'http://video.fc2.com/en/content/20121103kUan1KHs',
@@ -36,7 +36,7 @@ class FC2IE(InfoExtractor):
        thumbnail = self._og_search_thumbnail(webpage)
        refer = url.replace('/content/', '/a/content/')
-        mimi = hashlib.md5(video_id + '_gGddgPfeaf_gzyr').hexdigest()
+        mimi = hashlib.md5((video_id + '_gGddgPfeaf_gzyr').encode('utf-8')).hexdigest()
        info_url = (
            "http://video.fc2.com/ginfo.php?mimi={1:s}&href={2:s}&v={0:s}&fversion=WIN%2011%2C6%2C602%2C180&from=2&otag=0&upid={0:s}&tk=null&".
--- a/youtube_dl/extractor/francetv.py
+++ b/youtube_dl/extractor/francetv.py
@@ -48,24 +48,36 @@ class PluzzIE(FranceTVBaseInfoExtractor):
 class FranceTvInfoIE(FranceTVBaseInfoExtractor):
    IE_NAME = 'francetvinfo.fr'
-    _VALID_URL = r'https?://www\.francetvinfo\.fr/replay.*/(?P<title>.+)\.html'
+    _VALID_URL = r'https?://www\.francetvinfo\.fr/.*/(?P<title>.+)\.html'
-    _TEST = {
+    _TESTS = [{
        'url': 'http://www.francetvinfo.fr/replay-jt/france-3/soir-3/jt-grand-soir-3-lundi-26-aout-2013_393427.html',
        'file': '84981923.mp4',
        'info_dict': {
            'id': '84981923',
            'ext': 'mp4',
            'title': 'Soir 3',
        },
        'params': {
            'skip_download': True,
        },
-    }
+    }, {
        'url': 'http://www.francetvinfo.fr/elections/europeennes/direct-europeennes-regardez-le-debat-entre-les-candidats-a-la-presidence-de-la-commission_600639.html',
        'info_dict': {
            'id': 'EV_20019',
            'ext': 'mp4',
            'title': 'Débat des candidats à la Commission européenne',
            'description': 'Débat des candidats à la Commission européenne',
        },
        'params': {
            'skip_download': 'HLS (reqires ffmpeg)'
        }
    }]
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        page_title = mobj.group('title')
        webpage = self._download_webpage(url, page_title)
-        video_id = self._search_regex(r'id-video=(\d+?)[@"]', webpage, 'video id')
+        video_id = self._search_regex(r'id-video=((?:[^0-9]*?_)?[0-9]+)[@"]', webpage, 'video id')
        return self._extract_video(video_id)
--- a/youtube_dl/extractor/gamekings.py
+++ b/youtube_dl/extractor/gamekings.py
@@ -15,7 +15,7 @@ class GamekingsIE(InfoExtractor):
            'id': '20130811',
            'ext': 'mp4',
            'title': 'Phoenix Wright: Ace Attorney \u2013 Dual Destinies Review',
-            'description': 'md5:632e61a9f97d700e83f43d77ddafb6a4',
+            'description': 'md5:36fd701e57e8c15ac8682a2374c99731',
        }
    }
--- a/youtube_dl/extractor/gamespot.py
+++ b/youtube_dl/extractor/gamespot.py
@@ -15,11 +15,12 @@ from ..utils import (
 class GameSpotIE(InfoExtractor):
    _VALID_URL = r'(?:http://)?(?:www\.)?gamespot\.com/.*-(?P<page_id>\d+)/?'
    _TEST = {
-        "url": "http://www.gamespot.com/arma-iii/videos/arma-iii-community-guide-sitrep-i-6410818/",
+        'url': 'http://www.gamespot.com/videos/arma-3-community-guide-sitrep-i/2300-6410818/',
-        "file": "gs-2300-6410818.mp4",
+        'md5': 'b2a30deaa8654fcccd43713a6b6a4825',
-        "md5": "b2a30deaa8654fcccd43713a6b6a4825",
+        'info_dict': {
-        "info_dict": {
+            'id': 'gs-2300-6410818',
-            "title": "Arma 3 - Community Guide: SITREP I",
+            'ext': 'mp4',
            'title': 'Arma 3 - Community Guide: SITREP I',
            'description': 'Check out this video where some of the basics of Arma 3 is explained.',
        }
    }
--- a/youtube_dl/extractor/generic.py
+++ b/youtube_dl/extractor/generic.py
@@ -363,8 +363,13 @@ class GenericIE(InfoExtractor):
                    return self.url_result('http://' + url)
                else:
                    if default_search == 'auto_warning':
-                        self._downloader.report_warning(
+                        if re.match(r'^(?:url|URL)$', url):
-                            'Falling back to youtube search for  %s . Set --default-search to "auto" to suppress this warning.' % url)
+                            raise ExtractorError(
                                'Invalid URL:  %r . Call youtube-dl like this:  youtube-dl -v "https://www.youtube.com/watch?v=BaW_jenozKc"  ' % url,
                                expected=True)
                        else:
                            self._downloader.report_warning(
                                'Falling back to youtube search for  %s . Set --default-search to "auto" to suppress this warning.' % url)
                    return self.url_result('ytsearch:' + url)
            else:
                assert ':' in default_search
@@ -560,7 +565,7 @@ class GenericIE(InfoExtractor):
        # Look for embedded NovaMov-based player
        mobj = re.search(
-            r'''(?x)<iframe[^>]+?src=(["\'])
+            r'''(?x)<(?:pagespeed_)?iframe[^>]+?src=(["\'])
                    (?P<url>http://(?:(?:embed|www)\.)?
                        (?:novamov\.com|
                           nowvideo\.(?:ch|sx|eu|at|ag|co)|
@@ -672,7 +677,7 @@ class GenericIE(InfoExtractor):
            # HTML5 video
            found = re.findall(r'(?s)<video[^<]*(?:>.*?<source.*?)? src="([^"]+)"', webpage)
        if not found:
-            found = re.findall(
+            found = re.search(
                r'(?i)<meta\s+(?=(?:[a-z-]+="[^"]+"\s+)*http-equiv="refresh")'
                r'(?:[a-z-]+="[^"]+"\s+)*?content="[0-9]{,2};url=\'([^\']+)\'"',
                webpage)
--- a/youtube_dl/extractor/ivi.py
+++ b/youtube_dl/extractor/ivi.py
@@ -33,14 +33,14 @@ class IviIE(InfoExtractor):
        },
        # Serial's serie
        {
-            'url': 'http://www.ivi.ru/watch/dezhurnyi_angel/74791',
+            'url': 'http://www.ivi.ru/watch/dvoe_iz_lartsa/9549',
-            'md5': '3e6cc9a848c1d2ebcc6476444967baa9',
+            'md5': '221f56b35e3ed815fde2df71032f4b3e',
            'info_dict': {
-                'id': '74791',
+                'id': '9549',
                'ext': 'mp4',
-                'title': 'Дежурный ангел - 1 серия',
+                'title': 'Двое из ларца - Серия 1',
-                'duration': 2490,
+                'duration': 2655,
-                'thumbnail': 'http://thumbs.ivi.ru/f7.vcp.digitalaccess.ru/contents/8/e/bc2f6c2b6e5d291152fdd32c059141.jpg',
+                'thumbnail': 'http://thumbs.ivi.ru/f15.vcp.digitalaccess.ru/contents/8/4/0068dc0677041f3336b7c2baad8fc0.jpg',
            },
            'skip': 'Only works from Russia',
         }
--- a/youtube_dl/extractor/mailru.py
+++ b/youtube_dl/extractor/mailru.py
@@ -2,7 +2,6 @@
 from __future__ import unicode_literals
 import re
 import datetime
 from .common import InfoExtractor
@@ -10,28 +9,48 @@ from .common import InfoExtractor
 class MailRuIE(InfoExtractor):
    IE_NAME = 'mailru'
    IE_DESC = 'Видео@Mail.Ru'
-    _VALID_URL = r'http://(?:www\.)?my\.mail\.ru/video/.*#video=/?(?P<id>[^/]+/[^/]+/[^/]+/\d+)'
+    _VALID_URL = r'http://(?:www\.)?my\.mail\.ru/(?:video/.*#video=/?(?P<idv1>(?:[^/]+/){3}\d+)|(?:(?P<idv2prefix>(?:[^/]+/){2})video/(?P<idv2suffix>[^/]+/\d+))\.html)'
-    _TEST = {
+    _TESTS = [
-        'url': 'http://my.mail.ru/video/top#video=/mail/sonypicturesrus/75/76',
+        {
-        'md5': 'dea205f03120046894db4ebb6159879a',
+            'url': 'http://my.mail.ru/video/top#video=/mail/sonypicturesrus/75/76',
-        'info_dict': {
+            'md5': 'dea205f03120046894db4ebb6159879a',
-            'id': '46301138',
+            'info_dict': {
-            'ext': 'mp4',
+                'id': '46301138',
-            'title': 'Новый Человек-Паук. Высокое напряжение. Восстание Электро',
+                'ext': 'mp4',
-            'upload_date': '20140224',
+                'title': 'Новый Человек-Паук. Высокое напряжение. Восстание Электро',
-            'uploader': 'sonypicturesrus',
+                'timestamp': 1393232740,
-            'uploader_id': 'sonypicturesrus@mail.ru',
+                'upload_date': '20140224',
-            'duration': 184,
+                'uploader': 'sonypicturesrus',
-        }
+                'uploader_id': 'sonypicturesrus@mail.ru',
-    }
+                'duration': 184,
            },
        },
        {
            'url': 'http://my.mail.ru/corp/hitech/video/news_hi-tech_mail_ru/1263.html',
            'md5': '00a91a58c3402204dcced523777b475f',
            'info_dict': {
                'id': '46843144',
                'ext': 'mp4',
                'title': 'Samsung Galaxy S5 Hammer Smash Fail Battery Explosion',
                'timestamp': 1397217632,
                'upload_date': '20140411',
                'uploader': 'hitech',
                'uploader_id': 'hitech@corp.mail.ru',
                'duration': 245,
            },
        },
    ]
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
-        video_id = mobj.group('id')
+        video_id = mobj.group('idv1')
        if not video_id:
            video_id = mobj.group('idv2prefix') + mobj.group('idv2suffix')
        video_data = self._download_json(
-            'http://videoapi.my.mail.ru/videos/%s.json?new=1' % video_id, video_id, 'Downloading video JSON')
+            'http://api.video.mail.ru/videos/%s.json?new=1' % video_id, video_id, 'Downloading video JSON')
        author = video_data['author']
        uploader = author['name']
@@ -40,10 +59,11 @@ class MailRuIE(InfoExtractor):
        movie = video_data['movie']
        content_id = str(movie['contentId'])
        title = movie['title']
        if title.endswith('.mp4'):
            title = title[:-4]
        thumbnail = movie['poster']
        duration = movie['duration']
        upload_date = datetime.datetime.fromtimestamp(video_data['timestamp']).strftime('%Y%m%d')
        view_count = video_data['views_count']
        formats = [
@@ -57,7 +77,7 @@ class MailRuIE(InfoExtractor):
            'id': content_id,
            'title': title,
            'thumbnail': thumbnail,
-            'upload_date': upload_date,
+            'timestamp': video_data['timestamp'],
            'uploader': uploader,
            'uploader_id': uploader_id,
            'duration': duration,
--- a/youtube_dl/extractor/nbc.py
+++ b/youtube_dl/extractor/nbc.py
@@ -1,6 +1,7 @@
 from __future__ import unicode_literals
 import re
 import json
 from .common import InfoExtractor
 from ..utils import find_xpath_attr, compat_str
@@ -31,30 +32,68 @@ class NBCIE(InfoExtractor):
 class NBCNewsIE(InfoExtractor):
-    _VALID_URL = r'https?://www\.nbcnews\.com/video/.+?/(?P<id>\d+)'
+    _VALID_URL = r'''(?x)https?://www\.nbcnews\.com/
        ((video/.+?/(?P<id>\d+))|
        (feature/[^/]+/(?P<title>.+)))
        '''
-    _TEST = {
+    _TESTS = [
-        'url': 'http://www.nbcnews.com/video/nbc-news/52753292',
+        {
-        'md5': '47abaac93c6eaf9ad37ee6c4463a5179',
+            'url': 'http://www.nbcnews.com/video/nbc-news/52753292',
-        'info_dict': {
+            'md5': '47abaac93c6eaf9ad37ee6c4463a5179',
-            'id': '52753292',
+            'info_dict': {
-            'ext': 'flv',
+                'id': '52753292',
-            'title': 'Crew emerges after four-month Mars food study',
+                'ext': 'flv',
-            'description': 'md5:24e632ffac72b35f8b67a12d1b6ddfc1',
+                'title': 'Crew emerges after four-month Mars food study',
                'description': 'md5:24e632ffac72b35f8b67a12d1b6ddfc1',
            },
        },
-    }
+        {
            'url': 'http://www.nbcnews.com/feature/edward-snowden-interview/how-twitter-reacted-snowden-interview-n117236',
            'md5': 'b2421750c9f260783721d898f4c42063',
            'info_dict': {
                'id': 'I1wpAI_zmhsQ',
                'ext': 'flv',
                'title': 'How Twitter Reacted To The Snowden Interview',
                'description': 'md5:65a0bd5d76fe114f3c2727aa3a81fe64',
            },
            'add_ie': ['ThePlatform'],
        },
    ]
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        video_id = mobj.group('id')
-        all_info = self._download_xml('http://www.nbcnews.com/id/%s/displaymode/1219' % video_id, video_id)
+        if video_id is not None:
-        info = all_info.find('video')
+            all_info = self._download_xml('http://www.nbcnews.com/id/%s/displaymode/1219' % video_id, video_id)
            info = all_info.find('video')
-        return {
+            return {
-            'id': video_id,
+                'id': video_id,
-            'title': info.find('headline').text,
+                'title': info.find('headline').text,
-            'ext': 'flv',
+                'ext': 'flv',
-            'url': find_xpath_attr(info, 'media', 'type', 'flashVideo').text,
+                'url': find_xpath_attr(info, 'media', 'type', 'flashVideo').text,
-            'description': compat_str(info.find('caption').text),
+                'description': compat_str(info.find('caption').text),
-            'thumbnail': find_xpath_attr(info, 'media', 'type', 'thumbnail').text,
+                'thumbnail': find_xpath_attr(info, 'media', 'type', 'thumbnail').text,
-        }
+            }
        else:
            # "feature" pages use theplatform.com
            title = mobj.group('title')
            webpage = self._download_webpage(url, title)
            bootstrap_json = self._search_regex(
                r'var bootstrapJson = ({.+})\s*$', webpage, 'bootstrap json',
                flags=re.MULTILINE)
            bootstrap = json.loads(bootstrap_json)
            info = bootstrap['results'][0]['video']
            playlist_url = info['fallbackPlaylistUrl'] + '?form=MPXNBCNewsAPI'
            mpxid = info['mpxId']
            all_videos = self._download_json(playlist_url, title)['videos']
            # The response contains additional videos
            info = next(v for v in all_videos if v['mpxId'] == mpxid)
            return {
                '_type': 'url',
                # We get the best quality video
                'url': info['videoAssets'][-1]['publicUrl'],
                'ie_key': 'ThePlatform',
            }
--- a/youtube_dl/extractor/ndr.py
+++ b/youtube_dl/extractor/ndr.py
@@ -4,7 +4,11 @@ from __future__ import unicode_literals
 import re
 from .common import InfoExtractor
-from ..utils import ExtractorError
+from ..utils import (
    ExtractorError,
    int_or_none,
    qualities,
 )
 class NDRIE(InfoExtractor):
@@ -45,17 +49,16 @@ class NDRIE(InfoExtractor):
        page = self._download_webpage(url, video_id, 'Downloading page')
-        title = self._og_search_title(page)
+        title = self._og_search_title(page).strip()
        description = self._og_search_description(page)
        if description:
            description = description.strip()
-        mobj = re.search(
+        duration = int_or_none(self._html_search_regex(r'duration: (\d+),\n', page, 'duration', fatal=False))
            r'<div class="duration"><span class="min">(?P<minutes>\d+)</span>:<span class="sec">(?P<seconds>\d+)</span></div>',
            page)
        duration = int(mobj.group('minutes')) * 60 + int(mobj.group('seconds')) if mobj else None
        formats = []
-        mp3_url = re.search(r'''{src:'(?P<audio>[^']+)', type:"audio/mp3"},''', page)
+        mp3_url = re.search(r'''\{src:'(?P<audio>[^']+)', type:"audio/mp3"},''', page)
        if mp3_url:
            formats.append({
                'url': mp3_url.group('audio'),
@@ -64,13 +67,15 @@ class NDRIE(InfoExtractor):
        thumbnail = None
-        video_url = re.search(r'''3: {src:'(?P<video>.+?)\.hi\.mp4', type:"video/mp4"},''', page)
+        video_url = re.search(r'''3: \{src:'(?P<video>.+?)\.hi\.mp4', type:"video/mp4"},''', page)
        if video_url:
-            thumbnail = self._html_search_regex(r'(?m)title: "NDR PLAYER",\s*poster: "([^"]+)",',
+            thumbnails = re.findall(r'''\d+: \{src: "([^"]+)"(?: \|\| '[^']+')?, quality: '([^']+)'}''', page)
-                page, 'thumbnail', fatal=False)
+            if thumbnails:
-            if thumbnail:
+                quality_key = qualities(['xs', 's', 'm', 'l', 'xl'])
-                thumbnail = 'http://www.ndr.de' + thumbnail
+                largest = max(thumbnails, key=lambda thumb: quality_key(thumb[1]))
-            for format_id in ['lo', 'hi', 'hq']:
+                thumbnail = 'http://www.ndr.de' + largest[0]
            for format_id in 'lo', 'hi', 'hq':
                formats.append({
                    'url': '%s.%s.mp4' % (video_url.group('video'), format_id),
                    'format_id': format_id,
--- a/youtube_dl/extractor/noco.py
+++ b/youtube_dl/extractor/noco.py
@@ -26,7 +26,8 @@ class NocoIE(InfoExtractor):
            'uploader': 'Nolife',
            'uploader_id': 'NOL',
            'duration': 2851.2,
-        }
+        },
        'skip': 'Requires noco account',
    }
    def _real_extract(self, url):
--- a/youtube_dl/extractor/nowness.py
+++ b/youtube_dl/extractor/nowness.py
@@ -4,9 +4,7 @@ import re
 from .brightcove import BrightcoveIE
 from .common import InfoExtractor
-from ..utils import (
+from ..utils import ExtractorError
    ExtractorError,
 )
 class NownessIE(InfoExtractor):
@@ -14,9 +12,10 @@ class NownessIE(InfoExtractor):
    _TEST = {
        'url': 'http://www.nowness.com/day/2013/6/27/3131/candor--the-art-of-gesticulation',
-        'file': '2520295746001.mp4',
+        'md5': '068bc0202558c2e391924cb8cc470676',
        'md5': '0ece2f70a7bd252c7b00f3070182d418',
        'info_dict': {
            'id': '2520295746001',
            'ext': 'mp4',
            'description': 'Candor: The Art of Gesticulation',
            'uploader': 'Nowness',
            'title': 'Candor: The Art of Gesticulation',
--- a/youtube_dl/extractor/nrk.py
+++ b/youtube_dl/extractor/nrk.py
@@ -4,7 +4,11 @@ from __future__ import unicode_literals
 import re
 from .common import InfoExtractor
-from ..utils import ExtractorError
+from ..utils import (
    ExtractorError,
    int_or_none,
    unified_strdate,
 )
 class NRKIE(InfoExtractor):
@@ -65,3 +69,77 @@ class NRKIE(InfoExtractor):
            'description': data['description'],
            'thumbnail': thumbnail,
        }
 class NRKTVIE(InfoExtractor):
    _VALID_URL = r'http://tv\.nrk(?:super)?\.no/(?:serie/[^/]+|program)/(?P<id>[a-z]{4}\d{8})'
    _TESTS = [
        {
            'url': 'http://tv.nrk.no/serie/20-spoersmaal-tv/muhh48000314/23-05-2014',
            'md5': '7b96112fbae1faf09a6f9ae1aff6cb84',
            'info_dict': {
                'id': 'muhh48000314',
                'ext': 'flv',
                'title': '20 spørsmål',
                'description': 'md5:bdea103bc35494c143c6a9acdd84887a',
                'upload_date': '20140523',
                'duration': 1741.52,
            }
        },
        {
            'url': 'http://tv.nrk.no/program/mdfp15000514',
            'md5': '383650ece2b25ecec996ad7b5bb2a384',
            'info_dict': {
                'id': 'mdfp15000514',
                'ext': 'flv',
                'title': 'Kunnskapskanalen: Grunnlovsjubiléet - Stor ståhei for ingenting',
                'description': 'md5:654c12511f035aed1e42bdf5db3b206a',
                'upload_date': '20140524',
                'duration': 4605.0,
            }
        },
    ]
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        video_id = mobj.group('id')
        page = self._download_webpage(url, video_id)
        title = self._html_search_meta('title', page, 'title')
        description = self._html_search_meta('description', page, 'description')
        thumbnail = self._html_search_regex(r'data-posterimage="([^"]+)"', page, 'thumbnail', fatal=False)
        upload_date = unified_strdate(self._html_search_meta('rightsfrom', page, 'upload date', fatal=False))
        duration = self._html_search_regex(r'data-duration="([^"]+)"', page, 'duration', fatal=False)
        if duration:
            duration = float(duration)
        formats = []
        f4m_url = re.search(r'data-media="([^"]+)"', page)
        if f4m_url:
            formats.append({
                'url': f4m_url.group(1) + '?hdcore=3.1.1&plugin=aasp-3.1.1.69.124',
                'format_id': 'f4m',
                'ext': 'flv',
            })
        m3u8_url = re.search(r'data-hls-media="([^"]+)"', page)
        if m3u8_url:
            formats.append({
                'url': m3u8_url.group(1),
                'format_id': 'm3u8',
            })
        self._sort_formats(formats)
        return {
            'id': video_id,
            'title': title,
            'description': description,
            'thumbnail': thumbnail,
            'upload_date': upload_date,
            'duration': duration,
            'formats': formats,
        }
--- a/youtube_dl/extractor/nuvid.py
+++ b/youtube_dl/extractor/nuvid.py
@@ -30,7 +30,7 @@ class NuvidIE(InfoExtractor):
            webpage, 'title').strip()
        url_end = self._html_search_regex(
-            r'href="(/mp4/[^"]+)"[^>]*data-link_type="mp4"',
+            r'href="(/[^"]+)"[^>]*data-link_type="mp4"',
            webpage, 'video_url')
        video_url = 'http://m.nuvid.com' + url_end
--- a/youtube_dl/extractor/photobucket.py
+++ b/youtube_dl/extractor/photobucket.py
@@ -1,10 +1,10 @@
 from __future__ import unicode_literals
 import datetime
 import json
 import re
 from .common import InfoExtractor
 from ..utils import compat_urllib_parse
 class PhotobucketIE(InfoExtractor):
@@ -14,6 +14,7 @@ class PhotobucketIE(InfoExtractor):
        'file': 'zpsc0c3b9fa.mp4',
        'md5': '7dabfb92b0a31f6c16cebc0f8e60ff99',
        'info_dict': {
            'timestamp': 1367669341,
            'upload_date': '20130504',
            'uploader': 'rachaneronas',
            'title': 'Tired of Link Building? Try BacklinkMyDomain.com!',
@@ -32,11 +33,12 @@ class PhotobucketIE(InfoExtractor):
        info_json = self._search_regex(r'Pb\.Data\.Shared\.put\(Pb\.Data\.Shared\.MEDIA, (.*?)\);',
            webpage, 'info json')
        info = json.loads(info_json)
        url = compat_urllib_parse.unquote(self._html_search_regex(r'file=(.+\.mp4)', info['linkcodes']['html'], 'url'))
        return {
            'id': video_id,
-            'url': info['downloadUrl'],
+            'url': url,
            'uploader': info['username'],
-            'upload_date': datetime.date.fromtimestamp(info['creationDate']).strftime('%Y%m%d'),
+            'timestamp': info['creationDate'],
            'title': info['title'],
            'ext': video_extension,
            'thumbnail': info['thumbUrl'],
--- a/youtube_dl/extractor/pornhub.py
+++ b/youtube_dl/extractor/pornhub.py
@@ -45,7 +45,7 @@ class PornHubIE(InfoExtractor):
        video_title = self._html_search_regex(r'<h1 [^>]+>([^<]+)', webpage, 'title')
        video_uploader = self._html_search_regex(
-            r'(?s)<div class="video-info-row">\s*From:&nbsp;.+?<(?:a href="/users/|<span class="username)[^>]+>(.+?)<',
+            r'(?s)From:&nbsp;.+?<(?:a href="/users/|<span class="username)[^>]+>(.+?)<',
            webpage, 'uploader', fatal=False)
        thumbnail = self._html_search_regex(r'"image_url":"([^"]+)', webpage, 'thumbnail', fatal=False)
        if thumbnail:
--- a/youtube_dl/extractor/streamcz.py
+++ b/youtube_dl/extractor/streamcz.py
@@ -5,13 +5,16 @@ import re
 import json
 from .common import InfoExtractor
-from ..utils import int_or_none
+from ..utils import (
    int_or_none,
    compat_str,
 )
 class StreamCZIE(InfoExtractor):
    _VALID_URL = r'https?://(?:www\.)?stream\.cz/.+/(?P<videoid>.+)'
-    _TEST = {
+    _TESTS = [{
        'url': 'http://www.stream.cz/peklonataliri/765767-ecka-pro-deti',
        'md5': '6d3ca61a8d0633c9c542b92fcb936b0c',
        'info_dict': {
@@ -22,7 +25,18 @@ class StreamCZIE(InfoExtractor):
            'thumbnail': 'http://im.stream.cz/episode/52961d7e19d423f8f06f0100',
            'duration': 256,
        },
-    }
+    }, {
        'url': 'http://www.stream.cz/blanik/10002447-tri-roky-pro-mazanka',
        'md5': '246272e753e26bbace7fcd9deca0650c',
        'info_dict': {
            'id': '10002447',
            'ext': 'mp4',
            'title': 'Kancelář Blaník: Tři roky pro Mazánka',
            'description': 'md5:9177695a8b756a0a8ab160de4043b392',
            'thumbnail': 'http://im.stream.cz/episode/537f838c50c11f8d21320000',
            'duration': 368,
        },
    }]
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
@@ -57,7 +71,7 @@ class StreamCZIE(InfoExtractor):
        self._sort_formats(formats)
        return {
-            'id': str(jsonData['id']),
+            'id': compat_str(jsonData['episode_id']),
            'title': self._og_search_title(webpage),
            'thumbnail': jsonData['episode_image_original_url'].replace('//', 'http://'),
            'formats': formats,
--- a/youtube_dl/extractor/swrmediathek.py
+++ b/youtube_dl/extractor/swrmediathek.py
@@ -0,0 +1,104 @@
 # -*- coding: utf-8 -*-
 from __future__ import unicode_literals
 import re
 from .common import InfoExtractor
 from ..utils import parse_duration
 class SWRMediathekIE(InfoExtractor):
    _VALID_URL = r'https?://(?:www\.)?swrmediathek\.de/player\.htm\?show=(?P<id>[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})'
    _TESTS = [{
        'url': 'http://swrmediathek.de/player.htm?show=849790d0-dab8-11e3-a953-0026b975f2e6',
        'md5': '8c5f6f0172753368547ca8413a7768ac',
        'info_dict': {
            'id': '849790d0-dab8-11e3-a953-0026b975f2e6',
            'ext': 'mp4',
            'title': 'SWR odysso',
            'description': 'md5:2012e31baad36162e97ce9eb3f157b8a',
            'thumbnail': 're:^http:.*\.jpg$',
            'duration': 2602,
            'upload_date': '20140515',
            'uploader': 'SWR Fernsehen',
            'uploader_id': '990030',
        },
    }, {
        'url': 'http://swrmediathek.de/player.htm?show=0e1a8510-ddf2-11e3-9be3-0026b975f2e6',
        'md5': 'b10ab854f912eecc5a6b55cd6fc1f545',
        'info_dict': {
            'id': '0e1a8510-ddf2-11e3-9be3-0026b975f2e6',
            'ext': 'mp4',
            'title': 'Nachtcafé - Alltagsdroge Alkohol - zwischen Sektempfang und Komasaufen',
            'description': 'md5:e0a3adc17e47db2c23aab9ebc36dbee2',
            'thumbnail': 're:http://.*\.jpg',
            'duration': 5305,
            'upload_date': '20140516',
            'uploader': 'SWR Fernsehen',
            'uploader_id': '990030',
        },
    }, {
        'url': 'http://swrmediathek.de/player.htm?show=bba23e10-cb93-11e3-bf7f-0026b975f2e6',
        'md5': '4382e4ef2c9d7ce6852535fa867a0dd3',
        'info_dict': {
            'id': 'bba23e10-cb93-11e3-bf7f-0026b975f2e6',
            'ext': 'mp3',
            'title': 'Saša Stanišic: Vor dem Fest',
            'description': 'md5:5b792387dc3fbb171eb709060654e8c9',
            'thumbnail': 're:http://.*\.jpg',
            'duration': 3366,
            'upload_date': '20140520',
            'uploader': 'SWR 2',
            'uploader_id': '284670',
        }
    }]
    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        video_id = mobj.group('id')
        video = self._download_json(
            'http://swrmediathek.de/AjaxEntry?ekey=%s' % video_id, video_id, 'Downloading video JSON')
        attr = video['attr']
        media_type = attr['entry_etype']
        formats = []
        for entry in video['sub']:
            if entry['name'] != 'entry_media':
                continue
            entry_attr = entry['attr']
            codec = entry_attr['val0']
            quality = int(entry_attr['val1'])
            fmt = {
                'url': entry_attr['val2'],
                'quality': quality,
            }
            if media_type == 'Video':
                fmt.update({
                    'format_note': ['144p', '288p', '544p'][quality-1],
                    'vcodec': codec,
                })
            elif media_type == 'Audio':
                fmt.update({
                    'acodec': codec,
                })
            formats.append(fmt)
        self._sort_formats(formats)
        return {
            'id': video_id,
            'title': attr['entry_title'],
            'description': attr['entry_descl'],
            'thumbnail': attr['entry_image_16_9'],
            'duration': parse_duration(attr['entry_durat']),
            'upload_date': attr['entry_pdatet'][:-4],
            'uploader': attr['channel_title'],
            'uploader_id': attr['channel_idkey'],
            'formats': formats,
        }
--- a/youtube_dl/extractor/theplatform.py
+++ b/youtube_dl/extractor/theplatform.py
@@ -1,3 +1,5 @@
 from __future__ import unicode_literals
 import re
 import json
@@ -18,17 +20,17 @@ class ThePlatformIE(InfoExtractor):
    _TEST = {
        # from http://www.metacafe.com/watch/cb-e9I_cZgTgIPd/blackberrys_big_bold_z30/
-        u'url': u'http://link.theplatform.com/s/dJ5BDC/e9I_cZgTgIPd/meta.smil?format=smil&Tracking=true&mbr=true',
+        'url': 'http://link.theplatform.com/s/dJ5BDC/e9I_cZgTgIPd/meta.smil?format=smil&Tracking=true&mbr=true',
-        u'info_dict': {
+        'info_dict': {
-            u'id': u'e9I_cZgTgIPd',
+            'id': 'e9I_cZgTgIPd',
-            u'ext': u'flv',
+            'ext': 'flv',
-            u'title': u'Blackberry\'s big, bold Z30',
+            'title': 'Blackberry\'s big, bold Z30',
-            u'description': u'The Z30 is Blackberry\'s biggest, baddest mobile messaging device yet.',
+            'description': 'The Z30 is Blackberry\'s biggest, baddest mobile messaging device yet.',
-            u'duration': 247,
+            'duration': 247,
        },
-        u'params': {
+        'params': {
            # rtmp download
-            u'skip_download': True,
+            'skip_download': True,
        },
    }
@@ -39,7 +41,7 @@ class ThePlatformIE(InfoExtractor):
            error_msg = next(
                n.attrib['abstract']
                for n in meta.findall(_x('.//smil:ref'))
-                if n.attrib.get('title') == u'Geographic Restriction')
+                if n.attrib.get('title') == 'Geographic Restriction')
        except StopIteration:
            pass
        else:
@@ -101,8 +103,7 @@ class ThePlatformIE(InfoExtractor):
            config_url = url+ '&form=json'
            config_url = config_url.replace('swf/', 'config/')
            config_url = config_url.replace('onsite/', 'onsite/config/')
-            config_json = self._download_webpage(config_url, video_id, u'Downloading config')
+            config = self._download_json(config_url, video_id, 'Downloading config')
            config = json.loads(config_json)
            smil_url = config['releaseUrl'] + '&format=SMIL&formats=MPEG4&manifest=f4m'
        else:
            smil_url = ('http://link.theplatform.com/s/dJ5BDC/{0}/meta.smil?'
--- a/youtube_dl/extractor/ustream.py
+++ b/youtube_dl/extractor/ustream.py
@@ -11,29 +11,36 @@ from ..utils import (
 class UstreamIE(InfoExtractor):
-    _VALID_URL = r'https?://www\.ustream\.tv/(?P<type>recorded|embed)/(?P<videoID>\d+)'
+    _VALID_URL = r'https?://www\.ustream\.tv/(?P<type>recorded|embed|embed/recorded)/(?P<videoID>\d+)'
    IE_NAME = 'ustream'
    _TEST = {
        'url': 'http://www.ustream.tv/recorded/20274954',
        'file': '20274954.flv',
        'md5': '088f151799e8f572f84eb62f17d73e5c',
        'info_dict': {
-            "uploader": "Young Americans for Liberty",
+            'id': '20274954',
-            "title": "Young Americans for Liberty February 7, 2012 2:28 AM",
+            'ext': 'flv',
            'uploader': 'Young Americans for Liberty',
            'title': 'Young Americans for Liberty February 7, 2012 2:28 AM',
        },
    }
    def _real_extract(self, url):
        m = re.match(self._VALID_URL, url)
        video_id = m.group('videoID')
        # some sites use this embed format (see: http://github.com/rg3/youtube-dl/issues/2990)
        if m.group('type') == 'embed/recorded':
            video_id = m.group('videoID')
            desktop_url = 'http://www.ustream.tv/recorded/' + video_id
            return self.url_result(desktop_url, 'Ustream')
        if m.group('type') == 'embed':
            video_id = m.group('videoID')
            webpage = self._download_webpage(url, video_id)
-            desktop_video_id = self._html_search_regex(r'ContentVideoIds=\["([^"]*?)"\]', webpage, 'desktop_video_id')
+            desktop_video_id = self._html_search_regex(
                r'ContentVideoIds=\["([^"]*?)"\]', webpage, 'desktop_video_id')
            desktop_url = 'http://www.ustream.tv/recorded/' + desktop_video_id
            return self.url_result(desktop_url, 'Ustream')
        video_id = m.group('videoID')
        video_url = 'http://tcdn.ustream.tv/video/%s' % video_id
        webpage = self._download_webpage(url, video_id)
--- a/youtube_dl/extractor/vevo.py
+++ b/youtube_dl/extractor/vevo.py
@@ -16,7 +16,7 @@ class VevoIE(InfoExtractor):
    (currently used by MTVIE)
    """
    _VALID_URL = r'''(?x)
-        (?:https?://www\.vevo\.com/watch/(?:[^/]+/[^/]+/)?|
+        (?:https?://www\.vevo\.com/watch/(?:[^/]+/(?:[^/]+/)?)?|
           https?://cache\.vevo\.com/m/html/embed\.html\?video=|
           https?://videoplayer\.vevo\.com/embed/embedded\?videoId=|
           vevo:)
--- a/youtube_dl/extractor/youtube.py
+++ b/youtube_dl/extractor/youtube.py
@@ -242,7 +242,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor, SubtitlesInfoExtractor):
                u"uploader": u"Philipp Hagemeister",
                u"uploader_id": u"phihag",
                u"upload_date": u"20121002",
-                u"description": u"test chars:  \"'/\\ä↭𝕐\ntest URL: https://github.com/rg3/youtube-dl/issues/1892\n\nThis is a test video for youtube-dl.\n\nFor more information, contact phihag@phihag.de ."
+                u"description": u"test chars:  \"'/\\ä↭𝕐\ntest URL: https://github.com/rg3/youtube-dl/issues/1892\n\nThis is a test video for youtube-dl.\n\nFor more information, contact phihag@phihag.de .",
                u"categories": [u'Science & Technology'],
            }
        },
        {
@@ -1136,11 +1137,24 @@ class YoutubeIE(YoutubeBaseInfoExtractor, SubtitlesInfoExtractor):
        # upload date
        upload_date = None
-        mobj = re.search(r'id="eow-date.*?>(.*?)</span>', video_webpage, re.DOTALL)
+        mobj = re.search(r'(?s)id="eow-date.*?>(.*?)</span>', video_webpage)
        if mobj is None:
            mobj = re.search(
                r'(?s)id="watch-uploader-info".*?>.*?(?:Published|Uploaded|Streamed live) on (.*?)</strong>',
                video_webpage)
        if mobj is not None:
            upload_date = ' '.join(re.sub(r'[/,-]', r' ', mobj.group(1)).split())
            upload_date = unified_strdate(upload_date)
        m_cat_container = get_element_by_id("eow-category", video_webpage)
        if m_cat_container:
            category = self._html_search_regex(
                r'(?s)<a[^<]+>(.*?)</a>', m_cat_container, 'category',
                default=None)
            video_categories = None if category is None else [category]
        else:
            video_categories = None
        # description
        video_description = get_element_by_id("eow-description", video_webpage)
        if video_description:
@@ -1347,6 +1361,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor, SubtitlesInfoExtractor):
            'title':        video_title,
            'thumbnail':    video_thumbnail,
            'description':  video_description,
            'categories':   video_categories,
            'subtitles':    video_subtitles,
            'duration':     video_duration,
            'age_limit':    18 if age_gate else 0,
@@ -1760,9 +1775,12 @@ class YoutubeFeedsInfoExtractor(YoutubeBaseInfoExtractor):
            feed_entries.extend(
                self.url_result(video_id, 'Youtube', video_id=video_id)
                for video_id in ids)
-            if info['paging'] is None:
+            mobj = re.search(
                r'data-uix-load-more-href="/?[^"]+paging=(?P<paging>\d+)',
                feed_html)
            if mobj is None:
                break
-            paging = info['paging']
+            paging = mobj.group('paging')
        return self.playlist_result(feed_entries, playlist_title=self._PLAYLIST_TITLE)
 class YoutubeSubscriptionsIE(YoutubeFeedsInfoExtractor):
--- a/youtube_dl/postprocessor/ffmpeg.py
+++ b/youtube_dl/postprocessor/ffmpeg.py
@@ -9,6 +9,7 @@ from .common import AudioConversionError, PostProcessor
 from ..utils import (
    check_executable,
    compat_subprocess_get_DEVNULL,
    encodeArgument,
    encodeFilename,
    PostProcessingError,
    prepend_extension,
@@ -48,7 +49,7 @@ class FFmpegPostProcessor(PostProcessor):
        for path in input_paths:
            files_cmd.extend(['-i', encodeFilename(path, True)])
        cmd = ([self._get_executable(), '-y'] + files_cmd
-               + opts +
+               + [encodeArgument(o) for o in opts] +
               [encodeFilename(self._ffmpeg_filename_argument(out_path), True)])
        if self._downloader.params.get('verbose', False):
--- a/youtube_dl/postprocessor/xattrpp.py
+++ b/youtube_dl/postprocessor/xattrpp.py
@@ -6,6 +6,7 @@ from .common import PostProcessor
 from ..utils import (
    check_executable,
    hyphenate_date,
    subprocess_check_output
 )
@@ -57,7 +58,7 @@ class XAttrMetadataPP(PostProcessor):
                        elif user_has_xattr:
                            cmd = ['xattr', '-w', key, value, path]
-                        subprocess.check_output(cmd)
+                        subprocess_check_output(cmd)
                else:
                    # On Unix, and can't find pyxattr, setfattr, or xattr.
--- a/youtube_dl/utils.py
+++ b/youtube_dl/utils.py
@@ -540,6 +540,16 @@ def encodeFilename(s, for_subprocess=False):
        encoding = 'utf-8'
    return s.encode(encoding, 'ignore')
 def encodeArgument(s):
    if not isinstance(s, compat_str):
        # Legacy code that uses byte strings
        # Uncomment the following line after fixing all post processors
        #assert False, 'Internal error: %r should be of type %r, is %r' % (s, compat_str, type(s))
        s = s.decode('ascii')
    return encodeFilename(s, True)
 def decodeOption(optval):
    if optval is None:
        return optval
@@ -1429,3 +1439,15 @@ def qualities(quality_ids):
 DEFAULT_OUTTMPL = '%(title)s-%(id)s.%(ext)s'
 try:
    subprocess_check_output = subprocess.check_output
 except AttributeError:
    def subprocess_check_output(*args, **kwargs):
        assert 'input' not in kwargs
        p = subprocess.Popen(*args, stdout=subprocess.PIPE, **kwargs)
        output, _ = p.communicate()
        ret = p.poll()
        if ret:
            raise subprocess.CalledProcessError(ret, p.args, output=output)
        return output
--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@@ -1,2 +1,2 @@
-__version__ = '2014.05.13'
+__version__ = '2014.06.04'
Author	SHA1	Message	Date
Philipp Hagemeister	b675b32e6b	release 2014.06.04	2014-06-04 06:47:57 +02:00
Jaime Marquínez Ferrándiz	6a3fa81ffb	[ard] Fix format extraction (fixes #3006 and #3032 )	2014-06-03 21:56:49 +02:00
Jaime Marquínez Ferrándiz	0d69795014	Merge pull request #2962 from simonwjackson/patch-1 Update test_age_restriction.py	2014-06-03 16:47:59 +02:00
Sergey M.	3374f3fdc2	Merge pull request #3022 from MikeCol/Extremetube_title title extraction condition less restrictive	2014-06-03 19:59:08 +07:00
Sergey M.	4bf0727b1f	Merge pull request #3033 from Forever-Young/patch-2 Recognize a third format of the upload_date in the 'watch-uploader-info'...	2014-06-02 20:20:21 +07:00
Anton Novosyolov	263bd4ec50	Recognize a third format of the upload_date in the 'watch-uploader-info' element	2014-06-02 13:30:23 +04:00
Philipp Hagemeister	b7e8b6e37a	release 2014.06.02	2014-06-02 10:47:24 +02:00
Sergey M․	ceb7a17f34	[mailru] Add support for new mail.ru URL format (Closes #3024 )	2014-06-01 14:38:36 +07:00
Philipp Hagemeister	1a2f2e1e66	release 2014.05.31.4	2014-05-31 20:45:24 +02:00
Philipp Hagemeister	6803016858	release 2014.05.31.3	2014-05-31 20:40:48 +02:00
Philipp Hagemeister	9b7c4fd981	release 2014.05.31.2	2014-05-31 20:35:12 +02:00
Philipp Hagemeister	dc31942f42	release 2014.05.31.1	2014-05-31 20:29:53 +02:00
Philipp Hagemeister	1f6b8f3115	release 2014.05.31	2014-05-31 20:28:03 +02:00
MikeCol	9c7b79acd9	title extraction condition less restrictive	2014-05-31 18:31:39 +02:00
Jaime Marquínez Ferrándiz	9168308579	[vevo] The title in the url is optional (fixes #3020 )	2014-05-31 17:55:03 +02:00
Jaime Marquínez Ferrándiz	7e8fdb1aae	[fc2] Recognize urls without language part (reported in #1154 )	2014-05-31 14:45:46 +02:00
Jaime Marquínez Ferrándiz	386ba39cac	[fc2] Encode the string used for the md5 checksum In python 3 it must be a bytes object.	2014-05-31 14:40:05 +02:00
Sergey M․	236d0cd07c	[nrktv] Recognize tv.nrksuper.no URL	2014-05-31 17:45:00 +07:00
Jaime Marquínez Ferrándiz	ed86f38a11	[theplatform] Use unicode_literals and _download_json	2014-05-30 21:10:48 +02:00
Jaime Marquínez Ferrándiz	6db80ad2db	[comedycentralshows] Transform the rtmp urls so that rtmpdump can download them (fixes #3010 ) From 'rtmpe://viacomccstrmfs.fplive.net/viacomccstrm/gsp.comedystor/' to 'rtmpe://viacommtvstrmfs.fplive.net:1935/viacommtvstrm/gsp.comedystor/'	2014-05-30 20:59:15 +02:00
Sergey M․	6ebb46c106	[ivi] Replace tests	2014-05-30 19:12:55 +07:00
Philipp Hagemeister	0f97c9a06f	[ard] Fix title (#3006 )	2014-05-30 04:59:18 +02:00
Philipp Hagemeister	77fb72646f	release 2014.05.30.1	2014-05-30 03:26:03 +02:00
Philipp Hagemeister	aae74e3832	[Makefile] Remove CHANGELOG entry	2014-05-30 03:26:00 +02:00
Philipp Hagemeister	894e730911	release 2014.05.30	2014-05-30 03:19:51 +02:00
Philipp Hagemeister	63961d87a6	[devscripts/release] Do not commit CHANGELOG	2014-05-30 03:19:37 +02:00
Jaime Marquínez Ferrándiz	87fe568c28	[nbcnews] Add support for /feature/* pages (closes #3007 )	2014-05-30 00:38:57 +02:00
Sergey M․	46531b374d	Merge branch 'anovicecodemonkey-ustream-embed-recorded2'	2014-05-29 20:23:36 +07:00
Sergey M․	9e8753911c	[ustream] Modernize	2014-05-29 20:22:36 +07:00
Sergey M․	5c6b1e578c	[ustream] Remove unnecessary webpage download	2014-05-29 20:20:11 +07:00
Sergey M․	8f0c8fb452	Merge branch 'ustream-embed-recorded2' of https://github.com/anovicecodemonkey/youtube-dl into anovicecodemonkey-ustream-embed-recorded2	2014-05-29 19:57:42 +07:00
anovicecodemonkey	b702ecebf0	[UstreamIE] added support for "/embed/recorded/" style URLs (Fixes #2990 )	2014-05-28 22:17:13 +09:30
Sergey M․	950dc95e97	Merge branch 'rzhxeo-cinemassacre'	2014-05-28 19:38:55 +07:00
Sergey M․	d9dd3584e1	[cinemassacre] Improve formats extraction and modernize	2014-05-28 19:38:44 +07:00
Sergey M․	15a9f36849	Merge branch 'cinemassacre' of https://github.com/rzhxeo/youtube-dl into rzhxeo-cinemassacre	2014-05-28 19:31:23 +07:00
Sergey M․	d0087d4ff2	[nuvid] Fix video URL extraction	2014-05-27 18:46:30 +07:00
Sergey M․	cc5ada6f4c	[ivi] Update playlist tests	2014-05-26 00:16:10 +07:00
Sergey M․	dfb2e1a325	[nrktv] Add support for tv.nrk.no (Closes #2980 )	2014-05-25 07:14:18 +07:00
Sergey M.	65bab327b4	Merge pull request #2953 from codesparkle/ndr-regexes-escape-correctly [ndr] fix regexes containing illegal characters	2014-05-25 05:42:06 +07:00
Sergey M.	9eeb7abc6b	Merge pull request #2960 from codesparkle/fix-test-format-note-regex [test] fixed typo in test_format_note (test_YoutubeDL)	2014-05-25 05:36:03 +07:00
Sergey M․	c70df21099	[streamcz] Workaround CertificateError	2014-05-25 05:32:19 +07:00
Sergey M․	418424e5f5	[streamcz] Use compat_str	2014-05-25 05:30:15 +07:00
Sergey M.	8477466125	Merge pull request #2979 from pulpe/streamcz_fix [StreamCZ] correct video id + add test	2014-05-25 05:28:49 +07:00
pulpe	865dbd4a26	[StreamCZ] correct video id + add test	2014-05-24 16:01:37 +02:00
Sergey M․	b1e6f55912	[empflix] Fix extraction	2014-05-24 01:06:03 +07:00
Sergey M․	4d78f3b770	[pornhub] Fix uploader extraction	2014-05-24 00:44:34 +07:00
Sergey M․	7f739999e9	[swrmediathek] Extract direct links from JSON and add support for audio files	2014-05-23 21:04:21 +07:00
Sergey M․	0f8a01d4f3	[swrmediathek] Simplify	2014-05-22 19:35:46 +07:00
Sergey M.	e2bf499b14	Merge pull request #2944 from pulpe/SWRMediathek [SWRMediathek] add support for swrmediathek.de (fixes #2929)	2014-05-22 19:30:09 +07:00
rzhxeo	7cf4547ab6	[CinemassacreIE] Extract all available video/audio formats	2014-05-22 10:33:30 +02:00
Simon W. Jackson	8ae980807a	Update test_age_restriction.py typo	2014-05-21 16:35:49 +02:00
Sergey M․	eec4d8ef96	[gamekings] Update test description	2014-05-21 19:53:58 +07:00
codesparkle	1c783bca88	fixed (what I assume was a typo) that caused test_format_note to always fail. This test was introduced in `c57f775710`.	2014-05-21 18:03:17 +10:00
Philipp Hagemeister	ac73651f66	Merge pull request #2940 from codesparkle/remove-unused-files Remove old, unused CHANGELOG and LATEST_VERSION files	2014-05-21 08:33:13 +02:00
codesparkle	e5ceb3bfda	Bringing back LATEST_VERSION	2014-05-21 00:55:54 +10:00
Sergey M․	c2ef29234c	Credit @codesparkle for #2928 , #2934 , #2938 , #2939	2014-05-20 20:12:57 +07:00
Sergey M.	1a1826c1af	Merge pull request #2939 from codesparkle/upload-date-fix No longer erroneously calculate upload_date within some extractors	2014-05-20 19:53:28 +07:00
Sergey M․	c7c6d43fe1	Merge branch 'codesparkle-bandcamp-albums-regex-duplicate-fix'	2014-05-20 19:45:28 +07:00
Sergey M․	2902d44f99	[bandcamp] Replace maxsplit keyword argument with regular one Named arguments are not supported by methods implemented in native C (see http://bugs.python.org/issue1176)	2014-05-20 19:44:42 +07:00
Sergey M․	d6e4ba287b	Merge branch 'bandcamp-albums-regex-duplicate-fix' of https://github.com/codesparkle/youtube-dl into codesparkle-bandcamp-albums-regex-duplicate-fix	2014-05-20 19:38:28 +07:00
Philipp Hagemeister	f50ee8d1c3	Merge branch 'master' of github.com:rg3/youtube-dl	2014-05-19 17:10:19 +02:00
Philipp Hagemeister	0e67ab0d8e	[generic] Abort if user passes in URL "url" (#2942 )	2014-05-19 17:10:11 +02:00
codesparkle	77541837e5	The opening curly brace, '{', is a regex reserved control character, so it needs to be escaped (see http://stackoverflow.com/a/400316/1106367 ) Minor improvements: no need to sort the whole list if all we need is the maximum element, also instead of reinventing the wheel we can use utils to get indices from qualities.	2014-05-19 22:17:54 +10:00
Sergey M․	e3a6576f35	[nowness] Update test file md5 and modernize	2014-05-19 19:05:18 +07:00
Philipp Hagemeister	89bb8e97ee	release 2014.05.19	2014-05-19 11:42:37 +02:00
pulpe	375696b1b1	[SWRMediathek] add support for swrmediathek.de	2014-05-18 14:56:35 +02:00
Sergey M․	4ea5c7b70d	[ndr] Improve thumbnail extraction	2014-05-18 14:23:02 +07:00
Sergey M․	8dfa187b8a	[generic] Support pagespeed_iframe for NovaMov embeds	2014-05-17 18:12:12 +07:00
Sergey M․	c1ed1f7055	[ndr] Fix title, description and duration extraction	2014-05-17 18:11:40 +07:00
Sergey M․	1514f74967	[ndr] Fix thumbnail extraction	2014-05-17 17:58:37 +07:00
codesparkle	2e8323e3f7	CHANGELOG and LATEST_VERSION seem to serve no purpose at all. They haven't been changed in years. Unless these are actually used somewhere, let's get rid of them.	2014-05-17 17:07:50 +10:00
codesparkle	69f8364042	removed duplicate and somemtimes incorrect logic for parsing upload date as this job is already taken care of automatically by YoutubeDL.py	2014-05-17 15:21:46 +10:00
codesparkle	79981f039b	Fixed test failure in test_all_urls: test_no_duplicates: BandcampAlbumIE inappropriately matched non-album bandcamp links as well. BandcampIE changed to report full-accuracy duration instead of unnecessarily rounding it to the nearest integer. Simplified conditionals and parsing a bit. Fixed typos.	2014-05-17 14:22:24 +10:00
Philipp Hagemeister	91994c2c81	release 2014.05.17	2014-05-17 00:17:40 +02:00
Jaime Marquínez Ferrándiz	76e92371ac	[youtube] Recognize a second format of the upload_date in the 'watch-uploader-info' element (#2911 )	2014-05-16 22:12:52 +02:00
Jaime Marquínez Ferrándiz	08af0205f9	Merge remote-tracking branch 'codesparkle/fix-photobucket-url' (closes #2934 ) Fix photobucket url extraction	2014-05-16 20:44:52 +02:00
codesparkle	a725fb1f43	test_download works for photobucket after this change	2014-05-17 03:25:41 +10:00
Jaime Marquínez Ferrándiz	05ee2b6dad	[youtube] Fix extraction of the feed 'paging' values (fixes #2925 )	2014-05-16 16:01:13 +02:00
Philipp Hagemeister	b74feacac5	release 2014.05.16.1	2014-05-16 15:53:17 +02:00
Philipp Hagemeister	426b52fc5d	Merge remote-tracking branch 'origin/master'	2014-05-16 15:52:01 +02:00
Philipp Hagemeister	5c30b26846	[francetv] Add support for non-numeric video IDs (Fixes #2927 )	2014-05-16 15:51:01 +02:00
Philipp Hagemeister	f07b74fc18	[ffmpeg] Correct argument encoding on Windows with Python 2.x Fixes #2924	2014-05-16 15:47:56 +02:00
Sergey M․	a5a45015ba	[generic] Fix redirect	2014-05-16 20:32:53 +07:00
Philipp Hagemeister	beee53de06	[youtube] Look for published-on date if uploaded-on is not found Fixes #2911	2014-05-16 13:21:44 +02:00
Philipp Hagemeister	8712f2bea7	release 2014.05.16	2014-05-16 12:04:52 +02:00
Philipp Hagemeister	ea102818c9	Merge remote-tracking branch 'origin/master'	2014-05-16 12:04:24 +02:00
Philipp Hagemeister	0a871f6880	Provide compatibility check_output for 2.6 (Fixes #2926 )	2014-05-16 12:03:59 +02:00
Sergey M․	481efc84a8	[bliptv] Switch extraction to RSS (Closes #2920 )	2014-05-15 22:20:40 +07:00
Jaime Marquínez Ferrándiz	01ed5c9be3	[youtube] Fix typo	2014-05-15 13:43:29 +02:00
Philipp Hagemeister	ad3bc6acd5	Document and test categories (#2923 )	2014-05-15 12:41:42 +02:00
Philipp Hagemeister	5afa7f8bee	[extractor/common] --write-pages: Correct file name if video_id is None	2014-05-15 12:39:33 +02:00
Dario Guarascio	ec8deefc27	[youtube] Video categories added to metadata	2014-05-15 13:59:27 +07:00
Sergey M․	a2d5a4ee64	[gamespot] Update test URL and modernize	2014-05-14 20:13:34 +07:00
Jaime Marquínez Ferrándiz	dffcc2ea0c	Makefile: write the manpage to the right file and use the processed markdown document	2014-05-13 14:37:05 +02:00
Philipp Hagemeister	1800eeefed	add prepare_manpage	2014-05-13 14:21:21 +02:00
Sergey M․	d7e7dedbde	[noco] Skip test	2014-05-13 19:12:17 +07:00
Philipp Hagemeister	d19bb9c0aa	Split man and README (Fixes #2892 )	2014-05-13 11:16:11 +02:00
Philipp Hagemeister	3ef79a974a	[README] Stress example URL This seems to be the part most often overlooked in our README.	2014-05-13 10:28:58 +02:00
`@@ -1,2 +1,2 @@`

	`__version__ = '2014.05.13'`	`__version__ = '2014.06.04'`