release 2016.07.24

[tvp] Update dash format comment
[onet] Enable dash formats
2016-07-24 11:39:50 +07:00 · 2016-07-24 11:03:39 +07:00 · 2016-07-24 10:43:05 +07:00 · 2016-07-24 10:35:55 +07:00 · 2016-07-24 10:29:26 +07:00 · 2016-07-24 10:29:09 +07:00
23 changed files with 487 additions and 76 deletions
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
@@ -6,8 +6,8 @@

 ---

-### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.07.17*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.07.17**
+### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.07.24*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
+- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.07.24**

 ### Before submitting an *issue* make sure you have:
 - [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
 [debug] User config: []
 [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
-[debug] youtube-dl version 2016.07.17
+[debug] youtube-dl version 2016.07.24
 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
 [debug] Proxy map: {}
--- a/1
+++ b/1
@@ -178,3 +178,4 @@ Artur Krysiak
 Jakub Adam Wieczorek
 Aleksandar Topuzović
 Nehal Patel
+Rob van Bekkum
--- a/docs/supportedsites.md
+++ b/docs/supportedsites.md
@@ -46,6 +46,7 @@
 - **archive.org**: archive.org videos
 - **ARD**
 - **ARD:mediathek**
+ - **Arkena**
 - **arte.tv**
 - **arte.tv:+7**
 - **arte.tv:cinema**
@@ -142,6 +143,7 @@
 - **ComCarCoff**
 - **ComedyCentral**
 - **ComedyCentralShows**: The Daily Show / The Colbert Report
+ - **ComedyCentralTV**
 - **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED
 - **Coub**
 - **Cracked**
@@ -336,6 +338,8 @@
 - **kuwo:song**: 酷我音乐
 - **la7.it**
 - **Laola1Tv**
+ - **Lcp**
+ - **LcpPlay**
 - **Le**: 乐视网
 - **Learnr**
 - **Lecture2Go**
@@ -477,6 +481,7 @@
 - **NYTimes**
 - **NYTimesArticle**
 - **ocw.mit.edu**
+ - **OdaTV**
 - **Odnoklassniki**
 - **OktoberfestTV**
 - **on.aol.com**
--- a/youtube_dl/extractor/ard.py
+++ b/youtube_dl/extractor/ard.py
@@ -62,6 +62,17 @@ class ARDMediathekIE(InfoExtractor):
    }, {
        'url': 'http://mediathek.daserste.de/sendungen_a-z/328454_anne-will/22429276_vertrauen-ist-gut-spionieren-ist-besser-geht',
        'only_matching': True,
+    }, {
+        # audio
+        'url': 'http://mediathek.rbb-online.de/radio/Hörspiel/Vor-dem-Fest/kulturradio/Audio?documentId=30796318&topRessort=radio&bcastId=9839158',
+        'md5': '4e8f00631aac0395fee17368ac0e9867',
+        'info_dict': {
+            'id': '30796318',
+            'ext': 'mp3',
+            'title': 'Vor dem Fest',
+            'description': 'md5:c0c1c8048514deaed2a73b3a60eecacb',
+            'duration': 3287,
+        },
    }]

    def _extract_media_info(self, media_info_url, webpage, video_id):
--- a/youtube_dl/extractor/arkena.py
+++ b/youtube_dl/extractor/arkena.py
@@ -0,0 +1,115 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+import re
+
+from .common import InfoExtractor
+from ..utils import (
+    determine_ext,
+    float_or_none,
+    int_or_none,
+    mimetype2ext,
+    parse_iso8601,
+    strip_jsonp,
+)
+
+
+class ArkenaIE(InfoExtractor):
+    _VALID_URL = r'https?://play\.arkena\.com/(?:config|embed)/avp/v\d/player/media/(?P<id>[^/]+)/[^/]+/(?P<account_id>\d+)'
+    _TESTS = [{
+        'url': 'https://play.arkena.com/embed/avp/v2/player/media/b41dda37-d8e7-4d3f-b1b5-9a9db578bdfe/1/129411',
+        'md5': 'b96f2f71b359a8ecd05ce4e1daa72365',
+        'info_dict': {
+            'id': 'b41dda37-d8e7-4d3f-b1b5-9a9db578bdfe',
+            'ext': 'mp4',
+            'title': 'Big Buck Bunny',
+            'description': 'Royalty free test video',
+            'timestamp': 1432816365,
+            'upload_date': '20150528',
+            'is_live': False,
+        },
+    }, {
+        'url': 'https://play.arkena.com/config/avp/v2/player/media/b41dda37-d8e7-4d3f-b1b5-9a9db578bdfe/1/129411/?callbackMethod=jQuery1111023664739129262213_1469227693893',
+        'only_matching': True,
+    }, {
+        'url': 'http://play.arkena.com/config/avp/v1/player/media/327336/darkmatter/131064/?callbackMethod=jQuery1111002221189684892677_1469227595972',
+        'only_matching': True,
+    }, {
+        'url': 'http://play.arkena.com/embed/avp/v1/player/media/327336/darkmatter/131064/',
+        'only_matching': True,
+    }]
+
+    @staticmethod
+    def _extract_url(webpage):
+        # See https://support.arkena.com/display/PLAY/Ways+to+embed+your+video
+        mobj = re.search(
+            r'<iframe[^>]+src=(["\'])(?P<url>(?:https?:)?//play\.arkena\.com/embed/avp/.+?)\1',
+            webpage)
+        if mobj:
+            return mobj.group('url')
+
+    def _real_extract(self, url):
+        mobj = re.match(self._VALID_URL, url)
+        video_id = mobj.group('id')
+        account_id = mobj.group('account_id')
+
+        playlist = self._download_json(
+            'https://play.arkena.com/config/avp/v2/player/media/%s/0/%s/?callbackMethod=_'
+            % (video_id, account_id),
+            video_id, transform_source=strip_jsonp)['Playlist'][0]
+
+        media_info = playlist['MediaInfo']
+        title = media_info['Title']
+        media_files = playlist['MediaFiles']
+
+        is_live = False
+        formats = []
+        for kind_case, kind_formats in media_files.items():
+            kind = kind_case.lower()
+            for f in kind_formats:
+                f_url = f.get('Url')
+                if not f_url:
+                    continue
+                is_live = f.get('Live') == 'true'
+                exts = (mimetype2ext(f.get('Type')), determine_ext(f_url, None))
+                if kind == 'm3u8' or 'm3u8' in exts:
+                    formats.extend(self._extract_m3u8_formats(
+                        f_url, video_id, 'mp4',
+                        entry_protocol='m3u8' if is_live else 'm3u8_native',
+                        m3u8_id=kind, fatal=False, live=is_live))
+                elif kind == 'flash' or 'f4m' in exts:
+                    formats.extend(self._extract_f4m_formats(
+                        f_url, video_id, f4m_id=kind, fatal=False))
+                elif kind == 'dash' or 'mpd' in exts:
+                    formats.extend(self._extract_mpd_formats(
+                        f_url, video_id, mpd_id=kind, fatal=False))
+                elif kind == 'silverlight':
+                    # TODO: process when ism is supported (see
+                    # https://github.com/rg3/youtube-dl/issues/8118)
+                    continue
+                else:
+                    tbr = float_or_none(f.get('Bitrate'), 1000)
+                    formats.append({
+                        'url': f_url,
+                        'format_id': '%s-%d' % (kind, tbr) if tbr else kind,
+                        'tbr': tbr,
+                    })
+        self._sort_formats(formats)
+
+        description = media_info.get('Description')
+        video_id = media_info.get('VideoId') or video_id
+        timestamp = parse_iso8601(media_info.get('PublishDate'))
+        thumbnails = [{
+            'url': thumbnail['Url'],
+            'width': int_or_none(thumbnail.get('Size')),
+        } for thumbnail in (media_info.get('Poster') or []) if thumbnail.get('Url')]
+
+        return {
+            'id': video_id,
+            'title': title,
+            'description': description,
+            'timestamp': timestamp,
+            'is_live': is_live,
+            'thumbnails': thumbnails,
+            'formats': formats,
+        }
--- a/youtube_dl/extractor/bbc.py
+++ b/youtube_dl/extractor/bbc.py
@@ -589,7 +589,8 @@ class BBCIE(BBCCoUkIE):
        'info_dict': {
            'id': '150615_telabyad_kentin_cogu',
            'ext': 'mp4',
-            'title': "YPG: Tel Abyad'ın tamamı kontrolümüzde",
+            'title': "Tel Abyad'da IŞİD bayrağı indirildi YPG bayrağı çekildi",
+            'description': 'md5:33a4805a855c9baf7115fcbde57e7025',
            'timestamp': 1434397334,
            'upload_date': '20150615',
        },
@@ -603,6 +604,7 @@ class BBCIE(BBCCoUkIE):
            'id': '150619_video_honduras_militares_hospitales_corrupcion_aw',
            'ext': 'mp4',
            'title': 'Honduras militariza sus hospitales por nuevo escándalo de corrupción',
+            'description': 'md5:1525f17448c4ee262b64b8f0c9ce66c8',
            'timestamp': 1434713142,
            'upload_date': '20150619',
        },
@@ -818,8 +820,20 @@ class BBCIE(BBCCoUkIE):
                        # http://www.bbc.com/turkce/multimedya/2015/10/151010_vid_ankara_patlama_ani)
                        playlist = data_playable.get('otherSettings', {}).get('playlist', {})
                        if playlist:
-                            entries.append(self._extract_from_playlist_sxml(
-                                playlist.get('progressiveDownloadUrl'), playlist_id, timestamp))
+                            for key in ('progressiveDownload', 'streaming'):
+                                playlist_url = playlist.get('%sUrl' % key)
+                                if not playlist_url:
+                                    continue
+                                try:
+                                    entries.append(self._extract_from_playlist_sxml(
+                                        playlist_url, playlist_id, timestamp))
+                                except Exception as e:
+                                    # Some playlist URL may fail with 500, at the same time
+                                    # the other one may work fine (e.g.
+                                    # http://www.bbc.com/turkce/haberler/2015/06/150615_telabyad_kentin_cogu)
+                                    if isinstance(e.cause, compat_HTTPError) and e.cause.code == 500:
+                                        continue
+                                    raise

        if entries:
            return self.playlist_result(entries, playlist_id, playlist_title, playlist_description)
@@ -998,10 +1012,10 @@ class BBCCoUkPlaylistBaseIE(InfoExtractor):

 class BBCCoUkIPlayerPlaylistIE(BBCCoUkPlaylistBaseIE):
    IE_NAME = 'bbc.co.uk:iplayer:playlist'
-    _VALID_URL = r'https?://(?:www\.)?bbc\.co\.uk/iplayer/episodes/(?P<id>%s)' % BBCCoUkIE._ID_REGEX
+    _VALID_URL = r'https?://(?:www\.)?bbc\.co\.uk/iplayer/(?:episodes|group)/(?P<id>%s)' % BBCCoUkIE._ID_REGEX
    _URL_TEMPLATE = 'http://www.bbc.co.uk/iplayer/episode/%s'
    _VIDEO_ID_TEMPLATE = r'data-ip-id=["\'](%s)'
-    _TEST = {
+    _TESTS = [{
        'url': 'http://www.bbc.co.uk/iplayer/episodes/b05rcz9v',
        'info_dict': {
            'id': 'b05rcz9v',
@@ -1009,7 +1023,17 @@ class BBCCoUkIPlayerPlaylistIE(BBCCoUkPlaylistBaseIE):
            'description': 'French thriller serial about a missing teenager.',
        },
        'playlist_mincount': 6,
-    }
+        'skip': 'This programme is not currently available on BBC iPlayer',
+    }, {
+        # Available for over a year unlike 30 days for most other programmes
+        'url': 'http://www.bbc.co.uk/iplayer/group/p02tcc32',
+        'info_dict': {
+            'id': 'p02tcc32',
+            'title': 'Bohemian Icons',
+            'description': 'md5:683e901041b2fe9ba596f2ab04c4dbe7',
+        },
+        'playlist_mincount': 10,
+    }]

    def _extract_title_and_description(self, webpage):
        title = self._search_regex(r'<h1>([^<]+)</h1>', webpage, 'title', fatal=False)
--- a/youtube_dl/extractor/common.py
+++ b/youtube_dl/extractor/common.py
@@ -1481,6 +1481,13 @@ class InfoExtractor(object):
            compat_etree_fromstring(mpd.encode('utf-8')), mpd_id, mpd_base_url, formats_dict=formats_dict)

    def _parse_mpd_formats(self, mpd_doc, mpd_id=None, mpd_base_url='', formats_dict={}):
+        """
+        Parse formats from MPD manifest.
+        References:
+         1. MPEG-DASH Standard, ISO/IEC 23009-1:2014(E),
+            http://standards.iso.org/ittf/PubliclyAvailableStandards/c065274_ISO_IEC_23009-1_2014.zip
+         2. https://en.wikipedia.org/wiki/Dynamic_Adaptive_Streaming_over_HTTP
+        """
        if mpd_doc.get('type') == 'dynamic':
            return []

@@ -1513,8 +1520,16 @@ class InfoExtractor(object):
                        s_e = segment_timeline.findall(_add_ns('S'))
                        if s_e:
                            ms_info['total_number'] = 0
+                            ms_info['s'] = []
                            for s in s_e:
-                                ms_info['total_number'] += 1 + int(s.get('r', '0'))
+                                r = int(s.get('r', 0))
+                                ms_info['total_number'] += 1 + r
+                                ms_info['s'].append({
+                                    't': int(s.get('t', 0)),
+                                    # @d is mandatory (see [1, 5.3.9.6.2, Table 17, page 60])
+                                    'd': int(s.attrib['d']),
+                                    'r': r,
+                                })
                    else:
                        timescale = segment_template.get('timescale')
                        if timescale:
@@ -1551,7 +1566,7 @@ class InfoExtractor(object):
                        continue
                    representation_attrib = adaptation_set.attrib.copy()
                    representation_attrib.update(representation.attrib)
-                    # According to page 41 of ISO/IEC 29001-1:2014, @mimeType is mandatory
+                    # According to [1, 5.3.7.2, Table 9, page 41], @mimeType is mandatory
                    mime_type = representation_attrib['mimeType']
                    content_type = mime_type.split('/')[0]
                    if content_type == 'text':
@@ -1595,16 +1610,40 @@ class InfoExtractor(object):
                                representation_ms_info['total_number'] = int(math.ceil(float(period_duration) / segment_duration))
                            media_template = representation_ms_info['media_template']
                            media_template = media_template.replace('$RepresentationID$', representation_id)
-                            media_template = re.sub(r'\$(Number|Bandwidth)\$', r'%(\1)d', media_template)
-                            media_template = re.sub(r'\$(Number|Bandwidth)%([^$]+)\$', r'%(\1)\2', media_template)
+                            media_template = re.sub(r'\$(Number|Bandwidth|Time)\$', r'%(\1)d', media_template)
+                            media_template = re.sub(r'\$(Number|Bandwidth|Time)%([^$]+)\$', r'%(\1)\2', media_template)
                            media_template.replace('$$', '$')
-                            representation_ms_info['segment_urls'] = [
-                                media_template % {
-                                    'Number': segment_number,
-                                    'Bandwidth': representation_attrib.get('bandwidth')}
-                                for segment_number in range(
-                                    representation_ms_info['start_number'],
-                                    representation_ms_info['total_number'] + representation_ms_info['start_number'])]
+
+                            # As per [1, 5.3.9.4.4, Table 16, page 55] $Number$ and $Time$
+                            # can't be used at the same time
+                            if '%(Number' in media_template:
+                                representation_ms_info['segment_urls'] = [
+                                    media_template % {
+                                        'Number': segment_number,
+                                        'Bandwidth': representation_attrib.get('bandwidth'),
+                                    }
+                                    for segment_number in range(
+                                        representation_ms_info['start_number'],
+                                        representation_ms_info['total_number'] + representation_ms_info['start_number'])]
+                            else:
+                                representation_ms_info['segment_urls'] = []
+                                segment_time = 0
+
+                                def add_segment_url():
+                                    representation_ms_info['segment_urls'].append(
+                                        media_template % {
+                                            'Time': segment_time,
+                                            'Bandwidth': representation_attrib.get('bandwidth'),
+                                        }
+                                    )
+
+                                for num, s in enumerate(representation_ms_info['s']):
+                                    segment_time = s.get('t') or segment_time
+                                    add_segment_url()
+                                    for r in range(s.get('r', 0)):
+                                        segment_time += s['d']
+                                        add_segment_url()
+                                    segment_time += s['d']
                        if 'segment_urls' in representation_ms_info:
                            f.update({
                                'segment_urls': representation_ms_info['segment_urls'],
--- a/youtube_dl/extractor/dailymail.py
+++ b/youtube_dl/extractor/dailymail.py
@@ -5,19 +5,20 @@ from .common import InfoExtractor
 from ..utils import (
    int_or_none,
    determine_protocol,
+    unescapeHTML,
 )


 class DailyMailIE(InfoExtractor):
    _VALID_URL = r'https?://(?:www\.)?dailymail\.co\.uk/video/[^/]+/video-(?P<id>[0-9]+)'
    _TEST = {
-        'url': 'http://www.dailymail.co.uk/video/sciencetech/video-1288527/Turn-video-impressionist-masterpiece.html',
-        'md5': '2f639d446394f53f3a33658b518b6615',
+        'url': 'http://www.dailymail.co.uk/video/tvshowbiz/video-1295863/The-Mountain-appears-sparkling-water-ad-Heavy-Bubbles.html',
+        'md5': 'f6129624562251f628296c3a9ffde124',
        'info_dict': {
-            'id': '1288527',
+            'id': '1295863',
            'ext': 'mp4',
-            'title': 'Turn any video into an impressionist masterpiece',
-            'description': 'md5:88ddbcb504367987b2708bb38677c9d2',
+            'title': 'The Mountain appears in sparkling water ad for \'Heavy Bubbles\'',
+            'description': 'md5:a93d74b6da172dd5dc4d973e0b766a84',
        }
    }

@@ -26,7 +27,7 @@ class DailyMailIE(InfoExtractor):
        webpage = self._download_webpage(url, video_id)
        video_data = self._parse_json(self._search_regex(
            r"data-opts='({.+?})'", webpage, 'video data'), video_id)
-        title = video_data['title']
+        title = unescapeHTML(video_data['title'])
        video_sources = self._download_json(video_data.get(
            'sources', {}).get('url') or 'http://www.dailymail.co.uk/api/player/%s/video-sources.json' % video_id, video_id)

@@ -55,7 +56,7 @@ class DailyMailIE(InfoExtractor):
        return {
            'id': video_id,
            'title': title,
-            'description': video_data.get('descr'),
+            'description': unescapeHTML(video_data.get('descr')),
            'thumbnail': video_data.get('poster') or video_data.get('thumbnail'),
            'formats': formats,
        }
--- a/youtube_dl/extractor/dcn.py
+++ b/youtube_dl/extractor/dcn.py
@@ -62,11 +62,9 @@ class DCNBaseIE(InfoExtractor):
                r'file\s*:\s*"https?(://[^"]+)/playlist.m3u8',
                r'<a[^>]+href="rtsp(://[^"]+)"'
            ], webpage, 'format url')
-        # TODO: Current DASH formats are broken - $Time$ pattern in
-        # <SegmentTemplate> not implemented yet
-        # formats.extend(self._extract_mpd_formats(
-        #     format_url_base + '/manifest.mpd',
-        #     video_id, mpd_id='dash', fatal=False))
+        formats.extend(self._extract_mpd_formats(
+            format_url_base + '/manifest.mpd',
+            video_id, mpd_id='dash', fatal=False))
        formats.extend(self._extract_m3u8_formats(
            format_url_base + '/playlist.m3u8', video_id, 'mp4',
            m3u8_entry_protocol, m3u8_id='hls', fatal=False))
--- a/youtube_dl/extractor/eporner.py
+++ b/youtube_dl/extractor/eporner.py
@@ -4,19 +4,23 @@ from __future__ import unicode_literals
 import re

 from .common import InfoExtractor
+from ..compat import compat_str
 from ..utils import (
+    encode_base_n,
+    ExtractorError,
+    int_or_none,
    parse_duration,
    str_to_int,
 )


 class EpornerIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?eporner\.com/hd-porn/(?P<id>\w+)/(?P<display_id>[\w-]+)'
+    _VALID_URL = r'https?://(?:www\.)?eporner\.com/hd-porn/(?P<id>\w+)(?:/(?P<display_id>[\w-]+))?'
    _TESTS = [{
        'url': 'http://www.eporner.com/hd-porn/95008/Infamous-Tiffany-Teen-Strip-Tease-Video/',
        'md5': '39d486f046212d8e1b911c52ab4691f8',
        'info_dict': {
-            'id': '95008',
+            'id': 'qlDUmNsj6VS',
            'display_id': 'Infamous-Tiffany-Teen-Strip-Tease-Video',
            'ext': 'mp4',
            'title': 'Infamous Tiffany Teen Strip Tease Video',
@@ -28,34 +32,72 @@ class EpornerIE(InfoExtractor):
        # New (May 2016) URL layout
        'url': 'http://www.eporner.com/hd-porn/3YRUtzMcWn0/Star-Wars-XXX-Parody/',
        'only_matching': True,
+    }, {
+        'url': 'http://www.eporner.com/hd-porn/3YRUtzMcWn0',
+        'only_matching': True,
    }]

    def _real_extract(self, url):
        mobj = re.match(self._VALID_URL, url)
        video_id = mobj.group('id')
-        display_id = mobj.group('display_id')
+        display_id = mobj.group('display_id') or video_id

-        webpage = self._download_webpage(url, display_id)
-        title = self._html_search_regex(
-            r'<title>(.*?) - EPORNER', webpage, 'title')
+        webpage, urlh = self._download_webpage_handle(url, display_id)

-        redirect_url = 'http://www.eporner.com/config5/%s' % video_id
-        player_code = self._download_webpage(
-            redirect_url, display_id, note='Downloading player config')
+        video_id = self._match_id(compat_str(urlh.geturl()))

-        sources = self._search_regex(
-            r'(?s)sources\s*:\s*\[\s*({.+?})\s*\]', player_code, 'sources')
+        hash = self._search_regex(
+            r'hash\s*:\s*["\']([\da-f]{32})', webpage, 'hash')
+
+        title = self._og_search_title(webpage, default=None) or self._html_search_regex(
+            r'<title>(.+?) - EPORNER', webpage, 'title')
+
+        # Reverse engineered from vjs.js
+        def calc_hash(s):
+            return ''.join((encode_base_n(int(s[lb:lb + 8], 16), 36) for lb in range(0, 32, 8)))
+
+        video = self._download_json(
+            'http://www.eporner.com/xhr/video/%s' % video_id,
+            display_id, note='Downloading video JSON',
+            query={
+                'hash': calc_hash(hash),
+                'device': 'generic',
+                'domain': 'www.eporner.com',
+                'fallback': 'false',
+            })
+
+        if video.get('available') is False:
+            raise ExtractorError(
+                '%s said: %s' % (self.IE_NAME, video['message']), expected=True)
+
+        sources = video['sources']

        formats = []
-        for video_url, format_id in re.findall(r'file\s*:\s*"([^"]+)",\s*label\s*:\s*"([^"]+)"', sources):
-            fmt = {
-                'url': video_url,
-                'format_id': format_id,
-            }
-            m = re.search(r'^(\d+)', format_id)
-            if m:
-                fmt['height'] = int(m.group(1))
-            formats.append(fmt)
+        for kind, formats_dict in sources.items():
+            if not isinstance(formats_dict, dict):
+                continue
+            for format_id, format_dict in formats_dict.items():
+                if not isinstance(format_dict, dict):
+                    continue
+                src = format_dict.get('src')
+                if not isinstance(src, compat_str) or not src.startswith('http'):
+                    continue
+                if kind == 'hls':
+                    formats.extend(self._extract_m3u8_formats(
+                        src, display_id, 'mp4', entry_protocol='m3u8_native',
+                        m3u8_id=kind, fatal=False))
+                else:
+                    height = int_or_none(self._search_regex(
+                        r'(\d+)[pP]', format_id, 'height', default=None))
+                    fps = int_or_none(self._search_regex(
+                        r'(\d+)fps', format_id, 'fps', default=None))
+
+                    formats.append({
+                        'url': src,
+                        'format_id': format_id,
+                        'height': height,
+                        'fps': fps,
+                    })
        self._sort_formats(formats)

        duration = parse_duration(self._html_search_meta('duration', webpage))
--- a/youtube_dl/extractor/extractors.py
+++ b/youtube_dl/extractor/extractors.py
@@ -44,6 +44,7 @@ from .appletrailers import (
    AppleTrailersSectionIE,
 )
 from .archiveorg import ArchiveOrgIE
+from .arkena import ArkenaIE
 from .ard import (
    ARDIE,
    ARDMediathekIE,
@@ -156,7 +157,11 @@ from .cnn import (
 )
 from .coub import CoubIE
 from .collegerama import CollegeRamaIE
-from .comedycentral import ComedyCentralIE, ComedyCentralShowsIE
+from .comedycentral import (
+    ComedyCentralIE,
+    ComedyCentralShowsIE,
+    ComedyCentralTVIE,
+)
 from .comcarcoff import ComCarCoffIE
 from .commonmistakes import CommonMistakesIE, UnicodeBOMIE
 from .commonprotocols import RtmpIE
@@ -393,6 +398,10 @@ from .kuwo import (
 )
 from .la7 import LA7IE
 from .laola1tv import Laola1TvIE
+from .lcp import (
+    LcpPlayIE,
+    LcpIE,
+)
 from .learnr import LearnrIE
 from .lecture2go import Lecture2GoIE
 from .lemonde import LemondeIE
@@ -583,6 +592,7 @@ from .nytimes import (
    NYTimesArticleIE,
 )
 from .nuvid import NuvidIE
+from .odatv import OdaTVIE
 from .odnoklassniki import OdnoklassnikiIE
 from .oktoberfesttv import OktoberfestTVIE
 from .onet import (
--- a/youtube_dl/extractor/facebook.py
+++ b/youtube_dl/extractor/facebook.py
@@ -27,7 +27,7 @@ class FacebookIE(InfoExtractor):
    _VALID_URL = r'''(?x)
                (?:
                    https?://
-                        (?:\w+\.)?facebook\.com/
+                        (?:[\w-]+\.)?facebook\.com/
                        (?:[^#]*?\#!/)?
                        (?:
                            (?:
@@ -127,6 +127,9 @@ class FacebookIE(InfoExtractor):
    }, {
        'url': 'https://www.facebook.com/groups/164828000315060/permalink/764967300301124/',
        'only_matching': True,
+    }, {
+        'url': 'https://zh-hk.facebook.com/peoplespower/videos/1135894589806027/',
+        'only_matching': True,
    }]

    @staticmethod
--- a/youtube_dl/extractor/generic.py
+++ b/youtube_dl/extractor/generic.py
@@ -62,6 +62,7 @@ from .videomore import VideomoreIE
 from .googledrive import GoogleDriveIE
 from .jwplatform import JWPlatformIE
 from .digiteka import DigitekaIE
+from .arkena import ArkenaIE
 from .instagram import InstagramIE
 from .liveleak import LiveLeakIE
 from .threeqsdn import ThreeQSDNIE
@@ -1342,6 +1343,23 @@ class GenericIE(InfoExtractor):
            },
            'add_ie': ['Vimeo'],
        },
+        {
+            'url': 'https://support.arkena.com/display/PLAY/Ways+to+embed+your+video',
+            'md5': 'b96f2f71b359a8ecd05ce4e1daa72365',
+            'info_dict': {
+                'id': 'b41dda37-d8e7-4d3f-b1b5-9a9db578bdfe',
+                'ext': 'mp4',
+                'title': 'Big Buck Bunny',
+                'description': 'Royalty free test video',
+                'timestamp': 1432816365,
+                'upload_date': '20150528',
+                'is_live': False,
+            },
+            'params': {
+                'skip_download': True,
+            },
+            'add_ie': [ArkenaIE.ie_key()],
+        },
        # {
        #     # TODO: find another test
        #     # http://schema.org/VideoObject
@@ -2146,6 +2164,11 @@ class GenericIE(InfoExtractor):
        if digiteka_url:
            return self.url_result(self._proto_relative_url(digiteka_url), DigitekaIE.ie_key())

+        # Look for Arkena embeds
+        arkena_url = ArkenaIE._extract_url(webpage)
+        if arkena_url:
+            return self.url_result(arkena_url, ArkenaIE.ie_key())
+
        # Look for Limelight embeds
        mobj = re.search(r'LimelightPlayer\.doLoad(Media|Channel|ChannelList)\(["\'](?P<id>[a-z0-9]{32})', webpage)
        if mobj:
--- a/youtube_dl/extractor/lcp.py
+++ b/youtube_dl/extractor/lcp.py
@@ -0,0 +1,90 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from .arkena import ArkenaIE
+
+
+class LcpPlayIE(ArkenaIE):
+    _VALID_URL = r'https?://play\.lcp\.fr/embed/(?P<id>[^/]+)/(?P<account_id>[^/]+)/[^/]+/[^/]+'
+    _TESTS = [{
+        'url': 'http://play.lcp.fr/embed/327336/131064/darkmatter/0',
+        'md5': 'b8bd9298542929c06c1c15788b1f277a',
+        'info_dict': {
+            'id': '327336',
+            'ext': 'mp4',
+            'title': '327336',
+            'timestamp': 1456391602,
+            'upload_date': '20160225',
+        },
+        'params': {
+            'skip_download': True,
+        },
+    }]
+
+
+class LcpIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?lcp\.fr/(?:[^/]+/)*(?P<id>[^/]+)'
+
+    _TESTS = [{
+        # arkena embed
+        'url': 'http://www.lcp.fr/la-politique-en-video/schwartzenberg-prg-preconise-francois-hollande-de-participer-une-primaire',
+        'md5': 'b8bd9298542929c06c1c15788b1f277a',
+        'info_dict': {
+            'id': 'd56d03e9',
+            'ext': 'mp4',
+            'title': 'Schwartzenberg (PRG) préconise à François Hollande de participer à une primaire à gauche',
+            'description': 'md5:96ad55009548da9dea19f4120c6c16a8',
+            'timestamp': 1456488895,
+            'upload_date': '20160226',
+        },
+        'params': {
+            'skip_download': True,
+        },
+    }, {
+        # dailymotion live stream
+        'url': 'http://www.lcp.fr/le-direct',
+        'info_dict': {
+            'id': 'xji3qy',
+            'ext': 'mp4',
+            'title': 'La Chaine Parlementaire (LCP), Live TNT',
+            'description': 'md5:5c69593f2de0f38bd9a949f2c95e870b',
+            'uploader': 'LCP',
+            'uploader_id': 'xbz33d',
+            'timestamp': 1308923058,
+            'upload_date': '20110624',
+        },
+        'params': {
+            # m3u8 live stream
+            'skip_download': True,
+        },
+    }, {
+        'url': 'http://www.lcp.fr/emissions/277792-les-volontaires',
+        'only_matching': True,
+    }]
+
+    def _real_extract(self, url):
+        display_id = self._match_id(url)
+
+        webpage = self._download_webpage(url, display_id)
+
+        play_url = self._search_regex(
+            r'<iframe[^>]+src=(["\'])(?P<url>%s?(?:(?!\1).)*)\1' % LcpPlayIE._VALID_URL,
+            webpage, 'play iframe', default=None, group='url')
+
+        if not play_url:
+            return self.url_result(url, 'Generic')
+
+        title = self._og_search_title(webpage, default=None) or self._html_search_meta(
+            'twitter:title', webpage, fatal=True)
+        description = self._html_search_meta(
+            ('description', 'twitter:description'), webpage)
+
+        return {
+            '_type': 'url_transparent',
+            'ie_key': LcpPlayIE.ie_key(),
+            'url': play_url,
+            'display_id': display_id,
+            'title': title,
+            'description': description,
+        }
--- a/youtube_dl/extractor/odatv.py
+++ b/youtube_dl/extractor/odatv.py
@@ -0,0 +1,50 @@
+# coding: utf-8
+from __future__ import unicode_literals
+
+from .common import InfoExtractor
+from ..utils import (
+    ExtractorError,
+    NO_DEFAULT,
+    remove_start
+)
+
+
+class OdaTVIE(InfoExtractor):
+    _VALID_URL = r'https?://(?:www\.)?odatv\.com/(?:mob|vid)_video\.php\?.*\bid=(?P<id>[^&]+)'
+    _TESTS = [{
+        'url': 'http://odatv.com/vid_video.php?id=8E388',
+        'md5': 'dc61d052f205c9bf2da3545691485154',
+        'info_dict': {
+            'id': '8E388',
+            'ext': 'mp4',
+            'title': 'Artık Davutoğlu ile devam edemeyiz'
+        }
+    }, {
+        # mobile URL
+        'url': 'http://odatv.com/mob_video.php?id=8E388',
+        'only_matching': True,
+    }, {
+        # no video
+        'url': 'http://odatv.com/mob_video.php?id=8E900',
+        'only_matching': True,
+    }]
+
+    def _real_extract(self, url):
+        video_id = self._match_id(url)
+        webpage = self._download_webpage(url, video_id)
+
+        no_video = 'NO VIDEO!' in webpage
+
+        video_url = self._search_regex(
+            r'mp4\s*:\s*(["\'])(?P<url>http.+?)\1', webpage, 'video url',
+            default=None if no_video else NO_DEFAULT, group='url')
+
+        if no_video:
+            raise ExtractorError('Video %s does not exist' % video_id, expected=True)
+
+        return {
+            'id': video_id,
+            'url': video_url,
+            'title': remove_start(self._og_search_title(webpage), 'Video: '),
+            'thumbnail': self._og_search_thumbnail(webpage),
+        }
--- a/youtube_dl/extractor/onet.py
+++ b/youtube_dl/extractor/onet.py
@@ -59,11 +59,8 @@ class OnetBaseIE(InfoExtractor):
                        # TODO: Support Microsoft Smooth Streaming
                        continue
                    elif ext == 'mpd':
-                        # TODO: Current DASH formats are broken - $Time$ pattern in
-                        # <SegmentTemplate> not implemented yet
-                        # formats.extend(self._extract_mpd_formats(
-                        #    video_url, video_id, mpd_id='dash', fatal=False))
-                        continue
+                        formats.extend(self._extract_mpd_formats(
+                            video_url, video_id, mpd_id='dash', fatal=False))
                    else:
                        formats.append({
                            'url': video_url,
--- a/youtube_dl/extractor/pornhub.py
+++ b/youtube_dl/extractor/pornhub.py
@@ -111,7 +111,7 @@ class PornHubIE(InfoExtractor):
        webpage = self._download_webpage(req, video_id)

        error_msg = self._html_search_regex(
-            r'(?s)<div[^>]+class=(["\']).*?\b(?:removed|userMessageSection)\b.*?\1[^>]*>(?P<error>.+?)</div>',
+            r'(?s)<div[^>]+class=(["\'])(?:(?!\1).)*\b(?:removed|userMessageSection)\b(?:(?!\1).)*\1[^>]*>(?P<error>.+?)</div>',
            webpage, 'error message', default=None, group='error')
        if error_msg:
            error_msg = re.sub(r'\s+', ' ', error_msg)
--- a/youtube_dl/extractor/telegraaf.py
+++ b/youtube_dl/extractor/telegraaf.py
@@ -47,11 +47,10 @@ class TelegraafIE(InfoExtractor):
            ext = determine_ext(manifest_url)
            if ext == 'm3u8':
                formats.extend(self._extract_m3u8_formats(
-                    manifest_url, video_id, ext='mp4', m3u8_id='hls'))
+                    manifest_url, video_id, ext='mp4', m3u8_id='hls', fatal=False))
            elif ext == 'mpd':
-                # TODO: Current DASH formats are broken - $Time$ pattern in
-                # <SegmentTemplate> not implemented yet
-                continue
+                formats.extend(self._extract_mpd_formats(
+                    manifest_url, video_id, mpd_id='dash', fatal=False))
            else:
                self.report_warning('Unknown adaptive format %s' % ext)
        for location in locations.get('progressive', []):
--- a/youtube_dl/extractor/tvp.py
+++ b/youtube_dl/extractor/tvp.py
@@ -89,8 +89,8 @@ class TVPIE(InfoExtractor):
            r'(https?://.+?/video)(?:\.(?:ism|f4m|m3u8)|-\d+\.mp4)',
            video_url, 'video base url', default=None)
        if video_url_base:
-            # TODO: Current DASH formats are broken - $Time$ pattern in
-            # <SegmentTemplate> not implemented yet
+            # TODO: <Group> found instead of <AdaptationSet> in MPD manifest.
+            # It's not mentioned in MPEG-DASH standard. Figure that out.
            # formats.extend(self._extract_mpd_formats(
            #     video_url_base + '.ism/video.mpd',
            #     video_id, mpd_id='dash', fatal=False))
--- a/youtube_dl/extractor/youjizz.py
+++ b/youtube_dl/extractor/youjizz.py
@@ -9,8 +9,8 @@ from ..utils import (


 class YouJizzIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:\w+\.)?youjizz\.com/videos/[^/#?]+-(?P<id>[0-9]+)\.html(?:$|[?#])'
-    _TEST = {
+    _VALID_URL = r'https?://(?:\w+\.)?youjizz\.com/videos/(?:[^/#?]+)?-(?P<id>[0-9]+)\.html(?:$|[?#])'
+    _TESTS = [{
        'url': 'http://www.youjizz.com/videos/zeichentrick-1-2189178.html',
        'md5': '07e15fa469ba384c7693fd246905547c',
        'info_dict': {
@@ -19,7 +19,10 @@ class YouJizzIE(InfoExtractor):
            'title': 'Zeichentrick 1',
            'age_limit': 18,
        }
-    }
+    }, {
+        'url': 'http://www.youjizz.com/videos/-2189178.html',
+        'only_matching': True,
+    }]

    def _real_extract(self, url):
        video_id = self._match_id(url)
--- a/youtube_dl/extractor/youtube.py
+++ b/youtube_dl/extractor/youtube.py
@@ -53,6 +53,7 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
    """Provide base functions for Youtube extractors"""
    _LOGIN_URL = 'https://accounts.google.com/ServiceLogin'
    _TWOFACTOR_URL = 'https://accounts.google.com/signin/challenge'
+    _PASSWORD_CHALLENGE_URL = 'https://accounts.google.com/signin/challenge/sl/password'
    _NETRC_MACHINE = 'youtube'
    # If True it will raise an error if no login info is provided
    _LOGIN_REQUIRED = False
@@ -116,12 +117,10 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
            'hl': 'en_US',
        }

-        login_data = urlencode_postdata(login_form_strs)
-
-        req = sanitized_Request(self._LOGIN_URL, login_data)
        login_results = self._download_webpage(
-            req, None,
-            note='Logging in', errnote='unable to log in', fatal=False)
+            self._PASSWORD_CHALLENGE_URL, None,
+            note='Logging in', errnote='unable to log in', fatal=False,
+            data=urlencode_postdata(login_form_strs))
        if login_results is False:
            return False

@@ -1736,7 +1735,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):


 class YoutubeSharedVideoIE(InfoExtractor):
-    _VALID_URL = r'(?:https?:)?//(?:www\.)?youtube\.com/shared\?ci=(?P<id>[0-9A-Za-z_-]{11})'
+    _VALID_URL = r'(?:https?:)?//(?:www\.)?youtube\.com/shared\?.*\bci=(?P<id>[0-9A-Za-z_-]{11})'
    IE_NAME = 'youtube:shared'

    _TEST = {
--- a/youtube_dl/utils.py
+++ b/youtube_dl/utils.py
@@ -2123,6 +2123,7 @@ def mimetype2ext(mt):
        'dash+xml': 'mpd',
        'f4m': 'f4m',
        'f4m+xml': 'f4m',
+        'hds+xml': 'f4m',
        'vnd.ms-sstr+xml': 'ism',
    }.get(res, res)

--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@@ -1,3 +1,3 @@
 from __future__ import unicode_literals

-__version__ = '2016.07.17'
+__version__ = '2016.07.24'
Author	SHA1	Message	Date
Sergey M․	8fdc538b46	release 2016.07.24	2016-07-24 11:39:50 +07:00
Sergey M․	9513c1eb17	[tvp] Update dash format comment	2016-07-24 11:03:39 +07:00
Sergey M․	ae6fff4e64	[onet] Enable dash formats	2016-07-24 10:43:05 +07:00
Sergey M․	5a65668e25	[dcn] Enable dash formats	2016-07-24 10:35:55 +07:00
Sergey M․	f75e6890db	[telegraaf] Make hls non fatal	2016-07-24 10:29:26 +07:00
Sergey M․	d9cb92c840	[telegraaf] Enable dash formats	2016-07-24 10:29:09 +07:00
Sergey M․	94c04a3c79	[arkena] Enable dash formats	2016-07-24 10:28:11 +07:00
Sergey M․	f094834857	[extractor/common] Add support for $ in SegmentTemplate in MPD manifests	2016-07-24 10:27:16 +07:00
Déstin Reed	111de00289	[DailyMail] Improve title and description extraction	2016-07-24 05:37:13 +07:00
Sergey M․	b4a131e1a5	[facebook] Relax _VALID_URL (Closes #10151 )	2016-07-24 04:36:49 +07:00
Sergey M․	f1991ce928	[arkena] Skip dash formats	2016-07-23 18:07:55 +07:00
Sergey M․	6548030a17	Credit @rvanbekkum for arkena (#8682 )	2016-07-23 18:00:19 +07:00
Sergey M․	3a8947650b	[arkenaplay] Remove extractor	2016-07-23 17:57:55 +07:00
Sergey M․	1979969f91	[extractor/generic] Add support for arkena embeds	2016-07-23 17:56:48 +07:00
Sergey M․	0673741af3	[extractors] Add imports for arkena and lcp	2016-07-23 17:56:29 +07:00
Sergey M․	c8e170b209	[lcp] Improve extraction	2016-07-23 17:56:11 +07:00
Sergey M․	bbe1f3634a	[arkena] Improve extraction (Closes #8682 )	2016-07-23 17:55:54 +07:00
Rob van Bekkum	4671dd41b2	[arkena:lcp] Add extractors	2016-07-23 17:01:09 +07:00
Sergey M․	f164b97123	[utils] Add another f4m mimetype to mimetype2ext	2016-07-23 16:48:59 +07:00
Sergey M․	5275efe30d	release 2016.07.22	2016-07-22 23:11:28 +07:00
Sergey M․	b13647cf3c	[eporner] Fix extraction (Closes #10139 )	2016-07-22 23:04:13 +07:00
Sergey M․	add7d2a0e2	[pornhub] Make error regex less ambiguous (Closes #10138 )	2016-07-22 21:24:09 +07:00
Sergey M․	e298d3a08c	[youtube] Fix authentication (Closes #10140 )	2016-07-22 21:05:39 +07:00
Sergey M․	fd8c8c7dcd	[youtube:shared] Relax _VALID_URL	2016-07-21 22:58:34 +07:00
Sergey M․	9158af16cc	[bbc.co.uk:iplayer:playlist] Add support for group URLs	2016-07-21 22:37:36 +07:00
Sergey M․	c6668e4ad1	[bbc.co.uk:iplayer:playlist] Skip unavailable test	2016-07-21 22:34:55 +07:00
Sergey M․	84e8cca48b	[youjizz] Relax _VALID_URL (Closes #10131 )	2016-07-20 22:41:13 +07:00
Sergey M․	790b06b7d4	[odatv] Improve (Closes #9285 )	2016-07-20 21:43:22 +07:00
skacurt	740d7c49c2	[odatv] Add extractor	2016-07-20 21:42:05 +07:00
Sergey M․	4e51ec5f57	[extractors] Add import for comedycentral.tv	2016-07-19 22:50:37 +07:00
Sergey M․	05087d1b4c	[bbc] Improve extraction from sxml playlists	2016-07-19 22:49:38 +07:00
Sergey M․	a66a73ee90	[ard] Add test for rbb-online	2016-07-18 02:25:31 +07:00