Compare commits

...

43 Commits

Author SHA1 Message Date
Sergey M․
a50fd6e026 release 2016.06.19.1 2016-06-19 03:57:14 +07:00
Sergey M․
6a55bb66ee [vimeo] Fix rented videos (Closes #9830) 2016-06-19 03:56:01 +07:00
Lucas Moura
7c05097633 [jsinterp] Avoid double key lookup for setting new key
In order to add a new key to both __objects and __functions dicts on jsinterp.py, it is
necessary to first verify if a key was present and if not, create the key and
assign it to a value.

However, this can be done with a single step using dict setdefault method.
2016-06-19 03:29:45 +07:00
Sergey M․
589568789f release 2016.06.19 2016-06-19 02:30:29 +07:00
Sergey M․
7577d849a6 [r7] Fix extraction and add support for articles (Closes #9826) 2016-06-19 02:25:34 +07:00
Sergey M․
cb23192bc4 [closertotruth] Update and improve (Closes #8680) 2016-06-19 00:35:29 +07:00
Steven Gosseling
41c1023300 [closertotruth] Add extractor
Removed print statement from code.

Replaced two regex searches with the corret ones.

Removed some unnecessary semicolumns

fixed title extraction

refactored everything to search_regex

processed comments on commit 5650b0d, fixed feedback from flake8

Improved regexes and returns info dict now.

Added support for closertotruth interview URL

Added support for episodes page
2016-06-18 23:19:56 +07:00
Sergey M․
90b6288cce [arte:+7] Simplify _VALID_URL 2016-06-18 22:23:48 +07:00
Sergey M․
c1823c8ad9 [README.md] Remove 'small' from description (#9814) 2016-06-18 22:08:48 +07:00
Sergey M․
d7c6c656c5 [arte:+7] Expand _VALID_URL (Closes #9820) 2016-06-18 21:42:17 +07:00
Yen Chi Hsuan
b0b128049a [extractors] Update references to sportschau (#9799) 2016-06-18 13:43:47 +08:00
Yen Chi Hsuan
e8f13f2637 [sportschau.de] Fix extraction and moved to its own file (closes #9799) 2016-06-18 13:42:58 +08:00
Yen Chi Hsuan
b5aad37f6b [ard] Remove SportschauIE, which is now based on WDR (#9799) 2016-06-18 13:42:39 +08:00
Yen Chi Hsuan
6d0d4fc26d [wdr] Add WDRBaseIE, for Sportschau (#9799) 2016-06-18 13:40:55 +08:00
Yen Chi Hsuan
0278aa443f [br] Skip invalid tests 2016-06-18 12:53:48 +08:00
Yen Chi Hsuan
1f35745758 [azubu] Don't fail on optional fields 2016-06-18 12:39:08 +08:00
Yen Chi Hsuan
573c35272f [bbc] Skip a geo-restricted test case 2016-06-18 12:35:55 +08:00
Yen Chi Hsuan
09e3f91e40 [arte] Update _TESTS and fix for pages with multiple YouTube videos
Some tests are from #6895 and #6613
2016-06-18 12:34:58 +08:00
Yen Chi Hsuan
1b6cf16be7 [aftonbladet] Fix extraction 2016-06-18 12:27:39 +08:00
Yen Chi Hsuan
26264cb056 [adobetv] Use embedded data in the webpage
Sometimes the HTML webpage is returned even with '?format=json'
2016-06-18 12:21:40 +08:00
Yen Chi Hsuan
a72df5f36f [mtvservices] Fix ext for RTMP streams 2016-06-18 12:19:06 +08:00
Yen Chi Hsuan
c878e635de [bet] Moved to MTVServices 2016-06-18 12:17:24 +08:00
Sergey M․
0f47cc2e92 release 2016.06.18.1 2016-06-18 06:20:34 +07:00
Sergey M․
5fc2757682 release 2016.06.18 2016-06-18 06:00:05 +07:00
Sergey M․
e3944c2621 [pornhd] Add working test 2016-06-18 05:50:17 +07:00
Sergey M․
667d96480b [pornhd] Detect removed videos and modernize 2016-06-18 05:42:20 +07:00
Sergey M․
e6fe993c31 [pornhd] Improve formats extraction 2016-06-18 05:37:53 +07:00
Sergey M․
d0d93f76ea [pornhd] Fix metadata extraction 2016-06-18 05:30:46 +07:00
Sergey M․
20a6a154fe [mtv] Use compat_xpath and fix FutureWarning 2016-06-18 04:46:26 +07:00
Sergey M․
f011876076 [nickde] Add extractor (Closes #9778) 2016-06-18 04:40:48 +07:00
Sergey M․
6929569403 [mitele] Extract series metadata and make title more robust (Closes #9758) 2016-06-18 04:06:19 +07:00
Sergey M․
eb451890da [carambatv] Add extractor (Closes #9815) 2016-06-18 03:04:14 +07:00
Sergey M․
ded7511a70 [bbccouk] Add support for playlists (Closes #9812) 2016-06-17 23:42:52 +07:00
Sergey M․
d2161cade5 release 2016.06.16 2016-06-16 22:40:55 +07:00
Sergey M․
27e5fa8198 [cda] Fix extraction (Closes #9803) 2016-06-16 22:33:12 +07:00
Yen Chi Hsuan
efbd1eb51a [wimp] Fix extraction and update _TESTS 2016-06-16 12:27:21 +08:00
Yen Chi Hsuan
369ff75081 [jwplatform] Improved JWPlayer support 2016-06-16 12:26:45 +08:00
Yen Chi Hsuan
47212f7bcb [utils] Don't transform numbers not starting with a zero
Fix test_Viidea and maybe others
2016-06-16 11:00:54 +08:00
Sergey M․
4c93ee8d14 [imdb] Improve _VALID_URL (Closes #9788) 2016-06-15 22:34:55 +07:00
Yen Chi Hsuan
8bc4dbb1af [wrzuta.pl] Detect error and update _TESTS 2016-06-14 11:14:59 +08:00
Sergey M․
6c3760292c [pornhub] Improve title extraction (Closes #9777) 2016-06-14 04:57:59 +07:00
Sergey M․
4cef70db6c [devscripts/release.sh] Add flag for gpg-sign commits 2016-06-14 03:16:56 +07:00
Sergey M․
ff4af6ec59 [lynda] Remove superfluous _NETRC_MACHINE 2016-06-14 02:49:33 +07:00
34 changed files with 799 additions and 334 deletions

View File

@@ -6,8 +6,8 @@
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.06.14*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.06.14**
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.06.19.1*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.06.19.1**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2016.06.14
[debug] youtube-dl version 2016.06.19.1
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

View File

@@ -44,7 +44,7 @@ Or with [MacPorts](https://www.macports.org/):
Alternatively, refer to the [developer instructions](#developer-instructions) for how to check out and work with the git repository. For further options, including PGP signatures, see the [youtube-dl Download Page](https://rg3.github.io/youtube-dl/download.html).
# DESCRIPTION
**youtube-dl** is a small command-line program to download videos from
**youtube-dl** is a command-line program to download videos from
YouTube.com and a few more sites. It requires the Python interpreter, version
2.6, 2.7, or 3.2+, and it is not platform specific. It should work on
your Unix box, on Windows or on Mac OS X. It is released to the public domain,

View File

@@ -15,6 +15,7 @@
set -e
skip_tests=true
gpg_sign_commits=""
buildserver='localhost:8142'
while true
@@ -24,6 +25,10 @@ case "$1" in
skip_tests=false
shift
;;
--gpg-sign-commits|-S)
gpg_sign_commits="-S"
shift
;;
--buildserver)
buildserver="$2"
shift 2
@@ -69,7 +74,7 @@ sed -i "s/__version__ = '.*'/__version__ = '$version'/" youtube_dl/version.py
/bin/echo -e "\n### Committing documentation, templates and youtube_dl/version.py..."
make README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE.md supportedsites
git add README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE.md docs/supportedsites.md youtube_dl/version.py
git commit -m "release $version"
git commit $gpg_sign_commits -m "release $version"
/bin/echo -e "\n### Now tagging, signing and pushing..."
git tag -s -m "Release $version" "$version"
@@ -116,7 +121,7 @@ git clone --branch gh-pages --single-branch . build/gh-pages
"$ROOT/devscripts/gh-pages/update-copyright.py"
"$ROOT/devscripts/gh-pages/update-sites.py"
git add *.html *.html.in update
git commit -m "release $version"
git commit $gpg_sign_commits -m "release $version"
git push "$ROOT" gh-pages
git push "$ORIGIN_URL" gh-pages
)

View File

@@ -44,8 +44,8 @@
- **appletrailers:section**
- **archive.org**: archive.org videos
- **ARD**
- **ARD:mediathek**: Saarländischer Rundfunk
- **ARD:mediathek**
- **ARD:mediathek**: Saarländischer Rundfunk
- **arte.tv**
- **arte.tv:+7**
- **arte.tv:cinema**
@@ -74,6 +74,8 @@
- **bbc**: BBC
- **bbc.co.uk**: BBC iPlayer
- **bbc.co.uk:article**: BBC articles
- **bbc.co.uk:iplayer:playlist**
- **bbc.co.uk:playlist**
- **BeatportPro**
- **Beeg**
- **BehindKink**
@@ -104,6 +106,8 @@
- **canalc2.tv**
- **Canalplus**: canalplus.fr, piwiplus.fr and d8.tv
- **Canvas**
- **CarambaTV**
- **CarambaTVPage**
- **CBC**
- **CBCPlayer**
- **CBS**
@@ -124,6 +128,7 @@
- **cliphunter**
- **ClipRs**
- **Clipsyndicate**
- **CloserToTruth**
- **cloudtime**: CloudTime
- **Cloudy**
- **Clubic**
@@ -432,6 +437,7 @@
- **nhl.com:videocenter**
- **nhl.com:videocenter:category**: NHL videocenter category
- **nick.com**
- **nick.de**
- **niconico**: ニコニコ動画
- **NiconicoPlaylist**
- **njoy**: N-JOY
@@ -516,6 +522,7 @@
- **qqmusic:singer**: QQ音乐 - 歌手
- **qqmusic:toplist**: QQ音乐 - 排行榜
- **R7**
- **R7Article**
- **radio.de**
- **radiobremen**
- **radiocanada**

View File

@@ -640,6 +640,9 @@ class TestUtil(unittest.TestCase):
"1":{"src":"skipped", "type": "application/vnd.apple.mpegURL"}
}''')
inp = '''{"foo":101}'''
self.assertEqual(js_to_json(inp), '''{"foo":101}''')
def test_js_to_json_edgecases(self):
on = js_to_json("{abc_def:'1\\'\\\\2\\\\\\'3\"4'}")
self.assertEqual(json.loads(on), {"abc_def": "1'\\2\\'3\"4"})

View File

@@ -156,7 +156,10 @@ class AdobeTVVideoIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
video_data = self._download_json(url + '?format=json', video_id)
webpage = self._download_webpage(url, video_id)
video_data = self._parse_json(self._search_regex(
r'var\s+bridge\s*=\s*([^;]+);', webpage, 'bridged data'), video_id)
formats = [{
'format_id': '%s-%s' % (determine_ext(source['src']), source.get('height')),

View File

@@ -24,10 +24,10 @@ class AftonbladetIE(InfoExtractor):
webpage = self._download_webpage(url, video_id)
# find internal video meta data
meta_url = 'http://aftonbladet-play.drlib.aptoma.no/video/%s.json'
meta_url = 'http://aftonbladet-play-metadata.cdn.drvideo.aptoma.no/video/%s.json'
player_config = self._parse_json(self._html_search_regex(
r'data-player-config="([^"]+)"', webpage, 'player config'), video_id)
internal_meta_id = player_config['videoId']
internal_meta_id = player_config['aptomaVideoId']
internal_meta_url = meta_url % internal_meta_id
internal_meta_json = self._download_json(
internal_meta_url, video_id, 'Downloading video meta data')

View File

@@ -8,7 +8,6 @@ from .generic import GenericIE
from ..utils import (
determine_ext,
ExtractorError,
get_element_by_attribute,
qualities,
int_or_none,
parse_duration,
@@ -274,41 +273,3 @@ class ARDIE(InfoExtractor):
'upload_date': upload_date,
'thumbnail': thumbnail,
}
class SportschauIE(ARDMediathekIE):
IE_NAME = 'Sportschau'
_VALID_URL = r'(?P<baseurl>https?://(?:www\.)?sportschau\.de/(?:[^/]+/)+video(?P<id>[^/#?]+))\.html'
_TESTS = [{
'url': 'http://www.sportschau.de/tourdefrance/videoseppeltkokainhatnichtsmitklassischemdopingzutun100.html',
'info_dict': {
'id': 'seppeltkokainhatnichtsmitklassischemdopingzutun100',
'ext': 'mp4',
'title': 'Seppelt: "Kokain hat nichts mit klassischem Doping zu tun"',
'thumbnail': 're:^https?://.*\.jpg$',
'description': 'Der ARD-Doping Experte Hajo Seppelt gibt seine Einschätzung zum ersten Dopingfall der diesjährigen Tour de France um den Italiener Luca Paolini ab.',
},
'params': {
# m3u8 download
'skip_download': True,
},
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
base_url = mobj.group('baseurl')
webpage = self._download_webpage(url, video_id)
title = get_element_by_attribute('class', 'headline', webpage)
description = self._html_search_meta('description', webpage, 'description')
info = self._extract_media_info(
base_url + '-mc_defaultQuality-h.json', webpage, video_id)
info.update({
'title': title,
'description': description,
})
return info

View File

@@ -180,11 +180,14 @@ class ArteTVBaseIE(InfoExtractor):
class ArteTVPlus7IE(ArteTVBaseIE):
IE_NAME = 'arte.tv:+7'
_VALID_URL = r'https?://(?:www\.)?arte\.tv/guide/(?P<lang>fr|de|en|es)/(?:(?:sendungen|emissions|embed)/)?(?P<id>[^/]+)/(?P<name>[^/?#&]+)'
_VALID_URL = r'https?://(?:(?:www|sites)\.)?arte\.tv/[^/]+/(?P<lang>fr|de|en|es)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://www.arte.tv/guide/de/sendungen/XEN/xenius/?vid=055918-015_PLUS7-D',
'only_matching': True,
}, {
'url': 'http://sites.arte.tv/karambolage/de/video/karambolage-22',
'only_matching': True,
}]
@classmethod
@@ -240,10 +243,10 @@ class ArteTVPlus7IE(ArteTVBaseIE):
return self._extract_from_json_url(json_url, video_id, lang, title=title)
# Different kind of embed URL (e.g.
# http://www.arte.tv/magazine/trepalium/fr/episode-0406-replay-trepalium)
embed_url = self._search_regex(
r'<iframe[^>]+src=(["\'])(?P<url>.+?)\1',
webpage, 'embed url', group='url')
return self.url_result(embed_url)
entries = [
self.url_result(url)
for _, url in re.findall(r'<iframe[^>]+src=(["\'])(?P<url>.+?)\1', webpage)]
return self.playlist_result(entries)
# It also uses the arte_vp_url url from the webpage to extract the information
@@ -252,22 +255,17 @@ class ArteTVCreativeIE(ArteTVPlus7IE):
_VALID_URL = r'https?://creative\.arte\.tv/(?P<lang>fr|de|en|es)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://creative.arte.tv/de/magazin/agentur-amateur-corporate-design',
'url': 'http://creative.arte.tv/fr/episode/osmosis-episode-1',
'info_dict': {
'id': '72176',
'id': '057405-001-A',
'ext': 'mp4',
'title': 'Folge 2 - Corporate Design',
'upload_date': '20131004',
'title': 'OSMOSIS - N\'AYEZ PLUS PEUR D\'AIMER (1)',
'upload_date': '20150716',
},
}, {
'url': 'http://creative.arte.tv/fr/Monty-Python-Reunion',
'info_dict': {
'id': '160676',
'ext': 'mp4',
'title': 'Monty Python live (mostly)',
'description': 'Événement ! Quarante-cinq ans après leurs premiers succès, les légendaires Monty Python remontent sur scène.\n',
'upload_date': '20140805',
}
'playlist_count': 11,
'add_ie': ['Youtube'],
}, {
'url': 'http://creative.arte.tv/de/episode/agentur-amateur-4-der-erste-kunde',
'only_matching': True,
@@ -349,14 +347,13 @@ class ArteTVCinemaIE(ArteTVPlus7IE):
_VALID_URL = r'https?://cinema\.arte\.tv/(?P<lang>fr|de|en|es)/(?P<id>.+)'
_TESTS = [{
'url': 'http://cinema.arte.tv/de/node/38291',
'md5': '6b275511a5107c60bacbeeda368c3aa1',
'url': 'http://cinema.arte.tv/fr/article/les-ailes-du-desir-de-julia-reck',
'md5': 'a5b9dd5575a11d93daf0e3f404f45438',
'info_dict': {
'id': '055876-000_PWA12025-D',
'id': '062494-000-A',
'ext': 'mp4',
'title': 'Tod auf dem Nil',
'upload_date': '20160122',
'description': 'md5:7f749bbb77d800ef2be11d54529b96bc',
'title': 'Film lauréat du concours web - "Les ailes du désir" de Julia Reck',
'upload_date': '20150807',
},
}]

View File

@@ -46,6 +46,7 @@ class AzubuIE(InfoExtractor):
'uploader_id': 272749,
'view_count': int,
},
'skip': 'Channel offline',
},
]
@@ -56,22 +57,26 @@ class AzubuIE(InfoExtractor):
'http://www.azubu.tv/api/video/%s' % video_id, video_id)['data']
title = data['title'].strip()
description = data['description']
thumbnail = data['thumbnail']
view_count = data['view_count']
uploader = data['user']['username']
uploader_id = data['user']['id']
description = data.get('description')
thumbnail = data.get('thumbnail')
view_count = data.get('view_count')
user = data.get('user', {})
uploader = user.get('username')
uploader_id = user.get('id')
stream_params = json.loads(data['stream_params'])
timestamp = float_or_none(stream_params['creationDate'], 1000)
duration = float_or_none(stream_params['length'], 1000)
timestamp = float_or_none(stream_params.get('creationDate'), 1000)
duration = float_or_none(stream_params.get('length'), 1000)
renditions = stream_params.get('renditions') or []
video = stream_params.get('FLVFullLength') or stream_params.get('videoFullLength')
if video:
renditions.append(video)
if not renditions and not user.get('channel', {}).get('is_live', True):
raise ExtractorError('%s said: channel is offline.' % self.IE_NAME, expected=True)
formats = [{
'url': fmt['url'],
'width': fmt['frameWidth'],

View File

@@ -31,7 +31,7 @@ class BBCCoUkIE(InfoExtractor):
music/clips[/#]|
radio/player/
)
(?P<id>%s)
(?P<id>%s)(?!/(?:episodes|broadcasts|clips))
''' % _ID_REGEX
_MEDIASELECTOR_URLS = [
@@ -192,6 +192,7 @@ class BBCCoUkIE(InfoExtractor):
# rtmp download
'skip_download': True,
},
'skip': 'Now it\'s really geo-restricted',
}, {
# compact player (https://github.com/rg3/youtube-dl/issues/8147)
'url': 'http://www.bbc.co.uk/programmes/p028bfkf/player',
@@ -698,7 +699,9 @@ class BBCIE(BBCCoUkIE):
@classmethod
def suitable(cls, url):
return False if BBCCoUkIE.suitable(url) or BBCCoUkArticleIE.suitable(url) else super(BBCIE, cls).suitable(url)
EXCLUDE_IE = (BBCCoUkIE, BBCCoUkArticleIE, BBCCoUkIPlayerPlaylistIE, BBCCoUkPlaylistIE)
return (False if any(ie.suitable(url) for ie in EXCLUDE_IE)
else super(BBCIE, cls).suitable(url))
def _extract_from_media_meta(self, media_meta, video_id):
# Direct links to media in media metadata (e.g.
@@ -975,3 +978,72 @@ class BBCCoUkArticleIE(InfoExtractor):
r'<div[^>]+typeof="Clip"[^>]+resource="([^"]+)"', webpage)]
return self.playlist_result(entries, playlist_id, title, description)
class BBCCoUkPlaylistBaseIE(InfoExtractor):
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
entries = [
self.url_result(self._URL_TEMPLATE % video_id, BBCCoUkIE.ie_key())
for video_id in re.findall(
self._VIDEO_ID_TEMPLATE % BBCCoUkIE._ID_REGEX, webpage)]
title, description = self._extract_title_and_description(webpage)
return self.playlist_result(entries, playlist_id, title, description)
class BBCCoUkIPlayerPlaylistIE(BBCCoUkPlaylistBaseIE):
IE_NAME = 'bbc.co.uk:iplayer:playlist'
_VALID_URL = r'https?://(?:www\.)?bbc\.co\.uk/iplayer/episodes/(?P<id>%s)' % BBCCoUkIE._ID_REGEX
_URL_TEMPLATE = 'http://www.bbc.co.uk/iplayer/episode/%s'
_VIDEO_ID_TEMPLATE = r'data-ip-id=["\'](%s)'
_TEST = {
'url': 'http://www.bbc.co.uk/iplayer/episodes/b05rcz9v',
'info_dict': {
'id': 'b05rcz9v',
'title': 'The Disappearance',
'description': 'French thriller serial about a missing teenager.',
},
'playlist_mincount': 6,
}
def _extract_title_and_description(self, webpage):
title = self._search_regex(r'<h1>([^<]+)</h1>', webpage, 'title', fatal=False)
description = self._search_regex(
r'<p[^>]+class=(["\'])subtitle\1[^>]*>(?P<value>[^<]+)</p>',
webpage, 'description', fatal=False, group='value')
return title, description
class BBCCoUkPlaylistIE(BBCCoUkPlaylistBaseIE):
IE_NAME = 'bbc.co.uk:playlist'
_VALID_URL = r'https?://(?:www\.)?bbc\.co\.uk/programmes/(?P<id>%s)/(?:episodes|broadcasts|clips)' % BBCCoUkIE._ID_REGEX
_URL_TEMPLATE = 'http://www.bbc.co.uk/programmes/%s'
_VIDEO_ID_TEMPLATE = r'data-pid=["\'](%s)'
_TESTS = [{
'url': 'http://www.bbc.co.uk/programmes/b05rcz9v/clips',
'info_dict': {
'id': 'b05rcz9v',
'title': 'The Disappearance - Clips - BBC Four',
'description': 'French thriller serial about a missing teenager.',
},
'playlist_mincount': 7,
}, {
'url': 'http://www.bbc.co.uk/programmes/b05rcz9v/broadcasts/2016/06',
'only_matching': True,
}, {
'url': 'http://www.bbc.co.uk/programmes/b05rcz9v/clips',
'only_matching': True,
}, {
'url': 'http://www.bbc.co.uk/programmes/b055jkys/episodes/player',
'only_matching': True,
}]
def _extract_title_and_description(self, webpage):
title = self._og_search_title(webpage, fatal=False)
description = self._og_search_description(webpage)
return title, description

View File

@@ -1,31 +1,27 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_urllib_parse_unquote
from ..utils import (
xpath_text,
xpath_with_ns,
int_or_none,
parse_iso8601,
)
from .mtv import MTVServicesInfoExtractor
from ..utils import unified_strdate
from ..compat import compat_urllib_parse_urlencode
class BetIE(InfoExtractor):
class BetIE(MTVServicesInfoExtractor):
_VALID_URL = r'https?://(?:www\.)?bet\.com/(?:[^/]+/)+(?P<id>.+?)\.html'
_TESTS = [
{
'url': 'http://www.bet.com/news/politics/2014/12/08/in-bet-exclusive-obama-talks-race-and-racism.html',
'info_dict': {
'id': 'news/national/2014/a-conversation-with-president-obama',
'id': '07e96bd3-8850-3051-b856-271b457f0ab8',
'display_id': 'in-bet-exclusive-obama-talks-race-and-racism',
'ext': 'flv',
'title': 'A Conversation With President Obama',
'description': 'md5:699d0652a350cf3e491cd15cc745b5da',
'description': 'President Obama urges persistence in confronting racism and bias.',
'duration': 1534,
'timestamp': 1418075340,
'upload_date': '20141208',
'uploader': 'admin',
'thumbnail': 're:(?i)^https?://.*\.jpg$',
'subtitles': {
'en': 'mincount:2',
}
},
'params': {
# rtmp download
@@ -35,16 +31,17 @@ class BetIE(InfoExtractor):
{
'url': 'http://www.bet.com/video/news/national/2014/justice-for-ferguson-a-community-reacts.html',
'info_dict': {
'id': 'news/national/2014/justice-for-ferguson-a-community-reacts',
'id': '9f516bf1-7543-39c4-8076-dd441b459ba9',
'display_id': 'justice-for-ferguson-a-community-reacts',
'ext': 'flv',
'title': 'Justice for Ferguson: A Community Reacts',
'description': 'A BET News special.',
'duration': 1696,
'timestamp': 1416942360,
'upload_date': '20141125',
'uploader': 'admin',
'thumbnail': 're:(?i)^https?://.*\.jpg$',
'subtitles': {
'en': 'mincount:2',
}
},
'params': {
# rtmp download
@@ -53,57 +50,32 @@ class BetIE(InfoExtractor):
}
]
_FEED_URL = "http://feeds.mtvnservices.com/od/feed/bet-mrss-player"
def _get_feed_query(self, uri):
return compat_urllib_parse_urlencode({
'uuid': uri,
})
def _extract_mgid(self, webpage):
return self._search_regex(r'data-uri="([^"]+)', webpage, 'mgid')
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
mgid = self._extract_mgid(webpage)
videos_info = self._get_videos_info(mgid)
media_url = compat_urllib_parse_unquote(self._search_regex(
[r'mediaURL\s*:\s*"([^"]+)"', r"var\s+mrssMediaUrl\s*=\s*'([^']+)'"],
webpage, 'media URL'))
info_dict = videos_info['entries'][0]
video_id = self._search_regex(
r'/video/(.*)/_jcr_content/', media_url, 'video id')
upload_date = unified_strdate(self._html_search_meta('date', webpage))
description = self._html_search_meta('description', webpage)
mrss = self._download_xml(media_url, display_id)
item = mrss.find('./channel/item')
NS_MAP = {
'dc': 'http://purl.org/dc/elements/1.1/',
'media': 'http://search.yahoo.com/mrss/',
'ka': 'http://kickapps.com/karss',
}
title = xpath_text(item, './title', 'title')
description = xpath_text(
item, './description', 'description', fatal=False)
timestamp = parse_iso8601(xpath_text(
item, xpath_with_ns('./dc:date', NS_MAP),
'upload date', fatal=False))
uploader = xpath_text(
item, xpath_with_ns('./dc:creator', NS_MAP),
'uploader', fatal=False)
media_content = item.find(
xpath_with_ns('./media:content', NS_MAP))
duration = int_or_none(media_content.get('duration'))
smil_url = media_content.get('url')
thumbnail = media_content.find(
xpath_with_ns('./media:thumbnail', NS_MAP)).get('url')
formats = self._extract_smil_formats(smil_url, display_id)
self._sort_formats(formats)
return {
'id': video_id,
info_dict.update({
'display_id': display_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'timestamp': timestamp,
'uploader': uploader,
'duration': duration,
'formats': formats,
}
'upload_date': upload_date,
})
return info_dict

View File

@@ -29,7 +29,8 @@ class BRIE(InfoExtractor):
'duration': 180,
'uploader': 'Reinhard Weber',
'upload_date': '20150422',
}
},
'skip': '404 not found',
},
{
'url': 'http://www.br.de/nachrichten/oberbayern/inhalt/muenchner-polizeipraesident-schreiber-gestorben-100.html',
@@ -40,7 +41,8 @@ class BRIE(InfoExtractor):
'title': 'Manfred Schreiber ist tot',
'description': 'md5:b454d867f2a9fc524ebe88c3f5092d97',
'duration': 26,
}
},
'skip': '404 not found',
},
{
'url': 'https://www.br-klassik.de/audio/peeping-tom-premierenkritik-dance-festival-muenchen-100.html',
@@ -51,7 +53,8 @@ class BRIE(InfoExtractor):
'title': 'Kurzweilig und sehr bewegend',
'description': 'md5:0351996e3283d64adeb38ede91fac54e',
'duration': 296,
}
},
'skip': '404 not found',
},
{
'url': 'http://www.br.de/radio/bayern1/service/team/videos/team-video-erdelt100.html',

View File

@@ -0,0 +1,88 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
float_or_none,
int_or_none,
try_get,
)
class CarambaTVIE(InfoExtractor):
_VALID_URL = r'(?:carambatv:|https?://video1\.carambatv\.ru/v/)(?P<id>\d+)'
_TESTS = [{
'url': 'http://video1.carambatv.ru/v/191910501',
'md5': '2f4a81b7cfd5ab866ee2d7270cb34a2a',
'info_dict': {
'id': '191910501',
'ext': 'mp4',
'title': '[BadComedian] - Разборка в Маниле (Абсолютный обзор)',
'thumbnail': 're:^https?://.*\.jpg',
'duration': 2678.31,
},
}, {
'url': 'carambatv:191910501',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
video = self._download_json(
'http://video1.carambatv.ru/v/%s/videoinfo.js' % video_id,
video_id)
title = video['title']
base_url = video.get('video') or 'http://video1.carambatv.ru/v/%s/' % video_id
formats = [{
'url': base_url + f['fn'],
'height': int_or_none(f.get('height')),
'format_id': '%sp' % f['height'] if f.get('height') else None,
} for f in video['qualities'] if f.get('fn')]
self._sort_formats(formats)
thumbnail = video.get('splash')
duration = float_or_none(try_get(
video, lambda x: x['annotations'][0]['end_time'], compat_str))
return {
'id': video_id,
'title': title,
'thumbnail': thumbnail,
'duration': duration,
'formats': formats,
}
class CarambaTVPageIE(InfoExtractor):
_VALID_URL = r'https?://carambatv\.ru/(?:[^/]+/)+(?P<id>[^/?#&]+)'
_TEST = {
'url': 'http://carambatv.ru/movie/bad-comedian/razborka-v-manile/',
'md5': '',
'info_dict': {
'id': '191910501',
'ext': 'mp4',
'title': '[BadComedian] - Разборка в Маниле (Абсолютный обзор)',
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 2678.31,
},
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
video_url = self._og_search_property('video:iframe', webpage, default=None)
if not video_url:
video_id = self._search_regex(
r'(?:video_id|crmb_vuid)\s*[:=]\s*["\']?(\d+)',
webpage, 'video id')
video_url = 'carambatv:%s' % video_id
return self.url_result(video_url, CarambaTVIE.ie_key())

View File

@@ -58,7 +58,8 @@ class CDAIE(InfoExtractor):
def extract_format(page, version):
unpacked = decode_packed_codes(page)
format_url = self._search_regex(
r"url:\\'(.+?)\\'", unpacked, '%s url' % version, fatal=False)
r"(?:file|url)\s*:\s*(\\?[\"'])(?P<url>http.+?)\1", unpacked,
'%s url' % version, fatal=False, group='url')
if not format_url:
return
f = {
@@ -75,7 +76,8 @@ class CDAIE(InfoExtractor):
info_dict['formats'].append(f)
if not info_dict['duration']:
info_dict['duration'] = parse_duration(self._search_regex(
r"duration:\\'(.+?)\\'", unpacked, 'duration', fatal=False))
r"duration\s*:\s*(\\?[\"'])(?P<duration>.+?)\1",
unpacked, 'duration', fatal=False, group='duration'))
extract_format(webpage, 'default')

View File

@@ -0,0 +1,92 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
class CloserToTruthIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?closertotruth\.com/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://closertotruth.com/series/solutions-the-mind-body-problem#video-3688',
'info_dict': {
'id': '0_zof1ktre',
'display_id': 'solutions-the-mind-body-problem',
'ext': 'mov',
'title': 'Solutions to the Mind-Body Problem?',
'upload_date': '20140221',
'timestamp': 1392956007,
'uploader_id': 'CTTXML'
},
'params': {
'skip_download': True,
},
}, {
'url': 'http://closertotruth.com/episodes/how-do-brains-work',
'info_dict': {
'id': '0_iuxai6g6',
'display_id': 'how-do-brains-work',
'ext': 'mov',
'title': 'How do Brains Work?',
'upload_date': '20140221',
'timestamp': 1392956024,
'uploader_id': 'CTTXML'
},
'params': {
'skip_download': True,
},
}, {
'url': 'http://closertotruth.com/interviews/1725',
'info_dict': {
'id': '1725',
'title': 'AyaFr-002',
},
'playlist_mincount': 2,
}]
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
partner_id = self._search_regex(
r'<script[^>]+src=["\'].*?\b(?:partner_id|p)/(\d+)',
webpage, 'kaltura partner_id')
title = self._search_regex(
r'<title>(.+?)\s*\|\s*.+?</title>', webpage, 'video title')
select = self._search_regex(
r'(?s)<select[^>]+id="select-version"[^>]*>(.+?)</select>',
webpage, 'select version', default=None)
if select:
entry_ids = set()
entries = []
for mobj in re.finditer(
r'<option[^>]+value=(["\'])(?P<id>[0-9a-z_]+)(?:#.+?)?\1[^>]*>(?P<title>[^<]+)',
webpage):
entry_id = mobj.group('id')
if entry_id in entry_ids:
continue
entry_ids.add(entry_id)
entries.append({
'_type': 'url_transparent',
'url': 'kaltura:%s:%s' % (partner_id, entry_id),
'ie_key': 'Kaltura',
'title': mobj.group('title'),
})
if entries:
return self.playlist_result(entries, display_id, title)
entry_id = self._search_regex(
r'<a[^>]+id=(["\'])embed-kaltura\1[^>]+data-kaltura=(["\'])(?P<id>[0-9a-z_]+)\2',
webpage, 'kaltura entry_id', group='id')
return {
'_type': 'url_transparent',
'display_id': display_id,
'url': 'kaltura:%s:%s' % (partner_id, entry_id),
'ie_key': 'Kaltura',
'title': title
}

View File

@@ -44,7 +44,6 @@ from .archiveorg import ArchiveOrgIE
from .ard import (
ARDIE,
ARDMediathekIE,
SportschauIE,
)
from .arte import (
ArteTvIE,
@@ -71,6 +70,8 @@ from .bandcamp import BandcampIE, BandcampAlbumIE
from .bbc import (
BBCCoUkIE,
BBCCoUkArticleIE,
BBCCoUkIPlayerPlaylistIE,
BBCCoUkPlaylistIE,
BBCIE,
)
from .beeg import BeegIE
@@ -108,6 +109,10 @@ from .camwithher import CamWithHerIE
from .canalplus import CanalplusIE
from .canalc2 import Canalc2IE
from .canvas import CanvasIE
from .carambatv import (
CarambaTVIE,
CarambaTVPageIE,
)
from .cbc import (
CBCIE,
CBCPlayerIE,
@@ -135,6 +140,7 @@ from .cliprs import ClipRsIE
from .clipfish import ClipfishIE
from .cliphunter import CliphunterIE
from .clipsyndicate import ClipsyndicateIE
from .closertotruth import CloserToTruthIE
from .cloudy import CloudyIE
from .clubic import ClubicIE
from .clyp import ClypIE
@@ -512,7 +518,10 @@ from .nhl import (
NHLVideocenterCategoryIE,
NHLIE,
)
from .nick import NickIE
from .nick import (
NickIE,
NickDeIE,
)
from .niconico import NiconicoIE, NiconicoPlaylistIE
from .ninegag import NineGagIE
from .noco import NocoIE
@@ -622,7 +631,10 @@ from .qqmusic import (
QQMusicToplistIE,
QQMusicPlaylistIE,
)
from .r7 import R7IE
from .r7 import (
R7IE,
R7ArticleIE,
)
from .radiocanada import (
RadioCanadaIE,
RadioCanadaAudioVideoIE,
@@ -738,6 +750,7 @@ from .sportbox import (
SportBoxEmbedIE,
)
from .sportdeutschland import SportDeutschlandIE
from .sportschau import SportschauIE
from .srgssr import (
SRGSSRIE,
SRGSSRPlayIE,

View File

@@ -12,7 +12,7 @@ from ..utils import (
class ImdbIE(InfoExtractor):
IE_NAME = 'imdb'
IE_DESC = 'Internet Movie Database trailers'
_VALID_URL = r'https?://(?:www|m)\.imdb\.com/video/[^/]+/vi(?P<id>\d+)'
_VALID_URL = r'https?://(?:www|m)\.imdb\.com/(?:video/[^/]+/|title/tt\d+.*?#lb-)vi(?P<id>\d+)'
_TESTS = [{
'url': 'http://www.imdb.com/video/imdb/vi2524815897',
@@ -25,6 +25,12 @@ class ImdbIE(InfoExtractor):
}, {
'url': 'http://www.imdb.com/video/_/vi2524815897',
'only_matching': True,
}, {
'url': 'http://www.imdb.com/title/tt1667889/?ref_=ext_shr_eml_vi#lb-vi2524815897',
'only_matching': True,
}, {
'url': 'http://www.imdb.com/title/tt1667889/#lb-vi2524815897',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@@ -12,9 +12,35 @@ from ..utils import (
class JWPlatformBaseIE(InfoExtractor):
@staticmethod
def _find_jwplayer_data(webpage):
# TODO: Merge this with JWPlayer-related codes in generic.py
mobj = re.search(
'jwplayer\((?P<quote>[\'"])[^\'" ]+(?P=quote)\)\.setup\((?P<options>[^)]+)\)',
webpage)
if mobj:
return mobj.group('options')
def _extract_jwplayer_data(self, webpage, video_id, *args, **kwargs):
jwplayer_data = self._parse_json(
self._find_jwplayer_data(webpage), video_id)
return self._parse_jwplayer_data(
jwplayer_data, video_id, *args, **kwargs)
def _parse_jwplayer_data(self, jwplayer_data, video_id, require_title=True, m3u8_id=None, rtmp_params=None):
# JWPlayer backward compatibility: flattened playlists
# https://github.com/jwplayer/jwplayer/blob/v7.4.3/src/js/api/config.js#L81-L96
if 'playlist' not in jwplayer_data:
jwplayer_data = {'playlist': [jwplayer_data]}
video_data = jwplayer_data['playlist'][0]
# JWPlayer backward compatibility: flattened sources
# https://github.com/jwplayer/jwplayer/blob/v7.4.3/src/js/playlist/item.js#L29-L35
if 'sources' not in video_data:
video_data['sources'] = [video_data]
formats = []
for source in video_data['sources']:
source_url = self._proto_relative_url(source['file'])

View File

@@ -95,7 +95,6 @@ class LyndaIE(LyndaBaseIE):
IE_NAME = 'lynda'
IE_DESC = 'lynda.com videos'
_VALID_URL = r'https?://www\.lynda\.com/(?:[^/]+/[^/]+/\d+|player/embed)/(?P<id>\d+)'
_NETRC_MACHINE = 'lynda'
_TIMECODE_REGEX = r'\[(?P<timecode>\d+:\d+:\d+[\.,]\d+)\]'

View File

@@ -1,5 +1,8 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import (
compat_urllib_parse_urlencode,
@@ -8,6 +11,7 @@ from ..compat import (
from ..utils import (
get_element_by_attribute,
int_or_none,
remove_start,
)
@@ -15,7 +19,7 @@ class MiTeleIE(InfoExtractor):
IE_DESC = 'mitele.es'
_VALID_URL = r'https?://www\.mitele\.es/[^/]+/[^/]+/[^/]+/(?P<id>[^/]+)/'
_TEST = {
_TESTS = [{
'url': 'http://www.mitele.es/programas-tv/diario-de/la-redaccion/programa-144/',
# MD5 is unstable
'info_dict': {
@@ -24,10 +28,31 @@ class MiTeleIE(InfoExtractor):
'ext': 'flv',
'title': 'Tor, la web invisible',
'description': 'md5:3b6fce7eaa41b2d97358726378d9369f',
'series': 'Diario de',
'season': 'La redacción',
'episode': 'Programa 144',
'thumbnail': 're:(?i)^https?://.*\.jpg$',
'duration': 2913,
},
}
}, {
# no explicit title
'url': 'http://www.mitele.es/programas-tv/cuarto-milenio/temporada-6/programa-226/',
'info_dict': {
'id': 'eLZSwoEd1S3pVyUm8lc6F',
'display_id': 'programa-226',
'ext': 'flv',
'title': 'Cuarto Milenio - Temporada 6 - Programa 226',
'description': 'md5:50daf9fadefa4e62d9fc866d0c015701',
'series': 'Cuarto Milenio',
'season': 'Temporada 6',
'episode': 'Programa 226',
'thumbnail': 're:(?i)^https?://.*\.jpg$',
'duration': 7312,
},
'params': {
'skip_download': True,
},
}]
def _real_extract(self, url):
display_id = self._match_id(url)
@@ -70,7 +95,22 @@ class MiTeleIE(InfoExtractor):
self._sort_formats(formats)
title = self._search_regex(
r'class="Destacado-text"[^>]*>\s*<strong>([^<]+)</strong>', webpage, 'title')
r'class="Destacado-text"[^>]*>\s*<strong>([^<]+)</strong>',
webpage, 'title', default=None)
mobj = re.search(r'''(?sx)
class="Destacado-text"[^>]*>.*?<h1>\s*
<span>(?P<series>[^<]+)</span>\s*
<span>(?P<season>[^<]+)</span>\s*
<span>(?P<episode>[^<]+)</span>''', webpage)
series, season, episode = mobj.groups() if mobj else [None] * 3
if not title:
if mobj:
title = '%s - %s - %s' % (series, season, episode)
else:
title = remove_start(self._search_regex(
r'<title>([^<]+)</title>', webpage, 'title'), 'Ver online ')
video_id = self._search_regex(
r'data-media-id\s*=\s*"([^"]+)"', webpage,
@@ -83,6 +123,9 @@ class MiTeleIE(InfoExtractor):
'display_id': display_id,
'title': title,
'description': get_element_by_attribute('class', 'text', webpage),
'series': series,
'season': season,
'episode': episode,
'thumbnail': thumbnail,
'duration': duration,
'formats': formats,

View File

@@ -6,6 +6,7 @@ from .common import InfoExtractor
from ..compat import (
compat_urllib_parse_urlencode,
compat_str,
compat_xpath,
)
from ..utils import (
ExtractorError,
@@ -84,9 +85,10 @@ class MTVServicesInfoExtractor(InfoExtractor):
rtmp_video_url = rendition.find('./src').text
if rtmp_video_url.endswith('siteunavail.png'):
continue
new_url = self._transform_rtmp_url(rtmp_video_url)
formats.append({
'ext': ext,
'url': self._transform_rtmp_url(rtmp_video_url),
'ext': 'flv' if new_url.startswith('rtmp') else ext,
'url': new_url,
'format_id': rendition.get('bitrate'),
'width': int(rendition.get('width')),
'height': int(rendition.get('height')),
@@ -139,9 +141,9 @@ class MTVServicesInfoExtractor(InfoExtractor):
itemdoc, './/{http://search.yahoo.com/mrss/}category',
'scheme', 'urn:mtvn:video_title')
if title_el is None:
title_el = itemdoc.find('.//{http://search.yahoo.com/mrss/}title')
title_el = itemdoc.find(compat_xpath('.//{http://search.yahoo.com/mrss/}title'))
if title_el is None:
title_el = itemdoc.find('.//title') or itemdoc.find('./title')
title_el = itemdoc.find(compat_xpath('.//title'))
if title_el.text is None:
title_el = None

View File

@@ -3,6 +3,7 @@ from __future__ import unicode_literals
from .mtv import MTVServicesInfoExtractor
from ..compat import compat_urllib_parse_urlencode
from ..utils import update_url_query
class NickIE(MTVServicesInfoExtractor):
@@ -61,3 +62,26 @@ class NickIE(MTVServicesInfoExtractor):
def _extract_mgid(self, webpage):
return self._search_regex(r'data-contenturi="([^"]+)', webpage, 'mgid')
class NickDeIE(MTVServicesInfoExtractor):
IE_NAME = 'nick.de'
_VALID_URL = r'https?://(?:www\.)?nick\.de/(?:playlist|shows)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://www.nick.de/playlist/3773-top-videos/videos/episode/17306-zu-wasser-und-zu-land-rauchende-erdnusse',
'only_matching': True,
}, {
'url': 'http://www.nick.de/shows/342-icarly',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
mrss_url = update_url_query(self._search_regex(
r'data-mrss=(["\'])(?P<url>http.+?)\1', webpage, 'mrss url', group='url'),
{'siteKey': 'nick.de'})
return self._get_videos_info_from_url(mrss_url, video_id)

View File

@@ -1,19 +1,32 @@
from __future__ import unicode_literals
import re
import json
from .common import InfoExtractor
from ..utils import (
ExtractorError,
int_or_none,
js_to_json,
qualities,
)
class PornHdIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?pornhd\.com/(?:[a-z]{2,4}/)?videos/(?P<id>\d+)(?:/(?P<display_id>.+))?'
_TEST = {
_TESTS = [{
'url': 'http://www.pornhd.com/videos/9864/selfie-restroom-masturbation-fun-with-chubby-cutie-hd-porn-video',
'md5': 'c8b964b1f0a4b5f7f28ae3a5c9f86ad5',
'info_dict': {
'id': '9864',
'display_id': 'selfie-restroom-masturbation-fun-with-chubby-cutie-hd-porn-video',
'ext': 'mp4',
'title': 'Restroom selfie masturbation',
'description': 'md5:3748420395e03e31ac96857a8f125b2b',
'thumbnail': 're:^https?://.*\.jpg',
'view_count': int,
'age_limit': 18,
}
}, {
# removed video
'url': 'http://www.pornhd.com/videos/1962/sierra-day-gets-his-cum-all-over-herself-hd-porn-video',
'md5': '956b8ca569f7f4d8ec563e2c41598441',
'info_dict': {
@@ -25,8 +38,9 @@ class PornHdIE(InfoExtractor):
'thumbnail': 're:^https?://.*\.jpg',
'view_count': int,
'age_limit': 18,
}
}
},
'skip': 'Not available anymore',
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
@@ -38,28 +52,38 @@ class PornHdIE(InfoExtractor):
title = self._html_search_regex(
[r'<span[^>]+class=["\']video-name["\'][^>]*>([^<]+)',
r'<title>(.+?) - .*?[Pp]ornHD.*?</title>'], webpage, 'title')
description = self._html_search_regex(
r'<div class="description">([^<]+)</div>', webpage, 'description', fatal=False)
view_count = int_or_none(self._html_search_regex(
r'(\d+) views\s*</span>', webpage, 'view count', fatal=False))
thumbnail = self._search_regex(
r"'poster'\s*:\s*'([^']+)'", webpage, 'thumbnail', fatal=False)
quality = qualities(['sd', 'hd'])
sources = json.loads(js_to_json(self._search_regex(
sources = self._parse_json(js_to_json(self._search_regex(
r"(?s)'sources'\s*:\s*(\{.+?\})\s*\}[;,)]",
webpage, 'sources')))
webpage, 'sources', default='{}')), video_id)
if not sources:
message = self._html_search_regex(
r'(?s)<(div|p)[^>]+class="no-video"[^>]*>(?P<value>.+?)</\1',
webpage, 'error message', group='value')
raise ExtractorError('%s said: %s' % (self.IE_NAME, message), expected=True)
formats = []
for qname, video_url in sources.items():
for format_id, video_url in sources.items():
if not video_url:
continue
height = int_or_none(self._search_regex(
r'^(\d+)[pP]', format_id, 'height', default=None))
formats.append({
'url': video_url,
'format_id': qname,
'quality': quality(qname),
'format_id': format_id,
'height': height,
})
self._sort_formats(formats)
description = self._html_search_regex(
r'<(div|p)[^>]+class="description"[^>]*>(?P<value>[^<]+)</\1',
webpage, 'description', fatal=False, group='value')
view_count = int_or_none(self._html_search_regex(
r'(\d+) views\s*<', webpage, 'view count', fatal=False))
thumbnail = self._search_regex(
r"'poster'\s*:\s*'([^']+)'", webpage, 'thumbnail', fatal=False)
return {
'id': video_id,
'display_id': display_id,

View File

@@ -1,3 +1,4 @@
# coding: utf-8
from __future__ import unicode_literals
import itertools
@@ -39,7 +40,25 @@ class PornHubIE(InfoExtractor):
'dislike_count': int,
'comment_count': int,
'age_limit': 18,
}
},
}, {
# non-ASCII title
'url': 'http://www.pornhub.com/view_video.php?viewkey=1331683002',
'info_dict': {
'id': '1331683002',
'ext': 'mp4',
'title': '重庆婷婷女王足交',
'uploader': 'cj397186295',
'duration': 1753,
'view_count': int,
'like_count': int,
'dislike_count': int,
'comment_count': int,
'age_limit': 18,
},
'params': {
'skip_download': True,
},
}, {
'url': 'http://www.pornhub.com/view_video.php?viewkey=ph557bbb6676d2d',
'only_matching': True,
@@ -76,19 +95,25 @@ class PornHubIE(InfoExtractor):
'PornHub said: %s' % error_msg,
expected=True, video_id=video_id)
# video_title from flashvars contains whitespace instead of non-ASCII (see
# http://www.pornhub.com/view_video.php?viewkey=1331683002), not relying
# on that anymore.
title = self._html_search_meta(
'twitter:title', webpage, default=None) or self._search_regex(
(r'<h1[^>]+class=["\']title["\'][^>]*>(?P<title>[^<]+)',
r'<div[^>]+data-video-title=(["\'])(?P<title>.+?)\1',
r'shareTitle\s*=\s*(["\'])(?P<title>.+?)\1'),
webpage, 'title', group='title')
flashvars = self._parse_json(
self._search_regex(
r'var\s+flashvars_\d+\s*=\s*({.+?});', webpage, 'flashvars', default='{}'),
video_id)
if flashvars:
video_title = flashvars.get('video_title')
thumbnail = flashvars.get('image_url')
duration = int_or_none(flashvars.get('video_duration'))
else:
video_title, thumbnail, duration = [None] * 3
if not video_title:
video_title = self._html_search_regex(r'<h1 [^>]+>([^<]+)', webpage, 'title')
title, thumbnail, duration = [None] * 3
video_uploader = self._html_search_regex(
r'(?s)From:&nbsp;.+?<(?:a href="/users/|a href="/channels/|span class="username)[^>]+>(.+?)<',
@@ -137,7 +162,7 @@ class PornHubIE(InfoExtractor):
return {
'id': video_id,
'uploader': video_uploader,
'title': video_title,
'title': title,
'thumbnail': thumbnail,
'duration': duration,
'view_count': view_count,

View File

@@ -2,22 +2,19 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
js_to_json,
unescapeHTML,
int_or_none,
)
from ..utils import int_or_none
class R7IE(InfoExtractor):
_VALID_URL = r'''(?x)https?://
_VALID_URL = r'''(?x)
https?://
(?:
(?:[a-zA-Z]+)\.r7\.com(?:/[^/]+)+/idmedia/|
noticias\.r7\.com(?:/[^/]+)+/[^/]+-|
player\.r7\.com/video/i/
)
(?P<id>[\da-f]{24})
'''
'''
_TESTS = [{
'url': 'http://videos.r7.com/policiais-humilham-suspeito-a-beira-da-morte-morre-com-dignidade-/idmedia/54e7050b0cf2ff57e0279389.html',
'md5': '403c4e393617e8e8ddc748978ee8efde',
@@ -25,6 +22,7 @@ class R7IE(InfoExtractor):
'id': '54e7050b0cf2ff57e0279389',
'ext': 'mp4',
'title': 'Policiais humilham suspeito à beira da morte: "Morre com dignidade"',
'description': 'md5:01812008664be76a6479aa58ec865b72',
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 98,
'like_count': int,
@@ -44,45 +42,72 @@ class R7IE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(
'http://player.r7.com/video/i/%s' % video_id, video_id)
video = self._download_json(
'http://player-api.r7.com/video/i/%s' % video_id, video_id)
item = self._parse_json(js_to_json(self._search_regex(
r'(?s)var\s+item\s*=\s*({.+?});', webpage, 'player')), video_id)
title = unescapeHTML(item['title'])
thumbnail = item.get('init', {}).get('thumbUri')
duration = None
statistics = item.get('statistics', {})
like_count = int_or_none(statistics.get('likes'))
view_count = int_or_none(statistics.get('views'))
title = video['title']
formats = []
for format_key, format_dict in item['playlist'][0].items():
src = format_dict.get('src')
if not src:
continue
format_id = format_dict.get('format') or format_key
if duration is None:
duration = format_dict.get('duration')
if '.f4m' in src:
formats.extend(self._extract_f4m_formats(src, video_id, preference=-1))
elif src.endswith('.m3u8'):
formats.extend(self._extract_m3u8_formats(src, video_id, 'mp4', preference=-2))
else:
formats.append({
'url': src,
'format_id': format_id,
})
media_url_hls = video.get('media_url_hls')
if media_url_hls:
formats.extend(self._extract_m3u8_formats(
media_url_hls, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
media_url = video.get('media_url')
if media_url:
f = {
'url': media_url,
'format_id': 'http',
}
# m3u8 format always matches the http format, let's copy metadata from
# one to another
m3u8_formats = list(filter(
lambda f: f.get('vcodec') != 'none' and f.get('resolution') != 'multiple',
formats))
if len(m3u8_formats) == 1:
f_copy = m3u8_formats[0].copy()
f_copy.update(f)
f_copy['protocol'] = 'http'
f = f_copy
formats.append(f)
self._sort_formats(formats)
description = video.get('description')
thumbnail = video.get('thumb')
duration = int_or_none(video.get('media_duration'))
like_count = int_or_none(video.get('likes'))
view_count = int_or_none(video.get('views'))
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'duration': duration,
'like_count': like_count,
'view_count': view_count,
'formats': formats,
}
class R7ArticleIE(InfoExtractor):
_VALID_URL = r'https?://(?:[a-zA-Z]+)\.r7\.com/(?:[^/]+/)+[^/?#&]+-(?P<id>\d+)'
_TEST = {
'url': 'http://tv.r7.com/record-play/balanco-geral/videos/policiais-humilham-suspeito-a-beira-da-morte-morre-com-dignidade-16102015',
'only_matching': True,
}
@classmethod
def suitable(cls, url):
return False if R7IE.suitable(url) else super(R7ArticleIE, cls).suitable(url)
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
video_id = self._search_regex(
r'<div[^>]+(?:id=["\']player-|class=["\']embed["\'][^>]+id=["\'])([\da-f]{24})',
webpage, 'video id')
return self.url_result('http://player.r7.com/video/i/%s' % video_id, R7IE.ie_key())

View File

@@ -0,0 +1,38 @@
# coding: utf-8
from __future__ import unicode_literals
from .wdr import WDRBaseIE
from ..utils import get_element_by_attribute
class SportschauIE(WDRBaseIE):
IE_NAME = 'Sportschau'
_VALID_URL = r'https?://(?:www\.)?sportschau\.de/(?:[^/]+/)+video-?(?P<id>[^/#?]+)\.html'
_TEST = {
'url': 'http://www.sportschau.de/uefaeuro2016/videos/video-dfb-team-geht-gut-gelaunt-ins-spiel-gegen-polen-100.html',
'info_dict': {
'id': 'mdb-1140188',
'display_id': 'dfb-team-geht-gut-gelaunt-ins-spiel-gegen-polen-100',
'ext': 'mp4',
'title': 'DFB-Team geht gut gelaunt ins Spiel gegen Polen',
'description': 'Vor dem zweiten Gruppenspiel gegen Polen herrscht gute Stimmung im deutschen Team. Insbesondere Bastian Schweinsteiger strotzt vor Optimismus nach seinem Tor gegen die Ukraine.',
'upload_date': '20160615',
},
'skip': 'Geo-restricted to Germany',
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
title = get_element_by_attribute('class', 'headline', webpage)
description = self._html_search_meta('description', webpage, 'description')
info = self._extract_wdr_video(webpage, video_id)
info.update({
'title': title,
'description': description,
})
return info

View File

@@ -8,6 +8,7 @@ import itertools
from .common import InfoExtractor
from ..compat import (
compat_HTTPError,
compat_str,
compat_urlparse,
)
from ..utils import (
@@ -24,6 +25,7 @@ from ..utils import (
urlencode_postdata,
unescapeHTML,
parse_filesize,
try_get,
)
@@ -445,7 +447,18 @@ class VimeoIE(VimeoBaseInfoExtractor):
if config.get('view') == 4:
config = self._verify_player_video_password(url, video_id)
if '>You rented this title.<' in webpage:
def is_rented():
if '>You rented this title.<' in webpage:
return True
if config.get('user', {}).get('purchased'):
return True
label = try_get(
config, lambda x: x['video']['vod']['purchase_options'][0]['label_string'], compat_str)
if label and label.startswith('You rented this'):
return True
return False
if is_rented():
feature_id = config.get('video', {}).get('vod', {}).get('feature_id')
if feature_id and not data.get('force_feature_id', False):
return self.url_result(smuggle_url(

View File

@@ -15,7 +15,87 @@ from ..utils import (
)
class WDRIE(InfoExtractor):
class WDRBaseIE(InfoExtractor):
def _extract_wdr_video(self, webpage, display_id):
# for wdr.de the data-extension is in a tag with the class "mediaLink"
# for wdr.de radio players, in a tag with the class "wdrrPlayerPlayBtn"
# for wdrmaus its in a link to the page in a multiline "videoLink"-tag
json_metadata = self._html_search_regex(
r'class=(?:"(?:mediaLink|wdrrPlayerPlayBtn)\b[^"]*"[^>]+|"videoLink\b[^"]*"[\s]*>\n[^\n]*)data-extension="([^"]+)"',
webpage, 'media link', default=None, flags=re.MULTILINE)
if not json_metadata:
return
media_link_obj = self._parse_json(json_metadata, display_id,
transform_source=js_to_json)
jsonp_url = media_link_obj['mediaObj']['url']
metadata = self._download_json(
jsonp_url, 'metadata', transform_source=strip_jsonp)
metadata_tracker_data = metadata['trackerData']
metadata_media_resource = metadata['mediaResource']
formats = []
# check if the metadata contains a direct URL to a file
for kind, media_resource in metadata_media_resource.items():
if kind not in ('dflt', 'alt'):
continue
for tag_name, medium_url in media_resource.items():
if tag_name not in ('videoURL', 'audioURL'):
continue
ext = determine_ext(medium_url)
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
medium_url, display_id, 'mp4', 'm3u8_native',
m3u8_id='hls'))
elif ext == 'f4m':
manifest_url = update_url_query(
medium_url, {'hdcore': '3.2.0', 'plugin': 'aasp-3.2.0.77.18'})
formats.extend(self._extract_f4m_formats(
manifest_url, display_id, f4m_id='hds', fatal=False))
elif ext == 'smil':
formats.extend(self._extract_smil_formats(
medium_url, 'stream', fatal=False))
else:
a_format = {
'url': medium_url
}
if ext == 'unknown_video':
urlh = self._request_webpage(
medium_url, display_id, note='Determining extension')
ext = urlhandle_detect_ext(urlh)
a_format['ext'] = ext
formats.append(a_format)
self._sort_formats(formats)
subtitles = {}
caption_url = metadata_media_resource.get('captionURL')
if caption_url:
subtitles['de'] = [{
'url': caption_url,
'ext': 'ttml',
}]
title = metadata_tracker_data['trackerClipTitle']
return {
'id': metadata_tracker_data.get('trackerClipId', display_id),
'display_id': display_id,
'title': title,
'alt_title': metadata_tracker_data.get('trackerClipSubcategory'),
'formats': formats,
'subtitles': subtitles,
'upload_date': unified_strdate(metadata_tracker_data.get('trackerClipAirTime')),
}
class WDRIE(WDRBaseIE):
_CURRENT_MAUS_URL = r'https?://(?:www\.)wdrmaus.de/(?:[^/]+/){1,2}[^/?#]+\.php5'
_PAGE_REGEX = r'/(?:mediathek/)?[^/]+/(?P<type>[^/]+)/(?P<display_id>.+)\.html'
_VALID_URL = r'(?P<page_url>https?://(?:www\d\.)?wdr\d?\.de)' + _PAGE_REGEX + '|' + _CURRENT_MAUS_URL
@@ -91,10 +171,10 @@ class WDRIE(InfoExtractor):
},
{
'url': 'http://www.wdrmaus.de/sachgeschichten/sachgeschichten/achterbahn.php5',
# HDS download, MD5 is unstable
'md5': '803138901f6368ee497b4d195bb164f2',
'info_dict': {
'id': 'mdb-186083',
'ext': 'flv',
'ext': 'mp4',
'upload_date': '20130919',
'title': 'Sachgeschichte - Achterbahn ',
'description': '- Die Sendung mit der Maus -',
@@ -120,14 +200,9 @@ class WDRIE(InfoExtractor):
display_id = mobj.group('display_id')
webpage = self._download_webpage(url, display_id)
# for wdr.de the data-extension is in a tag with the class "mediaLink"
# for wdr.de radio players, in a tag with the class "wdrrPlayerPlayBtn"
# for wdrmaus its in a link to the page in a multiline "videoLink"-tag
json_metadata = self._html_search_regex(
r'class=(?:"(?:mediaLink|wdrrPlayerPlayBtn)\b[^"]*"[^>]+|"videoLink\b[^"]*"[\s]*>\n[^\n]*)data-extension="([^"]+)"',
webpage, 'media link', default=None, flags=re.MULTILINE)
info_dict = self._extract_wdr_video(webpage, display_id)
if not json_metadata:
if not info_dict:
entries = [
self.url_result(page_url + href[0], 'WDR')
for href in re.findall(
@@ -140,86 +215,22 @@ class WDRIE(InfoExtractor):
raise ExtractorError('No downloadable streams found', expected=True)
media_link_obj = self._parse_json(json_metadata, display_id,
transform_source=js_to_json)
jsonp_url = media_link_obj['mediaObj']['url']
metadata = self._download_json(
jsonp_url, 'metadata', transform_source=strip_jsonp)
metadata_tracker_data = metadata['trackerData']
metadata_media_resource = metadata['mediaResource']
formats = []
# check if the metadata contains a direct URL to a file
for kind, media_resource in metadata_media_resource.items():
if kind not in ('dflt', 'alt'):
continue
for tag_name, medium_url in media_resource.items():
if tag_name not in ('videoURL', 'audioURL'):
continue
ext = determine_ext(medium_url)
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
medium_url, display_id, 'mp4', 'm3u8_native',
m3u8_id='hls'))
elif ext == 'f4m':
manifest_url = update_url_query(
medium_url, {'hdcore': '3.2.0', 'plugin': 'aasp-3.2.0.77.18'})
formats.extend(self._extract_f4m_formats(
manifest_url, display_id, f4m_id='hds', fatal=False))
elif ext == 'smil':
formats.extend(self._extract_smil_formats(
medium_url, 'stream', fatal=False))
else:
a_format = {
'url': medium_url
}
if ext == 'unknown_video':
urlh = self._request_webpage(
medium_url, display_id, note='Determining extension')
ext = urlhandle_detect_ext(urlh)
a_format['ext'] = ext
formats.append(a_format)
self._sort_formats(formats)
subtitles = {}
caption_url = metadata_media_resource.get('captionURL')
if caption_url:
subtitles['de'] = [{
'url': caption_url,
'ext': 'ttml',
}]
title = metadata_tracker_data.get('trackerClipTitle')
is_live = url_type == 'live'
if is_live:
title = self._live_title(title)
upload_date = None
elif 'trackerClipAirTime' in metadata_tracker_data:
upload_date = metadata_tracker_data['trackerClipAirTime']
else:
upload_date = self._html_search_meta('DC.Date', webpage, 'upload date')
info_dict.update({
'title': self._live_title(info_dict['title']),
'upload_date': None,
})
elif 'upload_date' not in info_dict:
info_dict['upload_date'] = unified_strdate(self._html_search_meta('DC.Date', webpage, 'upload date'))
if upload_date:
upload_date = unified_strdate(upload_date)
return {
'id': metadata_tracker_data.get('trackerClipId', display_id),
'display_id': display_id,
'title': title,
'alt_title': metadata_tracker_data.get('trackerClipSubcategory'),
'formats': formats,
'upload_date': upload_date,
info_dict.update({
'description': self._html_search_meta('Description', webpage),
'is_live': is_live,
'subtitles': subtitles,
}
})
return info_dict
class WDRMobileIE(InfoExtractor):

View File

@@ -1,29 +1,33 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from .youtube import YoutubeIE
from .jwplatform import JWPlatformBaseIE
class WimpIE(InfoExtractor):
class WimpIE(JWPlatformBaseIE):
_VALID_URL = r'https?://(?:www\.)?wimp\.com/(?P<id>[^/]+)'
_TESTS = [{
'url': 'http://www.wimp.com/maruexhausted/',
'url': 'http://www.wimp.com/maru-is-exhausted/',
'md5': 'ee21217ffd66d058e8b16be340b74883',
'info_dict': {
'id': 'maruexhausted',
'id': 'maru-is-exhausted',
'ext': 'mp4',
'title': 'Maru is exhausted.',
'description': 'md5:57e099e857c0a4ea312542b684a869b8',
}
}, {
'url': 'http://www.wimp.com/clowncar/',
'md5': '4e2986c793694b55b37cf92521d12bb4',
'md5': '5c31ad862a90dc5b1f023956faec13fe',
'info_dict': {
'id': 'clowncar',
'id': 'cG4CEr2aiSg',
'ext': 'webm',
'title': 'It\'s like a clown car.',
'description': 'md5:0e56db1370a6e49c5c1d19124c0d2fb2',
'title': 'Basset hound clown car...incredible!',
'description': '5 of my Bassets crawled in this dog loo! www.bellinghambassets.com\n\nFor licensing/usage please contact: licensing(at)jukinmediadotcom',
'upload_date': '20140303',
'uploader': 'Gretchen Hoey',
'uploader_id': 'gretchenandjeff1',
},
'add_ie': ['Youtube'],
}]
def _real_extract(self, url):
@@ -41,14 +45,13 @@ class WimpIE(InfoExtractor):
'ie_key': YoutubeIE.ie_key(),
}
video_url = self._search_regex(
r'<video[^>]+>\s*<source[^>]+src=(["\'])(?P<url>.+?)\1',
webpage, 'video URL', group='url')
info_dict = self._extract_jwplayer_data(
webpage, video_id, require_title=False)
return {
info_dict.update({
'id': video_id,
'url': video_url,
'title': self._og_search_title(webpage),
'thumbnail': self._og_search_thumbnail(webpage),
'description': self._og_search_description(webpage),
}
})
return info_dict

View File

@@ -5,6 +5,7 @@ import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
int_or_none,
qualities,
remove_start,
@@ -27,16 +28,17 @@ class WrzutaIE(InfoExtractor):
'uploader_id': 'laboratoriumdextera',
'description': 'md5:7fb5ef3c21c5893375fda51d9b15d9cd',
},
'skip': 'Redirected to wrzuta.pl',
}, {
'url': 'http://jolka85.wrzuta.pl/audio/063jOPX5ue2/liber_natalia_szroeder_-_teraz_ty',
'md5': 'bc78077859bea7bcfe4295d7d7fc9025',
'url': 'http://vexling.wrzuta.pl/audio/01xBFabGXu6/james_horner_-_into_the_na_39_vi_world_bonus',
'md5': 'f80564fb5a2ec6ec59705ae2bf2ba56d',
'info_dict': {
'id': '063jOPX5ue2',
'ext': 'ogg',
'title': 'Liber & Natalia Szroeder - Teraz Ty',
'duration': 203,
'uploader_id': 'jolka85',
'description': 'md5:2d2b6340f9188c8c4cd891580e481096',
'id': '01xBFabGXu6',
'ext': 'mp3',
'title': 'James Horner - Into The Na\'vi World [Bonus]',
'description': 'md5:30a70718b2cd9df3120fce4445b0263b',
'duration': 95,
'uploader_id': 'vexling',
},
}]
@@ -46,7 +48,10 @@ class WrzutaIE(InfoExtractor):
typ = mobj.group('typ')
uploader = mobj.group('uploader')
webpage = self._download_webpage(url, video_id)
webpage, urlh = self._download_webpage_handle(url, video_id)
if urlh.geturl() == 'http://www.wrzuta.pl/':
raise ExtractorError('Video removed', expected=True)
quality = qualities(['SD', 'MQ', 'HQ', 'HD'])

View File

@@ -131,9 +131,8 @@ class JSInterpreter(object):
if variable in local_vars:
obj = local_vars[variable]
else:
if variable not in self._objects:
self._objects[variable] = self.extract_object(variable)
obj = self._objects[variable]
obj = self._objects.setdefault(
variable, self.extract_object(variable))
if arg_str is None:
# Member access
@@ -204,8 +203,7 @@ class JSInterpreter(object):
argvals = tuple([
int(v) if v.isdigit() else local_vars[v]
for v in m.group('args').split(',')])
if fname not in self._functions:
self._functions[fname] = self.extract_function(fname)
self._functions.setdefault(fname, self.extract_function(fname))
return self._functions[fname](argvals)
raise ExtractorError('Unsupported JS expression %r' % expr)

View File

@@ -1970,7 +1970,7 @@ def js_to_json(code):
'(?:[^'\\]*(?:\\\\|\\['"nurtbfx/\n]))*[^'\\]*'|
/\*.*?\*/|,(?=\s*[\]}])|
[a-zA-Z_][.a-zA-Z_0-9]*|
(?:0[xX][0-9a-fA-F]+|0+[0-7]+)(?:\s*:)?|
\b(?:0[xX][0-9a-fA-F]+|0+[0-7]+)(?:\s*:)?|
[0-9]+(?=\s*:)
''', fix_kv, code)

View File

@@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2016.06.14'
__version__ = '2016.06.19.1'