Compare commits

...

38 Commits

Author SHA1 Message Date
Sergey M․
c1c2fe2045 release 2017.01.16 2017-01-16 23:44:04 +07:00
Sergey M․
ddd53c392e [ChangeLog] Actualize 2017-01-16 23:42:04 +07:00
Sergey M․
79fc8496c6 [xiami] Improve extraction (closes #11699)
* Relax _VALID_URLs
* Improve track metadata extraction
2017-01-16 23:31:50 +07:00
Sergey M․
0ce8c66fb0 [options] Include custom conf in final argv (closes #11741) 2017-01-16 22:07:12 +07:00
Sergey M․
906420cae3 [limelight] Improve and make more robust (closes #11737)
+ Add support for direct http for videos hosted on video.llnw.net
* Check handmade http URLs
2017-01-16 21:54:47 +07:00
Yen Chi Hsuan
16e2c8f771 [brightcove] Recognize another player ID
Closes #11688
2017-01-16 00:06:52 +08:00
Yen Chi Hsuan
dcae7b3fdc [niconico] Allow login via cookies
Some codes are borrowed from #7968, which is by @jlhg

Closes #7968
2017-01-15 22:51:54 +08:00
Yen Chi Hsuan
8e4988f1a2 [niconico] Remove codes for downloading anonymously
Apparently Niconico now blocks playing without an account

Closes #11170
2017-01-15 22:10:57 +08:00
Sergey M․
a7acf868a5 [yourupload] Fix extraction (closes #11601) 2017-01-15 10:34:39 +07:00
Sergey M․
6f0be93747 [YoutubeDL] Improve protocol auto determining (closes #11720) 2017-01-15 06:09:32 +07:00
Sergey M․
af62de104f [beam:live] Improve and simplify (#10702, closes #11596) 2017-01-15 06:07:35 +07:00
sh!zeeg
cd55c6ccd7 [beam:live] Add extractor 2017-01-15 06:06:10 +07:00
Sergey M․
621a2800ca [vevo] Improve geo restriction detection 2017-01-15 04:42:05 +07:00
Sergey M․
b80e2ebc8d [dramafever] Add support for URLs with language code (#11714) 2017-01-14 18:27:22 +07:00
Remita Amine
99d537a5e0 [ooyala] fix typo 2017-01-14 07:12:50 +01:00
Sergey M․
8854f3fe78 [README.md] Clarify newline format in cookies section (closes #11709) 2017-01-14 08:48:26 +07:00
Sergey M․
abe8cb763f [cbc] Improve playlist support (closes #11704) 2017-01-14 08:30:00 +07:00
Sergey M․
5d4c7daa49 release 2017.01.14 2017-01-14 07:31:07 +07:00
Sergey M․
0b94510cd0 [ChangeLog] Actualize 2017-01-14 07:30:32 +07:00
Jakub Wilk
4f66c16f33 [brightcove:legacy] Fix misplaced backslash in a regexp 2017-01-14 06:26:11 +07:00
Sergey M․
e54fc0524e [cmt] Add support for video-clips 2017-01-14 06:23:24 +07:00
Sergey M․
adf063dad1 [mtv,cc,cmt,spike] Improve and refactor
- Eliminate _transform_rtmp_url
* Generalize triforce mgid extraction
+ [cmt] Add support for full-episodes (closes #11623)
2017-01-14 06:18:38 +07:00
Remita Amine
5e8eebb600 [mitele] extract dash formats 2017-01-13 23:06:59 +01:00
Remita Amine
9837cb7507 [ooyala] add support for videos with embedToken(#11684) 2017-01-13 23:06:59 +01:00
Sergey M․
fb6a59205e [mixcloud] Fix extraction (closes #11674) 2017-01-13 23:56:16 +07:00
Vijay Singh
06e9363b7a [openload] Fix extraction (closes #10408)
Just a minor fix for openload
2017-01-13 23:40:19 +07:00
Remita Amine
1f393a3241 [tv4] improve extraction(closes #11698)
- remove check for requires_subscription
- extract more formats
- extract subtitles
2017-01-13 10:21:37 +01:00
Remita Amine
c4251b9aaa [common] add possibility to customize akamai manifest host 2017-01-13 10:21:36 +01:00
Sergey M․
3a407e707a [freesound] Improve and remove unrelated metadata (closes #11608) 2017-01-12 23:03:53 +07:00
Sergey M․
cb655f34fb [utils] Add more date formats 2017-01-12 22:39:45 +07:00
sh!zeeg
ed06da4e7b [freesound] Fix extraction and extended (closes #11602) 2017-01-12 22:35:14 +07:00
Sergey M․
365d136b7c [vimeo] Fix tests 2017-01-11 22:57:08 +07:00
Sergey M․
1fd0fc42bd [vimeo:ondemand] Fix test (closes #11651) 2017-01-11 22:51:03 +07:00
Sergey M․
10cd2003b4 [nick] Add support for beta.nick.com (closes #11655) 2017-01-10 22:32:34 +07:00
Sergey M․
cdd11c0540 [mtv] Use native hls by default 2017-01-10 22:31:20 +07:00
Sergey M․
67fc365b86 [mtv,cc] Use hls by default (closes #11641) 2017-01-10 22:30:47 +07:00
Sergey M․
20faad74b6 [mtv] Fix non-hls extraction
method attribute may not be present
2017-01-10 22:27:23 +07:00
Sergey M․
2032d935d1 [mtv] Add default value for use_hls
These methods are used across codebase with old number of arguments
2017-01-10 22:25:33 +07:00
33 changed files with 489 additions and 229 deletions

View File

@@ -6,8 +6,8 @@
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.01.10*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.01.10**
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.01.16*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.01.16**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2017.01.10
[debug] youtube-dl version 2017.01.16
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

View File

@@ -1,3 +1,44 @@
version 2017.01.16
Core
* [options] Apply custom config to final composite configuration (#11741)
* [YoutubeDL] Improve protocol auto determining (#11720)
Extractors
* [xiami] Relax URL regular expressions
* [xiami] Improve track metadata extraction (#11699)
+ [limelight] Check hand-make direct HTTP links
+ [limelight] Add support for direct HTTP links at video.llnw.net (#11737)
+ [brightcove] Recognize another player ID pattern (#11688)
+ [niconico] Support login via cookies (#7968)
* [yourupload] Fix extraction (#11601)
+ [beam:live] Add support for beam.pro live streams (#10702, #11596)
* [vevo] Improve geo restriction detection
+ [dramafever] Add support for URLs with language code (#11714)
* [cbc] Improve playlist support (#11704)
version 2017.01.14
Core
+ [common] Add ability to customize akamai manifest host
+ [utils] Add more date formats
Extractors
- [mtv] Eliminate _transform_rtmp_url
* [mtv] Generalize triforce mgid extraction
+ [cmt] Add support for full episodes and video clips (#11623)
+ [mitele] Extract DASH formats
+ [ooyala] Add support for videos with embedToken (#11684)
* [mixcloud] Fix extraction (#11674)
* [openload] Fix extraction (#10408)
* [tv4] Improve extraction (#11698)
* [freesound] Fix and improve extraction (#11602)
+ [nick] Add support for beta.nick.com (#11655)
* [mtv,cc] Use HLS by default with native HLS downloader (#11641)
* [mtv] Fix non-HLS extraction
version 2017.01.10
Extractors

View File

@@ -841,7 +841,7 @@ Use the `--cookies` option, for example `--cookies /path/to/cookies/file.txt`.
In order to extract cookies from browser use any conforming browser extension for exporting cookies. For example, [cookies.txt](https://chrome.google.com/webstore/detail/cookiestxt/njabckikapfpffapmjgojcnbfjonfjfg) (for Chrome) or [Export Cookies](https://addons.mozilla.org/en-US/firefox/addon/export-cookies/) (for Firefox).
Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows, `LF` (`\n`) for Linux and `CR` (`\r`) for Mac OS. `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.
Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows and `LF` (`\n`) for Unix and Unix-like systems (Linux, Mac OS, etc.). `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.
Passing cookies to youtube-dl is a good way to workaround login when a particular extractor does not implement it explicitly. Another use case is working around [CAPTCHA](https://en.wikipedia.org/wiki/CAPTCHA) some websites require you to solve in particular cases in order to get access (e.g. YouTube, CloudFlare).

View File

@@ -86,6 +86,7 @@
- **bbc.co.uk:article**: BBC articles
- **bbc.co.uk:iplayer:playlist**
- **bbc.co.uk:playlist**
- **Beam:live**
- **Beatport**
- **Beeg**
- **BehindKink**

View File

@@ -295,6 +295,9 @@ class TestUtil(unittest.TestCase):
self.assertEqual(unified_strdate('27.02.2016 17:30'), '20160227')
self.assertEqual(unified_strdate('UNKNOWN DATE FORMAT'), None)
self.assertEqual(unified_strdate('Feb 7, 2016 at 6:35 pm'), '20160207')
self.assertEqual(unified_strdate('July 15th, 2013'), '20130715')
self.assertEqual(unified_strdate('September 1st, 2013'), '20130901')
self.assertEqual(unified_strdate('Sep 2nd, 2013'), '20130902')
def test_unified_timestamps(self):
self.assertEqual(unified_timestamp('December 21, 2010'), 1292889600)

View File

@@ -1363,7 +1363,7 @@ class YoutubeDL(object):
format['ext'] = determine_ext(format['url']).lower()
# Automatically determine protocol if missing (useful for format
# selection purposes)
if 'protocol' not in format:
if format.get('protocol') is None:
format['protocol'] = determine_protocol(format)
# Add HTTP headers, so that external programs can use them from the
# json output

View File

@@ -0,0 +1,73 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
ExtractorError,
clean_html,
compat_str,
int_or_none,
parse_iso8601,
try_get,
)
class BeamProLiveIE(InfoExtractor):
IE_NAME = 'Beam:live'
_VALID_URL = r'https?://(?:\w+\.)?beam\.pro/(?P<id>[^/?#&]+)'
_RATINGS = {'family': 0, 'teen': 13, '18+': 18}
_TEST = {
'url': 'http://www.beam.pro/niterhayven',
'info_dict': {
'id': '261562',
'ext': 'mp4',
'title': 'Introducing The Witcher 3 // The Grind Starts Now!',
'description': 'md5:0b161ac080f15fe05d18a07adb44a74d',
'thumbnail': r're:https://.*\.jpg$',
'timestamp': 1483477281,
'upload_date': '20170103',
'uploader': 'niterhayven',
'uploader_id': '373396',
'age_limit': 18,
'is_live': True,
'view_count': int,
},
'skip': 'niterhayven is offline',
'params': {
'skip_download': True,
},
}
def _real_extract(self, url):
channel_name = self._match_id(url)
chan = self._download_json(
'https://beam.pro/api/v1/channels/%s' % channel_name, channel_name)
if chan.get('online') is False:
raise ExtractorError(
'{0} is offline'.format(channel_name), expected=True)
channel_id = chan['id']
formats = self._extract_m3u8_formats(
'https://beam.pro/api/v1/channels/%s/manifest.m3u8' % channel_id,
channel_name, ext='mp4', m3u8_id='hls', fatal=False)
self._sort_formats(formats)
user_id = chan.get('userId') or try_get(chan, lambda x: x['user']['id'])
return {
'id': compat_str(chan.get('id') or channel_name),
'title': self._live_title(chan.get('name') or channel_name),
'description': clean_html(chan.get('description')),
'thumbnail': try_get(chan, lambda x: x['thumbnail']['url'], compat_str),
'timestamp': parse_iso8601(chan.get('updatedAt')),
'uploader': chan.get('token') or try_get(
chan, lambda x: x['user']['username'], compat_str),
'uploader_id': compat_str(user_id) if user_id else None,
'age_limit': self._RATINGS.get(chan.get('audience')),
'is_live': True,
'view_count': int_or_none(chan.get('viewersTotal')),
'formats': formats,
}

View File

@@ -179,7 +179,7 @@ class BrightcoveLegacyIE(InfoExtractor):
params = {}
playerID = find_param('playerID')
playerID = find_param('playerID') or find_param('playerId')
if playerID is None:
raise ExtractorError('Cannot find player ID')
params['playerID'] = playerID
@@ -204,7 +204,7 @@ class BrightcoveLegacyIE(InfoExtractor):
# // build Brightcove <object /> XML
# }
m = re.search(
r'''(?x)customBC.\createVideo\(
r'''(?x)customBC\.createVideo\(
.*? # skipping width and height
["\'](?P<playerID>\d+)["\']\s*,\s* # playerID
["\'](?P<playerKey>AQ[^"\']{48})[^"\']*["\']\s*,\s* # playerKey begins with AQ and is 50 characters

View File

@@ -90,36 +90,49 @@ class CBCIE(InfoExtractor):
},
}],
'skip': 'Geo-restricted to Canada',
}, {
# multiple CBC.APP.Caffeine.initInstance(...)
'url': 'http://www.cbc.ca/news/canada/calgary/dog-indoor-exercise-winter-1.3928238',
'info_dict': {
'title': 'Keep Rover active during the deep freeze with doggie pushups and other fun indoor tasks',
'id': 'dog-indoor-exercise-winter-1.3928238',
},
'playlist_mincount': 6,
}]
@classmethod
def suitable(cls, url):
return False if CBCPlayerIE.suitable(url) else super(CBCIE, cls).suitable(url)
def _extract_player_init(self, player_init, display_id):
player_info = self._parse_json(player_init, display_id, js_to_json)
media_id = player_info.get('mediaId')
if not media_id:
clip_id = player_info['clipId']
feed = self._download_json(
'http://tpfeed.cbc.ca/f/ExhSPC/vms_5akSXx4Ng_Zn?byCustomValue={:mpsReleases}{%s}' % clip_id,
clip_id, fatal=False)
if feed:
media_id = try_get(feed, lambda x: x['entries'][0]['guid'], compat_str)
if not media_id:
media_id = self._download_json(
'http://feed.theplatform.com/f/h9dtGB/punlNGjMlc1F?fields=id&byContent=byReleases%3DbyId%253D' + clip_id,
clip_id)['entries'][0]['id'].split('/')[-1]
return self.url_result('cbcplayer:%s' % media_id, 'CBCPlayer', media_id)
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
player_init = self._search_regex(
r'CBC\.APP\.Caffeine\.initInstance\(({.+?})\);', webpage, 'player init',
default=None)
if player_init:
player_info = self._parse_json(player_init, display_id, js_to_json)
media_id = player_info.get('mediaId')
if not media_id:
clip_id = player_info['clipId']
feed = self._download_json(
'http://tpfeed.cbc.ca/f/ExhSPC/vms_5akSXx4Ng_Zn?byCustomValue={:mpsReleases}{%s}' % clip_id,
clip_id, fatal=False)
if feed:
media_id = try_get(feed, lambda x: x['entries'][0]['guid'], compat_str)
if not media_id:
media_id = self._download_json(
'http://feed.theplatform.com/f/h9dtGB/punlNGjMlc1F?fields=id&byContent=byReleases%3DbyId%253D' + clip_id,
clip_id)['entries'][0]['id'].split('/')[-1]
return self.url_result('cbcplayer:%s' % media_id, 'CBCPlayer', media_id)
else:
entries = [self.url_result('cbcplayer:%s' % media_id, 'CBCPlayer', media_id) for media_id in re.findall(r'<iframe[^>]+src="[^"]+?mediaId=(\d+)"', webpage)]
return self.playlist_result(entries)
entries = [
self._extract_player_init(player_init, display_id)
for player_init in re.findall(r'CBC\.APP\.Caffeine\.initInstance\(({.+?})\);', webpage)]
entries.extend([
self.url_result('cbcplayer:%s' % media_id, 'CBCPlayer', media_id)
for media_id in re.findall(r'<iframe[^>]+src="[^"]+?mediaId=(\d+)"', webpage)])
return self.playlist_result(
entries, display_id,
self._og_search_title(webpage, fatal=False),
self._og_search_description(webpage))
class CBCPlayerIE(InfoExtractor):

View File

@@ -1,13 +1,11 @@
from __future__ import unicode_literals
from .mtv import MTVIE
from ..utils import ExtractorError
class CMTIE(MTVIE):
IE_NAME = 'cmt.com'
_VALID_URL = r'https?://(?:www\.)?cmt\.com/(?:videos|shows)/(?:[^/]+/)*(?P<videoid>\d+)'
_FEED_URL = 'http://www.cmt.com/sitewide/apps/player/embed/rss/'
_VALID_URL = r'https?://(?:www\.)?cmt\.com/(?:videos|shows|full-episodes|video-clips)/(?P<id>[^/]+)'
_TESTS = [{
'url': 'http://www.cmt.com/videos/garth-brooks/989124/the-call-featuring-trisha-yearwood.jhtml#artist=30061',
@@ -33,17 +31,24 @@ class CMTIE(MTVIE):
}, {
'url': 'http://www.cmt.com/shows/party-down-south/party-down-south-ep-407-gone-girl/1738172/playlist/#id=1738172',
'only_matching': True,
}, {
'url': 'http://www.cmt.com/full-episodes/537qb3/nashville-the-wayfaring-stranger-season-5-ep-501',
'only_matching': True,
}, {
'url': 'http://www.cmt.com/video-clips/t9e4ci/nashville-juliette-in-2-minutes',
'only_matching': True,
}]
@classmethod
def _transform_rtmp_url(cls, rtmp_video_url):
if 'error_not_available.swf' in rtmp_video_url:
raise ExtractorError(
'%s said: video is not available' % cls.IE_NAME, expected=True)
return super(CMTIE, cls)._transform_rtmp_url(rtmp_video_url)
def _extract_mgid(self, webpage):
return self._search_regex(
mgid = self._search_regex(
r'MTVN\.VIDEO\.contentUri\s*=\s*([\'"])(?P<mgid>.+?)\1',
webpage, 'mgid', group='mgid')
webpage, 'mgid', group='mgid', default=None)
if not mgid:
mgid = self._extract_triforce_mgid(webpage)
return mgid
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
mgid = self._extract_mgid(webpage)
return self.url_result('http://media.mtvnservices.com/embed/%s' % mgid)

View File

@@ -48,17 +48,8 @@ class ComedyCentralFullEpisodesIE(MTVServicesInfoExtractor):
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
feed_json = self._search_regex(r'var triforceManifestFeed\s*=\s*(\{.+?\});\n', webpage, 'triforce feeed')
feed = self._parse_json(feed_json, playlist_id)
zones = feed['manifest']['zones']
video_zone = zones['t2_lc_promo1']
feed = self._download_json(video_zone['feed'], playlist_id)
mgid = feed['result']['data']['id']
videos_info = self._get_videos_info(mgid, use_hls=True)
mgid = self._extract_triforce_mgid(webpage, data_zone='t2_lc_promo1')
videos_info = self._get_videos_info(mgid)
return videos_info
@@ -94,12 +85,6 @@ class ToshIE(MTVServicesInfoExtractor):
'only_matching': True,
}]
@classmethod
def _transform_rtmp_url(cls, rtmp_video_url):
new_urls = super(ToshIE, cls)._transform_rtmp_url(rtmp_video_url)
new_urls['rtmp'] = rtmp_video_url.replace('viacomccstrm', 'viacommtvstrm')
return new_urls
class ComedyCentralTVIE(MTVServicesInfoExtractor):
_VALID_URL = r'https?://(?:www\.)?comedycentral\.tv/(?:staffeln|shows)/(?P<id>[^/?#&]+)'

View File

@@ -1967,10 +1967,13 @@ class InfoExtractor(object):
entries.append(media_info)
return entries
def _extract_akamai_formats(self, manifest_url, video_id):
def _extract_akamai_formats(self, manifest_url, video_id, hosts={}):
formats = []
hdcore_sign = 'hdcore=3.7.0'
f4m_url = re.sub(r'(https?://.+?)/i/', r'\1/z/', manifest_url).replace('/master.m3u8', '/manifest.f4m')
f4m_url = re.sub(r'(https?://[^/+])/i/', r'\1/z/', manifest_url).replace('/master.m3u8', '/manifest.f4m')
hds_host = hosts.get('hds')
if hds_host:
f4m_url = re.sub(r'(https?://)[^/]+', r'\1' + hds_host, f4m_url)
if 'hdcore=' not in f4m_url:
f4m_url += ('&' if '?' in f4m_url else '?') + hdcore_sign
f4m_formats = self._extract_f4m_formats(
@@ -1978,7 +1981,10 @@ class InfoExtractor(object):
for entry in f4m_formats:
entry.update({'extra_param_to_segment_url': hdcore_sign})
formats.extend(f4m_formats)
m3u8_url = re.sub(r'(https?://.+?)/z/', r'\1/i/', manifest_url).replace('/manifest.f4m', '/master.m3u8')
m3u8_url = re.sub(r'(https?://[^/]+)/z/', r'\1/i/', manifest_url).replace('/manifest.f4m', '/master.m3u8')
hls_host = hosts.get('hls')
if hls_host:
m3u8_url = re.sub(r'(https?://)[^/]+', r'\1' + hls_host, m3u8_url)
formats.extend(self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))

View File

@@ -66,7 +66,7 @@ class DramaFeverBaseIE(AMPIE):
class DramaFeverIE(DramaFeverBaseIE):
IE_NAME = 'dramafever'
_VALID_URL = r'https?://(?:www\.)?dramafever\.com/drama/(?P<id>[0-9]+/[0-9]+)(?:/|$)'
_VALID_URL = r'https?://(?:www\.)?dramafever\.com/(?:[^/]+/)?drama/(?P<id>[0-9]+/[0-9]+)(?:/|$)'
_TESTS = [{
'url': 'http://www.dramafever.com/drama/4512/1/Cooking_with_Shin/',
'info_dict': {
@@ -103,6 +103,9 @@ class DramaFeverIE(DramaFeverBaseIE):
# m3u8 download
'skip_download': True,
},
}, {
'url': 'https://www.dramafever.com/zh-cn/drama/4972/15/Doctor_Romantic/',
'only_matching': True,
}]
def _real_extract(self, url):
@@ -148,7 +151,7 @@ class DramaFeverIE(DramaFeverBaseIE):
class DramaFeverSeriesIE(DramaFeverBaseIE):
IE_NAME = 'dramafever:series'
_VALID_URL = r'https?://(?:www\.)?dramafever\.com/drama/(?P<id>[0-9]+)(?:/(?:(?!\d+(?:/|$)).+)?)?$'
_VALID_URL = r'https?://(?:www\.)?dramafever\.com/(?:[^/]+/)?drama/(?P<id>[0-9]+)(?:/(?:(?!\d+(?:/|$)).+)?)?$'
_TESTS = [{
'url': 'http://www.dramafever.com/drama/4512/Cooking_with_Shin/',
'info_dict': {

View File

@@ -88,6 +88,7 @@ from .bbc import (
BBCCoUkPlaylistIE,
BBCIE,
)
from .beampro import BeamProLiveIE
from .beeg import BeegIE
from .behindkink import BehindKinkIE
from .bellmedia import BellMediaIE

View File

@@ -3,10 +3,16 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
float_or_none,
get_element_by_class,
get_element_by_id,
unified_strdate,
)
class FreesoundIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?freesound\.org/people/([^/]+)/sounds/(?P<id>[^/]+)'
_VALID_URL = r'https?://(?:www\.)?freesound\.org/people/[^/]+/sounds/(?P<id>[^/]+)'
_TEST = {
'url': 'http://www.freesound.org/people/miklovan/sounds/194503/',
'md5': '12280ceb42c81f19a515c745eae07650',
@@ -14,26 +20,60 @@ class FreesoundIE(InfoExtractor):
'id': '194503',
'ext': 'mp3',
'title': 'gulls in the city.wav',
'uploader': 'miklovan',
'description': 'the sounds of seagulls in the city',
'duration': 130.233,
'uploader': 'miklovan',
'upload_date': '20130715',
'tags': list,
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
music_id = mobj.group('id')
webpage = self._download_webpage(url, music_id)
title = self._html_search_regex(
r'<div id="single_sample_header">.*?<a href="#">(.+?)</a>',
webpage, 'music title', flags=re.DOTALL)
audio_id = self._match_id(url)
webpage = self._download_webpage(url, audio_id)
audio_url = self._og_search_property('audio', webpage, 'song url')
title = self._og_search_property('audio:title', webpage, 'song title')
description = self._html_search_regex(
r'<div id="sound_description">(.*?)</div>', webpage, 'description',
fatal=False, flags=re.DOTALL)
r'(?s)id=["\']sound_description["\'][^>]*>(.+?)</div>',
webpage, 'description', fatal=False)
duration = float_or_none(
get_element_by_class('duration', webpage), scale=1000)
upload_date = unified_strdate(get_element_by_id('sound_date', webpage))
uploader = self._og_search_property(
'audio:artist', webpage, 'uploader', fatal=False)
channels = self._html_search_regex(
r'Channels</dt><dd>(.+?)</dd>', webpage,
'channels info', fatal=False)
tags_str = get_element_by_class('tags', webpage)
tags = re.findall(r'<a[^>]+>([^<]+)', tags_str) if tags_str else None
audio_urls = [audio_url]
LQ_FORMAT = '-lq.mp3'
if LQ_FORMAT in audio_url:
audio_urls.append(audio_url.replace(LQ_FORMAT, '-hq.mp3'))
formats = [{
'url': format_url,
'format_note': channels,
'quality': quality,
} for quality, format_url in enumerate(audio_urls)]
self._sort_formats(formats)
return {
'id': music_id,
'id': audio_id,
'title': title,
'url': self._og_search_property('audio', webpage, 'music url'),
'uploader': self._og_search_property('audio:artist', webpage, 'music uploader'),
'description': description,
'duration': duration,
'uploader': uploader,
'upload_date': upload_date,
'tags': tags,
'formats': formats,
}

View File

@@ -422,6 +422,26 @@ class GenericIE(InfoExtractor):
'skip_download': True, # m3u8 download
},
},
{
# Brightcove with alternative playerID key
'url': 'http://www.nature.com/nmeth/journal/v9/n7/fig_tab/nmeth.2062_SV1.html',
'info_dict': {
'id': 'nmeth.2062_SV1',
'title': 'Simultaneous multiview imaging of the Drosophila syncytial blastoderm : Quantitative high-speed imaging of entire developing embryos with simultaneous multiview light-sheet microscopy : Nature Methods : Nature Research',
},
'playlist': [{
'info_dict': {
'id': '2228375078001',
'ext': 'mp4',
'title': 'nmeth.2062-sv1',
'description': 'nmeth.2062-sv1',
'timestamp': 1363357591,
'upload_date': '20130315',
'uploader': 'Nature Publishing Group',
'uploader_id': '1964492299001',
},
}],
},
# ooyala video
{
'url': 'http://www.rollingstone.com/music/videos/norwegian-dj-cashmere-cat-goes-spartan-on-with-me-premiere-20131219',
@@ -1939,7 +1959,14 @@ class GenericIE(InfoExtractor):
re.search(r'SBN\.VideoLinkset\.ooyala\([\'"](?P<ec>.{32})[\'"]\)', webpage) or
re.search(r'data-ooyala-video-id\s*=\s*[\'"](?P<ec>.{32})[\'"]', webpage))
if mobj is not None:
return OoyalaIE._build_url_result(smuggle_url(mobj.group('ec'), {'domain': url}))
embed_token = self._search_regex(
r'embedToken[\'"]?\s*:\s*[\'"]([^\'"]+)',
webpage, 'ooyala embed token', default=None)
return OoyalaIE._build_url_result(smuggle_url(
mobj.group('ec'), {
'domain': url,
'embed_token': embed_token,
}))
# Look for multiple Ooyala embeds on SBN network websites
mobj = re.search(r'SBN\.VideoLinkset\.entryGroup\((\[.*?\])', webpage)

View File

@@ -59,14 +59,26 @@ class LimelightBaseIE(InfoExtractor):
format_id = 'rtmp'
if stream.get('videoBitRate'):
format_id += '-%d' % int_or_none(stream['videoBitRate'])
http_url = 'http://cpl.delvenetworks.com/' + rtmp.group('playpath')[4:]
urls.append(http_url)
http_fmt = fmt.copy()
http_fmt.update({
'url': http_url,
'format_id': format_id.replace('rtmp', 'http'),
})
formats.append(http_fmt)
http_format_id = format_id.replace('rtmp', 'http')
CDN_HOSTS = (
('delvenetworks.com', 'cpl.delvenetworks.com'),
('video.llnw.net', 's2.content.video.llnw.net'),
)
for cdn_host, http_host in CDN_HOSTS:
if cdn_host not in rtmp.group('host').lower():
continue
http_url = 'http://%s/%s' % (http_host, rtmp.group('playpath')[4:])
urls.append(http_url)
if self._is_valid_url(http_url, video_id, http_format_id):
http_fmt = fmt.copy()
http_fmt.update({
'url': http_url,
'format_id': http_format_id,
})
formats.append(http_fmt)
break
fmt.update({
'url': rtmp.group('url'),
'play_path': rtmp.group('playpath'),

View File

@@ -190,7 +190,7 @@ class MiTeleIE(InfoExtractor):
return {
'_type': 'url_transparent',
# for some reason only HLS is supported
'url': smuggle_url('ooyala:' + embedCode, {'supportedformats': 'm3u8'}),
'url': smuggle_url('ooyala:' + embedCode, {'supportedformats': 'm3u8,dash'}),
'id': video_id,
'title': title,
'description': description,

View File

@@ -16,7 +16,6 @@ from ..utils import (
clean_html,
ExtractorError,
OnDemandPagedList,
parse_count,
str_to_int,
)
@@ -36,7 +35,6 @@ class MixcloudIE(InfoExtractor):
'uploader_id': 'dholbach',
'thumbnail': r're:https?://.*\.jpg',
'view_count': int,
'like_count': int,
},
}, {
'url': 'http://www.mixcloud.com/gillespeterson/caribou-7-inch-vinyl-mix-chat/',
@@ -49,7 +47,6 @@ class MixcloudIE(InfoExtractor):
'uploader_id': 'gillespeterson',
'thumbnail': 're:https?://.*',
'view_count': int,
'like_count': int,
},
}, {
'url': 'https://beta.mixcloud.com/RedLightRadio/nosedrip-15-red-light-radio-01-18-2016/',
@@ -89,26 +86,18 @@ class MixcloudIE(InfoExtractor):
song_url = play_info['stream_url']
PREFIX = (
r'm-play-on-spacebar[^>]+'
r'(?:\s+[a-zA-Z0-9-]+(?:="[^"]+")?)*?\s+')
title = self._html_search_regex(
PREFIX + r'm-title="([^"]+)"', webpage, 'title')
title = self._html_search_regex(r'm-title="([^"]+)"', webpage, 'title')
thumbnail = self._proto_relative_url(self._html_search_regex(
PREFIX + r'm-thumbnail-url="([^"]+)"', webpage, 'thumbnail',
fatal=False))
r'm-thumbnail-url="([^"]+)"', webpage, 'thumbnail', fatal=False))
uploader = self._html_search_regex(
PREFIX + r'm-owner-name="([^"]+)"',
webpage, 'uploader', fatal=False)
r'm-owner-name="([^"]+)"', webpage, 'uploader', fatal=False)
uploader_id = self._search_regex(
r'\s+"profile": "([^"]+)",', webpage, 'uploader id', fatal=False)
description = self._og_search_description(webpage)
like_count = parse_count(self._search_regex(
r'\bbutton-favorite[^>]+>.*?<span[^>]+class=["\']toggle-number[^>]+>\s*([^<]+)',
webpage, 'like count', default=None))
view_count = str_to_int(self._search_regex(
[r'<meta itemprop="interactionCount" content="UserPlays:([0-9]+)"',
r'/listeners/?">([0-9,.]+)</a>'],
r'/listeners/?">([0-9,.]+)</a>',
r'm-tooltip=["\']([\d,.]+) plays'],
webpage, 'play count', default=None))
return {
@@ -120,7 +109,6 @@ class MixcloudIE(InfoExtractor):
'uploader': uploader,
'uploader_id': uploader_id,
'view_count': view_count,
'like_count': like_count,
}

View File

@@ -13,11 +13,11 @@ from ..utils import (
fix_xml_ampersands,
float_or_none,
HEADRequest,
NO_DEFAULT,
RegexNotFoundError,
sanitized_Request,
strip_or_none,
timeconvert,
try_get,
unescapeHTML,
update_url_query,
url_basename,
@@ -42,15 +42,6 @@ class MTVServicesInfoExtractor(InfoExtractor):
# Remove the templates, like &device={device}
return re.sub(r'&[^=]*?={.*?}(?=(&|$))', '', url)
# This was originally implemented for ComedyCentral, but it also works here
@classmethod
def _transform_rtmp_url(cls, rtmp_video_url):
m = re.match(r'^rtmpe?://.*?/(?P<finalid>gsp\..+?/.*)$', rtmp_video_url)
if not m:
return {'rtmp': rtmp_video_url}
base = 'http://viacommtvstrmfs.fplive.net/'
return {'http': base + m.group('finalid')}
def _get_feed_url(self, uri):
return self._FEED_URL
@@ -88,24 +79,31 @@ class MTVServicesInfoExtractor(InfoExtractor):
formats = []
for rendition in mdoc.findall('.//rendition'):
if rendition.attrib['method'] == 'hls':
if rendition.get('method') == 'hls':
hls_url = rendition.find('./src').text
formats.extend(self._extract_m3u8_formats(hls_url, video_id, ext='mp4'))
formats.extend(self._extract_m3u8_formats(
hls_url, video_id, ext='mp4', entry_protocol='m3u8_native',
m3u8_id='hls'))
else:
# fms
try:
_, _, ext = rendition.attrib['type'].partition('/')
rtmp_video_url = rendition.find('./src').text
if 'error_not_available.swf' in rtmp_video_url:
raise ExtractorError(
'%s said: video is not available' % self.IE_NAME,
expected=True)
if rtmp_video_url.endswith('siteunavail.png'):
continue
new_urls = self._transform_rtmp_url(rtmp_video_url)
formats.extend([{
'ext': 'flv' if new_url.startswith('rtmp') else ext,
'url': new_url,
'format_id': '-'.join(filter(None, [kind, rendition.get('bitrate')])),
'ext': 'flv' if rtmp_video_url.startswith('rtmp') else ext,
'url': rtmp_video_url,
'format_id': '-'.join(filter(None, [
'rtmp' if rtmp_video_url.startswith('rtmp') else None,
rendition.get('bitrate')])),
'width': int(rendition.get('width')),
'height': int(rendition.get('height')),
} for kind, new_url in new_urls.items()])
}])
except (KeyError, TypeError):
raise ExtractorError('Invalid rendition field.')
self._sort_formats(formats)
@@ -123,7 +121,7 @@ class MTVServicesInfoExtractor(InfoExtractor):
} for typographic in transcript.findall('./typographic')]
return subtitles
def _get_video_info(self, itemdoc, use_hls):
def _get_video_info(self, itemdoc, use_hls=True):
uri = itemdoc.find('guid').text
video_id = self._id_from_uri(uri)
self.report_extraction(video_id)
@@ -193,13 +191,13 @@ class MTVServicesInfoExtractor(InfoExtractor):
data['lang'] = self._LANG
return data
def _get_videos_info(self, uri, use_hls=False):
def _get_videos_info(self, uri, use_hls=True):
video_id = self._id_from_uri(uri)
feed_url = self._get_feed_url(uri)
info_url = update_url_query(feed_url, self._get_feed_query(uri))
return self._get_videos_info_from_url(info_url, video_id, use_hls)
def _get_videos_info_from_url(self, url, video_id, use_hls):
def _get_videos_info_from_url(self, url, video_id, use_hls=True):
idoc = self._download_xml(
url, video_id,
'Downloading info', transform_source=fix_xml_ampersands)
@@ -211,7 +209,28 @@ class MTVServicesInfoExtractor(InfoExtractor):
[self._get_video_info(item, use_hls) for item in idoc.findall('.//item')],
playlist_title=title, playlist_description=description)
def _extract_mgid(self, webpage, default=NO_DEFAULT):
def _extract_triforce_mgid(self, webpage, data_zone=None, video_id=None):
triforce_feed = self._parse_json(self._search_regex(
r'triforceManifestFeed\s*=\s*(\{.+?\});\n', webpage,
'triforce feed', default='{}'), video_id, fatal=False)
data_zone = self._search_regex(
r'data-zone=(["\'])(?P<zone>.+?_lc_promo.*?)\1', webpage,
'data zone', default=data_zone, group='zone')
feed_url = try_get(
triforce_feed, lambda x: x['manifest']['zones'][data_zone]['feed'],
compat_str)
if not feed_url:
return
feed = self._download_json(feed_url, video_id, fatal=False)
if not feed:
return
return try_get(feed, lambda x: x['result']['data']['id'], compat_str)
def _extract_mgid(self, webpage):
try:
# the url can be http://media.mtvnservices.com/fb/{mgid}.swf
# or http://media.mtvnservices.com/{mgid}
@@ -231,7 +250,11 @@ class MTVServicesInfoExtractor(InfoExtractor):
sm4_embed = self._html_search_meta(
'sm4:video:embed', webpage, 'sm4 embed', default='')
mgid = self._search_regex(
r'embed/(mgid:.+?)["\'&?/]', sm4_embed, 'mgid', default=default)
r'embed/(mgid:.+?)["\'&?/]', sm4_embed, 'mgid', default=None)
if not mgid:
mgid = self._extract_triforce_mgid(webpage)
return mgid
def _real_extract(self, url):

View File

@@ -10,7 +10,7 @@ from ..utils import update_url_query
class NickIE(MTVServicesInfoExtractor):
# None of videos on the website are still alive?
IE_NAME = 'nick.com'
_VALID_URL = r'https?://(?:www\.)?nick(?:jr)?\.com/(?:videos/clip|[^/]+/videos)/(?P<id>[^/?#.]+)'
_VALID_URL = r'https?://(?:(?:www|beta)\.)?nick(?:jr)?\.com/(?:[^/]+/)?(?:videos/clip|[^/]+/videos)/(?P<id>[^/?#.]+)'
_FEED_URL = 'http://udat.mtvnservices.com/service1/dispatch.htm'
_TESTS = [{
'url': 'http://www.nick.com/videos/clip/alvinnn-and-the-chipmunks-112-full-episode.html',
@@ -57,6 +57,9 @@ class NickIE(MTVServicesInfoExtractor):
}, {
'url': 'http://www.nickjr.com/paw-patrol/videos/pups-save-a-goldrush-s3-ep302-full-episode/',
'only_matching': True,
}, {
'url': 'http://beta.nick.com/nicky-ricky-dicky-and-dawn/videos/nicky-ricky-dicky-dawn-301-full-episode/',
'only_matching': True,
}]
def _get_feed_query(self, uri):

View File

@@ -7,7 +7,6 @@ import datetime
from .common import InfoExtractor
from ..compat import (
compat_urllib_parse_urlencode,
compat_urlparse,
)
from ..utils import (
@@ -40,6 +39,7 @@ class NiconicoIE(InfoExtractor):
'description': '(c) copyright 2008, Blender Foundation / www.bigbuckbunny.org',
'duration': 33,
},
'skip': 'Requires an account',
}, {
# File downloaded with and without credentials are different, so omit
# the md5 field
@@ -55,6 +55,7 @@ class NiconicoIE(InfoExtractor):
'timestamp': 1304065916,
'duration': 209,
},
'skip': 'Requires an account',
}, {
# 'video exists but is marked as "deleted"
# md5 is unstable
@@ -65,9 +66,10 @@ class NiconicoIE(InfoExtractor):
'description': 'deleted',
'title': 'ドラえもんエターナル第3話「決戦第3新東京市」前編',
'upload_date': '20071224',
'timestamp': 1198527840, # timestamp field has different value if logged in
'timestamp': int, # timestamp field has different value if logged in
'duration': 304,
},
'skip': 'Requires an account',
}, {
'url': 'http://www.nicovideo.jp/watch/so22543406',
'info_dict': {
@@ -79,13 +81,12 @@ class NiconicoIE(InfoExtractor):
'upload_date': '20140104',
'uploader': 'アニメロチャンネル',
'uploader_id': '312',
}
},
'skip': 'The viewing period of the video you were searching for has expired.',
}]
_VALID_URL = r'https?://(?:www\.|secure\.)?nicovideo\.jp/watch/(?P<id>(?:[a-z]{2})?[0-9]+)'
_NETRC_MACHINE = 'niconico'
# Determine whether the downloader used authentication to download video
_AUTHENTICATED = False
def _real_initialize(self):
self._login()
@@ -109,8 +110,6 @@ class NiconicoIE(InfoExtractor):
if re.search(r'(?i)<h1 class="mb8p4">Log in error</h1>', login_results) is not None:
self._downloader.report_warning('unable to log in: bad username or password')
return False
# Successful login
self._AUTHENTICATED = True
return True
def _real_extract(self, url):
@@ -128,35 +127,19 @@ class NiconicoIE(InfoExtractor):
'http://ext.nicovideo.jp/api/getthumbinfo/' + video_id, video_id,
note='Downloading video info page')
if self._AUTHENTICATED:
# Get flv info
flv_info_webpage = self._download_webpage(
'http://flapi.nicovideo.jp/api/getflv/' + video_id + '?as3=1',
video_id, 'Downloading flv info')
else:
# Get external player info
ext_player_info = self._download_webpage(
'http://ext.nicovideo.jp/thumb_watch/' + video_id, video_id)
thumb_play_key = self._search_regex(
r'\'thumbPlayKey\'\s*:\s*\'(.*?)\'', ext_player_info, 'thumbPlayKey')
# Get flv info
flv_info_data = compat_urllib_parse_urlencode({
'k': thumb_play_key,
'v': video_id
})
flv_info_request = sanitized_Request(
'http://ext.nicovideo.jp/thumb_watch', flv_info_data,
{'Content-Type': 'application/x-www-form-urlencoded'})
flv_info_webpage = self._download_webpage(
flv_info_request, video_id,
note='Downloading flv info', errnote='Unable to download flv info')
# Get flv info
flv_info_webpage = self._download_webpage(
'http://flapi.nicovideo.jp/api/getflv/' + video_id + '?as3=1',
video_id, 'Downloading flv info')
flv_info = compat_urlparse.parse_qs(flv_info_webpage)
if 'url' not in flv_info:
if 'deleted' in flv_info:
raise ExtractorError('The video has been deleted.',
expected=True)
elif 'closed' in flv_info:
raise ExtractorError('Niconico videos now require logging in',
expected=True)
else:
raise ExtractorError('Unable to find video URL')

View File

@@ -18,7 +18,7 @@ class OoyalaBaseIE(InfoExtractor):
_CONTENT_TREE_BASE = _PLAYER_BASE + 'player_api/v1/content_tree/'
_AUTHORIZATION_URL_TEMPLATE = _PLAYER_BASE + 'sas/player_api/v2/authorization/embed_code/%s/%s?'
def _extract(self, content_tree_url, video_id, domain='example.org', supportedformats=None):
def _extract(self, content_tree_url, video_id, domain='example.org', supportedformats=None, embed_token=None):
content_tree = self._download_json(content_tree_url, video_id)['content_tree']
metadata = content_tree[list(content_tree)[0]]
embed_code = metadata['embed_code']
@@ -29,7 +29,8 @@ class OoyalaBaseIE(InfoExtractor):
self._AUTHORIZATION_URL_TEMPLATE % (pcode, embed_code) +
compat_urllib_parse_urlencode({
'domain': domain,
'supportedFormats': supportedformats or 'mp4,rtmp,m3u8,hds',
'supportedFormats': supportedformats or 'mp4,rtmp,m3u8,hds,dash,smooth',
'embedToken': embed_token,
}), video_id)
cur_auth_data = auth_data['authorization_data'][embed_code]
@@ -52,6 +53,12 @@ class OoyalaBaseIE(InfoExtractor):
elif delivery_type == 'hds' or ext == 'f4m':
formats.extend(self._extract_f4m_formats(
s_url + '?hdcore=3.7.0', embed_code, f4m_id='hds', fatal=False))
elif delivery_type == 'dash' or ext == 'mpd':
formats.extend(self._extract_mpd_formats(
s_url, embed_code, mpd_id='dash', fatal=False))
elif delivery_type == 'smooth':
self._extract_ism_formats(
s_url, embed_code, ism_id='mss', fatal=False)
elif ext == 'smil':
formats.extend(self._extract_smil_formats(
s_url, embed_code, fatal=False))
@@ -146,8 +153,9 @@ class OoyalaIE(OoyalaBaseIE):
embed_code = self._match_id(url)
domain = smuggled_data.get('domain')
supportedformats = smuggled_data.get('supportedformats')
embed_token = smuggled_data.get('embed_token')
content_tree_url = self._CONTENT_TREE_BASE + 'embed_code/%s/%s' % (embed_code, embed_code)
return self._extract(content_tree_url, embed_code, domain, supportedformats)
return self._extract(content_tree_url, embed_code, domain, supportedformats, embed_token)
class OoyalaExternalIE(OoyalaBaseIE):

View File

@@ -64,16 +64,17 @@ class OpenloadIE(InfoExtractor):
raise ExtractorError('File not found', expected=True)
ol_id = self._search_regex(
'<span[^>]+id="[a-zA-Z0-9]+x"[^>]*>([0-9]+)</span>',
'<span[^>]+id="[^"]+"[^>]*>([0-9]+)</span>',
webpage, 'openload ID')
first_two_chars = int(float(ol_id[0:][:2]))
first_three_chars = int(float(ol_id[0:][:3]))
fifth_char = int(float(ol_id[3:5]))
urlcode = ''
num = 2
num = 5
while num < len(ol_id):
urlcode += compat_chr(int(float(ol_id[num:][:3])) -
first_two_chars * int(float(ol_id[num + 3:][:2])))
urlcode += compat_chr(int(float(ol_id[num:][:3])) +
first_three_chars - fifth_char * int(float(ol_id[num + 3:][:2])))
num += 5
video_url = 'https://openload.co/stream/' + urlcode

View File

@@ -46,7 +46,7 @@ class SpikeIE(MTVServicesInfoExtractor):
_CUSTOM_URL_REGEX = re.compile(r'spikenetworkapp://([^/]+/[-a-fA-F0-9]+)')
def _extract_mgid(self, webpage):
mgid = super(SpikeIE, self)._extract_mgid(webpage, default=None)
mgid = super(SpikeIE, self)._extract_mgid(webpage)
if mgid is None:
url_parts = self._search_regex(self._CUSTOM_URL_REGEX, webpage, 'episode_id')
video_type, episode_id = url_parts.split('/', 1)

View File

@@ -4,11 +4,10 @@ from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
ExtractorError,
int_or_none,
parse_iso8601,
try_get,
update_url_query,
determine_ext,
)
@@ -28,7 +27,7 @@ class TV4IE(InfoExtractor):
_TESTS = [
{
'url': 'http://www.tv4.se/kalla-fakta/klipp/kalla-fakta-5-english-subtitles-2491650',
'md5': '909d6454b87b10a25aa04c4bdd416a9b',
'md5': 'cb837212f342d77cec06e6dad190e96d',
'info_dict': {
'id': '2491650',
'ext': 'mp4',
@@ -40,7 +39,7 @@ class TV4IE(InfoExtractor):
},
{
'url': 'http://www.tv4play.se/iframe/video/3054113',
'md5': '77f851c55139ffe0ebd41b6a5552489b',
'md5': 'cb837212f342d77cec06e6dad190e96d',
'info_dict': {
'id': '3054113',
'ext': 'mp4',
@@ -75,11 +74,10 @@ class TV4IE(InfoExtractor):
# If is_geo_restricted is true, it doesn't necessarily mean we can't download it
if info.get('is_geo_restricted'):
self.report_warning('This content might not be available in your country due to licensing restrictions.')
if info.get('requires_subscription'):
raise ExtractorError('This content requires subscription.', expected=True)
title = info['title']
subtitles = {}
formats = []
# http formats are linked with unresolvable host
for kind in ('hls', ''):
@@ -87,26 +85,41 @@ class TV4IE(InfoExtractor):
'https://prima.tv4play.se/api/web/asset/%s/play.json' % video_id,
video_id, 'Downloading sources JSON', query={
'protocol': kind,
'videoFormat': 'MP4+WEBVTTS+WEBVTT',
'videoFormat': 'MP4+WEBVTT',
})
item = try_get(data, lambda x: x['playback']['items']['item'], dict)
manifest_url = item.get('url')
if not isinstance(manifest_url, compat_str):
items = try_get(data, lambda x: x['playback']['items']['item'])
if not items:
continue
if kind == 'hls':
formats.extend(self._extract_m3u8_formats(
manifest_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id=kind, fatal=False))
else:
formats.extend(self._extract_f4m_formats(
update_url_query(manifest_url, {'hdcore': '3.8.0'}),
video_id, f4m_id='hds', fatal=False))
if isinstance(items, dict):
items = [items]
for item in items:
manifest_url = item.get('url')
if not isinstance(manifest_url, compat_str):
continue
ext = determine_ext(manifest_url)
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
manifest_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id=kind, fatal=False))
elif ext == 'f4m':
formats.extend(self._extract_akamai_formats(
manifest_url, video_id, {
'hls': 'tv4play-i.akamaihd.net',
}))
elif ext == 'webvtt':
subtitles = self._merge_subtitles(
subtitles, {
'sv': [{
'url': manifest_url,
'ext': 'vtt',
}]})
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
'formats': formats,
'subtitles': subtitles,
'description': info.get('description'),
'timestamp': parse_iso8601(info.get('broadcast_date_time')),
'duration': int_or_none(info.get('duration')),

View File

@@ -206,7 +206,7 @@ class VevoIE(VevoBaseIE):
note='Retrieving oauth token',
errnote='Unable to retrieve oauth token')
if 'THIS PAGE IS CURRENTLY UNAVAILABLE IN YOUR REGION' in webpage:
if re.search(r'(?i)THIS PAGE IS CURRENTLY UNAVAILABLE IN YOUR REGION', webpage):
self.raise_geo_restricted(
'%s said: This page is currently unavailable in your region' % self.IE_NAME)

View File

@@ -254,7 +254,7 @@ class VimeoIE(VimeoBaseInfoExtractor):
'uploader_id': 'user18948128',
'uploader': 'Jaime Marquínez Ferrándiz',
'duration': 10,
'description': 'This is "youtube-dl password protected test video" by on Vimeo, the home for high quality videos and the people who love them.',
'description': 'md5:dca3ea23adb29ee387127bc4ddfce63f',
},
'params': {
'videopassword': 'youtube-dl',
@@ -306,7 +306,7 @@ class VimeoIE(VimeoBaseInfoExtractor):
{
# contains original format
'url': 'https://vimeo.com/33951933',
'md5': '2d9f5475e0537f013d0073e812ab89e6',
'md5': '53c688fa95a55bf4b7293d37a89c5c53',
'info_dict': {
'id': '33951933',
'ext': 'mp4',
@@ -324,7 +324,7 @@ class VimeoIE(VimeoBaseInfoExtractor):
'url': 'https://vimeo.com/channels/tributes/6213729',
'info_dict': {
'id': '6213729',
'ext': 'mp4',
'ext': 'mov',
'title': 'Vimeo Tribute: The Shining',
'uploader': 'Casey Donahue',
'uploader_url': r're:https?://(?:www\.)?vimeo\.com/caseydonahue',
@@ -629,6 +629,9 @@ class VimeoOndemandIE(VimeoBaseInfoExtractor):
'uploader_url': r're:https?://(?:www\.)?vimeo\.com/gumfilms',
'uploader_id': 'gumfilms',
},
'params': {
'format': 'best[protocol=https]',
},
}, {
# requires Referer to be passed along with og:video:url
'url': 'https://vimeo.com/ondemand/36938/126682985',

View File

@@ -16,7 +16,9 @@ class XiamiBaseIE(InfoExtractor):
return webpage
def _extract_track(self, track, track_id=None):
title = track['title']
track_name = track.get('songName') or track.get('name') or track['subName']
artist = track.get('artist') or track.get('artist_name') or track.get('singers')
title = '%s - %s' % (artist, track_name) if artist else track_name
track_url = self._decrypt(track['location'])
subtitles = {}
@@ -31,9 +33,10 @@ class XiamiBaseIE(InfoExtractor):
'thumbnail': track.get('pic') or track.get('album_pic'),
'duration': int_or_none(track.get('length')),
'creator': track.get('artist', '').split(';')[0],
'track': title,
'album': track.get('album_name'),
'artist': track.get('artist'),
'track': track_name,
'track_number': int_or_none(track.get('track')),
'album': track.get('album_name') or track.get('title'),
'artist': artist,
'subtitles': subtitles,
}
@@ -68,14 +71,14 @@ class XiamiBaseIE(InfoExtractor):
class XiamiSongIE(XiamiBaseIE):
IE_NAME = 'xiami:song'
IE_DESC = '虾米音乐'
_VALID_URL = r'https?://(?:www\.)?xiami\.com/song/(?P<id>[0-9]+)'
_VALID_URL = r'https?://(?:www\.)?xiami\.com/song/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://www.xiami.com/song/1775610518',
'md5': '521dd6bea40fd5c9c69f913c232cb57e',
'info_dict': {
'id': '1775610518',
'ext': 'mp3',
'title': 'Woman',
'title': 'HONNE - Woman',
'thumbnail': r're:http://img\.xiami\.net/images/album/.*\.jpg',
'duration': 265,
'creator': 'HONNE',
@@ -95,7 +98,7 @@ class XiamiSongIE(XiamiBaseIE):
'info_dict': {
'id': '1775256504',
'ext': 'mp3',
'title': '悟空',
'title': '戴荃 - 悟空',
'thumbnail': r're:http://img\.xiami\.net/images/album/.*\.jpg',
'duration': 200,
'creator': '戴荃',
@@ -109,6 +112,26 @@ class XiamiSongIE(XiamiBaseIE):
},
},
'skip': 'Georestricted',
}, {
'url': 'http://www.xiami.com/song/1775953850',
'info_dict': {
'id': '1775953850',
'ext': 'mp3',
'title': 'До Скону - Чума Пожирает Землю',
'thumbnail': r're:http://img\.xiami\.net/images/album/.*\.jpg',
'duration': 683,
'creator': 'До Скону',
'track': 'Чума Пожирает Землю',
'track_number': 7,
'album': 'Ад',
'artist': 'До Скону',
},
'params': {
'skip_download': True,
},
}, {
'url': 'http://www.xiami.com/song/xLHGwgd07a1',
'only_matching': True,
}]
def _real_extract(self, url):
@@ -124,7 +147,7 @@ class XiamiPlaylistBaseIE(XiamiBaseIE):
class XiamiAlbumIE(XiamiPlaylistBaseIE):
IE_NAME = 'xiami:album'
IE_DESC = '虾米音乐 - 专辑'
_VALID_URL = r'https?://(?:www\.)?xiami\.com/album/(?P<id>[0-9]+)'
_VALID_URL = r'https?://(?:www\.)?xiami\.com/album/(?P<id>[^/?#&]+)'
_TYPE = '1'
_TESTS = [{
'url': 'http://www.xiami.com/album/2100300444',
@@ -136,28 +159,34 @@ class XiamiAlbumIE(XiamiPlaylistBaseIE):
}, {
'url': 'http://www.xiami.com/album/512288?spm=a1z1s.6843761.1110925389.6.hhE9p9',
'only_matching': True,
}, {
'url': 'http://www.xiami.com/album/URVDji2a506',
'only_matching': True,
}]
class XiamiArtistIE(XiamiPlaylistBaseIE):
IE_NAME = 'xiami:artist'
IE_DESC = '虾米音乐 - 歌手'
_VALID_URL = r'https?://(?:www\.)?xiami\.com/artist/(?P<id>[0-9]+)'
_VALID_URL = r'https?://(?:www\.)?xiami\.com/artist/(?P<id>[^/?#&]+)'
_TYPE = '2'
_TEST = {
_TESTS = [{
'url': 'http://www.xiami.com/artist/2132?spm=0.0.0.0.dKaScp',
'info_dict': {
'id': '2132',
},
'playlist_count': 20,
'skip': 'Georestricted',
}
}, {
'url': 'http://www.xiami.com/artist/bC5Tk2K6eb99',
'only_matching': True,
}]
class XiamiCollectionIE(XiamiPlaylistBaseIE):
IE_NAME = 'xiami:collection'
IE_DESC = '虾米音乐 - 精选集'
_VALID_URL = r'https?://(?:www\.)?xiami\.com/collect/(?P<id>[0-9]+)'
_VALID_URL = r'https?://(?:www\.)?xiami\.com/collect/(?P<id>[^/?#&]+)'
_TYPE = '3'
_TEST = {
'url': 'http://www.xiami.com/collect/156527391?spm=a1z1s.2943601.6856193.12.4jpBnr',

View File

@@ -2,44 +2,37 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import urljoin
class YourUploadIE(InfoExtractor):
_VALID_URL = r'''(?x)https?://(?:www\.)?
(?:yourupload\.com/watch|
embed\.yourupload\.com|
embed\.yucache\.net
)/(?P<id>[A-Za-z0-9]+)
'''
_TESTS = [
{
'url': 'http://yourupload.com/watch/14i14h',
'md5': '5e2c63385454c557f97c4c4131a393cd',
'info_dict': {
'id': '14i14h',
'ext': 'mp4',
'title': 'BigBuckBunny_320x180.mp4',
'thumbnail': r're:^https?://.*\.jpe?g',
}
},
{
'url': 'http://embed.yourupload.com/14i14h',
'only_matching': True,
},
{
'url': 'http://embed.yucache.net/14i14h?client_file_id=803349',
'only_matching': True,
},
]
_VALID_URL = r'https?://(?:www\.)?(?:yourupload\.com/(?:watch|embed)|embed\.yourupload\.com)/(?P<id>[A-Za-z0-9]+)'
_TESTS = [{
'url': 'http://yourupload.com/watch/14i14h',
'md5': '5e2c63385454c557f97c4c4131a393cd',
'info_dict': {
'id': '14i14h',
'ext': 'mp4',
'title': 'BigBuckBunny_320x180.mp4',
'thumbnail': r're:^https?://.*\.jpe?g',
}
}, {
'url': 'http://www.yourupload.com/embed/14i14h',
'only_matching': True,
}, {
'url': 'http://embed.yourupload.com/14i14h',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
embed_url = 'http://embed.yucache.net/{0:}'.format(video_id)
embed_url = 'http://www.yourupload.com/embed/%s' % video_id
webpage = self._download_webpage(embed_url, video_id)
title = self._og_search_title(webpage)
video_url = self._og_search_video_url(webpage)
video_url = urljoin(embed_url, self._og_search_video_url(webpage))
thumbnail = self._og_search_thumbnail(webpage, default=None)
return {

View File

@@ -867,7 +867,7 @@ def parseOpts(overrideArguments=None):
if '--ignore-config' not in system_conf:
user_conf = _readUserConf()
argv = system_conf + user_conf + command_line_conf
argv = system_conf + user_conf + custom_conf + command_line_conf
opts, args = parser.parse_args(argv)
if opts.verbose:
for conf_label, conf in (

View File

@@ -128,7 +128,13 @@ DATE_FORMATS = (
'%d %B %Y',
'%d %b %Y',
'%B %d %Y',
'%B %dst %Y',
'%B %dnd %Y',
'%B %dth %Y',
'%b %d %Y',
'%b %dst %Y',
'%b %dnd %Y',
'%b %dth %Y',
'%b %dst %Y %I:%M',
'%b %dnd %Y %I:%M',
'%b %dth %Y %I:%M',

View File

@@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2017.01.10'
__version__ = '2017.01.16'