Compare commits
90 Commits
2016.09.11
...
2016.09.24
Author | SHA1 | Date | |
---|---|---|---|
![]() |
e6332059ac | ||
![]() |
8eec691e8a | ||
![]() |
24628cf7db | ||
![]() |
71ad00c09f | ||
![]() |
45cae3b021 | ||
![]() |
4ddcb5999d | ||
![]() |
628406db96 | ||
![]() |
e3d6bdc8fc | ||
![]() |
0a439c5c4c | ||
![]() |
1978540a51 | ||
![]() |
12f211d0cb | ||
![]() |
3a5a18705f | ||
![]() |
1ae0ae5db0 | ||
![]() |
f62a77b99a | ||
![]() |
4bfd294e2f | ||
![]() |
e33a7253b2 | ||
![]() |
c38f06818d | ||
![]() |
cb57386873 | ||
![]() |
59fd8f931d | ||
![]() |
70b4cf9b1b | ||
![]() |
cc764a6da8 | ||
![]() |
d8dbf8707d | ||
![]() |
a1da888d0c | ||
![]() |
3acff9423d | ||
![]() |
9ca93b99d1 | ||
![]() |
14ae11efab | ||
![]() |
190d2027d0 | ||
![]() |
26394d021d | ||
![]() |
30d0b549be | ||
![]() |
86f4d14f81 | ||
![]() |
21d21b0c72 | ||
![]() |
b4c1d6e800 | ||
![]() |
a0d5077c8d | ||
![]() |
584d6f3457 | ||
![]() |
e14c82bd6b | ||
![]() |
c51a7f0b2f | ||
![]() |
d05ef09d9d | ||
![]() |
30d9e20938 | ||
![]() |
fc86d4eed0 | ||
![]() |
7d273a387a | ||
![]() |
6ad0219556 | ||
![]() |
98b7506e96 | ||
![]() |
52dc8a9b3f | ||
![]() |
9d8985a165 | ||
![]() |
f5e008d134 | ||
![]() |
e6bf3621e7 | ||
![]() |
490b755769 | ||
![]() |
1dec2c8a0e | ||
![]() |
dcce092e0a | ||
![]() |
32443dd346 | ||
![]() |
2133565cec | ||
![]() |
1da50aa34e | ||
![]() |
d2522b86ac | ||
![]() |
537f753399 | ||
![]() |
c849836854 | ||
![]() |
eb5b1fc021 | ||
![]() |
95be29e1c6 | ||
![]() |
c035dba19e | ||
![]() |
87148bb711 | ||
![]() |
797c636bcb | ||
![]() |
0002962f3f | ||
![]() |
3e4185c396 | ||
![]() |
f6717dec8a | ||
![]() |
a942d6cb48 | ||
![]() |
961516bfd1 | ||
![]() |
6db354a9f4 | ||
![]() |
353f340e11 | ||
![]() |
014b7e6b25 | ||
![]() |
925194022c | ||
![]() |
b690ea15eb | ||
![]() |
5712c0f426 | ||
![]() |
86d68f906e | ||
![]() |
4875ff6847 | ||
![]() |
1b6712ab23 | ||
![]() |
8414c2da31 | ||
![]() |
45396dd2ed | ||
![]() |
7a7309219c | ||
![]() |
fcba157e80 | ||
![]() |
a6ccc3e518 | ||
![]() |
1d16035bb4 | ||
![]() |
e8bcd982cc | ||
![]() |
a5ff05df1a | ||
![]() |
d002e91986 | ||
![]() |
546edb2efa | ||
![]() |
be45730226 | ||
![]() |
ee7e672eb0 | ||
![]() |
0307d6fba6 | ||
![]() |
fc150cba1d | ||
![]() |
d667ab7fad | ||
![]() |
eb87d4545a |
8
.github/ISSUE_TEMPLATE.md
vendored
8
.github/ISSUE_TEMPLATE.md
vendored
@@ -6,8 +6,8 @@
|
||||
|
||||
---
|
||||
|
||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.09.11*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.09.11**
|
||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.09.24*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.09.24**
|
||||
|
||||
### Before submitting an *issue* make sure you have:
|
||||
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
|
||||
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
|
||||
[debug] User config: []
|
||||
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
||||
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
||||
[debug] youtube-dl version 2016.09.11
|
||||
[debug] youtube-dl version 2016.09.24
|
||||
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
||||
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
||||
[debug] Proxy map: {}
|
||||
@@ -55,4 +55,4 @@ $ youtube-dl -v <your command line>
|
||||
### Description of your *issue*, suggested solution and other information
|
||||
|
||||
Explanation of your *issue* in arbitrary form goes here. Please make sure the [description is worded well enough to be understood](https://github.com/rg3/youtube-dl#is-the-description-of-the-issue-itself-sufficient). Provide as much context and examples as possible.
|
||||
If work on your *issue* required an account credentials please provide them or explain how one can obtain them.
|
||||
If work on your *issue* requires account credentials please provide them or explain how one can obtain them.
|
||||
|
2
.github/ISSUE_TEMPLATE_tmpl.md
vendored
2
.github/ISSUE_TEMPLATE_tmpl.md
vendored
@@ -55,4 +55,4 @@ $ youtube-dl -v <your command line>
|
||||
### Description of your *issue*, suggested solution and other information
|
||||
|
||||
Explanation of your *issue* in arbitrary form goes here. Please make sure the [description is worded well enough to be understood](https://github.com/rg3/youtube-dl#is-the-description-of-the-issue-itself-sufficient). Provide as much context and examples as possible.
|
||||
If work on your *issue* required an account credentials please provide them or explain how one can obtain them.
|
||||
If work on your *issue* requires account credentials please provide them or explain how one can obtain them.
|
||||
|
94
ChangeLog
94
ChangeLog
@@ -1,10 +1,100 @@
|
||||
version 2016.09.11
|
||||
version 2016.09.24
|
||||
|
||||
Core
|
||||
+ Add support for watchTVeverywhere.com authentication provider based MSOs for
|
||||
Adobe Pass authentication (#10709)
|
||||
|
||||
Extractors
|
||||
+ [soundcloud:playlist] Provide video id for early playlist entries (#10733)
|
||||
+ [prosiebensat1] Add support for kabeleinsdoku (#10732)
|
||||
* [cbs] Extract info from thunder videoPlayerService (#10728)
|
||||
* [openload] Fix extraction (#10408)
|
||||
+ [ustream] Support the new HLS streams (#10698)
|
||||
+ [ooyala] Extract all HLS formats
|
||||
+ [cartoonnetwork] Add support for Adobe Pass authentication
|
||||
+ [soundcloud] Extract license metadata
|
||||
+ [fox] Add support for Adobe Pass authentication (#8584)
|
||||
+ [tbs] Add support for Adobe Pass authentication (#10642, #10222)
|
||||
+ [trutv] Add support for Adobe Pass authentication (#10519)
|
||||
+ [turner] Add support for Adobe Pass authentication
|
||||
|
||||
|
||||
version 2016.09.19
|
||||
|
||||
Extractors
|
||||
+ [crunchyroll] Check if already authenticated (#10700)
|
||||
- [twitch:stream] Remove fallback to profile extraction when stream is offline
|
||||
* [thisav] Improve title extraction (#10682)
|
||||
* [vyborymos] Improve station info extraction
|
||||
|
||||
|
||||
version 2016.09.18
|
||||
|
||||
Core
|
||||
+ Introduce manifest_url and fragments fields in formats dictionary for
|
||||
fragmented media
|
||||
+ Provide manifest_url field for DASH segments, HLS and HDS
|
||||
+ Provide fragments field for DASH segments
|
||||
* Rework DASH segments downloader to use fragments field
|
||||
+ Add helper method for Wowza Streaming Engine formats extraction
|
||||
|
||||
Extractors
|
||||
+ [vyborymos] Add extractor for vybory.mos.ru (#10692)
|
||||
+ [xfileshare] Add title regular expression for streamin.to (#10646)
|
||||
+ [globo:article] Add support for multiple videos (#10653)
|
||||
+ [thisav] Recognize HTML5 videos (#10447)
|
||||
* [jwplatform] Improve JWPlayer detection
|
||||
+ [mangomolo] Add support for Mangomolo embeds
|
||||
+ [toutv] Add support for authentication (#10669)
|
||||
* [franceinter] Fix upload date extraction
|
||||
* [tv4] Fix HLS and HDS formats extraction (#10659)
|
||||
|
||||
|
||||
version 2016.09.15
|
||||
|
||||
Core
|
||||
* Improve _hidden_inputs
|
||||
+ Introduce improved explicit Adobe Pass support
|
||||
+ Add --ap-mso to provide multiple-system operator identifier
|
||||
+ Add --ap-username to provide MSO account username
|
||||
+ Add --ap-password to provide MSO account password
|
||||
+ Add --ap-list-mso to list all supported MSOs
|
||||
+ Add support for Rogers Cable multiple-system operator (#10606)
|
||||
|
||||
Extractors
|
||||
* [crunchyroll] Fix authentication (#10655)
|
||||
* [twitch] Fix API calls (#10654, #10660)
|
||||
+ [bellmedia] Add support for more Bell Media Television sites
|
||||
* [franceinter] Fix extraction (#10538, #2105)
|
||||
* [kuwo] Improve error detection (#10650)
|
||||
+ [go] Add support for free full episodes (#10439)
|
||||
* [bilibili] Fix extraction for specific videos (#10647)
|
||||
* [nhk] Fix extraction (#10633)
|
||||
* [kaltura] Improve audio detection
|
||||
* [kaltura] Skip chun format
|
||||
+ [vimeo:ondemand] Pass Referer along with embed URL (#10624)
|
||||
+ [nbc] Add support for NBC Olympics (#10361)
|
||||
|
||||
|
||||
version 2016.09.11.1
|
||||
|
||||
Extractors
|
||||
+ [tube8] Extract categories and tags (#10579)
|
||||
+ [pornhub] Extract categories and tags (#10499)
|
||||
+ [foxnews] Support Fox News articles (#10598)
|
||||
* [openload] Temporary fix (#10408)
|
||||
+ [foxnews] Add support Fox News articles (#10598)
|
||||
* [viafree] Improve video id extraction (#10615)
|
||||
* [iwara] Fix extraction after relaunch (#10462, #3215)
|
||||
+ [tfo] Add extractor for tfo.org
|
||||
* [lrt] Fix audio extraction (#10566)
|
||||
* [9now] Fix extraction (#10561)
|
||||
+ [canalplus] Add support for c8.fr (#10577)
|
||||
* [newgrounds] Fix uploader extraction (#10584)
|
||||
+ [polskieradio:category] Add support for category lists (#10576)
|
||||
+ [ketnet] Add extractor for ketnet.be (#10343)
|
||||
+ [canvas] Add support for een.be (#10605)
|
||||
+ [telequebec] Add extractor for telequebec.tv (#1999)
|
||||
* [parliamentliveuk] Fix extraction (#9137)
|
||||
|
||||
|
||||
version 2016.09.08
|
||||
|
2
Makefile
2
Makefile
@@ -1,7 +1,7 @@
|
||||
all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites
|
||||
|
||||
clean:
|
||||
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.jpg *.png CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe
|
||||
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.jpg *.png CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe
|
||||
find . -name "*.pyc" -delete
|
||||
find . -name "*.class" -delete
|
||||
|
||||
|
11
README.md
11
README.md
@@ -358,6 +358,17 @@ which means you can modify it, redistribute it or use it however you like.
|
||||
-n, --netrc Use .netrc authentication data
|
||||
--video-password PASSWORD Video password (vimeo, smotri, youku)
|
||||
|
||||
## Adobe Pass Options:
|
||||
--ap-mso MSO Adobe Pass multiple-system operator (TV
|
||||
provider) identifier, use --ap-list-mso for
|
||||
a list of available MSOs
|
||||
--ap-username USERNAME Multiple-system operator account login
|
||||
--ap-password PASSWORD Multiple-system operator account password.
|
||||
If this option is left out, youtube-dl will
|
||||
ask interactively.
|
||||
--ap-list-mso List all supported multiple-system
|
||||
operators
|
||||
|
||||
## Post-processing Options:
|
||||
-x, --extract-audio Convert video files to audio-only files
|
||||
(requires ffmpeg or avconv and ffprobe or
|
||||
|
@@ -60,6 +60,9 @@ if ! type pandoc >/dev/null 2>/dev/null; then echo 'ERROR: pandoc is missing'; e
|
||||
if ! python3 -c 'import rsa' 2>/dev/null; then echo 'ERROR: python3-rsa is missing'; exit 1; fi
|
||||
if ! python3 -c 'import wheel' 2>/dev/null; then echo 'ERROR: wheel is missing'; exit 1; fi
|
||||
|
||||
read -p "Is ChangeLog up to date? (y/n) " -n 1
|
||||
if [[ ! $REPLY =~ ^[Yy]$ ]]; then exit 1; fi
|
||||
|
||||
/bin/echo -e "\n### First of all, testing..."
|
||||
make clean
|
||||
if $skip_tests ; then
|
||||
|
@@ -89,6 +89,7 @@
|
||||
- **BeatportPro**
|
||||
- **Beeg**
|
||||
- **BehindKink**
|
||||
- **BellMedia**
|
||||
- **Bet**
|
||||
- **Bigflix**
|
||||
- **Bild**: Bild.de
|
||||
@@ -169,7 +170,6 @@
|
||||
- **CSNNE**
|
||||
- **CSpan**: C-SPAN
|
||||
- **CtsNews**: 華視新聞
|
||||
- **CTV**
|
||||
- **CTVNews**
|
||||
- **culturebox.francetvinfo.fr**
|
||||
- **CultureUnplugged**
|
||||
@@ -388,6 +388,8 @@
|
||||
- **mailru**: Видео@Mail.Ru
|
||||
- **MakersChannel**
|
||||
- **MakerTV**
|
||||
- **mangomolo:live**
|
||||
- **mangomolo:video**
|
||||
- **MatchTV**
|
||||
- **MDR**: MDR.DE and KiKA
|
||||
- **media.ccc.de**
|
||||
@@ -445,6 +447,7 @@
|
||||
- **NBA**
|
||||
- **NBC**
|
||||
- **NBCNews**
|
||||
- **NBCOlympics**
|
||||
- **NBCSports**
|
||||
- **NBCSportsVPlayer**
|
||||
- **ndr**: NDR.de - Norddeutscher Rundfunk
|
||||
@@ -848,6 +851,7 @@
|
||||
- **VRT**
|
||||
- **vube**: Vube.com
|
||||
- **VuClip**
|
||||
- **VyboryMos**
|
||||
- **Walla**
|
||||
- **washingtonpost**
|
||||
- **washingtonpost:article**
|
||||
|
@@ -40,6 +40,7 @@ from youtube_dl.utils import (
|
||||
js_to_json,
|
||||
limit_length,
|
||||
mimetype2ext,
|
||||
month_by_name,
|
||||
ohdave_rsa_encrypt,
|
||||
OnDemandPagedList,
|
||||
orderedSet,
|
||||
@@ -634,6 +635,14 @@ class TestUtil(unittest.TestCase):
|
||||
self.assertEqual(mimetype2ext('text/vtt;charset=utf-8'), 'vtt')
|
||||
self.assertEqual(mimetype2ext('text/html; charset=utf-8'), 'html')
|
||||
|
||||
def test_month_by_name(self):
|
||||
self.assertEqual(month_by_name(None), None)
|
||||
self.assertEqual(month_by_name('December', 'en'), 12)
|
||||
self.assertEqual(month_by_name('décembre', 'fr'), 12)
|
||||
self.assertEqual(month_by_name('December'), 12)
|
||||
self.assertEqual(month_by_name('décembre'), None)
|
||||
self.assertEqual(month_by_name('Unknown', 'unknown'), None)
|
||||
|
||||
def test_parse_codecs(self):
|
||||
self.assertEqual(parse_codecs(''), {})
|
||||
self.assertEqual(parse_codecs('avc1.77.30, mp4a.40.2'), {
|
||||
|
@@ -131,6 +131,9 @@ class YoutubeDL(object):
|
||||
username: Username for authentication purposes.
|
||||
password: Password for authentication purposes.
|
||||
videopassword: Password for accessing a video.
|
||||
ap_mso: Adobe Pass multiple-system operator identifier.
|
||||
ap_username: Multiple-system operator account username.
|
||||
ap_password: Multiple-system operator account password.
|
||||
usenetrc: Use netrc for authentication instead.
|
||||
verbose: Print additional info to stdout.
|
||||
quiet: Do not print messages to stdout.
|
||||
|
@@ -34,12 +34,14 @@ from .utils import (
|
||||
setproctitle,
|
||||
std_headers,
|
||||
write_string,
|
||||
render_table,
|
||||
)
|
||||
from .update import update_self
|
||||
from .downloader import (
|
||||
FileDownloader,
|
||||
)
|
||||
from .extractor import gen_extractors, list_extractors
|
||||
from .extractor.adobepass import MSO_INFO
|
||||
from .YoutubeDL import YoutubeDL
|
||||
|
||||
|
||||
@@ -118,18 +120,26 @@ def _real_main(argv=None):
|
||||
desc += ' (Example: "%s%s:%s" )' % (ie.SEARCH_KEY, random.choice(_COUNTS), random.choice(_SEARCHES))
|
||||
write_string(desc + '\n', out=sys.stdout)
|
||||
sys.exit(0)
|
||||
if opts.ap_list_mso:
|
||||
table = [[mso_id, mso_info['name']] for mso_id, mso_info in MSO_INFO.items()]
|
||||
write_string('Supported TV Providers:\n' + render_table(['mso', 'mso name'], table) + '\n', out=sys.stdout)
|
||||
sys.exit(0)
|
||||
|
||||
# Conflicting, missing and erroneous options
|
||||
if opts.usenetrc and (opts.username is not None or opts.password is not None):
|
||||
parser.error('using .netrc conflicts with giving username/password')
|
||||
if opts.password is not None and opts.username is None:
|
||||
parser.error('account username missing\n')
|
||||
if opts.ap_password is not None and opts.ap_username is None:
|
||||
parser.error('TV Provider account username missing\n')
|
||||
if opts.outtmpl is not None and (opts.usetitle or opts.autonumber or opts.useid):
|
||||
parser.error('using output template conflicts with using title, video ID or auto number')
|
||||
if opts.usetitle and opts.useid:
|
||||
parser.error('using title conflicts with using video ID')
|
||||
if opts.username is not None and opts.password is None:
|
||||
opts.password = compat_getpass('Type account password and press [Return]: ')
|
||||
if opts.ap_username is not None and opts.ap_password is None:
|
||||
opts.ap_password = compat_getpass('Type TV provider account password and press [Return]: ')
|
||||
if opts.ratelimit is not None:
|
||||
numeric_limit = FileDownloader.parse_bytes(opts.ratelimit)
|
||||
if numeric_limit is None:
|
||||
@@ -155,6 +165,8 @@ def _real_main(argv=None):
|
||||
parser.error('max sleep interval must be greater than or equal to min sleep interval')
|
||||
else:
|
||||
opts.max_sleep_interval = opts.sleep_interval
|
||||
if opts.ap_mso and opts.ap_mso not in MSO_INFO:
|
||||
parser.error('Unsupported TV Provider, use --ap-list-mso to get a list of supported TV Providers')
|
||||
|
||||
def parse_retries(retries):
|
||||
if retries in ('inf', 'infinite'):
|
||||
@@ -293,6 +305,9 @@ def _real_main(argv=None):
|
||||
'password': opts.password,
|
||||
'twofactor': opts.twofactor,
|
||||
'videopassword': opts.videopassword,
|
||||
'ap_mso': opts.ap_mso,
|
||||
'ap_username': opts.ap_username,
|
||||
'ap_password': opts.ap_password,
|
||||
'quiet': (opts.quiet or any_getting or any_printing),
|
||||
'no_warnings': opts.no_warnings,
|
||||
'forceurl': opts.geturl,
|
||||
|
@@ -1,7 +1,6 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import os
|
||||
import re
|
||||
|
||||
from .fragment import FragmentFD
|
||||
from ..compat import compat_urllib_error
|
||||
@@ -19,34 +18,32 @@ class DashSegmentsFD(FragmentFD):
|
||||
FD_NAME = 'dashsegments'
|
||||
|
||||
def real_download(self, filename, info_dict):
|
||||
base_url = info_dict['url']
|
||||
segment_urls = [info_dict['segment_urls'][0]] if self.params.get('test', False) else info_dict['segment_urls']
|
||||
initialization_url = info_dict.get('initialization_url')
|
||||
segments = info_dict['fragments'][:1] if self.params.get(
|
||||
'test', False) else info_dict['fragments']
|
||||
|
||||
ctx = {
|
||||
'filename': filename,
|
||||
'total_frags': len(segment_urls) + (1 if initialization_url else 0),
|
||||
'total_frags': len(segments),
|
||||
}
|
||||
|
||||
self._prepare_and_start_frag_download(ctx)
|
||||
|
||||
def combine_url(base_url, target_url):
|
||||
if re.match(r'^https?://', target_url):
|
||||
return target_url
|
||||
return '%s%s%s' % (base_url, '' if base_url.endswith('/') else '/', target_url)
|
||||
|
||||
segments_filenames = []
|
||||
|
||||
fragment_retries = self.params.get('fragment_retries', 0)
|
||||
skip_unavailable_fragments = self.params.get('skip_unavailable_fragments', True)
|
||||
|
||||
def process_segment(segment, tmp_filename, fatal):
|
||||
target_url, segment_name = segment
|
||||
def process_segment(segment, tmp_filename, num):
|
||||
segment_url = segment['url']
|
||||
segment_name = 'Frag%d' % num
|
||||
target_filename = '%s-%s' % (tmp_filename, segment_name)
|
||||
# In DASH, the first segment contains necessary headers to
|
||||
# generate a valid MP4 file, so always abort for the first segment
|
||||
fatal = num == 0 or not skip_unavailable_fragments
|
||||
count = 0
|
||||
while count <= fragment_retries:
|
||||
try:
|
||||
success = ctx['dl'].download(target_filename, {'url': combine_url(base_url, target_url)})
|
||||
success = ctx['dl'].download(target_filename, {'url': segment_url})
|
||||
if not success:
|
||||
return False
|
||||
down, target_sanitized = sanitize_open(target_filename, 'rb')
|
||||
@@ -72,16 +69,8 @@ class DashSegmentsFD(FragmentFD):
|
||||
return False
|
||||
return True
|
||||
|
||||
segments_to_download = [(initialization_url, 'Init')] if initialization_url else []
|
||||
segments_to_download.extend([
|
||||
(segment_url, 'Seg%d' % i)
|
||||
for i, segment_url in enumerate(segment_urls)])
|
||||
|
||||
for i, segment in enumerate(segments_to_download):
|
||||
# In DASH, the first segment contains necessary headers to
|
||||
# generate a valid MP4 file, so always abort for the first segment
|
||||
fatal = i == 0 or not skip_unavailable_fragments
|
||||
if not process_segment(segment, ctx['tmpfilename'], fatal):
|
||||
for i, segment in enumerate(segments):
|
||||
if not process_segment(segment, ctx['tmpfilename'], i):
|
||||
return False
|
||||
|
||||
self._finish_frag_download(ctx)
|
||||
|
@@ -13,7 +13,7 @@ from ..utils import (
|
||||
|
||||
class ABCIE(InfoExtractor):
|
||||
IE_NAME = 'abc.net.au'
|
||||
_VALID_URL = r'https?://www\.abc\.net\.au/news/(?:[^/]+/){1,2}(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?abc\.net\.au/news/(?:[^/]+/){1,2}(?P<id>\d+)'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://www.abc.net.au/news/2014-11-05/australia-to-staff-ebola-treatment-centre-in-sierra-leone/5868334',
|
||||
|
File diff suppressed because it is too large
Load Diff
@@ -4,7 +4,7 @@ from .common import InfoExtractor
|
||||
|
||||
|
||||
class AlJazeeraIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.aljazeera\.com/programmes/.*?/(?P<id>[^/]+)\.html'
|
||||
_VALID_URL = r'https?://(?:www\.)?aljazeera\.com/programmes/.*?/(?P<id>[^/]+)\.html'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.aljazeera.com/programmes/the-slum/2014/08/deliverance-201482883754237240.html',
|
||||
|
@@ -50,25 +50,6 @@ class AWAANBaseIE(InfoExtractor):
|
||||
'is_live': is_live,
|
||||
}
|
||||
|
||||
def _extract_video_formats(self, webpage, video_id, m3u8_entry_protocol):
|
||||
formats = []
|
||||
format_url_base = 'http' + self._html_search_regex(
|
||||
[
|
||||
r'file\s*:\s*"https?(://[^"]+)/playlist.m3u8',
|
||||
r'<a[^>]+href="rtsp(://[^"]+)"'
|
||||
], webpage, 'format url')
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
format_url_base + '/manifest.mpd',
|
||||
video_id, mpd_id='dash', fatal=False))
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
format_url_base + '/playlist.m3u8', video_id, 'mp4',
|
||||
m3u8_entry_protocol, m3u8_id='hls', fatal=False))
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
format_url_base + '/manifest.f4m',
|
||||
video_id, f4m_id='hds', fatal=False))
|
||||
self._sort_formats(formats)
|
||||
return formats
|
||||
|
||||
|
||||
class AWAANVideoIE(AWAANBaseIE):
|
||||
IE_NAME = 'awaan:video'
|
||||
@@ -99,16 +80,18 @@ class AWAANVideoIE(AWAANBaseIE):
|
||||
video_id, headers={'Origin': 'http://awaan.ae'})
|
||||
info = self._parse_video_data(video_data, video_id, False)
|
||||
|
||||
webpage = self._download_webpage(
|
||||
'http://admin.mangomolo.com/analytics/index.php/customers/embed/video?' +
|
||||
compat_urllib_parse_urlencode({
|
||||
'id': video_data['id'],
|
||||
'user_id': video_data['user_id'],
|
||||
'signature': video_data['signature'],
|
||||
'countries': 'Q0M=',
|
||||
'filter': 'DENY',
|
||||
}), video_id)
|
||||
info['formats'] = self._extract_video_formats(webpage, video_id, 'm3u8_native')
|
||||
embed_url = 'http://admin.mangomolo.com/analytics/index.php/customers/embed/video?' + compat_urllib_parse_urlencode({
|
||||
'id': video_data['id'],
|
||||
'user_id': video_data['user_id'],
|
||||
'signature': video_data['signature'],
|
||||
'countries': 'Q0M=',
|
||||
'filter': 'DENY',
|
||||
})
|
||||
info.update({
|
||||
'_type': 'url_transparent',
|
||||
'url': embed_url,
|
||||
'ie_key': 'MangomoloVideo',
|
||||
})
|
||||
return info
|
||||
|
||||
|
||||
@@ -138,16 +121,18 @@ class AWAANLiveIE(AWAANBaseIE):
|
||||
channel_id, headers={'Origin': 'http://awaan.ae'})
|
||||
info = self._parse_video_data(channel_data, channel_id, True)
|
||||
|
||||
webpage = self._download_webpage(
|
||||
'http://admin.mangomolo.com/analytics/index.php/customers/embed/index?' +
|
||||
compat_urllib_parse_urlencode({
|
||||
'id': base64.b64encode(channel_data['user_id'].encode()).decode(),
|
||||
'channelid': base64.b64encode(channel_data['id'].encode()).decode(),
|
||||
'signature': channel_data['signature'],
|
||||
'countries': 'Q0M=',
|
||||
'filter': 'DENY',
|
||||
}), channel_id)
|
||||
info['formats'] = self._extract_video_formats(webpage, channel_id, 'm3u8')
|
||||
embed_url = 'http://admin.mangomolo.com/analytics/index.php/customers/embed/index?' + compat_urllib_parse_urlencode({
|
||||
'id': base64.b64encode(channel_data['user_id'].encode()).decode(),
|
||||
'channelid': base64.b64encode(channel_data['id'].encode()).decode(),
|
||||
'signature': channel_data['signature'],
|
||||
'countries': 'Q0M=',
|
||||
'filter': 'DENY',
|
||||
})
|
||||
info.update({
|
||||
'_type': 'url_transparent',
|
||||
'url': embed_url,
|
||||
'ie_key': 'MangomoloLive',
|
||||
})
|
||||
return info
|
||||
|
||||
|
||||
|
@@ -103,7 +103,7 @@ class AzubuIE(InfoExtractor):
|
||||
|
||||
|
||||
class AzubuLiveIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www.azubu.tv/(?P<id>[^/]+)$'
|
||||
_VALID_URL = r'https?://(?:www\.)?azubu\.tv/(?P<id>[^/]+)$'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.azubu.tv/MarsTVMDLen',
|
||||
|
@@ -1028,7 +1028,7 @@ class BBCIE(BBCCoUkIE):
|
||||
|
||||
|
||||
class BBCCoUkArticleIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www.bbc.co.uk/programmes/articles/(?P<id>[a-zA-Z0-9]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?bbc\.co\.uk/programmes/articles/(?P<id>[a-zA-Z0-9]+)'
|
||||
IE_NAME = 'bbc.co.uk:article'
|
||||
IE_DESC = 'BBC articles'
|
||||
|
||||
|
@@ -6,8 +6,25 @@ import re
|
||||
from .common import InfoExtractor
|
||||
|
||||
|
||||
class CTVIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?(?P<domain>ctv|tsn|bnn|thecomedynetwork)\.ca/.*?(?:\bvid=|-vid|~|%7E)(?P<id>[0-9.]+)'
|
||||
class BellMediaIE(InfoExtractor):
|
||||
_VALID_URL = r'''(?x)https?://(?:www\.)?
|
||||
(?P<domain>
|
||||
(?:
|
||||
ctv|
|
||||
tsn|
|
||||
bnn|
|
||||
thecomedynetwork|
|
||||
discovery|
|
||||
discoveryvelocity|
|
||||
sciencechannel|
|
||||
investigationdiscovery|
|
||||
animalplanet|
|
||||
bravo|
|
||||
mtv|
|
||||
space
|
||||
)\.ca|
|
||||
much\.com
|
||||
)/.*?(?:\bvid=|-vid|~|%7E|/(?:episode)?)(?P<id>[0-9]{6})'''
|
||||
_TESTS = [{
|
||||
'url': 'http://www.ctv.ca/video/player?vid=706966',
|
||||
'md5': 'ff2ebbeae0aa2dcc32a830c3fd69b7b0',
|
||||
@@ -32,15 +49,27 @@ class CTVIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'http://www.ctv.ca/YourMorning/Video/S1E6-Monday-August-29-2016-vid938009',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.much.com/shows/atmidnight/episode948007/tuesday-september-13-2016',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.much.com/shows/the-almost-impossible-gameshow/928979/episode-6',
|
||||
'only_matching': True,
|
||||
}]
|
||||
_DOMAINS = {
|
||||
'thecomedynetwork': 'comedy',
|
||||
'discoveryvelocity': 'discvel',
|
||||
'sciencechannel': 'discsci',
|
||||
'investigationdiscovery': 'invdisc',
|
||||
'animalplanet': 'aniplan',
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
domain, video_id = re.match(self._VALID_URL, url).groups()
|
||||
if domain == 'thecomedynetwork':
|
||||
domain = 'comedy'
|
||||
domain = domain.split('.')[0]
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'id': video_id,
|
||||
'url': '9c9media:%s_web:%s' % (domain, video_id),
|
||||
'url': '9c9media:%s_web:%s' % (self._DOMAINS.get(domain, domain), video_id),
|
||||
'ie_key': 'NineCNineMedia',
|
||||
}
|
@@ -17,7 +17,7 @@ from ..utils import (
|
||||
class BiliBiliIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.|bangumi\.|)bilibili\.(?:tv|com)/(?:video/av|anime/v/)(?P<id>\d+)'
|
||||
|
||||
_TESTS = [{
|
||||
_TEST = {
|
||||
'url': 'http://www.bilibili.tv/video/av1074402/',
|
||||
'md5': '9fa226fe2b8a9a4d5a69b4c6a183417e',
|
||||
'info_dict': {
|
||||
@@ -32,64 +32,7 @@ class BiliBiliIE(InfoExtractor):
|
||||
'uploader': '菊子桑',
|
||||
'uploader_id': '156160',
|
||||
},
|
||||
}, {
|
||||
'url': 'http://www.bilibili.com/video/av1041170/',
|
||||
'info_dict': {
|
||||
'id': '1041170',
|
||||
'ext': 'mp4',
|
||||
'title': '【BD1080P】刀语【诸神&异域】',
|
||||
'description': '这是个神奇的故事~每个人不留弹幕不给走哦~切利哦!~',
|
||||
'duration': 3382.259,
|
||||
'timestamp': 1396530060,
|
||||
'upload_date': '20140403',
|
||||
'thumbnail': 're:^https?://.+\.jpg',
|
||||
'uploader': '枫叶逝去',
|
||||
'uploader_id': '520116',
|
||||
},
|
||||
}, {
|
||||
'url': 'http://www.bilibili.com/video/av4808130/',
|
||||
'info_dict': {
|
||||
'id': '4808130',
|
||||
'ext': 'mp4',
|
||||
'title': '【长篇】哆啦A梦443【钉铛】',
|
||||
'description': '(2016.05.27)来组合客人的脸吧&amp;寻母六千里锭 抱歉,又轮到周日上班现在才到家 封面www.pixiv.net/member_illust.php?mode=medium&amp;illust_id=56912929',
|
||||
'duration': 1493.995,
|
||||
'timestamp': 1464564180,
|
||||
'upload_date': '20160529',
|
||||
'thumbnail': 're:^https?://.+\.jpg',
|
||||
'uploader': '喜欢拉面',
|
||||
'uploader_id': '151066',
|
||||
},
|
||||
}, {
|
||||
# Missing upload time
|
||||
'url': 'http://www.bilibili.com/video/av1867637/',
|
||||
'info_dict': {
|
||||
'id': '1867637',
|
||||
'ext': 'mp4',
|
||||
'title': '【HDTV】【喜剧】岳父岳母真难当 (2014)【法国票房冠军】',
|
||||
'description': '一个信奉天主教的法国旧式传统资产阶级家庭中有四个女儿。三个女儿却分别找了阿拉伯、犹太、中国丈夫,老夫老妻唯独期盼剩下未嫁的小女儿能找一个信奉天主教的法国白人,结果没想到小女儿找了一位非裔黑人……【这次应该不会跳帧了】',
|
||||
'duration': 5760.0,
|
||||
'uploader': '黑夜为猫',
|
||||
'uploader_id': '610729',
|
||||
'thumbnail': 're:^https?://.+\.jpg',
|
||||
},
|
||||
'params': {
|
||||
# Just to test metadata extraction
|
||||
'skip_download': True,
|
||||
},
|
||||
'expected_warnings': ['upload time'],
|
||||
}, {
|
||||
'url': 'http://bangumi.bilibili.com/anime/v/40068',
|
||||
'md5': '08d539a0884f3deb7b698fb13ba69696',
|
||||
'info_dict': {
|
||||
'id': '40068',
|
||||
'ext': 'mp4',
|
||||
'duration': 1402.357,
|
||||
'title': '混沌武士 : 第7集 四面楚歌 A Risky Racket',
|
||||
'description': 'md5:6a9622b911565794c11f25f81d6a97d2',
|
||||
'thumbnail': 're:^http?://.+\.jpg',
|
||||
},
|
||||
}]
|
||||
}
|
||||
|
||||
_APP_KEY = '6f90a59ac58a4123'
|
||||
_BILIBILI_KEY = '0bfd84cc3940035173f35e6777508326'
|
||||
@@ -124,7 +67,7 @@ class BiliBiliIE(InfoExtractor):
|
||||
'url': durl['url'],
|
||||
'filesize': int_or_none(durl['size']),
|
||||
}]
|
||||
for backup_url in durl['backup_url']:
|
||||
for backup_url in durl.get('backup_url', []):
|
||||
formats.append({
|
||||
'url': backup_url,
|
||||
# backup URLs have lower priorities
|
||||
|
@@ -12,7 +12,7 @@ from ..utils import (
|
||||
|
||||
class BpbIE(InfoExtractor):
|
||||
IE_DESC = 'Bundeszentrale für politische Bildung'
|
||||
_VALID_URL = r'https?://www\.bpb\.de/mediathek/(?P<id>[0-9]+)/'
|
||||
_VALID_URL = r'https?://(?:www\.)?bpb\.de/mediathek/(?P<id>[0-9]+)/'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.bpb.de/mediathek/297/joachim-gauck-zu-1989-und-die-erinnerung-an-die-ddr',
|
||||
|
@@ -112,7 +112,7 @@ class CamdemyIE(InfoExtractor):
|
||||
|
||||
|
||||
class CamdemyFolderIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www.camdemy.com/folder/(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?camdemy\.com/folder/(?P<id>\d+)'
|
||||
_TESTS = [{
|
||||
# links with trailing slash
|
||||
'url': 'http://www.camdemy.com/folder/450',
|
||||
|
@@ -71,7 +71,7 @@ class CanvasIE(InfoExtractor):
|
||||
webpage)).strip()
|
||||
|
||||
video_id = self._html_search_regex(
|
||||
r'data-video=(["\'])(?P<id>.+?)\1', webpage, 'video id', group='id')
|
||||
r'data-video=(["\'])(?P<id>(?:(?!\1).)+)\1', webpage, 'video id', group='id')
|
||||
|
||||
data = self._download_json(
|
||||
'https://mediazone.vrt.be/api/v1/%s/assets/%s'
|
||||
|
@@ -33,4 +33,10 @@ class CartoonNetworkIE(TurnerBaseIE):
|
||||
'media_src': 'http://androidhls-secure.cdn.turner.com/toon/big',
|
||||
'tokenizer_src': 'http://www.cartoonnetwork.com/cntv/mvpd/processors/services/token_ipadAdobe.do',
|
||||
},
|
||||
}, {
|
||||
'url': url,
|
||||
'site_name': 'CartoonNetwork',
|
||||
'auth_required': self._search_regex(
|
||||
r'_cnglobal\.cvpFullOrPreviewAuth\s*=\s*(true|false);',
|
||||
webpage, 'auth required', default='false') == 'true',
|
||||
})
|
||||
|
@@ -4,7 +4,9 @@ from .theplatform import ThePlatformFeedIE
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
find_xpath_attr,
|
||||
ExtractorError,
|
||||
xpath_element,
|
||||
xpath_text,
|
||||
update_url_query,
|
||||
)
|
||||
|
||||
|
||||
@@ -47,27 +49,49 @@ class CBSIE(CBSBaseIE):
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _extract_video_info(self, guid):
|
||||
path = 'dJ5BDC/media/guid/2198311517/' + guid
|
||||
smil_url = 'http://link.theplatform.com/s/%s?mbr=true' % path
|
||||
formats, subtitles = self._extract_theplatform_smil(smil_url + '&manifest=m3u', guid)
|
||||
for r in ('OnceURL&formats=M3U', 'HLS&formats=M3U', 'RTMP', 'WIFI', '3G'):
|
||||
try:
|
||||
tp_formats, _ = self._extract_theplatform_smil(smil_url + '&assetTypes=' + r, guid, 'Downloading %s SMIL data' % r.split('&')[0])
|
||||
formats.extend(tp_formats)
|
||||
except ExtractorError:
|
||||
def _extract_video_info(self, content_id):
|
||||
items_data = self._download_xml(
|
||||
'http://can.cbs.com/thunder/player/videoPlayerService.php',
|
||||
content_id, query={'partner': 'cbs', 'contentId': content_id})
|
||||
video_data = xpath_element(items_data, './/item')
|
||||
title = xpath_text(video_data, 'videoTitle', 'title', True)
|
||||
tp_path = 'dJ5BDC/media/guid/2198311517/%s' % content_id
|
||||
tp_release_url = 'http://link.theplatform.com/s/' + tp_path
|
||||
|
||||
asset_types = []
|
||||
subtitles = {}
|
||||
formats = []
|
||||
for item in items_data.findall('.//item'):
|
||||
asset_type = xpath_text(item, 'assetType')
|
||||
if not asset_type or asset_type in asset_types:
|
||||
continue
|
||||
asset_types.append(asset_type)
|
||||
query = {
|
||||
'mbr': 'true',
|
||||
'assetTypes': asset_type,
|
||||
}
|
||||
if asset_type.startswith('HLS') or asset_type in ('OnceURL', 'StreamPack'):
|
||||
query['formats'] = 'MPEG4,M3U'
|
||||
elif asset_type in ('RTMP', 'WIFI', '3G'):
|
||||
query['formats'] = 'MPEG4,FLV'
|
||||
tp_formats, tp_subtitles = self._extract_theplatform_smil(
|
||||
update_url_query(tp_release_url, query), content_id,
|
||||
'Downloading %s SMIL data' % asset_type)
|
||||
formats.extend(tp_formats)
|
||||
subtitles = self._merge_subtitles(subtitles, tp_subtitles)
|
||||
self._sort_formats(formats)
|
||||
metadata = self._download_theplatform_metadata(path, guid)
|
||||
info = self._parse_theplatform_metadata(metadata)
|
||||
|
||||
info = self._extract_theplatform_metadata(tp_path, content_id)
|
||||
info.update({
|
||||
'id': guid,
|
||||
'id': content_id,
|
||||
'title': title,
|
||||
'series': xpath_text(video_data, 'seriesTitle'),
|
||||
'season_number': int_or_none(xpath_text(video_data, 'seasonNumber')),
|
||||
'episode_number': int_or_none(xpath_text(video_data, 'episodeNumber')),
|
||||
'duration': int_or_none(xpath_text(video_data, 'videoLength'), 1000),
|
||||
'thumbnail': xpath_text(video_data, 'previewImageURL'),
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
'series': metadata.get('cbs$SeriesTitle'),
|
||||
'season_number': int_or_none(metadata.get('cbs$SeasonNumber')),
|
||||
'episode': metadata.get('cbs$EpisodeTitle'),
|
||||
'episode_number': int_or_none(metadata.get('cbs$EpisodeNumber')),
|
||||
})
|
||||
return info
|
||||
|
||||
|
@@ -4,7 +4,7 @@ from .cbs import CBSBaseIE
|
||||
|
||||
|
||||
class CBSSportsIE(CBSBaseIE):
|
||||
_VALID_URL = r'https?://www\.cbssports\.com/video/player/[^/]+/(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?cbssports\.com/video/player/[^/]+/(?P<id>\d+)'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://www.cbssports.com/video/player/videos/708337219968/0/ben-simmons-the-next-lebron?-not-so-fast',
|
||||
|
@@ -17,7 +17,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class CeskaTelevizeIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.ceskatelevize\.cz/(porady|ivysilani)/(?:[^/]+/)*(?P<id>[^/#?]+)/*(?:[#?].*)?$'
|
||||
_VALID_URL = r'https?://(?:www\.)?ceskatelevize\.cz/(porady|ivysilani)/(?:[^/]+/)*(?P<id>[^/#?]+)/*(?:[#?].*)?$'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.ceskatelevize.cz/ivysilani/ivysilani/10441294653-hyde-park-civilizace/214411058091220',
|
||||
'info_dict': {
|
||||
|
@@ -65,7 +65,7 @@ class ChirbitIE(InfoExtractor):
|
||||
|
||||
class ChirbitProfileIE(InfoExtractor):
|
||||
IE_NAME = 'chirbit:profile'
|
||||
_VALID_URL = r'https?://(?:www\.)?chirbit.com/(?:rss/)?(?P<id>[^/]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?chirbit\.com/(?:rss/)?(?P<id>[^/]+)'
|
||||
_TEST = {
|
||||
'url': 'http://chirbit.com/ScarletBeauty',
|
||||
'info_dict': {
|
||||
|
@@ -6,7 +6,7 @@ from ..utils import ExtractorError
|
||||
|
||||
class CMTIE(MTVIE):
|
||||
IE_NAME = 'cmt.com'
|
||||
_VALID_URL = r'https?://www\.cmt\.com/(?:videos|shows)/(?:[^/]+/)*(?P<videoid>\d+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?cmt\.com/(?:videos|shows)/(?:[^/]+/)*(?P<videoid>\d+)'
|
||||
_FEED_URL = 'http://www.cmt.com/sitewide/apps/player/embed/rss/'
|
||||
|
||||
_TESTS = [{
|
||||
|
@@ -87,6 +87,9 @@ class InfoExtractor(object):
|
||||
|
||||
Potential fields:
|
||||
* url Mandatory. The URL of the video file
|
||||
* manifest_url
|
||||
The URL of the manifest file in case of
|
||||
fragmented media (DASH, hls, hds)
|
||||
* ext Will be calculated from URL if missing
|
||||
* format A human-readable description of the format
|
||||
("mp4 container with h264/opus").
|
||||
@@ -115,6 +118,11 @@ class InfoExtractor(object):
|
||||
download, lower-case.
|
||||
"http", "https", "rtsp", "rtmp", "rtmpe",
|
||||
"m3u8", "m3u8_native" or "http_dash_segments".
|
||||
* fragments A list of fragments of the fragmented media,
|
||||
with the following entries:
|
||||
* "url" (mandatory) - fragment's URL
|
||||
* "duration" (optional, int or float)
|
||||
* "filesize" (optional, int)
|
||||
* preference Order number of this format. If this field is
|
||||
present and not None, the formats get sorted
|
||||
by this field, regardless of all other values.
|
||||
@@ -674,33 +682,36 @@ class InfoExtractor(object):
|
||||
username = info[0]
|
||||
password = info[2]
|
||||
else:
|
||||
raise netrc.NetrcParseError('No authenticators for %s' % netrc_machine)
|
||||
raise netrc.NetrcParseError(
|
||||
'No authenticators for %s' % netrc_machine)
|
||||
except (IOError, netrc.NetrcParseError) as err:
|
||||
self._downloader.report_warning('parsing .netrc: %s' % error_to_compat_str(err))
|
||||
self._downloader.report_warning(
|
||||
'parsing .netrc: %s' % error_to_compat_str(err))
|
||||
|
||||
return (username, password)
|
||||
return username, password
|
||||
|
||||
def _get_login_info(self):
|
||||
def _get_login_info(self, username_option='username', password_option='password', netrc_machine=None):
|
||||
"""
|
||||
Get the login info as (username, password)
|
||||
It will look in the netrc file using the _NETRC_MACHINE value
|
||||
First look for the manually specified credentials using username_option
|
||||
and password_option as keys in params dictionary. If no such credentials
|
||||
available look in the netrc file using the netrc_machine or _NETRC_MACHINE
|
||||
value.
|
||||
If there's no info available, return (None, None)
|
||||
"""
|
||||
if self._downloader is None:
|
||||
return (None, None)
|
||||
|
||||
username = None
|
||||
password = None
|
||||
downloader_params = self._downloader.params
|
||||
|
||||
# Attempt to use provided username and password or .netrc data
|
||||
if downloader_params.get('username') is not None:
|
||||
username = downloader_params['username']
|
||||
password = downloader_params['password']
|
||||
if downloader_params.get(username_option) is not None:
|
||||
username = downloader_params[username_option]
|
||||
password = downloader_params[password_option]
|
||||
else:
|
||||
username, password = self._get_netrc_login_info()
|
||||
username, password = self._get_netrc_login_info(netrc_machine)
|
||||
|
||||
return (username, password)
|
||||
return username, password
|
||||
|
||||
def _get_tfa_info(self, note='two-factor verification code'):
|
||||
"""
|
||||
@@ -888,16 +899,16 @@ class InfoExtractor(object):
|
||||
def _hidden_inputs(html):
|
||||
html = re.sub(r'<!--(?:(?!<!--).)*-->', '', html)
|
||||
hidden_inputs = {}
|
||||
for input in re.findall(r'(?i)<input([^>]+)>', html):
|
||||
if not re.search(r'type=(["\'])(?:hidden|submit)\1', input):
|
||||
for input in re.findall(r'(?i)(<input[^>]+>)', html):
|
||||
attrs = extract_attributes(input)
|
||||
if not input:
|
||||
continue
|
||||
name = re.search(r'(?:name|id)=(["\'])(?P<value>.+?)\1', input)
|
||||
if not name:
|
||||
if attrs.get('type') not in ('hidden', 'submit'):
|
||||
continue
|
||||
value = re.search(r'value=(["\'])(?P<value>.*?)\1', input)
|
||||
if not value:
|
||||
continue
|
||||
hidden_inputs[name.group('value')] = value.group('value')
|
||||
name = attrs.get('name') or attrs.get('id')
|
||||
value = attrs.get('value')
|
||||
if name and value is not None:
|
||||
hidden_inputs[name] = value
|
||||
return hidden_inputs
|
||||
|
||||
def _form_hidden_inputs(self, form_id, html):
|
||||
@@ -1139,6 +1150,7 @@ class InfoExtractor(object):
|
||||
formats.append({
|
||||
'format_id': format_id,
|
||||
'url': manifest_url,
|
||||
'manifest_url': manifest_url,
|
||||
'ext': 'flv' if bootstrap_info is not None else None,
|
||||
'tbr': tbr,
|
||||
'width': width,
|
||||
@@ -1244,9 +1256,11 @@ class InfoExtractor(object):
|
||||
# format_id intact.
|
||||
if not live:
|
||||
format_id.append(stream_name if stream_name else '%d' % (tbr if tbr else len(formats)))
|
||||
manifest_url = format_url(line.strip())
|
||||
f = {
|
||||
'format_id': '-'.join(format_id),
|
||||
'url': format_url(line.strip()),
|
||||
'url': manifest_url,
|
||||
'manifest_url': manifest_url,
|
||||
'tbr': tbr,
|
||||
'ext': ext,
|
||||
'fps': float_or_none(last_info.get('FRAME-RATE')),
|
||||
@@ -1518,9 +1532,10 @@ class InfoExtractor(object):
|
||||
mpd_base_url = re.match(r'https?://.+/', urlh.geturl()).group()
|
||||
|
||||
return self._parse_mpd_formats(
|
||||
compat_etree_fromstring(mpd.encode('utf-8')), mpd_id, mpd_base_url, formats_dict=formats_dict)
|
||||
compat_etree_fromstring(mpd.encode('utf-8')), mpd_id, mpd_base_url,
|
||||
formats_dict=formats_dict, mpd_url=mpd_url)
|
||||
|
||||
def _parse_mpd_formats(self, mpd_doc, mpd_id=None, mpd_base_url='', formats_dict={}):
|
||||
def _parse_mpd_formats(self, mpd_doc, mpd_id=None, mpd_base_url='', formats_dict={}, mpd_url=None):
|
||||
"""
|
||||
Parse formats from MPD manifest.
|
||||
References:
|
||||
@@ -1541,42 +1556,52 @@ class InfoExtractor(object):
|
||||
|
||||
def extract_multisegment_info(element, ms_parent_info):
|
||||
ms_info = ms_parent_info.copy()
|
||||
|
||||
# As per [1, 5.3.9.2.2] SegmentList and SegmentTemplate share some
|
||||
# common attributes and elements. We will only extract relevant
|
||||
# for us.
|
||||
def extract_common(source):
|
||||
segment_timeline = source.find(_add_ns('SegmentTimeline'))
|
||||
if segment_timeline is not None:
|
||||
s_e = segment_timeline.findall(_add_ns('S'))
|
||||
if s_e:
|
||||
ms_info['total_number'] = 0
|
||||
ms_info['s'] = []
|
||||
for s in s_e:
|
||||
r = int(s.get('r', 0))
|
||||
ms_info['total_number'] += 1 + r
|
||||
ms_info['s'].append({
|
||||
't': int(s.get('t', 0)),
|
||||
# @d is mandatory (see [1, 5.3.9.6.2, Table 17, page 60])
|
||||
'd': int(s.attrib['d']),
|
||||
'r': r,
|
||||
})
|
||||
start_number = source.get('startNumber')
|
||||
if start_number:
|
||||
ms_info['start_number'] = int(start_number)
|
||||
timescale = source.get('timescale')
|
||||
if timescale:
|
||||
ms_info['timescale'] = int(timescale)
|
||||
segment_duration = source.get('duration')
|
||||
if segment_duration:
|
||||
ms_info['segment_duration'] = int(segment_duration)
|
||||
|
||||
def extract_Initialization(source):
|
||||
initialization = source.find(_add_ns('Initialization'))
|
||||
if initialization is not None:
|
||||
ms_info['initialization_url'] = initialization.attrib['sourceURL']
|
||||
|
||||
segment_list = element.find(_add_ns('SegmentList'))
|
||||
if segment_list is not None:
|
||||
extract_common(segment_list)
|
||||
extract_Initialization(segment_list)
|
||||
segment_urls_e = segment_list.findall(_add_ns('SegmentURL'))
|
||||
if segment_urls_e:
|
||||
ms_info['segment_urls'] = [segment.attrib['media'] for segment in segment_urls_e]
|
||||
initialization = segment_list.find(_add_ns('Initialization'))
|
||||
if initialization is not None:
|
||||
ms_info['initialization_url'] = initialization.attrib['sourceURL']
|
||||
else:
|
||||
segment_template = element.find(_add_ns('SegmentTemplate'))
|
||||
if segment_template is not None:
|
||||
start_number = segment_template.get('startNumber')
|
||||
if start_number:
|
||||
ms_info['start_number'] = int(start_number)
|
||||
segment_timeline = segment_template.find(_add_ns('SegmentTimeline'))
|
||||
if segment_timeline is not None:
|
||||
s_e = segment_timeline.findall(_add_ns('S'))
|
||||
if s_e:
|
||||
ms_info['total_number'] = 0
|
||||
ms_info['s'] = []
|
||||
for s in s_e:
|
||||
r = int(s.get('r', 0))
|
||||
ms_info['total_number'] += 1 + r
|
||||
ms_info['s'].append({
|
||||
't': int(s.get('t', 0)),
|
||||
# @d is mandatory (see [1, 5.3.9.6.2, Table 17, page 60])
|
||||
'd': int(s.attrib['d']),
|
||||
'r': r,
|
||||
})
|
||||
else:
|
||||
timescale = segment_template.get('timescale')
|
||||
if timescale:
|
||||
ms_info['timescale'] = int(timescale)
|
||||
segment_duration = segment_template.get('duration')
|
||||
if segment_duration:
|
||||
ms_info['segment_duration'] = int(segment_duration)
|
||||
extract_common(segment_template)
|
||||
media_template = segment_template.get('media')
|
||||
if media_template:
|
||||
ms_info['media_template'] = media_template
|
||||
@@ -1584,11 +1609,14 @@ class InfoExtractor(object):
|
||||
if initialization:
|
||||
ms_info['initialization_url'] = initialization
|
||||
else:
|
||||
initialization = segment_template.find(_add_ns('Initialization'))
|
||||
if initialization is not None:
|
||||
ms_info['initialization_url'] = initialization.attrib['sourceURL']
|
||||
extract_Initialization(segment_template)
|
||||
return ms_info
|
||||
|
||||
def combine_url(base_url, target_url):
|
||||
if re.match(r'^https?://', target_url):
|
||||
return target_url
|
||||
return '%s%s%s' % (base_url, '' if base_url.endswith('/') else '/', target_url)
|
||||
|
||||
mpd_duration = parse_duration(mpd_doc.get('mediaPresentationDuration'))
|
||||
formats = []
|
||||
for period in mpd_doc.findall(_add_ns('Period')):
|
||||
@@ -1631,6 +1659,7 @@ class InfoExtractor(object):
|
||||
f = {
|
||||
'format_id': '%s-%s' % (mpd_id, representation_id) if mpd_id else representation_id,
|
||||
'url': base_url,
|
||||
'manifest_url': mpd_url,
|
||||
'ext': mimetype2ext(mime_type),
|
||||
'width': int_or_none(representation_attrib.get('width')),
|
||||
'height': int_or_none(representation_attrib.get('height')),
|
||||
@@ -1645,9 +1674,7 @@ class InfoExtractor(object):
|
||||
}
|
||||
representation_ms_info = extract_multisegment_info(representation, adaption_set_ms_info)
|
||||
if 'segment_urls' not in representation_ms_info and 'media_template' in representation_ms_info:
|
||||
if 'total_number' not in representation_ms_info and 'segment_duration':
|
||||
segment_duration = float(representation_ms_info['segment_duration']) / float(representation_ms_info['timescale'])
|
||||
representation_ms_info['total_number'] = int(math.ceil(float(period_duration) / segment_duration))
|
||||
|
||||
media_template = representation_ms_info['media_template']
|
||||
media_template = media_template.replace('$RepresentationID$', representation_id)
|
||||
media_template = re.sub(r'\$(Number|Bandwidth|Time)\$', r'%(\1)d', media_template)
|
||||
@@ -1656,46 +1683,79 @@ class InfoExtractor(object):
|
||||
|
||||
# As per [1, 5.3.9.4.4, Table 16, page 55] $Number$ and $Time$
|
||||
# can't be used at the same time
|
||||
if '%(Number' in media_template:
|
||||
representation_ms_info['segment_urls'] = [
|
||||
media_template % {
|
||||
if '%(Number' in media_template and 's' not in representation_ms_info:
|
||||
segment_duration = None
|
||||
if 'total_number' not in representation_ms_info and 'segment_duration':
|
||||
segment_duration = float_or_none(representation_ms_info['segment_duration'], representation_ms_info['timescale'])
|
||||
representation_ms_info['total_number'] = int(math.ceil(float(period_duration) / segment_duration))
|
||||
representation_ms_info['fragments'] = [{
|
||||
'url': media_template % {
|
||||
'Number': segment_number,
|
||||
'Bandwidth': representation_attrib.get('bandwidth'),
|
||||
}
|
||||
for segment_number in range(
|
||||
representation_ms_info['start_number'],
|
||||
representation_ms_info['total_number'] + representation_ms_info['start_number'])]
|
||||
},
|
||||
'duration': segment_duration,
|
||||
} for segment_number in range(
|
||||
representation_ms_info['start_number'],
|
||||
representation_ms_info['total_number'] + representation_ms_info['start_number'])]
|
||||
else:
|
||||
representation_ms_info['segment_urls'] = []
|
||||
# $Number*$ or $Time$ in media template with S list available
|
||||
# Example $Number*$: http://www.svtplay.se/klipp/9023742/stopptid-om-bjorn-borg
|
||||
# Example $Time$: https://play.arkena.com/embed/avp/v2/player/media/b41dda37-d8e7-4d3f-b1b5-9a9db578bdfe/1/129411
|
||||
representation_ms_info['fragments'] = []
|
||||
segment_time = 0
|
||||
segment_d = None
|
||||
segment_number = representation_ms_info['start_number']
|
||||
|
||||
def add_segment_url():
|
||||
representation_ms_info['segment_urls'].append(
|
||||
media_template % {
|
||||
'Time': segment_time,
|
||||
'Bandwidth': representation_attrib.get('bandwidth'),
|
||||
}
|
||||
)
|
||||
segment_url = media_template % {
|
||||
'Time': segment_time,
|
||||
'Bandwidth': representation_attrib.get('bandwidth'),
|
||||
'Number': segment_number,
|
||||
}
|
||||
representation_ms_info['fragments'].append({
|
||||
'url': segment_url,
|
||||
'duration': float_or_none(segment_d, representation_ms_info['timescale']),
|
||||
})
|
||||
|
||||
for num, s in enumerate(representation_ms_info['s']):
|
||||
segment_time = s.get('t') or segment_time
|
||||
segment_d = s['d']
|
||||
add_segment_url()
|
||||
segment_number += 1
|
||||
for r in range(s.get('r', 0)):
|
||||
segment_time += s['d']
|
||||
segment_time += segment_d
|
||||
add_segment_url()
|
||||
segment_time += s['d']
|
||||
if 'segment_urls' in representation_ms_info:
|
||||
segment_number += 1
|
||||
segment_time += segment_d
|
||||
elif 'segment_urls' in representation_ms_info and 's' in representation_ms_info:
|
||||
# No media template
|
||||
# Example: https://www.youtube.com/watch?v=iXZV5uAYMJI
|
||||
# or any YouTube dashsegments video
|
||||
fragments = []
|
||||
s_num = 0
|
||||
for segment_url in representation_ms_info['segment_urls']:
|
||||
s = representation_ms_info['s'][s_num]
|
||||
for r in range(s.get('r', 0) + 1):
|
||||
fragments.append({
|
||||
'url': segment_url,
|
||||
'duration': float_or_none(s['d'], representation_ms_info['timescale']),
|
||||
})
|
||||
representation_ms_info['fragments'] = fragments
|
||||
# NB: MPD manifest may contain direct URLs to unfragmented media.
|
||||
# No fragments key is present in this case.
|
||||
if 'fragments' in representation_ms_info:
|
||||
f.update({
|
||||
'segment_urls': representation_ms_info['segment_urls'],
|
||||
'fragments': [],
|
||||
'protocol': 'http_dash_segments',
|
||||
})
|
||||
if 'initialization_url' in representation_ms_info:
|
||||
initialization_url = representation_ms_info['initialization_url'].replace('$RepresentationID$', representation_id)
|
||||
f.update({
|
||||
'initialization_url': initialization_url,
|
||||
})
|
||||
if not f.get('url'):
|
||||
f['url'] = initialization_url
|
||||
f['fragments'].append({'url': initialization_url})
|
||||
f['fragments'].extend(representation_ms_info['fragments'])
|
||||
for fragment in f['fragments']:
|
||||
fragment['url'] = combine_url(base_url, fragment['url'])
|
||||
try:
|
||||
existing_format = next(
|
||||
fo for fo in formats
|
||||
@@ -1792,6 +1852,49 @@ class InfoExtractor(object):
|
||||
m3u8_id='hls', fatal=False))
|
||||
return formats
|
||||
|
||||
def _extract_wowza_formats(self, url, video_id, m3u8_entry_protocol='m3u8_native', skip_protocols=[]):
|
||||
url = re.sub(r'/(?:manifest|playlist|jwplayer)\.(?:m3u8|f4m|mpd|smil)', '', url)
|
||||
url_base = self._search_regex(r'(?:https?|rtmp|rtsp)(://[^?]+)', url, 'format url')
|
||||
http_base_url = 'http' + url_base
|
||||
formats = []
|
||||
if 'm3u8' not in skip_protocols:
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
http_base_url + '/playlist.m3u8', video_id, 'mp4',
|
||||
m3u8_entry_protocol, m3u8_id='hls', fatal=False))
|
||||
if 'f4m' not in skip_protocols:
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
http_base_url + '/manifest.f4m',
|
||||
video_id, f4m_id='hds', fatal=False))
|
||||
if re.search(r'(?:/smil:|\.smil)', url_base):
|
||||
if 'dash' not in skip_protocols:
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
http_base_url + '/manifest.mpd',
|
||||
video_id, mpd_id='dash', fatal=False))
|
||||
if 'smil' not in skip_protocols:
|
||||
rtmp_formats = self._extract_smil_formats(
|
||||
http_base_url + '/jwplayer.smil',
|
||||
video_id, fatal=False)
|
||||
for rtmp_format in rtmp_formats:
|
||||
rtsp_format = rtmp_format.copy()
|
||||
rtsp_format['url'] = '%s/%s' % (rtmp_format['url'], rtmp_format['play_path'])
|
||||
del rtsp_format['play_path']
|
||||
del rtsp_format['ext']
|
||||
rtsp_format.update({
|
||||
'url': rtsp_format['url'].replace('rtmp://', 'rtsp://'),
|
||||
'format_id': rtmp_format['format_id'].replace('rtmp', 'rtsp'),
|
||||
'protocol': 'rtsp',
|
||||
})
|
||||
formats.extend([rtmp_format, rtsp_format])
|
||||
else:
|
||||
for protocol in ('rtmp', 'rtsp'):
|
||||
if protocol not in skip_protocols:
|
||||
formats.append({
|
||||
'url': protocol + url_base,
|
||||
'format_id': protocol,
|
||||
'protocol': protocol,
|
||||
})
|
||||
return formats
|
||||
|
||||
def _live_title(self, name):
|
||||
""" Generate the title for a live video """
|
||||
now = datetime.datetime.now()
|
||||
|
@@ -7,7 +7,7 @@ from .common import InfoExtractor
|
||||
|
||||
|
||||
class CriterionIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.criterion\.com/films/(?P<id>[0-9]+)-.+'
|
||||
_VALID_URL = r'https?://(?:www\.)?criterion\.com/films/(?P<id>[0-9]+)-.+'
|
||||
_TEST = {
|
||||
'url': 'http://www.criterion.com/films/184-le-samourai',
|
||||
'md5': 'bc51beba55685509883a9a7830919ec3',
|
||||
|
@@ -34,22 +34,58 @@ from ..aes import (
|
||||
|
||||
|
||||
class CrunchyrollBaseIE(InfoExtractor):
|
||||
_LOGIN_URL = 'https://www.crunchyroll.com/login'
|
||||
_LOGIN_FORM = 'login_form'
|
||||
_NETRC_MACHINE = 'crunchyroll'
|
||||
|
||||
def _login(self):
|
||||
(username, password) = self._get_login_info()
|
||||
if username is None:
|
||||
return
|
||||
self.report_login()
|
||||
login_url = 'https://www.crunchyroll.com/?a=formhandler'
|
||||
data = urlencode_postdata({
|
||||
'formname': 'RpcApiUser_Login',
|
||||
'name': username,
|
||||
'password': password,
|
||||
|
||||
login_page = self._download_webpage(
|
||||
self._LOGIN_URL, None, 'Downloading login page')
|
||||
|
||||
def is_logged(webpage):
|
||||
return '<title>Redirecting' in webpage
|
||||
|
||||
# Already logged in
|
||||
if is_logged(login_page):
|
||||
return
|
||||
|
||||
login_form_str = self._search_regex(
|
||||
r'(?P<form><form[^>]+?id=(["\'])%s\2[^>]*>)' % self._LOGIN_FORM,
|
||||
login_page, 'login form', group='form')
|
||||
|
||||
post_url = extract_attributes(login_form_str).get('action')
|
||||
if not post_url:
|
||||
post_url = self._LOGIN_URL
|
||||
elif not post_url.startswith('http'):
|
||||
post_url = compat_urlparse.urljoin(self._LOGIN_URL, post_url)
|
||||
|
||||
login_form = self._form_hidden_inputs(self._LOGIN_FORM, login_page)
|
||||
|
||||
login_form.update({
|
||||
'login_form[name]': username,
|
||||
'login_form[password]': password,
|
||||
})
|
||||
login_request = sanitized_Request(login_url, data)
|
||||
login_request.add_header('Content-Type', 'application/x-www-form-urlencoded')
|
||||
self._download_webpage(login_request, None, False, 'Wrong login info')
|
||||
|
||||
response = self._download_webpage(
|
||||
post_url, None, 'Logging in', 'Wrong login info',
|
||||
data=urlencode_postdata(login_form),
|
||||
headers={'Content-Type': 'application/x-www-form-urlencoded'})
|
||||
|
||||
# Successful login
|
||||
if is_logged(response):
|
||||
return
|
||||
|
||||
error = self._html_search_regex(
|
||||
'(?s)<ul[^>]+class=["\']messages["\'][^>]*>(.+?)</ul>',
|
||||
response, 'error message', default=None)
|
||||
if error:
|
||||
raise ExtractorError('Unable to login: %s' % error, expected=True)
|
||||
|
||||
raise ExtractorError('Unable to log in')
|
||||
|
||||
def _real_initialize(self):
|
||||
self._login()
|
||||
|
@@ -6,7 +6,7 @@ from ..compat import compat_str
|
||||
|
||||
|
||||
class DctpTvIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www.dctp.tv/(#/)?filme/(?P<id>.+?)/$'
|
||||
_VALID_URL = r'https?://(?:www\.)?dctp\.tv/(#/)?filme/(?P<id>.+?)/$'
|
||||
_TEST = {
|
||||
'url': 'http://www.dctp.tv/filme/videoinstallation-fuer-eine-kaufhausfassade/',
|
||||
'info_dict': {
|
||||
|
@@ -13,7 +13,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class DemocracynowIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?democracynow.org/(?P<id>[^\?]*)'
|
||||
_VALID_URL = r'https?://(?:www\.)?democracynow\.org/(?P<id>[^\?]*)'
|
||||
IE_NAME = 'democracynow'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.democracynow.org/shows/2015/7/3',
|
||||
|
@@ -4,7 +4,7 @@ from .common import InfoExtractor
|
||||
|
||||
|
||||
class EngadgetIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www.engadget.com/video/(?P<id>[^/?#]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?engadget\.com/video/(?P<id>[^/?#]+)'
|
||||
|
||||
_TESTS = [{
|
||||
# video with 5min ID
|
||||
|
@@ -8,7 +8,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class ExpoTVIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.expotv\.com/videos/[^?#]*/(?P<id>[0-9]+)($|[?#])'
|
||||
_VALID_URL = r'https?://(?:www\.)?expotv\.com/videos/[^?#]*/(?P<id>[0-9]+)($|[?#])'
|
||||
_TEST = {
|
||||
'url': 'http://www.expotv.com/videos/reviews/3/40/NYX-Butter-lipstick/667916',
|
||||
'md5': 'fe1d728c3a813ff78f595bc8b7a707a8',
|
||||
|
@@ -93,6 +93,7 @@ from .bbc import (
|
||||
)
|
||||
from .beeg import BeegIE
|
||||
from .behindkink import BehindKinkIE
|
||||
from .bellmedia import BellMediaIE
|
||||
from .beatportpro import BeatportProIE
|
||||
from .bet import BetIE
|
||||
from .bigflix import BigflixIE
|
||||
@@ -195,7 +196,6 @@ from .crunchyroll import (
|
||||
)
|
||||
from .cspan import CSpanIE
|
||||
from .ctsnews import CtsNewsIE
|
||||
from .ctv import CTVIE
|
||||
from .ctvnews import CTVNewsIE
|
||||
from .cultureunplugged import CultureUnpluggedIE
|
||||
from .curiositystream import (
|
||||
@@ -472,6 +472,10 @@ from .macgamestore import MacGameStoreIE
|
||||
from .mailru import MailRuIE
|
||||
from .makerschannel import MakersChannelIE
|
||||
from .makertv import MakerTVIE
|
||||
from .mangomolo import (
|
||||
MangomoloVideoIE,
|
||||
MangomoloLiveIE,
|
||||
)
|
||||
from .matchtv import MatchTVIE
|
||||
from .mdr import MDRIE
|
||||
from .meta import METAIE
|
||||
@@ -534,6 +538,7 @@ from .nbc import (
|
||||
CSNNEIE,
|
||||
NBCIE,
|
||||
NBCNewsIE,
|
||||
NBCOlympicsIE,
|
||||
NBCSportsIE,
|
||||
NBCSportsVPlayerIE,
|
||||
)
|
||||
@@ -1064,6 +1069,7 @@ from .vporn import VpornIE
|
||||
from .vrt import VRTIE
|
||||
from .vube import VubeIE
|
||||
from .vuclip import VuClipIE
|
||||
from .vyborymos import VyboryMosIE
|
||||
from .walla import WallaIE
|
||||
from .washingtonpost import (
|
||||
WashingtonPostIE,
|
||||
|
@@ -1,14 +1,14 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from .adobepass import AdobePassIE
|
||||
from ..utils import (
|
||||
smuggle_url,
|
||||
update_url_query,
|
||||
)
|
||||
|
||||
|
||||
class FOXIE(InfoExtractor):
|
||||
class FOXIE(AdobePassIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?fox\.com/watch/(?P<id>[0-9]+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.fox.com/watch/255180355939/7684182528',
|
||||
@@ -30,14 +30,26 @@ class FOXIE(InfoExtractor):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
release_url = self._parse_json(self._search_regex(
|
||||
r'"fox_pdk_player"\s*:\s*({[^}]+?})', webpage, 'fox_pdk_player'),
|
||||
video_id)['release_url']
|
||||
settings = self._parse_json(self._search_regex(
|
||||
r'jQuery\.extend\(Drupal\.settings\s*,\s*({.+?})\);',
|
||||
webpage, 'drupal settings'), video_id)
|
||||
fox_pdk_player = settings['fox_pdk_player']
|
||||
release_url = fox_pdk_player['release_url']
|
||||
query = {
|
||||
'mbr': 'true',
|
||||
'switch': 'http'
|
||||
}
|
||||
if fox_pdk_player.get('access') == 'locked':
|
||||
ap_p = settings['foxAdobePassProvider']
|
||||
rating = ap_p.get('videoRating')
|
||||
if rating == 'n/a':
|
||||
rating = None
|
||||
resource = self._get_mvpd_resource('fbc-fox', None, ap_p['videoGUID'], rating)
|
||||
query['auth'] = self._extract_mvpd_auth(url, video_id, 'fbc-fox', resource)
|
||||
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'ie_key': 'ThePlatform',
|
||||
'url': smuggle_url(update_url_query(
|
||||
release_url, {'switch': 'http'}), {'force_smil_url': True}),
|
||||
'url': smuggle_url(update_url_query(release_url, query), {'force_smil_url': True}),
|
||||
'id': video_id,
|
||||
}
|
||||
|
@@ -2,21 +2,21 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import int_or_none
|
||||
from ..utils import month_by_name
|
||||
|
||||
|
||||
class FranceInterIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?franceinter\.fr/player/reecouter\?play=(?P<id>[0-9]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?franceinter\.fr/emissions/(?P<id>[^?#]+)'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.franceinter.fr/player/reecouter?play=793962',
|
||||
'md5': '4764932e466e6f6c79c317d2e74f6884',
|
||||
'url': 'https://www.franceinter.fr/emissions/affaires-sensibles/affaires-sensibles-07-septembre-2016',
|
||||
'md5': '9e54d7bdb6fdc02a841007f8a975c094',
|
||||
'info_dict': {
|
||||
'id': '793962',
|
||||
'id': 'affaires-sensibles/affaires-sensibles-07-septembre-2016',
|
||||
'ext': 'mp3',
|
||||
'title': 'L’Histoire dans les jeux vidéo',
|
||||
'description': 'md5:7e93ddb4451e7530022792240a3049c7',
|
||||
'timestamp': 1387369800,
|
||||
'upload_date': '20131218',
|
||||
'title': 'Affaire Cahuzac : le contentieux du compte en Suisse',
|
||||
'description': 'md5:401969c5d318c061f86bda1fa359292b',
|
||||
'upload_date': '20160907',
|
||||
},
|
||||
}
|
||||
|
||||
@@ -25,23 +25,30 @@ class FranceInterIE(InfoExtractor):
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
path = self._search_regex(
|
||||
r'<a id="player".+?href="([^"]+)"', webpage, 'video url')
|
||||
video_url = 'http://www.franceinter.fr/' + path
|
||||
video_url = self._search_regex(
|
||||
r'(?s)<div[^>]+class=["\']page-diffusion["\'][^>]*>.*?<button[^>]+data-url=(["\'])(?P<url>(?:(?!\1).)+)\1',
|
||||
webpage, 'video url', group='url')
|
||||
|
||||
title = self._html_search_regex(
|
||||
r'<span class="title-diffusion">(.+?)</span>', webpage, 'title')
|
||||
description = self._html_search_regex(
|
||||
r'<span class="description">(.*?)</span>',
|
||||
webpage, 'description', fatal=False)
|
||||
timestamp = int_or_none(self._search_regex(
|
||||
r'data-date="(\d+)"', webpage, 'upload date', fatal=False))
|
||||
title = self._og_search_title(webpage)
|
||||
description = self._og_search_description(webpage)
|
||||
|
||||
upload_date_str = self._search_regex(
|
||||
r'class=["\']cover-emission-period["\'][^>]*>[^<]+\s+(\d{1,2}\s+[^\s]+\s+\d{4})<',
|
||||
webpage, 'upload date', fatal=False)
|
||||
if upload_date_str:
|
||||
upload_date_list = upload_date_str.split()
|
||||
upload_date_list.reverse()
|
||||
upload_date_list[1] = '%02d' % (month_by_name(upload_date_list[1], lang='fr') or 0)
|
||||
upload_date_list[2] = '%02d' % int(upload_date_list[2])
|
||||
upload_date = ''.join(upload_date_list)
|
||||
else:
|
||||
upload_date = None
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'timestamp': timestamp,
|
||||
'upload_date': upload_date,
|
||||
'formats': [{
|
||||
'url': video_url,
|
||||
'vcodec': 'none',
|
||||
|
@@ -8,7 +8,7 @@ from .common import InfoExtractor
|
||||
|
||||
class FreespeechIE(InfoExtractor):
|
||||
IE_NAME = 'freespeech.org'
|
||||
_VALID_URL = r'https://www\.freespeech\.org/video/(?P<title>.+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?freespeech\.org/video/(?P<title>.+)'
|
||||
_TEST = {
|
||||
'add_ie': ['Youtube'],
|
||||
'url': 'https://www.freespeech.org/video/obama-romney-campaign-colorado-ahead-debate-0',
|
||||
|
@@ -9,7 +9,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class GameStarIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.gamestar\.de/videos/.*,(?P<id>[0-9]+)\.html'
|
||||
_VALID_URL = r'https?://(?:www\.)?gamestar\.de/videos/.*,(?P<id>[0-9]+)\.html'
|
||||
_TEST = {
|
||||
'url': 'http://www.gamestar.de/videos/trailer,3/hobbit-3-die-schlacht-der-fuenf-heere,76110.html',
|
||||
'md5': '96974ecbb7fd8d0d20fca5a00810cea7',
|
||||
|
@@ -1369,6 +1369,11 @@ class GenericIE(InfoExtractor):
|
||||
},
|
||||
'add_ie': ['Vimeo'],
|
||||
},
|
||||
{
|
||||
# generic vimeo embed that requires original URL passed as Referer
|
||||
'url': 'http://racing4everyone.eu/2016/07/30/formula-1-2016-round12-germany/',
|
||||
'only_matching': True,
|
||||
},
|
||||
{
|
||||
'url': 'https://support.arkena.com/display/PLAY/Ways+to+embed+your+video',
|
||||
'md5': 'b96f2f71b359a8ecd05ce4e1daa72365',
|
||||
@@ -1652,7 +1657,9 @@ class GenericIE(InfoExtractor):
|
||||
return self.playlist_result(self._parse_xspf(doc, video_id), video_id)
|
||||
elif re.match(r'(?i)^(?:{[^}]+})?MPD$', doc.tag):
|
||||
info_dict['formats'] = self._parse_mpd_formats(
|
||||
doc, video_id, mpd_base_url=url.rpartition('/')[0])
|
||||
doc, video_id,
|
||||
mpd_base_url=full_response.geturl().rpartition('/')[0],
|
||||
mpd_url=url)
|
||||
self._sort_formats(info_dict['formats'])
|
||||
return info_dict
|
||||
elif re.match(r'^{http://ns\.adobe\.com/f4m/[12]\.0}manifest$', doc.tag):
|
||||
@@ -2249,6 +2256,35 @@ class GenericIE(InfoExtractor):
|
||||
return self.url_result(
|
||||
self._proto_relative_url(unescapeHTML(mobj.group('url'))), 'VODPlatform')
|
||||
|
||||
# Look for Mangomolo embeds
|
||||
mobj = re.search(
|
||||
r'''(?x)<iframe[^>]+src=(["\'])(?P<url>(?:https?:)?//(?:www\.)?admin\.mangomolo\.com/analytics/index\.php/customers/embed/
|
||||
(?:
|
||||
video\?.*?\bid=(?P<video_id>\d+)|
|
||||
index\?.*?\bchannelid=(?P<channel_id>(?:[A-Za-z0-9+/=]|%2B|%2F|%3D)+)
|
||||
).+?)\1''', webpage)
|
||||
if mobj is not None:
|
||||
info = {
|
||||
'_type': 'url_transparent',
|
||||
'url': self._proto_relative_url(unescapeHTML(mobj.group('url'))),
|
||||
'title': video_title,
|
||||
'description': video_description,
|
||||
'thumbnail': video_thumbnail,
|
||||
'uploader': video_uploader,
|
||||
}
|
||||
video_id = mobj.group('video_id')
|
||||
if video_id:
|
||||
info.update({
|
||||
'ie_key': 'MangomoloVideo',
|
||||
'id': video_id,
|
||||
})
|
||||
else:
|
||||
info.update({
|
||||
'ie_key': 'MangomoloLive',
|
||||
'id': mobj.group('channel_id'),
|
||||
})
|
||||
return info
|
||||
|
||||
# Look for Instagram embeds
|
||||
instagram_embed_url = InstagramIE._extract_embed_url(webpage)
|
||||
if instagram_embed_url is not None:
|
||||
|
@@ -2,6 +2,7 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import random
|
||||
import re
|
||||
import math
|
||||
|
||||
from .common import InfoExtractor
|
||||
@@ -14,6 +15,7 @@ from ..utils import (
|
||||
ExtractorError,
|
||||
float_or_none,
|
||||
int_or_none,
|
||||
orderedSet,
|
||||
str_or_none,
|
||||
)
|
||||
|
||||
@@ -63,6 +65,9 @@ class GloboIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'http://canaloff.globo.com/programas/desejar-profundo/videos/4518560.html',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'globo:3607726',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
class MD5(object):
|
||||
@@ -396,7 +401,7 @@ class GloboIE(InfoExtractor):
|
||||
|
||||
|
||||
class GloboArticleIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://.+?\.globo\.com/(?:[^/]+/)*(?P<id>[^/]+)(?:\.html)?'
|
||||
_VALID_URL = r'https?://.+?\.globo\.com/(?:[^/]+/)*(?P<id>[^/.]+)(?:\.html)?'
|
||||
|
||||
_VIDEOID_REGEXES = [
|
||||
r'\bdata-video-id=["\'](\d{7,})',
|
||||
@@ -408,15 +413,20 @@ class GloboArticleIE(InfoExtractor):
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://g1.globo.com/jornal-nacional/noticia/2014/09/novidade-na-fiscalizacao-de-bagagem-pela-receita-provoca-discussoes.html',
|
||||
'md5': '307fdeae4390ccfe6ba1aa198cf6e72b',
|
||||
'info_dict': {
|
||||
'id': '3652183',
|
||||
'ext': 'mp4',
|
||||
'title': 'Receita Federal explica como vai fiscalizar bagagens de quem retorna ao Brasil de avião',
|
||||
'duration': 110.711,
|
||||
'uploader': 'Rede Globo',
|
||||
'uploader_id': '196',
|
||||
}
|
||||
'id': 'novidade-na-fiscalizacao-de-bagagem-pela-receita-provoca-discussoes',
|
||||
'title': 'Novidade na fiscalização de bagagem pela Receita provoca discussões',
|
||||
'description': 'md5:c3c4b4d4c30c32fce460040b1ac46b12',
|
||||
},
|
||||
'playlist_count': 1,
|
||||
}, {
|
||||
'url': 'http://g1.globo.com/pr/parana/noticia/2016/09/mpf-denuncia-lula-marisa-e-mais-seis-na-operacao-lava-jato.html',
|
||||
'info_dict': {
|
||||
'id': 'mpf-denuncia-lula-marisa-e-mais-seis-na-operacao-lava-jato',
|
||||
'title': "Lula era o 'comandante máximo' do esquema da Lava Jato, diz MPF",
|
||||
'description': 'md5:8aa7cc8beda4dc71cc8553e00b77c54c',
|
||||
},
|
||||
'playlist_count': 6,
|
||||
}, {
|
||||
'url': 'http://gq.globo.com/Prazeres/Poder/noticia/2015/10/all-o-desafio-assista-ao-segundo-capitulo-da-serie.html',
|
||||
'only_matching': True,
|
||||
@@ -435,5 +445,12 @@ class GloboArticleIE(InfoExtractor):
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
video_id = self._search_regex(self._VIDEOID_REGEXES, webpage, 'video id')
|
||||
return self.url_result('globo:%s' % video_id, 'Globo')
|
||||
video_ids = []
|
||||
for video_regex in self._VIDEOID_REGEXES:
|
||||
video_ids.extend(re.findall(video_regex, webpage))
|
||||
entries = [
|
||||
self.url_result('globo:%s' % video_id, GloboIE.ie_key())
|
||||
for video_id in orderedSet(video_ids)]
|
||||
title = self._og_search_title(webpage, fatal=False)
|
||||
description = self._html_search_meta('description', webpage)
|
||||
return self.playlist_result(entries, display_id, title, description)
|
||||
|
@@ -8,6 +8,8 @@ from ..utils import (
|
||||
int_or_none,
|
||||
determine_ext,
|
||||
parse_age_limit,
|
||||
urlencode_postdata,
|
||||
ExtractorError,
|
||||
)
|
||||
|
||||
|
||||
@@ -19,7 +21,7 @@ class GoIE(InfoExtractor):
|
||||
'watchdisneyjunior': '008',
|
||||
'watchdisneyxd': '009',
|
||||
}
|
||||
_VALID_URL = r'https?://(?:(?P<sub_domain>%s)\.)?go\.com/.*?vdka(?P<id>\w+)' % '|'.join(_BRANDS.keys())
|
||||
_VALID_URL = r'https?://(?:(?P<sub_domain>%s)\.)?go\.com/(?:[^/]+/)*(?:vdka(?P<id>\w+)|season-\d+/\d+-(?P<display_id>[^/?#]+))' % '|'.join(_BRANDS.keys())
|
||||
_TESTS = [{
|
||||
'url': 'http://abc.go.com/shows/castle/video/most-recent/vdka0_g86w5onx',
|
||||
'info_dict': {
|
||||
@@ -38,9 +40,13 @@ class GoIE(InfoExtractor):
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
sub_domain, video_id = re.match(self._VALID_URL, url).groups()
|
||||
sub_domain, video_id, display_id = re.match(self._VALID_URL, url).groups()
|
||||
if not video_id:
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
video_id = self._search_regex(r'data-video-id=["\']VDKA(\w+)', webpage, 'video id')
|
||||
brand = self._BRANDS[sub_domain]
|
||||
video_data = self._download_json(
|
||||
'http://api.contents.watchabc.go.com/vp2/ws/contents/3000/videos/%s/001/-1/-1/-1/%s/-1/-1.json' % (self._BRANDS[sub_domain], video_id),
|
||||
'http://api.contents.watchabc.go.com/vp2/ws/contents/3000/videos/%s/001/-1/-1/-1/%s/-1/-1.json' % (brand, video_id),
|
||||
video_id)['video'][0]
|
||||
title = video_data['title']
|
||||
|
||||
@@ -52,6 +58,21 @@ class GoIE(InfoExtractor):
|
||||
format_id = asset.get('format')
|
||||
ext = determine_ext(asset_url)
|
||||
if ext == 'm3u8':
|
||||
video_type = video_data.get('type')
|
||||
if video_type == 'lf':
|
||||
entitlement = self._download_json(
|
||||
'https://api.entitlement.watchabc.go.com/vp2/ws-secure/entitlement/2020/authorize.json',
|
||||
video_id, data=urlencode_postdata({
|
||||
'video_id': video_data['id'],
|
||||
'video_type': video_type,
|
||||
'brand': brand,
|
||||
'device': '001',
|
||||
}))
|
||||
errors = entitlement.get('errors', {}).get('errors', [])
|
||||
if errors:
|
||||
error_message = ', '.join([error['message'] for error in errors])
|
||||
raise ExtractorError('%s said: %s' % (self.IE_NAME, error_message), expected=True)
|
||||
asset_url += '?' + entitlement['uplynkData']['sessionKey']
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
asset_url, video_id, 'mp4', m3u8_id=format_id or 'hls', fatal=False))
|
||||
else:
|
||||
|
@@ -10,7 +10,7 @@ from ..utils import unified_strdate
|
||||
|
||||
class GooglePlusIE(InfoExtractor):
|
||||
IE_DESC = 'Google Plus'
|
||||
_VALID_URL = r'https://plus\.google\.com/(?:[^/]+/)*?posts/(?P<id>\w+)'
|
||||
_VALID_URL = r'https?://plus\.google\.com/(?:[^/]+/)*?posts/(?P<id>\w+)'
|
||||
IE_NAME = 'plus.google'
|
||||
_TEST = {
|
||||
'url': 'https://plus.google.com/u/0/108897254135232129896/posts/ZButuJc6CtH',
|
||||
|
@@ -11,7 +11,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class GoshgayIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.goshgay\.com/video(?P<id>\d+?)($|/)'
|
||||
_VALID_URL = r'https?://(?:www\.)?goshgay\.com/video(?P<id>\d+?)($|/)'
|
||||
_TEST = {
|
||||
'url': 'http://www.goshgay.com/video299069/diesel_sfw_xxx_video',
|
||||
'md5': '4b6db9a0a333142eb9f15913142b0ed1',
|
||||
|
@@ -5,7 +5,7 @@ from .common import InfoExtractor
|
||||
|
||||
|
||||
class HarkIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.hark\.com/clips/(?P<id>.+?)-.+'
|
||||
_VALID_URL = r'https?://(?:www\.)?hark\.com/clips/(?P<id>.+?)-.+'
|
||||
_TEST = {
|
||||
'url': 'http://www.hark.com/clips/mmbzyhkgny-obama-beyond-the-afghan-theater-we-only-target-al-qaeda-on-may-23-2013',
|
||||
'md5': '6783a58491b47b92c7c1af5a77d4cbee',
|
||||
|
@@ -12,7 +12,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class HotNewHipHopIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.hotnewhiphop\.com/.*\.(?P<id>.*)\.html'
|
||||
_VALID_URL = r'https?://(?:www\.)?hotnewhiphop\.com/.*\.(?P<id>.*)\.html'
|
||||
_TEST = {
|
||||
'url': 'http://www.hotnewhiphop.com/freddie-gibbs-lay-it-down-song.1435540.html',
|
||||
'md5': '2c2cd2f76ef11a9b3b581e8b232f3d96',
|
||||
|
@@ -94,7 +94,7 @@ class ImdbIE(InfoExtractor):
|
||||
class ImdbListIE(InfoExtractor):
|
||||
IE_NAME = 'imdb:list'
|
||||
IE_DESC = 'Internet Movie Database lists'
|
||||
_VALID_URL = r'https?://www\.imdb\.com/list/(?P<id>[\da-zA-Z_-]{11})'
|
||||
_VALID_URL = r'https?://(?:www\.)?imdb\.com/list/(?P<id>[\da-zA-Z_-]{11})'
|
||||
_TEST = {
|
||||
'url': 'http://www.imdb.com/list/JFs9NWw6XI0',
|
||||
'info_dict': {
|
||||
|
@@ -9,6 +9,7 @@ from ..utils import (
|
||||
determine_ext,
|
||||
float_or_none,
|
||||
int_or_none,
|
||||
js_to_json,
|
||||
mimetype2ext,
|
||||
)
|
||||
|
||||
@@ -19,14 +20,15 @@ class JWPlatformBaseIE(InfoExtractor):
|
||||
# TODO: Merge this with JWPlayer-related codes in generic.py
|
||||
|
||||
mobj = re.search(
|
||||
'jwplayer\((?P<quote>[\'"])[^\'" ]+(?P=quote)\)\.setup\((?P<options>[^)]+)\)',
|
||||
r'jwplayer\((?P<quote>[\'"])[^\'" ]+(?P=quote)\)\.setup\s*\((?P<options>[^)]+)\)',
|
||||
webpage)
|
||||
if mobj:
|
||||
return mobj.group('options')
|
||||
|
||||
def _extract_jwplayer_data(self, webpage, video_id, *args, **kwargs):
|
||||
jwplayer_data = self._parse_json(
|
||||
self._find_jwplayer_data(webpage), video_id)
|
||||
self._find_jwplayer_data(webpage), video_id,
|
||||
transform_source=js_to_json)
|
||||
return self._parse_jwplayer_data(
|
||||
jwplayer_data, video_id, *args, **kwargs)
|
||||
|
||||
|
@@ -262,8 +262,16 @@ class KalturaIE(InfoExtractor):
|
||||
# Continue if asset is not ready
|
||||
if f.get('status') != 2:
|
||||
continue
|
||||
# Original format that's not available (e.g. kaltura:1926081:0_c03e1b5g)
|
||||
# skip for now.
|
||||
if f.get('fileExt') == 'chun':
|
||||
continue
|
||||
video_url = sign_url(
|
||||
'%s/flavorId/%s' % (data_url, f['id']))
|
||||
# audio-only has no videoCodecId (e.g. kaltura:1926081:0_c03e1b5g
|
||||
# -f mp4-56)
|
||||
vcodec = 'none' if 'videoCodecId' not in f and f.get(
|
||||
'frameRate') == 0 else f.get('videoCodecId')
|
||||
formats.append({
|
||||
'format_id': '%(fileExt)s-%(bitrate)s' % f,
|
||||
'ext': f.get('fileExt'),
|
||||
@@ -271,7 +279,7 @@ class KalturaIE(InfoExtractor):
|
||||
'fps': int_or_none(f.get('frameRate')),
|
||||
'filesize_approx': int_or_none(f.get('size'), invscale=1024),
|
||||
'container': f.get('containerFormat'),
|
||||
'vcodec': f.get('videoCodecId'),
|
||||
'vcodec': vcodec,
|
||||
'height': int_or_none(f.get('height')),
|
||||
'width': int_or_none(f.get('width')),
|
||||
'url': video_url,
|
||||
|
@@ -5,7 +5,7 @@ from .common import InfoExtractor
|
||||
|
||||
|
||||
class KaraoketvIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.karaoketv\.co\.il/[^/]+/(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?karaoketv\.co\.il/[^/]+/(?P<id>\d+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.karaoketv.co.il/%D7%A9%D7%99%D7%A8%D7%99_%D7%A7%D7%A8%D7%99%D7%95%D7%A7%D7%99/58356/%D7%90%D7%99%D7%96%D7%95%D7%9F',
|
||||
'info_dict': {
|
||||
|
@@ -6,7 +6,7 @@ from ..utils import smuggle_url
|
||||
|
||||
|
||||
class KickStarterIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.kickstarter\.com/projects/(?P<id>[^/]*)/.*'
|
||||
_VALID_URL = r'https?://(?:www\.)?kickstarter\.com/projects/(?P<id>[^/]*)/.*'
|
||||
_TESTS = [{
|
||||
'url': 'https://www.kickstarter.com/projects/1404461844/intersection-the-story-of-josh-grant/description',
|
||||
'md5': 'c81addca81327ffa66c642b5d8b08cab',
|
||||
|
@@ -59,7 +59,7 @@ class KuwoBaseIE(InfoExtractor):
|
||||
class KuwoIE(KuwoBaseIE):
|
||||
IE_NAME = 'kuwo:song'
|
||||
IE_DESC = '酷我音乐'
|
||||
_VALID_URL = r'https?://www\.kuwo\.cn/yinyue/(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?kuwo\.cn/yinyue/(?P<id>\d+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.kuwo.cn/yinyue/635632/',
|
||||
'info_dict': {
|
||||
@@ -82,7 +82,7 @@ class KuwoIE(KuwoBaseIE):
|
||||
'upload_date': '20150518',
|
||||
},
|
||||
'params': {
|
||||
'format': 'mp3-320'
|
||||
'format': 'mp3-320',
|
||||
},
|
||||
}, {
|
||||
'url': 'http://www.kuwo.cn/yinyue/3197154?catalog=yueku2016',
|
||||
@@ -91,10 +91,10 @@ class KuwoIE(KuwoBaseIE):
|
||||
|
||||
def _real_extract(self, url):
|
||||
song_id = self._match_id(url)
|
||||
webpage = self._download_webpage(
|
||||
webpage, urlh = self._download_webpage_handle(
|
||||
url, song_id, note='Download song detail info',
|
||||
errnote='Unable to get song detail info')
|
||||
if '对不起,该歌曲由于版权问题已被下线,将返回网站首页' in webpage:
|
||||
if song_id not in urlh.geturl() or '对不起,该歌曲由于版权问题已被下线,将返回网站首页' in webpage:
|
||||
raise ExtractorError('this song has been offline because of copyright issues', expected=True)
|
||||
|
||||
song_name = self._html_search_regex(
|
||||
@@ -139,7 +139,7 @@ class KuwoIE(KuwoBaseIE):
|
||||
class KuwoAlbumIE(InfoExtractor):
|
||||
IE_NAME = 'kuwo:album'
|
||||
IE_DESC = '酷我音乐 - 专辑'
|
||||
_VALID_URL = r'https?://www\.kuwo\.cn/album/(?P<id>\d+?)/'
|
||||
_VALID_URL = r'https?://(?:www\.)?kuwo\.cn/album/(?P<id>\d+?)/'
|
||||
_TEST = {
|
||||
'url': 'http://www.kuwo.cn/album/502294/',
|
||||
'info_dict': {
|
||||
@@ -181,7 +181,7 @@ class KuwoChartIE(InfoExtractor):
|
||||
'info_dict': {
|
||||
'id': '香港中文龙虎榜',
|
||||
},
|
||||
'playlist_mincount': 10,
|
||||
'playlist_mincount': 7,
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
@@ -200,7 +200,7 @@ class KuwoChartIE(InfoExtractor):
|
||||
class KuwoSingerIE(InfoExtractor):
|
||||
IE_NAME = 'kuwo:singer'
|
||||
IE_DESC = '酷我音乐 - 歌手'
|
||||
_VALID_URL = r'https?://www\.kuwo\.cn/mingxing/(?P<id>[^/]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?kuwo\.cn/mingxing/(?P<id>[^/]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.kuwo.cn/mingxing/bruno+mars/',
|
||||
'info_dict': {
|
||||
@@ -296,14 +296,14 @@ class KuwoCategoryIE(InfoExtractor):
|
||||
class KuwoMvIE(KuwoBaseIE):
|
||||
IE_NAME = 'kuwo:mv'
|
||||
IE_DESC = '酷我音乐 - MV'
|
||||
_VALID_URL = r'https?://www\.kuwo\.cn/mv/(?P<id>\d+?)/'
|
||||
_VALID_URL = r'https?://(?:www\.)?kuwo\.cn/mv/(?P<id>\d+?)/'
|
||||
_TEST = {
|
||||
'url': 'http://www.kuwo.cn/mv/6480076/',
|
||||
'info_dict': {
|
||||
'id': '6480076',
|
||||
'ext': 'mp4',
|
||||
'title': 'My HouseMV',
|
||||
'creator': 'PM02:00',
|
||||
'creator': '2PM',
|
||||
},
|
||||
# In this video, music URLs (anti.s) are blocked outside China and
|
||||
# USA, while the MV URL (mvurl) is available globally, so force the MV
|
||||
|
@@ -14,7 +14,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class LiTVIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.litv\.tv/(?:vod|promo)/[^/]+/(?:content\.do)?\?.*?\b(?:content_)?id=(?P<id>[^&]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?litv\.tv/(?:vod|promo)/[^/]+/(?:content\.do)?\?.*?\b(?:content_)?id=(?P<id>[^&]+)'
|
||||
|
||||
_URL_TEMPLATE = 'https://www.litv.tv/vod/%s/content.do?id=%s'
|
||||
|
||||
|
@@ -94,7 +94,7 @@ class LyndaBaseIE(InfoExtractor):
|
||||
class LyndaIE(LyndaBaseIE):
|
||||
IE_NAME = 'lynda'
|
||||
IE_DESC = 'lynda.com videos'
|
||||
_VALID_URL = r'https?://www\.lynda\.com/(?:[^/]+/[^/]+/\d+|player/embed)/(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?lynda\.com/(?:[^/]+/[^/]+/\d+|player/embed)/(?P<id>\d+)'
|
||||
|
||||
_TIMECODE_REGEX = r'\[(?P<timecode>\d+:\d+:\d+[\.,]\d+)\]'
|
||||
|
||||
|
@@ -7,7 +7,7 @@ from ..utils import ExtractorError
|
||||
class MacGameStoreIE(InfoExtractor):
|
||||
IE_NAME = 'macgamestore'
|
||||
IE_DESC = 'MacGameStore trailers'
|
||||
_VALID_URL = r'https?://www\.macgamestore\.com/mediaviewer\.php\?trailer=(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?macgamestore\.com/mediaviewer\.php\?trailer=(?P<id>\d+)'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.macgamestore.com/mediaviewer.php?trailer=2450',
|
||||
|
54
youtube_dl/extractor/mangomolo.py
Normal file
54
youtube_dl/extractor/mangomolo.py
Normal file
@@ -0,0 +1,54 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import base64
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_urllib_parse_unquote
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
)
|
||||
|
||||
|
||||
class MangomoloBaseIE(InfoExtractor):
|
||||
def _get_real_id(self, page_id):
|
||||
return page_id
|
||||
|
||||
def _real_extract(self, url):
|
||||
page_id = self._get_real_id(self._match_id(url))
|
||||
webpage = self._download_webpage(url, page_id)
|
||||
hidden_inputs = self._hidden_inputs(webpage)
|
||||
m3u8_entry_protocol = 'm3u8' if self._IS_LIVE else 'm3u8_native'
|
||||
|
||||
format_url = self._html_search_regex(
|
||||
[
|
||||
r'file\s*:\s*"(https?://[^"]+?/playlist.m3u8)',
|
||||
r'<a[^>]+href="(rtsp://[^"]+)"'
|
||||
], webpage, 'format url')
|
||||
formats = self._extract_wowza_formats(
|
||||
format_url, page_id, m3u8_entry_protocol, ['smil'])
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': page_id,
|
||||
'title': self._live_title(page_id) if self._IS_LIVE else page_id,
|
||||
'uploader_id': hidden_inputs.get('userid'),
|
||||
'duration': int_or_none(hidden_inputs.get('duration')),
|
||||
'is_live': self._IS_LIVE,
|
||||
'formats': formats,
|
||||
}
|
||||
|
||||
|
||||
class MangomoloVideoIE(MangomoloBaseIE):
|
||||
IE_NAME = 'mangomolo:video'
|
||||
_VALID_URL = r'https?://admin\.mangomolo\.com/analytics/index\.php/customers/embed/video\?.*?\bid=(?P<id>\d+)'
|
||||
_IS_LIVE = False
|
||||
|
||||
|
||||
class MangomoloLiveIE(MangomoloBaseIE):
|
||||
IE_NAME = 'mangomolo:live'
|
||||
_VALID_URL = r'https?://admin\.mangomolo\.com/analytics/index\.php/customers/embed/index\?.*?\bchannelid=(?P<id>(?:[A-Za-z0-9+/=]|%2B|%2F|%3D)+)'
|
||||
_IS_LIVE = True
|
||||
|
||||
def _get_real_id(self, page_id):
|
||||
return base64.b64decode(compat_urllib_parse_unquote(page_id).encode()).decode()
|
@@ -9,7 +9,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class MetacriticIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.metacritic\.com/.+?/trailers/(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?metacritic\.com/.+?/trailers/(?P<id>\d+)'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://www.metacritic.com/game/playstation-4/infamous-second-son/trailers/3698222',
|
||||
|
@@ -6,7 +6,7 @@ from ..utils import int_or_none
|
||||
|
||||
|
||||
class MGTVIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.mgtv\.com/v/(?:[^/]+/)*(?P<id>\d+)\.html'
|
||||
_VALID_URL = r'https?://(?:www\.)?mgtv\.com/v/(?:[^/]+/)*(?P<id>\d+)\.html'
|
||||
IE_DESC = '芒果TV'
|
||||
|
||||
_TESTS = [{
|
||||
|
@@ -8,7 +8,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class MinistryGridIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.ministrygrid.com/([^/?#]*/)*(?P<id>[^/#?]+)/?(?:$|[?#])'
|
||||
_VALID_URL = r'https?://(?:www\.)?ministrygrid\.com/([^/?#]*/)*(?P<id>[^/#?]+)/?(?:$|[?#])'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.ministrygrid.com/training-viewer/-/training/t4g-2014-conference/the-gospel-by-numbers-4/the-gospel-by-numbers',
|
||||
|
@@ -74,7 +74,7 @@ class MiTeleBaseIE(InfoExtractor):
|
||||
|
||||
class MiTeleIE(MiTeleBaseIE):
|
||||
IE_DESC = 'mitele.es'
|
||||
_VALID_URL = r'https?://www\.mitele\.es/(?:[^/]+/){3}(?P<id>[^/]+)/'
|
||||
_VALID_URL = r'https?://(?:www\.)?mitele\.es/(?:[^/]+/){3}(?P<id>[^/]+)/'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://www.mitele.es/programas-tv/diario-de/la-redaccion/programa-144/',
|
||||
|
@@ -9,7 +9,7 @@ from ..compat import (
|
||||
|
||||
class MotorsportIE(InfoExtractor):
|
||||
IE_DESC = 'motorsport.com'
|
||||
_VALID_URL = r'https?://www\.motorsport\.com/[^/?#]+/video/(?:[^/?#]+/)(?P<id>[^/]+)/?(?:$|[?#])'
|
||||
_VALID_URL = r'https?://(?:www\.)?motorsport\.com/[^/?#]+/video/(?:[^/?#]+/)(?P<id>[^/]+)/?(?:$|[?#])'
|
||||
_TEST = {
|
||||
'url': 'http://www.motorsport.com/f1/video/main-gallery/red-bull-racing-2014-rules-explained/',
|
||||
'info_dict': {
|
||||
|
@@ -7,7 +7,7 @@ from .common import InfoExtractor
|
||||
|
||||
|
||||
class MoviezineIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.moviezine\.se/video/(?P<id>[^?#]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?moviezine\.se/video/(?P<id>[^?#]+)'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.moviezine.se/video/205866',
|
||||
|
@@ -11,7 +11,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class MySpassIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.myspass\.de/.*'
|
||||
_VALID_URL = r'https?://(?:www\.)?myspass\.de/.*'
|
||||
_TEST = {
|
||||
'url': 'http://www.myspass.de/myspass/shows/tvshows/absolute-mehrheit/Absolute-Mehrheit-vom-17022013-Die-Highlights-Teil-2--/11741/',
|
||||
'md5': '0b49f4844a068f8b33f4b7c88405862b',
|
||||
|
@@ -13,7 +13,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class NBCIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.nbc\.com/(?:[^/]+/)+(?P<id>n?\d+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?nbc\.com/(?:[^/]+/)+(?P<id>n?\d+)'
|
||||
|
||||
_TESTS = [
|
||||
{
|
||||
@@ -138,7 +138,7 @@ class NBCSportsVPlayerIE(InfoExtractor):
|
||||
|
||||
class NBCSportsIE(InfoExtractor):
|
||||
# Does not include https because its certificate is invalid
|
||||
_VALID_URL = r'https?://www\.nbcsports\.com//?(?:[^/]+/)+(?P<id>[0-9a-z-]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?nbcsports\.com//?(?:[^/]+/)+(?P<id>[0-9a-z-]+)'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.nbcsports.com//college-basketball/ncaab/tom-izzo-michigan-st-has-so-much-respect-duke',
|
||||
@@ -161,7 +161,7 @@ class NBCSportsIE(InfoExtractor):
|
||||
|
||||
|
||||
class CSNNEIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.csnne\.com/video/(?P<id>[0-9a-z-]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?csnne\.com/video/(?P<id>[0-9a-z-]+)'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.csnne.com/video/snc-evening-update-wright-named-red-sox-no-5-starter',
|
||||
@@ -335,3 +335,43 @@ class NBCNewsIE(ThePlatformIE):
|
||||
'url': 'http://feed.theplatform.com/f/2E2eJC/nnd_NBCNews?byId=%s' % video_id,
|
||||
'ie_key': 'ThePlatformFeed',
|
||||
}
|
||||
|
||||
|
||||
class NBCOlympicsIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.nbcolympics\.com/video/(?P<id>[a-z-]+)'
|
||||
|
||||
_TEST = {
|
||||
# Geo-restricted to US
|
||||
'url': 'http://www.nbcolympics.com/video/justin-roses-son-leo-was-tears-after-his-dad-won-gold',
|
||||
'md5': '54fecf846d05429fbaa18af557ee523a',
|
||||
'info_dict': {
|
||||
'id': 'WjTBzDXx5AUq',
|
||||
'display_id': 'justin-roses-son-leo-was-tears-after-his-dad-won-gold',
|
||||
'ext': 'mp4',
|
||||
'title': 'Rose\'s son Leo was in tears after his dad won gold',
|
||||
'description': 'Olympic gold medalist Justin Rose gets emotional talking to the impact his win in men\'s golf has already had on his children.',
|
||||
'timestamp': 1471274964,
|
||||
'upload_date': '20160815',
|
||||
'uploader': 'NBCU-SPORTS',
|
||||
},
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
display_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
|
||||
drupal_settings = self._parse_json(self._search_regex(
|
||||
r'jQuery\.extend\(Drupal\.settings\s*,\s*({.+?})\);',
|
||||
webpage, 'drupal settings'), display_id)
|
||||
|
||||
iframe_url = drupal_settings['vod']['iframe_url']
|
||||
theplatform_url = iframe_url.replace(
|
||||
'vplayer.nbcolympics.com', 'player.theplatform.com')
|
||||
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'url': theplatform_url,
|
||||
'ie_key': ThePlatformIE.ie_key(),
|
||||
'display_id': display_id,
|
||||
}
|
||||
|
@@ -23,7 +23,7 @@ class NDRBaseIE(InfoExtractor):
|
||||
class NDRIE(NDRBaseIE):
|
||||
IE_NAME = 'ndr'
|
||||
IE_DESC = 'NDR.de - Norddeutscher Rundfunk'
|
||||
_VALID_URL = r'https?://www\.ndr\.de/(?:[^/]+/)*(?P<id>[^/?#]+),[\da-z]+\.html'
|
||||
_VALID_URL = r'https?://(?:www\.)?ndr\.de/(?:[^/]+/)*(?P<id>[^/?#]+),[\da-z]+\.html'
|
||||
_TESTS = [{
|
||||
# httpVideo, same content id
|
||||
'url': 'http://www.ndr.de/fernsehen/Party-Poette-und-Parade,hafengeburtstag988.html',
|
||||
@@ -105,7 +105,7 @@ class NDRIE(NDRBaseIE):
|
||||
class NJoyIE(NDRBaseIE):
|
||||
IE_NAME = 'njoy'
|
||||
IE_DESC = 'N-JOY'
|
||||
_VALID_URL = r'https?://www\.n-joy\.de/(?:[^/]+/)*(?:(?P<display_id>[^/?#]+),)?(?P<id>[\da-z]+)\.html'
|
||||
_VALID_URL = r'https?://(?:www\.)?n-joy\.de/(?:[^/]+/)*(?:(?P<display_id>[^/?#]+),)?(?P<id>[\da-z]+)\.html'
|
||||
_TESTS = [{
|
||||
# httpVideo, same content id
|
||||
'url': 'http://www.n-joy.de/entertainment/comedy/comedy_contest/Benaissa-beim-NDR-Comedy-Contest,comedycontest2480.html',
|
||||
@@ -238,7 +238,7 @@ class NDREmbedBaseIE(InfoExtractor):
|
||||
|
||||
class NDREmbedIE(NDREmbedBaseIE):
|
||||
IE_NAME = 'ndr:embed'
|
||||
_VALID_URL = r'https?://www\.ndr\.de/(?:[^/]+/)*(?P<id>[\da-z]+)-(?:player|externalPlayer)\.html'
|
||||
_VALID_URL = r'https?://(?:www\.)?ndr\.de/(?:[^/]+/)*(?P<id>[\da-z]+)-(?:player|externalPlayer)\.html'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.ndr.de/fernsehen/sendungen/ndr_aktuell/ndraktuell28488-player.html',
|
||||
'md5': '8b9306142fe65bbdefb5ce24edb6b0a9',
|
||||
@@ -332,7 +332,7 @@ class NDREmbedIE(NDREmbedBaseIE):
|
||||
|
||||
class NJoyEmbedIE(NDREmbedBaseIE):
|
||||
IE_NAME = 'njoy:embed'
|
||||
_VALID_URL = r'https?://www\.n-joy\.de/(?:[^/]+/)*(?P<id>[\da-z]+)-(?:player|externalPlayer)_[^/]+\.html'
|
||||
_VALID_URL = r'https?://(?:www\.)?n-joy\.de/(?:[^/]+/)*(?P<id>[\da-z]+)-(?:player|externalPlayer)_[^/]+\.html'
|
||||
_TESTS = [{
|
||||
# httpVideo
|
||||
'url': 'http://www.n-joy.de/events/reeperbahnfestival/doku948-player_image-bc168e87-5263-4d6d-bd27-bb643005a6de_theme-n-joy.html',
|
||||
|
@@ -7,7 +7,7 @@ from ..utils import parse_iso8601
|
||||
|
||||
class NextMediaIE(InfoExtractor):
|
||||
IE_DESC = '蘋果日報'
|
||||
_VALID_URL = r'https?://hk.apple.nextmedia.com/[^/]+/[^/]+/(?P<date>\d+)/(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://hk\.apple\.nextmedia\.com/[^/]+/[^/]+/(?P<date>\d+)/(?P<id>\d+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://hk.apple.nextmedia.com/realtime/news/20141108/53109199',
|
||||
'md5': 'dff9fad7009311c421176d1ac90bfe4f',
|
||||
@@ -68,7 +68,7 @@ class NextMediaIE(InfoExtractor):
|
||||
|
||||
class NextMediaActionNewsIE(NextMediaIE):
|
||||
IE_DESC = '蘋果日報 - 動新聞'
|
||||
_VALID_URL = r'https?://hk.dv.nextmedia.com/actionnews/[^/]+/(?P<date>\d+)/(?P<id>\d+)/\d+'
|
||||
_VALID_URL = r'https?://hk\.dv\.nextmedia\.com/actionnews/[^/]+/(?P<date>\d+)/(?P<id>\d+)/\d+'
|
||||
_TESTS = [{
|
||||
'url': 'http://hk.dv.nextmedia.com/actionnews/hit/20150121/19009428/20061460',
|
||||
'md5': '05fce8ffeed7a5e00665d4b7cf0f9201',
|
||||
@@ -93,7 +93,7 @@ class NextMediaActionNewsIE(NextMediaIE):
|
||||
|
||||
class AppleDailyIE(NextMediaIE):
|
||||
IE_DESC = '臺灣蘋果日報'
|
||||
_VALID_URL = r'https?://(www|ent).appledaily.com.tw/(?:animation|appledaily|enews|realtimenews)/[^/]+/[^/]+/(?P<date>\d+)/(?P<id>\d+)(/.*)?'
|
||||
_VALID_URL = r'https?://(www|ent)\.appledaily\.com\.tw/(?:animation|appledaily|enews|realtimenews)/[^/]+/[^/]+/(?P<date>\d+)/(?P<id>\d+)(/.*)?'
|
||||
_TESTS = [{
|
||||
'url': 'http://ent.appledaily.com.tw/enews/article/entertainment/20150128/36354694',
|
||||
'md5': 'a843ab23d150977cc55ef94f1e2c1e4d',
|
||||
|
@@ -165,7 +165,7 @@ class NFLIE(InfoExtractor):
|
||||
group='config'))
|
||||
# For articles, the id in the url is not the video id
|
||||
video_id = self._search_regex(
|
||||
r'(?:<nflcs:avplayer[^>]+data-content[Ii]d\s*=\s*|content[Ii]d\s*:\s*)(["\'])(?P<id>.+?)\1',
|
||||
r'(?:<nflcs:avplayer[^>]+data-content[Ii]d\s*=\s*|content[Ii]d\s*:\s*)(["\'])(?P<id>(?:(?!\1).)+)\1',
|
||||
webpage, 'video id', default=video_id, group='id')
|
||||
config = self._download_json(config_url, video_id, 'Downloading player config')
|
||||
url_template = NFLIE.prepend_host(
|
||||
|
@@ -1,14 +1,15 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import ExtractorError
|
||||
|
||||
|
||||
class NhkVodIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www3\.nhk\.or\.jp/nhkworld/en/vod/(?P<id>.+?)\.html'
|
||||
_VALID_URL = r'https?://www3\.nhk\.or\.jp/nhkworld/en/vod/(?P<id>[^/]+/[^/?#&]+)'
|
||||
_TEST = {
|
||||
# Videos available only for a limited period of time. Visit
|
||||
# http://www3.nhk.or.jp/nhkworld/en/vod/ for working samples.
|
||||
'url': 'http://www3.nhk.or.jp/nhkworld/en/vod/tokyofashion/20160815.html',
|
||||
'url': 'http://www3.nhk.or.jp/nhkworld/en/vod/tokyofashion/20160815',
|
||||
'info_dict': {
|
||||
'id': 'A1bnNiNTE6nY3jLllS-BIISfcC_PpvF5',
|
||||
'ext': 'flv',
|
||||
@@ -19,25 +20,25 @@ class NhkVodIE(InfoExtractor):
|
||||
},
|
||||
'skip': 'Videos available only for a limited period of time',
|
||||
}
|
||||
_API_URL = 'http://api.nhk.or.jp/nhkworld/vodesdlist/v1/all/all/all.json?apikey=EJfK8jdS57GqlupFgAfAAwr573q01y6k'
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
data = self._download_json(self._API_URL, video_id)
|
||||
|
||||
embed_code = self._search_regex(
|
||||
r'nw_vod_ooplayer\([^,]+,\s*(["\'])(?P<id>(?:(?!\1).)+)\1',
|
||||
webpage, 'ooyala embed code', group='id')
|
||||
try:
|
||||
episode = next(
|
||||
e for e in data['data']['episodes']
|
||||
if e.get('url') and video_id in e['url'])
|
||||
except StopIteration:
|
||||
raise ExtractorError('Unable to find episode')
|
||||
|
||||
title = self._search_regex(
|
||||
r'<div[^>]+class=["\']episode-detail["\']>\s*<h\d+>([^<]+)',
|
||||
webpage, 'title', default=None)
|
||||
description = self._html_search_regex(
|
||||
r'(?s)<p[^>]+class=["\']description["\'][^>]*>(.+?)</p>',
|
||||
webpage, 'description', default=None)
|
||||
series = self._search_regex(
|
||||
r'<h2[^>]+class=["\']detail-top-player-title[^>]+><a[^>]+>([^<]+)',
|
||||
webpage, 'series', default=None)
|
||||
embed_code = episode['vod_id']
|
||||
|
||||
title = episode.get('sub_title_clean') or episode['sub_title']
|
||||
description = episode.get('description_clean') or episode.get('description')
|
||||
series = episode.get('title_clean') or episode.get('title')
|
||||
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
|
@@ -252,7 +252,7 @@ class NiconicoIE(InfoExtractor):
|
||||
|
||||
|
||||
class NiconicoPlaylistIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.nicovideo\.jp/mylist/(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?nicovideo\.jp/mylist/(?P<id>\d+)'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.nicovideo.jp/mylist/27411728',
|
||||
|
@@ -429,7 +429,7 @@ class SchoolTVIE(InfoExtractor):
|
||||
display_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
video_id = self._search_regex(
|
||||
r'data-mid=(["\'])(?P<id>.+?)\1', webpage, 'video_id', group='id')
|
||||
r'data-mid=(["\'])(?P<id>(?:(?!\1).)+)\1', webpage, 'video_id', group='id')
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'ie_key': 'NPO',
|
||||
|
@@ -5,7 +5,7 @@ from .common import InfoExtractor
|
||||
|
||||
|
||||
class OktoberfestTVIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.oktoberfest-tv\.de/[^/]+/[^/]+/video/(?P<id>[^/?#]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?oktoberfest-tv\.de/[^/]+/[^/]+/video/(?P<id>[^/?#]+)'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.oktoberfest-tv.de/de/kameras/video/hb-zelt',
|
||||
|
@@ -47,7 +47,7 @@ class OoyalaBaseIE(InfoExtractor):
|
||||
delivery_type = stream['delivery_type']
|
||||
if delivery_type == 'hls' or ext == 'm3u8':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
s_url, embed_code, 'mp4', 'm3u8_native',
|
||||
re.sub(r'/ip(?:ad|hone)/', '/all/', s_url), embed_code, 'mp4', 'm3u8_native',
|
||||
m3u8_id='hls', fatal=False))
|
||||
elif delivery_type == 'hds' or ext == 'f4m':
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
|
@@ -13,7 +13,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class OpenloadIE(InfoExtractor):
|
||||
_VALID_URL = r'https://openload.(?:co|io)/(?:f|embed)/(?P<id>[a-zA-Z0-9-_]+)'
|
||||
_VALID_URL = r'https?://openload\.(?:co|io)/(?:f|embed)/(?P<id>[a-zA-Z0-9-_]+)'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'https://openload.co/f/kUEfGclsU9o',
|
||||
@@ -51,7 +51,8 @@ class OpenloadIE(InfoExtractor):
|
||||
# declared to be freely used in youtube-dl
|
||||
# See https://github.com/rg3/youtube-dl/issues/10408
|
||||
enc_data = self._html_search_regex(
|
||||
r'<span[^>]+id="hiddenurl"[^>]*>([^<]+)</span>', webpage, 'encrypted data')
|
||||
r'<span[^>]*>([^<]+)</span>\s*<span[^>]*>[^<]+</span>\s*<span[^>]+id="streamurl"',
|
||||
webpage, 'encrypted data')
|
||||
|
||||
video_url_chars = []
|
||||
|
||||
@@ -60,7 +61,7 @@ class OpenloadIE(InfoExtractor):
|
||||
if j >= 33 and j <= 126:
|
||||
j = ((j + 14) % 94) + 33
|
||||
if idx == len(enc_data) - 1:
|
||||
j += 3
|
||||
j += 2
|
||||
video_url_chars += compat_chr(j)
|
||||
|
||||
video_url = 'https://openload.co/stream/%s?mime=true' % ''.join(video_url_chars)
|
||||
|
@@ -94,7 +94,7 @@ class PeriscopeIE(PeriscopeBaseIE):
|
||||
|
||||
|
||||
class PeriscopeUserIE(PeriscopeBaseIE):
|
||||
_VALID_URL = r'https?://www\.periscope\.tv/(?P<id>[^/]+)/?$'
|
||||
_VALID_URL = r'https?://(?:www\.)?periscope\.tv/(?P<id>[^/]+)/?$'
|
||||
IE_DESC = 'Periscope user videos'
|
||||
IE_NAME = 'periscope:user'
|
||||
|
||||
|
@@ -14,7 +14,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class PlayvidIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.playvid\.com/watch(\?v=|/)(?P<id>.+?)(?:#|$)'
|
||||
_VALID_URL = r'https?://(?:www\.)?playvid\.com/watch(\?v=|/)(?P<id>.+?)(?:#|$)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.playvid.com/watch/RnmBNgtrrJu',
|
||||
'md5': 'ffa2f6b2119af359f544388d8c01eb6c',
|
||||
|
@@ -122,7 +122,7 @@ class ProSiebenSat1BaseIE(InfoExtractor):
|
||||
class ProSiebenSat1IE(ProSiebenSat1BaseIE):
|
||||
IE_NAME = 'prosiebensat1'
|
||||
IE_DESC = 'ProSiebenSat.1 Digital'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?:(?:prosieben|prosiebenmaxx|sixx|sat1|kabeleins|the-voice-of-germany|7tv)\.(?:de|at|ch)|ran\.de|fem\.com)/(?P<id>.+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?:(?:prosieben|prosiebenmaxx|sixx|sat1|kabeleins|the-voice-of-germany|7tv|kabeleinsdoku)\.(?:de|at|ch)|ran\.de|fem\.com)/(?P<id>.+)'
|
||||
|
||||
_TESTS = [
|
||||
{
|
||||
@@ -290,6 +290,11 @@ class ProSiebenSat1IE(ProSiebenSat1BaseIE):
|
||||
'skip_download': True,
|
||||
},
|
||||
},
|
||||
{
|
||||
# geo restricted to Germany
|
||||
'url': 'http://www.kabeleinsdoku.de/tv/mayday-alarm-im-cockpit/video/102-notlandung-im-hudson-river-ganze-folge',
|
||||
'only_matching': True,
|
||||
},
|
||||
]
|
||||
|
||||
_TOKEN = 'prosieben'
|
||||
|
@@ -18,7 +18,7 @@ from ..utils import (
|
||||
class QQMusicIE(InfoExtractor):
|
||||
IE_NAME = 'qqmusic'
|
||||
IE_DESC = 'QQ音乐'
|
||||
_VALID_URL = r'https?://y.qq.com/#type=song&mid=(?P<id>[0-9A-Za-z]+)'
|
||||
_VALID_URL = r'https?://y\.qq\.com/#type=song&mid=(?P<id>[0-9A-Za-z]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://y.qq.com/#type=song&mid=004295Et37taLD',
|
||||
'md5': '9ce1c1c8445f561506d2e3cfb0255705',
|
||||
@@ -172,7 +172,7 @@ class QQPlaylistBaseIE(InfoExtractor):
|
||||
class QQMusicSingerIE(QQPlaylistBaseIE):
|
||||
IE_NAME = 'qqmusic:singer'
|
||||
IE_DESC = 'QQ音乐 - 歌手'
|
||||
_VALID_URL = r'https?://y.qq.com/#type=singer&mid=(?P<id>[0-9A-Za-z]+)'
|
||||
_VALID_URL = r'https?://y\.qq\.com/#type=singer&mid=(?P<id>[0-9A-Za-z]+)'
|
||||
_TEST = {
|
||||
'url': 'http://y.qq.com/#type=singer&mid=001BLpXF2DyJe2',
|
||||
'info_dict': {
|
||||
@@ -217,7 +217,7 @@ class QQMusicSingerIE(QQPlaylistBaseIE):
|
||||
class QQMusicAlbumIE(QQPlaylistBaseIE):
|
||||
IE_NAME = 'qqmusic:album'
|
||||
IE_DESC = 'QQ音乐 - 专辑'
|
||||
_VALID_URL = r'https?://y.qq.com/#type=album&mid=(?P<id>[0-9A-Za-z]+)'
|
||||
_VALID_URL = r'https?://y\.qq\.com/#type=album&mid=(?P<id>[0-9A-Za-z]+)'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://y.qq.com/#type=album&mid=000gXCTb2AhRR1',
|
||||
|
@@ -13,6 +13,7 @@ from ..utils import (
|
||||
xpath_element,
|
||||
ExtractorError,
|
||||
determine_protocol,
|
||||
unsmuggle_url,
|
||||
)
|
||||
|
||||
|
||||
@@ -35,28 +36,51 @@ class RadioCanadaIE(InfoExtractor):
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
url, smuggled_data = unsmuggle_url(url, {})
|
||||
app_code, video_id = re.match(self._VALID_URL, url).groups()
|
||||
|
||||
device_types = ['ipad', 'android']
|
||||
metadata = self._download_xml(
|
||||
'http://api.radio-canada.ca/metaMedia/v1/index.ashx',
|
||||
video_id, note='Downloading metadata XML', query={
|
||||
'appCode': app_code,
|
||||
'idMedia': video_id,
|
||||
})
|
||||
|
||||
def get_meta(name):
|
||||
el = find_xpath_attr(metadata, './/Meta', 'name', name)
|
||||
return el.text if el is not None else None
|
||||
|
||||
if get_meta('protectionType'):
|
||||
raise ExtractorError('This video is DRM protected.', expected=True)
|
||||
|
||||
device_types = ['ipad']
|
||||
if app_code != 'toutv':
|
||||
device_types.append('flash')
|
||||
if not smuggled_data:
|
||||
device_types.append('android')
|
||||
|
||||
formats = []
|
||||
# TODO: extract f4m formats
|
||||
# f4m formats can be extracted using flashhd device_type but they produce unplayable file
|
||||
for device_type in device_types:
|
||||
v_data = self._download_xml(
|
||||
'http://api.radio-canada.ca/validationMedia/v1/Validation.ashx',
|
||||
video_id, note='Downloading %s XML' % device_type, query={
|
||||
'appCode': app_code,
|
||||
'idMedia': video_id,
|
||||
'connectionType': 'broadband',
|
||||
'multibitrate': 'true',
|
||||
'deviceType': device_type,
|
||||
validation_url = 'http://api.radio-canada.ca/validationMedia/v1/Validation.ashx'
|
||||
query = {
|
||||
'appCode': app_code,
|
||||
'idMedia': video_id,
|
||||
'connectionType': 'broadband',
|
||||
'multibitrate': 'true',
|
||||
'deviceType': device_type,
|
||||
}
|
||||
if smuggled_data:
|
||||
validation_url = 'https://services.radio-canada.ca/media/validation/v2/'
|
||||
query.update(smuggled_data)
|
||||
else:
|
||||
query.update({
|
||||
# paysJ391wsHjbOJwvCs26toz and bypasslock are used to bypass geo-restriction
|
||||
'paysJ391wsHjbOJwvCs26toz': 'CA',
|
||||
'bypasslock': 'NZt5K62gRqfc',
|
||||
}, fatal=False)
|
||||
})
|
||||
v_data = self._download_xml(validation_url, video_id, note='Downloading %s XML' % device_type, query=query, fatal=False)
|
||||
v_url = xpath_text(v_data, 'url')
|
||||
if not v_url:
|
||||
continue
|
||||
@@ -101,17 +125,6 @@ class RadioCanadaIE(InfoExtractor):
|
||||
f4m_id='hds', fatal=False))
|
||||
self._sort_formats(formats)
|
||||
|
||||
metadata = self._download_xml(
|
||||
'http://api.radio-canada.ca/metaMedia/v1/index.ashx',
|
||||
video_id, note='Downloading metadata XML', query={
|
||||
'appCode': app_code,
|
||||
'idMedia': video_id,
|
||||
})
|
||||
|
||||
def get_meta(name):
|
||||
el = find_xpath_attr(metadata, './/Meta', 'name', name)
|
||||
return el.text if el is not None else None
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': get_meta('Title'),
|
||||
|
@@ -5,7 +5,7 @@ from .internetvideoarchive import InternetVideoArchiveIE
|
||||
|
||||
|
||||
class RottenTomatoesIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.rottentomatoes\.com/m/[^/]+/trailers/(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?rottentomatoes\.com/m/[^/]+/trailers/(?P<id>\d+)'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.rottentomatoes.com/m/toy_story_3/trailers/11028566/',
|
||||
|
@@ -7,7 +7,7 @@ from ..utils import unified_strdate, determine_ext
|
||||
|
||||
|
||||
class RoxwelIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.roxwel\.com/player/(?P<filename>.+?)(\.|\?|$)'
|
||||
_VALID_URL = r'https?://(?:www\.)?roxwel\.com/player/(?P<filename>.+?)(\.|\?|$)'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.roxwel.com/player/passionpittakeawalklive.html',
|
||||
|
@@ -64,7 +64,7 @@ def _decrypt_url(png):
|
||||
class RTVEALaCartaIE(InfoExtractor):
|
||||
IE_NAME = 'rtve.es:alacarta'
|
||||
IE_DESC = 'RTVE a la carta'
|
||||
_VALID_URL = r'https?://www\.rtve\.es/(m/)?(alacarta/videos|filmoteca)/[^/]+/[^/]+/(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?rtve\.es/(m/)?(alacarta/videos|filmoteca)/[^/]+/[^/]+/(?P<id>\d+)'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://www.rtve.es/alacarta/videos/balonmano/o-swiss-cup-masculina-final-espana-suecia/2491869/',
|
||||
@@ -184,7 +184,7 @@ class RTVEInfantilIE(InfoExtractor):
|
||||
class RTVELiveIE(InfoExtractor):
|
||||
IE_NAME = 'rtve.es:live'
|
||||
IE_DESC = 'RTVE.es live streams'
|
||||
_VALID_URL = r'https?://www\.rtve\.es/directo/(?P<id>[a-zA-Z0-9-]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?rtve\.es/directo/(?P<id>[a-zA-Z0-9-]+)'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://www.rtve.es/directo/la-1/',
|
||||
@@ -226,7 +226,7 @@ class RTVELiveIE(InfoExtractor):
|
||||
|
||||
class RTVETelevisionIE(InfoExtractor):
|
||||
IE_NAME = 'rtve.es:television'
|
||||
_VALID_URL = r'https?://www\.rtve\.es/television/[^/]+/[^/]+/(?P<id>\d+).shtml'
|
||||
_VALID_URL = r'https?://(?:www\.)?rtve\.es/television/[^/]+/[^/]+/(?P<id>\d+).shtml'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.rtve.es/television/20160628/revolucion-del-movil/1364141.shtml',
|
||||
|
@@ -103,13 +103,13 @@ class SafariIE(SafariBaseIE):
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
reference_id = self._search_regex(
|
||||
r'data-reference-id=(["\'])(?P<id>.+?)\1',
|
||||
r'data-reference-id=(["\'])(?P<id>(?:(?!\1).)+)\1',
|
||||
webpage, 'kaltura reference id', group='id')
|
||||
partner_id = self._search_regex(
|
||||
r'data-partner-id=(["\'])(?P<id>.+?)\1',
|
||||
r'data-partner-id=(["\'])(?P<id>(?:(?!\1).)+)\1',
|
||||
webpage, 'kaltura widget id', group='id')
|
||||
ui_id = self._search_regex(
|
||||
r'data-ui-id=(["\'])(?P<id>.+?)\1',
|
||||
r'data-ui-id=(["\'])(?P<id>(?:(?!\1).)+)\1',
|
||||
webpage, 'kaltura uiconf id', group='id')
|
||||
|
||||
query = {
|
||||
|
@@ -11,7 +11,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class ScreenJunkiesIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www.screenjunkies.com/video/(?P<display_id>[^/]+?)(?:-(?P<id>\d+))?(?:[/?#&]|$)'
|
||||
_VALID_URL = r'https?://(?:www\.)?screenjunkies\.com/video/(?P<display_id>[^/]+?)(?:-(?P<id>\d+))?(?:[/?#&]|$)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.screenjunkies.com/video/best-quentin-tarantino-movie-2841915',
|
||||
'md5': '5c2b686bec3d43de42bde9ec047536b0',
|
||||
|
@@ -48,7 +48,7 @@ class SenateISVPIE(InfoExtractor):
|
||||
['arch', '', 'http://ussenate-f.akamaihd.net/']
|
||||
]
|
||||
_IE_NAME = 'senate.gov'
|
||||
_VALID_URL = r'https?://www\.senate\.gov/isvp/?\?(?P<qs>.+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?senate\.gov/isvp/?\?(?P<qs>.+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.senate.gov/isvp/?comm=judiciary&type=live&stt=&filename=judiciary031715&auto_play=false&wmode=transparent&poster=http%3A%2F%2Fwww.judiciary.senate.gov%2Fthemes%2Fjudiciary%2Fimages%2Fvideo-poster-flash-fit.png',
|
||||
'info_dict': {
|
||||
|
@@ -14,7 +14,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class SlideshareIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.slideshare\.net/[^/]+?/(?P<title>.+?)($|\?)'
|
||||
_VALID_URL = r'https?://(?:www\.)?slideshare\.net/[^/]+?/(?P<title>.+?)($|\?)'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.slideshare.net/Dataversity/keynote-presentation-managing-scale-and-complexity',
|
||||
|
@@ -53,6 +53,7 @@ class SoundcloudIE(InfoExtractor):
|
||||
'uploader': 'E.T. ExTerrestrial Music',
|
||||
'title': 'Lostin Powers - She so Heavy (SneakPreview) Adrian Ackers Blueprint 1',
|
||||
'duration': 143,
|
||||
'license': 'all-rights-reserved',
|
||||
}
|
||||
},
|
||||
# not streamable song
|
||||
@@ -66,6 +67,7 @@ class SoundcloudIE(InfoExtractor):
|
||||
'uploader': 'The Royal Concept',
|
||||
'upload_date': '20120521',
|
||||
'duration': 227,
|
||||
'license': 'all-rights-reserved',
|
||||
},
|
||||
'params': {
|
||||
# rtmp
|
||||
@@ -84,6 +86,7 @@ class SoundcloudIE(InfoExtractor):
|
||||
'description': 'test chars: \"\'/\\ä↭',
|
||||
'upload_date': '20131209',
|
||||
'duration': 9,
|
||||
'license': 'all-rights-reserved',
|
||||
},
|
||||
},
|
||||
# private link (alt format)
|
||||
@@ -98,6 +101,7 @@ class SoundcloudIE(InfoExtractor):
|
||||
'description': 'test chars: \"\'/\\ä↭',
|
||||
'upload_date': '20131209',
|
||||
'duration': 9,
|
||||
'license': 'all-rights-reserved',
|
||||
},
|
||||
},
|
||||
# downloadable song
|
||||
@@ -112,6 +116,7 @@ class SoundcloudIE(InfoExtractor):
|
||||
'uploader': 'oddsamples',
|
||||
'upload_date': '20140109',
|
||||
'duration': 17,
|
||||
'license': 'cc-by-sa',
|
||||
},
|
||||
},
|
||||
]
|
||||
@@ -138,20 +143,20 @@ class SoundcloudIE(InfoExtractor):
|
||||
name = full_title or track_id
|
||||
if quiet:
|
||||
self.report_extraction(name)
|
||||
|
||||
thumbnail = info['artwork_url']
|
||||
if thumbnail is not None:
|
||||
thumbnail = info.get('artwork_url')
|
||||
if isinstance(thumbnail, compat_str):
|
||||
thumbnail = thumbnail.replace('-large', '-t500x500')
|
||||
ext = 'mp3'
|
||||
result = {
|
||||
'id': track_id,
|
||||
'uploader': info['user']['username'],
|
||||
'upload_date': unified_strdate(info['created_at']),
|
||||
'uploader': info.get('user', {}).get('username'),
|
||||
'upload_date': unified_strdate(info.get('created_at')),
|
||||
'title': info['title'],
|
||||
'description': info['description'],
|
||||
'description': info.get('description'),
|
||||
'thumbnail': thumbnail,
|
||||
'duration': int_or_none(info.get('duration'), 1000),
|
||||
'webpage_url': info.get('permalink_url'),
|
||||
'license': info.get('license'),
|
||||
}
|
||||
formats = []
|
||||
if info.get('downloadable', False):
|
||||
@@ -221,7 +226,7 @@ class SoundcloudIE(InfoExtractor):
|
||||
raise ExtractorError('Invalid URL: %s' % url)
|
||||
|
||||
track_id = mobj.group('track_id')
|
||||
token = None
|
||||
|
||||
if track_id is not None:
|
||||
info_json_url = 'http://api.soundcloud.com/tracks/' + track_id + '.json?client_id=' + self._CLIENT_ID
|
||||
full_title = track_id
|
||||
@@ -472,7 +477,11 @@ class SoundcloudPlaylistIE(SoundcloudIE):
|
||||
data = self._download_json(
|
||||
base_url + data, playlist_id, 'Downloading playlist')
|
||||
|
||||
entries = [self.url_result(track['permalink_url'], 'Soundcloud') for track in data['tracks']]
|
||||
entries = [
|
||||
self.url_result(
|
||||
track['permalink_url'], SoundcloudIE.ie_key(),
|
||||
video_id=compat_str(track['id']) if track.get('id') else None)
|
||||
for track in data['tracks'] if track.get('permalink_url')]
|
||||
|
||||
return {
|
||||
'_type': 'playlist',
|
||||
|
@@ -103,7 +103,7 @@ class SpiegelIE(InfoExtractor):
|
||||
|
||||
|
||||
class SpiegelArticleIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.spiegel\.de/(?!video/)[^?#]*?-(?P<id>[0-9]+)\.html'
|
||||
_VALID_URL = r'https?://(?:www\.)?spiegel\.de/(?!video/)[^?#]*?-(?P<id>[0-9]+)\.html'
|
||||
IE_NAME = 'Spiegel:Article'
|
||||
IE_DESC = 'Articles on spiegel.de'
|
||||
_TESTS = [{
|
||||
|
@@ -16,7 +16,7 @@ class SVTBaseIE(InfoExtractor):
|
||||
def _extract_video(self, video_info, video_id):
|
||||
formats = []
|
||||
for vr in video_info['videoReferences']:
|
||||
player_type = vr.get('playerType')
|
||||
player_type = vr.get('playerType') or vr.get('format')
|
||||
vurl = vr['url']
|
||||
ext = determine_ext(vurl)
|
||||
if ext == 'm3u8':
|
||||
|
@@ -8,7 +8,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class SyfyIE(AdobePassIE):
|
||||
_VALID_URL = r'https?://www\.syfy\.com/(?:[^/]+/)?videos/(?P<id>[^/?#]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?syfy\.com/(?:[^/]+/)?videos/(?P<id>[^/?#]+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.syfy.com/theinternetruinedmylife/videos/the-internet-ruined-my-life-season-1-trailer',
|
||||
'info_dict': {
|
||||
|
@@ -4,10 +4,7 @@ from __future__ import unicode_literals
|
||||
import re
|
||||
|
||||
from .turner import TurnerBaseIE
|
||||
from ..utils import (
|
||||
extract_attributes,
|
||||
ExtractorError,
|
||||
)
|
||||
from ..utils import extract_attributes
|
||||
|
||||
|
||||
class TBSIE(TurnerBaseIE):
|
||||
@@ -37,10 +34,6 @@ class TBSIE(TurnerBaseIE):
|
||||
site = domain[:3]
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
video_params = extract_attributes(self._search_regex(r'(<[^>]+id="page-video"[^>]*>)', webpage, 'video params'))
|
||||
if video_params.get('isAuthRequired') == 'true':
|
||||
raise ExtractorError(
|
||||
'This video is only available via cable service provider subscription that'
|
||||
' is not currently supported.', expected=True)
|
||||
query = None
|
||||
clip_id = video_params.get('clipid')
|
||||
if clip_id:
|
||||
@@ -56,4 +49,8 @@ class TBSIE(TurnerBaseIE):
|
||||
'media_src': 'http://androidhls-secure.cdn.turner.com/%s/big' % site,
|
||||
'tokenizer_src': 'http://www.%s.com/video/processors/services/token_ipadAdobe.do' % domain,
|
||||
},
|
||||
}, {
|
||||
'url': url,
|
||||
'site_name': site.upper(),
|
||||
'auth_required': video_params.get('isAuthRequired') != 'false',
|
||||
})
|
||||
|
@@ -7,7 +7,7 @@ from .ooyala import OoyalaIE
|
||||
|
||||
|
||||
class TeachingChannelIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.teachingchannel\.org/videos/(?P<title>.+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?teachingchannel\.org/videos/(?P<title>.+)'
|
||||
|
||||
_TEST = {
|
||||
'url': 'https://www.teachingchannel.org/videos/teacher-teaming-evolution',
|
||||
|
@@ -6,7 +6,7 @@ from .mitele import MiTeleBaseIE
|
||||
|
||||
class TelecincoIE(MiTeleBaseIE):
|
||||
IE_DESC = 'telecinco.es, cuatro.com and mediaset.es'
|
||||
_VALID_URL = r'https?://www\.(?:telecinco\.es|cuatro\.com|mediaset\.es)/(?:[^/]+/)+(?P<id>.+?)\.html'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?:telecinco\.es|cuatro\.com|mediaset\.es)/(?:[^/]+/)+(?P<id>.+?)\.html'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://www.telecinco.es/robinfood/temporada-01/t01xp14/Bacalao-cocochas-pil-pil_0_1876350223.html',
|
||||
|
@@ -5,7 +5,7 @@ from .common import InfoExtractor
|
||||
|
||||
|
||||
class TelewebionIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://www\.telewebion\.com/#!/episode/(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?telewebion\.com/#!/episode/(?P<id>\d+)'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.telewebion.com/#!/episode/1263668/',
|
||||
|
@@ -11,7 +11,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class TheInterceptIE(InfoExtractor):
|
||||
_VALID_URL = r'https://theintercept.com/fieldofvision/(?P<id>[^/?#]+)'
|
||||
_VALID_URL = r'https?://theintercept\.com/fieldofvision/(?P<id>[^/?#]+)'
|
||||
_TESTS = [{
|
||||
'url': 'https://theintercept.com/fieldofvision/thisisacoup-episode-four-surrender-or-die/',
|
||||
'md5': '145f28b41d44aab2f87c0a4ac8ec95bd',
|
||||
|
@@ -7,7 +7,7 @@ from ..utils import qualities
|
||||
|
||||
|
||||
class TheSceneIE(InfoExtractor):
|
||||
_VALID_URL = r'https://thescene\.com/watch/[^/]+/(?P<id>[^/#?]+)'
|
||||
_VALID_URL = r'https?://thescene\.com/watch/[^/]+/(?P<id>[^/#?]+)'
|
||||
|
||||
_TEST = {
|
||||
'url': 'https://thescene.com/watch/vogue/narciso-rodriguez-spring-2013-ready-to-wear',
|
||||
|
@@ -3,13 +3,13 @@ from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import determine_ext
|
||||
from .jwplatform import JWPlatformBaseIE
|
||||
from ..utils import remove_end
|
||||
|
||||
|
||||
class ThisAVIE(InfoExtractor):
|
||||
class ThisAVIE(JWPlatformBaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?thisav\.com/video/(?P<id>[0-9]+)/.*'
|
||||
_TEST = {
|
||||
_TESTS = [{
|
||||
'url': 'http://www.thisav.com/video/47734/%98%26sup1%3B%83%9E%83%82---just-fit.html',
|
||||
'md5': '0480f1ef3932d901f0e0e719f188f19b',
|
||||
'info_dict': {
|
||||
@@ -19,29 +19,49 @@ class ThisAVIE(InfoExtractor):
|
||||
'uploader': 'dj7970',
|
||||
'uploader_id': 'dj7970'
|
||||
}
|
||||
}
|
||||
}, {
|
||||
'url': 'http://www.thisav.com/video/242352/nerdy-18yo-big-ass-tattoos-and-glasses.html',
|
||||
'md5': 'ba90c076bd0f80203679e5b60bf523ee',
|
||||
'info_dict': {
|
||||
'id': '242352',
|
||||
'ext': 'mp4',
|
||||
'title': 'Nerdy 18yo Big Ass Tattoos and Glasses',
|
||||
'uploader': 'cybersluts',
|
||||
'uploader_id': 'cybersluts',
|
||||
},
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
|
||||
video_id = mobj.group('id')
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
title = self._html_search_regex(r'<h1>([^<]*)</h1>', webpage, 'title')
|
||||
title = remove_end(self._html_search_regex(
|
||||
r'<title>([^<]+)</title>', webpage, 'title'),
|
||||
' - 視頻 - ThisAV.com-世界第一中文成人娛樂網站')
|
||||
video_url = self._html_search_regex(
|
||||
r"addVariable\('file','([^']+)'\);", webpage, 'video url')
|
||||
r"addVariable\('file','([^']+)'\);", webpage, 'video url', default=None)
|
||||
if video_url:
|
||||
info_dict = {
|
||||
'formats': [{
|
||||
'url': video_url,
|
||||
}],
|
||||
}
|
||||
else:
|
||||
info_dict = self._extract_jwplayer_data(
|
||||
webpage, video_id, require_title=False)
|
||||
uploader = self._html_search_regex(
|
||||
r': <a href="http://www.thisav.com/user/[0-9]+/(?:[^"]+)">([^<]+)</a>',
|
||||
webpage, 'uploader name', fatal=False)
|
||||
uploader_id = self._html_search_regex(
|
||||
r': <a href="http://www.thisav.com/user/[0-9]+/([^"]+)">(?:[^<]+)</a>',
|
||||
webpage, 'uploader id', fatal=False)
|
||||
ext = determine_ext(video_url)
|
||||
|
||||
return {
|
||||
info_dict.update({
|
||||
'id': video_id,
|
||||
'url': video_url,
|
||||
'uploader': uploader,
|
||||
'uploader_id': uploader_id,
|
||||
'title': title,
|
||||
'ext': ext,
|
||||
}
|
||||
})
|
||||
|
||||
return info_dict
|
||||
|
@@ -13,7 +13,7 @@ from ..compat import (
|
||||
|
||||
class TlcDeIE(InfoExtractor):
|
||||
IE_NAME = 'tlc.de'
|
||||
_VALID_URL = r'https?://www\.tlc\.de/(?:[^/]+/)*videos/(?P<title>[^/?#]+)?(?:.*#(?P<id>\d+))?'
|
||||
_VALID_URL = r'https?://(?:www\.)?tlc\.de/(?:[^/]+/)*videos/(?P<title>[^/?#]+)?(?:.*#(?P<id>\d+))?'
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://www.tlc.de/sendungen/breaking-amish/videos/#3235167922001',
|
||||
|
@@ -2,12 +2,22 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import int_or_none
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
js_to_json,
|
||||
ExtractorError,
|
||||
urlencode_postdata,
|
||||
extract_attributes,
|
||||
smuggle_url,
|
||||
)
|
||||
|
||||
|
||||
class TouTvIE(InfoExtractor):
|
||||
_NETRC_MACHINE = 'toutv'
|
||||
IE_NAME = 'tou.tv'
|
||||
_VALID_URL = r'https?://ici\.tou\.tv/(?P<id>[a-zA-Z0-9_-]+/S[0-9]+E[0-9]+)'
|
||||
_access_token = None
|
||||
_claims = None
|
||||
|
||||
_TEST = {
|
||||
'url': 'http://ici.tou.tv/garfield-tout-court/S2015E17',
|
||||
@@ -22,18 +32,64 @@ class TouTvIE(InfoExtractor):
|
||||
# m3u8 download
|
||||
'skip_download': True,
|
||||
},
|
||||
'skip': '404 Not Found',
|
||||
}
|
||||
|
||||
def _real_initialize(self):
|
||||
email, password = self._get_login_info()
|
||||
if email is None:
|
||||
return
|
||||
state = 'http://ici.tou.tv//'
|
||||
webpage = self._download_webpage(state, None, 'Downloading homepage')
|
||||
toutvlogin = self._parse_json(self._search_regex(
|
||||
r'(?s)toutvlogin\s*=\s*({.+?});', webpage, 'toutvlogin'), None, js_to_json)
|
||||
authorize_url = toutvlogin['host'] + '/auth/oauth/v2/authorize'
|
||||
login_webpage = self._download_webpage(
|
||||
authorize_url, None, 'Downloading login page', query={
|
||||
'client_id': toutvlogin['clientId'],
|
||||
'redirect_uri': 'https://ici.tou.tv/login/loginCallback',
|
||||
'response_type': 'token',
|
||||
'scope': 'media-drmt openid profile email id.write media-validation.read.privileged',
|
||||
'state': state,
|
||||
})
|
||||
login_form = self._search_regex(
|
||||
r'(?s)(<form[^>]+id="Form-login".+?</form>)', login_webpage, 'login form')
|
||||
form_data = self._hidden_inputs(login_form)
|
||||
form_data.update({
|
||||
'login-email': email,
|
||||
'login-password': password,
|
||||
})
|
||||
post_url = extract_attributes(login_form).get('action') or authorize_url
|
||||
_, urlh = self._download_webpage_handle(
|
||||
post_url, None, 'Logging in', data=urlencode_postdata(form_data))
|
||||
self._access_token = self._search_regex(
|
||||
r'access_token=([\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})',
|
||||
urlh.geturl(), 'access token')
|
||||
self._claims = self._download_json(
|
||||
'https://services.radio-canada.ca/media/validation/v2/getClaims',
|
||||
None, 'Extracting Claims', query={
|
||||
'token': self._access_token,
|
||||
'access_token': self._access_token,
|
||||
})['claims']
|
||||
|
||||
def _real_extract(self, url):
|
||||
path = self._match_id(url)
|
||||
metadata = self._download_json('http://ici.tou.tv/presentation/%s' % path, path)
|
||||
if metadata.get('IsDrm'):
|
||||
raise ExtractorError('This video is DRM protected.', expected=True)
|
||||
video_id = metadata['IdMedia']
|
||||
details = metadata['Details']
|
||||
title = details['OriginalTitle']
|
||||
video_url = 'radiocanada:%s:%s' % (metadata.get('AppCode', 'toutv'), video_id)
|
||||
if self._access_token and self._claims:
|
||||
video_url = smuggle_url(video_url, {
|
||||
'access_token': self._access_token,
|
||||
'claims': self._claims,
|
||||
})
|
||||
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'url': 'radiocanada:%s:%s' % (metadata.get('AppCode', 'toutv'), video_id),
|
||||
'url': video_url,
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'thumbnail': details.get('ImageUrl'),
|
||||
|
@@ -22,9 +22,17 @@ class TruTVIE(TurnerBaseIE):
|
||||
|
||||
def _real_extract(self, url):
|
||||
path, video_id = re.match(self._VALID_URL, url).groups()
|
||||
auth_required = False
|
||||
if path:
|
||||
data_src = 'http://www.trutv.com/video/cvp/v2/xml/content.xml?id=%s.xml' % path
|
||||
else:
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
video_id = self._search_regex(
|
||||
r"TTV\.TVE\.episodeId\s*=\s*'([^']+)';",
|
||||
webpage, 'video id', default=video_id)
|
||||
auth_required = self._search_regex(
|
||||
r'TTV\.TVE\.authRequired\s*=\s*(true|false);',
|
||||
webpage, 'auth required', default='false') == 'true'
|
||||
data_src = 'http://www.trutv.com/tveverywhere/services/cvpXML.do?titleId=' + video_id
|
||||
return self._extract_cvp_info(
|
||||
data_src, path, {
|
||||
@@ -32,4 +40,8 @@ class TruTVIE(TurnerBaseIE):
|
||||
'media_src': 'http://androidhls-secure.cdn.turner.com/trutv/big',
|
||||
'tokenizer_src': 'http://www.trutv.com/tveverywhere/processors/services/token_ipadAdobe.do',
|
||||
},
|
||||
}, {
|
||||
'url': url,
|
||||
'site_name': 'truTV',
|
||||
'auth_required': auth_required,
|
||||
})
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user