Compare commits
125 Commits
2016.09.18
...
2016.10.07
Author | SHA1 | Date | |
---|---|---|---|
![]() |
dd4291f729 | ||
![]() |
888f8d6ba4 | ||
![]() |
f475e88121 | ||
![]() |
3c6b3bf221 | ||
![]() |
38588ab977 | ||
![]() |
85bcdd081c | ||
![]() |
9dcd6fd3aa | ||
![]() |
98763ee354 | ||
![]() |
3d83a1ae92 | ||
![]() |
c0a7b9b348 | ||
![]() |
831a34caa2 | ||
![]() |
09b9c45e24 | ||
![]() |
33898fb19c | ||
![]() |
017eb82934 | ||
![]() |
b1d798887e | ||
![]() |
0a33bb2cb2 | ||
![]() |
185744f92f | ||
![]() |
7232e54813 | ||
![]() |
6eb5503b12 | ||
![]() |
539c881bfc | ||
![]() |
c1b2a0858c | ||
![]() |
215ff6e0f3 | ||
![]() |
dcdb292fdd | ||
![]() |
c1084ddb0c | ||
![]() |
ee5de4e38e | ||
![]() |
25291b979a | ||
![]() |
567a5996ca | ||
![]() |
6c152ce20f | ||
![]() |
26406d33c7 | ||
![]() |
703b3afa93 | ||
![]() |
99ed78c79e | ||
![]() |
fd15264172 | ||
![]() |
bd26441205 | ||
![]() |
b19e275d99 | ||
![]() |
f6ba581f89 | ||
![]() |
6d2549fb4f | ||
![]() |
4da4516973 | ||
![]() |
e1e97c2446 | ||
![]() |
53a7e3d287 | ||
![]() |
d54739a2e6 | ||
![]() |
63e0fd5bcc | ||
![]() |
9c51a24642 | ||
![]() |
9bd7bd0b80 | ||
![]() |
4a76b73c6c | ||
![]() |
e295618f9e | ||
![]() |
d7753d1948 | ||
![]() |
eaf9b22f94 | ||
![]() |
a1001f47fc | ||
![]() |
1609782258 | ||
![]() |
de6babf922 | ||
![]() |
b0582fc806 | ||
![]() |
af33dd8ee7 | ||
![]() |
70d7b323b6 | ||
![]() |
a7ee8a00f4 | ||
![]() |
c6eed6b8c0 | ||
![]() |
3aa3953d28 | ||
![]() |
efa97bdcf1 | ||
![]() |
475f8a4580 | ||
![]() |
93aa0b6318 | ||
![]() |
0ce26ef228 | ||
![]() |
0d72ff9c51 | ||
![]() |
a56e74e271 | ||
![]() |
f533490bb7 | ||
![]() |
8bfda726c2 | ||
![]() |
8f0cf20ab9 | ||
![]() |
c8f45f763c | ||
![]() |
dd2cffeeec | ||
![]() |
cdfcc4ce95 | ||
![]() |
e384552590 | ||
![]() |
1a2fbe322e | ||
![]() |
f9dd86a112 | ||
![]() |
2342733f85 | ||
![]() |
93933c9819 | ||
![]() |
d75d9e343e | ||
![]() |
72c3d02d29 | ||
![]() |
d3dbb46330 | ||
![]() |
fffb9cff94 | ||
![]() |
d3c97bad61 | ||
![]() |
2d5b4af007 | ||
![]() |
f1ee462c82 | ||
![]() |
5742c18bc1 | ||
![]() |
ddb19772d5 | ||
![]() |
a3d8b38168 | ||
![]() |
e590b7ff9e | ||
![]() |
f3625cc4ca | ||
![]() |
2d3d29976b | ||
![]() |
493353c7fd | ||
![]() |
0a078550b9 | ||
![]() |
f92bb612c6 | ||
![]() |
ddde91952f | ||
![]() |
63c583eb2c | ||
![]() |
7fd57de6fb | ||
![]() |
e71a450956 | ||
![]() |
27e99078d3 | ||
![]() |
6f126d903f | ||
![]() |
7518a61d41 | ||
![]() |
8e45e1cc4d | ||
![]() |
f0bc5a8609 | ||
![]() |
a54ffb8aa7 | ||
![]() |
8add4bfecb | ||
![]() |
0711995bca | ||
![]() |
5968d7d2fe | ||
![]() |
e6332059ac | ||
![]() |
8eec691e8a | ||
![]() |
24628cf7db | ||
![]() |
71ad00c09f | ||
![]() |
45cae3b021 | ||
![]() |
4ddcb5999d | ||
![]() |
628406db96 | ||
![]() |
e3d6bdc8fc | ||
![]() |
0a439c5c4c | ||
![]() |
1978540a51 | ||
![]() |
12f211d0cb | ||
![]() |
3a5a18705f | ||
![]() |
1ae0ae5db0 | ||
![]() |
f62a77b99a | ||
![]() |
4bfd294e2f | ||
![]() |
e33a7253b2 | ||
![]() |
c38f06818d | ||
![]() |
cb57386873 | ||
![]() |
59fd8f931d | ||
![]() |
70b4cf9b1b | ||
![]() |
cc764a6da8 | ||
![]() |
d8dbf8707d | ||
![]() |
a1da888d0c |
6
.github/ISSUE_TEMPLATE.md
vendored
6
.github/ISSUE_TEMPLATE.md
vendored
@@ -6,8 +6,8 @@
|
||||
|
||||
---
|
||||
|
||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.09.18*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.09.18**
|
||||
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.10.07*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
|
||||
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.10.07**
|
||||
|
||||
### Before submitting an *issue* make sure you have:
|
||||
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
|
||||
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
|
||||
[debug] User config: []
|
||||
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
|
||||
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
|
||||
[debug] youtube-dl version 2016.09.18
|
||||
[debug] youtube-dl version 2016.10.07
|
||||
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
|
||||
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
|
||||
[debug] Proxy map: {}
|
||||
|
5
.github/PULL_REQUEST_TEMPLATE.md
vendored
5
.github/PULL_REQUEST_TEMPLATE.md
vendored
@@ -10,8 +10,13 @@
|
||||
- [ ] At least skimmed through [adding new extractor tutorial](https://github.com/rg3/youtube-dl#adding-support-for-a-new-site) and [youtube-dl coding conventions](https://github.com/rg3/youtube-dl#youtube-dl-coding-conventions) sections
|
||||
- [ ] [Searched](https://github.com/rg3/youtube-dl/search?q=is%3Apr&type=Issues) the bugtracker for similar pull requests
|
||||
|
||||
### In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under [Unlicense](http://unlicense.org/). Check one of the following options:
|
||||
- [ ] I am the original author of this code and I am willing to release it under [Unlicense](http://unlicense.org/)
|
||||
- [ ] I am not the original author of this code but it is in public domain or released under [Unlicense](http://unlicense.org/) (provide reliable evidence)
|
||||
|
||||
### What is the purpose of your *pull request*?
|
||||
- [ ] Bug fix
|
||||
- [ ] Improvement
|
||||
- [ ] New extractor
|
||||
- [ ] New feature
|
||||
|
||||
|
1
.gitignore
vendored
1
.gitignore
vendored
@@ -29,6 +29,7 @@ updates_key.pem
|
||||
*.m4a
|
||||
*.m4v
|
||||
*.mp3
|
||||
*.3gp
|
||||
*.part
|
||||
*.swp
|
||||
test/testdata
|
||||
|
2
AUTHORS
2
AUTHORS
@@ -26,7 +26,7 @@ Albert Kim
|
||||
Pierre Rudloff
|
||||
Huarong Huo
|
||||
Ismael Mejía
|
||||
Steffan 'Ruirize' James
|
||||
Steffan Donal
|
||||
Andras Elso
|
||||
Jelle van der Waa
|
||||
Marcin Cieślak
|
||||
|
100
ChangeLog
100
ChangeLog
@@ -1,3 +1,103 @@
|
||||
version 2016.10.07
|
||||
|
||||
Extractors
|
||||
+ [iprima] Detect geo restriction
|
||||
* [facebook] Fix video extraction (#10846)
|
||||
+ [commonprotocols] Support direct MMS links (#10838)
|
||||
+ [generic] Add support for multiple vimeo embeds (#10862)
|
||||
+ [nzz] Add support for nzz.ch (#4407)
|
||||
+ [npo] Detect geo restriction
|
||||
+ [npo] Add support for 2doc.nl (#10842)
|
||||
+ [lego] Add support for lego.com (#10369)
|
||||
+ [tonline] Add support for t-online.de (#10376)
|
||||
* [techtalks] Relax URL regular expression (#10840)
|
||||
* [youtube:live] Extend URL regular expression (#10839)
|
||||
+ [theweatherchannel] Add support for weather.com (#7188)
|
||||
+ [thisoldhouse] Add support for thisoldhouse.com (#10837)
|
||||
+ [nhl] Add support for wch2016.com (#10833)
|
||||
* [pornoxo] Use JWPlatform to improve metadata extraction
|
||||
|
||||
|
||||
version 2016.10.02
|
||||
|
||||
Core
|
||||
* Fix possibly lost extended attributes during post-processing
|
||||
+ Support pyxattr as well as python-xattr for --xattrs and
|
||||
--xattr-set-filesize (#9054)
|
||||
|
||||
Extractors
|
||||
+ [jwplatform] Support DASH streams in JWPlayer
|
||||
+ [jwplatform] Support old-style JWPlayer playlists
|
||||
+ [byutv:event] Add extractor
|
||||
* [periscope:user] Fix extraction (#10820)
|
||||
* [dctp] Fix extraction (#10734)
|
||||
+ [instagram] Extract video dimensions (#10790)
|
||||
+ [tvland] Extend URL regular expression (#10812)
|
||||
+ [vgtv] Add support for tv.aftonbladet.se (#10800)
|
||||
- [aftonbladet] Remove extractor
|
||||
* [vk] Fix timestamp and view count extraction (#10760)
|
||||
+ [vk] Add support for running and finished live streams (#10799)
|
||||
+ [leeco] Recognize more Le Sports URLs (#10794)
|
||||
+ [instagram] Extract comments (#10788)
|
||||
+ [ketnet] Extract mzsource formats (#10770)
|
||||
* [limelight:media] Improve HTTP formats extraction
|
||||
|
||||
|
||||
version 2016.09.27
|
||||
|
||||
Core
|
||||
+ Add hdcore query parameter to akamai f4m formats
|
||||
+ Delegate HLS live streams downloading to ffmpeg
|
||||
+ Improved support for HTML5 subtitles
|
||||
|
||||
Extractors
|
||||
+ [vk] Add support for dailymotion embeds (#10661)
|
||||
* [promptfile] Fix extraction (#10634)
|
||||
* [kaltura] Speed up embed regular expressions (#10764)
|
||||
+ [npo] Add support for anderetijden.nl (#10754)
|
||||
+ [prosiebensat1] Add support for advopedia sites
|
||||
* [mwave] Relax URL regular expression (#10735, #10748)
|
||||
* [prosiebensat1] Fix playlist support (#10745)
|
||||
+ [prosiebensat1] Add support for sat1gold sites (#10745)
|
||||
+ [cbsnews:livevideo] Fix extraction and extract m3u8 formats
|
||||
+ [brightcove:new] Add support for live streams
|
||||
* [soundcloud] Generalize playlist entries extraction (#10733)
|
||||
+ [mtv] Add support for new URL schema (#8169, #9808)
|
||||
* [einthusan] Fix extraction (#10714)
|
||||
+ [twitter] Support Periscope embeds (#10737)
|
||||
+ [openload] Support subtitles (#10625)
|
||||
|
||||
|
||||
version 2016.09.24
|
||||
|
||||
Core
|
||||
+ Add support for watchTVeverywhere.com authentication provider based MSOs for
|
||||
Adobe Pass authentication (#10709)
|
||||
|
||||
Extractors
|
||||
+ [soundcloud:playlist] Provide video id for early playlist entries (#10733)
|
||||
+ [prosiebensat1] Add support for kabeleinsdoku (#10732)
|
||||
* [cbs] Extract info from thunder videoPlayerService (#10728)
|
||||
* [openload] Fix extraction (#10408)
|
||||
+ [ustream] Support the new HLS streams (#10698)
|
||||
+ [ooyala] Extract all HLS formats
|
||||
+ [cartoonnetwork] Add support for Adobe Pass authentication
|
||||
+ [soundcloud] Extract license metadata
|
||||
+ [fox] Add support for Adobe Pass authentication (#8584)
|
||||
+ [tbs] Add support for Adobe Pass authentication (#10642, #10222)
|
||||
+ [trutv] Add support for Adobe Pass authentication (#10519)
|
||||
+ [turner] Add support for Adobe Pass authentication
|
||||
|
||||
|
||||
version 2016.09.19
|
||||
|
||||
Extractors
|
||||
+ [crunchyroll] Check if already authenticated (#10700)
|
||||
- [twitch:stream] Remove fallback to profile extraction when stream is offline
|
||||
* [thisav] Improve title extraction (#10682)
|
||||
* [vyborymos] Improve station info extraction
|
||||
|
||||
|
||||
version 2016.09.18
|
||||
|
||||
Core
|
||||
|
2
Makefile
2
Makefile
@@ -1,7 +1,7 @@
|
||||
all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites
|
||||
|
||||
clean:
|
||||
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.jpg *.png CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe
|
||||
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.jpg *.png CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe
|
||||
find . -name "*.pyc" -delete
|
||||
find . -name "*.class" -delete
|
||||
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
# coding: utf-8
|
||||
#
|
||||
# youtube-dl documentation build configuration file, created by
|
||||
# sphinx-quickstart on Fri Mar 14 21:05:43 2014.
|
||||
|
@@ -34,12 +34,12 @@
|
||||
- **AdultSwim**
|
||||
- **aenetworks**: A+E Networks: A&E, Lifetime, History.com, FYI Network
|
||||
- **AfreecaTV**: afreecatv.com
|
||||
- **Aftonbladet**
|
||||
- **AirMozilla**
|
||||
- **AlJazeera**
|
||||
- **Allocine**
|
||||
- **AlphaPorno**
|
||||
- **AMCNetworks**
|
||||
- **anderetijden**: npo.nl and ntr.nl
|
||||
- **AnimeOnDemand**
|
||||
- **anitube.se**
|
||||
- **AnySex**
|
||||
@@ -111,6 +111,7 @@
|
||||
- **bt:vestlendingen**: Bergens Tidende - Vestlendingen
|
||||
- **BuzzFeed**
|
||||
- **BYUtv**
|
||||
- **BYUtvEvent**
|
||||
- **Camdemy**
|
||||
- **CamdemyFolder**
|
||||
- **CamWithHer**
|
||||
@@ -127,8 +128,8 @@
|
||||
- **CBS**
|
||||
- **CBSInteractive**
|
||||
- **CBSLocal**
|
||||
- **CBSNews**: CBS News
|
||||
- **CBSNewsLiveVideo**: CBS News Live Videos
|
||||
- **cbsnews**: CBS News
|
||||
- **cbsnews:livevideo**: CBS News Live Videos
|
||||
- **CBSSports**
|
||||
- **CCTV**
|
||||
- **CDA**
|
||||
@@ -363,6 +364,7 @@
|
||||
- **Le**: 乐视网
|
||||
- **Learnr**
|
||||
- **Lecture2Go**
|
||||
- **LEGO**
|
||||
- **Lemonde**
|
||||
- **LePlaylist**
|
||||
- **LetvCloud**: 乐视云
|
||||
@@ -424,8 +426,9 @@
|
||||
- **MPORA**
|
||||
- **MSN**
|
||||
- **mtg**: MTG services
|
||||
- **MTV**
|
||||
- **mtv**
|
||||
- **mtv.de**
|
||||
- **mtv:video**
|
||||
- **mtvservices:embedded**
|
||||
- **MuenchenTV**: münchen.tv
|
||||
- **MusicPlayOn**
|
||||
@@ -505,6 +508,7 @@
|
||||
- **Nuvid**
|
||||
- **NYTimes**
|
||||
- **NYTimesArticle**
|
||||
- **NZZ**
|
||||
- **ocw.mit.edu**
|
||||
- **OdaTV**
|
||||
- **Odnoklassniki**
|
||||
@@ -690,6 +694,7 @@
|
||||
- **SWRMediathek**
|
||||
- **Syfy**
|
||||
- **SztvHu**
|
||||
- **t-online.de**
|
||||
- **Tagesschau**
|
||||
- **tagesschau:player**
|
||||
- **Tass**
|
||||
@@ -719,8 +724,10 @@
|
||||
- **TheScene**
|
||||
- **TheSixtyOne**
|
||||
- **TheStar**
|
||||
- **TheWeatherChannel**
|
||||
- **ThisAmericanLife**
|
||||
- **ThisAV**
|
||||
- **ThisOldHouse**
|
||||
- **tinypic**: tinypic.com videos
|
||||
- **tlc.de**
|
||||
- **TMZ**
|
||||
@@ -865,7 +872,7 @@
|
||||
- **wholecloud**: WholeCloud
|
||||
- **Wimp**
|
||||
- **Wistia**
|
||||
- **WNL**
|
||||
- **wnl**: npo.nl and ntr.nl
|
||||
- **WorldStarHipHop**
|
||||
- **wrzuta.pl**
|
||||
- **wrzuta.pl:playlist**
|
||||
|
2
setup.py
2
setup.py
@@ -1,5 +1,5 @@
|
||||
#!/usr/bin/env python
|
||||
# -*- coding: utf-8 -*-
|
||||
# coding: utf-8
|
||||
|
||||
from __future__ import print_function
|
||||
|
||||
|
@@ -292,6 +292,7 @@ class TestUtil(unittest.TestCase):
|
||||
self.assertEqual(unified_strdate('25-09-2014'), '20140925')
|
||||
self.assertEqual(unified_strdate('27.02.2016 17:30'), '20160227')
|
||||
self.assertEqual(unified_strdate('UNKNOWN DATE FORMAT'), None)
|
||||
self.assertEqual(unified_strdate('Feb 7, 2016 at 6:35 pm'), '20160207')
|
||||
|
||||
def test_unified_timestamps(self):
|
||||
self.assertEqual(unified_timestamp('December 21, 2010'), 1292889600)
|
||||
@@ -312,6 +313,7 @@ class TestUtil(unittest.TestCase):
|
||||
self.assertEqual(unified_timestamp('27.02.2016 17:30'), 1456594200)
|
||||
self.assertEqual(unified_timestamp('UNKNOWN DATE FORMAT'), None)
|
||||
self.assertEqual(unified_timestamp('May 16, 2016 11:15 PM'), 1463440500)
|
||||
self.assertEqual(unified_timestamp('Feb 7, 2016 at 6:35 pm'), 1454870100)
|
||||
|
||||
def test_determine_ext(self):
|
||||
self.assertEqual(determine_ext('http://example.com/foo/bar.mp4/?download'), 'mp4')
|
||||
|
@@ -1,5 +1,5 @@
|
||||
#!/usr/bin/env python
|
||||
# -*- coding: utf-8 -*-
|
||||
# coding: utf-8
|
||||
|
||||
from __future__ import absolute_import, unicode_literals
|
||||
|
||||
|
@@ -1,5 +1,5 @@
|
||||
#!/usr/bin/env python
|
||||
# -*- coding: utf-8 -*-
|
||||
# coding: utf-8
|
||||
|
||||
from __future__ import unicode_literals
|
||||
|
||||
@@ -266,8 +266,6 @@ def _real_main(argv=None):
|
||||
postprocessors.append({
|
||||
'key': 'FFmpegEmbedSubtitle',
|
||||
})
|
||||
if opts.xattrs:
|
||||
postprocessors.append({'key': 'XAttrMetadata'})
|
||||
if opts.embedthumbnail:
|
||||
already_have_thumbnail = opts.writethumbnail or opts.write_all_thumbnails
|
||||
postprocessors.append({
|
||||
@@ -276,6 +274,10 @@ def _real_main(argv=None):
|
||||
})
|
||||
if not already_have_thumbnail:
|
||||
opts.writethumbnail = True
|
||||
# XAttrMetadataPP should be run after post-processors that may change file
|
||||
# contents
|
||||
if opts.xattrs:
|
||||
postprocessors.append({'key': 'XAttrMetadata'})
|
||||
# Please keep ExecAfterDownload towards the bottom as it allows the user to modify the final file in any way.
|
||||
# So if the user is able to remove the file before your postprocessor runs it might cause a few problems.
|
||||
if opts.exec_cmd:
|
||||
@@ -283,12 +285,6 @@ def _real_main(argv=None):
|
||||
'key': 'ExecAfterDownload',
|
||||
'exec_cmd': opts.exec_cmd,
|
||||
})
|
||||
if opts.xattr_set_filesize:
|
||||
try:
|
||||
import xattr
|
||||
xattr # Confuse flake8
|
||||
except ImportError:
|
||||
parser.error('setting filesize xattr requested but python-xattr is not available')
|
||||
external_downloader_args = None
|
||||
if opts.external_downloader_args:
|
||||
external_downloader_args = compat_shlex_split(opts.external_downloader_args)
|
||||
|
@@ -31,7 +31,7 @@ class HlsFD(FragmentFD):
|
||||
FD_NAME = 'hlsnative'
|
||||
|
||||
@staticmethod
|
||||
def can_download(manifest):
|
||||
def can_download(manifest, info_dict):
|
||||
UNSUPPORTED_FEATURES = (
|
||||
r'#EXT-X-KEY:METHOD=(?!NONE|AES-128)', # encrypted streams [1]
|
||||
r'#EXT-X-BYTERANGE', # playlists composed of byte ranges of media files [2]
|
||||
@@ -53,6 +53,7 @@ class HlsFD(FragmentFD):
|
||||
)
|
||||
check_results = [not re.search(feature, manifest) for feature in UNSUPPORTED_FEATURES]
|
||||
check_results.append(can_decrypt_frag or '#EXT-X-KEY:METHOD=AES-128' not in manifest)
|
||||
check_results.append(not info_dict.get('is_live'))
|
||||
return all(check_results)
|
||||
|
||||
def real_download(self, filename, info_dict):
|
||||
@@ -62,7 +63,7 @@ class HlsFD(FragmentFD):
|
||||
|
||||
s = manifest.decode('utf-8', 'ignore')
|
||||
|
||||
if not self.can_download(s):
|
||||
if not self.can_download(s, info_dict):
|
||||
self.report_warning(
|
||||
'hlsnative has detected features it does not support, '
|
||||
'extraction will be delegated to ffmpeg')
|
||||
|
@@ -13,6 +13,9 @@ from ..utils import (
|
||||
encodeFilename,
|
||||
sanitize_open,
|
||||
sanitized_Request,
|
||||
write_xattr,
|
||||
XAttrMetadataError,
|
||||
XAttrUnavailableError,
|
||||
)
|
||||
|
||||
|
||||
@@ -179,9 +182,8 @@ class HttpFD(FileDownloader):
|
||||
|
||||
if self.params.get('xattr_set_filesize', False) and data_len is not None:
|
||||
try:
|
||||
import xattr
|
||||
xattr.setxattr(tmpfilename, 'user.ytdl.filesize', str(data_len))
|
||||
except(OSError, IOError, ImportError) as err:
|
||||
write_xattr(tmpfilename, 'user.ytdl.filesize', str(data_len).encode('utf-8'))
|
||||
except (XAttrUnavailableError, XAttrMetadataError) as err:
|
||||
self.report_error('unable to set filesize xattr: %s' % str(err))
|
||||
|
||||
try:
|
||||
|
File diff suppressed because it is too large
Load Diff
@@ -1,64 +0,0 @@
|
||||
# encoding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import int_or_none
|
||||
|
||||
|
||||
class AftonbladetIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://tv\.aftonbladet\.se/abtv/articles/(?P<id>[0-9]+)'
|
||||
_TEST = {
|
||||
'url': 'http://tv.aftonbladet.se/abtv/articles/36015',
|
||||
'info_dict': {
|
||||
'id': '36015',
|
||||
'ext': 'mp4',
|
||||
'title': 'Vulkanutbrott i rymden - nu släpper NASA bilderna',
|
||||
'description': 'Jupiters måne mest aktiv av alla himlakroppar',
|
||||
'timestamp': 1394142732,
|
||||
'upload_date': '20140306',
|
||||
},
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
# find internal video meta data
|
||||
meta_url = 'http://aftonbladet-play-metadata.cdn.drvideo.aptoma.no/video/%s.json'
|
||||
player_config = self._parse_json(self._html_search_regex(
|
||||
r'data-player-config="([^"]+)"', webpage, 'player config'), video_id)
|
||||
internal_meta_id = player_config['aptomaVideoId']
|
||||
internal_meta_url = meta_url % internal_meta_id
|
||||
internal_meta_json = self._download_json(
|
||||
internal_meta_url, video_id, 'Downloading video meta data')
|
||||
|
||||
# find internal video formats
|
||||
format_url = 'http://aftonbladet-play.videodata.drvideo.aptoma.no/actions/video/?id=%s'
|
||||
internal_video_id = internal_meta_json['videoId']
|
||||
internal_formats_url = format_url % internal_video_id
|
||||
internal_formats_json = self._download_json(
|
||||
internal_formats_url, video_id, 'Downloading video formats')
|
||||
|
||||
formats = []
|
||||
for fmt in internal_formats_json['formats']['http']['pseudostreaming']['mp4']:
|
||||
p = fmt['paths'][0]
|
||||
formats.append({
|
||||
'url': 'http://%s:%d/%s/%s' % (p['address'], p['port'], p['path'], p['filename']),
|
||||
'ext': 'mp4',
|
||||
'width': int_or_none(fmt.get('width')),
|
||||
'height': int_or_none(fmt.get('height')),
|
||||
'tbr': int_or_none(fmt.get('bitrate')),
|
||||
'protocol': 'http',
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': internal_meta_json['title'],
|
||||
'formats': formats,
|
||||
'thumbnail': internal_meta_json.get('imageUrl'),
|
||||
'description': internal_meta_json.get('shortPreamble'),
|
||||
'timestamp': int_or_none(internal_meta_json.get('timePublished')),
|
||||
'duration': int_or_none(internal_meta_json.get('duration')),
|
||||
'view_count': int_or_none(internal_meta_json.get('views')),
|
||||
}
|
@@ -1,4 +1,4 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
@@ -28,6 +28,7 @@ class AMCNetworksIE(ThePlatformIE):
|
||||
# m3u8 download
|
||||
'skip_download': True,
|
||||
},
|
||||
'skip': 'Requires TV provider accounts',
|
||||
}, {
|
||||
'url': 'http://www.bbcamerica.com/shows/the-hunt/full-episodes/season-1/episode-01-the-hardest-challenge',
|
||||
'only_matching': True,
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
@@ -66,6 +66,7 @@ class AWAANVideoIE(AWAANBaseIE):
|
||||
'duration': 2041,
|
||||
'timestamp': 1227504126,
|
||||
'upload_date': '20081124',
|
||||
'uploader_id': '71',
|
||||
},
|
||||
}, {
|
||||
'url': 'http://awaan.ae/video/26723981/%D8%AF%D8%A7%D8%B1-%D8%A7%D9%84%D8%B3%D9%84%D8%A7%D9%85:-%D8%AE%D9%8A%D8%B1-%D8%AF%D9%88%D8%B1-%D8%A7%D9%84%D8%A3%D9%86%D8%B5%D8%A7%D8%B1',
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
@@ -621,15 +621,21 @@ class BrightcoveNewIE(InfoExtractor):
|
||||
'url': text_track['src'],
|
||||
})
|
||||
|
||||
is_live = False
|
||||
duration = float_or_none(json_data.get('duration'), 1000)
|
||||
if duration and duration < 0:
|
||||
is_live = True
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'title': self._live_title(title) if is_live else title,
|
||||
'description': clean_html(json_data.get('description')),
|
||||
'thumbnail': json_data.get('thumbnail') or json_data.get('poster'),
|
||||
'duration': float_or_none(json_data.get('duration'), 1000),
|
||||
'duration': duration,
|
||||
'timestamp': parse_iso8601(json_data.get('published_at')),
|
||||
'uploader_id': account_id,
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
'tags': json_data.get('tags', []),
|
||||
'is_live': is_live,
|
||||
}
|
||||
|
@@ -1,6 +1,5 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import json
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
@@ -8,15 +7,15 @@ from ..utils import ExtractorError
|
||||
|
||||
|
||||
class BYUtvIE(InfoExtractor):
|
||||
_VALID_URL = r'^https?://(?:www\.)?byutv.org/watch/[0-9a-f-]+/(?P<video_id>[^/?#]+)'
|
||||
_TEST = {
|
||||
_VALID_URL = r'https?://(?:www\.)?byutv\.org/watch/(?!event/)(?P<id>[0-9a-f-]+)(?:/(?P<display_id>[^/?#&]+))?'
|
||||
_TESTS = [{
|
||||
'url': 'http://www.byutv.org/watch/6587b9a3-89d2-42a6-a7f7-fd2f81840a7d/studio-c-season-5-episode-5',
|
||||
'md5': '05850eb8c749e2ee05ad5a1c34668493',
|
||||
'info_dict': {
|
||||
'id': 'studio-c-season-5-episode-5',
|
||||
'id': '6587b9a3-89d2-42a6-a7f7-fd2f81840a7d',
|
||||
'display_id': 'studio-c-season-5-episode-5',
|
||||
'ext': 'mp4',
|
||||
'description': 'md5:e07269172baff037f8e8bf9956bc9747',
|
||||
'title': 'Season 5 Episode 5',
|
||||
'description': 'md5:e07269172baff037f8e8bf9956bc9747',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'duration': 1486.486,
|
||||
},
|
||||
@@ -24,28 +23,71 @@ class BYUtvIE(InfoExtractor):
|
||||
'skip_download': True,
|
||||
},
|
||||
'add_ie': ['Ooyala'],
|
||||
}, {
|
||||
'url': 'http://www.byutv.org/watch/6587b9a3-89d2-42a6-a7f7-fd2f81840a7d',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
video_id = mobj.group('id')
|
||||
display_id = mobj.group('display_id') or video_id
|
||||
|
||||
webpage = self._download_webpage(url, display_id)
|
||||
episode_code = self._search_regex(
|
||||
r'(?s)episode:(.*?\}),\s*\n', webpage, 'episode information')
|
||||
|
||||
ep = self._parse_json(
|
||||
episode_code, display_id, transform_source=lambda s:
|
||||
re.sub(r'(\n\s+)([a-zA-Z]+):\s+\'(.*?)\'', r'\1"\2": "\3"', s))
|
||||
|
||||
if ep['providerType'] != 'Ooyala':
|
||||
raise ExtractorError('Unsupported provider %s' % ep['provider'])
|
||||
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'ie_key': 'Ooyala',
|
||||
'url': 'ooyala:%s' % ep['providerId'],
|
||||
'id': video_id,
|
||||
'display_id': display_id,
|
||||
'title': ep['title'],
|
||||
'description': ep.get('description'),
|
||||
'thumbnail': ep.get('imageThumbnail'),
|
||||
}
|
||||
|
||||
|
||||
class BYUtvEventIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?byutv\.org/watch/event/(?P<id>[0-9a-f-]+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.byutv.org/watch/event/29941b9b-8bf6-48d2-aebf-7a87add9e34b',
|
||||
'info_dict': {
|
||||
'id': '29941b9b-8bf6-48d2-aebf-7a87add9e34b',
|
||||
'ext': 'mp4',
|
||||
'title': 'Toledo vs. BYU (9/30/16)',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
},
|
||||
'add_ie': ['Ooyala'],
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
video_id = mobj.group('video_id')
|
||||
video_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
episode_code = self._search_regex(
|
||||
r'(?s)episode:(.*?\}),\s*\n', webpage, 'episode information')
|
||||
episode_json = re.sub(
|
||||
r'(\n\s+)([a-zA-Z]+):\s+\'(.*?)\'', r'\1"\2": "\3"', episode_code)
|
||||
ep = json.loads(episode_json)
|
||||
|
||||
if ep['providerType'] == 'Ooyala':
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'ie_key': 'Ooyala',
|
||||
'url': 'ooyala:%s' % ep['providerId'],
|
||||
'id': video_id,
|
||||
'title': ep['title'],
|
||||
'description': ep.get('description'),
|
||||
'thumbnail': ep.get('imageThumbnail'),
|
||||
}
|
||||
else:
|
||||
raise ExtractorError('Unsupported provider %s' % ep['provider'])
|
||||
ooyala_id = self._search_regex(
|
||||
r'providerId\s*:\s*(["\'])(?P<id>(?:(?!\1).)+)\1',
|
||||
webpage, 'ooyala id', group='id')
|
||||
|
||||
title = self._search_regex(
|
||||
r'class=["\']description["\'][^>]*>\s*<h1>([^<]+)</h1>', webpage,
|
||||
'title').strip()
|
||||
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'ie_key': 'Ooyala',
|
||||
'url': 'ooyala:%s' % ooyala_id,
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
}
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
@@ -33,4 +33,10 @@ class CartoonNetworkIE(TurnerBaseIE):
|
||||
'media_src': 'http://androidhls-secure.cdn.turner.com/toon/big',
|
||||
'tokenizer_src': 'http://www.cartoonnetwork.com/cntv/mvpd/processors/services/token_ipadAdobe.do',
|
||||
},
|
||||
}, {
|
||||
'url': url,
|
||||
'site_name': 'CartoonNetwork',
|
||||
'auth_required': self._search_regex(
|
||||
r'_cnglobal\.cvpFullOrPreviewAuth\s*=\s*(true|false);',
|
||||
webpage, 'auth required', default='false') == 'true',
|
||||
})
|
||||
|
@@ -4,7 +4,9 @@ from .theplatform import ThePlatformFeedIE
|
||||
from ..utils import (
|
||||
int_or_none,
|
||||
find_xpath_attr,
|
||||
ExtractorError,
|
||||
xpath_element,
|
||||
xpath_text,
|
||||
update_url_query,
|
||||
)
|
||||
|
||||
|
||||
@@ -47,27 +49,49 @@ class CBSIE(CBSBaseIE):
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _extract_video_info(self, guid):
|
||||
path = 'dJ5BDC/media/guid/2198311517/' + guid
|
||||
smil_url = 'http://link.theplatform.com/s/%s?mbr=true' % path
|
||||
formats, subtitles = self._extract_theplatform_smil(smil_url + '&manifest=m3u', guid)
|
||||
for r in ('OnceURL&formats=M3U', 'HLS&formats=M3U', 'RTMP', 'WIFI', '3G'):
|
||||
try:
|
||||
tp_formats, _ = self._extract_theplatform_smil(smil_url + '&assetTypes=' + r, guid, 'Downloading %s SMIL data' % r.split('&')[0])
|
||||
formats.extend(tp_formats)
|
||||
except ExtractorError:
|
||||
def _extract_video_info(self, content_id):
|
||||
items_data = self._download_xml(
|
||||
'http://can.cbs.com/thunder/player/videoPlayerService.php',
|
||||
content_id, query={'partner': 'cbs', 'contentId': content_id})
|
||||
video_data = xpath_element(items_data, './/item')
|
||||
title = xpath_text(video_data, 'videoTitle', 'title', True)
|
||||
tp_path = 'dJ5BDC/media/guid/2198311517/%s' % content_id
|
||||
tp_release_url = 'http://link.theplatform.com/s/' + tp_path
|
||||
|
||||
asset_types = []
|
||||
subtitles = {}
|
||||
formats = []
|
||||
for item in items_data.findall('.//item'):
|
||||
asset_type = xpath_text(item, 'assetType')
|
||||
if not asset_type or asset_type in asset_types:
|
||||
continue
|
||||
asset_types.append(asset_type)
|
||||
query = {
|
||||
'mbr': 'true',
|
||||
'assetTypes': asset_type,
|
||||
}
|
||||
if asset_type.startswith('HLS') or asset_type in ('OnceURL', 'StreamPack'):
|
||||
query['formats'] = 'MPEG4,M3U'
|
||||
elif asset_type in ('RTMP', 'WIFI', '3G'):
|
||||
query['formats'] = 'MPEG4,FLV'
|
||||
tp_formats, tp_subtitles = self._extract_theplatform_smil(
|
||||
update_url_query(tp_release_url, query), content_id,
|
||||
'Downloading %s SMIL data' % asset_type)
|
||||
formats.extend(tp_formats)
|
||||
subtitles = self._merge_subtitles(subtitles, tp_subtitles)
|
||||
self._sort_formats(formats)
|
||||
metadata = self._download_theplatform_metadata(path, guid)
|
||||
info = self._parse_theplatform_metadata(metadata)
|
||||
|
||||
info = self._extract_theplatform_metadata(tp_path, content_id)
|
||||
info.update({
|
||||
'id': guid,
|
||||
'id': content_id,
|
||||
'title': title,
|
||||
'series': xpath_text(video_data, 'seriesTitle'),
|
||||
'season_number': int_or_none(xpath_text(video_data, 'seasonNumber')),
|
||||
'episode_number': int_or_none(xpath_text(video_data, 'episodeNumber')),
|
||||
'duration': int_or_none(xpath_text(video_data, 'videoLength'), 1000),
|
||||
'thumbnail': xpath_text(video_data, 'previewImageURL'),
|
||||
'formats': formats,
|
||||
'subtitles': subtitles,
|
||||
'series': metadata.get('cbs$SeriesTitle'),
|
||||
'season_number': int_or_none(metadata.get('cbs$SeasonNumber')),
|
||||
'episode': metadata.get('cbs$EpisodeTitle'),
|
||||
'episode_number': int_or_none(metadata.get('cbs$EpisodeNumber')),
|
||||
})
|
||||
return info
|
||||
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
@@ -9,6 +9,7 @@ from ..utils import (
|
||||
|
||||
|
||||
class CBSNewsIE(CBSIE):
|
||||
IE_NAME = 'cbsnews'
|
||||
IE_DESC = 'CBS News'
|
||||
_VALID_URL = r'https?://(?:www\.)?cbsnews\.com/(?:news|videos)/(?P<id>[\da-z_-]+)'
|
||||
|
||||
@@ -68,15 +69,16 @@ class CBSNewsIE(CBSIE):
|
||||
|
||||
|
||||
class CBSNewsLiveVideoIE(InfoExtractor):
|
||||
IE_NAME = 'cbsnews:livevideo'
|
||||
IE_DESC = 'CBS News Live Videos'
|
||||
_VALID_URL = r'https?://(?:www\.)?cbsnews\.com/live/video/(?P<id>[\da-z_-]+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?cbsnews\.com/live/video/(?P<id>[^/?#]+)'
|
||||
|
||||
# Live videos get deleted soon. See http://www.cbsnews.com/live/ for the latest examples
|
||||
_TEST = {
|
||||
'url': 'http://www.cbsnews.com/live/video/clinton-sanders-prepare-to-face-off-in-nh/',
|
||||
'info_dict': {
|
||||
'id': 'clinton-sanders-prepare-to-face-off-in-nh',
|
||||
'ext': 'flv',
|
||||
'ext': 'mp4',
|
||||
'title': 'Clinton, Sanders Prepare To Face Off In NH',
|
||||
'duration': 334,
|
||||
},
|
||||
@@ -84,25 +86,22 @@ class CBSNewsLiveVideoIE(InfoExtractor):
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
display_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
video_info = self._download_json(
|
||||
'http://feeds.cbsn.cbsnews.com/rundown/story', display_id, query={
|
||||
'device': 'desktop',
|
||||
'dvr_slug': display_id,
|
||||
})
|
||||
|
||||
video_info = self._parse_json(self._html_search_regex(
|
||||
r'data-story-obj=\'({.+?})\'', webpage, 'video JSON info'), video_id)['story']
|
||||
|
||||
hdcore_sign = 'hdcore=3.3.1'
|
||||
f4m_formats = self._extract_f4m_formats(video_info['url'] + '&' + hdcore_sign, video_id)
|
||||
if f4m_formats:
|
||||
for entry in f4m_formats:
|
||||
# URLs without the extra param induce an 404 error
|
||||
entry.update({'extra_param_to_segment_url': hdcore_sign})
|
||||
self._sort_formats(f4m_formats)
|
||||
formats = self._extract_akamai_formats(video_info['url'], display_id)
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'id': display_id,
|
||||
'display_id': display_id,
|
||||
'title': video_info['headline'],
|
||||
'thumbnail': video_info.get('thumbnail_url_hd') or video_info.get('thumbnail_url_sd'),
|
||||
'duration': parse_duration(video_info.get('segmentDur')),
|
||||
'formats': f4m_formats,
|
||||
'formats': formats,
|
||||
}
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
@@ -1,9 +1,6 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import json
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
clean_html,
|
||||
@@ -30,16 +27,14 @@ class ClubicIE(InfoExtractor):
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
video_id = mobj.group('id')
|
||||
video_id = self._match_id(url)
|
||||
|
||||
player_url = 'http://player.m6web.fr/v1/player/clubic/%s.html' % video_id
|
||||
player_page = self._download_webpage(player_url, video_id)
|
||||
|
||||
config_json = self._search_regex(
|
||||
config = self._parse_json(self._search_regex(
|
||||
r'(?m)M6\.Player\.config\s*=\s*(\{.+?\});$', player_page,
|
||||
'configuration')
|
||||
config = json.loads(config_json)
|
||||
'configuration'), video_id)
|
||||
|
||||
video_info = config['videoInfo']
|
||||
sources = config['sources']
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
@@ -21,6 +21,7 @@ from ..compat import (
|
||||
compat_os_name,
|
||||
compat_str,
|
||||
compat_urllib_error,
|
||||
compat_urllib_parse_unquote,
|
||||
compat_urllib_parse_urlencode,
|
||||
compat_urllib_request,
|
||||
compat_urlparse,
|
||||
@@ -1828,7 +1829,7 @@ class InfoExtractor(object):
|
||||
for track_tag in re.findall(r'<track[^>]+>', media_content):
|
||||
track_attributes = extract_attributes(track_tag)
|
||||
kind = track_attributes.get('kind')
|
||||
if not kind or kind == 'subtitles':
|
||||
if not kind or kind in ('subtitles', 'captions'):
|
||||
src = track_attributes.get('src')
|
||||
if not src:
|
||||
continue
|
||||
@@ -1836,16 +1837,21 @@ class InfoExtractor(object):
|
||||
media_info['subtitles'].setdefault(lang, []).append({
|
||||
'url': absolute_url(src),
|
||||
})
|
||||
if media_info['formats']:
|
||||
if media_info['formats'] or media_info['subtitles']:
|
||||
entries.append(media_info)
|
||||
return entries
|
||||
|
||||
def _extract_akamai_formats(self, manifest_url, video_id):
|
||||
formats = []
|
||||
hdcore_sign = 'hdcore=3.7.0'
|
||||
f4m_url = re.sub(r'(https?://.+?)/i/', r'\1/z/', manifest_url).replace('/master.m3u8', '/manifest.f4m')
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
update_url_query(f4m_url, {'hdcore': '3.7.0'}),
|
||||
video_id, f4m_id='hds', fatal=False))
|
||||
if 'hdcore=' not in f4m_url:
|
||||
f4m_url += ('&' if '?' in f4m_url else '?') + hdcore_sign
|
||||
f4m_formats = self._extract_f4m_formats(
|
||||
f4m_url, video_id, f4m_id='hds', fatal=False)
|
||||
for entry in f4m_formats:
|
||||
entry.update({'extra_param_to_segment_url': hdcore_sign})
|
||||
formats.extend(f4m_formats)
|
||||
m3u8_url = re.sub(r'(https?://.+?)/z/', r'\1/i/', manifest_url).replace('/manifest.f4m', '/master.m3u8')
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
m3u8_url, video_id, 'mp4', 'm3u8_native',
|
||||
@@ -2015,6 +2021,12 @@ class InfoExtractor(object):
|
||||
headers['Ytdl-request-proxy'] = geo_verification_proxy
|
||||
return headers
|
||||
|
||||
def _generic_id(self, url):
|
||||
return compat_urllib_parse_unquote(os.path.splitext(url.rstrip('/').split('/')[-1])[0])
|
||||
|
||||
def _generic_title(self, url):
|
||||
return compat_urllib_parse_unquote(os.path.splitext(url_basename(url))[0])
|
||||
|
||||
|
||||
class SearchInfoExtractor(InfoExtractor):
|
||||
"""
|
||||
|
@@ -1,13 +1,9 @@
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import os
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import (
|
||||
compat_urllib_parse_unquote,
|
||||
compat_urlparse,
|
||||
)
|
||||
from ..utils import url_basename
|
||||
|
||||
|
||||
class RtmpIE(InfoExtractor):
|
||||
@@ -23,8 +19,8 @@ class RtmpIE(InfoExtractor):
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = compat_urllib_parse_unquote(os.path.splitext(url.rstrip('/').split('/')[-1])[0])
|
||||
title = compat_urllib_parse_unquote(os.path.splitext(url_basename(url))[0])
|
||||
video_id = self._generic_id(url)
|
||||
title = self._generic_title(url)
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
@@ -34,3 +30,31 @@ class RtmpIE(InfoExtractor):
|
||||
'format_id': compat_urlparse.urlparse(url).scheme,
|
||||
}],
|
||||
}
|
||||
|
||||
|
||||
class MmsIE(InfoExtractor):
|
||||
IE_DESC = False # Do not list
|
||||
_VALID_URL = r'(?i)mms://.+'
|
||||
|
||||
_TEST = {
|
||||
# Direct MMS link
|
||||
'url': 'mms://kentro.kaist.ac.kr/200907/MilesReid(0709).wmv',
|
||||
'info_dict': {
|
||||
'id': 'MilesReid(0709)',
|
||||
'ext': 'wmv',
|
||||
'title': 'MilesReid(0709)',
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True, # rtsp downloads, requiring mplayer or mpv
|
||||
},
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._generic_id(url)
|
||||
title = self._generic_title(url)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'url': url,
|
||||
}
|
||||
|
@@ -1,8 +1,6 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
||||
|
||||
@@ -16,20 +14,20 @@ class CriterionIE(InfoExtractor):
|
||||
'ext': 'mp4',
|
||||
'title': 'Le Samouraï',
|
||||
'description': 'md5:a2b4b116326558149bef81f76dcbb93f',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
video_id = mobj.group('id')
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
final_url = self._search_regex(
|
||||
r'so.addVariable\("videoURL", "(.+?)"\)\;', webpage, 'video url')
|
||||
r'so\.addVariable\("videoURL", "(.+?)"\)\;', webpage, 'video url')
|
||||
title = self._og_search_title(webpage)
|
||||
description = self._html_search_meta('description', webpage)
|
||||
thumbnail = self._search_regex(
|
||||
r'so.addVariable\("thumbnailURL", "(.+?)"\)\;',
|
||||
r'so\.addVariable\("thumbnailURL", "(.+?)"\)\;',
|
||||
webpage, 'thumbnail url')
|
||||
|
||||
return {
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
@@ -46,6 +46,13 @@ class CrunchyrollBaseIE(InfoExtractor):
|
||||
login_page = self._download_webpage(
|
||||
self._LOGIN_URL, None, 'Downloading login page')
|
||||
|
||||
def is_logged(webpage):
|
||||
return '<title>Redirecting' in webpage
|
||||
|
||||
# Already logged in
|
||||
if is_logged(login_page):
|
||||
return
|
||||
|
||||
login_form_str = self._search_regex(
|
||||
r'(?P<form><form[^>]+?id=(["\'])%s\2[^>]*>)' % self._LOGIN_FORM,
|
||||
login_page, 'login form', group='form')
|
||||
@@ -69,7 +76,7 @@ class CrunchyrollBaseIE(InfoExtractor):
|
||||
headers={'Content-Type': 'application/x-www-form-urlencoded'})
|
||||
|
||||
# Successful login
|
||||
if '<title>Redirecting' in response:
|
||||
if is_logged(response):
|
||||
return
|
||||
|
||||
error = self._html_search_regex(
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
|
||||
from __future__ import unicode_literals
|
||||
|
||||
|
@@ -1,61 +1,54 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_str
|
||||
from ..utils import unified_strdate
|
||||
|
||||
|
||||
class DctpTvIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?dctp\.tv/(#/)?filme/(?P<id>.+?)/$'
|
||||
_TEST = {
|
||||
'url': 'http://www.dctp.tv/filme/videoinstallation-fuer-eine-kaufhausfassade/',
|
||||
'md5': '174dd4a8a6225cf5655952f969cfbe24',
|
||||
'info_dict': {
|
||||
'id': '1324',
|
||||
'id': '95eaa4f33dad413aa17b4ee613cccc6c',
|
||||
'display_id': 'videoinstallation-fuer-eine-kaufhausfassade',
|
||||
'ext': 'flv',
|
||||
'title': 'Videoinstallation für eine Kaufhausfassade'
|
||||
'ext': 'mp4',
|
||||
'title': 'Videoinstallation für eine Kaufhausfassade',
|
||||
'description': 'Kurzfilm',
|
||||
'upload_date': '20110407',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
},
|
||||
'params': {
|
||||
# rtmp download
|
||||
'skip_download': True,
|
||||
}
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
base_url = 'http://dctp-ivms2-restapi.s3.amazonaws.com/'
|
||||
version_json = self._download_json(
|
||||
base_url + 'version.json',
|
||||
video_id, note='Determining file version')
|
||||
version = version_json['version_name']
|
||||
info_json = self._download_json(
|
||||
'{0}{1}/restapi/slugs/{2}.json'.format(base_url, version, video_id),
|
||||
video_id, note='Fetching object ID')
|
||||
object_id = compat_str(info_json['object_id'])
|
||||
meta_json = self._download_json(
|
||||
'{0}{1}/restapi/media/{2}.json'.format(base_url, version, object_id),
|
||||
video_id, note='Downloading metadata')
|
||||
uuid = meta_json['uuid']
|
||||
title = meta_json['title']
|
||||
wide = meta_json['is_wide']
|
||||
if wide:
|
||||
ratio = '16x9'
|
||||
else:
|
||||
ratio = '4x3'
|
||||
play_path = 'mp4:{0}_dctp_0500_{1}.m4v'.format(uuid, ratio)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
object_id = self._html_search_meta('DC.identifier', webpage)
|
||||
|
||||
servers_json = self._download_json(
|
||||
'http://www.dctp.tv/streaming_servers/',
|
||||
'http://www.dctp.tv/elastic_streaming_client/get_streaming_server/',
|
||||
video_id, note='Downloading server list')
|
||||
url = servers_json[0]['endpoint']
|
||||
server = servers_json[0]['server']
|
||||
m3u8_path = self._search_regex(
|
||||
r'\'([^\'"]+/playlist\.m3u8)"', webpage, 'm3u8 path')
|
||||
formats = self._extract_m3u8_formats(
|
||||
'http://%s%s' % (server, m3u8_path), video_id, ext='mp4',
|
||||
entry_protocol='m3u8_native')
|
||||
|
||||
title = self._og_search_title(webpage)
|
||||
description = self._html_search_meta('DC.description', webpage)
|
||||
upload_date = unified_strdate(
|
||||
self._html_search_meta('DC.date.created', webpage))
|
||||
thumbnail = self._og_search_thumbnail(webpage)
|
||||
|
||||
return {
|
||||
'id': object_id,
|
||||
'title': title,
|
||||
'format': 'rtmp',
|
||||
'url': url,
|
||||
'play_path': play_path,
|
||||
'rtmp_real_time': True,
|
||||
'ext': 'flv',
|
||||
'display_id': video_id
|
||||
'formats': formats,
|
||||
'display_id': video_id,
|
||||
'description': description,
|
||||
'upload_date': upload_date,
|
||||
'thumbnail': thumbnail,
|
||||
}
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import itertools
|
||||
|
@@ -14,7 +14,7 @@ class EinthusanIE(InfoExtractor):
|
||||
_TESTS = [
|
||||
{
|
||||
'url': 'http://www.einthusan.com/movies/watch.php?id=2447',
|
||||
'md5': 'af244f4458cd667205e513d75da5b8b1',
|
||||
'md5': 'd71379996ff5b7f217eca034c34e3461',
|
||||
'info_dict': {
|
||||
'id': '2447',
|
||||
'ext': 'mp4',
|
||||
@@ -25,13 +25,13 @@ class EinthusanIE(InfoExtractor):
|
||||
},
|
||||
{
|
||||
'url': 'http://www.einthusan.com/movies/watch.php?id=1671',
|
||||
'md5': 'ef63c7a803e22315880ed182c10d1c5c',
|
||||
'md5': 'b16a6fd3c67c06eb7c79c8a8615f4213',
|
||||
'info_dict': {
|
||||
'id': '1671',
|
||||
'ext': 'mp4',
|
||||
'title': 'Soodhu Kavvuum',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'description': 'md5:05d8a0c0281a4240d86d76e14f2f4d51',
|
||||
'description': 'md5:b40f2bf7320b4f9414f3780817b2af8c',
|
||||
}
|
||||
},
|
||||
]
|
||||
@@ -50,9 +50,11 @@ class EinthusanIE(InfoExtractor):
|
||||
video_id = self._search_regex(
|
||||
r'data-movieid=["\'](\d+)', webpage, 'video id', default=video_id)
|
||||
|
||||
video_url = self._download_webpage(
|
||||
m3u8_url = self._download_webpage(
|
||||
'http://cdn.einthusan.com/geturl/%s/hd/London,Washington,Toronto,Dallas,San,Sydney/'
|
||||
% video_id, video_id)
|
||||
% video_id, video_id, headers={'Referer': url})
|
||||
formats = self._extract_m3u8_formats(
|
||||
m3u8_url, video_id, ext='mp4', entry_protocol='m3u8_native')
|
||||
|
||||
description = self._html_search_meta('description', webpage)
|
||||
thumbnail = self._html_search_regex(
|
||||
@@ -64,7 +66,7 @@ class EinthusanIE(InfoExtractor):
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'url': video_url,
|
||||
'formats': formats,
|
||||
'thumbnail': thumbnail,
|
||||
'description': description,
|
||||
}
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
@@ -31,7 +31,6 @@ from .aenetworks import (
|
||||
HistoryTopicIE,
|
||||
)
|
||||
from .afreecatv import AfreecaTVIE
|
||||
from .aftonbladet import AftonbladetIE
|
||||
from .airmozilla import AirMozillaIE
|
||||
from .aljazeera import AlJazeeraIE
|
||||
from .alphaporno import AlphaPornoIE
|
||||
@@ -117,7 +116,10 @@ from .brightcove import (
|
||||
BrightcoveNewIE,
|
||||
)
|
||||
from .buzzfeed import BuzzFeedIE
|
||||
from .byutv import BYUtvIE
|
||||
from .byutv import (
|
||||
BYUtvIE,
|
||||
BYUtvEventIE,
|
||||
)
|
||||
from .c56 import C56IE
|
||||
from .camdemy import (
|
||||
CamdemyIE,
|
||||
@@ -184,7 +186,10 @@ from .comedycentral import (
|
||||
)
|
||||
from .comcarcoff import ComCarCoffIE
|
||||
from .commonmistakes import CommonMistakesIE, UnicodeBOMIE
|
||||
from .commonprotocols import RtmpIE
|
||||
from .commonprotocols import (
|
||||
MmsIE,
|
||||
RtmpIE,
|
||||
)
|
||||
from .condenast import CondeNastIE
|
||||
from .cracked import CrackedIE
|
||||
from .crackle import CrackleIE
|
||||
@@ -435,6 +440,7 @@ from .lcp import (
|
||||
)
|
||||
from .learnr import LearnrIE
|
||||
from .lecture2go import Lecture2GoIE
|
||||
from .lego import LEGOIE
|
||||
from .lemonde import LemondeIE
|
||||
from .leeco import (
|
||||
LeIE,
|
||||
@@ -516,6 +522,7 @@ from .movingimage import MovingImageIE
|
||||
from .msn import MSNIE
|
||||
from .mtv import (
|
||||
MTVIE,
|
||||
MTVVideoIE,
|
||||
MTVServicesEmbeddedIE,
|
||||
MTVDEIE,
|
||||
)
|
||||
@@ -611,13 +618,14 @@ from .nowtv import (
|
||||
)
|
||||
from .noz import NozIE
|
||||
from .npo import (
|
||||
AndereTijdenIE,
|
||||
NPOIE,
|
||||
NPOLiveIE,
|
||||
NPORadioIE,
|
||||
NPORadioFragmentIE,
|
||||
SchoolTVIE,
|
||||
VPROIE,
|
||||
WNLIE
|
||||
WNLIE,
|
||||
)
|
||||
from .npr import NprIE
|
||||
from .nrk import (
|
||||
@@ -633,6 +641,7 @@ from .nytimes import (
|
||||
NYTimesArticleIE,
|
||||
)
|
||||
from .nuvid import NuvidIE
|
||||
from .nzz import NZZIE
|
||||
from .odatv import OdaTVIE
|
||||
from .odnoklassniki import OdnoklassnikiIE
|
||||
from .oktoberfesttv import OktoberfestTVIE
|
||||
@@ -886,8 +895,10 @@ from .theplatform import (
|
||||
from .thescene import TheSceneIE
|
||||
from .thesixtyone import TheSixtyOneIE
|
||||
from .thestar import TheStarIE
|
||||
from .theweatherchannel import TheWeatherChannelIE
|
||||
from .thisamericanlife import ThisAmericanLifeIE
|
||||
from .thisav import ThisAVIE
|
||||
from .thisoldhouse import ThisOldHouseIE
|
||||
from .threeqsdn import ThreeQSDNIE
|
||||
from .tinypic import TinyPicIE
|
||||
from .tlc import TlcDeIE
|
||||
@@ -902,6 +913,7 @@ from .tnaflix import (
|
||||
MovieFapIE,
|
||||
)
|
||||
from .toggle import ToggleIE
|
||||
from .tonline import TOnlineIE
|
||||
from .toutv import TouTvIE
|
||||
from .toypics import ToypicsUserIE, ToypicsIE
|
||||
from .traileraddict import TrailerAddictIE
|
||||
|
@@ -258,7 +258,7 @@ class FacebookIE(InfoExtractor):
|
||||
|
||||
if not video_data:
|
||||
server_js_data = self._parse_json(self._search_regex(
|
||||
r'handleServerJS\(({.+})\);', webpage, 'server js data', default='{}'), video_id)
|
||||
r'handleServerJS\(({.+})(?:\);|,")', webpage, 'server js data', default='{}'), video_id)
|
||||
for item in server_js_data.get('instances', []):
|
||||
if item[1][0] == 'VideoConfig':
|
||||
video_data = video_data_list2dict(item[2][0]['videoData'])
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
@@ -11,9 +11,13 @@ class Formula1IE(InfoExtractor):
|
||||
'md5': '8c79e54be72078b26b89e0e111c0502b',
|
||||
'info_dict': {
|
||||
'id': 'JvYXJpMzE6pArfHWm5ARp5AiUmD-gibV',
|
||||
'ext': 'flv',
|
||||
'ext': 'mp4',
|
||||
'title': 'Race highlights - Spain 2016',
|
||||
},
|
||||
'params': {
|
||||
# m3u8 download
|
||||
'skip_download': True,
|
||||
},
|
||||
'add_ie': ['Ooyala'],
|
||||
}, {
|
||||
'url': 'http://www.formula1.com/en/video/2016/5/Race_highlights_-_Spain_2016.html',
|
||||
|
@@ -1,14 +1,14 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from .adobepass import AdobePassIE
|
||||
from ..utils import (
|
||||
smuggle_url,
|
||||
update_url_query,
|
||||
)
|
||||
|
||||
|
||||
class FOXIE(InfoExtractor):
|
||||
class FOXIE(AdobePassIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?fox\.com/watch/(?P<id>[0-9]+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.fox.com/watch/255180355939/7684182528',
|
||||
@@ -30,14 +30,26 @@ class FOXIE(InfoExtractor):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
release_url = self._parse_json(self._search_regex(
|
||||
r'"fox_pdk_player"\s*:\s*({[^}]+?})', webpage, 'fox_pdk_player'),
|
||||
video_id)['release_url']
|
||||
settings = self._parse_json(self._search_regex(
|
||||
r'jQuery\.extend\(Drupal\.settings\s*,\s*({.+?})\);',
|
||||
webpage, 'drupal settings'), video_id)
|
||||
fox_pdk_player = settings['fox_pdk_player']
|
||||
release_url = fox_pdk_player['release_url']
|
||||
query = {
|
||||
'mbr': 'true',
|
||||
'switch': 'http'
|
||||
}
|
||||
if fox_pdk_player.get('access') == 'locked':
|
||||
ap_p = settings['foxAdobePassProvider']
|
||||
rating = ap_p.get('videoRating')
|
||||
if rating == 'n/a':
|
||||
rating = None
|
||||
resource = self._get_mvpd_resource('fbc-fox', None, ap_p['videoGUID'], rating)
|
||||
query['auth'] = self._extract_mvpd_auth(url, video_id, 'fbc-fox', resource)
|
||||
|
||||
return {
|
||||
'_type': 'url_transparent',
|
||||
'ie_key': 'ThePlatform',
|
||||
'url': smuggle_url(update_url_query(
|
||||
release_url, {'switch': 'http'}), {'force_smil_url': True}),
|
||||
'url': smuggle_url(update_url_query(release_url, query), {'force_smil_url': True}),
|
||||
'id': video_id,
|
||||
}
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
|
||||
from __future__ import unicode_literals
|
||||
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
|
||||
from __future__ import unicode_literals
|
||||
|
||||
@@ -27,7 +27,6 @@ from ..utils import (
|
||||
unified_strdate,
|
||||
unsmuggle_url,
|
||||
UnsupportedError,
|
||||
url_basename,
|
||||
xpath_text,
|
||||
)
|
||||
from .brightcove import (
|
||||
@@ -1549,7 +1548,7 @@ class GenericIE(InfoExtractor):
|
||||
force_videoid = smuggled_data['force_videoid']
|
||||
video_id = force_videoid
|
||||
else:
|
||||
video_id = compat_urllib_parse_unquote(os.path.splitext(url.rstrip('/').split('/')[-1])[0])
|
||||
video_id = self._generic_id(url)
|
||||
|
||||
self.to_screen('%s: Requesting header' % video_id)
|
||||
|
||||
@@ -1578,7 +1577,7 @@ class GenericIE(InfoExtractor):
|
||||
|
||||
info_dict = {
|
||||
'id': video_id,
|
||||
'title': compat_urllib_parse_unquote(os.path.splitext(url_basename(url))[0]),
|
||||
'title': self._generic_title(url),
|
||||
'upload_date': unified_strdate(head_response.headers.get('Last-Modified'))
|
||||
}
|
||||
|
||||
@@ -1754,9 +1753,9 @@ class GenericIE(InfoExtractor):
|
||||
if matches:
|
||||
return _playlist_from_matches(matches, ie='RtlNl')
|
||||
|
||||
vimeo_url = VimeoIE._extract_vimeo_url(url, webpage)
|
||||
if vimeo_url is not None:
|
||||
return self.url_result(vimeo_url)
|
||||
vimeo_urls = VimeoIE._extract_urls(url, webpage)
|
||||
if vimeo_urls:
|
||||
return _playlist_from_matches(vimeo_urls, ie=VimeoIE.ie_key())
|
||||
|
||||
vid_me_embed_url = self._search_regex(
|
||||
r'src=[\'"](https?://vid\.me/[^\'"]+)[\'"]',
|
||||
@@ -2332,12 +2331,23 @@ class GenericIE(InfoExtractor):
|
||||
info_dict.update(json_ld)
|
||||
return info_dict
|
||||
|
||||
# Look for HTML5 media
|
||||
entries = self._parse_html5_media_entries(url, webpage, video_id, m3u8_id='hls')
|
||||
if entries:
|
||||
for entry in entries:
|
||||
entry.update({
|
||||
'id': video_id,
|
||||
'title': video_title,
|
||||
})
|
||||
self._sort_formats(entry['formats'])
|
||||
return self.playlist_result(entries)
|
||||
|
||||
def check_video(vurl):
|
||||
if YoutubeIE.suitable(vurl):
|
||||
return True
|
||||
vpath = compat_urlparse.urlparse(vurl).path
|
||||
vext = determine_ext(vpath)
|
||||
return '.' in vpath and vext not in ('swf', 'png', 'jpg', 'srt', 'sbv', 'sub', 'vtt', 'ttml')
|
||||
return '.' in vpath and vext not in ('swf', 'png', 'jpg', 'srt', 'sbv', 'sub', 'vtt', 'ttml', 'js')
|
||||
|
||||
def filter_video(urls):
|
||||
return list(filter(check_video, urls))
|
||||
@@ -2387,9 +2397,6 @@ class GenericIE(InfoExtractor):
|
||||
# We only look in og:video if the MIME type is a video, don't try if it's a Flash player:
|
||||
if m_video_type is not None:
|
||||
found = filter_video(re.findall(r'<meta.*?property="og:video".*?content="(.*?)"', webpage))
|
||||
if not found:
|
||||
# HTML5 video
|
||||
found = re.findall(r'(?s)<(?:video|audio)[^<]*(?:>.*?<source[^>]*)?\s+src=["\'](.*?)["\']', webpage)
|
||||
if not found:
|
||||
REDIRECT_REGEX = r'[0-9]{,2};\s*(?:URL|url)=\'?([^\'"]+)'
|
||||
found = re.search(
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
# coding: utf-8
|
||||
|
||||
from __future__ import unicode_literals
|
||||
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
@@ -29,6 +29,7 @@ class InstagramIE(InfoExtractor):
|
||||
'uploader': 'Naomi Leonor Phan-Quang',
|
||||
'like_count': int,
|
||||
'comment_count': int,
|
||||
'comments': list,
|
||||
},
|
||||
}, {
|
||||
# missing description
|
||||
@@ -44,6 +45,7 @@ class InstagramIE(InfoExtractor):
|
||||
'uploader': 'Britney Spears',
|
||||
'like_count': int,
|
||||
'comment_count': int,
|
||||
'comments': list,
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True,
|
||||
@@ -82,7 +84,7 @@ class InstagramIE(InfoExtractor):
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
(video_url, description, thumbnail, timestamp, uploader,
|
||||
uploader_id, like_count, comment_count) = [None] * 8
|
||||
uploader_id, like_count, comment_count, height, width) = [None] * 10
|
||||
|
||||
shared_data = self._parse_json(
|
||||
self._search_regex(
|
||||
@@ -94,6 +96,8 @@ class InstagramIE(InfoExtractor):
|
||||
shared_data, lambda x: x['entry_data']['PostPage'][0]['media'], dict)
|
||||
if media:
|
||||
video_url = media.get('video_url')
|
||||
height = int_or_none(media.get('dimensions', {}).get('height'))
|
||||
width = int_or_none(media.get('dimensions', {}).get('width'))
|
||||
description = media.get('caption')
|
||||
thumbnail = media.get('display_src')
|
||||
timestamp = int_or_none(media.get('date'))
|
||||
@@ -101,10 +105,24 @@ class InstagramIE(InfoExtractor):
|
||||
uploader_id = media.get('owner', {}).get('username')
|
||||
like_count = int_or_none(media.get('likes', {}).get('count'))
|
||||
comment_count = int_or_none(media.get('comments', {}).get('count'))
|
||||
comments = [{
|
||||
'author': comment.get('user', {}).get('username'),
|
||||
'author_id': comment.get('user', {}).get('id'),
|
||||
'id': comment.get('id'),
|
||||
'text': comment.get('text'),
|
||||
'timestamp': int_or_none(comment.get('created_at')),
|
||||
} for comment in media.get(
|
||||
'comments', {}).get('nodes', []) if comment.get('text')]
|
||||
|
||||
if not video_url:
|
||||
video_url = self._og_search_video_url(webpage, secure=False)
|
||||
|
||||
formats = [{
|
||||
'url': video_url,
|
||||
'width': width,
|
||||
'height': height,
|
||||
}]
|
||||
|
||||
if not uploader_id:
|
||||
uploader_id = self._search_regex(
|
||||
r'"owner"\s*:\s*{\s*"username"\s*:\s*"(.+?)"',
|
||||
@@ -121,7 +139,7 @@ class InstagramIE(InfoExtractor):
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'url': video_url,
|
||||
'formats': formats,
|
||||
'ext': 'mp4',
|
||||
'title': 'Video by %s' % uploader_id,
|
||||
'description': description,
|
||||
@@ -131,6 +149,7 @@ class InstagramIE(InfoExtractor):
|
||||
'uploader': uploader,
|
||||
'like_count': like_count,
|
||||
'comment_count': comment_count,
|
||||
'comments': comments,
|
||||
}
|
||||
|
||||
|
||||
|
@@ -81,6 +81,9 @@ class IPrimaIE(InfoExtractor):
|
||||
for _, src in re.findall(r'src["\']\s*:\s*(["\'])(.+?)\1', playerpage):
|
||||
extract_formats(src)
|
||||
|
||||
if not formats and '>GEO_IP_NOT_ALLOWED<' in playerpage:
|
||||
self.raise_geo_restricted()
|
||||
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# coding=utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
@@ -32,13 +32,20 @@ class JWPlatformBaseIE(InfoExtractor):
|
||||
return self._parse_jwplayer_data(
|
||||
jwplayer_data, video_id, *args, **kwargs)
|
||||
|
||||
def _parse_jwplayer_data(self, jwplayer_data, video_id=None, require_title=True, m3u8_id=None, rtmp_params=None, base_url=None):
|
||||
def _parse_jwplayer_data(self, jwplayer_data, video_id=None, require_title=True,
|
||||
m3u8_id=None, mpd_id=None, rtmp_params=None, base_url=None):
|
||||
# JWPlayer backward compatibility: flattened playlists
|
||||
# https://github.com/jwplayer/jwplayer/blob/v7.4.3/src/js/api/config.js#L81-L96
|
||||
if 'playlist' not in jwplayer_data:
|
||||
jwplayer_data = {'playlist': [jwplayer_data]}
|
||||
|
||||
entries = []
|
||||
|
||||
# JWPlayer backward compatibility: single playlist item
|
||||
# https://github.com/jwplayer/jwplayer/blob/v7.7.0/src/js/playlist/playlist.js#L10
|
||||
if not isinstance(jwplayer_data['playlist'], list):
|
||||
jwplayer_data['playlist'] = [jwplayer_data['playlist']]
|
||||
|
||||
for video_data in jwplayer_data['playlist']:
|
||||
# JWPlayer backward compatibility: flattened sources
|
||||
# https://github.com/jwplayer/jwplayer/blob/v7.4.3/src/js/playlist/item.js#L29-L35
|
||||
@@ -57,6 +64,9 @@ class JWPlatformBaseIE(InfoExtractor):
|
||||
if source_type == 'hls' or ext == 'm3u8':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
source_url, this_video_id, 'mp4', 'm3u8_native', m3u8_id=m3u8_id, fatal=False))
|
||||
elif ext == 'mpd':
|
||||
formats.extend(self._extract_mpd_formats(
|
||||
source_url, this_video_id, mpd_id=mpd_id, fatal=False))
|
||||
# https://github.com/jwplayer/jwplayer/blob/master/src/js/providers/default.js#L67
|
||||
elif source_type.startswith('audio') or ext in ('oga', 'aac', 'mp3', 'mpeg', 'vorbis'):
|
||||
formats.append({
|
||||
|
@@ -105,20 +105,20 @@ class KalturaIE(InfoExtractor):
|
||||
kWidget\.(?:thumb)?[Ee]mbed\(
|
||||
\{.*?
|
||||
(?P<q1>['\"])wid(?P=q1)\s*:\s*
|
||||
(?P<q2>['\"])_?(?P<partner_id>[^'\"]+)(?P=q2),.*?
|
||||
(?P<q2>['\"])_?(?P<partner_id>(?:(?!(?P=q2)).)+)(?P=q2),.*?
|
||||
(?P<q3>['\"])entry_?[Ii]d(?P=q3)\s*:\s*
|
||||
(?P<q4>['\"])(?P<id>[^'\"]+)(?P=q4),
|
||||
(?P<q4>['\"])(?P<id>(?:(?!(?P=q4)).)+)(?P=q4),
|
||||
""", webpage) or
|
||||
re.search(
|
||||
r'''(?xs)
|
||||
(?P<q1>["\'])
|
||||
(?:https?:)?//cdnapi(?:sec)?\.kaltura\.com/.*?(?:p|partner_id)/(?P<partner_id>\d+).*?
|
||||
(?:https?:)?//cdnapi(?:sec)?\.kaltura\.com/(?:(?!(?P=q1)).)*(?:p|partner_id)/(?P<partner_id>\d+)(?:(?!(?P=q1)).)*
|
||||
(?P=q1).*?
|
||||
(?:
|
||||
entry_?[Ii]d|
|
||||
(?P<q2>["\'])entry_?[Ii]d(?P=q2)
|
||||
)\s*:\s*
|
||||
(?P<q3>["\'])(?P<id>.+?)(?P=q3)
|
||||
(?P<q3>["\'])(?P<id>(?:(?!(?P=q3)).)+)(?P=q3)
|
||||
''', webpage))
|
||||
if mobj:
|
||||
embed_info = mobj.groupdict()
|
||||
|
@@ -21,6 +21,10 @@ class KetnetIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'https://www.ketnet.be/achter-de-schermen/sien-repeteert-voor-stars-for-life',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
# mzsource, geo restricted to Belgium
|
||||
'url': 'https://www.ketnet.be/kijken/nachtwacht/de-bermadoe',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
@@ -36,9 +40,25 @@ class KetnetIE(InfoExtractor):
|
||||
|
||||
title = config['title']
|
||||
|
||||
formats = self._extract_m3u8_formats(
|
||||
config['source']['hls'], video_id, 'mp4',
|
||||
entry_protocol='m3u8_native', m3u8_id='hls')
|
||||
formats = []
|
||||
for source_key in ('', 'mz'):
|
||||
source = config.get('%ssource' % source_key)
|
||||
if not isinstance(source, dict):
|
||||
continue
|
||||
for format_id, format_url in source.items():
|
||||
if format_id == 'hls':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
format_url, video_id, 'mp4',
|
||||
entry_protocol='m3u8_native', m3u8_id=format_id,
|
||||
fatal=False))
|
||||
elif format_id == 'hds':
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
format_url, video_id, f4m_id=format_id, fatal=False))
|
||||
else:
|
||||
formats.append({
|
||||
'url': format_url,
|
||||
'format_id': format_id,
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import json
|
||||
|
@@ -29,7 +29,7 @@ from ..utils import (
|
||||
|
||||
class LeIE(InfoExtractor):
|
||||
IE_DESC = '乐视网'
|
||||
_VALID_URL = r'https?://(?:www\.le\.com/ptv/vplay|sports\.le\.com/video)/(?P<id>\d+)\.html'
|
||||
_VALID_URL = r'https?://(?:www\.le\.com/ptv/vplay|(?:sports\.le|(?:www\.)?lesports)\.com/(?:match|video))/(?P<id>\d+)\.html'
|
||||
|
||||
_URL_TEMPLATE = 'http://www.le.com/ptv/vplay/%s.html'
|
||||
|
||||
@@ -73,6 +73,12 @@ class LeIE(InfoExtractor):
|
||||
}, {
|
||||
'url': 'http://sports.le.com/video/25737697.html',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://www.lesports.com/match/1023203003.html',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'http://sports.le.com/match/1023203003.html',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
# ror() and calc_time_key() are reversed from a embedded swf file in KLetvPlayer.swf
|
||||
|
86
youtube_dl/extractor/lego.py
Normal file
86
youtube_dl/extractor/lego.py
Normal file
@@ -0,0 +1,86 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_str
|
||||
from ..utils import (
|
||||
unescapeHTML,
|
||||
int_or_none,
|
||||
)
|
||||
|
||||
|
||||
class LEGOIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?lego\.com/(?:[^/]+/)*videos/(?:[^/]+/)*[^/?#]+-(?P<id>[0-9a-f]+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.lego.com/en-us/videos/themes/club/blocumentary-kawaguchi-55492d823b1b4d5e985787fa8c2973b1',
|
||||
'md5': 'f34468f176cfd76488767fc162c405fa',
|
||||
'info_dict': {
|
||||
'id': '55492d823b1b4d5e985787fa8c2973b1',
|
||||
'ext': 'mp4',
|
||||
'title': 'Blocumentary Great Creations: Akiyuki Kawaguchi',
|
||||
}
|
||||
}
|
||||
_BITRATES = [256, 512, 1024, 1536, 2560]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
webpage = self._download_webpage(
|
||||
'http://www.lego.com/en-US/mediaplayer/video/' + video_id, video_id)
|
||||
title = self._search_regex(r'<title>(.+?)</title>', webpage, 'title')
|
||||
video_data = self._parse_json(unescapeHTML(self._search_regex(
|
||||
r"video='([^']+)'", webpage, 'video data')), video_id)
|
||||
progressive_base = self._search_regex(
|
||||
r'data-video-progressive-url="([^"]+)"',
|
||||
webpage, 'progressive base', default='https://lc-mediaplayerns-live-s.legocdn.com/')
|
||||
streaming_base = self._search_regex(
|
||||
r'data-video-streaming-url="([^"]+)"',
|
||||
webpage, 'streaming base', default='http://legoprod-f.akamaihd.net/')
|
||||
item_id = video_data['ItemId']
|
||||
|
||||
net_storage_path = video_data.get('NetStoragePath') or '/'.join([item_id[:2], item_id[2:4]])
|
||||
base_path = '_'.join([item_id, video_data['VideoId'], video_data['Locale'], compat_str(video_data['VideoVersion'])])
|
||||
path = '/'.join([net_storage_path, base_path])
|
||||
streaming_path = ','.join(map(lambda bitrate: compat_str(bitrate), self._BITRATES))
|
||||
|
||||
formats = self._extract_akamai_formats(
|
||||
'%si/s/public/%s_,%s,.mp4.csmil/master.m3u8' % (streaming_base, path, streaming_path), video_id)
|
||||
m3u8_formats = list(filter(
|
||||
lambda f: f.get('protocol') == 'm3u8_native' and f.get('vcodec') != 'none' and f.get('resolution') != 'multiple',
|
||||
formats))
|
||||
if len(m3u8_formats) == len(self._BITRATES):
|
||||
self._sort_formats(m3u8_formats)
|
||||
for bitrate, m3u8_format in zip(self._BITRATES, m3u8_formats):
|
||||
progressive_base_url = '%spublic/%s_%d.' % (progressive_base, path, bitrate)
|
||||
mp4_f = m3u8_format.copy()
|
||||
mp4_f.update({
|
||||
'url': progressive_base_url + 'mp4',
|
||||
'format_id': m3u8_format['format_id'].replace('hls', 'mp4'),
|
||||
'protocol': 'http',
|
||||
})
|
||||
web_f = {
|
||||
'url': progressive_base_url + 'webm',
|
||||
'format_id': m3u8_format['format_id'].replace('hls', 'webm'),
|
||||
'width': m3u8_format['width'],
|
||||
'height': m3u8_format['height'],
|
||||
'tbr': m3u8_format.get('tbr'),
|
||||
'ext': 'webm',
|
||||
}
|
||||
formats.extend([web_f, mp4_f])
|
||||
else:
|
||||
for bitrate in self._BITRATES:
|
||||
for ext in ('web', 'mp4'):
|
||||
formats.append({
|
||||
'format_id': '%s-%s' % (ext, bitrate),
|
||||
'url': '%spublic/%s_%d.%s' % (progressive_base, path, bitrate, ext),
|
||||
'tbr': bitrate,
|
||||
'ext': ext,
|
||||
})
|
||||
self._sort_formats(formats)
|
||||
|
||||
return {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'thumbnail': video_data.get('CoverImageUrl'),
|
||||
'duration': int_or_none(video_data.get('Length')),
|
||||
'formats': formats,
|
||||
}
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
@@ -59,7 +59,7 @@ class LimelightBaseIE(InfoExtractor):
|
||||
format_id = 'rtmp'
|
||||
if stream.get('videoBitRate'):
|
||||
format_id += '-%d' % int_or_none(stream['videoBitRate'])
|
||||
http_url = 'http://%s/%s' % (rtmp.group('host').replace('csl.', 'cpl.'), rtmp.group('playpath')[4:])
|
||||
http_url = 'http://cpl.delvenetworks.com/' + rtmp.group('playpath')[4:]
|
||||
urls.append(http_url)
|
||||
http_fmt = fmt.copy()
|
||||
http_fmt.update({
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
@@ -270,6 +270,29 @@ class MTVServicesEmbeddedIE(MTVServicesInfoExtractor):
|
||||
|
||||
|
||||
class MTVIE(MTVServicesInfoExtractor):
|
||||
IE_NAME = 'mtv'
|
||||
_VALID_URL = r'https?://(?:www\.)?mtv\.com/(?:video-clips|full-episodes)/(?P<id>[^/?#.]+)'
|
||||
_FEED_URL = 'http://www.mtv.com/feeds/mrss/'
|
||||
|
||||
_TESTS = [{
|
||||
'url': 'http://www.mtv.com/video-clips/vl8qof/unlocking-the-truth-trailer',
|
||||
'md5': '1edbcdf1e7628e414a8c5dcebca3d32b',
|
||||
'info_dict': {
|
||||
'id': '5e14040d-18a4-47c4-a582-43ff602de88e',
|
||||
'ext': 'mp4',
|
||||
'title': 'Unlocking The Truth|July 18, 2016|1|101|Trailer',
|
||||
'description': '"Unlocking the Truth" premieres August 17th at 11/10c.',
|
||||
'timestamp': 1468846800,
|
||||
'upload_date': '20160718',
|
||||
},
|
||||
}, {
|
||||
'url': 'http://www.mtv.com/full-episodes/94tujl/unlocking-the-truth-gates-of-hell-season-1-ep-101',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
|
||||
class MTVVideoIE(MTVServicesInfoExtractor):
|
||||
IE_NAME = 'mtv:video'
|
||||
_VALID_URL = r'''(?x)^https?://
|
||||
(?:(?:www\.)?mtv\.com/videos/.+?/(?P<videoid>[0-9]+)/[^/]+$|
|
||||
m\.mtv\.com/videos/video\.rbml\?.*?id=(?P<mgid>[^&]+))'''
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
@@ -9,9 +9,9 @@ from ..utils import (
|
||||
|
||||
|
||||
class MwaveIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://mwave\.interest\.me/mnettv/videodetail\.m\?searchVideoDetailVO\.clip_id=(?P<id>[0-9]+)'
|
||||
_VALID_URL = r'https?://mwave\.interest\.me/(?:[^/]+/)?mnettv/videodetail\.m\?searchVideoDetailVO\.clip_id=(?P<id>[0-9]+)'
|
||||
_URL_TEMPLATE = 'http://mwave.interest.me/mnettv/videodetail.m?searchVideoDetailVO.clip_id=%s'
|
||||
_TEST = {
|
||||
_TESTS = [{
|
||||
'url': 'http://mwave.interest.me/mnettv/videodetail.m?searchVideoDetailVO.clip_id=168859',
|
||||
# md5 is unstable
|
||||
'info_dict': {
|
||||
@@ -23,7 +23,10 @@ class MwaveIE(InfoExtractor):
|
||||
'duration': 206,
|
||||
'view_count': int,
|
||||
}
|
||||
}
|
||||
}, {
|
||||
'url': 'http://mwave.interest.me/en/mnettv/videodetail.m?searchVideoDetailVO.clip_id=176199',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
@@ -60,8 +63,8 @@ class MwaveIE(InfoExtractor):
|
||||
|
||||
|
||||
class MwaveMeetGreetIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://mwave\.interest\.me/meetgreet/view/(?P<id>\d+)'
|
||||
_TEST = {
|
||||
_VALID_URL = r'https?://mwave\.interest\.me/(?:[^/]+/)?meetgreet/view/(?P<id>\d+)'
|
||||
_TESTS = [{
|
||||
'url': 'http://mwave.interest.me/meetgreet/view/256',
|
||||
'info_dict': {
|
||||
'id': '173294',
|
||||
@@ -72,7 +75,10 @@ class MwaveMeetGreetIE(InfoExtractor):
|
||||
'duration': 3634,
|
||||
'view_count': int,
|
||||
}
|
||||
}
|
||||
}, {
|
||||
'url': 'http://mwave.interest.me/en/meetgreet/view/256',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
@@ -245,7 +245,11 @@ class NHLVideocenterCategoryIE(NHLBaseInfoExtractor):
|
||||
|
||||
class NHLIE(InfoExtractor):
|
||||
IE_NAME = 'nhl.com'
|
||||
_VALID_URL = r'https?://(?:www\.)?nhl\.com/([^/]+/)*c-(?P<id>\d+)'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?P<site>nhl|wch2016)\.com/(?:[^/]+/)*c-(?P<id>\d+)'
|
||||
_SITES_MAP = {
|
||||
'nhl': 'nhl',
|
||||
'wch2016': 'wch',
|
||||
}
|
||||
_TESTS = [{
|
||||
# type=video
|
||||
'url': 'https://www.nhl.com/video/anisimov-cleans-up-mess/t-277752844/c-43663503',
|
||||
@@ -270,13 +274,20 @@ class NHLIE(InfoExtractor):
|
||||
'upload_date': '20160204',
|
||||
'timestamp': 1454544904,
|
||||
},
|
||||
}, {
|
||||
'url': 'https://www.wch2016.com/video/caneur-best-of-game-2-micd-up/t-281230378/c-44983703',
|
||||
'only_matching': True,
|
||||
}, {
|
||||
'url': 'https://www.wch2016.com/news/3-stars-team-europe-vs-team-canada/c-282195068',
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
tmp_id = self._match_id(url)
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
tmp_id, site = mobj.group('id'), mobj.group('site')
|
||||
video_data = self._download_json(
|
||||
'https://nhl.bamcontent.com/nhl/id/v1/%s/details/web-v1.json' % tmp_id,
|
||||
tmp_id)
|
||||
'https://nhl.bamcontent.com/%s/id/v1/%s/details/web-v1.json'
|
||||
% (self._SITES_MAP[site], tmp_id), tmp_id)
|
||||
if video_data.get('type') == 'article':
|
||||
video_data = video_data['media']
|
||||
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .brightcove import (
|
||||
|
@@ -3,12 +3,15 @@ from __future__ import unicode_literals
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..compat import compat_HTTPError
|
||||
from ..utils import (
|
||||
fix_xml_ampersands,
|
||||
orderedSet,
|
||||
parse_duration,
|
||||
qualities,
|
||||
strip_jsonp,
|
||||
unified_strdate,
|
||||
ExtractorError,
|
||||
)
|
||||
|
||||
|
||||
@@ -180,9 +183,16 @@ class NPOIE(NPOBaseIE):
|
||||
continue
|
||||
streams = format_info.get('streams')
|
||||
if streams:
|
||||
video_info = self._download_json(
|
||||
streams[0] + '&type=json',
|
||||
video_id, 'Downloading %s stream JSON' % format_id)
|
||||
try:
|
||||
video_info = self._download_json(
|
||||
streams[0] + '&type=json',
|
||||
video_id, 'Downloading %s stream JSON' % format_id)
|
||||
except ExtractorError as ee:
|
||||
if isinstance(ee.cause, compat_HTTPError) and ee.cause.code == 404:
|
||||
error = (self._parse_json(ee.cause.read().decode(), video_id, fatal=False) or {}).get('errorstring')
|
||||
if error:
|
||||
raise ExtractorError(error, expected=True)
|
||||
raise
|
||||
else:
|
||||
video_info = format_info
|
||||
video_url = video_info.get('url')
|
||||
@@ -438,9 +448,30 @@ class SchoolTVIE(InfoExtractor):
|
||||
}
|
||||
|
||||
|
||||
class VPROIE(NPOIE):
|
||||
class NPOPlaylistBaseIE(NPOIE):
|
||||
def _real_extract(self, url):
|
||||
playlist_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, playlist_id)
|
||||
|
||||
entries = [
|
||||
self.url_result('npo:%s' % video_id if not video_id.startswith('http') else video_id)
|
||||
for video_id in orderedSet(re.findall(self._PLAYLIST_ENTRY_RE, webpage))
|
||||
]
|
||||
|
||||
playlist_title = self._html_search_regex(
|
||||
self._PLAYLIST_TITLE_RE, webpage, 'playlist title',
|
||||
default=None) or self._og_search_title(webpage)
|
||||
|
||||
return self.playlist_result(entries, playlist_id, playlist_title)
|
||||
|
||||
|
||||
class VPROIE(NPOPlaylistBaseIE):
|
||||
IE_NAME = 'vpro'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?:tegenlicht\.)?vpro\.nl/(?:[^/]+/){2,}(?P<id>[^/]+)\.html'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?:(?:tegenlicht\.)?vpro|2doc)\.nl/(?:[^/]+/)*(?P<id>[^/]+)\.html'
|
||||
_PLAYLIST_TITLE_RE = (r'<h1[^>]+class=["\'].*?\bmedia-platform-title\b.*?["\'][^>]*>([^<]+)',
|
||||
r'<h5[^>]+class=["\'].*?\bmedia-platform-subtitle\b.*?["\'][^>]*>([^<]+)')
|
||||
_PLAYLIST_ENTRY_RE = r'data-media-id="([^"]+)"'
|
||||
|
||||
_TESTS = [
|
||||
{
|
||||
@@ -453,12 +484,13 @@ class VPROIE(NPOIE):
|
||||
'description': 'md5:52cf4eefbc96fffcbdc06d024147abea',
|
||||
'upload_date': '20130225',
|
||||
},
|
||||
'skip': 'Video gone',
|
||||
},
|
||||
{
|
||||
'url': 'http://www.vpro.nl/programmas/2doc/2015/sergio-herman.html',
|
||||
'info_dict': {
|
||||
'id': 'sergio-herman',
|
||||
'title': 'Sergio Herman: Fucking perfect',
|
||||
'title': 'sergio herman: fucking perfect',
|
||||
},
|
||||
'playlist_count': 2,
|
||||
},
|
||||
@@ -467,54 +499,61 @@ class VPROIE(NPOIE):
|
||||
'url': 'http://www.vpro.nl/programmas/2doc/2015/education-education.html',
|
||||
'info_dict': {
|
||||
'id': 'education-education',
|
||||
'title': '2Doc',
|
||||
'title': 'education education',
|
||||
},
|
||||
'playlist_count': 2,
|
||||
},
|
||||
{
|
||||
'url': 'http://www.2doc.nl/documentaires/series/2doc/2015/oktober/de-tegenprestatie.html',
|
||||
'info_dict': {
|
||||
'id': 'de-tegenprestatie',
|
||||
'title': 'De Tegenprestatie',
|
||||
},
|
||||
'playlist_count': 2,
|
||||
}, {
|
||||
'url': 'http://www.2doc.nl/speel~VARA_101375237~mh17-het-verdriet-van-nederland~.html',
|
||||
'info_dict': {
|
||||
'id': 'VARA_101375237',
|
||||
'ext': 'm4v',
|
||||
'title': 'MH17: Het verdriet van Nederland',
|
||||
'description': 'md5:09e1a37c1fdb144621e22479691a9f18',
|
||||
'upload_date': '20150716',
|
||||
},
|
||||
'params': {
|
||||
# Skip because of m3u8 download
|
||||
'skip_download': True
|
||||
},
|
||||
}
|
||||
]
|
||||
|
||||
def _real_extract(self, url):
|
||||
playlist_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, playlist_id)
|
||||
|
||||
entries = [
|
||||
self.url_result('npo:%s' % video_id if not video_id.startswith('http') else video_id)
|
||||
for video_id in re.findall(r'data-media-id="([^"]+)"', webpage)
|
||||
]
|
||||
|
||||
playlist_title = self._search_regex(
|
||||
r'<title>\s*([^>]+?)\s*-\s*Teledoc\s*-\s*VPRO\s*</title>',
|
||||
webpage, 'playlist title', default=None) or self._og_search_title(webpage)
|
||||
|
||||
return self.playlist_result(entries, playlist_id, playlist_title)
|
||||
|
||||
|
||||
class WNLIE(InfoExtractor):
|
||||
class WNLIE(NPOPlaylistBaseIE):
|
||||
IE_NAME = 'wnl'
|
||||
_VALID_URL = r'https?://(?:www\.)?omroepwnl\.nl/video/detail/(?P<id>[^/]+)__\d+'
|
||||
_PLAYLIST_TITLE_RE = r'(?s)<h1[^>]+class="subject"[^>]*>(.+?)</h1>'
|
||||
_PLAYLIST_ENTRY_RE = r'<a[^>]+href="([^"]+)"[^>]+class="js-mid"[^>]*>Deel \d+'
|
||||
|
||||
_TEST = {
|
||||
_TESTS = [{
|
||||
'url': 'http://www.omroepwnl.nl/video/detail/vandaag-de-dag-6-mei__060515',
|
||||
'info_dict': {
|
||||
'id': 'vandaag-de-dag-6-mei',
|
||||
'title': 'Vandaag de Dag 6 mei',
|
||||
},
|
||||
'playlist_count': 4,
|
||||
}
|
||||
}]
|
||||
|
||||
def _real_extract(self, url):
|
||||
playlist_id = self._match_id(url)
|
||||
|
||||
webpage = self._download_webpage(url, playlist_id)
|
||||
class AndereTijdenIE(NPOPlaylistBaseIE):
|
||||
IE_NAME = 'anderetijden'
|
||||
_VALID_URL = r'https?://(?:www\.)?anderetijden\.nl/programma/(?:[^/]+/)+(?P<id>[^/?#&]+)'
|
||||
_PLAYLIST_TITLE_RE = r'(?s)<h1[^>]+class=["\'].*?\bpage-title\b.*?["\'][^>]*>(.+?)</h1>'
|
||||
_PLAYLIST_ENTRY_RE = r'<figure[^>]+class=["\']episode-container episode-page["\'][^>]+data-prid=["\'](.+?)["\']'
|
||||
|
||||
entries = [
|
||||
self.url_result('npo:%s' % video_id, 'NPO')
|
||||
for video_id, part in re.findall(
|
||||
r'<a[^>]+href="([^"]+)"[^>]+class="js-mid"[^>]*>(Deel \d+)', webpage)
|
||||
]
|
||||
|
||||
playlist_title = self._html_search_regex(
|
||||
r'(?s)<h1[^>]+class="subject"[^>]*>(.+?)</h1>',
|
||||
webpage, 'playlist title')
|
||||
|
||||
return self.playlist_result(entries, playlist_id, playlist_title)
|
||||
_TESTS = [{
|
||||
'url': 'http://anderetijden.nl/programma/1/Andere-Tijden/aflevering/676/Duitse-soldaten-over-de-Slag-bij-Arnhem',
|
||||
'info_dict': {
|
||||
'id': 'Duitse-soldaten-over-de-Slag-bij-Arnhem',
|
||||
'title': 'Duitse soldaten over de Slag bij Arnhem',
|
||||
},
|
||||
'playlist_count': 3,
|
||||
}]
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
36
youtube_dl/extractor/nzz.py
Normal file
36
youtube_dl/extractor/nzz.py
Normal file
@@ -0,0 +1,36 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
extract_attributes,
|
||||
)
|
||||
|
||||
|
||||
class NZZIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?nzz\.ch/(?:[^/]+/)*[^/?#]+-ld\.(?P<id>\d+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.nzz.ch/zuerich/gymizyte/gymizyte-schreiben-schueler-heute-noch-diktate-ld.9153',
|
||||
'info_dict': {
|
||||
'id': '9153',
|
||||
},
|
||||
'playlist_mincount': 6,
|
||||
}
|
||||
|
||||
def _real_extract(self, url):
|
||||
page_id = self._match_id(url)
|
||||
webpage = self._download_webpage(url, page_id)
|
||||
|
||||
entries = []
|
||||
for player_element in re.findall(r'(<[^>]+class="kalturaPlayer"[^>]*>)', webpage):
|
||||
player_params = extract_attributes(player_element)
|
||||
if player_params.get('data-type') not in ('kaltura_singleArticle',):
|
||||
self.report_warning('Unsupported player type')
|
||||
continue
|
||||
entry_id = player_params['data-id']
|
||||
entries.append(self.url_result(
|
||||
'kaltura:1750922:' + entry_id, 'Kaltura', entry_id))
|
||||
|
||||
return self.playlist_result(entries, page_id)
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
@@ -47,7 +47,7 @@ class OoyalaBaseIE(InfoExtractor):
|
||||
delivery_type = stream['delivery_type']
|
||||
if delivery_type == 'hls' or ext == 'm3u8':
|
||||
formats.extend(self._extract_m3u8_formats(
|
||||
s_url, embed_code, 'mp4', 'm3u8_native',
|
||||
re.sub(r'/ip(?:ad|hone)/', '/all/', s_url), embed_code, 'mp4', 'm3u8_native',
|
||||
m3u8_id='hls', fatal=False))
|
||||
elif delivery_type == 'hds' or ext == 'f4m':
|
||||
formats.extend(self._extract_f4m_formats(
|
||||
|
@@ -24,6 +24,22 @@ class OpenloadIE(InfoExtractor):
|
||||
'title': 'skyrim_no-audio_1080.mp4',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
},
|
||||
}, {
|
||||
'url': 'https://openload.co/embed/rjC09fkPLYs',
|
||||
'info_dict': {
|
||||
'id': 'rjC09fkPLYs',
|
||||
'ext': 'mp4',
|
||||
'title': 'movie.mp4',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
'subtitles': {
|
||||
'en': [{
|
||||
'ext': 'vtt',
|
||||
}],
|
||||
},
|
||||
},
|
||||
'params': {
|
||||
'skip_download': True, # test subtitles only
|
||||
},
|
||||
}, {
|
||||
'url': 'https://openload.co/embed/kUEfGclsU9o/skyrim_no-audio_1080.mp4',
|
||||
'only_matching': True,
|
||||
@@ -51,7 +67,8 @@ class OpenloadIE(InfoExtractor):
|
||||
# declared to be freely used in youtube-dl
|
||||
# See https://github.com/rg3/youtube-dl/issues/10408
|
||||
enc_data = self._html_search_regex(
|
||||
r'<span[^>]+id="hiddenurl"[^>]*>([^<]+)</span>', webpage, 'encrypted data')
|
||||
r'<span[^>]*>([^<]+)</span>\s*<span[^>]*>[^<]+</span>\s*<span[^>]+id="streamurl"',
|
||||
webpage, 'encrypted data')
|
||||
|
||||
video_url_chars = []
|
||||
|
||||
@@ -60,7 +77,7 @@ class OpenloadIE(InfoExtractor):
|
||||
if j >= 33 and j <= 126:
|
||||
j = ((j + 14) % 94) + 33
|
||||
if idx == len(enc_data) - 1:
|
||||
j += 3
|
||||
j += 2
|
||||
video_url_chars += compat_chr(j)
|
||||
|
||||
video_url = 'https://openload.co/stream/%s?mime=true' % ''.join(video_url_chars)
|
||||
@@ -70,11 +87,17 @@ class OpenloadIE(InfoExtractor):
|
||||
'title', default=None) or self._html_search_meta(
|
||||
'description', webpage, 'title', fatal=True)
|
||||
|
||||
return {
|
||||
entries = self._parse_html5_media_entries(url, webpage, video_id)
|
||||
subtitles = entries[0]['subtitles'] if entries else None
|
||||
|
||||
info_dict = {
|
||||
'id': video_id,
|
||||
'title': title,
|
||||
'thumbnail': self._og_search_thumbnail(webpage, default=None),
|
||||
'url': video_url,
|
||||
# Seems all videos have extensions in their titles
|
||||
'ext': determine_ext(title),
|
||||
'subtitles': subtitles,
|
||||
}
|
||||
|
||||
return info_dict
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
@@ -1,6 +1,8 @@
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
parse_iso8601,
|
||||
@@ -41,6 +43,13 @@ class PeriscopeIE(PeriscopeBaseIE):
|
||||
'only_matching': True,
|
||||
}]
|
||||
|
||||
@staticmethod
|
||||
def _extract_url(webpage):
|
||||
mobj = re.search(
|
||||
r'<iframe[^>]+src=([\'"])(?P<url>(?:https?:)?//(?:www\.)?periscope\.tv/(?:(?!\1).)+)\1', webpage)
|
||||
if mobj:
|
||||
return mobj.group('url')
|
||||
|
||||
def _real_extract(self, url):
|
||||
token = self._match_id(url)
|
||||
|
||||
@@ -78,7 +87,7 @@ class PeriscopeIE(PeriscopeBaseIE):
|
||||
'ext': 'flv' if format_id == 'rtmp' else 'mp4',
|
||||
}
|
||||
if format_id != 'rtmp':
|
||||
f['protocol'] = 'm3u8_native' if state == 'ended' else 'm3u8'
|
||||
f['protocol'] = 'm3u8_native' if state in ('ended', 'timed_out') else 'm3u8'
|
||||
formats.append(f)
|
||||
self._sort_formats(formats)
|
||||
|
||||
@@ -123,7 +132,7 @@ class PeriscopeUserIE(PeriscopeBaseIE):
|
||||
|
||||
user = list(data_store['UserCache']['users'].values())[0]['user']
|
||||
user_id = user['id']
|
||||
session_id = data_store['SessionToken']['broadcastHistory']['token']['session_id']
|
||||
session_id = data_store['SessionToken']['public']['broadcastHistory']['token']['session_id']
|
||||
|
||||
broadcasts = self._call_api(
|
||||
'getUserBroadcastsPublic',
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from ..compat import (
|
||||
|
@@ -2,13 +2,13 @@ from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
||||
from .common import InfoExtractor
|
||||
from .jwplatform import JWPlatformBaseIE
|
||||
from ..utils import (
|
||||
str_to_int,
|
||||
)
|
||||
|
||||
|
||||
class PornoXOIE(InfoExtractor):
|
||||
class PornoXOIE(JWPlatformBaseIE):
|
||||
_VALID_URL = r'https?://(?:www\.)?pornoxo\.com/videos/(?P<id>\d+)/(?P<display_id>[^/]+)\.html'
|
||||
_TEST = {
|
||||
'url': 'http://www.pornoxo.com/videos/7564/striptease-from-sexy-secretary.html',
|
||||
@@ -17,7 +17,8 @@ class PornoXOIE(InfoExtractor):
|
||||
'id': '7564',
|
||||
'ext': 'flv',
|
||||
'title': 'Striptease From Sexy Secretary!',
|
||||
'description': 'Striptease From Sexy Secretary!',
|
||||
'display_id': 'striptease-from-sexy-secretary',
|
||||
'description': 'md5:0ee35252b685b3883f4a1d38332f9980',
|
||||
'categories': list, # NSFW
|
||||
'thumbnail': 're:https?://.*\.jpg$',
|
||||
'age_limit': 18,
|
||||
@@ -26,23 +27,14 @@ class PornoXOIE(InfoExtractor):
|
||||
|
||||
def _real_extract(self, url):
|
||||
mobj = re.match(self._VALID_URL, url)
|
||||
video_id = mobj.group('id')
|
||||
video_id, display_id = mobj.groups()
|
||||
|
||||
webpage = self._download_webpage(url, video_id)
|
||||
|
||||
video_url = self._html_search_regex(
|
||||
r'\'file\'\s*:\s*"([^"]+)"', webpage, 'video_url')
|
||||
video_data = self._extract_jwplayer_data(webpage, video_id, require_title=False)
|
||||
|
||||
title = self._html_search_regex(
|
||||
r'<title>([^<]+)\s*-\s*PornoXO', webpage, 'title')
|
||||
|
||||
description = self._html_search_regex(
|
||||
r'<meta name="description" content="([^"]+)\s*featuring',
|
||||
webpage, 'description', fatal=False)
|
||||
|
||||
thumbnail = self._html_search_regex(
|
||||
r'\'image\'\s*:\s*"([^"]+)"', webpage, 'thumbnail', fatal=False)
|
||||
|
||||
view_count = str_to_int(self._html_search_regex(
|
||||
r'[vV]iews:\s*([0-9,]+)', webpage, 'view count', fatal=False))
|
||||
|
||||
@@ -53,13 +45,14 @@ class PornoXOIE(InfoExtractor):
|
||||
None if categories_str is None
|
||||
else categories_str.split(','))
|
||||
|
||||
return {
|
||||
video_data.update({
|
||||
'id': video_id,
|
||||
'url': video_url,
|
||||
'title': title,
|
||||
'description': description,
|
||||
'thumbnail': thumbnail,
|
||||
'display_id': display_id,
|
||||
'description': self._html_search_meta('description', webpage),
|
||||
'categories': categories,
|
||||
'view_count': view_count,
|
||||
'age_limit': 18,
|
||||
}
|
||||
})
|
||||
|
||||
return video_data
|
||||
|
@@ -7,7 +7,6 @@ from .common import InfoExtractor
|
||||
from ..utils import (
|
||||
determine_ext,
|
||||
ExtractorError,
|
||||
sanitized_Request,
|
||||
urlencode_postdata,
|
||||
)
|
||||
|
||||
@@ -15,12 +14,12 @@ from ..utils import (
|
||||
class PromptFileIE(InfoExtractor):
|
||||
_VALID_URL = r'https?://(?:www\.)?promptfile\.com/l/(?P<id>[0-9A-Z\-]+)'
|
||||
_TEST = {
|
||||
'url': 'http://www.promptfile.com/l/D21B4746E9-F01462F0FF',
|
||||
'md5': 'd1451b6302da7215485837aaea882c4c',
|
||||
'url': 'http://www.promptfile.com/l/86D1CE8462-576CAAE416',
|
||||
'md5': '5a7e285a26e0d66d9a263fae91bc92ce',
|
||||
'info_dict': {
|
||||
'id': 'D21B4746E9-F01462F0FF',
|
||||
'id': '86D1CE8462-576CAAE416',
|
||||
'ext': 'mp4',
|
||||
'title': 'Birds.mp4',
|
||||
'title': 'oceans.mp4',
|
||||
'thumbnail': 're:^https?://.*\.jpg$',
|
||||
}
|
||||
}
|
||||
@@ -33,14 +32,23 @@ class PromptFileIE(InfoExtractor):
|
||||
raise ExtractorError('Video %s does not exist' % video_id,
|
||||
expected=True)
|
||||
|
||||
chash = self._search_regex(
|
||||
r'val\("([^"]*)"\s*\+\s*\$\("#chash"\)', webpage, 'chash')
|
||||
fields = self._hidden_inputs(webpage)
|
||||
post = urlencode_postdata(fields)
|
||||
req = sanitized_Request(url, post)
|
||||
req.add_header('Content-type', 'application/x-www-form-urlencoded')
|
||||
webpage = self._download_webpage(
|
||||
req, video_id, 'Downloading video page')
|
||||
keys = list(fields.keys())
|
||||
chash_key = keys[0] if len(keys) == 1 else next(
|
||||
key for key in keys if key.startswith('cha'))
|
||||
fields[chash_key] = chash + fields[chash_key]
|
||||
|
||||
url = self._html_search_regex(r'url:\s*\'([^\']+)\'', webpage, 'URL')
|
||||
webpage = self._download_webpage(
|
||||
url, video_id, 'Downloading video page',
|
||||
data=urlencode_postdata(fields),
|
||||
headers={'Content-type': 'application/x-www-form-urlencoded'})
|
||||
|
||||
video_url = self._search_regex(
|
||||
(r'<a[^>]+href=(["\'])(?P<url>(?:(?!\1).)+)\1[^>]*>\s*Download File',
|
||||
r'<a[^>]+href=(["\'])(?P<url>https?://(?:www\.)?promptfile\.com/file/(?:(?!\1).)+)\1'),
|
||||
webpage, 'video url', group='url')
|
||||
title = self._html_search_regex(
|
||||
r'<span.+title="([^"]+)">', webpage, 'title')
|
||||
thumbnail = self._html_search_regex(
|
||||
@@ -49,7 +57,7 @@ class PromptFileIE(InfoExtractor):
|
||||
|
||||
formats = [{
|
||||
'format_id': 'sd',
|
||||
'url': url,
|
||||
'url': video_url,
|
||||
'ext': determine_ext(title),
|
||||
}]
|
||||
self._sort_formats(formats)
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
@@ -122,7 +122,17 @@ class ProSiebenSat1BaseIE(InfoExtractor):
|
||||
class ProSiebenSat1IE(ProSiebenSat1BaseIE):
|
||||
IE_NAME = 'prosiebensat1'
|
||||
IE_DESC = 'ProSiebenSat.1 Digital'
|
||||
_VALID_URL = r'https?://(?:www\.)?(?:(?:prosieben|prosiebenmaxx|sixx|sat1|kabeleins|the-voice-of-germany|7tv)\.(?:de|at|ch)|ran\.de|fem\.com)/(?P<id>.+)'
|
||||
_VALID_URL = r'''(?x)
|
||||
https?://
|
||||
(?:www\.)?
|
||||
(?:
|
||||
(?:
|
||||
prosieben(?:maxx)?|sixx|sat1(?:gold)?|kabeleins(?:doku)?|the-voice-of-germany|7tv|advopedia
|
||||
)\.(?:de|at|ch)|
|
||||
ran\.de|fem\.com|advopedia\.de
|
||||
)
|
||||
/(?P<id>.+)
|
||||
'''
|
||||
|
||||
_TESTS = [
|
||||
{
|
||||
@@ -290,6 +300,24 @@ class ProSiebenSat1IE(ProSiebenSat1BaseIE):
|
||||
'skip_download': True,
|
||||
},
|
||||
},
|
||||
{
|
||||
# geo restricted to Germany
|
||||
'url': 'http://www.kabeleinsdoku.de/tv/mayday-alarm-im-cockpit/video/102-notlandung-im-hudson-river-ganze-folge',
|
||||
'only_matching': True,
|
||||
},
|
||||
{
|
||||
# geo restricted to Germany
|
||||
'url': 'http://www.sat1gold.de/tv/edel-starck/video/11-staffel-1-episode-1-partner-wider-willen-ganze-folge',
|
||||
'only_matching': True,
|
||||
},
|
||||
{
|
||||
'url': 'http://www.sat1gold.de/tv/edel-starck/playlist/die-gesamte-1-staffel',
|
||||
'only_matching': True,
|
||||
},
|
||||
{
|
||||
'url': 'http://www.advopedia.de/videos/lenssen-klaert-auf/lenssen-klaert-auf-folge-8-staffel-3-feiertage-und-freie-tage',
|
||||
'only_matching': True,
|
||||
},
|
||||
]
|
||||
|
||||
_TOKEN = 'prosieben'
|
||||
@@ -361,19 +389,28 @@ class ProSiebenSat1IE(ProSiebenSat1BaseIE):
|
||||
def _extract_playlist(self, url, webpage):
|
||||
playlist_id = self._html_search_regex(
|
||||
self._PLAYLIST_ID_REGEXES, webpage, 'playlist id')
|
||||
for regex in self._PLAYLIST_CLIP_REGEXES:
|
||||
playlist_clips = re.findall(regex, webpage)
|
||||
if playlist_clips:
|
||||
title = self._html_search_regex(
|
||||
self._TITLE_REGEXES, webpage, 'title')
|
||||
description = self._html_search_regex(
|
||||
self._DESCRIPTION_REGEXES, webpage, 'description', fatal=False)
|
||||
entries = [
|
||||
self.url_result(
|
||||
re.match('(.+?//.+?)/', url).group(1) + clip_path,
|
||||
'ProSiebenSat1')
|
||||
for clip_path in playlist_clips]
|
||||
return self.playlist_result(entries, playlist_id, title, description)
|
||||
playlist = self._parse_json(
|
||||
self._search_regex(
|
||||
'var\s+contentResources\s*=\s*(\[.+?\]);\s*</script',
|
||||
webpage, 'playlist'),
|
||||
playlist_id)
|
||||
entries = []
|
||||
for item in playlist:
|
||||
clip_id = item.get('id') or item.get('upc')
|
||||
if not clip_id:
|
||||
continue
|
||||
info = self._extract_video_info(url, clip_id)
|
||||
info.update({
|
||||
'id': clip_id,
|
||||
'title': item.get('title') or item.get('teaser', {}).get('headline'),
|
||||
'description': item.get('teaser', {}).get('description'),
|
||||
'thumbnail': item.get('poster'),
|
||||
'duration': float_or_none(item.get('duration')),
|
||||
'series': item.get('tvShowTitle'),
|
||||
'uploader': item.get('broadcastPublisher'),
|
||||
})
|
||||
entries.append(info)
|
||||
return self.playlist_result(entries, playlist_id)
|
||||
|
||||
def _real_extract(self, url):
|
||||
video_id = self._match_id(url)
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .prosiebensat1 import ProSiebenSat1BaseIE
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
# coding: utf-8
|
||||
|
||||
from __future__ import unicode_literals
|
||||
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
from .common import InfoExtractor
|
||||
|
@@ -1,4 +1,4 @@
|
||||
# encoding: utf-8
|
||||
# coding: utf-8
|
||||
from __future__ import unicode_literals
|
||||
|
||||
import re
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user