Compare commits

..

89 Commits

Author SHA1 Message Date
Sergey M․
3c1e9dc4ec release 2016.12.12 2016-12-12 01:44:50 +07:00
Sergey M․
62faf9b55e [ChangeLog] Actualize 2016-12-12 01:41:08 +07:00
Sergey M․
3530e0d3d9 [dplay] Use Safari user-agent for hls (closes #11418) 2016-12-12 00:58:08 +07:00
Sergey M․
fb37eb25d9 [utils] Add common user agents map 2016-12-12 00:49:07 +07:00
Sergey M․
d2d2495e16 [facebook] Detect login required error message 2016-12-11 01:40:30 +07:00
Sergey M․
19b4900b7b [facebook] Improve video selection (closes #11390) 2016-12-11 01:22:01 +07:00
Sergey M․
6ca478d44a [canalplus] Add another video id regex (closes #11399) 2016-12-11 00:45:27 +07:00
Sergey M․
655cb545ab [mixcloud] Relax _VALID_URL (closes #11406) 2016-12-10 23:48:18 +07:00
Remita Amine
f0b69fa91a [ctvnews] relax _VALID_URL regex(closes #11394) 2016-12-10 17:36:32 +01:00
Remita Amine
8821a718cf [common] recognize hls manifests that contain video only formats(#11394) 2016-12-10 17:22:15 +01:00
Remita Amine
0d7d9f9404 [rte] improve extraction(closes #10498)(closes #7746) 2016-12-10 16:34:01 +01:00
Remita Amine
f41db40596 [prosiebensat1] extract dash formats 2016-12-10 13:29:51 +01:00
Remita Amine
68601ef3ac [rts,srgssr] improve extraction for geo restricted videos(fixes #11089)(closes #4989) 2016-12-10 10:47:56 +01:00
Sergey M․
18ece70c4d release 2016.12.09 2016-12-09 02:46:18 +07:00
Sergey M․
9ed3495eae [ChangeLog] Actualize 2016-12-09 02:41:49 +07:00
Yen Chi Hsuan
6c20a0bb99 [openload] Fix extraction (closes #10408) 2016-12-09 02:15:16 +08:00
Sergey M․
f43795e56b [pandoratv] PEP 8 and simplify 2016-12-07 23:50:10 +07:00
Serkora
7441915b1e [pandoratv] Fix extraction (closes #11023) 2016-12-07 23:46:42 +07:00
Remita Amine
283d1c6a8b [telebruxelles] extract all formats and add support for emission urls 2016-12-06 19:01:17 +01:00
Sergey M․
875ddd7409 [bloomberg] Add another video id regex (closes #11371) 2016-12-06 00:41:03 +07:00
Sergey M․
4afa4ff223 [1tv] Fix video id extraction 2016-12-05 23:28:57 +07:00
vordep
3ed81714d8 [fusion] Update ooyala id regex 2016-12-05 22:43:36 +07:00
Yen Chi Hsuan
4bd7d9d4ae [socks] Refine exception model for better error handling
1. ProxyError now inherits from socket.error instead of IOError

The only functions socks.py overrides are connect and connect_ex. In
Python 2.x and Python <= 3.2, socket functions raises socket.error. In
newer Python versions, those functions raises OSError instead. The name
socket.error is preserved as an alias of OSError for backward
compability. To keep socks.py compatible with Python's standard library,
it should raise the same exception as raw sockets.

See PEP 3151 (https://www.python.org/dev/peps/pep-3151/) for more
information about the change in Python 3.3.

2. Raise EOFError instead of IOError when the socket receives less data
than it expects

There's no common convention, but both ftplib and telnetlib raises
EOFError for similar situations. socks.py follows them.

Closes #11355

In #11355, only Python 2 is affected. In Python 3, both socket.error and
IOError are alias of OSError, so AbstractHTTPHandler.do_open correctly
catches the error and thus InfoExtractor._is_valid_url works fine.
2016-12-05 00:43:37 +08:00
Sergey M․
9b5288c92a [1tv] Improve extraction and add support for playlists (closes #11335) 2016-12-04 23:35:21 +07:00
Yen Chi Hsuan
8344296619 [socks] Fix error reporting (#11355) 2016-12-03 21:53:41 +08:00
Remita Amine
a94e7f4a0c [aenetworks] extract more formats(closes #11321) 2016-12-01 12:15:35 +01:00
Yen Chi Hsuan
d17bfe4095 [thisoldhouse] Recognize /tv-episode/ URLs and update _TESTS
Closes #11271
2016-12-01 14:56:52 +08:00
Laneone
98b08f94b1 [README.md] Fix typo
Just a minor spelling mistake in the readme
2016-12-01 01:31:21 +07:00
Sergey M․
73ec479c7d release 2016.12.01 2016-12-01 00:15:12 +07:00
Sergey M․
f150530f4d [ChangeLog] Actualize 2016-12-01 00:13:06 +07:00
Sergey M․
4c4765dba2 [soundcloud] Update client id (closes #11327) 2016-11-30 23:17:30 +07:00
Philipp Hagemeister
f882554815 [comedcycentral] Give /shows/.+/full-episodes URLs to the COmedyCentralFullEpisodesIE 2016-11-30 11:52:19 +01:00
Sergey M․
db75f14d8a [ruutu] Detect DRM videos 2016-11-30 04:19:38 +07:00
Sergey M․
8b0d3ee64e [liveleak] Simplify and PEP 8 2016-11-29 23:42:19 +07:00
Varun
3779d524df [liveleak] Add support for youtube embeds 2016-11-29 23:37:30 +07:00
Mark Lee
6303fc8204 [spike] Fix full episodes extraction 2016-11-29 23:06:01 +07:00
Philipp Hagemeister
cc61fc3934 [comedycentral] Add new extractor for full-episodes
CC seems to have added yet another indirection for full episodes - the mgid is now only in a linked feed.
This may be a little brittle, but it's better than failing outright.
Plus, the current The Daily Show episode now works :)
2016-11-29 10:12:18 +01:00
Sergey M․
c2530d3319 [teamfourstar] Simplify _VALID_URL and relax regexes 2016-11-28 23:22:29 +07:00
felix
8953319916 [screenwavemedia] Remove extractor
Rewrite TeamFourStar and Normalboots extractors in terms of JWPlatform
2016-11-28 23:17:56 +07:00
Yen Chi Hsuan
51b1378eed Ignore and clean .swf files
Some videos on NicoNico are swf
2016-11-27 22:01:07 +08:00
Sergey M․
2b380fc299 release 2016.11.27 2016-11-27 20:05:32 +07:00
Sergey M․
294d4926d7 [ChangeLog] Actualize 2016-11-27 20:04:03 +07:00
Sergey M․
83f1481baa [extractor/generic] Add support for webcaster.pro embeds 2016-11-27 19:56:32 +07:00
Sergey M․
f25e1c8d8c [webcaster] Add support for webcaster.pro 2016-11-27 19:54:59 +07:00
Sergey M․
6901673868 [azubu] Add support for azubu.uol.com.br (closes #11305) 2016-11-27 15:40:28 +07:00
Sergey M․
560c8c6ec0 [viki] Prefer hls 2016-11-26 00:14:09 +07:00
Sergey M․
9338a0eae3 [viki] Fix rtmp formats extraction (closes #11255) 2016-11-26 00:13:46 +07:00
Sergey M․
74394b5e10 [puls4] Relax _VALID_URL (closes #11267) 2016-11-25 23:37:32 +07:00
Sergey M․
1db058466d [vevo] Allow video info to fail in tests 2016-11-24 23:10:58 +07:00
Sergey M․
e94eeb1dd3 [vevo] Simplify artists extraction 2016-11-24 23:09:35 +07:00
Andrew J. Erickson
8b27d83e4e vevo: fixing naming when there are featured artists 2016-11-24 23:07:28 +07:00
Sergey M․
8eb7b5c3f1 [mitele] Modernize and extract more metadata 2016-11-24 22:43:02 +07:00
zurfyx
b68599ed47 [mitele] Relax _VALID_URL 2016-11-24 21:57:53 +07:00
Yen Chi Hsuan
44444f0d3b [cbslocal] Support newyork.cbslocal.com
Closes #11285
2016-11-24 20:32:17 +08:00
Sergey M․
c867adc68c [youtube:playlist] Pass disable_polymer in query (closes #11193, closes #11270) 2016-11-23 23:28:32 +07:00
Sergey M․
3b5daf0736 release 2016.11.22 2016-11-22 22:32:16 +07:00
Sergey M․
c8f56741dd [ChangeLog] Actualize 2016-11-22 22:29:37 +07:00
Andy Savicki
868630fbe5 [hellporno] Add support for hellporno.net and improve ext extraction 2016-11-22 22:16:10 +07:00
Yen Chi Hsuan
1d6ae5628f [amcnetworks] Recognize more BBC America URLs
Closes #11263
2016-11-22 20:40:57 +08:00
Sergey M․
6334794f2a [funnyordie] Copy formats' metadata from hls and sort formats 2016-11-21 23:46:55 +07:00
Andy Savicki
4eece8ba57 [funnyordie] Improve extraction 2016-11-21 22:16:26 +07:00
Yen Chi Hsuan
2574721a81 Clean and ignore more file types
ape is another audio codec seen in kuwo. See
https://en.wikipedia.org/wiki/Monkey's_Audio
2016-11-21 12:50:13 +08:00
Yen Chi Hsuan
dbcc4a6b32 [CONTRIBUTING.md] Fix broken links (#11239) 2016-11-21 12:25:19 +08:00
Yen Chi Hsuan
0bb58a208b Merge pull request #11239 from josephfrazier/patch-1
[CONTRIBUTING.md] Fix broken link
2016-11-21 12:24:11 +08:00
Joseph Frazier
dc6a9e4195 [README.md] Update link from generated CONTRIBUTING.md 2016-11-20 11:32:00 -05:00
Sergey M․
8f8f182d0b [extractor/generic] Improve limelight embeds support 2016-11-20 02:13:21 +07:00
Yen Chi Hsuan
2176e466e0 Merge branch 'DarkstaIkers-master' 2016-11-20 00:07:35 +08:00
Yen Chi Hsuan
303b38fa84 [ChangeLog] Update for #9028 2016-11-20 00:06:44 +08:00
Yen Chi Hsuan
fb27d0ce5e Merge branch 'master' of https://github.com/DarkstaIkers/youtube-dl into DarkstaIkers-master 2016-11-20 00:05:11 +08:00
Sergey M․
0aacd2deb1 [bandcamp] Fix free downloads extraction and extract all formats (closes #11067) 2016-11-19 04:18:21 +07:00
Sergey M․
08ec95a6db [ChangeLog] Actualize 2016-11-19 03:10:20 +07:00
Sergey M․
df46b19cb8 [toutv] Fix login form regex (closes #11223) 2016-11-19 01:56:31 +07:00
Sergey M․
748a462fbe [twitter:card] Relax _VALID_URL (closes #11225) 2016-11-19 01:49:13 +07:00
Sergey M․
c131fc3372 [tvanouvelles] Add extractor (closes #10616) 2016-11-18 01:16:33 +07:00
Sergey M․
b25459b88a release 2016.11.18 2016-11-18 00:25:24 +07:00
Sergey M․
5f75c4a4ad [ChangeLog] Actualize 2016-11-18 00:19:55 +07:00
Sergey M․
689f31fde5 [devscripts/create-github-release] Fill release body from ChangeLog (closes #11094) 2016-11-18 00:17:46 +07:00
Yen Chi Hsuan
582be35847 Update coding style after pycodestyle 2.1.0
In pycodestyle 2.1.0, E305 was introduced, which requires two blank
lines after top level declarations, too.

See https://github.com/PyCQA/pycodestyle/issues/400

See also #10689; thanks @stepshal for first mentioning this issue and
initial patches
2016-11-17 19:45:42 +08:00
Sergey M․
073d5bf583 [youtube:live] Relax _VALID_URL (closes #11164) 2016-11-16 23:15:19 +07:00
Yen Chi Hsuan
315cb86a95 Merge pull request #11210 from FooBarQuaxx/patch-2
Strip only args urls
2016-11-16 23:29:37 +08:00
FooBarQuaxx
b2fc1c4fb9 Add explanatory comment 2016-11-16 18:18:54 +03:00
Yen Chi Hsuan
d76767c90e [ChangeLog] Update after #11122 landed 2016-11-16 20:47:15 +08:00
Yen Chi Hsuan
eceba9f805 Merge pull request #11122 from kasper93/openload
[openload] Fix extraction.
2016-11-16 20:43:19 +08:00
MAA
d755396804 Strip only args urls 2016-11-16 09:00:30 +03:00
Sergey M․
58355a3bf1 [vlive] Add test for #11203 2016-11-15 22:11:47 +07:00
ping
49b69ad91c [vlive] Prefer locale over language for subtitles id 2016-11-15 22:07:17 +07:00
Kacper Michajłow
95ad9ce573 [openload] Fix extraction.
aadecode code was restored from commit c1decda58c
with some optimizations (2x faster).

Fixes #10408
2016-11-11 15:36:57 +01:00
Kacper Michajłow
189935f159 [jsinterp] Fix function calls without arguments. 2016-11-11 15:36:57 +01:00
DarkstaIkers
6cbb20bb09 Update crunchyroll.py 2016-03-29 14:26:24 -03:00
87 changed files with 1162 additions and 593 deletions

View File

@@ -6,8 +6,8 @@
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.11.14.1*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.11.14.1**
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.12.12*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.12.12**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2016.11.14.1
[debug] youtube-dl version 2016.12.12
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

3
.gitignore vendored
View File

@@ -31,6 +31,9 @@ updates_key.pem
*.mp3
*.3gp
*.wav
*.ape
*.mkv
*.swf
*.part
*.swp
test/testdata

View File

@@ -92,7 +92,7 @@ If you want to create a build of youtube-dl yourself, you'll need
### Adding support for a new site
If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. youtube-dl does **not support** such sites thus pull requests adding support for them **will be rejected**.
If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](README.md#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. youtube-dl does **not support** such sites thus pull requests adding support for them **will be rejected**.
After you have ensured this site is distributing it's content legally, you can follow this quick list (assuming your service is called `yourextractor`):

View File

@@ -1,3 +1,89 @@
version 2016.12.12
Core
+ [utils] Add common user agents map
+ [common] Recognize HLS manifests that contain video only formats (#11394)
Extractors
+ [dplay] Use Safari user agent for HLS (#11418)
+ [facebook] Detect login required error message
* [facebook] Improve video selection (#11390)
+ [canalplus] Add another video id pattern (#11399)
* [mixcloud] Relax URL regular expression (#11406)
* [ctvnews] Relax URL regular expression (#11394)
+ [rte] Capture and output error message (#7746, #10498)
+ [prosiebensat1] Add support for DASH formats
* [srgssr] Improve extraction for geo restricted videos (#11089)
* [rts] Improve extraction for geo restricted videos (#4989)
version 2016.12.09
Core
* [socks] Fix error reporting (#11355)
Extractors
* [openload] Fix extraction (#10408)
* [pandoratv] Fix extraction (#11023)
+ [telebruxelles] Add support for emission URLs
* [telebruxelles] Extract all formats
+ [bloomberg] Add another video id regular expression (#11371)
* [fusion] Update ooyala id regular expression (#11364)
+ [1tv] Add support for playlists (#11335)
* [1tv] Improve extraction (#11335)
+ [aenetworks] Extract more formats (#11321)
+ [thisoldhouse] Recognize /tv-episode/ URLs (#11271)
version 2016.12.01
Extractors
* [soundcloud] Update client id (#11327)
* [ruutu] Detect DRM protected videos
+ [liveleak] Add support for youtube embeds (#10688)
* [spike] Fix full episodes support (#11312)
* [comedycentral] Fix full episodes support
* [normalboots] Rewrite in terms of JWPlatform (#11184)
* [teamfourstar] Rewrite in terms of JWPlatform (#11184)
- [screenwavemedia] Remove extractor (#11184)
version 2016.11.27
Extractors
+ [webcaster] Add support for webcaster.pro
+ [azubu] Add support for azubu.uol.com.br (#11305)
* [viki] Prefer hls formats
* [viki] Fix rtmp formats extraction (#11255)
* [puls4] Relax URL regular expression (#11267)
* [vevo] Improve artist extraction (#10911)
* [mitele] Relax URL regular expression and extract more metadata (#11244)
+ [cbslocal] Recognize New York site (#11285)
+ [youtube:playlist] Pass disable_polymer in URL query (#11193)
version 2016.11.22
Extractors
* [hellporno] Fix video extension extraction (#11247)
+ [hellporno] Add support for hellporno.net (#11247)
+ [amcnetworks] Recognize more BBC America URLs (#11263)
* [funnyordie] Improve extraction (#11208)
* [extractor/generic] Improve limelight embeds support
- [crunchyroll] Remove ScaledBorderAndShadow from ASS subtitles (#8207, #9028)
* [bandcamp] Fix free downloads extraction and extract all formats (#11067)
* [twitter:card] Relax URL regular expression (#11225)
+ [tvanouvelles] Add support for tvanouvelles.ca (#10616)
version 2016.11.18
Extractors
* [youtube:live] Relax URL regular expression (#11164)
* [openload] Fix extraction (#10408, #11122)
* [vlive] Prefer locale over language for subtitles id (#11203)
version 2016.11.14.1
Core

View File

@@ -1,7 +1,7 @@
all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites
clean:
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.wav *.jpg *.png CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.wav *.ape *.swf *.jpg *.png CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe
find . -name "*.pyc" -delete
find . -name "*.class" -delete

View File

@@ -664,7 +664,7 @@ $ youtube-dl -f 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best'
# Download best format available but not better that 480p
$ youtube-dl -f 'bestvideo[height<=480]+bestaudio/best[height<=480]'
# Download best video only format but no bigger that 50 MB
# Download best video only format but no bigger than 50 MB
$ youtube-dl -f 'best[filesize<50M]'
# Download best format available via direct link over HTTP/HTTPS protocol
@@ -930,7 +930,7 @@ If you want to create a build of youtube-dl yourself, you'll need
### Adding support for a new site
If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. youtube-dl does **not support** such sites thus pull requests adding support for them **will be rejected**.
If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](README.md#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. youtube-dl does **not support** such sites thus pull requests adding support for them **will be rejected**.
After you have ensured this site is distributing it's content legally, you can follow this quick list (assuming your service is called `yourextractor`):

View File

@@ -25,5 +25,6 @@ def build_completion(opt_parser):
filled_template = template.replace("{{flags}}", " ".join(opts_flag))
f.write(filled_template)
parser = youtube_dl.parseOpts()[0]
build_completion(parser)

View File

@@ -2,11 +2,13 @@
from __future__ import unicode_literals
import base64
import io
import json
import mimetypes
import netrc
import optparse
import os
import re
import sys
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
@@ -90,16 +92,23 @@ class GitHubReleaser(object):
def main():
parser = optparse.OptionParser(usage='%prog VERSION BUILDPATH')
parser = optparse.OptionParser(usage='%prog CHANGELOG VERSION BUILDPATH')
options, args = parser.parse_args()
if len(args) != 2:
if len(args) != 3:
parser.error('Expected a version and a build directory')
version, build_path = args
changelog_file, version, build_path = args
with io.open(changelog_file, encoding='utf-8') as inf:
changelog = inf.read()
mobj = re.search(r'(?s)version %s\n{2}(.+?)\n{3}' % version, changelog)
body = mobj.group(1) if mobj else ''
releaser = GitHubReleaser()
new_release = releaser.create_release(version, name='youtube-dl %s' % version)
new_release = releaser.create_release(
version, name='youtube-dl %s' % version, body=body)
release_id = new_release['id']
for asset in os.listdir(build_path):

View File

@@ -44,5 +44,6 @@ def build_completion(opt_parser):
with open(FISH_COMPLETION_FILE, 'w') as f:
f.write(filled_template)
parser = youtube_dl.parseOpts()[0]
build_completion(parser)

View File

@@ -23,6 +23,7 @@ def openssl_encode(algo, key, iv):
out, _ = prog.communicate(secret_msg)
return out
iv = key = [0x20, 0x15] + 14 * [0]
r = openssl_encode('aes-128-cbc', key, iv)

View File

@@ -32,5 +32,6 @@ def main():
with open('supportedsites.html', 'w', encoding='utf-8') as sitesf:
sitesf.write(template)
if __name__ == '__main__':
main()

View File

@@ -28,5 +28,6 @@ def main():
with io.open(outfile, 'w', encoding='utf-8') as outf:
outf.write(out)
if __name__ == '__main__':
main()

View File

@@ -59,6 +59,7 @@ def build_lazy_ie(ie, name):
s += make_valid_template.format(valid_url=ie._make_valid_url())
return s
# find the correct sorting and add the required base classes so that sublcasses
# can be correctly created
classes = _ALL_CLASSES[:-1]

View File

@@ -41,5 +41,6 @@ def main():
with io.open(outfile, 'w', encoding='utf-8') as outf:
outf.write(out)
if __name__ == '__main__':
main()

View File

@@ -74,5 +74,6 @@ def filter_options(readme):
return ret
if __name__ == '__main__':
main()

View File

@@ -110,7 +110,7 @@ RELEASE_FILES="youtube-dl youtube-dl.exe youtube-dl-$version.tar.gz"
for f in $RELEASE_FILES; do gpg --passphrase-repeat 5 --detach-sig "build/$version/$f"; done
ROOT=$(pwd)
python devscripts/create-github-release.py $version "$ROOT/build/$version"
python devscripts/create-github-release.py ChangeLog $version "$ROOT/build/$version"
ssh ytdl@yt-dl.org "sh html/update_latest.sh $version"

View File

@@ -44,5 +44,6 @@ def build_completion(opt_parser):
with open(ZSH_COMPLETION_FILE, "w") as f:
f.write(template)
parser = youtube_dl.parseOpts()[0]
build_completion(parser)

View File

@@ -158,6 +158,7 @@
- **CollegeRama**
- **ComCarCoff**
- **ComedyCentral**
- **ComedyCentralFullEpisodes**
- **ComedyCentralShortname**
- **ComedyCentralTV**
- **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED
@@ -643,7 +644,6 @@
- **Screencast**
- **ScreencastOMatic**
- **ScreenJunkies**
- **ScreenwaveMedia**
- **Seeker**
- **SenateISVP**
- **SendtoNews**
@@ -715,7 +715,7 @@
- **teachertube:user:collection**: teachertube.com user and collection videos
- **TeachingChannel**
- **Teamcoco**
- **TeamFour**
- **TeamFourStar**
- **TechTalks**
- **techtv.mit.edu**
- **ted**
@@ -771,6 +771,8 @@
- **TV2Article**
- **TV3**
- **TV4**: tv4.se and tv4play.se
- **TVANouvelles**
- **TVANouvellesArticle**
- **TVC**
- **TVCArticle**
- **tvigle**: Интернет-телевидение Tvigle.ru
@@ -880,6 +882,8 @@
- **WatchIndianPorn**: Watch Indian Porn
- **WDR**
- **wdr:mobile**
- **Webcaster**
- **WebcasterFeed**
- **WebOfStories**
- **WebOfStoriesPlaylist**
- **WeiqiTV**: WQTV

View File

@@ -84,5 +84,6 @@ class TestInfoExtractor(unittest.TestCase):
self.assertRaises(ExtractorError, self.ie._download_json, uri, None)
self.assertEqual(self.ie._download_json(uri, None, fatal=False), None)
if __name__ == '__main__':
unittest.main()

View File

@@ -51,5 +51,6 @@ class TestAES(unittest.TestCase):
decrypted = (aes_decrypt_text(encrypted, password, 32))
self.assertEqual(decrypted, self.secret_msg)
if __name__ == '__main__':
unittest.main()

View File

@@ -60,6 +60,7 @@ def _file_md5(fn):
with open(fn, 'rb') as f:
return hashlib.md5(f.read()).hexdigest()
defs = gettestcases()
@@ -217,6 +218,7 @@ def generator(test_case):
return test_template
# And add them to TestDownload
for n, test_case in enumerate(defs):
test_method = generator(test_case)

View File

@@ -39,5 +39,6 @@ class TestExecution(unittest.TestCase):
_, stderr = p.communicate()
self.assertFalse(stderr)
if __name__ == '__main__':
unittest.main()

View File

@@ -169,5 +169,6 @@ class TestProxy(unittest.TestCase):
# b'xn--fiq228c' is '中文'.encode('idna')
self.assertEqual(response, 'normal: http://xn--fiq228c.tw/')
if __name__ == '__main__':
unittest.main()

View File

@@ -43,5 +43,6 @@ class TestIqiyiSDKInterpreter(unittest.TestCase):
ie._login()
self.assertTrue('unable to log in:' in logger.messages[0])
if __name__ == '__main__':
unittest.main()

View File

@@ -104,6 +104,14 @@ class TestJSInterpreter(unittest.TestCase):
}''')
self.assertEqual(jsi.call_function('x'), [20, 20, 30, 40, 50])
def test_call(self):
jsi = JSInterpreter('''
function x() { return 2; }
function y(a) { return x() + a; }
function z() { return y(3); }
''')
self.assertEqual(jsi.call_function('z'), 5)
if __name__ == '__main__':
unittest.main()

View File

@@ -1075,5 +1075,6 @@ The first line
self.assertEqual(get_element_by_class('foo', html), 'nice')
self.assertEqual(get_element_by_class('no-such-class', html), None)
if __name__ == '__main__':
unittest.main()

View File

@@ -66,5 +66,6 @@ class TestVerboseOutput(unittest.TestCase):
self.assertTrue(b'-p' in serr)
self.assertTrue(b'secret' not in serr)
if __name__ == '__main__':
unittest.main()

View File

@@ -24,6 +24,7 @@ class YoutubeDL(youtube_dl.YoutubeDL):
super(YoutubeDL, self).__init__(*args, **kwargs)
self.to_stderr = self.to_screen
params = get_params({
'writeannotations': True,
'skip_download': True,
@@ -74,5 +75,6 @@ class TestAnnotations(unittest.TestCase):
def tearDown(self):
try_rm(ANNOTATIONS_FILE)
if __name__ == '__main__':
unittest.main()

View File

@@ -66,5 +66,6 @@ class TestYoutubeLists(unittest.TestCase):
for entry in result['entries']:
self.assertTrue(entry.get('title'))
if __name__ == '__main__':
unittest.main()

View File

@@ -114,6 +114,7 @@ def make_tfunc(url, stype, sig_input, expected_sig):
test_func.__name__ = str('test_signature_' + stype + '_' + test_id)
setattr(TestSignature, test_func.__name__, test_func)
for test_spec in _TESTS:
make_tfunc(*test_spec)

View File

@@ -95,8 +95,7 @@ def _real_main(argv=None):
write_string('[debug] Batch file urls: ' + repr(batch_urls) + '\n')
except IOError:
sys.exit('ERROR: batch file could not be read')
all_urls = batch_urls + args
all_urls = [url.strip() for url in all_urls]
all_urls = batch_urls + [url.strip() for url in args] # batch_urls are already striped in read_batch_urls
_enc = preferredencoding()
all_urls = [url.decode(_enc, 'ignore') if isinstance(url, bytes) else url for url in all_urls]
@@ -450,4 +449,5 @@ def main(argv=None):
except KeyboardInterrupt:
sys.exit('\nERROR: Interrupted by user')
__all__ = ['main', 'YoutubeDL', 'gen_extractors', 'list_extractors']

View File

@@ -174,6 +174,7 @@ def aes_decrypt_text(data, password, key_size_bytes):
return plaintext
RCON = (0x8d, 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80, 0x1b, 0x36)
SBOX = (0x63, 0x7C, 0x77, 0x7B, 0xF2, 0x6B, 0x6F, 0xC5, 0x30, 0x01, 0x67, 0x2B, 0xFE, 0xD7, 0xAB, 0x76,
0xCA, 0x82, 0xC9, 0x7D, 0xFA, 0x59, 0x47, 0xF0, 0xAD, 0xD4, 0xA2, 0xAF, 0x9C, 0xA4, 0x72, 0xC0,
@@ -328,4 +329,5 @@ def inc(data):
break
return data
__all__ = ['aes_encrypt', 'key_expansion', 'aes_ctr_decrypt', 'aes_cbc_decrypt', 'aes_decrypt_text']

View File

@@ -2491,6 +2491,7 @@ class _TreeBuilder(etree.TreeBuilder):
def doctype(self, name, pubid, system):
pass
if sys.version_info[0] >= 3:
def compat_etree_fromstring(text):
return etree.XML(text, parser=etree.XMLParser(target=_TreeBuilder()))
@@ -2787,6 +2788,7 @@ def workaround_optparse_bug9161():
return real_add_option(self, *bargs, **bkwargs)
optparse.OptionGroup.add_option = _compat_add_option
if hasattr(shutil, 'get_terminal_size'): # Python >= 3.3
compat_get_terminal_size = shutil.get_terminal_size
else:

View File

@@ -293,6 +293,7 @@ class FFmpegFD(ExternalFD):
class AVconvFD(FFmpegFD):
pass
_BY_NAME = dict(
(klass.get_basename(), klass)
for name, klass in globals().items()

View File

@@ -26,7 +26,7 @@ class AENetworksIE(AENetworksBaseIE):
_VALID_URL = r'https?://(?:www\.)?(?P<domain>(?:history|aetv|mylifetime)\.com|fyi\.tv)/(?:shows/(?P<show_path>[^/]+(?:/[^/]+){0,2})|movies/(?P<movie_display_id>[^/]+)/full-movie)'
_TESTS = [{
'url': 'http://www.history.com/shows/mountain-men/season-1/episode-1',
'md5': '8ff93eb073449f151d6b90c0ae1ef0c7',
'md5': 'a97a65f7e823ae10e9244bc5433d5fe6',
'info_dict': {
'id': '22253814',
'ext': 'mp4',
@@ -99,7 +99,7 @@ class AENetworksIE(AENetworksBaseIE):
query = {
'mbr': 'true',
'assetTypes': 'medium_video_s3'
'assetTypes': 'high_video_s3'
}
video_id = self._html_search_meta('aetn:VideoID', webpage)
media_url = self._search_regex(
@@ -155,7 +155,7 @@ class HistoryTopicIE(AENetworksBaseIE):
'id': 'world-war-i-history',
'title': 'World War I History',
},
'playlist_mincount': 24,
'playlist_mincount': 23,
}, {
'url': 'http://www.history.com/topics/world-war-i-history/videos',
'only_matching': True,
@@ -193,7 +193,8 @@ class HistoryTopicIE(AENetworksBaseIE):
return self.theplatform_url_result(
release_url, video_id, {
'mbr': 'true',
'switch': 'hls'
'switch': 'hls',
'assetTypes': 'high_video_ak',
})
else:
webpage = self._download_webpage(url, topic_id)
@@ -203,6 +204,7 @@ class HistoryTopicIE(AENetworksBaseIE):
entries.append(self.theplatform_url_result(
video_attributes['data-release-url'], video_attributes['data-id'], {
'mbr': 'true',
'switch': 'hls'
'switch': 'hls',
'assetTypes': 'high_video_ak',
}))
return self.playlist_result(entries, topic_id, get_element_by_attribute('class', 'show-title', webpage))

View File

@@ -10,7 +10,7 @@ from ..utils import (
class AMCNetworksIE(ThePlatformIE):
_VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|wetv)\.com/(?:movies/|shows/[^/]+/(?:full-episodes/)?season-\d+/episode-\d+(?:-(?:[^/]+/)?|/))(?P<id>[^/?#]+)'
_VALID_URL = r'https?://(?:www\.)?(?:amc|bbcamerica|ifc|wetv)\.com/(?:movies/|shows/[^/]+/(?:full-episodes/)?[^/]+/episode-\d+(?:-(?:[^/]+/)?|/))(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'http://www.ifc.com/shows/maron/season-04/episode-01/step-1',
'md5': '',
@@ -41,6 +41,9 @@ class AMCNetworksIE(ThePlatformIE):
}, {
'url': 'http://www.ifc.com/movies/chaos',
'only_matching': True,
}, {
'url': 'http://www.bbcamerica.com/shows/doctor-who/full-episodes/the-power-of-the-daleks/episode-01-episode-1-color-version',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@@ -11,7 +11,7 @@ from ..utils import (
class AzubuIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?azubu\.tv/[^/]+#!/play/(?P<id>\d+)'
_VALID_URL = r'https?://(?:www\.)?azubu\.(?:tv|uol.com.br)/[^/]+#!/play/(?P<id>\d+)'
_TESTS = [
{
'url': 'http://www.azubu.tv/GSL#!/play/15575/2014-hot6-cup-last-big-match-ro8-day-1',
@@ -103,12 +103,15 @@ class AzubuIE(InfoExtractor):
class AzubuLiveIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?azubu\.tv/(?P<id>[^/]+)$'
_VALID_URL = r'https?://(?:www\.)?azubu\.(?:tv|uol.com.br)/(?P<id>[^/]+)$'
_TEST = {
_TESTS = [{
'url': 'http://www.azubu.tv/MarsTVMDLen',
'only_matching': True,
}
}, {
'url': 'http://azubu.uol.com.br/adolfz',
'only_matching': True,
}]
def _real_extract(self, url):
user = self._match_id(url)

View File

@@ -1,7 +1,9 @@
from __future__ import unicode_literals
import json
import random
import re
import time
from .common import InfoExtractor
from ..compat import (
@@ -12,6 +14,9 @@ from ..utils import (
ExtractorError,
float_or_none,
int_or_none,
parse_filesize,
unescapeHTML,
update_url_query,
)
@@ -81,35 +86,68 @@ class BandcampIE(InfoExtractor):
r'(?ms)var TralbumData = .*?[{,]\s*id: (?P<id>\d+),?$',
webpage, 'video id')
download_webpage = self._download_webpage(download_link, video_id, 'Downloading free downloads page')
# We get the dictionary of the track from some javascript code
all_info = self._parse_json(self._search_regex(
r'(?sm)items: (.*?),$', download_webpage, 'items'), video_id)
info = all_info[0]
# We pick mp3-320 for now, until format selection can be easily implemented.
mp3_info = info['downloads']['mp3-320']
# If we try to use this url it says the link has expired
initial_url = mp3_info['url']
m_url = re.match(
r'(?P<server>http://(.*?)\.bandcamp\.com)/download/track\?enc=mp3-320&fsig=(?P<fsig>.*?)&id=(?P<id>.*?)&ts=(?P<ts>.*)$',
initial_url)
# We build the url we will use to get the final track url
# This url is build in Bandcamp in the script download_bunde_*.js
request_url = '%s/statdownload/track?enc=mp3-320&fsig=%s&id=%s&ts=%s&.rand=665028774616&.vrs=1' % (m_url.group('server'), m_url.group('fsig'), video_id, m_url.group('ts'))
final_url_webpage = self._download_webpage(request_url, video_id, 'Requesting download url')
# If we could correctly generate the .rand field the url would be
# in the "download_url" key
final_url = self._proto_relative_url(self._search_regex(
r'"retry_url":"(.+?)"', final_url_webpage, 'final video URL'), 'http:')
download_webpage = self._download_webpage(
download_link, video_id, 'Downloading free downloads page')
blob = self._parse_json(
self._search_regex(
r'data-blob=(["\'])(?P<blob>{.+?})\1', download_webpage,
'blob', group='blob'),
video_id, transform_source=unescapeHTML)
info = blob['digital_items'][0]
downloads = info['downloads']
track = info['title']
artist = info.get('artist')
title = '%s - %s' % (artist, track) if artist else track
download_formats = {}
for f in blob['download_formats']:
name, ext = f.get('name'), f.get('file_extension')
if all(isinstance(x, compat_str) for x in (name, ext)):
download_formats[name] = ext.strip('.')
formats = []
for format_id, f in downloads.items():
format_url = f.get('url')
if not format_url:
continue
# Stat URL generation algorithm is reverse engineered from
# download_*_bundle_*.js
stat_url = update_url_query(
format_url.replace('/download/', '/statdownload/'), {
'.rand': int(time.time() * 1000 * random.random()),
})
format_id = f.get('encoding_name') or format_id
stat = self._download_json(
stat_url, video_id, 'Downloading %s JSON' % format_id,
transform_source=lambda s: s[s.index('{'):s.rindex('}') + 1],
fatal=False)
if not stat:
continue
retry_url = stat.get('retry_url')
if not isinstance(retry_url, compat_str):
continue
formats.append({
'url': self._proto_relative_url(retry_url, 'http:'),
'ext': download_formats.get(format_id),
'format_id': format_id,
'format_note': f.get('description'),
'filesize': parse_filesize(f.get('size_mb')),
'vcodec': 'none',
})
self._sort_formats(formats)
return {
'id': video_id,
'title': info['title'],
'ext': 'mp3',
'vcodec': 'none',
'url': final_url,
'title': title,
'thumbnail': info.get('thumb_url'),
'uploader': info.get('artist'),
'artist': artist,
'track': track,
'formats': formats,
}

View File

@@ -45,7 +45,8 @@ class BloombergIE(InfoExtractor):
name = self._match_id(url)
webpage = self._download_webpage(url, name)
video_id = self._search_regex(
r'["\']bmmrId["\']\s*:\s*(["\'])(?P<url>.+?)\1',
(r'["\']bmmrId["\']\s*:\s*(["\'])(?P<url>(?:(?!\1).)+)\1',
r'videoId\s*:\s*(["\'])(?P<url>(?:(?!\1).)+)\1'),
webpage, 'id', group='url', default=None)
if not video_id:
bplayer_data = self._parse_json(self._search_regex(

View File

@@ -105,7 +105,8 @@ class CanalplusIE(InfoExtractor):
webpage = self._download_webpage(url, display_id)
video_id = self._search_regex(
[r'<canal:player[^>]+?videoId=(["\'])(?P<id>\d+)',
r'id=["\']canal_video_player(?P<id>\d+)'],
r'id=["\']canal_video_player(?P<id>\d+)',
r'data-video=["\'](?P<id>\d+)'],
webpage, 'video id', group='id')
info_url = self._VIDEO_INFO_TEMPLATE % (site_id, video_id)

View File

@@ -283,11 +283,6 @@ class CBCWatchVideoIE(CBCWatchBaseIE):
formats = self._extract_m3u8_formats(re.sub(r'/([^/]+)/[^/?]+\.m3u8', r'/\1/\1.m3u8', m3u8_url), video_id, 'mp4', fatal=False)
if len(formats) < 2:
formats = self._extract_m3u8_formats(m3u8_url, video_id, 'mp4')
# Despite metadata in m3u8 all video+audio formats are
# actually video-only (no audio)
for f in formats:
if f.get('acodec') != 'none' and f.get('vcodec') != 'none':
f['acodec'] = 'none'
self._sort_formats(formats)
info = {

View File

@@ -4,11 +4,14 @@ from __future__ import unicode_literals
from .anvato import AnvatoIE
from .sendtonews import SendtoNewsIE
from ..compat import compat_urlparse
from ..utils import unified_timestamp
from ..utils import (
parse_iso8601,
unified_timestamp,
)
class CBSLocalIE(AnvatoIE):
_VALID_URL = r'https?://[a-z]+\.cbslocal\.com/\d+/\d+/\d+/(?P<id>[0-9a-z-]+)'
_VALID_URL = r'https?://[a-z]+\.cbslocal\.com/(?:\d+/\d+/\d+|video)/(?P<id>[0-9a-z-]+)'
_TESTS = [{
# Anvato backend
@@ -49,6 +52,31 @@ class CBSLocalIE(AnvatoIE):
# m3u8 download
'skip_download': True,
},
}, {
'url': 'http://newyork.cbslocal.com/video/3580809-a-very-blue-anniversary/',
'info_dict': {
'id': '3580809',
'ext': 'mp4',
'title': 'A Very Blue Anniversary',
'description': 'CBS2s Cindy Hsu has more.',
'thumbnail': 're:^https?://.*',
'timestamp': 1479962220,
'upload_date': '20161124',
'uploader': 'CBS',
'subtitles': {
'en': 'mincount:5',
},
'categories': [
'Stations\\Spoken Word\\WCBSTV',
'Syndication\\AOL',
'Syndication\\MSN',
'Syndication\\NDN',
'Syndication\\Yahoo',
'Content\\News',
'Content\\News\\Local News',
],
'tags': ['CBS 2 News Weekends', 'Cindy Hsu', 'Blue Man Group'],
},
}]
def _real_extract(self, url):
@@ -64,8 +92,11 @@ class CBSLocalIE(AnvatoIE):
info_dict = self._extract_anvato_videos(webpage, display_id)
time_str = self._html_search_regex(
r'class="entry-date">([^<]+)<', webpage, 'released date', fatal=False)
timestamp = unified_timestamp(time_str)
r'class="entry-date">([^<]+)<', webpage, 'released date', default=None)
if time_str:
timestamp = unified_timestamp(time_str)
else:
timestamp = parse_iso8601(self._html_search_meta('uploadDate', webpage))
info_dict.update({
'display_id': display_id,

View File

@@ -6,7 +6,7 @@ from .common import InfoExtractor
class ComedyCentralIE(MTVServicesInfoExtractor):
_VALID_URL = r'''(?x)https?://(?:www\.)?cc\.com/
(video-clips|episodes|cc-studios|video-collections|full-episodes|shows)
(video-clips|episodes|cc-studios|video-collections|shows(?=/[^/]+/(?!full-episodes)))
/(?P<title>.*)'''
_FEED_URL = 'http://comedycentral.com/feeds/mrss/'
@@ -27,6 +27,40 @@ class ComedyCentralIE(MTVServicesInfoExtractor):
}]
class ComedyCentralFullEpisodesIE(MTVServicesInfoExtractor):
_VALID_URL = r'''(?x)https?://(?:www\.)?cc\.com/
(?:full-episodes|shows(?=/[^/]+/full-episodes))
/(?P<id>[^?]+)'''
_FEED_URL = 'http://comedycentral.com/feeds/mrss/'
_TESTS = [{
'url': 'http://www.cc.com/full-episodes/pv391a/the-daily-show-with-trevor-noah-november-28--2016---ryan-speedo-green-season-22-ep-22028',
'info_dict': {
'description': 'Donald Trump is accused of exploiting his president-elect status for personal gain, Cuban leader Fidel Castro dies, and Ryan Speedo Green discusses "Sing for Your Life."',
'title': 'November 28, 2016 - Ryan Speedo Green',
},
'playlist_count': 4,
}, {
'url': 'http://www.cc.com/shows/the-daily-show-with-trevor-noah/full-episodes',
'only_matching': True,
}]
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
feed_json = self._search_regex(r'var triforceManifestFeed\s*=\s*(\{.+?\});\n', webpage, 'triforce feeed')
feed = self._parse_json(feed_json, playlist_id)
zones = feed['manifest']['zones']
video_zone = zones['t2_lc_promo1']
feed = self._download_json(video_zone['feed'], playlist_id)
mgid = feed['result']['data']['id']
videos_info = self._get_videos_info(mgid)
return videos_info
class ToshIE(MTVServicesInfoExtractor):
IE_DESC = 'Tosh.0'
_VALID_URL = r'^https?://tosh\.cc\.com/video-(?:clips|collections)/[^/]+/(?P<videotitle>[^/?#]+)'

View File

@@ -1224,6 +1224,7 @@ class InfoExtractor(object):
'protocol': entry_protocol,
'preference': preference,
}]
audio_groups = set()
last_info = {}
last_media = {}
for line in m3u8_doc.splitlines():
@@ -1239,15 +1240,18 @@ class InfoExtractor(object):
for v in (media.get('GROUP-ID'), media.get('NAME')):
if v:
format_id.append(v)
formats.append({
f = {
'format_id': '-'.join(format_id),
'url': format_url(media_url),
'language': media.get('LANGUAGE'),
'vcodec': 'none' if media_type == 'AUDIO' else None,
'ext': ext,
'protocol': entry_protocol,
'preference': preference,
})
}
if media_type == 'AUDIO':
f['vcodec'] = 'none'
audio_groups.add(media['GROUP-ID'])
formats.append(f)
else:
# When there is no URI in EXT-X-MEDIA let this tag's
# data be used by regular URI lines below
@@ -1295,6 +1299,9 @@ class InfoExtractor(object):
'abr': abr,
})
f.update(parse_codecs(last_info.get('CODECS')))
if last_info.get('AUDIO') in audio_groups:
# TODO: update acodec for for audio only formats with the same GROUP-ID
f['acodec'] = 'none'
formats.append(f)
last_info = {}
last_media = {}

View File

@@ -236,7 +236,7 @@ class CrunchyrollIE(CrunchyrollBaseIE):
output += 'WrapStyle: %s\n' % sub_root.attrib['wrap_style']
output += 'PlayResX: %s\n' % sub_root.attrib['play_res_x']
output += 'PlayResY: %s\n' % sub_root.attrib['play_res_y']
output += """ScaledBorderAndShadow: yes
output += """ScaledBorderAndShadow: no
[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding

View File

@@ -8,7 +8,7 @@ from ..utils import orderedSet
class CTVNewsIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?ctvnews\.ca/(?:video\?(?:clip|playlist|bin)Id=|.*?)(?P<id>[0-9.]+)'
_VALID_URL = r'https?://(?:.+?\.)?ctvnews\.ca/(?:video\?(?:clip|playlist|bin)Id=|.*?)(?P<id>[0-9.]+)'
_TESTS = [{
'url': 'http://www.ctvnews.ca/video?clipId=901995',
'md5': '10deb320dc0ccb8d01d34d12fc2ea672',
@@ -40,6 +40,9 @@ class CTVNewsIE(InfoExtractor):
}, {
'url': 'http://www.ctvnews.ca/canadiens-send-p-k-subban-to-nashville-in-blockbuster-trade-1.2967231',
'only_matching': True,
}, {
'url': 'http://vancouverisland.ctvnews.ca/video?clipId=761241',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@@ -8,6 +8,7 @@ import time
from .common import InfoExtractor
from ..compat import compat_urlparse
from ..utils import (
USER_AGENTS,
int_or_none,
update_url_query,
)
@@ -102,10 +103,16 @@ class DPlayIE(InfoExtractor):
manifest_url, video_id, ext='mp4',
entry_protocol='m3u8_native', m3u8_id=protocol, fatal=False)
# Sometimes final URLs inside m3u8 are unsigned, let's fix this
# ourselves
# ourselves. Also fragments' URLs are only served signed for
# Safari user agent.
query = compat_urlparse.parse_qs(compat_urlparse.urlparse(manifest_url).query)
for m3u8_format in m3u8_formats:
m3u8_format['url'] = update_url_query(m3u8_format['url'], query)
m3u8_format.update({
'url': update_url_query(m3u8_format['url'], query),
'http_headers': {
'User-Agent': USER_AGENTS['Safari'],
},
})
formats.extend(m3u8_formats)
elif protocol == 'hds':
formats.extend(self._extract_f4m_formats(

View File

@@ -180,6 +180,7 @@ from .cnn import (
from .coub import CoubIE
from .collegerama import CollegeRamaIE
from .comedycentral import (
ComedyCentralFullEpisodesIE,
ComedyCentralIE,
ComedyCentralShortnameIE,
ComedyCentralTVIE,
@@ -804,7 +805,6 @@ from .scivee import SciVeeIE
from .screencast import ScreencastIE
from .screencastomatic import ScreencastOMaticIE
from .screenjunkies import ScreenJunkiesIE
from .screenwavemedia import ScreenwaveMediaIE, TeamFourIE
from .seeker import SeekerIE
from .senateisvp import SenateISVPIE
from .sendtonews import SendtoNewsIE
@@ -897,6 +897,7 @@ from .teachertube import (
)
from .teachingchannel import TeachingChannelIE
from .teamcoco import TeamcocoIE
from .teamfourstar import TeamFourStarIE
from .techtalks import TechTalksIE
from .ted import TEDIE
from .tele13 import Tele13IE
@@ -965,6 +966,10 @@ from .tv2 import (
)
from .tv3 import TV3IE
from .tv4 import TV4IE
from .tvanouvelles import (
TVANouvellesIE,
TVANouvellesArticleIE,
)
from .tvc import (
TVCIE,
TVCArticleIE,
@@ -1117,6 +1122,10 @@ from .wdr import (
WDRIE,
WDRMobileIE,
)
from .webcaster import (
WebcasterIE,
WebcasterFeedIE,
)
from .webofstories import (
WebOfStoriesIE,
WebOfStoriesPlaylistIE,

View File

@@ -244,8 +244,10 @@ class FacebookIE(InfoExtractor):
r'handleServerJS\(({.+})(?:\);|,")', webpage, 'server js data', default='{}'), video_id)
for item in server_js_data.get('instances', []):
if item[1][0] == 'VideoConfig':
video_data = item[2][0]['videoData']
break
video_item = item[2][0]
if video_item.get('video_id') == video_id:
video_data = video_item['videoData']
break
if not video_data:
if not fatal_if_no_video:
@@ -255,6 +257,8 @@ class FacebookIE(InfoExtractor):
raise ExtractorError(
'The video is not available, Facebook said: "%s"' % m_msg.group(1),
expected=True)
elif '>You must log in to continue' in webpage:
self.raise_login_required()
else:
raise ExtractorError('Cannot parse data')

View File

@@ -2,7 +2,10 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_urlparse
from ..compat import (
compat_str,
compat_urlparse,
)
from ..utils import (
int_or_none,
qualities,
@@ -22,8 +25,7 @@ class FirstTVIE(InfoExtractor):
'info_dict': {
'id': '40049',
'ext': 'mp4',
'title': 'Гость Людмила Сенчина. Наедине со всеми. Выпуск от 12.02.2015',
'description': 'md5:36a39c1d19618fec57d12efe212a8370',
'title': 'Гость Людмила Сенчина. Наедине со всеми. Выпуск от 12.02.2015',
'thumbnail': 're:^https?://.*\.(?:jpg|JPG)$',
'upload_date': '20150212',
'duration': 2694,
@@ -34,8 +36,7 @@ class FirstTVIE(InfoExtractor):
'info_dict': {
'id': '364746',
'ext': 'mp4',
'title': 'Весенняя аллергия. Доброе утро. Фрагмент выпуска от 07.04.2016',
'description': 'md5:a242eea0031fd180a4497d52640a9572',
'title': 'Весенняя аллергия. Доброе утро. Фрагмент выпуска от 07.04.2016',
'thumbnail': 're:^https?://.*\.(?:jpg|JPG)$',
'upload_date': '20160407',
'duration': 179,
@@ -44,6 +45,17 @@ class FirstTVIE(InfoExtractor):
'params': {
'skip_download': True,
},
}, {
'url': 'http://www.1tv.ru/news/issue/2016-12-01/14:00',
'info_dict': {
'id': '14:00',
'title': 'Выпуск новостей в 14:00 1 декабря 2016 года. Новости. Первый канал',
'description': 'md5:2e921b948f8c1ff93901da78ebdb1dfd',
},
'playlist_count': 13,
}, {
'url': 'http://www.1tv.ru/shows/tochvtoch-supersezon/vystupleniya/evgeniy-dyatlov-vladimir-vysockiy-koni-priveredlivye-toch-v-toch-supersezon-fragment-vypuska-ot-06-11-2016',
'only_matching': True,
}]
def _real_extract(self, url):
@@ -51,43 +63,66 @@ class FirstTVIE(InfoExtractor):
webpage = self._download_webpage(url, display_id)
playlist_url = compat_urlparse.urljoin(url, self._search_regex(
r'data-playlist-url="([^"]+)', webpage, 'playlist url'))
r'data-playlist-url=(["\'])(?P<url>(?:(?!\1).)+)\1',
webpage, 'playlist url', group='url'))
item = self._download_json(playlist_url, display_id)[0]
video_id = item['id']
quality = qualities(('ld', 'sd', 'hd', ))
formats = []
for f in item.get('mbr', []):
src = f.get('src')
if not src:
continue
fname = f.get('name')
formats.append({
'url': src,
'format_id': fname,
'quality': quality(fname),
parsed_url = compat_urlparse.urlparse(playlist_url)
qs = compat_urlparse.parse_qs(parsed_url.query)
item_ids = qs.get('videos_ids[]') or qs.get('news_ids[]')
items = self._download_json(playlist_url, display_id)
if item_ids:
items = [
item for item in items
if item.get('uid') and compat_str(item['uid']) in item_ids]
else:
items = [items[0]]
entries = []
QUALITIES = ('ld', 'sd', 'hd', )
for item in items:
title = item['title']
quality = qualities(QUALITIES)
formats = []
for f in item.get('mbr', []):
src = f.get('src')
if not src or not isinstance(src, compat_str):
continue
tbr = int_or_none(self._search_regex(
r'_(\d{3,})\.mp4', src, 'tbr', default=None))
formats.append({
'url': src,
'format_id': f.get('name'),
'tbr': tbr,
'quality': quality(f.get('name')),
})
self._sort_formats(formats)
thumbnail = item.get('poster') or self._og_search_thumbnail(webpage)
duration = int_or_none(item.get('duration') or self._html_search_meta(
'video:duration', webpage, 'video duration', fatal=False))
upload_date = unified_strdate(self._html_search_meta(
'ya:ovs:upload_date', webpage, 'upload date', default=None))
entries.append({
'id': compat_str(item.get('id') or item['uid']),
'thumbnail': thumbnail,
'title': title,
'upload_date': upload_date,
'duration': int_or_none(duration),
'formats': formats
})
self._sort_formats(formats)
title = self._html_search_regex(
(r'<div class="tv_translation">\s*<h1><a href="[^"]+">([^<]*)</a>',
r"'title'\s*:\s*'([^']+)'"),
webpage, 'title', default=None) or item['title']
webpage, 'title', default=None) or self._og_search_title(
webpage, default=None)
description = self._html_search_regex(
r'<div class="descr">\s*<div>&nbsp;</div>\s*<p>([^<]*)</p></div>',
webpage, 'description', default=None) or self._html_search_meta(
'description', webpage, 'description')
duration = int_or_none(self._html_search_meta(
'video:duration', webpage, 'video duration', fatal=False))
upload_date = unified_strdate(self._html_search_meta(
'ya:ovs:upload_date', webpage, 'upload date', fatal=False))
'description', webpage, 'description', default=None)
return {
'id': video_id,
'thumbnail': item.get('poster') or self._og_search_thumbnail(webpage),
'title': title,
'description': description,
'upload_date': upload_date,
'duration': int_or_none(duration),
'formats': formats
}
return self.playlist_result(entries, display_id, title, description)

View File

@@ -28,6 +28,9 @@ class FunnyOrDieIE(InfoExtractor):
'description': 'Please use this to sell something. www.jonlajoie.com',
'thumbnail': 're:^http:.*\.jpg$',
},
'params': {
'skip_download': True,
},
}, {
'url': 'http://www.funnyordie.com/articles/ebf5e34fc8/10-hours-of-walking-in-nyc-as-a-man',
'only_matching': True,
@@ -51,19 +54,45 @@ class FunnyOrDieIE(InfoExtractor):
formats = []
formats.extend(self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False))
m3u8_formats = self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False)
source_formats = list(filter(
lambda f: f.get('vcodec') != 'none' and f.get('resolution') != 'multiple',
m3u8_formats))
bitrates = [int(bitrate) for bitrate in re.findall(r'[,/]v(\d+)[,/]', m3u8_url)]
bitrates = [int(bitrate) for bitrate in re.findall(r'[,/]v(\d+)(?=[,/])', m3u8_url)]
bitrates.sort()
for bitrate in bitrates:
for link in links:
formats.append({
'url': self._proto_relative_url('%s%d.%s' % (link[0], bitrate, link[1])),
'format_id': '%s-%d' % (link[1], bitrate),
'vbr': bitrate,
})
if source_formats:
self._sort_formats(source_formats)
for bitrate, f in zip(bitrates, source_formats or [{}] * len(bitrates)):
for path, ext in links:
ff = f.copy()
if ff:
if ext != 'mp4':
ff = dict(
[(k, v) for k, v in ff.items()
if k in ('height', 'width', 'format_id')])
ff.update({
'format_id': ff['format_id'].replace('hls', ext),
'ext': ext,
'protocol': 'http',
})
else:
ff.update({
'format_id': '%s-%d' % (ext, bitrate),
'vbr': bitrate,
})
ff['url'] = self._proto_relative_url(
'%s%d.%s' % (path, bitrate, ext))
formats.append(ff)
self._check_formats(formats, video_id)
formats.extend(m3u8_formats)
self._sort_formats(
formats, field_preference=('height', 'width', 'tbr', 'format_id'))
subtitles = {}
for src, src_lang in re.findall(r'<track kind="captions" src="([^"]+)" srclang="([^"]+)"', webpage):

View File

@@ -29,7 +29,7 @@ class FusionIE(InfoExtractor):
webpage = self._download_webpage(url, display_id)
ooyala_code = self._search_regex(
r'data-video-id=(["\'])(?P<code>.+?)\1',
r'data-ooyala-id=(["\'])(?P<code>(?:(?!\1).)+)\1',
webpage, 'ooyala code', group='code')
return OoyalaIE._build_url_result(ooyala_code)

View File

@@ -56,10 +56,10 @@ from .dailymotion import (
)
from .onionstudios import OnionStudiosIE
from .viewlift import ViewLiftEmbedIE
from .screenwavemedia import ScreenwaveMediaIE
from .mtv import MTVServicesEmbeddedIE
from .pladform import PladformIE
from .videomore import VideomoreIE
from .webcaster import WebcasterFeedIE
from .googledrive import GoogleDriveIE
from .jwplatform import JWPlatformIE
from .digiteka import DigitekaIE
@@ -1189,16 +1189,6 @@ class GenericIE(InfoExtractor):
'duration': 248.667,
},
},
# ScreenwaveMedia embed
{
'url': 'http://www.thecinemasnob.com/the-cinema-snob/a-nightmare-on-elm-street-2-freddys-revenge1',
'md5': '24ace5baba0d35d55c6810b51f34e9e0',
'info_dict': {
'id': 'cinemasnob-55d26273809dd',
'ext': 'mp4',
'title': 'cinemasnob',
},
},
# BrightcoveInPageEmbed embed
{
'url': 'http://www.geekandsundry.com/tabletop-bonus-wils-final-thoughts-on-dread/',
@@ -2140,6 +2130,11 @@ class GenericIE(InfoExtractor):
if videomore_url:
return self.url_result(videomore_url)
# Look for Webcaster embeds
webcaster_url = WebcasterFeedIE._extract_url(self, webpage)
if webcaster_url:
return self.url_result(webcaster_url, ie=WebcasterFeedIE.ie_key())
# Look for Playwire embeds
mobj = re.search(
r'<script[^>]+data-config=(["\'])(?P<url>(?:https?:)?//config\.playwire\.com/.+?)\1', webpage)
@@ -2206,11 +2201,6 @@ class GenericIE(InfoExtractor):
if jwplatform_url:
return self.url_result(jwplatform_url, 'JWPlatform')
# Look for ScreenwaveMedia embeds
mobj = re.search(ScreenwaveMediaIE.EMBED_PATTERN, webpage)
if mobj is not None:
return self.url_result(unescapeHTML(mobj.group('url')), 'ScreenwaveMedia')
# Look for Digiteka embeds
digiteka_url = DigitekaIE._extract_url(webpage)
if digiteka_url:
@@ -2232,6 +2222,16 @@ class GenericIE(InfoExtractor):
return self.url_result('limelight:%s:%s' % (
lm[mobj.group(1)], mobj.group(2)), 'Limelight%s' % mobj.group(1), mobj.group(2))
mobj = re.search(
r'''(?sx)
<object[^>]+class=(["\'])LimelightEmbeddedPlayerFlash\1[^>]*>.*?
<param[^>]+
name=(["\'])flashVars\2[^>]+
value=(["\'])(?:(?!\3).)*mediaId=(?P<id>[a-z0-9]{32})
''', webpage)
if mobj:
return self.url_result('limelight:media:%s' % mobj.group('id'))
# Look for AdobeTVVideo embeds
mobj = re.search(
r'<iframe[^>]+src=[\'"]((?:https?:)?//video\.tv\.adobe\.com/v/\d+[^"]+)[\'"]',

View File

@@ -6,12 +6,13 @@ from .common import InfoExtractor
from ..utils import (
js_to_json,
remove_end,
determine_ext,
)
class HellPornoIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?hellporno\.com/videos/(?P<id>[^/]+)'
_TEST = {
_VALID_URL = r'https?://(?:www\.)?hellporno\.(?:com/videos|net/v)/(?P<id>[^/]+)'
_TESTS = [{
'url': 'http://hellporno.com/videos/dixie-is-posing-with-naked-ass-very-erotic/',
'md5': '1fee339c610d2049699ef2aa699439f1',
'info_dict': {
@@ -22,7 +23,10 @@ class HellPornoIE(InfoExtractor):
'thumbnail': 're:https?://.*\.jpg$',
'age_limit': 18,
}
}
}, {
'url': 'http://hellporno.net/v/186271/',
'only_matching': True,
}]
def _real_extract(self, url):
display_id = self._match_id(url)
@@ -38,7 +42,7 @@ class HellPornoIE(InfoExtractor):
video_id = flashvars.get('video_id')
thumbnail = flashvars.get('preview_url')
ext = flashvars.get('postfix', '.mp4')[1:]
ext = determine_ext(flashvars.get('postfix'), 'mp4')
formats = []
for video_url_key in ['video_url', 'video_alt_url']:

View File

@@ -54,6 +54,22 @@ class LiveLeakIE(InfoExtractor):
'title': 'Crazy Hungarian tourist films close call waterspout in Croatia',
'thumbnail': 're:^https?://.*\.jpg$'
}
}, {
# Covers https://github.com/rg3/youtube-dl/pull/10664#issuecomment-247439521
'url': 'http://m.liveleak.com/view?i=763_1473349649',
'add_ie': ['Youtube'],
'info_dict': {
'id': '763_1473349649',
'ext': 'mp4',
'title': 'Reporters and public officials ignore epidemic of black on asian violence in Sacramento | Colin Flaherty',
'description': 'Colin being the warrior he is and showing the injustice Asians in Sacramento are being subjected to.',
'uploader': 'Ziz',
'upload_date': '20160908',
'uploader_id': 'UCEbta5E_jqlZmEJsriTEtnw'
},
'params': {
'skip_download': True,
},
}]
@staticmethod
@@ -87,7 +103,7 @@ class LiveLeakIE(InfoExtractor):
else:
# Maybe an embed?
embed_url = self._search_regex(
r'<iframe[^>]+src="(http://www.prochan.com/embed\?[^"]+)"',
r'<iframe[^>]+src="(https?://(?:www\.)?(?:prochan|youtube)\.com/embed[^"]+)"',
webpage, 'embed URL')
return {
'_type': 'url_transparent',
@@ -107,6 +123,7 @@ class LiveLeakIE(InfoExtractor):
'format_note': s.get('label'),
'url': s['file'],
} for i, s in enumerate(sources)]
for i, s in enumerate(sources):
# Removing '.h264_*.mp4' gives the raw video, which is essentially
# the same video without the LiveLeak logo at the top (see

View File

@@ -75,7 +75,7 @@ class MiTeleBaseIE(InfoExtractor):
class MiTeleIE(InfoExtractor):
IE_DESC = 'mitele.es'
_VALID_URL = r'https?://(?:www\.)?mitele\.es/programas-tv/(?:[^/]+/)(?P<id>[^/]+)/player'
_VALID_URL = r'https?://(?:www\.)?mitele\.es/(?:[^/]+/)+(?P<id>[^/]+)/player'
_TESTS = [{
'url': 'http://www.mitele.es/programas-tv/diario-de/57b0dfb9c715da65618b4afa/player',
@@ -86,7 +86,10 @@ class MiTeleIE(InfoExtractor):
'description': 'md5:3b6fce7eaa41b2d97358726378d9369f',
'series': 'Diario de',
'season': 'La redacción',
'season_number': 14,
'season_id': 'diario_de_t14_11981',
'episode': 'Programa 144',
'episode_number': 3,
'thumbnail': 're:(?i)^https?://.*\.jpg$',
'duration': 2913,
},
@@ -101,7 +104,10 @@ class MiTeleIE(InfoExtractor):
'description': 'md5:5ff132013f0cd968ffbf1f5f3538a65f',
'series': 'Cuarto Milenio',
'season': 'Temporada 6',
'season_number': 6,
'season_id': 'cuarto_milenio_t06_12715',
'episode': 'Programa 226',
'episode_number': 24,
'thumbnail': 're:(?i)^https?://.*\.jpg$',
'duration': 7313,
},
@@ -109,41 +115,77 @@ class MiTeleIE(InfoExtractor):
'skip_download': True,
},
'add_ie': ['Ooyala'],
}, {
'url': 'http://www.mitele.es/series-online/la-que-se-avecina/57aac5c1c915da951a8b45ed/player',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
gigya_url = self._search_regex(r'<gigya-api>[^>]*</gigya-api>[^>]*<script\s*src="([^"]*)">[^>]*</script>', webpage, 'gigya', default=None)
gigya_sc = self._download_webpage(compat_urlparse.urljoin(r'http://www.mitele.es/', gigya_url), video_id, 'Downloading gigya script')
gigya_url = self._search_regex(
r'<gigya-api>[^>]*</gigya-api>[^>]*<script\s+src="([^"]*)">[^>]*</script>',
webpage, 'gigya', default=None)
gigya_sc = self._download_webpage(
compat_urlparse.urljoin('http://www.mitele.es/', gigya_url),
video_id, 'Downloading gigya script')
# Get a appKey/uuid for getting the session key
appKey_var = self._search_regex(r'value\("appGridApplicationKey",([0-9a-f]+)\)', gigya_sc, 'appKey variable')
appKey = self._search_regex(r'var %s="([0-9a-f]+)"' % appKey_var, gigya_sc, 'appKey')
uid = compat_str(uuid.uuid4())
session_url = 'https://appgrid-api.cloud.accedo.tv/session?appKey=%s&uuid=%s' % (appKey, uid)
session_json = self._download_json(session_url, video_id, 'Downloading session keys')
sessionKey = compat_str(session_json['sessionKey'])
appKey_var = self._search_regex(
r'value\s*\(\s*["\']appGridApplicationKey["\']\s*,\s*([0-9a-f]+)',
gigya_sc, 'appKey variable')
appKey = self._search_regex(
r'var\s+%s\s*=\s*["\']([0-9a-f]+)' % appKey_var, gigya_sc, 'appKey')
session_json = self._download_json(
'https://appgrid-api.cloud.accedo.tv/session',
video_id, 'Downloading session keys', query={
'appKey': appKey,
'uuid': compat_str(uuid.uuid4()),
})
paths = self._download_json(
'https://appgrid-api.cloud.accedo.tv/metadata/general_configuration,%20web_configuration',
video_id, 'Downloading paths JSON',
query={'sessionKey': compat_str(session_json['sessionKey'])})
paths_url = 'https://appgrid-api.cloud.accedo.tv/metadata/general_configuration,%20web_configuration?sessionKey=' + sessionKey
paths = self._download_json(paths_url, video_id, 'Downloading paths JSON')
ooyala_s = paths['general_configuration']['api_configuration']['ooyala_search']
data_p = (
'http://' + ooyala_s['base_url'] + ooyala_s['full_path'] + ooyala_s['provider_id'] +
'/docs/' + video_id + '?include_titles=Series,Season&product_name=test&format=full')
data = self._download_json(data_p, video_id, 'Downloading data JSON')
source = data['hits']['hits'][0]['_source']
embedCode = source['offers'][0]['embed_codes'][0]
source = self._download_json(
'http://%s%s%s/docs/%s' % (
ooyala_s['base_url'], ooyala_s['full_path'],
ooyala_s['provider_id'], video_id),
video_id, 'Downloading data JSON', query={
'include_titles': 'Series,Season',
'product_name': 'test',
'format': 'full',
})['hits']['hits'][0]['_source']
embedCode = source['offers'][0]['embed_codes'][0]
titles = source['localizable_titles'][0]
title = titles.get('title_medium') or titles['title_long']
episode = titles['title_sort_name']
description = titles['summary_long']
titles_series = source['localizable_titles_series'][0]
series = titles_series['title_long']
titles_season = source['localizable_titles_season'][0]
season = titles_season['title_medium']
duration = parse_duration(source['videos'][0]['duration'])
description = titles.get('summary_long') or titles.get('summary_medium')
def get(key1, key2):
value1 = source.get(key1)
if not value1 or not isinstance(value1, list):
return
if not isinstance(value1[0], dict):
return
return value1[0].get(key2)
series = get('localizable_titles_series', 'title_medium')
season = get('localizable_titles_season', 'title_medium')
season_number = int_or_none(source.get('season_number'))
season_id = source.get('season_id')
episode = titles.get('title_sort_name')
episode_number = int_or_none(source.get('episode_number'))
duration = parse_duration(get('videos', 'duration'))
return {
'_type': 'url_transparent',
@@ -154,7 +196,10 @@ class MiTeleIE(InfoExtractor):
'description': description,
'series': series,
'season': season,
'season_number': season_number,
'season_id': season_id,
'episode': episode,
'episode_number': episode_number,
'duration': duration,
'thumbnail': source['images'][0]['url'],
'thumbnail': get('images', 'url'),
}

View File

@@ -22,7 +22,7 @@ from ..utils import (
class MixcloudIE(InfoExtractor):
_VALID_URL = r'^(?:https?://)?(?:www\.)?mixcloud\.com/([^/]+)/(?!stream|uploads|favorites|listens|playlists)([^/]+)'
_VALID_URL = r'https?://(?:(?:www|beta|m)\.)?mixcloud\.com/([^/]+)/(?!stream|uploads|favorites|listens|playlists)([^/]+)'
IE_NAME = 'mixcloud'
_TESTS = [{
@@ -51,6 +51,9 @@ class MixcloudIE(InfoExtractor):
'view_count': int,
'like_count': int,
},
}, {
'url': 'https://beta.mixcloud.com/RedLightRadio/nosedrip-15-red-light-radio-01-18-2016/',
'only_matching': True,
}]
# See https://www.mixcloud.com/media/js2/www_js_2.9e23256562c080482435196ca3975ab5.js

View File

@@ -78,11 +78,6 @@ class MSNIE(InfoExtractor):
m3u8_formats = self._extract_m3u8_formats(
format_url, display_id, 'mp4',
m3u8_id='hls', fatal=False)
# Despite metadata in m3u8 all video+audio formats are
# actually video-only (no audio)
for f in m3u8_formats:
if f.get('acodec') != 'none' and f.get('vcodec') != 'none':
f['acodec'] = 'none'
formats.extend(m3u8_formats)
else:
formats.append({

View File

@@ -13,6 +13,7 @@ from ..utils import (
fix_xml_ampersands,
float_or_none,
HEADRequest,
NO_DEFAULT,
RegexNotFoundError,
sanitized_Request,
strip_or_none,
@@ -201,7 +202,7 @@ class MTVServicesInfoExtractor(InfoExtractor):
[self._get_video_info(item) for item in idoc.findall('.//item')],
playlist_title=title, playlist_description=description)
def _extract_mgid(self, webpage):
def _extract_mgid(self, webpage, default=NO_DEFAULT):
try:
# the url can be http://media.mtvnservices.com/fb/{mgid}.swf
# or http://media.mtvnservices.com/{mgid}
@@ -221,7 +222,7 @@ class MTVServicesInfoExtractor(InfoExtractor):
sm4_embed = self._html_search_meta(
'sm4:video:embed', webpage, 'sm4 embed', default='')
mgid = self._search_regex(
r'embed/(mgid:.+?)["\'&?/]', sm4_embed, 'mgid')
r'embed/(mgid:.+?)["\'&?/]', sm4_embed, 'mgid', default=default)
return mgid
def _real_extract(self, url):

View File

@@ -2,7 +2,7 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from .screenwavemedia import ScreenwaveMediaIE
from .jwplatform import JWPlatformIE
from ..utils import (
unified_strdate,
@@ -25,7 +25,7 @@ class NormalbootsIE(InfoExtractor):
# m3u8 download
'skip_download': True,
},
'add_ie': ['ScreenwaveMedia'],
'add_ie': ['JWPlatform'],
}
def _real_extract(self, url):
@@ -39,15 +39,13 @@ class NormalbootsIE(InfoExtractor):
r'<span style="text-transform:uppercase; font-size:inherit;">[A-Za-z]+, (?P<date>.*)</span>',
webpage, 'date', fatal=False))
screenwavemedia_url = self._html_search_regex(
ScreenwaveMediaIE.EMBED_PATTERN, webpage, 'screenwave URL',
group='url')
jwplatform_url = JWPlatformIE._extract_url(webpage)
return {
'_type': 'url_transparent',
'id': video_id,
'url': screenwavemedia_url,
'ie_key': ScreenwaveMediaIE.ie_key(),
'url': jwplatform_url,
'ie_key': JWPlatformIE.ie_key(),
'title': self._og_search_title(webpage),
'description': self._og_search_description(webpage),
'thumbnail': self._og_search_thumbnail(webpage),

View File

@@ -1,11 +1,8 @@
# coding: utf-8
from __future__ import unicode_literals, division
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import (
compat_chr,
compat_ord,
)
from ..compat import compat_chr
from ..utils import (
determine_ext,
ExtractorError,
@@ -63,29 +60,20 @@ class OpenloadIE(InfoExtractor):
if 'File not found' in webpage or 'deleted by the owner' in webpage:
raise ExtractorError('File not found', expected=True)
# The following decryption algorithm is written by @yokrysty and
# declared to be freely used in youtube-dl
# See https://github.com/rg3/youtube-dl/issues/10408
enc_data = self._html_search_regex(
r'<span[^>]*>([^<]+)</span>\s*<span[^>]*>[^<]+</span>\s*<span[^>]+id="streamurl"',
webpage, 'encrypted data')
ol_id = self._search_regex(
'<span[^>]+id="[a-zA-Z0-9]+x"[^>]*>([0-9]+)</span>',
webpage, 'openload ID')
magic = compat_ord(enc_data[-1])
video_url_chars = []
first_two_chars = int(float(ol_id[0:][:2]))
urlcode = ''
num = 2
for idx, c in enumerate(enc_data):
j = compat_ord(c)
if j == magic:
j -= 1
elif j == magic - 1:
j += 1
if j >= 33 and j <= 126:
j = ((j + 14) % 94) + 33
if idx == len(enc_data) - 1:
j += 3
video_url_chars += compat_chr(j)
while num < len(ol_id):
urlcode += compat_chr(int(float(ol_id[num:][:3])) -
first_two_chars * int(float(ol_id[num + 3:][:2])))
num += 5
video_url = 'https://openload.co/stream/%s?mime=true' % ''.join(video_url_chars)
video_url = 'https://openload.co/stream/' + urlcode
title = self._og_search_title(webpage, default=None) or self._search_regex(
r'<span[^>]+class=["\']title["\'][^>]*>([^<]+)', webpage,
@@ -104,5 +92,4 @@ class OpenloadIE(InfoExtractor):
'ext': determine_ext(title),
'subtitles': subtitles,
}
return info_dict

View File

@@ -11,6 +11,7 @@ from ..utils import (
float_or_none,
parse_duration,
str_to_int,
urlencode_postdata,
)
@@ -56,6 +57,22 @@ class PandoraTVIE(InfoExtractor):
r'^v(\d+)[Uu]rl$', format_id, 'height', default=None)
if not height:
continue
play_url = self._download_json(
'http://m.pandora.tv/?c=api&m=play_url', video_id,
data=urlencode_postdata({
'prgid': video_id,
'runtime': info.get('runtime'),
'vod_url': format_url,
}),
headers={
'Origin': url,
'Content-Type': 'application/x-www-form-urlencoded',
})
format_url = play_url.get('url')
if not format_url:
continue
formats.append({
'format_id': '%sp' % height,
'url': format_url,

View File

@@ -85,6 +85,9 @@ class ProSiebenSat1BaseIE(InfoExtractor):
formats.extend(self._extract_m3u8_formats(
source_url, clip_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
elif mimetype == 'application/dash+xml':
formats.extend(self._extract_mpd_formats(
source_url, clip_id, mpd_id='dash', fatal=False))
else:
tbr = fix_bitrate(source['bitrate'])
if protocol in ('rtmp', 'rtmpe'):

View File

@@ -10,7 +10,7 @@ from ..utils import (
class Puls4IE(ProSiebenSat1BaseIE):
_VALID_URL = r'https?://(?:www\.)?puls4\.com/(?P<id>(?:[^/]+/)*?videos/[^?#]+)'
_VALID_URL = r'https?://(?:www\.)?puls4\.com/(?P<id>[^?#&]+)'
_TESTS = [{
'url': 'http://www.puls4.com/2-minuten-2-millionen/staffel-3/videos/2min2miotalk/Tobias-Homberger-von-myclubs-im-2min2miotalk-118118',
'md5': 'fd3c6b0903ac72c9d004f04bc6bb3e03',
@@ -22,6 +22,12 @@ class Puls4IE(ProSiebenSat1BaseIE):
'upload_date': '20160830',
'uploader': 'PULS_4',
},
}, {
'url': 'http://www.puls4.com/pro-und-contra/wer-wird-prasident/Ganze-Folgen/Wer-wird-Praesident.-Norbert-Hofer',
'only_matching': True,
}, {
'url': 'http://www.puls4.com/pro-und-contra/wer-wird-prasident/Ganze-Folgen/Wer-wird-Praesident-Analyse-des-Interviews-mit-Norbert-Hofer-416598',
'only_matching': True,
}]
_TOKEN = 'puls4'
_SALT = '01!kaNgaiNgah1Ie4AeSha'

View File

@@ -4,118 +4,31 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_HTTPError
from ..utils import (
float_or_none,
parse_iso8601,
unescapeHTML,
ExtractorError,
)
class RteIE(InfoExtractor):
IE_NAME = 'rte'
IE_DESC = 'Raidió Teilifís Éireann TV'
_VALID_URL = r'https?://(?:www\.)?rte\.ie/player/[^/]{2,3}/show/[^/]+/(?P<id>[0-9]+)'
_TEST = {
'url': 'http://www.rte.ie/player/ie/show/iwitness-862/10478715/',
'info_dict': {
'id': '10478715',
'ext': 'flv',
'title': 'Watch iWitness online',
'thumbnail': 're:^https?://.*\.jpg$',
'description': 'iWitness : The spirit of Ireland, one voice and one minute at a time.',
'duration': 60.046,
},
'params': {
'skip_download': 'f4m fails with --test atm'
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
title = self._og_search_title(webpage)
description = self._html_search_meta('description', webpage, 'description')
duration = float_or_none(self._html_search_meta(
'duration', webpage, 'duration', fatal=False), 1000)
thumbnail = None
thumbnail_meta = self._html_search_meta('thumbnail', webpage)
if thumbnail_meta:
thumbnail_id = self._search_regex(
r'uri:irus:(.+)', thumbnail_meta,
'thumbnail id', fatal=False)
if thumbnail_id:
thumbnail = 'http://img.rasset.ie/%s.jpg' % thumbnail_id
feeds_url = self._html_search_meta('feeds-prefix', webpage, 'feeds url') + video_id
json_string = self._download_json(feeds_url, video_id)
# f4m_url = server + relative_url
f4m_url = json_string['shows'][0]['media:group'][0]['rte:server'] + json_string['shows'][0]['media:group'][0]['url']
f4m_formats = self._extract_f4m_formats(f4m_url, video_id)
self._sort_formats(f4m_formats)
return {
'id': video_id,
'title': title,
'formats': f4m_formats,
'description': description,
'thumbnail': thumbnail,
'duration': duration,
}
class RteRadioIE(InfoExtractor):
IE_NAME = 'rte:radio'
IE_DESC = 'Raidió Teilifís Éireann radio'
# Radioplayer URLs have two distinct specifier formats,
# the old format #!rii=<channel_id>:<id>:<playable_item_id>:<date>:
# the new format #!rii=b<channel_id>_<id>_<playable_item_id>_<date>_
# where the IDs are int/empty, the date is DD-MM-YYYY, and the specifier may be truncated.
# An <id> uniquely defines an individual recording, and is the only part we require.
_VALID_URL = r'https?://(?:www\.)?rte\.ie/radio/utils/radioplayer/rteradioweb\.html#!rii=(?:b?[0-9]*)(?:%3A|:|%5F|_)(?P<id>[0-9]+)'
_TESTS = [{
# Old-style player URL; HLS and RTMPE formats
'url': 'http://www.rte.ie/radio/utils/radioplayer/rteradioweb.html#!rii=16:10507902:2414:27-12-2015:',
'info_dict': {
'id': '10507902',
'ext': 'mp4',
'title': 'Gloria',
'thumbnail': 're:^https?://.*\.jpg$',
'description': 'md5:9ce124a7fb41559ec68f06387cabddf0',
'timestamp': 1451203200,
'upload_date': '20151227',
'duration': 7230.0,
},
'params': {
'skip_download': 'f4m fails with --test atm'
}
}, {
# New-style player URL; RTMPE formats only
'url': 'http://rte.ie/radio/utils/radioplayer/rteradioweb.html#!rii=b16_3250678_8861_06-04-2012_',
'info_dict': {
'id': '3250678',
'ext': 'flv',
'title': 'The Lyric Concert with Paul Herriott',
'thumbnail': 're:^https?://.*\.jpg$',
'description': '',
'timestamp': 1333742400,
'upload_date': '20120406',
'duration': 7199.016,
},
'params': {
'skip_download': 'f4m fails with --test atm'
}
}]
class RteBaseIE(InfoExtractor):
def _real_extract(self, url):
item_id = self._match_id(url)
json_string = self._download_json(
'http://www.rte.ie/rteavgen/getplaylist/?type=web&format=json&id=' + item_id,
item_id)
try:
json_string = self._download_json(
'http://www.rte.ie/rteavgen/getplaylist/?type=web&format=json&id=' + item_id,
item_id)
except ExtractorError as ee:
if isinstance(ee.cause, compat_HTTPError) and ee.cause.code == 404:
error_info = self._parse_json(ee.cause.read().decode(), item_id, fatal=False)
if error_info:
raise ExtractorError(
'%s said: %s' % (self.IE_NAME, error_info['message']),
expected=True)
raise
# NB the string values in the JSON are stored using XML escaping(!)
show = json_string['shows'][0]
@@ -163,3 +76,67 @@ class RteRadioIE(InfoExtractor):
'duration': duration,
'formats': formats,
}
class RteIE(RteBaseIE):
IE_NAME = 'rte'
IE_DESC = 'Raidió Teilifís Éireann TV'
_VALID_URL = r'https?://(?:www\.)?rte\.ie/player/[^/]{2,3}/show/[^/]+/(?P<id>[0-9]+)'
_TEST = {
'url': 'http://www.rte.ie/player/ie/show/iwitness-862/10478715/',
'md5': '4a76eb3396d98f697e6e8110563d2604',
'info_dict': {
'id': '10478715',
'ext': 'mp4',
'title': 'iWitness',
'thumbnail': 're:^https?://.*\.jpg$',
'description': 'The spirit of Ireland, one voice and one minute at a time.',
'duration': 60.046,
'upload_date': '20151012',
'timestamp': 1444694160,
},
}
class RteRadioIE(RteBaseIE):
IE_NAME = 'rte:radio'
IE_DESC = 'Raidió Teilifís Éireann radio'
# Radioplayer URLs have two distinct specifier formats,
# the old format #!rii=<channel_id>:<id>:<playable_item_id>:<date>:
# the new format #!rii=b<channel_id>_<id>_<playable_item_id>_<date>_
# where the IDs are int/empty, the date is DD-MM-YYYY, and the specifier may be truncated.
# An <id> uniquely defines an individual recording, and is the only part we require.
_VALID_URL = r'https?://(?:www\.)?rte\.ie/radio/utils/radioplayer/rteradioweb\.html#!rii=(?:b?[0-9]*)(?:%3A|:|%5F|_)(?P<id>[0-9]+)'
_TESTS = [{
# Old-style player URL; HLS and RTMPE formats
'url': 'http://www.rte.ie/radio/utils/radioplayer/rteradioweb.html#!rii=16:10507902:2414:27-12-2015:',
'md5': 'c79ccb2c195998440065456b69760411',
'info_dict': {
'id': '10507902',
'ext': 'mp4',
'title': 'Gloria',
'thumbnail': 're:^https?://.*\.jpg$',
'description': 'md5:9ce124a7fb41559ec68f06387cabddf0',
'timestamp': 1451203200,
'upload_date': '20151227',
'duration': 7230.0,
},
}, {
# New-style player URL; RTMPE formats only
'url': 'http://rte.ie/radio/utils/radioplayer/rteradioweb.html#!rii=b16_3250678_8861_06-04-2012_',
'info_dict': {
'id': '3250678',
'ext': 'flv',
'title': 'The Lyric Concert with Paul Herriott',
'thumbnail': 're:^https?://.*\.jpg$',
'description': '',
'timestamp': 1333742400,
'upload_date': '20120406',
'duration': 7199.016,
},
'params': {
# rtmp download
'skip_download': True,
},
}]

View File

@@ -4,27 +4,24 @@ from __future__ import unicode_literals
import re
from .srgssr import SRGSSRIE
from ..compat import (
compat_str,
compat_urllib_parse_urlparse,
)
from ..compat import compat_str
from ..utils import (
int_or_none,
parse_duration,
parse_iso8601,
unescapeHTML,
xpath_text,
determine_ext,
)
class RTSIE(SRGSSRIE):
IE_DESC = 'RTS.ch'
_VALID_URL = r'rts:(?P<rts_id>\d+)|https?://(?:www\.)?rts\.ch/(?:[^/]+/){2,}(?P<id>[0-9]+)-(?P<display_id>.+?)\.html'
_VALID_URL = r'rts:(?P<rts_id>\d+)|https?://(?:.+?\.)?rts\.ch/(?:[^/]+/){2,}(?P<id>[0-9]+)-(?P<display_id>.+?)\.html'
_TESTS = [
{
'url': 'http://www.rts.ch/archives/tv/divers/3449373-les-enfants-terribles.html',
'md5': 'f254c4b26fb1d3c183793d52bc40d3e7',
'md5': 'ff7f8450a90cf58dacb64e29707b4a8e',
'info_dict': {
'id': '3449373',
'display_id': 'les-enfants-terribles',
@@ -38,35 +35,17 @@ class RTSIE(SRGSSRIE):
'thumbnail': 're:^https?://.*\.image',
'view_count': int,
},
'params': {
# m3u8 download
'skip_download': True,
}
},
{
'url': 'http://www.rts.ch/emissions/passe-moi-les-jumelles/5624067-entre-ciel-et-mer.html',
'md5': 'f1077ac5af686c76528dc8d7c5df29ba',
'info_dict': {
'id': '5742494',
'display_id': '5742494',
'ext': 'mp4',
'duration': 3720,
'title': 'Les yeux dans les cieux - Mon homard au Canada',
'description': 'md5:d22ee46f5cc5bac0912e5a0c6d44a9f7',
'uploader': 'Passe-moi les jumelles',
'upload_date': '20140404',
'timestamp': 1396635300,
'thumbnail': 're:^https?://.*\.image',
'view_count': int,
'id': '5624065',
'title': 'Passe-moi les jumelles',
},
'params': {
# m3u8 download
'skip_download': True,
}
'playlist_mincount': 4,
},
{
'url': 'http://www.rts.ch/video/sport/hockey/5745975-1-2-kloten-fribourg-5-2-second-but-pour-gotteron-par-kwiatowski.html',
'md5': 'b4326fecd3eb64a458ba73c73e91299d',
'info_dict': {
'id': '5745975',
'display_id': '1-2-kloten-fribourg-5-2-second-but-pour-gotteron-par-kwiatowski',
@@ -80,11 +59,15 @@ class RTSIE(SRGSSRIE):
'thumbnail': 're:^https?://.*\.image',
'view_count': int,
},
'params': {
# m3u8 download
'skip_download': True,
},
'skip': 'Blocked outside Switzerland',
},
{
'url': 'http://www.rts.ch/video/info/journal-continu/5745356-londres-cachee-par-un-epais-smog.html',
'md5': '9f713382f15322181bb366cc8c3a4ff0',
'md5': '1bae984fe7b1f78e94abc74e802ed99f',
'info_dict': {
'id': '5745356',
'display_id': 'londres-cachee-par-un-epais-smog',
@@ -92,16 +75,12 @@ class RTSIE(SRGSSRIE):
'duration': 33,
'title': 'Londres cachée par un épais smog',
'description': 'Un important voile de smog recouvre Londres depuis mercredi, provoqué par la pollution et du sable du Sahara.',
'uploader': 'Le Journal en continu',
'uploader': 'L\'actu en vidéo',
'upload_date': '20140403',
'timestamp': 1396537322,
'thumbnail': 're:^https?://.*\.image',
'view_count': int,
},
'params': {
# m3u8 download
'skip_download': True,
}
},
{
'url': 'http://www.rts.ch/audio/couleur3/programmes/la-belle-video-de-stephane-laurenceau/5706148-urban-hippie-de-damien-krisl-03-04-2014.html',
@@ -125,6 +104,10 @@ class RTSIE(SRGSSRIE):
'title': 'Hockey: Davos décroche son 31e titre de champion de Suisse',
},
'playlist_mincount': 5,
},
{
'url': 'http://pages.rts.ch/emissions/passe-moi-les-jumelles/5624065-entre-ciel-et-mer.html',
'only_matching': True,
}
]
@@ -142,19 +125,32 @@ class RTSIE(SRGSSRIE):
# media_id extracted out of URL is not always a real id
if 'video' not in all_info and 'audio' not in all_info:
page = self._download_webpage(url, display_id)
entries = []
# article with videos on rhs
videos = re.findall(
r'<article[^>]+class="content-item"[^>]*>\s*<a[^>]+data-video-urn="urn:([^"]+)"',
page)
if not videos:
for item in all_info.get('items', []):
item_url = item.get('url')
if not item_url:
continue
entries.append(self.url_result(item_url, 'RTS'))
if not entries:
page, urlh = self._download_webpage_handle(url, display_id)
if re.match(self._VALID_URL, urlh.geturl()).group('id') != media_id:
return self.url_result(urlh.geturl(), 'RTS')
# article with videos on rhs
videos = re.findall(
r'(?s)<iframe[^>]+class="srg-player"[^>]+src="[^"]+urn:([^"]+)"',
r'<article[^>]+class="content-item"[^>]*>\s*<a[^>]+data-video-urn="urn:([^"]+)"',
page)
if videos:
entries = [self.url_result('srgssr:%s' % video_urn, 'SRGSSR') for video_urn in videos]
return self.playlist_result(entries, media_id, self._og_search_title(page))
if not videos:
videos = re.findall(
r'(?s)<iframe[^>]+class="srg-player"[^>]+src="[^"]+urn:([^"]+)"',
page)
if videos:
entries = [self.url_result('srgssr:%s' % video_urn, 'SRGSSR') for video_urn in videos]
if entries:
return self.playlist_result(entries, media_id, all_info.get('title'))
internal_id = self._html_search_regex(
r'<(?:video|audio) data-id="([0-9]+)"', page,
@@ -168,36 +164,29 @@ class RTSIE(SRGSSRIE):
info = all_info['video']['JSONinfo'] if 'video' in all_info else all_info['audio']
upload_timestamp = parse_iso8601(info.get('broadcast_date'))
duration = info.get('duration') or info.get('cutout') or info.get('cutduration')
if isinstance(duration, compat_str):
duration = parse_duration(duration)
view_count = info.get('plays')
thumbnail = unescapeHTML(info.get('preview_image_url'))
title = info['title']
def extract_bitrate(url):
return int_or_none(self._search_regex(
r'-([0-9]+)k\.', url, 'bitrate', default=None))
formats = []
for format_id, format_url in info['streams'].items():
if format_id == 'hds_sd' and 'hds' in info['streams']:
streams = info.get('streams', {})
for format_id, format_url in streams.items():
if format_id == 'hds_sd' and 'hds' in streams:
continue
if format_id == 'hls_sd' and 'hls' in info['streams']:
if format_id == 'hls_sd' and 'hls' in streams:
continue
if format_url.endswith('.f4m'):
token = self._download_xml(
'http://tp.srgssr.ch/token/akahd.xml?stream=%s/*' % compat_urllib_parse_urlparse(format_url).path,
media_id, 'Downloading %s token' % format_id)
auth_params = xpath_text(token, './/authparams', 'auth params')
if not auth_params:
continue
formats.extend(self._extract_f4m_formats(
'%s?%s&hdcore=3.4.0&plugin=aasp-3.4.0.132.66' % (format_url, auth_params),
media_id, f4m_id=format_id, fatal=False))
elif format_url.endswith('.m3u8'):
formats.extend(self._extract_m3u8_formats(
format_url, media_id, 'mp4', 'm3u8_native', m3u8_id=format_id, fatal=False))
ext = determine_ext(format_url)
if ext in ('m3u8', 'f4m'):
format_url = self._get_tokenized_src(format_url, media_id, format_id)
if ext == 'f4m':
formats.extend(self._extract_f4m_formats(
format_url + ('?' if '?' not in format_url else '&') + 'hdcore=3.4.0',
media_id, f4m_id=format_id, fatal=False))
else:
formats.extend(self._extract_m3u8_formats(
format_url, media_id, 'mp4', 'm3u8_native', m3u8_id=format_id, fatal=False))
else:
formats.append({
'format_id': format_id,
@@ -205,25 +194,37 @@ class RTSIE(SRGSSRIE):
'tbr': extract_bitrate(format_url),
})
if 'media' in info:
formats.extend([{
'format_id': '%s-%sk' % (media['ext'], media['rate']),
'url': 'http://download-video.rts.ch/%s' % media['url'],
'tbr': media['rate'] or extract_bitrate(media['url']),
} for media in info['media'] if media.get('rate')])
for media in info.get('media', []):
media_url = media.get('url')
if not media_url or re.match(r'https?://', media_url):
continue
rate = media.get('rate')
ext = media.get('ext') or determine_ext(media_url, 'mp4')
format_id = ext
if rate:
format_id += '-%dk' % rate
formats.append({
'format_id': format_id,
'url': 'http://download-video.rts.ch/' + media_url,
'tbr': rate or extract_bitrate(media_url),
})
self._check_formats(formats, media_id)
self._sort_formats(formats)
duration = info.get('duration') or info.get('cutout') or info.get('cutduration')
if isinstance(duration, compat_str):
duration = parse_duration(duration)
return {
'id': media_id,
'display_id': display_id,
'formats': formats,
'title': info['title'],
'title': title,
'description': info.get('intro'),
'duration': duration,
'view_count': view_count,
'view_count': int_or_none(info.get('plays')),
'uploader': info.get('programName'),
'timestamp': upload_timestamp,
'thumbnail': thumbnail,
'timestamp': parse_iso8601(info.get('broadcast_date')),
'thumbnail': unescapeHTML(info.get('preview_image_url')),
}

View File

@@ -5,6 +5,7 @@ from .common import InfoExtractor
from ..compat import compat_urllib_parse_urlparse
from ..utils import (
determine_ext,
ExtractorError,
int_or_none,
xpath_attr,
xpath_text,
@@ -101,6 +102,11 @@ class RuutuIE(InfoExtractor):
})
extract_formats(video_xml.find('./Clip'))
drm = xpath_text(video_xml, './Clip/DRM', default=None)
if not formats and drm:
raise ExtractorError('This video is DRM protected.', expected=True)
self._sort_formats(formats)
return {

View File

@@ -1,146 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
int_or_none,
unified_strdate,
js_to_json,
)
class ScreenwaveMediaIE(InfoExtractor):
_VALID_URL = r'(?:https?:)?//player\d?\.screenwavemedia\.com/(?:play/)?[a-zA-Z]+\.php\?.*\bid=(?P<id>[A-Za-z0-9-]+)'
EMBED_PATTERN = r'src=(["\'])(?P<url>(?:https?:)?//player\d?\.screenwavemedia\.com/(?:play/)?[a-zA-Z]+\.php\?.*\bid=.+?)\1'
_TESTS = [{
'url': 'http://player.screenwavemedia.com/play/play.php?playerdiv=videoarea&companiondiv=squareAd&id=Cinemassacre-19911',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
playerdata = self._download_webpage(
'http://player.screenwavemedia.com/player.php?id=%s' % video_id,
video_id, 'Downloading player webpage')
vidtitle = self._search_regex(
r'\'vidtitle\'\s*:\s*"([^"]+)"', playerdata, 'vidtitle').replace('\\/', '/')
playerconfig = self._download_webpage(
'http://player.screenwavemedia.com/player.js',
video_id, 'Downloading playerconfig webpage')
videoserver = self._search_regex(r'SWMServer\s*=\s*"([\d\.]+)"', playerdata, 'videoserver')
sources = self._parse_json(
js_to_json(
re.sub(
r'(?s)/\*.*?\*/', '',
self._search_regex(
r'sources\s*:\s*(\[[^\]]+?\])', playerconfig,
'sources',
).replace(
"' + thisObj.options.videoserver + '",
videoserver
).replace(
"' + playerVidId + '",
video_id
)
)
),
video_id, fatal=False
)
# Fallback to hardcoded sources if JS changes again
if not sources:
self.report_warning('Falling back to a hardcoded list of streams')
sources = [{
'file': 'http://%s/vod/%s_%s.mp4' % (videoserver, video_id, format_id),
'type': 'mp4',
'label': format_label,
} for format_id, format_label in (
('low', '144p Low'), ('med', '160p Med'), ('high', '360p High'), ('hd1', '720p HD1'))]
sources.append({
'file': 'http://%s/vod/smil:%s.smil/playlist.m3u8' % (videoserver, video_id),
'type': 'hls',
})
formats = []
for source in sources:
file_ = source.get('file')
if not file_:
continue
if source.get('type') == 'hls':
formats.extend(self._extract_m3u8_formats(file_, video_id, ext='mp4'))
else:
format_id = self._search_regex(
r'_(.+?)\.[^.]+$', file_, 'format id', default=None)
if not self._is_valid_url(file_, video_id, format_id or 'video'):
continue
format_label = source.get('label')
height = int_or_none(self._search_regex(
r'^(\d+)[pP]', format_label, 'height', default=None))
formats.append({
'url': file_,
'format_id': format_id,
'format': format_label,
'ext': source.get('type'),
'height': height,
})
self._sort_formats(formats, field_preference=('height', 'width', 'tbr', 'format_id'))
return {
'id': video_id,
'title': vidtitle,
'formats': formats,
}
class TeamFourIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?teamfourstar\.com/video/(?P<id>[a-z0-9\-]+)/?'
_TEST = {
'url': 'http://teamfourstar.com/video/a-moment-with-tfs-episode-4/',
'info_dict': {
'id': 'TeamFourStar-5292a02f20bfa',
'ext': 'mp4',
'upload_date': '20130401',
'description': 'Check out this and more on our website: http://teamfourstar.com\nTFS Store: http://sharkrobot.com/team-four-star\nFollow on Twitter: http://twitter.com/teamfourstar\nLike on FB: http://facebook.com/teamfourstar',
'title': 'A Moment With TFS Episode 4',
},
'params': {
# m3u8 download
'skip_download': True,
},
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
playerdata_url = self._search_regex(
r'src="(http://player\d?\.screenwavemedia\.com/(?:play/)?[a-zA-Z]+\.php\?[^"]*\bid=.+?)"',
webpage, 'player data URL')
video_title = self._html_search_regex(
r'<div class="heroheadingtitle">(?P<title>.+?)</div>',
webpage, 'title')
video_date = unified_strdate(self._html_search_regex(
r'<div class="heroheadingdate">(?P<date>.+?)</div>',
webpage, 'date', fatal=False))
video_description = self._html_search_regex(
r'(?s)<div class="postcontent">(?P<description>.+?)</div>',
webpage, 'description', fatal=False)
video_thumbnail = self._og_search_thumbnail(webpage)
return {
'_type': 'url_transparent',
'display_id': display_id,
'title': video_title,
'description': video_description,
'upload_date': video_date,
'thumbnail': video_thumbnail,
'url': playerdata_url,
}

View File

@@ -121,7 +121,7 @@ class SoundcloudIE(InfoExtractor):
},
]
_CLIENT_ID = '02gUJC0hH2ct1EGOcYXQIzRFU91c72Ea'
_CLIENT_ID = 'fDoItMDbsbZz8dY16ZzARCZmzgHBPotA'
_IPHONE_CLIENT_ID = '376f225bf427445fc4bfb6b99b72e0bf'
@staticmethod

View File

@@ -1,5 +1,7 @@
from __future__ import unicode_literals
import re
from .mtv import MTVServicesInfoExtractor
@@ -16,6 +18,15 @@ class SpikeIE(MTVServicesInfoExtractor):
'timestamp': 1388120400,
'upload_date': '20131227',
},
}, {
'url': 'http://www.spike.com/full-episodes/j830qm/lip-sync-battle-joel-mchale-vs-jim-rash-season-2-ep-209',
'md5': 'b25c6f16418aefb9ad5a6cae2559321f',
'info_dict': {
'id': '37ace3a8-1df6-48be-85b8-38df8229e241',
'ext': 'mp4',
'title': 'Lip Sync Battle|April 28, 2016|2|209|Joel McHale Vs. Jim Rash|Act 1',
'description': 'md5:a739ca8f978a7802f67f8016d27ce114',
},
}, {
'url': 'http://www.spike.com/video-clips/lhtu8m/',
'only_matching': True,
@@ -32,3 +43,12 @@ class SpikeIE(MTVServicesInfoExtractor):
_FEED_URL = 'http://www.spike.com/feeds/mrss/'
_MOBILE_TEMPLATE = 'http://m.spike.com/videos/video.rbml?id=%s'
_CUSTOM_URL_REGEX = re.compile(r'spikenetworkapp://([^/]+/[-a-fA-F0-9]+)')
def _extract_mgid(self, webpage):
mgid = super(SpikeIE, self)._extract_mgid(webpage, default=None)
if mgid is None:
url_parts = self._search_regex(self._CUSTOM_URL_REGEX, webpage, 'episode_id')
video_type, episode_id = url_parts.split('/', 1)
mgid = 'mgid:arc:{0}:spike.com:{1}'.format(video_type, episode_id)
return mgid

View File

@@ -4,6 +4,7 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_urllib_parse_urlparse
from ..utils import (
ExtractorError,
parse_iso8601,
@@ -23,6 +24,16 @@ class SRGSSRIE(InfoExtractor):
'STARTDATE': 'This video is not yet available. Please try again later.',
}
def _get_tokenized_src(self, url, video_id, format_id):
sp = compat_urllib_parse_urlparse(url).path.split('/')
token = self._download_json(
'http://tp.srgssr.ch/akahd/token?acl=/%s/%s/*' % (sp[1], sp[2]),
video_id, 'Downloading %s token' % format_id, fatal=False) or {}
auth_params = token.get('token', {}).get('authparams')
if auth_params:
url += '?' + auth_params
return url
def get_media_data(self, bu, media_type, media_id):
media_data = self._download_json(
'http://il.srgssr.ch/integrationlayer/1.0/ue/%s/%s/play/%s.json' % (bu, media_type, media_id),
@@ -61,14 +72,16 @@ class SRGSSRIE(InfoExtractor):
asset_url = asset['text']
quality = asset['@quality']
format_id = '%s-%s' % (protocol, quality)
if protocol == 'HTTP-HDS':
formats.extend(self._extract_f4m_formats(
asset_url + '?hdcore=3.4.0', media_id,
f4m_id=format_id, fatal=False))
elif protocol == 'HTTP-HLS':
formats.extend(self._extract_m3u8_formats(
asset_url, media_id, 'mp4', 'm3u8_native',
m3u8_id=format_id, fatal=False))
if protocol.startswith('HTTP-HDS') or protocol.startswith('HTTP-HLS'):
asset_url = self._get_tokenized_src(asset_url, media_id, format_id)
if protocol.startswith('HTTP-HDS'):
formats.extend(self._extract_f4m_formats(
asset_url + ('?' if '?' not in asset_url else '&') + 'hdcore=3.4.0',
media_id, f4m_id=format_id, fatal=False))
elif protocol.startswith('HTTP-HLS'):
formats.extend(self._extract_m3u8_formats(
asset_url, media_id, 'mp4', 'm3u8_native',
m3u8_id=format_id, fatal=False))
else:
formats.append({
'format_id': format_id,
@@ -94,10 +107,10 @@ class SRGSSRPlayIE(InfoExtractor):
_TESTS = [{
'url': 'http://www.srf.ch/play/tv/10vor10/video/snowden-beantragt-asyl-in-russland?id=28e1a57d-5b76-4399-8ab3-9097f071e6c5',
'md5': '4cd93523723beff51bb4bee974ee238d',
'md5': 'da6b5b3ac9fa4761a942331cef20fcb3',
'info_dict': {
'id': '28e1a57d-5b76-4399-8ab3-9097f071e6c5',
'ext': 'm4v',
'ext': 'mp4',
'upload_date': '20130701',
'title': 'Snowden beantragt Asyl in Russland',
'timestamp': 1372713995,

View File

@@ -0,0 +1,48 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from .jwplatform import JWPlatformIE
from ..utils import unified_strdate
class TeamFourStarIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?teamfourstar\.com/(?P<id>[a-z0-9\-]+)'
_TEST = {
'url': 'http://teamfourstar.com/tfs-abridged-parody-episode-1-2/',
'info_dict': {
'id': '0WdZO31W',
'title': 'TFS Abridged Parody Episode 1',
'description': 'md5:d60bc389588ebab2ee7ad432bda953ae',
'ext': 'mp4',
'timestamp': 1394168400,
'upload_date': '20080508',
},
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
jwplatform_url = JWPlatformIE._extract_url(webpage)
video_title = self._html_search_regex(
r'<h1[^>]+class="entry-title"[^>]*>(?P<title>.+?)</h1>',
webpage, 'title')
video_date = unified_strdate(self._html_search_regex(
r'<span[^>]+class="meta-date date updated"[^>]*>(?P<date>.+?)</span>',
webpage, 'date', fatal=False))
video_description = self._html_search_regex(
r'(?s)<div[^>]+class="content-inner"[^>]*>.*?(?P<description><p>.+?)</div>',
webpage, 'description', fatal=False)
video_thumbnail = self._og_search_thumbnail(webpage)
return {
'_type': 'url_transparent',
'display_id': display_id,
'title': video_title,
'description': video_description,
'upload_date': video_date,
'thumbnail': video_thumbnail,
'url': jwplatform_url,
}

View File

@@ -7,33 +7,30 @@ from .common import InfoExtractor
class TeleBruxellesIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?(?:telebruxelles|bx1)\.be/(news|sport|dernier-jt)/?(?P<id>[^/#?]+)'
_VALID_URL = r'https?://(?:www\.)?(?:telebruxelles|bx1)\.be/(news|sport|dernier-jt|emission)/?(?P<id>[^/#?]+)'
_TESTS = [{
'url': 'http://www.telebruxelles.be/news/auditions-devant-parlement-francken-galant-tres-attendus/',
'md5': '59439e568c9ee42fb77588b2096b214f',
'url': 'http://bx1.be/news/que-risque-lauteur-dune-fausse-alerte-a-la-bombe/',
'md5': 'a2a67a5b1c3e8c9d33109b902f474fd9',
'info_dict': {
'id': '11942',
'display_id': 'auditions-devant-parlement-francken-galant-tres-attendus',
'ext': 'flv',
'title': 'Parlement : Francken et Galant répondent aux interpellations de lopposition',
'description': 're:Les auditions des ministres se poursuivent*'
},
'params': {
'skip_download': 'requires rtmpdump'
'id': '158856',
'display_id': 'que-risque-lauteur-dune-fausse-alerte-a-la-bombe',
'ext': 'mp4',
'title': 'Que risque lauteur dune fausse alerte à la bombe ?',
'description': 'md5:3cf8df235d44ebc5426373050840e466',
},
}, {
'url': 'http://www.telebruxelles.be/sport/basket-brussels-bat-mons-80-74/',
'md5': '181d3fbdcf20b909309e5aef5c6c6047',
'url': 'http://bx1.be/sport/futsal-schaerbeek-sincline-5-3-a-thulin/',
'md5': 'dfe07ecc9c153ceba8582ac912687675',
'info_dict': {
'id': '10091',
'display_id': 'basket-brussels-bat-mons-80-74',
'ext': 'flv',
'title': 'Basket : le Brussels bat Mons 80-74',
'description': 're:^Ils l\u2019on fait ! En basket, le B*',
},
'params': {
'skip_download': 'requires rtmpdump'
'id': '158433',
'display_id': 'futsal-schaerbeek-sincline-5-3-a-thulin',
'ext': 'mp4',
'title': 'Futsal : Schaerbeek sincline 5-3 à Thulin',
'description': 'md5:fd013f1488d5e2dceb9cebe39e2d569b',
},
}, {
'url': 'http://bx1.be/emission/bxenf1-gastronomie/',
'only_matching': True,
}]
def _real_extract(self, url):
@@ -50,13 +47,13 @@ class TeleBruxellesIE(InfoExtractor):
r'file\s*:\s*"(rtmp://[^/]+/vod/mp4:"\s*\+\s*"[^"]+"\s*\+\s*".mp4)"',
webpage, 'RTMP url')
rtmp_url = re.sub(r'"\s*\+\s*"', '', rtmp_url)
formats = self._extract_wowza_formats(rtmp_url, article_id or display_id)
self._sort_formats(formats)
return {
'id': article_id or display_id,
'display_id': display_id,
'title': title,
'description': description,
'url': rtmp_url,
'ext': 'flv',
'rtmp_live': True # if rtmpdump is not called with "--live" argument, the download is blocked and can be completed
'formats': formats,
}

View File

@@ -5,10 +5,10 @@ from .common import InfoExtractor
class ThisOldHouseIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?thisoldhouse\.com/(?:watch|how-to)/(?P<id>[^/?#]+)'
_VALID_URL = r'https?://(?:www\.)?thisoldhouse\.com/(?:watch|how-to|tv-episode)/(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://www.thisoldhouse.com/how-to/how-to-build-storage-bench',
'md5': '568acf9ca25a639f0c4ff905826b662f',
'md5': '946f05bbaa12a33f9ae35580d2dfcfe3',
'info_dict': {
'id': '2REGtUDQ',
'ext': 'mp4',
@@ -20,6 +20,9 @@ class ThisOldHouseIE(InfoExtractor):
}, {
'url': 'https://www.thisoldhouse.com/watch/arlington-arts-crafts-arts-and-crafts-class-begins',
'only_matching': True,
}, {
'url': 'https://www.thisoldhouse.com/tv-episode/ask-toh-shelf-rough-electric',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@@ -56,7 +56,7 @@ class TouTvIE(InfoExtractor):
'state': state,
})
login_form = self._search_regex(
r'(?s)(<form[^>]+id="Form-login".+?</form>)', login_webpage, 'login form')
r'(?s)(<form[^>]+(?:id|name)="Form-login".+?</form>)', login_webpage, 'login form')
form_data = self._hidden_inputs(login_form)
form_data.update({
'login-email': email,

View File

@@ -0,0 +1,65 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from .brightcove import BrightcoveNewIE
class TVANouvellesIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?tvanouvelles\.ca/videos/(?P<id>\d+)'
_TEST = {
'url': 'http://www.tvanouvelles.ca/videos/5117035533001',
'info_dict': {
'id': '5117035533001',
'ext': 'mp4',
'title': 'Lindustrie du taxi dénonce lentente entre Québec et Uber: explications',
'description': 'md5:479653b7c8cf115747bf5118066bd8b3',
'uploader_id': '1741764581',
'timestamp': 1473352030,
'upload_date': '20160908',
},
'add_ie': ['BrightcoveNew'],
}
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/1741764581/default_default/index.html?videoId=%s'
def _real_extract(self, url):
brightcove_id = self._match_id(url)
return self.url_result(
self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id,
BrightcoveNewIE.ie_key(), brightcove_id)
class TVANouvellesArticleIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?tvanouvelles\.ca/(?:[^/]+/)+(?P<id>[^/?#&]+)'
_TEST = {
'url': 'http://www.tvanouvelles.ca/2016/11/17/des-policiers-qui-ont-la-meche-un-peu-courte',
'info_dict': {
'id': 'des-policiers-qui-ont-la-meche-un-peu-courte',
'title': 'Des policiers qui ont «la mèche un peu courte»?',
'description': 'md5:92d363c8eb0f0f030de9a4a84a90a3a0',
},
'playlist_mincount': 4,
}
@classmethod
def suitable(cls, url):
return False if TVANouvellesIE.suitable(url) else super(TVANouvellesArticleIE, cls).suitable(url)
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
entries = [
self.url_result(
'http://www.tvanouvelles.ca/videos/%s' % mobj.group('id'),
ie=TVANouvellesIE.ie_key(), video_id=mobj.group('id'))
for mobj in re.finditer(
r'data-video-id=(["\'])?(?P<id>\d+)', webpage)]
title = self._og_search_title(webpage, fatal=False)
description = self._og_search_description(webpage)
return self.playlist_result(entries, display_id, title, description)

View File

@@ -25,7 +25,7 @@ class TwitterBaseIE(InfoExtractor):
class TwitterCardIE(TwitterBaseIE):
IE_NAME = 'twitter:card'
_VALID_URL = r'https?://(?:www\.)?twitter\.com/i/(?:cards/tfw/v1|videos/tweet)/(?P<id>\d+)'
_VALID_URL = r'https?://(?:www\.)?twitter\.com/i/(?:cards/tfw/v1|videos(?:/tweet)?)/(?P<id>\d+)'
_TESTS = [
{
'url': 'https://twitter.com/i/cards/tfw/v1/560070183650213889',
@@ -84,6 +84,9 @@ class TwitterCardIE(TwitterBaseIE):
'title': 'Twitter web player',
'thumbnail': 're:^https?://.*\.jpg',
},
}, {
'url': 'https://twitter.com/i/videos/752274308186120192',
'only_matching': True,
},
]

View File

@@ -51,7 +51,7 @@ class VevoIE(VevoBaseIE):
'artist': 'Hurts',
'genre': 'Pop',
},
'expected_warnings': ['Unable to download SMIL file'],
'expected_warnings': ['Unable to download SMIL file', 'Unable to download info'],
}, {
'note': 'v3 SMIL format',
'url': 'http://www.vevo.com/watch/cassadee-pope/i-wish-i-could-break-your-heart/USUV71302923',
@@ -67,7 +67,7 @@ class VevoIE(VevoBaseIE):
'artist': 'Cassadee Pope',
'genre': 'Country',
},
'expected_warnings': ['Unable to download SMIL file'],
'expected_warnings': ['Unable to download SMIL file', 'Unable to download info'],
}, {
'note': 'Age-limited video',
'url': 'https://www.vevo.com/watch/justin-timberlake/tunnel-vision-explicit/USRV81300282',
@@ -83,7 +83,7 @@ class VevoIE(VevoBaseIE):
'artist': 'Justin Timberlake',
'genre': 'Pop',
},
'expected_warnings': ['Unable to download SMIL file'],
'expected_warnings': ['Unable to download SMIL file', 'Unable to download info'],
}, {
'note': 'No video_info',
'url': 'http://www.vevo.com/watch/k-camp-1/Till-I-Die/USUV71503000',
@@ -91,15 +91,33 @@ class VevoIE(VevoBaseIE):
'info_dict': {
'id': 'USUV71503000',
'ext': 'mp4',
'title': 'K Camp - Till I Die',
'title': 'K Camp ft. T.I. - Till I Die',
'age_limit': 18,
'timestamp': 1449468000,
'upload_date': '20151207',
'uploader': 'K Camp',
'track': 'Till I Die',
'artist': 'K Camp',
'genre': 'Rap/Hip-Hop',
'genre': 'Hip-Hop',
},
'expected_warnings': ['Unable to download SMIL file', 'Unable to download info'],
}, {
'note': 'Featured test',
'url': 'https://www.vevo.com/watch/lemaitre/Wait/USUV71402190',
'md5': 'd28675e5e8805035d949dc5cf161071d',
'info_dict': {
'id': 'USUV71402190',
'ext': 'mp4',
'title': 'Lemaitre ft. LoLo - Wait',
'age_limit': 0,
'timestamp': 1413432000,
'upload_date': '20141016',
'uploader': 'Lemaitre',
'track': 'Wait',
'artist': 'Lemaitre',
'genre': 'Electronic',
},
'expected_warnings': ['Unable to download SMIL file', 'Unable to download info'],
}, {
'note': 'Only available via webpage',
'url': 'http://www.vevo.com/watch/GBUV71600656',
@@ -242,8 +260,11 @@ class VevoIE(VevoBaseIE):
timestamp = parse_iso8601(video_info.get('releaseDate'))
artists = video_info.get('artists')
if artists:
artist = uploader = artists[0]['name']
for curr_artist in artists:
if curr_artist.get('role') == 'Featured':
featured_artist = curr_artist['name']
else:
artist = uploader = curr_artist['name']
view_count = int_or_none(video_info.get('views', {}).get('total'))
for video_version in video_versions:

View File

@@ -1,11 +1,12 @@
# coding: utf-8
from __future__ import unicode_literals
import json
import time
import hmac
import hashlib
import hmac
import itertools
import json
import re
import time
from .common import InfoExtractor
from ..utils import (
@@ -276,10 +277,14 @@ class VikiIE(VikiBaseIE):
height = int_or_none(self._search_regex(
r'^(\d+)[pP]$', format_id, 'height', default=None))
for protocol, format_dict in stream_dict.items():
# rtmps URLs does not seem to work
if protocol == 'rtmps':
continue
format_url = format_dict['url']
if format_id == 'm3u8':
m3u8_formats = self._extract_m3u8_formats(
format_dict['url'], video_id, 'mp4',
entry_protocol='m3u8_native', preference=-1,
format_url, video_id, 'mp4',
entry_protocol='m3u8_native',
m3u8_id='m3u8-%s' % protocol, fatal=False)
# Despite CODECS metadata in m3u8 all video-only formats
# are actually video+audio
@@ -287,9 +292,23 @@ class VikiIE(VikiBaseIE):
if f.get('acodec') == 'none' and f.get('vcodec') != 'none':
f['acodec'] = None
formats.extend(m3u8_formats)
elif format_url.startswith('rtmp'):
mobj = re.search(
r'^(?P<url>rtmp://[^/]+/(?P<app>.+?))/(?P<playpath>mp4:.+)$',
format_url)
if not mobj:
continue
formats.append({
'format_id': 'rtmp-%s' % format_id,
'ext': 'flv',
'url': mobj.group('url'),
'play_path': mobj.group('playpath'),
'app': mobj.group('app'),
'page_url': url,
})
else:
formats.append({
'url': format_dict['url'],
'url': format_url,
'format_id': '%s-%s' % (format_id, protocol),
'height': height,
})

View File

@@ -17,7 +17,7 @@ from ..compat import compat_urllib_parse_urlencode
class VLiveIE(InfoExtractor):
IE_NAME = 'vlive'
_VALID_URL = r'https?://(?:(?:www|m)\.)?vlive\.tv/video/(?P<id>[0-9]+)'
_TEST = {
_TESTS = [{
'url': 'http://www.vlive.tv/video/1326',
'md5': 'cc7314812855ce56de70a06a27314983',
'info_dict': {
@@ -27,7 +27,20 @@ class VLiveIE(InfoExtractor):
'creator': "Girl's Day",
'view_count': int,
},
}
}, {
'url': 'http://www.vlive.tv/video/16937',
'info_dict': {
'id': '16937',
'ext': 'mp4',
'title': '[V LIVE] 첸백시 걍방',
'creator': 'EXO',
'view_count': int,
'subtitles': 'mincount:12',
},
'params': {
'skip_download': True,
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
@@ -116,7 +129,7 @@ class VLiveIE(InfoExtractor):
subtitles = {}
for caption in playinfo.get('captions', {}).get('list', []):
lang = dict_get(caption, ('language', 'locale', 'country', 'label'))
lang = dict_get(caption, ('locale', 'language', 'country', 'label'))
if lang and caption.get('source'):
subtitles[lang] = [{
'ext': 'vtt',

View File

@@ -0,0 +1,102 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
determine_ext,
xpath_text,
)
class WebcasterIE(InfoExtractor):
_VALID_URL = r'https?://bl\.webcaster\.pro/(?:quote|media)/start/free_(?P<id>[^/]+)'
_TESTS = [{
# http://video.khl.ru/quotes/393859
'url': 'http://bl.webcaster.pro/quote/start/free_c8cefd240aa593681c8d068cff59f407_hd/q393859/eb173f99dd5f558674dae55f4ba6806d/1480289104?sr%3D105%26fa%3D1%26type_id%3D18',
'md5': '0c162f67443f30916ff1c89425dcd4cd',
'info_dict': {
'id': 'c8cefd240aa593681c8d068cff59f407_hd',
'ext': 'mp4',
'title': 'Сибирь - Нефтехимик. Лучшие моменты первого периода',
'thumbnail': 're:^https?://.*\.jpg$',
},
}, {
'url': 'http://bl.webcaster.pro/media/start/free_6246c7a4453ac4c42b4398f840d13100_hd/2_2991109016/e8d0d82587ef435480118f9f9c41db41/4635726126',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
video = self._download_xml(url, video_id)
title = xpath_text(video, './/event_name', 'event name', fatal=True)
def make_id(parts, separator):
return separator.join(filter(None, parts))
formats = []
for format_id in (None, 'noise'):
track_tag = make_id(('track', format_id), '_')
for track in video.findall('.//iphone/%s' % track_tag):
track_url = track.text
if not track_url:
continue
if determine_ext(track_url) == 'm3u8':
m3u8_formats = self._extract_m3u8_formats(
track_url, video_id, 'mp4',
entry_protocol='m3u8_native',
m3u8_id=make_id(('hls', format_id), '-'), fatal=False)
for f in m3u8_formats:
f.update({
'source_preference': 0 if format_id == 'noise' else 1,
'format_note': track.get('title'),
})
formats.extend(m3u8_formats)
self._sort_formats(formats)
thumbnail = xpath_text(video, './/image', 'thumbnail')
return {
'id': video_id,
'title': title,
'thumbnail': thumbnail,
'formats': formats,
}
class WebcasterFeedIE(InfoExtractor):
_VALID_URL = r'https?://bl\.webcaster\.pro/feed/start/free_(?P<id>[^/]+)'
_TEST = {
'url': 'http://bl.webcaster.pro/feed/start/free_c8cefd240aa593681c8d068cff59f407_hd/q393859/eb173f99dd5f558674dae55f4ba6806d/1480289104',
'only_matching': True,
}
@staticmethod
def _extract_url(ie, webpage):
mobj = re.search(
r'<(?:object|a[^>]+class=["\']webcaster-player["\'])[^>]+data(?:-config)?=(["\']).*?config=(?P<url>https?://bl\.webcaster\.pro/feed/start/free_.*?)(?:[?&]|\1)',
webpage)
if mobj:
return mobj.group('url')
for secure in (True, False):
video_url = ie._og_search_video_url(
webpage, secure=secure, default=None)
if video_url:
mobj = re.search(
r'config=(?P<url>https?://bl\.webcaster\.pro/feed/start/free_[^?&=]+)',
video_url)
if mobj:
return mobj.group('url')
def _real_extract(self, url):
video_id = self._match_id(url)
feed = self._download_xml(url, video_id)
video_url = xpath_text(
feed, ('video_hd', 'video'), 'video url', fatal=True)
return self.url_result(video_url, WebcasterIE.ie_key())

View File

@@ -1796,7 +1796,7 @@ class YoutubePlaylistIE(YoutubePlaylistBaseInfoExtractor):
|
((?:PL|LL|EC|UU|FL|RD|UL)[0-9A-Za-z-_]{10,})
)"""
_TEMPLATE_URL = 'https://www.youtube.com/playlist?list=%s'
_TEMPLATE_URL = 'https://www.youtube.com/playlist?list=%s&disable_polymer=true'
_VIDEO_RE = r'href="\s*/watch\?v=(?P<id>[0-9A-Za-z_-]{11})&amp;[^"]*?index=(?P<index>\d+)(?:[^>]+>(?P<title>[^<]+))?'
IE_NAME = 'youtube:playlist'
_TESTS = [{
@@ -2175,7 +2175,7 @@ class YoutubeUserIE(YoutubeChannelIE):
class YoutubeLiveIE(YoutubeBaseInfoExtractor):
IE_DESC = 'YouTube.com live streams'
_VALID_URL = r'(?P<base_url>https?://(?:\w+\.)?youtube\.com/(?:user|channel|c)/(?P<id>[^/]+))/live'
_VALID_URL = r'(?P<base_url>https?://(?:\w+\.)?youtube\.com/(?:(?:user|channel|c)/)?(?P<id>[^/]+))/live'
IE_NAME = 'youtube:live'
_TESTS = [{
@@ -2204,6 +2204,9 @@ class YoutubeLiveIE(YoutubeBaseInfoExtractor):
}, {
'url': 'https://www.youtube.com/c/CommanderVideoHq/live',
'only_matching': True,
}, {
'url': 'https://www.youtube.com/TheYoungTurks/live',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@@ -198,12 +198,12 @@ class JSInterpreter(object):
return opfunc(x, y)
m = re.match(
r'^(?P<func>%s)\((?P<args>[a-zA-Z0-9_$,]+)\)$' % _NAME_RE, expr)
r'^(?P<func>%s)\((?P<args>[a-zA-Z0-9_$,]*)\)$' % _NAME_RE, expr)
if m:
fname = m.group('func')
argvals = tuple([
int(v) if v.isdigit() else local_vars[v]
for v in m.group('args').split(',')])
for v in m.group('args').split(',')]) if len(m.group('args')) > 0 else tuple()
if fname not in self._functions:
self._functions[fname] = self.extract_function(fname)
return self._functions[fname](argvals)

View File

@@ -55,12 +55,12 @@ class Socks5AddressType(object):
ATYP_IPV6 = 0x04
class ProxyError(IOError):
class ProxyError(socket.error):
ERR_SUCCESS = 0x00
def __init__(self, code=None, msg=None):
if code is not None and msg is None:
msg = self.CODES.get(code) and 'unknown error'
msg = self.CODES.get(code) or 'unknown error'
super(ProxyError, self).__init__(code, msg)
@@ -103,6 +103,7 @@ class ProxyType(object):
SOCKS4A = 1
SOCKS5 = 2
Proxy = collections.namedtuple('Proxy', (
'type', 'host', 'port', 'username', 'password', 'remote_dns'))
@@ -122,7 +123,7 @@ class sockssocket(socket.socket):
while len(data) < cnt:
cur = self.recv(cnt - len(data))
if not cur:
raise IOError('{0} bytes missing'.format(cnt - len(data)))
raise EOFError('{0} bytes missing'.format(cnt - len(data)))
data += cur
return data

View File

@@ -115,6 +115,8 @@ def _u30(reader):
res = _read_int(reader)
assert res & 0xf0000000 == 0
return res
_u32 = _read_int
@@ -176,6 +178,7 @@ class _Undefined(object):
return 'undefined'
__repr__ = __str__
undefined = _Undefined()

View File

@@ -86,6 +86,11 @@ std_headers = {
}
USER_AGENTS = {
'Safari': 'Mozilla/5.0 (X11; Linux x86_64; rv:10.0) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27',
}
NO_DEFAULT = object()
ENGLISH_MONTH_NAMES = [

View File

@@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2016.11.14.1'
__version__ = '2016.12.12'