Compare commits

..

1885 Commits

Author SHA1 Message Date
Philipp Hagemeister
b0ba11cc64 release 2016.04.13 2016-04-13 08:02:03 +02:00
Yen Chi Hsuan
75af5d59ae [netease] Skip all tests: completely georestricted 2016-04-13 04:52:07 +08:00
Sergey M․
b969d12490 Credit @Phaeilo for presstv (#7113) 2016-04-13 01:52:50 +06:00
Sergey M․
466a614537 [youtube:playlist] Recognize popular uploads playlist as mix (Closes #9170) 2016-04-12 21:38:31 +06:00
Sergey M․
ffa2cecf72 [ard] Change subtitles extension to ttml (Closes #9169)
ttml is now served instead of srt
2016-04-12 21:20:31 +06:00
Yen Chi Hsuan
a837416025 [jadorecettepub] Remove extractor: website gone 2016-04-12 18:30:53 +08:00
Yen Chi Hsuan
c9d448876f [izlesene] Fix extraction
description may be absent
2016-04-12 18:29:28 +08:00
Yen Chi Hsuan
8865b8abfd [howstuffworks] Skip a broken test case 2016-04-12 17:30:14 +08:00
Yen Chi Hsuan
c77a0c01cb [groupon] Fix extraction 2016-04-12 17:26:09 +08:00
Yen Chi Hsuan
12355ac473 [goshgay] Fix extraction
isFamilyFriendly no longer exists in the webpage and I can't find
another indicator.
2016-04-12 17:23:00 +08:00
Sergey M․
49f523ca50 [mixcloud] Capture error message (#9156) 2016-04-11 20:45:58 +06:00
remitamine
4a903b93a9 Revert "[openclassroom] Add new extractor(closes #9147)"
This reverts commit 13267a2be3.
2016-04-11 14:44:35 +01:00
remitamine
13267a2be3 [openclassroom] Add new extractor(closes #9147) 2016-04-11 14:24:08 +01:00
Yen Chi Hsuan
134c207e3f [arte.tv:embed] Extended support (#2620) 2016-04-11 19:32:27 +08:00
Yen Chi Hsuan
0f56bd2178 Merge branch 'Phaeilo-presstv' 2016-04-11 16:17:05 +08:00
Yen Chi Hsuan
dfbc7f7f3f [presstv] Improve and simplify 2016-04-11 16:14:07 +08:00
Yen Chi Hsuan
7d58ea7c5b Merge branch 'presstv' of https://github.com/Phaeilo/youtube-dl into Phaeilo-presstv 2016-04-11 15:48:10 +08:00
Sergey M․
452908b257 [telebruxelles] Fix extraction (Closes #9142) 2016-04-11 00:06:05 +06:00
Sergey M․
5899e988d5 [glide] Improve extraction and extract upload info 2016-04-10 23:56:23 +06:00
Sergey M․
4a121d29bb [glide] Fix extraction (Closes #9141) 2016-04-10 23:45:17 +06:00
Sergey M․
7ebc36900d [jwplatform:base] Improve subtitles extraction 2016-04-10 22:55:07 +06:00
Sergey M․
d7eb052fa2 [screencastomatic] Add duration to test 2016-04-10 22:48:04 +06:00
Sergey M․
a6d6722c8f [jwplatform:base] Extract duration 2016-04-10 22:47:38 +06:00
Sergey M․
66fa495868 [screencastomatic] Fix extraction (Closes #9136) 2016-04-10 22:37:14 +06:00
Sergey M․
443285aabe [ebaumsworlds] Update _VALID_URL (Closes #9135) 2016-04-10 22:15:11 +06:00
Philip Huppert
de728757ad [presstv] Refactored extractor. 2016-04-10 16:36:44 +02:00
Sergey M․
f44c276842 [extractor/extractors] Remove non-existant imports 2016-04-10 19:21:58 +06:00
Sergey M․
a1fa60a934 [cliprs] Add extractor (Closes #9099) 2016-04-10 18:43:40 +06:00
Sergey M․
49caf3307f [extractor/common] Remove irrelevant comment 2016-04-10 17:10:27 +06:00
Jaime Marquínez Ferrándiz
6a801f4470 [test/InfoExtractors] add test for _download_json 2016-04-09 23:18:41 +02:00
Sergey M․
61dd350a04 [1tv] Fix extraction (Closes #9103) 2016-04-10 03:02:35 +06:00
Jaime Marquínez Ferrándiz
eb9c3edd5e [test/utils] Add test for date_from_str 2016-04-09 22:40:05 +02:00
Philip Huppert
95153a960d [presstv] updated extractor and tests to work with current PressTV website 2016-04-09 16:14:05 +02:00
Yen Chi Hsuan
6c4c7539f2 [test/helper] Check got values to be strings for md5: fields
Seen in PBSIE tests
2016-04-09 22:04:48 +08:00
Yen Chi Hsuan
c991106706 [videodetective] Adapt to InternetVideoArchiveIE 2016-04-09 21:47:35 +08:00
Yen Chi Hsuan
dae2a058de [rottentomatoes] Adapt to InternetVideoArchiveIE 2016-04-09 21:47:12 +08:00
Yen Chi Hsuan
c05025fdd7 [internetvideoarchive] Fix extraction and support json URLs 2016-04-09 21:46:51 +08:00
Philip Huppert
bfe96d7bea [presstv] Added extractor PressTV.
Fixes #7060
2016-04-09 14:55:54 +02:00
Yen Chi Hsuan
ab481b48e5 [funnyordie] Relax M3U8 URL matching
Also, m3u8_url extraction should be fatal as all formats depends
directly or indirectly on it.

This change fixes test_Generic_26 and TestFunnyOrDieSubtitles
2016-04-09 20:17:35 +08:00
Sergey M․
92c7f3157a [aol] Add coding cookie 2016-04-09 17:32:23 +06:00
Yen Chi Hsuan
cacd996662 [utils] Don't touch URLs if not necessary
Fix test_Generic_15 (Google redirect)
2016-04-09 19:27:54 +08:00
remitamine
bffb245a48 [aol] add support for videos with vidible IDs(closes #9124) 2016-04-09 10:51:23 +01:00
Yen Chi Hsuan
680efb6723 Merge pull request #8497 from jaimeMF/lazy-load
Add experimenta lazy loading of info extractors
2016-04-09 14:08:13 +08:00
Jaime Marquínez Ferrándiz
5a9858bfa9 setup.py: add command for building the lazy_extractors module 2016-04-08 21:50:54 +02:00
Jaime Marquínez Ferrándiz
8a5dc1c1e1 lazy extractors: Initialize the real info extractor
According to the docs '__init__' is only called automatically if '__new__' returns an instance of the original class.
2016-04-08 21:50:54 +02:00
Jaime Marquínez Ferrándiz
e0986e31cf lazy extractors: Output if it's enabled in the verbose log 2016-04-08 21:50:54 +02:00
Jaime Marquínez Ferrándiz
6b97ca96fc lazy extractors: Style fixes
* Sort extractors alphabetically
* Add newlines when needed (youtube_dl/extractors/lazy_extractors.py pass the flake8 test now)
2016-04-08 21:50:54 +02:00
Jaime Marquínez Ferrándiz
c1ce6acdd7 lazy extractors: Fix building with python2.6 2016-04-08 21:50:07 +02:00
Jaime Marquínez Ferrándiz
0d778b1db9 lazy extractors: specify the encoding
When building with python3 the unicode characters are not escaped, python2 needs to know the encoding.
2016-04-08 21:50:07 +02:00
Jaime Marquínez Ferrándiz
779822d945 Add experimental support for lazy loading the info extractors
'make lazy-extractors' creates the youtube_dl/extractor/lazy_extractors.py (imported by youtube_dl/extractor/__init__.py), which contains simplified classes that only have the 'suitable' class method and that load the appropiate class with the '__new__' method when a instance is created.
2016-04-08 21:50:07 +02:00
Jaime Marquínez Ferrándiz
1b3d5e05a8 Move the extreactors import to youtube_dl/extractor/extractors.py 2016-04-08 21:47:51 +02:00
Jaime Marquínez Ferrándiz
e52d7f85f2 Delay initialization of InfoExtractors until they are needed 2016-04-08 21:43:24 +02:00
Sergey M․
568d2f78d6 [tnaflix] Fix metadata extraction 2016-04-09 00:27:24 +06:00
Sergey M․
2f2fcf1a33 [tnaflix] Fix extraction (Closes #9074) 2016-04-08 23:34:59 +06:00
Sergey M․
bacec0397f [extractor/common] Relax _hidden_inputs 2016-04-08 23:33:45 +06:00
Sergey M․
3c6c7e7d7e [gdcvault] Fix extraction (Closes #9107, closes #9114) 2016-04-08 23:16:02 +06:00
Sergey M․
fb38aa8b53 [extractor/common] Support arbitrary format strings for template based identifiers in mpd manifests (Closes #9119, closes #9120) 2016-04-08 22:48:08 +06:00
Sergey M․
18da24634c [democracynow] Improve extraction 2016-04-08 22:27:27 +06:00
Sergey M․
a134426d61 [democracynow] Fix tests 2016-04-08 22:21:14 +06:00
Sergey M․
a64c0c9b06 [democracynow] Make description optional (Closes #9115) 2016-04-08 22:15:36 +06:00
Sergey M․
56019444cb [novamov] Improve _VALID_URL template (Closes #9116) 2016-04-08 21:26:42 +06:00
remitamine
a1ff3cd5f9 [acast] fix channel extraction(closes #9117) 2016-04-08 15:15:34 +01:00
remitamine
9a32e80477 [acast] fix extraction(#9117) 2016-04-08 14:51:00 +01:00
Sergey M․
536a55dabd [YoutubeDL] Sanitize single thumbnail URL 2016-04-08 00:17:47 +06:00
Sergey M․
ed6fb8b804 [vrt] Add support for direct hls playlists and YouTube (Closes #9108) 2016-04-07 23:22:43 +06:00
Sergey M․
3afef2e3fc [beeg] Improve extraction 2016-04-07 22:40:35 +06:00
Sergey M․
e90d175436 [yandexmusic] Extract music album metafields (Closes #7354) 2016-04-07 02:56:13 +06:00
Sergey M․
7a93ab5f3f [extractor/common] Introduce music album metafields 2016-04-07 02:53:53 +06:00
Philipp Hagemeister
c41cf65d4a release 2016.04.06 2016-04-06 15:13:08 +02:00
Jaime Marquínez Ferrándiz
ec4a4c6fcc Makefile: remove ISSUE_TEMPLATE.md from the 'all' target (fixes #9088)
It isn't included in the tar file, causing build failures.
Since it's only used for GitHub, I think we don't need to store it in the tar file.
2016-04-06 14:16:05 +02:00
Jaime Marquínez Ferrándiz
be0c7009fb Makefile: use full path for the ISSUE_TEMPLATE.md file 2016-04-06 14:09:31 +02:00
Yen Chi Hsuan
92d5477d84 [compat] Handle tuples properly in urlencode()
Fixes #9055
2016-04-06 18:29:54 +08:00
Yen Chi Hsuan
8790249c68 [iqiyi] Improve error detection for VIP-only videos
Closes #9071
2016-04-06 16:12:16 +08:00
Philipp Hagemeister
416930d450 release 2016.04.05 2016-04-05 18:36:24 +02:00
Sergey M․
65150b41bb [deezer] Fix extraction (Closes #9086) 2016-04-05 22:27:33 +06:00
Sergey M․
e42f413716 [rte] Improve thumbnail extraction (Closes #9085) 2016-04-05 22:23:20 +06:00
Sergey M․
40a056d85d [extractor/__init__] Remove novamov extractor and sort novamov based extractors alphabetically 2016-04-05 21:54:09 +06:00
Sergey M․
e7d77efb9d [auroravid] Add extractor (Closes #9070) 2016-04-05 21:52:07 +06:00
Sergey M․
995cf05c96 [novamov] Make title fatal 2016-04-05 21:40:43 +06:00
Jaime Marquínez Ferrándiz
5bf28d7864 [utils] dfxp2srt: add additional namespace
Used by the ZDF subtitles (#9081).
2016-04-04 20:46:35 +02:00
Jaime Marquínez Ferrándiz
8c7d6e8e22 [zdf] Extract subtitles (closes #9081) 2016-04-04 20:44:06 +02:00
Sergey M․
6d4fc66bfc [youtube] Add support for zwearz (Closes #9062) 2016-04-04 02:26:20 +06:00
remitamine
23576edbfc [brightcove:legacy] skip None value for uploader_id 2016-04-02 21:31:21 +01:00
remitamine
4d4cd35f48 [brightcove:legacy] extract uploader_id as a string 2016-04-02 20:55:44 +01:00
remitamine
3aac9b2fb1 [nowness] update tests 2016-04-02 18:57:15 +01:00
remitamine
e47d19e991 [brightcove:new] extract subtitles and strip video title 2016-04-02 18:57:15 +01:00
remitamine
41f5492fbc [brightcove:legacy] improve format extraction and extract uploader_id, duration and timestamp 2016-04-02 18:57:15 +01:00
Jaime Marquínez Ferrándiz
2defa7d75a [instagram:user] Fix extraction (fixes #9059)
The URL for the next page was incorrect and we always got the same page, therefore it got trapped in an infinite loop.
2016-04-02 18:03:56 +02:00
Sergey M․
bbc26c8a01 [bbc] Set vcodec to none for audio formats 2016-04-02 19:00:38 +06:00
Sergey M․
b507cc925b [extractor/common] Carry long line 2016-04-02 18:49:58 +06:00
Sergey M․
db8ee7ec05 [extractor/common] Fix numeric identifiers conversion in DASH URL templates 2016-04-02 18:48:05 +06:00
remitamine
08136dc138 [brightcove] fix format sorting 2016-04-02 10:57:57 +01:00
remitamine
fe7ef95e91 [cbsinteractive] Add support for ZDNet videos 2016-04-01 23:53:32 +01:00
remitamine
5f705baf5e [cnet] extract more formats 2016-04-01 20:42:15 +01:00
remitamine
0750b2491f [ffmpeg] try to convert tt subtitles usng dfxp2srt 2016-04-01 19:47:49 +01:00
remitamine
df634be2ed [common] prefer using mime type over ext for smil subtitle extraction
the subtitle ext for http://www.cnet.com/videos/download-amazon-prime-movies-and-tv/
is adb_xml while using the mime type it get tt(application/smptett+xml)
2016-04-01 19:47:49 +01:00
Jaime Marquínez Ferrándiz
6d628fafca [camwithher] Remove extra blank line 2016-04-01 20:45:21 +02:00
Jaime Marquínez Ferrándiz
0f28777f58 [cbsnews] Remove unused import 2016-04-01 20:43:14 +02:00
Jaime Marquínez Ferrándiz
329c1eae54 [aenetworks] Make pep8 happy 2016-04-01 20:42:19 +02:00
Sergey M․
9aaaf8e8e8 [camwithher] Improve extraction (Closes #8989) 2016-04-01 23:47:27 +06:00
theGeekPirate
04819db58e [camwithher] Add extractor
Corrected unnecessary test

Sane variable naming

RTMP all .flv & url_id for _download_webpage()

Corrected all outstanding issues, next up is a squash!
2016-04-01 23:44:25 +06:00
remitamine
79ba9140dc [theplatform] extract timestamp and uploader 2016-04-01 18:07:17 +01:00
Sergey M․
75d572e9fb [screencast] Improve title regexes (Closes #9025) 2016-04-01 23:01:55 +06:00
Martin Trigaux
791d6aaecc screencast.com: fallback on page title
When determining the title of the page, use the <title> tag of the page
2016-04-01 23:00:52 +06:00
Sergey M․
81de73e5b4 [screencast] Add test 2016-04-01 23:00:45 +06:00
Martin Trigaux
83cedc1cf2 screencast.com: support missing www
The "www." part of the URL is not mandatory
2016-04-01 22:58:16 +06:00
Sergey M․
244cd04237 [pluralsight] Remove unnecessary login/password encode 2016-04-01 22:46:46 +06:00
Sergey M․
fbdaced256 [lynda] Remove unnecessary login/password encode 2016-04-01 22:45:20 +06:00
Sergey M․
a3373823e1 [udemy] Remove unnecessary login/password encode
This is now covered by compat_urllib_parse_urlencode
2016-04-01 22:42:09 +06:00
Sergey M․
03caa463e7 [udemy:course] Skip non-video lectures 2016-04-01 22:38:56 +06:00
remitamine
3f64379eda [movieclips] fix extraction 2016-04-01 16:22:06 +01:00
remitamine
3e0c3d14d9 [cbs] add base extractor 2016-04-01 10:12:29 +01:00
remitamine
d8873d4def [aenetworks] improve format extraction 2016-04-01 09:58:02 +01:00
remitamine
db1c969da5 [theplatform] sign https urls 2016-04-01 09:58:02 +01:00
Philipp Hagemeister
1e02bc7ba2 release 2016.04.01 2016-04-01 09:07:40 +02:00
remitamine
63c55e9f22 [cbs] improve extraction(closes #6321) 2016-04-01 07:33:37 +01:00
remitamine
f9b1529af8 [generic] remove sbnation test(handled by VoxMediaIE) 2016-03-31 23:50:45 +01:00
remitamine
961fc024d2 [voxmedia] improve sbnation support 2016-03-31 23:33:36 +01:00
Sergey M․
b53a06e3b9 [udemy:course] Use new URL format 2016-04-01 02:24:22 +06:00
remitamine
4ecc1fc638 [howstuffworks] improve extraction 2016-03-31 21:11:58 +01:00
Yen Chi Hsuan
5b012dfce8 [tudou] Improve error handling (closes #8988) 2016-04-01 01:42:16 +08:00
remitamine
8369942773 [voxmedia] Add new extractor(closes #3182) 2016-03-31 18:36:41 +01:00
Sergey M․
86f3b66cec [udemy] Remove unused import 2016-03-31 23:00:11 +06:00
Sergey M․
6bb4600717 [udemy:course] Simplify course curriculum downloading 2016-03-31 22:59:19 +06:00
Sergey M․
41d06b0424 [extractor/common] Improve _request_webpage
* Do not ignore data, headers and query for Requests
* Default values for headers and query switched to dicts since these are used by urllib itself
2016-03-31 22:58:38 +06:00
Sergey M․
15d260ebaa [utils] Use update_Request in http_request 2016-03-31 22:55:49 +06:00
Sergey M․
ed0291d153 [utils] Add update_Request 2016-03-31 22:55:01 +06:00
Sergey M․
81da8cbc45 [udemy] Switch to api 2.0 (Closes #9035) 2016-03-31 22:05:25 +06:00
Sergey M․
5299bc3f91 [beeg] Switch to api v6 (Closes #9036) 2016-03-31 20:42:41 +06:00
remitamine
c9c39c22c5 [nationalgeographic] add support for channel.nationalgeographic.com urls 2016-03-31 13:47:38 +01:00
remitamine
d84b48e3f1 [nationalgeographic] improve extraction 2016-03-31 13:44:55 +01:00
remitamine
dd17041c82 [tenplay] remove extractor(fixes #6927) 2016-03-31 12:02:04 +01:00
remitamine
fea7295b14 [brightcove] relax embed_in_page regex 2016-03-31 10:48:22 +01:00
remitamine
9cf01f7f30 [nbc] add new extractor for csnne.com(#5432) 2016-03-31 00:26:42 +01:00
remitamine
ce548296fe [cnbc] fix test 2016-03-31 00:25:11 +01:00
remitamine
c02ec7d430 [cnbc] Add new extractor(closes #8012) 2016-03-30 23:18:31 +01:00
remitamine
6b820a2376 [myspace] improve extraction 2016-03-30 21:18:07 +01:00
Yen Chi Hsuan
e621a344e6 [kwuo] Port to new API and enable --cn-verification-proxy 2016-03-31 02:27:52 +08:00
Yen Chi Hsuan
3ae6f8fec1 [kwuo] Remove _sort_formats() from KuwoBaseIE._get_formats()
Following the idea proposed in 19dbaeece3
2016-03-31 02:11:21 +08:00
Yen Chi Hsuan
597d52fadb [kuwo:song] Correct song ID extraction (fixes #9033)
Bug introduced in daef04a4e7.
2016-03-31 02:00:50 +08:00
Sergey M․
afca767d19 [tumblr] Improve _VALID_URL (Closes #9027) 2016-03-30 22:26:43 +06:00
remitamine
6e359a1534 [comcarcoff] don not depend on crackle extractor(closes #8995)
previously extraction has been delegated to crackle to extract more info
and subtitles #6106 but some of the episodes can't be extracted using
crackle #8995.
2016-03-30 12:27:00 +01:00
Sergey M․
607619bc90 Add manually generated ISSUE_TEMPLATE.md
In order not to wait for the next release
2016-03-29 22:04:29 +06:00
Sergey M․
0b7bfc9422 Improve ISSUE_TEMPLATE_tmpl.md 2016-03-29 22:02:42 +06:00
Sergey M․
7168a6c874 [devscripts/make_issue_template] Fix __version__ again 2016-03-29 03:05:15 +06:00
Sergey M․
034947dd1e Rename ISSUE_TEMPLATE.tmpl in order not to be picked up by github 2016-03-29 02:48:04 +06:00
Sergey M․
3c0de33ad7 Remove ISSUE_TEMPLATE.md 2016-03-29 02:43:48 +06:00
Sergey M․
89924f8230 [devscripts/make_issue_template] Fix NameError under python3 2016-03-29 02:41:27 +06:00
Sergey M․
a39c68f7e5 Exclude make_issue_template.py from flake8 2016-03-29 02:19:24 +06:00
Sergey M․
4a5a67ca25 [devscripts/release.sh] Make ISSUE_TEMPLATE.md and commit it 2016-03-29 02:18:52 +06:00
Sergey M․
8751da85a7 [Makefile] Fix ISSUE_TEMPLATE.md target 2016-03-29 02:17:57 +06:00
Sergey M․
3bf1df51fd [devscripts/make_issue_template] Rework to use ISSUE_TEMPLATE.tmpl (Closes #8785) 2016-03-29 02:16:38 +06:00
Sergey M․
3842a3e652 Add ISSUE_TEMPLATE.tmpl as template for ISSUE_TEMPLATE.md 2016-03-29 02:15:26 +06:00
Sander van den Oever
7710bdf4e8 Add initial ISSUE_TEMPLATE
Add auto-updating of youtube-dl version in ISSUE_TEMPLATE

Move parts of template text and adopt makefile to new format

Moved the 'kind-of-issue' section and rephrased a bit

Rephrased and moved Example URL section upwards

Moved ISSUE_TEMPLATE inside .github folder.

Update makefile to match new folderstructure
2016-03-28 22:43:13 +06:00
Sergey M
8d9dd3c34b [README.md] Add format_id to the list of string meta fields available for use in format selection 2016-03-28 03:08:34 +05:00
Sergey M․
33f3040a3e [YoutubeDL] Fix sanitizing subtitles' url 2016-03-28 03:13:39 +06:00
Sergey M․
03442072c0 [pornhub] Fix typo (Closes #9008) 2016-03-28 01:21:44 +06:00
Sergey M․
c8b13fec02 [foxnews] Restore upload time fields in test 2016-03-28 01:14:12 +06:00
Sergey M․
87d105ac6c [amp] Fix upload timestamp extraction (Closes #9007) 2016-03-28 01:13:47 +06:00
Sergey M․
3454139576 [pornhub:uservideos] Add support for multipage videos (Closes #9006) 2016-03-28 00:50:46 +06:00
Sergey M․
3a23bae9cc [pornhub:playlistbase] Do not include videos not from playlist 2016-03-28 00:32:57 +06:00
Sergey M․
8f9a477e7f [pornhub:playlistbase] Use orderedSet 2016-03-28 00:21:08 +06:00
Sergey M․
a1cf3e38a3 [bbc] Extend vpid regex (Closes #9003) 2016-03-27 23:22:51 +06:00
Philipp Hagemeister
a122e7080b release 2016.03.27 2016-03-27 16:56:33 +02:00
Sergey M․
b22ca76204 [extractor/common] Filter out unsupported encrypted media for f4m formats (Closes #8573) 2016-03-27 07:42:38 +06:00
Sergey M․
f7df343b4a [downloader/f4m] Extract routine for removing unsupported encrypted media 2016-03-27 07:41:19 +06:00
Sergey M․
19dbaeece3 Remove _sort_formats from _extract_*_formats methods
Now _sort_formats should be called explicitly.
_sort_formats has been added to all the necessary places in code.

Closes #8051
2016-03-27 07:03:08 +06:00
Yen Chi Hsuan
395fd4b08a [twitter] Handle another form of embedded Vine
Fixes #8996
2016-03-27 04:36:02 +08:00
Sergey M․
8018028d0f [pluralsight] Extract chapter metadata (Closes #8993) 2016-03-27 02:10:52 +06:00
Sergey M․
00322ad4fd [lynda] Extract chapter metadata (#8993) 2016-03-27 02:00:36 +06:00
Sergey M․
4cf3489c6e [vevo] Update videoservice API URL (Closes #8900) 2016-03-27 01:11:11 +06:00
Sergey M․
b24ab3e341 [udemy] Improve paid course detection 2016-03-27 00:09:12 +06:00
Sergey M․
af4116f4f0 [udemy] Improve format_id 2016-03-27 00:02:52 +06:00
Sergey M․
f973e5d54e [udemy] Drop outputs' formats
Always results in 403
2016-03-26 23:55:07 +06:00
Sergey M․
62f55aa68a [udemy] Add outputs metadata to view_html formats 2016-03-26 23:54:12 +06:00
Sergey M․
02d7634d24 [udemy] Fix outputs' formats format_id 2016-03-26 23:43:25 +06:00
Sergey M․
48dce58ca9 [udemy] Use custom sorting 2016-03-26 23:42:46 +06:00
Sergey M․
efcba804f6 [udemy] Extract formats from view_html (Closes #8979) 2016-03-26 23:42:34 +06:00
Sergey M․
6dee688e6d [youtube:playlistsbase] Restrict playlist regex (Closes #8986) 2016-03-26 20:42:18 +06:00
Sergey M․
eedb7ba536 [YoutubeDL] Sort imports 2016-03-26 19:40:33 +06:00
Sergey M․
dcf77cf1a7 [YoutubeDL] Sanitize final URLs (Closes #8991) 2016-03-26 19:37:41 +06:00
Sergey M․
17bcc626bf [utils] Extract sanitize_url routine 2016-03-26 19:33:57 +06:00
Sergey M․
b5a5bbf376 [mailru] Extend _VALID_URL (Closes #8990) 2016-03-26 19:15:32 +06:00
Yen Chi Hsuan
e68d3a010f [twitter] Fix extraction (closes #8966)
HLS and DASH formats are no longer appeared in test cases. I keep them
for fear of triggering new errors.
2016-03-26 18:34:51 +08:00
Yen Chi Hsuan
d10fe8358c [generic] Add a test case for brightcove embed
Closes #8862
2016-03-26 18:30:43 +08:00
Yen Chi Hsuan
d6c340cae5 [brightcove] Extract more formats (#8862) 2016-03-26 18:21:07 +08:00
Yen Chi Hsuan
5964b598ff [brightcove] Support alternative BrightcoveExperience layout
The full URL lays in the `data` attribute of <object> (#8862)
2016-03-26 17:47:32 +08:00
Philipp Hagemeister
62cdb96f51 release 2016.03.26 2016-03-26 08:58:03 +01:00
Sergey M․
e289d6d62c [test_compat] Add tests for compat_urllib_parse_urlencode 2016-03-26 02:38:33 +06:00
Sergey M․
6e6bc8dae5 Use urlencode_postdata across the codebase 2016-03-26 02:19:24 +06:00
Sergey M․
15707c7e02 [compat] Add compat_urllib_parse_urlencode and eliminate encode_dict
encode_dict functionality has been improved and moved directly into compat_urllib_parse_urlencode
All occurrences of compat_urllib_parse.urlencode throughout the codebase have been replaced by compat_urllib_parse_urlencode

Closes #8974
2016-03-26 01:46:57 +06:00
Sergey M․
2156f16ca7 [thescene] Fix extraction and improve style (Closes #8978) 2016-03-25 20:14:34 +06:00
Sergey M․
4db441de72 [once] Relax _VALID_URL (Closes #8976) 2016-03-25 19:51:28 +06:00
Philipp Hagemeister
0be8314dc8 release 2016.03.25 2016-03-25 09:27:18 +01:00
Yen Chi Hsuan
d7f62b049a [iqiyi] Update enc_key 2016-03-25 15:45:40 +08:00
Yen Chi Hsuan
3bb3356812 [douyutv] Extend _VALID_URL 2016-03-25 15:43:29 +08:00
Sergey M․
3f15fec1d1 Credit @Kagami for mnet (#8958) 2016-03-25 03:56:27 +06:00
Sergey M․
98e68806fb [mnet] Improve (Closes #8958) 2016-03-25 03:26:29 +06:00
Kagami Hiiragi
e031768666 [mnet] Add new extractor 2016-03-25 02:32:06 +06:00
Sergey M․
5eb7db4ee9 [udemy] Add support for new URL schema 2016-03-25 02:28:39 +06:00
Sergey M․
f0e83681d9 [udemy] Extract formats from outputs 2016-03-25 02:27:13 +06:00
Sergey M․
ff9d5d0938 [udemy] Improve course enrolling 2016-03-25 02:26:46 +06:00
Sergey M․
d041a73674 [extractor/__init__] Add youtube:live and sort youtube extractors alphabetically 2016-03-25 01:39:25 +06:00
Sergey M․
f07e276a04 [youtube:live] Add extractor (Closes #8959) 2016-03-25 01:18:14 +06:00
Sergey M․
993271da0a [nytimes] Tolerate missing metadata (Closes #8952) 2016-03-24 23:28:24 +06:00
Sergey M․
369e7e3ff0 [iprima] Fix extraction (Closes #8953) 2016-03-24 22:54:26 +06:00
Sergey M․
5767b4eeae [mtv] Fix description extraction (Closes #8962) 2016-03-24 22:23:31 +06:00
Yen Chi Hsuan
622d19160b [utils] Clarify Python versions affected by buggy struct module 2016-03-24 18:06:15 +08:00
Yen Chi Hsuan
32d88410eb [tumblr] Add a test with Instagram embed
Closes #8817
2016-03-24 16:32:53 +08:00
Yen Chi Hsuan
5a51775a58 [generic] Extract Instagram embeds (#8817) 2016-03-24 16:32:27 +08:00
Yen Chi Hsuan
87696e78d7 [instagram] Unescape description (#8817) 2016-03-24 16:30:01 +08:00
Yen Chi Hsuan
c4096e8aea [instagram] Extract embed videos (#8817) 2016-03-24 16:29:33 +08:00
Yen Chi Hsuan
fc27ea9464 [tumblr] Support Vine embeds (#8817) 2016-03-23 23:55:52 +08:00
Yen Chi Hsuan
088e1aac59 [generic] Support Vine embeds (#8817) 2016-03-23 23:55:08 +08:00
Yen Chi Hsuan
81f36eba88 [test/test_utils] Update for escape_url change (again) 2016-03-23 23:23:26 +08:00
Yen Chi Hsuan
2d60465e44 [test/test_utils] Update for escape_url change 2016-03-23 23:20:28 +08:00
Sergey M
4333d56494 Merge pull request #8898 from dstftw/fragment-retries
Add --fragment-retries option (Fixes #8466)
2016-03-23 20:12:32 +05:00
Sergey M․
882c699296 [tunein] Fix stream data extraction (Closes #8899, closes #8924) 2016-03-23 20:45:39 +06:00
Yen Chi Hsuan
efbed08dc2 [utils] Encode hostnames before passing to urllib
With IDN (Internationalized Domain Name) and a proxy, non-ascii URLs
are passed down to urllib/urllib2, causing UnicodeEncodeError

Fixes #8890
2016-03-23 22:24:52 +08:00
Jaime Marquínez Ferrándiz
7da2c87119 Add extractor for thescene.com (closes #8929) 2016-03-22 22:17:59 +01:00
Sergey M․
c6ca11f1b3 [once] Prevent ads from embedding into m3u8 playlists (Closes #8893) 2016-03-22 23:48:05 +06:00
Sergey M․
2beeb286e1 [laola1tv] Add support for livestreams (Closes #8934) 2016-03-22 22:32:59 +06:00
Sergey M․
cc7397b04d [ceskatelevize] Make m3u8 formats extraction non fatal (Closes #8933) 2016-03-22 21:12:29 +06:00
Sergey M․
bc5d16b302 [animeondemand] Skip dash for now 2016-03-21 23:37:39 +06:00
Sergey M․
85c637b737 [animeondemand] Extract teaser when no full episode available (#8923) 2016-03-21 23:35:50 +06:00
Sergey M․
5c69f7a479 [animeondemand] Respect startvideo (Closes #8923) 2016-03-21 23:31:40 +06:00
Sergey M․
ff5873b72d [motherless] Detect friends only videos 2016-03-21 22:24:42 +06:00
Sergey M․
065c4b27bf [xhamster:embed] Extract vars (Closes #8912) 2016-03-21 22:07:34 +06:00
Sergey M․
1600ed1ff9 [rutv] Improve flash version pattern (Closes #8911) 2016-03-21 21:46:49 +06:00
Sergey M․
5886b38d73 Add support for https for all extractors as preventive and future-proof measure 2016-03-21 21:36:32 +06:00
Sergey M․
0cef27ad25 Add missing r prefix for _VALID_URLs 2016-03-21 21:22:37 +06:00
Sergey M․
12af4beb3e [mailru] Add support for https (Closes #8920) 2016-03-21 21:17:29 +06:00
Sergey M․
9016d76f71 [YoutubeDL] Improve _format_note 2016-03-20 22:01:45 +06:00
Sergey M․
3c5d183c19 [animeondemand] Extract all formats (Closes #8906) 2016-03-20 21:51:22 +06:00
Sergey M․
3e8bb9a972 [animeondemand] Detect geo restriction 2016-03-20 20:39:00 +06:00
Yen Chi Hsuan
daef04a4e7 [kwuo] Fix KuwoChartIE and KuwoSingerIE and accept new URL forms 2016-03-20 20:17:56 +08:00
Yen Chi Hsuan
7caae128a7 Credit @vitstradal for the key algorithm in OpenloadIE (#8489)
[ci skip]
2016-03-20 19:12:02 +08:00
Yen Chi Hsuan
2648918c81 [vlive] Fix creator extraction (closes #8814) 2016-03-20 18:15:53 +08:00
Jaime Marquínez Ferrándiz
920d318d3c README: document that BSD make is also supported (#8902) 2016-03-20 10:55:14 +01:00
Yen Chi Hsuan
9e3c2f1d74 [openload] Misc improvements
* Add thumbnail
* Detect errors (#6469)
* Match more (#6469, #8489)
2016-03-20 16:49:44 +08:00
Yen Chi Hsuan
2bfeee69b9 [openload] Add new extractor (closes #8489) 2016-03-20 15:54:58 +08:00
Yen Chi Hsuan
664bcd80b9 [tudou] Use InAdvancePagedList (closes #8884) 2016-03-20 15:45:31 +08:00
Sergey M․
3c20208eff [francetv] Improve formats extraction 2016-03-20 13:00:46 +06:00
Sergey M․
db264e3cc3 [francetvinfo] Add support for france3-regions and strip title (Closes #7673) 2016-03-20 12:44:04 +06:00
Sergey M
d396f30467 Merge pull request #8902 from jaimeMF/bmake
Makefile: make it compatible with bmake
2016-03-20 11:08:57 +05:00
Sergey M․
96a9f22d98 [discovery] Relax _VALID_URL (Closes #8903) 2016-03-20 10:26:58 +06:00
Sergey M․
40025ee2a3 [postprocessort/ffmpeg] Allow embedding webvtt into webm (Closes #8874) 2016-03-20 04:12:34 +06:00
Jaime Marquínez Ferrándiz
3ff63fb365 Makefile: make it compatible with bmake
It's the portable version of BSD make: http://crufty.net/help/sjg/bmake.html
The syntax for conditionals is different in GNU make and BSD make, so we use the shell
2016-03-19 21:51:13 +01:00
Jaime Marquínez Ferrándiz
5c7cd37ebd tox.ini: Exclude test_iqiyi_sdk_interpreter.py 2016-03-19 21:50:16 +01:00
Sergey M․
298c04b464 [91porn] Use common messages' wording 2016-03-20 02:35:48 +06:00
Sergey M․
d95114dd83 [91porn] Unquote final URL (Closes #8881) 2016-03-20 02:34:02 +06:00
Sergey M․
94dcade8f8 Credit @jjatria for biobiochiletv (#7314) 2016-03-20 01:36:20 +06:00
Sergey M․
fa023ccb2c [biobiochiletv] Fix extraction, extract m3u8 formats and overall improve (Closes #7314) 2016-03-20 01:31:55 +06:00
jjatria
e36f4aa72b [biobiotv] Add extractor 2016-03-20 01:29:08 +06:00
Sergey M․
9261e347cc Credit @kasper93 for cda (#8805) 2016-03-19 23:18:04 +06:00
Sergey M․
f1ced6df51 [cda] Improve and simplify (Closes #8805) 2016-03-19 23:17:14 +06:00
Kacper Michajłow
8b0d7a66ef [cda] Add new extractor for cda.pl
Fixes #8760
2016-03-19 22:42:40 +06:00
Sergey M․
3aec71766d [safari:api] Separate extractor (Closes #8871) 2016-03-19 22:30:48 +06:00
Sergey M․
16a8b7986b [downloader/fragment] Document fragment_retries 2016-03-19 20:54:21 +06:00
Sergey M․
617e58d850 [downloader/{common,fragment}] Fix total retries reporting on python 2.6 2016-03-19 20:51:30 +06:00
Sergey M․
e33baba0dd [downloader/dash] Add fragment retry capability
YouTube may often return 404 HTTP error for a fragment causing the
whole download to fail. However if the same fragment is immediately
retried with the same request data this usually succeeds (1-2 attemps
is usually enough) thus allowing to download the whole file successfully.
So, we will retry all fragments that fail with 404 HTTP error for now.
2016-03-19 20:42:23 +06:00
Sergey M․
721f26b821 [downloader/fragment] Add report_retry_fragment 2016-03-19 20:41:24 +06:00
Sergey M․
52bb437e41 [options] Add --fragment-retries option 2016-03-19 20:40:36 +06:00
Jaime Marquínez Ferrándiz
782b1b5bd1 [utils] lookup_unit_table: Match word boundary instead of end of string 2016-03-19 11:44:49 +01:00
Sergey M․
0d769bcb78 [extractor/generic] Fix missing byte literal prefix 2016-03-19 05:43:43 +06:00
remitamine
4cd70099ea [hbo] Add new extractor 2016-03-18 21:18:18 +01:00
Jaime Marquínez Ferrándiz
09fc33198a utils: lookup_unit_table: Use a stricter regex
In parse_count multiple units start with the same letter, so it would match different units depending on the order they were sorted when iterating over them.
2016-03-18 19:23:06 +01:00
Sergey M․
4c3b16d5d1 [test_YoutubeDL] Add test for format_id format selection 2016-03-19 00:04:26 +06:00
John Peel
d5aacf9a90 Added format_id to the filers on -f. 2016-03-18 23:59:24 +06:00
Sergey M․
19e2617a6f [commonprotocols] Add generic support for rtmp URLs (Closes #8488) 2016-03-18 23:42:15 +06:00
Sergey M․
edd9b71c2c [extractor/generic] Add a test for m3u playlist served without proper Content-Type 2016-03-18 22:49:11 +06:00
Sergey M․
5940862d5a [extractor/generic] Detect m3u playlists served without proper Content-Type 2016-03-18 22:45:28 +06:00
Sergey M․
de6c51e88e [extractor/generic] Fix direct link semantics 2016-03-18 22:43:07 +06:00
Sergey M․
303dcdb995 [extractor/generic] Simplify upload_date extraction 2016-03-18 22:41:16 +06:00
Sergey M․
20938f768b [extractor/generic] Add another test for generic m3u8 2016-03-18 21:54:33 +06:00
Sergey M․
955737b2d4 [extractor/generic] Force Content-Type to lowecase 2016-03-18 21:50:44 +06:00
Sergey M․
263eff9537 [extractor/generic] Properly extract format id from Content-Type
Fixes extraction for cases like: audio/x-mpegURL; charset=utf-8
2016-03-18 21:50:10 +06:00
Sergey M․
cae21032ab [theplatform] Improve geo restriction detection 2016-03-18 21:08:25 +06:00
remitamine
6187091532 [once] check http formats availability 2016-03-18 11:51:34 +01:00
Philipp Hagemeister
0d33166ec5 release 2016.03.18 2016-03-18 11:43:48 +01:00
remitamine
87c03c6bd2 [theplatform] remove unnecessary import 2016-03-18 09:43:28 +01:00
remitamine
4c92fd2e83 [theplatform] always force theplatform to return a smil for _extract_theplatform_smil 2016-03-18 09:22:10 +01:00
Sergey M․
e3d17b3c07 [noz] Fix extraction on python 2.6 by means of using compat_xpath 2016-03-18 02:54:27 +06:00
Sergey M․
810c10baa1 [utils] Use compat_xpath 2016-03-18 02:52:23 +06:00
Sergey M․
57f7e3c62d [compat] Add compat_xpath 2016-03-18 02:51:38 +06:00
Sergey M․
0d0e282912 [animeondemand] Fix typo and improve 2016-03-18 00:13:50 +06:00
Sergey M․
85e8f26b82 [animeondemand] Improve extraction 2016-03-18 00:02:34 +06:00
Sergey M․
b57fecfddd [animeondemand] Add test 2016-03-17 23:50:10 +06:00
Sergey M․
8c97e7efb6 [animeondemand] Expand episode title regex (Closes #8875) 2016-03-17 23:43:14 +06:00
Sergey M․
cc162f6a0a [crunchyroll] Fix custom _download_webpage (Closes #8883) 2016-03-17 22:55:04 +06:00
remitamine
cf45ed786e [wistia] extract more metadata 2016-03-17 17:48:17 +01:00
remitamine
574b2a7393 [nbc:nbcnews] improve extraction(fixes #6922)
- extract more metadata and formats
- relax regex
2016-03-17 16:11:29 +01:00
remitamine
9f02ff537c [theplatform] extract brightcove once formats 2016-03-17 16:11:29 +01:00
remitamine
0436ec0e7a [once] Add new format extractor 2016-03-17 16:11:29 +01:00
Yen Chi Hsuan
11f12195af [youtube] Added itag 91
Seen in https://www.youtube.com/watch?v=jMN4cxyhJjk
2016-03-17 19:25:37 +08:00
remitamine
a646a8cf98 [sbs] improve extraction(fixes #3811)
- extract error messages
- force the platform smil url(previously the manifest param
in the query is not respected which make theplatform return non working
mp4 files for some videos)
2016-03-17 02:07:06 +01:00
remitamine
63f41d3821 [bravotv] Add new extractor(#4657) 2016-03-16 21:26:25 +01:00
Sergey M․
c5229f3926 [utils] PEP 8 2016-03-16 21:50:04 +06:00
Sergey M․
96f4f796fb [brightcover] Remove unused import 2016-03-16 21:47:51 +06:00
Sergey M․
70cab344c4 [udemy] Improve course id v4 regex 2016-03-16 21:46:09 +06:00
Quan Hua
a7ba57dc17 [udemy] Update course id regex to cover v4 layout (Closes #8753, closes #8868, closes #8870) 2016-03-16 21:45:01 +06:00
remitamine
83548824c2 Merge pull request #8092 from bpfoley/twitter-thumbnail
[utils] Add extract_attributes for extracting html tag attributes
2016-03-16 13:16:27 +01:00
remitamine
354dbbd880 [brightcove:new] extract protocol-less embed URLs(closes #2914) 2016-03-16 11:46:53 +01:00
remitamine
23edc49509 [tv3] Add new extractor(closes #8059) 2016-03-16 10:47:39 +01:00
remitamine
48254c3f2c [brightcove] some improvements and fixes
- use FFmpeg downloader to download m3u8 formats extracted
from BrightcoveNew(some of the m3u8 media playlists use AES-128)
- update comment and update_url_query to handle url query
2016-03-16 09:21:07 +01:00
remitamine
2cab48704c [thestar] Add new extractor(closes #5955) 2016-03-15 23:10:31 +01:00
remitamine
64d4f31d78 [brightcove:new] update embed_in_page embeds regex to match non numeric ref id 2016-03-15 22:50:43 +01:00
remitamine
0c9ff24041 [noz] fix extraction in python 2.6 2016-03-15 21:00:39 +01:00
Yen Chi Hsuan
3ff8279e80 [kuwo:mv] Fix the test and extraction of georestricted MVs 2016-03-16 02:41:18 +08:00
remitamine
cb6e477dfe [aljazeera] update the extractor to use BrightcoveNewIE 2016-03-15 19:38:10 +01:00
remitamine
edfd93518e [svt] extract dashhbbtv formats(#8867) 2016-03-15 19:33:09 +01:00
remitamine
89807d6a82 [brightcove] extract dash formats and detect audio formats 2016-03-15 18:48:21 +01:00
remitamine
49dea4913b Merge pull request #8513 from remitamine/dash-sort
[extractor/common] fix dash formats sorting
2016-03-15 18:39:50 +01:00
Sergey M․
dec2cae0a7 [twitch:playlistbase] Clarify pagination bug
Pagination bug has been fixed by twitch on 15.03.2016.
2016-03-15 21:45:43 +06:00
remitamine
cf6cd07396 [noz] extract f4m and m3u8 formats 2016-03-15 15:24:12 +01:00
remitamine
975b9c9ab0 [brightcove:new] detect m3u8 manifests by M2TS container 2016-03-15 10:06:53 +01:00
remitamine
8ac73bdbe4 [brightcove:new] Add support for non numeric ref: preffixed video ids 2016-03-15 10:03:08 +01:00
remitamine
877f440f7b [rice] Add new extractor(closes #1736) 2016-03-15 00:49:23 +01:00
remitamine
d13bdc3824 [brightcove] raise ExtractorError on 403 errors and fix regex to work with tenplay 2016-03-14 22:24:52 +01:00
remitamine
744daf9418 [gameinformer] remove unused imports 2016-03-14 21:57:26 +01:00
remitamine
bf475e1990 [tlc] fix extraction and update extractor to use BrightcoveNewIE 2016-03-14 21:53:00 +01:00
remitamine
203f3d779a [gameinformer] update the extractor to use BrightcoveNewIE 2016-03-14 18:32:29 +01:00
remitamine
4230c4894d [external/downloader] fix rtmp downloading using FFmpegFD 2016-03-14 16:51:01 +01:00
Philipp Hagemeister
6bb266693f release 2016.03.14 2016-03-14 10:25:20 +01:00
remitamine
5d53c32701 [usatoday] Add new extractor(closes #8655) 2016-03-13 22:36:15 +01:00
remitamine
2e7e561c1d Merge pull request #8611 from remitamine/ffmpegfd
[downloader/external] Add FFmpegFD
2016-03-13 21:30:27 +01:00
remitamine
d8515fd41c [downloader/external] pass configuration args to ffmpeg 2016-03-13 21:28:26 +01:00
remitamine
694c47b261 [external/downloader] don't pass -t and -ss to ffmpeg 2016-03-13 21:28:16 +01:00
remitamine
77dea16ac8 [downloader/external] check for ffmpeg availablity when it used for m3u8 download 2016-03-13 20:34:51 +01:00
remitamine
6ae27bed01 [download/external] move the check for multiple selected formats to get_suitable_downloader 2016-03-13 20:34:38 +01:00
remitamine
da1973a038 [extractor/__init__] disable time range downloading 2016-03-13 16:16:26 +01:00
remitamine
be24916a7f [downloader/rtsp] Add rtsp and mms downloader 2016-03-13 15:24:02 +01:00
remitamine
2cb99ebbd0 [downloader/external] add can_download mathod for checking downloader availibilty and support 2016-03-13 15:18:51 +01:00
remitamine
91ee320bfa [downloader/external] wrap available_opt in a list 2016-03-13 14:37:45 +01:00
remitamine
8fb754bcd0 Merge pull request #8821 from remitamine/list-thumbnails-order
[YoutubeDL] check for --list-thumbnails immediately after processing them
2016-03-13 12:44:50 +01:00
remitamine
b7b72db9ad [YoutubeDL] check for --list-thumbnails immediately after processing them 2016-03-13 12:41:15 +01:00
remitamine
634415ca17 [downloader/external] skip FFmpegFD when requesting multiple formats 2016-03-13 12:23:10 +01:00
Sergey M․
2f7ae819ac [utils] PEP 8 2016-03-13 17:23:08 +06:00
Sergey M․
0a477f8731 [vice:show] Add extractor (Closes #8847) 2016-03-13 17:22:23 +06:00
remitamine
a755f82549 [ffmpeg] convert format ext to ffmpeg output formats codes 2016-03-13 12:15:29 +01:00
Sergey M․
7f4173ae7c [mixcloud] Fix view count extraction (Closes #8831, closes #8845) 2016-03-13 16:27:58 +06:00
Sergey M․
fb47597b09 [bbc] Generalize unit table lookup and add parse_count 2016-03-13 16:27:20 +06:00
Sergey M․
450b233cc2 [bbc] Update test 2016-03-13 15:59:54 +06:00
Sergey M․
b7d7674f1e [bbc] Update test 2016-03-13 15:56:34 +06:00
Sergey M․
0e832c2c97 [bbc] Improve title and description extraction (Closes #8826, closes #8822) 2016-03-13 15:54:56 +06:00
Benjamin Congdon
8e4aa7bf18 [bbc] Fix BBC Extractor to work with 'School Report' 2016-03-13 15:54:34 +06:00
remitamine
a42dfa629e [makerschannel] Add new extractor(closes #8839) 2016-03-12 22:52:53 +01:00
remitamine
b970dfddaf [minoto] Add new extractor 2016-03-12 22:52:53 +01:00
Sergey M․
46a4ea8276 [safari] Remove unused imports 2016-03-13 03:48:38 +06:00
Sergey M․
3f2f4a94aa [extractor/generic] Extract f4m formats from final URLs 2016-03-13 03:38:20 +06:00
Sergey M․
f930e0c76e [extractor/generic] Extract f4m formats and refactor common info 2016-03-13 03:17:25 +06:00
Sergey M․
0fdbb3322b [extractor/common] Add _parse_f4m_formats routine 2016-03-13 03:16:08 +06:00
Sergey M․
e9c8999ede [safari] Fix authentication 2016-03-13 02:08:36 +06:00
Sergey M․
73cbd709f9 [safari] Respect kaltura session (Closes #7491) 2016-03-13 02:03:07 +06:00
Sergey M․
9dce3c095b [kaltura] Respect kaltura session 2016-03-13 02:01:10 +06:00
remitamine
e5a2e17a9c [kaltura] optimize url info extraction 2016-03-12 18:43:45 +01:00
remitamine
0ec589fac3 Merge pull request #8827 from remitamine/safari
[safari] extract free and preview videos(#7491)
2016-03-12 17:28:54 +01:00
remitamine
36bb63e084 [dw] add support for article pages(closes #8790) 2016-03-12 08:33:22 +01:00
remitamine
91d6aafb48 [dw] add support for audio pages 2016-03-11 23:55:26 +01:00
remitamine
c8868a9d83 [dw] Add new extractor 2016-03-11 22:44:18 +01:00
remitamine
09f572fbc0 [extractor/common] add transform_source to _download_smil and _extract_smil_formats 2016-03-11 22:37:07 +01:00
Sergey M․
58e6d097d8 [googledrive] Relax _VALID_URL (Closes #8829) 2016-03-12 00:36:39 +06:00
remitamine
15bf934de5 Merge pull request #8819 from remitamine/simple-webpage-requests
[extractor/common] simplify using data, headers and query params with _download_* methods
2016-03-11 18:19:43 +01:00
remitamine
cdfee16818 [extractor/common] add data, headers and query params to _request_webpage 2016-03-11 18:12:50 +01:00
remitamine
bcb668de18 [safari] extract free and preview videos(#7491) 2016-03-11 16:57:06 +01:00
remitamine
fac7e79277 [kaltura] add support for videos with reference id 2016-03-11 16:52:07 +01:00
Yen Chi Hsuan
a6c8b75904 [common] Use mimeType to determine file extensions (#8766) 2016-03-11 23:51:42 +08:00
Yen Chi Hsuan
25cb05bda9 [utils] Remove codec2ext
This function is orignally used for determining file extensions of DASH
formats. Now in DASH, ext is determined by mime_type. See #8766 for more
information.
2016-03-11 23:51:42 +08:00
Sergey M․
6fa6d38549 Credit @benjamincongdon for audioboom (#8812) 2016-03-11 19:46:06 +06:00
Sergey M․
883c052378 [audioboom] Improve robustness and extract uploader (Closes #8812) 2016-03-11 19:44:17 +06:00
Benjamin Congdon
61f317c24c Added extractor for AudioBoom.com 2016-03-11 19:43:01 +06:00
Yen Chi Hsuan
64f08d4ff2 Merge pull request #8766 from yan12125/dash-detect-ext
Detect file extensions of DASH formats from their codecs
2016-03-11 21:40:07 +08:00
Yen Chi Hsuan
e738e43358 [facebook] Support videos in groups
Viewing/Downloading videos in groups requires logging in, even for
those in public groups.

Fixes #6951.
2016-03-11 16:20:27 +08:00
Jaime Marquínez Ferrándiz
f6f6217a98 [facebook] Don't override variable in list comprehension 2016-03-10 15:17:04 +01:00
Yen Chi Hsuan
31db8709bf [iqiyi] Update enc_key 2016-03-10 21:37:26 +08:00
Yen Chi Hsuan
5080cbf9fd [facebook] Handle escaped swf params
Fixes #8713
2016-03-10 15:26:32 +08:00
Yen Chi Hsuan
9880124196 [facebook] Fix for m.facebook.com URLs 2016-03-10 14:59:30 +08:00
Yen Chi Hsuan
9c7b509b2a [facebook] Merge FacebookPostIE into FacebookIE
Fixes #8713
2016-03-10 14:59:30 +08:00
Sergey M․
e0dccdd398 [test_YoutubeDL] PEP 8 2016-03-10 09:04:48 +06:00
Sergey M․
5d583bdf6c [YoutubeDL] Improve _format_note 2016-03-10 01:03:18 +06:00
Sergey M․
1e501364d5 [vimeo:ondemand] Clarify IE_NAME 2016-03-10 00:52:52 +06:00
Sergey M․
74278def2e [vimeo:ondemand] Separate ondemand extractor (Closes #8330, closes #8801) 2016-03-10 00:51:07 +06:00
Sergey M․
e375a149e1 [livestream] Properly build smil URLs (#8794) 2016-03-09 23:11:09 +06:00
Sergey M
2bfc0e97f6 Merge pull request #8791 from benjamincongdon/Twitch-AudioOnly-Rebased
[twitch] Support for "Audio_Only" format
2016-03-08 13:02:56 +06:00
Benjamin Congdon
ac45505528 Added flag for 'allow_audio_only' format in Twitch queries 2016-03-07 21:03:24 -06:00
Sergey M․
7404061141 Credit @mutantmonkey for ustudio (#8574) and kusi (#8575) 2016-03-07 02:30:47 +06:00
Sergey M․
46c329d6f6 [arte] Improve extraction (Closes #8768) 2016-03-07 02:19:54 +06:00
Sergey M․
1818e4c2b4 [arte] Fix typo 2016-03-07 02:10:16 +06:00
Sergey M․
e7bd17373d [sexu] Improve extraction (Closes #8782) 2016-03-06 18:08:53 +06:00
aystroganov@gmail.com
c58e74062f [Sexu] fix extractor 2016-03-06 17:53:22 +06:00
Yen Chi Hsuan
6d210f2090 [utils] Add more codecs to codec2ext
BBC uses avc3. Here's an example (thanks to @remitamine for this example)

http://rdmedia.bbc.co.uk/dash/ondemand/bbb/2/client_manifest-common_init.mpd

See also https://trac.ffmpeg.org/ticket/5217
2016-03-06 17:57:48 +08:00
Yen Chi Hsuan
af7d5a63b2 [common] Document protocol http_dash_segments 2016-03-06 17:47:07 +08:00
Yen Chi Hsuan
e41acb6364 [safari] Don't pollute std_headers (#8778) 2016-03-06 17:38:39 +08:00
Philipp Hagemeister
bdf7f13954 release 2016.03.06 2016-03-06 10:08:02 +01:00
Yen Chi Hsuan
0f56a4b443 [vimeo] Don't pollute std_headers
Fixes #8778
2016-03-06 17:01:05 +08:00
Sergey M․
1b5284b13f [downloader/fragment] Make speed more smooth
At the beginning of every segment there was a drop to Unknown speed due to timeslice being too small to calculate speed.
Now last speed from the previous fragment is used.
2016-03-06 05:36:52 +06:00
Sergey M․
d1e4a464cd [YoutubeDL] Carry long lines and improve readability 2016-03-06 04:32:18 +06:00
Sergey M․
ff059017c0 [YoutubeDL] Fix typo in m3u8_native fixup 2016-03-06 04:30:19 +06:00
remitamine
f22ba4bd60 update tests related to the change in youtube http format sorting
the change was done in 82156fdbf0
2016-03-05 21:52:24 +01:00
remitamine
1db772673e [cinemassacre] update tests 2016-03-05 21:34:34 +01:00
remitamine
75313f2baa [cnet] fix info extraction 2016-03-05 21:10:00 +01:00
remitamine
090eb8e25f Merge pull request #8718 from remitamine/m3u8-fixup
Add fixup for media files produced by HlsNative downloader(fixes #4776)
2016-03-05 18:37:28 +01:00
remitamine
a9793f58a1 Merge pull request #8754 from remitamine/5min
update 5min related web sites info extraction and add support for Aol features.
2016-03-05 18:35:48 +01:00
remitamine
7177fd24f8 [vgtv] support ap.vgtv.no and fix old videos extraction(fixes #8719) 2016-03-05 17:51:46 +01:00
Sergey M․
1e501f6c40 [jeuxvideo] Fix config URL extraction (Closes #8774) 2016-03-05 21:01:43 +06:00
remitamine
2629a3802c [revison3] fix video_id for --download-archive 2016-03-05 15:42:15 +01:00
Sergey M․
51ce91174b [YoutubeDL] Fix resolution with missing height in output template dict 2016-03-05 19:38:58 +06:00
remitamine
107d0c421a [revision3] add support for pages of type tag 2016-03-05 13:43:29 +01:00
remitamine
18b0b23992 [revision3] add support pages of type embed 2016-03-05 12:14:48 +01:00
Sergey M․
d1b29d1342 [elpais] Add support for alternative layout (Closes #8744) 2016-03-05 16:43:29 +06:00
Yen Chi Hsuan
2def60c5f3 [common] Use codec2ext for DASH formats (#8764) 2016-03-05 18:18:39 +08:00
Yen Chi Hsuan
19a17d4623 [utils] Add codec2ext 2016-03-05 18:18:28 +08:00
Yen Chi Hsuan
845817aadf [twitter] Provide more metadata 2016-03-05 18:14:58 +08:00
Jaime Marquínez Ferrándiz
3233a68fbb [utils] update_url_query: Encode the strings in the query dict
The test case with {'test': '第二行тест'} was failing on python 2 (the non-ascii characters were replaced with '?').
2016-03-04 22:18:40 +01:00
remitamine
cf074e5ddd [foxnews] update test 2016-03-04 21:42:04 +01:00
Sergey M․
002c755248 [youporn] Fix sources regex 2016-03-05 01:51:27 +06:00
Sergey M․
d627cec608 [youporn] Fix quality extraction (Closes #8758) 2016-03-05 01:50:12 +06:00
remitamine
1315224cbb [bleacherreport] update tests 2016-03-04 20:14:09 +01:00
remitamine
7760b9ff4d [audimedia] update _VALID_URL and video_id regex and improve http format_id 2016-03-04 17:55:50 +01:00
Yen Chi Hsuan
28559564b2 [kusi] Correct test_KUSI 2016-03-05 00:04:29 +08:00
Yen Chi Hsuan
fa880d20ad [kusi] Two fixes
Thanks @dstftw for pointing out those
2016-03-04 23:59:58 +08:00
Sergey M․
ae7d31af1c [yandexmusic] Capture and output API errors 2016-03-04 21:32:54 +06:00
Yen Chi Hsuan
9d303bf29b Merge branch 'mutantmonkey-kusi' 2016-03-04 23:21:24 +08:00
Yen Chi Hsuan
5f1688f271 [kusi] Simplify and improve 2016-03-04 23:08:47 +08:00
remitamine
1d4c9ed90c [aol] imporve extraction
- add support for aol features
- remove support for legacy urls
2016-03-04 10:42:58 +01:00
remitamine
d48352fb5d [engadget] remove support for legacy urls 2016-03-04 10:40:39 +01:00
remitamine
6d6536acb2 [fivemin] improve extraction
- skip m3u8 formats(404 error)
- skip unavailable test
- download embed page only when it's needed
- update _VALID_URL regex(joystiq.com redirect to engadget.com)
2016-03-04 10:25:16 +01:00
Yen Chi Hsuan
b6f94d81ea [kusi] Add a test for the alternative form of URL 2016-03-04 14:32:01 +08:00
Yen Chi Hsuan
8477a69283 Merge branch 'kusi' of https://github.com/mutantmonkey/youtube-dl into mutantmonkey-kusi 2016-03-04 14:21:23 +08:00
Yen Chi Hsuan
d58cb3ec7e [leeco] Skip an invalid test. test_LePlaylist_1 is sufficient 2016-03-04 13:46:38 +08:00
Yen Chi Hsuan
8a370aedac [leeco] format_id should be strings 2016-03-04 13:38:45 +08:00
Yen Chi Hsuan
24ca0e9c0b [douyutv] Fix tests 2016-03-04 13:36:29 +08:00
Sergey M․
e1dd521e49 [livestream] Fix FutureWarning (Closes #8742) 2016-03-04 01:16:58 +06:00
remitamine
1255733945 Merge pull request #8739 from remitamine/update_url_params
[utils] add update_url_query function to create or update query string params
2016-03-03 19:24:04 +01:00
remitamine
3201a67f61 [test/test_utils] add more tests for update_url_query 2016-03-03 19:18:57 +01:00
Sergey M․
d0ff690d68 [indavideo:embed] Fix tags extraction (Closes #8738) 2016-03-04 00:09:40 +06:00
remitamine
fb640d0a3d [test/test_utils] add tests for update_url_query 2016-03-03 18:40:05 +01:00
remitamine
38f9ef31dc [utils] add update_url_query function 2016-03-03 18:34:52 +01:00
Sergey M․
a8276b2680 [twitch:playlistbase] Fix all at once fetch 2016-03-03 22:18:32 +06:00
Sergey M․
ececca6cde [twitch:playlistbase] Restore original _PAGE_LIMIT 2016-03-03 22:12:55 +06:00
Sergey M․
8bbb4b56ee [twitch:playlistsbase] Use orderedSet 2016-03-03 22:11:26 +06:00
Sergey M․
539a1641c6 [twitch] Workaround broken paging (Closes #8740) 2016-03-03 22:10:36 +06:00
Yen Chi Hsuan
1b0635aba3 [Makefile] Allow specifying the Python version in offline tests 2016-03-03 21:57:49 +08:00
Yen Chi Hsuan
429491f531 [test/http] Fix failure in Jython
make offlinetest passed on the latest Jython hg version with patched
lib-python/2.7/urllib2.py pulled from CPython 2.7.11
2016-03-03 21:55:17 +08:00
Yen Chi Hsuan
e9c0cdd389 [jython] Introduce compat_os_name
os.name is always 'java' on Jython
2016-03-03 19:24:24 +08:00
Yen Chi Hsuan
0cae023b24 Merge branch 'jython-support'
Closes #8302
2016-03-03 18:49:32 +08:00
Yen Chi Hsuan
8ee239e921 [utils] Jython support - handle filenames correctly
Now test:youtube downloads
2016-03-03 18:47:54 +08:00
Brian Foley
8bb56eeeea [utils] Add extract_attributes for extracting html tag attributes
This is much more robust than just using regexps, and handles all
the common scenarios, such as empty/no values, repeated attributes,
entity decoding, mixed case names, and the different possible value
quoting schemes.
2016-03-03 10:11:37 +00:00
remitamine
fa9e259fd9 [extractor/common] use compat_parse_qs in update_url_params 2016-03-03 10:54:39 +01:00
remitamine
f3bdae76de [extractor/common] add update_url_params helper method to add or update query string params 2016-03-03 10:27:22 +01:00
Yen Chi Hsuan
03879ff054 [twitter] Media info is not always in the first entity
Fixes #8704
2016-03-03 14:42:49 +08:00
Yen Chi Hsuan
c8398a9b87 [twitter] Now Twitter serves the same file for Firefox and Chrome 2016-03-03 14:27:27 +08:00
Yen Chi Hsuan
b8972bd69d [twitter] Fix extraction of test_Twitter and test_Twitter_1 2016-03-03 14:24:24 +08:00
Yen Chi Hsuan
0ae937a798 [twitter] Support twitter.com/i/videos/tweet/ URLS
Closes #8737
2016-03-03 13:43:45 +08:00
remitamine
4459bef203 [thepltform] detect other types of errors 2016-03-02 21:41:29 +01:00
remitamine
e07237f640 [utils] remove check for val from find_xpath_attr 2016-03-02 21:40:21 +01:00
Yen Chi Hsuan
8c5a994424 [leeco] Letv renamed to LeEco
LeEco is the company name and Le is the domain name.

For more information see the Chinese news post
http://www.techorz.com/company-news/letv-renamed-to-leeco-and-new-logo/
2016-03-03 03:27:55 +08:00
Yen Chi Hsuan
2eb25b256b [letv] Merge LetvTvIE into LetvPlaylistIE
And
1. Add more URL examples
2. Improve the matching pattern
2016-03-03 03:27:55 +08:00
Yen Chi Hsuan
f3bc19a989 [letv] Correct regular expressions and fix a typo 2016-03-03 03:27:55 +08:00
Yen Chi Hsuan
7a8fef3173 [letv] Order imports alphabetically 2016-03-03 03:27:55 +08:00
Yen Chi Hsuan
7465e7e42d [letv] Keep videos' order in playlists 2016-03-03 03:27:55 +08:00
Yen Chi Hsuan
5e73a67d44 [letv] Domain name changed 2016-03-03 03:27:55 +08:00
Sergey M․
2316dc2b9a [twitch:playlistbase] Mark broken
Twitch paging mechanism is completely broken on twitch side serving all videos all the time and making our travis builds stall.
2016-03-03 00:41:36 +06:00
Sergey M․
a2d7797cee [vimeo] Extract uploader_url (Closes #8727) 2016-03-03 00:00:11 +06:00
Sergey M․
fd050249af [youtube] Extract uploader_url (Closes #8724) 2016-03-02 23:49:10 +06:00
Sergey M․
7bcd2830dd [extractor/common] Document uploader_url 2016-03-02 23:31:24 +06:00
Sergey M
47462a125b [README.md] Document license field for output template 2016-03-02 23:10:01 +06:00
Sergey M․
7caf9830b0 [youtube] Extract license (Closes #8725) 2016-03-02 23:07:25 +06:00
Sergey M․
2bc0c46f98 [extractor/common] Document license metafield 2016-03-02 23:06:39 +06:00
remitamine
3318832e9d [youtube] improve width and height extraction from fmt_list 2016-03-02 17:52:13 +01:00
remitamine
e7d2084568 Merge branch 'master' of github.com:rg3/youtube-dl 2016-03-02 17:35:55 +01:00
remitamine
c2d3cb4c63 Revert "[youtube] add tbr to _formats extracted from watch_as3.swf"
This reverts commit 4a5ba28a87.
2016-03-02 17:35:04 +01:00
remitamine
c48dd4400f Revert "[youtube] add basic info for some unknown formats extracted from watch_as3.swf"
This reverts commit 85ca019d96.
2016-03-02 17:34:56 +01:00
Sergey M․
e38cafe986 [YoutubeDL] Skip postprocessing and archive report when outputting to stdout (Closes #8729) 2016-03-02 21:11:18 +06:00
remitamine
85ca019d96 [youtube] add basic info for some unknown formats extracted from watch_as3.swf 2016-03-02 16:05:05 +01:00
remitamine
4a5ba28a87 [youtube] add tbr to _formats extracted from watch_as3.swf 2016-03-02 16:05:05 +01:00
remitamine
82156fdbf0 [youtube] extract width and height from fmt_list 2016-03-02 16:05:05 +01:00
Sergey M․
6114090418 [nrk:skole] Relax _VALID_URL 2016-03-02 20:57:04 +06:00
Sergey M․
3099b31276 [nrk:skole] Add extractor (Closes #8728) 2016-03-02 20:52:06 +06:00
remitamine
f17f86513e Add fixup for media files produced by HlsNative downloader(fixes #4776) 2016-03-01 21:10:41 +01:00
Sergey M․
90f794c6c3 [options] Add --no-mark-watched (#5054) 2016-03-01 23:41:23 +06:00
Sergey M․
66ca2cfddd [wistia] Fix extraction (Closes #8707) 2016-03-01 23:26:53 +06:00
Sergey M
269dd2c6a7 Merge pull request #8703 from dstftw/mark-watched
Add --mark-watched feature (Closes #5054)
2016-03-01 23:00:51 +06:00
Sergey M․
e7998f59aa [lifenews] Fix extraction and improve (Closes #2482, closes #8714) 2016-03-01 22:59:11 +06:00
Yen Chi Hsuan
9fb556eef0 [iqiyi] SWF URLs are not used anymore
Since automatic detection of enc_key failed

Closes #8705
2016-03-01 08:42:33 +08:00
Philipp Hagemeister
e781ab63db release 2016.03.01 2016-03-01 00:05:39 +01:00
Jaime Marquínez Ferrándiz
3e76968220 [rtve.es:live] Fix extraction
* Update _VALID_URL to match the current URLs
* Use the m3u8 manifest since I haven't figured out how to use the rtmp stream
2016-02-29 20:57:26 +01:00
Sergey M․
2812c24c16 [mdr] Fix extraction (Closes #8702) 2016-03-01 01:24:26 +06:00
Sergey M․
d77ab8e255 Add --mark-watched feature (Closes #5054) 2016-03-01 01:01:33 +06:00
Sergey M․
4b3cd7316c [tf1] Improve wat id regex (Closes #8691) 2016-02-29 03:28:21 +06:00
Sergey M․
6dae56384a [screenwavemedia] Check formats' URLs 2016-02-28 21:46:36 +06:00
Sergey M․
2b2dfae83e [screenwavemedia] Improve formats sorting 2016-02-28 20:16:31 +06:00
Sergey M․
6c10dbeae9 [screenwavemedia] Improve formats extraction 2016-02-28 20:05:58 +06:00
Jaime Marquínez Ferrándiz
9173202b84 [zdf] Ignore hls manifests that use https (closes #8665)
The certificates are misconfigured, you get the following error mesage:

    ssl.CertificateError: hostname u'zdf-hdios-none-i.zdf.de' doesn't match either of 'a248.e.akamai.net', '*.akamaihd.net', '*.akamaihd-staging.net', '*.akamaized.net', '*.akamaized-staging.net'
2016-02-28 14:06:26 +01:00
Sergey M․
8870bb4653 [webofstories] Tolerate malforder og:title (Closes #8417) 2016-02-28 03:37:48 +06:00
Sergey M
7a0e7779fe [README.md] Use simple wording instead of env variable for home 2016-02-28 03:12:13 +06:00
Sergey M
a048ffc9b0 [README.md] Clarify configuration file options syntax 2016-02-28 03:04:06 +06:00
Sergey M
4587915b2a [README.md] Make configuration file example more diverse 2016-02-28 02:56:09 +06:00
Philipp Hagemeister
da665ddc25 release 2016.02.27 2016-02-27 21:31:21 +01:00
Sergey M․
5add979d91 [dplay] Add support for dplay.no 2016-02-27 21:42:08 +06:00
Sergey M․
20afe8bd14 Credit @aidan- for more dplay sites support (#8463) 2016-02-27 21:31:43 +06:00
Sergey M․
940b606a07 [dplay] Improve, extract all formats and metadata (Closes #8463) 2016-02-27 21:30:47 +06:00
Aidan Rowe
9505053704 [dplay] add support for it.dplay.com and dplay.dk 2016-02-27 19:40:36 +06:00
Sergey M․
2c9ca78281 [extractor/generic] Add support for tnaflix network embeds (Closes #7505) 2016-02-27 17:15:49 +06:00
Sergey M․
63719a8ac3 [tnaflixnetwork:embed] Add _extract_urls 2016-02-27 17:15:06 +06:00
Sergey M․
8fab62482a [tnaflixnetwork] Fallback age limit to 18 2016-02-27 16:59:10 +06:00
Sergey M․
d6e9c2706f [tnaflixnetwork:embed] Add extractor 2016-02-27 16:58:11 +06:00
Sergey M․
f7f2e53a0a [imdb] Recognize 1080p formats (Closes #8677) 2016-02-27 15:51:25 +06:00
Sergey M․
9cdffeeb3f [extractor/common] Clarify rationale on media playlist detection 2016-02-27 07:01:11 +06:00
Sergey M․
fbb6edd298 [extractor/common] Properly extract audio only formats in master m3u8 playlists 2016-02-27 06:48:13 +06:00
Yen Chi Hsuan
5eb6bdced4 [utils] Multiple changes to base_n()
1. Renamed to encode_base_n()
2. Allow tables longer than 62 characters
3. Raise ValueError instead of AssertionError for invalid input data
4. Return the first character in the table instead of '0' for number 0
5. Add tests
2016-02-27 03:22:52 +08:00
Yen Chi Hsuan
5633b4d39d [infoq] Use BokeCC extractor function 2016-02-27 02:55:11 +08:00
Yen Chi Hsuan
4435c6e98e [bokecc] Add new extractor (#2336) 2016-02-27 02:54:43 +08:00
Yen Chi Hsuan
2ebd2eac88 [letv] Speedup M3U8 decryption 2016-02-27 00:58:03 +08:00
Sergey M․
b78b292f0c [youtube] Add alternative automatic captions extraction approach (Closes #8667) 2016-02-26 22:21:47 +06:00
Yen Chi Hsuan
efbd6fb8bb [vidzi] Use decode_packed_codes
Javascript codes found on Vidzi are slightly different from those found
in VideoMega and iQiyi. Nevertheless, the difference has no effects on
the final result.
2016-02-26 15:14:13 +08:00
Yen Chi Hsuan
680079be39 [utils] Relaxing regex in decode_packed_codes for vidzi 2016-02-26 15:13:03 +08:00
Yen Chi Hsuan
e4fc8d2ebe [videomega] Fix extraction (closes #7606) 2016-02-26 15:00:48 +08:00
Yen Chi Hsuan
f52354a889 [utils] Move codes for handling eval() from iqiyi.py 2016-02-26 14:58:29 +08:00
Yen Chi Hsuan
59f898b7a7 [utils] Merge base_n functions 2016-02-26 14:37:20 +08:00
Yen Chi Hsuan
8f4a2124a9 [vidzi] Fix extraction 2016-02-26 14:26:26 +08:00
Yen Chi Hsuan
481888294d [utils] Add base36 for use in Vidzi 2016-02-26 14:26:26 +08:00
Yen Chi Hsuan
d1e440a4a1 [jwplatform] Separate codes for for parsing jwplayer data 2016-02-26 14:26:26 +08:00
Yen Chi Hsuan
81bdc8fdf6 [utils] Move base62 to utils 2016-02-26 14:26:26 +08:00
Yen Chi Hsuan
e048d87fc9 [kuwo] Fix a test 2016-02-26 14:26:26 +08:00
Sergey M․
e26cde0927 [space] Remove extractor (Closes #8662)
Now uses ooyala embed
2016-02-25 21:46:43 +06:00
Sergey M․
20108c6b90 [ustudio] Improve (Closes #8574) 2016-02-25 21:30:19 +06:00
mutantmonkey
9195ef745a [uStudio] Add new extractor 2016-02-25 21:29:49 +06:00
Sergey M․
d0459c530d [motherless] Update tests 2016-02-25 00:54:41 +06:00
Sergey M․
f160785c5c [utils] Remove AM/PM from unified_strdate patterns 2016-02-25 00:52:49 +06:00
Sergey M․
5c0a57185c [motherless] Detect non-existing videos 2016-02-25 00:42:19 +06:00
Sergey M․
43479d9e9d [motherless] Make categories optional (Closes #8654) 2016-02-25 00:36:14 +06:00
Sergey M
c0da50d2b2 [README.md] Turn references to issues to links 2016-02-24 23:05:23 +06:00
Yen Chi Hsuan
c24883a1c0 [facebook] Fix format sorting
'hd' formats should have higher priorities
2016-02-24 03:43:24 +08:00
Yen Chi Hsuan
1b77ee6248 [c56] Support videos hosted on Sohu (closes #8073) 2016-02-24 03:32:29 +08:00
Sergey M․
bf4b3b6bd9 [vk] Extract video URL from extra_data (Closes #8646) 2016-02-23 18:47:13 +06:00
Yen Chi Hsuan
efbeddead3 [facebook] Support mobile URLs (closes #8638) 2016-02-23 13:17:24 +08:00
Yen Chi Hsuan
3cfeb1624a [nba] Support channels (#5362, #4167) 2016-02-23 13:11:20 +08:00
Yen Chi Hsuan
b95dc034ca [utils] Implement cache for OnDemandPagedList 2016-02-23 13:11:20 +08:00
Yen Chi Hsuan
86a7dbe66e [nba] Support non-video/ pages
Fixes #8589
2016-02-23 13:11:20 +08:00
Sergey M
b43a7a92cd [README.md] Fix typo 2016-02-23 05:41:09 +06:00
Sergey M
6563d31710 [README.md] Fix typo 2016-02-23 05:37:49 +06:00
Sergey M
cf89ba9eff [README.md] Clarify robustness and future-proof requirements for new extractors 2016-02-23 05:35:19 +06:00
Sergey M
9b01272832 [README.md] Update link to extractor metafields 2016-02-23 05:03:57 +06:00
Sergey M
58525c94d5 [README.md] Emphasize copyright infringement aspects in add-new-site-support tutorial 2016-02-23 04:58:51 +06:00
Sergey M
621bd0cda9 [README.md] Add tl;dr links to examples 2016-02-23 04:43:45 +06:00
Sergey M
1610f770d7 [README.md] Extract example subsections 2016-02-23 04:29:39 +06:00
Sergey M
0fc871d2f0 [README.md:output_template] Add example for channel/user playlists download 2016-02-23 04:16:47 +06:00
Sergey M․
1ad6143061 [xfileshare] Add support for powerwatch (Closes #8628) 2016-02-22 17:37:00 +06:00
Philipp Hagemeister
92da3cd848 release 2016.02.22 2016-02-22 11:57:31 +01:00
remitamine
6212bcb191 [tf1] fix info extraction(fixes #8599) 2016-02-22 09:57:40 +01:00
Sergey M․
d69abbd3f0 [googledrive] Make thumbnail optional (Closes #8629) 2016-02-22 03:13:18 +06:00
Sergey M․
1d00a8823e [arte] PEP 8 2016-02-22 01:32:23 +06:00
Sergey M․
5d6e1011df [pbs] Extract all formats (Closes #8538) 2016-02-22 01:23:27 +06:00
Sergey M․
f5bdb44443 [extractor/common] Add _remove_duplicate_formats 2016-02-22 01:19:39 +06:00
Yen Chi Hsuan
7efc1c2b49 [twitter] Fix metadata extraction and test_Twitter_1 2016-02-21 17:29:28 +08:00
Yen Chi Hsuan
132e3b74bd [twitter] Fix a typo 2016-02-21 17:21:37 +08:00
Yen Chi Hsuan
bdbf4ba40e [twitter:amplify] Extract more metadata 2016-02-21 17:16:35 +08:00
Yen Chi Hsuan
acb6e97e6a [twitter] Fix several failed tests 2016-02-21 16:57:56 +08:00
Yen Chi Hsuan
445d72b8b5 [twitter:amplify] Add TwitterAmplifyIE for handling Twitter smart URLs
Closes #8075
2016-02-21 16:41:24 +08:00
Sergey M․
92c5e11b40 [arte:future] Fix test 2016-02-21 14:23:58 +06:00
Sergey M․
0dd046c16c [arte:magazine] Fix test 2016-02-21 13:57:30 +06:00
Sergey M․
305168ca3e [arte:+7] Detect more embeds (Closes #8613) 2016-02-21 13:55:25 +06:00
Sergey M․
b72f6163dc [arte:+7] Improve _VALID_URL 2016-02-21 13:37:31 +06:00
Sergey M․
33d4fdabfa [extractor/generic] Add support for ok embeds (#8619) 2016-02-21 09:51:54 +06:00
remitamine
cafcf657a4 add more subtitles mime types to mimetype2ext and fix the platform subtitle extraction 2016-02-20 22:02:03 +01:00
Yen Chi Hsuan
101067de12 Jython support - handle *.class files 2016-02-21 03:32:03 +08:00
Yen Chi Hsuan
7360db05b4 [postprocessor/embedthumbnail] Allow mkv to embed thumbnails
Fixes #6046
2016-02-21 03:32:03 +08:00
Yen Chi Hsuan
c1c05c67ea [utils] Jython support - disable setproctitle() until ctypes is complete 2016-02-21 03:32:03 +08:00
Yen Chi Hsuan
399a76e67b [utils] Jython support: tolerate missing fcntl module 2016-02-21 03:32:03 +08:00
Jaime Marquínez Ferrándiz
765ac263db [utils] mimetype2ext: return 'm4a' for 'audio/mp4' (fixes #8620)
The youtube extractor was using 'mp4' for them, therefore filters like 'bestaudio[ext=m4a]' stopped working (94278f7202 broke it).
2016-02-20 19:55:10 +01:00
Yen Chi Hsuan
a4e4d7dfcd [test_iqiyi_sdk_interpreter] Add test for iQiyi login 2016-02-20 23:10:39 +08:00
Yen Chi Hsuan
73f9c2867d [iqiyi] Support playlists (closes #8019) 2016-02-20 22:44:04 +08:00
Philipp Hagemeister
9c86d50916 [faz] Future-proof XML element check 2016-02-20 14:11:44 +01:00
Yen Chi Hsuan
1d14c75f55 [Makefile] iQiyi login test requires network 2016-02-20 20:49:30 +08:00
Yen Chi Hsuan
99709cc3f1 [iqiyi] Implement _login()
Currently only email login supported
2016-02-20 19:54:58 +08:00
Yen Chi Hsuan
5bc880b988 [utils] Add OHDave's RSA encryption function 2016-02-20 19:54:58 +08:00
Yen Chi Hsuan
958759f44b [appletrailers] Extend _VALID_URL (#8524) 2016-02-20 15:54:00 +08:00
remitamine
f34294fa0c [downloader/external:ffmpegfd] check for None value of start_time 2016-02-20 08:06:12 +01:00
remitamine
99cbe98ce8 [downloader/external] check for external downloaders availability 2016-02-20 07:58:25 +01:00
Sergey M․
86bf29050e [test_YoutubeDL] Make test pass until more intelligent sort formats (Closes #8462) 2016-02-20 03:36:03 +06:00
remitamine
04cbc4980d [mtv] imporove duration extraction 2016-02-19 20:56:45 +01:00
RiCON
8765151c8a [mtv] Extract duration from each playlist item
RSS used instead of manifest files because it's exact to the millisecond
with the video I tested while in manifest it's only exact to the second.
2016-02-19 19:38:28 +00:00
remitamine
12b84ac8c1 [downloader/external] Add FFmpegFD(fixes #622)
- replace HlsFD and RtspFD
- add basic support for downloading part of the video or audio
2016-02-19 19:29:24 +01:00
Sergey M
8ec64ac683 [README.md] Clarify verbose log 2016-02-19 22:18:21 +06:00
Sergey M․
ed8648a322 [pornhub] Fix thumbnail and duration extraction (Closes #8604) 2016-02-19 21:42:46 +06:00
Sergey M․
88641243ab [pornhub:playlistbase] Improve extract entries 2016-02-18 22:30:19 +06:00
Sergey M․
40e146aa1e [pornhub:user:videos] Add extractor (Closes #8548) 2016-02-18 22:29:17 +06:00
Sergey M․
f3f9cd9234 [francetv] Improve video id regex (Closes #8563) 2016-02-18 22:09:21 +06:00
Sergey M․
ebf1b291d0 [youtube:watchlater] Respect --no-playlist 2016-02-18 22:03:46 +06:00
Sergey M․
bc7a9cd8fb [youtube:watchlater] Improve _VALID_URL (Closes #8594) 2016-02-18 21:50:21 +06:00
Sergey M․
d48502b82a [arte] Improve _VALID_URLs 2016-02-18 21:29:52 +06:00
Sergey M․
479ec54a8d [arte:magazine] Improve (Closes #8473) 2016-02-18 21:29:07 +06:00
Thomas Jost
49625662a9 [arte:magazine] Add extractor 2016-02-18 21:28:18 +06:00
remitamine
8b809a079a [cbsnews] use find_xpath_attr 2016-02-18 16:10:09 +01:00
remitamine
778433cb90 [cbsnews] extract subtitle url from theplatform SMIL manifest(fixes #8568) 2016-02-18 15:43:28 +01:00
cazulu
411cb8f476 [dailymotion] Fix view count extraction
Fix view count parsing when the decimal marker is a whitespace, e.g. '101 101'
2016-02-18 20:31:43 +06:00
Sergey M․
63bf4f0dc0 [vrt] Detect geo restriction 2016-02-17 23:28:41 +06:00
Sergey M․
80e59a0d5d [vrt] Make formats extraction non fatal (Closes #8587) 2016-02-17 23:18:23 +06:00
Sergey M․
8bbd3d1476 [arte] Fix upload date extraction (Closes #8581) 2016-02-17 22:51:08 +06:00
Sergey M․
e725e4bced [arte] PEP 8 2016-02-17 22:37:55 +06:00
Sergey M․
08d65046f0 [arte] Make sorting aware of en/es formats 2016-02-17 22:37:05 +06:00
Sergey M․
44b9745000 [arte] Extend more _VALID_URLs for en and es support 2016-02-17 21:53:53 +06:00
Sergey M․
9654fc875b [arte:+7] Fix extraction for react-based layout 2016-02-17 21:49:15 +06:00
Sergey M․
0f425e65ec [arte:+7] Add support for en and es URLs 2016-02-17 21:47:18 +06:00
mutantmonkey
199e724291 [KUSI] Add new extractor 2016-02-16 19:55:46 -08:00
Sergey M․
e277f2a63b [orf:tvthek] Check formats (Closes #8580) 2016-02-16 22:23:38 +06:00
Sergey M․
f4db09178a [xtube:user] Remove duplicated video ids 2016-02-16 22:06:26 +06:00
Sergey M․
86be3cdc2a [xtube] Fix extraction (Closes #8565) 2016-02-16 22:05:23 +06:00
Yen Chi Hsuan
cb64ccc715 [facebook] Improve error handling (#8572) 2016-02-16 09:07:38 +08:00
Sergey M․
f66a3c7bc2 [screenjunkies] Fix spelling 2016-02-16 01:30:00 +06:00
Sergey M․
fe80df3080 Credit @TingPing for screenjunkies (#8505) 2016-02-16 01:24:57 +06:00
Yen Chi Hsuan
1932476c13 [iqiyi] Omit MD5 sums for the VIP-only video 2016-02-16 02:45:21 +08:00
Sergey M․
d2c1f79f20 [youtube:searchurl] Extend _VALID_URL 2016-02-16 00:29:51 +06:00
Sergey M․
8eacae8cf9 Credit @RobinHoutevelts for canvas subtiltes (#8537) 2016-02-15 22:33:32 +06:00
Sergey M․
c8a80fd818 [screenjunkies] Improve, extract more metadata and workaround subscription (Closes #8505) 2016-02-15 22:29:28 +06:00
Patrick Griffis
b9e8d7140a [screenjunkies] Add new extractor
This doesn't handle the plus only videos yet

Closes #8492
2016-02-15 22:28:36 +06:00
Sergey M․
6eff2605d6 [canvas] Add subtitles test (#8537) 2016-02-15 20:59:16 +06:00
Sergey M․
fd7a3ea4a4 [canvas] Improve subtitles (Closes #8537) 2016-02-15 20:54:01 +06:00
Robin Houtevelts
8d3eeb36d7 [Canvas] Add subtitles 2016-02-15 20:50:03 +06:00
Yen Chi Hsuan
8e0548e180 [iqiyi] Partial support for VIP-only videos
See #8569 and #8019. Currently only 6-min preview are supported
2016-02-15 19:58:24 +08:00
Philipp Hagemeister
a517bb4b1e [noz] Add new extractor 2016-02-15 00:07:16 +01:00
Sergey M․
9dcefb23a1 [laola1tv] Improve (Closes #8478) 2016-02-14 23:40:26 +06:00
Sergey M․
d9da74bc06 Credit @blackwinter for laola1tv (#8478) 2016-02-14 23:39:49 +06:00
Jens Wille
5e19323ed9 [laola1tv] Fixes for changed site layout.
* Fixed valid URLs (w/ tests).
* Fixed iframe URL extraction.
* Fixed token URL extraction.
* Fixed variable extraction.
* Fixed uploader spelling.
* Added upload_date to result dictionary.
2016-02-14 23:01:49 +06:00
Sergey M․
611c1dd96e [refactor] Single quotes consistency 2016-02-14 15:37:17 +06:00
Sergey M․
d800609c62 [refactor] Do not specify redundant None as second argument in dict.get() 2016-02-14 14:25:04 +06:00
Sergey M․
c78c9cd10d [downloader/dash] PEP 8 2016-02-14 14:13:09 +06:00
Sergey M․
e76394f36c [globo] Switch to new-style classes 2016-02-14 14:02:12 +06:00
Sergey M․
080e09557d [aes] Switch to new-style classes 2016-02-14 14:01:43 +06:00
Sergey M․
fca2e6d5a6 [dailymotion:cloud] Use idiomatic name for classmethod's first argument 2016-02-14 13:44:23 +06:00
Sergey M․
b45f2b1d6e [myvideo] Mark broken 2016-02-14 11:24:57 +06:00
remitamine
fc2e70ee90 Merge pull request #8479 from remitamine/dash_downloader
[downloader/dash] Implement dashsegments fd in terms of fragment fd
2016-02-13 21:12:33 +01:00
Sergey M․
b4561e857f [animeondemand] Add .netrc 2016-02-13 22:41:58 +06:00
Jaime Marquínez Ferrándiz
7023251239 [comedycentral] Support /shows URLs (fixes #8405) 2016-02-13 12:26:27 +01:00
Sergey M․
e2bd68c901 [animeondemand][wip] Add extractor (#8518) 2016-02-13 13:30:31 +06:00
Philipp Hagemeister
35ced3985a release 2016.02.13 2016-02-13 08:25:05 +01:00
Sergey M․
3e18700d45 [nbc] Correct test 2016-02-13 07:45:32 +06:00
Sergey M․
f9f49d87c2 [youtube] Add test for #8536 2016-02-13 05:18:58 +06:00
Sergey M․
6863631c26 [youtube] Improve multifeed videos extraction (Closes #8536) 2016-02-13 05:01:20 +06:00
Sergey M․
9d939cec48 [extractor/generic] Add direct mpd url test 2016-02-13 00:36:47 +06:00
Sergey M․
4c77d3f52a [YoutubeDL] Allow bestvideo+bestaudio for any extractor 2016-02-13 00:23:14 +06:00
Sergey M․
7be747b921 [extractor/generic] Pass mpd base url to _parse_mpd_formats 2016-02-13 00:15:59 +06:00
Sergey M․
bb20526b64 [extractor/common] Improve base url construction 2016-02-13 00:13:56 +06:00
remitamine
bcbb1b08b2 Revert "[aenetworks] extract http formats"
This reverts commit 3d98f97c64.
2016-02-12 17:56:06 +01:00
remitamine
3d98f97c64 [aenetworks] extract http formats 2016-02-12 17:39:32 +01:00
remitamine
c349456ef6 [extractor/common] strip http urls in smil manifest 2016-02-12 17:38:48 +01:00
Sergey M․
5a4905924d [extractor/generic] Improve dailymotion embed detection (Closes #8521, closes #8325) 2016-02-12 22:03:10 +06:00
Sergey M․
b826035dd5 [vimeo] Fix authentication (Closes #8520) 2016-02-12 03:16:26 +06:00
remitamine
a7cab4d039 [theplatform] remove unused import and change smil url for ThePlatformFeedIE 2016-02-11 18:50:14 +01:00
remitamine
fc3810f6d1 Merge branch 'master' of github.com:rg3/youtube-dl 2016-02-11 18:13:56 +01:00
remitamine
3dc71d82ce [theplatform] fix pid extraction in the platform feed 2016-02-11 18:13:03 +01:00
Sergey M․
9c7b38981c [utils] Bump Firefox version in User-Agent
Old version number causes Youtube not to serve some formats in ytplayer.config
2016-02-11 23:12:30 +06:00
remitamine
8b85ac3fd9 [cbc] Add new extractor(closes #3803)(closes #4731)(closes #5309) 2016-02-11 18:10:32 +01:00
remitamine
81e1c4e2fc [extractor/common] remove duplicate rtmp formats in smil manifest 2016-02-11 17:58:48 +01:00
Sergey M․
388ae76b52 [YoutubeDL] Fix format resolution when height is missing 2016-02-11 22:46:13 +06:00
Sergey M․
b67d63149d [youtube] Fix typos 2016-02-11 22:33:08 +06:00
Sergey M․
28280e8ded [plays] PEP 8 2016-02-11 22:02:57 +06:00
Sergey M․
6b3fbd3425 [pbs] Fix multi part videos extraction 2016-02-11 22:02:37 +06:00
Sergey M․
a7ab46375b [pbs] Update some tests 2016-02-11 21:43:01 +06:00
Sergey M․
b14d5e26f6 [pbs] Improve description extraction 2016-02-11 21:28:09 +06:00
Sergey M․
9a61dfba0c [pbs] Revert prefer portalplayer 2016-02-11 21:22:57 +06:00
remitamine
dd86780596 [extractor/common] fix dash formats sorting 2016-02-11 10:55:50 +01:00
remitamine
154c209e2d [extractor/common] improve dash format ids 2016-02-11 10:33:26 +01:00
remitamine
d1ea5e171f [plays] Add new extractor(#8458) 2016-02-11 10:30:31 +01:00
remitamine
a1188d0ed0 [crackle] add prefix to format ids 2016-02-10 22:39:33 +01:00
remitamine
47d205a646 [crackle] improve format sorting 2016-02-10 22:23:56 +01:00
remitamine
80f772c28a [crackle] Add new extractor 2016-02-10 22:16:21 +01:00
Philipp Hagemeister
f817d9bec1 release 2016.02.10 2016-02-10 16:17:38 +01:00
Sergey M․
e2effb08a4 [YoutubeDL] Sanitize format_id (Closes #8494) 2016-02-10 21:16:58 +06:00
Sergey M․
7fcea295c5 [pbs] Switch to portal player by default (Closes #8491) 2016-02-10 20:46:38 +06:00
Sergey M․
cc799437ea [youku] Report private videos (Closes #8498) 2016-02-10 20:05:17 +06:00
Sergey M․
89d23f37f2 [hotstar] Relax _VALID_URL (Closes #8487) 2016-02-10 04:43:00 +06:00
Philipp Hagemeister
b92071ef00 release 2016.02.09.1 2016-02-09 20:12:36 +01:00
Sergey M․
47246ae26c [viddler] Update tests 2016-02-10 01:12:47 +06:00
Sergey M․
9c15869c28 [viddler] Add support for secret videos (Closes #8481) 2016-02-10 01:09:07 +06:00
remitamine
51e9094f4a [extractor/common] extract youtube dash formats filesize(fixes #8480) 2016-02-09 20:05:39 +01:00
remitamine
5e3a6fec33 [fox] update test 2016-02-09 17:30:42 +01:00
remitamine
c43fe0268c [downloader/dash] Implement dashsegments fd in terms of fragment fd 2016-02-09 17:25:44 +01:00
remitamine
d413095f7e [extractor/common] remove duplicated formats and subtiles in smil manifests 2016-02-09 17:15:41 +01:00
remitamine
1bedf4de06 [fox] extract http formats 2016-02-09 17:12:34 +01:00
Sergey M․
3967a761f4 [mailru] Fix tests 2016-02-09 21:31:51 +06:00
Sergey M․
b081350bd9 [mailru] Improve and modernize 2016-02-09 21:30:48 +06:00
Sergey M․
16f1430ba6 [mailru] Prefer metaUrl API (Closes #8474) 2016-02-09 21:14:02 +06:00
Philipp Hagemeister
085ad71157 release 2016.02.09 2016-02-09 12:57:51 +01:00
Sergey M․
35972ba172 [vk] Improve rutube embeds detection (Closes #8461) 2016-02-08 21:30:23 +06:00
Sergey M․
3834d3e35c [youtube] Clarify itag 36 height and abr (Closes #8457) 2016-02-08 01:30:57 +06:00
Sergey M
8d0a2a2a4e [README.md] Fix typo 2016-02-07 21:23:29 +06:00
Sergey M
11c0339bec [README.md] Clarify quotes in output template 2016-02-07 21:22:33 +06:00
Sergey M
915dd77783 [README.md] Add output template example for streaming to stdout 2016-02-07 21:21:14 +06:00
Sergey M․
b6bfa6fb79 [konserthusetplay] Reorder code pieces 2016-02-07 21:18:32 +06:00
Sergey M․
f070197bd7 [konserthusetplay] Improve _VALID_URL 2016-02-07 21:16:31 +06:00
Sergey M․
5a7699bb2e [konserthusetplay] Improve and extract all formats (Closes #8381) 2016-02-07 21:11:59 +06:00
ovitei
8628d26f38 [KonserthusetPlay] Add new extractor (partial support) 2016-02-07 19:47:29 +06:00
Sergey M․
8411229bd5 [utils] Allow dot in strip_jsonp 2016-02-07 19:47:09 +06:00
Sergey M
72b9ebc65d [README.md] Document extractor sequences in output template 2016-02-07 19:08:54 +06:00
Sergey M
3b799ca14c [README.md] Clarify percent literal and output to stdout 2016-02-07 19:06:42 +06:00
Sergey M
0474512e30 [README.md] Document even more sequences in output template 2016-02-07 19:00:59 +06:00
Sergey M
f0905c6ec3 [README.md] Document more sequences in output template 2016-02-07 18:45:44 +06:00
Sergey M․
86296ad2cd [utils] Add ability to control skipping false values in dict_get 2016-02-07 08:13:04 +06:00
Sergey M․
52f5889f77 [vlive] Improve and extract more metadata (Closes #8446) 2016-02-07 06:17:40 +06:00
Sergey M․
81e0b4f2d1 Credit @EraYaN for vlive update (#8446) 2016-02-07 06:14:26 +06:00
Sergey M․
cbecc9b903 [utils] Add dict_get convenience method 2016-02-07 06:12:53 +06:00
Erwin de Haan
b8b465af3e [vlive] Updated to new V App/VLive api.
More robust with getting keys and ids from website.
2016-02-07 05:27:17 +06:00
pulpe
59b35c6745 [IPrima] Remove test video_id 2016-02-06 21:42:24 +01:00
Jaime Marquínez Ferrándiz
7032833011 [iprima] Follow pep8 2016-02-06 21:37:28 +01:00
pulpe
f406c78785 [IPrima] Fix extractor (fixes #7617) 2016-02-06 21:23:41 +01:00
Sergey M
f326b5837a Merge pull request #8445 from bpfoley/rte-newurl
[rte:radio] Add support for RTMP downloads, alternate URL style
2016-02-07 00:28:39 +05:00
Brian Foley
5dd4b3468f [rte:radio] Add support for RTMP downloads, alternate URL style
This is useful as
a) RTMP downloads are a good deal faster to download
b) Older items are available only as RTMP streams
2016-02-06 18:42:57 +00:00
Jaime Marquínez Ferrándiz
d4f8e83404 [FFmpegSubtitlesConvertorPP] remove unused variable 2016-02-06 19:04:53 +01:00
Jaime Marquínez Ferrándiz
7b8b007cd9 [FFmpegSubtitlesConvertorPP] remove intermediate srt files 2016-02-06 19:04:18 +01:00
Jaime Marquínez Ferrándiz
3547d26587 [FFmpegSubtitlesConvertorPP] correctly update the extension (fixes #8444) 2016-02-06 18:58:18 +01:00
Jaime Marquínez Ferrándiz
7e62c2eb6d [FFmpegSubtitlesConvertorPP] fix not working when srt is used as the intermediate format between ttml/dfxp and other format
It was trying to use the ttml/dfxp file with ffmpeg, which doesn't have support for them.
I broke it in e04398e397.
2016-02-06 18:51:05 +01:00
Sergey M․
56401e1e5f [downloader/hls] Do not send 'q' to ffmpeg on Windows (Closes #8300) 2016-02-06 23:24:22 +06:00
Sergey M
860db2d508 [README.md] Fix typo 2016-02-06 22:40:20 +06:00
Sergey M
4b8874975c [README.md] Remove non-relevant info 2016-02-06 22:39:50 +06:00
Sergey M
bd6b6f6622 [README.md] Fix typo 2016-02-06 22:36:30 +06:00
Sergey M․
4340727e6c [videomore] Fix typo 2016-02-06 22:36:30 +06:00
Sergey M
3ceccade87 [README.md] Improve output template documentation and add more examples 2016-02-06 22:33:49 +06:00
remitamine
28ad7df65d [generic] detect MPD manfiest only from the content 2016-02-06 14:51:45 +01:00
Sergey M․
79a3508579 [extractor/generic] Detect DASH manifests in found URLs and extract mpd formats 2016-02-06 19:42:03 +06:00
Sergey M․
1b840245bd [extractor/generic] Detect DASH manifests and extract mpd formats 2016-02-06 19:35:32 +06:00
remitamine
6a3828fddd [common] use float conversion instead of using division from __future__ 2016-02-06 14:27:04 +01:00
remitamine
91cb6b5065 rename _parse_mpd to _parse_mpd_formats and add default value for mpd namespace 2016-02-06 14:03:48 +01:00
remitamine
0826a0b555 [common] sort dash formats 2016-02-06 06:52:48 +01:00
remitamine
bcbbb98bfe [generic] extract dash formats detected using content type 2016-02-06 06:47:38 +01:00
remitamine
66159b38aa Merge pull request #8408 from remitamine/dash
Add generic support for mpd manifests(dash formats)
2016-02-06 06:26:02 +01:00
Sergey M․
23d17e4beb [youtube] Fix automatic captions 2016-02-06 06:44:38 +06:00
Sergey M․
d97b0e3241 [vidme] Clarify IE_NAMEs 2016-02-05 23:58:26 +06:00
Sergey M․
eb2533ec4c [vidme:user:likes] Add extractor 2016-02-05 23:57:35 +06:00
Sergey M․
b7b365067f [vidme:user] Add extractor (Closes #8416) 2016-02-05 23:32:22 +06:00
remitamine
86e284e028 Merge branch 'master' of github.com:rg3/youtube-dl 2016-02-05 17:20:00 +01:00
Sergey M․
d9e543b680 [spankbang] Add test with single format (#8398) 2016-02-05 22:16:56 +06:00
Sergey M․
c773c232d8 [spankbang] Check formats (#8398) 2016-02-05 22:16:40 +06:00
Sergey M․
58ae24336a [spankbang] Extend format id regex (Closes #8398) 2016-02-05 22:16:12 +06:00
remitamine
7d3a035ee0 [ffmpeg] check for m3u8 protocol in FFmpegMetadataPP 2016-02-05 17:12:49 +01:00
Philipp Hagemeister
e06e75c7e7 release 2016.02.05.1 2016-02-05 15:17:31 +01:00
remitamine
593e0f43b4 [ffmpeg] fix condition(fixes #8440) 2016-02-05 15:12:06 +01:00
Philipp Hagemeister
008ab0f814 release 2016.02.05 2016-02-05 11:04:00 +01:00
Jaime Marquínez Ferrándiz
3f7e8750d4 [arte.tv:+7] Fix extraction (fixes #8427) 2016-02-04 20:16:47 +01:00
Philipp Hagemeister
f1ed3acae5 release 2016.02.04 2016-02-04 13:39:26 +01:00
remitamine
920d21b9d3 [test_subtitles] update youtube subtitles tests 2016-02-04 08:50:55 +01:00
remitamine
2fb35d1c28 [youtube] fix subtitle order 2016-02-04 08:39:01 +01:00
remitamine
09be85b8dd [youtube] fix subtitle extraction(fixes #8415) 2016-02-04 08:28:37 +01:00
remitamine
eadc3ccd50 [generic] extract m3u8 formats when mpegurl content type detected 2016-02-04 01:25:36 +01:00
remitamine
255732f0d3 [common] fix segment duration calculation 2016-02-03 23:57:08 +01:00
remitamine
53c269c6fd [common] fix media_template string formating 2016-02-03 23:54:34 +01:00
remitamine
675d001633 [common] skip drm protected dash formats 2016-02-03 18:44:43 +01:00
Yen Chi Hsuan
58be922079 [kuwo] Check for georestriction 2016-02-04 01:26:25 +08:00
Sergey M
c84d3a557d [README.md] Clarify unavailable sequences in output format 2016-02-03 19:18:25 +05:00
remitamine
d577c79632 [common] ignore ISO 639-2 generic codes 2016-02-03 13:24:07 +01:00
remitamine
6ad2b01e14 [srgssr] use flv as ext for rtmp formats 2016-02-02 23:09:50 +01:00
remitamine
fd3a1f3d60 [cbsnews] add support for live videos(fixes #7010) 2016-02-02 23:02:18 +01:00
Jaime Marquínez Ferrándiz
87de7069b9 [utils] dfxp2srt: make TTMLPElementParser inherit from object
For consistency between python 2 and 3.
2016-02-02 22:30:13 +01:00
remitamine
6fba62c87a [ffmpeg] fix adding metadata when using --hls-prefer-native(#8350) 2016-02-02 22:14:23 +01:00
remitamine
f14be22816 [common] remove duplicate reference to namespace 2016-02-02 22:02:08 +01:00
Yen Chi Hsuan
1df4141196 [test_YoutubeDL] Fix test_youtube_format_selection
Broken since a6c2c24479. Thanks to
@jaimeMF and @anisse for pointing that out
2016-02-03 03:42:37 +08:00
remitamine
fae45ede08 Merge pull request #8354 from remitamine/m3u8_metadata
[ffmpeg] fix adding metadata when using m3u8_native(fixes #8350)
2016-02-02 19:13:58 +01:00
remitamine
4e0cff2a50 Merge pull request #8348 from remitamine/dfxp2srt-text
[utils] fix dfxp2srt text extraction(fixes #8055)
2016-02-02 18:36:26 +01:00
remitamine
9c74423510 [common] fix media template regex 2016-02-02 18:30:31 +01:00
remitamine
5976e7ab57 [vevo] add support for dash formats 2016-02-02 18:13:01 +01:00
remitamine
a1a22572fb [downloader/dash] make initialization_url optional 2016-02-02 18:12:32 +01:00
remitamine
c11875b328 [facebook] use _parse_mpd 2016-02-02 18:11:16 +01:00
remitamine
8ff648e4f9 [youtube] use _extract_mpd_formats 2016-02-02 18:10:23 +01:00
remitamine
1bac34556f [common] add a generic support for mpd manifests 2016-02-02 18:09:25 +01:00
Sergey M․
0436157b95 [vk:uservideos] Improve _VALID_URL (Closes #8389) 2016-02-02 00:52:37 +06:00
Philipp Hagemeister
ae0db349c1 release 2016.02.01 2016-02-01 12:00:32 +01:00
Yen Chi Hsuan
08411970d5 Merge pull request #8374 from yan12125/facebook-dash
Facebook DASH formats
2016-02-01 18:31:49 +08:00
Yen Chi Hsuan
dc724e0c8b [daum.net:user] Match more URLs (#1952) 2016-02-01 18:26:23 +08:00
Yen Chi Hsuan
0a5d1ec706 Merge branch 'ping-daum-playlist-user' 2016-02-01 18:19:32 +08:00
Yen Chi Hsuan
58250eff2b [daum] Update test_daum_1 2016-02-01 18:19:02 +08:00
Yen Chi Hsuan
11a4efc505 [daum] Do not match a single URL with multiple info extractors 2016-02-01 18:15:53 +08:00
Yen Chi Hsuan
7537b35fb8 [daum] PEP8 2016-02-01 17:40:35 +08:00
Yen Chi Hsuan
33cc74eeeb Merge branch 'daum-playlist-user' of https://github.com/ping/youtube-dl into ping-daum-playlist-user 2016-02-01 17:29:12 +08:00
Yen Chi Hsuan
f021acee49 [kickstarter] Fix title and test_kickstarter
It's the description page that contains a video. The original URL is now
the calendar.
2016-02-01 17:21:40 +08:00
Yen Chi Hsuan
abe694ca95 [kickstarter] Eliminate the warning message and add_ie 2016-02-01 17:10:11 +08:00
Yen Chi Hsuan
b286f201a8 [YoutubeDL] Do not override ie_key in url_transparent 2016-02-01 17:05:48 +08:00
Yen Chi Hsuan
bd93a12e85 [vidzi] Fix _TESTS 2016-02-01 17:03:31 +08:00
Yen Chi Hsuan
92769650fa [vidzi] Fix extraction
Closes #8386.

Vidzi.tv now uses jwplayer, which can be handled by GenericIE
2016-02-01 15:40:42 +08:00
Yen Chi Hsuan
dc4fe5c6d7 [allocine] Use xpath_element 2016-02-01 05:32:28 +08:00
Yen Chi Hsuan
566bda51f2 [bpb] Fix extraction and update tests 2016-02-01 05:00:09 +08:00
Yen Chi Hsuan
f63757ec35 [allocine] Fix for Python 2.6
Python 2.6 does not support .// syntax in find(). Fortunately, the
interested node is at the top level
2016-02-01 03:34:02 +08:00
Yen Chi Hsuan
7a0ed06909 [allocine] Fix extraction of test_allocine_1 and update tests 2016-02-01 03:31:58 +08:00
Yen Chi Hsuan
9934fe76be [acast] Remove ACastBaseIE
No longer necessary as _API_BASE_URL is used by ACastChannelIE only
2016-02-01 03:08:46 +08:00
Yen Chi Hsuan
a8aad21001 [acast] Fix extraction 2016-02-01 03:07:45 +08:00
Yen Chi Hsuan
d055bf91cc Merge branch 'rrooij-gamekings_fix' 2016-02-01 02:21:02 +08:00
Yen Chi Hsuan
0e1b1a011d [gamekings] Stricter checks 2016-02-01 02:19:03 +08:00
Yen Chi Hsuan
eab3c2895c [gamekings] add_ie 2016-02-01 02:15:25 +08:00
Yen Chi Hsuan
163da6a484 [gamekings] Add MD5 back
The test is now a YouTube video, whose MD5 should be stable
2016-02-01 02:13:11 +08:00
Yen Chi Hsuan
324916d11a Merge branch 'gamekings_fix' of https://github.com/rrooij/youtube-dl into rrooij-gamekings_fix 2016-02-01 02:11:25 +08:00
Jaime Marquínez Ferrándiz
3ccb0655c1 [youtube] Use 'orderedSet' instead of 'set' to preserve the order 2016-01-31 15:11:00 +01:00
Jaime Marquínez Ferrándiz
e04398e397 [FFmpegSubtitlesConvertorPP] delete old subtitle files (fixes #8382) 2016-01-31 14:22:36 +01:00
Yen Chi Hsuan
231ea2a3bb [xuite] Replace the test case with my uploaded one 2016-01-31 20:21:57 +08:00
Yen Chi Hsuan
b99d88c6a1 [youporn] Fix uploader and description 2016-01-31 20:12:43 +08:00
Yen Chi Hsuan
189d72d5fd [test_subtitles] Fix TestRaiSubtitles
RaiIE is renamed to RaiTVIE in 06d5556dfa
2016-01-31 20:12:43 +08:00
Yen Chi Hsuan
a7aab0c23e [test_youtube_lists] Fix TestYoutubeLists.test_youtube_course
Youtube entries are now generators
2016-01-31 20:12:43 +08:00
Philipp Hagemeister
a69bee4762 release 2016.01.31 2016-01-31 12:57:18 +01:00
Sergey M․
9acd33094d [youtube] Filter duplicates in playlists base extractor 2016-01-31 17:52:02 +06:00
Sergey M․
8e7aad2075 [youtube] Use authentication for entry list base extractor (Closes #8380) 2016-01-31 17:49:59 +06:00
rrooij
ce5879fa14 [Gamekings] Fix viewing of old videos
Some old videos that aren't on Vimeo are being uploaded to YouTube under the
'Gamekings Vault' channel. They use YouTube now for some videos as video
hosting instead of Vimeo or their own hosting. The first test failed to
succeed under the existing code, but works now by using the YouTube
extractor.

The Regex is changed to find the new gogoVideo JavaScript line with the
YouTube embed. Checking if there is a YouTube embed is done by a String
find, which is probably not the best method of checking this.
2016-01-31 00:20:46 +01:00
Yen Chi Hsuan
7b7507d6e1 [letv] Fix LetvCloud extraction 2016-01-31 07:15:43 +08:00
rrooij
14823decf3 [Gamekings] Fix url from .tv to .nl
Gamekings doesn't use the .tv top level domain anymore, but the regular
domain for Dutch sites.
2016-01-31 00:03:23 +01:00
Sergey M․
673fb82e65 [schooltv] Improve video id regex 2016-01-31 04:41:18 +06:00
Sergey M
181cf24bc0 Merge pull request #8376 from rrooij/schooltv
[schooltv] Add extractor for SchoolTV playlists
2016-01-31 04:36:33 +06:00
rrooij
89f2602880 [schooltv] Add extractor for SchoolTV playlists
This closes #8163
2016-01-30 23:21:42 +01:00
Yen Chi Hsuan
db9b1dbcd9 [nba] Add ext for hls formats and fix test_NBA 2016-01-31 04:58:10 +08:00
Yen Chi Hsuan
e881c4bcab [nbc] Use NBC's id and fix _TESTS
ThePlatform URL gives the same ID for all _TESTS
2016-01-31 04:58:10 +08:00
Yen Chi Hsuan
670ad51ade [nrktv] Fix _TESTS 2016-01-31 04:58:10 +08:00
Yen Chi Hsuan
eb6fc7d32a [senateisvp] Fix test_SenateISVP and test_SenateISVP_1 2016-01-31 04:58:10 +08:00
Yen Chi Hsuan
ed1a390583 [tv2] Fix test_TV2 2016-01-31 04:58:10 +08:00
Yen Chi Hsuan
809e1857c5 [screenwavemedia] Fix HLS extension and test_TeamFour 2016-01-31 04:58:10 +08:00
Yen Chi Hsuan
7c38af48b9 [vgtv] Fix test_VGTV_2 2016-01-31 04:58:10 +08:00
Yen Chi Hsuan
60ad3eb970 [viidea] Skip download for the test case requiring ffmpeg 2016-01-31 04:58:10 +08:00
Sergey M․
a7685b3a6b [npo] Add extension for m3u8 2016-01-31 02:38:28 +06:00
remitamine
8f1fddc816 [limelight] fix format sorting and make m3u8 and f4m extraction non fatal 2016-01-30 20:51:47 +01:00
remitamine
1bf996fa5c [generic] Add support for Limelight API 2016-01-30 20:45:56 +01:00
Yen Chi Hsuan
248ae880b6 [facebook] Add md5 for the test case with DASH 2016-01-30 23:01:19 +08:00
Yen Chi Hsuan
2d2fa82d17 [common] Add _extract_dash_manifest_formats 2016-01-30 22:52:23 +08:00
Yen Chi Hsuan
c94678957f [common] Remove unused arguments 2016-01-30 22:45:16 +08:00
Yen Chi Hsuan
16f38a699f [common] Rename to namespace
For consistency with _parse_smil_*
2016-01-30 22:40:56 +08:00
Yen Chi Hsuan
a6c2c24479 [youtube] Remove '(v|a)codec': 'none' entries
Not used anymore
2016-01-30 22:28:53 +08:00
Sergey M․
b8c9926c0a [downloader/f4m] Do not update fragment list while test 2016-01-30 19:43:25 +06:00
Yen Chi Hsuan
df374b5222 [common] Prefer the manifest than formats_dict in determining codecs 2016-01-30 21:42:27 +08:00
Yen Chi Hsuan
5ea1eb78f5 [common] Fix for youtube 2016-01-30 21:36:01 +08:00
Yen Chi Hsuan
5d2c0fd9ba [youtube] Pass self._formats to _parse_dash_manifest 2016-01-30 21:32:15 +08:00
Yen Chi Hsuan
0803753fea [facebook] Add support for DASH manifests 2016-01-30 21:31:53 +08:00
Sergey M․
2c2f1efdcd [downloader/fragment] Remove superfluous whitespace 2016-01-30 19:30:31 +06:00
Yen Chi Hsuan
b323e1707d [common] Modify _parse_dash_manifest for use in Facebook 2016-01-30 21:27:43 +08:00
Sergey M․
09104e9930 [downloader/f4m] Add live stream flag to context
Now download progress for f4m livestreams is reported correctly
2016-01-30 19:22:15 +06:00
Sergey M․
5fa1702ca6 [downloader/fragment] Do not report total bytes estimation and eta for live streams 2016-01-30 19:20:52 +06:00
Yen Chi Hsuan
17b598d30c [common] _parse_dash_manifest() from youtube.py 2016-01-30 21:05:55 +08:00
Sergey M․
53be8894e4 [options] Add missing closing parenthesis 2016-01-30 18:44:22 +06:00
Sergey M․
c3deacd562 [matchtv] Add extractor (Closes #8313) 2016-01-30 18:30:27 +06:00
Sergey M․
8ab3fe81d8 [downloader/f4m] Prefer bootstrap url attribute over inline bootstrap info 2016-01-30 18:28:38 +06:00
ping
2f0a33d8a3 [daum.net] Support for playlists, user channels 2016-01-30 20:10:36 +08:00
Yen Chi Hsuan
05d0d131a7 [youtube] Move decrypt_sig out of _parse_dash_manifest 2016-01-30 20:05:56 +08:00
Yen Chi Hsuan
c140629995 [facebook] Support alternative webpage form
Fixes #8371
2016-01-30 19:33:22 +08:00
Jaime Marquínez Ferrándiz
7d106a65ca Add --hls-use-mpegts option
When using the mpegts container hls vidoes can be played while being downloaded (useful if you are recording a live stream).
VLC and mpv play them file, but QuickTime doesn't.
2016-01-30 12:26:40 +01:00
Yen Chi Hsuan
0179f6a830 [daum] Add 'thumbnail' to all _TESTS 2016-01-30 16:54:14 +08:00
Yen Chi Hsuan
830afe85dc [daum.net] Support VodPlayer.swf URLs (closes #8173) 2016-01-30 16:50:13 +08:00
Yen Chi Hsuan
8bf39420b4 Merge remote-tracking branch 'upstream/master' 2016-01-30 16:25:55 +08:00
Yen Chi Hsuan
71d08b3e29 Merge branch 'ping-daum-fix-clip' 2016-01-30 16:25:06 +08:00
Yen Chi Hsuan
06ffa33485 [daum.net] Move the request to ClipInfoXml.do
To reduce the number of wasted requests
2016-01-30 16:23:37 +08:00
Yen Chi Hsuan
874e05975b Merge branch 'daum-fix-clip' of https://github.com/ping/youtube-dl into ping-daum-fix-clip 2016-01-30 16:22:37 +08:00
ping
f5d30d521c [daum] Fix add view_count, comment_count to test 2016-01-30 11:09:30 +08:00
ping
e047922be0 [daum] Fix copy-paste mistake 2016-01-30 11:04:11 +08:00
Sergey M․
83ab8a79cc [espn] Improve video id extraction (Closes #8368) 2016-01-30 01:48:54 +06:00
Sergey M․
350cf045d8 [extractor/common] Restrict checks when auto calculating tbr 2016-01-30 01:47:46 +06:00
Sergey M․
68a0ea15b4 [cspan] Unescape path (Closes #8365) 2016-01-30 00:26:33 +06:00
Jaime Marquínez Ferrándiz
2b4f5e68d1 [azubu] Add extractor for live streams (closes #8343) 2016-01-29 15:36:33 +01:00
Philipp Hagemeister
055f417278 release 2016.01.29 2016-01-29 12:20:08 +01:00
Jaime Marquínez Ferrándiz
70029bc348 [youtube:user] Require 'https?://' in the url (fixes #8356)
It was matching www.youtube.com/embed/WpfukLMe1TM.
The generic extractor automatically adds http:// if it's missing.
2016-01-29 11:27:11 +01:00
remitamine
cf57433bbd [ffmpeg] fix adding metadata when using m3u8_native(fixes #8350) 2016-01-28 18:57:32 +01:00
Sergey M․
1ac6e794cb [bbc] Add test for #8147 2016-01-28 23:27:48 +06:00
Sergey M․
a853427427 [bbc] Add another description regex 2016-01-28 23:23:13 +06:00
Sergey M․
50e989e263 [bbc] Add another title regex (Closes #8340) 2016-01-28 23:19:53 +06:00
Sergey M․
10e6ed9341 [ok] Add support for mobile URLs (Closes #8345) 2016-01-28 22:56:49 +06:00
Sergey M․
38c84acae5 [ndr:embed:base] Add missing ext for m3u8 2016-01-28 22:50:18 +06:00
Yen Chi Hsuan
29f46c2bee Credit @dyn888 for improving format selection
[ci skip]
2016-01-28 22:56:59 +08:00
Yen Chi Hsuan
39c10a2b6e Merge pull request #8346 from dyn888/dyn888-regex-1
Regex pattern update to match more codecs (fixes #6858)
2016-01-28 22:22:43 +08:00
dyn888
b913348d5f Test codec with a dot '.' in name selection. 2016-01-28 15:07:33 +01:00
remitamine
2b14cb566f [utils] fix dfxp2srt text extraction(fixes #8055) 2016-01-28 12:38:34 +01:00
dyn888
b0df5223be Update YoutubeDL.py 2016-01-28 12:07:15 +01:00
Sergey M․
ed7cd1e859 [cbsnews] Remove unused import 2016-01-28 00:42:04 +06:00
remitamine
f125d9115b [cbsnews] extract all formats 2016-01-27 19:11:21 +01:00
remitamine
a9d5f12fec Merge pull request #8328 from remitamine/hls-master-detect
[extractor/common] detect media playlist in _extract_m3u8_formats
2016-01-27 18:07:30 +01:00
remitamine
7f32e5dc35 [extractor/common] detect media playlist in _extract_m3u8_formats 2016-01-27 17:53:42 +01:00
Sergey M․
c3111ab34f [spankbang] Fix title extraction (Closes #8329) 2016-01-27 21:49:56 +06:00
Sergey M․
9339774af2 [spankbang] Fix formats extraction 2016-01-27 21:49:39 +06:00
Sergey M․
b0d21deda9 [extractor/common] Auto calculate tbr when missing 2016-01-27 21:11:17 +06:00
Philipp Hagemeister
fab6f0e65b release 2016.01.27 2016-01-27 08:32:03 +01:00
ping
b6c33fd544 [daum.net] Fixes #8331 2016-01-27 12:48:00 +08:00
Sergey M․
fb4b345800 [instagram] Make description optional (Closes #8326) 2016-01-26 21:46:51 +06:00
Sergey M․
af9c2a07ae [cspan] Extract from path when no qualities (Closes #8317) 2016-01-26 21:29:42 +06:00
remitamine
ab180fc648 Merge branch 'master' of github.com:rg3/youtube-dl 2016-01-26 15:55:38 +01:00
remitamine
682f8c43b5 [vevo] fallback to youtube video only if vevo video is geo restricted(fixes 8263)(fixes 2874) 2016-01-26 15:54:32 +01:00
Sergey M․
f693213567 [cspan] Fix clip/prog id extraction (#8317) 2016-01-26 20:42:20 +06:00
remitamine
9165d6bab9 [vevo] extract metadata and formats from api if videoinfo is empty
these was fixed by @yan12125 in ff51983e15
i only added some code to extract video metadata and more formats from
api
2016-01-26 13:46:58 +01:00
remitamine
2975fe1a7b [vevo] extract all formats and bypass geo restriction 2016-01-25 22:35:06 +01:00
Sergey M․
de691a498d [facebook:post] Add extractor (Closes #8321) 2016-01-25 22:18:34 +06:00
Sergey M․
2e6e742c3c [facebook] Add shortcut and reformat _VALID_URL 2016-01-25 22:15:21 +06:00
Yen Chi Hsuan
e9bd0f772b Merge pull request #8130 from dyn888/master
[youtube] added vcodec/acodec/abr for multiple itags
2016-01-25 01:15:11 +08:00
Yen Chi Hsuan
77f785076f [common] Keep full codec name from m3u8 manifests
See #8293. This is for consistency between YouTube and HLS formats.
2016-01-25 01:03:46 +08:00
Yen Chi Hsuan
94278f7202 [youtube] Prefer info from YouTube than _formats (#8293) 2016-01-25 01:02:19 +08:00
Yen Chi Hsuan
a0d8d704df [utils] Reorder items in mimetype2ext alphabetically 2016-01-25 01:01:15 +08:00
Yen Chi Hsuan
f6861ec96f [utils] Add more items to mimetype2ext (#8293)
These are used in Youtube formats
2016-01-25 00:58:53 +08:00
Philipp Hagemeister
f733b05302 release 2016.01.23 2016-01-23 12:03:12 +01:00
Sergey M․
6fa73386cb [drtv] Use IETF language tag 2016-01-23 01:54:00 +06:00
Sergey M․
5ca01bb9e4 [kanalplay] Use IETF language tag 2016-01-23 01:51:18 +06:00
Sergey M․
1ca59daca9 [options] Clarify language tags 2016-01-23 01:50:06 +06:00
Sergey M․
594c4d79a5 [svt] Improve subtitles extraction and add test (Closes #8265) 2016-01-23 01:47:54 +06:00
Marian Sigler
1f16b958b1 [SVTPlay] Add subtitle support 2016-01-23 01:28:31 +06:00
Sergey M․
4c0d13df9b [lovehomeporn] Add extractor 2016-01-23 00:52:23 +06:00
Sergey M․
b2c6528baf [ruleporn] Rework in terms of nuevo (Closes #8206) 2016-01-23 00:40:11 +06:00
Sergey M․
ea17820432 [nuevo] Improve thumbnail extraction 2016-01-23 00:38:58 +06:00
Dankryn
1257b049bc [ruleporn] Add new extractor 2016-01-23 00:13:27 +06:00
Sergey M․
b969813548 Credit @nexAkari for trollvids and nuevo (#7728) 2016-01-23 00:10:49 +06:00
Sergey M․
10677ece81 [nuevo] Simplify nuevo extractors (Closes #7728) 2016-01-23 00:04:33 +06:00
Andrew "Akari" Alexeyew
d570746e45 [nuevo] Generalize nuevo extractor and add support for trollvids
Supports only the nuevo player for now (most common).

[trollvids] convert duration to an int

[trollvids] added a test

[trollvids] made flake8 shut up

Generalized the Nuevo extractor

Affects: anitube, trollvids, trutube

[nuevo] Complied with the code comments.
2016-01-22 23:29:24 +06:00
Sergey M․
4fcd9d147d [arte:cinema] Add extractor 2016-01-22 23:00:50 +06:00
Sergey M․
9c54ae3387 [arte:future] Make duplicated test matching only 2016-01-22 23:00:05 +06:00
François Charlier
24114fee74 [arte:future] Fix extraction
[arte] Add support for more "Arte Future" uri
2016-01-22 22:58:37 +06:00
Sergey M․
220ee33f2b [cbsnews] Simplify subtitles extraction and fix test (Closes #8295) 2016-01-22 22:23:21 +06:00
John Assael
4118cc02c1 [cbsnews] Extract subtitles
added test function for CBS News subtitles
2016-01-22 22:15:51 +06:00
Jaime Marquínez Ferrándiz
32d77eeb04 [downloader/common] report_retry: Don't crash when retries is infinite (fixes #8299) 2016-01-22 14:49:17 +01:00
Filippo Valsorda
032f232626 Merge pull request #8142 from FiloSottile/filippo/updates
[update] fix (unexploitable) BB'06 vulnerability in rsa_verify
2016-01-21 20:17:37 +00:00
Filippo Valsorda
4d318be195 [update] fix (unexploitable) BB'06 vulnerability in rsa_verify
The rsa_verify code was vulnerable to a BB'06 attack, allowing to forge
signatures for arbitrary messages if and only if the public key exponent is
3.  Since the updates key is hardcoded to 65537, there is no risk for
youtube-dl, but I don't want vulnerable code in the wild.

The new function adopts a way safer approach of encoding-and-comparing to
replace the dangerous parsing code.
2016-01-21 20:12:17 +00:00
Yen Chi Hsuan
6b45f9aba2 [iqiyi] Update key (closes #8292) 2016-01-22 02:14:47 +08:00
Sergey M․
1e10d02fec [hitbox] Skip subscribe only formats (Closes #8217) 2016-01-21 23:28:22 +06:00
Sergey M․
51290d8457 [youtube] Simplify automatic captions URL check (Closes #8287) 2016-01-21 22:58:03 +06:00
Dimitre Liotev
582f4f834e Fix issue #8109 (error when downloading automatic captions) 2016-01-21 22:55:36 +06:00
Sergey M․
e87d98b0dd [yahoo] Add improve content id regexes (Closes #8290) 2016-01-21 22:42:50 +06:00
igv
383496e65e Additional regex for yahoo extractor 2016-01-21 22:35:28 +06:00
Jaime Marquínez Ferrándiz
4519c1f43c [vimeo] 'ext' must be a string, not a tuple (fixes #8288)
There was an ',' at the end of the line.
2016-01-21 12:43:45 +01:00
Sergey M․
a616f65471 [tube8] PEP 8 2016-01-20 21:30:29 +06:00
CeruleanSky
1f78ed189a [OraTV] update extractor
"current" is now "video"
"hls_stream" is now hls_stream without quotes
video_id is now id
duration for current video is not present(for other videos it is)

modified regex to find hls_stream variable to work reguardless of whether it is quoted or not.

[ora] Improve (Closes #8273)
2016-01-20 21:28:42 +06:00
Sergey M․
7dde358adc [tube8] Extract duration and modernize 2016-01-20 20:07:32 +06:00
Sergey M․
27b83249c9 [tube8] Fix extraction and extract all formats (Closes #8281) 2016-01-20 20:00:51 +06:00
Yen Chi Hsuan
56aa074538 Credit @FounderSG for WeiqiTV and LetvCloud (#7994)
[ci skip]
2016-01-20 13:20:03 +08:00
Jaime Marquínez Ferrándiz
9d90e7de03 [downloader/hls] Ask ffmpeg to quit when interrupting youtube-dl with 'Ctrl+C' (#8252)
Otherwise the mp4 file can't be played.
2016-01-19 22:07:14 +01:00
Yen Chi Hsuan
7d4d9c526a Merge branch 'ping-patch-8239' 2016-01-20 04:22:25 +08:00
Yen Chi Hsuan
fe6856b059 [neteasemusic] Use float_or_none 2016-01-20 04:21:51 +08:00
Yen Chi Hsuan
a54fbf2ca6 Merge branch 'patch-8239' of https://github.com/ping/youtube-dl into ping-patch-8239 2016-01-20 04:15:46 +08:00
Yen Chi Hsuan
d8024aebe5 Merge branch 'FounderSG-Weiqitv' 2016-01-20 04:06:09 +08:00
Yen Chi Hsuan
8652bd22f1 [weiqitv] Use single quotes 2016-01-20 04:04:39 +08:00
Yen Chi Hsuan
f15a9ca301 [weiqitv] Rename the extractor - capitilize 'TV' 2016-01-20 04:03:57 +08:00
Yen Chi Hsuan
65ced034b8 [weiqitv] Make codes shorter 2016-01-20 04:02:30 +08:00
Yen Chi Hsuan
bec30224ff [letv] LetvCloud: Detect ext instead of the hardcoded one 2016-01-20 04:00:37 +08:00
Yen Chi Hsuan
0428106da3 [letv] LetvCloud: make title looks like a title 2016-01-20 03:53:17 +08:00
Yen Chi Hsuan
73e7442456 [letv] LetvCloud: simplify and improve _VALID_URL 2016-01-20 03:42:01 +08:00
Yen Chi Hsuan
26de1bba83 [letv] LetvCloud: check error messages from server 2016-01-20 03:31:34 +08:00
Yen Chi Hsuan
e0690782b8 [letv] LetvCloud: guard against invalid URLs 2016-01-20 03:25:12 +08:00
Yen Chi Hsuan
8fff4f61e5 [letv] Use single quotes 2016-01-20 03:18:54 +08:00
Yen Chi Hsuan
10defdd06a [letv] Reduce duplicated codes 2016-01-20 03:17:35 +08:00
Sergey M․
485139c15c [viewster] Tolerate missing synopsis (Closes #8274) 2016-01-20 00:02:46 +06:00
Sergey M․
b605ebb609 [lemonde] Add extractor 2016-01-19 22:09:55 +06:00
Sergey M․
aecfcd4e59 [ultimedia] Rename to digiteka 2016-01-19 21:51:46 +06:00
Sergey M․
942d46196f [ultimedia] Extend _VALID_URL to support digiteka 2016-01-19 21:47:06 +06:00
Yen Chi Hsuan
78be2eca7c Merge branch 'Weiqitv' of https://github.com/FounderSG/youtube-dl into FounderSG-Weiqitv 2016-01-19 23:39:32 +08:00
Sergey M․
1fa2b9841d [extractor/generic] Extend dailymotion embed regex 2016-01-19 21:20:45 +06:00
Sergey M․
9fbd0822aa [dailymotion] Extend _VALID_URL 2016-01-19 21:20:14 +06:00
Sergey M․
e323cf3ff3 [youtube] Skip test 2016-01-19 20:56:04 +06:00
Sergey M․
8ceabd4df3 [youtube] Capture and output unavailable message 2016-01-19 20:54:43 +06:00
Sergey M․
a8776b107b [youtube] Clarify test_Youtube_18 2016-01-18 23:19:38 +06:00
Sergey M․
096b533982 [youtube] Fix URL expansion in video description
Fixes test_Youtube_18
2016-01-18 23:17:45 +06:00
Sergey M․
dae503afaa [atresplayer] Skip HLS completely (Closes #8261) 2016-01-17 22:14:07 +06:00
Yen Chi Hsuan
b39eab7f94 Merge pull request #8262 from jwilk/https-everywhere
[ustream] Use HTTPS for GitHub URL
2016-01-17 22:10:03 +08:00
Jakub Wilk
e5a66240c0 [ustream] Use HTTPS for GitHub URL 2016-01-17 15:06:00 +01:00
ping
e0ef13ddeb [neteasemusic] Fallback to alt hosts if m5.music.126.net doesn't work 2016-01-17 07:48:46 +08:00
Sergey M․
855f90fa6f [ae] Rename to aenetworks and clarify extractor name and description 2016-01-17 03:02:45 +06:00
Yen Chi Hsuan
614db89ae3 [compat] Clarify the versions requiring compat_kwargs
It's supported since 2.7.0 alpha 1 and 2.6.5 rc 1. See
https://hg.python.org/cpython/file/v2.7a1/Misc/NEWS#l337
https://hg.python.org/cpython/file/v2.6.5rc1/Misc/NEWS#l28
2016-01-16 22:17:31 +08:00
Yen Chi Hsuan
1358b94163 [ae] Fix _TESTS 2016-01-16 20:56:53 +08:00
Yen Chi Hsuan
350e02d40d [bbc] Use _search_json_ld 2016-01-16 20:46:28 +08:00
Yen Chi Hsuan
0b26ba3fc8 [extractor/common] Allow passing more parameters to _search_json_ld 2016-01-16 20:45:36 +08:00
ping
3a0a78731b Fixes #8239 2016-01-16 12:17:07 +08:00
Sergey M
6be16ed24b [README.md] Add protocol usage example in format selection 2016-01-16 10:15:24 +06:00
Sergey M․
b555942428 [YoutubeDL] Ensure protocol is always present 2016-01-16 10:10:28 +06:00
Sergey M
b2dca40d81 [README.md] Improve format selection documentation 2016-01-16 09:55:52 +06:00
Sergey M
15870bbd01 [README.md] Mention new string operators for format selection 2016-01-16 09:53:31 +06:00
Yen Chi Hsuan
10d33b3473 [YoutubeDL] Introduce CSS3 like string operators 2016-01-16 09:53:12 +06:00
Sergey M
ac25992bc7 Merge pull request #8246 from dstftw/initial-json-ld-metadata-support
Initial JSON-LD metadata extraction support
2016-01-16 07:20:15 +06:00
Sergey M
30783c442d Merge pull request #8245 from dstftw/auto-generate-title-fields
[YoutubeDL] Auto generate title fields corresponding to the *_number fields
2016-01-16 07:20:03 +06:00
Sergey M․
a50a8003a0 [cultureunplugged] Improve (Closes #8060) 2016-01-16 07:10:51 +06:00
Sergey M․
315bdae00a [zippcast] Improve (Closes #8198) 2016-01-16 06:27:34 +06:00
ckuu
2ddfd26f1b '[ZippCast] Add new extractor'
Closes rg3/youtube-dl#6591
2016-01-16 06:25:06 +06:00
Philipp Hagemeister
f3ed5df611 release 2016.01.15 2016-01-15 19:43:04 +01:00
Sergey M․
b4e44234bc [ae] Use JSON-LD for TV series metadata 2016-01-16 00:36:49 +06:00
Sergey M․
4ca2a3cf3c [extractor/common] Add initial support for JSON-LD metadata extraction into info_dict 2016-01-16 00:36:02 +06:00
Sergey M․
33d2fc2f64 [YoutubeDL] Auto generate title fields corresponding to the *_number fields
Auto generate title fields corresponding to the *_number fields when missing in order to always have clean titles. This is very common for TV series.
2016-01-16 00:09:54 +06:00
remitamine
27a95f51aa [cwtv] Add new extractor 2016-01-15 17:45:51 +01:00
Sergey M․
a78d6a9bb1 [ae] Improve _VALID_URL 2016-01-15 22:13:48 +06:00
Sergey M․
567f9a5809 [ae] Add extractor import 2016-01-15 22:12:51 +06:00
Sergey M․
3a421c724f [history] Remove import (Closes #8243) 2016-01-15 22:10:07 +06:00
Sergey M․
34dd81c03a [xtube:user] Fix extraction (Closes #8224) 2016-01-15 21:35:20 +06:00
Sergey M․
b3f502cdb9 [xtube] Add shortcut 2016-01-15 21:28:36 +06:00
remitamine
587dfd44a4 [ae] Add support for fyi.tv, aetv.com and mylifetime.com(closes #3599) 2016-01-15 16:18:07 +01:00
remitamine
52767c1ba0 [history] add support for episode pages(fixes #8240) 2016-01-15 15:16:57 +01:00
remitamine
014b5c59d8 [theplatform] extend _VALID_URL regex 2016-01-15 15:12:35 +01:00
remitamine
fad7a336a1 Revert "[history] fix signature and media url extraction(fixes #8240)"
This reverts commit ffbc0baf72.
2016-01-15 14:54:39 +01:00
remitamine
ffbc0baf72 [history] fix signature and media url extraction(fixes #8240) 2016-01-15 12:35:31 +01:00
Sergey M
345f12196c Merge pull request #8228 from jaimeMF/disable-file-handler
[YoutubeDL] urlopen: disable the 'file:' protocol (#8227)
2016-01-14 22:20:02 +05:00
Sergey M․
5769b68bc0 Credit @TomGijselinck for canvas (#7145) 2016-01-14 23:15:26 +06:00
Sergey M․
4e2743abd9 [canvas] Improve (Closes #7145) 2016-01-14 23:15:12 +06:00
Tom Gijselinck
be2d40a58a [Canvas] Add new extractor 2016-01-14 23:14:41 +06:00
Sergey M․
81549898c0 [prosiebensat1] Fix some extraction and update tests 2016-01-14 22:45:09 +06:00
Lucas
0baedd1851 [prosiebensat1] add support for 7tv.de 2016-01-14 22:14:04 +06:00
Sergey M․
6b559c2fbc [ntvde] Improve regex 2016-01-14 22:12:24 +06:00
Sergey M․
986986064e [orf:fm4] Add test 2016-01-14 22:11:33 +06:00
Sergey M․
4654c1d016 [orf:fm4] Extend _VALID_URL (Closes #8234) 2016-01-14 22:07:42 +06:00
Sergey M․
163e8369b0 [ntvde] Fix extraction 2016-01-14 22:05:04 +06:00
Sergey M․
5cc9c5dfa8 [unistra] Fix extraction 2016-01-14 21:53:24 +06:00
Sergey M․
fbd90643cb [vodlocker] Fix extraction (Closes #8231) 2016-01-14 21:48:08 +06:00
Jaime Marquínez Ferrándiz
30e2f2d76f [YoutubeDL] use a more correct terminology in the error message for file:// URLs 2016-01-14 16:28:46 +01:00
Philipp Hagemeister
11c60089a8 release 2016.01.14 2016-01-14 15:43:21 +01:00
Sergey M․
abb893e6e4 [beeg] Update API URL 2016-01-14 19:57:56 +06:00
Sergey M․
4511c1976d [beeg] Fix extraction (Closes #8225) 2016-01-14 19:57:20 +06:00
Jaime Marquínez Ferrándiz
4240d50496 [YoutubeDL] improve error message for file:/// URLs 2016-01-14 14:07:54 +01:00
Jaime Marquínez Ferrándiz
6240b0a278 [YoutubeDL] urlopen: use build_opener again
Otherwise we would need to manually add handlers like HTTPRedirectHandler, instead we add a customized FileHandler instance that raises an error.
2016-01-14 08:16:39 +01:00
Jaime Marquínez Ferrándiz
e37afbe0b8 [YoutubeDL] urlopen: disable the 'file:' protocol (#8227)
If someone is running youtube-dl on a server to deliver files, the user could input 'file:///some/important/file' and youtube-dl would save that file as a video giving access to sensitive information to the user.
'file:' urls can be filtered, but the user can use an URL to a crafted m3u8 manifest like:

    #EXTM3U
    #EXT-X-MEDIA-SEQUENCE:0
    #EXTINF:10.0
    file:///etc/passwd
    #EXT-X-ENDLIST

With this patch 'file:' URLs raise URLError like for unknown protocols.
2016-01-14 00:24:04 +01:00
remitamine
40cf7fcbd2 [tudou] Add support for Albums and Playlists and extract more metadata 2016-01-13 13:29:00 +01:00
Yen Chi Hsuan
cc28492d31 [youtube] Fix acodec and vcodec order
In RFC6381, there's no rule stating that the first part of codecs should
be video and the second part should be audio, while it seems the case
for data reported by YouTube.
2016-01-13 17:05:38 +08:00
Sergey M․
bc0550c262 [pluralsight] Fix new player (Closes #8215) 2016-01-13 08:18:37 +06:00
Sergey M․
b83b782dc4 [downloader/fragment] Move helper data to context dict 2016-01-13 00:00:31 +06:00
Sergey M․
16a348475c [dailymotion] Prefer direct links (Closes #8156) 2016-01-12 23:23:39 +06:00
Sergey M․
709185a264 [downloader/fragment] More smooth calculations
`downloaded_bytes` is now updated on each fragment progress hook invocation
2016-01-12 23:18:38 +06:00
Sergey M․
9cb1a06b6c [downloader/fragment] Remove unused code and fix zero division error 2016-01-12 22:09:38 +06:00
Sergey M․
be27283ef6 [iprima] Mark broken 2016-01-11 22:00:17 +06:00
Sergey M․
b924bfad68 [videott] Mark broken 2016-01-11 21:58:32 +06:00
Sergey M․
192b9a571c [videomega] Mark broken 2016-01-11 21:56:19 +06:00
remitamine
6ec6cb4e95 Revert "fix typos"
This reverts commit 36a0e46c39.
2016-01-10 19:27:22 +01:00
remitamine
36a0e46c39 fix typos 2016-01-10 17:55:41 +01:00
Jakub Wilk
dfb1b1468c Fix typos
Closes #8200.
2016-01-10 17:24:28 +01:00
Jaime Marquínez Ferrándiz
3c91e41614 [downloader/fragment] Don't fail if the 'Content-Length' header is missing
In some dailymotion videos (like http://www.dailymotion.com/video/x3k0dtv from #8156) the segments URLs don't have the 'Content-Length' header and HttpFD sets the 'totat_bytes' field to None, so we also use '0' in that case (since we do different math operations with it).
2016-01-10 14:41:38 +01:00
Jaime Marquínez Ferrándiz
7e8a800f29 [bigflix] Use correct indentation to make flake8 happy 2016-01-10 14:26:27 +01:00
remitamine
2334762b03 [shahid] raise ExtractorError if the video is DRM protected 2016-01-10 07:55:58 +01:00
remitamine
3fc088f8c7 [dcn] extract video ids in season entries 2016-01-10 07:45:41 +01:00
Sergey M․
a9bbd26f1d [bigflix] Improve formats extraction 2016-01-10 10:49:27 +06:00
Sergey M․
6e99d5762a [bigflix] Extract all formats 2016-01-10 10:31:36 +06:00
Sergey M․
15b1c6656f Credit @vickyg3 for bigflix (#8194) 2016-01-10 10:03:56 +06:00
Sergey M
d412794205 Merge pull request #8194 from vickyg3/bigflix_ie
[Bigflix] Add new extractor for bigflix.com
2016-01-10 09:02:18 +05:00
Vignesh Venkat
0a899a1448 [Bigflix] Add new extractor for bigflix.com
Add an IE to support bigflix.com. It uses some sort of silverlight
plugin whose video url is being populated using base64 encoded
flashvars. So it is quite straightforward to extract.
2016-01-09 19:45:58 -08:00
Sergey M․
7a34302e95 [canalc2] Fix extraction (Closes #8191) 2016-01-10 01:37:10 +06:00
Jaime Marquínez Ferrándiz
27783821af [xhamster] Remove unused import 2016-01-09 11:16:23 +01:00
Philipp Hagemeister
b374af6ebd release 2016.01.09 2016-01-09 01:16:08 +01:00
Sergey M․
16f1131a4d [vimeo] Add test for #8187 2016-01-09 03:07:29 +06:00
Sergey M․
d5f071afb5 [vimeo] Check source file URL (Closes #8187) 2016-01-09 03:06:09 +06:00
Sergey M․
14b4f038c0 [xhamster] Update tests 2016-01-09 00:36:43 +06:00
Sergey M․
bcac2a0710 [xhamster] Fix uploader extraction 2016-01-09 00:36:19 +06:00
Sergey M․
1a6d92847f [xhamster] Change title regex precedence 2016-01-09 00:31:24 +06:00
Sergey M․
6a16fd4a1a [xhamster] Fix view count extraction 2016-01-09 00:29:10 +06:00
Sergey M․
44731e308c [xhamster] Fix duration extraction 2016-01-09 00:26:37 +06:00
Sergey M․
4763b624a6 [xhamster] Fix upload date extraction 2016-01-09 00:21:57 +06:00
Sergey M․
6609b3ce37 [xhamster] Improve title extraction 2016-01-09 00:19:36 +06:00
Sergey M
7e182627d9 Merge pull request #8182 from atomic83/patch-1
Extract xHamster title fix
2016-01-08 23:04:32 +05:00
atomic83
5777f5d386 Extract xHamster title fix 2016-01-08 12:58:05 +01:00
Sergey M․
5dbe81a1d3 [vimeo] Automatically pickup full movie when rented (Closes #8171) 2016-01-08 10:41:24 +06:00
Sergey M․
4cf096a4a9 [ivideon] Add support for map bound URLs 2016-01-08 05:11:23 +06:00
Sergey M․
18e6c97c48 [adultswim] Skip georestricted hls (Closes #8168) 2016-01-08 03:19:47 +06:00
Sergey M․
97afd99a18 [soundcloud:likes] Adapt to API changes (Closes #8166) 2016-01-08 01:54:31 +06:00
Sergey M․
23f13e9754 [youtube] Support expanding alternative format of links in description (Closes #8164) 2016-01-08 00:52:55 +06:00
Sergey M․
2e02ecbccc [ivideon] Add extractor 2016-01-07 12:24:32 +06:00
oittaa
e4f49a8753 check video_play_path and use xpath_text
"This check should take place earlier and should be more general if not video_url:. Same should be done for video_play_path. Also these fields better extracted with xpath_text."

Suggestions by @dstftw
2016-01-07 11:36:17 +06:00
Sergey M․
51d3045de2 [npr] Fix extractor (Closes #7218) 2016-01-07 01:57:36 +06:00
kaspi
76048b23e8 [npr] Add extractor
removed md5 from _TEST

moved from xml data to json

test

changed _TEST url to one that will not expire, so tests would not be failing
2016-01-07 01:55:55 +06:00
Sergey M․
f20756fb10 [udemy] Fix non free course message 2016-01-06 00:03:39 +06:00
Sergey M․
17b2d7ca77 [udemy] Detect non free courses (Closes #8138) 2016-01-06 00:02:21 +06:00
Sergey M
40f796288a [README.md] Clarify cookies usage 2016-01-05 02:17:12 +06:00
Sergey M․
2f546d0a3c [vrt] Prefix format ids 2016-01-05 01:59:45 +06:00
Sergey M․
18c782ab26 [vrt] Extend _VALUD_URL 2016-01-05 01:58:25 +06:00
Sergey M․
33cee6c7f6 [dramafever] Add test for custom episode title 2016-01-05 01:41:18 +06:00
Sergey M․
a2e51e7b49 [dramafever] Fix episode fallback 2016-01-05 01:36:38 +06:00
Sergey M․
bd19aa0ed3 [dramafever] Extract episode 2016-01-05 01:28:48 +06:00
Sergey M․
8f4c56f334 [dramafever] Extract episode number 2016-01-05 01:17:33 +06:00
Sergey M․
1dcc38b233 [dramafever] Improve subtitles extraction (Closes #8136) 2016-01-05 01:11:07 +06:00
Sergey M․
fff79f1867 [amp] Add missing subtitles to info dict 2016-01-05 01:05:37 +06:00
Jaime Marquínez Ferrándiz
3f17c357d9 [downloader/hls] Don't let ffmpeg read from stdin (#8139)
If you run 'while read aurl ; do youtube-dl "${aurl}"; done < path_to_batch_file'  (batch_file contains one url per line that uses the hls downloader) each call to youtube-dl consumed some characters and 'read' would assing to 'aurl' a non valid url

(This is the same problem that was fixed for the ffmpeg postprocessors in cffcbc02de)
2016-01-04 18:35:31 +01:00
Sergey M․
9938a17f92 [rte:radio] Extract timestamp 2016-01-04 05:04:48 +06:00
Sergey M․
9746f4314a [rte:radio] Simplify 2016-01-04 05:01:32 +06:00
Sergey M․
0238451fc0 [rte] PEP 8 2016-01-04 04:49:13 +06:00
Sergey M
2098aee7d6 Merge pull request #8063 from bpfoley/rteradio
[rte:radio] Add support for RTE radio player
2016-01-04 03:33:22 +05:00
Sergey M․
fb588f6a56 Credit @bpfoley for rte:radio (#8063) 2016-01-04 04:32:47 +06:00
bpfoley
896c7a23cd [extractor/rte.py] Add support for RTE radio player
While here, stop RteIE changing filename extensions to .mp4. The files
saved are .flv containers with h264 video.
2016-01-03 22:26:06 +00:00
Sergey M․
1463c5b9ac [ivi] Extract season info 2016-01-04 03:54:52 +06:00
Sergey M․
c6270b2ed5 [ivi:compilation] Fix extraction 2016-01-04 03:49:18 +06:00
Sergey M․
ab3176af34 [ivi] Fix extraction and modernize 2016-01-04 03:34:15 +06:00
Sergey M․
5aa535c329 [bbccouk] Update tests (Closes #8090) 2016-01-04 02:55:25 +06:00
Sergey M․
133b1886fc [20min] Improve (Closes #8110) 2016-01-04 02:33:08 +06:00
pingtux
66295fa4a6 [20min.ch] Added support for videoportal 2016-01-04 02:22:03 +06:00
pingtux
e54c44eeab [20min.ch] Add new extractor (closes #5977) 2016-01-04 02:21:56 +06:00
Sergey M․
a7aaa39863 [utils] Extract known extensions for reuse 2016-01-04 01:08:34 +06:00
Sergey M
ea6abd740f [nowtv] Mark broken 2016-01-03 10:12:13 +05:00
dyn888
e1a0bfdffe [youtube] added vcodec/acodec/abr for multiple itags
Should make downloading with filters more precise and easier, ie. bestvideo[vcodec=h264]. By default a lot of codecs are specified as avc1.xxxxxx and unique for each format, which makes them unusable for bestvideo selection.
2016-01-03 04:11:19 +01:00
Sergey M
3f3343cd3e Merge pull request #8061 from dstftw/introduce-chapter-and-series-fields
Introduce chapter and series fields
2016-01-03 03:54:22 +06:00
remitamine
4059eabd58 [dreisat] use extract_from_xml_url from ZDFIE for info extraction(fixes #7680)(fixes #8104)(closes #8121) 2016-01-02 21:29:10 +01:00
remitamine
6b46102661 [zdf] fix rtmpt format downloading handle errors 2016-01-02 21:24:57 +01:00
Yen Chi Hsuan
141a273a8b [qqmusic] Update tests 2016-01-02 22:39:09 +08:00
Yen Chi Hsuan
2fffb1dcd0 [qqmusic:playlist] Capture errors and update tests 2016-01-02 22:33:33 +08:00
Yen Chi Hsuan
e698e4e533 Merge branch 'remitamine-baidu' 2016-01-02 21:50:17 +08:00
Yen Chi Hsuan
b7546397f0 [baidu] Use list comprehension 2016-01-02 21:46:40 +08:00
Yen Chi Hsuan
0311677258 [baidu] Add notes for API calls 2016-01-02 21:44:49 +08:00
Sergey M․
88fb59d91b [bbccouk] Extend title extraction 2016-01-02 19:42:11 +06:00
Yen Chi Hsuan
a1d9f6c5dc [baidu] Improve playlist description 2016-01-02 21:36:35 +08:00
Yen Chi Hsuan
c579c5e967 [baidu] Cleanups 2016-01-02 21:31:02 +08:00
Yen Chi Hsuan
c9c194053d Merge branch 'baidu' of https://github.com/remitamine/youtube-dl into remitamine-baidu 2016-01-02 21:27:32 +08:00
Sergey M․
f20a11ed25 [bbccouk] Extend _VALID_URL (Closes #8116) 2016-01-02 19:22:39 +06:00
Sergey M․
76a353c9e5 [ruutu] Fix extraction (Closes #8107) 2016-01-02 07:44:30 +06:00
Sergey M
392f04d586 Merge pull request #8112 from pingtux/testtube
Remove testtube import
2016-01-02 07:11:32 +06:00
pingtux
94de6cf59c Remove testtube import
Extractor got deleted in remitamine/youtube-dl@8af2804
2016-01-02 01:35:09 +01:00
remitamine
8af2804a5d [testtube] Remove Extractor 2016-01-01 21:53:19 +01:00
remitamine
054479754c [revision3] Add new extractor(closes #6388)
- revision3.com
- testtube.com
- animalist.com
2016-01-01 21:03:16 +01:00
Sergey M․
5bafcf6525 [udemy] Use chapter_number 2016-01-01 20:34:29 +06:00
Sergey M․
306c51c669 [videomore] Use number fields for series 2016-01-01 20:30:08 +06:00
Sergey M․
27bfd4e526 [extractor/common] Introduce number fields for chapters and series 2016-01-01 20:26:56 +06:00
Jaime Marquínez Ferrándiz
ca227c8698 [yahoo] Support pages that use an alias (fixes #8084) 2016-01-01 14:32:00 +01:00
Philipp Hagemeister
32f9036447 [ccc] Add language information to formats 2016-01-01 13:28:45 +01:00
Philipp Hagemeister
190ef07981 release 2016.01.01 2016-01-01 12:17:10 +01:00
Sergey M․
82597f0ec0 [ccc] Extract duration 2016-01-01 15:41:52 +06:00
Sergey M․
8499d21158 [ccc] Fix description extraction and update test 2016-01-01 15:29:42 +06:00
Sergey M․
c9154514c4 [ccc] Fix upload date extraction 2016-01-01 15:22:22 +06:00
Sergey M․
0d5095fc65 [ccc] Update _VALID_URL (Closes #8097) 2016-01-01 15:14:41 +06:00
Yen Chi Hsuan
034caf70b2 [youku] Fix extraction (#8068) 2016-01-01 13:33:01 +08:00
remitamine
e565cf6048 [nextmovie] Add new extractor 2015-12-31 22:47:18 +01:00
remitamine
59f197aec1 Merge branch 'master' of github.com:rg3/youtube-dl 2015-12-31 22:15:14 +01:00
remitamine
a0e5beb0fb [nick] Add new extractor 2015-12-31 22:12:05 +01:00
remitamine
c1e90619bd [mtv] extract mgid extraction and query building into separate methods 2015-12-31 22:10:00 +01:00
Sergey M․
fec09bf15d [einthusan] Improve extraction (Closes #7877) 2016-01-01 02:39:00 +06:00
j
a0d7ede350 Fix einthusan parser 2016-01-01 02:38:50 +06:00
Sergey M․
b26afec81f [einthusan] Improve extraction (Closes #7877) 2016-01-01 02:23:03 +06:00
Sergey M․
8f7c4f7d2e Merge branch 'master' of github.com:rg3/youtube-dl 2016-01-01 02:22:26 +06:00
j
0416006a30 Fix einthusan parser 2016-01-01 01:58:49 +06:00
remitamine
7f9134fb2d [tvland] inherit from MTVServicesInfoExtractor 2015-12-31 20:52:47 +01:00
remitamine
91e274546c [tvland] Add new extractor 2015-12-31 20:23:48 +01:00
Jaime Marquínez Ferrándiz
69f8595256 [espn] Extract better titles 2015-12-31 20:06:21 +01:00
Jaime Marquínez Ferrándiz
930087f2f6 [espn] Support 'intl' videos (#7858) 2015-12-31 20:04:17 +01:00
Jaime Marquínez Ferrándiz
9f9f7664b5 [espn] Update test 2015-12-31 19:52:48 +01:00
Sergey M․
72528252e3 [pandoratv] Add IE names 2016-01-01 00:42:42 +06:00
Sergey M․
e4bd63f9c0 [pandoratv] Improve extraction (Closes #7921) 2016-01-01 00:40:27 +06:00
j
9accfed4e7 [pandoratv] Add new extractor (closes #6884) 2016-01-01 00:18:13 +06:00
remitamine
f1e21efe63 [tlc] remove TlcIE 2015-12-31 18:33:40 +01:00
remitamine
b05641ce40 [discovery] improve _VALID_URL regex 2015-12-31 18:24:49 +01:00
remitamine
fec040e754 [discovery] add support for discovery related sites
- investigationdiscovery.com
- discoverylife.com
- animalplanet.com
- ahctv.com
- destinationamerica.com
- sciencechannel.com
- tlc.com
- velocity.com
2015-12-31 17:29:37 +01:00
Sergey M․
34a9da136f [regiotv] Improve extraction (Closes #7915) 2015-12-31 22:12:47 +06:00
j
c43fda4c1a [regiotv] Add new extractor (closes #7797) 2015-12-31 22:11:13 +06:00
Philipp Hagemeister
7de81fcc53 release 2015.12.31 2015-12-31 16:50:53 +01:00
remitamine
9d46608efa [ora] Add new extractor(closes #7732) 2015-12-31 16:35:51 +01:00
remitamine
80b8b72cb8 [animalplanet] Add new extractor(closes #5303) 2015-12-31 13:36:07 +01:00
remitamine
9787c5f4c8 [fox] Add new extractor(closes #3063) 2015-12-31 12:02:33 +01:00
remitamine
d5f6429de8 [canalplus] improve extraction(fixes #6301)
- extract data from json instead of xml
- fix http format urls
- extract more metadata
- update tests
- make m3u8 and f4m format extraction non fatal
- use m3u8_native implementation
2015-12-31 04:02:08 +01:00
Sergey M․
df827a983a [discovery] Allow https (Closes #8065) 2015-12-31 06:06:36 +06:00
Sergey M․
29f3683901 [espn] Remove broken flag 2015-12-31 05:31:01 +06:00
Jaime Marquínez Ferrándiz
c7932289e7 [cbsnews] Fix extraction of the URL for the 'RtmpDesktop' format (fixes #8048) 2015-12-30 23:57:19 +01:00
Sergey M․
7a0b07c719 [videomore] Extract series info 2015-12-31 03:13:02 +06:00
Sergey M․
4d402db521 [udemy] Extract chapter info 2015-12-31 03:11:21 +06:00
Sergey M․
7109903e61 [extractor/common] Document chapter and series fields 2015-12-31 03:10:44 +06:00
Sergey M․
3092fc4035 [udemy] Fix typo 2015-12-31 01:09:21 +06:00
Sergey M․
f5bc4b5f95 [options] Prefer --convert-subs spelling 2015-12-30 23:12:35 +06:00
Sergey M․
69759a5990 [videomore] Set IE_NAME 2015-12-30 00:44:07 +06:00
Sergey M․
453fe2a345 [dramafever] Fix subtitles extraction (Closes #8049) 2015-12-30 00:13:00 +06:00
Sergey M․
ff18735cb2 [extractor/generic] Add support for videomore embeds 2015-12-29 23:58:23 +06:00
Sergey M․
030dfb04e0 [videomore] Add extractor (Closes #8040) 2015-12-29 23:57:46 +06:00
remitamine
06e4874c99 Merge branch 'jukebox' of https://github.com/remitamine/youtube-dl into remitamine-jukebox 2015-12-29 17:31:18 +01:00
remitamine
0d8a0fdc30 [srgssr] use SRFIE format ids 2015-12-29 16:38:06 +01:00
Jaime Marquínez Ferrándiz
53365f74a7 Credit @flatgreen for FranceCultureEmission (#8022) 2015-12-29 16:14:05 +01:00
remitamine
0368181998 [wdr] split long lines 2015-12-29 15:07:50 +01:00
remitamine
6101f45ef9 [ooyala] split long lines, fix test duration and add hdcode param to hds url 2015-12-29 15:05:21 +01:00
remitamine
bf96b45238 [rai] split long lines 2015-12-29 15:03:14 +01:00
remitamine
98d7c0f4f7 [tele13] split long lines 2015-12-29 15:02:18 +01:00
remitamine
f2017cb020 [srgssr] split long lines and use m3u8_native 2015-12-29 14:58:59 +01:00
remitamine
f889ac45b8 [ign] split long lines 2015-12-29 14:55:36 +01:00
remitamine
eccde5e9de [audimedia] split long lines 2015-12-29 14:53:43 +01:00
remitamine
ce7d243c7e [srgssr] fix IE_DESC 2015-12-29 12:01:22 +01:00
remitamine
6c4d6609ad [phoenix] fix IE_NAME 2015-12-29 12:00:52 +01:00
remitamine
db710571fd [daum] fix IE_NAME 2015-12-29 11:59:27 +01:00
remitamine
574dd17882 Merge branch 'remitamine-srgssr' 2015-12-29 11:38:19 +01:00
remitamine
422f7c112c [srgssr] update tests 2015-12-29 11:36:04 +01:00
Philipp Hagemeister
e974356f32 release 2015.12.29 2015-12-29 11:02:58 +01:00
remitamine
b19ad2fb53 Merge branch 'srgssr' of https://github.com/remitamine/youtube-dl into remitamine-srgssr 2015-12-29 10:57:45 +01:00
remitamine
126d7701b0 Merge branch 'daum' of https://github.com/remitamine/youtube-dl into remitamine-daum 2015-12-29 10:40:32 +01:00
flatgreen
ecf17d1653 [franceculture] Add extractor for '/emission-*' urls (closes #3777, closes #8022) 2015-12-28 20:07:35 +01:00
Jaime Marquínez Ferrándiz
7447661e9b [franceculture] Fix test 2015-12-28 20:05:47 +01:00
Sergey M․
7e5edcfd33 Simplify formats accumulation for f4m/m3u8/smil formats
Now all _extract_*_formats routines return a list
2015-12-29 00:58:24 +06:00
remitamine
39d60b715a Merge pull request #7769 from remitamine/sort
[common] lower (m3u8,rtmp,rtsp) format preference only if required program is not available
2015-12-28 19:15:14 +01:00
remitamine
d497a201ca [common] use specific variable for protocol preference in _sort_formats 2015-12-28 18:45:01 +01:00
remitamine
54537cdfb3 Merge pull request #8023 from remitamine/extract-formats
[common] simplify the use of _extract_m3u8_formats and _extract_f4m_formats
2015-12-28 18:17:12 +01:00
remitamine
bc737bd61a Merge branch 'zdf'(fixes #8024) 2015-12-28 17:54:04 +01:00
remitamine
a5c1d95500 [zdf] fix formats extraction 2015-12-28 17:52:05 +01:00
Sergey M․
c1f49e1684 [facebook] Fix authentication 2015-12-28 21:37:02 +06:00
Sergey M․
9f66931e16 [facebook] Extract login error 2015-12-28 21:20:09 +06:00
Jaime Marquínez Ferrándiz
6c6b8bd5cc [cspan] Fix extraction (fixes #8032) 2015-12-28 13:50:29 +01:00
Jaime Marquínez Ferrándiz
04e24906be [cspan] Initialize 'video_type' to avoid 'UnboundLocalError' exceptions (#8032) 2015-12-28 13:06:30 +01:00
remitamine
974c1b2d42 Merge branch 'dcn' of github.com:remitamine/youtube-dl into remitamine-dcn 2015-12-28 10:38:31 +01:00
remitamine
bca9bea1c1 [dcn] make m3u8 formats extraction non fatal 2015-12-28 10:27:17 +01:00
remitamine
bd3f9ecabe [tunein] add support for tunein topic,clip and program(fixes #7348) 2015-12-28 00:36:57 +01:00
Yen Chi Hsuan
c047270c02 [utils] Remove Content-encoding from headers after decompression
With cn_verification_proxy, our http_response() is called twice, one from
PerRequestProxyHandler.proxy_open() and another from normal
YoutubeDL.urlopen(). As a result, for proxies honoring Accept-Encoding, the
following bug occurs:

$ youtube-dl -vs --cn-verification-proxy https://secure.uku.im:993 "test:letv"
[debug] System config: []
[debug] User config: []
[debug] Command-line args: ['-vs', '--cn-verification-proxy', 'https://secure.uku.im:993', 'test:letv']
[debug] Encodings: locale UTF-8, fs utf-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2015.12.23
[debug] Git HEAD: 97f18fa
[debug] Python version 3.5.1 - Linux-4.3.3-1-ARCH-x86_64-with-arch-Arch-Linux
[debug] exe versions: ffmpeg 2.8.4, ffprobe 2.8.4, rtmpdump 2.4
[debug] Proxy map: {}
[TestURL] Test URL: http://www.letv.com/ptv/vplay/22005890.html
[Letv] 22005890: Downloading webpage
[Letv] 22005890: Downloading playJson data
ERROR: Unable to download JSON metadata: Not a gzipped file (b'{"') (caused by OSError('Not a gzipped file (b\'{"\')',)); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/extractor/common.py", line 330, in _request_webpage
    return self._downloader.urlopen(url_or_request)
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/YoutubeDL.py", line 1886, in urlopen
    return self._opener.open(req, timeout=self._socket_timeout)
  File "/usr/lib/python3.5/urllib/request.py", line 471, in open
    response = meth(req, response)
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/utils.py", line 773, in http_response
    raise original_ioerror
  File "/home/yen/Executables/Multimedia/youtube-dl/youtube_dl/utils.py", line 761, in http_response
    uncompressed = io.BytesIO(gz.read())
  File "/usr/lib/python3.5/gzip.py", line 274, in read
    return self._buffer.read(size)
  File "/usr/lib/python3.5/gzip.py", line 461, in read
    if not self._read_gzip_header():
  File "/usr/lib/python3.5/gzip.py", line 409, in _read_gzip_header
    raise OSError('Not a gzipped file (%r)' % magic)
2015-12-28 01:09:18 +08:00
remitamine
97f18fac3a [vgtv] fix f4m downloading(fixes #7843) 2015-12-27 17:44:55 +01:00
remitamine
c71d2e2087 [livestream] change test url 2015-12-27 17:27:20 +01:00
Yen Chi Hsuan
59185202c6 [iqiyi] Add tests for #7894 2015-12-28 00:19:36 +08:00
forDream
bee83e84f6 [iqiyi]fix valid url
eg:
http://yule.iqiyi.com/zbj.html
2015-12-28 00:16:34 +08:00
gam2046
82e02ea5fc Update iqiyi.py
Fix part of the address can not be resolved.
eg:http://www.iqiyi.com/w_19rt6o8t9p.html
2015-12-28 00:16:21 +08:00
Sergey M․
a95c26a06a [jwplatform] Carry long line 2015-12-27 21:43:14 +06:00
Sergey M․
0b0a17ae9d [viki] Fix typo 2015-12-27 21:42:26 +06:00
Sergey M․
30f51acbc8 [rai] Fix typos 2015-12-27 21:42:12 +06:00
Sergey M․
e0898585a1 [jwplatform] Fix typo 2015-12-27 21:41:55 +06:00
Sergey M․
62bdc9fecc [esri] Fix typo 2015-12-27 21:41:36 +06:00
Sergey M․
e73277c7e8 [abc7news] Remove redundant formats sorting 2015-12-27 21:41:21 +06:00
remitamine
8d29e47f54 [common] simplify the use of _extract_m3u8_formats and _extract_f4m_formats 2015-12-27 15:33:39 +01:00
remitamine
2db772b9ea Merge branch 'master' of github.com:rg3/youtube-dl 2015-12-27 15:03:13 +01:00
remitamine
7b81316508 [livestream] skip m3u8 manifest in progressive_urls 2015-12-27 14:58:57 +01:00
Philipp Hagemeister
05358deeca Merge branch 'master' of github.com:rg3/youtube-dl 2015-12-27 14:34:51 +01:00
Philipp Hagemeister
9f610f3a9e [sportdeutschland] Do not abort if meta info is missing
This fixes http://sportdeutschland.tv/badminton/yonex-copenhagen-masters-2015 . No testcase though since the event will be over by 2016.
2015-12-27 13:11:55 +01:00
remitamine
dbfd06730c Merge pull request #7892 from remitamine/livestream
[livestream] improve extraction(fixes #2133)(fixes #2838)(fixes #4152)(fixes #6988)
2015-12-27 13:11:33 +01:00
remitamine
5b025168c7 [livestream] improve extraction
- split long lines
- use m3u8 entry protocol for live streams
- extend _VALID_URL regex for livestream original
- extract livestream original live streams
2015-12-27 12:57:39 +01:00
remitamine
46124a49b2 Merge pull request #7851 from remitamine/kaltura
[kaltura] extract more formats
2015-12-27 10:34:36 +01:00
remitamine
608cc3b85c [kaltura] add referrer to m3u8 url 2015-12-27 10:19:44 +01:00
remitamine
6afe044b51 [dcn] improve extraction 2015-12-27 09:56:15 +01:00
Sergey M․
15aad84dc5 [lrt] Extract counters 2015-12-27 12:26:48 +06:00
Sergey M․
f7e1d82d40 [lrt] Improve 2015-12-27 12:16:55 +06:00
Giedrius Statkevičius
339b1944e7 [lrt] fix the rest of extractor
Closes #7690.
2015-12-26 20:02:54 +01:00
Giedrius Statkevičius
85367c3a47 [lrt] fix duration parsing 2015-12-26 19:29:54 +01:00
Sergey M․
607d65fbce [ign] flake8 2015-12-26 03:17:56 +06:00
remitamine
9f0ee2a388 Merge pull request #7045 from remitamine/ign
[ign] add support for pcmag and extract all formats and more metadata(fixes #5335)(fixes #7006)
2015-12-25 20:06:27 +01:00
remitamine
1fc0b47fdf [srmediathek] improve extraction 2015-12-25 17:37:50 +01:00
Sergey M․
6418b2439b [rutv] Fix extraction (Closes #8004) 2015-12-25 21:14:00 +06:00
remitamine
06d5556dfa [rai] improve extraction 2015-12-25 15:38:12 +01:00
remitamine
fb8e402ad2 [hotstar] Add new extractor 2015-12-25 01:59:56 +01:00
Sergey M․
c24044635b [zdf:channel] Add more tests 2015-12-24 20:44:49 +06:00
Sergey M․
67ba388efb [zdf:channel] Relax _VALID_URL 2015-12-24 20:42:29 +06:00
Boris Wachtmeister
e41604227c [zdf] expand valid-url pattern for channels
The webpage also creates URLs which include additional text that defines
the sorting order on the page like "aktuellste" (most current) and
"meist-gesehen" (most seen), e.g.:

http://www.zdf.de/ZDFmediathek/kanaluebersicht/aktuellste/332
http://www.zdf.de/ZDFmediathek/kanaluebersicht/meist-gesehen/332
2015-12-24 20:39:37 +06:00
Sergey M․
8a609c32fd [chaturbate] Improve error extraction (Closes #7989) 2015-12-24 20:09:48 +06:00
remitamine
96db61ffb8 [theintercept] improve extraction 2015-12-23 22:36:53 +01:00
remitamine
c153bd8b2f Merge branch 'theintercept' of https://github.com/bit/youtube-dl into bit-theintercept 2015-12-23 21:50:27 +01:00
Sergey M
383003653f Merge pull request #7969 from jwilk/spelling
Fix typos
2015-12-24 00:15:08 +06:00
Jakub Wilk
fc383f199e Fix typos 2015-12-23 19:12:16 +01:00
Sergey M․
2c566d02fe [pbs] Extend PBS station regex (Closes #7964) 2015-12-23 23:22:47 +06:00
Jaime Marquínez Ferrándiz
a8f1d167f6 [arte] Prefer json URLs that contain the video id from the 'vid' parameter in the URL (fixes #7920) 2015-12-23 18:00:46 +01:00
remitamine
261b4c23c7 [appletrailers] skip clips with empty url 2015-12-23 17:48:37 +01:00
Sergey M․
dcdc352371 [instagram:user] Improve _VALID_URL (Closes #7955) 2015-12-23 21:13:31 +06:00
Sergey M․
be514c856c [24video] Fix test 2015-12-23 20:49:52 +06:00
Sergey M․
128eb31d90 [24video] Fix extraction on python 2.6 2015-12-23 20:49:41 +06:00
Sergey M․
747b028412 [24video] Fix extraction (Closes #7956) 2015-12-23 20:42:36 +06:00
Jaime Marquínez Ferrándiz
7fe37d8a05 [appletrailers] Improve regex for fixing '<img>' tags (#7953) 2015-12-23 14:48:40 +01:00
Philipp Hagemeister
f10c27b8cb release 2015.12.23 2015-12-23 14:05:06 +01:00
remitamine
60427f63d1 [appletrailers] Add support for AppleTrailers Section 2015-12-23 10:40:45 +01:00
Sergey M․
178b47e6af [daum] Add test for #7949 2015-12-23 02:59:49 +06:00
Sergey M․
3a70ed9ebe [daum] Fix extraction (Closes #7949) 2015-12-23 02:54:32 +06:00
Sergey M․
89abf7bf4d [periscope] Fix token based extraction (Closes #7943) 2015-12-23 02:09:50 +06:00
Sergey M․
cfe9e5aa6c [comcarcoff] Extract duration 2015-12-23 01:18:14 +06:00
Sergey M․
4c24ed9464 [comcarcoff] Improve json data regex and modernize 2015-12-23 01:16:14 +06:00
Sergey M
11208ebbf1 Merge pull request #7942 from ausbin/comcarcoff-json-fix
[comcarcoff] adjust for json updates
2015-12-23 01:05:54 +06:00
Sergey M․
774ce35571 [imgur] Improve (Closes #7928) 2015-12-22 21:48:48 +06:00
Abhishek Kedia
dbee18b552 Improve extraction (Closes #7918)
remove outer parentheses in if

Conflicts:
	youtube_dl/extractor/imgur.py

checked code with flake8

not returning list in case of single images.

using the fact that id with length 5 are albums and more are single videos.
Also for single videos ie ImgurIE both urls - http://imgur.com/gallery/oWeAMW2 and http://imgur.com/oWeAMW2 are equally fine. Change regex to allow thuis.
For albums urls - http://imgur.com/gallery/Q95ko and http://imgur.com/Q95ko are ok. Change regex to allow this also.

update description in ImgurIE Tests.
Also move single video test 'https://imgur.com/gallery/YcAQlkx' from ImgurAlbumIE to ImgurIE.
2015-12-22 21:43:49 +06:00
remitamine
31d9ea4a3e Merge pull request #7322 from remitamine/vgtv
[vgtv] extract videos from FTV, Aftenposten, Aftonbladet using VGTVIE
2015-12-22 16:10:04 +01:00
remitamine
3b68efdc6a [vgtv] update tests and correct format sorting 2015-12-22 15:54:51 +01:00
j
2be689b7e2 [theintercept] Add new extractor 2015-12-22 13:08:16 +01:00
remitamine
2db5806991 [franceinter] use _match_id 2015-12-22 11:30:35 +01:00
remitamine
220bc3f0e3 [franceinter] fix title extraction 2015-12-22 11:27:18 +01:00
remitamine
48a6c984b8 [bleacherreport] update test 2015-12-22 10:14:57 +01:00
remitamine
dc016bf521 [viki] detect errors and fix formats extraction 2015-12-22 09:55:25 +01:00
remitamine
ff43d2365f [soompi] remove extractor
http://tv.soompi.com now redirect to viki.com because Viki has acquired
Soompi
http://www.soompi.com/2015/08/19/we-got-married-soompi-joins-viki/
2015-12-22 07:58:33 +01:00
Austin Adams
ed63cbd6ee [comcarcoff] adjust for json updates 2015-12-21 20:28:38 -05:00
Founder Fang
5f432ac8f5 [Weiqitv] Add new extractor 2015-12-22 06:21:56 +08:00
remitamine
c7224074d6 [audimedia] correct test case id 2015-12-21 23:02:55 +01:00
remitamine
eed30fea75 [flickr] fix format sorting 2015-12-21 22:10:16 +01:00
remitamine
5625bd0617 [br] add support for br-klassik.de and improve extraction
- extend _VALID_URL to match both br.de and br-klassik.de
- extract all formats(hls,hds and rtmp)
- use xpath_element and xpath_text for xml info extraction
2015-12-21 21:06:10 +01:00
Sergey M․
5ef5d25b15 [audiomack] Fix typo (Closes #7936) 2015-12-21 22:51:58 +06:00
remitamine
0f15ad7b9b [adultswim] update test 2015-12-21 17:07:19 +01:00
remitamine
f11d00fa41 [test_subtitles] remove BlipTV test 2015-12-21 16:52:47 +01:00
remitamine
61ebb401b7 [atresplayer] improve extraction
- select hashlib.md5 constructor as digestmod(in python 3.4+ MD5 as
implicit default digest for digestmod is deprecated.)
- extract hls formats
- update tests
- extract errors
2015-12-21 16:26:40 +01:00
remitamine
5c5a3ecf1b [abc] detect expired state and update tests 2015-12-21 13:07:52 +01:00
Philipp Hagemeister
0197004f78 release 2015.12.21 2015-12-21 11:42:25 +01:00
remitamine
2c28da8e05 Merge branch 'bleacherreport' of github.com:remitamine/youtube-dl into remitamine-bleacherreport 2015-12-21 11:18:32 +01:00
remitamine
c7fa5fa42c [bleacherreport] fix style issues and simplify 2015-12-21 11:12:58 +01:00
remitamine
7ba71e30fb Merge branch 'bliptv' of github.com:remitamine/youtube-dl into remitamine-bliptv 2015-12-21 04:31:17 +01:00
remitamine
7cb0952474 [makertv] improve extraction 2015-12-21 04:24:58 +01:00
remitamine
a8ae232fa9 Merge branch 'googledrive' of github.com:remitamine/youtube-dl into remitamine-googledrive 2015-12-21 03:15:19 +01:00
remitamine
5b251628e9 [googledrive] Modernize 2015-12-21 03:05:34 +01:00
remitamine
b9a324c0da Merge branch 'flickr' of github.com:remitamine/youtube-dl into remitamine-flickr 2015-12-21 00:37:51 +01:00
remitamine
5b95419ca5 [flickr] extract views_count and tags 2015-12-21 00:20:22 +01:00
remitamine
ecbccea703 [faz] extract duration and bitrate and use xpath_element and xpath_text for extraction 2015-12-20 21:38:30 +01:00
remitamine
c240ab6ecf Merge pull request #6790 from remitamine/tele13
[canal13cl] fix info extraction
2015-12-20 16:11:07 +01:00
remitamine
6882c0870e [tele13] improve extraction
- improve jwplayer setup regex
- sort formats
- remove duplicate formats
- update youtube test
2015-12-20 15:48:19 +01:00
remitamine
b0eeaf4f40 Merge pull request #6928 from remitamine/cnet
[cnet] fix extraction and extract more formats and metadata(closes #7003)
2015-12-20 12:59:35 +01:00
remitamine
c6ed6fadc2 [cnet] improve extraction
- relex data json regex
- extract the platform metadata once
- extract hds formats
- extract duration
- extract thumbnail
2015-12-20 12:43:00 +01:00
Sergey M․
e462474e1d [youtube] Generalize playlists extractor 2015-12-20 07:48:16 +06:00
Sergey M․
6b77d52b1f [test_utils] Add tests for encode_compat_str 2015-12-20 07:07:14 +06:00
Sergey M․
9b9c5355e4 Rename error_to_str to error_to_compat_str 2015-12-20 07:00:39 +06:00
Sergey M․
d890b4cc0a [nbc:news] Remove unnecessary compat_str 2015-12-20 06:43:42 +06:00
Sergey M․
2c74e6fa77 [YoutubeDL] Revert error_to_str for ExtractorError 2015-12-20 06:35:58 +06:00
Sergey M․
c0384f221e Use proper encoding on compat_str construction when necessary 2015-12-20 06:29:36 +06:00
Sergey M․
8e60dc7526 [utils] Add encode_compat_str 2015-12-20 06:26:26 +06:00
Sergey M․
8900ab4d9b [YoutubeDL] More error_to_str 2015-12-20 06:22:01 +06:00
Sergey M․
fb043a6e4e [YoutubeDL] Use error_to_str 2015-12-20 06:16:19 +06:00
Sergey M․
7f8b271465 Properly convert errors to strings 2015-12-20 05:27:38 +06:00
Sergey M․
fdae235858 [utils] Add error_to_str 2015-12-20 05:26:47 +06:00
remitamine
1deb710f26 [gputechconf] improve extraction 2015-12-19 23:59:00 +01:00
remitamine
ec6504b39c [gputechconf] Add new extractor(closes #5775) 2015-12-19 23:28:54 +01:00
Sergey M․
dd85e4d707 [extractor/common] Properly decode error string on python 2 (Closes #1354, closes #3957, closes #4037, closes #6449) 2015-12-20 02:43:50 +06:00
remitamine
fa64a84311 [faz] fix info extraction 2015-12-19 19:02:04 +01:00
remitamine
e0f06eae43 [fktv] fix info extraction 2015-12-19 18:26:28 +01:00
Sergey M․
0f206ee814 [toggle] Change IE_NAME 2015-12-19 23:11:23 +06:00
Sergey M․
cc0f378d54 [toggle] Rename to toggle 2015-12-19 19:59:00 +06:00
Sergey M․
e33c9cba7c [toggle] Improve _VALID_URL 2015-12-19 19:58:18 +06:00
Sergey M․
989e9f8ead [toggle] Improve formats extraction robustness 2015-12-19 19:52:37 +06:00
Sergey M․
8f097af4ec [toggle] Extract counters 2015-12-19 19:23:28 +06:00
Sergey M․
c40dbb19ab [toggle] Extract thumbnails 2015-12-19 19:19:26 +06:00
Sergey M․
ffaf6e66e3 [toggle] Improve 2015-12-19 19:16:49 +06:00
Sergey M․
74c730174f [toggle] Style 2015-12-19 19:06:05 +06:00
Sergey M․
c82a8dd14c [toggle] Remove unused imports 2015-12-19 19:04:38 +06:00
Sergey M․
f8253af561 [toggle] Use sanitized_Request 2015-12-19 19:03:55 +06:00
ping
ed370ff0e6 [togglesg] Fixes 2015-12-19 18:48:59 +06:00
ping
ee0f0393cf [togglesg] New extractor for toggle.sg 2015-12-19 18:48:46 +06:00
Yen Chi Hsuan
db2fe38b55 [utils] Support alternative timestamp format in TTML
Fixes #7608
2015-12-19 19:29:51 +08:00
Yen Chi Hsuan
d631d5f9f2 [utils] Fix TTML conversion
Tolerate invalid timestamps (closes #7909)
2015-12-19 18:21:42 +08:00
Sergey M․
4f29fa9906 [brightcove:new] Add test for ref: prefixed video id 2015-12-18 22:31:48 +06:00
Sergey M․
5b72fda140 [brightcove:new] Clarify ref: prefix 2015-12-18 22:22:41 +06:00
Sergey M․
f81ccbb3df [brightcove:new] Fix typo 2015-12-18 22:20:44 +06:00
Sergey M․
9fd0f67678 [brightcove:new] Add support for ref: preffixed video ids (Closes #7794) 2015-12-18 22:18:55 +06:00
Sergey M․
15d50aca64 [nowness] Add support for brightcove:new videos (Closes #7884) 2015-12-18 22:05:56 +06:00
Sergey M․
7234d1d9c7 [brightcove:new] Add _extract_url 2015-12-18 22:05:32 +06:00
Sergey M․
9796a9b20c [ndr] Fix description and upload date extraction (Closes #7893) 2015-12-18 21:34:17 +06:00
Philipp Hagemeister
016dd82050 release 2015.12.18 2015-12-18 14:21:30 +01:00
Sergey M․
b95779be21 [jsinterp] Extend function regex (Closes #7900, closes #7901) 2015-12-18 18:57:49 +06:00
Yen Chi Hsuan
10171468d9 [iqiyi] Update key (closes #7896) 2015-12-18 18:20:41 +08:00
Yen Chi Hsuan
bf597a6dd1 Merge pull request #7895 from Blue9/patch-1
Fix hyperlink to youtube-dl options
2015-12-18 18:10:41 +08:00
Gautam M
45dad8bab9 Fix hyperlink to youtube-dl options 2015-12-18 03:16:36 -05:00
remitamine
64ccbf18c0 [livestream] improve extraction, add support for live streams and extract more info and formats 2015-12-17 20:56:54 +01:00
Sergey M․
9dc1d94a0c [noco] Fix bitrates 2015-12-17 22:18:28 +06:00
Sergey M․
7824e1f6a6 [noco] Modernize 2015-12-17 22:16:58 +06:00
Sergey M․
2469a6aecb [noco] Adjust timestamp according to server time (Closes #7864) 2015-12-17 22:16:22 +06:00
Sergey M․
8f0afda028 [pbs] Extend _VALID_URL (Closes #7889) 2015-12-17 20:24:33 +06:00
remitamine
35e22b6b32 [youku] check for the correct variable 2015-12-17 12:51:50 +01:00
remitamine
323f82a7e0 [vimeo] add test for original format 2015-12-16 17:00:17 +01:00
remitamine
8534bf1f00 [vimeo] prefer original format 2015-12-16 16:36:25 +01:00
remitamine
eb4f27405b [vimeo] extract source file(closes #1072) 2015-12-16 09:43:53 +01:00
Sergey M․
2d3b70271c [rutube] Extend _VALID_URL 2015-12-16 04:44:17 +06:00
Sergey M․
ad1b6017cd [tf1] Fix tests 2015-12-15 21:36:59 +06:00
Sergey M․
05467d5a52 [tf1] Relax _VALID_URL 2015-12-15 21:31:58 +06:00
Sergey M․
ae5e94808e [tf1] Fix extraction (2) 2015-12-15 21:11:52 +06:00
Sergey M․
d7ffcfcf97 [tf1] Fix extraction (Closes #7873) 2015-12-15 21:09:14 +06:00
Sergey M․
0cb58b0259 [youtube] Extract alt_title and creator for music videos (Closes #7862) 2015-12-14 21:31:53 +06:00
Sergey M․
31b2051e21 [utils] Add remove_quotes 2015-12-14 21:30:58 +06:00
Yen Chi Hsuan
eb0bdc2c3e [novamov] Fix again 2015-12-14 02:50:59 +08:00
Yen Chi Hsuan
6583c741cd [novamov] Fix filekey extraction and reupload test video 2015-12-14 02:34:20 +08:00
Sergey M․
2d9295643e [footyroom] Skip test 2015-12-13 23:55:10 +06:00
Sergey M․
ee86e2c6d7 [novamov] Add support for mobile URLs 2015-12-13 19:16:01 +06:00
Yen Chi Hsuan
02a63fadc3 [infoq] Refactor and support the Chinese version
Closes #7576
2015-12-13 19:16:58 +08:00
Philipp Hagemeister
f3711edcf1 release 2015.12.13 2015-12-13 10:52:59 +01:00
Yen Chi Hsuan
22d07ba4e4 [infoq] Fix extraction for HTTP URLs (closes #7739) 2015-12-13 17:29:27 +08:00
Yen Chi Hsuan
f6abca506e [nowvideo] Skip deleted test case 2015-12-13 15:43:20 +08:00
Yen Chi Hsuan
b5424acdb9 [novamov] Improve existence checking 2015-12-13 15:43:20 +08:00
Yen Chi Hsuan
47c7f3d995 [novamov] Fix filekey extraction (closes #7764) 2015-12-13 15:43:20 +08:00
Sergey M․
0014ffa829 [funimation] Improve login 2015-12-13 07:17:42 +06:00
Sergey M․
c03943f394 Credit @Slyneth for funimation (#7775) 2015-12-12 15:19:23 +06:00
Yen Chi Hsuan
deb1e8d20e [youku] Put the missing item to get_hd 2015-12-12 15:49:19 +08:00
Yen Chi Hsuan
174964a7bc Credit @Celthi for fixing Youku extractor 2015-12-12 15:34:40 +08:00
Yen Chi Hsuan
9c568178fb Merge branch 'Celthi-youku_bugfix' 2015-12-12 15:26:43 +08:00
Yen Chi Hsuan
dbb7d7e26c [youku] Reorder format items 2015-12-12 15:24:58 +08:00
Yen Chi Hsuan
ade2340971 [youku] Simplify 2015-12-12 15:19:14 +08:00
Yen Chi Hsuan
4d77550cf0 [youku] Fix tests 2015-12-12 14:57:14 +08:00
Yen Chi Hsuan
c683454e7e [youku] MD5 is unstable 2015-12-12 14:48:46 +08:00
Yen Chi Hsuan
f133fd326b [youku] Cleanup and PEP8 2015-12-12 14:41:53 +08:00
Yen Chi Hsuan
1faa66f005 Merge branch 'youku_bugfix' of https://github.com/Celthi/youtube-dl into Celthi-youku_bugfix 2015-12-12 14:36:29 +08:00
Yen Chi Hsuan
8773f3158f [safari] Use postdata_urlencode (#7465) 2015-12-12 14:28:05 +08:00
Celthi
7e37c39485 merge data1 and data2 2015-12-12 11:26:15 +08:00
Celthi
14c17cafa1 add support to video protected by password 2015-12-12 11:21:44 +08:00
Celthi
8696a7fd13 fix the keyerror(mp4hd), todo support download the video protected by password 2015-12-12 10:44:21 +08:00
remitamine
bb4b8c57b9 [kaltura] extract more formats 2015-12-11 23:40:12 +01:00
Sergey M․
d63cfc3f0f [beeg] API v5 (Closes #7846) 2015-12-12 02:52:20 +06:00
Sergey M․
f377f44dae [funimation] Improve extraction 2015-12-12 01:02:54 +06:00
Sergey M․
0b1bb1ac3a [funimation] Add test for promotional video 2015-12-12 00:52:00 +06:00
Sergey M․
f208e52a76 [funimation] Fix promotional videos extraction 2015-12-12 00:48:09 +06:00
Sergey M․
b091529a3c [funimation] Extend _VALID_URL to match promotional videos 2015-12-12 00:43:03 +06:00
Sergey M․
b323a3cbff [funimation] Remove unused import 2015-12-12 00:39:44 +06:00
Sergey M․
b59623ef43 [funimation] Use mobile webpage for workaround hulu error 2015-12-12 00:38:58 +06:00
Sergey M․
9c163950da [funimation] Improve _VALID_URL 2015-12-11 23:20:10 +06:00
Sergey M․
d357bbd375 [funimation] Update test 2015-12-11 23:06:44 +06:00
Sergey M?
f542a3d26b [funimation] Improve extraction (Closes #7775) 2015-12-11 23:00:38 +06:00
Sergey M?
59a4ff482a [funimation] Real UA is required for login 2015-12-11 23:00:37 +06:00
Sergey M?
40ca5b04f4 [funimation] Remove unnecessary login form field 2015-12-11 23:00:37 +06:00
Sergey M?
411e5b88c9 [funimation] Fix login message 2015-12-11 23:00:37 +06:00
Sergey M?
b4c299bad0 [funimation] PEP 8 2015-12-11 23:00:36 +06:00
Muratcan Simsek
ab4bdc913f [funimation] Add new extractor
Update funimation.py

Update funimation.py

Removed unnecessary lines.

Update funimation.py

Added thumbnail and description.

Filename improvement.

fixed TEST.
2015-12-11 23:00:35 +06:00
remitamine
1fe248a51b Merge pull request #7833 from remitamine/ooyala
[ooyala] improve extraction
2015-12-11 17:55:32 +01:00
remitamine
2559b9d017 [wdr] extract all formats(closes #7788) 2015-12-11 17:31:33 +01:00
Sergey M․
4db43567e8 [downloader/f4m] Decode manifest before fixing 2015-12-11 20:28:44 +06:00
Celthi
5333842a1d According the blog and you-get fixed the issues #7627. 2015-12-11 20:08:14 +08:00
Celthi
98c3806b15 fix some not important codesnips 2015-12-11 19:18:14 +08:00
Yen Chi Hsuan
b6afc225c8 [vevo] Use _download_smil to provide informative error messages 2015-12-11 19:16:51 +08:00
Yen Chi Hsuan
ad30dc1e20 [vevo] Allow calling API without https
Not all proxies allow CONNECT
2015-12-11 19:07:13 +08:00
Yen Chi Hsuan
ff51983e15 [vevo] Handle videos without video_info (#7802) 2015-12-11 18:52:03 +08:00
Celthi
fdf01663d1 able to download first part of the video, but fail in the left part 2015-12-11 17:48:40 +08:00
Yen Chi Hsuan
4b94288301 [vevo] Use _match_id 2015-12-11 17:32:29 +08:00
Yen Chi Hsuan
4bf99ade15 [vevo] Catch the georestriction message (#7802) 2015-12-11 14:25:01 +08:00
remitamine
e3e166d8cf [ign] improve extraction and extract uploader_id 2015-12-11 00:08:54 +01:00
remitamine
3da3999612 [ultimedia] keep direct support for ultimedia videos 2015-12-10 23:04:28 +01:00
remitamine
d50116b8ac [vgtv] extract 5 digit length video ids using both xstream and vgtv 2015-12-10 22:18:42 +01:00
remitamine
75ed53320e [ooyala] improve extraction 2015-12-10 19:08:16 +01:00
Sergey M․
17b786ae73 [downloader/f4m] Fix malformed manifests (Closes #7823) 2015-12-10 22:59:50 +06:00
Sergey M
dfd42a43c3 Merge pull request #7821 from joksnet/patch-1
[FFmpegPostProcessor] Default of prefer ffmpeg
2015-12-10 22:10:20 +06:00
Philipp Hagemeister
f7b8dd63f0 release 2015.12.10 2015-12-10 17:05:13 +01:00
Sergey M․
a8abf124c8 [dailymotion] Add subtitles test URL for reference 2015-12-10 21:54:48 +06:00
Sergey M․
176ccefcd8 [pbs] PEP 8 2015-12-10 21:33:40 +06:00
Sergey M․
cbd2ffd031 [dailymotion] Fix subtitles extraction 2015-12-10 21:29:07 +06:00
Sergey M․
0b534d2adc [dailymotion] Restrict player v5 regex (Closes #7826) 2015-12-10 21:27:47 +06:00
Sergey M․
526a20bd16 [pbs] Clarify member stations' URLs 2015-12-10 21:04:26 +06:00
Celthi
51094b1b08 add cookie and referer in headers, change the video url 2015-12-10 21:42:12 +08:00
Philipp Hagemeister
f1ac2033ab Merge pull request #7827 from habi/master
Updating README.md
2015-12-10 13:54:18 +01:00
David Haberthür
a1b8d815f5 Reverting markup changes 2015-12-10 13:45:53 +01:00
David Haberthür
8b756bd98e Merge branch 'update-readme' 2015-12-10 13:20:25 +01:00
David Haberthür
46047c58d0 Updating README.md
- Harmonizing mentions of **youtube-dl** in the text
- Removing unnecessary Markdown markup for headers
- Adding some links
2015-12-10 13:19:26 +01:00
Juan M Martínez
374c761e77 [FFmpegPostProcessor] Default of prefer ffmpeg
When no `downloader` is passed to `FFmpegPostProcessor`
an exception was raised trying to get the prefer ffmpeg param.

    AttributeError: 'NoneType' object has no attribute 'params'

This fixes and defaults to `False`.
2015-12-09 20:56:00 -03:00
Sergey M․
6c7b26e13f [pbs] Make URLs lowercase 2015-12-09 21:28:04 +06:00
Sergey M․
b51b108045 [pbs] Clean up stations list from duplicates 2015-12-09 21:23:19 +06:00
Philipp Hagemeister
86e8c89488 release 2015.12.09 2015-12-09 15:32:26 +01:00
remitamine
41c3b34b1f [vgtv] add sortcut expressions to use the extractor 2015-12-09 10:50:11 +01:00
Jaime Marquínez Ferrándiz
47f48f5d85 [test/test_all_urls] Update pbs extractor name
It's in lowercase now (since e15e2ef7a0).
2015-12-08 21:12:13 +01:00
Sergey M․
e15e2ef7a0 [pbs] Add support for all member stations (#7674) 2015-12-09 01:51:34 +06:00
Sergey M․
d0c8b279da [pbs] Add another coveplayer pattern (Closes #7674) 2015-12-08 23:34:43 +06:00
Sergey M․
612d83b51d [pbs] Extend _VALID_URL 2015-12-08 23:28:36 +06:00
Sergey M
9c30efeb7e Merge pull request #7792 from jindaxia/fix_sohu_403forbidden
[sohu] Fix 403 forbidden
2015-12-08 22:54:14 +06:00
Sergey M․
39fa4cc107 [cliphunter] Fix extraction (Closes #7796) 2015-12-08 21:56:00 +06:00
Sergey M?
b09c122373 [nbc] Add another theplatform pattern 2015-12-08 21:35:42 +06:00
Sergey M
3348243b7b [README.md] Clarify verbose log requirements 2015-12-08 21:34:26 +06:00
Sergey M․
b46b65ed37 [nbc] Smuggle referer (Closes #7791) 2015-12-08 21:16:14 +06:00
Sergey M․
18e4088fad [theplatform] Add support for referer protected videos wuth explicit SMIL 2015-12-08 21:15:45 +06:00
虾哥哥
5fd6cd64f9 [sohu]fix 403 forbidden 2015-12-08 14:14:14 +08:00
Sergey M․
3d24bbfbe4 [YoutubeDL] Check formats for merge to be opposite (#7786) 2015-12-07 23:10:57 +06:00
Sergey M․
1775612512 [wimp] Improve video URL regex 2015-12-07 22:18:00 +06:00
Sergey M․
0d2d967cc7 [wimp] Fix extraction (Closes #7784) 2015-12-07 22:14:45 +06:00
Sergey M․
a5e52a1fd4 [vk] Add test for pladform embed 2015-12-07 22:05:54 +06:00
Sergey M․
291a93bafa [vk] Remove unnecessary message 2015-12-07 22:04:47 +06:00
Sergey M․
c4737bea17 [vk] Add support for pladform embeds (Closes #7780) 2015-12-07 22:03:52 +06:00
Sergey M․
45dad7ba1b [extractor/generic] Use _extract_url for pladform 2015-12-07 22:03:21 +06:00
Sergey M․
db7c9da871 [pladform] Add _extract_url routine 2015-12-07 22:02:45 +06:00
Philipp Hagemeister
bc92621ade release 2015.12.06 2015-12-06 18:51:25 +01:00
Sergey M․
fd8e559c3a [beeg] Switch to api v4 (Closes #7774) 2015-12-06 23:47:10 +06:00
Sergey M․
222e11d4ae [bbc] Add another pattern for playlist.sxml (Closes #7743) 2015-12-06 16:41:12 +06:00
Sergey M․
7d682f0acb [nowtv] Extend _VALID_URL to support jahr URLs (Closes #7755) 2015-12-06 16:18:59 +06:00
Yen Chi Hsuan
8364b6b0b1 [iqiyi] Update key
Closes #7772
2015-12-06 16:41:02 +08:00
remitamine
9f6b517671 [vgtv] extract all formats and improve extraction 2015-12-06 07:50:27 +01:00
Sergey M․
7ac40e5521 [nowvideo] Update test 2015-12-06 09:42:20 +06:00
Sergey M․
36066dd3ee [movshare] Rename to wholecloud 2015-12-06 09:42:00 +06:00
Sergey M․
636aa83ed3 [cloudtime] Add extractor 2015-12-06 09:37:38 +06:00
Sergey M․
33d152b6cc [novamov] Move all novamov based extractors to a single place
For easier navigation
2015-12-06 09:29:41 +06:00
remitamine
51c4fec0d5 [nba] use int_or_none for tbr 2015-12-05 21:04:22 +01:00
remitamine
0017486dca [nba] use int instead of int_or_none 2015-12-05 20:58:44 +01:00
Sergey M․
edc70f4aaf [pluralsight] Fix format code split while guessing quality 2015-12-06 01:40:13 +06:00
Sergey M․
756926ff00 [pluralsight] Add support for widescreen videos (Closes #7766) 2015-12-06 01:39:28 +06:00
remitamine
cb160dd531 [nba] handle format info properly 2015-12-05 18:47:15 +01:00
Jaime Marquínez Ferrándiz
77334ccb44 [metacafe] Fix age limit extraction 2015-12-05 16:12:50 +01:00
Jaime Marquínez Ferrándiz
796db21295 [metacafe] Fix video url extraction (closes #7763) 2015-12-05 16:12:02 +01:00
Philipp Hagemeister
535d7b681b release 2015.12.05 2015-12-05 16:01:37 +01:00
remitamine
7db2897ded [srgssr] handle all play urls only in SRGSSRIE and keep RTSIE for articles 2015-12-05 15:57:10 +01:00
Sergey M․
960e038886 [hypem] Modernize 2015-12-05 20:46:57 +06:00
Sergey M․
ea14422ff1 [hypem] Correctly handle cookies (Closes #7762) 2015-12-05 20:42:21 +06:00
Yen Chi Hsuan
38d05d17e5 [fc2] Fix test_FC2_1 2015-12-05 21:10:26 +08:00
Yen Chi Hsuan
db9bd5267f [keezmovies] Fix extraction
Also fixes #7752
2015-12-05 17:26:13 +08:00
remitamine
ab3b773bbe [acast] change tests into more stable casts and work with channel extractor only if it didn't match cast regex 2015-12-05 10:14:34 +01:00
Yen Chi Hsuan
0bc4ee60e0 [bbc] Fix test_BBC_6 2015-12-05 16:55:53 +08:00
Yen Chi Hsuan
a3ef0e1cdd [bbc.co.uk] Skip removed test video 2015-12-05 16:55:53 +08:00
Yen Chi Hsuan
679bacf0b5 [bbc.co.uk] Fix test_BBCCoUk
This is similar to the one in #7756, So also fixes #7756.
2015-12-05 16:55:53 +08:00
remitamine
02e3952f3b [trilulilu] handle errors 2015-12-05 09:42:00 +01:00
Yen Chi Hsuan
64b7e89c0c [srf] Support audios (closes #7760) 2015-12-05 16:26:30 +08:00
remitamine
bee4c5571a [clipfish] improve extraction 2015-12-04 16:38:05 +01:00
remitamine
96929dd1e8 [skynewsarabia] fix extractor name 2015-12-04 16:23:44 +01:00
remitamine
53e06b2507 [ooyala] fix duration scale 2015-12-04 16:18:02 +01:00
remitamine
b80d4bebf3 [nba] fix extraction errors 2015-12-04 16:04:22 +01:00
Jaime Marquínez Ferrándiz
55bec9b658 [clipfish] Remove unused import and style fix 2015-12-04 14:29:37 +01:00
Jaime Marquínez Ferrándiz
2a63b0f110 [mixcloud] Fix extraction of the audio url (fixes #7751) 2015-12-04 14:26:34 +01:00
remitamine
07b88cffce Merge pull request #7686 from remitamine/acast
[acast] Add new extractor
2015-12-04 09:10:02 +01:00
remitamine
58c8451f36 Merge pull request #7660 from remitamine/gameinformer
[gameinformer] Add new extractor(closes #3376)
2015-12-04 09:03:21 +01:00
remitamine
3047121c63 Merge pull request #7320 from remitamine/adobetv
[adobetv] improve extraction and add support specific language video,show and channel extraction
2015-12-04 08:54:06 +01:00
remitamine
7079f8ff1f [adobetv] use compat_str 2015-12-04 08:44:18 +01:00
remitamine
2c3b9f3570 [adobetv] use a variable for api base url 2015-12-04 08:37:08 +01:00
remitamine
fad2428f47 [gameinformer] split long line 2015-12-04 08:24:04 +01:00
remitamine
c3d3110f6a Merge pull request #7185 from remitamine/ooyala
[ooyala] extract more formats and metadata
2015-12-04 08:23:21 +01:00
remitamine
79ec00276c Merge pull request #7326 from remitamine/clipfish
[clipfish] improve info extraction
2015-12-04 07:57:58 +01:00
remitamine
9c117d345f [nba] improve(fixes #7068)
* extract more formats
* extract videos from team mini sites
* extract more metadata
2015-12-04 07:20:27 +01:00
remitamine
46cc1c65a4 [nba] use xpath utils 2015-12-04 07:09:48 +01:00
remitamine
71d9fe7818 [trilulilu] improve extraction 2015-12-04 06:53:33 +01:00
remitamine
4ccabf93db [trilulilu] fix info extraction 2015-12-04 00:51:02 +01:00
remitamine
6612a34939 [bilibili] flake8 2015-12-03 22:43:19 +01:00
remitamine
e5b4225f7c [audimedia] flake8 2015-12-03 22:25:08 +01:00
remitamine
b2ca35ddbc Merge pull request #7745 from remitamine/bilibili
[bilibili] use xpath_text and catch errors in xml document
2015-12-03 22:11:41 +01:00
remitamine
76ab842d9b [bilibili] use xpath_text and catch errors in xml document 2015-12-03 22:01:32 +01:00
remitamine
78653a33aa Merge remote-tracking branch 'upstream/master' into bliptv 2015-12-03 20:33:22 +01:00
remitamine
24dc1ed715 Merge pull request #7659 from remitamine/audimedia
[audimedia] Add new extractor(closes #7654)
2015-12-03 20:28:52 +01:00
remitamine
682d8dcd21 Merge pull request #7210 from remitamine/bilibili
[bilibili] fix info extraction(fixes #7182)
2015-12-03 20:16:54 +01:00
remitamine
640bb54e73 Merge branch 'master' of https://github.com/rg3/youtube-dl into bilibili 2015-12-03 20:05:11 +01:00
Sergey M․
e0977d7686 [beeg] Decrypt URL (Closes #7736) 2015-12-04 00:59:32 +06:00
remitamine
112ab398db Merge pull request #7681 from remitamine/skynewarabia
[skynewsarabia] Add new extractor
2015-12-03 18:41:38 +01:00
Sergey M․
af93fcfa05 [beeg] Update API URL (Closes #7736) 2015-12-03 23:23:36 +06:00
Sergey M․
62d231c004 [extractor/common] Clarify duration can be float 2015-12-03 20:55:02 +06:00
Sergey M․
49358274d7 [bbc] Fix _VALID_URL 2015-12-03 20:49:14 +06:00
Jaime Marquínez Ferrándiz
7b1e379ca9 [gametrailers] Fix extraction (fixes #7722)
They have stopped using the MTV system.
2015-12-03 13:47:21 +01:00
Sergey M․
22d7368dfb [bbc] Extract _ID_REGEX and ad one more video id pattern (Closes #7724) 2015-12-02 02:34:31 +06:00
Sergey M․
24121bc703 [udemy] Make lecture downloading fatal 2015-12-02 00:53:03 +06:00
Sergey M․
9fc87fa767 [udemy] Remove unused import 2015-12-02 00:51:47 +06:00
Sergey M․
328f82d59a [udemy] Semi-switch to api 2.0 (Closes #7704)
* Use api 2.0 to get lectures since it provides more formats
* Fix authorization for api 2.0
* Autotry enrolling in the course for single lectures
* Extract additional metadata rom asset['data']['outputs']
2015-12-02 00:48:27 +06:00
Sergey M․
78717fc328 [udemy] Allow authentication via cookies 2015-12-01 22:10:10 +06:00
Sergey M․
3b35c3425e [udemy] Extract formats from data.outputs (#7704) 2015-12-01 20:35:46 +06:00
Sergey M․
874ae0354e [nrk] Extract f4m formats and impose geo restriction only when not media URL (Closes #7715) 2015-12-01 18:35:24 +06:00
Sergey M․
4c6b4764f0 [youtube] Clarify itag 272 possible resolutions (#7699) 2015-11-30 20:42:05 +06:00
Sergey M․
59ee8a8647 [facebook] Make alternative title optional (Closes #7700) 2015-11-30 20:10:09 +06:00
Sergey M․
af284305d5 [vodlocker] Capture file not found error (Closes #7696) 2015-11-30 03:58:39 +06:00
Sergey M․
d53a4af1a4 [pornhub:playlist] Allow alphanumeric viewkeys (Closes #7695) 2015-11-30 03:47:01 +06:00
Sergey M․
2e1b928540 [youtube:playlist] Extend _VALID_URL 2015-11-29 21:04:11 +06:00
Sergey M․
040ac68679 [youtube] Extend _VALID_URL (Closes #7694) 2015-11-29 21:01:59 +06:00
Yen Chi Hsuan
049d71d874 [youtube] Simplify and make sure header values are strings 2015-11-29 19:52:48 +08:00
Sergey M․
bf2c8c8f82 [spiegel] Fix extraction (Closes #7693) 2015-11-29 17:03:33 +06:00
Yen Chi Hsuan
ef428960c9 Merge pull request #7691 from ryandesign/use-PYTHON-env-var
Always use PYTHON env var in Makefile
2015-11-29 13:08:46 +08:00
Yen Chi Hsuan
992fc9d6e1 [utils] Refactor handle_youtubedl_headers for future extension 2015-11-29 12:58:29 +08:00
Ryan Schmidt
8639f89f51 Always use PYTHON env var in Makefile 2015-11-28 22:56:24 -06:00
Yen Chi Hsuan
0424ec307b [utils] Correct docstring of YoutubeDLHandler 2015-11-29 12:46:04 +08:00
Yen Chi Hsuan
ac5a69af45 [youtube] Disable compression for live streams 2015-11-29 12:44:24 +08:00
Yen Chi Hsuan
94e8c80473 [downloader/hls] Respect Youtubedl-* headers 2015-11-29 12:43:59 +08:00
Yen Chi Hsuan
87f0e62d94 [utils] Separate codes for handling Youtubedl-* headers 2015-11-29 12:42:50 +08:00
remitamine
46b4070f3f Merge pull request #7057 from remitamine/cspan
[cspan] correct the clip info extraction (fixes #7335)
2015-11-28 21:36:52 +01:00
remitamine
2a776f9788 [cspan] change into a function 2015-11-28 20:22:31 +01:00
remitamine
f4c7ef9862 [skynewsarabia] return empty categories array if there is no topic 2015-11-28 18:20:44 +01:00
remitamine
50e12e9df1 [acast] Add new extractor 2015-11-28 18:10:37 +01:00
Sergey M․
b7faebbac8 [bloomberg] Improve formats extraction 2015-11-28 22:45:19 +06:00
Sergey M․
4191fdf147 [bloomberg] Improve video id regex 2015-11-28 22:41:39 +06:00
Sergey M․
9a4f12be98 [bloomberg] Modernize 2015-11-28 22:40:29 +06:00
Sergey M․
7ad4258add [bloomberg] Relax _VALID_URL even more (Closes #7685) 2015-11-28 22:39:36 +06:00
Sergey M․
9945c4994c Credit @reiv for soundcloud:search 2015-11-28 20:21:03 +06:00
Sergey M․
5faf9fed7e [youtube] Clarify rationale for yt:stretch validation 2015-11-28 18:50:21 +06:00
Sergey M
13a9b69b09 Merge pull request #7677 from lalinsky/yt-stretch-zero-height
[youtube] Ignore yt:stretch with zero width/height
2015-11-28 18:14:06 +06:00
remitamine
4975650e00 [skynewsarabia] fix IE_NAME 2015-11-28 12:20:39 +01:00
remitamine
0cc7178546 [skynewsarabia] Add new extractor 2015-11-28 11:48:18 +01:00
Lukáš Lalinský
41f24c321d [youtube] Use the existing w and h variables 2015-11-28 08:16:46 +01:00
Yen Chi Hsuan
4b3fbafdd2 [options] Changed wording for --list-formats
As proposed by @dstftw at 9bff48a0e7
2015-11-28 14:14:20 +08:00
Sergey M․
7ac40086f5 [dbtv] Expand _VALID_URL (Closes #7645) 2015-11-28 08:44:13 +06:00
Lukáš Lalinský
313dfc45f5 [youtube] Ignore yt:stretch with zero width/height 2015-11-28 01:07:07 +01:00
Philipp Hagemeister
78a55d7a28 release 2015.11.27.1 2015-11-27 16:39:59 +01:00
Philipp Hagemeister
bb6ac83698 release 2015.11.27 2015-11-27 16:32:51 +01:00
Yen Chi Hsuan
9d0e366880 [downloader/hls] Remove Accept-encoding from headers passed to ffmpeg
Fails for Youtube Gaming live streams (#7671)
2015-11-27 21:37:45 +08:00
Yen Chi Hsuan
9bff48a0e7 [options] Clarify --list-formats needs videos (closes #7669) 2015-11-27 21:24:39 +08:00
remitamine
60121eb514 [gameinformer] Add new extractor 2015-11-26 22:43:31 +01:00
remitamine
527ca1da4f [audimedia] Add new extractor(closes #7654) 2015-11-26 21:24:10 +01:00
Sergey M
7689413e42 [README.md] Mention mplayer and mpv in "other programs" question 2015-11-24 23:06:21 +06:00
Philipp Hagemeister
ba7a92b0ce release 2015.11.24 2015-11-24 07:46:38 +01:00
Philipp Hagemeister
4c7d816dd7 [jsinterp] Adapt to updated YouTube code generation (Fixes #7623, fixes #7624, fixes #7625, fixes #7626) 2015-11-24 07:45:38 +01:00
Philipp Hagemeister
032f2f260f README: Document which other programs may be helpful (Fixes #7621) 2015-11-24 03:38:46 +01:00
Philipp Hagemeister
20e98bf6c0 release 2015.11.23 2015-11-23 18:07:58 +01:00
Sergey M?
5c2266df4b Switch codebase to use sanitized_Request instead of
compat_urllib_request.Request

[downloader/dash] Use sanitized_Request

[downloader/http] Use sanitized_Request

[atresplayer] Use sanitized_Request

[bambuser] Use sanitized_Request

[bliptv] Use sanitized_Request

[brightcove] Use sanitized_Request

[cbs] Use sanitized_Request

[ceskatelevize] Use sanitized_Request

[collegerama] Use sanitized_Request

[extractor/common] Use sanitized_Request

[crunchyroll] Use sanitized_Request

[dailymotion] Use sanitized_Request

[dcn] Use sanitized_Request

[dramafever] Use sanitized_Request

[dumpert] Use sanitized_Request

[eitb] Use sanitized_Request

[escapist] Use sanitized_Request

[everyonesmixtape] Use sanitized_Request

[extremetube] Use sanitized_Request

[facebook] Use sanitized_Request

[fc2] Use sanitized_Request

[flickr] Use sanitized_Request

[4tube] Use sanitized_Request

[gdcvault] Use sanitized_Request

[extractor/generic] Use sanitized_Request

[hearthisat] Use sanitized_Request

[hotnewhiphop] Use sanitized_Request

[hypem] Use sanitized_Request

[iprima] Use sanitized_Request

[ivi] Use sanitized_Request

[keezmovies] Use sanitized_Request

[letv] Use sanitized_Request

[lynda] Use sanitized_Request

[metacafe] Use sanitized_Request

[minhateca] Use sanitized_Request

[miomio] Use sanitized_Request

[meovideo] Use sanitized_Request

[mofosex] Use sanitized_Request

[moniker] Use sanitized_Request

[mooshare] Use sanitized_Request

[movieclips] Use sanitized_Request

[mtv] Use sanitized_Request

[myvideo] Use sanitized_Request

[neteasemusic] Use sanitized_Request

[nfb] Use sanitized_Request

[niconico] Use sanitized_Request

[noco] Use sanitized_Request

[nosvideo] Use sanitized_Request

[novamov] Use sanitized_Request

[nowness] Use sanitized_Request

[nuvid] Use sanitized_Request

[played] Use sanitized_Request

[pluralsight] Use sanitized_Request

[pornhub] Use sanitized_Request

[pornotube] Use sanitized_Request

[primesharetv] Use sanitized_Request

[promptfile] Use sanitized_Request

[qqmusic] Use sanitized_Request

[rtve] Use sanitized_Request

[safari] Use sanitized_Request

[sandia] Use sanitized_Request

[shared] Use sanitized_Request

[sharesix] Use sanitized_Request

[sina] Use sanitized_Request

[smotri] Use sanitized_Request

[sohu] Use sanitized_Request

[spankwire] Use sanitized_Request

[sportdeutschland] Use sanitized_Request

[streamcloud] Use sanitized_Request

[streamcz] Use sanitized_Request

[tapely] Use sanitized_Request

[tube8] Use sanitized_Request

[tubitv] Use sanitized_Request

[twitch] Use sanitized_Request

[twitter] Use sanitized_Request

[udemy] Use sanitized_Request

[vbox7] Use sanitized_Request

[veoh] Use sanitized_Request

[vessel] Use sanitized_Request

[vevo] Use sanitized_Request

[viddler] Use sanitized_Request

[videomega] Use sanitized_Request

[viewvster] Use sanitized_Request

[viki] Use sanitized_Request

[vk] Use sanitized_Request

[vodlocker] Use sanitized_Request

[voicerepublic] Use sanitized_Request

[wistia] Use sanitized_Request

[xfileshare] Use sanitized_Request

[xtube] Use sanitized_Request

[xvideos] Use sanitized_Request

[yandexmusic] Use sanitized_Request

[youku] Use sanitized_Request

[youporn] Use sanitized_Request

[youtube] Use sanitized_Request

[patreon] Use sanitized_Request

[extractor/common] Remove unused import

[nfb] PEP 8
2015-11-23 21:56:23 +06:00
Sergey M․
67dda51722 Rename compat_urllib_request_Request to sanitized_Request and move to utils 2015-11-23 21:55:15 +06:00
Sergey M․
e4c4bcf36f [vimeo] Use compat_urllib_request_Request 2015-11-23 21:55:14 +06:00
Sergey M․
82d8a8b6e2 [YoutubeDL] Wrap plain-text URL requests in compat_urllib_request_Request 2015-11-23 21:55:13 +06:00
Sergey M․
13a10d5aa3 [compat] Add compat_urllib_request_Request
This is actually not a compatibility routine but rather a workaround for URLs without protocol specified.
The protocol-less URL is treated as HTTP one since it's most probable scenario and it will most likely to
redirect to HTTPS if HTTPS was actually expected. This routine could also be useful for any Request
preprocessing that may be added in future.
2015-11-23 21:55:12 +06:00
Sergey M․
9022726446 [youtube] Fix test 2015-11-23 21:37:21 +06:00
Sergey M․
94bfcd23b7 [youtube] Fix test 2015-11-23 21:35:23 +06:00
Sergey M․
526b3b0716 [youtube] Clarify ytplayer.config extraction rationale 2015-11-23 21:14:03 +06:00
Sergey M․
61f92af1cf [youtube] Add test with '};' in tags 2015-11-23 21:02:37 +06:00
Sergey M․
a72778d364 [youtube] Improve ytplayer.config extraction 2015-11-23 21:00:06 +06:00
Sergey M
5ae17037a3 Merge pull request #7599 from lalinsky/fix-youtube
[youtube] More explicit player config JSON extraction (fixes #7468)
2015-11-23 20:52:23 +06:00
Sergey M․
02f0da20b0 [pluralsight] Add support for alternative webpage layout (Closes #7607) 2015-11-23 03:08:38 +06:00
Lukáš Lalinský
b41631c4e6 [youtube] Send the list of patterns directly to _search_regex 2015-11-22 13:53:26 +01:00
Lukáš Lalinský
0e49d9a6b0 [youtube] Fall back to the original regex for ytplayer.config 2015-11-22 13:49:33 +01:00
Sergey M․
4a7d108ab3 [rutube] Remove unnecessary print 2015-11-22 18:24:17 +06:00
Lukáš Lalinský
3cfd000849 [youtube] More explicit player config JSON extraction (fixes #7468) 2015-11-22 13:14:35 +01:00
Sergey M․
1b38185361 [pornhd] Fix title extraction (Closes #7596) 2015-11-22 18:08:30 +06:00
Sergey M․
9cb9a5df77 [utils] Check ext with trailing slash against the list of known extensions 2015-11-22 17:27:13 +06:00
Sergey M․
5035536e3f [test_utils] Add tests for determine_ext 2015-11-22 06:33:52 +06:00
Sergey M․
3e12bc583a [utils] Improve determine_ext (Closes #7593) 2015-11-22 06:29:39 +06:00
Sergey M․
e568c2233e [youtube] Add test for multi page list of playlists 2015-11-22 05:03:23 +06:00
Sergey M․
061a75edd6 [youtube] Extract base for entry list extractors and support multi page lists of playlists 2015-11-22 05:01:01 +06:00
Philipp Hagemeister
82c4d7b0ce release 2015.11.21 2015-11-21 23:36:27 +01:00
Sergey M․
136dadde95 [youtube:show] Rework in terms of playlists base extractor 2015-11-22 04:18:20 +06:00
Sergey M․
0c14841585 [youtube:user:playlists] Add extractor (Closes #3817) 2015-11-22 04:17:07 +06:00
Sergey M․
0eebf34d9d [pluralsight] Rephrase 2015-11-22 00:58:25 +06:00
Sergey M․
cf186b77a7 [pluralsight] Clarify allowed qualities guessing rationale 2015-11-22 00:56:40 +06:00
Sergey M․
a3372437bf [soundcloud] Remove unused variable 2015-11-22 00:49:58 +06:00
Sergey M․
4c57b4853d [pluralsight] Until listing formats request only single format 2015-11-22 00:42:58 +06:00
Sergey M․
38eb2968ab [pluralsight] Clarify and randomize ViewClip sleep interval 2015-11-22 00:07:09 +06:00
Andrzej Lichnerowicz
bea56c9569 [pluralsight] prevent error 429 when sensing video formats 2015-11-21 23:49:58 +06:00
Sergey M․
7e508ff2cf [pluralsight] Improve login detection 2015-11-21 21:49:37 +06:00
Sergey M․
563772eda4 [pluralsight] Extract base class 2015-11-21 21:37:29 +06:00
Sergey M․
0533915aad [pluralsight] Update some more URLs 2015-11-21 21:35:08 +06:00
Sergey M․
c3a227d1c4 [pluralsight] Update _LOGIN_URL 2015-11-21 21:25:48 +06:00
Sergey M․
f6c903e708 [soundcloud:search] Simplify (Closes #7213) 2015-11-21 21:21:21 +06:00
Sergey M․
7dc011c063 [soundcloud:search] Remove no track results message 2015-11-21 21:00:42 +06:00
Sergey M․
4e3b303016 [soundcloud:search] Fix non-ASCII searches 2015-11-21 20:55:48 +06:00
Sergey M․
7e1f5447e7 [utils] Improve encode_dict 2015-11-21 20:46:33 +06:00
Sergey M․
7e3472758b [soundcloud:search] PEP 8 2015-11-21 20:04:35 +06:00
reiv
328a22e175 [soundcloud] Remove limit on search results 2015-11-21 19:41:36 +06:00
reiv
417b453699 [soundcloud] Use correct error message conventions 2015-11-21 19:41:31 +06:00
reiv
6ea7190a3e Rewrite as list comprehension. 2015-11-21 19:41:26 +06:00
reiv
b54b08c91b Simplify with itertools.islice(). 2015-11-21 19:41:19 +06:00
reiv
c30943b1c0 Fix some compatibility issues, cleanup. 2015-11-21 19:41:15 +06:00
reiv
2abf7cab80 [soundcloud] Add Soundcloud search extractor 2015-11-21 19:41:08 +06:00
Sergey M․
4137196899 [rutube] Extract all formats 2015-11-21 18:02:52 +06:00
Sergey M․
019839faaa [extractor/common] Use baseURL from f4m manifest for recursive manifest extraction 2015-11-21 18:01:39 +06:00
Sergey M․
f52183a878 [rutube:embed] Extend _VALID_URL (Closes #7588) 2015-11-21 17:39:24 +06:00
Yen Chi Hsuan
750b9ff032 [generic] Extract M3U8 formats (closes #7582) 2015-11-21 16:43:01 +08:00
Yen Chi Hsuan
28602e747c [generic] Refactor 2015-11-21 16:08:54 +08:00
Yen Chi Hsuan
6cc37c69e2 [generic] Unescape URLs from JWPlayer (#7582) 2015-11-21 14:12:34 +08:00
Sergey M․
a5cd0eb8a4 [pluralsight:course] Improve _VALID_URL 2015-11-21 08:32:48 +06:00
Sergey M․
c23e266427 [pluralsight] Do not require pluralsight account
Looks like some courses are available without pluralsight account
2015-11-21 08:25:52 +06:00
Sergey M․
651acffbe5 [pluralsight] Update ViewClip URL 2015-11-21 08:21:33 +06:00
Sergey M․
71bd93b89c [pluralsight] Do not rely on argument order in query (Closes #7583) 2015-11-21 08:08:34 +06:00
Sergey M․
6da620de58 [kaltura] Add test for referrer protected video (#7409) 2015-11-21 01:40:28 +06:00
Sergey M․
bdceea7afd [kaltura] Clean description 2015-11-21 01:39:29 +06:00
Sergey M․
d80a39cec8 [kaltura] Improve 2015-11-21 01:38:08 +06:00
Sergey M․
5b5fae5f20 [generic] Use referrer from source kaltura embed URLs (#7409) 2015-11-21 01:35:58 +06:00
Sergey M․
01b06aedcf [kaltura] Add support for referrer protected videos (#7409) 2015-11-21 01:34:02 +06:00
Sergey M
c711383811 Merge pull request #7579 from ashutosh-mishra/typo_fix
Typo fix, found while going through the code.
2015-11-20 23:24:54 +06:00
ashutosh-mishra
17cc153435 Typo fix, found while going through the code. 2015-11-20 22:51:46 +05:30
Sergey M․
67446fd49b [instagram] Improve _VALID_URL (Closes #7568) 2015-11-20 04:07:39 +06:00
Sergey M․
325bb615a7 [theplatform] Style 2015-11-19 22:58:43 +06:00
Sergey M․
ee5cd8418e [theplatform] Handle protocolless feed URLs (Closes #7532) 2015-11-19 22:58:29 +06:00
Sergey M․
342609a1b4 [bloomberg] Reax _VALID_URL (Closes #7546) 2015-11-19 22:55:06 +06:00
Sergey M
f270cf1a26 Merge pull request #7519 from barlik/master
Clarify that automatic subtitles are generated.
2015-11-19 22:44:08 +06:00
hedii
371c3b796c [YoutubeDL] Add playlist finished downloading message (Closes #7517)
Conflicts:
	youtube_dl/YoutubeDL.py
2015-11-19 22:39:02 +06:00
Sergey M․
6b7ceee1b9 [vimeo] Add test for #7552 2015-11-19 22:31:16 +06:00
Sergey M․
fdb20a27a3 [vimeo:group] Improve _VALID_URL (Closes #7552) 2015-11-19 22:30:58 +06:00
Sergey M․
2c94198eb6 [vimeo] Improve playlists extraction 2015-11-19 21:29:32 +06:00
Philipp Hagemeister
e8110b8125 release 2015.11.19 2015-11-19 15:35:13 +01:00
Yen Chi Hsuan
c39fd7b1ca [UDNEmbed] Fix generic UDN pages
Closes #7547
2015-11-19 22:32:56 +08:00
Sergey M․
a9c09a7c62 [pbs] Update API URL (Closes #7565) 2015-11-19 20:25:28 +06:00
Philipp Hagemeister
82beaabb41 release 2015.11.18 2015-11-18 19:23:04 +01:00
Jaime Marquínez Ferrándiz
63b4295d20 [youtube:playlist] fix title extraction (fixes #7544 and #7545) 2015-11-18 18:28:05 +01:00
Sergey M․
312a3f389b [pbs] Extend _VALID_URL 2015-11-18 00:46:41 +06:00
Jaime Marquínez Ferrándiz
609af1ae1c [dplay] Add 'encoding: utf-8' line 2015-11-17 17:58:16 +01:00
Jaime Marquínez Ferrándiz
4cd759f73d [dplay] Add extractor (closes #7515)
Since I haven't figured out how to download the hds stream, we use the hls one instead.
2015-11-17 17:52:29 +01:00
Jaime Marquínez Ferrándiz
e156e70281 [rtve] Remove unused import 2015-11-17 16:23:29 +01:00
Sergey M․
9b464929fe [rtve.es:alacarta] Fix extraction 2015-11-17 21:11:42 +06:00
Sergey M
0c176d7bde Merge pull request #7514 from ping/patch-7301
[neteasemusic] Fixes #7301
2015-11-16 14:25:29 +00:00
Sergey M․
7a3f0c00ad [utils] Style 2015-11-16 20:24:09 +06:00
Sergey M․
7aefc49c40 [utils] Skip invalid/non HTML entities (Closes #7518) 2015-11-16 20:20:16 +06:00
Rastislav Barlik
741dd8ea65 Clarify that automatic subtitles are generated.
It wasn't clear what automatic word mean.
2015-11-16 14:15:25 +00:00
ping
76adc82068 [neteasemusic] Fixes #7301 2015-11-16 11:39:18 +08:00
Philipp Hagemeister
bd1512d196 release 2015.11.15 2015-11-15 22:16:08 +01:00
Sergey M․
9a4acbfaf5 [theplatform] Add test for #7385 2015-11-16 00:28:04 +06:00
Sergey M․
ad1f4e7902 [theplatform] Handle explicitly specified SMIL (#7385) 2015-11-15 23:43:23 +06:00
Sergey M
b328295910 Merge pull request #7436 from davidbz/add_proxy_to_update_procedure
Add proxy support for update_self
2015-11-15 11:13:22 +00:00
David Ben Zakai
828b2a5cd9 Removing an unnecessary import 2015-11-15 09:40:32 +02:00
Sergey M․
2ff7cbeaaa [nowtv:list] Add extrator (Closes #7147) 2015-11-15 08:30:13 +06:00
Sergey M․
b2f7738830 [dumpert] Use original protocol 2015-11-15 02:25:00 +06:00
Sergey M․
dc0279532a [dumpert] Disable SSL (Closes #7504) 2015-11-15 02:21:24 +06:00
Sergey M․
0c59d02bdc [periscope] Relax _VALID_URL (Closes #7503) 2015-11-15 00:20:17 +06:00
Jaime Marquínez Ferrándiz
0f72beb515 [periscope] Remove unused imports 2015-11-14 18:31:33 +01:00
Sergey M․
d781e29316 [bbc] Allow selectionunavailable errors (Closes #7502) 2015-11-14 23:08:13 +06:00
Sergey M․
3b3e8ed332 [quickscope] Remove extractor (2) 2015-11-14 22:34:30 +06:00
Sergey M․
dcdfeb33d2 [quickscope] Remove extractor 2015-11-14 22:32:54 +06:00
Sergey M․
0d85c3a732 [lynda] Style 2015-11-14 16:44:24 +06:00
Sergey M․
903d136942 [lynda] Logout only when login info present (Closes #7500) 2015-11-14 16:43:58 +06:00
Yen Chi Hsuan
9d584da7d0 [xfileshare] Correct _VALID_URL 2015-11-14 17:27:32 +08:00
Yen Chi Hsuan
31752f76f7 [twitter:card] Add add_ie for the external test 2015-11-14 17:03:26 +08:00
Yen Chi Hsuan
5f1b2aea80 [twitter:card] Support vine.co embeds (closes #7496) 2015-11-14 17:02:07 +08:00
Sergey M․
4479600d57 [instagram] Add test for #7497 2015-11-14 07:21:20 +06:00
Sergey M․
a90189c3ad [instagram] Relax _VALID_URL (Closes #7497) 2015-11-14 07:20:33 +06:00
Sergey M․
d8a1caf04f [brightcove:new] Style 2015-11-14 06:22:12 +06:00
Sergey M․
cb33d389ed [brightcove:new] Add test with rtmp streams 2015-11-14 06:20:09 +06:00
Sergey M․
967e0955f0 Merge branch 'remitamine-brightcove_in_page_embed' 2015-11-14 06:11:49 +06:00
Sergey M․
e01b432ad3 [brightcove:new] Fix test 2015-11-14 06:11:17 +06:00
Sergey M․
fd91257c40 [brightcove] Order imports alphabetically 2015-11-14 06:08:36 +06:00
Sergey M․
c7b959ce38 [utils] Remove unused function 2015-11-14 06:07:44 +06:00
Sergey M․
75eac8961e [brightcove] Remove unused import 2015-11-14 06:07:24 +06:00
Sergey M․
3b7d9aa487 Rename all references to legacy studio Brightcove extractor 2015-11-14 06:05:46 +06:00
Sergey M․
1f4b722b00 [generic] Clarify Brightcove Legacy Studio comment 2015-11-14 06:03:32 +06:00
Sergey M․
f6519f89b0 [generic] Extract Brightcove New Studio embeds 2015-11-14 06:03:07 +06:00
Sergey M․
24af85298e [brightcove] Fix _extract_urls 2015-11-14 06:01:56 +06:00
Sergey M․
e721d857c2 [brightcove] Clarify IE_NAMEs 2015-11-14 05:56:51 +06:00
Sergey M․
5c17f0a67a [brightcove:embedinpage] Rename extractor to brightcove new
It's not actually embed_in_page but "New Studio" and allows both iframe and embed_in_page embeds
2015-11-14 05:55:59 +06:00
Sergey M․
4fcaa4f4a5 [brightcove] Rename extractor to brightcove legacy
Old embedding approaches are now "Legacy Studio"
2015-11-14 05:54:16 +06:00
Sergey M․
536f819eda [brightcove] Imrove extraction of new embeds 2015-11-14 05:51:05 +06:00
Sergey M․
a662489877 [brightcove:embedinpage] Make more robust and extract rtmp streams 2015-11-14 05:09:50 +06:00
Sergey M․
a2973eb597 Merge branch 'brightcove_in_page_embed' of https://github.com/remitamine/youtube-dl into remitamine-brightcove_in_page_embed 2015-11-14 01:23:15 +06:00
Sergey M․
4e21b3a94f [cbs] Use android UA for higher quality streams (Closes #7490) 2015-11-14 00:25:28 +06:00
Jaime Marquínez Ferrándiz
b703ebeeaf [twitter] Don't fail if the description doesn't contain an URL (fixes #7489) 2015-11-13 19:09:42 +01:00
Jaime Marquínez Ferrándiz
b84a5f0337 [twitter] Update tests checksums 2015-11-13 18:55:07 +01:00
Philipp Hagemeister
a1ec9a7553 release 2015.11.13 2015-11-13 11:07:30 +01:00
Sergey M․
91d644b5ba [ruutu] Relax formats extraction 2015-11-13 02:43:27 +06:00
Sergey M․
5d6c3d6a66 [ruutu] Skip NOT-USED URLs(Closes #7478) 2015-11-13 02:41:38 +06:00
Jaime Marquínez Ferrándiz
1ebb4717df [cbsnews] Fix construction of 'play_path' in some videos (fixes #7394) 2015-11-12 21:02:56 +01:00
Yen Chi Hsuan
cf5881fc4d Credit @ferama
For providing idea for vidto.me (#7167) and extending nowvideo support (#6760)
2015-11-12 21:33:46 +08:00
Sergey M․
fcd817a326 [vimeo] Fix extraction (Closes #7460) 2015-11-12 03:56:11 +06:00
Sergey M․
031ec536f0 [gorillavid] Rename to xfileshare 2015-11-11 23:00:53 +06:00
Sergey M․
668db403f9 [gorillavid] Add test for vidto.me and strip title 2015-11-11 22:47:28 +06:00
Sergey M․
b9ad101926 [gorillavid] Add support for vidto.me 2015-11-11 22:44:03 +06:00
Sergey M․
435911029f [vidto] Remove extractor 2015-11-11 22:43:17 +06:00
Sergey M․
699ed30cee [novamov] Modernize 2015-11-11 22:34:49 +06:00
Sergey M․
9eab37dca0 [vimeo] Simplify set cookie 2015-11-11 22:32:13 +06:00
Sergey M․
9a8a12b7d8 [vimeo] Append cookies instead of overriding 2015-11-11 22:23:23 +06:00
Yen Chi Hsuan
a4c2ab35c1 Merge remote-tracking branch 'upstream/master' 2015-11-12 00:08:42 +08:00
Sergey M․
3d9c4bf09a [vimeo] Fix password protected videos (Closes #7451) 2015-11-11 21:21:21 +06:00
Yen Chi Hsuan
8b8a39e279 [vidto] Several simplifications and improvements
1. Use InfoExtractor._hidden_inputs
2. Fetch title from <title> tag
3. Cookies are preserved automatically
4. Use single quotes everywhere
5. Do not declare variables for one-time use only
2015-11-11 23:17:59 +08:00
Sergey M․
82393e2bb2 [novamov] Follow continue-to-the-video button if any (Closes #7330) 2015-11-11 21:02:05 +06:00
Sergey M․
2eb99a4b98 [nowvideo] Replace main host to resolvable one 2015-11-11 21:00:23 +06:00
Yen Chi Hsuan
6abce58a12 Credit @sieben for fixing wsj extractor 2015-11-11 20:16:18 +08:00
Yen Chi Hsuan
990e6e8fa3 [vidto] Minor fixes
1. import order
2. fatal is already True in helper functions
2015-11-11 20:13:03 +08:00
Yen Chi Hsuan
bfd88516eb Merge pull request #7454 from sieben/duplicate_keys
Remove duplicate key
2015-11-11 20:00:13 +08:00
Rémy Léone
d8b7e80d29 Remove duplicate key 2015-11-11 12:00:31 +01:00
Yen Chi Hsuan
37120974dc [vidto] PEP8 2015-11-11 02:02:46 +08:00
Marco Ferragina
42fc93c709 vidto extractor: code cleanup 2015-11-11 01:58:47 +08:00
Marco Ferragina
a625e56543 [vidto] Add extractor 2015-11-11 01:52:43 +08:00
Sergey M․
9b738b2caa [funnyordie] Fix extraction and extract m3u8 formats 2015-11-10 21:32:54 +06:00
David Ben Zakai
90bb5667bf Using internal opener 2015-11-10 17:15:23 +02:00
David Ben Zakai
d3d3e2e3aa Adding proxy to update procedure 2015-11-10 16:31:42 +02:00
Philipp Hagemeister
37ca7b22b5 release 2015.11.10 2015-11-10 11:39:06 +01:00
Yen Chi Hsuan
50f84a9ae1 [youtube] Support new base.js html5 player 2015-11-10 12:55:01 +08:00
Yen Chi Hsuan
ff29bf81f8 [jsinterp] Support alternative function definition form 2015-11-10 12:54:02 +08:00
Sergey M․
b25f753397 [kaltura] Relax _VALID_URL 2015-11-09 20:50:43 +06:00
Sergey M․
6a5d6de1e3 [generic] Improve kaltura embed detection (2) 2015-11-09 20:49:42 +06:00
Sergey M․
1c31a5b0e0 [generic] Improve kaltura embed detection (Closes #7409) 2015-11-09 20:49:06 +06:00
Sergey M․
4f5cdf7c9b [cmt] Extend _VALID_URL to support shows (Closes #7407) 2015-11-09 01:48:46 +06:00
Sergey M․
f09a767d31 [mit] Allow external embeds (Closes #7406) 2015-11-08 19:19:13 +06:00
Sergey M․
cc8034cc4c [extremetube] Modernize 2015-11-08 19:14:39 +06:00
Sergey M․
50506cb607 [extremetube] Fix extraction (Closes #7163) 2015-11-08 19:01:37 +06:00
Sergey M․
aa8d2d5be6 [rtbf] Make www optional in _VALID_URL 2015-11-08 17:03:21 +06:00
Sergey M․
114e6025b0 [rtbf] Expand _VALID_URL (Closes #7402) 2015-11-08 17:01:45 +06:00
Sergey M․
fda2717ef9 [movieclips] Add coding cookie 2015-11-08 16:56:20 +06:00
Frans de Jonge
937511dfc0 Added support for the RTBF OUFtivi subpage 2015-11-08 16:51:13 +06:00
Jaime Marquínez Ferrándiz
d5c181a14e [movieclips] Fix extraction (fixes #7404)
They use theplatform now.
Changed the test, because the old one seems to be georestricted.
2015-11-08 11:49:51 +01:00
Sergey M?
e8ce2375e0 [viidea] Improve and cleanup (Closes #7390)
* Optimize requests for multipart videos
* Fix cfg regex
* Improve titles and identifiers
2015-11-08 06:55:52 +06:00
remitamine
6fdb39ded1 [viidia] Cleaup
[viidea] extract playlist if lecture is an event

[viidia] use compat_str
2015-11-08 06:55:51 +06:00
remitamine
8e3a2bd620 [viidea] fix _VALID_URL regex and tests 2015-11-08 06:55:51 +06:00
remitamine
a06bf87a2c [viidea] add support for sites using viidea service 2015-11-08 06:55:50 +06:00
remitamine
ee4337d100 [videolecture] add support for multi part videos 2015-11-08 06:55:50 +06:00
Jaime Marquínez Ferrándiz
cff551c0b0 [googleplus] Fix extraction of formats 2015-11-07 18:43:22 +01:00
remitamine
63b728f06f [bleacherreport] Add new Extractor 2015-11-07 16:56:21 +01:00
remitamine
3793090b1b [amp] Add generic extractor for Akamai AMP feeds and use it in dramafever and foxnews extractors 2015-11-07 16:54:35 +01:00
Sergey M․
6d02b9a392 [crunchyroll] Fix description extraction 2015-11-07 20:02:39 +06:00
Sergey M․
2c740cf28d [crunchyroll] Simplify description extraction 2015-11-07 19:29:49 +06:00
Sergey M․
5214f1e31d [crunchyroll] Fix title extraction (Closes #7396) 2015-11-07 19:29:42 +06:00
Sergey M․
5d0f84d32c [beeg] Skip empty URLs (Closes #7392) 2015-11-07 06:23:00 +06:00
Mister Hat
ee223abb88 [vidzi] fixed. finds url from hash and host in script
Closes #7386.
2015-11-06 20:24:02 +01:00
Sergey M․
21d0c33ecd [pbs] Make flp embed lookup non fatal 2015-11-07 01:08:40 +06:00
Sergey M․
8b6d9406db [pbs] Add test for flp frontline embeds 2015-11-07 00:42:30 +06:00
Sergey M․
686f98816e [pbs] Add support for flp frontlines (Closes #7369) 2015-11-07 00:39:16 +06:00
Sergey M․
0fa6b17dcc [pbs] Simplify and speed up player URL search 2015-11-06 23:45:26 +06:00
Sergey M․
472404953a [miomio] PEP 8 2015-11-06 23:28:14 +06:00
Sergey M․
ae4ddf9efa [lynda] PEP 8 2015-11-06 23:27:38 +06:00
Sergey M․
ea8ed40b2f [lynda] Modernize and make more robust 2015-11-06 23:24:39 +06:00
Sergey M․
71bb016160 [lynda:course] Modernize and make more robust 2015-11-06 23:10:07 +06:00
Sergey M․
179ffab69c [lynda:course] Force log out (Closes #7361) 2015-11-06 23:06:13 +06:00
Sergey M․
deb85c32bb [postprocessor/ffmpeg] Use ffmpeg as prefix since it's used all over the places (Closes #7371) 2015-11-06 21:56:31 +06:00
Sergey M․
92366d189e [njoy:embed] Relax _VALID_URL 2015-11-06 21:09:17 +06:00
Sergey M․
81413c0165 [ndr:embed] Relax _VALID_URL 2015-11-06 21:08:52 +06:00
Sergey M․
1e2eb4b40d [njoy] Relax _VALID_URL 2015-11-06 21:08:21 +06:00
Sergey M․
01003d072c [ndr] Add test for #7383 2015-11-06 21:07:52 +06:00
Sergey M․
5003e4283b [ndr] Relax _VALID_URL (Closes #7383) 2015-11-06 21:06:44 +06:00
Yen Chi Hsuan
123c781044 Merge pull request #7382 from remitamine/miomio
[miomio] fix info extraction (fixes #7366)
2015-11-06 14:50:39 +08:00
remitamine
a641b24592 [cnet] skip hls_phone if hls_tablet is present 2015-11-06 07:23:03 +01:00
remitamine
e68dd1921a [miomio] use the formats urls headers for downloading xml 2015-11-06 06:33:05 +01:00
remitamine
6953d8e95a [miomio] fix info extraction (fixes #7366) 2015-11-06 02:09:55 +01:00
remitamine
967c9076a3 raise ExtractorError if the page doesn't contain a video 2015-11-05 18:01:13 +01:00
Sergey M․
b3613d36da [YoutubeDL] Sanitize path after output template substitution (Closes #7367) 2015-11-05 04:39:21 +06:00
Sergey M․
53472df857 [periscope] Add note on where to find alive example URLs 2015-11-05 02:56:44 +06:00
Sergey M․
2549e113b8 [periscope] Add test for broadcast_id based URL 2015-11-05 02:55:53 +06:00
Sergey M․
b15c44cd36 [periscope] Add support for videos with broadcast_id (Closes #7359) 2015-11-05 02:51:30 +06:00
Sergey M․
f93ded9852 [prosiebensat1] Add support for .ch domains (Closes #7365) 2015-11-05 01:54:49 +06:00
Sergey M․
89ea063eeb [youtube] Clarify rationale for preferring a video info with token (#7362) 2015-11-04 22:49:23 +06:00
Sergey M․
44b2264fea [youtube] Prefer video_info with token available 2015-11-04 22:12:24 +06:00
Jaime Marquínez Ferrándiz
cb5a470635 [vimeo] Remove unused import 2015-11-04 16:18:51 +01:00
Sergey M․
17d1900581 [vk] Fix view count extraction (Closes #7353) 2015-11-04 17:57:46 +06:00
Sergey M․
5d501a0901 [globo] Add more tests 2015-11-04 17:42:11 +06:00
Sergey M․
c13722480b [globo:article] Fix test 2015-11-04 17:13:35 +06:00
Sergey M․
e7d34c03f2 [globo] Force uploader id to be string 2015-11-04 17:12:42 +06:00
Sergey M․
264cd00fff [globo] Update tests 2015-11-04 17:10:45 +06:00
Sergey M․
a4a6b7b80f [globo] Improve http formats 2015-11-04 17:03:45 +06:00
Sergey M․
aebb42d32b [globo] Remove like count
It's no longer provided
2015-11-04 17:01:55 +06:00
Sergey M․
b4ef6a0038 [globo] Remove non available test 2015-11-04 17:01:27 +06:00
Sergey M․
5d235ca7f6 [globo] Prefer native m3u8 2015-11-04 16:55:39 +06:00
Sergey M․
c3459d24f1 [globo] Skip unsupported smooth streaming 2015-11-04 16:53:21 +06:00
Sergey M․
e3778cce0e [globo] Improve m3u8 extraction 2015-11-04 16:51:19 +06:00
Sergey M․
ad607563a2 [globo] Separate article extractor 2015-11-04 16:46:26 +06:00
Yen Chi Hsuan
236cb2131b Merge remote-tracking branch 'upstream/master' 2015-11-04 00:54:27 +08:00
Yen Chi Hsuan
66d041f250 [test/subtitles] Add test for DemocracynowIE 2015-11-04 00:53:30 +08:00
Yen Chi Hsuan
f3cb54e6d9 Merge branch 'atomicdryad-pr-democracynow' 2015-11-04 00:45:18 +08:00
Yen Chi Hsuan
0aeb9a106e [democracynow] Prevent required fields to be None 2015-11-04 00:13:00 +08:00
Yen Chi Hsuan
fd8102820c [democracynow] Rename js to json_data 2015-11-04 00:09:55 +08:00
Sergey M․
bfdf891fd3 [vimeo] Fix non-ASCII album passwords 2015-11-03 21:09:24 +06:00
Sergey M․
3fa3ff1bc3 [vimeo] Fix non-ASCII login 2015-11-03 21:06:36 +06:00
Sergey M․
0a0110fc6b [vimeo] Fix non-ASCII video passwords (2) 2015-11-03 21:01:09 +06:00
Sergey M․
852fad922f [vimeo] Fix non-ASCII video passwords (Closes #7352) 2015-11-03 20:53:17 +06:00
Yen Chi Hsuan
fc68d52bb9 [democracynow] Add MD5 sums 2015-11-03 21:24:10 +08:00
Yen Chi Hsuan
dde9fe9788 [democracynow] Simplify 2015-11-03 21:16:42 +08:00
Philipp Hagemeister
a230068ff7 release 2015.11.02 2015-11-02 16:18:54 +01:00
Jaime Marquínez Ferrándiz
6a75040278 [utils] unified_strdate: Return None if the date format can't be recognized (fixes #7340)
This issue was introduced with ae12bc3ebb, it returned 'None'.
2015-11-02 14:08:38 +01:00
remitamine
c514b0ec65 [videofy.me] fix info extraction
Closes #7339.
2015-11-02 13:55:21 +01:00
Jaime Marquínez Ferrándiz
eb97f46e8b [mitele] Fix extraction and update test checksum (fixes #7343) 2015-11-02 12:46:10 +01:00
Sergey M․
c90d16cf36 [utils:sanitize_path] Disallow trailing whitespace in path segment (Closes #7332) 2015-11-02 04:26:20 +06:00
Philipp Hagemeister
ab6ca04802 release 2015.11.01 2015-11-01 14:20:10 +01:00
remitamine
f3003531a5 [flickr] handle error message 2015-11-01 13:38:11 +01:00
remitamine
146672254e [flickr] extract fresh api key and remove duplication in test 2015-11-01 13:23:23 +01:00
Sergey M․
999079b454 [eitb] Improve hds extraction 2015-11-01 15:49:11 +06:00
remitamine
02fb980451 [flickr] extract more info and formats 2015-11-01 02:08:19 +01:00
Sergey M․
8a06999ba0 [eitb] Improve, make more robust and extract f4m formats (Closes #7328) 2015-11-01 01:52:33 +06:00
remitamine
80dcee5cd5 [eitb] fix info extraction 2015-11-01 00:56:16 +06:00
remitamine
9550ca506f [utils] change extract_attributes to work in python 2 2015-10-31 19:36:04 +01:00
Sergey M
30eecc6a04 Merge pull request #7296 from jaimeMF/xml_attrib_unicode
Use a wrapper around xml.etree.ElementTree.fromstring in python 2.x (…
2015-10-31 18:15:21 +00:00
Sergey M․
dbd82a1d4f [extractor/common] Fix m3u8 extraction on failure 2015-11-01 00:01:34 +06:00
Sergey M․
76f0c50d3d [mdr] Fix failed formats processing 2015-11-01 00:01:08 +06:00
Sergey M․
dc519b5421 [extractor/common] Make ie_key and IE_NAME return unicode string 2015-10-31 23:12:57 +06:00
Sergey M․
ae12bc3ebb [utils] Make unified_strdate always return unicode string 2015-10-31 23:07:37 +06:00
Sergey M․
e327b736ca [generic] Update test 2015-10-31 23:05:30 +06:00
Sergey M․
82b69a5cbb [mdr] PEP 8 2015-10-31 23:00:36 +06:00
Sergey M․
11465da702 [mdr] Simplify xpath 2015-10-31 22:45:45 +06:00
Sergey M․
578c074575 [utils] Support list of xpath in xpath_element 2015-10-31 22:39:44 +06:00
Sergey M․
8cdb5c8453 [mdr] Add audio test 2015-10-31 22:24:21 +06:00
Sergey M․
2b1b2d83ca [mdr] Modernize and include kika.de 2015-10-31 22:17:09 +06:00
Jaime Marquínez Ferrándiz
c3040bd00a [kika] Cleanup
Closes #6957.
2015-10-31 16:32:35 +01:00
Jaime Marquínez Ferrándiz
8c1aa28c27 [kika] Replace non working tests and recognize 'einzelsendung' urls. 2015-10-31 16:14:36 +01:00
remitamine
50b9dd7344 [dcn] improve season info extraction 2015-10-31 15:40:11 +01:00
Yen Chi Hsuan
78d7ee19dc [democracynow] Fix _TESTS 2015-10-31 22:21:52 +08:00
Lucas
892015b088 replaced inefficient code 2015-10-31 15:18:23 +01:00
Lucas
47f2d01a5a Add new extractor 2015-10-31 15:18:23 +01:00
Yen Chi Hsuan
33a513faf7 Merge branch 'pr-democracynow' of https://github.com/atomicdryad/youtube-dl into atomicdryad-pr-democracynow 2015-10-31 22:06:05 +08:00
remitamine
720334659a [daum] improve info extraction 2015-10-31 01:08:37 +01:00
remitamine
240384afe6 [clipfish] improve info extraction 2015-10-30 20:06:38 +01:00
Sergey M․
6722ebd437 [anitube] Relax key regex (Closes #7303)
Another variant seen http://anitubebr.xpg.uol.com.br/embed/
2015-10-30 21:00:36 +06:00
remitamine
957e0db1d2 [baidu] improve info extraction 2015-10-30 13:56:21 +01:00
remitamine
804afc5871 [vgtv] improve _VALID_URL regex 2015-10-30 10:20:38 +01:00
remitamine
00d24327ef [vgtv] extract videos from FTV, Aftenposten, Aftonbladet using VGTVIE 2015-10-30 09:48:56 +01:00
remitamine
9a605c8859 [adobetv] add support for show and channel extraction 2015-10-29 20:00:27 +01:00
remitamine
402ca40c9d [adobetv] extract AdobeTVVideo info from json directly 2015-10-29 19:55:04 +01:00
remitamine
30bd1c16c8 [adobetv] use api for extraction and add support specific language videos 2015-10-29 19:44:26 +01:00
Sergey M․
721f5a277c [moniker] Add tests for #7244 2015-10-29 22:47:18 +06:00
Sergey M․
6fb8ace671 [moniker] Add support for builtin embedded videos (Closes #7244) 2015-10-29 22:44:01 +06:00
Jaime Marquínez Ferrándiz
ae37338e68 [compat] compat_etree_fromstring: clarify comment 2015-10-29 13:58:40 +01:00
Sergey M․
03c2c162f9 [clyp] Improve and cleanup (Closes #7194) 2015-10-28 21:42:01 +06:00
Sergey M․
52c3a6e49d [utils] Improve parse_iso8601 2015-10-28 21:40:22 +06:00
Cian Ruane
4e16c1f80b [clyp] Add extractor
Update __init__.py

[clyp.it] Extract ID idiomatically and make duration and description optional
2015-10-28 20:37:19 +06:00
Jaime Marquínez Ferrándiz
7ccb2b84dd [francetv] fix style issues reported by flake8
* Don't redefine variable in list comprehension
* Line missing indentation
2015-10-28 08:22:04 +01:00
Sergey M․
0a192fbea7 [pluzz] Fix mobile support and modernize (Closes #7305) 2015-10-27 21:43:29 +06:00
Pierre Fenoll
a526167d40 [francetv] Accept mobile URLs 2015-10-27 21:39:29 +06:00
Jaime Marquínez Ferrándiz
f78546272c [compat] compat_etree_fromstring: also decode the text attribute
Deletes parse_xml from utils, because it also does it.
2015-10-26 16:41:24 +01:00
Sergey M․
c137cc0d33 [francetv] Add subtitles test 2015-10-26 20:35:45 +06:00
Sergey M․
6e4b8b2891 [francetv] Make subtitles more robust (Closes #7298) 2015-10-26 20:35:28 +06:00
Frans de Jonge
5dadae079b [francetv] Add subtitles support 2015-10-26 20:20:15 +06:00
Sergey M
cd08d806b1 Merge pull request #7297 from lalinsky/vidme-deleted
[vidme] Check for deleted videos
2015-10-26 13:47:42 +00:00
Lukáš Lalinský
5f9f87c06f [vidme] Check for deleted videos 2015-10-26 14:42:17 +01:00
Jaime Marquínez Ferrándiz
387db16a78 [compat] compat_etree_fromstring: only decode bytes objects 2015-10-25 20:30:54 +01:00
Jaime Marquínez Ferrándiz
36e6f62cd0 Use a wrapper around xml.etree.ElementTree.fromstring in python 2.x (#7178)
Attributes aren't unicode objects, so they couldn't be directly used in info_dict fields (for example '--write-description' doesn't work with bytes).
2015-10-25 20:13:16 +01:00
Sergey M․
755ff8d22c [youporn] Extract comment count 2015-10-25 23:41:10 +06:00
Sergey M․
7b3a19e533 [stitcher] Remove origEpisodeURL
It's always 404
2015-10-25 23:17:23 +06:00
Sergey M․
4f13f8f798 [youporn] Improve uploader extraction 2015-10-25 23:12:12 +06:00
Sergey M․
feb7711cf5 [youporn] Make description optional
Some videos does not contain any description
2015-10-25 23:01:12 +06:00
Sergey M․
589c33dade [youporn] Improve and make more robust (Closes #6888, closes #7214) 2015-10-25 22:56:35 +06:00
Erik
e572a1010b [youporn] Fix extraction
[youporn] Added description and thumbnail

[youporn] Added uploader and date

[youporn] Removed Try and Except lines

[youporn] Fixed date, fatal, formats and /s*

[youporn] Undid removing comment about video url components & Undid and fixed removal of encrypted URL detection

[youporn] Fix: Add encrypted link to links array only if not already in it

[youporn] Fix: Add encrypted link to links array only if not already in it

[youporn] Fix: cleanup
2015-10-25 20:57:08 +06:00
Sergey M․
7e0dc61334 [njoy] Add support for URLs without display id 2015-10-25 20:48:29 +06:00
Sergey M․
8e82ecfe8f [dailymotion] Extract f4m formats 2015-10-24 21:04:09 +06:00
Sergey M․
ec29539e06 [senateisvp] Pass extra param as query segment without ? 2015-10-24 21:03:45 +06:00
Sergey M․
8cd9614abf [downloader/f4m] More accurate fragment URL construction 2015-10-24 21:02:31 +06:00
remitamine
324ac0a243 [downloader/f4m] get the redirected f4m_url and handle url query string properly 2015-10-24 20:05:46 +06:00
remitamine
3711304510 [extractor/common] get the redirected m3u8_url in _extract_m3u8_formats 2015-10-24 19:01:54 +06:00
Jaime Marquínez Ferrándiz
50b936936d [tutv] Fix test 2015-10-24 14:22:47 +02:00
Jaime Marquínez Ferrándiz
d97da29da2 [abc] Support more URL formats 2015-10-24 12:43:02 +02:00
remitamine
7687b354c5 [abc] add support for audio extraction 2015-10-24 12:42:56 +02:00
Jaime Marquínez Ferrándiz
36d7281037 [spiegeltv] Fix style issue
Use two spaces before comment.
2015-10-24 12:42:08 +02:00
Jaime Marquínez Ferrándiz
865d1fbafc [extractor/common] Remove unused import 2015-10-24 12:39:23 +02:00
Sergey M․
ac21e71968 [spiegeltv] Check formats 2015-10-24 16:25:44 +06:00
Sergey M․
943a1e24b8 [extractor/common] Use more generic URLError in _is_valid_url 2015-10-24 16:25:04 +06:00
Sergey M․
50f01302d3 [spiegeltv] Do not extract m3u8 formats since it's already a format 2015-10-24 16:24:08 +06:00
Christoph Döpmann
0198807ef9 [spiegeltv] Fix Accept-Encoding issue (server chokes on gzip) 2015-10-24 16:21:14 +06:00
Jaime Marquínez Ferrándiz
6856139705 [mitele] Fix test checksum 2015-10-24 12:13:26 +02:00
Jaime Marquínez Ferrándiz
c93153852f [mitele] Don't encode the URL query (closes #7280)
This seems to produce sporadic errors when trying to access the URL, because on python 3.x when you do '%s' % b'somedata' you get "b'somedata'".
2015-10-24 12:10:53 +02:00
remitamine
497f5fd93f [bilibili] extract multiple backup_urls 2015-10-21 08:24:05 +01:00
remitamine
4bf5614195 [cspan] move get_text_attr to CSpanIE 2015-10-20 07:43:39 +01:00
remitamine
520e753390 [bilibili] add support for specefic page extraction 2015-10-17 23:12:58 +01:00
remitamine
355c7ad361 [cspan] handle error massages and extract qualities 2015-10-17 21:30:38 +01:00
remitamine
55af2b26e0 [bilibili] extract backup url 2015-10-17 18:30:51 +01:00
remitamine
d90e40305b [bilibili] fix info extraction 2015-10-17 17:28:09 +01:00
remitamine
cce9d15d01 [ooyala] extract domain,handle errors and change related tests 2015-10-16 16:02:40 +01:00
remitamine
dd414c970b [ooyala] fix sorting and format id 2015-10-16 10:12:42 +01:00
remitamine
77302fe5c9 [bliptv] remove extractor and add support for site replacement(makertv) 2015-10-15 23:27:46 +01:00
remitamine
497ca088a6 [ooyala] remove print statment 2015-10-15 14:37:05 +01:00
remitamine
90bddb6cdd [ooyala] extract more formats and metadata 2015-10-15 14:28:56 +01:00
remitamine
db8e38b8cf [ign] add tests for me.ign specific language urls 2015-10-14 11:55:03 +01:00
remitamine
e09f58b3bc [srgssr] change the url chortcut, fix image extraction ,add a test and extract format id 2015-10-14 10:40:54 +01:00
remitamine
05ad5409b4 [srgssr] fix regex for swissinfo.ch 2015-10-09 20:34:03 +01:00
remitamine
1ef1563649 [srgssr] Add generic extractor for SRGSSR Group sites 2015-10-09 20:08:37 +01:00
remitamine
6a11bb77ba [nba] add support for team subsites 2015-10-07 12:17:32 +01:00
remitamine
ecf6de5b02 [nba] extract width,height and bitrate from format key 2015-10-07 07:09:45 +01:00
remitamine
139f27827e [nba] skip Legacy Video Files 2015-10-07 06:53:19 +01:00
remitamine
30787f7259 [cspan] correct the clip info extraction 2015-10-03 19:28:48 +01:00
remitamine
c233e6bcc3 [nba] extract video info from xml feed 2015-10-03 12:30:05 +01:00
remitamine
28809ab07a [nba] extract more formats 2015-10-03 09:47:19 +01:00
remitamine
adccf33632 [ign] add support for pcmag and extract all formats and more metadata 2015-10-02 21:58:20 +01:00
remitamine
8fc226ef99 [nba] extract all video formats and extract more info 2015-10-02 17:24:30 +01:00
remitamine
c01e1a96aa [brightcove] fix test and fields extraction 2015-09-30 11:20:43 +01:00
remitamine
6aeba407db [jukebox] remove extractor and handle it using generic extractor 2015-09-25 10:52:48 +01:00
remitamine
53407e3f38 [brightcove] fix streaming_src extraction 2015-09-23 14:02:13 +01:00
remitamine
b306c439d7 [cnet] fix extraction and extract more formats 2015-09-23 13:28:05 +01:00
remitamine
ed1269000f [brightcove] add support for brightcove in page embed(fixes #6824) 2015-09-11 04:46:21 +01:00
remitamine
689fb748ee [utlis] add extract_attributes for extracting html tags attributes 2015-09-11 04:44:17 +01:00
remitamine
436416afe2 [tele13] skip test 2015-09-07 21:13:49 +01:00
remitamine
8b55cadc83 [canal13cl] fix info extraction 2015-09-07 16:39:01 +01:00
remitamine
486375154c correct season info extraction and simplify 2015-09-05 11:30:42 +01:00
remitamine
8e2898edf9 [dcn] add support for live streams and catchup videos 2015-09-04 15:42:09 +01:00
remitamine
b477da2094 correct the extractor name and id and remove unnecessary request 2015-09-03 16:59:10 +01:00
remitamine
9f4921bfa0 [dcn] add show extraction and support for other types of urls 2015-09-03 00:29:53 +01:00
remitamine
8e92d21ebf [googledrive] raise ExtractorError instead of warning 2015-07-23 11:59:13 +01:00
remitamine
36dbca8784 fix recursive error 2015-07-23 11:59:13 +01:00
remitamine
d1cc05e17e remove unnecessary regex group names 2015-07-23 11:59:12 +01:00
remitamine
3b3d531965 fix embed regex 2015-07-23 11:59:12 +01:00
remitamine
653789afc7 add google drive embeds 2015-07-23 11:59:12 +01:00
remitamine
2d651a2d02 import google drive embed class 2015-07-23 11:57:09 +01:00
remitamine
3e5f3df172 move the embed to a separate class 2015-07-23 11:57:09 +01:00
remitamine
f120a7ab5e change the _TEST info 2015-07-23 11:57:09 +01:00
remitamine
984e4d4875 [googledrive] Add new extractor 2015-07-23 11:57:08 +01:00
fnord
eb08081330 democracynow: correct syntax 2015-07-17 02:57:08 -05:00
fnord
f870544302 Add support for democracynow.org
Supports downloading clips or entire shows. Subtitle support
2015-07-13 07:41:38 -05:00
513 changed files with 22307 additions and 8621 deletions

58
.github/ISSUE_TEMPLATE.md vendored Normal file
View File

@@ -0,0 +1,58 @@
## Please follow the guide below
- You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly
- Put an `x` into all the boxes [ ] relevant to your *issue* (like that [x])
- Use *Preview* tab to see how your issue will actually look like
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.04.13*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.04.13**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
- [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
### What is the purpose of your *issue*?
- [ ] Bug report (encountered problems with youtube-dl)
- [ ] Site support request (request for adding support for a new site)
- [ ] Feature request (request for a new functionality)
- [ ] Question
- [ ] Other
---
### The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your *issue*
---
### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows:
Add `-v` flag to **your command line** you run youtube-dl with, copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```):
```
$ youtube-dl -v <your command line>
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2016.04.13
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
...
<end of log>
```
---
### If the purpose of this *issue* is a *site support request* please provide all kinds of example URLs support for which should be included (replace following example URLs by **yours**):
- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
---
### Description of your *issue*, suggested solution and other information
Explanation of your *issue* in arbitrary form goes here. Please make sure the [description is worded well enough to be understood](https://github.com/rg3/youtube-dl#is-the-description-of-the-issue-itself-sufficient). Provide as much context and examples as possible.
If work on your *issue* required an account credentials please provide them or explain how one can obtain them.

58
.github/ISSUE_TEMPLATE_tmpl.md vendored Normal file
View File

@@ -0,0 +1,58 @@
## Please follow the guide below
- You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly
- Put an `x` into all the boxes [ ] relevant to your *issue* (like that [x])
- Use *Preview* tab to see how your issue will actually look like
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *%(version)s*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **%(version)s**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
- [ ] [Searched](https://github.com/rg3/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
### What is the purpose of your *issue*?
- [ ] Bug report (encountered problems with youtube-dl)
- [ ] Site support request (request for adding support for a new site)
- [ ] Feature request (request for a new functionality)
- [ ] Question
- [ ] Other
---
### The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your *issue*
---
### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows:
Add `-v` flag to **your command line** you run youtube-dl with, copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```):
```
$ youtube-dl -v <your command line>
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version %(version)s
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
...
<end of log>
```
---
### If the purpose of this *issue* is a *site support request* please provide all kinds of example URLs support for which should be included (replace following example URLs by **yours**):
- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
---
### Description of your *issue*, suggested solution and other information
Explanation of your *issue* in arbitrary form goes here. Please make sure the [description is worded well enough to be understood](https://github.com/rg3/youtube-dl#is-the-description-of-the-issue-itself-sufficient). Provide as much context and examples as possible.
If work on your *issue* required an account credentials please provide them or explain how one can obtain them.

4
.gitignore vendored
View File

@@ -1,5 +1,6 @@
*.pyc
*.pyo
*.class
*~
*.DS_Store
wine-py2exe/
@@ -12,6 +13,7 @@ README.txt
youtube-dl.1
youtube-dl.bash-completion
youtube-dl.fish
youtube_dl/extractor/lazy_extractors.py
youtube-dl
youtube-dl.exe
youtube-dl.tar.gz
@@ -32,4 +34,4 @@ test/testdata
.tox
youtube-dl.zsh
.idea
.idea/*
.idea/*

24
AUTHORS
View File

@@ -144,3 +144,27 @@ Lee Jenkins
Anssi Hannula
Lukáš Lalinský
Qijiang Fan
Rémy Léone
Marco Ferragina
reiv
Muratcan Simsek
Evan Lu
flatgreen
Brian Foley
Vignesh Venkat
Tom Gijselinck
Founder Fang
Andrew Alexeyew
Saso Bezlaj
Erwin de Haan
Jens Wille
Robin Houtevelts
Patrick Griffis
Aidan Rowe
mutantmonkey
Ben Congdon
Kacper Michajłow
José Joaquín Atria
Viťas Strádal
Kagami Hiiragi
Philip Huppert

View File

@@ -1,4 +1,18 @@
**Please include the full output of youtube-dl when run with `-v`**.
**Please include the full output of youtube-dl when run with `-v`**, i.e. **add** `-v` flag to **your command line**, copy the **whole** output and post it in the issue body wrapped in \`\`\` for better formatting. It should look similar to this:
```
$ youtube-dl -v <your command line>
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2015.12.06
[debug] Git HEAD: 135392e
[debug] Python version 2.6.6 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
...
```
**Do not post screenshots of verbose log only plain text is acceptable.**
The output (including the first lines) contains important debugging information. Issues without the full output are often not reproducible and therefore do not get solved in short order, if ever.
@@ -14,13 +28,13 @@ So please elaborate on what feature you are requesting, or what bug you want to
- How it could be fixed
- How your proposed solution would look like
If your report is shorter than two lines, it is almost certainly missing some of these, which makes it hard for us to respond to it. We're often too polite to close the issue outright, but the missing info makes misinterpretation likely. As a commiter myself, I often get frustrated by these issues, since the only possible way for me to move forward on them is to ask for clarification over and over.
If your report is shorter than two lines, it is almost certainly missing some of these, which makes it hard for us to respond to it. We're often too polite to close the issue outright, but the missing info makes misinterpretation likely. As a committer myself, I often get frustrated by these issues, since the only possible way for me to move forward on them is to ask for clarification over and over.
For bug reports, this means that your report should contain the *complete* output of youtube-dl when called with the `-v` flag. The error message you get for (most) bugs even says so, but you would not believe how many of our bug reports do not contain this information.
If your server has multiple IPs or you suspect censorship, adding `--call-home` may be a good idea to get more diagnostics. If the error is `ERROR: Unable to extract ...` and you cannot reproduce it from multiple countries, add `--dump-pages` (warning: this will yield a rather large output, redirect it to the file `log.txt` by adding `>log.txt 2>&1` to your command-line) or upload the `.dump` files you get when you add `--write-pages` [somewhere](https://gist.github.com/).
**Site support requests must contain an example URL**. An example URL is a URL you might want to download, like http://www.youtube.com/watch?v=BaW_jenozKc . There should be an obvious video present. Except under very special circumstances, the main page of a video service (e.g. http://www.youtube.com/ ) is *not* an example URL.
**Site support requests must contain an example URL**. An example URL is a URL you might want to download, like `http://www.youtube.com/watch?v=BaW_jenozKc`. There should be an obvious video present. Except under very special circumstances, the main page of a video service (e.g. `http://www.youtube.com/`) is *not* an example URL.
### Are you using the latest version?
@@ -28,7 +42,7 @@ Before reporting any issue, type `youtube-dl -U`. This should report that you're
### Is the issue already documented?
Make sure that someone has not already opened the issue you're trying to open. Search at the top of the window or at https://github.com/rg3/youtube-dl/search?type=Issues . If there is an issue, feel free to write something along the lines of "This affects me as well, with version 2015.01.01. Here is some more information on the issue: ...". While some issues may be old, a new post into them often spurs rapid activity.
Make sure that someone has not already opened the issue you're trying to open. Search at the top of the window or browse the [GitHub Issues](https://github.com/rg3/youtube-dl/search?type=Issues) of this repository. If there is an issue, feel free to write something along the lines of "This affects me as well, with version 2015.01.01. Here is some more information on the issue: ...". While some issues may be old, a new post into them often spurs rapid activity.
### Why are existing options not enough?
@@ -71,14 +85,16 @@ To run the test, simply invoke your favorite test runner, or execute a test file
If you want to create a build of youtube-dl yourself, you'll need
* python
* make
* make (both GNU make and BSD make are supported)
* pandoc
* zip
* nosetests
### Adding support for a new site
If you want to add support for a new site, you can follow this quick list (assuming your service is called `yourextractor`):
If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. youtube-dl does **not support** such sites thus pull requests adding support for them **will be rejected**.
After you have ensured this site is distributing it's content legally, you can follow this quick list (assuming your service is called `yourextractor`):
1. [Fork this repository](https://github.com/rg3/youtube-dl/fork)
2. Check out the source code with `git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git`
@@ -124,18 +140,19 @@ If you want to add support for a new site, you can follow this quick list (assum
# TODO more properties (see youtube_dl/extractor/common.py)
}
```
5. Add an import in [`youtube_dl/extractor/__init__.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/__init__.py).
5. Add an import in [`youtube_dl/extractor/extractors.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/extractors.py).
6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc.
7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L62-L200). Add tests and code for as many as you want.
8. If you can, check the code with [flake8](https://pypi.python.org/pypi/flake8).
9. When the tests pass, [add](http://git-scm.com/docs/git-add) the new files and [commit](http://git-scm.com/docs/git-commit) them and [push](http://git-scm.com/docs/git-push) the result, like this:
7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/rg3/youtube-dl/blob/58525c94d547be1c8167d16c298bdd75506db328/youtube_dl/extractor/common.py#L68-L226). Add tests and code for as many as you want.
8. Keep in mind that the only mandatory fields in info dict for successful extraction process are `id`, `title` and either `url` or `formats`, i.e. these are the critical data the extraction does not make any sense without. This means that [any field](https://github.com/rg3/youtube-dl/blob/58525c94d547be1c8167d16c298bdd75506db328/youtube_dl/extractor/common.py#L138-L226) apart from aforementioned mandatory ones should be treated **as optional** and extraction should be **tolerate** to situations when sources for these fields can potentially be unavailable (even if they always available at the moment) and **future-proof** in order not to break the extraction of general purpose mandatory fields. For example, if you have some intermediate dict `meta` that is a source of metadata and it has a key `summary` that you want to extract and put into resulting info dict as `description`, you should be ready that this key may be missing from the `meta` dict, i.e. you should extract it as `meta.get('summary')` and not `meta['summary']`. Similarly, you should pass `fatal=False` when extracting data from a webpage with `_search_regex/_html_search_regex`.
9. Check the code with [flake8](https://pypi.python.org/pypi/flake8).
10. When the tests pass, [add](http://git-scm.com/docs/git-add) the new files and [commit](http://git-scm.com/docs/git-commit) them and [push](http://git-scm.com/docs/git-push) the result, like this:
$ git add youtube_dl/extractor/__init__.py
$ git add youtube_dl/extractor/extractors.py
$ git add youtube_dl/extractor/yourextractor.py
$ git commit -m '[yourextractor] Add new extractor'
$ git push origin yourextractor
10. Finally, [create a pull request](https://help.github.com/articles/creating-a-pull-request). We'll then review and merge it.
11. Finally, [create a pull request](https://help.github.com/articles/creating-a-pull-request). We'll then review and merge it.
In any case, thank you very much for your contributions!

View File

@@ -1,8 +1,9 @@
all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites
clean:
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish *.dump *.part *.info.json *.mp4 *.flv *.mp3 *.avi CONTRIBUTING.md.tmp youtube-dl youtube-dl.exe
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part *.info.json *.mp4 *.flv *.mp3 *.avi CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe
find . -name "*.pyc" -delete
find . -name "*.class" -delete
PREFIX ?= /usr/local
BINDIR ?= $(PREFIX)/bin
@@ -11,15 +12,7 @@ SHAREDIR ?= $(PREFIX)/share
PYTHON ?= /usr/bin/env python
# set SYSCONFDIR to /etc if PREFIX=/usr or PREFIX=/usr/local
ifeq ($(PREFIX),/usr)
SYSCONFDIR=/etc
else
ifeq ($(PREFIX),/usr/local)
SYSCONFDIR=/etc
else
SYSCONFDIR=$(PREFIX)/etc
endif
endif
SYSCONFDIR != if [ $(PREFIX) = /usr -o $(PREFIX) = /usr/local ]; then echo /etc; else echo $(PREFIX)/etc; fi
install: youtube-dl youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish
install -d $(DESTDIR)$(BINDIR)
@@ -44,7 +37,7 @@ test:
ot: offlinetest
offlinetest: codetest
nosetests --verbose test --exclude test_download.py --exclude test_age_restriction.py --exclude test_subtitles.py --exclude test_write_annotations.py --exclude test_youtube_lists.py
$(PYTHON) -m nose --verbose test --exclude test_download.py --exclude test_age_restriction.py --exclude test_subtitles.py --exclude test_write_annotations.py --exclude test_youtube_lists.py --exclude test_iqiyi_sdk_interpreter.py
tar: youtube-dl.tar.gz
@@ -61,37 +54,46 @@ youtube-dl: youtube_dl/*.py youtube_dl/*/*.py
chmod a+x youtube-dl
README.md: youtube_dl/*.py youtube_dl/*/*.py
COLUMNS=80 python youtube_dl/__main__.py --help | python devscripts/make_readme.py
COLUMNS=80 $(PYTHON) youtube_dl/__main__.py --help | $(PYTHON) devscripts/make_readme.py
CONTRIBUTING.md: README.md
python devscripts/make_contributing.py README.md CONTRIBUTING.md
$(PYTHON) devscripts/make_contributing.py README.md CONTRIBUTING.md
.github/ISSUE_TEMPLATE.md: devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl.md youtube_dl/version.py
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl.md .github/ISSUE_TEMPLATE.md
supportedsites:
python devscripts/make_supportedsites.py docs/supportedsites.md
$(PYTHON) devscripts/make_supportedsites.py docs/supportedsites.md
README.txt: README.md
pandoc -f markdown -t plain README.md -o README.txt
youtube-dl.1: README.md
python devscripts/prepare_manpage.py >youtube-dl.1.temp.md
$(PYTHON) devscripts/prepare_manpage.py >youtube-dl.1.temp.md
pandoc -s -f markdown -t man youtube-dl.1.temp.md -o youtube-dl.1
rm -f youtube-dl.1.temp.md
youtube-dl.bash-completion: youtube_dl/*.py youtube_dl/*/*.py devscripts/bash-completion.in
python devscripts/bash-completion.py
$(PYTHON) devscripts/bash-completion.py
bash-completion: youtube-dl.bash-completion
youtube-dl.zsh: youtube_dl/*.py youtube_dl/*/*.py devscripts/zsh-completion.in
python devscripts/zsh-completion.py
$(PYTHON) devscripts/zsh-completion.py
zsh-completion: youtube-dl.zsh
youtube-dl.fish: youtube_dl/*.py youtube_dl/*/*.py devscripts/fish-completion.in
python devscripts/fish-completion.py
$(PYTHON) devscripts/fish-completion.py
fish-completion: youtube-dl.fish
lazy-extractors: youtube_dl/extractor/lazy_extractors.py
_EXTRACTOR_FILES != find youtube_dl/extractor -iname '*.py' -and -not -iname 'lazy_extractors.py'
youtube_dl/extractor/lazy_extractors.py: devscripts/make_lazy_extractors.py devscripts/lazy_load_template.py $(_EXTRACTOR_FILES)
$(PYTHON) devscripts/make_lazy_extractors.py $@
youtube-dl.tar.gz: youtube-dl README.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish
@tar -czf youtube-dl.tar.gz --transform "s|^|youtube-dl/|" --owner 0 --group 0 \
--exclude '*.DS_Store' \

290
README.md
View File

@@ -35,7 +35,7 @@ You can also use pip:
sudo pip install youtube-dl
Alternatively, refer to the [developer instructions](#developer-instructions) for how to check out and work with the git repository. For further options, including PGP signatures, see https://rg3.github.io/youtube-dl/download.html .
Alternatively, refer to the [developer instructions](#developer-instructions) for how to check out and work with the git repository. For further options, including PGP signatures, see the [youtube-dl Download Page](https://rg3.github.io/youtube-dl/download.html).
# DESCRIPTION
**youtube-dl** is a small command-line program to download videos from
@@ -80,6 +80,8 @@ which means you can modify it, redistribute it or use it however you like.
on Windows)
--flat-playlist Do not extract the videos of a playlist,
only list them.
--mark-watched Mark videos watched (YouTube only)
--no-mark-watched Do not mark videos watched (YouTube only)
--no-color Do not emit color codes in output
## Network Options:
@@ -162,6 +164,8 @@ which means you can modify it, redistribute it or use it however you like.
(e.g. 50K or 4.2M)
-R, --retries RETRIES Number of retries (default is 10), or
"infinite".
--fragment-retries RETRIES Number of retries for a fragment (default
is 10), or "infinite" (DASH only)
--buffer-size SIZE Size of download buffer (e.g. 1024 or 16K)
(default is 1024)
--no-resize-buffer Do not automatically adjust the buffer
@@ -173,9 +177,13 @@ which means you can modify it, redistribute it or use it however you like.
expected filesize (experimental)
--hls-prefer-native Use the native HLS downloader instead of
ffmpeg (experimental)
--hls-use-mpegts Use the mpegts container for HLS videos,
allowing to play the video while
downloading (some players may not be able
to play it)
--external-downloader COMMAND Use the specified external downloader.
Currently supports
aria2c,axel,curl,httpie,wget
aria2c,avconv,axel,curl,ffmpeg,httpie,wget
--external-downloader-args ARGS Give these arguments to the external
downloader
@@ -319,7 +327,8 @@ which means you can modify it, redistribute it or use it however you like.
--all-formats Download all available video formats
--prefer-free-formats Prefer free video formats unless a specific
one is requested
-F, --list-formats List all available formats
-F, --list-formats List all available formats of requested
videos
--youtube-skip-dash-manifest Do not download the DASH manifests and
related data on YouTube videos
--merge-output-format FORMAT If a merge is required (e.g.
@@ -329,8 +338,8 @@ which means you can modify it, redistribute it or use it however you like.
## Subtitle Options:
--write-sub Write subtitle file
--write-auto-sub Write automatic subtitle file (YouTube
only)
--write-auto-sub Write automatically generated subtitle file
(YouTube only)
--all-subs Download all the available subtitles of the
video
--list-subs List all available subtitles for the video
@@ -338,8 +347,8 @@ which means you can modify it, redistribute it or use it however you like.
preference, for example: "srt" or
"ass/srt/best"
--sub-lang LANGS Languages of the subtitles to download
(optional) separated by commas, use IETF
language tags like 'en,pt'
(optional) separated by commas, use --list-
subs for available language tags
## Authentication Options:
-u, --username USERNAME Login with this account ID
@@ -369,8 +378,8 @@ which means you can modify it, redistribute it or use it however you like.
--no-post-overwrites Do not overwrite post-processed files; the
post-processed files are overwritten by
default
--embed-subs Embed subtitles in the video (only for mkv
and mp4 videos)
--embed-subs Embed subtitles in the video (only for mp4,
webm and mkv videos)
--embed-thumbnail Embed thumbnail in the audio as cover art
--add-metadata Write metadata to the video file
--metadata-from-title FORMAT Parse additional metadata like song title /
@@ -399,21 +408,26 @@ which means you can modify it, redistribute it or use it however you like.
downloading, similar to find's -exec
syntax. Example: --exec 'adb push {}
/sdcard/Music/ && rm {}'
--convert-subtitles FORMAT Convert the subtitles to other format
--convert-subs FORMAT Convert the subtitles to other format
(currently supported: srt|ass|vtt)
# CONFIGURATION
You can configure youtube-dl by placing any supported command line option to a configuration file. On Linux, the system wide configuration file is located at `/etc/youtube-dl.conf` and the user wide configuration file at `~/.config/youtube-dl/config`. On Windows, the user wide configuration file locations are `%APPDATA%\youtube-dl\config.txt` or `C:\Users\<user name>\youtube-dl.conf`. For example, with the following configuration file youtube-dl will always extract the audio, not copy the mtime and use a proxy:
You can configure youtube-dl by placing any supported command line option to a configuration file. On Linux, the system wide configuration file is located at `/etc/youtube-dl.conf` and the user wide configuration file at `~/.config/youtube-dl/config`. On Windows, the user wide configuration file locations are `%APPDATA%\youtube-dl\config.txt` or `C:\Users\<user name>\youtube-dl.conf`.
For example, with the following configuration file youtube-dl will always extract the audio, not copy the mtime, use a proxy and save all videos under `Movies` directory in your home directory:
```
--extract-audio
-x
--no-mtime
--proxy 127.0.0.1:3128
-o ~/Movies/%(title)s.%(ext)s
```
Note that options in configuration file are just the same options aka switches used in regular command line calls thus there **must be no whitespace** after `-` or `--`, e.g. `-o` or `--proxy` but not `- o` or `-- proxy`.
You can use `--ignore-config` if you want to disable the configuration file for a particular youtube-dl run.
### Authentication with `.netrc` file ###
### Authentication with `.netrc` file
You may also want to configure automatic credentials storage for extractors that support authentication (by providing login and password with `--username` and `--password`) in order not to pass credentials as command line arguments on every youtube-dl execution and prevent tracking plain text passwords in the shell command history. You can achieve this using a [`.netrc` file](http://stackoverflow.com/tags/.netrc/info) on per extractor basis. For that you will need to create a`.netrc` file in your `$HOME` and restrict permissions to read/write by you only:
```
@@ -435,43 +449,190 @@ On Windows you may also need to setup the `%HOME%` environment variable manually
# OUTPUT TEMPLATE
The `-o` option allows users to indicate a template for the output file names. The basic usage is not to set any template arguments when downloading a single file, like in `youtube-dl -o funny_video.flv "http://some/video"`. However, it may contain special sequences that will be replaced when downloading each video. The special sequences have the format `%(NAME)s`. To clarify, that is a percent symbol followed by a name in parentheses, followed by a lowercase S. Allowed names are:
The `-o` option allows users to indicate a template for the output file names.
- `id`: The sequence will be replaced by the video identifier.
- `url`: The sequence will be replaced by the video URL.
- `uploader`: The sequence will be replaced by the nickname of the person who uploaded the video.
- `upload_date`: The sequence will be replaced by the upload date in YYYYMMDD format.
- `title`: The sequence will be replaced by the video title.
- `ext`: The sequence will be replaced by the appropriate extension (like flv or mp4).
- `epoch`: The sequence will be replaced by the Unix epoch when creating the file.
- `autonumber`: The sequence will be replaced by a five-digit number that will be increased with each download, starting at zero.
- `playlist`: The sequence will be replaced by the name or the id of the playlist that contains the video.
- `playlist_index`: The sequence will be replaced by the index of the video in the playlist padded with leading zeros according to the total length of the playlist.
- `format_id`: The sequence will be replaced by the format code specified by `--format`.
- `duration`: The sequence will be replaced by the length of the video in seconds.
**tl;dr:** [navigate me to examples](#output-template-examples).
The basic usage is not to set any template arguments when downloading a single file, like in `youtube-dl -o funny_video.flv "http://some/video"`. However, it may contain special sequences that will be replaced when downloading each video. The special sequences have the format `%(NAME)s`. To clarify, that is a percent symbol followed by a name in parentheses, followed by a lowercase S. Allowed names are:
- `id`: Video identifier
- `title`: Video title
- `url`: Video URL
- `ext`: Video filename extension
- `alt_title`: A secondary title of the video
- `display_id`: An alternative identifier for the video
- `uploader`: Full name of the video uploader
- `license`: License name the video is licensed under
- `creator`: The main artist who created the video
- `release_date`: The date (YYYYMMDD) when the video was released
- `timestamp`: UNIX timestamp of the moment the video became available
- `upload_date`: Video upload date (YYYYMMDD)
- `uploader_id`: Nickname or id of the video uploader
- `location`: Physical location where the video was filmed
- `duration`: Length of the video in seconds
- `view_count`: How many users have watched the video on the platform
- `like_count`: Number of positive ratings of the video
- `dislike_count`: Number of negative ratings of the video
- `repost_count`: Number of reposts of the video
- `average_rating`: Average rating give by users, the scale used depends on the webpage
- `comment_count`: Number of comments on the video
- `age_limit`: Age restriction for the video (years)
- `format`: A human-readable description of the format
- `format_id`: Format code specified by `--format`
- `format_note`: Additional info about the format
- `width`: Width of the video
- `height`: Height of the video
- `resolution`: Textual description of width and height
- `tbr`: Average bitrate of audio and video in KBit/s
- `abr`: Average audio bitrate in KBit/s
- `acodec`: Name of the audio codec in use
- `asr`: Audio sampling rate in Hertz
- `vbr`: Average video bitrate in KBit/s
- `fps`: Frame rate
- `vcodec`: Name of the video codec in use
- `container`: Name of the container format
- `filesize`: The number of bytes, if known in advance
- `filesize_approx`: An estimate for the number of bytes
- `protocol`: The protocol that will be used for the actual download
- `extractor`: Name of the extractor
- `extractor_key`: Key name of the extractor
- `epoch`: Unix epoch when creating the file
- `autonumber`: Five-digit number that will be increased with each download, starting at zero
- `playlist`: Name or id of the playlist that contains the video
- `playlist_index`: Index of the video in the playlist padded with leading zeros according to the total length of the playlist
Available for the video that belongs to some logical chapter or section:
- `chapter`: Name or title of the chapter the video belongs to
- `chapter_number`: Number of the chapter the video belongs to
- `chapter_id`: Id of the chapter the video belongs to
Available for the video that is an episode of some series or programme:
- `series`: Title of the series or programme the video episode belongs to
- `season`: Title of the season the video episode belongs to
- `season_number`: Number of the season the video episode belongs to
- `season_id`: Id of the season the video episode belongs to
- `episode`: Title of the video episode
- `episode_number`: Number of the video episode within a season
- `episode_id`: Id of the video episode
Each aforementioned sequence when referenced in output template will be replaced by the actual value corresponding to the sequence name. Note that some of the sequences are not guaranteed to be present since they depend on the metadata obtained by particular extractor, such sequences will be replaced with `NA`.
For example for `-o %(title)s-%(id)s.%(ext)s` and mp4 video with title `youtube-dl test video` and id `BaW_jenozKcj` this will result in a `youtube-dl test video-BaW_jenozKcj.mp4` file created in the current directory.
Output template can also contain arbitrary hierarchical path, e.g. `-o '%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s'` that will result in downloading each video in a directory corresponding to this path template. Any missing directory will be automatically created for you.
To specify percent literal in output template use `%%`. To output to stdout use `-o -`.
The current default template is `%(title)s-%(id)s.%(ext)s`.
In some cases, you don't want special characters such as 中, spaces, or &, such as when transferring the downloaded filename to a Windows system or the filename through an 8bit-unsafe channel. In these cases, add the `--restrict-filenames` flag to get a shorter title:
#### Output template examples
Note on Windows you may need to use double quotes instead of single.
```bash
$ youtube-dl --get-filename -o "%(title)s.%(ext)s" BaW_jenozKc
$ youtube-dl --get-filename -o '%(title)s.%(ext)s' BaW_jenozKc
youtube-dl test video ''_ä↭𝕐.mp4 # All kinds of weird characters
$ youtube-dl --get-filename -o "%(title)s.%(ext)s" BaW_jenozKc --restrict-filenames
$ youtube-dl --get-filename -o '%(title)s.%(ext)s' BaW_jenozKc --restrict-filenames
youtube-dl_test_video_.mp4 # A simple file name
# Download YouTube playlist videos in separate directory indexed by video order in a playlist
$ youtube-dl -o '%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s' https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re
# Download all playlists of YouTube channel/user keeping each playlist in separate directory:
$ youtube-dl -o '%(uploader)s/%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s' https://www.youtube.com/user/TheLinuxFoundation/playlists
# Download Udemy course keeping each chapter in separate directory under MyVideos directory in your home
$ youtube-dl -u user -p password -o '~/MyVideos/%(playlist)s/%(chapter_number)s - %(chapter)s/%(title)s.%(ext)s' https://www.udemy.com/java-tutorial/
# Download entire series season keeping each series and each season in separate directory under C:/MyVideos
$ youtube-dl -o "C:/MyVideos/%(series)s/%(season_number)s - %(season)s/%(episode_number)s - %(episode)s.%(ext)s" http://videomore.ru/kino_v_detalayah/5_sezon/367617
# Stream the video being downloaded to stdout
$ youtube-dl -o - BaW_jenozKc
```
# FORMAT SELECTION
By default youtube-dl tries to download the best quality, but sometimes you may want to download in a different format.
The simplest case is requesting a specific format, for example `-f 22`. You can get the list of available formats using `--list-formats`, you can also use a file extension (currently it supports aac, m4a, mp3, mp4, ogg, wav, webm) or the special names `best`, `bestvideo`, `bestaudio` and `worst`.
By default youtube-dl tries to download the best available quality, i.e. if you want the best quality you **don't need** to pass any special options, youtube-dl will guess it for you by **default**.
If you want to download multiple videos and they don't have the same formats available, you can specify the order of preference using slashes, as in `-f 22/17/18`. You can also filter the video results by putting a condition in brackets, as in `-f "best[height=720]"` (or `-f "[filesize>10M]"`). This works for filesize, height, width, tbr, abr, vbr, asr, and fps and the comparisons <, <=, >, >=, =, != and for ext, acodec, vcodec, container, and protocol and the comparisons =, != . Formats for which the value is not known are excluded unless you put a question mark (?) after the operator. You can combine format filters, so `-f "[height <=? 720][tbr>500]"` selects up to 720p videos (or videos where the height is not known) with a bitrate of at least 500 KBit/s. Use commas to download multiple formats, such as `-f 136/137/mp4/bestvideo,140/m4a/bestaudio`. You can merge the video and audio of two formats into a single file using `-f <video-format>+<audio-format>` (requires ffmpeg or avconv), for example `-f bestvideo+bestaudio`. Format selectors can also be grouped using parentheses, for example if you want to download the best mp4 and webm formats with a height lower than 480 you can use `-f '(mp4,webm)[height<480]'`.
But sometimes you may want to download in a different format, for example when you are on a slow or intermittent connection. The key mechanism for achieving this is so called *format selection* based on which you can explicitly specify desired format, select formats based on some criterion or criteria, setup precedence and much more.
Since the end of April 2015 and version 2015.04.26 youtube-dl uses `-f bestvideo+bestaudio/best` as default format selection (see #5447, #5456). If ffmpeg or avconv are installed this results in downloading `bestvideo` and `bestaudio` separately and muxing them together into a single file giving the best overall quality available. Otherwise it falls back to `best` and results in downloading the best available quality served as a single file. `best` is also needed for videos that don't come from YouTube because they don't provide the audio and video in two different files. If you want to only download some dash formats (for example if you are not interested in getting videos with a resolution higher than 1080p), you can add `-f bestvideo[height<=?1080]+bestaudio/best` to your configuration file. Note that if you use youtube-dl to stream to `stdout` (and most likely to pipe it to your media player then), i.e. you explicitly specify output template as `-o -`, youtube-dl still uses `-f best` format selection in order to start content delivery immediately to your player and not to wait until `bestvideo` and `bestaudio` are downloaded and muxed.
The general syntax for format selection is `--format FORMAT` or shorter `-f FORMAT` where `FORMAT` is a *selector expression*, i.e. an expression that describes format or formats you would like to download.
**tl;dr:** [navigate me to examples](#format-selection-examples).
The simplest case is requesting a specific format, for example with `-f 22` you can download the format with format code equal to 22. You can get the list of available format codes for particular video using `--list-formats` or `-F`. Note that these format codes are extractor specific.
You can also use a file extension (currently `3gp`, `aac`, `flv`, `m4a`, `mp3`, `mp4`, `ogg`, `wav`, `webm` are supported) to download best quality format of particular file extension served as a single file, e.g. `-f webm` will download best quality format with `webm` extension served as a single file.
You can also use special names to select particular edge case format:
- `best`: Select best quality format represented by single file with video and audio
- `worst`: Select worst quality format represented by single file with video and audio
- `bestvideo`: Select best quality video only format (e.g. DASH video), may not be available
- `worstvideo`: Select worst quality video only format, may not be available
- `bestaudio`: Select best quality audio only format, may not be available
- `worstaudio`: Select worst quality audio only format, may not be available
For example, to download worst quality video only format you can use `-f worstvideo`.
If you want to download multiple videos and they don't have the same formats available, you can specify the order of preference using slashes. Note that slash is left-associative, i.e. formats on the left hand side are preferred, for example `-f 22/17/18` will download format 22 if it's available, otherwise it will download format 17 if it's available, otherwise it will download format 18 if it's available, otherwise it will complain that no suitable formats are available for download.
If you want to download several formats of the same video use comma as a separator, e.g. `-f 22,17,18` will download all these three formats, of course if they are available. Or more sophisticated example combined with precedence feature `-f 136/137/mp4/bestvideo,140/m4a/bestaudio`.
You can also filter the video formats by putting a condition in brackets, as in `-f "best[height=720]"` (or `-f "[filesize>10M]"`).
The following numeric meta fields can be used with comparisons `<`, `<=`, `>`, `>=`, `=` (equals), `!=` (not equals):
- `filesize`: The number of bytes, if known in advance
- `width`: Width of the video, if known
- `height`: Height of the video, if known
- `tbr`: Average bitrate of audio and video in KBit/s
- `abr`: Average audio bitrate in KBit/s
- `vbr`: Average video bitrate in KBit/s
- `asr`: Audio sampling rate in Hertz
- `fps`: Frame rate
Also filtering work for comparisons `=` (equals), `!=` (not equals), `^=` (begins with), `$=` (ends with), `*=` (contains) and following string meta fields:
- `ext`: File extension
- `acodec`: Name of the audio codec in use
- `vcodec`: Name of the video codec in use
- `container`: Name of the container format
- `protocol`: The protocol that will be used for the actual download, lower-case. `http`, `https`, `rtsp`, `rtmp`, `rtmpe`, `m3u8`, or `m3u8_native`
- `format_id`: A short description of the format
Note that none of the aforementioned meta fields are guaranteed to be present since this solely depends on the metadata obtained by particular extractor, i.e. the metadata offered by video hoster.
Formats for which the value is not known are excluded unless you put a question mark (`?`) after the operator. You can combine format filters, so `-f "[height <=? 720][tbr>500]"` selects up to 720p videos (or videos where the height is not known) with a bitrate of at least 500 KBit/s.
You can merge the video and audio of two formats into a single file using `-f <video-format>+<audio-format>` (requires ffmpeg or avconv installed), for example `-f bestvideo+bestaudio` will download best video only format, best audio only format and mux them together with ffmpeg/avconv.
Format selectors can also be grouped using parentheses, for example if you want to download the best mp4 and webm formats with a height lower than 480 you can use `-f '(mp4,webm)[height<480]'`.
Since the end of April 2015 and version 2015.04.26 youtube-dl uses `-f bestvideo+bestaudio/best` as default format selection (see [#5447](https://github.com/rg3/youtube-dl/issues/5447), [#5456](https://github.com/rg3/youtube-dl/issues/5456)). If ffmpeg or avconv are installed this results in downloading `bestvideo` and `bestaudio` separately and muxing them together into a single file giving the best overall quality available. Otherwise it falls back to `best` and results in downloading the best available quality served as a single file. `best` is also needed for videos that don't come from YouTube because they don't provide the audio and video in two different files. If you want to only download some DASH formats (for example if you are not interested in getting videos with a resolution higher than 1080p), you can add `-f bestvideo[height<=?1080]+bestaudio/best` to your configuration file. Note that if you use youtube-dl to stream to `stdout` (and most likely to pipe it to your media player then), i.e. you explicitly specify output template as `-o -`, youtube-dl still uses `-f best` format selection in order to start content delivery immediately to your player and not to wait until `bestvideo` and `bestaudio` are downloaded and muxed.
If you want to preserve the old format selection behavior (prior to youtube-dl 2015.04.26), i.e. you want to download the best available quality media served as a single file, you should explicitly specify your choice with `-f best`. You may want to add it to the [configuration file](#configuration) in order not to type it every time you run youtube-dl.
#### Format selection examples
Note on Windows you may need to use double quotes instead of single.
```bash
# Download best mp4 format available or any other best if no mp4 available
$ youtube-dl -f 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best'
# Download best format available but not better that 480p
$ youtube-dl -f 'bestvideo[height<=480]+bestaudio/best[height<=480]'
# Download best video only format but no bigger that 50 MB
$ youtube-dl -f 'best[filesize<50M]'
# Download best format available via direct link over HTTP/HTTPS protocol
$ youtube-dl -f '(bestvideo+bestaudio/best)[protocol^=http]'
```
# VIDEO SELECTION
Videos can be filtered by their upload date using the options `--date`, `--datebefore` or `--dateafter`. They accept dates in two formats:
@@ -534,6 +695,12 @@ Most people asking this question are not aware that youtube-dl now defaults to d
Apparently YouTube requires you to pass a CAPTCHA test if you download too much. We're [considering to provide a way to let you solve the CAPTCHA](https://github.com/rg3/youtube-dl/issues/154), but at the moment, your best course of action is pointing a webbrowser to the youtube URL, solving the CAPTCHA, and restart youtube-dl.
### Do I need any other programs?
youtube-dl works fine on its own on most sites. However, if you want to convert video/audio, you'll need [avconv](https://libav.org/) or [ffmpeg](https://www.ffmpeg.org/). On some sites - most notably YouTube - videos can be retrieved in a higher quality format without sound. youtube-dl will detect whether avconv/ffmpeg is present and automatically pick the best option.
Videos or video formats streamed via RTMP protocol can only be downloaded when [rtmpdump](https://rtmpdump.mplayerhq.hu/) is installed. Downloading MMS and RTSP videos requires either [mplayer](http://mplayerhq.hu/) or [mpv](https://mpv.io/) to be installed.
### I have downloaded a video but how can I play it?
Once the video is fully downloaded, use any video player, such as [vlc](http://www.videolan.org) or [mplayer](http://www.mplayerhq.hu/).
@@ -552,11 +719,11 @@ If you want to play the video on a machine that is not running youtube-dl, you c
YouTube has switched to a new video info format in July 2011 which is not supported by old versions of youtube-dl. See [above](#how-do-i-update-youtube-dl) for how to update youtube-dl.
### ERROR: unable to download video ###
### ERROR: unable to download video
YouTube requires an additional signature since September 2012 which is not supported by old versions of youtube-dl. See [above](#how-do-i-update-youtube-dl) for how to update youtube-dl.
### Video URL contains an ampersand and I'm getting some strange output `[1] 2839` or `'v' is not recognized as an internal or external command` ###
### Video URL contains an ampersand and I'm getting some strange output `[1] 2839` or `'v' is not recognized as an internal or external command`
That's actually the output from your shell. Since ampersand is one of the special shell characters it's interpreted by the shell preventing you from passing the whole URL to youtube-dl. To disable your shell from interpreting the ampersands (or any other special characters) you have to either put the whole URL in quotes or escape them with a backslash (which approach will work depends on your shell).
@@ -580,7 +747,7 @@ In February 2015, the new YouTube player contained a character sequence in a str
These two error codes indicate that the service is blocking your IP address because of overuse. Contact the service and ask them to unblock your IP address, or - if you have acquired a whitelisted IP address already - use the [`--proxy` or `--source-address` options](#network-options) to select another IP address.
### SyntaxError: Non-ASCII character ###
### SyntaxError: Non-ASCII character
The error
@@ -591,7 +758,7 @@ means you're using an outdated version of Python. Please update to Python 2.6 or
### What is this binary file? Where has the code gone?
Since June 2012 (#342) youtube-dl is packed as an executable zipfile, simply unzip it (might need renaming to `youtube-dl.zip` first on some systems) or clone the git repository, as laid out above. If you modify the code, you can run it by executing the `__main__.py` file. To recompile the executable, run `make youtube-dl`.
Since June 2012 ([#342](https://github.com/rg3/youtube-dl/issues/342)) youtube-dl is packed as an executable zipfile, simply unzip it (might need renaming to `youtube-dl.zip` first on some systems) or clone the git repository, as laid out above. If you modify the code, you can run it by executing the `__main__.py` file. To recompile the executable, run `make youtube-dl`.
### The exe throws a *Runtime error from Visual C++*
@@ -609,7 +776,7 @@ From then on, after restarting your shell, you will be able to access both youtu
Use the `-o` to specify an [output template](#output-template), for example `-o "/home/user/videos/%(title)s-%(id)s.%(ext)s"`. If you want this for all of your downloads, put the option into your [configuration file](#configuration).
### How do I download a video starting with a `-` ?
### How do I download a video starting with a `-`?
Either prepend `http://www.youtube.com/watch?v=` or separate the ID from the options with `--`:
@@ -620,7 +787,7 @@ Either prepend `http://www.youtube.com/watch?v=` or separate the ID from the opt
Use the `--cookies` option, for example `--cookies /path/to/cookies/file.txt`. Note that the cookies file must be in Mozilla/Netscape format and the first line of the cookies file must be either `# HTTP Cookie File` or `# Netscape HTTP Cookie File`. Make sure you have correct [newline format](https://en.wikipedia.org/wiki/Newline) in the cookies file and convert newlines if necessary to correspond with your OS, namely `CRLF` (`\r\n`) for Windows, `LF` (`\n`) for Linux and `CR` (`\r`) for Mac OS. `HTTP Error 400: Bad Request` when using `--cookies` is a good sign of invalid newline format.
Passing cookies to youtube-dl is a good way to workaround login when a particular extractor does not implement it explicitly.
Passing cookies to youtube-dl is a good way to workaround login when a particular extractor does not implement it explicitly. Another use case is working around [CAPTCHA](https://en.wikipedia.org/wiki/CAPTCHA) some websites require you to solve in particular cases in order to get access (e.g. YouTube, CloudFlare).
### Can you add support for this anime video site, or site which shows current movies for free?
@@ -667,14 +834,16 @@ To run the test, simply invoke your favorite test runner, or execute a test file
If you want to create a build of youtube-dl yourself, you'll need
* python
* make
* make (both GNU make and BSD make are supported)
* pandoc
* zip
* nosetests
### Adding support for a new site
If you want to add support for a new site, you can follow this quick list (assuming your service is called `yourextractor`):
If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. youtube-dl does **not support** such sites thus pull requests adding support for them **will be rejected**.
After you have ensured this site is distributing it's content legally, you can follow this quick list (assuming your service is called `yourextractor`):
1. [Fork this repository](https://github.com/rg3/youtube-dl/fork)
2. Check out the source code with `git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git`
@@ -720,18 +889,19 @@ If you want to add support for a new site, you can follow this quick list (assum
# TODO more properties (see youtube_dl/extractor/common.py)
}
```
5. Add an import in [`youtube_dl/extractor/__init__.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/__init__.py).
5. Add an import in [`youtube_dl/extractor/extractors.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/extractors.py).
6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc.
7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L62-L200). Add tests and code for as many as you want.
8. If you can, check the code with [flake8](https://pypi.python.org/pypi/flake8).
9. When the tests pass, [add](http://git-scm.com/docs/git-add) the new files and [commit](http://git-scm.com/docs/git-commit) them and [push](http://git-scm.com/docs/git-push) the result, like this:
7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/rg3/youtube-dl/blob/58525c94d547be1c8167d16c298bdd75506db328/youtube_dl/extractor/common.py#L68-L226). Add tests and code for as many as you want.
8. Keep in mind that the only mandatory fields in info dict for successful extraction process are `id`, `title` and either `url` or `formats`, i.e. these are the critical data the extraction does not make any sense without. This means that [any field](https://github.com/rg3/youtube-dl/blob/58525c94d547be1c8167d16c298bdd75506db328/youtube_dl/extractor/common.py#L138-L226) apart from aforementioned mandatory ones should be treated **as optional** and extraction should be **tolerate** to situations when sources for these fields can potentially be unavailable (even if they always available at the moment) and **future-proof** in order not to break the extraction of general purpose mandatory fields. For example, if you have some intermediate dict `meta` that is a source of metadata and it has a key `summary` that you want to extract and put into resulting info dict as `description`, you should be ready that this key may be missing from the `meta` dict, i.e. you should extract it as `meta.get('summary')` and not `meta['summary']`. Similarly, you should pass `fatal=False` when extracting data from a webpage with `_search_regex/_html_search_regex`.
9. Check the code with [flake8](https://pypi.python.org/pypi/flake8).
10. When the tests pass, [add](http://git-scm.com/docs/git-add) the new files and [commit](http://git-scm.com/docs/git-commit) them and [push](http://git-scm.com/docs/git-push) the result, like this:
$ git add youtube_dl/extractor/__init__.py
$ git add youtube_dl/extractor/extractors.py
$ git add youtube_dl/extractor/yourextractor.py
$ git commit -m '[yourextractor] Add new extractor'
$ git push origin yourextractor
10. Finally, [create a pull request](https://help.github.com/articles/creating-a-pull-request). We'll then review and merge it.
11. Finally, [create a pull request](https://help.github.com/articles/creating-a-pull-request). We'll then review and merge it.
In any case, thank you very much for your contributions!
@@ -750,7 +920,7 @@ with youtube_dl.YoutubeDL(ydl_opts) as ydl:
ydl.download(['http://www.youtube.com/watch?v=BaW_jenozKc'])
```
Most likely, you'll want to use various options. For a list of what can be done, have a look at [youtube_dl/YoutubeDL.py](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L117-L265). For a start, if you want to intercept youtube-dl's output, set a `logger` object.
Most likely, you'll want to use various options. For a list of what can be done, have a look at [`youtube_dl/YoutubeDL.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L121-L269). For a start, if you want to intercept youtube-dl's output, set a `logger` object.
Here's a more complete example of a program that outputs only errors (and a short message after the download is finished), and downloads/converts the video to an mp3 file:
@@ -791,9 +961,23 @@ with youtube_dl.YoutubeDL(ydl_opts) as ydl:
# BUGS
Bugs and suggestions should be reported at: <https://github.com/rg3/youtube-dl/issues> . Unless you were prompted so or there is another pertinent reason (e.g. GitHub fails to accept the bug report), please do not send bug reports via personal email. For discussions, join us in the irc channel #youtube-dl on freenode.
Bugs and suggestions should be reported at: <https://github.com/rg3/youtube-dl/issues>. Unless you were prompted so or there is another pertinent reason (e.g. GitHub fails to accept the bug report), please do not send bug reports via personal email. For discussions, join us in the IRC channel [#youtube-dl](irc://chat.freenode.net/#youtube-dl) on freenode ([webchat](http://webchat.freenode.net/?randomnick=1&channels=youtube-dl)).
**Please include the full output of youtube-dl when run with `-v`**.
**Please include the full output of youtube-dl when run with `-v`**, i.e. **add** `-v` flag to **your command line**, copy the **whole** output and post it in the issue body wrapped in \`\`\` for better formatting. It should look similar to this:
```
$ youtube-dl -v <your command line>
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2015.12.06
[debug] Git HEAD: 135392e
[debug] Python version 2.6.6 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
...
```
**Do not post screenshots of verbose log only plain text is acceptable.**
The output (including the first lines) contains important debugging information. Issues without the full output are often not reproducible and therefore do not get solved in short order, if ever.
@@ -809,13 +993,13 @@ So please elaborate on what feature you are requesting, or what bug you want to
- How it could be fixed
- How your proposed solution would look like
If your report is shorter than two lines, it is almost certainly missing some of these, which makes it hard for us to respond to it. We're often too polite to close the issue outright, but the missing info makes misinterpretation likely. As a commiter myself, I often get frustrated by these issues, since the only possible way for me to move forward on them is to ask for clarification over and over.
If your report is shorter than two lines, it is almost certainly missing some of these, which makes it hard for us to respond to it. We're often too polite to close the issue outright, but the missing info makes misinterpretation likely. As a committer myself, I often get frustrated by these issues, since the only possible way for me to move forward on them is to ask for clarification over and over.
For bug reports, this means that your report should contain the *complete* output of youtube-dl when called with the `-v` flag. The error message you get for (most) bugs even says so, but you would not believe how many of our bug reports do not contain this information.
If your server has multiple IPs or you suspect censorship, adding `--call-home` may be a good idea to get more diagnostics. If the error is `ERROR: Unable to extract ...` and you cannot reproduce it from multiple countries, add `--dump-pages` (warning: this will yield a rather large output, redirect it to the file `log.txt` by adding `>log.txt 2>&1` to your command-line) or upload the `.dump` files you get when you add `--write-pages` [somewhere](https://gist.github.com/).
**Site support requests must contain an example URL**. An example URL is a URL you might want to download, like http://www.youtube.com/watch?v=BaW_jenozKc . There should be an obvious video present. Except under very special circumstances, the main page of a video service (e.g. http://www.youtube.com/ ) is *not* an example URL.
**Site support requests must contain an example URL**. An example URL is a URL you might want to download, like `http://www.youtube.com/watch?v=BaW_jenozKc`. There should be an obvious video present. Except under very special circumstances, the main page of a video service (e.g. `http://www.youtube.com/`) is *not* an example URL.
### Are you using the latest version?
@@ -823,7 +1007,7 @@ Before reporting any issue, type `youtube-dl -U`. This should report that you're
### Is the issue already documented?
Make sure that someone has not already opened the issue you're trying to open. Search at the top of the window or at https://github.com/rg3/youtube-dl/search?type=Issues . If there is an issue, feel free to write something along the lines of "This affects me as well, with version 2015.01.01. Here is some more information on the issue: ...". While some issues may be old, a new post into them often spurs rapid activity.
Make sure that someone has not already opened the issue you're trying to open. Search at the top of the window or browse the [GitHub Issues](https://github.com/rg3/youtube-dl/search?type=Issues) of this repository. If there is an issue, feel free to write something along the lines of "This affects me as well, with version 2015.01.01. Here is some more information on the issue: ...". While some issues may be old, a new post into them often spurs rapid activity.
### Why are existing options not enough?
@@ -853,4 +1037,4 @@ It may sound strange, but some bug reports we receive are completely unrelated t
youtube-dl is released into the public domain by the copyright holders.
This README file was originally written by Daniel Bolton (<https://github.com/dbbolton>) and is likewise released into the public domain.
This README file was originally written by [Daniel Bolton](https://github.com/dbbolton) and is likewise released into the public domain.

View File

@@ -5,7 +5,7 @@ from __future__ import with_statement, unicode_literals
import datetime
import glob
import io # For Python 2 compatibilty
import io # For Python 2 compatibility
import os
import re

View File

@@ -0,0 +1,19 @@
# encoding: utf-8
from __future__ import unicode_literals
import re
class LazyLoadExtractor(object):
_module = None
@classmethod
def ie_key(cls):
return cls.__name__[:-2]
def __new__(cls, *args, **kwargs):
mod = __import__(cls._module, fromlist=(cls.__name__,))
real_cls = getattr(mod, cls.__name__)
instance = real_cls.__new__(real_cls)
instance.__init__(*args, **kwargs)
return instance

View File

@@ -0,0 +1,29 @@
#!/usr/bin/env python
from __future__ import unicode_literals
import io
import optparse
def main():
parser = optparse.OptionParser(usage='%prog INFILE OUTFILE')
options, args = parser.parse_args()
if len(args) != 2:
parser.error('Expected an input and an output filename')
infile, outfile = args
with io.open(infile, encoding='utf-8') as inf:
issue_template_tmpl = inf.read()
# Get the version from youtube_dl/version.py without importing the package
exec(compile(open('youtube_dl/version.py').read(),
'youtube_dl/version.py', 'exec'))
out = issue_template_tmpl % {'version': locals()['__version__']}
with io.open(outfile, 'w', encoding='utf-8') as outf:
outf.write(out)
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,63 @@
from __future__ import unicode_literals, print_function
from inspect import getsource
import os
from os.path import dirname as dirn
import sys
print('WARNING: Lazy loading extractors is an experimental feature that may not always work', file=sys.stderr)
sys.path.insert(0, dirn(dirn((os.path.abspath(__file__)))))
lazy_extractors_filename = sys.argv[1]
if os.path.exists(lazy_extractors_filename):
os.remove(lazy_extractors_filename)
from youtube_dl.extractor import _ALL_CLASSES
from youtube_dl.extractor.common import InfoExtractor
with open('devscripts/lazy_load_template.py', 'rt') as f:
module_template = f.read()
module_contents = [module_template + '\n' + getsource(InfoExtractor.suitable)]
ie_template = '''
class {name}(LazyLoadExtractor):
_VALID_URL = {valid_url!r}
_module = '{module}'
'''
make_valid_template = '''
@classmethod
def _make_valid_url(cls):
return {valid_url!r}
'''
def build_lazy_ie(ie, name):
valid_url = getattr(ie, '_VALID_URL', None)
s = ie_template.format(
name=name,
valid_url=valid_url,
module=ie.__module__)
if ie.suitable.__func__ is not InfoExtractor.suitable.__func__:
s += '\n' + getsource(ie.suitable)
if hasattr(ie, '_make_valid_url'):
# search extractors
s += make_valid_template.format(valid_url=ie._make_valid_url())
return s
names = []
for ie in list(sorted(_ALL_CLASSES[:-1], key=lambda cls: cls.ie_key())) + _ALL_CLASSES[-1:]:
name = ie.ie_key() + 'IE'
src = build_lazy_ie(ie, name)
module_contents.append(src)
names.append(name)
module_contents.append(
'_ALL_CLASSES = [{0}]'.format(', '.join(names)))
module_src = '\n'.join(module_contents) + '\n'
with open(lazy_extractors_filename, 'wt') as f:
f.write(module_src)

View File

@@ -45,9 +45,9 @@ fi
/bin/echo -e "\n### Changing version in version.py..."
sed -i "s/__version__ = '.*'/__version__ = '$version'/" youtube_dl/version.py
/bin/echo -e "\n### Committing documentation and youtube_dl/version.py..."
make README.md CONTRIBUTING.md supportedsites
git add README.md CONTRIBUTING.md docs/supportedsites.md youtube_dl/version.py
/bin/echo -e "\n### Committing documentation, templates and youtube_dl/version.py..."
make README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE.md supportedsites
git add README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE.md docs/supportedsites.md youtube_dl/version.py
git commit -m "release $version"
/bin/echo -e "\n### Now tagging, signing and pushing..."

View File

@@ -1,6 +1,7 @@
# Supported sites
- **1tv**: Первый канал
- **1up.com**
- **20min**
- **220.ro**
- **22tracks:genre**
- **22tracks:track**
@@ -15,37 +16,50 @@
- **abc.net.au**
- **Abc7News**
- **AcademicEarth:Course**
- **acast**
- **acast:channel**
- **AddAnime**
- **AdobeTV**
- **AdobeTVChannel**
- **AdobeTVShow**
- **AdobeTVVideo**
- **AdultSwim**
- **Aftenposten**
- **aenetworks**: A+E Networks: A&E, Lifetime, History.com, FYI Network
- **Aftonbladet**
- **AirMozilla**
- **AlJazeera**
- **Allocine**
- **AlphaPorno**
- **AnimeOnDemand**
- **anitube.se**
- **AnySex**
- **Aparat**
- **AppleConnect**
- **AppleDaily**: 臺灣蘋果日報
- **AppleTrailers**
- **appletrailers**
- **appletrailers:section**
- **archive.org**: archive.org videos
- **ARD**
- **ARD:mediathek**: Saarländischer Rundfunk
- **ARD:mediathek**
- **arte.tv**
- **arte.tv:+7**
- **arte.tv:cinema**
- **arte.tv:concert**
- **arte.tv:creative**
- **arte.tv:ddc**
- **arte.tv:embed**
- **arte.tv:future**
- **arte.tv:magazine**
- **AtresPlayer**
- **ATTTechChannel**
- **AudiMedia**
- **AudioBoom**
- **audiomack**
- **audiomack:album**
- **auroravid**: AuroraVid
- **Azubu**
- **AzubuLive**
- **BaiduVideo**: 百度视频
- **bambuser**
- **bambuser:channel**
@@ -58,28 +72,39 @@
- **Beeg**
- **BehindKink**
- **Bet**
- **Bigflix**
- **Bild**: Bild.de
- **BiliBili**
- **BioBioChileTV**
- **BleacherReport**
- **BleacherReportCMS**
- **blinkx**
- **blip.tv:user**
- **BlipTV**
- **Bloomberg**
- **BokeCC**
- **Bpb**: Bundeszentrale für politische Bildung
- **BR**: Bayerischer Rundfunk Mediathek
- **BravoTV**
- **Break**
- **Brightcove**
- **brightcove:legacy**
- **brightcove:new**
- **bt:article**: Bergens Tidende Articles
- **bt:vestlendingen**: Bergens Tidende - Vestlendingen
- **BuzzFeed**
- **BYUtv**
- **Camdemy**
- **CamdemyFolder**
- **Canal13cl**
- **CamWithHer**
- **canalc2.tv**
- **Canalplus**: canalplus.fr, piwiplus.fr and d8.tv
- **Canvas**
- **CBC**
- **CBCPlayer**
- **CBS**
- **CBSInteractive**
- **CBSNews**: CBS News
- **CBSNewsLiveVideo**: CBS News Live Videos
- **CBSSports**
- **CDA**
- **CeskaTelevize**
- **channel9**: Channel 9
- **Chaturbate**
@@ -90,11 +115,14 @@
- **Cinemassacre**
- **Clipfish**
- **cliphunter**
- **ClipRs**
- **Clipsyndicate**
- **cloudtime**: CloudTime
- **Cloudy**
- **Clubic**
- **Clyp**
- **cmt.com**
- **CNET**
- **CNBC**
- **CNN**
- **CNNArticle**
- **CNNBlogs**
@@ -105,27 +133,40 @@
- **ComedyCentralShows**: The Daily Show / The Colbert Report
- **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED
- **Cracked**
- **Crackle**
- **Criterion**
- **CrooksAndLiars**
- **Crunchyroll**
- **crunchyroll:playlist**
- **CSNNE**
- **CSpan**: C-SPAN
- **CtsNews**: 華視新聞
- **culturebox.francetvinfo.fr**
- **CultureUnplugged**
- **CWTV**
- **dailymotion**
- **dailymotion:playlist**
- **dailymotion:user**
- **DailymotionCloud**
- **daum.net**
- **daum.net:clip**
- **daum.net:playlist**
- **daum.net:user**
- **DBTV**
- **DCN**
- **dcn:live**
- **dcn:season**
- **dcn:video**
- **DctpTv**
- **DeezerPlaylist**
- **defense.gouv.fr**
- **democracynow**
- **DHM**: Filmarchiv - Deutsches Historisches Museum
- **Digiteka**
- **Discovery**
- **Dotsub**
- **DouyuTV**: 斗鱼
- **DPlay**
- **dramafever**
- **dramafever:series**
- **DRBonanza**
@@ -135,6 +176,8 @@
- **Dump**
- **Dumpert**
- **dvtv**: http://video.aktualne.cz/
- **dw**
- **dw:article**
- **EaglePlatform**
- **EbaumsWorld**
- **EchoMsk**
@@ -150,7 +193,7 @@
- **Eporner**
- **EroProfile**
- **Escapist**
- **ESPN** (Currently broken)
- **ESPN**
- **EsriVideo**
- **Europa**
- **EveryonesMixtape**
@@ -161,24 +204,29 @@
- **faz.net**
- **fc2**
- **Fczenit**
- **features.aol.com**
- **fernsehkritik.tv**
- **Firstpost**
- **FiveTV**
- **Flickr**
- **Folketinget**: Folketinget (ft.dk; Danish parliament)
- **FootyRoom**
- **FOX**
- **Foxgay**
- **FoxNews**: Fox News and Fox Business Video
- **FoxSports**
- **france2.fr:generation-quoi**
- **FranceCulture**
- **FranceCultureEmission**
- **FranceInter**
- **francetv**: France 2, 3, 4, 5 and Ô
- **francetvinfo.fr**
- **Freesound**
- **freespeech.org**
- **FreeVideo**
- **Funimation**
- **FunnyOrDie**
- **GameInformer**
- **Gamekings**
- **GameOne**
- **gameone:playlist**
@@ -194,24 +242,27 @@
- **Giga**
- **Glide**: Glide mobile video messages (glide.me)
- **Globo**
- **GloboArticle**
- **GodTube**
- **GoldenMoustache**
- **Golem**
- **GorillaVid**: GorillaVid.in, daclips.in, movpod.in, fastvideo.in, realvid.net and filehoot.com
- **GoogleDrive**
- **Goshgay**
- **GPUTechConf**
- **Groupon**
- **Hark**
- **HBO**
- **HearThisAt**
- **Heise**
- **HellPorno**
- **Helsinki**: helsinki.fi
- **HentaiStigma**
- **HistoricFilms**
- **History**
- **hitbox**
- **hitbox:live**
- **HornBunny**
- **HotNewHipHop**
- **HotStar**
- **Howcast**
- **HowStuffWorks**
- **HuffPost**: Huffington Post
@@ -234,12 +285,12 @@
- **Ir90Tv**
- **ivi**: ivi.ru
- **ivi:compilation**: ivi.ru compilations
- **ivideon**: Ivideon TV
- **Izlesene**
- **JadoreCettePub**
- **JeuxVideo**
- **Jove**
- **jpopsuki.tv**
- **Jukebox**
- **JWPlatform**
- **Kaltura**
- **KanalPlay**: Kanal 5/9/11 Play
- **Kankan**
@@ -249,9 +300,11 @@
- **KeezMovies**
- **KhanAcademy**
- **KickStarter**
- **KonserthusetPlay**
- **kontrtube**: KontrTube.ru - Труба зовёт
- **KrasView**: Красвью
- **Ku6**
- **KUSI**
- **kuwo:album**: 酷我音乐 - 专辑
- **kuwo:category**: 酷我音乐 - 分类
- **kuwo:chart**: 酷我音乐 - 排行榜
@@ -260,10 +313,11 @@
- **kuwo:song**: 酷我音乐
- **la7.tv**
- **Laola1Tv**
- **Le**: 乐视网
- **Lecture2Go**
- **Letv**: 乐视网
- **LetvPlaylist**
- **LetvTv**
- **Lemonde**
- **LePlaylist**
- **LetvCloud**: 乐视云
- **Libsyn**
- **life:embed**
- **lifenews**: LIFE | NEWS
@@ -274,24 +328,30 @@
- **livestream**
- **livestream:original**
- **LnkGo**
- **LoveHomePorn**
- **lrt.lt**
- **lynda**: lynda.com videos
- **lynda:course**: lynda.com online courses
- **m6**
- **macgamestore**: MacGameStore trailers
- **mailru**: Видео@Mail.Ru
- **MakersChannel**
- **MakerTV**
- **Malemotion**
- **MDR**
- **MatchTV**
- **MDR**: MDR.DE and KiKA
- **media.ccc.de**
- **metacafe**
- **Metacritic**
- **Mgoon**
- **Minhateca**
- **MinistryGrid**
- **Minoto**
- **miomio.tv**
- **MiTele**: mitele.es
- **mixcloud**
- **MLB**
- **Mnet**
- **MoeVideo**: LetitBit video services: moevideo.net, playreplay.net and videochart.net
- **Mofosex**
- **Mojvideo**
@@ -303,7 +363,6 @@
- **MovieClips**
- **MovieFap**
- **Moviezine**
- **movshare**: MovShare
- **MPORA**
- **MSNBC**
- **MTV**
@@ -318,10 +377,11 @@
- **MySpace:album**
- **MySpass**
- **Myvi**
- **myvideo**
- **myvideo** (Currently broken)
- **MyVidster**
- **n-tv.de**
- **NationalGeographic**
- **natgeo**
- **natgeo:channel**
- **Naver**
- **NBA**
- **NBC**
@@ -346,11 +406,13 @@
- **Newstube**
- **NextMedia**: 蘋果日報
- **NextMediaActionNews**: 蘋果日報 - 動新聞
- **nextmovie.com**
- **nfb**: National Film Board of Canada
- **nfl.com**
- **nhl.com**
- **nhl.com:news**: NHL news
- **nhl.com:videocenter**: NHL videocenter category
- **nick.com**
- **niconico**: ニコニコ動画
- **NiconicoPlaylist**
- **njoy**: N-JOY
@@ -359,18 +421,21 @@
- **Normalboots**
- **NosVideo**
- **Nova**: TN.cz, Prásk.tv, Nova.cz, Novaplus.cz, FANDA.tv, Krásná.cz and Doma.cz
- **novamov**: NovaMov
- **nowness**
- **nowness:playlist**
- **nowness:series**
- **NowTV**
- **NowTV** (Currently broken)
- **NowTVList**
- **nowvideo**: NowVideo
- **Noz**
- **npo**: npo.nl and ntr.nl
- **npo.nl:live**
- **npo.nl:radio**
- **npo.nl:radio:fragment**
- **Npr**
- **NRK**
- **NRKPlaylist**
- **NRKSkole**: NRK Skole
- **NRKTV**: NRK TV and NRK Radio
- **ntv.ru**
- **Nuvid**
@@ -383,22 +448,27 @@
- **OnionStudios**
- **Ooyala**
- **OoyalaExternal**
- **Openload**
- **OraTV**
- **orf:fm4**: radio FM4
- **orf:iptv**: iptv.ORF.at
- **orf:oe1**: Radio Österreich 1
- **orf:tvthek**: ORF TVthek
- **pandora.tv**: 판도라TV
- **parliamentlive.tv**: UK parliament videos
- **Patreon**
- **PBS**
- **pbs**: Public Broadcasting Service (PBS) and member stations: PBS: Public Broadcasting Service, APT - Alabama Public Television (WBIQ), GPB/Georgia Public Broadcasting (WGTV), Mississippi Public Broadcasting (WMPN), Nashville Public Television (WNPT), WFSU-TV (WFSU), WSRE (WSRE), WTCI (WTCI), WPBA/Channel 30 (WPBA), Alaska Public Media (KAKM), Arizona PBS (KAET), KNME-TV/Channel 5 (KNME), Vegas PBS (KLVX), AETN/ARKANSAS ETV NETWORK (KETS), KET (WKLE), WKNO/Channel 10 (WKNO), LPB/LOUISIANA PUBLIC BROADCASTING (WLPB), OETA (KETA), Ozarks Public Television (KOZK), WSIU Public Broadcasting (WSIU), KEET TV (KEET), KIXE/Channel 9 (KIXE), KPBS San Diego (KPBS), KQED (KQED), KVIE Public Television (KVIE), PBS SoCal/KOCE (KOCE), ValleyPBS (KVPT), CONNECTICUT PUBLIC TELEVISION (WEDH), KNPB Channel 5 (KNPB), SOPTV (KSYS), Rocky Mountain PBS (KRMA), KENW-TV3 (KENW), KUED Channel 7 (KUED), Wyoming PBS (KCWC), Colorado Public Television / KBDI 12 (KBDI), KBYU-TV (KBYU), Thirteen/WNET New York (WNET), WGBH/Channel 2 (WGBH), WGBY (WGBY), NJTV Public Media NJ (WNJT), WLIW21 (WLIW), mpt/Maryland Public Television (WMPB), WETA Television and Radio (WETA), WHYY (WHYY), PBS 39 (WLVT), WVPT - Your Source for PBS and More! (WVPT), Howard University Television (WHUT), WEDU PBS (WEDU), WGCU Public Media (WGCU), WPBT2 (WPBT), WUCF TV (WUCF), WUFT/Channel 5 (WUFT), WXEL/Channel 42 (WXEL), WLRN/Channel 17 (WLRN), WUSF Public Broadcasting (WUSF), ETV (WRLK), UNC-TV (WUNC), PBS Hawaii - Oceanic Cable Channel 10 (KHET), Idaho Public Television (KAID), KSPS (KSPS), OPB (KOPB), KWSU/Channel 10 & KTNW/Channel 31 (KWSU), WILL-TV (WILL), Network Knowledge - WSEC/Springfield (WSEC), WTTW11 (WTTW), Iowa Public Television/IPTV (KDIN), Nine Network (KETC), PBS39 Fort Wayne (WFWA), WFYI Indianapolis (WFYI), Milwaukee Public Television (WMVS), WNIN (WNIN), WNIT Public Television (WNIT), WPT (WPNE), WVUT/Channel 22 (WVUT), WEIU/Channel 51 (WEIU), WQPT-TV (WQPT), WYCC PBS Chicago (WYCC), WIPB-TV (WIPB), WTIU (WTIU), CET (WCET), ThinkTVNetwork (WPTD), WBGU-TV (WBGU), WGVU TV (WGVU), NET1 (KUON), Pioneer Public Television (KWCM), SDPB Television (KUSD), TPT (KTCA), KSMQ (KSMQ), KPTS/Channel 8 (KPTS), KTWU/Channel 11 (KTWU), East Tennessee PBS (WSJK), WCTE-TV (WCTE), WLJT, Channel 11 (WLJT), WOSU TV (WOSU), WOUB/WOUC (WOUB), WVPB (WVPB), WKYU-PBS (WKYU), KERA 13 (KERA), MPBN (WCBB), Mountain Lake PBS (WCFE), NHPTV (WENH), Vermont PBS (WETK), witf (WITF), WQED Multimedia (WQED), WMHT Educational Telecommunications (WMHT), Q-TV (WDCQ), WTVS Detroit Public TV (WTVS), CMU Public Television (WCMU), WKAR-TV (WKAR), WNMU-TV Public TV 13 (WNMU), WDSE - WRPT (WDSE), WGTE TV (WGTE), Lakeland Public Television (KAWE), KMOS-TV - Channels 6.1, 6.2 and 6.3 (KMOS), MontanaPBS (KUSM), KRWG/Channel 22 (KRWG), KACV (KACV), KCOS/Channel 13 (KCOS), WCNY/Channel 24 (WCNY), WNED (WNED), WPBS (WPBS), WSKG Public TV (WSKG), WXXI (WXXI), WPSU (WPSU), WVIA Public Media Studios (WVIA), WTVI (WTVI), Western Reserve PBS (WNEO), WVIZ/PBS ideastream (WVIZ), KCTS 9 (KCTS), Basin PBS (KPBT), KUHT / Channel 8 (KUHT), KLRN (KLRN), KLRU (KLRU), WTJX Channel 12 (WTJX), WCVE PBS (WCVE), KBTC Public Television (KBTC)
- **pcmag**
- **Periscope**: Periscope
- **PhilharmonieDeParis**: Philharmonie de Paris
- **Phoenix**
- **phoenix.de**
- **Photobucket**
- **Pinkbike**
- **Pladform**
- **PlanetaPlay**
- **play.fm**
- **played.to**
- **PlaysTV**
- **Playtvak**: Playtvak.cz, iDNES.cz and Lidovky.cz
- **Playvid**
- **Playwire**
@@ -410,9 +480,11 @@
- **PornHd**
- **PornHub**
- **PornHubPlaylist**
- **PornHubUserVideos**
- **Pornotube**
- **PornoVoisines**
- **PornoXO**
- **PressTV**
- **PrimeShareTV**
- **PromptFile**
- **prosiebensat1**: ProSiebenSat.1 Digital
@@ -423,7 +495,6 @@
- **qqmusic:playlist**: QQ音乐 - 歌单
- **qqmusic:singer**: QQ音乐 - 歌手
- **qqmusic:toplist**: QQ音乐 - 排行榜
- **Quickscope**: Quick Scope
- **QuickVid**
- **R7**
- **radio.de**
@@ -431,16 +502,21 @@
- **radiofrance**
- **RadioJavan**
- **Rai**
- **RaiTV**
- **RBMARadio**
- **RDS**: RDS.ca
- **RedTube**
- **RegioTV**
- **Restudy**
- **ReverbNation**
- **Revision3**
- **RICE**
- **RingTV**
- **RottenTomatoes**
- **Roxwel**
- **RTBF**
- **Rte**
- **rte**: Raidió Teilifís Éireann TV
- **rte:radio**: Raidió Teilifís Éireann radio
- **rtl.nl**: rtl.nl and rtlxl.nl
- **RTL2**
- **RTP**
@@ -450,6 +526,7 @@
- **rtve.es:live**: RTVE.es live streams
- **RTVNH**
- **RUHD**
- **RulePorn**
- **rutube**: Rutube videos
- **rutube:channel**: Rutube channels
- **rutube:embed**: Rutube embedded videos
@@ -458,15 +535,18 @@
- **RUTV**: RUTV.RU
- **Ruutu**
- **safari**: safaribooksonline.com online video
- **safari:api**
- **safari:course**: safaribooksonline.com online courses
- **Sandia**: Sandia National Laboratories
- **Sapo**: SAPO Vídeos
- **savefrom.net**
- **SBS**: sbs.com.au
- **schooltv**
- **SciVee**
- **screen.yahoo:search**: Yahoo screen search
- **Screencast**
- **ScreencastOMatic**
- **ScreenJunkies**
- **ScreenwaveMedia**
- **SenateISVP**
- **ServingSys**
@@ -476,6 +556,8 @@
- **Shared**: shared.sx and vivo.sx
- **ShareSix**
- **Sina**
- **skynewsarabia:video**
- **skynewsarabia:video**
- **Slideshare**
- **Slutload**
- **smotri**: Smotri.com
@@ -486,10 +568,9 @@
- **SnagFilmsEmbed**
- **Snotr**
- **Sohu**
- **soompi**
- **soompi:show**
- **soundcloud**
- **soundcloud:playlist**
- **soundcloud:search**: Soundcloud search
- **soundcloud:set**
- **soundcloud:user**
- **soundgasm**
@@ -499,7 +580,6 @@
- **southpark.de**
- **southpark.nl**
- **southparkstudios.dk**
- **Space**
- **SpankBang**
- **Spankwire**
- **Spiegel**
@@ -511,8 +591,8 @@
- **SportBoxEmbed**
- **SportDeutschland**
- **Sportschau**
- **Srf**
- **SRMediathek**: Saarländischer Rundfunk
- **SRGSSR**
- **SRGSSRPlay**: srf.ch, rts.ch, rsi.ch, rtr.ch and swissinfo.ch play sites
- **SSA**
- **stanfordoc**: Stanford Open ClassRoom
- **Steam**
@@ -537,48 +617,59 @@
- **TechTalks**
- **techtv.mit.edu**
- **ted**
- **Tele13**
- **TeleBruxelles**
- **Telecinco**: telecinco.es, cuatro.com and mediaset.es
- **Telegraaf**
- **TeleMB**
- **TeleTask**
- **TenPlay**
- **TestTube**
- **TF1**
- **TheIntercept**
- **TheOnion**
- **ThePlatform**
- **ThePlatformFeed**
- **TheScene**
- **TheSixtyOne**
- **TheStar**
- **ThisAmericanLife**
- **ThisAV**
- **THVideo**
- **THVideoPlaylist**
- **tinypic**: tinypic.com videos
- **tlc.com**
- **tlc.de**
- **TMZ**
- **TMZArticle**
- **TNAFlix**
- **TNAFlixNetworkEmbed**
- **toggle**
- **tou.tv**
- **Toypics**: Toypics user profile
- **ToypicsUser**: Toypics user profile
- **TrailerAddict** (Currently broken)
- **Trilulilu**
- **trollvids**
- **TruTube**
- **Tube8**
- **TubiTv**
- **Tudou**
- **tudou**
- **tudou:album**
- **tudou:playlist**
- **Tumblr**
- **TuneIn**
- **tunein:clip**
- **tunein:program**
- **tunein:station**
- **tunein:topic**
- **Turbo**
- **Tutv**
- **tv.dfb.de**
- **TV2**
- **TV2Article**
- **TV3**
- **TV4**: tv4.se and tv4play.se
- **TVC**
- **TVCArticle**
- **tvigle**: Интернет-телевидение Tvigle.ru
- **tvland.com**
- **tvp.pl**
- **tvp.pl:Series**
- **TVPlay**: TV3Play and related services
@@ -591,16 +682,18 @@
- **twitch:video**
- **twitch:vod**
- **twitter**
- **twitter:amplify**
- **twitter:card**
- **Ubu**
- **udemy**
- **udemy:course**
- **UDNEmbed**: 聯合影音
- **Ultimedia**
- **Unistra**
- **Urort**: NRK P3 Urørt
- **USAToday**
- **ustream**
- **ustream:channel**
- **Ustudio**
- **Varzesh3**
- **Vbox7**
- **VeeHD**
@@ -608,24 +701,30 @@
- **Vessel**
- **Vesti**: Вести.Ru
- **Vevo**
- **VGTV**: VGTV and BTTV
- **VGTV**: VGTV, BTTV, FTV, Aftenposten and Aftonbladet
- **vh1.com**
- **Vice**
- **ViceShow**
- **Viddler**
- **video.google:search**: Google Video search
- **video.mit.edu**
- **VideoDetective**
- **videofy.me**
- **videolectures.net**
- **VideoMega**
- **videomore**
- **videomore:season**
- **videomore:video**
- **VideoPremium**
- **VideoTt**: video.tt - Your True Tube
- **VideoTt**: video.tt - Your True Tube (Currently broken)
- **videoweed**: VideoWeed
- **Vidme**
- **vidme**
- **vidme:user**
- **vidme:user:likes**
- **Vidzi**
- **vier**
- **vier:videos**
- **Viewster**
- **Viidea**
- **viki**
- **viki:channel**
- **vimeo**
@@ -633,6 +732,7 @@
- **vimeo:channel**
- **vimeo:group**
- **vimeo:likes**: Vimeo user likes
- **vimeo:ondemand**
- **vimeo:review**: Review pages on vimeo
- **vimeo:user**
- **vimeo:watchlater**: Vimeo watch later list, "vimeowatchlater" keyword (requires authentication)
@@ -644,6 +744,7 @@
- **vlive**
- **Vodlocker**
- **VoiceRepublic**
- **VoxMedia**
- **Vporn**
- **vpro**: npo.nl and ntr.nl
- **VRT**
@@ -660,6 +761,8 @@
- **WebOfStories**
- **WebOfStoriesPlaylist**
- **Weibo**
- **WeiqiTV**: WQTV
- **wholecloud**: WholeCloud
- **Wimp**
- **Wistia**
- **WNL**
@@ -668,6 +771,7 @@
- **WSJ**: Wall Street Journal
- **XBef**
- **XboxClips**
- **XFileShare**: XFileShare based sites: GorillaVid.in, daclips.in, movpod.in, fastvideo.in, realvid.net, filehoot.com and vidto.me
- **XHamster**
- **XHamsterEmbed**
- **XMinus**
@@ -694,7 +798,9 @@
- **youtube:channel**: YouTube.com channels
- **youtube:favorites**: YouTube.com favourite videos, ":ytfav" for short (requires authentication)
- **youtube:history**: Youtube watch history, ":ythistory" for short (requires authentication)
- **youtube:live**: YouTube.com live streams
- **youtube:playlist**: YouTube.com playlists
- **youtube:playlists**: YouTube.com user/channel playlists
- **youtube:recommended**: YouTube.com recommended videos, ":ytrec" for short (requires authentication)
- **youtube:search**: YouTube.com searches
- **youtube:search:date**: YouTube.com searches, newest videos first
@@ -708,3 +814,4 @@
- **ZDFChannel**
- **zingmp3:album**: mp3.zing.vn albums
- **zingmp3:song**: mp3.zing.vn songs
- **ZippCast**

View File

@@ -2,5 +2,5 @@
universal = True
[flake8]
exclude = youtube_dl/extractor/__init__.py,devscripts/buildserver.py,setup.py,build,.git
exclude = youtube_dl/extractor/__init__.py,devscripts/buildserver.py,devscripts/lazy_load_template.py,devscripts/make_issue_template.py,setup.py,build,.git
ignore = E402,E501,E731

View File

@@ -8,11 +8,12 @@ import warnings
import sys
try:
from setuptools import setup
from setuptools import setup, Command
setuptools_available = True
except ImportError:
from distutils.core import setup
from distutils.core import setup, Command
setuptools_available = False
from distutils.spawn import spawn
try:
# This will create an exe that needs Microsoft Visual C++ 2008
@@ -70,6 +71,22 @@ else:
else:
params['scripts'] = ['bin/youtube-dl']
class build_lazy_extractors(Command):
description = "Build the extractor lazy loading module"
user_options = []
def initialize_options(self):
pass
def finalize_options(self):
pass
def run(self):
spawn(
[sys.executable, 'devscripts/make_lazy_extractors.py', 'youtube_dl/extractor/lazy_extractors.py'],
dry_run=self.dry_run,
)
# Get the version from youtube_dl/version.py without importing the package
exec(compile(open('youtube_dl/version.py').read(),
'youtube_dl/version.py', 'exec'))
@@ -107,5 +124,6 @@ setup(
"Programming Language :: Python :: 3.4",
],
cmdclass={'build_lazy_extractors': build_lazy_extractors},
**params
)

View File

@@ -11,8 +11,11 @@ import sys
import youtube_dl.extractor
from youtube_dl import YoutubeDL
from youtube_dl.utils import (
from youtube_dl.compat import (
compat_os_name,
compat_str,
)
from youtube_dl.utils import (
preferredencoding,
write_string,
)
@@ -42,7 +45,7 @@ def report_warning(message):
Print the message to stderr, it will be prefixed with 'WARNING:'
If stderr is a tty file the 'WARNING:' will be colored
'''
if sys.stderr.isatty() and os.name != 'nt':
if sys.stderr.isatty() and compat_os_name != 'nt':
_msg_header = '\033[0;33mWARNING:\033[0m'
else:
_msg_header = 'WARNING:'
@@ -140,6 +143,9 @@ def expect_value(self, got, expected, field):
expect_value(self, item_got, item_expected, field)
else:
if isinstance(expected, compat_str) and expected.startswith('md5:'):
self.assertTrue(
isinstance(got, compat_str),
'Expected field %s to be a unicode object, but got value %r of type %r' % (field, got, type(got)))
got = 'md5:' + md5(got)
elif isinstance(expected, compat_str) and expected.startswith('mincount:'):
self.assertTrue(

View File

@@ -11,6 +11,7 @@ sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from test.helper import FakeYDL
from youtube_dl.extractor.common import InfoExtractor
from youtube_dl.extractor import YoutubeIE, get_info_extractor
from youtube_dl.utils import encode_data_uri, strip_jsonp, ExtractorError
class TestIE(InfoExtractor):
@@ -66,5 +67,14 @@ class TestInfoExtractor(unittest.TestCase):
self.assertEqual(ie._html_search_meta('e', html), '5')
self.assertEqual(ie._html_search_meta('f', html), '6')
def test_download_json(self):
uri = encode_data_uri(b'{"foo": "blah"}', 'application/json')
self.assertEqual(self.ie._download_json(uri, None), {'foo': 'blah'})
uri = encode_data_uri(b'callback({"foo": "blah"})', 'application/javascript')
self.assertEqual(self.ie._download_json(uri, None, transform_source=strip_jsonp), {'foo': 'blah'})
uri = encode_data_uri(b'{"foo": invalid}', 'application/json')
self.assertRaises(ExtractorError, self.ie._download_json, uri, None)
self.assertEqual(self.ie._download_json(uri, None, fatal=False), None)
if __name__ == '__main__':
unittest.main()

View File

@@ -12,8 +12,9 @@ import copy
from test.helper import FakeYDL, assertRegexpMatches
from youtube_dl import YoutubeDL
from youtube_dl.compat import compat_str
from youtube_dl.compat import compat_str, compat_urllib_error
from youtube_dl.extractor import YoutubeIE
from youtube_dl.extractor.common import InfoExtractor
from youtube_dl.postprocessor.common import PostProcessor
from youtube_dl.utils import ExtractorError, match_filter_func
@@ -221,9 +222,24 @@ class TestFormatSelection(unittest.TestCase):
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'dash-video-low')
ydl = YDL({'format': 'bestvideo[format_id^=dash][format_id$=low]'})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'dash-video-low')
formats = [
{'format_id': 'vid-vcodec-dot', 'ext': 'mp4', 'preference': 1, 'vcodec': 'avc1.123456', 'acodec': 'none', 'url': TEST_URL},
]
info_dict = _make_result(formats)
ydl = YDL({'format': 'bestvideo[vcodec=avc1.123456]'})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'vid-vcodec-dot')
def test_youtube_format_selection(self):
order = [
'38', '37', '46', '22', '45', '35', '44', '18', '34', '43', '6', '5', '36', '17', '13',
'38', '37', '46', '22', '45', '35', '44', '18', '34', '43', '6', '5', '17', '36', '13',
# Apple HTTP Live Streaming
'96', '95', '94', '93', '92', '132', '151',
# 3D
@@ -237,6 +253,17 @@ class TestFormatSelection(unittest.TestCase):
def format_info(f_id):
info = YoutubeIE._formats[f_id].copy()
# XXX: In real cases InfoExtractor._parse_mpd_formats() fills up 'acodec'
# and 'vcodec', while in tests such information is incomplete since
# commit a6c2c24479e5f4827ceb06f64d855329c0a6f593
# test_YoutubeDL.test_youtube_format_selection is broken without
# this fix
if 'acodec' in info and 'vcodec' not in info:
info['vcodec'] = 'none'
elif 'vcodec' in info and 'acodec' not in info:
info['acodec'] = 'none'
info['format_id'] = f_id
info['url'] = 'url:' + f_id
return info
@@ -480,6 +507,9 @@ class TestYoutubeDL(unittest.TestCase):
assertRegexpMatches(self, ydl._format_note({
'vbr': 10,
}), '^\s*10k$')
assertRegexpMatches(self, ydl._format_note({
'fps': 30,
}), '^30fps$')
def test_postprocessors(self):
filename = 'post-processor-testfile.mp4'
@@ -631,6 +661,47 @@ class TestYoutubeDL(unittest.TestCase):
result = get_ids({'playlist_items': '10'})
self.assertEqual(result, [])
def test_urlopen_no_file_protocol(self):
# see https://github.com/rg3/youtube-dl/issues/8227
ydl = YDL()
self.assertRaises(compat_urllib_error.URLError, ydl.urlopen, 'file:///etc/passwd')
def test_do_not_override_ie_key_in_url_transparent(self):
ydl = YDL()
class Foo1IE(InfoExtractor):
_VALID_URL = r'foo1:'
def _real_extract(self, url):
return {
'_type': 'url_transparent',
'url': 'foo2:',
'ie_key': 'Foo2',
}
class Foo2IE(InfoExtractor):
_VALID_URL = r'foo2:'
def _real_extract(self, url):
return {
'_type': 'url',
'url': 'foo3:',
'ie_key': 'Foo3',
}
class Foo3IE(InfoExtractor):
_VALID_URL = r'foo3:'
def _real_extract(self, url):
return _make_result([{'url': TEST_URL}])
ydl.add_info_extractor(Foo1IE(ydl))
ydl.add_info_extractor(Foo2IE(ydl))
ydl.add_info_extractor(Foo3IE(ydl))
ydl.extract_info('foo1:')
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['url'], TEST_URL)
if __name__ == '__main__':
unittest.main()

View File

@@ -56,7 +56,7 @@ class TestAllURLsMatching(unittest.TestCase):
assertChannel('https://www.youtube.com/channel/HCtnHdj3df7iM/videos')
def test_youtube_user_matching(self):
self.assertMatch('www.youtube.com/NASAgovVideo/videos', ['youtube:user'])
self.assertMatch('http://www.youtube.com/NASAgovVideo/videos', ['youtube:user'])
def test_youtube_feeds(self):
self.assertMatch('https://www.youtube.com/feed/watch_later', ['youtube:watchlater'])
@@ -121,8 +121,8 @@ class TestAllURLsMatching(unittest.TestCase):
def test_pbs(self):
# https://github.com/rg3/youtube-dl/issues/2350
self.assertMatch('http://video.pbs.org/viralplayer/2365173446/', ['PBS'])
self.assertMatch('http://video.pbs.org/widget/partnerplayer/980042464/', ['PBS'])
self.assertMatch('http://video.pbs.org/viralplayer/2365173446/', ['pbs'])
self.assertMatch('http://video.pbs.org/widget/partnerplayer/980042464/', ['pbs'])
def test_yahoo_https(self):
# https://github.com/rg3/youtube-dl/issues/2701

View File

@@ -13,10 +13,13 @@ sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from youtube_dl.utils import get_filesystem_encoding
from youtube_dl.compat import (
compat_getenv,
compat_etree_fromstring,
compat_expanduser,
compat_shlex_split,
compat_str,
compat_urllib_parse_unquote,
compat_urllib_parse_unquote_plus,
compat_urllib_parse_urlencode,
)
@@ -68,8 +71,33 @@ class TestCompat(unittest.TestCase):
self.assertEqual(compat_urllib_parse_unquote_plus('abc%20def'), 'abc def')
self.assertEqual(compat_urllib_parse_unquote_plus('%7e/abc+def'), '~/abc def')
def test_compat_urllib_parse_urlencode(self):
self.assertEqual(compat_urllib_parse_urlencode({'abc': 'def'}), 'abc=def')
self.assertEqual(compat_urllib_parse_urlencode({'abc': b'def'}), 'abc=def')
self.assertEqual(compat_urllib_parse_urlencode({b'abc': 'def'}), 'abc=def')
self.assertEqual(compat_urllib_parse_urlencode({b'abc': b'def'}), 'abc=def')
self.assertEqual(compat_urllib_parse_urlencode([('abc', 'def')]), 'abc=def')
self.assertEqual(compat_urllib_parse_urlencode([('abc', b'def')]), 'abc=def')
self.assertEqual(compat_urllib_parse_urlencode([(b'abc', 'def')]), 'abc=def')
self.assertEqual(compat_urllib_parse_urlencode([(b'abc', b'def')]), 'abc=def')
def test_compat_shlex_split(self):
self.assertEqual(compat_shlex_split('-option "one two"'), ['-option', 'one two'])
def test_compat_etree_fromstring(self):
xml = '''
<root foo="bar" spam="中文">
<normal>foo</normal>
<chinese>中文</chinese>
<foo><bar>spam</bar></foo>
</root>
'''
doc = compat_etree_fromstring(xml.encode('utf-8'))
self.assertTrue(isinstance(doc.attrib['foo'], compat_str))
self.assertTrue(isinstance(doc.attrib['spam'], compat_str))
self.assertTrue(isinstance(doc.find('normal').text, compat_str))
self.assertTrue(isinstance(doc.find('chinese').text, compat_str))
self.assertTrue(isinstance(doc.find('foo/bar').text, compat_str))
if __name__ == '__main__':
unittest.main()

View File

@@ -1,4 +1,5 @@
#!/usr/bin/env python
# coding: utf-8
from __future__ import unicode_literals
# Allow direct execution
@@ -52,7 +53,12 @@ class TestHTTP(unittest.TestCase):
('localhost', 0), HTTPTestRequestHandler)
self.httpd.socket = ssl.wrap_socket(
self.httpd.socket, certfile=certfn, server_side=True)
self.port = self.httpd.socket.getsockname()[1]
if os.name == 'java':
# In Jython SSLSocket is not a subclass of socket.socket
sock = self.httpd.socket.sock
else:
sock = self.httpd.socket
self.port = sock.getsockname()[1]
self.server_thread = threading.Thread(target=self.httpd.serve_forever)
self.server_thread.daemon = True
self.server_thread.start()
@@ -115,5 +121,14 @@ class TestProxy(unittest.TestCase):
response = ydl.urlopen(req).read().decode('utf-8')
self.assertEqual(response, 'cn: {0}'.format(url))
def test_proxy_with_idn(self):
ydl = YoutubeDL({
'proxy': 'localhost:{0}'.format(self.port),
})
url = 'http://中文.tw/'
response = ydl.urlopen(url).read().decode('utf-8')
# b'xn--fiq228c' is '中文'.encode('idna')
self.assertEqual(response, 'normal: http://xn--fiq228c.tw/')
if __name__ == '__main__':
unittest.main()

View File

@@ -0,0 +1,47 @@
#!/usr/bin/env python
from __future__ import unicode_literals
# Allow direct execution
import os
import sys
import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from test.helper import FakeYDL
from youtube_dl.extractor import IqiyiIE
class IqiyiIEWithCredentials(IqiyiIE):
def _get_login_info(self):
return 'foo', 'bar'
class WarningLogger(object):
def __init__(self):
self.messages = []
def warning(self, msg):
self.messages.append(msg)
def debug(self, msg):
pass
def error(self, msg):
pass
class TestIqiyiSDKInterpreter(unittest.TestCase):
def test_iqiyi_sdk_interpreter(self):
'''
Test the functionality of IqiyiSDKInterpreter by trying to log in
If `sign` is incorrect, /validate call throws an HTTP 556 error
'''
logger = WarningLogger()
ie = IqiyiIEWithCredentials(FakeYDL({'logger': logger}))
ie._login()
self.assertTrue('unable to log in:' in logger.messages[0])
if __name__ == '__main__':
unittest.main()

View File

@@ -19,6 +19,9 @@ class TestJSInterpreter(unittest.TestCase):
jsi = JSInterpreter('function x3(){return 42;}')
self.assertEqual(jsi.call_function('x3'), 42)
jsi = JSInterpreter('var x5 = function(){return 42;}')
self.assertEqual(jsi.call_function('x5'), 42)
def test_calc(self):
jsi = JSInterpreter('function x4(a){return 2*a+1;}')
self.assertEqual(jsi.call_function('x4', 3), 7)

View File

@@ -11,7 +11,6 @@ from test.helper import FakeYDL, md5
from youtube_dl.extractor import (
BlipTVIE,
YoutubeIE,
DailymotionIE,
TEDIE,
@@ -22,12 +21,13 @@ from youtube_dl.extractor import (
NPOIE,
ComedyCentralIE,
NRKTVIE,
RaiIE,
RaiTVIE,
VikiIE,
ThePlatformIE,
ThePlatformFeedIE,
RTVEALaCartaIE,
FunnyOrDieIE,
DemocracynowIE,
)
@@ -65,16 +65,16 @@ class TestYoutubeSubtitles(BaseTestSubtitles):
self.DL.params['allsubtitles'] = True
subtitles = self.getSubtitles()
self.assertEqual(len(subtitles.keys()), 13)
self.assertEqual(md5(subtitles['en']), '4cd9278a35ba2305f47354ee13472260')
self.assertEqual(md5(subtitles['it']), '164a51f16f260476a05b50fe4c2f161d')
for lang in ['it', 'fr', 'de']:
self.assertEqual(md5(subtitles['en']), '3cb210999d3e021bd6c7f0ea751eab06')
self.assertEqual(md5(subtitles['it']), '6d752b98c31f1cf8d597050c7a2cb4b5')
for lang in ['fr', 'de']:
self.assertTrue(subtitles.get(lang) is not None, 'Subtitles for \'%s\' not extracted' % lang)
def test_youtube_subtitles_sbv_format(self):
def test_youtube_subtitles_ttml_format(self):
self.DL.params['writesubtitles'] = True
self.DL.params['subtitlesformat'] = 'sbv'
self.DL.params['subtitlesformat'] = 'ttml'
subtitles = self.getSubtitles()
self.assertEqual(md5(subtitles['en']), '13aeaa0c245a8bed9a451cb643e3ad8b')
self.assertEqual(md5(subtitles['en']), 'e306f8c42842f723447d9f63ad65df54')
def test_youtube_subtitles_vtt_format(self):
self.DL.params['writesubtitles'] = True
@@ -144,18 +144,6 @@ class TestTedSubtitles(BaseTestSubtitles):
self.assertTrue(subtitles.get(lang) is not None, 'Subtitles for \'%s\' not extracted' % lang)
class TestBlipTVSubtitles(BaseTestSubtitles):
url = 'http://blip.tv/a/a-6603250'
IE = BlipTVIE
def test_allsubtitles(self):
self.DL.params['writesubtitles'] = True
self.DL.params['allsubtitles'] = True
subtitles = self.getSubtitles()
self.assertEqual(set(subtitles.keys()), set(['en']))
self.assertEqual(md5(subtitles['en']), '5b75c300af65fe4476dff79478bb93e4')
class TestVimeoSubtitles(BaseTestSubtitles):
url = 'http://vimeo.com/76979871'
IE = VimeoIE
@@ -272,7 +260,7 @@ class TestNRKSubtitles(BaseTestSubtitles):
class TestRaiSubtitles(BaseTestSubtitles):
url = 'http://www.rai.tv/dl/RaiTV/programmi/media/ContentItem-cb27157f-9dd0-4aee-b788-b1f67643a391.html'
IE = RaiIE
IE = RaiTVIE
def test_allsubtitles(self):
self.DL.params['writesubtitles'] = True
@@ -346,5 +334,25 @@ class TestFunnyOrDieSubtitles(BaseTestSubtitles):
self.assertEqual(md5(subtitles['en']), 'c5593c193eacd353596c11c2d4f9ecc4')
class TestDemocracynowSubtitles(BaseTestSubtitles):
url = 'http://www.democracynow.org/shows/2015/7/3'
IE = DemocracynowIE
def test_allsubtitles(self):
self.DL.params['writesubtitles'] = True
self.DL.params['allsubtitles'] = True
subtitles = self.getSubtitles()
self.assertEqual(set(subtitles.keys()), set(['en']))
self.assertEqual(md5(subtitles['en']), 'acaca989e24a9e45a6719c9b3d60815c')
def test_subtitles_in_page(self):
self.url = 'http://www.democracynow.org/2015/7/3/this_flag_comes_down_today_bree'
self.DL.params['writesubtitles'] = True
self.DL.params['allsubtitles'] = True
subtitles = self.getSubtitles()
self.assertEqual(set(subtitles.keys()), set(['en']))
self.assertEqual(md5(subtitles['en']), 'acaca989e24a9e45a6719c9b3d60815c')
if __name__ == '__main__':
unittest.main()

30
test/test_update.py Normal file
View File

@@ -0,0 +1,30 @@
#!/usr/bin/env python
from __future__ import unicode_literals
# Allow direct execution
import os
import sys
import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import json
from youtube_dl.update import rsa_verify
class TestUpdate(unittest.TestCase):
def test_rsa_verify(self):
UPDATES_RSA_KEY = (0x9d60ee4d8f805312fdb15a62f87b95bd66177b91df176765d13514a0f1754bcd2057295c5b6f1d35daa6742c3ffc9a82d3e118861c207995a8031e151d863c9927e304576bc80692bc8e094896fcf11b66f3e29e04e3a71e9a11558558acea1840aec37fc396fb6b65dc81a1c4144e03bd1c011de62e3f1357b327d08426fe93, 65537)
with open(os.path.join(os.path.dirname(os.path.abspath(__file__)), 'versions.json'), 'rb') as f:
versions_info = f.read().decode()
versions_info = json.loads(versions_info)
signature = versions_info['signature']
del versions_info['signature']
self.assertTrue(rsa_verify(
json.dumps(versions_info, sort_keys=True).encode('utf-8'),
signature, UPDATES_RSA_KEY))
if __name__ == '__main__':
unittest.main()

View File

@@ -18,12 +18,18 @@ import xml.etree.ElementTree
from youtube_dl.utils import (
age_restricted,
args_to_str,
encode_base_n,
clean_html,
date_from_str,
DateRange,
detect_exe_version,
determine_ext,
dict_get,
encode_compat_str,
encodeFilename,
escape_rfc3986,
escape_url,
extract_attributes,
ExtractorError,
find_xpath_attr,
fix_xml_ampersands,
@@ -32,16 +38,19 @@ from youtube_dl.utils import (
is_html,
js_to_json,
limit_length,
ohdave_rsa_encrypt,
OnDemandPagedList,
orderedSet,
parse_duration,
parse_filesize,
parse_count,
parse_iso8601,
read_batch_urls,
sanitize_filename,
sanitize_path,
prepend_extension,
replace_extension,
remove_quotes,
shell_quote,
smuggle_url,
str_to_int,
@@ -55,6 +64,7 @@ from youtube_dl.utils import (
lowercase_escape,
url_basename,
urlencode_postdata,
update_url_query,
version_tuple,
xpath_with_ns,
xpath_element,
@@ -68,6 +78,12 @@ from youtube_dl.utils import (
cli_valueless_option,
cli_bool_option,
)
from youtube_dl.compat import (
compat_chr,
compat_etree_fromstring,
compat_urlparse,
compat_parse_qs,
)
class TestUtil(unittest.TestCase):
@@ -196,6 +212,15 @@ class TestUtil(unittest.TestCase):
self.assertEqual(replace_extension('.abc', 'temp'), '.abc.temp')
self.assertEqual(replace_extension('.abc.ext', 'temp'), '.abc.temp')
def test_remove_quotes(self):
self.assertEqual(remove_quotes(None), None)
self.assertEqual(remove_quotes('"'), '"')
self.assertEqual(remove_quotes("'"), "'")
self.assertEqual(remove_quotes(';'), ';')
self.assertEqual(remove_quotes('";'), '";')
self.assertEqual(remove_quotes('""'), '')
self.assertEqual(remove_quotes('";"'), ';')
def test_ordered_set(self):
self.assertEqual(orderedSet([1, 1, 2, 3, 4, 4, 5, 6, 7, 3, 5]), [1, 2, 3, 4, 5, 6, 7])
self.assertEqual(orderedSet([]), [])
@@ -207,8 +232,15 @@ class TestUtil(unittest.TestCase):
self.assertEqual(unescapeHTML('%20;'), '%20;')
self.assertEqual(unescapeHTML('&#x2F;'), '/')
self.assertEqual(unescapeHTML('&#47;'), '/')
self.assertEqual(
unescapeHTML('&eacute;'), 'é')
self.assertEqual(unescapeHTML('&eacute;'), 'é')
self.assertEqual(unescapeHTML('&#2013266066;'), '&#2013266066;')
def test_date_from_str(self):
self.assertEqual(date_from_str('yesterday'), date_from_str('now-1day'))
self.assertEqual(date_from_str('now+7day'), date_from_str('now+1week'))
self.assertEqual(date_from_str('now+14day'), date_from_str('now+2week'))
self.assertEqual(date_from_str('now+365day'), date_from_str('now+1year'))
self.assertEqual(date_from_str('now+30day'), date_from_str('now+1month'))
def test_daterange(self):
_20century = DateRange("19000101", "20000101")
@@ -232,7 +264,16 @@ class TestUtil(unittest.TestCase):
self.assertEqual(
unified_strdate('2/2/2015 6:47:40 PM', day_first=False),
'20150202')
self.assertEqual(unified_strdate('Feb 14th 2016 5:45PM'), '20160214')
self.assertEqual(unified_strdate('25-09-2014'), '20140925')
self.assertEqual(unified_strdate('UNKNOWN DATE FORMAT'), None)
def test_determine_ext(self):
self.assertEqual(determine_ext('http://example.com/foo/bar.mp4/?download'), 'mp4')
self.assertEqual(determine_ext('http://example.com/foo/bar/?download', None), None)
self.assertEqual(determine_ext('http://example.com/foo/bar.nonext/?download', None), None)
self.assertEqual(determine_ext('http://example.com/foo/bar/mp4?download', None), None)
self.assertEqual(determine_ext('http://example.com/foo/bar.m3u8//?download'), 'm3u8')
def test_find_xpath_attr(self):
testxml = '''<root>
@@ -242,7 +283,7 @@ class TestUtil(unittest.TestCase):
<node x="b" y="d" />
<node x="" />
</root>'''
doc = xml.etree.ElementTree.fromstring(testxml)
doc = compat_etree_fromstring(testxml)
self.assertEqual(find_xpath_attr(doc, './/fourohfour', 'n'), None)
self.assertEqual(find_xpath_attr(doc, './/fourohfour', 'n', 'v'), None)
@@ -263,7 +304,7 @@ class TestUtil(unittest.TestCase):
<url>http://server.com/download.mp3</url>
</media:song>
</root>'''
doc = xml.etree.ElementTree.fromstring(testxml)
doc = compat_etree_fromstring(testxml)
find = lambda p: doc.find(xpath_with_ns(p, {'media': 'http://example.com/'}))
self.assertTrue(find('media:song') is not None)
self.assertEqual(find('media:song/media:author').text, 'The Author')
@@ -275,9 +316,16 @@ class TestUtil(unittest.TestCase):
p = xml.etree.ElementTree.SubElement(div, 'p')
p.text = 'Foo'
self.assertEqual(xpath_element(doc, 'div/p'), p)
self.assertEqual(xpath_element(doc, ['div/p']), p)
self.assertEqual(xpath_element(doc, ['div/bar', 'div/p']), p)
self.assertEqual(xpath_element(doc, 'div/bar', default='default'), 'default')
self.assertEqual(xpath_element(doc, ['div/bar'], default='default'), 'default')
self.assertTrue(xpath_element(doc, 'div/bar') is None)
self.assertTrue(xpath_element(doc, ['div/bar']) is None)
self.assertTrue(xpath_element(doc, ['div/bar'], 'div/baz') is None)
self.assertRaises(ExtractorError, xpath_element, doc, 'div/bar', fatal=True)
self.assertRaises(ExtractorError, xpath_element, doc, ['div/bar'], fatal=True)
self.assertRaises(ExtractorError, xpath_element, doc, ['div/bar', 'div/baz'], fatal=True)
def test_xpath_text(self):
testxml = '''<root>
@@ -285,7 +333,7 @@ class TestUtil(unittest.TestCase):
<p>Foo</p>
</div>
</root>'''
doc = xml.etree.ElementTree.fromstring(testxml)
doc = compat_etree_fromstring(testxml)
self.assertEqual(xpath_text(doc, 'div/p'), 'Foo')
self.assertEqual(xpath_text(doc, 'div/bar', default='default'), 'default')
self.assertTrue(xpath_text(doc, 'div/bar') is None)
@@ -297,7 +345,7 @@ class TestUtil(unittest.TestCase):
<p x="a">Foo</p>
</div>
</root>'''
doc = xml.etree.ElementTree.fromstring(testxml)
doc = compat_etree_fromstring(testxml)
self.assertEqual(xpath_attr(doc, 'div/p', 'x'), 'a')
self.assertEqual(xpath_attr(doc, 'div/bar', 'x'), None)
self.assertEqual(xpath_attr(doc, 'div/p', 'y'), None)
@@ -420,11 +468,73 @@ class TestUtil(unittest.TestCase):
data = urlencode_postdata({'username': 'foo@bar.com', 'password': '1234'})
self.assertTrue(isinstance(data, bytes))
def test_update_url_query(self):
def query_dict(url):
return compat_parse_qs(compat_urlparse.urlparse(url).query)
self.assertEqual(query_dict(update_url_query(
'http://example.com/path', {'quality': ['HD'], 'format': ['mp4']})),
query_dict('http://example.com/path?quality=HD&format=mp4'))
self.assertEqual(query_dict(update_url_query(
'http://example.com/path', {'system': ['LINUX', 'WINDOWS']})),
query_dict('http://example.com/path?system=LINUX&system=WINDOWS'))
self.assertEqual(query_dict(update_url_query(
'http://example.com/path', {'fields': 'id,formats,subtitles'})),
query_dict('http://example.com/path?fields=id,formats,subtitles'))
self.assertEqual(query_dict(update_url_query(
'http://example.com/path', {'fields': ('id,formats,subtitles', 'thumbnails')})),
query_dict('http://example.com/path?fields=id,formats,subtitles&fields=thumbnails'))
self.assertEqual(query_dict(update_url_query(
'http://example.com/path?manifest=f4m', {'manifest': []})),
query_dict('http://example.com/path'))
self.assertEqual(query_dict(update_url_query(
'http://example.com/path?system=LINUX&system=WINDOWS', {'system': 'LINUX'})),
query_dict('http://example.com/path?system=LINUX'))
self.assertEqual(query_dict(update_url_query(
'http://example.com/path', {'fields': b'id,formats,subtitles'})),
query_dict('http://example.com/path?fields=id,formats,subtitles'))
self.assertEqual(query_dict(update_url_query(
'http://example.com/path', {'width': 1080, 'height': 720})),
query_dict('http://example.com/path?width=1080&height=720'))
self.assertEqual(query_dict(update_url_query(
'http://example.com/path', {'bitrate': 5020.43})),
query_dict('http://example.com/path?bitrate=5020.43'))
self.assertEqual(query_dict(update_url_query(
'http://example.com/path', {'test': '第二行тест'})),
query_dict('http://example.com/path?test=%E7%AC%AC%E4%BA%8C%E8%A1%8C%D1%82%D0%B5%D1%81%D1%82'))
def test_dict_get(self):
FALSE_VALUES = {
'none': None,
'false': False,
'zero': 0,
'empty_string': '',
'empty_list': [],
}
d = FALSE_VALUES.copy()
d['a'] = 42
self.assertEqual(dict_get(d, 'a'), 42)
self.assertEqual(dict_get(d, 'b'), None)
self.assertEqual(dict_get(d, 'b', 42), 42)
self.assertEqual(dict_get(d, ('a', )), 42)
self.assertEqual(dict_get(d, ('b', 'a', )), 42)
self.assertEqual(dict_get(d, ('b', 'c', 'a', 'd', )), 42)
self.assertEqual(dict_get(d, ('b', 'c', )), None)
self.assertEqual(dict_get(d, ('b', 'c', ), 42), 42)
for key, false_value in FALSE_VALUES.items():
self.assertEqual(dict_get(d, ('b', 'c', key, )), None)
self.assertEqual(dict_get(d, ('b', 'c', key, ), skip_false_values=False), false_value)
def test_encode_compat_str(self):
self.assertEqual(encode_compat_str(b'\xd1\x82\xd0\xb5\xd1\x81\xd1\x82', 'utf-8'), 'тест')
self.assertEqual(encode_compat_str('тест', 'utf-8'), 'тест')
def test_parse_iso8601(self):
self.assertEqual(parse_iso8601('2014-03-23T23:04:26+0100'), 1395612266)
self.assertEqual(parse_iso8601('2014-03-23T22:04:26+0000'), 1395612266)
self.assertEqual(parse_iso8601('2014-03-23T22:04:26Z'), 1395612266)
self.assertEqual(parse_iso8601('2014-03-23T22:04:26.1234Z'), 1395612266)
self.assertEqual(parse_iso8601('2015-09-29T08:27:31.727'), 1443515251)
self.assertEqual(parse_iso8601('2015-09-29T08-27-31.727'), None)
def test_strip_jsonp(self):
stripped = strip_jsonp('cb ([ {"id":"532cb",\n\n\n"x":\n3}\n]\n);')
@@ -435,6 +545,10 @@ class TestUtil(unittest.TestCase):
d = json.loads(stripped)
self.assertEqual(d, {'STATUS': 'OK'})
stripped = strip_jsonp('ps.embedHandler({"status": "success"});')
d = json.loads(stripped)
self.assertEqual(d, {'status': 'success'})
def test_uppercase_escape(self):
self.assertEqual(uppercase_escape(''), '')
self.assertEqual(uppercase_escape('\\U0001d550'), '𝕐')
@@ -471,11 +585,11 @@ class TestUtil(unittest.TestCase):
)
self.assertEqual(
escape_url('http://тест.рф/фрагмент'),
'http://тест.рф/%D1%84%D1%80%D0%B0%D0%B3%D0%BC%D0%B5%D0%BD%D1%82'
'http://xn--e1aybc.xn--p1ai/%D1%84%D1%80%D0%B0%D0%B3%D0%BC%D0%B5%D0%BD%D1%82'
)
self.assertEqual(
escape_url('http://тест.рф/абв?абв=абв#абв'),
'http://тест.рф/%D0%B0%D0%B1%D0%B2?%D0%B0%D0%B1%D0%B2=%D0%B0%D0%B1%D0%B2#%D0%B0%D0%B1%D0%B2'
'http://xn--e1aybc.xn--p1ai/%D0%B0%D0%B1%D0%B2?%D0%B0%D0%B1%D0%B2=%D0%B0%D0%B1%D0%B2#%D0%B0%D0%B1%D0%B2'
)
self.assertEqual(escape_url('http://vimeo.com/56015672#at=0'), 'http://vimeo.com/56015672#at=0')
@@ -525,6 +639,44 @@ class TestUtil(unittest.TestCase):
on = js_to_json('{"abc": "def",}')
self.assertEqual(json.loads(on), {'abc': 'def'})
def test_extract_attributes(self):
self.assertEqual(extract_attributes('<e x="y">'), {'x': 'y'})
self.assertEqual(extract_attributes("<e x='y'>"), {'x': 'y'})
self.assertEqual(extract_attributes('<e x=y>'), {'x': 'y'})
self.assertEqual(extract_attributes('<e x="a \'b\' c">'), {'x': "a 'b' c"})
self.assertEqual(extract_attributes('<e x=\'a "b" c\'>'), {'x': 'a "b" c'})
self.assertEqual(extract_attributes('<e x="&#121;">'), {'x': 'y'})
self.assertEqual(extract_attributes('<e x="&#x79;">'), {'x': 'y'})
self.assertEqual(extract_attributes('<e x="&amp;">'), {'x': '&'}) # XML
self.assertEqual(extract_attributes('<e x="&quot;">'), {'x': '"'})
self.assertEqual(extract_attributes('<e x="&pound;">'), {'x': '£'}) # HTML 3.2
self.assertEqual(extract_attributes('<e x="&lambda;">'), {'x': 'λ'}) # HTML 4.0
self.assertEqual(extract_attributes('<e x="&foo">'), {'x': '&foo'})
self.assertEqual(extract_attributes('<e x="\'">'), {'x': "'"})
self.assertEqual(extract_attributes('<e x=\'"\'>'), {'x': '"'})
self.assertEqual(extract_attributes('<e x >'), {'x': None})
self.assertEqual(extract_attributes('<e x=y a>'), {'x': 'y', 'a': None})
self.assertEqual(extract_attributes('<e x= y>'), {'x': 'y'})
self.assertEqual(extract_attributes('<e x=1 y=2 x=3>'), {'y': '2', 'x': '3'})
self.assertEqual(extract_attributes('<e \nx=\ny\n>'), {'x': 'y'})
self.assertEqual(extract_attributes('<e \nx=\n"y"\n>'), {'x': 'y'})
self.assertEqual(extract_attributes("<e \nx=\n'y'\n>"), {'x': 'y'})
self.assertEqual(extract_attributes('<e \nx="\ny\n">'), {'x': '\ny\n'})
self.assertEqual(extract_attributes('<e CAPS=x>'), {'caps': 'x'}) # Names lowercased
self.assertEqual(extract_attributes('<e x=1 X=2>'), {'x': '2'})
self.assertEqual(extract_attributes('<e X=1 x=2>'), {'x': '2'})
self.assertEqual(extract_attributes('<e _:funny-name1=1>'), {'_:funny-name1': '1'})
self.assertEqual(extract_attributes('<e x="Fáilte 世界 \U0001f600">'), {'x': 'Fáilte 世界 \U0001f600'})
self.assertEqual(extract_attributes('<e x="décompose&#769;">'), {'x': 'décompose\u0301'})
# "Narrow" Python builds don't support unicode code points outside BMP.
try:
compat_chr(0x10000)
supports_outside_bmp = True
except ValueError:
supports_outside_bmp = False
if supports_outside_bmp:
self.assertEqual(extract_attributes('<e x="Smile &#128512;!">'), {'x': 'Smile \U0001f600!'})
def test_clean_html(self):
self.assertEqual(clean_html('a:\nb'), 'a: b')
self.assertEqual(clean_html('a:\n "b"'), 'a: "b"')
@@ -550,6 +702,17 @@ class TestUtil(unittest.TestCase):
self.assertEqual(parse_filesize('1.2Tb'), 1200000000000)
self.assertEqual(parse_filesize('1,24 KB'), 1240)
def test_parse_count(self):
self.assertEqual(parse_count(None), None)
self.assertEqual(parse_count(''), None)
self.assertEqual(parse_count('0'), 0)
self.assertEqual(parse_count('1000'), 1000)
self.assertEqual(parse_count('1.000'), 1000)
self.assertEqual(parse_count('1.1k'), 1100)
self.assertEqual(parse_count('1.1kk'), 1100000)
self.assertEqual(parse_count('1.1kk '), 1100000)
self.assertEqual(parse_count('1.1kk views'), 1100000)
def test_version_tuple(self):
self.assertEqual(version_tuple('1'), (1,))
self.assertEqual(version_tuple('10.23.344'), (10, 23, 344))
@@ -630,12 +793,13 @@ ffmpeg version 2.4.4 Copyright (c) 2000-2014 the FFmpeg ...'''), '2.4.4')
{'like_count': 190, 'dislike_count': 10}))
def test_parse_dfxp_time_expr(self):
self.assertEqual(parse_dfxp_time_expr(None), 0.0)
self.assertEqual(parse_dfxp_time_expr(''), 0.0)
self.assertEqual(parse_dfxp_time_expr(None), None)
self.assertEqual(parse_dfxp_time_expr(''), None)
self.assertEqual(parse_dfxp_time_expr('0.1'), 0.1)
self.assertEqual(parse_dfxp_time_expr('0.1s'), 0.1)
self.assertEqual(parse_dfxp_time_expr('00:00:01'), 1.0)
self.assertEqual(parse_dfxp_time_expr('00:00:01.100'), 1.1)
self.assertEqual(parse_dfxp_time_expr('00:00:01:100'), 1.1)
def test_dfxp2srt(self):
dfxp_data = '''<?xml version="1.0" encoding="UTF-8"?>
@@ -645,6 +809,9 @@ ffmpeg version 2.4.4 Copyright (c) 2000-2014 the FFmpeg ...'''), '2.4.4')
<p begin="0" end="1">The following line contains Chinese characters and special symbols</p>
<p begin="1" end="2">第二行<br/>♪♪</p>
<p begin="2" dur="1"><span>Third<br/>Line</span></p>
<p begin="3" end="-1">Lines with invalid timestamps are ignored</p>
<p begin="-1" end="-1">Ignore, two</p>
<p begin="3" dur="-1">Ignored, three</p>
</div>
</body>
</tt>'''
@@ -725,6 +892,24 @@ The first line
{'nocheckcertificate': False}, '--check-certificate', 'nocheckcertificate', 'false', 'true', '='),
['--check-certificate=true'])
def test_ohdave_rsa_encrypt(self):
N = 0xab86b6371b5318aaa1d3c9e612a9f1264f372323c8c0f19875b5fc3b3fd3afcc1e5bec527aa94bfa85bffc157e4245aebda05389a5357b75115ac94f074aefcd
e = 65537
self.assertEqual(
ohdave_rsa_encrypt(b'aa111222', e, N),
'726664bd9a23fd0c70f9f1b84aab5e3905ce1e45a584e9cbcf9bcc7510338fc1986d6c599ff990d923aa43c51c0d9013cd572e13bc58f4ae48f2ed8c0b0ba881')
def test_encode_base_n(self):
self.assertEqual(encode_base_n(0, 30), '0')
self.assertEqual(encode_base_n(80, 30), '2k')
custom_table = '9876543210ZYXWVUTSRQPONMLKJIHGFEDCBA'
self.assertEqual(encode_base_n(0, 30, custom_table), '9')
self.assertEqual(encode_base_n(80, 30, custom_table), '7P')
self.assertRaises(ValueError, encode_base_n, 0, 70)
self.assertRaises(ValueError, encode_base_n, 0, 60, custom_table)
if __name__ == '__main__':
unittest.main()

View File

@@ -66,7 +66,7 @@ class TestAnnotations(unittest.TestCase):
textTag = a.find('TEXT')
text = textTag.text
self.assertTrue(text in expected) # assertIn only added in python 2.7
# remove the first occurance, there could be more than one annotation with the same text
# remove the first occurrence, there could be more than one annotation with the same text
expected.remove(text)
# We should have seen (and removed) all the expected annotation texts.
self.assertEqual(len(expected), 0, 'Not all expected annotations were found.')

View File

@@ -34,7 +34,7 @@ class TestYoutubeLists(unittest.TestCase):
ie = YoutubePlaylistIE(dl)
# TODO find a > 100 (paginating?) videos course
result = ie.extract('https://www.youtube.com/course?list=ECUl4u3cNGP61MdtwGTqZA0MreSaDybji8')
entries = result['entries']
entries = list(result['entries'])
self.assertEqual(YoutubeIE().extract_id(entries[0]['url']), 'j9WZyLZCBzs')
self.assertEqual(len(entries), 25)
self.assertEqual(YoutubeIE().extract_id(entries[-1]['url']), 'rYefUsYuEp0')

34
test/versions.json Normal file
View File

@@ -0,0 +1,34 @@
{
"latest": "2013.01.06",
"signature": "72158cdba391628569ffdbea259afbcf279bbe3d8aeb7492690735dc1cfa6afa754f55c61196f3871d429599ab22f2667f1fec98865527b32632e7f4b3675a7ef0f0fbe084d359256ae4bba68f0d33854e531a70754712f244be71d4b92e664302aa99653ee4df19800d955b6c4149cd2b3f24288d6e4b40b16126e01f4c8ce6",
"versions": {
"2013.01.02": {
"bin": [
"http://youtube-dl.org/downloads/2013.01.02/youtube-dl",
"f5b502f8aaa77675c4884938b1e4871ebca2611813a0c0e74f60c0fbd6dcca6b"
],
"exe": [
"http://youtube-dl.org/downloads/2013.01.02/youtube-dl.exe",
"75fa89d2ce297d102ff27675aa9d92545bbc91013f52ec52868c069f4f9f0422"
],
"tar": [
"http://youtube-dl.org/downloads/2013.01.02/youtube-dl-2013.01.02.tar.gz",
"6a66d022ac8e1c13da284036288a133ec8dba003b7bd3a5179d0c0daca8c8196"
]
},
"2013.01.06": {
"bin": [
"http://youtube-dl.org/downloads/2013.01.06/youtube-dl",
"64b6ed8865735c6302e836d4d832577321b4519aa02640dc508580c1ee824049"
],
"exe": [
"http://youtube-dl.org/downloads/2013.01.06/youtube-dl.exe",
"58609baf91e4389d36e3ba586e21dab882daaaee537e4448b1265392ae86ff84"
],
"tar": [
"http://youtube-dl.org/downloads/2013.01.06/youtube-dl-2013.01.06.tar.gz",
"fe77ab20a95d980ed17a659aa67e371fdd4d656d19c4c7950e7b720b0c2f1a86"
]
}
}
}

View File

@@ -8,6 +8,6 @@ deps =
passenv = HOME
defaultargs = test --exclude test_download.py --exclude test_age_restriction.py
--exclude test_subtitles.py --exclude test_write_annotations.py
--exclude test_youtube_lists.py
--exclude test_youtube_lists.py --exclude test_iqiyi_sdk_interpreter.py
commands = nosetests --verbose {posargs:{[testenv]defaultargs}} # --with-coverage --cover-package=youtube_dl --cover-html
# test.test_download:TestDownload.test_NowVideo

View File

@@ -24,15 +24,14 @@ import time
import tokenize
import traceback
if os.name == 'nt':
import ctypes
from .compat import (
compat_basestring,
compat_cookiejar,
compat_expanduser,
compat_get_terminal_size,
compat_http_client,
compat_kwargs,
compat_os_name,
compat_str,
compat_tokenize_tokenize,
compat_urllib_error,
@@ -40,13 +39,18 @@ from .compat import (
compat_urllib_request_DataHandler,
)
from .utils import (
age_restricted,
args_to_str,
ContentTooShortError,
date_from_str,
DateRange,
DEFAULT_OUTTMPL,
determine_ext,
determine_protocol,
DownloadError,
encode_compat_str,
encodeFilename,
error_to_compat_str,
ExtractorError,
format_bytes,
formatSeconds,
@@ -56,13 +60,17 @@ from .utils import (
PagedList,
parse_filesize,
PerRequestProxyHandler,
PostProcessingError,
platform_name,
PostProcessingError,
preferredencoding,
prepend_extension,
render_table,
replace_extension,
SameFileError,
sanitize_filename,
sanitize_path,
sanitize_url,
sanitized_Request,
std_headers,
subtitles_filename,
UnavailableVideoError,
@@ -72,16 +80,13 @@ from .utils import (
write_string,
YoutubeDLCookieProcessor,
YoutubeDLHandler,
prepend_extension,
replace_extension,
args_to_str,
age_restricted,
)
from .cache import Cache
from .extractor import get_info_extractor, gen_extractors
from .extractor import get_info_extractor, gen_extractor_classes, _LAZY_LOADER
from .downloader import get_suitable_downloader
from .downloader.rtmp import rtmpdump_version
from .postprocessor import (
FFmpegFixupM3u8PP,
FFmpegFixupM4aPP,
FFmpegFixupStretchedPP,
FFmpegMergerPP,
@@ -90,6 +95,9 @@ from .postprocessor import (
)
from .version import __version__
if compat_os_name == 'nt':
import ctypes
class YoutubeDL(object):
"""YoutubeDL class.
@@ -156,7 +164,7 @@ class YoutubeDL(object):
writethumbnail: Write the thumbnail image to a file
write_all_thumbnails: Write all thumbnail formats to files
writesubtitles: Write the video subtitles to a file
writeautomaticsub: Write the automatic subtitles to a file
writeautomaticsub: Write the automatically generated subtitles to a file
allsubtitles: Downloads all the subtitles of the video
(requires writesubtitles or writeautomaticsub)
listsubtitles: Lists all available subtitles for the video
@@ -258,7 +266,7 @@ class YoutubeDL(object):
the downloader (see youtube_dl/downloader/common.py):
nopart, updatetime, buffersize, ratelimit, min_filesize, max_filesize, test,
noresizebuffer, retries, continuedl, noprogress, consoletitle,
xattr_set_filesize, external_downloader_args.
xattr_set_filesize, external_downloader_args, hls_use_mpegts.
The following options are used by the post processors:
prefer_ffmpeg: If True, use ffmpeg instead of avconv if both are available,
@@ -370,8 +378,9 @@ class YoutubeDL(object):
def add_info_extractor(self, ie):
"""Add an InfoExtractor object to the end of the list."""
self._ies.append(ie)
self._ies_instances[ie.ie_key()] = ie
ie.set_downloader(self)
if not isinstance(ie, type):
self._ies_instances[ie.ie_key()] = ie
ie.set_downloader(self)
def get_info_extractor(self, ie_key):
"""
@@ -389,7 +398,7 @@ class YoutubeDL(object):
"""
Add the InfoExtractors returned by gen_extractors to the end of the list
"""
for ie in gen_extractors():
for ie in gen_extractor_classes():
self.add_info_extractor(ie)
def add_post_processor(self, pp):
@@ -445,7 +454,7 @@ class YoutubeDL(object):
def to_console_title(self, message):
if not self.params.get('consoletitle', False):
return
if os.name == 'nt' and ctypes.windll.kernel32.GetConsoleWindow():
if compat_os_name == 'nt' and ctypes.windll.kernel32.GetConsoleWindow():
# c_wchar_p() might not be necessary if `message` is
# already of type unicode()
ctypes.windll.kernel32.SetConsoleTitleW(ctypes.c_wchar_p(message))
@@ -493,7 +502,7 @@ class YoutubeDL(object):
tb = ''
if hasattr(sys.exc_info()[1], 'exc_info') and sys.exc_info()[1].exc_info[0]:
tb += ''.join(traceback.format_exception(*sys.exc_info()[1].exc_info))
tb += compat_str(traceback.format_exc())
tb += encode_compat_str(traceback.format_exc())
else:
tb_data = traceback.format_list(traceback.extract_stack())
tb = ''.join(tb_data)
@@ -516,7 +525,7 @@ class YoutubeDL(object):
else:
if self.params.get('no_warnings'):
return
if not self.params.get('no_color') and self._err_file.isatty() and os.name != 'nt':
if not self.params.get('no_color') and self._err_file.isatty() and compat_os_name != 'nt':
_msg_header = '\033[0;33mWARNING:\033[0m'
else:
_msg_header = 'WARNING:'
@@ -528,7 +537,7 @@ class YoutubeDL(object):
Do the same as trouble, but prefixes the message with 'ERROR:', colored
in red if stderr is a tty file.
'''
if not self.params.get('no_color') and self._err_file.isatty() and os.name != 'nt':
if not self.params.get('no_color') and self._err_file.isatty() and compat_os_name != 'nt':
_msg_header = '\033[0;31mERROR:\033[0m'
else:
_msg_header = 'ERROR:'
@@ -561,7 +570,7 @@ class YoutubeDL(object):
elif template_dict.get('height'):
template_dict['resolution'] = '%sp' % template_dict['height']
elif template_dict.get('width'):
template_dict['resolution'] = '?x%d' % template_dict['width']
template_dict['resolution'] = '%dx?' % template_dict['width']
sanitize = lambda k, v: sanitize_filename(
compat_str(v),
@@ -572,7 +581,7 @@ class YoutubeDL(object):
if v is not None)
template_dict = collections.defaultdict(lambda: 'NA', template_dict)
outtmpl = sanitize_path(self.params.get('outtmpl', DEFAULT_OUTTMPL))
outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL)
tmpl = compat_expanduser(outtmpl)
filename = tmpl % template_dict
# Temporary fix for #4787
@@ -580,7 +589,7 @@ class YoutubeDL(object):
# to workaround encoding issues with subprocess on python2 @ Windows
if sys.version_info < (3, 0) and sys.platform == 'win32':
filename = encodeFilename(filename, True).decode(preferredencoding())
return filename
return sanitize_path(filename)
except ValueError as err:
self.report_error('Error in output template: ' + str(err) + ' (encoding: ' + repr(preferredencoding()) + ')')
return None
@@ -600,12 +609,12 @@ class YoutubeDL(object):
if rejecttitle:
if re.search(rejecttitle, title, re.IGNORECASE):
return '"' + title + '" title matched reject pattern "' + rejecttitle + '"'
date = info_dict.get('upload_date', None)
date = info_dict.get('upload_date')
if date is not None:
dateRange = self.params.get('daterange', DateRange())
if date not in dateRange:
return '%s upload date is not in range %s' % (date_from_str(date).isoformat(), dateRange)
view_count = info_dict.get('view_count', None)
view_count = info_dict.get('view_count')
if view_count is not None:
min_views = self.params.get('min_views')
if min_views is not None and view_count < min_views:
@@ -653,6 +662,7 @@ class YoutubeDL(object):
if not ie.suitable(url):
continue
ie = self.get_info_extractor(ie.ie_key())
if not ie.working():
self.report_warning('The program functionality for this site has been marked as broken, '
'and will probably not work.')
@@ -672,14 +682,14 @@ class YoutubeDL(object):
return self.process_ie_result(ie_result, download, extra_info)
else:
return ie_result
except ExtractorError as de: # An error we somewhat expected
self.report_error(compat_str(de), de.format_traceback())
except ExtractorError as e: # An error we somewhat expected
self.report_error(compat_str(e), e.format_traceback())
break
except MaxDownloadsReached:
raise
except Exception as e:
if self.params.get('ignoreerrors', False):
self.report_error(compat_str(e), tb=compat_str(traceback.format_exc()))
self.report_error(error_to_compat_str(e), tb=encode_compat_str(traceback.format_exc()))
break
else:
raise
@@ -702,7 +712,6 @@ class YoutubeDL(object):
It will also download the videos if 'download'.
Returns the resolved ie_result.
"""
result_type = ie_result.get('_type', 'video')
if result_type in ('url', 'url_transparent'):
@@ -731,7 +740,7 @@ class YoutubeDL(object):
force_properties = dict(
(k, v) for k, v in ie_result.items() if v is not None)
for f in ('_type', 'url'):
for f in ('_type', 'url', 'ie_key'):
if f in force_properties:
del force_properties[f]
new_result = info.copy()
@@ -743,18 +752,18 @@ class YoutubeDL(object):
new_result, download=download, extra_info=extra_info)
elif result_type == 'playlist' or result_type == 'multi_video':
# We process each entry in the playlist
playlist = ie_result.get('title', None) or ie_result.get('id', None)
playlist = ie_result.get('title') or ie_result.get('id')
self.to_screen('[download] Downloading playlist: %s' % playlist)
playlist_results = []
playliststart = self.params.get('playliststart', 1) - 1
playlistend = self.params.get('playlistend', None)
playlistend = self.params.get('playlistend')
# For backwards compatibility, interpret -1 as whole list
if playlistend == -1:
playlistend = None
playlistitems_str = self.params.get('playlist_items', None)
playlistitems_str = self.params.get('playlist_items')
playlistitems = None
if playlistitems_str is not None:
def iter_playlistitems(format):
@@ -778,7 +787,7 @@ class YoutubeDL(object):
entries = ie_entries[playliststart:playlistend]
n_entries = len(entries)
self.to_screen(
"[%s] playlist %s: Collected %d video ids (downloading %d of them)" %
'[%s] playlist %s: Collected %d video ids (downloading %d of them)' %
(ie_result['extractor'], playlist, n_all_entries, n_entries))
elif isinstance(ie_entries, PagedList):
if playlistitems:
@@ -792,7 +801,7 @@ class YoutubeDL(object):
playliststart, playlistend)
n_entries = len(entries)
self.to_screen(
"[%s] playlist %s: Downloading %d videos" %
'[%s] playlist %s: Downloading %d videos' %
(ie_result['extractor'], playlist, n_entries))
else: # iterable
if playlistitems:
@@ -803,7 +812,7 @@ class YoutubeDL(object):
ie_entries, playliststart, playlistend))
n_entries = len(entries)
self.to_screen(
"[%s] playlist %s: Downloading %d videos" %
'[%s] playlist %s: Downloading %d videos' %
(ie_result['extractor'], playlist, n_entries))
if self.params.get('playlistreverse', False):
@@ -833,6 +842,7 @@ class YoutubeDL(object):
extra_info=extra)
playlist_results.append(entry_result)
ie_result['entries'] = playlist_results
self.to_screen('[download] Finished downloading playlist: %s' % playlist)
return ie_result
elif result_type == 'compat_list':
self.report_warning(
@@ -893,11 +903,14 @@ class YoutubeDL(object):
STR_OPERATORS = {
'=': operator.eq,
'!=': operator.ne,
'^=': lambda attr, value: attr.startswith(value),
'$=': lambda attr, value: attr.endswith(value),
'*=': lambda attr, value: value in attr,
}
str_operator_rex = re.compile(r'''(?x)
\s*(?P<key>ext|acodec|vcodec|container|protocol)
\s*(?P<key>ext|acodec|vcodec|container|protocol|format_id)
\s*(?P<op>%s)(?P<none_inclusive>\s*\?)?
\s*(?P<value>[a-zA-Z0-9_-]+)
\s*(?P<value>[a-zA-Z0-9._-]+)
\s*$
''' % '|'.join(map(re.escape, STR_OPERATORS.keys())))
m = str_operator_rex.search(filter_spec)
@@ -937,7 +950,7 @@ class YoutubeDL(object):
filter_parts.append(string)
def _remove_unused_ops(tokens):
# Remove operators that we don't use and join them with the sourrounding strings
# Remove operators that we don't use and join them with the surrounding strings
# for example: 'mp4' '-' 'baseline' '-' '16x9' is converted to 'mp4-baseline-16x9'
ALLOWED_OPS = ('/', '+', ',', '(', ')')
last_string, last_start, last_end, last_line = None, None, None, None
@@ -1107,6 +1120,12 @@ class YoutubeDL(object):
'contain the video, try using '
'"-f %s+%s"' % (format_2, format_1))
return
# Formats must be opposite (video+audio)
if formats_info[0].get('acodec') == 'none' and formats_info[1].get('acodec') == 'none':
self.report_error(
'Both formats %s and %s are video-only, you must specify "-f video+audio"'
% (format_1, format_2))
return
output_ext = (
formats_info[0]['ext']
if self.params.get('merge_output_format') is None
@@ -1186,7 +1205,7 @@ class YoutubeDL(object):
return res
def _calc_cookies(self, info_dict):
pr = compat_urllib_request.Request(info_dict['url'])
pr = sanitized_Request(info_dict['url'])
self.cookiejar.add_cookie_header(pr)
return pr.get_header('Cookie')
@@ -1213,12 +1232,20 @@ class YoutubeDL(object):
t.get('preference'), t.get('width'), t.get('height'),
t.get('id'), t.get('url')))
for i, t in enumerate(thumbnails):
t['url'] = sanitize_url(t['url'])
if t.get('width') and t.get('height'):
t['resolution'] = '%dx%d' % (t['width'], t['height'])
if t.get('id') is None:
t['id'] = '%d' % i
if thumbnails and 'thumbnail' not in info_dict:
if self.params.get('list_thumbnails'):
self.list_thumbnails(info_dict)
return
thumbnail = info_dict.get('thumbnail')
if thumbnail:
info_dict['thumbnail'] = sanitize_url(thumbnail)
elif thumbnails:
info_dict['thumbnail'] = thumbnails[-1]['url']
if 'display_id' not in info_dict and 'id' in info_dict:
@@ -1233,10 +1260,18 @@ class YoutubeDL(object):
except (ValueError, OverflowError, OSError):
pass
# Auto generate title fields corresponding to the *_number fields when missing
# in order to always have clean titles. This is very common for TV series.
for field in ('chapter', 'season', 'episode'):
if info_dict.get('%s_number' % field) is not None and not info_dict.get(field):
info_dict[field] = '%s %d' % (field.capitalize(), info_dict['%s_number' % field])
subtitles = info_dict.get('subtitles')
if subtitles:
for _, subtitle in subtitles.items():
for subtitle_format in subtitle:
if subtitle_format.get('url'):
subtitle_format['url'] = sanitize_url(subtitle_format['url'])
if 'ext' not in subtitle_format:
subtitle_format['ext'] = determine_ext(subtitle_format['url']).lower()
@@ -1266,8 +1301,13 @@ class YoutubeDL(object):
if 'url' not in format:
raise ExtractorError('Missing "url" key in result (index %d)' % i)
format['url'] = sanitize_url(format['url'])
if format.get('format_id') is None:
format['format_id'] = compat_str(i)
else:
# Sanitize format_id from characters used in format selector expression
format['format_id'] = re.sub('[\s,/+\[\]()]', '_', format['format_id'])
format_id = format['format_id']
if format_id not in formats_dict:
formats_dict[format_id] = []
@@ -1289,6 +1329,10 @@ class YoutubeDL(object):
# Automatically determine file extension if missing
if 'ext' not in format:
format['ext'] = determine_ext(format['url']).lower()
# Automatically determine protocol if missing (useful for format
# selection purposes)
if 'protocol' not in format:
format['protocol'] = determine_protocol(format)
# Add HTTP headers, so that external programs can use them from the
# json output
full_format_info = info_dict.copy()
@@ -1301,20 +1345,16 @@ class YoutubeDL(object):
# only set the 'formats' fields if the original info_dict list them
# otherwise we end up with a circular reference, the first (and unique)
# element in the 'formats' field in info_dict is info_dict itself,
# wich can't be exported to json
# which can't be exported to json
info_dict['formats'] = formats
if self.params.get('listformats'):
self.list_formats(info_dict)
return
if self.params.get('list_thumbnails'):
self.list_thumbnails(info_dict)
return
req_format = self.params.get('format')
if req_format is None:
req_format_list = []
if (self.params.get('outtmpl', DEFAULT_OUTTMPL) != '-' and
info_dict['extractor'] in ['youtube', 'ted'] and
not info_dict.get('is_live')):
merger = FFmpegMergerPP(self)
if merger.available and merger.can_merge():
@@ -1450,7 +1490,7 @@ class YoutubeDL(object):
if dn and not os.path.exists(dn):
os.makedirs(dn)
except (OSError, IOError) as err:
self.report_error('unable to create directory ' + compat_str(err))
self.report_error('unable to create directory ' + error_to_compat_str(err))
return
if self.params.get('writedescription', False):
@@ -1501,7 +1541,7 @@ class YoutubeDL(object):
sub_info['url'], info_dict['id'], note=False)
except ExtractorError as err:
self.report_warning('Unable to download subtitle for "%s": %s' %
(sub_lang, compat_str(err.cause)))
(sub_lang, error_to_compat_str(err.cause)))
continue
try:
sub_filename = subtitles_filename(filename, sub_lang, sub_format)
@@ -1605,12 +1645,14 @@ class YoutubeDL(object):
self.report_error('content too short (expected %s bytes and served %s)' % (err.expected, err.downloaded))
return
if success:
if success and filename != '-':
# Fixup content
fixup_policy = self.params.get('fixup')
if fixup_policy is None:
fixup_policy = 'detect_or_warn'
INSTALL_FFMPEG_MESSAGE = 'Install ffmpeg or avconv to fix this automatically.'
stretched_ratio = info_dict.get('stretched_ratio')
if stretched_ratio is not None and stretched_ratio != 1:
if fixup_policy == 'warn':
@@ -1623,15 +1665,18 @@ class YoutubeDL(object):
info_dict['__postprocessors'].append(stretched_pp)
else:
self.report_warning(
'%s: Non-uniform pixel ratio (%s). Install ffmpeg or avconv to fix this automatically.' % (
info_dict['id'], stretched_ratio))
'%s: Non-uniform pixel ratio (%s). %s'
% (info_dict['id'], stretched_ratio, INSTALL_FFMPEG_MESSAGE))
else:
assert fixup_policy in ('ignore', 'never')
if info_dict.get('requested_formats') is None and info_dict.get('container') == 'm4a_dash':
if (info_dict.get('requested_formats') is None and
info_dict.get('container') == 'm4a_dash'):
if fixup_policy == 'warn':
self.report_warning('%s: writing DASH m4a. Only some players support this container.' % (
info_dict['id']))
self.report_warning(
'%s: writing DASH m4a. '
'Only some players support this container.'
% info_dict['id'])
elif fixup_policy == 'detect_or_warn':
fixup_pp = FFmpegFixupM4aPP(self)
if fixup_pp.available:
@@ -1639,8 +1684,27 @@ class YoutubeDL(object):
info_dict['__postprocessors'].append(fixup_pp)
else:
self.report_warning(
'%s: writing DASH m4a. Only some players support this container. Install ffmpeg or avconv to fix this automatically.' % (
info_dict['id']))
'%s: writing DASH m4a. '
'Only some players support this container. %s'
% (info_dict['id'], INSTALL_FFMPEG_MESSAGE))
else:
assert fixup_policy in ('ignore', 'never')
if (info_dict.get('protocol') == 'm3u8_native' or
info_dict.get('protocol') == 'm3u8' and
self.params.get('hls_prefer_native')):
if fixup_policy == 'warn':
self.report_warning('%s: malformated aac bitstream.' % (
info_dict['id']))
elif fixup_policy == 'detect_or_warn':
fixup_pp = FFmpegFixupM3u8PP(self)
if fixup_pp.available:
info_dict.setdefault('__postprocessors', [])
info_dict['__postprocessors'].append(fixup_pp)
else:
self.report_warning(
'%s: malformated aac bitstream. %s'
% (info_dict['id'], INSTALL_FFMPEG_MESSAGE))
else:
assert fixup_policy in ('ignore', 'never')
@@ -1771,7 +1835,7 @@ class YoutubeDL(object):
else:
res = '%sp' % format['height']
elif format.get('width') is not None:
res = '?x%d' % format['width']
res = '%dx?' % format['width']
else:
res = default
return res
@@ -1780,6 +1844,10 @@ class YoutubeDL(object):
res = ''
if fdict.get('ext') in ['f4f', 'f4m']:
res += '(unsupported) '
if fdict.get('language'):
if res:
res += ' '
res += '[%s] ' % fdict['language']
if fdict.get('format_note') is not None:
res += fdict['format_note'] + ' '
if fdict.get('tbr') is not None:
@@ -1800,7 +1868,9 @@ class YoutubeDL(object):
if fdict.get('vbr') is not None:
res += '%4dk' % fdict['vbr']
if fdict.get('fps') is not None:
res += ', %sfps' % fdict['fps']
if res:
res += ', '
res += '%sfps' % fdict['fps']
if fdict.get('acodec') is not None:
if res:
res += ', '
@@ -1843,13 +1913,8 @@ class YoutubeDL(object):
def list_thumbnails(self, info_dict):
thumbnails = info_dict.get('thumbnails')
if not thumbnails:
tn_url = info_dict.get('thumbnail')
if tn_url:
thumbnails = [{'id': '0', 'url': tn_url}]
else:
self.to_screen(
'[info] No thumbnails present for %s' % info_dict['id'])
return
self.to_screen('[info] No thumbnails present for %s' % info_dict['id'])
return
self.to_screen(
'[info] Thumbnails for %s:' % info_dict['id'])
@@ -1870,6 +1935,8 @@ class YoutubeDL(object):
def urlopen(self, req):
""" Start an HTTP download """
if isinstance(req, compat_basestring):
req = sanitized_Request(req)
return self._opener.open(req, timeout=self._socket_timeout)
def print_debug_header(self):
@@ -1892,6 +1959,8 @@ class YoutubeDL(object):
write_string(encoding_str, encoding=None)
self._write_string('[debug] youtube-dl version ' + __version__ + '\n')
if _LAZY_LOADER:
self._write_string('[debug] Lazy loading extractors enabled' + '\n')
try:
sp = subprocess.Popen(
['git', 'rev-parse', '--short', 'HEAD'],
@@ -1969,8 +2038,19 @@ class YoutubeDL(object):
https_handler = make_HTTPS_handler(self.params, debuglevel=debuglevel)
ydlh = YoutubeDLHandler(self.params, debuglevel=debuglevel)
data_handler = compat_urllib_request_DataHandler()
# When passing our own FileHandler instance, build_opener won't add the
# default FileHandler and allows us to disable the file protocol, which
# can be used for malicious purposes (see
# https://github.com/rg3/youtube-dl/issues/8227)
file_handler = compat_urllib_request.FileHandler()
def file_open(*args, **kwargs):
raise compat_urllib_error.URLError('file:// scheme is explicitly disabled in youtube-dl for security reasons')
file_handler.file_open = file_open
opener = compat_urllib_request.build_opener(
proxy_handler, https_handler, cookie_processor, ydlh, data_handler)
proxy_handler, https_handler, cookie_processor, ydlh, data_handler, file_handler)
# Delete the default user-agent header, which would otherwise apply in
# cases where our custom HTTP handler doesn't come into play
@@ -2028,4 +2108,4 @@ class YoutubeDL(object):
(info_dict['extractor'], info_dict['id'], thumb_display_id, thumb_filename))
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
self.report_warning('Unable to download thumbnail "%s": %s' %
(t['url'], compat_str(err)))
(t['url'], error_to_compat_str(err)))

View File

@@ -144,14 +144,20 @@ def _real_main(argv=None):
if numeric_limit is None:
parser.error('invalid max_filesize specified')
opts.max_filesize = numeric_limit
if opts.retries is not None:
if opts.retries in ('inf', 'infinite'):
opts_retries = float('inf')
def parse_retries(retries):
if retries in ('inf', 'infinite'):
parsed_retries = float('inf')
else:
try:
opts_retries = int(opts.retries)
parsed_retries = int(retries)
except (TypeError, ValueError):
parser.error('invalid retry count specified')
return parsed_retries
if opts.retries is not None:
opts.retries = parse_retries(opts.retries)
if opts.fragment_retries is not None:
opts.fragment_retries = parse_retries(opts.fragment_retries)
if opts.buffersize is not None:
numeric_buffersize = FileDownloader.parse_bytes(opts.buffersize)
if numeric_buffersize is None:
@@ -299,7 +305,8 @@ def _real_main(argv=None):
'force_generic_extractor': opts.force_generic_extractor,
'ratelimit': opts.ratelimit,
'nooverwrites': opts.nooverwrites,
'retries': opts_retries,
'retries': opts.retries,
'fragment_retries': opts.fragment_retries,
'buffersize': opts.buffersize,
'noresizebuffer': opts.noresizebuffer,
'continuedl': opts.continue_dl,
@@ -355,6 +362,7 @@ def _real_main(argv=None):
'youtube_include_dash_manifest': opts.youtube_include_dash_manifest,
'encoding': opts.encoding,
'extract_flat': opts.extract_flat,
'mark_watched': opts.mark_watched,
'merge_output_format': opts.merge_output_format,
'postprocessors': postprocessors,
'fixup': opts.fixup,
@@ -369,6 +377,7 @@ def _real_main(argv=None):
'no_color': opts.no_color,
'ffmpeg_location': opts.ffmpeg_location,
'hls_prefer_native': opts.hls_prefer_native,
'hls_use_mpegts': opts.hls_use_mpegts,
'external_downloader_args': external_downloader_args,
'postprocessor_args': postprocessor_args,
'cn_verification_proxy': opts.cn_verification_proxy,
@@ -377,7 +386,7 @@ def _real_main(argv=None):
with YoutubeDL(ydl_opts) as ydl:
# Update version
if opts.update_self:
update_self(ydl.to_screen, opts.verbose)
update_self(ydl.to_screen, opts.verbose, ydl._opener)
# Remove cache dir
if opts.rm_cachedir:

View File

@@ -7,7 +7,7 @@ from __future__ import unicode_literals
import sys
if __package__ is None and not hasattr(sys, "frozen"):
if __package__ is None and not hasattr(sys, 'frozen'):
# direct call of __main__.py
import os.path
path = os.path.realpath(os.path.abspath(__file__))

View File

@@ -161,7 +161,7 @@ def aes_decrypt_text(data, password, key_size_bytes):
nonce = data[:NONCE_LENGTH_BYTES]
cipher = data[NONCE_LENGTH_BYTES:]
class Counter:
class Counter(object):
__value = nonce + [0] * (BLOCK_SIZE_BYTES - NONCE_LENGTH_BYTES)
def next_value(self):

View File

@@ -14,6 +14,7 @@ import socket
import subprocess
import sys
import itertools
import xml.etree.ElementTree
try:
@@ -76,6 +77,11 @@ try:
except ImportError: # Python 2
from urllib import urlretrieve as compat_urlretrieve
try:
from html.parser import HTMLParser as compat_HTMLParser
except ImportError: # Python 2
from HTMLParser import HTMLParser as compat_HTMLParser
try:
from subprocess import DEVNULL
@@ -163,6 +169,32 @@ except ImportError: # Python 2
string = string.replace('+', ' ')
return compat_urllib_parse_unquote(string, encoding, errors)
try:
from urllib.parse import urlencode as compat_urllib_parse_urlencode
except ImportError: # Python 2
# Python 2 will choke in urlencode on mixture of byte and unicode strings.
# Possible solutions are to either port it from python 3 with all
# the friends or manually ensure input query contains only byte strings.
# We will stick with latter thus recursively encoding the whole query.
def compat_urllib_parse_urlencode(query, doseq=0, encoding='utf-8'):
def encode_elem(e):
if isinstance(e, dict):
e = encode_dict(e)
elif isinstance(e, (list, tuple,)):
list_e = encode_list(e)
e = tuple(list_e) if isinstance(e, tuple) else list_e
elif isinstance(e, compat_str):
e = e.encode(encoding)
return e
def encode_dict(d):
return dict((encode_elem(k), encode_elem(v)) for k, v in d.items())
def encode_list(l):
return [encode_elem(e) for e in l]
return compat_urllib_parse.urlencode(encode_elem(query), doseq=doseq)
try:
from urllib.request import DataHandler as compat_urllib_request_DataHandler
except ImportError: # Python < 3.4
@@ -180,20 +212,20 @@ except ImportError: # Python < 3.4
# parameter := attribute "=" value
url = req.get_full_url()
scheme, data = url.split(":", 1)
mediatype, data = data.split(",", 1)
scheme, data = url.split(':', 1)
mediatype, data = data.split(',', 1)
# even base64 encoded data URLs might be quoted so unquote in any case:
data = compat_urllib_parse_unquote_to_bytes(data)
if mediatype.endswith(";base64"):
if mediatype.endswith(';base64'):
data = binascii.a2b_base64(data)
mediatype = mediatype[:-7]
if not mediatype:
mediatype = "text/plain;charset=US-ASCII"
mediatype = 'text/plain;charset=US-ASCII'
headers = email.message_from_string(
"Content-type: %s\nContent-length: %d\n" % (mediatype, len(data)))
'Content-type: %s\nContent-length: %d\n' % (mediatype, len(data)))
return compat_urllib_response.addinfourl(io.BytesIO(data), headers, url)
@@ -212,6 +244,53 @@ try:
except ImportError: # Python 2.6
from xml.parsers.expat import ExpatError as compat_xml_parse_error
if sys.version_info[0] >= 3:
compat_etree_fromstring = xml.etree.ElementTree.fromstring
else:
# python 2.x tries to encode unicode strings with ascii (see the
# XMLParser._fixtext method)
etree = xml.etree.ElementTree
try:
_etree_iter = etree.Element.iter
except AttributeError: # Python <=2.6
def _etree_iter(root):
for el in root.findall('*'):
yield el
for sub in _etree_iter(el):
yield sub
# on 2.6 XML doesn't have a parser argument, function copied from CPython
# 2.7 source
def _XML(text, parser=None):
if not parser:
parser = etree.XMLParser(target=etree.TreeBuilder())
parser.feed(text)
return parser.close()
def _element_factory(*args, **kwargs):
el = etree.Element(*args, **kwargs)
for k, v in el.items():
if isinstance(v, bytes):
el.set(k, v.decode('utf-8'))
return el
def compat_etree_fromstring(text):
doc = _XML(text, parser=etree.XMLParser(target=etree.TreeBuilder(element_factory=_element_factory)))
for el in _etree_iter(doc):
if el.text is not None and isinstance(el.text, bytes):
el.text = el.text.decode('utf-8')
return doc
if sys.version_info < (2, 7):
# Here comes the crazy part: In 2.6, if the xpath is a unicode,
# .//node does not match if a node is a direct child of . !
def compat_xpath(xpath):
if isinstance(xpath, compat_str):
xpath = xpath.encode('ascii')
return xpath
else:
compat_xpath = lambda xpath: xpath
try:
from urllib.parse import parse_qs as compat_parse_qs
@@ -230,7 +309,7 @@ except ImportError: # Python 2
nv = name_value.split('=', 1)
if len(nv) != 2:
if strict_parsing:
raise ValueError("bad query field: %r" % (name_value,))
raise ValueError('bad query field: %r' % (name_value,))
# Handle case of a control-name with no equal sign
if keep_blank_values:
nv.append('')
@@ -288,6 +367,9 @@ def compat_ord(c):
return ord(c)
compat_os_name = os._name if os.name == 'java' else os.name
if sys.version_info >= (3, 0):
compat_getenv = os.getenv
compat_expanduser = os.path.expanduser
@@ -308,7 +390,7 @@ else:
# The following are os.path.expanduser implementations from cpython 2.7.8 stdlib
# for different platforms with correct environment variables decoding.
if os.name == 'posix':
if compat_os_name == 'posix':
def compat_expanduser(path):
"""Expand ~ and ~user constructions. If user or $HOME is unknown,
do nothing."""
@@ -332,7 +414,7 @@ else:
userhome = pwent.pw_dir
userhome = userhome.rstrip('/')
return (userhome + path[i:]) or '/'
elif os.name == 'nt' or os.name == 'ce':
elif compat_os_name == 'nt' or compat_os_name == 'ce':
def compat_expanduser(path):
"""Expand ~ and ~user constructs.
@@ -395,7 +477,7 @@ if sys.version_info < (3, 0) and sys.platform == 'win32':
else:
compat_getpass = getpass.getpass
# Old 2.6 and 2.7 releases require kwargs to be bytes
# Python < 2.6.5 require kwargs to be bytes
try:
def _testfunc(x):
pass
@@ -428,7 +510,7 @@ if sys.version_info < (2, 7):
if err is not None:
raise err
else:
raise socket.error("getaddrinfo returns an empty list")
raise socket.error('getaddrinfo returns an empty list')
else:
compat_socket_create_connection = socket.create_connection
@@ -502,11 +584,13 @@ else:
from tokenize import generate_tokens as compat_tokenize_tokenize
__all__ = [
'compat_HTMLParser',
'compat_HTTPError',
'compat_basestring',
'compat_chr',
'compat_cookiejar',
'compat_cookies',
'compat_etree_fromstring',
'compat_expanduser',
'compat_get_terminal_size',
'compat_getenv',
@@ -517,6 +601,7 @@ __all__ = [
'compat_itertools_count',
'compat_kwargs',
'compat_ord',
'compat_os_name',
'compat_parse_qs',
'compat_print',
'compat_shlex_split',
@@ -529,6 +614,7 @@ __all__ = [
'compat_urllib_parse_unquote',
'compat_urllib_parse_unquote_plus',
'compat_urllib_parse_unquote_to_bytes',
'compat_urllib_parse_urlencode',
'compat_urllib_parse_urlparse',
'compat_urllib_request',
'compat_urllib_request_DataHandler',
@@ -536,6 +622,7 @@ __all__ = [
'compat_urlparse',
'compat_urlretrieve',
'compat_xml_parse_error',
'compat_xpath',
'shlex_quote',
'subprocess_check_output',
'workaround_optparse_bug9161',

View File

@@ -1,14 +1,16 @@
from __future__ import unicode_literals
from .common import FileDownloader
from .external import get_external_downloader
from .f4m import F4mFD
from .hls import HlsFD
from .hls import NativeHlsFD
from .http import HttpFD
from .rtsp import RtspFD
from .rtmp import RtmpFD
from .dash import DashSegmentsFD
from .rtsp import RtspFD
from .external import (
get_external_downloader,
FFmpegFD,
)
from ..utils import (
determine_protocol,
@@ -16,8 +18,8 @@ from ..utils import (
PROTOCOL_MAP = {
'rtmp': RtmpFD,
'm3u8_native': NativeHlsFD,
'm3u8': HlsFD,
'm3u8_native': HlsFD,
'm3u8': FFmpegFD,
'mms': RtspFD,
'rtsp': RtspFD,
'f4m': F4mFD,
@@ -30,14 +32,17 @@ def get_suitable_downloader(info_dict, params={}):
protocol = determine_protocol(info_dict)
info_dict['protocol'] = protocol
# if (info_dict.get('start_time') or info_dict.get('end_time')) and not info_dict.get('requested_formats') and FFmpegFD.can_download(info_dict):
# return FFmpegFD
external_downloader = params.get('external_downloader')
if external_downloader is not None:
ed = get_external_downloader(external_downloader)
if ed.supports(info_dict):
if ed.can_download(info_dict):
return ed
if protocol == 'm3u8' and params.get('hls_prefer_native'):
return NativeHlsFD
return HlsFD
return PROTOCOL_MAP.get(protocol, HttpFD)

View File

@@ -5,9 +5,10 @@ import re
import sys
import time
from ..compat import compat_str
from ..compat import compat_os_name
from ..utils import (
encodeFilename,
error_to_compat_str,
decodeArgument,
format_bytes,
timeconvert,
@@ -42,9 +43,10 @@ class FileDownloader(object):
min_filesize: Skip files smaller than this size
max_filesize: Skip files larger than this size
xattr_set_filesize: Set ytdl.filesize user xattribute with expected size.
(experimenatal)
(experimental)
external_downloader_args: A list of additional command-line arguments for the
external downloader.
hls_use_mpegts: Use the mpegts container for HLS videos.
Subclasses of this one must re-define the real_download method.
"""
@@ -113,6 +115,10 @@ class FileDownloader(object):
return '%10s' % '---b/s'
return '%10s' % ('%s/s' % format_bytes(speed))
@staticmethod
def format_retries(retries):
return 'inf' if retries == float('inf') else '%.0f' % retries
@staticmethod
def best_block_size(elapsed_time, bytes):
new_min = max(bytes / 2.0, 1.0)
@@ -156,7 +162,7 @@ class FileDownloader(object):
def slow_down(self, start_time, now, byte_counter):
"""Sleep if the download speed is over the rate limit."""
rate_limit = self.params.get('ratelimit', None)
rate_limit = self.params.get('ratelimit')
if rate_limit is None or byte_counter == 0:
return
if now is None:
@@ -186,7 +192,7 @@ class FileDownloader(object):
return
os.rename(encodeFilename(old_filename), encodeFilename(new_filename))
except (IOError, OSError) as err:
self.report_error('unable to rename file: %s' % compat_str(err))
self.report_error('unable to rename file: %s' % error_to_compat_str(err))
def try_utime(self, filename, last_modified_hdr):
"""Try to set the last-modified time of the given file."""
@@ -218,7 +224,7 @@ class FileDownloader(object):
if self.params.get('progress_with_newline', False):
self.to_screen(fullmsg)
else:
if os.name == 'nt':
if compat_os_name == 'nt':
prev_len = getattr(self, '_report_progress_prev_line_length',
0)
if prev_len > len(fullmsg):
@@ -295,7 +301,9 @@ class FileDownloader(object):
def report_retry(self, count, retries):
"""Report retry in case of HTTP error 5xx"""
self.to_screen('[download] Got server HTTP error. Retrying (attempt %d of %d)...' % (count, retries))
self.to_screen(
'[download] Got server HTTP error. Retrying (attempt %d of %s)...'
% (count, self.format_retries(retries)))
def report_file_already_downloaded(self, file_name):
"""Report file has already been fully downloaded."""

View File

@@ -1,66 +1,81 @@
from __future__ import unicode_literals
import os
import re
from .common import FileDownloader
from ..compat import compat_urllib_request
from .fragment import FragmentFD
from ..compat import compat_urllib_error
from ..utils import (
sanitize_open,
encodeFilename,
)
class DashSegmentsFD(FileDownloader):
class DashSegmentsFD(FragmentFD):
"""
Download segments in a DASH manifest
"""
FD_NAME = 'dashsegments'
def real_download(self, filename, info_dict):
self.report_destination(filename)
tmpfilename = self.temp_name(filename)
base_url = info_dict['url']
segment_urls = info_dict['segment_urls']
segment_urls = [info_dict['segment_urls'][0]] if self.params.get('test', False) else info_dict['segment_urls']
initialization_url = info_dict.get('initialization_url')
is_test = self.params.get('test', False)
remaining_bytes = self._TEST_FILE_SIZE if is_test else None
byte_counter = 0
ctx = {
'filename': filename,
'total_frags': len(segment_urls) + (1 if initialization_url else 0),
}
def append_url_to_file(outf, target_url, target_name, remaining_bytes=None):
self.to_screen('[DashSegments] %s: Downloading %s' % (info_dict['id'], target_name))
req = compat_urllib_request.Request(target_url)
if remaining_bytes is not None:
req.add_header('Range', 'bytes=0-%d' % (remaining_bytes - 1))
data = self.ydl.urlopen(req).read()
if remaining_bytes is not None:
data = data[:remaining_bytes]
outf.write(data)
return len(data)
self._prepare_and_start_frag_download(ctx)
def combine_url(base_url, target_url):
if re.match(r'^https?://', target_url):
return target_url
return '%s%s%s' % (base_url, '' if base_url.endswith('/') else '/', target_url)
with open(tmpfilename, 'wb') as outf:
append_url_to_file(
outf, combine_url(base_url, info_dict['initialization_url']),
'initialization segment')
for i, segment_url in enumerate(segment_urls):
segment_len = append_url_to_file(
outf, combine_url(base_url, segment_url),
'segment %d / %d' % (i + 1, len(segment_urls)),
remaining_bytes)
byte_counter += segment_len
if remaining_bytes is not None:
remaining_bytes -= segment_len
if remaining_bytes <= 0:
break
segments_filenames = []
self.try_rename(tmpfilename, filename)
fragment_retries = self.params.get('fragment_retries', 0)
self._hook_progress({
'downloaded_bytes': byte_counter,
'total_bytes': byte_counter,
'filename': filename,
'status': 'finished',
})
def append_url_to_file(target_url, tmp_filename, segment_name):
target_filename = '%s-%s' % (tmp_filename, segment_name)
count = 0
while count <= fragment_retries:
try:
success = ctx['dl'].download(target_filename, {'url': combine_url(base_url, target_url)})
if not success:
return False
down, target_sanitized = sanitize_open(target_filename, 'rb')
ctx['dest_stream'].write(down.read())
down.close()
segments_filenames.append(target_sanitized)
break
except (compat_urllib_error.HTTPError, ) as err:
# YouTube may often return 404 HTTP error for a fragment causing the
# whole download to fail. However if the same fragment is immediately
# retried with the same request data this usually succeeds (1-2 attemps
# is usually enough) thus allowing to download the whole file successfully.
# So, we will retry all fragments that fail with 404 HTTP error for now.
if err.code != 404:
raise
# Retry fragment
count += 1
if count <= fragment_retries:
self.report_retry_fragment(segment_name, count, fragment_retries)
if count > fragment_retries:
self.report_error('giving up after %s fragment retries' % fragment_retries)
return False
if initialization_url:
append_url_to_file(initialization_url, ctx['tmpfilename'], 'Init')
for i, segment_url in enumerate(segment_urls):
append_url_to_file(segment_url, ctx['tmpfilename'], 'Seg%d' % i)
self._finish_frag_download(ctx)
for segment_file in segments_filenames:
os.remove(encodeFilename(segment_file))
return True

View File

@@ -2,8 +2,11 @@ from __future__ import unicode_literals
import os.path
import subprocess
import sys
import re
from .common import FileDownloader
from ..postprocessor.ffmpeg import FFmpegPostProcessor, EXT_TO_OUT_FORMATS
from ..utils import (
cli_option,
cli_valueless_option,
@@ -11,6 +14,8 @@ from ..utils import (
cli_configuration_args,
encodeFilename,
encodeArgument,
handle_youtubedl_headers,
check_executable,
)
@@ -45,10 +50,18 @@ class ExternalFD(FileDownloader):
def exe(self):
return self.params.get('external_downloader')
@classmethod
def available(cls):
return check_executable(cls.get_basename(), [cls.AVAILABLE_OPT])
@classmethod
def supports(cls, info_dict):
return info_dict['protocol'] in ('http', 'https', 'ftp', 'ftps')
@classmethod
def can_download(cls, info_dict):
return cls.available() and cls.supports(info_dict)
def _option(self, command_option, param):
return cli_option(self.params, command_option, param)
@@ -76,6 +89,8 @@ class ExternalFD(FileDownloader):
class CurlFD(ExternalFD):
AVAILABLE_OPT = '-V'
def _make_cmd(self, tmpfilename, info_dict):
cmd = [self.exe, '--location', '-o', tmpfilename]
for key, val in info_dict['http_headers'].items():
@@ -89,6 +104,8 @@ class CurlFD(ExternalFD):
class AxelFD(ExternalFD):
AVAILABLE_OPT = '-V'
def _make_cmd(self, tmpfilename, info_dict):
cmd = [self.exe, '-o', tmpfilename]
for key, val in info_dict['http_headers'].items():
@@ -99,6 +116,8 @@ class AxelFD(ExternalFD):
class WgetFD(ExternalFD):
AVAILABLE_OPT = '--version'
def _make_cmd(self, tmpfilename, info_dict):
cmd = [self.exe, '-O', tmpfilename, '-nv', '--no-cookies']
for key, val in info_dict['http_headers'].items():
@@ -112,6 +131,8 @@ class WgetFD(ExternalFD):
class Aria2cFD(ExternalFD):
AVAILABLE_OPT = '-v'
def _make_cmd(self, tmpfilename, info_dict):
cmd = [self.exe, '-c']
cmd += self._configuration_args([
@@ -130,12 +151,112 @@ class Aria2cFD(ExternalFD):
class HttpieFD(ExternalFD):
@classmethod
def available(cls):
return check_executable('http', ['--version'])
def _make_cmd(self, tmpfilename, info_dict):
cmd = ['http', '--download', '--output', tmpfilename, info_dict['url']]
for key, val in info_dict['http_headers'].items():
cmd += ['%s:%s' % (key, val)]
return cmd
class FFmpegFD(ExternalFD):
@classmethod
def supports(cls, info_dict):
return info_dict['protocol'] in ('http', 'https', 'ftp', 'ftps', 'm3u8', 'rtsp', 'rtmp', 'mms')
@classmethod
def available(cls):
return FFmpegPostProcessor().available
def _call_downloader(self, tmpfilename, info_dict):
url = info_dict['url']
ffpp = FFmpegPostProcessor(downloader=self)
if not ffpp.available:
self.report_error('m3u8 download detected but ffmpeg or avconv could not be found. Please install one.')
return False
ffpp.check_version()
args = [ffpp.executable, '-y']
args += self._configuration_args()
# start_time = info_dict.get('start_time') or 0
# if start_time:
# args += ['-ss', compat_str(start_time)]
# end_time = info_dict.get('end_time')
# if end_time:
# args += ['-t', compat_str(end_time - start_time)]
if info_dict['http_headers'] and re.match(r'^https?://', url):
# Trailing \r\n after each HTTP header is important to prevent warning from ffmpeg/avconv:
# [http @ 00000000003d2fa0] No trailing CRLF found in HTTP header.
headers = handle_youtubedl_headers(info_dict['http_headers'])
args += [
'-headers',
''.join('%s: %s\r\n' % (key, val) for key, val in headers.items())]
protocol = info_dict.get('protocol')
if protocol == 'rtmp':
player_url = info_dict.get('player_url')
page_url = info_dict.get('page_url')
app = info_dict.get('app')
play_path = info_dict.get('play_path')
tc_url = info_dict.get('tc_url')
flash_version = info_dict.get('flash_version')
live = info_dict.get('rtmp_live', False)
if player_url is not None:
args += ['-rtmp_swfverify', player_url]
if page_url is not None:
args += ['-rtmp_pageurl', page_url]
if app is not None:
args += ['-rtmp_app', app]
if play_path is not None:
args += ['-rtmp_playpath', play_path]
if tc_url is not None:
args += ['-rtmp_tcurl', tc_url]
if flash_version is not None:
args += ['-rtmp_flashver', flash_version]
if live:
args += ['-rtmp_live', 'live']
args += ['-i', url, '-c', 'copy']
if protocol == 'm3u8':
if self.params.get('hls_use_mpegts', False):
args += ['-f', 'mpegts']
else:
args += ['-f', 'mp4', '-bsf:a', 'aac_adtstoasc']
elif protocol == 'rtmp':
args += ['-f', 'flv']
else:
args += ['-f', EXT_TO_OUT_FORMATS.get(info_dict['ext'], info_dict['ext'])]
args = [encodeArgument(opt) for opt in args]
args.append(encodeFilename(ffpp._ffmpeg_filename_argument(tmpfilename), True))
self._debug_cmd(args)
proc = subprocess.Popen(args, stdin=subprocess.PIPE)
try:
retval = proc.wait()
except KeyboardInterrupt:
# subprocces.run would send the SIGKILL signal to ffmpeg and the
# mp4 file couldn't be played, but if we ask ffmpeg to quit it
# produces a file that is playable (this is mostly useful for live
# streams). Note that Windows is not affected and produces playable
# files (see https://github.com/rg3/youtube-dl/issues/8300).
if sys.platform != 'win32':
proc.communicate(b'q')
raise
return retval
class AVconvFD(FFmpegFD):
pass
_BY_NAME = dict(
(klass.get_basename(), klass)
for name, klass in globals().items()

View File

@@ -5,15 +5,17 @@ import io
import itertools
import os
import time
import xml.etree.ElementTree as etree
from .fragment import FragmentFD
from ..compat import (
compat_etree_fromstring,
compat_urlparse,
compat_urllib_error,
compat_urllib_parse_urlparse,
)
from ..utils import (
encodeFilename,
fix_xml_ampersands,
sanitize_open,
struct_pack,
struct_unpack,
@@ -221,6 +223,12 @@ def write_metadata_tag(stream, metadata):
write_unsigned_int(stream, FLV_TAG_HEADER_LEN + len(metadata))
def remove_encrypted_media(media):
return list(filter(lambda e: 'drmAdditionalHeaderId' not in e.attrib and
'drmAdditionalHeaderSetId' not in e.attrib,
media))
def _add_ns(prop):
return '{http://ns.adobe.com/f4m/1.0}%s' % prop
@@ -242,9 +250,7 @@ class F4mFD(FragmentFD):
# without drmAdditionalHeaderId or drmAdditionalHeaderSetId attribute
if 'id' not in e.attrib:
self.report_error('Missing ID in f4m DRM')
media = list(filter(lambda e: 'drmAdditionalHeaderId' not in e.attrib and
'drmAdditionalHeaderSetId' not in e.attrib,
media))
media = remove_encrypted_media(media)
if not media:
self.report_error('Unsupported DRM')
return media
@@ -271,23 +277,34 @@ class F4mFD(FragmentFD):
return fragments_list
def _parse_bootstrap_node(self, node, base_url):
if node.text is None:
# Sometimes non empty inline bootstrap info can be specified along
# with bootstrap url attribute (e.g. dummy inline bootstrap info
# contains whitespace characters in [1]). We will prefer bootstrap
# url over inline bootstrap info when present.
# 1. http://live-1-1.rutube.ru/stream/1024/HDS/SD/C2NKsS85HQNckgn5HdEmOQ/1454167650/S-s604419906/move/four/dirs/upper/1024-576p.f4m
bootstrap_url = node.get('url')
if bootstrap_url:
bootstrap_url = compat_urlparse.urljoin(
base_url, node.attrib['url'])
base_url, bootstrap_url)
boot_info = self._get_bootstrap_from_url(bootstrap_url)
else:
bootstrap_url = None
bootstrap = base64.b64decode(node.text.encode('ascii'))
boot_info = read_bootstrap_info(bootstrap)
return (boot_info, bootstrap_url)
return boot_info, bootstrap_url
def real_download(self, filename, info_dict):
man_url = info_dict['url']
requested_bitrate = info_dict.get('tbr')
self.to_screen('[%s] Downloading f4m manifest' % self.FD_NAME)
manifest = self.ydl.urlopen(man_url).read()
urlh = self.ydl.urlopen(man_url)
man_url = urlh.geturl()
# Some manifests may be malformed, e.g. prosiebensat1 generated manifests
# (see https://github.com/rg3/youtube-dl/issues/6215#issuecomment-121704244
# and https://github.com/rg3/youtube-dl/issues/7823)
manifest = fix_xml_ampersands(urlh.read().decode('utf-8', 'ignore')).strip()
doc = etree.fromstring(manifest)
doc = compat_etree_fromstring(manifest)
formats = [(int(f.attrib.get('bitrate', -1)), f)
for f in self._get_unencrypted_media(doc)]
if requested_bitrate is None:
@@ -309,7 +326,8 @@ class F4mFD(FragmentFD):
metadata = None
fragments_list = build_fragments_list(boot_info)
if self.params.get('test', False):
test = self.params.get('test', False)
if test:
# We only download the first fragment
fragments_list = fragments_list[:1]
total_frags = len(fragments_list)
@@ -319,6 +337,7 @@ class F4mFD(FragmentFD):
ctx = {
'filename': filename,
'total_frags': total_frags,
'live': live,
}
self._prepare_frag_download(ctx)
@@ -329,20 +348,25 @@ class F4mFD(FragmentFD):
if not live:
write_metadata_tag(dest_stream, metadata)
base_url_parsed = compat_urllib_parse_urlparse(base_url)
self._start_frag_download(ctx)
frags_filenames = []
while fragments_list:
seg_i, frag_i = fragments_list.pop(0)
name = 'Seg%d-Frag%d' % (seg_i, frag_i)
url = base_url + name
query = []
if base_url_parsed.query:
query.append(base_url_parsed.query)
if akamai_pv:
url += '?' + akamai_pv.strip(';')
query.append(akamai_pv.strip(';'))
if info_dict.get('extra_param_to_segment_url'):
url += info_dict.get('extra_param_to_segment_url')
query.append(info_dict['extra_param_to_segment_url'])
url_parsed = base_url_parsed._replace(path=base_url_parsed.path + name, query='&'.join(query))
frag_filename = '%s-%s' % (ctx['tmpfilename'], name)
try:
success = ctx['dl'].download(frag_filename, {'url': url})
success = ctx['dl'].download(frag_filename, {'url': url_parsed.geturl()})
if not success:
return False
(down, frag_sanitized) = sanitize_open(frag_filename, 'rb')
@@ -368,7 +392,7 @@ class F4mFD(FragmentFD):
else:
raise
if not fragments_list and live and bootstrap_url:
if not fragments_list and not test and live and bootstrap_url:
fragments_list = self._update_live_fragments(bootstrap_url, frag_i)
total_frags += len(fragments_list)
if fragments_list and (fragments_list[0][1] > frag_i + 1):

View File

@@ -19,14 +19,27 @@ class HttpQuietDownloader(HttpFD):
class FragmentFD(FileDownloader):
"""
A base file downloader class for fragmented media (e.g. f4m/m3u8 manifests).
Available options:
fragment_retries: Number of times to retry a fragment for HTTP error (DASH only)
"""
def report_retry_fragment(self, fragment_name, count, retries):
self.to_screen(
'[download] Got server HTTP error. Retrying fragment %s (attempt %d of %s)...'
% (fragment_name, count, self.format_retries(retries)))
def _prepare_and_start_frag_download(self, ctx):
self._prepare_frag_download(ctx)
self._start_frag_download(ctx)
def _prepare_frag_download(self, ctx):
self.to_screen('[%s] Total fragments: %d' % (self.FD_NAME, ctx['total_frags']))
if 'live' not in ctx:
ctx['live'] = False
self.to_screen(
'[%s] Total fragments: %s'
% (self.FD_NAME, ctx['total_frags'] if not ctx['live'] else 'unknown (live)'))
self.report_destination(ctx['filename'])
dl = HttpQuietDownloader(
self.ydl,
@@ -34,7 +47,7 @@ class FragmentFD(FileDownloader):
'continuedl': True,
'quiet': True,
'noprogress': True,
'ratelimit': self.params.get('ratelimit', None),
'ratelimit': self.params.get('ratelimit'),
'retries': self.params.get('retries', 0),
'test': self.params.get('test', False),
}
@@ -59,37 +72,45 @@ class FragmentFD(FileDownloader):
'filename': ctx['filename'],
'tmpfilename': ctx['tmpfilename'],
}
start = time.time()
ctx['started'] = start
ctx.update({
'started': start,
# Total complete fragments downloaded so far in bytes
'complete_frags_downloaded_bytes': 0,
# Amount of fragment's bytes downloaded by the time of the previous
# frag progress hook invocation
'prev_frag_downloaded_bytes': 0,
})
def frag_progress_hook(s):
if s['status'] not in ('downloading', 'finished'):
return
frag_total_bytes = s.get('total_bytes', 0)
if s['status'] == 'finished':
state['downloaded_bytes'] += frag_total_bytes
state['frag_index'] += 1
estimated_size = (
(state['downloaded_bytes'] + frag_total_bytes) /
(state['frag_index'] + 1) * total_frags)
time_now = time.time()
state['total_bytes_estimate'] = estimated_size
state['elapsed'] = time_now - start
frag_total_bytes = s.get('total_bytes') or 0
if not ctx['live']:
estimated_size = (
(ctx['complete_frags_downloaded_bytes'] + frag_total_bytes) /
(state['frag_index'] + 1) * total_frags)
state['total_bytes_estimate'] = estimated_size
if s['status'] == 'finished':
progress = self.calc_percent(state['frag_index'], total_frags)
state['frag_index'] += 1
state['downloaded_bytes'] += frag_total_bytes - ctx['prev_frag_downloaded_bytes']
ctx['complete_frags_downloaded_bytes'] = state['downloaded_bytes']
ctx['prev_frag_downloaded_bytes'] = 0
else:
frag_downloaded_bytes = s['downloaded_bytes']
frag_progress = self.calc_percent(frag_downloaded_bytes,
frag_total_bytes)
progress = self.calc_percent(state['frag_index'], total_frags)
progress += frag_progress / float(total_frags)
state['eta'] = self.calc_eta(
start, time_now, estimated_size, state['downloaded_bytes'] + frag_downloaded_bytes)
state['speed'] = s.get('speed')
state['downloaded_bytes'] += frag_downloaded_bytes - ctx['prev_frag_downloaded_bytes']
if not ctx['live']:
state['eta'] = self.calc_eta(
start, time_now, estimated_size,
state['downloaded_bytes'])
state['speed'] = s.get('speed') or ctx.get('speed')
ctx['speed'] = state['speed']
ctx['prev_frag_downloaded_bytes'] = frag_downloaded_bytes
self._hook_progress(state)
ctx['dl'].add_progress_hook(frag_progress_hook)

View File

@@ -1,69 +1,19 @@
from __future__ import unicode_literals
import os
import os.path
import re
import subprocess
from .common import FileDownloader
from .fragment import FragmentFD
from ..compat import compat_urlparse
from ..postprocessor.ffmpeg import FFmpegPostProcessor
from ..utils import (
encodeArgument,
encodeFilename,
sanitize_open,
)
class HlsFD(FileDownloader):
def real_download(self, filename, info_dict):
url = info_dict['url']
self.report_destination(filename)
tmpfilename = self.temp_name(filename)
ffpp = FFmpegPostProcessor(downloader=self)
if not ffpp.available:
self.report_error('m3u8 download detected but ffmpeg or avconv could not be found. Please install one.')
return False
ffpp.check_version()
args = [ffpp.executable, '-y']
if info_dict['http_headers'] and re.match(r'^https?://', url):
# Trailing \r\n after each HTTP header is important to prevent warning from ffmpeg/avconv:
# [http @ 00000000003d2fa0] No trailing CRLF found in HTTP header.
args += [
'-headers',
''.join('%s: %s\r\n' % (key, val) for key, val in info_dict['http_headers'].items())]
args += ['-i', url, '-f', 'mp4', '-c', 'copy', '-bsf:a', 'aac_adtstoasc']
args = [encodeArgument(opt) for opt in args]
args.append(encodeFilename(ffpp._ffmpeg_filename_argument(tmpfilename), True))
self._debug_cmd(args)
retval = subprocess.call(args)
if retval == 0:
fsize = os.path.getsize(encodeFilename(tmpfilename))
self.to_screen('\r[%s] %s bytes' % (args[0], fsize))
self.try_rename(tmpfilename, filename)
self._hook_progress({
'downloaded_bytes': fsize,
'total_bytes': fsize,
'filename': filename,
'status': 'finished',
})
return True
else:
self.to_stderr('\n')
self.report_error('%s exited with code %d' % (ffpp.basename, retval))
return False
class NativeHlsFD(FragmentFD):
""" A more limited implementation that does not require ffmpeg """
class HlsFD(FragmentFD):
""" A limited implementation that does not require ffmpeg """
FD_NAME = 'hlsnative'

View File

@@ -7,14 +7,12 @@ import time
import re
from .common import FileDownloader
from ..compat import (
compat_urllib_request,
compat_urllib_error,
)
from ..compat import compat_urllib_error
from ..utils import (
ContentTooShortError,
encodeFilename,
sanitize_open,
sanitized_Request,
)
@@ -29,8 +27,8 @@ class HttpFD(FileDownloader):
add_headers = info_dict.get('http_headers')
if add_headers:
headers.update(add_headers)
basic_request = compat_urllib_request.Request(url, None, headers)
request = compat_urllib_request.Request(url, None, headers)
basic_request = sanitized_Request(url, None, headers)
request = sanitized_Request(url, None, headers)
is_test = self.params.get('test', False)
@@ -142,8 +140,8 @@ class HttpFD(FileDownloader):
if data_len is not None:
data_len = int(data_len) + resume_len
min_data_len = self.params.get("min_filesize", None)
max_data_len = self.params.get("max_filesize", None)
min_data_len = self.params.get('min_filesize')
max_data_len = self.params.get('max_filesize')
if min_data_len is not None and data_len < min_data_len:
self.to_screen('\r[download] File is smaller than min-filesize (%s bytes < %s bytes). Aborting.' % (data_len, min_data_len))
return False

View File

@@ -94,15 +94,15 @@ class RtmpFD(FileDownloader):
return proc.returncode
url = info_dict['url']
player_url = info_dict.get('player_url', None)
page_url = info_dict.get('page_url', None)
app = info_dict.get('app', None)
play_path = info_dict.get('play_path', None)
tc_url = info_dict.get('tc_url', None)
flash_version = info_dict.get('flash_version', None)
player_url = info_dict.get('player_url')
page_url = info_dict.get('page_url')
app = info_dict.get('app')
play_path = info_dict.get('play_path')
tc_url = info_dict.get('tc_url')
flash_version = info_dict.get('flash_version')
live = info_dict.get('rtmp_live', False)
conn = info_dict.get('rtmp_conn', None)
protocol = info_dict.get('rtmp_protocol', None)
conn = info_dict.get('rtmp_conn')
protocol = info_dict.get('rtmp_protocol')
real_time = info_dict.get('rtmp_real_time', False)
no_resume = info_dict.get('no_resume', False)
continue_dl = self.params.get('continuedl', True)
@@ -117,7 +117,7 @@ class RtmpFD(FileDownloader):
return False
# Download using rtmpdump. rtmpdump returns exit code 2 when
# the connection was interrumpted and resuming appears to be
# the connection was interrupted and resuming appears to be
# possible. This is part of rtmpdump's normal usage, AFAIK.
basic_args = [
'rtmpdump', '--verbose', '-r', url,

View File

@@ -1,851 +1,33 @@
from __future__ import unicode_literals
from .abc import ABCIE
from .abc7news import Abc7NewsIE
from .academicearth import AcademicEarthCourseIE
from .addanime import AddAnimeIE
from .adobetv import (
AdobeTVIE,
AdobeTVVideoIE,
)
from .adultswim import AdultSwimIE
from .aftenposten import AftenpostenIE
from .aftonbladet import AftonbladetIE
from .airmozilla import AirMozillaIE
from .aljazeera import AlJazeeraIE
from .alphaporno import AlphaPornoIE
from .anitube import AnitubeIE
from .anysex import AnySexIE
from .aol import AolIE
from .allocine import AllocineIE
from .aparat import AparatIE
from .appleconnect import AppleConnectIE
from .appletrailers import AppleTrailersIE
from .archiveorg import ArchiveOrgIE
from .ard import (
ARDIE,
ARDMediathekIE,
SportschauIE,
)
from .arte import (
ArteTvIE,
ArteTVPlus7IE,
ArteTVCreativeIE,
ArteTVConcertIE,
ArteTVFutureIE,
ArteTVDDCIE,
ArteTVEmbedIE,
)
from .atresplayer import AtresPlayerIE
from .atttechchannel import ATTTechChannelIE
from .audiomack import AudiomackIE, AudiomackAlbumIE
from .azubu import AzubuIE
from .baidu import BaiduVideoIE
from .bambuser import BambuserIE, BambuserChannelIE
from .bandcamp import BandcampIE, BandcampAlbumIE
from .bbc import (
BBCCoUkIE,
BBCCoUkArticleIE,
BBCIE,
)
from .beeg import BeegIE
from .behindkink import BehindKinkIE
from .beatportpro import BeatportProIE
from .bet import BetIE
from .bild import BildIE
from .bilibili import BiliBiliIE
from .blinkx import BlinkxIE
from .bliptv import BlipTVIE, BlipTVUserIE
from .bloomberg import BloombergIE
from .bpb import BpbIE
from .br import BRIE
from .breakcom import BreakIE
from .brightcove import BrightcoveIE
from .buzzfeed import BuzzFeedIE
from .byutv import BYUtvIE
from .c56 import C56IE
from .camdemy import (
CamdemyIE,
CamdemyFolderIE
)
from .canal13cl import Canal13clIE
from .canalplus import CanalplusIE
from .canalc2 import Canalc2IE
from .cbs import CBSIE
from .cbsnews import CBSNewsIE
from .cbssports import CBSSportsIE
from .ccc import CCCIE
from .ceskatelevize import CeskaTelevizeIE
from .channel9 import Channel9IE
from .chaturbate import ChaturbateIE
from .chilloutzone import ChilloutzoneIE
from .chirbit import (
ChirbitIE,
ChirbitProfileIE,
)
from .cinchcast import CinchcastIE
from .cinemassacre import CinemassacreIE
from .clipfish import ClipfishIE
from .cliphunter import CliphunterIE
from .clipsyndicate import ClipsyndicateIE
from .cloudy import CloudyIE
from .clubic import ClubicIE
from .cmt import CMTIE
from .cnet import CNETIE
from .cnn import (
CNNIE,
CNNBlogsIE,
CNNArticleIE,
)
from .collegehumor import CollegeHumorIE
from .collegerama import CollegeRamaIE
from .comedycentral import ComedyCentralIE, ComedyCentralShowsIE
from .comcarcoff import ComCarCoffIE
from .commonmistakes import CommonMistakesIE, UnicodeBOMIE
from .condenast import CondeNastIE
from .cracked import CrackedIE
from .criterion import CriterionIE
from .crooksandliars import CrooksAndLiarsIE
from .crunchyroll import (
CrunchyrollIE,
CrunchyrollShowPlaylistIE
)
from .cspan import CSpanIE
from .ctsnews import CtsNewsIE
from .dailymotion import (
DailymotionIE,
DailymotionPlaylistIE,
DailymotionUserIE,
DailymotionCloudIE,
)
from .daum import DaumIE
from .dbtv import DBTVIE
from .dcn import DCNIE
from .dctp import DctpTvIE
from .deezer import DeezerPlaylistIE
from .dfb import DFBIE
from .dhm import DHMIE
from .dotsub import DotsubIE
from .douyutv import DouyuTVIE
from .dramafever import (
DramaFeverIE,
DramaFeverSeriesIE,
)
from .dreisat import DreiSatIE
from .drbonanza import DRBonanzaIE
from .drtuber import DrTuberIE
from .drtv import DRTVIE
from .dvtv import DVTVIE
from .dump import DumpIE
from .dumpert import DumpertIE
from .defense import DefenseGouvFrIE
from .discovery import DiscoveryIE
from .dropbox import DropboxIE
from .eagleplatform import EaglePlatformIE
from .ebaumsworld import EbaumsWorldIE
from .echomsk import EchoMskIE
from .ehow import EHowIE
from .eighttracks import EightTracksIE
from .einthusan import EinthusanIE
from .eitb import EitbIE
from .ellentv import (
EllenTVIE,
EllenTVClipsIE,
)
from .elpais import ElPaisIE
from .embedly import EmbedlyIE
from .engadget import EngadgetIE
from .eporner import EpornerIE
from .eroprofile import EroProfileIE
from .escapist import EscapistIE
from .espn import ESPNIE
from .esri import EsriVideoIE
from .europa import EuropaIE
from .everyonesmixtape import EveryonesMixtapeIE
from .exfm import ExfmIE
from .expotv import ExpoTVIE
from .extremetube import ExtremeTubeIE
from .facebook import FacebookIE
from .faz import FazIE
from .fc2 import FC2IE
from .fczenit import FczenitIE
from .firstpost import FirstpostIE
from .firsttv import FirstTVIE
from .fivemin import FiveMinIE
from .fivetv import FiveTVIE
from .fktv import FKTVIE
from .flickr import FlickrIE
from .folketinget import FolketingetIE
from .footyroom import FootyRoomIE
from .fourtube import FourTubeIE
from .foxgay import FoxgayIE
from .foxnews import FoxNewsIE
from .foxsports import FoxSportsIE
from .franceculture import FranceCultureIE
from .franceinter import FranceInterIE
from .francetv import (
PluzzIE,
FranceTvInfoIE,
FranceTVIE,
GenerationQuoiIE,
CultureboxIE,
)
from .freesound import FreesoundIE
from .freespeech import FreespeechIE
from .freevideo import FreeVideoIE
from .funnyordie import FunnyOrDieIE
from .gamekings import GamekingsIE
from .gameone import (
GameOneIE,
GameOnePlaylistIE,
)
from .gamersyde import GamersydeIE
from .gamespot import GameSpotIE
from .gamestar import GameStarIE
from .gametrailers import GametrailersIE
from .gazeta import GazetaIE
from .gdcvault import GDCVaultIE
from .generic import GenericIE
from .gfycat import GfycatIE
from .giantbomb import GiantBombIE
from .giga import GigaIE
from .glide import GlideIE
from .globo import GloboIE
from .godtube import GodTubeIE
from .goldenmoustache import GoldenMoustacheIE
from .golem import GolemIE
from .googleplus import GooglePlusIE
from .googlesearch import GoogleSearchIE
from .gorillavid import GorillaVidIE
from .goshgay import GoshgayIE
from .groupon import GrouponIE
from .hark import HarkIE
from .hearthisat import HearThisAtIE
from .heise import HeiseIE
from .hellporno import HellPornoIE
from .helsinki import HelsinkiIE
from .hentaistigma import HentaiStigmaIE
from .historicfilms import HistoricFilmsIE
from .history import HistoryIE
from .hitbox import HitboxIE, HitboxLiveIE
from .hornbunny import HornBunnyIE
from .hotnewhiphop import HotNewHipHopIE
from .howcast import HowcastIE
from .howstuffworks import HowStuffWorksIE
from .huffpost import HuffPostIE
from .hypem import HypemIE
from .iconosquare import IconosquareIE
from .ign import IGNIE, OneUPIE
from .imdb import (
ImdbIE,
ImdbListIE
)
from .imgur import (
ImgurIE,
ImgurAlbumIE,
)
from .ina import InaIE
from .indavideo import (
IndavideoIE,
IndavideoEmbedIE,
)
from .infoq import InfoQIE
from .instagram import InstagramIE, InstagramUserIE
from .internetvideoarchive import InternetVideoArchiveIE
from .iprima import IPrimaIE
from .iqiyi import IqiyiIE
from .ir90tv import Ir90TvIE
from .ivi import (
IviIE,
IviCompilationIE
)
from .izlesene import IzleseneIE
from .jadorecettepub import JadoreCettePubIE
from .jeuxvideo import JeuxVideoIE
from .jove import JoveIE
from .jukebox import JukeboxIE
from .jpopsukitv import JpopsukiIE
from .kaltura import KalturaIE
from .kanalplay import KanalPlayIE
from .kankan import KankanIE
from .karaoketv import KaraoketvIE
from .karrierevideos import KarriereVideosIE
from .keezmovies import KeezMoviesIE
from .khanacademy import KhanAcademyIE
from .kickstarter import KickStarterIE
from .keek import KeekIE
from .kontrtube import KontrTubeIE
from .krasview import KrasViewIE
from .ku6 import Ku6IE
from .kuwo import (
KuwoIE,
KuwoAlbumIE,
KuwoChartIE,
KuwoSingerIE,
KuwoCategoryIE,
KuwoMvIE,
)
from .la7 import LA7IE
from .laola1tv import Laola1TvIE
from .lecture2go import Lecture2GoIE
from .letv import (
LetvIE,
LetvTvIE,
LetvPlaylistIE
)
from .libsyn import LibsynIE
from .lifenews import (
LifeNewsIE,
LifeEmbedIE,
)
from .limelight import (
LimelightMediaIE,
LimelightChannelIE,
LimelightChannelListIE,
)
from .liveleak import LiveLeakIE
from .livestream import (
LivestreamIE,
LivestreamOriginalIE,
LivestreamShortenerIE,
)
from .lnkgo import LnkGoIE
from .lrt import LRTIE
from .lynda import (
LyndaIE,
LyndaCourseIE
)
from .m6 import M6IE
from .macgamestore import MacGameStoreIE
from .mailru import MailRuIE
from .malemotion import MalemotionIE
from .mdr import MDRIE
from .metacafe import MetacafeIE
from .metacritic import MetacriticIE
from .mgoon import MgoonIE
from .minhateca import MinhatecaIE
from .ministrygrid import MinistryGridIE
from .miomio import MioMioIE
from .mit import TechTVMITIE, MITIE, OCWMITIE
from .mitele import MiTeleIE
from .mixcloud import MixcloudIE
from .mlb import MLBIE
from .mpora import MporaIE
from .moevideo import MoeVideoIE
from .mofosex import MofosexIE
from .mojvideo import MojvideoIE
from .moniker import MonikerIE
from .mooshare import MooshareIE
from .morningstar import MorningstarIE
from .motherless import MotherlessIE
from .motorsport import MotorsportIE
from .movieclips import MovieClipsIE
from .moviezine import MoviezineIE
from .movshare import MovShareIE
from .mtv import (
MTVIE,
MTVServicesEmbeddedIE,
MTVIggyIE,
MTVDEIE,
)
from .muenchentv import MuenchenTVIE
from .musicplayon import MusicPlayOnIE
from .muzu import MuzuTVIE
from .mwave import MwaveIE
from .myspace import MySpaceIE, MySpaceAlbumIE
from .myspass import MySpassIE
from .myvi import MyviIE
from .myvideo import MyVideoIE
from .myvidster import MyVidsterIE
from .nationalgeographic import NationalGeographicIE
from .naver import NaverIE
from .nba import NBAIE
from .nbc import (
NBCIE,
NBCNewsIE,
NBCSportsIE,
NBCSportsVPlayerIE,
MSNBCIE,
)
from .ndr import (
NDRIE,
NJoyIE,
NDREmbedBaseIE,
NDREmbedIE,
NJoyEmbedIE,
)
from .ndtv import NDTVIE
from .netzkino import NetzkinoIE
from .nerdcubed import NerdCubedFeedIE
from .nerdist import NerdistIE
from .neteasemusic import (
NetEaseMusicIE,
NetEaseMusicAlbumIE,
NetEaseMusicSingerIE,
NetEaseMusicListIE,
NetEaseMusicMvIE,
NetEaseMusicProgramIE,
NetEaseMusicDjRadioIE,
)
from .newgrounds import NewgroundsIE
from .newstube import NewstubeIE
from .nextmedia import (
NextMediaIE,
NextMediaActionNewsIE,
AppleDailyIE,
)
from .nfb import NFBIE
from .nfl import NFLIE
from .nhl import (
NHLIE,
NHLNewsIE,
NHLVideocenterIE,
)
from .niconico import NiconicoIE, NiconicoPlaylistIE
from .ninegag import NineGagIE
from .noco import NocoIE
from .normalboots import NormalbootsIE
from .nosvideo import NosVideoIE
from .nova import NovaIE
from .novamov import NovaMovIE
from .nowness import (
NownessIE,
NownessPlaylistIE,
NownessSeriesIE,
)
from .nowtv import NowTVIE
from .nowvideo import NowVideoIE
from .npo import (
NPOIE,
NPOLiveIE,
NPORadioIE,
NPORadioFragmentIE,
VPROIE,
WNLIE
)
from .nrk import (
NRKIE,
NRKPlaylistIE,
NRKTVIE,
)
from .ntvde import NTVDeIE
from .ntvru import NTVRuIE
from .nytimes import (
NYTimesIE,
NYTimesArticleIE,
)
from .nuvid import NuvidIE
from .odnoklassniki import OdnoklassnikiIE
from .oktoberfesttv import OktoberfestTVIE
from .onionstudios import OnionStudiosIE
from .ooyala import (
OoyalaIE,
OoyalaExternalIE,
)
from .orf import (
ORFTVthekIE,
ORFOE1IE,
ORFFM4IE,
ORFIPTVIE,
)
from .parliamentliveuk import ParliamentLiveUKIE
from .patreon import PatreonIE
from .pbs import PBSIE
from .periscope import (
PeriscopeIE,
QuickscopeIE,
)
from .philharmoniedeparis import PhilharmonieDeParisIE
from .phoenix import PhoenixIE
from .photobucket import PhotobucketIE
from .pinkbike import PinkbikeIE
from .planetaplay import PlanetaPlayIE
from .pladform import PladformIE
from .played import PlayedIE
from .playfm import PlayFMIE
from .playtvak import PlaytvakIE
from .playvid import PlayvidIE
from .playwire import PlaywireIE
from .pluralsight import (
PluralsightIE,
PluralsightCourseIE,
)
from .podomatic import PodomaticIE
from .porn91 import Porn91IE
from .pornhd import PornHdIE
from .pornhub import (
PornHubIE,
PornHubPlaylistIE,
)
from .pornotube import PornotubeIE
from .pornovoisines import PornoVoisinesIE
from .pornoxo import PornoXOIE
from .primesharetv import PrimeShareTVIE
from .promptfile import PromptFileIE
from .prosiebensat1 import ProSiebenSat1IE
from .puls4 import Puls4IE
from .pyvideo import PyvideoIE
from .qqmusic import (
QQMusicIE,
QQMusicSingerIE,
QQMusicAlbumIE,
QQMusicToplistIE,
QQMusicPlaylistIE,
)
from .quickvid import QuickVidIE
from .r7 import R7IE
from .radiode import RadioDeIE
from .radiojavan import RadioJavanIE
from .radiobremen import RadioBremenIE
from .radiofrance import RadioFranceIE
from .rai import RaiIE
from .rbmaradio import RBMARadioIE
from .rds import RDSIE
from .redtube import RedTubeIE
from .restudy import RestudyIE
from .reverbnation import ReverbNationIE
from .ringtv import RingTVIE
from .ro220 import Ro220IE
from .rottentomatoes import RottenTomatoesIE
from .roxwel import RoxwelIE
from .rtbf import RTBFIE
from .rte import RteIE
from .rtlnl import RtlNlIE
from .rtl2 import RTL2IE
from .rtp import RTPIE
from .rts import RTSIE
from .rtve import RTVEALaCartaIE, RTVELiveIE, RTVEInfantilIE
from .rtvnh import RTVNHIE
from .ruhd import RUHDIE
from .rutube import (
RutubeIE,
RutubeChannelIE,
RutubeEmbedIE,
RutubeMovieIE,
RutubePersonIE,
)
from .rutv import RUTVIE
from .ruutu import RuutuIE
from .sandia import SandiaIE
from .safari import (
SafariIE,
SafariCourseIE,
)
from .sapo import SapoIE
from .savefrom import SaveFromIE
from .sbs import SBSIE
from .scivee import SciVeeIE
from .screencast import ScreencastIE
from .screencastomatic import ScreencastOMaticIE
from .screenwavemedia import ScreenwaveMediaIE, TeamFourIE
from .senateisvp import SenateISVPIE
from .servingsys import ServingSysIE
from .sexu import SexuIE
from .sexykarma import SexyKarmaIE
from .shahid import ShahidIE
from .shared import SharedIE
from .sharesix import ShareSixIE
from .sina import SinaIE
from .slideshare import SlideshareIE
from .slutload import SlutloadIE
from .smotri import (
SmotriIE,
SmotriCommunityIE,
SmotriUserIE,
SmotriBroadcastIE,
)
from .snagfilms import (
SnagFilmsIE,
SnagFilmsEmbedIE,
)
from .snotr import SnotrIE
from .sohu import SohuIE
from .soompi import (
SoompiIE,
SoompiShowIE,
)
from .soundcloud import (
SoundcloudIE,
SoundcloudSetIE,
SoundcloudUserIE,
SoundcloudPlaylistIE
)
from .soundgasm import (
SoundgasmIE,
SoundgasmProfileIE
)
from .southpark import (
SouthParkIE,
SouthParkDeIE,
SouthParkDkIE,
SouthParkEsIE,
SouthParkNlIE
)
from .space import SpaceIE
from .spankbang import SpankBangIE
from .spankwire import SpankwireIE
from .spiegel import SpiegelIE, SpiegelArticleIE
from .spiegeltv import SpiegeltvIE
from .spike import SpikeIE
from .stitcher import StitcherIE
from .sport5 import Sport5IE
from .sportbox import (
SportBoxIE,
SportBoxEmbedIE,
)
from .sportdeutschland import SportDeutschlandIE
from .srf import SrfIE
from .srmediathek import SRMediathekIE
from .ssa import SSAIE
from .stanfordoc import StanfordOpenClassroomIE
from .steam import SteamIE
from .streamcloud import StreamcloudIE
from .streamcz import StreamCZIE
from .streetvoice import StreetVoiceIE
from .sunporno import SunPornoIE
from .svt import (
SVTIE,
SVTPlayIE,
)
from .swrmediathek import SWRMediathekIE
from .syfy import SyfyIE
from .sztvhu import SztvHuIE
from .tagesschau import TagesschauIE
from .tapely import TapelyIE
from .tass import TassIE
from .teachertube import (
TeacherTubeIE,
TeacherTubeUserIE,
)
from .teachingchannel import TeachingChannelIE
from .teamcoco import TeamcocoIE
from .techtalks import TechTalksIE
from .ted import TEDIE
from .telebruxelles import TeleBruxellesIE
from .telecinco import TelecincoIE
from .telegraaf import TelegraafIE
from .telemb import TeleMBIE
from .teletask import TeleTaskIE
from .tenplay import TenPlayIE
from .testurl import TestURLIE
from .testtube import TestTubeIE
from .tf1 import TF1IE
from .theonion import TheOnionIE
from .theplatform import (
ThePlatformIE,
ThePlatformFeedIE,
)
from .thesixtyone import TheSixtyOneIE
from .thisamericanlife import ThisAmericanLifeIE
from .thisav import ThisAVIE
from .tinypic import TinyPicIE
from .tlc import TlcIE, TlcDeIE
from .tmz import (
TMZIE,
TMZArticleIE,
)
from .tnaflix import (
TNAFlixIE,
EMPFlixIE,
MovieFapIE,
)
from .thvideo import (
THVideoIE,
THVideoPlaylistIE
)
from .toutv import TouTvIE
from .toypics import ToypicsUserIE, ToypicsIE
from .traileraddict import TrailerAddictIE
from .trilulilu import TriluliluIE
from .trutube import TruTubeIE
from .tube8 import Tube8IE
from .tubitv import TubiTvIE
from .tudou import TudouIE
from .tumblr import TumblrIE
from .tunein import TuneInIE
from .turbo import TurboIE
from .tutv import TutvIE
from .tv2 import (
TV2IE,
TV2ArticleIE,
)
from .tv4 import TV4IE
from .tvc import (
TVCIE,
TVCArticleIE,
)
from .tvigle import TvigleIE
from .tvp import TvpIE, TvpSeriesIE
from .tvplay import TVPlayIE
from .tweakers import TweakersIE
from .twentyfourvideo import TwentyFourVideoIE
from .twentytwotracks import (
TwentyTwoTracksIE,
TwentyTwoTracksGenreIE
)
from .twitch import (
TwitchVideoIE,
TwitchChapterIE,
TwitchVodIE,
TwitchProfileIE,
TwitchPastBroadcastsIE,
TwitchBookmarksIE,
TwitchStreamIE,
)
from .twitter import TwitterCardIE, TwitterIE
from .ubu import UbuIE
from .udemy import (
UdemyIE,
UdemyCourseIE
)
from .udn import UDNEmbedIE
from .ultimedia import UltimediaIE
from .unistra import UnistraIE
from .urort import UrortIE
from .ustream import UstreamIE, UstreamChannelIE
from .varzesh3 import Varzesh3IE
from .vbox7 import Vbox7IE
from .veehd import VeeHDIE
from .veoh import VeohIE
from .vessel import VesselIE
from .vesti import VestiIE
from .vevo import VevoIE
from .vgtv import (
BTArticleIE,
BTVestlendingenIE,
VGTVIE,
)
from .vh1 import VH1IE
from .vice import ViceIE
from .viddler import ViddlerIE
from .videodetective import VideoDetectiveIE
from .videolecturesnet import VideoLecturesNetIE
from .videofyme import VideofyMeIE
from .videomega import VideoMegaIE
from .videopremium import VideoPremiumIE
from .videott import VideoTtIE
from .videoweed import VideoWeedIE
from .vidme import VidmeIE
from .vidzi import VidziIE
from .vier import VierIE, VierVideosIE
from .viewster import ViewsterIE
from .vimeo import (
VimeoIE,
VimeoAlbumIE,
VimeoChannelIE,
VimeoGroupsIE,
VimeoLikesIE,
VimeoReviewIE,
VimeoUserIE,
VimeoWatchLaterIE,
)
from .vimple import VimpleIE
from .vine import (
VineIE,
VineUserIE,
)
from .viki import (
VikiIE,
VikiChannelIE,
)
from .vk import (
VKIE,
VKUserVideosIE,
)
from .vlive import VLiveIE
from .vodlocker import VodlockerIE
from .voicerepublic import VoiceRepublicIE
from .vporn import VpornIE
from .vrt import VRTIE
from .vube import VubeIE
from .vuclip import VuClipIE
from .vulture import VultureIE
from .walla import WallaIE
from .washingtonpost import WashingtonPostIE
from .wat import WatIE
from .wayofthemaster import WayOfTheMasterIE
from .wdr import (
WDRIE,
WDRMobileIE,
WDRMausIE,
)
from .webofstories import (
WebOfStoriesIE,
WebOfStoriesPlaylistIE,
)
from .weibo import WeiboIE
from .wimp import WimpIE
from .wistia import WistiaIE
from .worldstarhiphop import WorldStarHipHopIE
from .wrzuta import WrzutaIE
from .wsj import WSJIE
from .xbef import XBefIE
from .xboxclips import XboxClipsIE
from .xhamster import (
XHamsterIE,
XHamsterEmbedIE,
)
from .xminus import XMinusIE
from .xnxx import XNXXIE
from .xstream import XstreamIE
from .xtube import XTubeUserIE, XTubeIE
from .xuite import XuiteIE
from .xvideos import XVideosIE
from .xxxymovies import XXXYMoviesIE
from .yahoo import (
YahooIE,
YahooSearchIE,
)
from .yam import YamIE
from .yandexmusic import (
YandexMusicTrackIE,
YandexMusicAlbumIE,
YandexMusicPlaylistIE,
)
from .yesjapan import YesJapanIE
from .yinyuetai import YinYueTaiIE
from .ynet import YnetIE
from .youjizz import YouJizzIE
from .youku import YoukuIE
from .youporn import YouPornIE
from .yourupload import YourUploadIE
from .youtube import (
YoutubeIE,
YoutubeChannelIE,
YoutubeFavouritesIE,
YoutubeHistoryIE,
YoutubePlaylistIE,
YoutubeRecommendedIE,
YoutubeSearchDateIE,
YoutubeSearchIE,
YoutubeSearchURLIE,
YoutubeShowIE,
YoutubeSubscriptionsIE,
YoutubeTruncatedIDIE,
YoutubeTruncatedURLIE,
YoutubeUserIE,
YoutubeWatchLaterIE,
)
from .zapiks import ZapiksIE
from .zdf import ZDFIE, ZDFChannelIE
from .zingmp3 import (
ZingMp3SongIE,
ZingMp3AlbumIE,
)
try:
from .lazy_extractors import *
from .lazy_extractors import _ALL_CLASSES
_LAZY_LOADER = True
except ImportError:
_LAZY_LOADER = False
from .extractors import *
_ALL_CLASSES = [
klass
for name, klass in globals().items()
if name.endswith('IE') and name != 'GenericIE'
]
_ALL_CLASSES.append(GenericIE)
_ALL_CLASSES = [
klass
for name, klass in globals().items()
if name.endswith('IE') and name != 'GenericIE'
]
_ALL_CLASSES.append(GenericIE)
def gen_extractor_classes():
""" Return a list of supported extractors.
The order does matter; the first extractor matched is the one handling the URL.
"""
return _ALL_CLASSES
def gen_extractors():
""" Return a list of an instance of every supported extractor.
The order does matter; the first extractor matched is the one handling the URL.
"""
return [klass() for klass in _ALL_CLASSES]
return [klass() for klass in gen_extractor_classes()]
def list_extractors(age_limit):

View File

@@ -12,7 +12,7 @@ from ..utils import (
class ABCIE(InfoExtractor):
IE_NAME = 'abc.net.au'
_VALID_URL = r'http://www\.abc\.net\.au/news/[^/]+/[^/]+/(?P<id>\d+)'
_VALID_URL = r'https?://www\.abc\.net\.au/news/(?:[^/]+/){1,2}(?P<id>\d+)'
_TESTS = [{
'url': 'http://www.abc.net.au/news/2014-11-05/australia-to-staff-ebola-treatment-centre-in-sierra-leone/5868334',
@@ -23,6 +23,7 @@ class ABCIE(InfoExtractor):
'title': 'Australia to help staff Ebola treatment centre in Sierra Leone',
'description': 'md5:809ad29c67a05f54eb41f2a105693a67',
},
'skip': 'this video has expired',
}, {
'url': 'http://www.abc.net.au/news/2015-08-17/warren-entsch-introduces-same-sex-marriage-bill/6702326',
'md5': 'db2a5369238b51f9811ad815b69dc086',
@@ -36,6 +37,19 @@ class ABCIE(InfoExtractor):
'title': 'Marriage Equality: Warren Entsch introduces same sex marriage bill',
},
'add_ie': ['Youtube'],
'skip': 'Not accessible from Travis CI server',
}, {
'url': 'http://www.abc.net.au/news/2015-10-23/nab-lifts-interest-rates-following-westpac-and-cba/6880080',
'md5': 'b96eee7c9edf4fc5a358a0252881cc1f',
'info_dict': {
'id': '6880080',
'ext': 'mp3',
'title': 'NAB lifts interest rates, following Westpac and CBA',
'description': 'md5:f13d8edc81e462fce4a0437c7dc04728',
},
}, {
'url': 'http://www.abc.net.au/news/2015-10-19/6866214',
'only_matching': True,
}]
def _real_extract(self, url):
@@ -43,9 +57,12 @@ class ABCIE(InfoExtractor):
webpage = self._download_webpage(url, video_id)
mobj = re.search(
r'inline(?P<type>Video|YouTube)Data\.push\((?P<json_data>[^)]+)\);',
r'inline(?P<type>Video|Audio|YouTube)Data\.push\((?P<json_data>[^)]+)\);',
webpage)
if mobj is None:
expired = self._html_search_regex(r'(?s)class="expired-(?:video|audio)".+?<span>(.+?)</span>', webpage, 'expired', None)
if expired:
raise ExtractorError('%s said: %s' % (self.IE_NAME, expired), expected=True)
raise ExtractorError('Unable to extract video urls')
urls_info = self._parse_json(
@@ -60,11 +77,13 @@ class ABCIE(InfoExtractor):
formats = [{
'url': url_info['url'],
'vcodec': url_info.get('codec') if mobj.group('type') == 'Video' else 'none',
'width': int_or_none(url_info.get('width')),
'height': int_or_none(url_info.get('height')),
'tbr': int_or_none(url_info.get('bitrate')),
'filesize': int_or_none(url_info.get('filesize')),
} for url_info in urls_info]
self._sort_formats(formats)
return {

View File

@@ -0,0 +1,82 @@
# coding: utf-8
from __future__ import unicode_literals
import re
import functools
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
int_or_none,
OnDemandPagedList,
)
class ACastIE(InfoExtractor):
IE_NAME = 'acast'
_VALID_URL = r'https?://(?:www\.)?acast\.com/(?P<channel>[^/]+)/(?P<id>[^/#?]+)'
_TEST = {
'url': 'https://www.acast.com/condenasttraveler/-where-are-you-taipei-101-taiwan',
'md5': 'ada3de5a1e3a2a381327d749854788bb',
'info_dict': {
'id': '57de3baa-4bb0-487e-9418-2692c1277a34',
'ext': 'mp3',
'title': '"Where Are You?": Taipei 101, Taiwan',
'timestamp': 1196172000000,
'description': 'md5:a0b4ef3634e63866b542e5b1199a1a0e',
'duration': 211,
}
}
def _real_extract(self, url):
channel, display_id = re.match(self._VALID_URL, url).groups()
cast_data = self._download_json(
'https://embed.acast.com/api/acasts/%s/%s' % (channel, display_id), display_id)
return {
'id': compat_str(cast_data['id']),
'display_id': display_id,
'url': cast_data['blings'][0]['audio'],
'title': cast_data['name'],
'description': cast_data.get('description'),
'thumbnail': cast_data.get('image'),
'timestamp': int_or_none(cast_data.get('publishingDate')),
'duration': int_or_none(cast_data.get('duration')),
}
class ACastChannelIE(InfoExtractor):
IE_NAME = 'acast:channel'
_VALID_URL = r'https?://(?:www\.)?acast\.com/(?P<id>[^/#?]+)'
_TEST = {
'url': 'https://www.acast.com/condenasttraveler',
'info_dict': {
'id': '50544219-29bb-499e-a083-6087f4cb7797',
'title': 'Condé Nast Traveler Podcast',
'description': 'md5:98646dee22a5b386626ae31866638fbd',
},
'playlist_mincount': 20,
}
_API_BASE_URL = 'https://www.acast.com/api/'
_PAGE_SIZE = 10
@classmethod
def suitable(cls, url):
return False if ACastIE.suitable(url) else super(ACastChannelIE, cls).suitable(url)
def _fetch_page(self, channel_slug, page):
casts = self._download_json(
self._API_BASE_URL + 'channels/%s/acasts?page=%s' % (channel_slug, page),
channel_slug, note='Download page %d of channel data' % page)
for cast in casts:
yield self.url_result(
'https://www.acast.com/%s/%s' % (channel_slug, cast['url']),
'ACast', cast['id'])
def _real_extract(self, url):
channel_slug = self._match_id(url)
channel_data = self._download_json(
self._API_BASE_URL + 'channels/%s' % channel_slug, channel_slug)
entries = OnDemandPagedList(functools.partial(
self._fetch_page, channel_slug), self._PAGE_SIZE)
return self.playlist_result(entries, compat_str(
channel_data['id']), channel_data['name'], channel_data.get('description'))

View File

@@ -6,7 +6,7 @@ from .common import InfoExtractor
from ..compat import (
compat_HTTPError,
compat_str,
compat_urllib_parse,
compat_urllib_parse_urlencode,
compat_urllib_parse_urlparse,
)
from ..utils import (
@@ -16,7 +16,7 @@ from ..utils import (
class AddAnimeIE(InfoExtractor):
_VALID_URL = r'http://(?:\w+\.)?add-anime\.net/(?:watch_video\.php\?(?:.*?)v=|video/)(?P<id>[\w_]+)'
_VALID_URL = r'https?://(?:\w+\.)?add-anime\.net/(?:watch_video\.php\?(?:.*?)v=|video/)(?P<id>[\w_]+)'
_TESTS = [{
'url': 'http://www.add-anime.net/watch_video.php?v=24MR3YO5SAS9',
'md5': '72954ea10bc979ab5e2eb288b21425a0',
@@ -60,7 +60,7 @@ class AddAnimeIE(InfoExtractor):
confirm_url = (
parsed_url.scheme + '://' + parsed_url.netloc +
action + '?' +
compat_urllib_parse.urlencode({
compat_urllib_parse_urlencode({
'jschl_vc': vc, 'jschl_answer': compat_str(av_val)}))
self._download_webpage(
confirm_url, video_id,

View File

@@ -1,23 +1,32 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
parse_duration,
unified_strdate,
str_to_int,
int_or_none,
float_or_none,
ISO639Utils,
determine_ext,
)
class AdobeTVIE(InfoExtractor):
_VALID_URL = r'https?://tv\.adobe\.com/watch/[^/]+/(?P<id>[^/]+)'
class AdobeTVBaseIE(InfoExtractor):
_API_BASE_URL = 'http://tv.adobe.com/api/v4/'
class AdobeTVIE(AdobeTVBaseIE):
_VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?watch/(?P<show_urlname>[^/]+)/(?P<id>[^/]+)'
_TEST = {
'url': 'http://tv.adobe.com/watch/the-complete-picture-with-julieanne-kost/quick-tip-how-to-draw-a-circle-around-an-object-in-photoshop/',
'md5': '9bc5727bcdd55251f35ad311ca74fa1e',
'info_dict': {
'id': 'quick-tip-how-to-draw-a-circle-around-an-object-in-photoshop',
'id': '10981',
'ext': 'mp4',
'title': 'Quick Tip - How to Draw a Circle Around an Object in Photoshop',
'description': 'md5:99ec318dc909d7ba2a1f2b038f7d2311',
@@ -29,50 +38,106 @@ class AdobeTVIE(InfoExtractor):
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
language, show_urlname, urlname = re.match(self._VALID_URL, url).groups()
if not language:
language = 'en'
player = self._parse_json(
self._search_regex(r'html5player:\s*({.+?})\s*\n', webpage, 'player'),
video_id)
title = player.get('title') or self._search_regex(
r'data-title="([^"]+)"', webpage, 'title')
description = self._og_search_description(webpage)
thumbnail = self._og_search_thumbnail(webpage)
upload_date = unified_strdate(
self._html_search_meta('datepublished', webpage, 'upload date'))
duration = parse_duration(
self._html_search_meta('duration', webpage, 'duration') or
self._search_regex(
r'Runtime:\s*(\d{2}:\d{2}:\d{2})',
webpage, 'duration', fatal=False))
view_count = str_to_int(self._search_regex(
r'<div class="views">\s*Views?:\s*([\d,.]+)\s*</div>',
webpage, 'view count'))
video_data = self._download_json(
self._API_BASE_URL + 'episode/get/?language=%s&show_urlname=%s&urlname=%s&disclosure=standard' % (language, show_urlname, urlname),
urlname)['data'][0]
formats = [{
'url': source['src'],
'format_id': source.get('quality') or source['src'].split('-')[-1].split('.')[0] or None,
'tbr': source.get('bitrate'),
} for source in player['sources']]
'url': source['url'],
'format_id': source.get('quality_level') or source['url'].split('-')[-1].split('.')[0] or None,
'width': int_or_none(source.get('width')),
'height': int_or_none(source.get('height')),
'tbr': int_or_none(source.get('video_data_rate')),
} for source in video_data['videos']]
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'upload_date': upload_date,
'duration': duration,
'view_count': view_count,
'id': compat_str(video_data['id']),
'title': video_data['title'],
'description': video_data.get('description'),
'thumbnail': video_data.get('thumbnail'),
'upload_date': unified_strdate(video_data.get('start_date')),
'duration': parse_duration(video_data.get('duration')),
'view_count': str_to_int(video_data.get('playcount')),
'formats': formats,
}
class AdobeTVPlaylistBaseIE(AdobeTVBaseIE):
def _parse_page_data(self, page_data):
return [self.url_result(self._get_element_url(element_data)) for element_data in page_data]
def _extract_playlist_entries(self, url, display_id):
page = self._download_json(url, display_id)
entries = self._parse_page_data(page['data'])
for page_num in range(2, page['paging']['pages'] + 1):
entries.extend(self._parse_page_data(
self._download_json(url + '&page=%d' % page_num, display_id)['data']))
return entries
class AdobeTVShowIE(AdobeTVPlaylistBaseIE):
_VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?show/(?P<id>[^/]+)'
_TEST = {
'url': 'http://tv.adobe.com/show/the-complete-picture-with-julieanne-kost',
'info_dict': {
'id': '36',
'title': 'The Complete Picture with Julieanne Kost',
'description': 'md5:fa50867102dcd1aa0ddf2ab039311b27',
},
'playlist_mincount': 136,
}
def _get_element_url(self, element_data):
return element_data['urls'][0]
def _real_extract(self, url):
language, show_urlname = re.match(self._VALID_URL, url).groups()
if not language:
language = 'en'
query = 'language=%s&show_urlname=%s' % (language, show_urlname)
show_data = self._download_json(self._API_BASE_URL + 'show/get/?%s' % query, show_urlname)['data'][0]
return self.playlist_result(
self._extract_playlist_entries(self._API_BASE_URL + 'episode/?%s' % query, show_urlname),
compat_str(show_data['id']),
show_data['show_name'],
show_data['show_description'])
class AdobeTVChannelIE(AdobeTVPlaylistBaseIE):
_VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?channel/(?P<id>[^/]+)(?:/(?P<category_urlname>[^/]+))?'
_TEST = {
'url': 'http://tv.adobe.com/channel/development',
'info_dict': {
'id': 'development',
},
'playlist_mincount': 96,
}
def _get_element_url(self, element_data):
return element_data['url']
def _real_extract(self, url):
language, channel_urlname, category_urlname = re.match(self._VALID_URL, url).groups()
if not language:
language = 'en'
query = 'language=%s&channel_urlname=%s' % (language, channel_urlname)
if category_urlname:
query += '&category_urlname=%s' % category_urlname
return self.playlist_result(
self._extract_playlist_entries(self._API_BASE_URL + 'show/?%s' % query, channel_urlname),
channel_urlname)
class AdobeTVVideoIE(InfoExtractor):
_VALID_URL = r'https?://video\.tv\.adobe\.com/v/(?P<id>\d+)'
@@ -91,28 +156,25 @@ class AdobeTVVideoIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
player_params = self._parse_json(self._search_regex(
r'var\s+bridge\s*=\s*([^;]+);', webpage, 'player parameters'),
video_id)
video_data = self._download_json(url + '?format=json', video_id)
formats = [{
'format_id': '%s-%s' % (determine_ext(source['src']), source.get('height')),
'url': source['src'],
'width': source.get('width'),
'height': source.get('height'),
'tbr': source.get('bitrate'),
} for source in player_params['sources']]
'width': int_or_none(source.get('width')),
'height': int_or_none(source.get('height')),
'tbr': int_or_none(source.get('bitrate')),
} for source in video_data['sources']]
self._sort_formats(formats)
# For both metadata and downloaded files the duration varies among
# formats. I just pick the max one
duration = max(filter(None, [
float_or_none(source.get('duration'), scale=1000)
for source in player_params['sources']]))
for source in video_data['sources']]))
subtitles = {}
for translation in player_params.get('translations', []):
for translation in video_data.get('translations', []):
lang_id = translation.get('language_w3c') or ISO639Utils.long2short(translation['language_medium'])
if lang_id not in subtitles:
subtitles[lang_id] = []
@@ -124,8 +186,9 @@ class AdobeTVVideoIE(InfoExtractor):
return {
'id': video_id,
'formats': formats,
'title': player_params['title'],
'description': self._og_search_description(webpage),
'title': video_data['title'],
'description': video_data.get('description'),
'thumbnail': video_data['video'].get('poster'),
'duration': duration,
'subtitles': subtitles,
}

View File

@@ -68,7 +68,7 @@ class AdultSwimIE(InfoExtractor):
'md5': '3e346a2ab0087d687a05e1e7f3b3e529',
'info_dict': {
'id': 'sY3cMUR_TbuE4YmdjzbIcQ-0',
'ext': 'flv',
'ext': 'mp4',
'title': 'Tim and Eric Awesome Show Great Job! - Dr. Steve Brule, For Your Wine',
'description': 'Dr. Brule reports live from Wine Country with a special report on wines. \r\nWatch Tim and Eric Awesome Show Great Job! episode #20, "Embarrassed" on Adult Swim.\r\n\r\n',
},
@@ -79,6 +79,10 @@ class AdultSwimIE(InfoExtractor):
'title': 'Tim and Eric Awesome Show Great Job! - Dr. Steve Brule, For Your Wine',
'description': 'Dr. Brule reports live from Wine Country with a special report on wines. \r\nWatch Tim and Eric Awesome Show Great Job! episode #20, "Embarrassed" on Adult Swim.\r\n\r\n',
},
'params': {
# m3u8 download
'skip_download': True,
}
}]
@staticmethod
@@ -183,7 +187,8 @@ class AdultSwimIE(InfoExtractor):
media_url = file_el.text
if determine_ext(media_url) == 'm3u8':
formats.extend(self._extract_m3u8_formats(
media_url, segment_title, 'mp4', preference=0, m3u8_id='hls'))
media_url, segment_title, 'mp4', preference=0,
m3u8_id='hls', fatal=False))
else:
formats.append({
'format_id': '%s_%s' % (bitrate, ftype),

View File

@@ -0,0 +1,87 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
smuggle_url,
update_url_query,
unescapeHTML,
)
class AENetworksIE(InfoExtractor):
IE_NAME = 'aenetworks'
IE_DESC = 'A+E Networks: A&E, Lifetime, History.com, FYI Network'
_VALID_URL = r'https?://(?:www\.)?(?:(?:history|aetv|mylifetime)\.com|fyi\.tv)/(?P<type>[^/]+)/(?:[^/]+/)+(?P<id>[^/]+?)(?:$|[?#])'
_TESTS = [{
'url': 'http://www.history.com/topics/valentines-day/history-of-valentines-day/videos/bet-you-didnt-know-valentines-day?m=528e394da93ae&s=undefined&f=1&free=false',
'info_dict': {
'id': 'g12m5Gyt3fdR',
'ext': 'mp4',
'title': "Bet You Didn't Know: Valentine's Day",
'description': 'md5:7b57ea4829b391995b405fa60bd7b5f7',
'timestamp': 1375819729,
'upload_date': '20130806',
'uploader': 'AENE-NEW',
},
'params': {
# m3u8 download
'skip_download': True,
},
'add_ie': ['ThePlatform'],
'expected_warnings': ['JSON-LD'],
}, {
'url': 'http://www.history.com/shows/mountain-men/season-1/episode-1',
'md5': '8ff93eb073449f151d6b90c0ae1ef0c7',
'info_dict': {
'id': 'eg47EERs_JsZ',
'ext': 'mp4',
'title': 'Winter Is Coming',
'description': 'md5:641f424b7a19d8e24f26dea22cf59d74',
'timestamp': 1338306241,
'upload_date': '20120529',
'uploader': 'AENE-NEW',
},
'add_ie': ['ThePlatform'],
}, {
'url': 'http://www.aetv.com/shows/duck-dynasty/video/inlawful-entry',
'only_matching': True
}, {
'url': 'http://www.fyi.tv/shows/tiny-house-nation/videos/207-sq-ft-minnesota-prairie-cottage',
'only_matching': True
}, {
'url': 'http://www.mylifetime.com/shows/project-runway-junior/video/season-1/episode-6/superstar-clients',
'only_matching': True
}]
def _real_extract(self, url):
page_type, video_id = re.match(self._VALID_URL, url).groups()
webpage = self._download_webpage(url, video_id)
video_url_re = [
r'data-href="[^"]*/%s"[^>]+data-release-url="([^"]+)"' % video_id,
r"media_url\s*=\s*'([^']+)'"
]
video_url = unescapeHTML(self._search_regex(video_url_re, webpage, 'video url'))
query = {'mbr': 'true'}
if page_type == 'shows':
query['assetTypes'] = 'medium_video_s3'
if 'switch=hds' in video_url:
query['switch'] = 'hls'
info = self._search_json_ld(webpage, video_id, fatal=False)
info.update({
'_type': 'url_transparent',
'url': smuggle_url(
update_url_query(video_url, query),
{
'sig': {
'key': 'crazyjava',
'secret': 's3cr3t'},
'force_smil_url': True
}),
})
return info

View File

@@ -1,23 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
class AftenpostenIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?aftenposten\.no/webtv/(?:#!/)?video/(?P<id>\d+)'
_TEST = {
'url': 'http://www.aftenposten.no/webtv/#!/video/21039/trailer-sweatshop-i-can-t-take-any-more',
'md5': 'fd828cd29774a729bf4d4425fe192972',
'info_dict': {
'id': '21039',
'ext': 'mov',
'title': 'TRAILER: "Sweatshop" - I can´t take any more',
'description': 'md5:21891f2b0dd7ec2f78d84a50e54f8238',
'timestamp': 1416927969,
'upload_date': '20141125',
}
}
def _real_extract(self, url):
return self.url_result('xstream:ap:%s' % self._match_id(url), 'Xstream')

View File

@@ -6,7 +6,7 @@ from ..utils import int_or_none
class AftonbladetIE(InfoExtractor):
_VALID_URL = r'http://tv\.aftonbladet\.se/abtv/articles/(?P<id>[0-9]+)'
_VALID_URL = r'https?://tv\.aftonbladet\.se/abtv/articles/(?P<id>[0-9]+)'
_TEST = {
'url': 'http://tv.aftonbladet.se/abtv/articles/36015',
'info_dict': {

View File

@@ -4,7 +4,7 @@ from .common import InfoExtractor
class AlJazeeraIE(InfoExtractor):
_VALID_URL = r'http://www\.aljazeera\.com/programmes/.*?/(?P<id>[^/]+)\.html'
_VALID_URL = r'https?://www\.aljazeera\.com/programmes/.*?/(?P<id>[^/]+)\.html'
_TEST = {
'url': 'http://www.aljazeera.com/programmes/the-slum/2014/08/deliverance-201482883754237240.html',
@@ -13,24 +13,18 @@ class AlJazeeraIE(InfoExtractor):
'ext': 'mp4',
'title': 'The Slum - Episode 1: Deliverance',
'description': 'As a birth attendant advocating for family planning, Remy is on the frontline of Tondo\'s battle with overcrowding.',
'uploader': 'Al Jazeera English',
'uploader_id': '665003303001',
'timestamp': 1411116829,
'upload_date': '20140919',
},
'add_ie': ['Brightcove'],
'add_ie': ['BrightcoveNew'],
'skip': 'Not accessible from Travis CI server',
}
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/665003303001/default_default/index.html?videoId=%s'
def _real_extract(self, url):
program_name = self._match_id(url)
webpage = self._download_webpage(url, program_name)
brightcove_id = self._search_regex(
r'RenderPagesVideo\(\'(.+?)\'', webpage, 'brightcove id')
return {
'_type': 'url',
'url': (
'brightcove:'
'playerKey=AQ~~%2CAAAAmtVJIFk~%2CTVGOQ5ZTwJbeMWnq5d_H4MOM57xfzApc'
'&%40videoPlayer={0}'.format(brightcove_id)
),
'ie_key': 'Brightcove',
}
return self.url_result(self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id, 'BrightcoveNew', brightcove_id)

View File

@@ -8,6 +8,8 @@ from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
qualities,
unescapeHTML,
xpath_element,
)
@@ -31,7 +33,7 @@ class AllocineIE(InfoExtractor):
'id': '19540403',
'ext': 'mp4',
'title': 'Planes 2 Bande-annonce VF',
'description': 'md5:eeaffe7c2d634525e21159b93acf3b1e',
'description': 'Regardez la bande annonce du film Planes 2 (Planes 2 Bande-annonce VF). Planes 2, un film de Roberts Gannaway',
'thumbnail': 're:http://.*\.jpg',
},
}, {
@@ -41,7 +43,7 @@ class AllocineIE(InfoExtractor):
'id': '19544709',
'ext': 'mp4',
'title': 'Dragons 2 - Bande annonce finale VF',
'description': 'md5:71742e3a74b0d692c7fce0dd2017a4ac',
'description': 'md5:601d15393ac40f249648ef000720e7e3',
'thumbnail': 're:http://.*\.jpg',
},
}, {
@@ -59,14 +61,18 @@ class AllocineIE(InfoExtractor):
if typ == 'film':
video_id = self._search_regex(r'href="/video/player_gen_cmedia=([0-9]+).+"', webpage, 'video id')
else:
player = self._search_regex(r'data-player=\'([^\']+)\'>', webpage, 'data player')
player_data = json.loads(player)
video_id = compat_str(player_data['refMedia'])
player = self._search_regex(r'data-player=\'([^\']+)\'>', webpage, 'data player', default=None)
if player:
player_data = json.loads(player)
video_id = compat_str(player_data['refMedia'])
else:
model = self._search_regex(r'data-model="([^"]+)">', webpage, 'data model')
model_data = self._parse_json(unescapeHTML(model), display_id)
video_id = compat_str(model_data['id'])
xml = self._download_xml('http://www.allocine.fr/ws/AcVisiondataV4.ashx?media=%s' % video_id, display_id)
video = xml.find('.//AcVisionVideo').attrib
video = xpath_element(xml, './/AcVisionVideo').attrib
quality = qualities(['ld', 'md', 'hd'])
formats = []

View File

@@ -0,0 +1,83 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_iso8601,
)
class AMPIE(InfoExtractor):
# parse Akamai Adaptive Media Player feed
def _extract_feed_info(self, url):
item = self._download_json(
url, None, 'Downloading Akamai AMP feed',
'Unable to download Akamai AMP feed')['channel']['item']
video_id = item['guid']
def get_media_node(name, default=None):
media_name = 'media-%s' % name
media_group = item.get('media-group') or item
return media_group.get(media_name) or item.get(media_name) or item.get(name, default)
thumbnails = []
media_thumbnail = get_media_node('thumbnail')
if media_thumbnail:
if isinstance(media_thumbnail, dict):
media_thumbnail = [media_thumbnail]
for thumbnail_data in media_thumbnail:
thumbnail = thumbnail_data['@attributes']
thumbnails.append({
'url': self._proto_relative_url(thumbnail['url'], 'http:'),
'width': int_or_none(thumbnail.get('width')),
'height': int_or_none(thumbnail.get('height')),
})
subtitles = {}
media_subtitle = get_media_node('subTitle')
if media_subtitle:
if isinstance(media_subtitle, dict):
media_subtitle = [media_subtitle]
for subtitle_data in media_subtitle:
subtitle = subtitle_data['@attributes']
lang = subtitle.get('lang') or 'en'
subtitles[lang] = [{'url': subtitle['href']}]
formats = []
media_content = get_media_node('content')
if isinstance(media_content, dict):
media_content = [media_content]
for media_data in media_content:
media = media_data['@attributes']
media_type = media['type']
if media_type == 'video/f4m':
formats.extend(self._extract_f4m_formats(
media['url'] + '?hdcore=3.4.0&plugin=aasp-3.4.0.132.124',
video_id, f4m_id='hds', fatal=False))
elif media_type == 'application/x-mpegURL':
formats.extend(self._extract_m3u8_formats(
media['url'], video_id, 'mp4', m3u8_id='hls', fatal=False))
else:
formats.append({
'format_id': media_data['media-category']['@attributes']['label'],
'url': media['url'],
'tbr': int_or_none(media.get('bitrate')),
'filesize': int_or_none(media.get('fileSize')),
})
self._sort_formats(formats)
timestamp = parse_iso8601(item.get('pubDate'), ' ') or parse_iso8601(item.get('dc-date'))
return {
'id': video_id,
'title': get_media_node('title'),
'description': get_media_node('description'),
'thumbnails': thumbnails,
'timestamp': timestamp,
'duration': int_or_none(media_content[0].get('@attributes', {}).get('duration')),
'subtitles': subtitles,
'formats': formats,
}

View File

@@ -0,0 +1,242 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import (
compat_urlparse,
compat_str,
)
from ..utils import (
determine_ext,
extract_attributes,
ExtractorError,
sanitized_Request,
urlencode_postdata,
)
class AnimeOnDemandIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?anime-on-demand\.de/anime/(?P<id>\d+)'
_LOGIN_URL = 'https://www.anime-on-demand.de/users/sign_in'
_APPLY_HTML5_URL = 'https://www.anime-on-demand.de/html5apply'
_NETRC_MACHINE = 'animeondemand'
_TESTS = [{
'url': 'https://www.anime-on-demand.de/anime/161',
'info_dict': {
'id': '161',
'title': 'Grimgar, Ashes and Illusions (OmU)',
'description': 'md5:6681ce3c07c7189d255ac6ab23812d31',
},
'playlist_mincount': 4,
}, {
# Film wording is used instead of Episode
'url': 'https://www.anime-on-demand.de/anime/39',
'only_matching': True,
}, {
# Episodes without titles
'url': 'https://www.anime-on-demand.de/anime/162',
'only_matching': True,
}, {
# ger/jap, Dub/OmU, account required
'url': 'https://www.anime-on-demand.de/anime/169',
'only_matching': True,
}]
def _login(self):
(username, password) = self._get_login_info()
if username is None:
return
login_page = self._download_webpage(
self._LOGIN_URL, None, 'Downloading login page')
if '>Our licensing terms allow the distribution of animes only to German-speaking countries of Europe' in login_page:
self.raise_geo_restricted(
'%s is only available in German-speaking countries of Europe' % self.IE_NAME)
login_form = self._form_hidden_inputs('new_user', login_page)
login_form.update({
'user[login]': username,
'user[password]': password,
})
post_url = self._search_regex(
r'<form[^>]+action=(["\'])(?P<url>.+?)\1', login_page,
'post url', default=self._LOGIN_URL, group='url')
if not post_url.startswith('http'):
post_url = compat_urlparse.urljoin(self._LOGIN_URL, post_url)
request = sanitized_Request(
post_url, urlencode_postdata(login_form))
request.add_header('Referer', self._LOGIN_URL)
response = self._download_webpage(
request, None, 'Logging in as %s' % username)
if all(p not in response for p in ('>Logout<', 'href="/users/sign_out"')):
error = self._search_regex(
r'<p class="alert alert-danger">(.+?)</p>',
response, 'error', default=None)
if error:
raise ExtractorError('Unable to login: %s' % error, expected=True)
raise ExtractorError('Unable to log in')
def _real_initialize(self):
self._login()
def _real_extract(self, url):
anime_id = self._match_id(url)
webpage = self._download_webpage(url, anime_id)
if 'data-playlist=' not in webpage:
self._download_webpage(
self._APPLY_HTML5_URL, anime_id,
'Activating HTML5 beta', 'Unable to apply HTML5 beta')
webpage = self._download_webpage(url, anime_id)
csrf_token = self._html_search_meta(
'csrf-token', webpage, 'csrf token', fatal=True)
anime_title = self._html_search_regex(
r'(?s)<h1[^>]+itemprop="name"[^>]*>(.+?)</h1>',
webpage, 'anime name')
anime_description = self._html_search_regex(
r'(?s)<div[^>]+itemprop="description"[^>]*>(.+?)</div>',
webpage, 'anime description', default=None)
entries = []
for num, episode_html in enumerate(re.findall(
r'(?s)<h3[^>]+class="episodebox-title".+?>Episodeninhalt<', webpage), 1):
episodebox_title = self._search_regex(
(r'class="episodebox-title"[^>]+title=(["\'])(?P<title>.+?)\1',
r'class="episodebox-title"[^>]+>(?P<title>.+?)<'),
episode_html, 'episodebox title', default=None, group='title')
if not episodebox_title:
continue
episode_number = int(self._search_regex(
r'(?:Episode|Film)\s*(\d+)',
episodebox_title, 'episode number', default=num))
episode_title = self._search_regex(
r'(?:Episode|Film)\s*\d+\s*-\s*(.+)',
episodebox_title, 'episode title', default=None)
video_id = 'episode-%d' % episode_number
common_info = {
'id': video_id,
'series': anime_title,
'episode': episode_title,
'episode_number': episode_number,
}
formats = []
for input_ in re.findall(
r'<input[^>]+class=["\'].*?streamstarter_html5[^>]+>', episode_html):
attributes = extract_attributes(input_)
playlist_urls = []
for playlist_key in ('data-playlist', 'data-otherplaylist'):
playlist_url = attributes.get(playlist_key)
if isinstance(playlist_url, compat_str) and re.match(
r'/?[\da-zA-Z]+', playlist_url):
playlist_urls.append(attributes[playlist_key])
if not playlist_urls:
continue
lang = attributes.get('data-lang')
lang_note = attributes.get('value')
for playlist_url in playlist_urls:
kind = self._search_regex(
r'videomaterialurl/\d+/([^/]+)/',
playlist_url, 'media kind', default=None)
format_id_list = []
if lang:
format_id_list.append(lang)
if kind:
format_id_list.append(kind)
if not format_id_list:
format_id_list.append(compat_str(num))
format_id = '-'.join(format_id_list)
format_note = ', '.join(filter(None, (kind, lang_note)))
request = sanitized_Request(
compat_urlparse.urljoin(url, playlist_url),
headers={
'X-Requested-With': 'XMLHttpRequest',
'X-CSRF-Token': csrf_token,
'Referer': url,
'Accept': 'application/json, text/javascript, */*; q=0.01',
})
playlist = self._download_json(
request, video_id, 'Downloading %s playlist JSON' % format_id,
fatal=False)
if not playlist:
continue
start_video = playlist.get('startvideo', 0)
playlist = playlist.get('playlist')
if not playlist or not isinstance(playlist, list):
continue
playlist = playlist[start_video]
title = playlist.get('title')
if not title:
continue
description = playlist.get('description')
for source in playlist.get('sources', []):
file_ = source.get('file')
if not file_:
continue
ext = determine_ext(file_)
format_id_list = [lang, kind]
if ext == 'm3u8':
format_id_list.append('hls')
elif source.get('type') == 'video/dash' or ext == 'mpd':
format_id_list.append('dash')
format_id = '-'.join(filter(None, format_id_list))
if ext == 'm3u8':
file_formats = self._extract_m3u8_formats(
file_, video_id, 'mp4',
entry_protocol='m3u8_native', m3u8_id=format_id, fatal=False)
elif source.get('type') == 'video/dash' or ext == 'mpd':
continue
file_formats = self._extract_mpd_formats(
file_, video_id, mpd_id=format_id, fatal=False)
else:
continue
for f in file_formats:
f.update({
'language': lang,
'format_note': format_note,
})
formats.extend(file_formats)
if formats:
self._sort_formats(formats)
f = common_info.copy()
f.update({
'title': title,
'description': description,
'formats': formats,
})
entries.append(f)
# Extract teaser only when full episode is not available
if not formats:
m = re.search(
r'data-dialog-header=(["\'])(?P<title>.+?)\1[^>]+href=(["\'])(?P<href>.+?)\3[^>]*>Teaser<',
episode_html)
if m:
f = common_info.copy()
f.update({
'id': '%s-teaser' % f['id'],
'title': m.group('title'),
'url': compat_urlparse.urljoin(url, m.group('href')),
})
entries.append(f)
return self.playlist_result(entries, anime_id, anime_title, anime_description)

View File

@@ -1,11 +1,9 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from .nuevo import NuevoBaseIE
class AnitubeIE(InfoExtractor):
class AnitubeIE(NuevoBaseIE):
IE_NAME = 'anitube.se'
_VALID_URL = r'https?://(?:www\.)?anitube\.se/video/(?P<id>\d+)'
@@ -22,38 +20,11 @@ class AnitubeIE(InfoExtractor):
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
key = self._html_search_regex(
r'http://www\.anitube\.se/embed/([A-Za-z0-9_-]*)', webpage, 'key')
key = self._search_regex(
r'src=["\']https?://[^/]+/embed/([A-Za-z0-9_-]+)', webpage, 'key')
config_xml = self._download_xml(
'http://www.anitube.se/nuevo/econfig.php?key=%s' % key, key)
video_title = config_xml.find('title').text
thumbnail = config_xml.find('image').text
duration = float(config_xml.find('duration').text)
formats = []
video_url = config_xml.find('file')
if video_url is not None:
formats.append({
'format_id': 'sd',
'url': video_url.text,
})
video_url = config_xml.find('filehd')
if video_url is not None:
formats.append({
'format_id': 'hd',
'url': video_url.text,
})
return {
'id': video_id,
'title': video_title,
'thumbnail': thumbnail,
'duration': duration,
'formats': formats
}
return self._extract_nuevo(
'http://www.anitube.se/nuevo/econfig.php?key=%s' % key, video_id)

View File

@@ -1,24 +1,18 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
int_or_none,
)
class AolIE(InfoExtractor):
IE_NAME = 'on.aol.com'
_VALID_URL = r'''(?x)
(?:
aol-video:|
http://on\.aol\.com/
(?:
video/.*-|
playlist/(?P<playlist_display_id>[^/?#]+?)-(?P<playlist_id>[0-9]+)[?#].*_videoid=
)
)
(?P<id>[0-9]+)
(?:$|\?)
'''
_VALID_URL = r'(?:aol-video:|https?://on\.aol\.com/video/.*-)(?P<id>[^/?-]+)'
_TESTS = [{
'url': 'http://on.aol.com/video/u-s--official-warns-of-largest-ever-irs-phone-scam-518167793?icid=OnHomepageC2Wide_MustSee_Img',
@@ -27,44 +21,99 @@ class AolIE(InfoExtractor):
'id': '518167793',
'ext': 'mp4',
'title': 'U.S. Official Warns Of \'Largest Ever\' IRS Phone Scam',
'description': 'A major phone scam has cost thousands of taxpayers more than $1 million, with less than a month until income tax returns are due to the IRS.',
'timestamp': 1395405060,
'upload_date': '20140321',
'uploader': 'Newsy Studio',
},
'add_ie': ['FiveMin'],
'params': {
# m3u8 download
'skip_download': True,
}
}, {
'url': 'http://on.aol.com/playlist/brace-yourself---todays-weirdest-news-152147?icid=OnHomepageC4_Omg_Img#_videoid=518184316',
'url': 'http://on.aol.com/video/netflix-is-raising-rates-5707d6b8e4b090497b04f706?context=PC:homepage:PL1944:1460189336183',
'info_dict': {
'id': '152147',
'title': 'Brace Yourself - Today\'s Weirdest News',
'id': '5707d6b8e4b090497b04f706',
'ext': 'mp4',
'title': 'Netflix is Raising Rates',
'description': 'Netflix is rewarding millions of its long-standing members with an increase in cost. Veuers Carly Figueroa has more.',
'upload_date': '20160408',
'timestamp': 1460123280,
'uploader': 'Veuer',
},
'playlist_mincount': 10,
'params': {
# m3u8 download
'skip_download': True,
}
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
playlist_id = mobj.group('playlist_id')
if not playlist_id or self._downloader.params.get('noplaylist'):
return self.url_result('5min:%s' % video_id)
video_id = self._match_id(url)
self.to_screen('Downloading playlist %s - add --no-playlist to just download video %s' % (playlist_id, video_id))
response = self._download_json(
'https://feedapi.b2c.on.aol.com/v1.0/app/videos/aolon/%s/details' % video_id,
video_id)['response']
if response['statusText'] != 'Ok':
raise ExtractorError('%s said: %s' % (self.IE_NAME, response['statusText']), expected=True)
webpage = self._download_webpage(url, playlist_id)
title = self._html_search_regex(
r'<h1 class="video-title[^"]*">(.+?)</h1>', webpage, 'title')
playlist_html = self._search_regex(
r"(?s)<ul\s+class='video-related[^']*'>(.*?)</ul>", webpage,
'playlist HTML')
entries = [{
'_type': 'url',
'url': 'aol-video:%s' % m.group('id'),
'ie_key': 'Aol',
} for m in re.finditer(
r"<a\s+href='.*videoid=(?P<id>[0-9]+)'\s+class='video-thumb'>",
playlist_html)]
video_data = response['data']
formats = []
m3u8_url = video_data.get('videoMasterPlaylist')
if m3u8_url:
formats.extend(self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
for rendition in video_data.get('renditions', []):
video_url = rendition.get('url')
if not video_url:
continue
ext = rendition.get('format')
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
video_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
else:
f = {
'url': video_url,
'format_id': rendition.get('quality'),
}
mobj = re.search(r'(\d+)x(\d+)', video_url)
if mobj:
f.update({
'width': int(mobj.group(1)),
'height': int(mobj.group(2)),
})
formats.append(f)
self._sort_formats(formats, ('width', 'height', 'tbr', 'format_id'))
return {
'_type': 'playlist',
'id': playlist_id,
'display_id': mobj.group('playlist_display_id'),
'title': title,
'entries': entries,
'id': video_id,
'title': video_data['title'],
'duration': int_or_none(video_data.get('duration')),
'timestamp': int_or_none(video_data.get('publishDate')),
'view_count': int_or_none(video_data.get('views')),
'description': video_data.get('description'),
'uploader': video_data.get('videoOwner'),
'formats': formats,
}
class AolFeaturesIE(InfoExtractor):
IE_NAME = 'features.aol.com'
_VALID_URL = r'https?://features\.aol\.com/video/(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'http://features.aol.com/video/behind-secret-second-careers-late-night-talk-show-hosts',
'md5': '7db483bb0c09c85e241f84a34238cc75',
'info_dict': {
'id': '519507715',
'ext': 'mp4',
'title': 'What To Watch - February 17, 2016',
},
'add_ie': ['FiveMin'],
}]
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
return self.url_result(self._search_regex(
r'<script type="text/javascript" src="(https?://[^/]*?5min\.com/Scripts/PlayerSeed\.js[^"]+)"',
webpage, '5min embed url'), 'FiveMin')

View File

@@ -11,7 +11,8 @@ from ..utils import (
class AppleTrailersIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?trailers\.apple\.com/(?:trailers|ca)/(?P<company>[^/]+)/(?P<movie>[^/]+)'
IE_NAME = 'appletrailers'
_VALID_URL = r'https?://(?:www\.|movie)?trailers\.apple\.com/(?:trailers|ca)/(?P<company>[^/]+)/(?P<movie>[^/]+)'
_TESTS = [{
'url': 'http://trailers.apple.com/trailers/wb/manofsteel/',
'info_dict': {
@@ -63,9 +64,18 @@ class AppleTrailersIE(InfoExtractor):
},
},
]
}, {
'url': 'http://trailers.apple.com/trailers/magnolia/blackthorn/',
'info_dict': {
'id': 'blackthorn',
},
'playlist_mincount': 2,
}, {
'url': 'http://trailers.apple.com/ca/metropole/autrui/',
'only_matching': True,
}, {
'url': 'http://movietrailers.apple.com/trailers/focus_features/kuboandthetwostrings/',
'only_matching': True,
}]
_JSON_RE = r'iTunes.playURL\((.*?)\);'
@@ -79,7 +89,7 @@ class AppleTrailersIE(InfoExtractor):
def fix_html(s):
s = re.sub(r'(?s)<script[^<]*?>.*?</script>', '', s)
s = re.sub(r'<img ([^<]*?)>', r'<img \1/>', s)
s = re.sub(r'<img ([^<]*?)/?>', r'<img \1/>', s)
# The ' in the onClick attributes are not escaped, it couldn't be parsed
# like: http://trailers.apple.com/trailers/wb/gravity/
@@ -96,6 +106,9 @@ class AppleTrailersIE(InfoExtractor):
trailer_info_json = self._search_regex(self._JSON_RE,
on_click, 'trailer info')
trailer_info = json.loads(trailer_info_json)
first_url = trailer_info.get('url')
if not first_url:
continue
title = trailer_info['title']
video_id = movie + '-' + re.sub(r'[^a-zA-Z0-9]', '', title).lower()
thumbnail = li.find('.//img').attrib['src']
@@ -107,7 +120,6 @@ class AppleTrailersIE(InfoExtractor):
if m:
duration = 60 * int(m.group('minutes')) + int(m.group('seconds'))
first_url = trailer_info['url']
trailer_id = first_url.split('/')[-1].rpartition('_')[0].lower()
settings_json_url = compat_urlparse.urljoin(url, 'includes/settings/%s.json' % trailer_id)
settings = self._download_json(settings_json_url, trailer_id, 'Downloading settings json')
@@ -144,3 +156,76 @@ class AppleTrailersIE(InfoExtractor):
'id': movie,
'entries': playlist,
}
class AppleTrailersSectionIE(InfoExtractor):
IE_NAME = 'appletrailers:section'
_SECTIONS = {
'justadded': {
'feed_path': 'just_added',
'title': 'Just Added',
},
'exclusive': {
'feed_path': 'exclusive',
'title': 'Exclusive',
},
'justhd': {
'feed_path': 'just_hd',
'title': 'Just HD',
},
'mostpopular': {
'feed_path': 'most_pop',
'title': 'Most Popular',
},
'moviestudios': {
'feed_path': 'studios',
'title': 'Movie Studios',
},
}
_VALID_URL = r'https?://(?:www\.)?trailers\.apple\.com/#section=(?P<id>%s)' % '|'.join(_SECTIONS)
_TESTS = [{
'url': 'http://trailers.apple.com/#section=justadded',
'info_dict': {
'title': 'Just Added',
'id': 'justadded',
},
'playlist_mincount': 80,
}, {
'url': 'http://trailers.apple.com/#section=exclusive',
'info_dict': {
'title': 'Exclusive',
'id': 'exclusive',
},
'playlist_mincount': 80,
}, {
'url': 'http://trailers.apple.com/#section=justhd',
'info_dict': {
'title': 'Just HD',
'id': 'justhd',
},
'playlist_mincount': 80,
}, {
'url': 'http://trailers.apple.com/#section=mostpopular',
'info_dict': {
'title': 'Most Popular',
'id': 'mostpopular',
},
'playlist_mincount': 80,
}, {
'url': 'http://trailers.apple.com/#section=moviestudios',
'info_dict': {
'title': 'Movie Studios',
'id': 'moviestudios',
},
'playlist_mincount': 80,
}]
def _real_extract(self, url):
section = self._match_id(url)
section_data = self._download_json(
'http://trailers.apple.com/trailers/home/feeds/%s.json' % self._SECTIONS[section]['feed_path'],
section)
entries = [
self.url_result('http://trailers.apple.com' + e['location'])
for e in section_data]
return self.playlist_result(entries, section, self._SECTIONS[section]['title'])

View File

@@ -14,8 +14,8 @@ from ..utils import (
parse_duration,
unified_strdate,
xpath_text,
parse_xml,
)
from ..compat import compat_etree_fromstring
class ARDMediathekIE(InfoExtractor):
@@ -83,7 +83,7 @@ class ARDMediathekIE(InfoExtractor):
subtitle_url = media_info.get('_subtitleUrl')
if subtitle_url:
subtitles['de'] = [{
'ext': 'srt',
'ext': 'ttml',
'url': subtitle_url,
}]
@@ -110,13 +110,15 @@ class ARDMediathekIE(InfoExtractor):
server = stream.get('_server')
for stream_url in stream_urls:
ext = determine_ext(stream_url)
if quality != 'auto' and ext in ('f4m', 'm3u8'):
continue
if ext == 'f4m':
formats.extend(self._extract_f4m_formats(
stream_url + '?hdcore=3.1.1&plugin=aasp-3.1.1.69.124',
video_id, preference=-1, f4m_id='hds'))
video_id, preference=-1, f4m_id='hds', fatal=False))
elif ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
stream_url, video_id, 'mp4', preference=1, m3u8_id='hls'))
stream_url, video_id, 'mp4', preference=1, m3u8_id='hls', fatal=False))
else:
if server and server.startswith('rtmp'):
f = {
@@ -161,7 +163,7 @@ class ARDMediathekIE(InfoExtractor):
raise ExtractorError('This program is only suitable for those aged 12 and older. Video %s is therefore only available between 20 pm and 6 am.' % video_id, expected=True)
if re.search(r'[\?&]rss($|[=&])', url):
doc = parse_xml(webpage)
doc = compat_etree_fromstring(webpage.encode('utf-8'))
if doc.tag == 'rss':
return GenericIE()._extract_rss(url, video_id, doc)

View File

@@ -13,6 +13,7 @@ from ..utils import (
unified_strdate,
get_element_by_attribute,
int_or_none,
NO_DEFAULT,
qualities,
)
@@ -22,7 +23,7 @@ from ..utils import (
class ArteTvIE(InfoExtractor):
_VALID_URL = r'http://videos\.arte\.tv/(?P<lang>fr|de)/.*-(?P<id>.*?)\.html'
_VALID_URL = r'https?://videos\.arte\.tv/(?P<lang>fr|de|en|es)/.*-(?P<id>.*?)\.html'
IE_NAME = 'arte.tv'
def _real_extract(self, url):
@@ -62,15 +63,19 @@ class ArteTvIE(InfoExtractor):
class ArteTVPlus7IE(InfoExtractor):
IE_NAME = 'arte.tv:+7'
_VALID_URL = r'https?://(?:www\.)?arte\.tv/guide/(?P<lang>fr|de)/(?:(?:sendungen|emissions)/)?(?P<id>.*?)/(?P<name>.*?)(\?.*)?'
_VALID_URL = r'https?://(?:www\.)?arte\.tv/guide/(?P<lang>fr|de|en|es)/(?:(?:sendungen|emissions|embed)/)?(?P<id>[^/]+)/(?P<name>[^/?#&+])'
@classmethod
def _extract_url_info(cls, url):
mobj = re.match(cls._VALID_URL, url)
lang = mobj.group('lang')
# This is not a real id, it can be for example AJT for the news
# http://www.arte.tv/guide/fr/emissions/AJT/arte-journal
video_id = mobj.group('id')
query = compat_parse_qs(compat_urllib_parse_urlparse(url).query)
if 'vid' in query:
video_id = query['vid'][0]
else:
# This is not a real id, it can be for example AJT for the news
# http://www.arte.tv/guide/fr/emissions/AJT/arte-journal
video_id = mobj.group('id')
return video_id, lang
def _real_extract(self, url):
@@ -79,26 +84,63 @@ class ArteTVPlus7IE(InfoExtractor):
return self._extract_from_webpage(webpage, video_id, lang)
def _extract_from_webpage(self, webpage, video_id, lang):
patterns_templates = (r'arte_vp_url=["\'](.*?%s.*?)["\']', r'data-url=["\']([^"]+%s[^"]+)["\']')
ids = (video_id, '')
# some pages contain multiple videos (like
# http://www.arte.tv/guide/de/sendungen/XEN/xenius/?vid=055918-015_PLUS7-D),
# so we first try to look for json URLs that contain the video id from
# the 'vid' parameter.
patterns = [t % re.escape(_id) for _id in ids for t in patterns_templates]
json_url = self._html_search_regex(
[r'arte_vp_url=["\'](.*?)["\']', r'data-url=["\']([^"]+)["\']'],
webpage, 'json vp url', default=None)
patterns, webpage, 'json vp url', default=None)
if not json_url:
iframe_url = self._html_search_regex(
r'<iframe[^>]+src=(["\'])(?P<url>.+\bjson_url=.+?)\1',
webpage, 'iframe url', group='url')
json_url = compat_parse_qs(
compat_urllib_parse_urlparse(iframe_url).query)['json_url'][0]
return self._extract_from_json_url(json_url, video_id, lang)
def find_iframe_url(webpage, default=NO_DEFAULT):
return self._html_search_regex(
r'<iframe[^>]+src=(["\'])(?P<url>.+\bjson_url=.+?)\1',
webpage, 'iframe url', group='url', default=default)
def _extract_from_json_url(self, json_url, video_id, lang):
iframe_url = find_iframe_url(webpage, None)
if not iframe_url:
embed_url = self._html_search_regex(
r'arte_vp_url_oembed=\'([^\']+?)\'', webpage, 'embed url', default=None)
if embed_url:
player = self._download_json(
embed_url, video_id, 'Downloading player page')
iframe_url = find_iframe_url(player['html'])
# en and es URLs produce react-based pages with different layout (e.g.
# http://www.arte.tv/guide/en/053330-002-A/carnival-italy?zone=world)
if not iframe_url:
program = self._search_regex(
r'program\s*:\s*({.+?["\']embed_html["\'].+?}),?\s*\n',
webpage, 'program', default=None)
if program:
embed_html = self._parse_json(program, video_id)
if embed_html:
iframe_url = find_iframe_url(embed_html['embed_html'])
if iframe_url:
json_url = compat_parse_qs(
compat_urllib_parse_urlparse(iframe_url).query)['json_url'][0]
if json_url:
title = self._search_regex(
r'<h3[^>]+title=(["\'])(?P<title>.+?)\1',
webpage, 'title', default=None, group='title')
return self._extract_from_json_url(json_url, video_id, lang, title=title)
# Different kind of embed URL (e.g.
# http://www.arte.tv/magazine/trepalium/fr/episode-0406-replay-trepalium)
embed_url = self._search_regex(
r'<iframe[^>]+src=(["\'])(?P<url>.+?)\1',
webpage, 'embed url', group='url')
return self.url_result(embed_url)
def _extract_from_json_url(self, json_url, video_id, lang, title=None):
info = self._download_json(json_url, video_id)
player_info = info['videoJsonPlayer']
upload_date_str = player_info.get('shootingDate')
if not upload_date_str:
upload_date_str = player_info.get('VDA', '').split(' ')[0]
upload_date_str = (player_info.get('VRA') or player_info.get('VDA') or '').split(' ')[0]
title = player_info['VTI'].strip()
title = (player_info.get('VTI') or title or player_info['VID']).strip()
subtitle = player_info.get('VSU', '').strip()
if subtitle:
title += ' - %s' % subtitle
@@ -112,27 +154,30 @@ class ArteTVPlus7IE(InfoExtractor):
}
qfunc = qualities(['HQ', 'MQ', 'EQ', 'SQ'])
LANGS = {
'fr': 'F',
'de': 'A',
'en': 'E[ANG]',
'es': 'E[ESP]',
}
formats = []
for format_id, format_dict in player_info['VSR'].items():
f = dict(format_dict)
versionCode = f.get('versionCode')
langcode = {
'fr': 'F',
'de': 'A',
}.get(lang, lang)
lang_rexs = [r'VO?%s' % langcode, r'VO?.-ST%s' % langcode]
lang_pref = (
None if versionCode is None else (
10 if any(re.match(r, versionCode) for r in lang_rexs)
else -10))
langcode = LANGS.get(lang, lang)
lang_rexs = [r'VO?%s-' % re.escape(langcode), r'VO?.-ST%s$' % re.escape(langcode)]
lang_pref = None
if versionCode:
matched_lang_rexs = [r for r in lang_rexs if re.match(r, versionCode)]
lang_pref = -10 if not matched_lang_rexs else 10 * len(matched_lang_rexs)
source_pref = 0
if versionCode is not None:
# The original version with subtitles has lower relevance
if re.match(r'VO-ST(F|A)', versionCode):
if re.match(r'VO-ST(F|A|E)', versionCode):
source_pref -= 10
# The version with sourds/mal subtitles has also lower relevance
elif re.match(r'VO?(F|A)-STM\1', versionCode):
elif re.match(r'VO?(F|A|E)-STM\1', versionCode):
source_pref -= 9
format = {
'format_id': format_id,
@@ -165,7 +210,7 @@ class ArteTVPlus7IE(InfoExtractor):
# It also uses the arte_vp_url url from the webpage to extract the information
class ArteTVCreativeIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:creative'
_VALID_URL = r'https?://creative\.arte\.tv/(?P<lang>fr|de)/(?:magazine?/)?(?P<id>[^?#]+)'
_VALID_URL = r'https?://creative\.arte\.tv/(?P<lang>fr|de|en|es)/(?:magazine?/)?(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://creative.arte.tv/de/magazin/agentur-amateur-corporate-design',
@@ -189,30 +234,25 @@ class ArteTVCreativeIE(ArteTVPlus7IE):
class ArteTVFutureIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:future'
_VALID_URL = r'https?://future\.arte\.tv/(?P<lang>fr|de)/(thema|sujet)/.*?#article-anchor-(?P<id>\d+)'
_VALID_URL = r'https?://future\.arte\.tv/(?P<lang>fr|de|en|es)/(?P<id>[^/?#&]+)'
_TEST = {
'url': 'http://future.arte.tv/fr/sujet/info-sciences#article-anchor-7081',
_TESTS = [{
'url': 'http://future.arte.tv/fr/info-sciences/les-ecrevisses-aussi-sont-anxieuses',
'info_dict': {
'id': '5201',
'id': '050940-028-A',
'ext': 'mp4',
'title': 'Les champignons au secours de la planète',
'upload_date': '20131101',
'title': 'Les écrevisses aussi peuvent être anxieuses',
'upload_date': '20140902',
},
}
def _real_extract(self, url):
anchor_id, lang = self._extract_url_info(url)
webpage = self._download_webpage(url, anchor_id)
row = self._search_regex(
r'(?s)id="%s"[^>]*>.+?(<div[^>]*arte_vp_url[^>]*>)' % anchor_id,
webpage, 'row')
return self._extract_from_webpage(row, anchor_id, lang)
}, {
'url': 'http://future.arte.tv/fr/la-science-est-elle-responsable',
'only_matching': True,
}]
class ArteTVDDCIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:ddc'
_VALID_URL = r'https?://ddc\.arte\.tv/(?P<lang>emission|folge)/(?P<id>.+)'
_VALID_URL = r'https?://ddc\.arte\.tv/(?P<lang>emission|folge)/(?P<id>[^/?#&]+)'
def _real_extract(self, url):
video_id, lang = self._extract_url_info(url)
@@ -230,7 +270,7 @@ class ArteTVDDCIE(ArteTVPlus7IE):
class ArteTVConcertIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:concert'
_VALID_URL = r'https?://concert\.arte\.tv/(?P<lang>de|fr)/(?P<id>.+)'
_VALID_URL = r'https?://concert\.arte\.tv/(?P<lang>fr|de|en|es)/(?P<id>[^/?#&]+)'
_TEST = {
'url': 'http://concert.arte.tv/de/notwist-im-pariser-konzertclub-divan-du-monde',
@@ -245,11 +285,59 @@ class ArteTVConcertIE(ArteTVPlus7IE):
}
class ArteTVCinemaIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:cinema'
_VALID_URL = r'https?://cinema\.arte\.tv/(?P<lang>fr|de|en|es)/(?P<id>.+)'
_TEST = {
'url': 'http://cinema.arte.tv/de/node/38291',
'md5': '6b275511a5107c60bacbeeda368c3aa1',
'info_dict': {
'id': '055876-000_PWA12025-D',
'ext': 'mp4',
'title': 'Tod auf dem Nil',
'upload_date': '20160122',
'description': 'md5:7f749bbb77d800ef2be11d54529b96bc',
},
}
class ArteTVMagazineIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:magazine'
_VALID_URL = r'https?://(?:www\.)?arte\.tv/magazine/[^/]+/(?P<lang>fr|de|en|es)/(?P<id>[^/?#&]+)'
_TESTS = [{
# Embedded via <iframe src="http://www.arte.tv/arte_vp/index.php?json_url=..."
'url': 'http://www.arte.tv/magazine/trepalium/fr/entretien-avec-le-realisateur-vincent-lannoo-trepalium',
'md5': '2a9369bcccf847d1c741e51416299f25',
'info_dict': {
'id': '065965-000-A',
'ext': 'mp4',
'title': 'Trepalium - Extrait Ep.01',
'upload_date': '20160121',
},
}, {
# Embedded via <iframe src="http://www.arte.tv/guide/fr/embed/054813-004-A/medium"
'url': 'http://www.arte.tv/magazine/trepalium/fr/episode-0406-replay-trepalium',
'md5': 'fedc64fc7a946110fe311634e79782ca',
'info_dict': {
'id': '054813-004_PLUS7-F',
'ext': 'mp4',
'title': 'Trepalium (4/6)',
'description': 'md5:10057003c34d54e95350be4f9b05cb40',
'upload_date': '20160218',
},
}, {
'url': 'http://www.arte.tv/magazine/metropolis/de/frank-woeste-german-paris-metropolis',
'only_matching': True,
}]
class ArteTVEmbedIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:embed'
_VALID_URL = r'''(?x)
http://www\.arte\.tv
/playerv2/embed\.php\?json_url=
/(?:playerv2/embed|arte_vp/index)\.php\?json_url=
(?P<json_url>
http://arte\.tv/papi/tvguide/videos/stream/player/
(?P<lang>[^/]+)/(?P<id>[^/]+)[^&]*

View File

@@ -2,18 +2,18 @@ from __future__ import unicode_literals
import time
import hmac
import hashlib
import re
from .common import InfoExtractor
from ..compat import (
compat_str,
compat_urllib_parse,
compat_urllib_request,
)
from ..compat import compat_str
from ..utils import (
int_or_none,
float_or_none,
xpath_text,
ExtractorError,
float_or_none,
int_or_none,
sanitized_Request,
urlencode_postdata,
xpath_text,
)
@@ -32,6 +32,19 @@ class AtresPlayerIE(InfoExtractor):
'duration': 5527.6,
'thumbnail': 're:^https?://.*\.jpg$',
},
'skip': 'This video is only available for registered users'
},
{
'url': 'http://www.atresplayer.com/television/especial/videoencuentros/temporada-1/capitulo-112-david-bustamante_2014121600375.html',
'md5': '0d0e918533bbd4b263f2de4d197d4aac',
'info_dict': {
'id': 'capitulo-112-david-bustamante',
'ext': 'flv',
'title': 'David Bustamante',
'description': 'md5:f33f1c0a05be57f6708d4dd83a3b81c6',
'duration': 1439.0,
'thumbnail': 're:^https?://.*\.jpg$',
},
},
{
'url': 'http://www.atresplayer.com/television/series/el-secreto-de-puente-viejo/el-chico-de-los-tres-lunares/capitulo-977-29-12-14_2014122400174.html',
@@ -50,6 +63,13 @@ class AtresPlayerIE(InfoExtractor):
_LOGIN_URL = 'https://servicios.atresplayer.com/j_spring_security_check'
_ERRORS = {
'UNPUBLISHED': 'We\'re sorry, but this video is not yet available.',
'DELETED': 'This video has expired and is no longer available for online streaming.',
'GEOUNPUBLISHED': 'We\'re sorry, but this video is not available in your region due to right restrictions.',
# 'PREMIUM': 'PREMIUM',
}
def _real_initialize(self):
self._login()
@@ -63,8 +83,8 @@ class AtresPlayerIE(InfoExtractor):
'j_password': password,
}
request = compat_urllib_request.Request(
self._LOGIN_URL, compat_urllib_parse.urlencode(login_form).encode('utf-8'))
request = sanitized_Request(
self._LOGIN_URL, urlencode_postdata(login_form))
request.add_header('Content-Type', 'application/x-www-form-urlencoded')
response = self._download_webpage(
request, None, 'Logging in as %s' % username)
@@ -83,58 +103,72 @@ class AtresPlayerIE(InfoExtractor):
episode_id = self._search_regex(
r'episode="([^"]+)"', webpage, 'episode id')
request = sanitized_Request(
self._PLAYER_URL_TEMPLATE % episode_id,
headers={'User-Agent': self._USER_AGENT})
player = self._download_json(request, episode_id, 'Downloading player JSON')
episode_type = player.get('typeOfEpisode')
error_message = self._ERRORS.get(episode_type)
if error_message:
raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, error_message), expected=True)
formats = []
video_url = player.get('urlVideo')
if video_url:
format_info = {
'url': video_url,
'format_id': 'http',
}
mobj = re.search(r'(?P<bitrate>\d+)K_(?P<width>\d+)x(?P<height>\d+)', video_url)
if mobj:
format_info.update({
'width': int_or_none(mobj.group('width')),
'height': int_or_none(mobj.group('height')),
'tbr': int_or_none(mobj.group('bitrate')),
})
formats.append(format_info)
timestamp = int_or_none(self._download_webpage(
self._TIME_API_URL,
video_id, 'Downloading timestamp', fatal=False), 1000, time.time())
timestamp_shifted = compat_str(timestamp + self._TIMESTAMP_SHIFT)
token = hmac.new(
self._MAGIC.encode('ascii'),
(episode_id + timestamp_shifted).encode('utf-8')
(episode_id + timestamp_shifted).encode('utf-8'), hashlib.md5
).hexdigest()
formats = []
for fmt in ['windows', 'android_tablet']:
request = compat_urllib_request.Request(
self._URL_VIDEO_TEMPLATE.format(fmt, episode_id, timestamp_shifted, token))
request.add_header('User-Agent', self._USER_AGENT)
request = sanitized_Request(
self._URL_VIDEO_TEMPLATE.format('windows', episode_id, timestamp_shifted, token),
headers={'User-Agent': self._USER_AGENT})
fmt_json = self._download_json(
request, video_id, 'Downloading %s video JSON' % fmt)
fmt_json = self._download_json(
request, video_id, 'Downloading windows video JSON')
result = fmt_json.get('resultDes')
if result.lower() != 'ok':
raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, result), expected=True)
result = fmt_json.get('resultDes')
if result.lower() != 'ok':
raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, result), expected=True)
for format_id, video_url in fmt_json['resultObject'].items():
if format_id == 'token' or not video_url.startswith('http'):
continue
if video_url.endswith('/Manifest'):
if 'geodeswowsmpra3player' in video_url:
f4m_path = video_url.split('smil:', 1)[-1].split('free_', 1)[0]
f4m_url = 'http://drg.antena3.com/{0}hds/es/sd.f4m'.format(f4m_path)
# this videos are protected by DRM, the f4m downloader doesn't support them
continue
else:
f4m_url = video_url[:-9] + '/manifest.f4m'
formats.extend(self._extract_f4m_formats(f4m_url, video_id))
else:
formats.append({
'url': video_url,
'format_id': 'android-%s' % format_id,
'preference': 1,
})
for format_id, video_url in fmt_json['resultObject'].items():
if format_id == 'token' or not video_url.startswith('http'):
continue
if 'geodeswowsmpra3player' in video_url:
f4m_path = video_url.split('smil:', 1)[-1].split('free_', 1)[0]
f4m_url = 'http://drg.antena3.com/{0}hds/es/sd.f4m'.format(f4m_path)
# this videos are protected by DRM, the f4m downloader doesn't support them
continue
else:
f4m_url = video_url[:-9] + '/manifest.f4m'
formats.extend(self._extract_f4m_formats(f4m_url, video_id, f4m_id='hds', fatal=False))
self._sort_formats(formats)
player = self._download_json(
self._PLAYER_URL_TEMPLATE % episode_id,
episode_id)
path_data = player.get('pathData')
episode = self._download_xml(
self._EPISODE_URL_TEMPLATE % path_data,
video_id, 'Downloading episode XML')
self._EPISODE_URL_TEMPLATE % path_data, video_id,
'Downloading episode XML')
duration = float_or_none(xpath_text(
episode, './media/asset/info/technical/contentDuration', 'duration'))

View File

@@ -0,0 +1,89 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_iso8601,
sanitized_Request,
)
class AudiMediaIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?audi-mediacenter\.com/(?:en|de)/audimediatv/(?P<id>[^/?#]+)'
_TEST = {
'url': 'https://www.audi-mediacenter.com/en/audimediatv/60-seconds-of-audi-sport-104-2015-wec-bahrain-rookie-test-1467',
'md5': '79a8b71c46d49042609795ab59779b66',
'info_dict': {
'id': '1565',
'ext': 'mp4',
'title': '60 Seconds of Audi Sport 104/2015 - WEC Bahrain, Rookie Test',
'description': 'md5:60e5d30a78ced725f7b8d34370762941',
'upload_date': '20151124',
'timestamp': 1448354940,
'duration': 74022,
'view_count': int,
}
}
# extracted from https://audimedia.tv/assets/embed/embedded-player.js (dataSourceAuthToken)
_AUTH_TOKEN = 'e25b42847dba18c6c8816d5d8ce94c326e06823ebf0859ed164b3ba169be97f2'
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
raw_payload = self._search_regex([
r'class="amtv-embed"[^>]+id="([^"]+)"',
r'class=\\"amtv-embed\\"[^>]+id=\\"([^"]+)\\"',
], webpage, 'raw payload')
_, stage_mode, video_id, lang = raw_payload.split('-')
# TODO: handle s and e stage_mode (live streams and ended live streams)
if stage_mode not in ('s', 'e'):
request = sanitized_Request(
'https://audimedia.tv/api/video/v1/videos/%s?embed[]=video_versions&embed[]=thumbnail_image&where[content_language_iso]=%s' % (video_id, lang),
headers={'X-Auth-Token': self._AUTH_TOKEN})
json_data = self._download_json(request, video_id)['results']
formats = []
stream_url_hls = json_data.get('stream_url_hls')
if stream_url_hls:
formats.extend(self._extract_m3u8_formats(
stream_url_hls, video_id, 'mp4',
entry_protocol='m3u8_native', m3u8_id='hls', fatal=False))
stream_url_hds = json_data.get('stream_url_hds')
if stream_url_hds:
formats.extend(self._extract_f4m_formats(
stream_url_hds + '?hdcore=3.4.0',
video_id, f4m_id='hds', fatal=False))
for video_version in json_data.get('video_versions'):
video_version_url = video_version.get('download_url') or video_version.get('stream_url')
if not video_version_url:
continue
f = {
'url': video_version_url,
'width': int_or_none(video_version.get('width')),
'height': int_or_none(video_version.get('height')),
'abr': int_or_none(video_version.get('audio_bitrate')),
'vbr': int_or_none(video_version.get('video_bitrate')),
}
bitrate = self._search_regex(r'(\d+)k', video_version_url, 'bitrate', default=None)
if bitrate:
f.update({
'format_id': 'http-%s' % bitrate,
})
formats.append(f)
self._sort_formats(formats)
return {
'id': video_id,
'title': json_data['title'],
'description': json_data.get('subtitle'),
'thumbnail': json_data.get('thumbnail_image', {}).get('file'),
'timestamp': parse_iso8601(json_data.get('publication_date')),
'duration': int_or_none(json_data.get('duration')),
'view_count': int_or_none(json_data.get('view_count')),
'formats': formats,
}

View File

@@ -0,0 +1,66 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import float_or_none
class AudioBoomIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?audioboom\.com/boos/(?P<id>[0-9]+)'
_TEST = {
'url': 'https://audioboom.com/boos/4279833-3-09-2016-czaban-hour-3?t=0',
'md5': '63a8d73a055c6ed0f1e51921a10a5a76',
'info_dict': {
'id': '4279833',
'ext': 'mp3',
'title': '3/09/2016 Czaban Hour 3',
'description': 'Guest: Nate Davis - NFL free agency, Guest: Stan Gans',
'duration': 2245.72,
'uploader': 'Steve Czaban',
'uploader_url': 're:https?://(?:www\.)?audioboom\.com/channel/steveczabanyahoosportsradio',
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
clip = None
clip_store = self._parse_json(
self._search_regex(
r'data-new-clip-store=(["\'])(?P<json>{.*?"clipId"\s*:\s*%s.*?})\1' % video_id,
webpage, 'clip store', default='{}', group='json'),
video_id, fatal=False)
if clip_store:
clips = clip_store.get('clips')
if clips and isinstance(clips, list) and isinstance(clips[0], dict):
clip = clips[0]
def from_clip(field):
if clip:
clip.get(field)
audio_url = from_clip('clipURLPriorToLoading') or self._og_search_property(
'audio', webpage, 'audio url')
title = from_clip('title') or self._og_search_title(webpage)
description = from_clip('description') or self._og_search_description(webpage)
duration = float_or_none(from_clip('duration') or self._html_search_meta(
'weibo:audio:duration', webpage))
uploader = from_clip('author') or self._og_search_property(
'audio:artist', webpage, 'uploader', fatal=False)
uploader_url = from_clip('author_url') or self._html_search_meta(
'audioboo:channel', webpage, 'uploader url')
return {
'id': video_id,
'url': audio_url,
'title': title,
'description': description,
'duration': duration,
'uploader': uploader,
'uploader_url': uploader_url,
}

View File

@@ -56,7 +56,7 @@ class AudiomackIE(InfoExtractor):
# API is inconsistent with errors
if 'url' not in api_response or not api_response['url'] or 'error' in api_response:
raise ExtractorError('Invalid url %s', url)
raise ExtractorError('Invalid url %s' % url)
# Audiomack wraps a lot of soundcloud tracks in their branded wrapper
# if so, pass the work off to the soundcloud extractor

View File

@@ -3,7 +3,11 @@ from __future__ import unicode_literals
import json
from .common import InfoExtractor
from ..utils import float_or_none
from ..utils import (
ExtractorError,
float_or_none,
sanitized_Request,
)
class AzubuIE(InfoExtractor):
@@ -91,3 +95,38 @@ class AzubuIE(InfoExtractor):
'view_count': view_count,
'formats': formats,
}
class AzubuLiveIE(InfoExtractor):
_VALID_URL = r'https?://www.azubu.tv/(?P<id>[^/]+)$'
_TEST = {
'url': 'http://www.azubu.tv/MarsTVMDLen',
'only_matching': True,
}
def _real_extract(self, url):
user = self._match_id(url)
info = self._download_json(
'http://api.azubu.tv/public/modules/last-video/{0}/info'.format(user),
user)['data']
if info['type'] != 'STREAM':
raise ExtractorError('{0} is not streaming live'.format(user), expected=True)
req = sanitized_Request(
'https://edge-elb.api.brightcove.com/playback/v1/accounts/3361910549001/videos/ref:' + info['reference_id'])
req.add_header('Accept', 'application/json;pk=BCpkADawqM1gvI0oGWg8dxQHlgT8HkdE2LnAlWAZkOlznO39bSZX726u4JqnDsK3MDXcO01JxXK2tZtJbgQChxgaFzEVdHRjaDoxaOu8hHOO8NYhwdxw9BzvgkvLUlpbDNUuDoc4E4wxDToV')
bc_info = self._download_json(req, user)
m3u8_url = next(source['src'] for source in bc_info['sources'] if source['container'] == 'M2TS')
formats = self._extract_m3u8_formats(m3u8_url, user, ext='mp4')
self._sort_formats(formats)
return {
'id': info['id'],
'title': self._live_title(info['title']),
'uploader_id': user,
'formats': formats,
'is_live': True,
'thumbnail': bc_info['poster'],
}

View File

@@ -4,18 +4,18 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_urlparse
from ..utils import unescapeHTML
class BaiduVideoIE(InfoExtractor):
IE_DESC = '百度视频'
_VALID_URL = r'http://v\.baidu\.com/(?P<type>[a-z]+)/(?P<id>\d+)\.htm'
_VALID_URL = r'https?://v\.baidu\.com/(?P<type>[a-z]+)/(?P<id>\d+)\.htm'
_TESTS = [{
'url': 'http://v.baidu.com/comic/1069.htm?frp=bdbrand&q=%E4%B8%AD%E5%8D%8E%E5%B0%8F%E5%BD%93%E5%AE%B6',
'info_dict': {
'id': '1069',
'title': '中华小当家 TV版 (全52集)',
'description': 'md5:395a419e41215e531c857bb037bbaf80',
'title': '中华小当家 TV版国语',
'description': 'md5:51be07afe461cf99fa61231421b5397c',
},
'playlist_count': 52,
}, {
@@ -25,45 +25,32 @@ class BaiduVideoIE(InfoExtractor):
'title': 're:^奔跑吧兄弟',
'description': 'md5:1bf88bad6d850930f542d51547c089b8',
},
'playlist_mincount': 3,
'playlist_mincount': 12,
}]
def _call_api(self, path, category, playlist_id, note):
return self._download_json('http://app.video.baidu.com/%s/?worktype=adnative%s&id=%s' % (
path, category, playlist_id), playlist_id, note)
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
playlist_id = mobj.group('id')
category = category2 = mobj.group('type')
category, playlist_id = re.match(self._VALID_URL, url).groups()
if category == 'show':
category2 = 'tvshow'
category = 'tvshow'
if category == 'tv':
category = 'tvplay'
webpage = self._download_webpage(url, playlist_id)
playlist_detail = self._call_api(
'xqinfo', category, playlist_id, 'Download playlist JSON metadata')
playlist_title = self._html_search_regex(
r'title\s*:\s*(["\'])(?P<title>[^\']+)\1', webpage,
'playlist title', group='title')
playlist_description = self._html_search_regex(
r'<input[^>]+class="j-data-intro"[^>]+value="([^"]+)"/>', webpage,
playlist_id, 'playlist description')
playlist_title = playlist_detail['title']
playlist_description = unescapeHTML(playlist_detail.get('intro'))
site = self._html_search_regex(
r'filterSite\s*:\s*["\']([^"]*)["\']', webpage,
'primary provider site')
api_result = self._download_json(
'http://v.baidu.com/%s_intro/?dtype=%sPlayUrl&id=%s&site=%s' % (
category, category2, playlist_id, site),
playlist_id, 'Get playlist links')
episodes_detail = self._call_api(
'xqsingle', category, playlist_id, 'Download episodes JSON metadata')
entries = []
for episode in api_result[0]['episodes']:
episode_id = '%s_%s' % (playlist_id, episode['episode'])
redirect_page = self._download_webpage(
compat_urlparse.urljoin(url, episode['url']), episode_id,
note='Download Baidu redirect page')
real_url = self._html_search_regex(
r'location\.replace\("([^"]+)"\)', redirect_page, 'real URL')
entries.append(self.url_result(
real_url, video_title=episode['single_title']))
entries = [self.url_result(
episode['url'], video_title=episode['title']
) for episode in episodes_detail['videos']]
return self.playlist_result(
entries, playlist_id, playlist_title, playlist_description)

View File

@@ -4,15 +4,13 @@ import re
import itertools
from .common import InfoExtractor
from ..compat import (
compat_urllib_parse,
compat_urllib_request,
compat_str,
)
from ..compat import compat_str
from ..utils import (
ExtractorError,
int_or_none,
float_or_none,
int_or_none,
sanitized_Request,
urlencode_postdata,
)
@@ -57,8 +55,8 @@ class BambuserIE(InfoExtractor):
'pass': password,
}
request = compat_urllib_request.Request(
self._LOGIN_URL, compat_urllib_parse.urlencode(login_form).encode('utf-8'))
request = sanitized_Request(
self._LOGIN_URL, urlencode_postdata(login_form))
request.add_header('Referer', self._LOGIN_URL)
response = self._download_webpage(
request, None, 'Logging in as %s' % username)
@@ -126,7 +124,7 @@ class BambuserChannelIE(InfoExtractor):
'&sort=created&access_mode=0%2C1%2C2&limit={count}'
'&method=broadcast&format=json&vid_older_than={last}'
).format(user=user, count=self._STEP, last=last_id)
req = compat_urllib_request.Request(req_url)
req = sanitized_Request(req_url)
# Without setting this header, we wouldn't get any result
req.add_header('Referer', 'http://bambuser.com/channel/%s' % user)
data = self._download_json(

View File

@@ -2,7 +2,6 @@
from __future__ import unicode_literals
import re
import xml.etree.ElementTree
from .common import InfoExtractor
from ..utils import (
@@ -11,21 +10,34 @@ from ..utils import (
int_or_none,
parse_duration,
parse_iso8601,
remove_end,
unescapeHTML,
)
from ..compat import compat_HTTPError
from ..compat import (
compat_etree_fromstring,
compat_HTTPError,
)
class BBCCoUkIE(InfoExtractor):
IE_NAME = 'bbc.co.uk'
IE_DESC = 'BBC iPlayer'
_VALID_URL = r'https?://(?:www\.)?bbc\.co\.uk/(?:(?:programmes/(?!articles/)|iplayer(?:/[^/]+)?/(?:episode/|playlist/))|music/clips[/#])(?P<id>[\da-z]{8})'
_ID_REGEX = r'[pb][\da-z]{7}'
_VALID_URL = r'''(?x)
https?://
(?:www\.)?bbc\.co\.uk/
(?:
programmes/(?!articles/)|
iplayer(?:/[^/]+)?/(?:episode/|playlist/)|
music/clips[/#]|
radio/player/
)
(?P<id>%s)
''' % _ID_REGEX
_MEDIASELECTOR_URLS = [
# Provides HQ HLS streams with even better quality that pc mediaset but fails
# with geolocation in some cases when it's even not geo restricted at all (e.g.
# http://www.bbc.co.uk/programmes/b06bp7lf)
# http://www.bbc.co.uk/programmes/b06bp7lf). Also may fail with selectionunavailable.
'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/iptv-all/vpid/%s',
'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/pc/vpid/%s',
]
@@ -44,9 +56,8 @@ class BBCCoUkIE(InfoExtractor):
'info_dict': {
'id': 'b039d07m',
'ext': 'flv',
'title': 'Kaleidoscope, Leonard Cohen',
'title': 'Leonard Cohen, Kaleidoscope - BBC Radio 4',
'description': 'The Canadian poet and songwriter reflects on his musical career.',
'duration': 1740,
},
'params': {
# rtmp download
@@ -74,7 +85,7 @@ class BBCCoUkIE(InfoExtractor):
'id': 'b00yng1d',
'ext': 'flv',
'title': 'The Voice UK: Series 3: Blind Auditions 5',
'description': "Emma Willis and Marvin Humes present the fifth set of blind auditions in the singing competition, as the coaches continue to build their teams based on voice alone.",
'description': 'Emma Willis and Marvin Humes present the fifth set of blind auditions in the singing competition, as the coaches continue to build their teams based on voice alone.',
'duration': 5100,
},
'params': {
@@ -109,16 +120,17 @@ class BBCCoUkIE(InfoExtractor):
'params': {
# rtmp download
'skip_download': True,
}
},
'skip': 'Episode is no longer available on BBC iPlayer Radio',
}, {
'url': 'http://www.bbc.co.uk/music/clips/p02frcc3',
'url': 'http://www.bbc.co.uk/music/clips/p022h44b',
'note': 'Audio',
'info_dict': {
'id': 'p02frcch',
'id': 'p022h44j',
'ext': 'flv',
'title': 'Pete Tong, Past, Present and Future Special, Madeon - After Hours mix',
'description': 'French house superstar Madeon takes us out of the club and onto the after party.',
'duration': 3507,
'title': 'BBC Proms Music Guides, Rachmaninov: Symphonic Dances',
'description': "In this Proms Music Guide, Andrew McGregor looks at Rachmaninov's Symphonic Dances.",
'duration': 227,
},
'params': {
# rtmp download
@@ -169,13 +181,25 @@ class BBCCoUkIE(InfoExtractor):
}, {
# iptv-all mediaset fails with geolocation however there is no geo restriction
# for this programme at all
'url': 'http://www.bbc.co.uk/programmes/b06bp7lf',
'url': 'http://www.bbc.co.uk/programmes/b06rkn85',
'info_dict': {
'id': 'b06bp7kf',
'id': 'b06rkms3',
'ext': 'flv',
'title': "Annie Mac's Friday Night, B.Traits sits in for Annie",
'description': 'B.Traits sits in for Annie Mac with a Mini-Mix from Disclosure.',
'duration': 10800,
'title': "Best of the Mini-Mixes 2015: Part 3, Annie Mac's Friday Night - BBC Radio 1",
'description': "Annie has part three in the Best of the Mini-Mixes 2015, plus the year's Most Played!",
},
'params': {
# rtmp download
'skip_download': True,
},
}, {
# compact player (https://github.com/rg3/youtube-dl/issues/8147)
'url': 'http://www.bbc.co.uk/programmes/p028bfkf/player',
'info_dict': {
'id': 'p028bfkj',
'ext': 'flv',
'title': 'Extract from BBC documentary Look Stranger - Giant Leeks and Magic Brews',
'description': 'Extract from BBC documentary Look Stranger - Giant Leeks and Magic Brews',
},
'params': {
# rtmp download
@@ -190,6 +214,9 @@ class BBCCoUkIE(InfoExtractor):
}, {
'url': 'http://www.bbc.co.uk/iplayer/cbeebies/episode/b0480276/bing-14-atchoo',
'only_matching': True,
}, {
'url': 'http://www.bbc.co.uk/radio/player/p03cchwf',
'only_matching': True,
}
]
@@ -220,11 +247,9 @@ class BBCCoUkIE(InfoExtractor):
elif transfer_format == 'dash':
pass
elif transfer_format == 'hls':
m3u8_formats = self._extract_m3u8_formats(
formats.extend(self._extract_m3u8_formats(
href, programme_id, ext='mp4', entry_protocol='m3u8_native',
m3u8_id=supplier, fatal=False)
if m3u8_formats:
formats.extend(m3u8_formats)
m3u8_id=supplier, fatal=False))
# Direct link
else:
formats.append({
@@ -303,6 +328,7 @@ class BBCCoUkIE(InfoExtractor):
'format_id': '%s_%s' % (service, format['format_id']),
'abr': abr,
'acodec': acodec,
'vcodec': 'none',
})
formats.extend(conn_formats)
return formats
@@ -332,7 +358,7 @@ class BBCCoUkIE(InfoExtractor):
return self._download_media_selector_url(
mediaselector_url % programme_id, programme_id)
except BBCCoUkIE.MediaSelectionError as e:
if e.id in ('notukerror', 'geolocation'):
if e.id in ('notukerror', 'geolocation', 'selectionunavailable'):
last_exception = e
continue
self._raise_extractor_error(e)
@@ -343,8 +369,8 @@ class BBCCoUkIE(InfoExtractor):
media_selection = self._download_xml(
url, programme_id, 'Downloading media selection XML')
except ExtractorError as ee:
if isinstance(ee.cause, compat_HTTPError) and ee.cause.code == 403:
media_selection = xml.etree.ElementTree.fromstring(ee.cause.read().decode('utf-8'))
if isinstance(ee.cause, compat_HTTPError) and ee.cause.code in (403, 404):
media_selection = compat_etree_fromstring(ee.cause.read().decode('utf-8'))
else:
raise
return self._process_media_selector(media_selection, programme_id)
@@ -451,6 +477,7 @@ class BBCCoUkIE(InfoExtractor):
webpage = self._download_webpage(url, group_id, 'Downloading video page')
programme_id = None
duration = None
tviplayer = self._search_regex(
r'mediator\.bind\(({.+?})\s*,\s*document\.getElementById',
@@ -463,14 +490,19 @@ class BBCCoUkIE(InfoExtractor):
if not programme_id:
programme_id = self._search_regex(
r'"vpid"\s*:\s*"([\da-z]{8})"', webpage, 'vpid', fatal=False, default=None)
r'"vpid"\s*:\s*"(%s)"' % self._ID_REGEX, webpage, 'vpid', fatal=False, default=None)
if programme_id:
formats, subtitles = self._download_media_selector(programme_id)
title = self._og_search_title(webpage)
title = self._og_search_title(webpage, default=None) or self._html_search_regex(
(r'<h2[^>]+id="parent-title"[^>]*>(.+?)</h2>',
r'<div[^>]+class="info"[^>]*>\s*<h1>(.+?)</h1>'), webpage, 'title')
description = self._search_regex(
r'<p class="[^"]*medium-description[^"]*">([^<]+)</p>',
webpage, 'description', fatal=False)
(r'<p class="[^"]*medium-description[^"]*">([^<]+)</p>',
r'<div[^>]+class="info_+synopsis"[^>]*>([^<]+)</div>'),
webpage, 'description', default=None)
if not description:
description = self._html_search_meta('description', webpage)
else:
programme_id, title, description, duration, formats, subtitles = self._download_playlist(group_id)
@@ -529,7 +561,7 @@ class BBCIE(BBCCoUkIE):
'url': 'http://www.bbc.co.uk/blogs/adamcurtis/entries/3662a707-0af9-3149-963f-47bea720b460',
'info_dict': {
'id': '3662a707-0af9-3149-963f-47bea720b460',
'title': 'BBC Blogs - Adam Curtis - BUGGER',
'title': 'BUGGER',
},
'playlist_count': 18,
}, {
@@ -584,6 +616,7 @@ class BBCIE(BBCCoUkIE):
'ext': 'mp4',
'title': '''Judge Mindy Glazer: "I'm sorry to see you here... I always wondered what happened to you"''',
'duration': 56,
'description': '''Judge Mindy Glazer: "I'm sorry to see you here... I always wondered what happened to you"''',
},
'params': {
'skip_download': True,
@@ -637,9 +670,17 @@ class BBCIE(BBCCoUkIE):
'url': 'http://www.bbc.com/sport/0/football/34475836',
'info_dict': {
'id': '34475836',
'title': 'What Liverpool can expect from Klopp',
'title': 'Jurgen Klopp: Furious football from a witty and winning coach',
},
'playlist_count': 3,
}, {
# school report article with single video
'url': 'http://www.bbc.co.uk/schoolreport/35744779',
'info_dict': {
'id': '35744779',
'title': 'School which breaks down barriers in Jerusalem',
},
'playlist_count': 1,
}, {
# single video with playlist URL from weather section
'url': 'http://www.bbc.com/weather/features/33601775',
@@ -648,6 +689,10 @@ class BBCIE(BBCCoUkIE):
# custom redirection to www.bbc.com
'url': 'http://www.bbc.co.uk/news/science-environment-33661876',
'only_matching': True,
}, {
# single video article embedded with data-media-vpid
'url': 'http://www.bbc.co.uk/sport/rowing/35908187',
'only_matching': True,
}]
@classmethod
@@ -700,19 +745,19 @@ class BBCIE(BBCCoUkIE):
webpage = self._download_webpage(url, playlist_id)
timestamp = None
playlist_title = None
playlist_description = None
json_ld_info = self._search_json_ld(webpage, playlist_id, default=None)
timestamp = json_ld_info.get('timestamp')
ld = self._parse_json(
self._search_regex(
r'(?s)<script type="application/ld\+json">(.+?)</script>',
webpage, 'ld json', default='{}'),
playlist_id, fatal=False)
if ld:
timestamp = parse_iso8601(ld.get('datePublished'))
playlist_title = ld.get('headline')
playlist_description = ld.get('articleBody')
playlist_title = json_ld_info.get('title')
if not playlist_title:
playlist_title = self._og_search_title(
webpage, default=None) or self._html_search_regex(
r'<title>(.+?)</title>', webpage, 'playlist title', default=None)
if playlist_title:
playlist_title = re.sub(r'(.+)\s*-\s*BBC.*?$', r'\1', playlist_title).strip()
playlist_description = json_ld_info.get(
'description') or self._og_search_description(webpage, default=None)
if not timestamp:
timestamp = parse_iso8601(self._search_regex(
@@ -726,6 +771,7 @@ class BBCIE(BBCCoUkIE):
# article with multiple videos embedded with playlist.sxml (e.g.
# http://www.bbc.com/sport/0/football/34475836)
playlists = re.findall(r'<param[^>]+name="playlist"[^>]+value="([^"]+)"', webpage)
playlists.extend(re.findall(r'data-media-id="([^"]+/playlist\.sxml)"', webpage))
if playlists:
entries = [
self._extract_from_playlist_sxml(playlist_url, playlist_id, timestamp)
@@ -772,14 +818,13 @@ class BBCIE(BBCCoUkIE):
playlist.get('progressiveDownloadUrl'), playlist_id, timestamp))
if entries:
playlist_title = playlist_title or remove_end(self._og_search_title(webpage), ' - BBC News')
playlist_description = playlist_description or self._og_search_description(webpage, default=None)
return self.playlist_result(entries, playlist_id, playlist_title, playlist_description)
# single video story (e.g. http://www.bbc.com/travel/story/20150625-sri-lankas-spicy-secret)
programme_id = self._search_regex(
[r'data-video-player-vpid="([\da-z]{8})"',
r'<param[^>]+name="externalIdentifier"[^>]+value="([\da-z]{8})"'],
[r'data-(?:video-player|media)-vpid="(%s)"' % self._ID_REGEX,
r'<param[^>]+name="externalIdentifier"[^>]+value="(%s)"' % self._ID_REGEX,
r'videoId\s*:\s*["\'](%s)["\']' % self._ID_REGEX],
webpage, 'vpid', default=None)
if programme_id:
@@ -803,10 +848,6 @@ class BBCIE(BBCCoUkIE):
'subtitles': subtitles,
}
playlist_title = self._html_search_regex(
r'<title>(.*?)(?:\s*-\s*BBC [^ ]+)?</title>', webpage, 'playlist title')
playlist_description = self._og_search_description(webpage, default=None)
def extract_all(pattern):
return list(filter(None, map(
lambda s: self._parse_json(s, playlist_id, fatal=False),
@@ -814,7 +855,7 @@ class BBCIE(BBCCoUkIE):
# Multiple video article (e.g.
# http://www.bbc.co.uk/blogs/adamcurtis/entries/3662a707-0af9-3149-963f-47bea720b460)
EMBED_URL = r'https?://(?:www\.)?bbc\.co\.uk/(?:[^/]+/)+[\da-z]{8}(?:\b[^"]+)?'
EMBED_URL = r'https?://(?:www\.)?bbc\.co\.uk/(?:[^/]+/)+%s(?:\b[^"]+)?' % self._ID_REGEX
entries = []
for match in extract_all(r'new\s+SMP\(({.+?})\)'):
embed_url = match.get('playerSettings', {}).get('externalEmbedUrl')
@@ -906,7 +947,7 @@ class BBCIE(BBCCoUkIE):
class BBCCoUkArticleIE(InfoExtractor):
_VALID_URL = 'http://www.bbc.co.uk/programmes/articles/(?P<id>[a-zA-Z0-9]+)'
_VALID_URL = r'https?://www.bbc.co.uk/programmes/articles/(?P<id>[a-zA-Z0-9]+)'
IE_NAME = 'bbc.co.uk:article'
IE_DESC = 'BBC articles'

View File

@@ -1,6 +1,11 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import (
compat_chr,
compat_ord,
compat_urllib_parse_unquote,
)
from ..utils import (
int_or_none,
parse_iso8601,
@@ -28,17 +33,75 @@ class BeegIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
cpl_url = self._search_regex(
r'<script[^>]+src=(["\'])(?P<url>(?:https?:)?//static\.beeg\.com/cpl/\d+\.js.*?)\1',
webpage, 'cpl', default=None, group='url')
beeg_version, beeg_salt = [None] * 2
if cpl_url:
cpl = self._download_webpage(
self._proto_relative_url(cpl_url), video_id,
'Downloading cpl JS', fatal=False)
if cpl:
beeg_version = self._search_regex(
r'beeg_version\s*=\s*(\d+)', cpl,
'beeg version', default=None) or self._search_regex(
r'/(\d+)\.js', cpl_url, 'beeg version', default=None)
beeg_salt = self._search_regex(
r'beeg_salt\s*=\s*(["\'])(?P<beeg_salt>.+?)\1', cpl, 'beeg beeg_salt',
default=None, group='beeg_salt')
beeg_version = beeg_version or '1750'
beeg_salt = beeg_salt or 'MIDtGaw96f0N1kMMAM1DE46EC9pmFr'
video = self._download_json(
'http://beeg.com/api/v1/video/%s' % video_id, video_id)
'http://api.beeg.com/api/v6/%s/video/%s' % (beeg_version, video_id),
video_id)
def split(o, e):
def cut(s, x):
n.append(s[:x])
return s[x:]
n = []
r = len(o) % e
if r > 0:
o = cut(o, r)
while len(o) > e:
o = cut(o, e)
n.append(o)
return n
def decrypt_key(key):
# Reverse engineered from http://static.beeg.com/cpl/1738.js
a = beeg_salt
e = compat_urllib_parse_unquote(key)
o = ''.join([
compat_chr(compat_ord(e[n]) - compat_ord(a[n % len(a)]) % 21)
for n in range(len(e))])
return ''.join(split(o, 3)[::-1])
def decrypt_url(encrypted_url):
encrypted_url = self._proto_relative_url(
encrypted_url.replace('{DATA_MARKERS}', ''), 'https:')
key = self._search_regex(
r'/key=(.*?)%2Cend=', encrypted_url, 'key', default=None)
if not key:
return encrypted_url
return encrypted_url.replace(key, decrypt_key(key))
formats = []
for format_id, video_url in video.items():
if not video_url:
continue
height = self._search_regex(
r'^(\d+)[pP]$', format_id, 'height', default=None)
if not height:
continue
formats.append({
'url': self._proto_relative_url(video_url.replace('{DATA_MARKERS}', ''), 'http:'),
'url': decrypt_url(video_url),
'format_id': format_id,
'height': int(height),
})
@@ -63,5 +126,5 @@ class BeegIE(InfoExtractor):
'duration': duration,
'tags': tags,
'formats': formats,
'age_limit': 18,
'age_limit': self._rta_search(webpage),
}

View File

@@ -8,7 +8,7 @@ from ..utils import url_basename
class BehindKinkIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?behindkink\.com/(?P<year>[0-9]{4})/(?P<month>[0-9]{2})/(?P<day>[0-9]{2})/(?P<id>[^/#?_]+)'
_VALID_URL = r'https?://(?:www\.)?behindkink\.com/(?P<year>[0-9]{4})/(?P<month>[0-9]{2})/(?P<day>[0-9]{2})/(?P<id>[^/#?_]+)'
_TEST = {
'url': 'http://www.behindkink.com/2014/12/05/what-are-you-passionate-about-marley-blaze/',
'md5': '507b57d8fdcd75a41a9a7bdb7989c762',

View File

@@ -94,6 +94,7 @@ class BetIE(InfoExtractor):
xpath_with_ns('./media:thumbnail', NS_MAP)).get('url')
formats = self._extract_smil_formats(smil_url, display_id)
self._sort_formats(formats)
return {
'id': video_id,

View File

@@ -0,0 +1,85 @@
# coding: utf-8
from __future__ import unicode_literals
import base64
import re
from .common import InfoExtractor
from ..compat import compat_urllib_parse_unquote
class BigflixIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?bigflix\.com/.+/(?P<id>[0-9]+)'
_TESTS = [{
'url': 'http://www.bigflix.com/Hindi-movies/Action-movies/Singham-Returns/16537',
'md5': 'ec76aa9b1129e2e5b301a474e54fab74',
'info_dict': {
'id': '16537',
'ext': 'mp4',
'title': 'Singham Returns',
'description': 'md5:3d2ba5815f14911d5cc6a501ae0cf65d',
}
}, {
# 2 formats
'url': 'http://www.bigflix.com/Tamil-movies/Drama-movies/Madarasapatinam/16070',
'info_dict': {
'id': '16070',
'ext': 'mp4',
'title': 'Madarasapatinam',
'description': 'md5:63b9b8ed79189c6f0418c26d9a3452ca',
'formats': 'mincount:2',
},
'params': {
'skip_download': True,
}
}, {
# multiple formats
'url': 'http://www.bigflix.com/Malayalam-movies/Drama-movies/Indian-Rupee/15967',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
title = self._html_search_regex(
r'<div[^>]+class=["\']pagetitle["\'][^>]*>(.+?)</div>',
webpage, 'title')
def decode_url(quoted_b64_url):
return base64.b64decode(compat_urllib_parse_unquote(
quoted_b64_url).encode('ascii')).decode('utf-8')
formats = []
for height, encoded_url in re.findall(
r'ContentURL_(\d{3,4})[pP][^=]+=([^&]+)', webpage):
video_url = decode_url(encoded_url)
f = {
'url': video_url,
'format_id': '%sp' % height,
'height': int(height),
}
if video_url.startswith('rtmp'):
f['ext'] = 'flv'
formats.append(f)
file_url = self._search_regex(
r'file=([^&]+)', webpage, 'video url', default=None)
if file_url:
video_url = decode_url(file_url)
if all(f['url'] != video_url for f in formats):
formats.append({
'url': decode_url(file_url),
})
self._sort_formats(formats)
description = self._html_search_meta('description', webpage)
return {
'id': video_id,
'title': title,
'description': description,
'formats': formats
}

View File

@@ -2,141 +2,109 @@
from __future__ import unicode_literals
import re
import itertools
import json
import xml.etree.ElementTree as ET
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
int_or_none,
unified_strdate,
unescapeHTML,
ExtractorError,
xpath_text,
)
class BiliBiliIE(InfoExtractor):
_VALID_URL = r'http://www\.bilibili\.(?:tv|com)/video/av(?P<id>[0-9]+)/'
_VALID_URL = r'https?://www\.bilibili\.(?:tv|com)/video/av(?P<id>\d+)(?:/index_(?P<page_num>\d+).html)?'
_TESTS = [{
'url': 'http://www.bilibili.tv/video/av1074402/',
'md5': '2c301e4dab317596e837c3e7633e7d86',
'info_dict': {
'id': '1074402_part1',
'id': '1554319',
'ext': 'flv',
'title': '【金坷垃】金泡沫',
'duration': 308,
'duration': 308313,
'upload_date': '20140420',
'thumbnail': 're:^https?://.+\.jpg',
'description': 'md5:ce18c2a2d2193f0df2917d270f2e5923',
'timestamp': 1397983878,
'uploader': '菊子桑',
},
}, {
'url': 'http://www.bilibili.com/video/av1041170/',
'info_dict': {
'id': '1041170',
'title': '【BD1080P】刀语【诸神&异域】',
'description': '这是个神奇的故事~每个人不留弹幕不给走哦~切利哦!~',
'uploader': '枫叶逝去',
'timestamp': 1396501299,
},
'playlist_count': 9,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
page_num = mobj.group('page_num') or '1'
if '(此视频不存在或被删除)' in webpage:
raise ExtractorError(
'The video does not exist or was deleted', expected=True)
view_data = self._download_json(
'http://api.bilibili.com/view?type=json&appkey=8e9fc618fbd41e28&id=%s&page=%s' % (video_id, page_num),
video_id)
if 'error' in view_data:
raise ExtractorError('%s said: %s' % (self.IE_NAME, view_data['error']), expected=True)
if '>你没有权限浏览! 由于版权相关问题 我们不对您所在的地区提供服务<' in webpage:
raise ExtractorError(
'The video is not available in your region due to copyright reasons',
expected=True)
cid = view_data['cid']
title = unescapeHTML(view_data['title'])
video_code = self._search_regex(
r'(?s)<div itemprop="video".*?>(.*?)</div>', webpage, 'video code')
doc = self._download_xml(
'http://interface.bilibili.com/v_cdn_play?appkey=8e9fc618fbd41e28&cid=%s' % cid,
cid,
'Downloading page %s/%s' % (page_num, view_data['pages'])
)
title = self._html_search_meta(
'media:title', video_code, 'title', fatal=True)
duration_str = self._html_search_meta(
'duration', video_code, 'duration')
if duration_str is None:
duration = None
else:
duration_mobj = re.match(
r'^T(?:(?P<hours>[0-9]+)H)?(?P<minutes>[0-9]+)M(?P<seconds>[0-9]+)S$',
duration_str)
duration = (
int_or_none(duration_mobj.group('hours'), default=0) * 3600 +
int(duration_mobj.group('minutes')) * 60 +
int(duration_mobj.group('seconds')))
upload_date = unified_strdate(self._html_search_meta(
'uploadDate', video_code, fatal=False))
thumbnail = self._html_search_meta(
'thumbnailUrl', video_code, 'thumbnail', fatal=False)
cid = self._search_regex(r'cid=(\d+)', webpage, 'cid')
if xpath_text(doc, './result') == 'error':
raise ExtractorError('%s said: %s' % (self.IE_NAME, xpath_text(doc, './message')), expected=True)
entries = []
lq_page = self._download_webpage(
'http://interface.bilibili.com/v_cdn_play?appkey=1&cid=%s' % cid,
video_id,
note='Downloading LQ video info'
)
try:
err_info = json.loads(lq_page)
raise ExtractorError(
'BiliBili said: ' + err_info['error_text'], expected=True)
except ValueError:
pass
lq_doc = ET.fromstring(lq_page)
lq_durls = lq_doc.findall('./durl')
hq_doc = self._download_xml(
'http://interface.bilibili.com/playurl?appkey=1&cid=%s' % cid,
video_id,
note='Downloading HQ video info',
fatal=False,
)
if hq_doc is not False:
hq_durls = hq_doc.findall('./durl')
assert len(lq_durls) == len(hq_durls)
else:
hq_durls = itertools.repeat(None)
i = 1
for lq_durl, hq_durl in zip(lq_durls, hq_durls):
for durl in doc.findall('./durl'):
size = xpath_text(durl, ['./filesize', './size'])
formats = [{
'format_id': 'lq',
'quality': 1,
'url': lq_durl.find('./url').text,
'filesize': int_or_none(
lq_durl.find('./size'), get_attr='text'),
'url': durl.find('./url').text,
'filesize': int_or_none(size),
'ext': 'flv',
}]
if hq_durl is not None:
formats.append({
'format_id': 'hq',
'quality': 2,
'ext': 'flv',
'url': hq_durl.find('./url').text,
'filesize': int_or_none(
hq_durl.find('./size'), get_attr='text'),
})
self._sort_formats(formats)
backup_urls = durl.find('./backup_url')
if backup_urls is not None:
for backup_url in backup_urls.findall('./url'):
formats.append({'url': backup_url.text})
formats.reverse()
entries.append({
'id': '%s_part%d' % (video_id, i),
'id': '%s_part%s' % (cid, xpath_text(durl, './order')),
'title': title,
'duration': int_or_none(xpath_text(durl, './length'), 1000),
'formats': formats,
'duration': duration,
'upload_date': upload_date,
'thumbnail': thumbnail,
})
i += 1
return {
'_type': 'multi_video',
'entries': entries,
'id': video_id,
'title': title
info = {
'id': compat_str(cid),
'title': title,
'description': view_data.get('description'),
'thumbnail': view_data.get('pic'),
'uploader': view_data.get('author'),
'timestamp': int_or_none(view_data.get('created')),
'view_count': int_or_none(view_data.get('play')),
'duration': int_or_none(xpath_text(doc, './timelength')),
}
if len(entries) == 1:
entries[0].update(info)
return entries[0]
else:
info.update({
'_type': 'multi_video',
'id': video_id,
'entries': entries,
})
return info

View File

@@ -0,0 +1,86 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import remove_end
class BioBioChileTVIE(InfoExtractor):
_VALID_URL = r'https?://tv\.biobiochile\.cl/notas/(?:[^/]+/)+(?P<id>[^/]+)\.shtml'
_TESTS = [{
'url': 'http://tv.biobiochile.cl/notas/2015/10/21/sobre-camaras-y-camarillas-parlamentarias.shtml',
'md5': '26f51f03cf580265defefb4518faec09',
'info_dict': {
'id': 'sobre-camaras-y-camarillas-parlamentarias',
'ext': 'mp4',
'title': 'Sobre Cámaras y camarillas parlamentarias',
'thumbnail': 're:^https?://.*\.jpg$',
'uploader': 'Fernando Atria',
},
}, {
# different uploader layout
'url': 'http://tv.biobiochile.cl/notas/2016/03/18/natalia-valdebenito-repasa-a-diputado-hasbun-paso-a-la-categoria-de-hablar-brutalidades.shtml',
'md5': 'edc2e6b58974c46d5b047dea3c539ff3',
'info_dict': {
'id': 'natalia-valdebenito-repasa-a-diputado-hasbun-paso-a-la-categoria-de-hablar-brutalidades',
'ext': 'mp4',
'title': 'Natalia Valdebenito repasa a diputado Hasbún: Pasó a la categoría de hablar brutalidades',
'thumbnail': 're:^https?://.*\.jpg$',
'uploader': 'Piangella Obrador',
},
'params': {
'skip_download': True,
},
}, {
'url': 'http://tv.biobiochile.cl/notas/2015/10/22/ninos-transexuales-de-quien-es-la-decision.shtml',
'only_matching': True,
}, {
'url': 'http://tv.biobiochile.cl/notas/2015/10/21/exclusivo-hector-pinto-formador-de-chupete-revela-version-del-ex-delantero-albo.shtml',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
title = remove_end(self._og_search_title(webpage), ' - BioBioChile TV')
file_url = self._search_regex(
r'loadFWPlayerVideo\([^,]+,\s*(["\'])(?P<url>.+?)\1',
webpage, 'file url', group='url')
base_url = self._search_regex(
r'file\s*:\s*(["\'])(?P<url>.+?)\1\s*\+\s*fileURL', webpage,
'base url', default='http://unlimited2-cl.digitalproserver.com/bbtv/',
group='url')
formats = self._extract_m3u8_formats(
'%s%s/playlist.m3u8' % (base_url, file_url), video_id, 'mp4',
entry_protocol='m3u8_native', m3u8_id='hls', fatal=False)
f = {
'url': '%s%s' % (base_url, file_url),
'format_id': 'http',
'protocol': 'http',
'preference': 1,
}
if formats:
f_copy = formats[-1].copy()
f_copy.update(f)
f = f_copy
formats.append(f)
self._sort_formats(formats)
thumbnail = self._og_search_thumbnail(webpage)
uploader = self._html_search_regex(
r'<a[^>]+href=["\']https?://busca\.biobiochile\.cl/author[^>]+>(.+?)</a>',
webpage, 'uploader', fatal=False)
return {
'id': video_id,
'title': title,
'thumbnail': thumbnail,
'uploader': uploader,
'formats': formats,
}

View File

@@ -0,0 +1,110 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from .amp import AMPIE
from ..utils import (
ExtractorError,
int_or_none,
parse_iso8601,
)
class BleacherReportIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?bleacherreport\.com/articles/(?P<id>\d+)'
_TESTS = [{
'url': 'http://bleacherreport.com/articles/2496438-fsu-stat-projections-is-jalen-ramsey-best-defensive-player-in-college-football',
'md5': 'a3ffc3dc73afdbc2010f02d98f990f20',
'info_dict': {
'id': '2496438',
'ext': 'mp4',
'title': 'FSU Stat Projections: Is Jalen Ramsey Best Defensive Player in College Football?',
'uploader_id': 3992341,
'description': 'CFB, ACC, Florida State',
'timestamp': 1434380212,
'upload_date': '20150615',
'uploader': 'Team Stream Now ',
},
'add_ie': ['Ooyala'],
}, {
'url': 'http://bleacherreport.com/articles/2586817-aussie-golfers-get-fright-of-their-lives-after-being-chased-by-angry-kangaroo',
'md5': '6a5cd403418c7b01719248ca97fb0692',
'info_dict': {
'id': '2586817',
'ext': 'webm',
'title': 'Aussie Golfers Get Fright of Their Lives After Being Chased by Angry Kangaroo',
'timestamp': 1446839961,
'uploader': 'Sean Fay',
'description': 'md5:825e94e0f3521df52fa83b2ed198fa20',
'uploader_id': 6466954,
'upload_date': '20151011',
},
'add_ie': ['Youtube'],
}]
def _real_extract(self, url):
article_id = self._match_id(url)
article_data = self._download_json('http://api.bleacherreport.com/api/v1/articles/%s' % article_id, article_id)['article']
thumbnails = []
primary_photo = article_data.get('primaryPhoto')
if primary_photo:
thumbnails = [{
'url': primary_photo['url'],
'width': primary_photo.get('width'),
'height': primary_photo.get('height'),
}]
info = {
'_type': 'url_transparent',
'id': article_id,
'title': article_data['title'],
'uploader': article_data.get('author', {}).get('name'),
'uploader_id': article_data.get('authorId'),
'timestamp': parse_iso8601(article_data.get('createdAt')),
'thumbnails': thumbnails,
'comment_count': int_or_none(article_data.get('commentsCount')),
'view_count': int_or_none(article_data.get('hitCount')),
}
video = article_data.get('video')
if video:
video_type = video['type']
if video_type == 'cms.bleacherreport.com':
info['url'] = 'http://bleacherreport.com/video_embed?id=%s' % video['id']
elif video_type == 'ooyala.com':
info['url'] = 'ooyala:%s' % video['id']
elif video_type == 'youtube.com':
info['url'] = video['id']
elif video_type == 'vine.co':
info['url'] = 'https://vine.co/v/%s' % video['id']
else:
info['url'] = video_type + video['id']
return info
else:
raise ExtractorError('no video in the article', expected=True)
class BleacherReportCMSIE(AMPIE):
_VALID_URL = r'https?://(?:www\.)?bleacherreport\.com/video_embed\?id=(?P<id>[0-9a-f-]{36})'
_TESTS = [{
'url': 'http://bleacherreport.com/video_embed?id=8fd44c2f-3dc5-4821-9118-2c825a98c0e1',
'md5': '8c2c12e3af7805152675446c905d159b',
'info_dict': {
'id': '8fd44c2f-3dc5-4821-9118-2c825a98c0e1',
'ext': 'mp4',
'title': 'Cena vs. Rollins Would Expose the Heavyweight Division',
'description': 'md5:984afb4ade2f9c0db35f3267ed88b36e',
},
'params': {
# m3u8 download
'skip_download': True,
},
}]
def _real_extract(self, url):
video_id = self._match_id(url)
info = self._extract_feed_info('http://cms.bleacherreport.com/media/items/%s/akamai.json' % video_id)
info['id'] = video_id
return info

View File

@@ -1,292 +0,0 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import (
compat_urllib_request,
compat_urlparse,
)
from ..utils import (
clean_html,
int_or_none,
parse_iso8601,
unescapeHTML,
xpath_text,
xpath_with_ns,
)
class BlipTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:\w+\.)?blip\.tv/(?:(?:.+-|rss/flash/)(?P<id>\d+)|((?:play/|api\.swf#)(?P<lookup_id>[\da-zA-Z+_]+)))'
_TESTS = [
{
'url': 'http://blip.tv/cbr/cbr-exclusive-gotham-city-imposters-bats-vs-jokerz-short-3-5796352',
'md5': '80baf1ec5c3d2019037c1c707d676b9f',
'info_dict': {
'id': '5779306',
'ext': 'm4v',
'title': 'CBR EXCLUSIVE: "Gotham City Imposters" Bats VS Jokerz Short 3',
'description': 'md5:9bc31f227219cde65e47eeec8d2dc596',
'timestamp': 1323138843,
'upload_date': '20111206',
'uploader': 'cbr',
'uploader_id': '679425',
'duration': 81,
}
},
{
# https://github.com/rg3/youtube-dl/pull/2274
'note': 'Video with subtitles',
'url': 'http://blip.tv/play/h6Uag5OEVgI.html',
'md5': '309f9d25b820b086ca163ffac8031806',
'info_dict': {
'id': '6586561',
'ext': 'mp4',
'title': 'Red vs. Blue Season 11 Episode 1',
'description': 'One-Zero-One',
'timestamp': 1371261608,
'upload_date': '20130615',
'uploader': 'redvsblue',
'uploader_id': '792887',
'duration': 279,
}
},
{
# https://bugzilla.redhat.com/show_bug.cgi?id=967465
'url': 'http://a.blip.tv/api.swf#h6Uag5KbVwI',
'md5': '314e87b1ebe7a48fcbfdd51b791ce5a6',
'info_dict': {
'id': '6573122',
'ext': 'mov',
'upload_date': '20130520',
'description': 'Two hapless space marines argue over what to do when they realize they have an astronomically huge problem on their hands.',
'title': 'Red vs. Blue Season 11 Trailer',
'timestamp': 1369029609,
'uploader': 'redvsblue',
'uploader_id': '792887',
}
},
{
'url': 'http://blip.tv/play/gbk766dkj4Yn',
'md5': 'fe0a33f022d49399a241e84a8ea8b8e3',
'info_dict': {
'id': '1749452',
'ext': 'mp4',
'upload_date': '20090208',
'description': 'Witness the first appearance of the Nostalgia Critic character, as Doug reviews the movie Transformers.',
'title': 'Nostalgia Critic: Transformers',
'timestamp': 1234068723,
'uploader': 'NostalgiaCritic',
'uploader_id': '246467',
}
},
{
# https://github.com/rg3/youtube-dl/pull/4404
'note': 'Audio only',
'url': 'http://blip.tv/hilarios-productions/weekly-manga-recap-kingdom-7119982',
'md5': '76c0a56f24e769ceaab21fbb6416a351',
'info_dict': {
'id': '7103299',
'ext': 'flv',
'title': 'Weekly Manga Recap: Kingdom',
'description': 'And then Shin breaks the enemy line, and he&apos;s all like HWAH! And then he slices a guy and it&apos;s all like FWASHING! And... it&apos;s really hard to describe the best parts of this series without breaking down into sound effects, okay?',
'timestamp': 1417660321,
'upload_date': '20141204',
'uploader': 'The Rollo T',
'uploader_id': '407429',
'duration': 7251,
'vcodec': 'none',
}
},
{
# missing duration
'url': 'http://blip.tv/rss/flash/6700880',
'info_dict': {
'id': '6684191',
'ext': 'm4v',
'title': 'Cowboy Bebop: Gateway Shuffle Review',
'description': 'md5:3acc480c0f9ae157f5fe88547ecaf3f8',
'timestamp': 1386639757,
'upload_date': '20131210',
'uploader': 'sfdebris',
'uploader_id': '706520',
}
}
]
@staticmethod
def _extract_url(webpage):
mobj = re.search(r'<meta\s[^>]*https?://api\.blip\.tv/\w+/redirect/\w+/(\d+)', webpage)
if mobj:
return 'http://blip.tv/a/a-' + mobj.group(1)
mobj = re.search(r'<(?:iframe|embed|object)\s[^>]*(https?://(?:\w+\.)?blip\.tv/(?:play/|api\.swf#)[a-zA-Z0-9_]+)', webpage)
if mobj:
return mobj.group(1)
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
lookup_id = mobj.group('lookup_id')
# See https://github.com/rg3/youtube-dl/issues/857 and
# https://github.com/rg3/youtube-dl/issues/4197
if lookup_id:
urlh = self._request_webpage(
'http://blip.tv/play/%s' % lookup_id, lookup_id, 'Resolving lookup id')
url = compat_urlparse.urlparse(urlh.geturl())
qs = compat_urlparse.parse_qs(url.query)
mobj = re.match(self._VALID_URL, qs['file'][0])
video_id = mobj.group('id')
rss = self._download_xml('http://blip.tv/rss/flash/%s' % video_id, video_id, 'Downloading video RSS')
def _x(p):
return xpath_with_ns(p, {
'blip': 'http://blip.tv/dtd/blip/1.0',
'media': 'http://search.yahoo.com/mrss/',
'itunes': 'http://www.itunes.com/dtds/podcast-1.0.dtd',
})
item = rss.find('channel/item')
video_id = xpath_text(item, _x('blip:item_id'), 'video id') or lookup_id
title = xpath_text(item, 'title', 'title', fatal=True)
description = clean_html(xpath_text(item, _x('blip:puredescription'), 'description'))
timestamp = parse_iso8601(xpath_text(item, _x('blip:datestamp'), 'timestamp'))
uploader = xpath_text(item, _x('blip:user'), 'uploader')
uploader_id = xpath_text(item, _x('blip:userid'), 'uploader id')
duration = int_or_none(xpath_text(item, _x('blip:runtime'), 'duration'))
media_thumbnail = item.find(_x('media:thumbnail'))
thumbnail = (media_thumbnail.get('url') if media_thumbnail is not None
else xpath_text(item, 'image', 'thumbnail'))
categories = [category.text for category in item.findall('category') if category is not None]
formats = []
subtitles_urls = {}
media_group = item.find(_x('media:group'))
for media_content in media_group.findall(_x('media:content')):
url = media_content.get('url')
role = media_content.get(_x('blip:role'))
msg = self._download_webpage(
url + '?showplayer=20140425131715&referrer=http://blip.tv&mask=7&skin=flashvars&view=url',
video_id, 'Resolving URL for %s' % role)
real_url = compat_urlparse.parse_qs(msg.strip())['message'][0]
media_type = media_content.get('type')
if media_type == 'text/srt' or url.endswith('.srt'):
LANGS = {
'english': 'en',
}
lang = role.rpartition('-')[-1].strip().lower()
langcode = LANGS.get(lang, lang)
subtitles_urls[langcode] = url
elif media_type.startswith('video/'):
formats.append({
'url': real_url,
'format_id': role,
'format_note': media_type,
'vcodec': media_content.get(_x('blip:vcodec')) or 'none',
'acodec': media_content.get(_x('blip:acodec')),
'filesize': media_content.get('filesize'),
'width': int_or_none(media_content.get('width')),
'height': int_or_none(media_content.get('height')),
})
self._check_formats(formats, video_id)
self._sort_formats(formats)
subtitles = self.extract_subtitles(video_id, subtitles_urls)
return {
'id': video_id,
'title': title,
'description': description,
'timestamp': timestamp,
'uploader': uploader,
'uploader_id': uploader_id,
'duration': duration,
'thumbnail': thumbnail,
'categories': categories,
'formats': formats,
'subtitles': subtitles,
}
def _get_subtitles(self, video_id, subtitles_urls):
subtitles = {}
for lang, url in subtitles_urls.items():
# For some weird reason, blip.tv serves a video instead of subtitles
# when we request with a common UA
req = compat_urllib_request.Request(url)
req.add_header('User-Agent', 'youtube-dl')
subtitles[lang] = [{
# The extension is 'srt' but it's actually an 'ass' file
'ext': 'ass',
'data': self._download_webpage(req, None, note=False),
}]
return subtitles
class BlipTVUserIE(InfoExtractor):
_VALID_URL = r'(?:(?:https?://(?:\w+\.)?blip\.tv/)|bliptvuser:)(?!api\.swf)([^/]+)/*$'
_PAGE_SIZE = 12
IE_NAME = 'blip.tv:user'
_TEST = {
'url': 'http://blip.tv/actone',
'info_dict': {
'id': 'actone',
'title': 'Act One: The Series',
},
'playlist_count': 5,
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
username = mobj.group(1)
page_base = 'http://m.blip.tv/pr/show_get_full_episode_list?users_id=%s&lite=0&esi=1'
page = self._download_webpage(url, username, 'Downloading user page')
mobj = re.search(r'data-users-id="([^"]+)"', page)
page_base = page_base % mobj.group(1)
title = self._og_search_title(page)
# Download video ids using BlipTV Ajax calls. Result size per
# query is limited (currently to 12 videos) so we need to query
# page by page until there are no video ids - it means we got
# all of them.
video_ids = []
pagenum = 1
while True:
url = page_base + "&page=" + str(pagenum)
page = self._download_webpage(
url, username, 'Downloading video ids from page %d' % pagenum)
# Extract video identifiers
ids_in_page = []
for mobj in re.finditer(r'href="/([^"]+)"', page):
if mobj.group(1) not in ids_in_page:
ids_in_page.append(unescapeHTML(mobj.group(1)))
video_ids.extend(ids_in_page)
# A little optimization - if current page is not
# "full", ie. does not contain PAGE_SIZE video ids then
# we can assume that this page is the last one - there
# are no more ids on further pages - no need to query
# again.
if len(ids_in_page) < self._PAGE_SIZE:
break
pagenum += 1
urls = ['http://blip.tv/%s' % video_id for video_id in video_ids]
url_entries = [self.url_result(vurl, 'BlipTV') for vurl in urls]
return self.playlist_result(
url_entries, playlist_title=title, playlist_id=username)

View File

@@ -6,9 +6,9 @@ from .common import InfoExtractor
class BloombergIE(InfoExtractor):
_VALID_URL = r'https?://www\.bloomberg\.com/news/videos/[^/]+/(?P<id>[^/?#]+)'
_VALID_URL = r'https?://(?:www\.)?bloomberg\.com/(?:[^/]+/)*(?P<id>[^/?#]+)'
_TEST = {
_TESTS = [{
'url': 'http://www.bloomberg.com/news/videos/b/aaeae121-5949-481e-a1ce-4562db6f5df2',
# The md5 checksum changes
'info_dict': {
@@ -17,22 +17,35 @@ class BloombergIE(InfoExtractor):
'title': 'Shah\'s Presentation on Foreign-Exchange Strategies',
'description': 'md5:a8ba0302912d03d246979735c17d2761',
},
}
}, {
'url': 'http://www.bloomberg.com/news/articles/2015-11-12/five-strange-things-that-have-been-happening-in-financial-markets',
'only_matching': True,
}, {
'url': 'http://www.bloomberg.com/politics/videos/2015-11-25/karl-rove-on-jeb-bush-s-struggles-stopping-trump',
'only_matching': True,
}]
def _real_extract(self, url):
name = self._match_id(url)
webpage = self._download_webpage(url, name)
video_id = self._search_regex(r'"bmmrId":"(.+?)"', webpage, 'id')
video_id = self._search_regex(
r'["\']bmmrId["\']\s*:\s*(["\'])(?P<url>.+?)\1',
webpage, 'id', group='url')
title = re.sub(': Video$', '', self._og_search_title(webpage))
embed_info = self._download_json(
'http://www.bloomberg.com/api/embed?id=%s' % video_id, video_id)
formats = []
for stream in embed_info['streams']:
if stream["muxing_format"] == "TS":
formats.extend(self._extract_m3u8_formats(stream['url'], video_id))
stream_url = stream.get('url')
if not stream_url:
continue
if stream['muxing_format'] == 'TS':
formats.extend(self._extract_m3u8_formats(
stream_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
else:
formats.extend(self._extract_f4m_formats(stream['url'], video_id))
formats.extend(self._extract_f4m_formats(
stream_url, video_id, f4m_id='hds', fatal=False))
self._sort_formats(formats)
return {

View File

@@ -0,0 +1,60 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_parse_qs
from ..utils import ExtractorError
class BokeCCBaseIE(InfoExtractor):
def _extract_bokecc_formats(self, webpage, video_id, format_id=None):
player_params_str = self._html_search_regex(
r'<(?:script|embed)[^>]+src="http://p\.bokecc\.com/player\?([^"]+)',
webpage, 'player params')
player_params = compat_parse_qs(player_params_str)
info_xml = self._download_xml(
'http://p.bokecc.com/servlet/playinfo?uid=%s&vid=%s&m=1' % (
player_params['siteid'][0], player_params['vid'][0]), video_id)
formats = [{
'format_id': format_id,
'url': quality.find('./copy').attrib['playurl'],
'preference': int(quality.attrib['value']),
} for quality in info_xml.findall('./video/quality')]
self._sort_formats(formats)
return formats
class BokeCCIE(BokeCCBaseIE):
_IE_DESC = 'CC视频'
_VALID_URL = r'https?://union\.bokecc\.com/playvideo\.bo\?(?P<query>.*)'
_TESTS = [{
'url': 'http://union.bokecc.com/playvideo.bo?vid=E44D40C15E65EA30&uid=CD0C5D3C8614B28B',
'info_dict': {
'id': 'CD0C5D3C8614B28B_E44D40C15E65EA30',
'ext': 'flv',
'title': 'BokeCC Video',
},
}]
def _real_extract(self, url):
qs = compat_parse_qs(re.match(self._VALID_URL, url).group('query'))
if not qs.get('vid') or not qs.get('uid'):
raise ExtractorError('Invalid URL', expected=True)
video_id = '%s_%s' % (qs['uid'][0], qs['vid'][0])
webpage = self._download_webpage(url, video_id)
return {
'id': video_id,
'title': 'BokeCC Video', # no title provided in the webpage
'formats': self._extract_bokecc_formats(webpage, video_id),
}

View File

@@ -1,16 +1,23 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
js_to_json,
determine_ext,
)
class BpbIE(InfoExtractor):
IE_DESC = 'Bundeszentrale für politische Bildung'
_VALID_URL = r'http://www\.bpb\.de/mediathek/(?P<id>[0-9]+)/'
_VALID_URL = r'https?://www\.bpb\.de/mediathek/(?P<id>[0-9]+)/'
_TEST = {
'url': 'http://www.bpb.de/mediathek/297/joachim-gauck-zu-1989-und-die-erinnerung-an-die-ddr',
'md5': '0792086e8e2bfbac9cdf27835d5f2093',
# md5 fails in Python 2.6 due to buggy server response and wrong handling of urllib2
'md5': 'c4f84c8a8044ca9ff68bb8441d300b3f',
'info_dict': {
'id': '297',
'ext': 'mp4',
@@ -25,13 +32,26 @@ class BpbIE(InfoExtractor):
title = self._html_search_regex(
r'<h2 class="white">(.*?)</h2>', webpage, 'title')
video_url = self._html_search_regex(
r'(http://film\.bpb\.de/player/dokument_[0-9]+\.mp4)',
webpage, 'video URL')
video_info_dicts = re.findall(
r"({\s*src:\s*'http://film\.bpb\.de/[^}]+})", webpage)
formats = []
for video_info in video_info_dicts:
video_info = self._parse_json(video_info, video_id, transform_source=js_to_json)
quality = video_info['quality']
video_url = video_info['src']
formats.append({
'url': video_url,
'preference': 10 if quality == 'high' else 0,
'format_note': quality,
'format_id': '%s-%s' % (quality, determine_ext(video_url)),
})
self._sort_formats(formats)
return {
'id': video_id,
'url': video_url,
'formats': formats,
'title': title,
'description': self._og_search_description(webpage),
}

View File

@@ -1,18 +1,21 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
int_or_none,
parse_duration,
xpath_element,
xpath_text,
)
class BRIE(InfoExtractor):
IE_DESC = 'Bayerischer Rundfunk Mediathek'
_VALID_URL = r'https?://(?:www\.)?br\.de/(?:[a-z0-9\-_]+/)+(?P<id>[a-z0-9\-_]+)\.html'
_BASE_URL = 'http://www.br.de'
_VALID_URL = r'(?P<base_url>https?://(?:www\.)?br(?:-klassik)?\.de)/(?:[a-z0-9\-_]+/)+(?P<id>[a-z0-9\-_]+)\.html'
_TESTS = [
{
@@ -22,7 +25,7 @@ class BRIE(InfoExtractor):
'id': '48f656ef-287e-486f-be86-459122db22cc',
'ext': 'mp4',
'title': 'Die böse Überraschung',
'description': 'Betriebliche Altersvorsorge: Die böse Überraschung',
'description': 'md5:ce9ac81b466ce775b8018f6801b48ac9',
'duration': 180,
'uploader': 'Reinhard Weber',
'upload_date': '20150422',
@@ -30,23 +33,23 @@ class BRIE(InfoExtractor):
},
{
'url': 'http://www.br.de/nachrichten/oberbayern/inhalt/muenchner-polizeipraesident-schreiber-gestorben-100.html',
'md5': 'a44396d73ab6a68a69a568fae10705bb',
'md5': 'af3a3a4aa43ff0ce6a89504c67f427ef',
'info_dict': {
'id': 'a4b83e34-123d-4b81-9f4e-c0d3121a4e05',
'ext': 'mp4',
'ext': 'flv',
'title': 'Manfred Schreiber ist tot',
'description': 'Abendschau kompakt: Manfred Schreiber ist tot',
'description': 'md5:b454d867f2a9fc524ebe88c3f5092d97',
'duration': 26,
}
},
{
'url': 'http://www.br.de/radio/br-klassik/sendungen/allegro/premiere-urauffuehrung-the-land-2015-dance-festival-muenchen-100.html',
'url': 'https://www.br-klassik.de/audio/peeping-tom-premierenkritik-dance-festival-muenchen-100.html',
'md5': '8b5b27c0b090f3b35eac4ab3f7a73d3d',
'info_dict': {
'id': '74c603c9-26d3-48bb-b85b-079aeed66e0b',
'ext': 'aac',
'title': 'Kurzweilig und sehr bewegend',
'description': '"The Land" von Peeping Tom: Kurzweilig und sehr bewegend',
'description': 'md5:0351996e3283d64adeb38ede91fac54e',
'duration': 296,
}
},
@@ -57,7 +60,7 @@ class BRIE(InfoExtractor):
'id': '6ba73750-d405-45d3-861d-1ce8c524e059',
'ext': 'mp4',
'title': 'Umweltbewusster Häuslebauer',
'description': 'Uwe Erdelt: Umweltbewusster Häuslebauer',
'description': 'md5:d52dae9792d00226348c1dbb13c9bae2',
'duration': 116,
}
},
@@ -68,7 +71,7 @@ class BRIE(InfoExtractor):
'id': 'd982c9ce-8648-4753-b358-98abb8aec43d',
'ext': 'mp4',
'title': 'Folge 1 - Metaphysik',
'description': 'Kant für Anfänger: Folge 1 - Metaphysik',
'description': 'md5:bb659990e9e59905c3d41e369db1fbe3',
'duration': 893,
'uploader': 'Eva Maria Steimle',
'upload_date': '20140117',
@@ -77,28 +80,31 @@ class BRIE(InfoExtractor):
]
def _real_extract(self, url):
display_id = self._match_id(url)
base_url, display_id = re.search(self._VALID_URL, url).groups()
page = self._download_webpage(url, display_id)
xml_url = self._search_regex(
r"return BRavFramework\.register\(BRavFramework\('avPlayer_(?:[a-f0-9-]{36})'\)\.setup\({dataURL:'(/(?:[a-z0-9\-]+/)+[a-z0-9/~_.-]+)'}\)\);", page, 'XMLURL')
xml = self._download_xml(self._BASE_URL + xml_url, None)
xml = self._download_xml(base_url + xml_url, display_id)
medias = []
for xml_media in xml.findall('video') + xml.findall('audio'):
media_id = xml_media.get('externalId')
media = {
'id': xml_media.get('externalId'),
'title': xml_media.find('title').text,
'duration': parse_duration(xml_media.find('duration').text),
'formats': self._extract_formats(xml_media.find('assets')),
'thumbnails': self._extract_thumbnails(xml_media.find('teaserImage/variants')),
'description': ' '.join(xml_media.find('shareTitle').text.splitlines()),
'webpage_url': xml_media.find('permalink').text
'id': media_id,
'title': xpath_text(xml_media, 'title', 'title', True),
'duration': parse_duration(xpath_text(xml_media, 'duration')),
'formats': self._extract_formats(xpath_element(
xml_media, 'assets'), media_id),
'thumbnails': self._extract_thumbnails(xpath_element(
xml_media, 'teaserImage/variants'), base_url),
'description': xpath_text(xml_media, 'desc'),
'webpage_url': xpath_text(xml_media, 'permalink'),
'uploader': xpath_text(xml_media, 'author'),
}
if xml_media.find('author').text:
media['uploader'] = xml_media.find('author').text
if xml_media.find('broadcastDate').text:
media['upload_date'] = ''.join(reversed(xml_media.find('broadcastDate').text.split('.')))
broadcast_date = xpath_text(xml_media, 'broadcastDate')
if broadcast_date:
media['upload_date'] = ''.join(reversed(broadcast_date.split('.')))
medias.append(media)
if len(medias) > 1:
@@ -109,35 +115,54 @@ class BRIE(InfoExtractor):
raise ExtractorError('No media entries found')
return medias[0]
def _extract_formats(self, assets):
def text_or_none(asset, tag):
elem = asset.find(tag)
return None if elem is None else elem.text
formats = [{
'url': text_or_none(asset, 'downloadUrl'),
'ext': text_or_none(asset, 'mediaType'),
'format_id': asset.get('type'),
'width': int_or_none(text_or_none(asset, 'frameWidth')),
'height': int_or_none(text_or_none(asset, 'frameHeight')),
'tbr': int_or_none(text_or_none(asset, 'bitrateVideo')),
'abr': int_or_none(text_or_none(asset, 'bitrateAudio')),
'vcodec': text_or_none(asset, 'codecVideo'),
'acodec': text_or_none(asset, 'codecAudio'),
'container': text_or_none(asset, 'mediaType'),
'filesize': int_or_none(text_or_none(asset, 'size')),
} for asset in assets.findall('asset')
if asset.find('downloadUrl') is not None]
def _extract_formats(self, assets, media_id):
formats = []
for asset in assets.findall('asset'):
format_url = xpath_text(asset, ['downloadUrl', 'url'])
asset_type = asset.get('type')
if asset_type == 'HDS':
formats.extend(self._extract_f4m_formats(
format_url + '?hdcore=3.2.0', media_id, f4m_id='hds', fatal=False))
elif asset_type == 'HLS':
formats.extend(self._extract_m3u8_formats(
format_url, media_id, 'mp4', 'm3u8_native', m3u8_id='hds', fatal=False))
else:
format_info = {
'ext': xpath_text(asset, 'mediaType'),
'width': int_or_none(xpath_text(asset, 'frameWidth')),
'height': int_or_none(xpath_text(asset, 'frameHeight')),
'tbr': int_or_none(xpath_text(asset, 'bitrateVideo')),
'abr': int_or_none(xpath_text(asset, 'bitrateAudio')),
'vcodec': xpath_text(asset, 'codecVideo'),
'acodec': xpath_text(asset, 'codecAudio'),
'container': xpath_text(asset, 'mediaType'),
'filesize': int_or_none(xpath_text(asset, 'size')),
}
format_url = self._proto_relative_url(format_url)
if format_url:
http_format_info = format_info.copy()
http_format_info.update({
'url': format_url,
'format_id': 'http-%s' % asset_type,
})
formats.append(http_format_info)
server_prefix = xpath_text(asset, 'serverPrefix')
if server_prefix:
rtmp_format_info = format_info.copy()
rtmp_format_info.update({
'url': server_prefix,
'play_path': xpath_text(asset, 'fileName'),
'format_id': 'rtmp-%s' % asset_type,
})
formats.append(rtmp_format_info)
self._sort_formats(formats)
return formats
def _extract_thumbnails(self, variants):
def _extract_thumbnails(self, variants, base_url):
thumbnails = [{
'url': self._BASE_URL + variant.find('url').text,
'width': int_or_none(variant.find('width').text),
'height': int_or_none(variant.find('height').text),
} for variant in variants.findall('variant')]
'url': base_url + xpath_text(variant, 'url'),
'width': int_or_none(xpath_text(variant, 'width')),
'height': int_or_none(xpath_text(variant, 'height')),
} for variant in variants.findall('variant') if xpath_text(variant, 'url')]
thumbnails.sort(key=lambda x: x['width'] * x['height'], reverse=True)
return thumbnails

View File

@@ -0,0 +1,31 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import smuggle_url
class BravoTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?bravotv\.com/(?:[^/]+/)+videos/(?P<id>[^/?]+)'
_TEST = {
'url': 'http://www.bravotv.com/last-chance-kitchen/season-5/videos/lck-ep-12-fishy-finale',
'md5': 'd60cdf68904e854fac669bd26cccf801',
'info_dict': {
'id': 'LitrBdX64qLn',
'ext': 'mp4',
'title': 'Last Chance Kitchen Returns',
'description': 'S13: Last Chance Kitchen Returns for Top Chef Season 13',
'timestamp': 1448926740,
'upload_date': '20151130',
'uploader': 'NBCU-BRAV',
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
account_pid = self._search_regex(r'"account_pid"\s*:\s*"([^"]+)"', webpage, 'account pid')
release_pid = self._search_regex(r'"release_pid"\s*:\s*"([^"]+)"', webpage, 'release pid')
return self.url_result(smuggle_url(
'http://link.theplatform.com/s/%s/%s?mbr=true&switch=progressive' % (account_pid, release_pid),
{'force_smil_url': True}), 'ThePlatform', release_pid)

View File

@@ -11,7 +11,7 @@ from ..utils import (
class BreakIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?break\.com/video/(?:[^/]+/)*.+-(?P<id>\d+)'
_VALID_URL = r'https?://(?:www\.)?break\.com/video/(?:[^/]+/)*.+-(?P<id>\d+)'
_TESTS = [{
'url': 'http://www.break.com/video/when-girls-act-like-guys-2468056',
'info_dict': {

View File

@@ -3,31 +3,36 @@ from __future__ import unicode_literals
import re
import json
import xml.etree.ElementTree
from .common import InfoExtractor
from ..compat import (
compat_etree_fromstring,
compat_parse_qs,
compat_str,
compat_urllib_parse,
compat_urllib_parse_urlparse,
compat_urllib_request,
compat_urlparse,
compat_xml_parse_error,
compat_HTTPError,
)
from ..utils import (
determine_ext,
ExtractorError,
find_xpath_attr,
fix_xml_ampersands,
float_or_none,
js_to_json,
int_or_none,
parse_iso8601,
unescapeHTML,
unsmuggle_url,
update_url_query,
)
class BrightcoveIE(InfoExtractor):
class BrightcoveLegacyIE(InfoExtractor):
IE_NAME = 'brightcove:legacy'
_VALID_URL = r'(?:https?://.*brightcove\.com/(services|viewer).*?\?|brightcove:)(?P<query>.*)'
_FEDERATED_URL_TEMPLATE = 'http://c.brightcove.com/services/viewer/htmlFederated?%s'
_FEDERATED_URL = 'http://c.brightcove.com/services/viewer/htmlFederated'
_TESTS = [
{
@@ -41,6 +46,9 @@ class BrightcoveIE(InfoExtractor):
'title': 'Xavier Sala i Martín: “Un banc que no presta és un banc zombi que no serveix per a res”',
'uploader': '8TV',
'description': 'md5:a950cc4285c43e44d763d036710cd9cd',
'timestamp': 1368213670,
'upload_date': '20130510',
'uploader_id': '1589608506001',
}
},
{
@@ -52,6 +60,9 @@ class BrightcoveIE(InfoExtractor):
'title': 'JVMLS 2012: Arrays 2.0 - Opportunities and Challenges',
'description': 'John Rose speaks at the JVM Language Summit, August 1, 2012.',
'uploader': 'Oracle',
'timestamp': 1344975024,
'upload_date': '20120814',
'uploader_id': '1460825906',
},
},
{
@@ -63,6 +74,9 @@ class BrightcoveIE(InfoExtractor):
'title': 'This Bracelet Acts as a Personal Thermostat',
'description': 'md5:547b78c64f4112766ccf4e151c20b6a0',
'uploader': 'Mashable',
'timestamp': 1382041798,
'upload_date': '20131017',
'uploader_id': '1130468786001',
},
},
{
@@ -80,14 +94,17 @@ class BrightcoveIE(InfoExtractor):
{
# test flv videos served by akamaihd.net
# From http://www.redbull.com/en/bike/stories/1331655643987/replay-uci-dh-world-cup-2014-from-fort-william
'url': 'http://c.brightcove.com/services/viewer/htmlFederated?%40videoPlayer=ref%3ABC2996102916001&linkBaseURL=http%3A%2F%2Fwww.redbull.com%2Fen%2Fbike%2Fvideos%2F1331655630249%2Freplay-uci-fort-william-2014-dh&playerKey=AQ%7E%7E%2CAAAApYJ7UqE%7E%2Cxqr_zXk0I-zzNndy8NlHogrCb5QdyZRf&playerID=1398061561001#__youtubedl_smuggle=%7B%22Referer%22%3A+%22http%3A%2F%2Fwww.redbull.com%2Fen%2Fbike%2Fstories%2F1331655643987%2Freplay-uci-dh-world-cup-2014-from-fort-william%22%7D',
'url': 'http://c.brightcove.com/services/viewer/htmlFederated?%40videoPlayer=ref%3Aevent-stream-356&linkBaseURL=http%3A%2F%2Fwww.redbull.com%2Fen%2Fbike%2Fvideos%2F1331655630249%2Freplay-uci-fort-william-2014-dh&playerKey=AQ%7E%7E%2CAAAApYJ7UqE%7E%2Cxqr_zXk0I-zzNndy8NlHogrCb5QdyZRf&playerID=1398061561001#__youtubedl_smuggle=%7B%22Referer%22%3A+%22http%3A%2F%2Fwww.redbull.com%2Fen%2Fbike%2Fstories%2F1331655643987%2Freplay-uci-dh-world-cup-2014-from-fort-william%22%7D',
# The md5 checksum changes on each download
'info_dict': {
'id': '2996102916001',
'id': '3750436379001',
'ext': 'flv',
'title': 'UCI MTB World Cup 2014: Fort William, UK - Downhill Finals',
'uploader': 'Red Bull TV',
'uploader': 'RBTV Old (do not use)',
'description': 'UCI MTB World Cup 2014: Fort William, UK - Downhill Finals',
'timestamp': 1409122195,
'upload_date': '20140827',
'uploader_id': '710858724001',
},
},
{
@@ -101,6 +118,12 @@ class BrightcoveIE(InfoExtractor):
'playlist_mincount': 7,
},
]
FLV_VCODECS = {
1: 'SORENSON',
2: 'ON2',
3: 'H264',
4: 'VP8',
}
@classmethod
def _build_brighcove_url(cls, object_str):
@@ -119,7 +142,7 @@ class BrightcoveIE(InfoExtractor):
object_str = fix_xml_ampersands(object_str)
try:
object_doc = xml.etree.ElementTree.fromstring(object_str.encode('utf-8'))
object_doc = compat_etree_fromstring(object_str.encode('utf-8'))
except compat_xml_parse_error:
return
@@ -131,13 +154,16 @@ class BrightcoveIE(InfoExtractor):
else:
flashvars = {}
data_url = object_doc.attrib.get('data', '')
data_url_params = compat_parse_qs(compat_urllib_parse_urlparse(data_url).query)
def find_param(name):
if name in flashvars:
return flashvars[name]
node = find_xpath_attr(object_doc, './param', 'name', name)
if node is not None:
return node.attrib['value']
return None
return data_url_params.get(name)
params = {}
@@ -150,8 +176,8 @@ class BrightcoveIE(InfoExtractor):
# Not all pages define this value
if playerKey is not None:
params['playerKey'] = playerKey
# The three fields hold the id of the video
videoPlayer = find_param('@videoPlayer') or find_param('videoId') or find_param('videoID')
# These fields hold the id of the video
videoPlayer = find_param('@videoPlayer') or find_param('videoId') or find_param('videoID') or find_param('@videoList')
if videoPlayer is not None:
params['@videoPlayer'] = videoPlayer
linkBase = find_param('linkBaseURL')
@@ -179,8 +205,7 @@ class BrightcoveIE(InfoExtractor):
@classmethod
def _make_brightcove_url(cls, params):
data = compat_urllib_parse.urlencode(params)
return cls._FEDERATED_URL_TEMPLATE % data
return update_url_query(cls._FEDERATED_URL, params)
@classmethod
def _extract_brightcove_url(cls, webpage):
@@ -234,7 +259,7 @@ class BrightcoveIE(InfoExtractor):
# We set the original url as the default 'Referer' header
referer = smuggled_data.get('Referer', url)
return self._get_video_info(
videoPlayer[0], query_str, query, referer=referer)
videoPlayer[0], query, referer=referer)
elif 'playerKey' in query:
player_key = query['playerKey']
return self._get_playlist_info(player_key[0])
@@ -243,15 +268,14 @@ class BrightcoveIE(InfoExtractor):
'Cannot find playerKey= variable. Did you forget quotes in a shell invocation?',
expected=True)
def _get_video_info(self, video_id, query_str, query, referer=None):
request_url = self._FEDERATED_URL_TEMPLATE % query_str
req = compat_urllib_request.Request(request_url)
def _get_video_info(self, video_id, query, referer=None):
headers = {}
linkBase = query.get('linkBaseURL')
if linkBase is not None:
referer = linkBase[0]
if referer is not None:
req.add_header('Referer', referer)
webpage = self._download_webpage(req, video_id)
headers['Referer'] = referer
webpage = self._download_webpage(self._FEDERATED_URL, video_id, headers=headers, query=query)
error_msg = self._html_search_regex(
r"<h1>We're sorry.</h1>([\s\n]*<p>.*?</p>)+", webpage,
@@ -283,15 +307,19 @@ class BrightcoveIE(InfoExtractor):
playlist_title=playlist_info['mediaCollectionDTO']['displayName'])
def _extract_video_info(self, video_info):
publisher_id = video_info.get('publisherId')
info = {
'id': compat_str(video_info['id']),
'title': video_info['displayName'].strip(),
'description': video_info.get('shortDescription'),
'thumbnail': video_info.get('videoStillURL') or video_info.get('thumbnailURL'),
'uploader': video_info.get('publisherName'),
'uploader_id': compat_str(publisher_id) if publisher_id else None,
'duration': float_or_none(video_info.get('length'), 1000),
'timestamp': int_or_none(video_info.get('creationDate'), 1000),
}
renditions = video_info.get('renditions')
renditions = video_info.get('renditions', []) + video_info.get('IOSRenditions', [])
if renditions:
formats = []
for rend in renditions:
@@ -312,19 +340,42 @@ class BrightcoveIE(InfoExtractor):
ext = 'flv'
if ext is None:
ext = determine_ext(url)
size = rend.get('size')
formats.append({
tbr = int_or_none(rend.get('encodingRate'), 1000),
a_format = {
'format_id': 'http%s' % ('-%s' % tbr if tbr else ''),
'url': url,
'ext': ext,
'height': rend.get('frameHeight'),
'width': rend.get('frameWidth'),
'filesize': size if size != 0 else None,
})
'filesize': int_or_none(rend.get('size')) or None,
'tbr': tbr,
}
if rend.get('audioOnly'):
a_format.update({
'vcodec': 'none',
})
else:
a_format.update({
'height': int_or_none(rend.get('frameHeight')),
'width': int_or_none(rend.get('frameWidth')),
'vcodec': rend.get('videoCodec'),
})
# m3u8 manifests with remote == false are media playlists
# Not calling _extract_m3u8_formats here to save network traffic
if ext == 'm3u8':
a_format.update({
'format_id': 'hls%s' % ('-%s' % tbr if tbr else ''),
'ext': 'mp4',
'protocol': 'm3u8',
})
formats.append(a_format)
self._sort_formats(formats)
info['formats'] = formats
elif video_info.get('FLVFullLengthURL') is not None:
info.update({
'url': video_info['FLVFullLengthURL'],
'vcodec': self.FLV_VCODECS.get(video_info.get('FLVFullCodec')),
'filesize': int_or_none(video_info.get('FLVFullSize')),
})
if self._downloader.params.get('include_ads', False):
@@ -346,3 +397,205 @@ class BrightcoveIE(InfoExtractor):
if 'url' not in info and not info.get('formats'):
raise ExtractorError('Unable to extract video url for %s' % info['id'])
return info
class BrightcoveNewIE(InfoExtractor):
IE_NAME = 'brightcove:new'
_VALID_URL = r'https?://players\.brightcove\.net/(?P<account_id>\d+)/(?P<player_id>[^/]+)_(?P<embed>[^/]+)/index\.html\?.*videoId=(?P<video_id>\d+|ref:[^&]+)'
_TESTS = [{
'url': 'http://players.brightcove.net/929656772001/e41d32dc-ec74-459e-a845-6c69f7b724ea_default/index.html?videoId=4463358922001',
'md5': 'c8100925723840d4b0d243f7025703be',
'info_dict': {
'id': '4463358922001',
'ext': 'mp4',
'title': 'Meet the man behind Popcorn Time',
'description': 'md5:eac376a4fe366edc70279bfb681aea16',
'duration': 165.768,
'timestamp': 1441391203,
'upload_date': '20150904',
'uploader_id': '929656772001',
'formats': 'mincount:22',
},
}, {
# with rtmp streams
'url': 'http://players.brightcove.net/4036320279001/5d112ed9-283f-485f-a7f9-33f42e8bc042_default/index.html?videoId=4279049078001',
'info_dict': {
'id': '4279049078001',
'ext': 'mp4',
'title': 'Titansgrave: Chapter 0',
'description': 'Titansgrave: Chapter 0',
'duration': 1242.058,
'timestamp': 1433556729,
'upload_date': '20150606',
'uploader_id': '4036320279001',
'formats': 'mincount:41',
},
'params': {
# m3u8 download
'skip_download': True,
}
}, {
# ref: prefixed video id
'url': 'http://players.brightcove.net/3910869709001/21519b5c-4b3b-4363-accb-bdc8f358f823_default/index.html?videoId=ref:7069442',
'only_matching': True,
}, {
# non numeric ref: prefixed video id
'url': 'http://players.brightcove.net/710858724001/default_default/index.html?videoId=ref:event-stream-356',
'only_matching': True,
}]
@staticmethod
def _extract_url(webpage):
urls = BrightcoveNewIE._extract_urls(webpage)
return urls[0] if urls else None
@staticmethod
def _extract_urls(webpage):
# Reference:
# 1. http://docs.brightcove.com/en/video-cloud/brightcove-player/guides/publish-video.html#setvideoiniframe
# 2. http://docs.brightcove.com/en/video-cloud/brightcove-player/guides/publish-video.html#setvideousingjavascript
# 3. http://docs.brightcove.com/en/video-cloud/brightcove-player/guides/embed-in-page.html
# 4. https://support.brightcove.com/en/video-cloud/docs/dynamically-assigning-videos-player
entries = []
# Look for iframe embeds [1]
for _, url in re.findall(
r'<iframe[^>]+src=(["\'])((?:https?:)?//players\.brightcove\.net/\d+/[^/]+/index\.html.+?)\1', webpage):
entries.append(url if url.startswith('http') else 'http:' + url)
# Look for embed_in_page embeds [2]
for video_id, account_id, player_id, embed in re.findall(
# According to examples from [3] it's unclear whether video id
# may be optional and what to do when it is
# According to [4] data-video-id may be prefixed with ref:
r'''(?sx)
<video[^>]+
data-video-id=["\'](\d+|ref:[^"\']+)["\'][^>]*>.*?
</video>.*?
<script[^>]+
src=["\'](?:https?:)?//players\.brightcove\.net/
(\d+)/([^/]+)_([^/]+)/index(?:\.min)?\.js
''', webpage):
entries.append(
'http://players.brightcove.net/%s/%s_%s/index.html?videoId=%s'
% (account_id, player_id, embed, video_id))
return entries
def _real_extract(self, url):
account_id, player_id, embed, video_id = re.match(self._VALID_URL, url).groups()
webpage = self._download_webpage(
'http://players.brightcove.net/%s/%s_%s/index.min.js'
% (account_id, player_id, embed), video_id)
policy_key = None
catalog = self._search_regex(
r'catalog\(({.+?})\);', webpage, 'catalog', default=None)
if catalog:
catalog = self._parse_json(
js_to_json(catalog), video_id, fatal=False)
if catalog:
policy_key = catalog.get('policyKey')
if not policy_key:
policy_key = self._search_regex(
r'policyKey\s*:\s*(["\'])(?P<pk>.+?)\1',
webpage, 'policy key', group='pk')
api_url = 'https://edge.api.brightcove.com/playback/v1/accounts/%s/videos/%s' % (account_id, video_id)
try:
json_data = self._download_json(api_url, video_id, headers={
'Accept': 'application/json;pk=%s' % policy_key
})
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
json_data = self._parse_json(e.cause.read().decode(), video_id)
raise ExtractorError(json_data[0]['message'], expected=True)
raise
title = json_data['name'].strip()
formats = []
for source in json_data.get('sources', []):
container = source.get('container')
source_type = source.get('type')
src = source.get('src')
if source_type == 'application/x-mpegURL' or container == 'M2TS':
if not src:
continue
formats.extend(self._extract_m3u8_formats(
src, video_id, 'mp4', m3u8_id='hls', fatal=False))
elif source_type == 'application/dash+xml':
if not src:
continue
formats.extend(self._extract_mpd_formats(src, video_id, 'dash', fatal=False))
else:
streaming_src = source.get('streaming_src')
stream_name, app_name = source.get('stream_name'), source.get('app_name')
if not src and not streaming_src and (not stream_name or not app_name):
continue
tbr = float_or_none(source.get('avg_bitrate'), 1000)
height = int_or_none(source.get('height'))
width = int_or_none(source.get('width'))
f = {
'tbr': tbr,
'filesize': int_or_none(source.get('size')),
'container': container,
'ext': container.lower(),
}
if width == 0 and height == 0:
f.update({
'vcodec': 'none',
})
else:
f.update({
'width': width,
'height': height,
'vcodec': source.get('codec'),
})
def build_format_id(kind):
format_id = kind
if tbr:
format_id += '-%dk' % int(tbr)
if height:
format_id += '-%dp' % height
return format_id
if src or streaming_src:
f.update({
'url': src or streaming_src,
'format_id': build_format_id('http' if src else 'http-streaming'),
'source_preference': 0 if src else -1,
})
else:
f.update({
'url': app_name,
'play_path': stream_name,
'format_id': build_format_id('rtmp'),
})
formats.append(f)
self._sort_formats(formats)
subtitles = {}
for text_track in json_data.get('text_tracks', []):
if text_track.get('src'):
subtitles.setdefault(text_track.get('srclang'), []).append({
'url': text_track['src'],
})
return {
'id': video_id,
'title': title,
'description': json_data.get('description'),
'thumbnail': json_data.get('thumbnail') or json_data.get('poster'),
'duration': float_or_none(json_data.get('duration'), 1000),
'timestamp': parse_iso8601(json_data.get('published_at')),
'uploader_id': account_id,
'formats': formats,
'subtitles': subtitles,
'tags': json_data.get('tags', []),
}

View File

@@ -14,9 +14,10 @@ class BYUtvIE(InfoExtractor):
'info_dict': {
'id': 'studio-c-season-5-episode-5',
'ext': 'mp4',
'description': 'md5:5438d33774b6bdc662f9485a340401cc',
'description': 'md5:e07269172baff037f8e8bf9956bc9747',
'title': 'Season 5 Episode 5',
'thumbnail': 're:^https?://.*\.jpg$'
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 1486.486,
},
'params': {
'skip_download': True,

View File

@@ -4,12 +4,13 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import js_to_json
class C56IE(InfoExtractor):
_VALID_URL = r'https?://(?:(?:www|player)\.)?56\.com/(?:.+?/)?(?:v_|(?:play_album.+-))(?P<textid>.+?)\.(?:html|swf)'
IE_NAME = '56.com'
_TEST = {
_TESTS = [{
'url': 'http://www.56.com/u39/v_OTM0NDA3MTY.html',
'md5': 'e59995ac63d0457783ea05f93f12a866',
'info_dict': {
@@ -18,12 +19,29 @@ class C56IE(InfoExtractor):
'title': '网事知多少 第32期车怒',
'duration': 283.813,
},
}
}, {
'url': 'http://www.56.com/u47/v_MTM5NjQ5ODc2.html',
'md5': '',
'info_dict': {
'id': '82247482',
'title': '爱的诅咒之杜鹃花开',
},
'playlist_count': 7,
'add_ie': ['Sohu'],
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url, flags=re.VERBOSE)
text_id = mobj.group('textid')
webpage = self._download_webpage(url, text_id)
sohu_video_info_str = self._search_regex(
r'var\s+sohuVideoInfo\s*=\s*({[^}]+});', webpage, 'Sohu video info', default=None)
if sohu_video_info_str:
sohu_video_info = self._parse_json(
sohu_video_info_str, text_id, transform_source=js_to_json)
return self.url_result(sohu_video_info['url'], 'Sohu')
page = self._download_json(
'http://vxml.56.com/json/%s/' % text_id, text_id, 'Downloading video info')

View File

@@ -6,7 +6,7 @@ import re
from .common import InfoExtractor
from ..compat import (
compat_urllib_parse,
compat_urllib_parse_urlencode,
compat_urlparse,
)
from ..utils import (
@@ -16,7 +16,7 @@ from ..utils import (
class CamdemyIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?camdemy\.com/media/(?P<id>\d+)'
_VALID_URL = r'https?://(?:www\.)?camdemy\.com/media/(?P<id>\d+)'
_TESTS = [{
# single file
'url': 'http://www.camdemy.com/media/5181/',
@@ -104,7 +104,7 @@ class CamdemyIE(InfoExtractor):
class CamdemyFolderIE(InfoExtractor):
_VALID_URL = r'http://www.camdemy.com/folder/(?P<id>\d+)'
_VALID_URL = r'https?://www.camdemy.com/folder/(?P<id>\d+)'
_TESTS = [{
# links with trailing slash
'url': 'http://www.camdemy.com/folder/450',
@@ -139,7 +139,7 @@ class CamdemyFolderIE(InfoExtractor):
parsed_url = list(compat_urlparse.urlparse(url))
query = dict(compat_urlparse.parse_qsl(parsed_url[4]))
query.update({'displayMode': 'list'})
parsed_url[4] = compat_urllib_parse.urlencode(query)
parsed_url[4] = compat_urllib_parse_urlencode(query)
final_url = compat_urlparse.urlunparse(parsed_url)
page = self._download_webpage(final_url, folder_id)

View File

@@ -0,0 +1,87 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_duration,
unified_strdate,
)
class CamWithHerIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?camwithher\.tv/view_video\.php\?.*\bviewkey=(?P<id>\w+)'
_TESTS = [{
'url': 'http://camwithher.tv/view_video.php?viewkey=6e9a24e2c0e842e1f177&page=&viewtype=&category=',
'info_dict': {
'id': '5644',
'ext': 'flv',
'title': 'Periscope Tease',
'description': 'In the clouds teasing on periscope to my favorite song',
'duration': 240,
'view_count': int,
'comment_count': int,
'uploader': 'MileenaK',
'upload_date': '20160322',
},
'params': {
'skip_download': True,
}
}, {
'url': 'http://camwithher.tv/view_video.php?viewkey=6dfd8b7c97531a459937',
'only_matching': True,
}, {
'url': 'http://camwithher.tv/view_video.php?page=&viewkey=6e9a24e2c0e842e1f177&viewtype=&category=',
'only_matching': True,
}, {
'url': 'http://camwithher.tv/view_video.php?viewkey=b6c3b5bea9515d1a1fc4&page=&viewtype=&category=mv',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
flv_id = self._html_search_regex(
r'<a[^>]+href=["\']/download/\?v=(\d+)', webpage, 'video id')
# Video URL construction algorithm is reverse-engineered from cwhplayer.swf
rtmp_url = 'rtmp://camwithher.tv/clipshare/%s' % (
('mp4:%s.mp4' % flv_id) if int(flv_id) > 2010 else flv_id)
title = self._html_search_regex(
r'<div[^>]+style="float:left"[^>]*>\s*<h2>(.+?)</h2>', webpage, 'title')
description = self._html_search_regex(
r'>Description:</span>(.+?)</div>', webpage, 'description', default=None)
runtime = self._search_regex(
r'Runtime\s*:\s*(.+?) \|', webpage, 'duration', default=None)
if runtime:
runtime = re.sub(r'[\s-]', '', runtime)
duration = parse_duration(runtime)
view_count = int_or_none(self._search_regex(
r'Views\s*:\s*(\d+)', webpage, 'view count', default=None))
comment_count = int_or_none(self._search_regex(
r'Comments\s*:\s*(\d+)', webpage, 'comment count', default=None))
uploader = self._search_regex(
r'Added by\s*:\s*<a[^>]+>([^<]+)</a>', webpage, 'uploader', default=None)
upload_date = unified_strdate(self._search_regex(
r'Added on\s*:\s*([\d-]+)', webpage, 'upload date', default=None))
return {
'id': flv_id,
'url': rtmp_url,
'ext': 'flv',
'no_resume': True,
'title': title,
'description': description,
'duration': duration,
'view_count': view_count,
'comment_count': comment_count,
'uploader': uploader,
'upload_date': upload_date,
}

View File

@@ -1,48 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
class Canal13clIE(InfoExtractor):
_VALID_URL = r'^http://(?:www\.)?13\.cl/(?:[^/?#]+/)*(?P<id>[^/?#]+)'
_TEST = {
'url': 'http://www.13.cl/t13/nacional/el-circulo-de-hierro-de-michelle-bachelet-en-su-regreso-a-la-moneda',
'md5': '4cb1fa38adcad8fea88487a078831755',
'info_dict': {
'id': '1403022125',
'display_id': 'el-circulo-de-hierro-de-michelle-bachelet-en-su-regreso-a-la-moneda',
'ext': 'mp4',
'title': 'El "círculo de hierro" de Michelle Bachelet en su regreso a La Moneda',
'description': '(Foto: Agencia Uno) En nueve días más, Michelle Bachelet va a asumir por segunda vez como presidenta de la República. Entre aquellos que la acompañarán hay caras que se repiten y otras que se consolidan en su entorno de colaboradores más cercanos.',
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
display_id = mobj.group('id')
webpage = self._download_webpage(url, display_id)
title = self._html_search_meta(
'twitter:title', webpage, 'title', fatal=True)
description = self._html_search_meta(
'twitter:description', webpage, 'description')
url = self._html_search_regex(
r'articuloVideo = \"(.*?)\"', webpage, 'url')
real_id = self._search_regex(
r'[^0-9]([0-9]{7,})[^0-9]', url, 'id', default=display_id)
thumbnail = self._html_search_regex(
r'articuloImagen = \"(.*?)\"', webpage, 'thumbnail')
return {
'id': real_id,
'display_id': display_id,
'url': url,
'title': title,
'description': description,
'ext': 'mp4',
'thumbnail': thumbnail,
}

View File

@@ -9,9 +9,9 @@ from ..utils import parse_duration
class Canalc2IE(InfoExtractor):
IE_NAME = 'canalc2.tv'
_VALID_URL = r'https?://(?:www\.)?canalc2\.tv/video/(?P<id>\d+)'
_VALID_URL = r'https?://(?:(?:www\.)?canalc2\.tv/video/|archives-canalc2\.u-strasbg\.fr/video\.asp\?.*\bidVideo=)(?P<id>\d+)'
_TEST = {
_TESTS = [{
'url': 'http://www.canalc2.tv/video/12163',
'md5': '060158428b650f896c542dfbb3d6487f',
'info_dict': {
@@ -23,24 +23,36 @@ class Canalc2IE(InfoExtractor):
'params': {
'skip_download': True, # Requires rtmpdump
}
}
}, {
'url': 'http://archives-canalc2.u-strasbg.fr/video.asp?idVideo=11427&voir=oui',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
video_url = self._search_regex(
r'jwplayer\((["\'])Player\1\)\.setup\({[^}]*file\s*:\s*(["\'])(?P<file>.+?)\2',
webpage, 'video_url', group='file')
formats = [{'url': video_url}]
if video_url.startswith('rtmp://'):
rtmp = re.search(r'^(?P<url>rtmp://[^/]+/(?P<app>.+/))(?P<play_path>mp4:.+)$', video_url)
formats[0].update({
'url': rtmp.group('url'),
'ext': 'flv',
'app': rtmp.group('app'),
'play_path': rtmp.group('play_path'),
'page_url': url,
})
webpage = self._download_webpage(
'http://www.canalc2.tv/video/%s' % video_id, video_id)
formats = []
for _, video_url in re.findall(r'file\s*=\s*(["\'])(.+?)\1', webpage):
if video_url.startswith('rtmp://'):
rtmp = re.search(
r'^(?P<url>rtmp://[^/]+/(?P<app>.+/))(?P<play_path>mp4:.+)$', video_url)
formats.append({
'url': rtmp.group('url'),
'format_id': 'rtmp',
'ext': 'flv',
'app': rtmp.group('app'),
'play_path': rtmp.group('play_path'),
'page_url': url,
})
else:
formats.append({
'url': video_url,
'format_id': 'http',
})
self._sort_formats(formats)
title = self._html_search_regex(
r'(?s)class="[^"]*col_description[^"]*">.*?<h3>(.*?)</h3>', webpage, 'title')

View File

@@ -10,13 +10,14 @@ from ..utils import (
unified_strdate,
url_basename,
qualities,
int_or_none,
)
class CanalplusIE(InfoExtractor):
IE_DESC = 'canalplus.fr, piwiplus.fr and d8.tv'
_VALID_URL = r'https?://(?:www\.(?P<site>canalplus\.fr|piwiplus\.fr|d8\.tv|itele\.fr)/.*?/(?P<path>.*)|player\.canalplus\.fr/#/(?P<id>[0-9]+))'
_VIDEO_INFO_TEMPLATE = 'http://service.canal-plus.com/video/rest/getVideosLiees/%s/%s'
_VIDEO_INFO_TEMPLATE = 'http://service.canal-plus.com/video/rest/getVideosLiees/%s/%s?format=json'
_SITE_ID_MAP = {
'canalplus.fr': 'cplus',
'piwiplus.fr': 'teletoon',
@@ -26,10 +27,10 @@ class CanalplusIE(InfoExtractor):
_TESTS = [{
'url': 'http://www.canalplus.fr/c-emissions/pid1830-c-zapping.html?vid=1263092',
'md5': 'b3481d7ca972f61e37420798d0a9d934',
'md5': '12164a6f14ff6df8bd628e8ba9b10b78',
'info_dict': {
'id': '1263092',
'ext': 'flv',
'ext': 'mp4',
'title': 'Le Zapping - 13/05/15',
'description': 'md5:09738c0d06be4b5d06a0940edb0da73f',
'upload_date': '20150513',
@@ -56,10 +57,10 @@ class CanalplusIE(InfoExtractor):
'skip': 'videos get deleted after a while',
}, {
'url': 'http://www.itele.fr/france/video/aubervilliers-un-lycee-en-colere-111559',
'md5': 'f3a46edcdf28006598ffaf5b30e6a2d4',
'md5': '38b8f7934def74f0d6f3ba6c036a5f82',
'info_dict': {
'id': '1213714',
'ext': 'flv',
'ext': 'mp4',
'title': 'Aubervilliers : un lycée en colère - Le 11/02/2015 à 06h45',
'description': 'md5:8216206ec53426ea6321321f3b3c16db',
'upload_date': '20150211',
@@ -82,15 +83,16 @@ class CanalplusIE(InfoExtractor):
webpage, 'video id', group='id')
info_url = self._VIDEO_INFO_TEMPLATE % (site_id, video_id)
doc = self._download_xml(info_url, video_id, 'Downloading video XML')
video_data = self._download_json(info_url, video_id, 'Downloading video JSON')
video_info = [video for video in doc if video.find('ID').text == video_id][0]
media = video_info.find('MEDIA')
infos = video_info.find('INFOS')
if isinstance(video_data, list):
video_data = [video for video in video_data if video.get('ID') == video_id][0]
media = video_data['MEDIA']
infos = video_data['INFOS']
preference = qualities(['MOBILE', 'BAS_DEBIT', 'HAUT_DEBIT', 'HD', 'HLS', 'HDS'])
preference = qualities(['MOBILE', 'BAS_DEBIT', 'HAUT_DEBIT', 'HD'])
fmt_url = next(iter(media.find('VIDEOS'))).text
fmt_url = next(iter(media.get('VIDEOS')))
if '/geo' in fmt_url.lower():
response = self._request_webpage(
HEADRequest(fmt_url), video_id,
@@ -101,35 +103,42 @@ class CanalplusIE(InfoExtractor):
expected=True)
formats = []
for fmt in media.find('VIDEOS'):
format_url = fmt.text
for format_id, format_url in media['VIDEOS'].items():
if not format_url:
continue
format_id = fmt.tag
if format_id == 'HLS':
formats.extend(self._extract_m3u8_formats(
format_url, video_id, 'mp4', preference=preference(format_id)))
format_url, video_id, 'mp4', 'm3u8_native', m3u8_id=format_id, fatal=False))
elif format_id == 'HDS':
formats.extend(self._extract_f4m_formats(
format_url + '?hdcore=2.11.3', video_id, preference=preference(format_id)))
format_url + '?hdcore=2.11.3', video_id, f4m_id=format_id, fatal=False))
else:
formats.append({
'url': format_url,
# the secret extracted ya function in http://player.canalplus.fr/common/js/canalPlayer.js
'url': format_url + '?secret=pqzerjlsmdkjfoiuerhsdlfknaes',
'format_id': format_id,
'preference': preference(format_id),
})
self._sort_formats(formats)
thumbnails = [{
'id': image_id,
'url': image_url,
} for image_id, image_url in media.get('images', {}).items()]
titrage = infos['TITRAGE']
return {
'id': video_id,
'display_id': display_id,
'title': '%s - %s' % (infos.find('TITRAGE/TITRE').text,
infos.find('TITRAGE/SOUS_TITRE').text),
'upload_date': unified_strdate(infos.find('PUBLICATION/DATE').text),
'thumbnail': media.find('IMAGES/GRAND').text,
'description': infos.find('DESCRIPTION').text,
'view_count': int(infos.find('NB_VUES').text),
'like_count': int(infos.find('NB_LIKES').text),
'comment_count': int(infos.find('NB_COMMENTS').text),
'title': '%s - %s' % (titrage['TITRE'],
titrage['SOUS_TITRE']),
'upload_date': unified_strdate(infos.get('PUBLICATION', {}).get('DATE')),
'thumbnails': thumbnails,
'description': infos.get('DESCRIPTION'),
'duration': int_or_none(infos.get('DURATION')),
'view_count': int_or_none(infos.get('NB_VUES')),
'like_count': int_or_none(infos.get('NB_LIKES')),
'comment_count': int_or_none(infos.get('NB_COMMENTS')),
'formats': formats,
}

View File

@@ -0,0 +1,94 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import float_or_none
class CanvasIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?canvas\.be/video/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://www.canvas.be/video/de-afspraak/najaar-2015/de-afspraak-veilt-voor-de-warmste-week',
'md5': 'ea838375a547ac787d4064d8c7860a6c',
'info_dict': {
'id': 'mz-ast-5e5f90b6-2d72-4c40-82c2-e134f884e93e',
'display_id': 'de-afspraak-veilt-voor-de-warmste-week',
'ext': 'mp4',
'title': 'De afspraak veilt voor de Warmste Week',
'description': 'md5:24cb860c320dc2be7358e0e5aa317ba6',
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 49.02,
}
}, {
# with subtitles
'url': 'http://www.canvas.be/video/panorama/2016/pieter-0167',
'info_dict': {
'id': 'mz-ast-5240ff21-2d30-4101-bba6-92b5ec67c625',
'display_id': 'pieter-0167',
'ext': 'mp4',
'title': 'Pieter 0167',
'description': 'md5:943cd30f48a5d29ba02c3a104dc4ec4e',
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 2553.08,
'subtitles': {
'nl': [{
'ext': 'vtt',
}],
},
},
'params': {
'skip_download': True,
}
}]
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
title = self._search_regex(
r'<h1[^>]+class="video__body__header__title"[^>]*>(.+?)</h1>',
webpage, 'title', default=None) or self._og_search_title(webpage)
video_id = self._html_search_regex(
r'data-video=(["\'])(?P<id>.+?)\1', webpage, 'video id', group='id')
data = self._download_json(
'https://mediazone.vrt.be/api/v1/canvas/assets/%s' % video_id, display_id)
formats = []
for target in data['targetUrls']:
format_url, format_type = target.get('url'), target.get('type')
if not format_url or not format_type:
continue
if format_type == 'HLS':
formats.extend(self._extract_m3u8_formats(
format_url, display_id, entry_protocol='m3u8_native',
ext='mp4', preference=0, fatal=False, m3u8_id=format_type))
elif format_type == 'HDS':
formats.extend(self._extract_f4m_formats(
format_url, display_id, f4m_id=format_type, fatal=False))
else:
formats.append({
'format_id': format_type,
'url': format_url,
})
self._sort_formats(formats)
subtitles = {}
subtitle_urls = data.get('subtitleUrls')
if isinstance(subtitle_urls, list):
for subtitle in subtitle_urls:
subtitle_url = subtitle.get('url')
if subtitle_url and subtitle.get('type') == 'CLOSED':
subtitles.setdefault('nl', []).append({'url': subtitle_url})
return {
'id': video_id,
'display_id': display_id,
'title': title,
'description': self._og_search_description(webpage),
'formats': formats,
'duration': float_or_none(data.get('duration'), 1000),
'thumbnail': data.get('posterImageUrl'),
'subtitles': subtitles,
}

113
youtube_dl/extractor/cbc.py Normal file
View File

@@ -0,0 +1,113 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import js_to_json
class CBCIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?cbc\.ca/(?:[^/]+/)+(?P<id>[^/?#]+)'
_TESTS = [{
# with mediaId
'url': 'http://www.cbc.ca/22minutes/videos/clips-season-23/don-cherry-play-offs',
'info_dict': {
'id': '2682904050',
'ext': 'flv',
'title': 'Don Cherry All-Stars',
'description': 'Don Cherry has a bee in his bonnet about AHL player John Scott because that guys got heart.',
'timestamp': 1454475540,
'upload_date': '20160203',
},
'params': {
# rtmp download
'skip_download': True,
},
}, {
# with clipId
'url': 'http://www.cbc.ca/archives/entry/1978-robin-williams-freestyles-on-90-minutes-live',
'info_dict': {
'id': '2487345465',
'ext': 'flv',
'title': 'Robin Williams freestyles on 90 Minutes Live',
'description': 'Wacky American comedian Robin Williams shows off his infamous "freestyle" comedic talents while being interviewed on CBC\'s 90 Minutes Live.',
'upload_date': '19700101',
},
'params': {
# rtmp download
'skip_download': True,
},
}, {
# multiple iframes
'url': 'http://www.cbc.ca/natureofthings/blog/birds-eye-view-from-vancouvers-burrard-street-bridge-how-we-got-the-shot',
'playlist': [{
'info_dict': {
'id': '2680832926',
'ext': 'flv',
'title': 'An Eagle\'s-Eye View Off Burrard Bridge',
'description': 'Hercules the eagle flies from Vancouver\'s Burrard Bridge down to a nearby park with a mini-camera strapped to his back.',
'upload_date': '19700101',
},
}, {
'info_dict': {
'id': '2658915080',
'ext': 'flv',
'title': 'Fly like an eagle!',
'description': 'Eagle equipped with a mini camera flies from the world\'s tallest tower',
'upload_date': '19700101',
},
}],
'params': {
# rtmp download
'skip_download': True,
},
}]
@classmethod
def suitable(cls, url):
return False if CBCPlayerIE.suitable(url) else super(CBCIE, cls).suitable(url)
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
player_init = self._search_regex(
r'CBC\.APP\.Caffeine\.initInstance\(({.+?})\);', webpage, 'player init',
default=None)
if player_init:
player_info = self._parse_json(player_init, display_id, js_to_json)
media_id = player_info.get('mediaId')
if not media_id:
clip_id = player_info['clipId']
media_id = self._download_json(
'http://feed.theplatform.com/f/h9dtGB/punlNGjMlc1F?fields=id&byContent=byReleases%3DbyId%253D' + clip_id,
clip_id)['entries'][0]['id'].split('/')[-1]
return self.url_result('cbcplayer:%s' % media_id, 'CBCPlayer', media_id)
else:
entries = [self.url_result('cbcplayer:%s' % media_id, 'CBCPlayer', media_id) for media_id in re.findall(r'<iframe[^>]+src="[^"]+?mediaId=(\d+)"', webpage)]
return self.playlist_result(entries)
class CBCPlayerIE(InfoExtractor):
_VALID_URL = r'(?:cbcplayer:|https?://(?:www\.)?cbc\.ca/(?:player/play/|i/caffeine/syndicate/\?mediaId=))(?P<id>\d+)'
_TEST = {
'url': 'http://www.cbc.ca/player/play/2683190193',
'info_dict': {
'id': '2683190193',
'ext': 'flv',
'title': 'Gerry Runs a Sweat Shop',
'description': 'md5:b457e1c01e8ff408d9d801c1c2cd29b0',
'timestamp': 1455067800,
'upload_date': '20160210',
},
'params': {
# rtmp download
'skip_download': True,
},
}
def _real_extract(self, url):
video_id = self._match_id(url)
return self.url_result(
'http://feed.theplatform.com/f/ExhSPC/vms_5akSXx4Ng_Zn?byGuid=%s' % video_id,
'ThePlatformFeed', video_id)

View File

@@ -1,20 +1,41 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from .theplatform import ThePlatformIE
from ..utils import (
xpath_text,
xpath_element,
int_or_none,
ExtractorError,
find_xpath_attr,
)
class CBSIE(InfoExtractor):
class CBSBaseIE(ThePlatformIE):
def _parse_smil_subtitles(self, smil, namespace=None, subtitles_lang='en'):
closed_caption_e = find_xpath_attr(smil, self._xpath_ns('.//param', namespace), 'name', 'ClosedCaptionURL')
return {
'en': [{
'ext': 'ttml',
'url': closed_caption_e.attrib['value'],
}]
} if closed_caption_e is not None and closed_caption_e.attrib.get('value') else []
class CBSIE(CBSBaseIE):
_VALID_URL = r'https?://(?:www\.)?(?:cbs\.com/shows/[^/]+/(?:video|artist)|colbertlateshow\.com/(?:video|podcasts))/[^/]+/(?P<id>[^/]+)'
_TESTS = [{
'url': 'http://www.cbs.com/shows/garth-brooks/video/_u7W953k6la293J7EPTd9oHkSPs6Xn6_/connect-chat-feat-garth-brooks/',
'info_dict': {
'id': '4JUVEwq3wUT7',
'id': '_u7W953k6la293J7EPTd9oHkSPs6Xn6_',
'display_id': 'connect-chat-feat-garth-brooks',
'ext': 'flv',
'ext': 'mp4',
'title': 'Connect Chat feat. Garth Brooks',
'description': 'Connect with country music singer Garth Brooks, as he chats with fans on Wednesday November 27, 2013. Be sure to tune in to Garth Brooks: Live from Las Vegas, Friday November 29, at 9/8c on CBS!',
'duration': 1495,
'timestamp': 1385585425,
'upload_date': '20131127',
'uploader': 'CBSI-NEW',
},
'params': {
# rtmp download
@@ -43,16 +64,46 @@ class CBSIE(InfoExtractor):
'url': 'http://www.colbertlateshow.com/podcasts/dYSwjqPs_X1tvbV_P2FcPWRa_qT6akTC/in-the-bad-room-with-stephen/',
'only_matching': True,
}]
TP_RELEASE_URL_TEMPLATE = 'http://link.theplatform.com/s/dJ5BDC/%s?manifest=m3u&mbr=true'
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
real_id = self._search_regex(
[r"video\.settings\.pid\s*=\s*'([^']+)';", r"cbsplayer\.pid\s*=\s*'([^']+)';"],
webpage, 'real video ID')
return {
'_type': 'url_transparent',
'ie_key': 'ThePlatform',
'url': 'theplatform:%s' % real_id,
content_id = self._search_regex(
[r"video\.settings\.content_id\s*=\s*'([^']+)';", r"cbsplayer\.contentId\s*=\s*'([^']+)';"],
webpage, 'content id')
items_data = self._download_xml(
'http://can.cbs.com/thunder/player/videoPlayerService.php',
content_id, query={'partner': 'cbs', 'contentId': content_id})
video_data = xpath_element(items_data, './/item')
title = xpath_text(video_data, 'videoTitle', 'title', True)
subtitles = {}
formats = []
for item in items_data.findall('.//item'):
pid = xpath_text(item, 'pid')
if not pid:
continue
try:
tp_formats, tp_subtitles = self._extract_theplatform_smil(
self.TP_RELEASE_URL_TEMPLATE % pid, content_id, 'Downloading %s SMIL data' % pid)
except ExtractorError:
continue
formats.extend(tp_formats)
subtitles = self._merge_subtitles(subtitles, tp_subtitles)
self._sort_formats(formats)
info = self.get_metadata('dJ5BDC/media/guid/2198311517/%s' % content_id, content_id)
info.update({
'id': content_id,
'display_id': display_id,
}
'title': title,
'series': xpath_text(video_data, 'seriesTitle'),
'season_number': int_or_none(xpath_text(video_data, 'seasonNumber')),
'episode_number': int_or_none(xpath_text(video_data, 'episodeNumber')),
'duration': int_or_none(xpath_text(video_data, 'videoLength'), 1000),
'thumbnail': xpath_text(video_data, 'previewImageURL'),
'formats': formats,
'subtitles': subtitles,
})
return info

View File

@@ -0,0 +1,108 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .theplatform import ThePlatformIE
from ..utils import int_or_none
class CBSInteractiveIE(ThePlatformIE):
_VALID_URL = r'https?://(?:www\.)?(?P<site>cnet|zdnet)\.com/(?:videos|video/share)/(?P<id>[^/?]+)'
_TESTS = [{
'url': 'http://www.cnet.com/videos/hands-on-with-microsofts-windows-8-1-update/',
'info_dict': {
'id': '56f4ea68-bd21-4852-b08c-4de5b8354c60',
'ext': 'flv',
'title': 'Hands-on with Microsoft Windows 8.1 Update',
'description': 'The new update to the Windows 8 OS brings improved performance for mouse and keyboard users.',
'uploader_id': '6085384d-619e-11e3-b231-14feb5ca9861',
'uploader': 'Sarah Mitroff',
'duration': 70,
'timestamp': 1396479627,
'upload_date': '20140402',
},
}, {
'url': 'http://www.cnet.com/videos/whiny-pothole-tweets-at-local-government-when-hit-by-cars-tomorrow-daily-187/',
'info_dict': {
'id': '56527b93-d25d-44e3-b738-f989ce2e49ba',
'ext': 'flv',
'title': 'Whiny potholes tweet at local government when hit by cars (Tomorrow Daily 187)',
'description': 'Khail and Ashley wonder what other civic woes can be solved by self-tweeting objects, investigate a new kind of VR camera and watch an origami robot self-assemble, walk, climb, dig and dissolve. #TDPothole',
'uploader_id': 'b163284d-6b73-44fc-b3e6-3da66c392d40',
'uploader': 'Ashley Esqueda',
'duration': 1482,
'timestamp': 1433289889,
'upload_date': '20150603',
},
}, {
'url': 'http://www.zdnet.com/video/share/video-keeping-android-smartphones-and-tablets-secure/',
'info_dict': {
'id': 'bc1af9f0-a2b5-4e54-880d-0d95525781c0',
'ext': 'mp4',
'title': 'Video: Keeping Android smartphones and tablets secure',
'description': 'Here\'s the best way to keep Android devices secure, and what you do when they\'ve come to the end of their lives.',
'uploader_id': 'f2d97ea2-8175-11e2-9d12-0018fe8a00b0',
'uploader': 'Adrian Kingsley-Hughes',
'timestamp': 1448961720,
'upload_date': '20151201',
},
'params': {
# m3u8 download
'skip_download': True,
}
}]
TP_RELEASE_URL_TEMPLATE = 'http://link.theplatform.com/s/kYEXFC/%s?mbr=true'
MPX_ACCOUNTS = {
'cnet': 2288573011,
'zdnet': 2387448114,
}
def _real_extract(self, url):
site, display_id = re.match(self._VALID_URL, url).groups()
webpage = self._download_webpage(url, display_id)
data_json = self._html_search_regex(
r"data-(?:cnet|zdnet)-video(?:-uvp)?-options='([^']+)'",
webpage, 'data json')
data = self._parse_json(data_json, display_id)
vdata = data.get('video') or data['videos'][0]
video_id = vdata['id']
title = vdata['title']
author = vdata.get('author')
if author:
uploader = '%s %s' % (author['firstName'], author['lastName'])
uploader_id = author.get('id')
else:
uploader = None
uploader_id = None
media_guid_path = 'media/guid/%d/%s' % (self.MPX_ACCOUNTS[site], vdata['mpxRefId'])
formats, subtitles = [], {}
if site == 'cnet':
formats, subtitles = self._extract_theplatform_smil(
self.TP_RELEASE_URL_TEMPLATE % media_guid_path, video_id)
for (fkey, vid) in vdata['files'].items():
if fkey == 'hls_phone' and 'hls_tablet' in vdata['files']:
continue
release_url = self.TP_RELEASE_URL_TEMPLATE % vid
if fkey == 'hds':
release_url += '&manifest=f4m'
tp_formats, tp_subtitles = self._extract_theplatform_smil(release_url, video_id, 'Downloading %s SMIL data' % fkey)
formats.extend(tp_formats)
subtitles = self._merge_subtitles(subtitles, tp_subtitles)
self._sort_formats(formats)
info = self.get_metadata('kYEXFC/%s' % media_guid_path, video_id)
info.update({
'id': video_id,
'display_id': display_id,
'title': title,
'duration': int_or_none(vdata.get('duration')),
'uploader': uploader,
'uploader_id': uploader_id,
'subtitles': subtitles,
'formats': formats,
})
return info

View File

@@ -1,15 +1,16 @@
# encoding: utf-8
from __future__ import unicode_literals
import re
import json
from .common import InfoExtractor
from .cbs import CBSBaseIE
from ..utils import (
parse_duration,
)
class CBSNewsIE(InfoExtractor):
class CBSNewsIE(CBSBaseIE):
IE_DESC = 'CBS News'
_VALID_URL = r'http://(?:www\.)?cbsnews\.com/(?:[^/]+/)+(?P<id>[\da-z_-]+)'
_VALID_URL = r'https?://(?:www\.)?cbsnews\.com/(?:news|videos)/(?P<id>[\da-z_-]+)'
_TESTS = [
{
@@ -30,53 +31,48 @@ class CBSNewsIE(InfoExtractor):
'url': 'http://www.cbsnews.com/videos/fort-hood-shooting-army-downplays-mental-illness-as-cause-of-attack/',
'info_dict': {
'id': 'fort-hood-shooting-army-downplays-mental-illness-as-cause-of-attack',
'ext': 'flv',
'ext': 'mp4',
'title': 'Fort Hood shooting: Army downplays mental illness as cause of attack',
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 205,
'subtitles': {
'en': [{
'ext': 'ttml',
}],
},
},
'params': {
# rtmp download
# m3u8 download
'skip_download': True,
},
},
]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
video_info = json.loads(self._html_search_regex(
video_info = self._parse_json(self._html_search_regex(
r'(?:<ul class="media-list items" id="media-related-items"><li data-video-info|<div id="cbsNewsVideoPlayer" data-video-player-options)=\'({.+?})\'',
webpage, 'video JSON info'))
webpage, 'video JSON info'), video_id)
item = video_info['item'] if 'item' in video_info else video_info
title = item.get('articleTitle') or item.get('hed')
duration = item.get('duration')
thumbnail = item.get('mediaImage') or item.get('thumbnail')
subtitles = {}
formats = []
for format_id in ['RtmpMobileLow', 'RtmpMobileHigh', 'Hls', 'RtmpDesktop']:
uri = item.get('media' + format_id + 'URI')
if not uri:
pid = item.get('media' + format_id)
if not pid:
continue
fmt = {
'url': uri,
'format_id': format_id,
}
if uri.startswith('rtmp'):
fmt.update({
'app': 'ondemand?auth=cbs',
'play_path': 'mp4:' + uri.split('<break>')[-1],
'player_url': 'http://www.cbsnews.com/[[IMPORT]]/vidtech.cbsinteractive.com/player/3_3_0/CBSI_PLAYER_HD.swf',
'page_url': 'http://www.cbsnews.com',
'ext': 'flv',
})
elif uri.endswith('.m3u8'):
fmt['ext'] = 'mp4'
formats.append(fmt)
release_url = 'http://link.theplatform.com/s/dJ5BDC/%s?mbr=true' % pid
tp_formats, tp_subtitles = self._extract_theplatform_smil(release_url, video_id, 'Downloading %s SMIL data' % pid)
formats.extend(tp_formats)
subtitles = self._merge_subtitles(subtitles, tp_subtitles)
self._sort_formats(formats)
return {
'id': video_id,
@@ -84,4 +80,44 @@ class CBSNewsIE(InfoExtractor):
'thumbnail': thumbnail,
'duration': duration,
'formats': formats,
'subtitles': subtitles,
}
class CBSNewsLiveVideoIE(InfoExtractor):
IE_DESC = 'CBS News Live Videos'
_VALID_URL = r'https?://(?:www\.)?cbsnews\.com/live/video/(?P<id>[\da-z_-]+)'
_TEST = {
'url': 'http://www.cbsnews.com/live/video/clinton-sanders-prepare-to-face-off-in-nh/',
'info_dict': {
'id': 'clinton-sanders-prepare-to-face-off-in-nh',
'ext': 'flv',
'title': 'Clinton, Sanders Prepare To Face Off In NH',
'duration': 334,
},
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
video_info = self._parse_json(self._html_search_regex(
r'data-story-obj=\'({.+?})\'', webpage, 'video JSON info'), video_id)['story']
hdcore_sign = 'hdcore=3.3.1'
f4m_formats = self._extract_f4m_formats(video_info['url'] + '&' + hdcore_sign, video_id)
if f4m_formats:
for entry in f4m_formats:
# URLs without the extra param induce an 404 error
entry.update({'extra_param_to_segment_url': hdcore_sign})
self._sort_formats(f4m_formats)
return {
'id': video_id,
'title': video_info['headline'],
'thumbnail': video_info.get('thumbnail_url_hd') or video_info.get('thumbnail_url_sd'),
'duration': parse_duration(video_info.get('segmentDur')),
'formats': f4m_formats,
}

View File

@@ -6,7 +6,7 @@ from .common import InfoExtractor
class CBSSportsIE(InfoExtractor):
_VALID_URL = r'http://www\.cbssports\.com/video/player/(?P<section>[^/]+)/(?P<id>[^/]+)'
_VALID_URL = r'https?://www\.cbssports\.com/video/player/(?P<section>[^/]+)/(?P<id>[^/]+)'
_TEST = {
'url': 'http://www.cbssports.com/video/player/tennis/318462531970/0/us-open-flashbacks-1990s',

View File

@@ -5,6 +5,7 @@ import re
from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_duration,
qualities,
unified_strdate,
)
@@ -12,21 +13,25 @@ from ..utils import (
class CCCIE(InfoExtractor):
IE_NAME = 'media.ccc.de'
_VALID_URL = r'https?://(?:www\.)?media\.ccc\.de/[^?#]+/[^?#/]*?_(?P<id>[0-9]{8,})._[^?#/]*\.html'
_VALID_URL = r'https?://(?:www\.)?media\.ccc\.de/v/(?P<id>[^/?#&]+)'
_TEST = {
'url': 'http://media.ccc.de/browse/congress/2013/30C3_-_5443_-_en_-_saal_g_-_201312281830_-_introduction_to_processor_design_-_byterazor.html#video',
_TESTS = [{
'url': 'https://media.ccc.de/v/30C3_-_5443_-_en_-_saal_g_-_201312281830_-_introduction_to_processor_design_-_byterazor#video',
'md5': '3a1eda8f3a29515d27f5adb967d7e740',
'info_dict': {
'id': '20131228183',
'id': '30C3_-_5443_-_en_-_saal_g_-_201312281830_-_introduction_to_processor_design_-_byterazor',
'ext': 'mp4',
'title': 'Introduction to Processor Design',
'description': 'md5:5ddbf8c734800267f2cee4eab187bc1b',
'description': 'md5:80be298773966f66d56cb11260b879af',
'thumbnail': 're:^https?://.*\.jpg$',
'view_count': int,
'upload_date': '20131229',
'upload_date': '20131228',
'duration': 3660,
}
}
}, {
'url': 'https://media.ccc.de/v/32c3-7368-shopshifting#download',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
@@ -40,21 +45,25 @@ class CCCIE(InfoExtractor):
title = self._html_search_regex(
r'(?s)<h1>(.*?)</h1>', webpage, 'title')
description = self._html_search_regex(
r"(?s)<p class='description'>(.*?)</p>",
r'(?s)<h3>About</h3>(.+?)<h3>',
webpage, 'description', fatal=False)
upload_date = unified_strdate(self._html_search_regex(
r"(?s)<span class='[^']*fa-calendar-o'></span>(.*?)</li>",
r"(?s)<span[^>]+class='[^']*fa-calendar-o'[^>]*>(.+?)</span>",
webpage, 'upload date', fatal=False))
view_count = int_or_none(self._html_search_regex(
r"(?s)<span class='[^']*fa-eye'></span>(.*?)</li>",
webpage, 'view count', fatal=False))
duration = parse_duration(self._html_search_regex(
r'(?s)<span[^>]+class=(["\']).*?fa-clock-o.*?\1[^>]*></span>(?P<duration>.+?)</li',
webpage, 'duration', fatal=False, group='duration'))
matches = re.finditer(r'''(?xs)
<(?:span|div)\s+class='label\s+filetype'>(?P<format>.*?)</(?:span|div)>\s*
<(?:span|div)\s+class='label\s+filetype'>(?P<format>[^<]*)</(?:span|div)>\s*
<(?:span|div)\s+class='label\s+filetype'>(?P<lang>[^<]*)</(?:span|div)>\s*
<a\s+download\s+href='(?P<http_url>[^']+)'>\s*
(?:
.*?
<a\s+href='(?P<torrent_url>[^']+\.torrent)'
<a\s+(?:download\s+)?href='(?P<torrent_url>[^']+\.torrent)'
)?''', webpage)
formats = []
for m in matches:
@@ -62,12 +71,15 @@ class CCCIE(InfoExtractor):
format_id = self._search_regex(
r'.*/([a-z0-9_-]+)/[^/]*$',
m.group('http_url'), 'format id', default=None)
if format_id:
format_id = m.group('lang') + '-' + format_id
vcodec = 'h264' if 'h264' in format_id else (
'none' if format_id in ('mp3', 'opus') else None
)
formats.append({
'format_id': format_id,
'format': format,
'language': m.group('lang'),
'url': m.group('http_url'),
'vcodec': vcodec,
'preference': preference(format_id),
@@ -95,5 +107,6 @@ class CCCIE(InfoExtractor):
'thumbnail': thumbnail,
'view_count': view_count,
'upload_date': upload_date,
'duration': duration,
'formats': formats,
}

96
youtube_dl/extractor/cda.py Executable file
View File

@@ -0,0 +1,96 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
decode_packed_codes,
ExtractorError,
parse_duration
)
class CDAIE(InfoExtractor):
_VALID_URL = r'https?://(?:(?:www\.)?cda\.pl/video|ebd\.cda\.pl/[0-9]+x[0-9]+)/(?P<id>[0-9a-z]+)'
_TESTS = [{
'url': 'http://www.cda.pl/video/5749950c',
'md5': '6f844bf51b15f31fae165365707ae970',
'info_dict': {
'id': '5749950c',
'ext': 'mp4',
'height': 720,
'title': 'Oto dlaczego przed zakrętem należy zwolnić.',
'duration': 39
}
}, {
'url': 'http://www.cda.pl/video/57413289',
'md5': 'a88828770a8310fc00be6c95faf7f4d5',
'info_dict': {
'id': '57413289',
'ext': 'mp4',
'title': 'Lądowanie na lotnisku na Maderze',
'duration': 137
}
}, {
'url': 'http://ebd.cda.pl/0x0/5749950c',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage('http://ebd.cda.pl/0x0/' + video_id, video_id)
if 'Ten film jest dostępny dla użytkowników premium' in webpage:
raise ExtractorError('This video is only available for premium users.', expected=True)
title = self._html_search_regex(r'<title>(.+?)</title>', webpage, 'title')
formats = []
info_dict = {
'id': video_id,
'title': title,
'formats': formats,
'duration': None,
}
def extract_format(page, version):
unpacked = decode_packed_codes(page)
format_url = self._search_regex(
r"url:\\'(.+?)\\'", unpacked, '%s url' % version, fatal=False)
if not format_url:
return
f = {
'url': format_url,
}
m = re.search(
r'<a[^>]+data-quality="(?P<format_id>[^"]+)"[^>]+href="[^"]+"[^>]+class="[^"]*quality-btn-active[^"]*">(?P<height>[0-9]+)p',
page)
if m:
f.update({
'format_id': m.group('format_id'),
'height': int(m.group('height')),
})
info_dict['formats'].append(f)
if not info_dict['duration']:
info_dict['duration'] = parse_duration(self._search_regex(
r"duration:\\'(.+?)\\'", unpacked, 'duration', fatal=False))
extract_format(webpage, 'default')
for href, resolution in re.findall(
r'<a[^>]+data-quality="[^"]+"[^>]+href="([^"]+)"[^>]+class="quality-btn"[^>]*>([0-9]+p)',
webpage):
webpage = self._download_webpage(
href, video_id, 'Downloading %s version information' % resolution, fatal=False)
if not webpage:
# Manually report warning because empty page is returned when
# invalid version is requested.
self.report_warning('Unable to download %s version information' % resolution)
continue
extract_format(webpage, resolution)
self._sort_formats(formats)
return info_dict

Some files were not shown because too many files have changed in this diff Show More