Compare commits

..

445 Commits

Author SHA1 Message Date
Philipp Hagemeister
0be8314dc8 release 2016.03.25 2016-03-25 09:27:18 +01:00
Yen Chi Hsuan
d7f62b049a [iqiyi] Update enc_key 2016-03-25 15:45:40 +08:00
Yen Chi Hsuan
3bb3356812 [douyutv] Extend _VALID_URL 2016-03-25 15:43:29 +08:00
Sergey M․
3f15fec1d1 Credit @Kagami for mnet (#8958) 2016-03-25 03:56:27 +06:00
Sergey M․
98e68806fb [mnet] Improve (Closes #8958) 2016-03-25 03:26:29 +06:00
Kagami Hiiragi
e031768666 [mnet] Add new extractor 2016-03-25 02:32:06 +06:00
Sergey M․
5eb7db4ee9 [udemy] Add support for new URL schema 2016-03-25 02:28:39 +06:00
Sergey M․
f0e83681d9 [udemy] Extract formats from outputs 2016-03-25 02:27:13 +06:00
Sergey M․
ff9d5d0938 [udemy] Improve course enrolling 2016-03-25 02:26:46 +06:00
Sergey M․
d041a73674 [extractor/__init__] Add youtube:live and sort youtube extractors alphabetically 2016-03-25 01:39:25 +06:00
Sergey M․
f07e276a04 [youtube:live] Add extractor (Closes #8959) 2016-03-25 01:18:14 +06:00
Sergey M․
993271da0a [nytimes] Tolerate missing metadata (Closes #8952) 2016-03-24 23:28:24 +06:00
Sergey M․
369e7e3ff0 [iprima] Fix extraction (Closes #8953) 2016-03-24 22:54:26 +06:00
Sergey M․
5767b4eeae [mtv] Fix description extraction (Closes #8962) 2016-03-24 22:23:31 +06:00
Yen Chi Hsuan
622d19160b [utils] Clarify Python versions affected by buggy struct module 2016-03-24 18:06:15 +08:00
Yen Chi Hsuan
32d88410eb [tumblr] Add a test with Instagram embed
Closes #8817
2016-03-24 16:32:53 +08:00
Yen Chi Hsuan
5a51775a58 [generic] Extract Instagram embeds (#8817) 2016-03-24 16:32:27 +08:00
Yen Chi Hsuan
87696e78d7 [instagram] Unescape description (#8817) 2016-03-24 16:30:01 +08:00
Yen Chi Hsuan
c4096e8aea [instagram] Extract embed videos (#8817) 2016-03-24 16:29:33 +08:00
Yen Chi Hsuan
fc27ea9464 [tumblr] Support Vine embeds (#8817) 2016-03-23 23:55:52 +08:00
Yen Chi Hsuan
088e1aac59 [generic] Support Vine embeds (#8817) 2016-03-23 23:55:08 +08:00
Yen Chi Hsuan
81f36eba88 [test/test_utils] Update for escape_url change (again) 2016-03-23 23:23:26 +08:00
Yen Chi Hsuan
2d60465e44 [test/test_utils] Update for escape_url change 2016-03-23 23:20:28 +08:00
Sergey M
4333d56494 Merge pull request #8898 from dstftw/fragment-retries
Add --fragment-retries option (Fixes #8466)
2016-03-23 20:12:32 +05:00
Sergey M․
882c699296 [tunein] Fix stream data extraction (Closes #8899, closes #8924) 2016-03-23 20:45:39 +06:00
Yen Chi Hsuan
efbed08dc2 [utils] Encode hostnames before passing to urllib
With IDN (Internationalized Domain Name) and a proxy, non-ascii URLs
are passed down to urllib/urllib2, causing UnicodeEncodeError

Fixes #8890
2016-03-23 22:24:52 +08:00
Jaime Marquínez Ferrándiz
7da2c87119 Add extractor for thescene.com (closes #8929) 2016-03-22 22:17:59 +01:00
Sergey M․
c6ca11f1b3 [once] Prevent ads from embedding into m3u8 playlists (Closes #8893) 2016-03-22 23:48:05 +06:00
Sergey M․
2beeb286e1 [laola1tv] Add support for livestreams (Closes #8934) 2016-03-22 22:32:59 +06:00
Sergey M․
cc7397b04d [ceskatelevize] Make m3u8 formats extraction non fatal (Closes #8933) 2016-03-22 21:12:29 +06:00
Sergey M․
bc5d16b302 [animeondemand] Skip dash for now 2016-03-21 23:37:39 +06:00
Sergey M․
85c637b737 [animeondemand] Extract teaser when no full episode available (#8923) 2016-03-21 23:35:50 +06:00
Sergey M․
5c69f7a479 [animeondemand] Respect startvideo (Closes #8923) 2016-03-21 23:31:40 +06:00
Sergey M․
ff5873b72d [motherless] Detect friends only videos 2016-03-21 22:24:42 +06:00
Sergey M․
065c4b27bf [xhamster:embed] Extract vars (Closes #8912) 2016-03-21 22:07:34 +06:00
Sergey M․
1600ed1ff9 [rutv] Improve flash version pattern (Closes #8911) 2016-03-21 21:46:49 +06:00
Sergey M․
5886b38d73 Add support for https for all extractors as preventive and future-proof measure 2016-03-21 21:36:32 +06:00
Sergey M․
0cef27ad25 Add missing r prefix for _VALID_URLs 2016-03-21 21:22:37 +06:00
Sergey M․
12af4beb3e [mailru] Add support for https (Closes #8920) 2016-03-21 21:17:29 +06:00
Sergey M․
9016d76f71 [YoutubeDL] Improve _format_note 2016-03-20 22:01:45 +06:00
Sergey M․
3c5d183c19 [animeondemand] Extract all formats (Closes #8906) 2016-03-20 21:51:22 +06:00
Sergey M․
3e8bb9a972 [animeondemand] Detect geo restriction 2016-03-20 20:39:00 +06:00
Yen Chi Hsuan
daef04a4e7 [kwuo] Fix KuwoChartIE and KuwoSingerIE and accept new URL forms 2016-03-20 20:17:56 +08:00
Yen Chi Hsuan
7caae128a7 Credit @vitstradal for the key algorithm in OpenloadIE (#8489)
[ci skip]
2016-03-20 19:12:02 +08:00
Yen Chi Hsuan
2648918c81 [vlive] Fix creator extraction (closes #8814) 2016-03-20 18:15:53 +08:00
Jaime Marquínez Ferrándiz
920d318d3c README: document that BSD make is also supported (#8902) 2016-03-20 10:55:14 +01:00
Yen Chi Hsuan
9e3c2f1d74 [openload] Misc improvements
* Add thumbnail
* Detect errors (#6469)
* Match more (#6469, #8489)
2016-03-20 16:49:44 +08:00
Yen Chi Hsuan
2bfeee69b9 [openload] Add new extractor (closes #8489) 2016-03-20 15:54:58 +08:00
Yen Chi Hsuan
664bcd80b9 [tudou] Use InAdvancePagedList (closes #8884) 2016-03-20 15:45:31 +08:00
Sergey M․
3c20208eff [francetv] Improve formats extraction 2016-03-20 13:00:46 +06:00
Sergey M․
db264e3cc3 [francetvinfo] Add support for france3-regions and strip title (Closes #7673) 2016-03-20 12:44:04 +06:00
Sergey M
d396f30467 Merge pull request #8902 from jaimeMF/bmake
Makefile: make it compatible with bmake
2016-03-20 11:08:57 +05:00
Sergey M․
96a9f22d98 [discovery] Relax _VALID_URL (Closes #8903) 2016-03-20 10:26:58 +06:00
Sergey M․
40025ee2a3 [postprocessort/ffmpeg] Allow embedding webvtt into webm (Closes #8874) 2016-03-20 04:12:34 +06:00
Jaime Marquínez Ferrándiz
3ff63fb365 Makefile: make it compatible with bmake
It's the portable version of BSD make: http://crufty.net/help/sjg/bmake.html
The syntax for conditionals is different in GNU make and BSD make, so we use the shell
2016-03-19 21:51:13 +01:00
Jaime Marquínez Ferrándiz
5c7cd37ebd tox.ini: Exclude test_iqiyi_sdk_interpreter.py 2016-03-19 21:50:16 +01:00
Sergey M․
298c04b464 [91porn] Use common messages' wording 2016-03-20 02:35:48 +06:00
Sergey M․
d95114dd83 [91porn] Unquote final URL (Closes #8881) 2016-03-20 02:34:02 +06:00
Sergey M․
94dcade8f8 Credit @jjatria for biobiochiletv (#7314) 2016-03-20 01:36:20 +06:00
Sergey M․
fa023ccb2c [biobiochiletv] Fix extraction, extract m3u8 formats and overall improve (Closes #7314) 2016-03-20 01:31:55 +06:00
jjatria
e36f4aa72b [biobiotv] Add extractor 2016-03-20 01:29:08 +06:00
Sergey M․
9261e347cc Credit @kasper93 for cda (#8805) 2016-03-19 23:18:04 +06:00
Sergey M․
f1ced6df51 [cda] Improve and simplify (Closes #8805) 2016-03-19 23:17:14 +06:00
Kacper Michajłow
8b0d7a66ef [cda] Add new extractor for cda.pl
Fixes #8760
2016-03-19 22:42:40 +06:00
Sergey M․
3aec71766d [safari:api] Separate extractor (Closes #8871) 2016-03-19 22:30:48 +06:00
Sergey M․
16a8b7986b [downloader/fragment] Document fragment_retries 2016-03-19 20:54:21 +06:00
Sergey M․
617e58d850 [downloader/{common,fragment}] Fix total retries reporting on python 2.6 2016-03-19 20:51:30 +06:00
Sergey M․
e33baba0dd [downloader/dash] Add fragment retry capability
YouTube may often return 404 HTTP error for a fragment causing the
whole download to fail. However if the same fragment is immediately
retried with the same request data this usually succeeds (1-2 attemps
is usually enough) thus allowing to download the whole file successfully.
So, we will retry all fragments that fail with 404 HTTP error for now.
2016-03-19 20:42:23 +06:00
Sergey M․
721f26b821 [downloader/fragment] Add report_retry_fragment 2016-03-19 20:41:24 +06:00
Sergey M․
52bb437e41 [options] Add --fragment-retries option 2016-03-19 20:40:36 +06:00
Jaime Marquínez Ferrándiz
782b1b5bd1 [utils] lookup_unit_table: Match word boundary instead of end of string 2016-03-19 11:44:49 +01:00
Sergey M․
0d769bcb78 [extractor/generic] Fix missing byte literal prefix 2016-03-19 05:43:43 +06:00
remitamine
4cd70099ea [hbo] Add new extractor 2016-03-18 21:18:18 +01:00
Jaime Marquínez Ferrándiz
09fc33198a utils: lookup_unit_table: Use a stricter regex
In parse_count multiple units start with the same letter, so it would match different units depending on the order they were sorted when iterating over them.
2016-03-18 19:23:06 +01:00
Sergey M․
4c3b16d5d1 [test_YoutubeDL] Add test for format_id format selection 2016-03-19 00:04:26 +06:00
John Peel
d5aacf9a90 Added format_id to the filers on -f. 2016-03-18 23:59:24 +06:00
Sergey M․
19e2617a6f [commonprotocols] Add generic support for rtmp URLs (Closes #8488) 2016-03-18 23:42:15 +06:00
Sergey M․
edd9b71c2c [extractor/generic] Add a test for m3u playlist served without proper Content-Type 2016-03-18 22:49:11 +06:00
Sergey M․
5940862d5a [extractor/generic] Detect m3u playlists served without proper Content-Type 2016-03-18 22:45:28 +06:00
Sergey M․
de6c51e88e [extractor/generic] Fix direct link semantics 2016-03-18 22:43:07 +06:00
Sergey M․
303dcdb995 [extractor/generic] Simplify upload_date extraction 2016-03-18 22:41:16 +06:00
Sergey M․
20938f768b [extractor/generic] Add another test for generic m3u8 2016-03-18 21:54:33 +06:00
Sergey M․
955737b2d4 [extractor/generic] Force Content-Type to lowecase 2016-03-18 21:50:44 +06:00
Sergey M․
263eff9537 [extractor/generic] Properly extract format id from Content-Type
Fixes extraction for cases like: audio/x-mpegURL; charset=utf-8
2016-03-18 21:50:10 +06:00
Sergey M․
cae21032ab [theplatform] Improve geo restriction detection 2016-03-18 21:08:25 +06:00
remitamine
6187091532 [once] check http formats availability 2016-03-18 11:51:34 +01:00
Philipp Hagemeister
0d33166ec5 release 2016.03.18 2016-03-18 11:43:48 +01:00
remitamine
87c03c6bd2 [theplatform] remove unnecessary import 2016-03-18 09:43:28 +01:00
remitamine
4c92fd2e83 [theplatform] always force theplatform to return a smil for _extract_theplatform_smil 2016-03-18 09:22:10 +01:00
Sergey M․
e3d17b3c07 [noz] Fix extraction on python 2.6 by means of using compat_xpath 2016-03-18 02:54:27 +06:00
Sergey M․
810c10baa1 [utils] Use compat_xpath 2016-03-18 02:52:23 +06:00
Sergey M․
57f7e3c62d [compat] Add compat_xpath 2016-03-18 02:51:38 +06:00
Sergey M․
0d0e282912 [animeondemand] Fix typo and improve 2016-03-18 00:13:50 +06:00
Sergey M․
85e8f26b82 [animeondemand] Improve extraction 2016-03-18 00:02:34 +06:00
Sergey M․
b57fecfddd [animeondemand] Add test 2016-03-17 23:50:10 +06:00
Sergey M․
8c97e7efb6 [animeondemand] Expand episode title regex (Closes #8875) 2016-03-17 23:43:14 +06:00
Sergey M․
cc162f6a0a [crunchyroll] Fix custom _download_webpage (Closes #8883) 2016-03-17 22:55:04 +06:00
remitamine
cf45ed786e [wistia] extract more metadata 2016-03-17 17:48:17 +01:00
remitamine
574b2a7393 [nbc:nbcnews] improve extraction(fixes #6922)
- extract more metadata and formats
- relax regex
2016-03-17 16:11:29 +01:00
remitamine
9f02ff537c [theplatform] extract brightcove once formats 2016-03-17 16:11:29 +01:00
remitamine
0436ec0e7a [once] Add new format extractor 2016-03-17 16:11:29 +01:00
Yen Chi Hsuan
11f12195af [youtube] Added itag 91
Seen in https://www.youtube.com/watch?v=jMN4cxyhJjk
2016-03-17 19:25:37 +08:00
remitamine
a646a8cf98 [sbs] improve extraction(fixes #3811)
- extract error messages
- force the platform smil url(previously the manifest param
in the query is not respected which make theplatform return non working
mp4 files for some videos)
2016-03-17 02:07:06 +01:00
remitamine
63f41d3821 [bravotv] Add new extractor(#4657) 2016-03-16 21:26:25 +01:00
Sergey M․
c5229f3926 [utils] PEP 8 2016-03-16 21:50:04 +06:00
Sergey M․
96f4f796fb [brightcover] Remove unused import 2016-03-16 21:47:51 +06:00
Sergey M․
70cab344c4 [udemy] Improve course id v4 regex 2016-03-16 21:46:09 +06:00
Quan Hua
a7ba57dc17 [udemy] Update course id regex to cover v4 layout (Closes #8753, closes #8868, closes #8870) 2016-03-16 21:45:01 +06:00
remitamine
83548824c2 Merge pull request #8092 from bpfoley/twitter-thumbnail
[utils] Add extract_attributes for extracting html tag attributes
2016-03-16 13:16:27 +01:00
remitamine
354dbbd880 [brightcove:new] extract protocol-less embed URLs(closes #2914) 2016-03-16 11:46:53 +01:00
remitamine
23edc49509 [tv3] Add new extractor(closes #8059) 2016-03-16 10:47:39 +01:00
remitamine
48254c3f2c [brightcove] some improvements and fixes
- use FFmpeg downloader to download m3u8 formats extracted
from BrightcoveNew(some of the m3u8 media playlists use AES-128)
- update comment and update_url_query to handle url query
2016-03-16 09:21:07 +01:00
remitamine
2cab48704c [thestar] Add new extractor(closes #5955) 2016-03-15 23:10:31 +01:00
remitamine
64d4f31d78 [brightcove:new] update embed_in_page embeds regex to match non numeric ref id 2016-03-15 22:50:43 +01:00
remitamine
0c9ff24041 [noz] fix extraction in python 2.6 2016-03-15 21:00:39 +01:00
Yen Chi Hsuan
3ff8279e80 [kuwo:mv] Fix the test and extraction of georestricted MVs 2016-03-16 02:41:18 +08:00
remitamine
cb6e477dfe [aljazeera] update the extractor to use BrightcoveNewIE 2016-03-15 19:38:10 +01:00
remitamine
edfd93518e [svt] extract dashhbbtv formats(#8867) 2016-03-15 19:33:09 +01:00
remitamine
89807d6a82 [brightcove] extract dash formats and detect audio formats 2016-03-15 18:48:21 +01:00
remitamine
49dea4913b Merge pull request #8513 from remitamine/dash-sort
[extractor/common] fix dash formats sorting
2016-03-15 18:39:50 +01:00
Sergey M․
dec2cae0a7 [twitch:playlistbase] Clarify pagination bug
Pagination bug has been fixed by twitch on 15.03.2016.
2016-03-15 21:45:43 +06:00
remitamine
cf6cd07396 [noz] extract f4m and m3u8 formats 2016-03-15 15:24:12 +01:00
remitamine
975b9c9ab0 [brightcove:new] detect m3u8 manifests by M2TS container 2016-03-15 10:06:53 +01:00
remitamine
8ac73bdbe4 [brightcove:new] Add support for non numeric ref: preffixed video ids 2016-03-15 10:03:08 +01:00
remitamine
877f440f7b [rice] Add new extractor(closes #1736) 2016-03-15 00:49:23 +01:00
remitamine
d13bdc3824 [brightcove] raise ExtractorError on 403 errors and fix regex to work with tenplay 2016-03-14 22:24:52 +01:00
remitamine
744daf9418 [gameinformer] remove unused imports 2016-03-14 21:57:26 +01:00
remitamine
bf475e1990 [tlc] fix extraction and update extractor to use BrightcoveNewIE 2016-03-14 21:53:00 +01:00
remitamine
203f3d779a [gameinformer] update the extractor to use BrightcoveNewIE 2016-03-14 18:32:29 +01:00
remitamine
4230c4894d [external/downloader] fix rtmp downloading using FFmpegFD 2016-03-14 16:51:01 +01:00
Philipp Hagemeister
6bb266693f release 2016.03.14 2016-03-14 10:25:20 +01:00
remitamine
5d53c32701 [usatoday] Add new extractor(closes #8655) 2016-03-13 22:36:15 +01:00
remitamine
2e7e561c1d Merge pull request #8611 from remitamine/ffmpegfd
[downloader/external] Add FFmpegFD
2016-03-13 21:30:27 +01:00
remitamine
d8515fd41c [downloader/external] pass configuration args to ffmpeg 2016-03-13 21:28:26 +01:00
remitamine
694c47b261 [external/downloader] don't pass -t and -ss to ffmpeg 2016-03-13 21:28:16 +01:00
remitamine
77dea16ac8 [downloader/external] check for ffmpeg availablity when it used for m3u8 download 2016-03-13 20:34:51 +01:00
remitamine
6ae27bed01 [download/external] move the check for multiple selected formats to get_suitable_downloader 2016-03-13 20:34:38 +01:00
remitamine
da1973a038 [extractor/__init__] disable time range downloading 2016-03-13 16:16:26 +01:00
remitamine
be24916a7f [downloader/rtsp] Add rtsp and mms downloader 2016-03-13 15:24:02 +01:00
remitamine
2cb99ebbd0 [downloader/external] add can_download mathod for checking downloader availibilty and support 2016-03-13 15:18:51 +01:00
remitamine
91ee320bfa [downloader/external] wrap available_opt in a list 2016-03-13 14:37:45 +01:00
remitamine
8fb754bcd0 Merge pull request #8821 from remitamine/list-thumbnails-order
[YoutubeDL] check for --list-thumbnails immediately after processing them
2016-03-13 12:44:50 +01:00
remitamine
b7b72db9ad [YoutubeDL] check for --list-thumbnails immediately after processing them 2016-03-13 12:41:15 +01:00
remitamine
634415ca17 [downloader/external] skip FFmpegFD when requesting multiple formats 2016-03-13 12:23:10 +01:00
Sergey M․
2f7ae819ac [utils] PEP 8 2016-03-13 17:23:08 +06:00
Sergey M․
0a477f8731 [vice:show] Add extractor (Closes #8847) 2016-03-13 17:22:23 +06:00
remitamine
a755f82549 [ffmpeg] convert format ext to ffmpeg output formats codes 2016-03-13 12:15:29 +01:00
Sergey M․
7f4173ae7c [mixcloud] Fix view count extraction (Closes #8831, closes #8845) 2016-03-13 16:27:58 +06:00
Sergey M․
fb47597b09 [bbc] Generalize unit table lookup and add parse_count 2016-03-13 16:27:20 +06:00
Sergey M․
450b233cc2 [bbc] Update test 2016-03-13 15:59:54 +06:00
Sergey M․
b7d7674f1e [bbc] Update test 2016-03-13 15:56:34 +06:00
Sergey M․
0e832c2c97 [bbc] Improve title and description extraction (Closes #8826, closes #8822) 2016-03-13 15:54:56 +06:00
Benjamin Congdon
8e4aa7bf18 [bbc] Fix BBC Extractor to work with 'School Report' 2016-03-13 15:54:34 +06:00
remitamine
a42dfa629e [makerschannel] Add new extractor(closes #8839) 2016-03-12 22:52:53 +01:00
remitamine
b970dfddaf [minoto] Add new extractor 2016-03-12 22:52:53 +01:00
Sergey M․
46a4ea8276 [safari] Remove unused imports 2016-03-13 03:48:38 +06:00
Sergey M․
3f2f4a94aa [extractor/generic] Extract f4m formats from final URLs 2016-03-13 03:38:20 +06:00
Sergey M․
f930e0c76e [extractor/generic] Extract f4m formats and refactor common info 2016-03-13 03:17:25 +06:00
Sergey M․
0fdbb3322b [extractor/common] Add _parse_f4m_formats routine 2016-03-13 03:16:08 +06:00
Sergey M․
e9c8999ede [safari] Fix authentication 2016-03-13 02:08:36 +06:00
Sergey M․
73cbd709f9 [safari] Respect kaltura session (Closes #7491) 2016-03-13 02:03:07 +06:00
Sergey M․
9dce3c095b [kaltura] Respect kaltura session 2016-03-13 02:01:10 +06:00
remitamine
e5a2e17a9c [kaltura] optimize url info extraction 2016-03-12 18:43:45 +01:00
remitamine
0ec589fac3 Merge pull request #8827 from remitamine/safari
[safari] extract free and preview videos(#7491)
2016-03-12 17:28:54 +01:00
remitamine
36bb63e084 [dw] add support for article pages(closes #8790) 2016-03-12 08:33:22 +01:00
remitamine
91d6aafb48 [dw] add support for audio pages 2016-03-11 23:55:26 +01:00
remitamine
c8868a9d83 [dw] Add new extractor 2016-03-11 22:44:18 +01:00
remitamine
09f572fbc0 [extractor/common] add transform_source to _download_smil and _extract_smil_formats 2016-03-11 22:37:07 +01:00
Sergey M․
58e6d097d8 [googledrive] Relax _VALID_URL (Closes #8829) 2016-03-12 00:36:39 +06:00
remitamine
15bf934de5 Merge pull request #8819 from remitamine/simple-webpage-requests
[extractor/common] simplify using data, headers and query params with _download_* methods
2016-03-11 18:19:43 +01:00
remitamine
cdfee16818 [extractor/common] add data, headers and query params to _request_webpage 2016-03-11 18:12:50 +01:00
remitamine
bcb668de18 [safari] extract free and preview videos(#7491) 2016-03-11 16:57:06 +01:00
remitamine
fac7e79277 [kaltura] add support for videos with reference id 2016-03-11 16:52:07 +01:00
Yen Chi Hsuan
a6c8b75904 [common] Use mimeType to determine file extensions (#8766) 2016-03-11 23:51:42 +08:00
Yen Chi Hsuan
25cb05bda9 [utils] Remove codec2ext
This function is orignally used for determining file extensions of DASH
formats. Now in DASH, ext is determined by mime_type. See #8766 for more
information.
2016-03-11 23:51:42 +08:00
Sergey M․
6fa6d38549 Credit @benjamincongdon for audioboom (#8812) 2016-03-11 19:46:06 +06:00
Sergey M․
883c052378 [audioboom] Improve robustness and extract uploader (Closes #8812) 2016-03-11 19:44:17 +06:00
Benjamin Congdon
61f317c24c Added extractor for AudioBoom.com 2016-03-11 19:43:01 +06:00
Yen Chi Hsuan
64f08d4ff2 Merge pull request #8766 from yan12125/dash-detect-ext
Detect file extensions of DASH formats from their codecs
2016-03-11 21:40:07 +08:00
Yen Chi Hsuan
e738e43358 [facebook] Support videos in groups
Viewing/Downloading videos in groups requires logging in, even for
those in public groups.

Fixes #6951.
2016-03-11 16:20:27 +08:00
Jaime Marquínez Ferrándiz
f6f6217a98 [facebook] Don't override variable in list comprehension 2016-03-10 15:17:04 +01:00
Yen Chi Hsuan
31db8709bf [iqiyi] Update enc_key 2016-03-10 21:37:26 +08:00
Yen Chi Hsuan
5080cbf9fd [facebook] Handle escaped swf params
Fixes #8713
2016-03-10 15:26:32 +08:00
Yen Chi Hsuan
9880124196 [facebook] Fix for m.facebook.com URLs 2016-03-10 14:59:30 +08:00
Yen Chi Hsuan
9c7b509b2a [facebook] Merge FacebookPostIE into FacebookIE
Fixes #8713
2016-03-10 14:59:30 +08:00
Sergey M․
e0dccdd398 [test_YoutubeDL] PEP 8 2016-03-10 09:04:48 +06:00
Sergey M․
5d583bdf6c [YoutubeDL] Improve _format_note 2016-03-10 01:03:18 +06:00
Sergey M․
1e501364d5 [vimeo:ondemand] Clarify IE_NAME 2016-03-10 00:52:52 +06:00
Sergey M․
74278def2e [vimeo:ondemand] Separate ondemand extractor (Closes #8330, closes #8801) 2016-03-10 00:51:07 +06:00
Sergey M․
e375a149e1 [livestream] Properly build smil URLs (#8794) 2016-03-09 23:11:09 +06:00
Sergey M
2bfc0e97f6 Merge pull request #8791 from benjamincongdon/Twitch-AudioOnly-Rebased
[twitch] Support for "Audio_Only" format
2016-03-08 13:02:56 +06:00
Benjamin Congdon
ac45505528 Added flag for 'allow_audio_only' format in Twitch queries 2016-03-07 21:03:24 -06:00
Sergey M․
7404061141 Credit @mutantmonkey for ustudio (#8574) and kusi (#8575) 2016-03-07 02:30:47 +06:00
Sergey M․
46c329d6f6 [arte] Improve extraction (Closes #8768) 2016-03-07 02:19:54 +06:00
Sergey M․
1818e4c2b4 [arte] Fix typo 2016-03-07 02:10:16 +06:00
Sergey M․
e7bd17373d [sexu] Improve extraction (Closes #8782) 2016-03-06 18:08:53 +06:00
aystroganov@gmail.com
c58e74062f [Sexu] fix extractor 2016-03-06 17:53:22 +06:00
Yen Chi Hsuan
6d210f2090 [utils] Add more codecs to codec2ext
BBC uses avc3. Here's an example (thanks to @remitamine for this example)

http://rdmedia.bbc.co.uk/dash/ondemand/bbb/2/client_manifest-common_init.mpd

See also https://trac.ffmpeg.org/ticket/5217
2016-03-06 17:57:48 +08:00
Yen Chi Hsuan
af7d5a63b2 [common] Document protocol http_dash_segments 2016-03-06 17:47:07 +08:00
Yen Chi Hsuan
e41acb6364 [safari] Don't pollute std_headers (#8778) 2016-03-06 17:38:39 +08:00
Philipp Hagemeister
bdf7f13954 release 2016.03.06 2016-03-06 10:08:02 +01:00
Yen Chi Hsuan
0f56a4b443 [vimeo] Don't pollute std_headers
Fixes #8778
2016-03-06 17:01:05 +08:00
Sergey M․
1b5284b13f [downloader/fragment] Make speed more smooth
At the beginning of every segment there was a drop to Unknown speed due to timeslice being too small to calculate speed.
Now last speed from the previous fragment is used.
2016-03-06 05:36:52 +06:00
Sergey M․
d1e4a464cd [YoutubeDL] Carry long lines and improve readability 2016-03-06 04:32:18 +06:00
Sergey M․
ff059017c0 [YoutubeDL] Fix typo in m3u8_native fixup 2016-03-06 04:30:19 +06:00
remitamine
f22ba4bd60 update tests related to the change in youtube http format sorting
the change was done in 82156fdbf0
2016-03-05 21:52:24 +01:00
remitamine
1db772673e [cinemassacre] update tests 2016-03-05 21:34:34 +01:00
remitamine
75313f2baa [cnet] fix info extraction 2016-03-05 21:10:00 +01:00
remitamine
090eb8e25f Merge pull request #8718 from remitamine/m3u8-fixup
Add fixup for media files produced by HlsNative downloader(fixes #4776)
2016-03-05 18:37:28 +01:00
remitamine
a9793f58a1 Merge pull request #8754 from remitamine/5min
update 5min related web sites info extraction and add support for Aol features.
2016-03-05 18:35:48 +01:00
remitamine
7177fd24f8 [vgtv] support ap.vgtv.no and fix old videos extraction(fixes #8719) 2016-03-05 17:51:46 +01:00
Sergey M․
1e501f6c40 [jeuxvideo] Fix config URL extraction (Closes #8774) 2016-03-05 21:01:43 +06:00
remitamine
2629a3802c [revison3] fix video_id for --download-archive 2016-03-05 15:42:15 +01:00
Sergey M․
51ce91174b [YoutubeDL] Fix resolution with missing height in output template dict 2016-03-05 19:38:58 +06:00
remitamine
107d0c421a [revision3] add support for pages of type tag 2016-03-05 13:43:29 +01:00
remitamine
18b0b23992 [revision3] add support pages of type embed 2016-03-05 12:14:48 +01:00
Sergey M․
d1b29d1342 [elpais] Add support for alternative layout (Closes #8744) 2016-03-05 16:43:29 +06:00
Yen Chi Hsuan
2def60c5f3 [common] Use codec2ext for DASH formats (#8764) 2016-03-05 18:18:39 +08:00
Yen Chi Hsuan
19a17d4623 [utils] Add codec2ext 2016-03-05 18:18:28 +08:00
Yen Chi Hsuan
845817aadf [twitter] Provide more metadata 2016-03-05 18:14:58 +08:00
Jaime Marquínez Ferrándiz
3233a68fbb [utils] update_url_query: Encode the strings in the query dict
The test case with {'test': '第二行тест'} was failing on python 2 (the non-ascii characters were replaced with '?').
2016-03-04 22:18:40 +01:00
remitamine
cf074e5ddd [foxnews] update test 2016-03-04 21:42:04 +01:00
Sergey M․
002c755248 [youporn] Fix sources regex 2016-03-05 01:51:27 +06:00
Sergey M․
d627cec608 [youporn] Fix quality extraction (Closes #8758) 2016-03-05 01:50:12 +06:00
remitamine
1315224cbb [bleacherreport] update tests 2016-03-04 20:14:09 +01:00
remitamine
7760b9ff4d [audimedia] update _VALID_URL and video_id regex and improve http format_id 2016-03-04 17:55:50 +01:00
Yen Chi Hsuan
28559564b2 [kusi] Correct test_KUSI 2016-03-05 00:04:29 +08:00
Yen Chi Hsuan
fa880d20ad [kusi] Two fixes
Thanks @dstftw for pointing out those
2016-03-04 23:59:58 +08:00
Sergey M․
ae7d31af1c [yandexmusic] Capture and output API errors 2016-03-04 21:32:54 +06:00
Yen Chi Hsuan
9d303bf29b Merge branch 'mutantmonkey-kusi' 2016-03-04 23:21:24 +08:00
Yen Chi Hsuan
5f1688f271 [kusi] Simplify and improve 2016-03-04 23:08:47 +08:00
remitamine
1d4c9ed90c [aol] imporve extraction
- add support for aol features
- remove support for legacy urls
2016-03-04 10:42:58 +01:00
remitamine
d48352fb5d [engadget] remove support for legacy urls 2016-03-04 10:40:39 +01:00
remitamine
6d6536acb2 [fivemin] improve extraction
- skip m3u8 formats(404 error)
- skip unavailable test
- download embed page only when it's needed
- update _VALID_URL regex(joystiq.com redirect to engadget.com)
2016-03-04 10:25:16 +01:00
Yen Chi Hsuan
b6f94d81ea [kusi] Add a test for the alternative form of URL 2016-03-04 14:32:01 +08:00
Yen Chi Hsuan
8477a69283 Merge branch 'kusi' of https://github.com/mutantmonkey/youtube-dl into mutantmonkey-kusi 2016-03-04 14:21:23 +08:00
Yen Chi Hsuan
d58cb3ec7e [leeco] Skip an invalid test. test_LePlaylist_1 is sufficient 2016-03-04 13:46:38 +08:00
Yen Chi Hsuan
8a370aedac [leeco] format_id should be strings 2016-03-04 13:38:45 +08:00
Yen Chi Hsuan
24ca0e9c0b [douyutv] Fix tests 2016-03-04 13:36:29 +08:00
Sergey M․
e1dd521e49 [livestream] Fix FutureWarning (Closes #8742) 2016-03-04 01:16:58 +06:00
remitamine
1255733945 Merge pull request #8739 from remitamine/update_url_params
[utils] add update_url_query function to create or update query string params
2016-03-03 19:24:04 +01:00
remitamine
3201a67f61 [test/test_utils] add more tests for update_url_query 2016-03-03 19:18:57 +01:00
Sergey M․
d0ff690d68 [indavideo:embed] Fix tags extraction (Closes #8738) 2016-03-04 00:09:40 +06:00
remitamine
fb640d0a3d [test/test_utils] add tests for update_url_query 2016-03-03 18:40:05 +01:00
remitamine
38f9ef31dc [utils] add update_url_query function 2016-03-03 18:34:52 +01:00
Sergey M․
a8276b2680 [twitch:playlistbase] Fix all at once fetch 2016-03-03 22:18:32 +06:00
Sergey M․
ececca6cde [twitch:playlistbase] Restore original _PAGE_LIMIT 2016-03-03 22:12:55 +06:00
Sergey M․
8bbb4b56ee [twitch:playlistsbase] Use orderedSet 2016-03-03 22:11:26 +06:00
Sergey M․
539a1641c6 [twitch] Workaround broken paging (Closes #8740) 2016-03-03 22:10:36 +06:00
Yen Chi Hsuan
1b0635aba3 [Makefile] Allow specifying the Python version in offline tests 2016-03-03 21:57:49 +08:00
Yen Chi Hsuan
429491f531 [test/http] Fix failure in Jython
make offlinetest passed on the latest Jython hg version with patched
lib-python/2.7/urllib2.py pulled from CPython 2.7.11
2016-03-03 21:55:17 +08:00
Yen Chi Hsuan
e9c0cdd389 [jython] Introduce compat_os_name
os.name is always 'java' on Jython
2016-03-03 19:24:24 +08:00
Yen Chi Hsuan
0cae023b24 Merge branch 'jython-support'
Closes #8302
2016-03-03 18:49:32 +08:00
Yen Chi Hsuan
8ee239e921 [utils] Jython support - handle filenames correctly
Now test:youtube downloads
2016-03-03 18:47:54 +08:00
Brian Foley
8bb56eeeea [utils] Add extract_attributes for extracting html tag attributes
This is much more robust than just using regexps, and handles all
the common scenarios, such as empty/no values, repeated attributes,
entity decoding, mixed case names, and the different possible value
quoting schemes.
2016-03-03 10:11:37 +00:00
remitamine
fa9e259fd9 [extractor/common] use compat_parse_qs in update_url_params 2016-03-03 10:54:39 +01:00
remitamine
f3bdae76de [extractor/common] add update_url_params helper method to add or update query string params 2016-03-03 10:27:22 +01:00
Yen Chi Hsuan
03879ff054 [twitter] Media info is not always in the first entity
Fixes #8704
2016-03-03 14:42:49 +08:00
Yen Chi Hsuan
c8398a9b87 [twitter] Now Twitter serves the same file for Firefox and Chrome 2016-03-03 14:27:27 +08:00
Yen Chi Hsuan
b8972bd69d [twitter] Fix extraction of test_Twitter and test_Twitter_1 2016-03-03 14:24:24 +08:00
Yen Chi Hsuan
0ae937a798 [twitter] Support twitter.com/i/videos/tweet/ URLS
Closes #8737
2016-03-03 13:43:45 +08:00
remitamine
4459bef203 [thepltform] detect other types of errors 2016-03-02 21:41:29 +01:00
remitamine
e07237f640 [utils] remove check for val from find_xpath_attr 2016-03-02 21:40:21 +01:00
Yen Chi Hsuan
8c5a994424 [leeco] Letv renamed to LeEco
LeEco is the company name and Le is the domain name.

For more information see the Chinese news post
http://www.techorz.com/company-news/letv-renamed-to-leeco-and-new-logo/
2016-03-03 03:27:55 +08:00
Yen Chi Hsuan
2eb25b256b [letv] Merge LetvTvIE into LetvPlaylistIE
And
1. Add more URL examples
2. Improve the matching pattern
2016-03-03 03:27:55 +08:00
Yen Chi Hsuan
f3bc19a989 [letv] Correct regular expressions and fix a typo 2016-03-03 03:27:55 +08:00
Yen Chi Hsuan
7a8fef3173 [letv] Order imports alphabetically 2016-03-03 03:27:55 +08:00
Yen Chi Hsuan
7465e7e42d [letv] Keep videos' order in playlists 2016-03-03 03:27:55 +08:00
Yen Chi Hsuan
5e73a67d44 [letv] Domain name changed 2016-03-03 03:27:55 +08:00
Sergey M․
2316dc2b9a [twitch:playlistbase] Mark broken
Twitch paging mechanism is completely broken on twitch side serving all videos all the time and making our travis builds stall.
2016-03-03 00:41:36 +06:00
Sergey M․
a2d7797cee [vimeo] Extract uploader_url (Closes #8727) 2016-03-03 00:00:11 +06:00
Sergey M․
fd050249af [youtube] Extract uploader_url (Closes #8724) 2016-03-02 23:49:10 +06:00
Sergey M․
7bcd2830dd [extractor/common] Document uploader_url 2016-03-02 23:31:24 +06:00
Sergey M
47462a125b [README.md] Document license field for output template 2016-03-02 23:10:01 +06:00
Sergey M․
7caf9830b0 [youtube] Extract license (Closes #8725) 2016-03-02 23:07:25 +06:00
Sergey M․
2bc0c46f98 [extractor/common] Document license metafield 2016-03-02 23:06:39 +06:00
remitamine
3318832e9d [youtube] improve width and height extraction from fmt_list 2016-03-02 17:52:13 +01:00
remitamine
e7d2084568 Merge branch 'master' of github.com:rg3/youtube-dl 2016-03-02 17:35:55 +01:00
remitamine
c2d3cb4c63 Revert "[youtube] add tbr to _formats extracted from watch_as3.swf"
This reverts commit 4a5ba28a87.
2016-03-02 17:35:04 +01:00
remitamine
c48dd4400f Revert "[youtube] add basic info for some unknown formats extracted from watch_as3.swf"
This reverts commit 85ca019d96.
2016-03-02 17:34:56 +01:00
Sergey M․
e38cafe986 [YoutubeDL] Skip postprocessing and archive report when outputting to stdout (Closes #8729) 2016-03-02 21:11:18 +06:00
remitamine
85ca019d96 [youtube] add basic info for some unknown formats extracted from watch_as3.swf 2016-03-02 16:05:05 +01:00
remitamine
4a5ba28a87 [youtube] add tbr to _formats extracted from watch_as3.swf 2016-03-02 16:05:05 +01:00
remitamine
82156fdbf0 [youtube] extract width and height from fmt_list 2016-03-02 16:05:05 +01:00
Sergey M․
6114090418 [nrk:skole] Relax _VALID_URL 2016-03-02 20:57:04 +06:00
Sergey M․
3099b31276 [nrk:skole] Add extractor (Closes #8728) 2016-03-02 20:52:06 +06:00
remitamine
f17f86513e Add fixup for media files produced by HlsNative downloader(fixes #4776) 2016-03-01 21:10:41 +01:00
Sergey M․
90f794c6c3 [options] Add --no-mark-watched (#5054) 2016-03-01 23:41:23 +06:00
Sergey M․
66ca2cfddd [wistia] Fix extraction (Closes #8707) 2016-03-01 23:26:53 +06:00
Sergey M
269dd2c6a7 Merge pull request #8703 from dstftw/mark-watched
Add --mark-watched feature (Closes #5054)
2016-03-01 23:00:51 +06:00
Sergey M․
e7998f59aa [lifenews] Fix extraction and improve (Closes #2482, closes #8714) 2016-03-01 22:59:11 +06:00
Yen Chi Hsuan
9fb556eef0 [iqiyi] SWF URLs are not used anymore
Since automatic detection of enc_key failed

Closes #8705
2016-03-01 08:42:33 +08:00
Philipp Hagemeister
e781ab63db release 2016.03.01 2016-03-01 00:05:39 +01:00
Jaime Marquínez Ferrándiz
3e76968220 [rtve.es:live] Fix extraction
* Update _VALID_URL to match the current URLs
* Use the m3u8 manifest since I haven't figured out how to use the rtmp stream
2016-02-29 20:57:26 +01:00
Sergey M․
2812c24c16 [mdr] Fix extraction (Closes #8702) 2016-03-01 01:24:26 +06:00
Sergey M․
d77ab8e255 Add --mark-watched feature (Closes #5054) 2016-03-01 01:01:33 +06:00
Sergey M․
4b3cd7316c [tf1] Improve wat id regex (Closes #8691) 2016-02-29 03:28:21 +06:00
Sergey M․
6dae56384a [screenwavemedia] Check formats' URLs 2016-02-28 21:46:36 +06:00
Sergey M․
2b2dfae83e [screenwavemedia] Improve formats sorting 2016-02-28 20:16:31 +06:00
Sergey M․
6c10dbeae9 [screenwavemedia] Improve formats extraction 2016-02-28 20:05:58 +06:00
Jaime Marquínez Ferrándiz
9173202b84 [zdf] Ignore hls manifests that use https (closes #8665)
The certificates are misconfigured, you get the following error mesage:

    ssl.CertificateError: hostname u'zdf-hdios-none-i.zdf.de' doesn't match either of 'a248.e.akamai.net', '*.akamaihd.net', '*.akamaihd-staging.net', '*.akamaized.net', '*.akamaized-staging.net'
2016-02-28 14:06:26 +01:00
Sergey M․
8870bb4653 [webofstories] Tolerate malforder og:title (Closes #8417) 2016-02-28 03:37:48 +06:00
Sergey M
7a0e7779fe [README.md] Use simple wording instead of env variable for home 2016-02-28 03:12:13 +06:00
Sergey M
a048ffc9b0 [README.md] Clarify configuration file options syntax 2016-02-28 03:04:06 +06:00
Sergey M
4587915b2a [README.md] Make configuration file example more diverse 2016-02-28 02:56:09 +06:00
Philipp Hagemeister
da665ddc25 release 2016.02.27 2016-02-27 21:31:21 +01:00
Sergey M․
5add979d91 [dplay] Add support for dplay.no 2016-02-27 21:42:08 +06:00
Sergey M․
20afe8bd14 Credit @aidan- for more dplay sites support (#8463) 2016-02-27 21:31:43 +06:00
Sergey M․
940b606a07 [dplay] Improve, extract all formats and metadata (Closes #8463) 2016-02-27 21:30:47 +06:00
Aidan Rowe
9505053704 [dplay] add support for it.dplay.com and dplay.dk 2016-02-27 19:40:36 +06:00
Sergey M․
2c9ca78281 [extractor/generic] Add support for tnaflix network embeds (Closes #7505) 2016-02-27 17:15:49 +06:00
Sergey M․
63719a8ac3 [tnaflixnetwork:embed] Add _extract_urls 2016-02-27 17:15:06 +06:00
Sergey M․
8fab62482a [tnaflixnetwork] Fallback age limit to 18 2016-02-27 16:59:10 +06:00
Sergey M․
d6e9c2706f [tnaflixnetwork:embed] Add extractor 2016-02-27 16:58:11 +06:00
Sergey M․
f7f2e53a0a [imdb] Recognize 1080p formats (Closes #8677) 2016-02-27 15:51:25 +06:00
Sergey M․
9cdffeeb3f [extractor/common] Clarify rationale on media playlist detection 2016-02-27 07:01:11 +06:00
Sergey M․
fbb6edd298 [extractor/common] Properly extract audio only formats in master m3u8 playlists 2016-02-27 06:48:13 +06:00
Yen Chi Hsuan
5eb6bdced4 [utils] Multiple changes to base_n()
1. Renamed to encode_base_n()
2. Allow tables longer than 62 characters
3. Raise ValueError instead of AssertionError for invalid input data
4. Return the first character in the table instead of '0' for number 0
5. Add tests
2016-02-27 03:22:52 +08:00
Yen Chi Hsuan
5633b4d39d [infoq] Use BokeCC extractor function 2016-02-27 02:55:11 +08:00
Yen Chi Hsuan
4435c6e98e [bokecc] Add new extractor (#2336) 2016-02-27 02:54:43 +08:00
Yen Chi Hsuan
2ebd2eac88 [letv] Speedup M3U8 decryption 2016-02-27 00:58:03 +08:00
Sergey M․
b78b292f0c [youtube] Add alternative automatic captions extraction approach (Closes #8667) 2016-02-26 22:21:47 +06:00
Yen Chi Hsuan
efbd6fb8bb [vidzi] Use decode_packed_codes
Javascript codes found on Vidzi are slightly different from those found
in VideoMega and iQiyi. Nevertheless, the difference has no effects on
the final result.
2016-02-26 15:14:13 +08:00
Yen Chi Hsuan
680079be39 [utils] Relaxing regex in decode_packed_codes for vidzi 2016-02-26 15:13:03 +08:00
Yen Chi Hsuan
e4fc8d2ebe [videomega] Fix extraction (closes #7606) 2016-02-26 15:00:48 +08:00
Yen Chi Hsuan
f52354a889 [utils] Move codes for handling eval() from iqiyi.py 2016-02-26 14:58:29 +08:00
Yen Chi Hsuan
59f898b7a7 [utils] Merge base_n functions 2016-02-26 14:37:20 +08:00
Yen Chi Hsuan
8f4a2124a9 [vidzi] Fix extraction 2016-02-26 14:26:26 +08:00
Yen Chi Hsuan
481888294d [utils] Add base36 for use in Vidzi 2016-02-26 14:26:26 +08:00
Yen Chi Hsuan
d1e440a4a1 [jwplatform] Separate codes for for parsing jwplayer data 2016-02-26 14:26:26 +08:00
Yen Chi Hsuan
81bdc8fdf6 [utils] Move base62 to utils 2016-02-26 14:26:26 +08:00
Yen Chi Hsuan
e048d87fc9 [kuwo] Fix a test 2016-02-26 14:26:26 +08:00
Sergey M․
e26cde0927 [space] Remove extractor (Closes #8662)
Now uses ooyala embed
2016-02-25 21:46:43 +06:00
Sergey M․
20108c6b90 [ustudio] Improve (Closes #8574) 2016-02-25 21:30:19 +06:00
mutantmonkey
9195ef745a [uStudio] Add new extractor 2016-02-25 21:29:49 +06:00
Sergey M․
d0459c530d [motherless] Update tests 2016-02-25 00:54:41 +06:00
Sergey M․
f160785c5c [utils] Remove AM/PM from unified_strdate patterns 2016-02-25 00:52:49 +06:00
Sergey M․
5c0a57185c [motherless] Detect non-existing videos 2016-02-25 00:42:19 +06:00
Sergey M․
43479d9e9d [motherless] Make categories optional (Closes #8654) 2016-02-25 00:36:14 +06:00
Sergey M
c0da50d2b2 [README.md] Turn references to issues to links 2016-02-24 23:05:23 +06:00
Yen Chi Hsuan
c24883a1c0 [facebook] Fix format sorting
'hd' formats should have higher priorities
2016-02-24 03:43:24 +08:00
Yen Chi Hsuan
1b77ee6248 [c56] Support videos hosted on Sohu (closes #8073) 2016-02-24 03:32:29 +08:00
Sergey M․
bf4b3b6bd9 [vk] Extract video URL from extra_data (Closes #8646) 2016-02-23 18:47:13 +06:00
Yen Chi Hsuan
efbeddead3 [facebook] Support mobile URLs (closes #8638) 2016-02-23 13:17:24 +08:00
Yen Chi Hsuan
3cfeb1624a [nba] Support channels (#5362, #4167) 2016-02-23 13:11:20 +08:00
Yen Chi Hsuan
b95dc034ca [utils] Implement cache for OnDemandPagedList 2016-02-23 13:11:20 +08:00
Yen Chi Hsuan
86a7dbe66e [nba] Support non-video/ pages
Fixes #8589
2016-02-23 13:11:20 +08:00
Sergey M
b43a7a92cd [README.md] Fix typo 2016-02-23 05:41:09 +06:00
Sergey M
6563d31710 [README.md] Fix typo 2016-02-23 05:37:49 +06:00
Sergey M
cf89ba9eff [README.md] Clarify robustness and future-proof requirements for new extractors 2016-02-23 05:35:19 +06:00
Sergey M
9b01272832 [README.md] Update link to extractor metafields 2016-02-23 05:03:57 +06:00
Sergey M
58525c94d5 [README.md] Emphasize copyright infringement aspects in add-new-site-support tutorial 2016-02-23 04:58:51 +06:00
Sergey M
621bd0cda9 [README.md] Add tl;dr links to examples 2016-02-23 04:43:45 +06:00
Sergey M
1610f770d7 [README.md] Extract example subsections 2016-02-23 04:29:39 +06:00
Sergey M
0fc871d2f0 [README.md:output_template] Add example for channel/user playlists download 2016-02-23 04:16:47 +06:00
Sergey M․
1ad6143061 [xfileshare] Add support for powerwatch (Closes #8628) 2016-02-22 17:37:00 +06:00
Philipp Hagemeister
92da3cd848 release 2016.02.22 2016-02-22 11:57:31 +01:00
remitamine
6212bcb191 [tf1] fix info extraction(fixes #8599) 2016-02-22 09:57:40 +01:00
Sergey M․
d69abbd3f0 [googledrive] Make thumbnail optional (Closes #8629) 2016-02-22 03:13:18 +06:00
Sergey M․
1d00a8823e [arte] PEP 8 2016-02-22 01:32:23 +06:00
Sergey M․
5d6e1011df [pbs] Extract all formats (Closes #8538) 2016-02-22 01:23:27 +06:00
Sergey M․
f5bdb44443 [extractor/common] Add _remove_duplicate_formats 2016-02-22 01:19:39 +06:00
Yen Chi Hsuan
7efc1c2b49 [twitter] Fix metadata extraction and test_Twitter_1 2016-02-21 17:29:28 +08:00
Yen Chi Hsuan
132e3b74bd [twitter] Fix a typo 2016-02-21 17:21:37 +08:00
Yen Chi Hsuan
bdbf4ba40e [twitter:amplify] Extract more metadata 2016-02-21 17:16:35 +08:00
Yen Chi Hsuan
acb6e97e6a [twitter] Fix several failed tests 2016-02-21 16:57:56 +08:00
Yen Chi Hsuan
445d72b8b5 [twitter:amplify] Add TwitterAmplifyIE for handling Twitter smart URLs
Closes #8075
2016-02-21 16:41:24 +08:00
Sergey M․
92c5e11b40 [arte:future] Fix test 2016-02-21 14:23:58 +06:00
Sergey M․
0dd046c16c [arte:magazine] Fix test 2016-02-21 13:57:30 +06:00
Sergey M․
305168ca3e [arte:+7] Detect more embeds (Closes #8613) 2016-02-21 13:55:25 +06:00
Sergey M․
b72f6163dc [arte:+7] Improve _VALID_URL 2016-02-21 13:37:31 +06:00
Sergey M․
33d4fdabfa [extractor/generic] Add support for ok embeds (#8619) 2016-02-21 09:51:54 +06:00
remitamine
cafcf657a4 add more subtitles mime types to mimetype2ext and fix the platform subtitle extraction 2016-02-20 22:02:03 +01:00
Yen Chi Hsuan
101067de12 Jython support - handle *.class files 2016-02-21 03:32:03 +08:00
Yen Chi Hsuan
7360db05b4 [postprocessor/embedthumbnail] Allow mkv to embed thumbnails
Fixes #6046
2016-02-21 03:32:03 +08:00
Yen Chi Hsuan
c1c05c67ea [utils] Jython support - disable setproctitle() until ctypes is complete 2016-02-21 03:32:03 +08:00
Yen Chi Hsuan
399a76e67b [utils] Jython support: tolerate missing fcntl module 2016-02-21 03:32:03 +08:00
Jaime Marquínez Ferrándiz
765ac263db [utils] mimetype2ext: return 'm4a' for 'audio/mp4' (fixes #8620)
The youtube extractor was using 'mp4' for them, therefore filters like 'bestaudio[ext=m4a]' stopped working (94278f7202 broke it).
2016-02-20 19:55:10 +01:00
Yen Chi Hsuan
a4e4d7dfcd [test_iqiyi_sdk_interpreter] Add test for iQiyi login 2016-02-20 23:10:39 +08:00
Yen Chi Hsuan
73f9c2867d [iqiyi] Support playlists (closes #8019) 2016-02-20 22:44:04 +08:00
Philipp Hagemeister
9c86d50916 [faz] Future-proof XML element check 2016-02-20 14:11:44 +01:00
Yen Chi Hsuan
1d14c75f55 [Makefile] iQiyi login test requires network 2016-02-20 20:49:30 +08:00
Yen Chi Hsuan
99709cc3f1 [iqiyi] Implement _login()
Currently only email login supported
2016-02-20 19:54:58 +08:00
Yen Chi Hsuan
5bc880b988 [utils] Add OHDave's RSA encryption function 2016-02-20 19:54:58 +08:00
Yen Chi Hsuan
958759f44b [appletrailers] Extend _VALID_URL (#8524) 2016-02-20 15:54:00 +08:00
remitamine
f34294fa0c [downloader/external:ffmpegfd] check for None value of start_time 2016-02-20 08:06:12 +01:00
remitamine
99cbe98ce8 [downloader/external] check for external downloaders availability 2016-02-20 07:58:25 +01:00
Sergey M․
86bf29050e [test_YoutubeDL] Make test pass until more intelligent sort formats (Closes #8462) 2016-02-20 03:36:03 +06:00
remitamine
04cbc4980d [mtv] imporove duration extraction 2016-02-19 20:56:45 +01:00
RiCON
8765151c8a [mtv] Extract duration from each playlist item
RSS used instead of manifest files because it's exact to the millisecond
with the video I tested while in manifest it's only exact to the second.
2016-02-19 19:38:28 +00:00
remitamine
12b84ac8c1 [downloader/external] Add FFmpegFD(fixes #622)
- replace HlsFD and RtspFD
- add basic support for downloading part of the video or audio
2016-02-19 19:29:24 +01:00
Sergey M
8ec64ac683 [README.md] Clarify verbose log 2016-02-19 22:18:21 +06:00
Sergey M․
ed8648a322 [pornhub] Fix thumbnail and duration extraction (Closes #8604) 2016-02-19 21:42:46 +06:00
Sergey M․
88641243ab [pornhub:playlistbase] Improve extract entries 2016-02-18 22:30:19 +06:00
Sergey M․
40e146aa1e [pornhub:user:videos] Add extractor (Closes #8548) 2016-02-18 22:29:17 +06:00
Sergey M․
f3f9cd9234 [francetv] Improve video id regex (Closes #8563) 2016-02-18 22:09:21 +06:00
Sergey M․
ebf1b291d0 [youtube:watchlater] Respect --no-playlist 2016-02-18 22:03:46 +06:00
Sergey M․
bc7a9cd8fb [youtube:watchlater] Improve _VALID_URL (Closes #8594) 2016-02-18 21:50:21 +06:00
Sergey M․
d48502b82a [arte] Improve _VALID_URLs 2016-02-18 21:29:52 +06:00
Sergey M․
479ec54a8d [arte:magazine] Improve (Closes #8473) 2016-02-18 21:29:07 +06:00
Thomas Jost
49625662a9 [arte:magazine] Add extractor 2016-02-18 21:28:18 +06:00
remitamine
8b809a079a [cbsnews] use find_xpath_attr 2016-02-18 16:10:09 +01:00
remitamine
778433cb90 [cbsnews] extract subtitle url from theplatform SMIL manifest(fixes #8568) 2016-02-18 15:43:28 +01:00
cazulu
411cb8f476 [dailymotion] Fix view count extraction
Fix view count parsing when the decimal marker is a whitespace, e.g. '101 101'
2016-02-18 20:31:43 +06:00
Sergey M․
63bf4f0dc0 [vrt] Detect geo restriction 2016-02-17 23:28:41 +06:00
Sergey M․
80e59a0d5d [vrt] Make formats extraction non fatal (Closes #8587) 2016-02-17 23:18:23 +06:00
Sergey M․
8bbd3d1476 [arte] Fix upload date extraction (Closes #8581) 2016-02-17 22:51:08 +06:00
Sergey M․
e725e4bced [arte] PEP 8 2016-02-17 22:37:55 +06:00
Sergey M․
08d65046f0 [arte] Make sorting aware of en/es formats 2016-02-17 22:37:05 +06:00
Sergey M․
44b9745000 [arte] Extend more _VALID_URLs for en and es support 2016-02-17 21:53:53 +06:00
Sergey M․
9654fc875b [arte:+7] Fix extraction for react-based layout 2016-02-17 21:49:15 +06:00
Sergey M․
0f425e65ec [arte:+7] Add support for en and es URLs 2016-02-17 21:47:18 +06:00
mutantmonkey
199e724291 [KUSI] Add new extractor 2016-02-16 19:55:46 -08:00
Sergey M․
e277f2a63b [orf:tvthek] Check formats (Closes #8580) 2016-02-16 22:23:38 +06:00
Sergey M․
f4db09178a [xtube:user] Remove duplicated video ids 2016-02-16 22:06:26 +06:00
Sergey M․
86be3cdc2a [xtube] Fix extraction (Closes #8565) 2016-02-16 22:05:23 +06:00
Yen Chi Hsuan
cb64ccc715 [facebook] Improve error handling (#8572) 2016-02-16 09:07:38 +08:00
Sergey M․
f66a3c7bc2 [screenjunkies] Fix spelling 2016-02-16 01:30:00 +06:00
Sergey M․
fe80df3080 Credit @TingPing for screenjunkies (#8505) 2016-02-16 01:24:57 +06:00
Yen Chi Hsuan
1932476c13 [iqiyi] Omit MD5 sums for the VIP-only video 2016-02-16 02:45:21 +08:00
Sergey M․
d2c1f79f20 [youtube:searchurl] Extend _VALID_URL 2016-02-16 00:29:51 +06:00
Sergey M․
8eacae8cf9 Credit @RobinHoutevelts for canvas subtiltes (#8537) 2016-02-15 22:33:32 +06:00
Sergey M․
c8a80fd818 [screenjunkies] Improve, extract more metadata and workaround subscription (Closes #8505) 2016-02-15 22:29:28 +06:00
Patrick Griffis
b9e8d7140a [screenjunkies] Add new extractor
This doesn't handle the plus only videos yet

Closes #8492
2016-02-15 22:28:36 +06:00
Sergey M․
6eff2605d6 [canvas] Add subtitles test (#8537) 2016-02-15 20:59:16 +06:00
Sergey M․
fd7a3ea4a4 [canvas] Improve subtitles (Closes #8537) 2016-02-15 20:54:01 +06:00
Robin Houtevelts
8d3eeb36d7 [Canvas] Add subtitles 2016-02-15 20:50:03 +06:00
Yen Chi Hsuan
8e0548e180 [iqiyi] Partial support for VIP-only videos
See #8569 and #8019. Currently only 6-min preview are supported
2016-02-15 19:58:24 +08:00
Philipp Hagemeister
a517bb4b1e [noz] Add new extractor 2016-02-15 00:07:16 +01:00
Sergey M․
9dcefb23a1 [laola1tv] Improve (Closes #8478) 2016-02-14 23:40:26 +06:00
Sergey M․
d9da74bc06 Credit @blackwinter for laola1tv (#8478) 2016-02-14 23:39:49 +06:00
Jens Wille
5e19323ed9 [laola1tv] Fixes for changed site layout.
* Fixed valid URLs (w/ tests).
* Fixed iframe URL extraction.
* Fixed token URL extraction.
* Fixed variable extraction.
* Fixed uploader spelling.
* Added upload_date to result dictionary.
2016-02-14 23:01:49 +06:00
Sergey M․
611c1dd96e [refactor] Single quotes consistency 2016-02-14 15:37:17 +06:00
Sergey M․
d800609c62 [refactor] Do not specify redundant None as second argument in dict.get() 2016-02-14 14:25:04 +06:00
Sergey M․
c78c9cd10d [downloader/dash] PEP 8 2016-02-14 14:13:09 +06:00
Sergey M․
e76394f36c [globo] Switch to new-style classes 2016-02-14 14:02:12 +06:00
Sergey M․
080e09557d [aes] Switch to new-style classes 2016-02-14 14:01:43 +06:00
Sergey M․
fca2e6d5a6 [dailymotion:cloud] Use idiomatic name for classmethod's first argument 2016-02-14 13:44:23 +06:00
Sergey M․
b45f2b1d6e [myvideo] Mark broken 2016-02-14 11:24:57 +06:00
remitamine
fc2e70ee90 Merge pull request #8479 from remitamine/dash_downloader
[downloader/dash] Implement dashsegments fd in terms of fragment fd
2016-02-13 21:12:33 +01:00
Sergey M․
b4561e857f [animeondemand] Add .netrc 2016-02-13 22:41:58 +06:00
Jaime Marquínez Ferrándiz
7023251239 [comedycentral] Support /shows URLs (fixes #8405) 2016-02-13 12:26:27 +01:00
Sergey M․
e2bd68c901 [animeondemand][wip] Add extractor (#8518) 2016-02-13 13:30:31 +06:00
remitamine
dd86780596 [extractor/common] fix dash formats sorting 2016-02-11 10:55:50 +01:00
remitamine
c43fe0268c [downloader/dash] Implement dashsegments fd in terms of fragment fd 2016-02-09 17:25:44 +01:00
292 changed files with 6471 additions and 2101 deletions

1
.gitignore vendored
View File

@@ -1,5 +1,6 @@
*.pyc *.pyc
*.pyo *.pyo
*.class
*~ *~
*.DS_Store *.DS_Store
wine-py2exe/ wine-py2exe/

10
AUTHORS
View File

@@ -157,3 +157,13 @@ Founder Fang
Andrew Alexeyew Andrew Alexeyew
Saso Bezlaj Saso Bezlaj
Erwin de Haan Erwin de Haan
Jens Wille
Robin Houtevelts
Patrick Griffis
Aidan Rowe
mutantmonkey
Ben Congdon
Kacper Michajłow
José Joaquín Atria
Viťas Strádal
Kagami Hiiragi

View File

@@ -1,6 +1,6 @@
**Please include the full output of youtube-dl when run with `-v`**, i.e. add `-v` flag to your command line, copy the **whole** output and post it in the issue body wrapped in \`\`\` for better formatting. It should look similar to this: **Please include the full output of youtube-dl when run with `-v`**, i.e. **add** `-v` flag to **your command line**, copy the **whole** output and post it in the issue body wrapped in \`\`\` for better formatting. It should look similar to this:
``` ```
$ youtube-dl -v http://www.youtube.com/watch?v=BaW_jenozKcj $ youtube-dl -v <your command line>
[debug] System config: [] [debug] System config: []
[debug] User config: [] [debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj'] [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
@@ -85,14 +85,16 @@ To run the test, simply invoke your favorite test runner, or execute a test file
If you want to create a build of youtube-dl yourself, you'll need If you want to create a build of youtube-dl yourself, you'll need
* python * python
* make * make (both GNU make and BSD make are supported)
* pandoc * pandoc
* zip * zip
* nosetests * nosetests
### Adding support for a new site ### Adding support for a new site
If you want to add support for a new site, you can follow this quick list (assuming your service is called `yourextractor`): If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. youtube-dl does **not support** such sites thus pull requests adding support for them **will be rejected**.
After you have ensured this site is distributing it's content legally, you can follow this quick list (assuming your service is called `yourextractor`):
1. [Fork this repository](https://github.com/rg3/youtube-dl/fork) 1. [Fork this repository](https://github.com/rg3/youtube-dl/fork)
2. Check out the source code with `git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git` 2. Check out the source code with `git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git`
@@ -140,16 +142,17 @@ If you want to add support for a new site, you can follow this quick list (assum
``` ```
5. Add an import in [`youtube_dl/extractor/__init__.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/__init__.py). 5. Add an import in [`youtube_dl/extractor/__init__.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/__init__.py).
6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. 6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc.
7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L62-L200). Add tests and code for as many as you want. 7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/rg3/youtube-dl/blob/58525c94d547be1c8167d16c298bdd75506db328/youtube_dl/extractor/common.py#L68-L226). Add tests and code for as many as you want.
8. If you can, check the code with [flake8](https://pypi.python.org/pypi/flake8). 8. Keep in mind that the only mandatory fields in info dict for successful extraction process are `id`, `title` and either `url` or `formats`, i.e. these are the critical data the extraction does not make any sense without. This means that [any field](https://github.com/rg3/youtube-dl/blob/58525c94d547be1c8167d16c298bdd75506db328/youtube_dl/extractor/common.py#L138-L226) apart from aforementioned mandatory ones should be treated **as optional** and extraction should be **tolerate** to situations when sources for these fields can potentially be unavailable (even if they always available at the moment) and **future-proof** in order not to break the extraction of general purpose mandatory fields. For example, if you have some intermediate dict `meta` that is a source of metadata and it has a key `summary` that you want to extract and put into resulting info dict as `description`, you should be ready that this key may be missing from the `meta` dict, i.e. you should extract it as `meta.get('summary')` and not `meta['summary']`. Similarly, you should pass `fatal=False` when extracting data from a webpage with `_search_regex/_html_search_regex`.
9. When the tests pass, [add](http://git-scm.com/docs/git-add) the new files and [commit](http://git-scm.com/docs/git-commit) them and [push](http://git-scm.com/docs/git-push) the result, like this: 9. Check the code with [flake8](https://pypi.python.org/pypi/flake8).
10. When the tests pass, [add](http://git-scm.com/docs/git-add) the new files and [commit](http://git-scm.com/docs/git-commit) them and [push](http://git-scm.com/docs/git-push) the result, like this:
$ git add youtube_dl/extractor/__init__.py $ git add youtube_dl/extractor/__init__.py
$ git add youtube_dl/extractor/yourextractor.py $ git add youtube_dl/extractor/yourextractor.py
$ git commit -m '[yourextractor] Add new extractor' $ git commit -m '[yourextractor] Add new extractor'
$ git push origin yourextractor $ git push origin yourextractor
10. Finally, [create a pull request](https://help.github.com/articles/creating-a-pull-request). We'll then review and merge it. 11. Finally, [create a pull request](https://help.github.com/articles/creating-a-pull-request). We'll then review and merge it.
In any case, thank you very much for your contributions! In any case, thank you very much for your contributions!

View File

@@ -3,6 +3,7 @@ all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bas
clean: clean:
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish *.dump *.part *.info.json *.mp4 *.flv *.mp3 *.avi CONTRIBUTING.md.tmp youtube-dl youtube-dl.exe rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish *.dump *.part *.info.json *.mp4 *.flv *.mp3 *.avi CONTRIBUTING.md.tmp youtube-dl youtube-dl.exe
find . -name "*.pyc" -delete find . -name "*.pyc" -delete
find . -name "*.class" -delete
PREFIX ?= /usr/local PREFIX ?= /usr/local
BINDIR ?= $(PREFIX)/bin BINDIR ?= $(PREFIX)/bin
@@ -11,15 +12,7 @@ SHAREDIR ?= $(PREFIX)/share
PYTHON ?= /usr/bin/env python PYTHON ?= /usr/bin/env python
# set SYSCONFDIR to /etc if PREFIX=/usr or PREFIX=/usr/local # set SYSCONFDIR to /etc if PREFIX=/usr or PREFIX=/usr/local
ifeq ($(PREFIX),/usr) SYSCONFDIR != if [ $(PREFIX) = /usr -o $(PREFIX) = /usr/local ]; then echo /etc; else echo $(PREFIX)/etc; fi
SYSCONFDIR=/etc
else
ifeq ($(PREFIX),/usr/local)
SYSCONFDIR=/etc
else
SYSCONFDIR=$(PREFIX)/etc
endif
endif
install: youtube-dl youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish install: youtube-dl youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish
install -d $(DESTDIR)$(BINDIR) install -d $(DESTDIR)$(BINDIR)
@@ -44,7 +37,7 @@ test:
ot: offlinetest ot: offlinetest
offlinetest: codetest offlinetest: codetest
nosetests --verbose test --exclude test_download.py --exclude test_age_restriction.py --exclude test_subtitles.py --exclude test_write_annotations.py --exclude test_youtube_lists.py $(PYTHON) -m nose --verbose test --exclude test_download.py --exclude test_age_restriction.py --exclude test_subtitles.py --exclude test_write_annotations.py --exclude test_youtube_lists.py --exclude test_iqiyi_sdk_interpreter.py
tar: youtube-dl.tar.gz tar: youtube-dl.tar.gz

View File

@@ -80,6 +80,8 @@ which means you can modify it, redistribute it or use it however you like.
on Windows) on Windows)
--flat-playlist Do not extract the videos of a playlist, --flat-playlist Do not extract the videos of a playlist,
only list them. only list them.
--mark-watched Mark videos watched (YouTube only)
--no-mark-watched Do not mark videos watched (YouTube only)
--no-color Do not emit color codes in output --no-color Do not emit color codes in output
## Network Options: ## Network Options:
@@ -162,6 +164,8 @@ which means you can modify it, redistribute it or use it however you like.
(e.g. 50K or 4.2M) (e.g. 50K or 4.2M)
-R, --retries RETRIES Number of retries (default is 10), or -R, --retries RETRIES Number of retries (default is 10), or
"infinite". "infinite".
--fragment-retries RETRIES Number of retries for a fragment (default
is 10), or "infinite" (DASH only)
--buffer-size SIZE Size of download buffer (e.g. 1024 or 16K) --buffer-size SIZE Size of download buffer (e.g. 1024 or 16K)
(default is 1024) (default is 1024)
--no-resize-buffer Do not automatically adjust the buffer --no-resize-buffer Do not automatically adjust the buffer
@@ -179,7 +183,7 @@ which means you can modify it, redistribute it or use it however you like.
to play it) to play it)
--external-downloader COMMAND Use the specified external downloader. --external-downloader COMMAND Use the specified external downloader.
Currently supports Currently supports
aria2c,axel,curl,httpie,wget aria2c,avconv,axel,curl,ffmpeg,httpie,wget
--external-downloader-args ARGS Give these arguments to the external --external-downloader-args ARGS Give these arguments to the external
downloader downloader
@@ -374,8 +378,8 @@ which means you can modify it, redistribute it or use it however you like.
--no-post-overwrites Do not overwrite post-processed files; the --no-post-overwrites Do not overwrite post-processed files; the
post-processed files are overwritten by post-processed files are overwritten by
default default
--embed-subs Embed subtitles in the video (only for mkv --embed-subs Embed subtitles in the video (only for mp4,
and mp4 videos) webm and mkv videos)
--embed-thumbnail Embed thumbnail in the audio as cover art --embed-thumbnail Embed thumbnail in the audio as cover art
--add-metadata Write metadata to the video file --add-metadata Write metadata to the video file
--metadata-from-title FORMAT Parse additional metadata like song title / --metadata-from-title FORMAT Parse additional metadata like song title /
@@ -409,13 +413,18 @@ which means you can modify it, redistribute it or use it however you like.
# CONFIGURATION # CONFIGURATION
You can configure youtube-dl by placing any supported command line option to a configuration file. On Linux, the system wide configuration file is located at `/etc/youtube-dl.conf` and the user wide configuration file at `~/.config/youtube-dl/config`. On Windows, the user wide configuration file locations are `%APPDATA%\youtube-dl\config.txt` or `C:\Users\<user name>\youtube-dl.conf`. For example, with the following configuration file youtube-dl will always extract the audio, not copy the mtime and use a proxy: You can configure youtube-dl by placing any supported command line option to a configuration file. On Linux, the system wide configuration file is located at `/etc/youtube-dl.conf` and the user wide configuration file at `~/.config/youtube-dl/config`. On Windows, the user wide configuration file locations are `%APPDATA%\youtube-dl\config.txt` or `C:\Users\<user name>\youtube-dl.conf`.
For example, with the following configuration file youtube-dl will always extract the audio, not copy the mtime, use a proxy and save all videos under `Movies` directory in your home directory:
``` ```
--extract-audio -x
--no-mtime --no-mtime
--proxy 127.0.0.1:3128 --proxy 127.0.0.1:3128
-o ~/Movies/%(title)s.%(ext)s
``` ```
Note that options in configuration file are just the same options aka switches used in regular command line calls thus there **must be no whitespace** after `-` or `--`, e.g. `-o` or `--proxy` but not `- o` or `-- proxy`.
You can use `--ignore-config` if you want to disable the configuration file for a particular youtube-dl run. You can use `--ignore-config` if you want to disable the configuration file for a particular youtube-dl run.
### Authentication with `.netrc` file ### Authentication with `.netrc` file
@@ -440,7 +449,11 @@ On Windows you may also need to setup the `%HOME%` environment variable manually
# OUTPUT TEMPLATE # OUTPUT TEMPLATE
The `-o` option allows users to indicate a template for the output file names. The basic usage is not to set any template arguments when downloading a single file, like in `youtube-dl -o funny_video.flv "http://some/video"`. However, it may contain special sequences that will be replaced when downloading each video. The special sequences have the format `%(NAME)s`. To clarify, that is a percent symbol followed by a name in parentheses, followed by a lowercase S. Allowed names are: The `-o` option allows users to indicate a template for the output file names.
**tl;dr:** [navigate me to examples](#output-template-examples).
The basic usage is not to set any template arguments when downloading a single file, like in `youtube-dl -o funny_video.flv "http://some/video"`. However, it may contain special sequences that will be replaced when downloading each video. The special sequences have the format `%(NAME)s`. To clarify, that is a percent symbol followed by a name in parentheses, followed by a lowercase S. Allowed names are:
- `id`: Video identifier - `id`: Video identifier
- `title`: Video title - `title`: Video title
@@ -449,6 +462,7 @@ The `-o` option allows users to indicate a template for the output file names. T
- `alt_title`: A secondary title of the video - `alt_title`: A secondary title of the video
- `display_id`: An alternative identifier for the video - `display_id`: An alternative identifier for the video
- `uploader`: Full name of the video uploader - `uploader`: Full name of the video uploader
- `license`: License name the video is licensed under
- `creator`: The main artist who created the video - `creator`: The main artist who created the video
- `release_date`: The date (YYYYMMDD) when the video was released - `release_date`: The date (YYYYMMDD) when the video was released
- `timestamp`: UNIX timestamp of the moment the video became available - `timestamp`: UNIX timestamp of the moment the video became available
@@ -513,7 +527,9 @@ The current default template is `%(title)s-%(id)s.%(ext)s`.
In some cases, you don't want special characters such as 中, spaces, or &, such as when transferring the downloaded filename to a Windows system or the filename through an 8bit-unsafe channel. In these cases, add the `--restrict-filenames` flag to get a shorter title: In some cases, you don't want special characters such as 中, spaces, or &, such as when transferring the downloaded filename to a Windows system or the filename through an 8bit-unsafe channel. In these cases, add the `--restrict-filenames` flag to get a shorter title:
Examples (note on Windows you may need to use double quotes instead of single): #### Output template examples
Note on Windows you may need to use double quotes instead of single.
```bash ```bash
$ youtube-dl --get-filename -o '%(title)s.%(ext)s' BaW_jenozKc $ youtube-dl --get-filename -o '%(title)s.%(ext)s' BaW_jenozKc
@@ -525,6 +541,9 @@ youtube-dl_test_video_.mp4 # A simple file name
# Download YouTube playlist videos in separate directory indexed by video order in a playlist # Download YouTube playlist videos in separate directory indexed by video order in a playlist
$ youtube-dl -o '%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s' https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re $ youtube-dl -o '%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s' https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re
# Download all playlists of YouTube channel/user keeping each playlist in separate directory:
$ youtube-dl -o '%(uploader)s/%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s' https://www.youtube.com/user/TheLinuxFoundation/playlists
# Download Udemy course keeping each chapter in separate directory under MyVideos directory in your home # Download Udemy course keeping each chapter in separate directory under MyVideos directory in your home
$ youtube-dl -u user -p password -o '~/MyVideos/%(playlist)s/%(chapter_number)s - %(chapter)s/%(title)s.%(ext)s' https://www.udemy.com/java-tutorial/ $ youtube-dl -u user -p password -o '~/MyVideos/%(playlist)s/%(chapter_number)s - %(chapter)s/%(title)s.%(ext)s' https://www.udemy.com/java-tutorial/
@@ -543,6 +562,8 @@ But sometimes you may want to download in a different format, for example when y
The general syntax for format selection is `--format FORMAT` or shorter `-f FORMAT` where `FORMAT` is a *selector expression*, i.e. an expression that describes format or formats you would like to download. The general syntax for format selection is `--format FORMAT` or shorter `-f FORMAT` where `FORMAT` is a *selector expression*, i.e. an expression that describes format or formats you would like to download.
**tl;dr:** [navigate me to examples](#format-selection-examples).
The simplest case is requesting a specific format, for example with `-f 22` you can download the format with format code equal to 22. You can get the list of available format codes for particular video using `--list-formats` or `-F`. Note that these format codes are extractor specific. The simplest case is requesting a specific format, for example with `-f 22` you can download the format with format code equal to 22. You can get the list of available format codes for particular video using `--list-formats` or `-F`. Note that these format codes are extractor specific.
You can also use a file extension (currently `3gp`, `aac`, `flv`, `m4a`, `mp3`, `mp4`, `ogg`, `wav`, `webm` are supported) to download best quality format of particular file extension served as a single file, e.g. `-f webm` will download best quality format with `webm` extension served as a single file. You can also use a file extension (currently `3gp`, `aac`, `flv`, `m4a`, `mp3`, `mp4`, `ogg`, `wav`, `webm` are supported) to download best quality format of particular file extension served as a single file, e.g. `-f webm` will download best quality format with `webm` extension served as a single file.
@@ -588,11 +609,14 @@ You can merge the video and audio of two formats into a single file using `-f <v
Format selectors can also be grouped using parentheses, for example if you want to download the best mp4 and webm formats with a height lower than 480 you can use `-f '(mp4,webm)[height<480]'`. Format selectors can also be grouped using parentheses, for example if you want to download the best mp4 and webm formats with a height lower than 480 you can use `-f '(mp4,webm)[height<480]'`.
Since the end of April 2015 and version 2015.04.26 youtube-dl uses `-f bestvideo+bestaudio/best` as default format selection (see #5447, #5456). If ffmpeg or avconv are installed this results in downloading `bestvideo` and `bestaudio` separately and muxing them together into a single file giving the best overall quality available. Otherwise it falls back to `best` and results in downloading the best available quality served as a single file. `best` is also needed for videos that don't come from YouTube because they don't provide the audio and video in two different files. If you want to only download some DASH formats (for example if you are not interested in getting videos with a resolution higher than 1080p), you can add `-f bestvideo[height<=?1080]+bestaudio/best` to your configuration file. Note that if you use youtube-dl to stream to `stdout` (and most likely to pipe it to your media player then), i.e. you explicitly specify output template as `-o -`, youtube-dl still uses `-f best` format selection in order to start content delivery immediately to your player and not to wait until `bestvideo` and `bestaudio` are downloaded and muxed. Since the end of April 2015 and version 2015.04.26 youtube-dl uses `-f bestvideo+bestaudio/best` as default format selection (see [#5447](https://github.com/rg3/youtube-dl/issues/5447), [#5456](https://github.com/rg3/youtube-dl/issues/5456)). If ffmpeg or avconv are installed this results in downloading `bestvideo` and `bestaudio` separately and muxing them together into a single file giving the best overall quality available. Otherwise it falls back to `best` and results in downloading the best available quality served as a single file. `best` is also needed for videos that don't come from YouTube because they don't provide the audio and video in two different files. If you want to only download some DASH formats (for example if you are not interested in getting videos with a resolution higher than 1080p), you can add `-f bestvideo[height<=?1080]+bestaudio/best` to your configuration file. Note that if you use youtube-dl to stream to `stdout` (and most likely to pipe it to your media player then), i.e. you explicitly specify output template as `-o -`, youtube-dl still uses `-f best` format selection in order to start content delivery immediately to your player and not to wait until `bestvideo` and `bestaudio` are downloaded and muxed.
If you want to preserve the old format selection behavior (prior to youtube-dl 2015.04.26), i.e. you want to download the best available quality media served as a single file, you should explicitly specify your choice with `-f best`. You may want to add it to the [configuration file](#configuration) in order not to type it every time you run youtube-dl. If you want to preserve the old format selection behavior (prior to youtube-dl 2015.04.26), i.e. you want to download the best available quality media served as a single file, you should explicitly specify your choice with `-f best`. You may want to add it to the [configuration file](#configuration) in order not to type it every time you run youtube-dl.
Examples (note on Windows you may need to use double quotes instead of single): #### Format selection examples
Note on Windows you may need to use double quotes instead of single.
```bash ```bash
# Download best mp4 format available or any other best if no mp4 available # Download best mp4 format available or any other best if no mp4 available
$ youtube-dl -f 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best' $ youtube-dl -f 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best'
@@ -733,7 +757,7 @@ means you're using an outdated version of Python. Please update to Python 2.6 or
### What is this binary file? Where has the code gone? ### What is this binary file? Where has the code gone?
Since June 2012 (#342) youtube-dl is packed as an executable zipfile, simply unzip it (might need renaming to `youtube-dl.zip` first on some systems) or clone the git repository, as laid out above. If you modify the code, you can run it by executing the `__main__.py` file. To recompile the executable, run `make youtube-dl`. Since June 2012 ([#342](https://github.com/rg3/youtube-dl/issues/342)) youtube-dl is packed as an executable zipfile, simply unzip it (might need renaming to `youtube-dl.zip` first on some systems) or clone the git repository, as laid out above. If you modify the code, you can run it by executing the `__main__.py` file. To recompile the executable, run `make youtube-dl`.
### The exe throws a *Runtime error from Visual C++* ### The exe throws a *Runtime error from Visual C++*
@@ -809,14 +833,16 @@ To run the test, simply invoke your favorite test runner, or execute a test file
If you want to create a build of youtube-dl yourself, you'll need If you want to create a build of youtube-dl yourself, you'll need
* python * python
* make * make (both GNU make and BSD make are supported)
* pandoc * pandoc
* zip * zip
* nosetests * nosetests
### Adding support for a new site ### Adding support for a new site
If you want to add support for a new site, you can follow this quick list (assuming your service is called `yourextractor`): If you want to add support for a new site, first of all **make sure** this site is **not dedicated to [copyright infringement](#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. youtube-dl does **not support** such sites thus pull requests adding support for them **will be rejected**.
After you have ensured this site is distributing it's content legally, you can follow this quick list (assuming your service is called `yourextractor`):
1. [Fork this repository](https://github.com/rg3/youtube-dl/fork) 1. [Fork this repository](https://github.com/rg3/youtube-dl/fork)
2. Check out the source code with `git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git` 2. Check out the source code with `git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git`
@@ -864,16 +890,17 @@ If you want to add support for a new site, you can follow this quick list (assum
``` ```
5. Add an import in [`youtube_dl/extractor/__init__.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/__init__.py). 5. Add an import in [`youtube_dl/extractor/__init__.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/__init__.py).
6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. 6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc.
7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L62-L200). Add tests and code for as many as you want. 7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/rg3/youtube-dl/blob/58525c94d547be1c8167d16c298bdd75506db328/youtube_dl/extractor/common.py#L68-L226). Add tests and code for as many as you want.
8. If you can, check the code with [flake8](https://pypi.python.org/pypi/flake8). 8. Keep in mind that the only mandatory fields in info dict for successful extraction process are `id`, `title` and either `url` or `formats`, i.e. these are the critical data the extraction does not make any sense without. This means that [any field](https://github.com/rg3/youtube-dl/blob/58525c94d547be1c8167d16c298bdd75506db328/youtube_dl/extractor/common.py#L138-L226) apart from aforementioned mandatory ones should be treated **as optional** and extraction should be **tolerate** to situations when sources for these fields can potentially be unavailable (even if they always available at the moment) and **future-proof** in order not to break the extraction of general purpose mandatory fields. For example, if you have some intermediate dict `meta` that is a source of metadata and it has a key `summary` that you want to extract and put into resulting info dict as `description`, you should be ready that this key may be missing from the `meta` dict, i.e. you should extract it as `meta.get('summary')` and not `meta['summary']`. Similarly, you should pass `fatal=False` when extracting data from a webpage with `_search_regex/_html_search_regex`.
9. When the tests pass, [add](http://git-scm.com/docs/git-add) the new files and [commit](http://git-scm.com/docs/git-commit) them and [push](http://git-scm.com/docs/git-push) the result, like this: 9. Check the code with [flake8](https://pypi.python.org/pypi/flake8).
10. When the tests pass, [add](http://git-scm.com/docs/git-add) the new files and [commit](http://git-scm.com/docs/git-commit) them and [push](http://git-scm.com/docs/git-push) the result, like this:
$ git add youtube_dl/extractor/__init__.py $ git add youtube_dl/extractor/__init__.py
$ git add youtube_dl/extractor/yourextractor.py $ git add youtube_dl/extractor/yourextractor.py
$ git commit -m '[yourextractor] Add new extractor' $ git commit -m '[yourextractor] Add new extractor'
$ git push origin yourextractor $ git push origin yourextractor
10. Finally, [create a pull request](https://help.github.com/articles/creating-a-pull-request). We'll then review and merge it. 11. Finally, [create a pull request](https://help.github.com/articles/creating-a-pull-request). We'll then review and merge it.
In any case, thank you very much for your contributions! In any case, thank you very much for your contributions!
@@ -935,9 +962,9 @@ with youtube_dl.YoutubeDL(ydl_opts) as ydl:
Bugs and suggestions should be reported at: <https://github.com/rg3/youtube-dl/issues>. Unless you were prompted so or there is another pertinent reason (e.g. GitHub fails to accept the bug report), please do not send bug reports via personal email. For discussions, join us in the IRC channel [#youtube-dl](irc://chat.freenode.net/#youtube-dl) on freenode ([webchat](http://webchat.freenode.net/?randomnick=1&channels=youtube-dl)). Bugs and suggestions should be reported at: <https://github.com/rg3/youtube-dl/issues>. Unless you were prompted so or there is another pertinent reason (e.g. GitHub fails to accept the bug report), please do not send bug reports via personal email. For discussions, join us in the IRC channel [#youtube-dl](irc://chat.freenode.net/#youtube-dl) on freenode ([webchat](http://webchat.freenode.net/?randomnick=1&channels=youtube-dl)).
**Please include the full output of youtube-dl when run with `-v`**, i.e. add `-v` flag to your command line, copy the **whole** output and post it in the issue body wrapped in \`\`\` for better formatting. It should look similar to this: **Please include the full output of youtube-dl when run with `-v`**, i.e. **add** `-v` flag to **your command line**, copy the **whole** output and post it in the issue body wrapped in \`\`\` for better formatting. It should look similar to this:
``` ```
$ youtube-dl -v http://www.youtube.com/watch?v=BaW_jenozKcj $ youtube-dl -v <your command line>
[debug] System config: [] [debug] System config: []
[debug] User config: [] [debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj'] [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']

View File

@@ -30,6 +30,7 @@
- **AlJazeera** - **AlJazeera**
- **Allocine** - **Allocine**
- **AlphaPorno** - **AlphaPorno**
- **AnimeOnDemand**
- **anitube.se** - **anitube.se**
- **AnySex** - **AnySex**
- **Aparat** - **Aparat**
@@ -49,9 +50,11 @@
- **arte.tv:ddc** - **arte.tv:ddc**
- **arte.tv:embed** - **arte.tv:embed**
- **arte.tv:future** - **arte.tv:future**
- **arte.tv:magazine**
- **AtresPlayer** - **AtresPlayer**
- **ATTTechChannel** - **ATTTechChannel**
- **AudiMedia** - **AudiMedia**
- **AudioBoom**
- **audiomack** - **audiomack**
- **audiomack:album** - **audiomack:album**
- **Azubu** - **Azubu**
@@ -71,12 +74,15 @@
- **Bigflix** - **Bigflix**
- **Bild**: Bild.de - **Bild**: Bild.de
- **BiliBili** - **BiliBili**
- **BioBioChileTV**
- **BleacherReport** - **BleacherReport**
- **BleacherReportCMS** - **BleacherReportCMS**
- **blinkx** - **blinkx**
- **Bloomberg** - **Bloomberg**
- **BokeCC**
- **Bpb**: Bundeszentrale für politische Bildung - **Bpb**: Bundeszentrale für politische Bildung
- **BR**: Bayerischer Rundfunk Mediathek - **BR**: Bayerischer Rundfunk Mediathek
- **BravoTV**
- **Break** - **Break**
- **brightcove:legacy** - **brightcove:legacy**
- **brightcove:new** - **brightcove:new**
@@ -95,6 +101,7 @@
- **CBSNews**: CBS News - **CBSNews**: CBS News
- **CBSNewsLiveVideo**: CBS News Live Videos - **CBSNewsLiveVideo**: CBS News Live Videos
- **CBSSports** - **CBSSports**
- **CDA**
- **CeskaTelevize** - **CeskaTelevize**
- **channel9**: Channel 9 - **channel9**: Channel 9
- **Chaturbate** - **Chaturbate**
@@ -164,6 +171,8 @@
- **Dump** - **Dump**
- **Dumpert** - **Dumpert**
- **dvtv**: http://video.aktualne.cz/ - **dvtv**: http://video.aktualne.cz/
- **dw**
- **dw:article**
- **EaglePlatform** - **EaglePlatform**
- **EbaumsWorld** - **EbaumsWorld**
- **EchoMsk** - **EchoMsk**
@@ -187,10 +196,10 @@
- **ExpoTV** - **ExpoTV**
- **ExtremeTube** - **ExtremeTube**
- **facebook** - **facebook**
- **facebook:post**
- **faz.net** - **faz.net**
- **fc2** - **fc2**
- **Fczenit** - **Fczenit**
- **features.aol.com**
- **fernsehkritik.tv** - **fernsehkritik.tv**
- **Firstpost** - **Firstpost**
- **FiveTV** - **FiveTV**
@@ -237,6 +246,7 @@
- **GPUTechConf** - **GPUTechConf**
- **Groupon** - **Groupon**
- **Hark** - **Hark**
- **HBO**
- **HearThisAt** - **HearThisAt**
- **Heise** - **Heise**
- **HellPorno** - **HellPorno**
@@ -290,6 +300,7 @@
- **kontrtube**: KontrTube.ru - Труба зовёт - **kontrtube**: KontrTube.ru - Труба зовёт
- **KrasView**: Красвью - **KrasView**: Красвью
- **Ku6** - **Ku6**
- **KUSI**
- **kuwo:album**: 酷我音乐 - 专辑 - **kuwo:album**: 酷我音乐 - 专辑
- **kuwo:category**: 酷我音乐 - 分类 - **kuwo:category**: 酷我音乐 - 分类
- **kuwo:chart**: 酷我音乐 - 排行榜 - **kuwo:chart**: 酷我音乐 - 排行榜
@@ -298,12 +309,11 @@
- **kuwo:song**: 酷我音乐 - **kuwo:song**: 酷我音乐
- **la7.tv** - **la7.tv**
- **Laola1Tv** - **Laola1Tv**
- **Le**: 乐视网
- **Lecture2Go** - **Lecture2Go**
- **Lemonde** - **Lemonde**
- **Letv**: 乐视网 - **LePlaylist**
- **LetvCloud**: 乐视云 - **LetvCloud**: 乐视云
- **LetvPlaylist**
- **LetvTv**
- **Libsyn** - **Libsyn**
- **life:embed** - **life:embed**
- **lifenews**: LIFE | NEWS - **lifenews**: LIFE | NEWS
@@ -321,6 +331,7 @@
- **m6** - **m6**
- **macgamestore**: MacGameStore trailers - **macgamestore**: MacGameStore trailers
- **mailru**: Видео@Mail.Ru - **mailru**: Видео@Mail.Ru
- **MakersChannel**
- **MakerTV** - **MakerTV**
- **Malemotion** - **Malemotion**
- **MatchTV** - **MatchTV**
@@ -331,10 +342,12 @@
- **Mgoon** - **Mgoon**
- **Minhateca** - **Minhateca**
- **MinistryGrid** - **MinistryGrid**
- **Minoto**
- **miomio.tv** - **miomio.tv**
- **MiTele**: mitele.es - **MiTele**: mitele.es
- **mixcloud** - **mixcloud**
- **MLB** - **MLB**
- **Mnet**
- **MoeVideo**: LetitBit video services: moevideo.net, playreplay.net and videochart.net - **MoeVideo**: LetitBit video services: moevideo.net, playreplay.net and videochart.net
- **Mofosex** - **Mofosex**
- **Mojvideo** - **Mojvideo**
@@ -360,7 +373,7 @@
- **MySpace:album** - **MySpace:album**
- **MySpass** - **MySpass**
- **Myvi** - **Myvi**
- **myvideo** - **myvideo** (Currently broken)
- **MyVidster** - **MyVidster**
- **n-tv.de** - **n-tv.de**
- **NationalGeographic** - **NationalGeographic**
@@ -410,6 +423,7 @@
- **NowTV** (Currently broken) - **NowTV** (Currently broken)
- **NowTVList** - **NowTVList**
- **nowvideo**: NowVideo - **nowvideo**: NowVideo
- **Noz**
- **npo**: npo.nl and ntr.nl - **npo**: npo.nl and ntr.nl
- **npo.nl:live** - **npo.nl:live**
- **npo.nl:radio** - **npo.nl:radio**
@@ -417,6 +431,7 @@
- **Npr** - **Npr**
- **NRK** - **NRK**
- **NRKPlaylist** - **NRKPlaylist**
- **NRKSkole**: NRK Skole
- **NRKTV**: NRK TV and NRK Radio - **NRKTV**: NRK TV and NRK Radio
- **ntv.ru** - **ntv.ru**
- **Nuvid** - **Nuvid**
@@ -429,6 +444,7 @@
- **OnionStudios** - **OnionStudios**
- **Ooyala** - **Ooyala**
- **OoyalaExternal** - **OoyalaExternal**
- **Openload**
- **OraTV** - **OraTV**
- **orf:fm4**: radio FM4 - **orf:fm4**: radio FM4
- **orf:iptv**: iptv.ORF.at - **orf:iptv**: iptv.ORF.at
@@ -460,6 +476,7 @@
- **PornHd** - **PornHd**
- **PornHub** - **PornHub**
- **PornHubPlaylist** - **PornHubPlaylist**
- **PornHubUserVideos**
- **Pornotube** - **Pornotube**
- **PornoVoisines** - **PornoVoisines**
- **PornoXO** - **PornoXO**
@@ -488,6 +505,7 @@
- **Restudy** - **Restudy**
- **ReverbNation** - **ReverbNation**
- **Revision3** - **Revision3**
- **RICE**
- **RingTV** - **RingTV**
- **RottenTomatoes** - **RottenTomatoes**
- **Roxwel** - **Roxwel**
@@ -512,6 +530,7 @@
- **RUTV**: RUTV.RU - **RUTV**: RUTV.RU
- **Ruutu** - **Ruutu**
- **safari**: safaribooksonline.com online video - **safari**: safaribooksonline.com online video
- **safari:api**
- **safari:course**: safaribooksonline.com online courses - **safari:course**: safaribooksonline.com online courses
- **Sandia**: Sandia National Laboratories - **Sandia**: Sandia National Laboratories
- **Sapo**: SAPO Vídeos - **Sapo**: SAPO Vídeos
@@ -522,6 +541,7 @@
- **screen.yahoo:search**: Yahoo screen search - **screen.yahoo:search**: Yahoo screen search
- **Screencast** - **Screencast**
- **ScreencastOMatic** - **ScreencastOMatic**
- **ScreenJunkies**
- **ScreenwaveMedia** - **ScreenwaveMedia**
- **SenateISVP** - **SenateISVP**
- **ServingSys** - **ServingSys**
@@ -555,7 +575,6 @@
- **southpark.de** - **southpark.de**
- **southpark.nl** - **southpark.nl**
- **southparkstudios.dk** - **southparkstudios.dk**
- **Space**
- **SpankBang** - **SpankBang**
- **Spankwire** - **Spankwire**
- **Spiegel** - **Spiegel**
@@ -605,7 +624,9 @@
- **TheOnion** - **TheOnion**
- **ThePlatform** - **ThePlatform**
- **ThePlatformFeed** - **ThePlatformFeed**
- **TheScene**
- **TheSixtyOne** - **TheSixtyOne**
- **TheStar**
- **ThisAmericanLife** - **ThisAmericanLife**
- **ThisAV** - **ThisAV**
- **THVideo** - **THVideo**
@@ -615,6 +636,7 @@
- **TMZ** - **TMZ**
- **TMZArticle** - **TMZArticle**
- **TNAFlix** - **TNAFlix**
- **TNAFlixNetworkEmbed**
- **toggle** - **toggle**
- **tou.tv** - **tou.tv**
- **Toypics**: Toypics user profile - **Toypics**: Toypics user profile
@@ -638,6 +660,7 @@
- **tv.dfb.de** - **tv.dfb.de**
- **TV2** - **TV2**
- **TV2Article** - **TV2Article**
- **TV3**
- **TV4**: tv4.se and tv4play.se - **TV4**: tv4.se and tv4play.se
- **TVC** - **TVC**
- **TVCArticle** - **TVCArticle**
@@ -655,6 +678,7 @@
- **twitch:video** - **twitch:video**
- **twitch:vod** - **twitch:vod**
- **twitter** - **twitter**
- **twitter:amplify**
- **twitter:card** - **twitter:card**
- **Ubu** - **Ubu**
- **udemy** - **udemy**
@@ -662,8 +686,10 @@
- **UDNEmbed**: 聯合影音 - **UDNEmbed**: 聯合影音
- **Unistra** - **Unistra**
- **Urort**: NRK P3 Urørt - **Urort**: NRK P3 Urørt
- **USAToday**
- **ustream** - **ustream**
- **ustream:channel** - **ustream:channel**
- **Ustudio**
- **Varzesh3** - **Varzesh3**
- **Vbox7** - **Vbox7**
- **VeeHD** - **VeeHD**
@@ -674,12 +700,13 @@
- **VGTV**: VGTV, BTTV, FTV, Aftenposten and Aftonbladet - **VGTV**: VGTV, BTTV, FTV, Aftenposten and Aftonbladet
- **vh1.com** - **vh1.com**
- **Vice** - **Vice**
- **ViceShow**
- **Viddler** - **Viddler**
- **video.google:search**: Google Video search - **video.google:search**: Google Video search
- **video.mit.edu** - **video.mit.edu**
- **VideoDetective** - **VideoDetective**
- **videofy.me** - **videofy.me**
- **VideoMega** (Currently broken) - **VideoMega**
- **videomore** - **videomore**
- **videomore:season** - **videomore:season**
- **videomore:video** - **videomore:video**
@@ -701,6 +728,7 @@
- **vimeo:channel** - **vimeo:channel**
- **vimeo:group** - **vimeo:group**
- **vimeo:likes**: Vimeo user likes - **vimeo:likes**: Vimeo user likes
- **vimeo:ondemand**
- **vimeo:review**: Review pages on vimeo - **vimeo:review**: Review pages on vimeo
- **vimeo:user** - **vimeo:user**
- **vimeo:watchlater**: Vimeo watch later list, "vimeowatchlater" keyword (requires authentication) - **vimeo:watchlater**: Vimeo watch later list, "vimeowatchlater" keyword (requires authentication)
@@ -765,6 +793,7 @@
- **youtube:channel**: YouTube.com channels - **youtube:channel**: YouTube.com channels
- **youtube:favorites**: YouTube.com favourite videos, ":ytfav" for short (requires authentication) - **youtube:favorites**: YouTube.com favourite videos, ":ytfav" for short (requires authentication)
- **youtube:history**: Youtube watch history, ":ythistory" for short (requires authentication) - **youtube:history**: Youtube watch history, ":ythistory" for short (requires authentication)
- **youtube:live**: YouTube.com live streams
- **youtube:playlist**: YouTube.com playlists - **youtube:playlist**: YouTube.com playlists
- **youtube:playlists**: YouTube.com user/channel playlists - **youtube:playlists**: YouTube.com user/channel playlists
- **youtube:recommended**: YouTube.com recommended videos, ":ytrec" for short (requires authentication) - **youtube:recommended**: YouTube.com recommended videos, ":ytrec" for short (requires authentication)

View File

@@ -11,8 +11,11 @@ import sys
import youtube_dl.extractor import youtube_dl.extractor
from youtube_dl import YoutubeDL from youtube_dl import YoutubeDL
from youtube_dl.utils import ( from youtube_dl.compat import (
compat_os_name,
compat_str, compat_str,
)
from youtube_dl.utils import (
preferredencoding, preferredencoding,
write_string, write_string,
) )
@@ -42,7 +45,7 @@ def report_warning(message):
Print the message to stderr, it will be prefixed with 'WARNING:' Print the message to stderr, it will be prefixed with 'WARNING:'
If stderr is a tty file the 'WARNING:' will be colored If stderr is a tty file the 'WARNING:' will be colored
''' '''
if sys.stderr.isatty() and os.name != 'nt': if sys.stderr.isatty() and compat_os_name != 'nt':
_msg_header = '\033[0;33mWARNING:\033[0m' _msg_header = '\033[0;33mWARNING:\033[0m'
else: else:
_msg_header = 'WARNING:' _msg_header = 'WARNING:'

View File

@@ -222,6 +222,11 @@ class TestFormatSelection(unittest.TestCase):
downloaded = ydl.downloaded_info_dicts[0] downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'dash-video-low') self.assertEqual(downloaded['format_id'], 'dash-video-low')
ydl = YDL({'format': 'bestvideo[format_id^=dash][format_id$=low]'})
ydl.process_ie_result(info_dict.copy())
downloaded = ydl.downloaded_info_dicts[0]
self.assertEqual(downloaded['format_id'], 'dash-video-low')
formats = [ formats = [
{'format_id': 'vid-vcodec-dot', 'ext': 'mp4', 'preference': 1, 'vcodec': 'avc1.123456', 'acodec': 'none', 'url': TEST_URL}, {'format_id': 'vid-vcodec-dot', 'ext': 'mp4', 'preference': 1, 'vcodec': 'avc1.123456', 'acodec': 'none', 'url': TEST_URL},
] ]
@@ -234,7 +239,7 @@ class TestFormatSelection(unittest.TestCase):
def test_youtube_format_selection(self): def test_youtube_format_selection(self):
order = [ order = [
'38', '37', '46', '22', '45', '35', '44', '18', '34', '43', '6', '5', '36', '17', '13', '38', '37', '46', '22', '45', '35', '44', '18', '34', '43', '6', '5', '17', '36', '13',
# Apple HTTP Live Streaming # Apple HTTP Live Streaming
'96', '95', '94', '93', '92', '132', '151', '96', '95', '94', '93', '92', '132', '151',
# 3D # 3D
@@ -502,6 +507,9 @@ class TestYoutubeDL(unittest.TestCase):
assertRegexpMatches(self, ydl._format_note({ assertRegexpMatches(self, ydl._format_note({
'vbr': 10, 'vbr': 10,
}), '^\s*10k$') }), '^\s*10k$')
assertRegexpMatches(self, ydl._format_note({
'fps': 30,
}), '^30fps$')
def test_postprocessors(self): def test_postprocessors(self):
filename = 'post-processor-testfile.mp4' filename = 'post-processor-testfile.mp4'

View File

@@ -1,4 +1,5 @@
#!/usr/bin/env python #!/usr/bin/env python
# coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
# Allow direct execution # Allow direct execution
@@ -52,7 +53,12 @@ class TestHTTP(unittest.TestCase):
('localhost', 0), HTTPTestRequestHandler) ('localhost', 0), HTTPTestRequestHandler)
self.httpd.socket = ssl.wrap_socket( self.httpd.socket = ssl.wrap_socket(
self.httpd.socket, certfile=certfn, server_side=True) self.httpd.socket, certfile=certfn, server_side=True)
self.port = self.httpd.socket.getsockname()[1] if os.name == 'java':
# In Jython SSLSocket is not a subclass of socket.socket
sock = self.httpd.socket.sock
else:
sock = self.httpd.socket
self.port = sock.getsockname()[1]
self.server_thread = threading.Thread(target=self.httpd.serve_forever) self.server_thread = threading.Thread(target=self.httpd.serve_forever)
self.server_thread.daemon = True self.server_thread.daemon = True
self.server_thread.start() self.server_thread.start()
@@ -115,5 +121,14 @@ class TestProxy(unittest.TestCase):
response = ydl.urlopen(req).read().decode('utf-8') response = ydl.urlopen(req).read().decode('utf-8')
self.assertEqual(response, 'cn: {0}'.format(url)) self.assertEqual(response, 'cn: {0}'.format(url))
def test_proxy_with_idn(self):
ydl = YoutubeDL({
'proxy': 'localhost:{0}'.format(self.port),
})
url = 'http://中文.tw/'
response = ydl.urlopen(url).read().decode('utf-8')
# b'xn--fiq228c' is '中文'.encode('idna')
self.assertEqual(response, 'normal: http://xn--fiq228c.tw/')
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@@ -0,0 +1,47 @@
#!/usr/bin/env python
from __future__ import unicode_literals
# Allow direct execution
import os
import sys
import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from test.helper import FakeYDL
from youtube_dl.extractor import IqiyiIE
class IqiyiIEWithCredentials(IqiyiIE):
def _get_login_info(self):
return 'foo', 'bar'
class WarningLogger(object):
def __init__(self):
self.messages = []
def warning(self, msg):
self.messages.append(msg)
def debug(self, msg):
pass
def error(self, msg):
pass
class TestIqiyiSDKInterpreter(unittest.TestCase):
def test_iqiyi_sdk_interpreter(self):
'''
Test the functionality of IqiyiSDKInterpreter by trying to log in
If `sign` is incorrect, /validate call throws an HTTP 556 error
'''
logger = WarningLogger()
ie = IqiyiIEWithCredentials(FakeYDL({'logger': logger}))
ie._login()
self.assertTrue('unable to log in:' in logger.messages[0])
if __name__ == '__main__':
unittest.main()

View File

@@ -18,6 +18,7 @@ import xml.etree.ElementTree
from youtube_dl.utils import ( from youtube_dl.utils import (
age_restricted, age_restricted,
args_to_str, args_to_str,
encode_base_n,
clean_html, clean_html,
DateRange, DateRange,
detect_exe_version, detect_exe_version,
@@ -27,6 +28,7 @@ from youtube_dl.utils import (
encodeFilename, encodeFilename,
escape_rfc3986, escape_rfc3986,
escape_url, escape_url,
extract_attributes,
ExtractorError, ExtractorError,
find_xpath_attr, find_xpath_attr,
fix_xml_ampersands, fix_xml_ampersands,
@@ -35,10 +37,12 @@ from youtube_dl.utils import (
is_html, is_html,
js_to_json, js_to_json,
limit_length, limit_length,
ohdave_rsa_encrypt,
OnDemandPagedList, OnDemandPagedList,
orderedSet, orderedSet,
parse_duration, parse_duration,
parse_filesize, parse_filesize,
parse_count,
parse_iso8601, parse_iso8601,
read_batch_urls, read_batch_urls,
sanitize_filename, sanitize_filename,
@@ -59,6 +63,7 @@ from youtube_dl.utils import (
lowercase_escape, lowercase_escape,
url_basename, url_basename,
urlencode_postdata, urlencode_postdata,
update_url_query,
version_tuple, version_tuple,
xpath_with_ns, xpath_with_ns,
xpath_element, xpath_element,
@@ -73,7 +78,10 @@ from youtube_dl.utils import (
cli_bool_option, cli_bool_option,
) )
from youtube_dl.compat import ( from youtube_dl.compat import (
compat_chr,
compat_etree_fromstring, compat_etree_fromstring,
compat_urlparse,
compat_parse_qs,
) )
@@ -248,6 +256,7 @@ class TestUtil(unittest.TestCase):
self.assertEqual( self.assertEqual(
unified_strdate('2/2/2015 6:47:40 PM', day_first=False), unified_strdate('2/2/2015 6:47:40 PM', day_first=False),
'20150202') '20150202')
self.assertEqual(unified_strdate('Feb 14th 2016 5:45PM'), '20160214')
self.assertEqual(unified_strdate('25-09-2014'), '20140925') self.assertEqual(unified_strdate('25-09-2014'), '20140925')
self.assertEqual(unified_strdate('UNKNOWN DATE FORMAT'), None) self.assertEqual(unified_strdate('UNKNOWN DATE FORMAT'), None)
@@ -451,6 +460,40 @@ class TestUtil(unittest.TestCase):
data = urlencode_postdata({'username': 'foo@bar.com', 'password': '1234'}) data = urlencode_postdata({'username': 'foo@bar.com', 'password': '1234'})
self.assertTrue(isinstance(data, bytes)) self.assertTrue(isinstance(data, bytes))
def test_update_url_query(self):
def query_dict(url):
return compat_parse_qs(compat_urlparse.urlparse(url).query)
self.assertEqual(query_dict(update_url_query(
'http://example.com/path', {'quality': ['HD'], 'format': ['mp4']})),
query_dict('http://example.com/path?quality=HD&format=mp4'))
self.assertEqual(query_dict(update_url_query(
'http://example.com/path', {'system': ['LINUX', 'WINDOWS']})),
query_dict('http://example.com/path?system=LINUX&system=WINDOWS'))
self.assertEqual(query_dict(update_url_query(
'http://example.com/path', {'fields': 'id,formats,subtitles'})),
query_dict('http://example.com/path?fields=id,formats,subtitles'))
self.assertEqual(query_dict(update_url_query(
'http://example.com/path', {'fields': ('id,formats,subtitles', 'thumbnails')})),
query_dict('http://example.com/path?fields=id,formats,subtitles&fields=thumbnails'))
self.assertEqual(query_dict(update_url_query(
'http://example.com/path?manifest=f4m', {'manifest': []})),
query_dict('http://example.com/path'))
self.assertEqual(query_dict(update_url_query(
'http://example.com/path?system=LINUX&system=WINDOWS', {'system': 'LINUX'})),
query_dict('http://example.com/path?system=LINUX'))
self.assertEqual(query_dict(update_url_query(
'http://example.com/path', {'fields': b'id,formats,subtitles'})),
query_dict('http://example.com/path?fields=id,formats,subtitles'))
self.assertEqual(query_dict(update_url_query(
'http://example.com/path', {'width': 1080, 'height': 720})),
query_dict('http://example.com/path?width=1080&height=720'))
self.assertEqual(query_dict(update_url_query(
'http://example.com/path', {'bitrate': 5020.43})),
query_dict('http://example.com/path?bitrate=5020.43'))
self.assertEqual(query_dict(update_url_query(
'http://example.com/path', {'test': '第二行тест'})),
query_dict('http://example.com/path?test=%E7%AC%AC%E4%BA%8C%E8%A1%8C%D1%82%D0%B5%D1%81%D1%82'))
def test_dict_get(self): def test_dict_get(self):
FALSE_VALUES = { FALSE_VALUES = {
'none': None, 'none': None,
@@ -534,11 +577,11 @@ class TestUtil(unittest.TestCase):
) )
self.assertEqual( self.assertEqual(
escape_url('http://тест.рф/фрагмент'), escape_url('http://тест.рф/фрагмент'),
'http://тест.рф/%D1%84%D1%80%D0%B0%D0%B3%D0%BC%D0%B5%D0%BD%D1%82' 'http://xn--e1aybc.xn--p1ai/%D1%84%D1%80%D0%B0%D0%B3%D0%BC%D0%B5%D0%BD%D1%82'
) )
self.assertEqual( self.assertEqual(
escape_url('http://тест.рф/абв?абв=абв#абв'), escape_url('http://тест.рф/абв?абв=абв#абв'),
'http://тест.рф/%D0%B0%D0%B1%D0%B2?%D0%B0%D0%B1%D0%B2=%D0%B0%D0%B1%D0%B2#%D0%B0%D0%B1%D0%B2' 'http://xn--e1aybc.xn--p1ai/%D0%B0%D0%B1%D0%B2?%D0%B0%D0%B1%D0%B2=%D0%B0%D0%B1%D0%B2#%D0%B0%D0%B1%D0%B2'
) )
self.assertEqual(escape_url('http://vimeo.com/56015672#at=0'), 'http://vimeo.com/56015672#at=0') self.assertEqual(escape_url('http://vimeo.com/56015672#at=0'), 'http://vimeo.com/56015672#at=0')
@@ -588,6 +631,44 @@ class TestUtil(unittest.TestCase):
on = js_to_json('{"abc": "def",}') on = js_to_json('{"abc": "def",}')
self.assertEqual(json.loads(on), {'abc': 'def'}) self.assertEqual(json.loads(on), {'abc': 'def'})
def test_extract_attributes(self):
self.assertEqual(extract_attributes('<e x="y">'), {'x': 'y'})
self.assertEqual(extract_attributes("<e x='y'>"), {'x': 'y'})
self.assertEqual(extract_attributes('<e x=y>'), {'x': 'y'})
self.assertEqual(extract_attributes('<e x="a \'b\' c">'), {'x': "a 'b' c"})
self.assertEqual(extract_attributes('<e x=\'a "b" c\'>'), {'x': 'a "b" c'})
self.assertEqual(extract_attributes('<e x="&#121;">'), {'x': 'y'})
self.assertEqual(extract_attributes('<e x="&#x79;">'), {'x': 'y'})
self.assertEqual(extract_attributes('<e x="&amp;">'), {'x': '&'}) # XML
self.assertEqual(extract_attributes('<e x="&quot;">'), {'x': '"'})
self.assertEqual(extract_attributes('<e x="&pound;">'), {'x': '£'}) # HTML 3.2
self.assertEqual(extract_attributes('<e x="&lambda;">'), {'x': 'λ'}) # HTML 4.0
self.assertEqual(extract_attributes('<e x="&foo">'), {'x': '&foo'})
self.assertEqual(extract_attributes('<e x="\'">'), {'x': "'"})
self.assertEqual(extract_attributes('<e x=\'"\'>'), {'x': '"'})
self.assertEqual(extract_attributes('<e x >'), {'x': None})
self.assertEqual(extract_attributes('<e x=y a>'), {'x': 'y', 'a': None})
self.assertEqual(extract_attributes('<e x= y>'), {'x': 'y'})
self.assertEqual(extract_attributes('<e x=1 y=2 x=3>'), {'y': '2', 'x': '3'})
self.assertEqual(extract_attributes('<e \nx=\ny\n>'), {'x': 'y'})
self.assertEqual(extract_attributes('<e \nx=\n"y"\n>'), {'x': 'y'})
self.assertEqual(extract_attributes("<e \nx=\n'y'\n>"), {'x': 'y'})
self.assertEqual(extract_attributes('<e \nx="\ny\n">'), {'x': '\ny\n'})
self.assertEqual(extract_attributes('<e CAPS=x>'), {'caps': 'x'}) # Names lowercased
self.assertEqual(extract_attributes('<e x=1 X=2>'), {'x': '2'})
self.assertEqual(extract_attributes('<e X=1 x=2>'), {'x': '2'})
self.assertEqual(extract_attributes('<e _:funny-name1=1>'), {'_:funny-name1': '1'})
self.assertEqual(extract_attributes('<e x="Fáilte 世界 \U0001f600">'), {'x': 'Fáilte 世界 \U0001f600'})
self.assertEqual(extract_attributes('<e x="décompose&#769;">'), {'x': 'décompose\u0301'})
# "Narrow" Python builds don't support unicode code points outside BMP.
try:
compat_chr(0x10000)
supports_outside_bmp = True
except ValueError:
supports_outside_bmp = False
if supports_outside_bmp:
self.assertEqual(extract_attributes('<e x="Smile &#128512;!">'), {'x': 'Smile \U0001f600!'})
def test_clean_html(self): def test_clean_html(self):
self.assertEqual(clean_html('a:\nb'), 'a: b') self.assertEqual(clean_html('a:\nb'), 'a: b')
self.assertEqual(clean_html('a:\n "b"'), 'a: "b"') self.assertEqual(clean_html('a:\n "b"'), 'a: "b"')
@@ -613,6 +694,17 @@ class TestUtil(unittest.TestCase):
self.assertEqual(parse_filesize('1.2Tb'), 1200000000000) self.assertEqual(parse_filesize('1.2Tb'), 1200000000000)
self.assertEqual(parse_filesize('1,24 KB'), 1240) self.assertEqual(parse_filesize('1,24 KB'), 1240)
def test_parse_count(self):
self.assertEqual(parse_count(None), None)
self.assertEqual(parse_count(''), None)
self.assertEqual(parse_count('0'), 0)
self.assertEqual(parse_count('1000'), 1000)
self.assertEqual(parse_count('1.000'), 1000)
self.assertEqual(parse_count('1.1k'), 1100)
self.assertEqual(parse_count('1.1kk'), 1100000)
self.assertEqual(parse_count('1.1kk '), 1100000)
self.assertEqual(parse_count('1.1kk views'), 1100000)
def test_version_tuple(self): def test_version_tuple(self):
self.assertEqual(version_tuple('1'), (1,)) self.assertEqual(version_tuple('1'), (1,))
self.assertEqual(version_tuple('10.23.344'), (10, 23, 344)) self.assertEqual(version_tuple('10.23.344'), (10, 23, 344))
@@ -792,6 +884,24 @@ The first line
{'nocheckcertificate': False}, '--check-certificate', 'nocheckcertificate', 'false', 'true', '='), {'nocheckcertificate': False}, '--check-certificate', 'nocheckcertificate', 'false', 'true', '='),
['--check-certificate=true']) ['--check-certificate=true'])
def test_ohdave_rsa_encrypt(self):
N = 0xab86b6371b5318aaa1d3c9e612a9f1264f372323c8c0f19875b5fc3b3fd3afcc1e5bec527aa94bfa85bffc157e4245aebda05389a5357b75115ac94f074aefcd
e = 65537
self.assertEqual(
ohdave_rsa_encrypt(b'aa111222', e, N),
'726664bd9a23fd0c70f9f1b84aab5e3905ce1e45a584e9cbcf9bcc7510338fc1986d6c599ff990d923aa43c51c0d9013cd572e13bc58f4ae48f2ed8c0b0ba881')
def test_encode_base_n(self):
self.assertEqual(encode_base_n(0, 30), '0')
self.assertEqual(encode_base_n(80, 30), '2k')
custom_table = '9876543210ZYXWVUTSRQPONMLKJIHGFEDCBA'
self.assertEqual(encode_base_n(0, 30, custom_table), '9')
self.assertEqual(encode_base_n(80, 30, custom_table), '7P')
self.assertRaises(ValueError, encode_base_n, 0, 70)
self.assertRaises(ValueError, encode_base_n, 0, 60, custom_table)
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@@ -8,6 +8,6 @@ deps =
passenv = HOME passenv = HOME
defaultargs = test --exclude test_download.py --exclude test_age_restriction.py defaultargs = test --exclude test_download.py --exclude test_age_restriction.py
--exclude test_subtitles.py --exclude test_write_annotations.py --exclude test_subtitles.py --exclude test_write_annotations.py
--exclude test_youtube_lists.py --exclude test_youtube_lists.py --exclude test_iqiyi_sdk_interpreter.py
commands = nosetests --verbose {posargs:{[testenv]defaultargs}} # --with-coverage --cover-package=youtube_dl --cover-html commands = nosetests --verbose {posargs:{[testenv]defaultargs}} # --with-coverage --cover-package=youtube_dl --cover-html
# test.test_download:TestDownload.test_NowVideo # test.test_download:TestDownload.test_NowVideo

View File

@@ -24,9 +24,6 @@ import time
import tokenize import tokenize
import traceback import traceback
if os.name == 'nt':
import ctypes
from .compat import ( from .compat import (
compat_basestring, compat_basestring,
compat_cookiejar, compat_cookiejar,
@@ -34,6 +31,7 @@ from .compat import (
compat_get_terminal_size, compat_get_terminal_size,
compat_http_client, compat_http_client,
compat_kwargs, compat_kwargs,
compat_os_name,
compat_str, compat_str,
compat_tokenize_tokenize, compat_tokenize_tokenize,
compat_urllib_error, compat_urllib_error,
@@ -87,6 +85,7 @@ from .extractor import get_info_extractor, gen_extractors
from .downloader import get_suitable_downloader from .downloader import get_suitable_downloader
from .downloader.rtmp import rtmpdump_version from .downloader.rtmp import rtmpdump_version
from .postprocessor import ( from .postprocessor import (
FFmpegFixupM3u8PP,
FFmpegFixupM4aPP, FFmpegFixupM4aPP,
FFmpegFixupStretchedPP, FFmpegFixupStretchedPP,
FFmpegMergerPP, FFmpegMergerPP,
@@ -95,6 +94,9 @@ from .postprocessor import (
) )
from .version import __version__ from .version import __version__
if compat_os_name == 'nt':
import ctypes
class YoutubeDL(object): class YoutubeDL(object):
"""YoutubeDL class. """YoutubeDL class.
@@ -450,7 +452,7 @@ class YoutubeDL(object):
def to_console_title(self, message): def to_console_title(self, message):
if not self.params.get('consoletitle', False): if not self.params.get('consoletitle', False):
return return
if os.name == 'nt' and ctypes.windll.kernel32.GetConsoleWindow(): if compat_os_name == 'nt' and ctypes.windll.kernel32.GetConsoleWindow():
# c_wchar_p() might not be necessary if `message` is # c_wchar_p() might not be necessary if `message` is
# already of type unicode() # already of type unicode()
ctypes.windll.kernel32.SetConsoleTitleW(ctypes.c_wchar_p(message)) ctypes.windll.kernel32.SetConsoleTitleW(ctypes.c_wchar_p(message))
@@ -521,7 +523,7 @@ class YoutubeDL(object):
else: else:
if self.params.get('no_warnings'): if self.params.get('no_warnings'):
return return
if not self.params.get('no_color') and self._err_file.isatty() and os.name != 'nt': if not self.params.get('no_color') and self._err_file.isatty() and compat_os_name != 'nt':
_msg_header = '\033[0;33mWARNING:\033[0m' _msg_header = '\033[0;33mWARNING:\033[0m'
else: else:
_msg_header = 'WARNING:' _msg_header = 'WARNING:'
@@ -533,7 +535,7 @@ class YoutubeDL(object):
Do the same as trouble, but prefixes the message with 'ERROR:', colored Do the same as trouble, but prefixes the message with 'ERROR:', colored
in red if stderr is a tty file. in red if stderr is a tty file.
''' '''
if not self.params.get('no_color') and self._err_file.isatty() and os.name != 'nt': if not self.params.get('no_color') and self._err_file.isatty() and compat_os_name != 'nt':
_msg_header = '\033[0;31mERROR:\033[0m' _msg_header = '\033[0;31mERROR:\033[0m'
else: else:
_msg_header = 'ERROR:' _msg_header = 'ERROR:'
@@ -566,7 +568,7 @@ class YoutubeDL(object):
elif template_dict.get('height'): elif template_dict.get('height'):
template_dict['resolution'] = '%sp' % template_dict['height'] template_dict['resolution'] = '%sp' % template_dict['height']
elif template_dict.get('width'): elif template_dict.get('width'):
template_dict['resolution'] = '?x%d' % template_dict['width'] template_dict['resolution'] = '%dx?' % template_dict['width']
sanitize = lambda k, v: sanitize_filename( sanitize = lambda k, v: sanitize_filename(
compat_str(v), compat_str(v),
@@ -605,12 +607,12 @@ class YoutubeDL(object):
if rejecttitle: if rejecttitle:
if re.search(rejecttitle, title, re.IGNORECASE): if re.search(rejecttitle, title, re.IGNORECASE):
return '"' + title + '" title matched reject pattern "' + rejecttitle + '"' return '"' + title + '" title matched reject pattern "' + rejecttitle + '"'
date = info_dict.get('upload_date', None) date = info_dict.get('upload_date')
if date is not None: if date is not None:
dateRange = self.params.get('daterange', DateRange()) dateRange = self.params.get('daterange', DateRange())
if date not in dateRange: if date not in dateRange:
return '%s upload date is not in range %s' % (date_from_str(date).isoformat(), dateRange) return '%s upload date is not in range %s' % (date_from_str(date).isoformat(), dateRange)
view_count = info_dict.get('view_count', None) view_count = info_dict.get('view_count')
if view_count is not None: if view_count is not None:
min_views = self.params.get('min_views') min_views = self.params.get('min_views')
if min_views is not None and view_count < min_views: if min_views is not None and view_count < min_views:
@@ -747,18 +749,18 @@ class YoutubeDL(object):
new_result, download=download, extra_info=extra_info) new_result, download=download, extra_info=extra_info)
elif result_type == 'playlist' or result_type == 'multi_video': elif result_type == 'playlist' or result_type == 'multi_video':
# We process each entry in the playlist # We process each entry in the playlist
playlist = ie_result.get('title', None) or ie_result.get('id', None) playlist = ie_result.get('title') or ie_result.get('id')
self.to_screen('[download] Downloading playlist: %s' % playlist) self.to_screen('[download] Downloading playlist: %s' % playlist)
playlist_results = [] playlist_results = []
playliststart = self.params.get('playliststart', 1) - 1 playliststart = self.params.get('playliststart', 1) - 1
playlistend = self.params.get('playlistend', None) playlistend = self.params.get('playlistend')
# For backwards compatibility, interpret -1 as whole list # For backwards compatibility, interpret -1 as whole list
if playlistend == -1: if playlistend == -1:
playlistend = None playlistend = None
playlistitems_str = self.params.get('playlist_items', None) playlistitems_str = self.params.get('playlist_items')
playlistitems = None playlistitems = None
if playlistitems_str is not None: if playlistitems_str is not None:
def iter_playlistitems(format): def iter_playlistitems(format):
@@ -782,7 +784,7 @@ class YoutubeDL(object):
entries = ie_entries[playliststart:playlistend] entries = ie_entries[playliststart:playlistend]
n_entries = len(entries) n_entries = len(entries)
self.to_screen( self.to_screen(
"[%s] playlist %s: Collected %d video ids (downloading %d of them)" % '[%s] playlist %s: Collected %d video ids (downloading %d of them)' %
(ie_result['extractor'], playlist, n_all_entries, n_entries)) (ie_result['extractor'], playlist, n_all_entries, n_entries))
elif isinstance(ie_entries, PagedList): elif isinstance(ie_entries, PagedList):
if playlistitems: if playlistitems:
@@ -796,7 +798,7 @@ class YoutubeDL(object):
playliststart, playlistend) playliststart, playlistend)
n_entries = len(entries) n_entries = len(entries)
self.to_screen( self.to_screen(
"[%s] playlist %s: Downloading %d videos" % '[%s] playlist %s: Downloading %d videos' %
(ie_result['extractor'], playlist, n_entries)) (ie_result['extractor'], playlist, n_entries))
else: # iterable else: # iterable
if playlistitems: if playlistitems:
@@ -807,7 +809,7 @@ class YoutubeDL(object):
ie_entries, playliststart, playlistend)) ie_entries, playliststart, playlistend))
n_entries = len(entries) n_entries = len(entries)
self.to_screen( self.to_screen(
"[%s] playlist %s: Downloading %d videos" % '[%s] playlist %s: Downloading %d videos' %
(ie_result['extractor'], playlist, n_entries)) (ie_result['extractor'], playlist, n_entries))
if self.params.get('playlistreverse', False): if self.params.get('playlistreverse', False):
@@ -903,7 +905,7 @@ class YoutubeDL(object):
'*=': lambda attr, value: value in attr, '*=': lambda attr, value: value in attr,
} }
str_operator_rex = re.compile(r'''(?x) str_operator_rex = re.compile(r'''(?x)
\s*(?P<key>ext|acodec|vcodec|container|protocol) \s*(?P<key>ext|acodec|vcodec|container|protocol|format_id)
\s*(?P<op>%s)(?P<none_inclusive>\s*\?)? \s*(?P<op>%s)(?P<none_inclusive>\s*\?)?
\s*(?P<value>[a-zA-Z0-9._-]+) \s*(?P<value>[a-zA-Z0-9._-]+)
\s*$ \s*$
@@ -1232,6 +1234,10 @@ class YoutubeDL(object):
if t.get('id') is None: if t.get('id') is None:
t['id'] = '%d' % i t['id'] = '%d' % i
if self.params.get('list_thumbnails'):
self.list_thumbnails(info_dict)
return
if thumbnails and 'thumbnail' not in info_dict: if thumbnails and 'thumbnail' not in info_dict:
info_dict['thumbnail'] = thumbnails[-1]['url'] info_dict['thumbnail'] = thumbnails[-1]['url']
@@ -1333,9 +1339,6 @@ class YoutubeDL(object):
if self.params.get('listformats'): if self.params.get('listformats'):
self.list_formats(info_dict) self.list_formats(info_dict)
return return
if self.params.get('list_thumbnails'):
self.list_thumbnails(info_dict)
return
req_format = self.params.get('format') req_format = self.params.get('format')
if req_format is None: if req_format is None:
@@ -1631,12 +1634,14 @@ class YoutubeDL(object):
self.report_error('content too short (expected %s bytes and served %s)' % (err.expected, err.downloaded)) self.report_error('content too short (expected %s bytes and served %s)' % (err.expected, err.downloaded))
return return
if success: if success and filename != '-':
# Fixup content # Fixup content
fixup_policy = self.params.get('fixup') fixup_policy = self.params.get('fixup')
if fixup_policy is None: if fixup_policy is None:
fixup_policy = 'detect_or_warn' fixup_policy = 'detect_or_warn'
INSTALL_FFMPEG_MESSAGE = 'Install ffmpeg or avconv to fix this automatically.'
stretched_ratio = info_dict.get('stretched_ratio') stretched_ratio = info_dict.get('stretched_ratio')
if stretched_ratio is not None and stretched_ratio != 1: if stretched_ratio is not None and stretched_ratio != 1:
if fixup_policy == 'warn': if fixup_policy == 'warn':
@@ -1649,15 +1654,18 @@ class YoutubeDL(object):
info_dict['__postprocessors'].append(stretched_pp) info_dict['__postprocessors'].append(stretched_pp)
else: else:
self.report_warning( self.report_warning(
'%s: Non-uniform pixel ratio (%s). Install ffmpeg or avconv to fix this automatically.' % ( '%s: Non-uniform pixel ratio (%s). %s'
info_dict['id'], stretched_ratio)) % (info_dict['id'], stretched_ratio, INSTALL_FFMPEG_MESSAGE))
else: else:
assert fixup_policy in ('ignore', 'never') assert fixup_policy in ('ignore', 'never')
if info_dict.get('requested_formats') is None and info_dict.get('container') == 'm4a_dash': if (info_dict.get('requested_formats') is None and
info_dict.get('container') == 'm4a_dash'):
if fixup_policy == 'warn': if fixup_policy == 'warn':
self.report_warning('%s: writing DASH m4a. Only some players support this container.' % ( self.report_warning(
info_dict['id'])) '%s: writing DASH m4a. '
'Only some players support this container.'
% info_dict['id'])
elif fixup_policy == 'detect_or_warn': elif fixup_policy == 'detect_or_warn':
fixup_pp = FFmpegFixupM4aPP(self) fixup_pp = FFmpegFixupM4aPP(self)
if fixup_pp.available: if fixup_pp.available:
@@ -1665,8 +1673,27 @@ class YoutubeDL(object):
info_dict['__postprocessors'].append(fixup_pp) info_dict['__postprocessors'].append(fixup_pp)
else: else:
self.report_warning( self.report_warning(
'%s: writing DASH m4a. Only some players support this container. Install ffmpeg or avconv to fix this automatically.' % ( '%s: writing DASH m4a. '
'Only some players support this container. %s'
% (info_dict['id'], INSTALL_FFMPEG_MESSAGE))
else:
assert fixup_policy in ('ignore', 'never')
if (info_dict.get('protocol') == 'm3u8_native' or
info_dict.get('protocol') == 'm3u8' and
self.params.get('hls_prefer_native')):
if fixup_policy == 'warn':
self.report_warning('%s: malformated aac bitstream.' % (
info_dict['id'])) info_dict['id']))
elif fixup_policy == 'detect_or_warn':
fixup_pp = FFmpegFixupM3u8PP(self)
if fixup_pp.available:
info_dict.setdefault('__postprocessors', [])
info_dict['__postprocessors'].append(fixup_pp)
else:
self.report_warning(
'%s: malformated aac bitstream. %s'
% (info_dict['id'], INSTALL_FFMPEG_MESSAGE))
else: else:
assert fixup_policy in ('ignore', 'never') assert fixup_policy in ('ignore', 'never')
@@ -1830,7 +1857,9 @@ class YoutubeDL(object):
if fdict.get('vbr') is not None: if fdict.get('vbr') is not None:
res += '%4dk' % fdict['vbr'] res += '%4dk' % fdict['vbr']
if fdict.get('fps') is not None: if fdict.get('fps') is not None:
res += ', %sfps' % fdict['fps'] if res:
res += ', '
res += '%sfps' % fdict['fps']
if fdict.get('acodec') is not None: if fdict.get('acodec') is not None:
if res: if res:
res += ', ' res += ', '
@@ -1873,12 +1902,7 @@ class YoutubeDL(object):
def list_thumbnails(self, info_dict): def list_thumbnails(self, info_dict):
thumbnails = info_dict.get('thumbnails') thumbnails = info_dict.get('thumbnails')
if not thumbnails: if not thumbnails:
tn_url = info_dict.get('thumbnail') self.to_screen('[info] No thumbnails present for %s' % info_dict['id'])
if tn_url:
thumbnails = [{'id': '0', 'url': tn_url}]
else:
self.to_screen(
'[info] No thumbnails present for %s' % info_dict['id'])
return return
self.to_screen( self.to_screen(

View File

@@ -144,14 +144,20 @@ def _real_main(argv=None):
if numeric_limit is None: if numeric_limit is None:
parser.error('invalid max_filesize specified') parser.error('invalid max_filesize specified')
opts.max_filesize = numeric_limit opts.max_filesize = numeric_limit
if opts.retries is not None:
if opts.retries in ('inf', 'infinite'): def parse_retries(retries):
opts_retries = float('inf') if retries in ('inf', 'infinite'):
parsed_retries = float('inf')
else: else:
try: try:
opts_retries = int(opts.retries) parsed_retries = int(retries)
except (TypeError, ValueError): except (TypeError, ValueError):
parser.error('invalid retry count specified') parser.error('invalid retry count specified')
return parsed_retries
if opts.retries is not None:
opts.retries = parse_retries(opts.retries)
if opts.fragment_retries is not None:
opts.fragment_retries = parse_retries(opts.fragment_retries)
if opts.buffersize is not None: if opts.buffersize is not None:
numeric_buffersize = FileDownloader.parse_bytes(opts.buffersize) numeric_buffersize = FileDownloader.parse_bytes(opts.buffersize)
if numeric_buffersize is None: if numeric_buffersize is None:
@@ -299,7 +305,8 @@ def _real_main(argv=None):
'force_generic_extractor': opts.force_generic_extractor, 'force_generic_extractor': opts.force_generic_extractor,
'ratelimit': opts.ratelimit, 'ratelimit': opts.ratelimit,
'nooverwrites': opts.nooverwrites, 'nooverwrites': opts.nooverwrites,
'retries': opts_retries, 'retries': opts.retries,
'fragment_retries': opts.fragment_retries,
'buffersize': opts.buffersize, 'buffersize': opts.buffersize,
'noresizebuffer': opts.noresizebuffer, 'noresizebuffer': opts.noresizebuffer,
'continuedl': opts.continue_dl, 'continuedl': opts.continue_dl,
@@ -355,6 +362,7 @@ def _real_main(argv=None):
'youtube_include_dash_manifest': opts.youtube_include_dash_manifest, 'youtube_include_dash_manifest': opts.youtube_include_dash_manifest,
'encoding': opts.encoding, 'encoding': opts.encoding,
'extract_flat': opts.extract_flat, 'extract_flat': opts.extract_flat,
'mark_watched': opts.mark_watched,
'merge_output_format': opts.merge_output_format, 'merge_output_format': opts.merge_output_format,
'postprocessors': postprocessors, 'postprocessors': postprocessors,
'fixup': opts.fixup, 'fixup': opts.fixup,

View File

@@ -7,7 +7,7 @@ from __future__ import unicode_literals
import sys import sys
if __package__ is None and not hasattr(sys, "frozen"): if __package__ is None and not hasattr(sys, 'frozen'):
# direct call of __main__.py # direct call of __main__.py
import os.path import os.path
path = os.path.realpath(os.path.abspath(__file__)) path = os.path.realpath(os.path.abspath(__file__))

View File

@@ -161,7 +161,7 @@ def aes_decrypt_text(data, password, key_size_bytes):
nonce = data[:NONCE_LENGTH_BYTES] nonce = data[:NONCE_LENGTH_BYTES]
cipher = data[NONCE_LENGTH_BYTES:] cipher = data[NONCE_LENGTH_BYTES:]
class Counter: class Counter(object):
__value = nonce + [0] * (BLOCK_SIZE_BYTES - NONCE_LENGTH_BYTES) __value = nonce + [0] * (BLOCK_SIZE_BYTES - NONCE_LENGTH_BYTES)
def next_value(self): def next_value(self):

View File

@@ -77,6 +77,11 @@ try:
except ImportError: # Python 2 except ImportError: # Python 2
from urllib import urlretrieve as compat_urlretrieve from urllib import urlretrieve as compat_urlretrieve
try:
from html.parser import HTMLParser as compat_HTMLParser
except ImportError: # Python 2
from HTMLParser import HTMLParser as compat_HTMLParser
try: try:
from subprocess import DEVNULL from subprocess import DEVNULL
@@ -181,20 +186,20 @@ except ImportError: # Python < 3.4
# parameter := attribute "=" value # parameter := attribute "=" value
url = req.get_full_url() url = req.get_full_url()
scheme, data = url.split(":", 1) scheme, data = url.split(':', 1)
mediatype, data = data.split(",", 1) mediatype, data = data.split(',', 1)
# even base64 encoded data URLs might be quoted so unquote in any case: # even base64 encoded data URLs might be quoted so unquote in any case:
data = compat_urllib_parse_unquote_to_bytes(data) data = compat_urllib_parse_unquote_to_bytes(data)
if mediatype.endswith(";base64"): if mediatype.endswith(';base64'):
data = binascii.a2b_base64(data) data = binascii.a2b_base64(data)
mediatype = mediatype[:-7] mediatype = mediatype[:-7]
if not mediatype: if not mediatype:
mediatype = "text/plain;charset=US-ASCII" mediatype = 'text/plain;charset=US-ASCII'
headers = email.message_from_string( headers = email.message_from_string(
"Content-type: %s\nContent-length: %d\n" % (mediatype, len(data))) 'Content-type: %s\nContent-length: %d\n' % (mediatype, len(data)))
return compat_urllib_response.addinfourl(io.BytesIO(data), headers, url) return compat_urllib_response.addinfourl(io.BytesIO(data), headers, url)
@@ -251,6 +256,16 @@ else:
el.text = el.text.decode('utf-8') el.text = el.text.decode('utf-8')
return doc return doc
if sys.version_info < (2, 7):
# Here comes the crazy part: In 2.6, if the xpath is a unicode,
# .//node does not match if a node is a direct child of . !
def compat_xpath(xpath):
if isinstance(xpath, compat_str):
xpath = xpath.encode('ascii')
return xpath
else:
compat_xpath = lambda xpath: xpath
try: try:
from urllib.parse import parse_qs as compat_parse_qs from urllib.parse import parse_qs as compat_parse_qs
except ImportError: # Python 2 except ImportError: # Python 2
@@ -268,7 +283,7 @@ except ImportError: # Python 2
nv = name_value.split('=', 1) nv = name_value.split('=', 1)
if len(nv) != 2: if len(nv) != 2:
if strict_parsing: if strict_parsing:
raise ValueError("bad query field: %r" % (name_value,)) raise ValueError('bad query field: %r' % (name_value,))
# Handle case of a control-name with no equal sign # Handle case of a control-name with no equal sign
if keep_blank_values: if keep_blank_values:
nv.append('') nv.append('')
@@ -326,6 +341,9 @@ def compat_ord(c):
return ord(c) return ord(c)
compat_os_name = os._name if os.name == 'java' else os.name
if sys.version_info >= (3, 0): if sys.version_info >= (3, 0):
compat_getenv = os.getenv compat_getenv = os.getenv
compat_expanduser = os.path.expanduser compat_expanduser = os.path.expanduser
@@ -346,7 +364,7 @@ else:
# The following are os.path.expanduser implementations from cpython 2.7.8 stdlib # The following are os.path.expanduser implementations from cpython 2.7.8 stdlib
# for different platforms with correct environment variables decoding. # for different platforms with correct environment variables decoding.
if os.name == 'posix': if compat_os_name == 'posix':
def compat_expanduser(path): def compat_expanduser(path):
"""Expand ~ and ~user constructions. If user or $HOME is unknown, """Expand ~ and ~user constructions. If user or $HOME is unknown,
do nothing.""" do nothing."""
@@ -370,7 +388,7 @@ else:
userhome = pwent.pw_dir userhome = pwent.pw_dir
userhome = userhome.rstrip('/') userhome = userhome.rstrip('/')
return (userhome + path[i:]) or '/' return (userhome + path[i:]) or '/'
elif os.name == 'nt' or os.name == 'ce': elif compat_os_name == 'nt' or compat_os_name == 'ce':
def compat_expanduser(path): def compat_expanduser(path):
"""Expand ~ and ~user constructs. """Expand ~ and ~user constructs.
@@ -466,7 +484,7 @@ if sys.version_info < (2, 7):
if err is not None: if err is not None:
raise err raise err
else: else:
raise socket.error("getaddrinfo returns an empty list") raise socket.error('getaddrinfo returns an empty list')
else: else:
compat_socket_create_connection = socket.create_connection compat_socket_create_connection = socket.create_connection
@@ -540,6 +558,7 @@ else:
from tokenize import generate_tokens as compat_tokenize_tokenize from tokenize import generate_tokens as compat_tokenize_tokenize
__all__ = [ __all__ = [
'compat_HTMLParser',
'compat_HTTPError', 'compat_HTTPError',
'compat_basestring', 'compat_basestring',
'compat_chr', 'compat_chr',
@@ -556,6 +575,7 @@ __all__ = [
'compat_itertools_count', 'compat_itertools_count',
'compat_kwargs', 'compat_kwargs',
'compat_ord', 'compat_ord',
'compat_os_name',
'compat_parse_qs', 'compat_parse_qs',
'compat_print', 'compat_print',
'compat_shlex_split', 'compat_shlex_split',
@@ -575,6 +595,7 @@ __all__ = [
'compat_urlparse', 'compat_urlparse',
'compat_urlretrieve', 'compat_urlretrieve',
'compat_xml_parse_error', 'compat_xml_parse_error',
'compat_xpath',
'shlex_quote', 'shlex_quote',
'subprocess_check_output', 'subprocess_check_output',
'workaround_optparse_bug9161', 'workaround_optparse_bug9161',

View File

@@ -1,14 +1,16 @@
from __future__ import unicode_literals from __future__ import unicode_literals
from .common import FileDownloader from .common import FileDownloader
from .external import get_external_downloader
from .f4m import F4mFD from .f4m import F4mFD
from .hls import HlsFD from .hls import HlsFD
from .hls import NativeHlsFD
from .http import HttpFD from .http import HttpFD
from .rtsp import RtspFD
from .rtmp import RtmpFD from .rtmp import RtmpFD
from .dash import DashSegmentsFD from .dash import DashSegmentsFD
from .rtsp import RtspFD
from .external import (
get_external_downloader,
FFmpegFD,
)
from ..utils import ( from ..utils import (
determine_protocol, determine_protocol,
@@ -16,8 +18,8 @@ from ..utils import (
PROTOCOL_MAP = { PROTOCOL_MAP = {
'rtmp': RtmpFD, 'rtmp': RtmpFD,
'm3u8_native': NativeHlsFD, 'm3u8_native': HlsFD,
'm3u8': HlsFD, 'm3u8': FFmpegFD,
'mms': RtspFD, 'mms': RtspFD,
'rtsp': RtspFD, 'rtsp': RtspFD,
'f4m': F4mFD, 'f4m': F4mFD,
@@ -30,14 +32,17 @@ def get_suitable_downloader(info_dict, params={}):
protocol = determine_protocol(info_dict) protocol = determine_protocol(info_dict)
info_dict['protocol'] = protocol info_dict['protocol'] = protocol
# if (info_dict.get('start_time') or info_dict.get('end_time')) and not info_dict.get('requested_formats') and FFmpegFD.can_download(info_dict):
# return FFmpegFD
external_downloader = params.get('external_downloader') external_downloader = params.get('external_downloader')
if external_downloader is not None: if external_downloader is not None:
ed = get_external_downloader(external_downloader) ed = get_external_downloader(external_downloader)
if ed.supports(info_dict): if ed.can_download(info_dict):
return ed return ed
if protocol == 'm3u8' and params.get('hls_prefer_native'): if protocol == 'm3u8' and params.get('hls_prefer_native'):
return NativeHlsFD return HlsFD
return PROTOCOL_MAP.get(protocol, HttpFD) return PROTOCOL_MAP.get(protocol, HttpFD)

View File

@@ -5,6 +5,7 @@ import re
import sys import sys
import time import time
from ..compat import compat_os_name
from ..utils import ( from ..utils import (
encodeFilename, encodeFilename,
error_to_compat_str, error_to_compat_str,
@@ -114,6 +115,10 @@ class FileDownloader(object):
return '%10s' % '---b/s' return '%10s' % '---b/s'
return '%10s' % ('%s/s' % format_bytes(speed)) return '%10s' % ('%s/s' % format_bytes(speed))
@staticmethod
def format_retries(retries):
return 'inf' if retries == float('inf') else '%.0f' % retries
@staticmethod @staticmethod
def best_block_size(elapsed_time, bytes): def best_block_size(elapsed_time, bytes):
new_min = max(bytes / 2.0, 1.0) new_min = max(bytes / 2.0, 1.0)
@@ -157,7 +162,7 @@ class FileDownloader(object):
def slow_down(self, start_time, now, byte_counter): def slow_down(self, start_time, now, byte_counter):
"""Sleep if the download speed is over the rate limit.""" """Sleep if the download speed is over the rate limit."""
rate_limit = self.params.get('ratelimit', None) rate_limit = self.params.get('ratelimit')
if rate_limit is None or byte_counter == 0: if rate_limit is None or byte_counter == 0:
return return
if now is None: if now is None:
@@ -219,7 +224,7 @@ class FileDownloader(object):
if self.params.get('progress_with_newline', False): if self.params.get('progress_with_newline', False):
self.to_screen(fullmsg) self.to_screen(fullmsg)
else: else:
if os.name == 'nt': if compat_os_name == 'nt':
prev_len = getattr(self, '_report_progress_prev_line_length', prev_len = getattr(self, '_report_progress_prev_line_length',
0) 0)
if prev_len > len(fullmsg): if prev_len > len(fullmsg):
@@ -296,7 +301,9 @@ class FileDownloader(object):
def report_retry(self, count, retries): def report_retry(self, count, retries):
"""Report retry in case of HTTP error 5xx""" """Report retry in case of HTTP error 5xx"""
self.to_screen('[download] Got server HTTP error. Retrying (attempt %d of %.0f)...' % (count, retries)) self.to_screen(
'[download] Got server HTTP error. Retrying (attempt %d of %s)...'
% (count, self.format_retries(retries)))
def report_file_already_downloaded(self, file_name): def report_file_already_downloaded(self, file_name):
"""Report file has already been fully downloaded.""" """Report file has already been fully downloaded."""

View File

@@ -1,67 +1,81 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import os
import re import re
from .common import FileDownloader from .fragment import FragmentFD
from ..utils import sanitized_Request from ..compat import compat_urllib_error
from ..utils import (
sanitize_open,
encodeFilename,
)
class DashSegmentsFD(FileDownloader): class DashSegmentsFD(FragmentFD):
""" """
Download segments in a DASH manifest Download segments in a DASH manifest
""" """
FD_NAME = 'dashsegments'
def real_download(self, filename, info_dict): def real_download(self, filename, info_dict):
self.report_destination(filename)
tmpfilename = self.temp_name(filename)
base_url = info_dict['url'] base_url = info_dict['url']
segment_urls = info_dict['segment_urls'] segment_urls = [info_dict['segment_urls'][0]] if self.params.get('test', False) else info_dict['segment_urls']
initialization_url = info_dict.get('initialization_url')
is_test = self.params.get('test', False) ctx = {
remaining_bytes = self._TEST_FILE_SIZE if is_test else None 'filename': filename,
byte_counter = 0 'total_frags': len(segment_urls) + (1 if initialization_url else 0),
}
def append_url_to_file(outf, target_url, target_name, remaining_bytes=None): self._prepare_and_start_frag_download(ctx)
self.to_screen('[DashSegments] %s: Downloading %s' % (info_dict['id'], target_name))
req = sanitized_Request(target_url)
if remaining_bytes is not None:
req.add_header('Range', 'bytes=0-%d' % (remaining_bytes - 1))
data = self.ydl.urlopen(req).read()
if remaining_bytes is not None:
data = data[:remaining_bytes]
outf.write(data)
return len(data)
def combine_url(base_url, target_url): def combine_url(base_url, target_url):
if re.match(r'^https?://', target_url): if re.match(r'^https?://', target_url):
return target_url return target_url
return '%s%s%s' % (base_url, '' if base_url.endswith('/') else '/', target_url) return '%s%s%s' % (base_url, '' if base_url.endswith('/') else '/', target_url)
with open(tmpfilename, 'wb') as outf: segments_filenames = []
if info_dict.get('initialization_url'):
append_url_to_file( fragment_retries = self.params.get('fragment_retries', 0)
outf, combine_url(base_url, info_dict['initialization_url']),
'initialization segment') def append_url_to_file(target_url, tmp_filename, segment_name):
for i, segment_url in enumerate(segment_urls): target_filename = '%s-%s' % (tmp_filename, segment_name)
segment_len = append_url_to_file( count = 0
outf, combine_url(base_url, segment_url), while count <= fragment_retries:
'segment %d / %d' % (i + 1, len(segment_urls)), try:
remaining_bytes) success = ctx['dl'].download(target_filename, {'url': combine_url(base_url, target_url)})
byte_counter += segment_len if not success:
if remaining_bytes is not None: return False
remaining_bytes -= segment_len down, target_sanitized = sanitize_open(target_filename, 'rb')
if remaining_bytes <= 0: ctx['dest_stream'].write(down.read())
down.close()
segments_filenames.append(target_sanitized)
break break
except (compat_urllib_error.HTTPError, ) as err:
# YouTube may often return 404 HTTP error for a fragment causing the
# whole download to fail. However if the same fragment is immediately
# retried with the same request data this usually succeeds (1-2 attemps
# is usually enough) thus allowing to download the whole file successfully.
# So, we will retry all fragments that fail with 404 HTTP error for now.
if err.code != 404:
raise
# Retry fragment
count += 1
if count <= fragment_retries:
self.report_retry_fragment(segment_name, count, fragment_retries)
if count > fragment_retries:
self.report_error('giving up after %s fragment retries' % fragment_retries)
return False
self.try_rename(tmpfilename, filename) if initialization_url:
append_url_to_file(initialization_url, ctx['tmpfilename'], 'Init')
for i, segment_url in enumerate(segment_urls):
append_url_to_file(segment_url, ctx['tmpfilename'], 'Seg%d' % i)
self._hook_progress({ self._finish_frag_download(ctx)
'downloaded_bytes': byte_counter,
'total_bytes': byte_counter, for segment_file in segments_filenames:
'filename': filename, os.remove(encodeFilename(segment_file))
'status': 'finished',
})
return True return True

View File

@@ -2,8 +2,11 @@ from __future__ import unicode_literals
import os.path import os.path
import subprocess import subprocess
import sys
import re
from .common import FileDownloader from .common import FileDownloader
from ..postprocessor.ffmpeg import FFmpegPostProcessor, EXT_TO_OUT_FORMATS
from ..utils import ( from ..utils import (
cli_option, cli_option,
cli_valueless_option, cli_valueless_option,
@@ -11,6 +14,8 @@ from ..utils import (
cli_configuration_args, cli_configuration_args,
encodeFilename, encodeFilename,
encodeArgument, encodeArgument,
handle_youtubedl_headers,
check_executable,
) )
@@ -45,10 +50,18 @@ class ExternalFD(FileDownloader):
def exe(self): def exe(self):
return self.params.get('external_downloader') return self.params.get('external_downloader')
@classmethod
def available(cls):
return check_executable(cls.get_basename(), [cls.AVAILABLE_OPT])
@classmethod @classmethod
def supports(cls, info_dict): def supports(cls, info_dict):
return info_dict['protocol'] in ('http', 'https', 'ftp', 'ftps') return info_dict['protocol'] in ('http', 'https', 'ftp', 'ftps')
@classmethod
def can_download(cls, info_dict):
return cls.available() and cls.supports(info_dict)
def _option(self, command_option, param): def _option(self, command_option, param):
return cli_option(self.params, command_option, param) return cli_option(self.params, command_option, param)
@@ -76,6 +89,8 @@ class ExternalFD(FileDownloader):
class CurlFD(ExternalFD): class CurlFD(ExternalFD):
AVAILABLE_OPT = '-V'
def _make_cmd(self, tmpfilename, info_dict): def _make_cmd(self, tmpfilename, info_dict):
cmd = [self.exe, '--location', '-o', tmpfilename] cmd = [self.exe, '--location', '-o', tmpfilename]
for key, val in info_dict['http_headers'].items(): for key, val in info_dict['http_headers'].items():
@@ -89,6 +104,8 @@ class CurlFD(ExternalFD):
class AxelFD(ExternalFD): class AxelFD(ExternalFD):
AVAILABLE_OPT = '-V'
def _make_cmd(self, tmpfilename, info_dict): def _make_cmd(self, tmpfilename, info_dict):
cmd = [self.exe, '-o', tmpfilename] cmd = [self.exe, '-o', tmpfilename]
for key, val in info_dict['http_headers'].items(): for key, val in info_dict['http_headers'].items():
@@ -99,6 +116,8 @@ class AxelFD(ExternalFD):
class WgetFD(ExternalFD): class WgetFD(ExternalFD):
AVAILABLE_OPT = '--version'
def _make_cmd(self, tmpfilename, info_dict): def _make_cmd(self, tmpfilename, info_dict):
cmd = [self.exe, '-O', tmpfilename, '-nv', '--no-cookies'] cmd = [self.exe, '-O', tmpfilename, '-nv', '--no-cookies']
for key, val in info_dict['http_headers'].items(): for key, val in info_dict['http_headers'].items():
@@ -112,6 +131,8 @@ class WgetFD(ExternalFD):
class Aria2cFD(ExternalFD): class Aria2cFD(ExternalFD):
AVAILABLE_OPT = '-v'
def _make_cmd(self, tmpfilename, info_dict): def _make_cmd(self, tmpfilename, info_dict):
cmd = [self.exe, '-c'] cmd = [self.exe, '-c']
cmd += self._configuration_args([ cmd += self._configuration_args([
@@ -130,12 +151,112 @@ class Aria2cFD(ExternalFD):
class HttpieFD(ExternalFD): class HttpieFD(ExternalFD):
@classmethod
def available(cls):
return check_executable('http', ['--version'])
def _make_cmd(self, tmpfilename, info_dict): def _make_cmd(self, tmpfilename, info_dict):
cmd = ['http', '--download', '--output', tmpfilename, info_dict['url']] cmd = ['http', '--download', '--output', tmpfilename, info_dict['url']]
for key, val in info_dict['http_headers'].items(): for key, val in info_dict['http_headers'].items():
cmd += ['%s:%s' % (key, val)] cmd += ['%s:%s' % (key, val)]
return cmd return cmd
class FFmpegFD(ExternalFD):
@classmethod
def supports(cls, info_dict):
return info_dict['protocol'] in ('http', 'https', 'ftp', 'ftps', 'm3u8', 'rtsp', 'rtmp', 'mms')
@classmethod
def available(cls):
return FFmpegPostProcessor().available
def _call_downloader(self, tmpfilename, info_dict):
url = info_dict['url']
ffpp = FFmpegPostProcessor(downloader=self)
if not ffpp.available:
self.report_error('m3u8 download detected but ffmpeg or avconv could not be found. Please install one.')
return False
ffpp.check_version()
args = [ffpp.executable, '-y']
args += self._configuration_args()
# start_time = info_dict.get('start_time') or 0
# if start_time:
# args += ['-ss', compat_str(start_time)]
# end_time = info_dict.get('end_time')
# if end_time:
# args += ['-t', compat_str(end_time - start_time)]
if info_dict['http_headers'] and re.match(r'^https?://', url):
# Trailing \r\n after each HTTP header is important to prevent warning from ffmpeg/avconv:
# [http @ 00000000003d2fa0] No trailing CRLF found in HTTP header.
headers = handle_youtubedl_headers(info_dict['http_headers'])
args += [
'-headers',
''.join('%s: %s\r\n' % (key, val) for key, val in headers.items())]
protocol = info_dict.get('protocol')
if protocol == 'rtmp':
player_url = info_dict.get('player_url')
page_url = info_dict.get('page_url')
app = info_dict.get('app')
play_path = info_dict.get('play_path')
tc_url = info_dict.get('tc_url')
flash_version = info_dict.get('flash_version')
live = info_dict.get('rtmp_live', False)
if player_url is not None:
args += ['-rtmp_swfverify', player_url]
if page_url is not None:
args += ['-rtmp_pageurl', page_url]
if app is not None:
args += ['-rtmp_app', app]
if play_path is not None:
args += ['-rtmp_playpath', play_path]
if tc_url is not None:
args += ['-rtmp_tcurl', tc_url]
if flash_version is not None:
args += ['-rtmp_flashver', flash_version]
if live:
args += ['-rtmp_live', 'live']
args += ['-i', url, '-c', 'copy']
if protocol == 'm3u8':
if self.params.get('hls_use_mpegts', False):
args += ['-f', 'mpegts']
else:
args += ['-f', 'mp4', '-bsf:a', 'aac_adtstoasc']
elif protocol == 'rtmp':
args += ['-f', 'flv']
else:
args += ['-f', EXT_TO_OUT_FORMATS.get(info_dict['ext'], info_dict['ext'])]
args = [encodeArgument(opt) for opt in args]
args.append(encodeFilename(ffpp._ffmpeg_filename_argument(tmpfilename), True))
self._debug_cmd(args)
proc = subprocess.Popen(args, stdin=subprocess.PIPE)
try:
retval = proc.wait()
except KeyboardInterrupt:
# subprocces.run would send the SIGKILL signal to ffmpeg and the
# mp4 file couldn't be played, but if we ask ffmpeg to quit it
# produces a file that is playable (this is mostly useful for live
# streams). Note that Windows is not affected and produces playable
# files (see https://github.com/rg3/youtube-dl/issues/8300).
if sys.platform != 'win32':
proc.communicate(b'q')
raise
return retval
class AVconvFD(FFmpegFD):
pass
_BY_NAME = dict( _BY_NAME = dict(
(klass.get_basename(), klass) (klass.get_basename(), klass)
for name, klass in globals().items() for name, klass in globals().items()

View File

@@ -19,8 +19,17 @@ class HttpQuietDownloader(HttpFD):
class FragmentFD(FileDownloader): class FragmentFD(FileDownloader):
""" """
A base file downloader class for fragmented media (e.g. f4m/m3u8 manifests). A base file downloader class for fragmented media (e.g. f4m/m3u8 manifests).
Available options:
fragment_retries: Number of times to retry a fragment for HTTP error (DASH only)
""" """
def report_retry_fragment(self, fragment_name, count, retries):
self.to_screen(
'[download] Got server HTTP error. Retrying fragment %s (attempt %d of %s)...'
% (fragment_name, count, self.format_retries(retries)))
def _prepare_and_start_frag_download(self, ctx): def _prepare_and_start_frag_download(self, ctx):
self._prepare_frag_download(ctx) self._prepare_frag_download(ctx)
self._start_frag_download(ctx) self._start_frag_download(ctx)
@@ -38,7 +47,7 @@ class FragmentFD(FileDownloader):
'continuedl': True, 'continuedl': True,
'quiet': True, 'quiet': True,
'noprogress': True, 'noprogress': True,
'ratelimit': self.params.get('ratelimit', None), 'ratelimit': self.params.get('ratelimit'),
'retries': self.params.get('retries', 0), 'retries': self.params.get('retries', 0),
'test': self.params.get('test', False), 'test': self.params.get('test', False),
} }
@@ -99,7 +108,8 @@ class FragmentFD(FileDownloader):
state['eta'] = self.calc_eta( state['eta'] = self.calc_eta(
start, time_now, estimated_size, start, time_now, estimated_size,
state['downloaded_bytes']) state['downloaded_bytes'])
state['speed'] = s.get('speed') state['speed'] = s.get('speed') or ctx.get('speed')
ctx['speed'] = state['speed']
ctx['prev_frag_downloaded_bytes'] = frag_downloaded_bytes ctx['prev_frag_downloaded_bytes'] = frag_downloaded_bytes
self._hook_progress(state) self._hook_progress(state)

View File

@@ -1,87 +1,19 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import os import os.path
import re import re
import subprocess
import sys
from .common import FileDownloader
from .fragment import FragmentFD from .fragment import FragmentFD
from ..compat import compat_urlparse from ..compat import compat_urlparse
from ..postprocessor.ffmpeg import FFmpegPostProcessor
from ..utils import ( from ..utils import (
encodeArgument,
encodeFilename, encodeFilename,
sanitize_open, sanitize_open,
handle_youtubedl_headers,
) )
class HlsFD(FileDownloader): class HlsFD(FragmentFD):
def real_download(self, filename, info_dict): """ A limited implementation that does not require ffmpeg """
url = info_dict['url']
self.report_destination(filename)
tmpfilename = self.temp_name(filename)
ffpp = FFmpegPostProcessor(downloader=self)
if not ffpp.available:
self.report_error('m3u8 download detected but ffmpeg or avconv could not be found. Please install one.')
return False
ffpp.check_version()
args = [ffpp.executable, '-y']
if info_dict['http_headers'] and re.match(r'^https?://', url):
# Trailing \r\n after each HTTP header is important to prevent warning from ffmpeg/avconv:
# [http @ 00000000003d2fa0] No trailing CRLF found in HTTP header.
headers = handle_youtubedl_headers(info_dict['http_headers'])
args += [
'-headers',
''.join('%s: %s\r\n' % (key, val) for key, val in headers.items())]
args += ['-i', url, '-c', 'copy']
if self.params.get('hls_use_mpegts', False):
args += ['-f', 'mpegts']
else:
args += ['-f', 'mp4', '-bsf:a', 'aac_adtstoasc']
args = [encodeArgument(opt) for opt in args]
args.append(encodeFilename(ffpp._ffmpeg_filename_argument(tmpfilename), True))
self._debug_cmd(args)
proc = subprocess.Popen(args, stdin=subprocess.PIPE)
try:
retval = proc.wait()
except KeyboardInterrupt:
# subprocces.run would send the SIGKILL signal to ffmpeg and the
# mp4 file couldn't be played, but if we ask ffmpeg to quit it
# produces a file that is playable (this is mostly useful for live
# streams). Note that Windows is not affected and produces playable
# files (see https://github.com/rg3/youtube-dl/issues/8300).
if sys.platform != 'win32':
proc.communicate(b'q')
raise
if retval == 0:
fsize = os.path.getsize(encodeFilename(tmpfilename))
self.to_screen('\r[%s] %s bytes' % (args[0], fsize))
self.try_rename(tmpfilename, filename)
self._hook_progress({
'downloaded_bytes': fsize,
'total_bytes': fsize,
'filename': filename,
'status': 'finished',
})
return True
else:
self.to_stderr('\n')
self.report_error('%s exited with code %d' % (ffpp.basename, retval))
return False
class NativeHlsFD(FragmentFD):
""" A more limited implementation that does not require ffmpeg """
FD_NAME = 'hlsnative' FD_NAME = 'hlsnative'

View File

@@ -140,8 +140,8 @@ class HttpFD(FileDownloader):
if data_len is not None: if data_len is not None:
data_len = int(data_len) + resume_len data_len = int(data_len) + resume_len
min_data_len = self.params.get("min_filesize", None) min_data_len = self.params.get('min_filesize')
max_data_len = self.params.get("max_filesize", None) max_data_len = self.params.get('max_filesize')
if min_data_len is not None and data_len < min_data_len: if min_data_len is not None and data_len < min_data_len:
self.to_screen('\r[download] File is smaller than min-filesize (%s bytes < %s bytes). Aborting.' % (data_len, min_data_len)) self.to_screen('\r[download] File is smaller than min-filesize (%s bytes < %s bytes). Aborting.' % (data_len, min_data_len))
return False return False

View File

@@ -94,15 +94,15 @@ class RtmpFD(FileDownloader):
return proc.returncode return proc.returncode
url = info_dict['url'] url = info_dict['url']
player_url = info_dict.get('player_url', None) player_url = info_dict.get('player_url')
page_url = info_dict.get('page_url', None) page_url = info_dict.get('page_url')
app = info_dict.get('app', None) app = info_dict.get('app')
play_path = info_dict.get('play_path', None) play_path = info_dict.get('play_path')
tc_url = info_dict.get('tc_url', None) tc_url = info_dict.get('tc_url')
flash_version = info_dict.get('flash_version', None) flash_version = info_dict.get('flash_version')
live = info_dict.get('rtmp_live', False) live = info_dict.get('rtmp_live', False)
conn = info_dict.get('rtmp_conn', None) conn = info_dict.get('rtmp_conn')
protocol = info_dict.get('rtmp_protocol', None) protocol = info_dict.get('rtmp_protocol')
real_time = info_dict.get('rtmp_real_time', False) real_time = info_dict.get('rtmp_real_time', False)
no_resume = info_dict.get('no_resume', False) no_resume = info_dict.get('no_resume', False)
continue_dl = self.params.get('continuedl', True) continue_dl = self.params.get('continuedl', True)

View File

@@ -20,9 +20,13 @@ from .aftonbladet import AftonbladetIE
from .airmozilla import AirMozillaIE from .airmozilla import AirMozillaIE
from .aljazeera import AlJazeeraIE from .aljazeera import AlJazeeraIE
from .alphaporno import AlphaPornoIE from .alphaporno import AlphaPornoIE
from .animeondemand import AnimeOnDemandIE
from .anitube import AnitubeIE from .anitube import AnitubeIE
from .anysex import AnySexIE from .anysex import AnySexIE
from .aol import AolIE from .aol import (
AolIE,
AolFeaturesIE,
)
from .allocine import AllocineIE from .allocine import AllocineIE
from .aparat import AparatIE from .aparat import AparatIE
from .appleconnect import AppleConnectIE from .appleconnect import AppleConnectIE
@@ -44,11 +48,13 @@ from .arte import (
ArteTVFutureIE, ArteTVFutureIE,
ArteTVCinemaIE, ArteTVCinemaIE,
ArteTVDDCIE, ArteTVDDCIE,
ArteTVMagazineIE,
ArteTVEmbedIE, ArteTVEmbedIE,
) )
from .atresplayer import AtresPlayerIE from .atresplayer import AtresPlayerIE
from .atttechchannel import ATTTechChannelIE from .atttechchannel import ATTTechChannelIE
from .audimedia import AudiMediaIE from .audimedia import AudiMediaIE
from .audioboom import AudioBoomIE
from .audiomack import AudiomackIE, AudiomackAlbumIE from .audiomack import AudiomackIE, AudiomackAlbumIE
from .azubu import AzubuIE, AzubuLiveIE from .azubu import AzubuIE, AzubuLiveIE
from .baidu import BaiduVideoIE from .baidu import BaiduVideoIE
@@ -66,14 +72,17 @@ from .bet import BetIE
from .bigflix import BigflixIE from .bigflix import BigflixIE
from .bild import BildIE from .bild import BildIE
from .bilibili import BiliBiliIE from .bilibili import BiliBiliIE
from .biobiochiletv import BioBioChileTVIE
from .bleacherreport import ( from .bleacherreport import (
BleacherReportIE, BleacherReportIE,
BleacherReportCMSIE, BleacherReportCMSIE,
) )
from .blinkx import BlinkxIE from .blinkx import BlinkxIE
from .bloomberg import BloombergIE from .bloomberg import BloombergIE
from .bokecc import BokeCCIE
from .bpb import BpbIE from .bpb import BpbIE
from .br import BRIE from .br import BRIE
from .bravotv import BravoTVIE
from .breakcom import BreakIE from .breakcom import BreakIE
from .brightcove import ( from .brightcove import (
BrightcoveLegacyIE, BrightcoveLegacyIE,
@@ -100,6 +109,7 @@ from .cbsnews import (
) )
from .cbssports import CBSSportsIE from .cbssports import CBSSportsIE
from .ccc import CCCIE from .ccc import CCCIE
from .cda import CDAIE
from .ceskatelevize import CeskaTelevizeIE from .ceskatelevize import CeskaTelevizeIE
from .channel9 import Channel9IE from .channel9 import Channel9IE
from .chaturbate import ChaturbateIE from .chaturbate import ChaturbateIE
@@ -128,6 +138,7 @@ from .collegerama import CollegeRamaIE
from .comedycentral import ComedyCentralIE, ComedyCentralShowsIE from .comedycentral import ComedyCentralIE, ComedyCentralShowsIE
from .comcarcoff import ComCarCoffIE from .comcarcoff import ComCarCoffIE
from .commonmistakes import CommonMistakesIE, UnicodeBOMIE from .commonmistakes import CommonMistakesIE, UnicodeBOMIE
from .commonprotocols import RtmpIE
from .condenast import CondeNastIE from .condenast import CondeNastIE
from .cracked import CrackedIE from .cracked import CrackedIE
from .crackle import CrackleIE from .crackle import CrackleIE
@@ -182,6 +193,10 @@ from .dumpert import DumpertIE
from .defense import DefenseGouvFrIE from .defense import DefenseGouvFrIE
from .discovery import DiscoveryIE from .discovery import DiscoveryIE
from .dropbox import DropboxIE from .dropbox import DropboxIE
from .dw import (
DWIE,
DWArticleIE,
)
from .eagleplatform import EaglePlatformIE from .eagleplatform import EaglePlatformIE
from .ebaumsworld import EbaumsWorldIE from .ebaumsworld import EbaumsWorldIE
from .echomsk import EchoMskIE from .echomsk import EchoMskIE
@@ -206,10 +221,7 @@ from .everyonesmixtape import EveryonesMixtapeIE
from .exfm import ExfmIE from .exfm import ExfmIE
from .expotv import ExpoTVIE from .expotv import ExpoTVIE
from .extremetube import ExtremeTubeIE from .extremetube import ExtremeTubeIE
from .facebook import ( from .facebook import FacebookIE
FacebookIE,
FacebookPostIE,
)
from .faz import FazIE from .faz import FazIE
from .fc2 import FC2IE from .fc2 import FC2IE
from .fczenit import FczenitIE from .fczenit import FczenitIE
@@ -274,6 +286,7 @@ from .goshgay import GoshgayIE
from .gputechconf import GPUTechConfIE from .gputechconf import GPUTechConfIE
from .groupon import GrouponIE from .groupon import GrouponIE
from .hark import HarkIE from .hark import HarkIE
from .hbo import HBOIE
from .hearthisat import HearThisAtIE from .hearthisat import HearThisAtIE
from .heise import HeiseIE from .heise import HeiseIE
from .hellporno import HellPornoIE from .hellporno import HellPornoIE
@@ -337,6 +350,7 @@ from .konserthusetplay import KonserthusetPlayIE
from .kontrtube import KontrTubeIE from .kontrtube import KontrTubeIE
from .krasview import KrasViewIE from .krasview import KrasViewIE
from .ku6 import Ku6IE from .ku6 import Ku6IE
from .kusi import KUSIIE
from .kuwo import ( from .kuwo import (
KuwoIE, KuwoIE,
KuwoAlbumIE, KuwoAlbumIE,
@@ -349,10 +363,9 @@ from .la7 import LA7IE
from .laola1tv import Laola1TvIE from .laola1tv import Laola1TvIE
from .lecture2go import Lecture2GoIE from .lecture2go import Lecture2GoIE
from .lemonde import LemondeIE from .lemonde import LemondeIE
from .letv import ( from .leeco import (
LetvIE, LeIE,
LetvTvIE, LePlaylistIE,
LetvPlaylistIE,
LetvCloudIE, LetvCloudIE,
) )
from .libsyn import LibsynIE from .libsyn import LibsynIE
@@ -381,6 +394,7 @@ from .lynda import (
from .m6 import M6IE from .m6 import M6IE
from .macgamestore import MacGameStoreIE from .macgamestore import MacGameStoreIE
from .mailru import MailRuIE from .mailru import MailRuIE
from .makerschannel import MakersChannelIE
from .makertv import MakerTVIE from .makertv import MakerTVIE
from .malemotion import MalemotionIE from .malemotion import MalemotionIE
from .matchtv import MatchTVIE from .matchtv import MatchTVIE
@@ -390,11 +404,13 @@ from .metacritic import MetacriticIE
from .mgoon import MgoonIE from .mgoon import MgoonIE
from .minhateca import MinhatecaIE from .minhateca import MinhatecaIE
from .ministrygrid import MinistryGridIE from .ministrygrid import MinistryGridIE
from .minoto import MinotoIE
from .miomio import MioMioIE from .miomio import MioMioIE
from .mit import TechTVMITIE, MITIE, OCWMITIE from .mit import TechTVMITIE, MITIE, OCWMITIE
from .mitele import MiTeleIE from .mitele import MiTeleIE
from .mixcloud import MixcloudIE from .mixcloud import MixcloudIE
from .mlb import MLBIE from .mlb import MLBIE
from .mnet import MnetIE
from .mpora import MporaIE from .mpora import MporaIE
from .moevideo import MoeVideoIE from .moevideo import MoeVideoIE
from .mofosex import MofosexIE from .mofosex import MofosexIE
@@ -489,6 +505,7 @@ from .nowtv import (
NowTVIE, NowTVIE,
NowTVListIE, NowTVListIE,
) )
from .noz import NozIE
from .npo import ( from .npo import (
NPOIE, NPOIE,
NPOLiveIE, NPOLiveIE,
@@ -502,6 +519,7 @@ from .npr import NprIE
from .nrk import ( from .nrk import (
NRKIE, NRKIE,
NRKPlaylistIE, NRKPlaylistIE,
NRKSkoleIE,
NRKTVIE, NRKTVIE,
) )
from .ntvde import NTVDeIE from .ntvde import NTVDeIE
@@ -518,6 +536,7 @@ from .ooyala import (
OoyalaIE, OoyalaIE,
OoyalaExternalIE, OoyalaExternalIE,
) )
from .openload import OpenloadIE
from .ora import OraTVIE from .ora import OraTVIE
from .orf import ( from .orf import (
ORFTVthekIE, ORFTVthekIE,
@@ -552,6 +571,7 @@ from .pornhd import PornHdIE
from .pornhub import ( from .pornhub import (
PornHubIE, PornHubIE,
PornHubPlaylistIE, PornHubPlaylistIE,
PornHubUserVideosIE,
) )
from .pornotube import PornotubeIE from .pornotube import PornotubeIE
from .pornovoisines import PornoVoisinesIE from .pornovoisines import PornoVoisinesIE
@@ -585,6 +605,7 @@ from .regiotv import RegioTVIE
from .restudy import RestudyIE from .restudy import RestudyIE
from .reverbnation import ReverbNationIE from .reverbnation import ReverbNationIE
from .revision3 import Revision3IE from .revision3 import Revision3IE
from .rice import RICEIE
from .ringtv import RingTVIE from .ringtv import RingTVIE
from .ro220 import Ro220IE from .ro220 import Ro220IE
from .rottentomatoes import RottenTomatoesIE from .rottentomatoes import RottenTomatoesIE
@@ -611,6 +632,7 @@ from .ruutu import RuutuIE
from .sandia import SandiaIE from .sandia import SandiaIE
from .safari import ( from .safari import (
SafariIE, SafariIE,
SafariApiIE,
SafariCourseIE, SafariCourseIE,
) )
from .sapo import SapoIE from .sapo import SapoIE
@@ -619,6 +641,7 @@ from .sbs import SBSIE
from .scivee import SciVeeIE from .scivee import SciVeeIE
from .screencast import ScreencastIE from .screencast import ScreencastIE
from .screencastomatic import ScreencastOMaticIE from .screencastomatic import ScreencastOMaticIE
from .screenjunkies import ScreenJunkiesIE
from .screenwavemedia import ScreenwaveMediaIE, TeamFourIE from .screenwavemedia import ScreenwaveMediaIE, TeamFourIE
from .senateisvp import SenateISVPIE from .senateisvp import SenateISVPIE
from .servingsys import ServingSysIE from .servingsys import ServingSysIE
@@ -664,7 +687,6 @@ from .southpark import (
SouthParkEsIE, SouthParkEsIE,
SouthParkNlIE SouthParkNlIE
) )
from .space import SpaceIE
from .spankbang import SpankBangIE from .spankbang import SpankBangIE
from .spankwire import SpankwireIE from .spankwire import SpankwireIE
from .spiegel import SpiegelIE, SpiegelArticleIE from .spiegel import SpiegelIE, SpiegelArticleIE
@@ -722,7 +744,9 @@ from .theplatform import (
ThePlatformIE, ThePlatformIE,
ThePlatformFeedIE, ThePlatformFeedIE,
) )
from .thescene import TheSceneIE
from .thesixtyone import TheSixtyOneIE from .thesixtyone import TheSixtyOneIE
from .thestar import TheStarIE
from .thisamericanlife import ThisAmericanLifeIE from .thisamericanlife import ThisAmericanLifeIE
from .thisav import ThisAVIE from .thisav import ThisAVIE
from .tinypic import TinyPicIE from .tinypic import TinyPicIE
@@ -732,6 +756,7 @@ from .tmz import (
TMZArticleIE, TMZArticleIE,
) )
from .tnaflix import ( from .tnaflix import (
TNAFlixNetworkEmbedIE,
TNAFlixIE, TNAFlixIE,
EMPFlixIE, EMPFlixIE,
MovieFapIE, MovieFapIE,
@@ -768,6 +793,7 @@ from .tv2 import (
TV2IE, TV2IE,
TV2ArticleIE, TV2ArticleIE,
) )
from .tv3 import TV3IE
from .tv4 import TV4IE from .tv4 import TV4IE
from .tvc import ( from .tvc import (
TVCIE, TVCIE,
@@ -793,7 +819,11 @@ from .twitch import (
TwitchBookmarksIE, TwitchBookmarksIE,
TwitchStreamIE, TwitchStreamIE,
) )
from .twitter import TwitterCardIE, TwitterIE from .twitter import (
TwitterCardIE,
TwitterIE,
TwitterAmplifyIE,
)
from .ubu import UbuIE from .ubu import UbuIE
from .udemy import ( from .udemy import (
UdemyIE, UdemyIE,
@@ -803,7 +833,9 @@ from .udn import UDNEmbedIE
from .digiteka import DigitekaIE from .digiteka import DigitekaIE
from .unistra import UnistraIE from .unistra import UnistraIE
from .urort import UrortIE from .urort import UrortIE
from .usatoday import USATodayIE
from .ustream import UstreamIE, UstreamChannelIE from .ustream import UstreamIE, UstreamChannelIE
from .ustudio import UstudioIE
from .varzesh3 import Varzesh3IE from .varzesh3 import Varzesh3IE
from .vbox7 import Vbox7IE from .vbox7 import Vbox7IE
from .veehd import VeeHDIE from .veehd import VeeHDIE
@@ -817,7 +849,10 @@ from .vgtv import (
VGTVIE, VGTVIE,
) )
from .vh1 import VH1IE from .vh1 import VH1IE
from .vice import ViceIE from .vice import (
ViceIE,
ViceShowIE,
)
from .viddler import ViddlerIE from .viddler import ViddlerIE
from .videodetective import VideoDetectiveIE from .videodetective import VideoDetectiveIE
from .videofyme import VideofyMeIE from .videofyme import VideofyMeIE
@@ -844,6 +879,7 @@ from .vimeo import (
VimeoChannelIE, VimeoChannelIE,
VimeoGroupsIE, VimeoGroupsIE,
VimeoLikesIE, VimeoLikesIE,
VimeoOndemandIE,
VimeoReviewIE, VimeoReviewIE,
VimeoUserIE, VimeoUserIE,
VimeoWatchLaterIE, VimeoWatchLaterIE,
@@ -925,7 +961,9 @@ from .youtube import (
YoutubeChannelIE, YoutubeChannelIE,
YoutubeFavouritesIE, YoutubeFavouritesIE,
YoutubeHistoryIE, YoutubeHistoryIE,
YoutubeLiveIE,
YoutubePlaylistIE, YoutubePlaylistIE,
YoutubePlaylistsIE,
YoutubeRecommendedIE, YoutubeRecommendedIE,
YoutubeSearchDateIE, YoutubeSearchDateIE,
YoutubeSearchIE, YoutubeSearchIE,
@@ -935,7 +973,6 @@ from .youtube import (
YoutubeTruncatedIDIE, YoutubeTruncatedIDIE,
YoutubeTruncatedURLIE, YoutubeTruncatedURLIE,
YoutubeUserIE, YoutubeUserIE,
YoutubePlaylistsIE,
YoutubeWatchLaterIE, YoutubeWatchLaterIE,
) )
from .zapiks import ZapiksIE from .zapiks import ZapiksIE

View File

@@ -12,7 +12,7 @@ from ..utils import (
class ABCIE(InfoExtractor): class ABCIE(InfoExtractor):
IE_NAME = 'abc.net.au' IE_NAME = 'abc.net.au'
_VALID_URL = r'http://www\.abc\.net\.au/news/(?:[^/]+/){1,2}(?P<id>\d+)' _VALID_URL = r'https?://www\.abc\.net\.au/news/(?:[^/]+/){1,2}(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.abc.net.au/news/2014-11-05/australia-to-staff-ebola-treatment-centre-in-sierra-leone/5868334', 'url': 'http://www.abc.net.au/news/2014-11-05/australia-to-staff-ebola-treatment-centre-in-sierra-leone/5868334',

View File

@@ -16,7 +16,7 @@ from ..utils import (
class AddAnimeIE(InfoExtractor): class AddAnimeIE(InfoExtractor):
_VALID_URL = r'http://(?:\w+\.)?add-anime\.net/(?:watch_video\.php\?(?:.*?)v=|video/)(?P<id>[\w_]+)' _VALID_URL = r'https?://(?:\w+\.)?add-anime\.net/(?:watch_video\.php\?(?:.*?)v=|video/)(?P<id>[\w_]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.add-anime.net/watch_video.php?v=24MR3YO5SAS9', 'url': 'http://www.add-anime.net/watch_video.php?v=24MR3YO5SAS9',
'md5': '72954ea10bc979ab5e2eb288b21425a0', 'md5': '72954ea10bc979ab5e2eb288b21425a0',

View File

@@ -28,7 +28,7 @@ class AENetworksIE(InfoExtractor):
'info_dict': { 'info_dict': {
'id': 'eg47EERs_JsZ', 'id': 'eg47EERs_JsZ',
'ext': 'mp4', 'ext': 'mp4',
'title': "Winter Is Coming", 'title': 'Winter Is Coming',
'description': 'md5:641f424b7a19d8e24f26dea22cf59d74', 'description': 'md5:641f424b7a19d8e24f26dea22cf59d74',
}, },
'params': { 'params': {

View File

@@ -6,7 +6,7 @@ from ..utils import int_or_none
class AftonbladetIE(InfoExtractor): class AftonbladetIE(InfoExtractor):
_VALID_URL = r'http://tv\.aftonbladet\.se/abtv/articles/(?P<id>[0-9]+)' _VALID_URL = r'https?://tv\.aftonbladet\.se/abtv/articles/(?P<id>[0-9]+)'
_TEST = { _TEST = {
'url': 'http://tv.aftonbladet.se/abtv/articles/36015', 'url': 'http://tv.aftonbladet.se/abtv/articles/36015',
'info_dict': { 'info_dict': {

View File

@@ -4,7 +4,7 @@ from .common import InfoExtractor
class AlJazeeraIE(InfoExtractor): class AlJazeeraIE(InfoExtractor):
_VALID_URL = r'http://www\.aljazeera\.com/programmes/.*?/(?P<id>[^/]+)\.html' _VALID_URL = r'https?://www\.aljazeera\.com/programmes/.*?/(?P<id>[^/]+)\.html'
_TEST = { _TEST = {
'url': 'http://www.aljazeera.com/programmes/the-slum/2014/08/deliverance-201482883754237240.html', 'url': 'http://www.aljazeera.com/programmes/the-slum/2014/08/deliverance-201482883754237240.html',
@@ -13,24 +13,18 @@ class AlJazeeraIE(InfoExtractor):
'ext': 'mp4', 'ext': 'mp4',
'title': 'The Slum - Episode 1: Deliverance', 'title': 'The Slum - Episode 1: Deliverance',
'description': 'As a birth attendant advocating for family planning, Remy is on the frontline of Tondo\'s battle with overcrowding.', 'description': 'As a birth attendant advocating for family planning, Remy is on the frontline of Tondo\'s battle with overcrowding.',
'uploader': 'Al Jazeera English', 'uploader_id': '665003303001',
'timestamp': 1411116829,
'upload_date': '20140919',
}, },
'add_ie': ['BrightcoveLegacy'], 'add_ie': ['BrightcoveNew'],
'skip': 'Not accessible from Travis CI server', 'skip': 'Not accessible from Travis CI server',
} }
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/665003303001/default_default/index.html?videoId=%s'
def _real_extract(self, url): def _real_extract(self, url):
program_name = self._match_id(url) program_name = self._match_id(url)
webpage = self._download_webpage(url, program_name) webpage = self._download_webpage(url, program_name)
brightcove_id = self._search_regex( brightcove_id = self._search_regex(
r'RenderPagesVideo\(\'(.+?)\'', webpage, 'brightcove id') r'RenderPagesVideo\(\'(.+?)\'', webpage, 'brightcove id')
return self.url_result(self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id, 'BrightcoveNew', brightcove_id)
return {
'_type': 'url',
'url': (
'brightcove:'
'playerKey=AQ~~%2CAAAAmtVJIFk~%2CTVGOQ5ZTwJbeMWnq5d_H4MOM57xfzApc'
'&%40videoPlayer={0}'.format(brightcove_id)
),
'ie_key': 'BrightcoveLegacy',
}

View File

@@ -0,0 +1,243 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import (
compat_urlparse,
compat_str,
)
from ..utils import (
determine_ext,
encode_dict,
extract_attributes,
ExtractorError,
sanitized_Request,
urlencode_postdata,
)
class AnimeOnDemandIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?anime-on-demand\.de/anime/(?P<id>\d+)'
_LOGIN_URL = 'https://www.anime-on-demand.de/users/sign_in'
_APPLY_HTML5_URL = 'https://www.anime-on-demand.de/html5apply'
_NETRC_MACHINE = 'animeondemand'
_TESTS = [{
'url': 'https://www.anime-on-demand.de/anime/161',
'info_dict': {
'id': '161',
'title': 'Grimgar, Ashes and Illusions (OmU)',
'description': 'md5:6681ce3c07c7189d255ac6ab23812d31',
},
'playlist_mincount': 4,
}, {
# Film wording is used instead of Episode
'url': 'https://www.anime-on-demand.de/anime/39',
'only_matching': True,
}, {
# Episodes without titles
'url': 'https://www.anime-on-demand.de/anime/162',
'only_matching': True,
}, {
# ger/jap, Dub/OmU, account required
'url': 'https://www.anime-on-demand.de/anime/169',
'only_matching': True,
}]
def _login(self):
(username, password) = self._get_login_info()
if username is None:
return
login_page = self._download_webpage(
self._LOGIN_URL, None, 'Downloading login page')
if '>Our licensing terms allow the distribution of animes only to German-speaking countries of Europe' in login_page:
self.raise_geo_restricted(
'%s is only available in German-speaking countries of Europe' % self.IE_NAME)
login_form = self._form_hidden_inputs('new_user', login_page)
login_form.update({
'user[login]': username,
'user[password]': password,
})
post_url = self._search_regex(
r'<form[^>]+action=(["\'])(?P<url>.+?)\1', login_page,
'post url', default=self._LOGIN_URL, group='url')
if not post_url.startswith('http'):
post_url = compat_urlparse.urljoin(self._LOGIN_URL, post_url)
request = sanitized_Request(
post_url, urlencode_postdata(encode_dict(login_form)))
request.add_header('Referer', self._LOGIN_URL)
response = self._download_webpage(
request, None, 'Logging in as %s' % username)
if all(p not in response for p in ('>Logout<', 'href="/users/sign_out"')):
error = self._search_regex(
r'<p class="alert alert-danger">(.+?)</p>',
response, 'error', default=None)
if error:
raise ExtractorError('Unable to login: %s' % error, expected=True)
raise ExtractorError('Unable to log in')
def _real_initialize(self):
self._login()
def _real_extract(self, url):
anime_id = self._match_id(url)
webpage = self._download_webpage(url, anime_id)
if 'data-playlist=' not in webpage:
self._download_webpage(
self._APPLY_HTML5_URL, anime_id,
'Activating HTML5 beta', 'Unable to apply HTML5 beta')
webpage = self._download_webpage(url, anime_id)
csrf_token = self._html_search_meta(
'csrf-token', webpage, 'csrf token', fatal=True)
anime_title = self._html_search_regex(
r'(?s)<h1[^>]+itemprop="name"[^>]*>(.+?)</h1>',
webpage, 'anime name')
anime_description = self._html_search_regex(
r'(?s)<div[^>]+itemprop="description"[^>]*>(.+?)</div>',
webpage, 'anime description', default=None)
entries = []
for num, episode_html in enumerate(re.findall(
r'(?s)<h3[^>]+class="episodebox-title".+?>Episodeninhalt<', webpage), 1):
episodebox_title = self._search_regex(
(r'class="episodebox-title"[^>]+title=(["\'])(?P<title>.+?)\1',
r'class="episodebox-title"[^>]+>(?P<title>.+?)<'),
episode_html, 'episodebox title', default=None, group='title')
if not episodebox_title:
continue
episode_number = int(self._search_regex(
r'(?:Episode|Film)\s*(\d+)',
episodebox_title, 'episode number', default=num))
episode_title = self._search_regex(
r'(?:Episode|Film)\s*\d+\s*-\s*(.+)',
episodebox_title, 'episode title', default=None)
video_id = 'episode-%d' % episode_number
common_info = {
'id': video_id,
'series': anime_title,
'episode': episode_title,
'episode_number': episode_number,
}
formats = []
for input_ in re.findall(
r'<input[^>]+class=["\'].*?streamstarter_html5[^>]+>', episode_html):
attributes = extract_attributes(input_)
playlist_urls = []
for playlist_key in ('data-playlist', 'data-otherplaylist'):
playlist_url = attributes.get(playlist_key)
if isinstance(playlist_url, compat_str) and re.match(
r'/?[\da-zA-Z]+', playlist_url):
playlist_urls.append(attributes[playlist_key])
if not playlist_urls:
continue
lang = attributes.get('data-lang')
lang_note = attributes.get('value')
for playlist_url in playlist_urls:
kind = self._search_regex(
r'videomaterialurl/\d+/([^/]+)/',
playlist_url, 'media kind', default=None)
format_id_list = []
if lang:
format_id_list.append(lang)
if kind:
format_id_list.append(kind)
if not format_id_list:
format_id_list.append(compat_str(num))
format_id = '-'.join(format_id_list)
format_note = ', '.join(filter(None, (kind, lang_note)))
request = sanitized_Request(
compat_urlparse.urljoin(url, playlist_url),
headers={
'X-Requested-With': 'XMLHttpRequest',
'X-CSRF-Token': csrf_token,
'Referer': url,
'Accept': 'application/json, text/javascript, */*; q=0.01',
})
playlist = self._download_json(
request, video_id, 'Downloading %s playlist JSON' % format_id,
fatal=False)
if not playlist:
continue
start_video = playlist.get('startvideo', 0)
playlist = playlist.get('playlist')
if not playlist or not isinstance(playlist, list):
continue
playlist = playlist[start_video]
title = playlist.get('title')
if not title:
continue
description = playlist.get('description')
for source in playlist.get('sources', []):
file_ = source.get('file')
if not file_:
continue
ext = determine_ext(file_)
format_id_list = [lang, kind]
if ext == 'm3u8':
format_id_list.append('hls')
elif source.get('type') == 'video/dash' or ext == 'mpd':
format_id_list.append('dash')
format_id = '-'.join(filter(None, format_id_list))
if ext == 'm3u8':
file_formats = self._extract_m3u8_formats(
file_, video_id, 'mp4',
entry_protocol='m3u8_native', m3u8_id=format_id, fatal=False)
elif source.get('type') == 'video/dash' or ext == 'mpd':
continue
file_formats = self._extract_mpd_formats(
file_, video_id, mpd_id=format_id, fatal=False)
else:
continue
for f in file_formats:
f.update({
'language': lang,
'format_note': format_note,
})
formats.extend(file_formats)
if formats:
self._sort_formats(formats)
f = common_info.copy()
f.update({
'title': title,
'description': description,
'formats': formats,
})
entries.append(f)
# Extract teaser only when full episode is not available
if not formats:
m = re.search(
r'data-dialog-header=(["\'])(?P<title>.+?)\1[^>]+href=(["\'])(?P<href>.+?)\3[^>]*>Teaser<',
episode_html)
if m:
f = common_info.copy()
f.update({
'id': '%s-teaser' % f['id'],
'title': m.group('title'),
'url': compat_urlparse.urljoin(url, m.group('href')),
})
entries.append(f)
return self.playlist_result(entries, anime_id, anime_title, anime_description)

View File

@@ -1,24 +1,11 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import re
from .common import InfoExtractor from .common import InfoExtractor
class AolIE(InfoExtractor): class AolIE(InfoExtractor):
IE_NAME = 'on.aol.com' IE_NAME = 'on.aol.com'
_VALID_URL = r'''(?x) _VALID_URL = r'(?:aol-video:|https?://on\.aol\.com/video/.*-)(?P<id>[0-9]+)(?:$|\?)'
(?:
aol-video:|
http://on\.aol\.com/
(?:
video/.*-|
playlist/(?P<playlist_display_id>[^/?#]+?)-(?P<playlist_id>[0-9]+)[?#].*_videoid=
)
)
(?P<id>[0-9]+)
(?:$|\?)
'''
_TESTS = [{ _TESTS = [{
'url': 'http://on.aol.com/video/u-s--official-warns-of-largest-ever-irs-phone-scam-518167793?icid=OnHomepageC2Wide_MustSee_Img', 'url': 'http://on.aol.com/video/u-s--official-warns-of-largest-ever-irs-phone-scam-518167793?icid=OnHomepageC2Wide_MustSee_Img',
@@ -29,42 +16,31 @@ class AolIE(InfoExtractor):
'title': 'U.S. Official Warns Of \'Largest Ever\' IRS Phone Scam', 'title': 'U.S. Official Warns Of \'Largest Ever\' IRS Phone Scam',
}, },
'add_ie': ['FiveMin'], 'add_ie': ['FiveMin'],
}, {
'url': 'http://on.aol.com/playlist/brace-yourself---todays-weirdest-news-152147?icid=OnHomepageC4_Omg_Img#_videoid=518184316',
'info_dict': {
'id': '152147',
'title': 'Brace Yourself - Today\'s Weirdest News',
},
'playlist_mincount': 10,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url) video_id = self._match_id(url)
video_id = mobj.group('id')
playlist_id = mobj.group('playlist_id')
if not playlist_id or self._downloader.params.get('noplaylist'):
return self.url_result('5min:%s' % video_id) return self.url_result('5min:%s' % video_id)
self.to_screen('Downloading playlist %s - add --no-playlist to just download video %s' % (playlist_id, video_id))
webpage = self._download_webpage(url, playlist_id) class AolFeaturesIE(InfoExtractor):
title = self._html_search_regex( IE_NAME = 'features.aol.com'
r'<h1 class="video-title[^"]*">(.+?)</h1>', webpage, 'title') _VALID_URL = r'https?://features\.aol\.com/video/(?P<id>[^/?#]+)'
playlist_html = self._search_regex(
r"(?s)<ul\s+class='video-related[^']*'>(.*?)</ul>", webpage,
'playlist HTML')
entries = [{
'_type': 'url',
'url': 'aol-video:%s' % m.group('id'),
'ie_key': 'Aol',
} for m in re.finditer(
r"<a\s+href='.*videoid=(?P<id>[0-9]+)'\s+class='video-thumb'>",
playlist_html)]
return { _TESTS = [{
'_type': 'playlist', 'url': 'http://features.aol.com/video/behind-secret-second-careers-late-night-talk-show-hosts',
'id': playlist_id, 'md5': '7db483bb0c09c85e241f84a34238cc75',
'display_id': mobj.group('playlist_display_id'), 'info_dict': {
'title': title, 'id': '519507715',
'entries': entries, 'ext': 'mp4',
} 'title': 'What To Watch - February 17, 2016',
},
'add_ie': ['FiveMin'],
}]
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
return self.url_result(self._search_regex(
r'<script type="text/javascript" src="(https?://[^/]*?5min\.com/Scripts/PlayerSeed\.js[^"]+)"',
webpage, '5min embed url'), 'FiveMin')

View File

@@ -12,7 +12,7 @@ from ..utils import (
class AppleTrailersIE(InfoExtractor): class AppleTrailersIE(InfoExtractor):
IE_NAME = 'appletrailers' IE_NAME = 'appletrailers'
_VALID_URL = r'https?://(?:www\.)?trailers\.apple\.com/(?:trailers|ca)/(?P<company>[^/]+)/(?P<movie>[^/]+)' _VALID_URL = r'https?://(?:www\.|movie)?trailers\.apple\.com/(?:trailers|ca)/(?P<company>[^/]+)/(?P<movie>[^/]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://trailers.apple.com/trailers/wb/manofsteel/', 'url': 'http://trailers.apple.com/trailers/wb/manofsteel/',
'info_dict': { 'info_dict': {
@@ -73,6 +73,9 @@ class AppleTrailersIE(InfoExtractor):
}, { }, {
'url': 'http://trailers.apple.com/ca/metropole/autrui/', 'url': 'http://trailers.apple.com/ca/metropole/autrui/',
'only_matching': True, 'only_matching': True,
}, {
'url': 'http://movietrailers.apple.com/trailers/focus_features/kuboandthetwostrings/',
'only_matching': True,
}] }]
_JSON_RE = r'iTunes.playURL\((.*?)\);' _JSON_RE = r'iTunes.playURL\((.*?)\);'

View File

@@ -23,7 +23,7 @@ from ..utils import (
class ArteTvIE(InfoExtractor): class ArteTvIE(InfoExtractor):
_VALID_URL = r'http://videos\.arte\.tv/(?P<lang>fr|de)/.*-(?P<id>.*?)\.html' _VALID_URL = r'https?://videos\.arte\.tv/(?P<lang>fr|de|en|es)/.*-(?P<id>.*?)\.html'
IE_NAME = 'arte.tv' IE_NAME = 'arte.tv'
def _real_extract(self, url): def _real_extract(self, url):
@@ -63,7 +63,7 @@ class ArteTvIE(InfoExtractor):
class ArteTVPlus7IE(InfoExtractor): class ArteTVPlus7IE(InfoExtractor):
IE_NAME = 'arte.tv:+7' IE_NAME = 'arte.tv:+7'
_VALID_URL = r'https?://(?:www\.)?arte\.tv/guide/(?P<lang>fr|de)/(?:(?:sendungen|emissions)/)?(?P<id>.*?)/(?P<name>.*?)(\?.*)?' _VALID_URL = r'https?://(?:www\.)?arte\.tv/guide/(?P<lang>fr|de|en|es)/(?:(?:sendungen|emissions|embed)/)?(?P<id>[^/]+)/(?P<name>[^/?#&+])'
@classmethod @classmethod
def _extract_url_info(cls, url): def _extract_url_info(cls, url):
@@ -102,23 +102,45 @@ class ArteTVPlus7IE(InfoExtractor):
iframe_url = find_iframe_url(webpage, None) iframe_url = find_iframe_url(webpage, None)
if not iframe_url: if not iframe_url:
embed_url = self._html_search_regex( embed_url = self._html_search_regex(
r'arte_vp_url_oembed=\'([^\']+?)\'', webpage, 'embed url') r'arte_vp_url_oembed=\'([^\']+?)\'', webpage, 'embed url', default=None)
if embed_url:
player = self._download_json( player = self._download_json(
embed_url, video_id, 'Downloading player page') embed_url, video_id, 'Downloading player page')
iframe_url = find_iframe_url(player['html']) iframe_url = find_iframe_url(player['html'])
# en and es URLs produce react-based pages with different layout (e.g.
# http://www.arte.tv/guide/en/053330-002-A/carnival-italy?zone=world)
if not iframe_url:
program = self._search_regex(
r'program\s*:\s*({.+?["\']embed_html["\'].+?}),?\s*\n',
webpage, 'program', default=None)
if program:
embed_html = self._parse_json(program, video_id)
if embed_html:
iframe_url = find_iframe_url(embed_html['embed_html'])
if iframe_url:
json_url = compat_parse_qs( json_url = compat_parse_qs(
compat_urllib_parse_urlparse(iframe_url).query)['json_url'][0] compat_urllib_parse_urlparse(iframe_url).query)['json_url'][0]
return self._extract_from_json_url(json_url, video_id, lang) if json_url:
title = self._search_regex(
r'<h3[^>]+title=(["\'])(?P<title>.+?)\1',
webpage, 'title', default=None, group='title')
return self._extract_from_json_url(json_url, video_id, lang, title=title)
# Different kind of embed URL (e.g.
# http://www.arte.tv/magazine/trepalium/fr/episode-0406-replay-trepalium)
embed_url = self._search_regex(
r'<iframe[^>]+src=(["\'])(?P<url>.+?)\1',
webpage, 'embed url', group='url')
return self.url_result(embed_url)
def _extract_from_json_url(self, json_url, video_id, lang): def _extract_from_json_url(self, json_url, video_id, lang, title=None):
info = self._download_json(json_url, video_id) info = self._download_json(json_url, video_id)
player_info = info['videoJsonPlayer'] player_info = info['videoJsonPlayer']
upload_date_str = player_info.get('shootingDate') upload_date_str = player_info.get('shootingDate')
if not upload_date_str: if not upload_date_str:
upload_date_str = player_info.get('VDA', '').split(' ')[0] upload_date_str = (player_info.get('VRA') or player_info.get('VDA') or '').split(' ')[0]
title = player_info['VTI'].strip() title = (player_info.get('VTI') or title or player_info['VID']).strip()
subtitle = player_info.get('VSU', '').strip() subtitle = player_info.get('VSU', '').strip()
if subtitle: if subtitle:
title += ' - %s' % subtitle title += ' - %s' % subtitle
@@ -132,27 +154,30 @@ class ArteTVPlus7IE(InfoExtractor):
} }
qfunc = qualities(['HQ', 'MQ', 'EQ', 'SQ']) qfunc = qualities(['HQ', 'MQ', 'EQ', 'SQ'])
LANGS = {
'fr': 'F',
'de': 'A',
'en': 'E[ANG]',
'es': 'E[ESP]',
}
formats = [] formats = []
for format_id, format_dict in player_info['VSR'].items(): for format_id, format_dict in player_info['VSR'].items():
f = dict(format_dict) f = dict(format_dict)
versionCode = f.get('versionCode') versionCode = f.get('versionCode')
langcode = LANGS.get(lang, lang)
langcode = { lang_rexs = [r'VO?%s-' % re.escape(langcode), r'VO?.-ST%s$' % re.escape(langcode)]
'fr': 'F', lang_pref = None
'de': 'A', if versionCode:
}.get(lang, lang) matched_lang_rexs = [r for r in lang_rexs if re.match(r, versionCode)]
lang_rexs = [r'VO?%s' % langcode, r'VO?.-ST%s' % langcode] lang_pref = -10 if not matched_lang_rexs else 10 * len(matched_lang_rexs)
lang_pref = (
None if versionCode is None else (
10 if any(re.match(r, versionCode) for r in lang_rexs)
else -10))
source_pref = 0 source_pref = 0
if versionCode is not None: if versionCode is not None:
# The original version with subtitles has lower relevance # The original version with subtitles has lower relevance
if re.match(r'VO-ST(F|A)', versionCode): if re.match(r'VO-ST(F|A|E)', versionCode):
source_pref -= 10 source_pref -= 10
# The version with sourds/mal subtitles has also lower relevance # The version with sourds/mal subtitles has also lower relevance
elif re.match(r'VO?(F|A)-STM\1', versionCode): elif re.match(r'VO?(F|A|E)-STM\1', versionCode):
source_pref -= 9 source_pref -= 9
format = { format = {
'format_id': format_id, 'format_id': format_id,
@@ -185,7 +210,7 @@ class ArteTVPlus7IE(InfoExtractor):
# It also uses the arte_vp_url url from the webpage to extract the information # It also uses the arte_vp_url url from the webpage to extract the information
class ArteTVCreativeIE(ArteTVPlus7IE): class ArteTVCreativeIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:creative' IE_NAME = 'arte.tv:creative'
_VALID_URL = r'https?://creative\.arte\.tv/(?P<lang>fr|de)/(?:magazine?/)?(?P<id>[^?#]+)' _VALID_URL = r'https?://creative\.arte\.tv/(?P<lang>fr|de|en|es)/(?:magazine?/)?(?P<id>[^/?#&]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://creative.arte.tv/de/magazin/agentur-amateur-corporate-design', 'url': 'http://creative.arte.tv/de/magazin/agentur-amateur-corporate-design',
@@ -209,7 +234,7 @@ class ArteTVCreativeIE(ArteTVPlus7IE):
class ArteTVFutureIE(ArteTVPlus7IE): class ArteTVFutureIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:future' IE_NAME = 'arte.tv:future'
_VALID_URL = r'https?://future\.arte\.tv/(?P<lang>fr|de)/(?P<id>.+)' _VALID_URL = r'https?://future\.arte\.tv/(?P<lang>fr|de|en|es)/(?P<id>[^/?#&]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://future.arte.tv/fr/info-sciences/les-ecrevisses-aussi-sont-anxieuses', 'url': 'http://future.arte.tv/fr/info-sciences/les-ecrevisses-aussi-sont-anxieuses',
@@ -217,6 +242,7 @@ class ArteTVFutureIE(ArteTVPlus7IE):
'id': '050940-028-A', 'id': '050940-028-A',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Les écrevisses aussi peuvent être anxieuses', 'title': 'Les écrevisses aussi peuvent être anxieuses',
'upload_date': '20140902',
}, },
}, { }, {
'url': 'http://future.arte.tv/fr/la-science-est-elle-responsable', 'url': 'http://future.arte.tv/fr/la-science-est-elle-responsable',
@@ -226,7 +252,7 @@ class ArteTVFutureIE(ArteTVPlus7IE):
class ArteTVDDCIE(ArteTVPlus7IE): class ArteTVDDCIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:ddc' IE_NAME = 'arte.tv:ddc'
_VALID_URL = r'https?://ddc\.arte\.tv/(?P<lang>emission|folge)/(?P<id>.+)' _VALID_URL = r'https?://ddc\.arte\.tv/(?P<lang>emission|folge)/(?P<id>[^/?#&]+)'
def _real_extract(self, url): def _real_extract(self, url):
video_id, lang = self._extract_url_info(url) video_id, lang = self._extract_url_info(url)
@@ -244,7 +270,7 @@ class ArteTVDDCIE(ArteTVPlus7IE):
class ArteTVConcertIE(ArteTVPlus7IE): class ArteTVConcertIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:concert' IE_NAME = 'arte.tv:concert'
_VALID_URL = r'https?://concert\.arte\.tv/(?P<lang>de|fr)/(?P<id>.+)' _VALID_URL = r'https?://concert\.arte\.tv/(?P<lang>fr|de|en|es)/(?P<id>[^/?#&]+)'
_TEST = { _TEST = {
'url': 'http://concert.arte.tv/de/notwist-im-pariser-konzertclub-divan-du-monde', 'url': 'http://concert.arte.tv/de/notwist-im-pariser-konzertclub-divan-du-monde',
@@ -261,7 +287,7 @@ class ArteTVConcertIE(ArteTVPlus7IE):
class ArteTVCinemaIE(ArteTVPlus7IE): class ArteTVCinemaIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:cinema' IE_NAME = 'arte.tv:cinema'
_VALID_URL = r'https?://cinema\.arte\.tv/(?P<lang>de|fr)/(?P<id>.+)' _VALID_URL = r'https?://cinema\.arte\.tv/(?P<lang>fr|de|en|es)/(?P<id>.+)'
_TEST = { _TEST = {
'url': 'http://cinema.arte.tv/de/node/38291', 'url': 'http://cinema.arte.tv/de/node/38291',
@@ -276,6 +302,37 @@ class ArteTVCinemaIE(ArteTVPlus7IE):
} }
class ArteTVMagazineIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:magazine'
_VALID_URL = r'https?://(?:www\.)?arte\.tv/magazine/[^/]+/(?P<lang>fr|de|en|es)/(?P<id>[^/?#&]+)'
_TESTS = [{
# Embedded via <iframe src="http://www.arte.tv/arte_vp/index.php?json_url=..."
'url': 'http://www.arte.tv/magazine/trepalium/fr/entretien-avec-le-realisateur-vincent-lannoo-trepalium',
'md5': '2a9369bcccf847d1c741e51416299f25',
'info_dict': {
'id': '065965-000-A',
'ext': 'mp4',
'title': 'Trepalium - Extrait Ep.01',
'upload_date': '20160121',
},
}, {
# Embedded via <iframe src="http://www.arte.tv/guide/fr/embed/054813-004-A/medium"
'url': 'http://www.arte.tv/magazine/trepalium/fr/episode-0406-replay-trepalium',
'md5': 'fedc64fc7a946110fe311634e79782ca',
'info_dict': {
'id': '054813-004_PLUS7-F',
'ext': 'mp4',
'title': 'Trepalium (4/6)',
'description': 'md5:10057003c34d54e95350be4f9b05cb40',
'upload_date': '20160218',
},
}, {
'url': 'http://www.arte.tv/magazine/metropolis/de/frank-woeste-german-paris-metropolis',
'only_matching': True,
}]
class ArteTVEmbedIE(ArteTVPlus7IE): class ArteTVEmbedIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:embed' IE_NAME = 'arte.tv:embed'
_VALID_URL = r'''(?x) _VALID_URL = r'''(?x)

View File

@@ -10,9 +10,9 @@ from ..utils import (
class AudiMediaIE(InfoExtractor): class AudiMediaIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?audimedia\.tv/(?:en|de)/vid/(?P<id>[^/?#]+)' _VALID_URL = r'https?://(?:www\.)?audi-mediacenter\.com/(?:en|de)/audimediatv/(?P<id>[^/?#]+)'
_TEST = { _TEST = {
'url': 'https://audimedia.tv/en/vid/60-seconds-of-audi-sport-104-2015-wec-bahrain-rookie-test', 'url': 'https://www.audi-mediacenter.com/en/audimediatv/60-seconds-of-audi-sport-104-2015-wec-bahrain-rookie-test-1467',
'md5': '79a8b71c46d49042609795ab59779b66', 'md5': '79a8b71c46d49042609795ab59779b66',
'info_dict': { 'info_dict': {
'id': '1565', 'id': '1565',
@@ -32,7 +32,10 @@ class AudiMediaIE(InfoExtractor):
display_id = self._match_id(url) display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
raw_payload = self._search_regex(r'<script[^>]+class="amtv-embed"[^>]+id="([^"]+)"', webpage, 'raw payload') raw_payload = self._search_regex([
r'class="amtv-embed"[^>]+id="([^"]+)"',
r'class=\\"amtv-embed\\"[^>]+id=\\"([^"]+)\\"',
], webpage, 'raw payload')
_, stage_mode, video_id, lang = raw_payload.split('-') _, stage_mode, video_id, lang = raw_payload.split('-')
# TODO: handle s and e stage_mode (live streams and ended live streams) # TODO: handle s and e stage_mode (live streams and ended live streams)
@@ -59,13 +62,19 @@ class AudiMediaIE(InfoExtractor):
video_version_url = video_version.get('download_url') or video_version.get('stream_url') video_version_url = video_version.get('download_url') or video_version.get('stream_url')
if not video_version_url: if not video_version_url:
continue continue
formats.append({ f = {
'url': video_version_url, 'url': video_version_url,
'width': int_or_none(video_version.get('width')), 'width': int_or_none(video_version.get('width')),
'height': int_or_none(video_version.get('height')), 'height': int_or_none(video_version.get('height')),
'abr': int_or_none(video_version.get('audio_bitrate')), 'abr': int_or_none(video_version.get('audio_bitrate')),
'vbr': int_or_none(video_version.get('video_bitrate')), 'vbr': int_or_none(video_version.get('video_bitrate')),
}
bitrate = self._search_regex(r'(\d+)k', video_version_url, 'bitrate', default=None)
if bitrate:
f.update({
'format_id': 'http-%s' % bitrate,
}) })
formats.append(f)
self._sort_formats(formats) self._sort_formats(formats)
return { return {

View File

@@ -0,0 +1,66 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import float_or_none
class AudioBoomIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?audioboom\.com/boos/(?P<id>[0-9]+)'
_TEST = {
'url': 'https://audioboom.com/boos/4279833-3-09-2016-czaban-hour-3?t=0',
'md5': '63a8d73a055c6ed0f1e51921a10a5a76',
'info_dict': {
'id': '4279833',
'ext': 'mp3',
'title': '3/09/2016 Czaban Hour 3',
'description': 'Guest: Nate Davis - NFL free agency, Guest: Stan Gans',
'duration': 2245.72,
'uploader': 'Steve Czaban',
'uploader_url': 're:https?://(?:www\.)?audioboom\.com/channel/steveczabanyahoosportsradio',
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
clip = None
clip_store = self._parse_json(
self._search_regex(
r'data-new-clip-store=(["\'])(?P<json>{.*?"clipId"\s*:\s*%s.*?})\1' % video_id,
webpage, 'clip store', default='{}', group='json'),
video_id, fatal=False)
if clip_store:
clips = clip_store.get('clips')
if clips and isinstance(clips, list) and isinstance(clips[0], dict):
clip = clips[0]
def from_clip(field):
if clip:
clip.get(field)
audio_url = from_clip('clipURLPriorToLoading') or self._og_search_property(
'audio', webpage, 'audio url')
title = from_clip('title') or self._og_search_title(webpage)
description = from_clip('description') or self._og_search_description(webpage)
duration = float_or_none(from_clip('duration') or self._html_search_meta(
'weibo:audio:duration', webpage))
uploader = from_clip('author') or self._og_search_property(
'audio:artist', webpage, 'uploader', fatal=False)
uploader_url = from_clip('author_url') or self._html_search_meta(
'audioboo:channel', webpage, 'uploader url')
return {
'id': video_id,
'url': audio_url,
'title': title,
'description': description,
'duration': duration,
'uploader': uploader,
'uploader_url': uploader_url,
}

View File

@@ -98,7 +98,7 @@ class AzubuIE(InfoExtractor):
class AzubuLiveIE(InfoExtractor): class AzubuLiveIE(InfoExtractor):
_VALID_URL = r'http://www.azubu.tv/(?P<id>[^/]+)$' _VALID_URL = r'https?://www.azubu.tv/(?P<id>[^/]+)$'
_TEST = { _TEST = {
'url': 'http://www.azubu.tv/MarsTVMDLen', 'url': 'http://www.azubu.tv/MarsTVMDLen',

View File

@@ -9,7 +9,7 @@ from ..utils import unescapeHTML
class BaiduVideoIE(InfoExtractor): class BaiduVideoIE(InfoExtractor):
IE_DESC = '百度视频' IE_DESC = '百度视频'
_VALID_URL = r'http://v\.baidu\.com/(?P<type>[a-z]+)/(?P<id>\d+)\.htm' _VALID_URL = r'https?://v\.baidu\.com/(?P<type>[a-z]+)/(?P<id>\d+)\.htm'
_TESTS = [{ _TESTS = [{
'url': 'http://v.baidu.com/comic/1069.htm?frp=bdbrand&q=%E4%B8%AD%E5%8D%8E%E5%B0%8F%E5%BD%93%E5%AE%B6', 'url': 'http://v.baidu.com/comic/1069.htm?frp=bdbrand&q=%E4%B8%AD%E5%8D%8E%E5%B0%8F%E5%BD%93%E5%AE%B6',
'info_dict': { 'info_dict': {

View File

@@ -10,7 +10,6 @@ from ..utils import (
int_or_none, int_or_none,
parse_duration, parse_duration,
parse_iso8601, parse_iso8601,
remove_end,
unescapeHTML, unescapeHTML,
) )
from ..compat import ( from ..compat import (
@@ -86,7 +85,7 @@ class BBCCoUkIE(InfoExtractor):
'id': 'b00yng1d', 'id': 'b00yng1d',
'ext': 'flv', 'ext': 'flv',
'title': 'The Voice UK: Series 3: Blind Auditions 5', 'title': 'The Voice UK: Series 3: Blind Auditions 5',
'description': "Emma Willis and Marvin Humes present the fifth set of blind auditions in the singing competition, as the coaches continue to build their teams based on voice alone.", 'description': 'Emma Willis and Marvin Humes present the fifth set of blind auditions in the singing competition, as the coaches continue to build their teams based on voice alone.',
'duration': 5100, 'duration': 5100,
}, },
'params': { 'params': {
@@ -561,7 +560,7 @@ class BBCIE(BBCCoUkIE):
'url': 'http://www.bbc.co.uk/blogs/adamcurtis/entries/3662a707-0af9-3149-963f-47bea720b460', 'url': 'http://www.bbc.co.uk/blogs/adamcurtis/entries/3662a707-0af9-3149-963f-47bea720b460',
'info_dict': { 'info_dict': {
'id': '3662a707-0af9-3149-963f-47bea720b460', 'id': '3662a707-0af9-3149-963f-47bea720b460',
'title': 'BBC Blogs - Adam Curtis - BUGGER', 'title': 'BUGGER',
}, },
'playlist_count': 18, 'playlist_count': 18,
}, { }, {
@@ -670,9 +669,17 @@ class BBCIE(BBCCoUkIE):
'url': 'http://www.bbc.com/sport/0/football/34475836', 'url': 'http://www.bbc.com/sport/0/football/34475836',
'info_dict': { 'info_dict': {
'id': '34475836', 'id': '34475836',
'title': 'What Liverpool can expect from Klopp', 'title': 'Jurgen Klopp: Furious football from a witty and winning coach',
}, },
'playlist_count': 3, 'playlist_count': 3,
}, {
# school report article with single video
'url': 'http://www.bbc.co.uk/schoolreport/35744779',
'info_dict': {
'id': '35744779',
'title': 'School which breaks down barriers in Jerusalem',
},
'playlist_count': 1,
}, { }, {
# single video with playlist URL from weather section # single video with playlist URL from weather section
'url': 'http://www.bbc.com/weather/features/33601775', 'url': 'http://www.bbc.com/weather/features/33601775',
@@ -735,8 +742,17 @@ class BBCIE(BBCCoUkIE):
json_ld_info = self._search_json_ld(webpage, playlist_id, default=None) json_ld_info = self._search_json_ld(webpage, playlist_id, default=None)
timestamp = json_ld_info.get('timestamp') timestamp = json_ld_info.get('timestamp')
playlist_title = json_ld_info.get('title') playlist_title = json_ld_info.get('title')
playlist_description = json_ld_info.get('description') if not playlist_title:
playlist_title = self._og_search_title(
webpage, default=None) or self._html_search_regex(
r'<title>(.+?)</title>', webpage, 'playlist title', default=None)
if playlist_title:
playlist_title = re.sub(r'(.+)\s*-\s*BBC.*?$', r'\1', playlist_title).strip()
playlist_description = json_ld_info.get(
'description') or self._og_search_description(webpage, default=None)
if not timestamp: if not timestamp:
timestamp = parse_iso8601(self._search_regex( timestamp = parse_iso8601(self._search_regex(
@@ -797,8 +813,6 @@ class BBCIE(BBCCoUkIE):
playlist.get('progressiveDownloadUrl'), playlist_id, timestamp)) playlist.get('progressiveDownloadUrl'), playlist_id, timestamp))
if entries: if entries:
playlist_title = playlist_title or remove_end(self._og_search_title(webpage), ' - BBC News')
playlist_description = playlist_description or self._og_search_description(webpage, default=None)
return self.playlist_result(entries, playlist_id, playlist_title, playlist_description) return self.playlist_result(entries, playlist_id, playlist_title, playlist_description)
# single video story (e.g. http://www.bbc.com/travel/story/20150625-sri-lankas-spicy-secret) # single video story (e.g. http://www.bbc.com/travel/story/20150625-sri-lankas-spicy-secret)
@@ -829,10 +843,6 @@ class BBCIE(BBCCoUkIE):
'subtitles': subtitles, 'subtitles': subtitles,
} }
playlist_title = self._html_search_regex(
r'<title>(.*?)(?:\s*-\s*BBC [^ ]+)?</title>', webpage, 'playlist title')
playlist_description = self._og_search_description(webpage, default=None)
def extract_all(pattern): def extract_all(pattern):
return list(filter(None, map( return list(filter(None, map(
lambda s: self._parse_json(s, playlist_id, fatal=False), lambda s: self._parse_json(s, playlist_id, fatal=False),
@@ -932,7 +942,7 @@ class BBCIE(BBCCoUkIE):
class BBCCoUkArticleIE(InfoExtractor): class BBCCoUkArticleIE(InfoExtractor):
_VALID_URL = 'http://www.bbc.co.uk/programmes/articles/(?P<id>[a-zA-Z0-9]+)' _VALID_URL = r'https?://www.bbc.co.uk/programmes/articles/(?P<id>[a-zA-Z0-9]+)'
IE_NAME = 'bbc.co.uk:article' IE_NAME = 'bbc.co.uk:article'
IE_DESC = 'BBC articles' IE_DESC = 'BBC articles'

View File

@@ -8,7 +8,7 @@ from ..utils import url_basename
class BehindKinkIE(InfoExtractor): class BehindKinkIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?behindkink\.com/(?P<year>[0-9]{4})/(?P<month>[0-9]{2})/(?P<day>[0-9]{2})/(?P<id>[^/#?_]+)' _VALID_URL = r'https?://(?:www\.)?behindkink\.com/(?P<year>[0-9]{4})/(?P<month>[0-9]{2})/(?P<day>[0-9]{2})/(?P<id>[^/#?_]+)'
_TEST = { _TEST = {
'url': 'http://www.behindkink.com/2014/12/05/what-are-you-passionate-about-marley-blaze/', 'url': 'http://www.behindkink.com/2014/12/05/what-are-you-passionate-about-marley-blaze/',
'md5': '507b57d8fdcd75a41a9a7bdb7989c762', 'md5': '507b57d8fdcd75a41a9a7bdb7989c762',

View File

@@ -14,7 +14,7 @@ from ..utils import (
class BiliBiliIE(InfoExtractor): class BiliBiliIE(InfoExtractor):
_VALID_URL = r'http://www\.bilibili\.(?:tv|com)/video/av(?P<id>\d+)(?:/index_(?P<page_num>\d+).html)?' _VALID_URL = r'https?://www\.bilibili\.(?:tv|com)/video/av(?P<id>\d+)(?:/index_(?P<page_num>\d+).html)?'
_TESTS = [{ _TESTS = [{
'url': 'http://www.bilibili.tv/video/av1074402/', 'url': 'http://www.bilibili.tv/video/av1074402/',

View File

@@ -0,0 +1,86 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import remove_end
class BioBioChileTVIE(InfoExtractor):
_VALID_URL = r'https?://tv\.biobiochile\.cl/notas/(?:[^/]+/)+(?P<id>[^/]+)\.shtml'
_TESTS = [{
'url': 'http://tv.biobiochile.cl/notas/2015/10/21/sobre-camaras-y-camarillas-parlamentarias.shtml',
'md5': '26f51f03cf580265defefb4518faec09',
'info_dict': {
'id': 'sobre-camaras-y-camarillas-parlamentarias',
'ext': 'mp4',
'title': 'Sobre Cámaras y camarillas parlamentarias',
'thumbnail': 're:^https?://.*\.jpg$',
'uploader': 'Fernando Atria',
},
}, {
# different uploader layout
'url': 'http://tv.biobiochile.cl/notas/2016/03/18/natalia-valdebenito-repasa-a-diputado-hasbun-paso-a-la-categoria-de-hablar-brutalidades.shtml',
'md5': 'edc2e6b58974c46d5b047dea3c539ff3',
'info_dict': {
'id': 'natalia-valdebenito-repasa-a-diputado-hasbun-paso-a-la-categoria-de-hablar-brutalidades',
'ext': 'mp4',
'title': 'Natalia Valdebenito repasa a diputado Hasbún: Pasó a la categoría de hablar brutalidades',
'thumbnail': 're:^https?://.*\.jpg$',
'uploader': 'Piangella Obrador',
},
'params': {
'skip_download': True,
},
}, {
'url': 'http://tv.biobiochile.cl/notas/2015/10/22/ninos-transexuales-de-quien-es-la-decision.shtml',
'only_matching': True,
}, {
'url': 'http://tv.biobiochile.cl/notas/2015/10/21/exclusivo-hector-pinto-formador-de-chupete-revela-version-del-ex-delantero-albo.shtml',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
title = remove_end(self._og_search_title(webpage), ' - BioBioChile TV')
file_url = self._search_regex(
r'loadFWPlayerVideo\([^,]+,\s*(["\'])(?P<url>.+?)\1',
webpage, 'file url', group='url')
base_url = self._search_regex(
r'file\s*:\s*(["\'])(?P<url>.+?)\1\s*\+\s*fileURL', webpage,
'base url', default='http://unlimited2-cl.digitalproserver.com/bbtv/',
group='url')
formats = self._extract_m3u8_formats(
'%s%s/playlist.m3u8' % (base_url, file_url), video_id, 'mp4',
entry_protocol='m3u8_native', m3u8_id='hls', fatal=False)
f = {
'url': '%s%s' % (base_url, file_url),
'format_id': 'http',
'protocol': 'http',
'preference': 1,
}
if formats:
f_copy = formats[-1].copy()
f_copy.update(f)
f = f_copy
formats.append(f)
self._sort_formats(formats)
thumbnail = self._og_search_thumbnail(webpage)
uploader = self._html_search_regex(
r'<a[^>]+href=["\']https?://busca\.biobiochile\.cl/author[^>]+>(.+?)</a>',
webpage, 'uploader', fatal=False)
return {
'id': video_id,
'title': title,
'thumbnail': thumbnail,
'uploader': uploader,
'formats': formats,
}

View File

@@ -28,10 +28,10 @@ class BleacherReportIE(InfoExtractor):
'add_ie': ['Ooyala'], 'add_ie': ['Ooyala'],
}, { }, {
'url': 'http://bleacherreport.com/articles/2586817-aussie-golfers-get-fright-of-their-lives-after-being-chased-by-angry-kangaroo', 'url': 'http://bleacherreport.com/articles/2586817-aussie-golfers-get-fright-of-their-lives-after-being-chased-by-angry-kangaroo',
'md5': 'af5f90dc9c7ba1c19d0a3eac806bbf50', 'md5': '6a5cd403418c7b01719248ca97fb0692',
'info_dict': { 'info_dict': {
'id': '2586817', 'id': '2586817',
'ext': 'mp4', 'ext': 'webm',
'title': 'Aussie Golfers Get Fright of Their Lives After Being Chased by Angry Kangaroo', 'title': 'Aussie Golfers Get Fright of Their Lives After Being Chased by Angry Kangaroo',
'timestamp': 1446839961, 'timestamp': 1446839961,
'uploader': 'Sean Fay', 'uploader': 'Sean Fay',
@@ -93,10 +93,14 @@ class BleacherReportCMSIE(AMPIE):
'md5': '8c2c12e3af7805152675446c905d159b', 'md5': '8c2c12e3af7805152675446c905d159b',
'info_dict': { 'info_dict': {
'id': '8fd44c2f-3dc5-4821-9118-2c825a98c0e1', 'id': '8fd44c2f-3dc5-4821-9118-2c825a98c0e1',
'ext': 'flv', 'ext': 'mp4',
'title': 'Cena vs. Rollins Would Expose the Heavyweight Division', 'title': 'Cena vs. Rollins Would Expose the Heavyweight Division',
'description': 'md5:984afb4ade2f9c0db35f3267ed88b36e', 'description': 'md5:984afb4ade2f9c0db35f3267ed88b36e',
}, },
'params': {
# m3u8 download
'skip_download': True,
},
}] }]
def _real_extract(self, url): def _real_extract(self, url):

View File

@@ -0,0 +1,60 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_parse_qs
from ..utils import ExtractorError
class BokeCCBaseIE(InfoExtractor):
def _extract_bokecc_formats(self, webpage, video_id, format_id=None):
player_params_str = self._html_search_regex(
r'<(?:script|embed)[^>]+src="http://p\.bokecc\.com/player\?([^"]+)',
webpage, 'player params')
player_params = compat_parse_qs(player_params_str)
info_xml = self._download_xml(
'http://p.bokecc.com/servlet/playinfo?uid=%s&vid=%s&m=1' % (
player_params['siteid'][0], player_params['vid'][0]), video_id)
formats = [{
'format_id': format_id,
'url': quality.find('./copy').attrib['playurl'],
'preference': int(quality.attrib['value']),
} for quality in info_xml.findall('./video/quality')]
self._sort_formats(formats)
return formats
class BokeCCIE(BokeCCBaseIE):
_IE_DESC = 'CC视频'
_VALID_URL = r'https?://union\.bokecc\.com/playvideo\.bo\?(?P<query>.*)'
_TESTS = [{
'url': 'http://union.bokecc.com/playvideo.bo?vid=E44D40C15E65EA30&uid=CD0C5D3C8614B28B',
'info_dict': {
'id': 'CD0C5D3C8614B28B_E44D40C15E65EA30',
'ext': 'flv',
'title': 'BokeCC Video',
},
}]
def _real_extract(self, url):
qs = compat_parse_qs(re.match(self._VALID_URL, url).group('query'))
if not qs.get('vid') or not qs.get('uid'):
raise ExtractorError('Invalid URL', expected=True)
video_id = '%s_%s' % (qs['uid'][0], qs['vid'][0])
webpage = self._download_webpage(url, video_id)
return {
'id': video_id,
'title': 'BokeCC Video', # no title provided in the webpage
'formats': self._extract_bokecc_formats(webpage, video_id),
}

View File

@@ -12,7 +12,7 @@ from ..utils import (
class BpbIE(InfoExtractor): class BpbIE(InfoExtractor):
IE_DESC = 'Bundeszentrale für politische Bildung' IE_DESC = 'Bundeszentrale für politische Bildung'
_VALID_URL = r'http://www\.bpb\.de/mediathek/(?P<id>[0-9]+)/' _VALID_URL = r'https?://www\.bpb\.de/mediathek/(?P<id>[0-9]+)/'
_TEST = { _TEST = {
'url': 'http://www.bpb.de/mediathek/297/joachim-gauck-zu-1989-und-die-erinnerung-an-die-ddr', 'url': 'http://www.bpb.de/mediathek/297/joachim-gauck-zu-1989-und-die-erinnerung-an-die-ddr',

View File

@@ -0,0 +1,28 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import smuggle_url
class BravoTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?bravotv\.com/(?:[^/]+/)+videos/(?P<id>[^/?]+)'
_TEST = {
'url': 'http://www.bravotv.com/last-chance-kitchen/season-5/videos/lck-ep-12-fishy-finale',
'md5': 'd60cdf68904e854fac669bd26cccf801',
'info_dict': {
'id': 'LitrBdX64qLn',
'ext': 'mp4',
'title': 'Last Chance Kitchen Returns',
'description': 'S13: Last Chance Kitchen Returns for Top Chef Season 13',
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
account_pid = self._search_regex(r'"account_pid"\s*:\s*"([^"]+)"', webpage, 'account pid')
release_pid = self._search_regex(r'"release_pid"\s*:\s*"([^"]+)"', webpage, 'release pid')
return self.url_result(smuggle_url(
'http://link.theplatform.com/s/%s/%s?mbr=true&switch=progressive' % (account_pid, release_pid),
{'force_smil_url': True}), 'ThePlatform', release_pid)

View File

@@ -11,7 +11,7 @@ from ..utils import (
class BreakIE(InfoExtractor): class BreakIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?break\.com/video/(?:[^/]+/)*.+-(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?break\.com/video/(?:[^/]+/)*.+-(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.break.com/video/when-girls-act-like-guys-2468056', 'url': 'http://www.break.com/video/when-girls-act-like-guys-2468056',
'info_dict': { 'info_dict': {

View File

@@ -9,10 +9,10 @@ from ..compat import (
compat_etree_fromstring, compat_etree_fromstring,
compat_parse_qs, compat_parse_qs,
compat_str, compat_str,
compat_urllib_parse,
compat_urllib_parse_urlparse, compat_urllib_parse_urlparse,
compat_urlparse, compat_urlparse,
compat_xml_parse_error, compat_xml_parse_error,
compat_HTTPError,
) )
from ..utils import ( from ..utils import (
determine_ext, determine_ext,
@@ -23,16 +23,16 @@ from ..utils import (
js_to_json, js_to_json,
int_or_none, int_or_none,
parse_iso8601, parse_iso8601,
sanitized_Request,
unescapeHTML, unescapeHTML,
unsmuggle_url, unsmuggle_url,
update_url_query,
) )
class BrightcoveLegacyIE(InfoExtractor): class BrightcoveLegacyIE(InfoExtractor):
IE_NAME = 'brightcove:legacy' IE_NAME = 'brightcove:legacy'
_VALID_URL = r'(?:https?://.*brightcove\.com/(services|viewer).*?\?|brightcove:)(?P<query>.*)' _VALID_URL = r'(?:https?://.*brightcove\.com/(services|viewer).*?\?|brightcove:)(?P<query>.*)'
_FEDERATED_URL_TEMPLATE = 'http://c.brightcove.com/services/viewer/htmlFederated?%s' _FEDERATED_URL = 'http://c.brightcove.com/services/viewer/htmlFederated'
_TESTS = [ _TESTS = [
{ {
@@ -155,8 +155,8 @@ class BrightcoveLegacyIE(InfoExtractor):
# Not all pages define this value # Not all pages define this value
if playerKey is not None: if playerKey is not None:
params['playerKey'] = playerKey params['playerKey'] = playerKey
# The three fields hold the id of the video # These fields hold the id of the video
videoPlayer = find_param('@videoPlayer') or find_param('videoId') or find_param('videoID') videoPlayer = find_param('@videoPlayer') or find_param('videoId') or find_param('videoID') or find_param('@videoList')
if videoPlayer is not None: if videoPlayer is not None:
params['@videoPlayer'] = videoPlayer params['@videoPlayer'] = videoPlayer
linkBase = find_param('linkBaseURL') linkBase = find_param('linkBaseURL')
@@ -184,8 +184,7 @@ class BrightcoveLegacyIE(InfoExtractor):
@classmethod @classmethod
def _make_brightcove_url(cls, params): def _make_brightcove_url(cls, params):
data = compat_urllib_parse.urlencode(params) return update_url_query(cls._FEDERATED_URL, params)
return cls._FEDERATED_URL_TEMPLATE % data
@classmethod @classmethod
def _extract_brightcove_url(cls, webpage): def _extract_brightcove_url(cls, webpage):
@@ -239,7 +238,7 @@ class BrightcoveLegacyIE(InfoExtractor):
# We set the original url as the default 'Referer' header # We set the original url as the default 'Referer' header
referer = smuggled_data.get('Referer', url) referer = smuggled_data.get('Referer', url)
return self._get_video_info( return self._get_video_info(
videoPlayer[0], query_str, query, referer=referer) videoPlayer[0], query, referer=referer)
elif 'playerKey' in query: elif 'playerKey' in query:
player_key = query['playerKey'] player_key = query['playerKey']
return self._get_playlist_info(player_key[0]) return self._get_playlist_info(player_key[0])
@@ -248,15 +247,14 @@ class BrightcoveLegacyIE(InfoExtractor):
'Cannot find playerKey= variable. Did you forget quotes in a shell invocation?', 'Cannot find playerKey= variable. Did you forget quotes in a shell invocation?',
expected=True) expected=True)
def _get_video_info(self, video_id, query_str, query, referer=None): def _get_video_info(self, video_id, query, referer=None):
request_url = self._FEDERATED_URL_TEMPLATE % query_str headers = {}
req = sanitized_Request(request_url)
linkBase = query.get('linkBaseURL') linkBase = query.get('linkBaseURL')
if linkBase is not None: if linkBase is not None:
referer = linkBase[0] referer = linkBase[0]
if referer is not None: if referer is not None:
req.add_header('Referer', referer) headers['Referer'] = referer
webpage = self._download_webpage(req, video_id) webpage = self._download_webpage(self._FEDERATED_URL, video_id, headers=headers, query=query)
error_msg = self._html_search_regex( error_msg = self._html_search_regex(
r"<h1>We're sorry.</h1>([\s\n]*<p>.*?</p>)+", webpage, r"<h1>We're sorry.</h1>([\s\n]*<p>.*?</p>)+", webpage,
@@ -355,7 +353,7 @@ class BrightcoveLegacyIE(InfoExtractor):
class BrightcoveNewIE(InfoExtractor): class BrightcoveNewIE(InfoExtractor):
IE_NAME = 'brightcove:new' IE_NAME = 'brightcove:new'
_VALID_URL = r'https?://players\.brightcove\.net/(?P<account_id>\d+)/(?P<player_id>[^/]+)_(?P<embed>[^/]+)/index\.html\?.*videoId=(?P<video_id>(?:ref:)?\d+)' _VALID_URL = r'https?://players\.brightcove\.net/(?P<account_id>\d+)/(?P<player_id>[^/]+)_(?P<embed>[^/]+)/index\.html\?.*videoId=(?P<video_id>\d+|ref:[^&]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://players.brightcove.net/929656772001/e41d32dc-ec74-459e-a845-6c69f7b724ea_default/index.html?videoId=4463358922001', 'url': 'http://players.brightcove.net/929656772001/e41d32dc-ec74-459e-a845-6c69f7b724ea_default/index.html?videoId=4463358922001',
'md5': 'c8100925723840d4b0d243f7025703be', 'md5': 'c8100925723840d4b0d243f7025703be',
@@ -391,6 +389,10 @@ class BrightcoveNewIE(InfoExtractor):
# ref: prefixed video id # ref: prefixed video id
'url': 'http://players.brightcove.net/3910869709001/21519b5c-4b3b-4363-accb-bdc8f358f823_default/index.html?videoId=ref:7069442', 'url': 'http://players.brightcove.net/3910869709001/21519b5c-4b3b-4363-accb-bdc8f358f823_default/index.html?videoId=ref:7069442',
'only_matching': True, 'only_matching': True,
}, {
# non numeric ref: prefixed video id
'url': 'http://players.brightcove.net/710858724001/default_default/index.html?videoId=ref:event-stream-356',
'only_matching': True,
}] }]
@staticmethod @staticmethod
@@ -410,8 +412,8 @@ class BrightcoveNewIE(InfoExtractor):
# Look for iframe embeds [1] # Look for iframe embeds [1]
for _, url in re.findall( for _, url in re.findall(
r'<iframe[^>]+src=(["\'])((?:https?:)//players\.brightcove\.net/\d+/[^/]+/index\.html.+?)\1', webpage): r'<iframe[^>]+src=(["\'])((?:https?:)?//players\.brightcove\.net/\d+/[^/]+/index\.html.+?)\1', webpage):
entries.append(url) entries.append(url if url.startswith('http') else 'http:' + url)
# Look for embed_in_page embeds [2] # Look for embed_in_page embeds [2]
for video_id, account_id, player_id, embed in re.findall( for video_id, account_id, player_id, embed in re.findall(
@@ -420,11 +422,11 @@ class BrightcoveNewIE(InfoExtractor):
# According to [4] data-video-id may be prefixed with ref: # According to [4] data-video-id may be prefixed with ref:
r'''(?sx) r'''(?sx)
<video[^>]+ <video[^>]+
data-video-id=["\']((?:ref:)?\d+)["\'][^>]*>.*? data-video-id=["\'](\d+|ref:[^"\']+)["\'][^>]*>.*?
</video>.*? </video>.*?
<script[^>]+ <script[^>]+
src=["\'](?:https?:)?//players\.brightcove\.net/ src=["\'](?:https?:)?//players\.brightcove\.net/
(\d+)/([\da-f-]+)_([^/]+)/index\.min\.js (\d+)/([\da-f-]+)_([^/]+)/index(?:\.min)?\.js
''', webpage): ''', webpage):
entries.append( entries.append(
'http://players.brightcove.net/%s/%s_%s/index.html?videoId=%s' 'http://players.brightcove.net/%s/%s_%s/index.html?videoId=%s'
@@ -454,24 +456,33 @@ class BrightcoveNewIE(InfoExtractor):
r'policyKey\s*:\s*(["\'])(?P<pk>.+?)\1', r'policyKey\s*:\s*(["\'])(?P<pk>.+?)\1',
webpage, 'policy key', group='pk') webpage, 'policy key', group='pk')
req = sanitized_Request( api_url = 'https://edge.api.brightcove.com/playback/v1/accounts/%s/videos/%s' % (account_id, video_id)
'https://edge.api.brightcove.com/playback/v1/accounts/%s/videos/%s' try:
% (account_id, video_id), json_data = self._download_json(api_url, video_id, headers={
headers={'Accept': 'application/json;pk=%s' % policy_key}) 'Accept': 'application/json;pk=%s' % policy_key
json_data = self._download_json(req, video_id) })
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
json_data = self._parse_json(e.cause.read().decode(), video_id)
raise ExtractorError(json_data[0]['message'], expected=True)
raise
title = json_data['name'] title = json_data['name']
formats = [] formats = []
for source in json_data.get('sources', []): for source in json_data.get('sources', []):
container = source.get('container')
source_type = source.get('type') source_type = source.get('type')
src = source.get('src') src = source.get('src')
if source_type == 'application/x-mpegURL': if source_type == 'application/x-mpegURL' or container == 'M2TS':
if not src: if not src:
continue continue
formats.extend(self._extract_m3u8_formats( formats.extend(self._extract_m3u8_formats(
src, video_id, 'mp4', entry_protocol='m3u8_native', src, video_id, 'mp4', m3u8_id='hls', fatal=False))
m3u8_id='hls', fatal=False)) elif source_type == 'application/dash+xml':
if not src:
continue
formats.extend(self._extract_mpd_formats(src, video_id, 'dash', fatal=False))
else: else:
streaming_src = source.get('streaming_src') streaming_src = source.get('streaming_src')
stream_name, app_name = source.get('stream_name'), source.get('app_name') stream_name, app_name = source.get('stream_name'), source.get('app_name')
@@ -479,15 +490,23 @@ class BrightcoveNewIE(InfoExtractor):
continue continue
tbr = float_or_none(source.get('avg_bitrate'), 1000) tbr = float_or_none(source.get('avg_bitrate'), 1000)
height = int_or_none(source.get('height')) height = int_or_none(source.get('height'))
width = int_or_none(source.get('width'))
f = { f = {
'tbr': tbr, 'tbr': tbr,
'width': int_or_none(source.get('width')),
'height': height,
'filesize': int_or_none(source.get('size')), 'filesize': int_or_none(source.get('size')),
'container': source.get('container'), 'container': container,
'vcodec': source.get('codec'), 'ext': container.lower(),
'ext': source.get('container').lower(),
} }
if width == 0 and height == 0:
f.update({
'vcodec': 'none',
})
else:
f.update({
'width': width,
'height': height,
'vcodec': source.get('codec'),
})
def build_format_id(kind): def build_format_id(kind):
format_id = kind format_id = kind

View File

@@ -4,12 +4,13 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import js_to_json
class C56IE(InfoExtractor): class C56IE(InfoExtractor):
_VALID_URL = r'https?://(?:(?:www|player)\.)?56\.com/(?:.+?/)?(?:v_|(?:play_album.+-))(?P<textid>.+?)\.(?:html|swf)' _VALID_URL = r'https?://(?:(?:www|player)\.)?56\.com/(?:.+?/)?(?:v_|(?:play_album.+-))(?P<textid>.+?)\.(?:html|swf)'
IE_NAME = '56.com' IE_NAME = '56.com'
_TEST = { _TESTS = [{
'url': 'http://www.56.com/u39/v_OTM0NDA3MTY.html', 'url': 'http://www.56.com/u39/v_OTM0NDA3MTY.html',
'md5': 'e59995ac63d0457783ea05f93f12a866', 'md5': 'e59995ac63d0457783ea05f93f12a866',
'info_dict': { 'info_dict': {
@@ -18,12 +19,29 @@ class C56IE(InfoExtractor):
'title': '网事知多少 第32期车怒', 'title': '网事知多少 第32期车怒',
'duration': 283.813, 'duration': 283.813,
}, },
} }, {
'url': 'http://www.56.com/u47/v_MTM5NjQ5ODc2.html',
'md5': '',
'info_dict': {
'id': '82247482',
'title': '爱的诅咒之杜鹃花开',
},
'playlist_count': 7,
'add_ie': ['Sohu'],
}]
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url, flags=re.VERBOSE) mobj = re.match(self._VALID_URL, url, flags=re.VERBOSE)
text_id = mobj.group('textid') text_id = mobj.group('textid')
webpage = self._download_webpage(url, text_id)
sohu_video_info_str = self._search_regex(
r'var\s+sohuVideoInfo\s*=\s*({[^}]+});', webpage, 'Sohu video info', default=None)
if sohu_video_info_str:
sohu_video_info = self._parse_json(
sohu_video_info_str, text_id, transform_source=js_to_json)
return self.url_result(sohu_video_info['url'], 'Sohu')
page = self._download_json( page = self._download_json(
'http://vxml.56.com/json/%s/' % text_id, text_id, 'Downloading video info') 'http://vxml.56.com/json/%s/' % text_id, text_id, 'Downloading video info')

View File

@@ -16,7 +16,7 @@ from ..utils import (
class CamdemyIE(InfoExtractor): class CamdemyIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?camdemy\.com/media/(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?camdemy\.com/media/(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
# single file # single file
'url': 'http://www.camdemy.com/media/5181/', 'url': 'http://www.camdemy.com/media/5181/',
@@ -104,7 +104,7 @@ class CamdemyIE(InfoExtractor):
class CamdemyFolderIE(InfoExtractor): class CamdemyFolderIE(InfoExtractor):
_VALID_URL = r'http://www.camdemy.com/folder/(?P<id>\d+)' _VALID_URL = r'https?://www.camdemy.com/folder/(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
# links with trailing slash # links with trailing slash
'url': 'http://www.camdemy.com/folder/450', 'url': 'http://www.camdemy.com/folder/450',

View File

@@ -6,7 +6,7 @@ from ..utils import float_or_none
class CanvasIE(InfoExtractor): class CanvasIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?canvas\.be/video/(?:[^/]+/)*(?P<id>[^/?#&]+)' _VALID_URL = r'https?://(?:www\.)?canvas\.be/video/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TEST = { _TESTS = [{
'url': 'http://www.canvas.be/video/de-afspraak/najaar-2015/de-afspraak-veilt-voor-de-warmste-week', 'url': 'http://www.canvas.be/video/de-afspraak/najaar-2015/de-afspraak-veilt-voor-de-warmste-week',
'md5': 'ea838375a547ac787d4064d8c7860a6c', 'md5': 'ea838375a547ac787d4064d8c7860a6c',
'info_dict': { 'info_dict': {
@@ -18,7 +18,27 @@ class CanvasIE(InfoExtractor):
'thumbnail': 're:^https?://.*\.jpg$', 'thumbnail': 're:^https?://.*\.jpg$',
'duration': 49.02, 'duration': 49.02,
} }
}, {
# with subtitles
'url': 'http://www.canvas.be/video/panorama/2016/pieter-0167',
'info_dict': {
'id': 'mz-ast-5240ff21-2d30-4101-bba6-92b5ec67c625',
'display_id': 'pieter-0167',
'ext': 'mp4',
'title': 'Pieter 0167',
'description': 'md5:943cd30f48a5d29ba02c3a104dc4ec4e',
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 2553.08,
'subtitles': {
'nl': [{
'ext': 'vtt',
}],
},
},
'params': {
'skip_download': True,
} }
}]
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) display_id = self._match_id(url)
@@ -54,6 +74,14 @@ class CanvasIE(InfoExtractor):
}) })
self._sort_formats(formats) self._sort_formats(formats)
subtitles = {}
subtitle_urls = data.get('subtitleUrls')
if isinstance(subtitle_urls, list):
for subtitle in subtitle_urls:
subtitle_url = subtitle.get('url')
if subtitle_url and subtitle.get('type') == 'CLOSED':
subtitles.setdefault('nl', []).append({'url': subtitle_url})
return { return {
'id': video_id, 'id': video_id,
'display_id': display_id, 'display_id': display_id,
@@ -62,4 +90,5 @@ class CanvasIE(InfoExtractor):
'formats': formats, 'formats': formats,
'duration': float_or_none(data.get('duration'), 1000), 'duration': float_or_none(data.get('duration'), 1000),
'thumbnail': data.get('posterImageUrl'), 'thumbnail': data.get('posterImageUrl'),
'subtitles': subtitles,
} }

View File

@@ -3,12 +3,15 @@ from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from .theplatform import ThePlatformIE from .theplatform import ThePlatformIE
from ..utils import parse_duration from ..utils import (
parse_duration,
find_xpath_attr,
)
class CBSNewsIE(ThePlatformIE): class CBSNewsIE(ThePlatformIE):
IE_DESC = 'CBS News' IE_DESC = 'CBS News'
_VALID_URL = r'http://(?:www\.)?cbsnews\.com/(?:news|videos)/(?P<id>[\da-z_-]+)' _VALID_URL = r'https?://(?:www\.)?cbsnews\.com/(?:news|videos)/(?P<id>[\da-z_-]+)'
_TESTS = [ _TESTS = [
{ {
@@ -46,6 +49,15 @@ class CBSNewsIE(ThePlatformIE):
}, },
] ]
def _parse_smil_subtitles(self, smil, namespace=None, subtitles_lang='en'):
closed_caption_e = find_xpath_attr(smil, self._xpath_ns('.//param', namespace), 'name', 'ClosedCaptionURL')
return {
'en': [{
'ext': 'ttml',
'url': closed_caption_e.attrib['value'],
}]
} if closed_caption_e is not None and closed_caption_e.attrib.get('value') else []
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
@@ -61,18 +73,12 @@ class CBSNewsIE(ThePlatformIE):
thumbnail = item.get('mediaImage') or item.get('thumbnail') thumbnail = item.get('mediaImage') or item.get('thumbnail')
subtitles = {} subtitles = {}
if 'mpxRefId' in video_info:
subtitles['en'] = [{
'ext': 'ttml',
'url': 'http://www.cbsnews.com/videos/captions/%s.adb_xml' % video_info['mpxRefId'],
}]
formats = [] formats = []
for format_id in ['RtmpMobileLow', 'RtmpMobileHigh', 'Hls', 'RtmpDesktop']: for format_id in ['RtmpMobileLow', 'RtmpMobileHigh', 'Hls', 'RtmpDesktop']:
pid = item.get('media' + format_id) pid = item.get('media' + format_id)
if not pid: if not pid:
continue continue
release_url = 'http://link.theplatform.com/s/dJ5BDC/%s?format=SMIL&mbr=true' % pid release_url = 'http://link.theplatform.com/s/dJ5BDC/%s?mbr=true' % pid
tp_formats, tp_subtitles = self._extract_theplatform_smil(release_url, video_id, 'Downloading %s SMIL data' % pid) tp_formats, tp_subtitles = self._extract_theplatform_smil(release_url, video_id, 'Downloading %s SMIL data' % pid)
formats.extend(tp_formats) formats.extend(tp_formats)
subtitles = self._merge_subtitles(subtitles, tp_subtitles) subtitles = self._merge_subtitles(subtitles, tp_subtitles)
@@ -90,7 +96,7 @@ class CBSNewsIE(ThePlatformIE):
class CBSNewsLiveVideoIE(InfoExtractor): class CBSNewsLiveVideoIE(InfoExtractor):
IE_DESC = 'CBS News Live Videos' IE_DESC = 'CBS News Live Videos'
_VALID_URL = r'http://(?:www\.)?cbsnews\.com/live/video/(?P<id>[\da-z_-]+)' _VALID_URL = r'https?://(?:www\.)?cbsnews\.com/live/video/(?P<id>[\da-z_-]+)'
_TEST = { _TEST = {
'url': 'http://www.cbsnews.com/live/video/clinton-sanders-prepare-to-face-off-in-nh/', 'url': 'http://www.cbsnews.com/live/video/clinton-sanders-prepare-to-face-off-in-nh/',

View File

@@ -6,7 +6,7 @@ from .common import InfoExtractor
class CBSSportsIE(InfoExtractor): class CBSSportsIE(InfoExtractor):
_VALID_URL = r'http://www\.cbssports\.com/video/player/(?P<section>[^/]+)/(?P<id>[^/]+)' _VALID_URL = r'https?://www\.cbssports\.com/video/player/(?P<section>[^/]+)/(?P<id>[^/]+)'
_TEST = { _TEST = {
'url': 'http://www.cbssports.com/video/player/tennis/318462531970/0/us-open-flashbacks-1990s', 'url': 'http://www.cbssports.com/video/player/tennis/318462531970/0/us-open-flashbacks-1990s',

View File

@@ -45,7 +45,7 @@ class CCCIE(InfoExtractor):
title = self._html_search_regex( title = self._html_search_regex(
r'(?s)<h1>(.*?)</h1>', webpage, 'title') r'(?s)<h1>(.*?)</h1>', webpage, 'title')
description = self._html_search_regex( description = self._html_search_regex(
r"(?s)<h3>About</h3>(.+?)<h3>", r'(?s)<h3>About</h3>(.+?)<h3>',
webpage, 'description', fatal=False) webpage, 'description', fatal=False)
upload_date = unified_strdate(self._html_search_regex( upload_date = unified_strdate(self._html_search_regex(
r"(?s)<span[^>]+class='[^']*fa-calendar-o'[^>]*>(.+?)</span>", r"(?s)<span[^>]+class='[^']*fa-calendar-o'[^>]*>(.+?)</span>",

96
youtube_dl/extractor/cda.py Executable file
View File

@@ -0,0 +1,96 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
decode_packed_codes,
ExtractorError,
parse_duration
)
class CDAIE(InfoExtractor):
_VALID_URL = r'https?://(?:(?:www\.)?cda\.pl/video|ebd\.cda\.pl/[0-9]+x[0-9]+)/(?P<id>[0-9a-z]+)'
_TESTS = [{
'url': 'http://www.cda.pl/video/5749950c',
'md5': '6f844bf51b15f31fae165365707ae970',
'info_dict': {
'id': '5749950c',
'ext': 'mp4',
'height': 720,
'title': 'Oto dlaczego przed zakrętem należy zwolnić.',
'duration': 39
}
}, {
'url': 'http://www.cda.pl/video/57413289',
'md5': 'a88828770a8310fc00be6c95faf7f4d5',
'info_dict': {
'id': '57413289',
'ext': 'mp4',
'title': 'Lądowanie na lotnisku na Maderze',
'duration': 137
}
}, {
'url': 'http://ebd.cda.pl/0x0/5749950c',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage('http://ebd.cda.pl/0x0/' + video_id, video_id)
if 'Ten film jest dostępny dla użytkowników premium' in webpage:
raise ExtractorError('This video is only available for premium users.', expected=True)
title = self._html_search_regex(r'<title>(.+?)</title>', webpage, 'title')
formats = []
info_dict = {
'id': video_id,
'title': title,
'formats': formats,
'duration': None,
}
def extract_format(page, version):
unpacked = decode_packed_codes(page)
format_url = self._search_regex(
r"url:\\'(.+?)\\'", unpacked, '%s url' % version, fatal=False)
if not format_url:
return
f = {
'url': format_url,
}
m = re.search(
r'<a[^>]+data-quality="(?P<format_id>[^"]+)"[^>]+href="[^"]+"[^>]+class="[^"]*quality-btn-active[^"]*">(?P<height>[0-9]+)p',
page)
if m:
f.update({
'format_id': m.group('format_id'),
'height': int(m.group('height')),
})
info_dict['formats'].append(f)
if not info_dict['duration']:
info_dict['duration'] = parse_duration(self._search_regex(
r"duration:\\'(.+?)\\'", unpacked, 'duration', fatal=False))
extract_format(webpage, 'default')
for href, resolution in re.findall(
r'<a[^>]+data-quality="[^"]+"[^>]+href="([^"]+)"[^>]+class="quality-btn"[^>]*>([0-9]+p)',
webpage):
webpage = self._download_webpage(
href, video_id, 'Downloading %s version information' % resolution, fatal=False)
if not webpage:
# Manually report warning because empty page is returned when
# invalid version is requested.
self.report_warning('Unable to download %s version information' % resolution)
continue
extract_format(webpage, resolution)
self._sort_formats(formats)
return info_dict

View File

@@ -129,7 +129,8 @@ class CeskaTelevizeIE(InfoExtractor):
formats = [] formats = []
for format_id, stream_url in item['streamUrls'].items(): for format_id, stream_url in item['streamUrls'].items():
formats.extend(self._extract_m3u8_formats( formats.extend(self._extract_m3u8_formats(
stream_url, playlist_id, 'mp4', entry_protocol='m3u8_native')) stream_url, playlist_id, 'mp4',
entry_protocol='m3u8_native', fatal=False))
self._sort_formats(formats) self._sort_formats(formats)
item_id = item.get('id') or item['assetId'] item_id = item.get('id') or item['assetId']
@@ -177,16 +178,16 @@ class CeskaTelevizeIE(InfoExtractor):
for divider in [1000, 60, 60, 100]: for divider in [1000, 60, 60, 100]:
components.append(msec % divider) components.append(msec % divider)
msec //= divider msec //= divider
return "{3:02}:{2:02}:{1:02},{0:03}".format(*components) return '{3:02}:{2:02}:{1:02},{0:03}'.format(*components)
def _fix_subtitle(subtitle): def _fix_subtitle(subtitle):
for line in subtitle.splitlines(): for line in subtitle.splitlines():
m = re.match(r"^\s*([0-9]+);\s*([0-9]+)\s+([0-9]+)\s*$", line) m = re.match(r'^\s*([0-9]+);\s*([0-9]+)\s+([0-9]+)\s*$', line)
if m: if m:
yield m.group(1) yield m.group(1)
start, stop = (_msectotimecode(int(t)) for t in m.groups()[1:]) start, stop = (_msectotimecode(int(t)) for t in m.groups()[1:])
yield "{0} --> {1}".format(start, stop) yield '{0} --> {1}'.format(start, stop)
else: else:
yield line yield line
return "\r\n".join(_fix_subtitle(subtitles)) return '\r\n'.join(_fix_subtitle(subtitles))

View File

@@ -21,6 +21,10 @@ class CinemassacreIE(InfoExtractor):
'title': '“Angry Video Game Nerd: The Movie” Trailer', 'title': '“Angry Video Game Nerd: The Movie” Trailer',
'description': 'md5:fb87405fcb42a331742a0dce2708560b', 'description': 'md5:fb87405fcb42a331742a0dce2708560b',
}, },
'params': {
# m3u8 download
'skip_download': True,
},
}, },
{ {
'url': 'http://cinemassacre.com/2013/10/02/the-mummys-hand-1940', 'url': 'http://cinemassacre.com/2013/10/02/the-mummys-hand-1940',
@@ -31,14 +35,18 @@ class CinemassacreIE(InfoExtractor):
'upload_date': '20131002', 'upload_date': '20131002',
'title': 'The Mummys Hand (1940)', 'title': 'The Mummys Hand (1940)',
}, },
'params': {
# m3u8 download
'skip_download': True,
},
}, },
{ {
# Youtube embedded video # Youtube embedded video
'url': 'http://cinemassacre.com/2006/12/07/chronologically-confused-about-bad-movie-and-video-game-sequel-titles/', 'url': 'http://cinemassacre.com/2006/12/07/chronologically-confused-about-bad-movie-and-video-game-sequel-titles/',
'md5': 'df4cf8a1dcedaec79a73d96d83b99023', 'md5': 'ec9838a5520ef5409b3e4e42fcb0a3b9',
'info_dict': { 'info_dict': {
'id': 'OEVzPCY2T-g', 'id': 'OEVzPCY2T-g',
'ext': 'mp4', 'ext': 'webm',
'title': 'AVGN: Chronologically Confused about Bad Movie and Video Game Sequel Titles', 'title': 'AVGN: Chronologically Confused about Bad Movie and Video Game Sequel Titles',
'upload_date': '20061207', 'upload_date': '20061207',
'uploader': 'Cinemassacre', 'uploader': 'Cinemassacre',
@@ -49,12 +57,12 @@ class CinemassacreIE(InfoExtractor):
{ {
# Youtube embedded video # Youtube embedded video
'url': 'http://cinemassacre.com/2006/09/01/mckids/', 'url': 'http://cinemassacre.com/2006/09/01/mckids/',
'md5': '6eb30961fa795fedc750eac4881ad2e1', 'md5': '7393c4e0f54602ad110c793eb7a6513a',
'info_dict': { 'info_dict': {
'id': 'FnxsNhuikpo', 'id': 'FnxsNhuikpo',
'ext': 'mp4', 'ext': 'webm',
'upload_date': '20060901', 'upload_date': '20060901',
'uploader': 'Cinemassacre Extras', 'uploader': 'Cinemassacre Extra',
'description': 'md5:de9b751efa9e45fbaafd9c8a1123ed53', 'description': 'md5:de9b751efa9e45fbaafd9c8a1123ed53',
'uploader_id': 'Cinemassacre', 'uploader_id': 'Cinemassacre',
'title': 'AVGN: McKids', 'title': 'AVGN: McKids',
@@ -69,7 +77,11 @@ class CinemassacreIE(InfoExtractor):
'description': 'Lets Play Mario Kart 64 !! Mario Kart 64 is a classic go-kart racing game released for the Nintendo 64 (N64). Today James & Mike do 4 player Battle Mode with Kyle and Bootsy!', 'description': 'Lets Play Mario Kart 64 !! Mario Kart 64 is a classic go-kart racing game released for the Nintendo 64 (N64). Today James & Mike do 4 player Battle Mode with Kyle and Bootsy!',
'title': 'Mario Kart 64 (Nintendo 64) James & Mike Mondays', 'title': 'Mario Kart 64 (Nintendo 64) James & Mike Mondays',
'upload_date': '20150525', 'upload_date': '20150525',
} },
'params': {
# m3u8 download
'skip_download': True,
},
} }
] ]

View File

@@ -19,7 +19,7 @@ def _decode(s):
class CliphunterIE(InfoExtractor): class CliphunterIE(InfoExtractor):
IE_NAME = 'cliphunter' IE_NAME = 'cliphunter'
_VALID_URL = r'''(?x)http://(?:www\.)?cliphunter\.com/w/ _VALID_URL = r'''(?x)https?://(?:www\.)?cliphunter\.com/w/
(?P<id>[0-9]+)/ (?P<id>[0-9]+)/
(?P<seo>.+?)(?:$|[#\?]) (?P<seo>.+?)(?:$|[#\?])
''' '''

View File

@@ -8,7 +8,7 @@ from ..utils import (
class ClipsyndicateIE(InfoExtractor): class ClipsyndicateIE(InfoExtractor):
_VALID_URL = r'http://(?:chic|www)\.clipsyndicate\.com/video/play(list/\d+)?/(?P<id>\d+)' _VALID_URL = r'https?://(?:chic|www)\.clipsyndicate\.com/video/play(list/\d+)?/(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.clipsyndicate.com/video/play/4629301/brick_briscoe', 'url': 'http://www.clipsyndicate.com/video/play/4629301/brick_briscoe',

View File

@@ -12,7 +12,7 @@ from ..utils import (
class ClubicIE(InfoExtractor): class ClubicIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?clubic\.com/video/(?:[^/]+/)*video.*-(?P<id>[0-9]+)\.html' _VALID_URL = r'https?://(?:www\.)?clubic\.com/video/(?:[^/]+/)*video.*-(?P<id>[0-9]+)\.html'
_TESTS = [{ _TESTS = [{
'url': 'http://www.clubic.com/video/clubic-week/video-clubic-week-2-0-le-fbi-se-lance-dans-la-photo-d-identite-448474.html', 'url': 'http://www.clubic.com/video/clubic-week/video-clubic-week-2-0-le-fbi-se-lance-dans-la-photo-d-identite-448474.html',

View File

@@ -51,9 +51,7 @@ class CNETIE(ThePlatformIE):
uploader = None uploader = None
uploader_id = None uploader_id = None
mpx_account = data['config']['uvpConfig']['default']['mpx_account'] metadata = self.get_metadata('kYEXFC/%s' % list(vdata['files'].values())[0], video_id)
metadata = self.get_metadata('%s/%s' % (mpx_account, list(vdata['files'].values())[0]), video_id)
description = vdata.get('description') or metadata.get('description') description = vdata.get('description') or metadata.get('description')
duration = int_or_none(vdata.get('duration')) or metadata.get('duration') duration = int_or_none(vdata.get('duration')) or metadata.get('duration')
@@ -62,7 +60,7 @@ class CNETIE(ThePlatformIE):
for (fkey, vid) in vdata['files'].items(): for (fkey, vid) in vdata['files'].items():
if fkey == 'hls_phone' and 'hls_tablet' in vdata['files']: if fkey == 'hls_phone' and 'hls_tablet' in vdata['files']:
continue continue
release_url = 'http://link.theplatform.com/s/%s/%s?format=SMIL&mbr=true' % (mpx_account, vid) release_url = 'http://link.theplatform.com/s/kYEXFC/%s?mbr=true' % vid
if fkey == 'hds': if fkey == 'hds':
release_url += '&manifest=f4m' release_url += '&manifest=f4m'
tp_formats, tp_subtitles = self._extract_theplatform_smil(release_url, video_id, 'Downloading %s SMIL data' % fkey) tp_formats, tp_subtitles = self._extract_theplatform_smil(release_url, video_id, 'Downloading %s SMIL data' % fkey)

View File

@@ -26,14 +26,14 @@ class CNNIE(InfoExtractor):
'upload_date': '20130609', 'upload_date': '20130609',
}, },
}, { }, {
"url": "http://edition.cnn.com/video/?/video/us/2013/08/21/sot-student-gives-epic-speech.georgia-institute-of-technology&utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+rss%2Fcnn_topstories+%28RSS%3A+Top+Stories%29", 'url': 'http://edition.cnn.com/video/?/video/us/2013/08/21/sot-student-gives-epic-speech.georgia-institute-of-technology&utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+rss%2Fcnn_topstories+%28RSS%3A+Top+Stories%29',
"md5": "b5cc60c60a3477d185af8f19a2a26f4e", 'md5': 'b5cc60c60a3477d185af8f19a2a26f4e',
"info_dict": { 'info_dict': {
'id': 'us/2013/08/21/sot-student-gives-epic-speech.georgia-institute-of-technology', 'id': 'us/2013/08/21/sot-student-gives-epic-speech.georgia-institute-of-technology',
'ext': 'mp4', 'ext': 'mp4',
"title": "Student's epic speech stuns new freshmen", 'title': "Student's epic speech stuns new freshmen",
"description": "A Georgia Tech student welcomes the incoming freshmen with an epic speech backed by music from \"2001: A Space Odyssey.\"", 'description': "A Georgia Tech student welcomes the incoming freshmen with an epic speech backed by music from \"2001: A Space Odyssey.\"",
"upload_date": "20130821", 'upload_date': '20130821',
} }
}, { }, {
'url': 'http://www.cnn.com/video/data/2.0/video/living/2014/12/22/growing-america-nashville-salemtown-board-episode-1.hln.html', 'url': 'http://www.cnn.com/video/data/2.0/video/living/2014/12/22/growing-america-nashville-salemtown-board-episode-1.hln.html',

View File

@@ -46,9 +46,9 @@ class CollegeRamaIE(InfoExtractor):
video_id = self._match_id(url) video_id = self._match_id(url)
player_options_request = { player_options_request = {
"getPlayerOptionsRequest": { 'getPlayerOptionsRequest': {
"ResourceId": video_id, 'ResourceId': video_id,
"QueryString": "", 'QueryString': '',
} }
} }

View File

@@ -11,7 +11,7 @@ from ..utils import (
class ComCarCoffIE(InfoExtractor): class ComCarCoffIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?comediansincarsgettingcoffee\.com/(?P<id>[a-z0-9\-]*)' _VALID_URL = r'https?://(?:www\.)?comediansincarsgettingcoffee\.com/(?P<id>[a-z0-9\-]*)'
_TESTS = [{ _TESTS = [{
'url': 'http://comediansincarsgettingcoffee.com/miranda-sings-happy-thanksgiving-miranda/', 'url': 'http://comediansincarsgettingcoffee.com/miranda-sings-happy-thanksgiving-miranda/',
'info_dict': { 'info_dict': {

View File

@@ -16,11 +16,11 @@ from ..utils import (
class ComedyCentralIE(MTVServicesInfoExtractor): class ComedyCentralIE(MTVServicesInfoExtractor):
_VALID_URL = r'''(?x)https?://(?:www\.)?cc\.com/ _VALID_URL = r'''(?x)https?://(?:www\.)?cc\.com/
(video-clips|episodes|cc-studios|video-collections|full-episodes) (video-clips|episodes|cc-studios|video-collections|full-episodes|shows)
/(?P<title>.*)''' /(?P<title>.*)'''
_FEED_URL = 'http://comedycentral.com/feeds/mrss/' _FEED_URL = 'http://comedycentral.com/feeds/mrss/'
_TEST = { _TESTS = [{
'url': 'http://www.cc.com/video-clips/kllhuv/stand-up-greg-fitzsimmons--uncensored---too-good-of-a-mother', 'url': 'http://www.cc.com/video-clips/kllhuv/stand-up-greg-fitzsimmons--uncensored---too-good-of-a-mother',
'md5': 'c4f48e9eda1b16dd10add0744344b6d8', 'md5': 'c4f48e9eda1b16dd10add0744344b6d8',
'info_dict': { 'info_dict': {
@@ -29,7 +29,10 @@ class ComedyCentralIE(MTVServicesInfoExtractor):
'title': 'CC:Stand-Up|Greg Fitzsimmons: Life on Stage|Uncensored - Too Good of a Mother', 'title': 'CC:Stand-Up|Greg Fitzsimmons: Life on Stage|Uncensored - Too Good of a Mother',
'description': 'After a certain point, breastfeeding becomes c**kblocking.', 'description': 'After a certain point, breastfeeding becomes c**kblocking.',
}, },
} }, {
'url': 'http://www.cc.com/shows/the-daily-show-with-trevor-noah/interviews/6yx39d/exclusive-rand-paul-extended-interview',
'only_matching': True,
}]
class ComedyCentralShowsIE(MTVServicesInfoExtractor): class ComedyCentralShowsIE(MTVServicesInfoExtractor):
@@ -192,7 +195,7 @@ class ComedyCentralShowsIE(MTVServicesInfoExtractor):
if len(altMovieParams) == 0: if len(altMovieParams) == 0:
raise ExtractorError('unable to find Flash URL in webpage ' + url) raise ExtractorError('unable to find Flash URL in webpage ' + url)
else: else:
mMovieParams = [("http://media.mtvnservices.com/" + altMovieParams[0], altMovieParams[0])] mMovieParams = [('http://media.mtvnservices.com/' + altMovieParams[0], altMovieParams[0])]
uri = mMovieParams[0][1] uri = mMovieParams[0][1]
# Correct cc.com in uri # Correct cc.com in uri

View File

@@ -15,13 +15,14 @@ import math
from ..compat import ( from ..compat import (
compat_cookiejar, compat_cookiejar,
compat_cookies, compat_cookies,
compat_etree_fromstring,
compat_getpass, compat_getpass,
compat_http_client, compat_http_client,
compat_os_name,
compat_str,
compat_urllib_error, compat_urllib_error,
compat_urllib_parse, compat_urllib_parse,
compat_urlparse, compat_urlparse,
compat_str,
compat_etree_fromstring,
) )
from ..utils import ( from ..utils import (
NO_DEFAULT, NO_DEFAULT,
@@ -46,6 +47,8 @@ from ..utils import (
xpath_with_ns, xpath_with_ns,
determine_protocol, determine_protocol,
parse_duration, parse_duration,
mimetype2ext,
update_url_query,
) )
@@ -103,7 +106,7 @@ class InfoExtractor(object):
* protocol The protocol that will be used for the actual * protocol The protocol that will be used for the actual
download, lower-case. download, lower-case.
"http", "https", "rtsp", "rtmp", "rtmpe", "http", "https", "rtsp", "rtmp", "rtmpe",
"m3u8", or "m3u8_native". "m3u8", "m3u8_native" or "http_dash_segments".
* preference Order number of this format. If this field is * preference Order number of this format. If this field is
present and not None, the formats get sorted present and not None, the formats get sorted
by this field, regardless of all other values. by this field, regardless of all other values.
@@ -156,12 +159,14 @@ class InfoExtractor(object):
thumbnail: Full URL to a video thumbnail image. thumbnail: Full URL to a video thumbnail image.
description: Full video description. description: Full video description.
uploader: Full name of the video uploader. uploader: Full name of the video uploader.
license: License name the video is licensed under.
creator: The main artist who created the video. creator: The main artist who created the video.
release_date: The date (YYYYMMDD) when the video was released. release_date: The date (YYYYMMDD) when the video was released.
timestamp: UNIX timestamp of the moment the video became available. timestamp: UNIX timestamp of the moment the video became available.
upload_date: Video upload date (YYYYMMDD). upload_date: Video upload date (YYYYMMDD).
If not explicitly set, calculated from timestamp. If not explicitly set, calculated from timestamp.
uploader_id: Nickname or id of the video uploader. uploader_id: Nickname or id of the video uploader.
uploader_url: Full URL to a personal webpage of the video uploader.
location: Physical location where the video was filmed. location: Physical location where the video was filmed.
subtitles: The available subtitles as a dictionary in the format subtitles: The available subtitles as a dictionary in the format
{language: subformats}. "subformats" is a list sorted from {language: subformats}. "subformats" is a list sorted from
@@ -341,7 +346,7 @@ class InfoExtractor(object):
def IE_NAME(self): def IE_NAME(self):
return compat_str(type(self).__name__[:-2]) return compat_str(type(self).__name__[:-2])
def _request_webpage(self, url_or_request, video_id, note=None, errnote=None, fatal=True): def _request_webpage(self, url_or_request, video_id, note=None, errnote=None, fatal=True, data=None, headers=None, query=None):
""" Returns the response handle """ """ Returns the response handle """
if note is None: if note is None:
self.report_download_webpage(video_id) self.report_download_webpage(video_id)
@@ -350,6 +355,12 @@ class InfoExtractor(object):
self.to_screen('%s' % (note,)) self.to_screen('%s' % (note,))
else: else:
self.to_screen('%s: %s' % (video_id, note)) self.to_screen('%s: %s' % (video_id, note))
# data, headers and query params will be ignored for `Request` objects
if isinstance(url_or_request, compat_str):
if query:
url_or_request = update_url_query(url_or_request, query)
if data or headers:
url_or_request = sanitized_Request(url_or_request, data, headers or {})
try: try:
return self._downloader.urlopen(url_or_request) return self._downloader.urlopen(url_or_request)
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err: except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
@@ -365,13 +376,13 @@ class InfoExtractor(object):
self._downloader.report_warning(errmsg) self._downloader.report_warning(errmsg)
return False return False
def _download_webpage_handle(self, url_or_request, video_id, note=None, errnote=None, fatal=True, encoding=None): def _download_webpage_handle(self, url_or_request, video_id, note=None, errnote=None, fatal=True, encoding=None, data=None, headers=None, query=None):
""" Returns a tuple (page content as string, URL handle) """ """ Returns a tuple (page content as string, URL handle) """
# Strip hashes from the URL (#1038) # Strip hashes from the URL (#1038)
if isinstance(url_or_request, (compat_str, str)): if isinstance(url_or_request, (compat_str, str)):
url_or_request = url_or_request.partition('#')[0] url_or_request = url_or_request.partition('#')[0]
urlh = self._request_webpage(url_or_request, video_id, note, errnote, fatal) urlh = self._request_webpage(url_or_request, video_id, note, errnote, fatal, data=data, headers=headers, query=query)
if urlh is False: if urlh is False:
assert not fatal assert not fatal
return False return False
@@ -424,7 +435,7 @@ class InfoExtractor(object):
self.to_screen('Saving request to ' + filename) self.to_screen('Saving request to ' + filename)
# Working around MAX_PATH limitation on Windows (see # Working around MAX_PATH limitation on Windows (see
# http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx) # http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx)
if os.name == 'nt': if compat_os_name == 'nt':
absfilepath = os.path.abspath(filename) absfilepath = os.path.abspath(filename)
if len(absfilepath) > 259: if len(absfilepath) > 259:
filename = '\\\\?\\' + absfilepath filename = '\\\\?\\' + absfilepath
@@ -458,13 +469,13 @@ class InfoExtractor(object):
return content return content
def _download_webpage(self, url_or_request, video_id, note=None, errnote=None, fatal=True, tries=1, timeout=5, encoding=None): def _download_webpage(self, url_or_request, video_id, note=None, errnote=None, fatal=True, tries=1, timeout=5, encoding=None, data=None, headers=None, query=None):
""" Returns the data of the page as a string """ """ Returns the data of the page as a string """
success = False success = False
try_count = 0 try_count = 0
while success is False: while success is False:
try: try:
res = self._download_webpage_handle(url_or_request, video_id, note, errnote, fatal, encoding=encoding) res = self._download_webpage_handle(url_or_request, video_id, note, errnote, fatal, encoding=encoding, data=data, headers=headers, query=query)
success = True success = True
except compat_http_client.IncompleteRead as e: except compat_http_client.IncompleteRead as e:
try_count += 1 try_count += 1
@@ -479,10 +490,10 @@ class InfoExtractor(object):
def _download_xml(self, url_or_request, video_id, def _download_xml(self, url_or_request, video_id,
note='Downloading XML', errnote='Unable to download XML', note='Downloading XML', errnote='Unable to download XML',
transform_source=None, fatal=True, encoding=None): transform_source=None, fatal=True, encoding=None, data=None, headers=None, query=None):
"""Return the xml as an xml.etree.ElementTree.Element""" """Return the xml as an xml.etree.ElementTree.Element"""
xml_string = self._download_webpage( xml_string = self._download_webpage(
url_or_request, video_id, note, errnote, fatal=fatal, encoding=encoding) url_or_request, video_id, note, errnote, fatal=fatal, encoding=encoding, data=data, headers=headers, query=query)
if xml_string is False: if xml_string is False:
return xml_string return xml_string
if transform_source: if transform_source:
@@ -493,10 +504,10 @@ class InfoExtractor(object):
note='Downloading JSON metadata', note='Downloading JSON metadata',
errnote='Unable to download JSON metadata', errnote='Unable to download JSON metadata',
transform_source=None, transform_source=None,
fatal=True, encoding=None): fatal=True, encoding=None, data=None, headers=None, query=None):
json_string = self._download_webpage( json_string = self._download_webpage(
url_or_request, video_id, note, errnote, fatal=fatal, url_or_request, video_id, note, errnote, fatal=fatal,
encoding=encoding) encoding=encoding, data=data, headers=headers, query=query)
if (not fatal) and json_string is False: if (not fatal) and json_string is False:
return None return None
return self._parse_json( return self._parse_json(
@@ -593,7 +604,7 @@ class InfoExtractor(object):
if mobj: if mobj:
break break
if not self._downloader.params.get('no_color') and os.name != 'nt' and sys.stderr.isatty(): if not self._downloader.params.get('no_color') and compat_os_name != 'nt' and sys.stderr.isatty():
_name = '\033[0;34m%s\033[0m' % name _name = '\033[0;34m%s\033[0m' % name
else: else:
_name = name _name = name
@@ -636,7 +647,7 @@ class InfoExtractor(object):
downloader_params = self._downloader.params downloader_params = self._downloader.params
# Attempt to use provided username and password or .netrc data # Attempt to use provided username and password or .netrc data
if downloader_params.get('username', None) is not None: if downloader_params.get('username') is not None:
username = downloader_params['username'] username = downloader_params['username']
password = downloader_params['password'] password = downloader_params['password']
elif downloader_params.get('usenetrc', False): elif downloader_params.get('usenetrc', False):
@@ -663,7 +674,7 @@ class InfoExtractor(object):
return None return None
downloader_params = self._downloader.params downloader_params = self._downloader.params
if downloader_params.get('twofactor', None) is not None: if downloader_params.get('twofactor') is not None:
return downloader_params['twofactor'] return downloader_params['twofactor']
return compat_getpass('Type %s and press [Return]: ' % note) return compat_getpass('Type %s and press [Return]: ' % note)
@@ -744,7 +755,7 @@ class InfoExtractor(object):
'mature': 17, 'mature': 17,
'restricted': 19, 'restricted': 19,
} }
return RATING_TABLE.get(rating.lower(), None) return RATING_TABLE.get(rating.lower())
def _family_friendly_search(self, html): def _family_friendly_search(self, html):
# See http://schema.org/VideoObject # See http://schema.org/VideoObject
@@ -759,7 +770,7 @@ class InfoExtractor(object):
'0': 18, '0': 18,
'false': 18, 'false': 18,
} }
return RATING_TABLE.get(family_friendly.lower(), None) return RATING_TABLE.get(family_friendly.lower())
def _twitter_search_player(self, html): def _twitter_search_player(self, html):
return self._html_search_meta('twitter:player', html, return self._html_search_meta('twitter:player', html,
@@ -851,6 +862,7 @@ class InfoExtractor(object):
proto_preference = 0 if determine_protocol(f) in ['http', 'https'] else -0.1 proto_preference = 0 if determine_protocol(f) in ['http', 'https'] else -0.1
if f.get('vcodec') == 'none': # audio only if f.get('vcodec') == 'none': # audio only
preference -= 50
if self._downloader.params.get('prefer_free_formats'): if self._downloader.params.get('prefer_free_formats'):
ORDER = ['aac', 'mp3', 'm4a', 'webm', 'ogg', 'opus'] ORDER = ['aac', 'mp3', 'm4a', 'webm', 'ogg', 'opus']
else: else:
@@ -861,6 +873,8 @@ class InfoExtractor(object):
except ValueError: except ValueError:
audio_ext_preference = -1 audio_ext_preference = -1
else: else:
if f.get('acodec') == 'none': # video only
preference -= 40
if self._downloader.params.get('prefer_free_formats'): if self._downloader.params.get('prefer_free_formats'):
ORDER = ['flv', 'mp4', 'webm'] ORDER = ['flv', 'mp4', 'webm']
else: else:
@@ -899,6 +913,16 @@ class InfoExtractor(object):
item='%s video format' % f.get('format_id') if f.get('format_id') else 'video'), item='%s video format' % f.get('format_id') if f.get('format_id') else 'video'),
formats) formats)
@staticmethod
def _remove_duplicate_formats(formats):
format_urls = set()
unique_formats = []
for f in formats:
if f['url'] not in format_urls:
format_urls.add(f['url'])
unique_formats.append(f)
formats[:] = unique_formats
def _is_valid_url(self, url, video_id, item='video'): def _is_valid_url(self, url, video_id, item='video'):
url = self._proto_relative_url(url, scheme='http:') url = self._proto_relative_url(url, scheme='http:')
# For now assume non HTTP(S) URLs always valid # For now assume non HTTP(S) URLs always valid
@@ -952,6 +976,13 @@ class InfoExtractor(object):
if manifest is False: if manifest is False:
return [] return []
return self._parse_f4m_formats(
manifest, manifest_url, video_id, preference=preference, f4m_id=f4m_id,
transform_source=transform_source, fatal=fatal)
def _parse_f4m_formats(self, manifest, manifest_url, video_id, preference=None, f4m_id=None,
transform_source=lambda s: fix_xml_ampersands(s).strip(),
fatal=True):
formats = [] formats = []
manifest_version = '1.0' manifest_version = '1.0'
media_nodes = manifest.findall('{http://ns.adobe.com/f4m/1.0}media') media_nodes = manifest.findall('{http://ns.adobe.com/f4m/1.0}media')
@@ -977,7 +1008,8 @@ class InfoExtractor(object):
# bitrate in f4m downloader # bitrate in f4m downloader
if determine_ext(manifest_url) == 'f4m': if determine_ext(manifest_url) == 'f4m':
formats.extend(self._extract_f4m_formats( formats.extend(self._extract_f4m_formats(
manifest_url, video_id, preference, f4m_id, fatal=fatal)) manifest_url, video_id, preference=preference, f4m_id=f4m_id,
transform_source=transform_source, fatal=fatal))
continue continue
tbr = int_or_none(media_el.attrib.get('bitrate')) tbr = int_or_none(media_el.attrib.get('bitrate'))
formats.append({ formats.append({
@@ -1022,11 +1054,21 @@ class InfoExtractor(object):
return [] return []
m3u8_doc, urlh = res m3u8_doc, urlh = res
m3u8_url = urlh.geturl() m3u8_url = urlh.geturl()
# A Media Playlist Tag MUST NOT appear in a Master Playlist
# https://tools.ietf.org/html/draft-pantos-http-live-streaming-17#section-4.3.3 # We should try extracting formats only from master playlists [1], i.e.
# The EXT-X-TARGETDURATION tag is REQUIRED for every M3U8 Media Playlists # playlists that describe available qualities. On the other hand media
# https://tools.ietf.org/html/draft-pantos-http-live-streaming-17#section-4.3.3.1 # playlists [2] should be returned as is since they contain just the media
if '#EXT-X-TARGETDURATION' in m3u8_doc: # without qualities renditions.
# Fortunately, master playlist can be easily distinguished from media
# playlist based on particular tags availability. As of [1, 2] master
# playlist tags MUST NOT appear in a media playist and vice versa.
# As of [3] #EXT-X-TARGETDURATION tag is REQUIRED for every media playlist
# and MUST NOT appear in master playlist thus we can clearly detect media
# playlist with this criterion.
# 1. https://tools.ietf.org/html/draft-pantos-http-live-streaming-17#section-4.3.4
# 2. https://tools.ietf.org/html/draft-pantos-http-live-streaming-17#section-4.3.3
# 3. https://tools.ietf.org/html/draft-pantos-http-live-streaming-17#section-4.3.3.1
if '#EXT-X-TARGETDURATION' in m3u8_doc: # media playlist, return as is
return [{ return [{
'url': m3u8_url, 'url': m3u8_url,
'format_id': m3u8_id, 'format_id': m3u8_id,
@@ -1073,19 +1115,29 @@ class InfoExtractor(object):
'protocol': entry_protocol, 'protocol': entry_protocol,
'preference': preference, 'preference': preference,
} }
codecs = last_info.get('CODECS')
if codecs:
# TODO: looks like video codec is not always necessarily goes first
va_codecs = codecs.split(',')
if va_codecs[0]:
f['vcodec'] = va_codecs[0]
if len(va_codecs) > 1 and va_codecs[1]:
f['acodec'] = va_codecs[1]
resolution = last_info.get('RESOLUTION') resolution = last_info.get('RESOLUTION')
if resolution: if resolution:
width_str, height_str = resolution.split('x') width_str, height_str = resolution.split('x')
f['width'] = int(width_str) f['width'] = int(width_str)
f['height'] = int(height_str) f['height'] = int(height_str)
codecs = last_info.get('CODECS')
if codecs:
vcodec, acodec = [None] * 2
va_codecs = codecs.split(',')
if len(va_codecs) == 1:
# Audio only entries usually come with single codec and
# no resolution. For more robustness we also check it to
# be mp4 audio.
if not resolution and va_codecs[0].startswith('mp4a'):
vcodec, acodec = 'none', va_codecs[0]
else:
vcodec = va_codecs[0]
else:
vcodec, acodec = va_codecs[:2]
f.update({
'acodec': acodec,
'vcodec': vcodec,
})
if last_media is not None: if last_media is not None:
f['m3u8_media'] = last_media f['m3u8_media'] = last_media
last_media = None last_media = None
@@ -1106,8 +1158,8 @@ class InfoExtractor(object):
out.append('{%s}%s' % (namespace, c)) out.append('{%s}%s' % (namespace, c))
return '/'.join(out) return '/'.join(out)
def _extract_smil_formats(self, smil_url, video_id, fatal=True, f4m_params=None): def _extract_smil_formats(self, smil_url, video_id, fatal=True, f4m_params=None, transform_source=None):
smil = self._download_smil(smil_url, video_id, fatal=fatal) smil = self._download_smil(smil_url, video_id, fatal=fatal, transform_source=transform_source)
if smil is False: if smil is False:
assert not fatal assert not fatal
@@ -1124,10 +1176,10 @@ class InfoExtractor(object):
return {} return {}
return self._parse_smil(smil, smil_url, video_id, f4m_params=f4m_params) return self._parse_smil(smil, smil_url, video_id, f4m_params=f4m_params)
def _download_smil(self, smil_url, video_id, fatal=True): def _download_smil(self, smil_url, video_id, fatal=True, transform_source=None):
return self._download_xml( return self._download_xml(
smil_url, video_id, 'Downloading SMIL file', smil_url, video_id, 'Downloading SMIL file',
'Unable to download SMIL file', fatal=fatal) 'Unable to download SMIL file', fatal=fatal, transform_source=transform_source)
def _parse_smil(self, smil, smil_url, video_id, f4m_params=None): def _parse_smil(self, smil, smil_url, video_id, f4m_params=None):
namespace = self._parse_smil_namespace(smil) namespace = self._parse_smil_namespace(smil)
@@ -1277,16 +1329,7 @@ class InfoExtractor(object):
if not src or src in urls: if not src or src in urls:
continue continue
urls.append(src) urls.append(src)
ext = textstream.get('ext') or determine_ext(src) ext = textstream.get('ext') or determine_ext(src) or mimetype2ext(textstream.get('type'))
if not ext:
type_ = textstream.get('type')
SUBTITLES_TYPES = {
'text/vtt': 'vtt',
'text/srt': 'srt',
'application/smptett+xml': 'tt',
}
if type_ in SUBTITLES_TYPES:
ext = SUBTITLES_TYPES[type_]
lang = textstream.get('systemLanguage') or textstream.get('systemLanguageName') or textstream.get('lang') or subtitles_lang lang = textstream.get('systemLanguage') or textstream.get('systemLanguageName') or textstream.get('lang') or subtitles_lang
subtitles.setdefault(lang, []).append({ subtitles.setdefault(lang, []).append({
'url': src, 'url': src,
@@ -1422,8 +1465,9 @@ class InfoExtractor(object):
continue continue
representation_attrib = adaptation_set.attrib.copy() representation_attrib = adaptation_set.attrib.copy()
representation_attrib.update(representation.attrib) representation_attrib.update(representation.attrib)
mime_type = representation_attrib.get('mimeType') # According to page 41 of ISO/IEC 29001-1:2014, @mimeType is mandatory
content_type = mime_type.split('/')[0] if mime_type else representation_attrib.get('contentType') mime_type = representation_attrib['mimeType']
content_type = mime_type.split('/')[0]
if content_type == 'text': if content_type == 'text':
# TODO implement WebVTT downloading # TODO implement WebVTT downloading
pass pass
@@ -1446,6 +1490,7 @@ class InfoExtractor(object):
f = { f = {
'format_id': '%s-%s' % (mpd_id, representation_id) if mpd_id else representation_id, 'format_id': '%s-%s' % (mpd_id, representation_id) if mpd_id else representation_id,
'url': base_url, 'url': base_url,
'ext': mimetype2ext(mime_type),
'width': int_or_none(representation_attrib.get('width')), 'width': int_or_none(representation_attrib.get('width')),
'height': int_or_none(representation_attrib.get('height')), 'height': int_or_none(representation_attrib.get('height')),
'tbr': int_or_none(representation_attrib.get('bandwidth'), 1000), 'tbr': int_or_none(representation_attrib.get('bandwidth'), 1000),
@@ -1497,7 +1542,7 @@ class InfoExtractor(object):
def _live_title(self, name): def _live_title(self, name):
""" Generate the title for a live video """ """ Generate the title for a live video """
now = datetime.datetime.now() now = datetime.datetime.now()
now_str = now.strftime("%Y-%m-%d %H:%M") now_str = now.strftime('%Y-%m-%d %H:%M')
return name + ' ' + now_str return name + ' ' + now_str
def _int(self, v, name, fatal=False, **kwargs): def _int(self, v, name, fatal=False, **kwargs):
@@ -1570,7 +1615,7 @@ class InfoExtractor(object):
return {} return {}
def _get_subtitles(self, *args, **kwargs): def _get_subtitles(self, *args, **kwargs):
raise NotImplementedError("This method must be implemented by subclasses") raise NotImplementedError('This method must be implemented by subclasses')
@staticmethod @staticmethod
def _merge_subtitle_items(subtitle_list1, subtitle_list2): def _merge_subtitle_items(subtitle_list1, subtitle_list2):
@@ -1596,7 +1641,16 @@ class InfoExtractor(object):
return {} return {}
def _get_automatic_captions(self, *args, **kwargs): def _get_automatic_captions(self, *args, **kwargs):
raise NotImplementedError("This method must be implemented by subclasses") raise NotImplementedError('This method must be implemented by subclasses')
def mark_watched(self, *args, **kwargs):
if (self._downloader.params.get('mark_watched', False) and
(self._get_login_info()[0] is not None or
self._downloader.params.get('cookiefile') is not None)):
self._mark_watched(*args, **kwargs)
def _mark_watched(self, *args, **kwargs):
raise NotImplementedError('This method must be implemented by subclasses')
class SearchInfoExtractor(InfoExtractor): class SearchInfoExtractor(InfoExtractor):
@@ -1636,7 +1690,7 @@ class SearchInfoExtractor(InfoExtractor):
def _get_n_results(self, query, n): def _get_n_results(self, query, n):
"""Get a specified number of results for a query""" """Get a specified number of results for a query"""
raise NotImplementedError("This method must be implemented by subclasses") raise NotImplementedError('This method must be implemented by subclasses')
@property @property
def SEARCH_KEY(self): def SEARCH_KEY(self):

View File

@@ -0,0 +1,36 @@
from __future__ import unicode_literals
import os
from .common import InfoExtractor
from ..compat import (
compat_urllib_parse_unquote,
compat_urlparse,
)
from ..utils import url_basename
class RtmpIE(InfoExtractor):
IE_DESC = False # Do not list
_VALID_URL = r'(?i)rtmp[est]?://.+'
_TESTS = [{
'url': 'rtmp://cp44293.edgefcs.net/ondemand?auth=daEcTdydfdqcsb8cZcDbAaCbhamacbbawaS-bw7dBb-bWG-GqpGFqCpNCnGoyL&aifp=v001&slist=public/unsecure/audio/2c97899446428e4301471a8cb72b4b97--audio--pmg-20110908-0900a_flv_aac_med_int.mp4',
'only_matching': True,
}, {
'url': 'rtmp://edge.live.hitbox.tv/live/dimak',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = compat_urllib_parse_unquote(os.path.splitext(url.rstrip('/').split('/')[-1])[0])
title = compat_urllib_parse_unquote(os.path.splitext(url_basename(url))[0])
return {
'id': video_id,
'title': title,
'formats': [{
'url': url,
'ext': 'flv',
'format_id': compat_urlparse.urlparse(url).scheme,
}],
}

View File

@@ -45,7 +45,7 @@ class CondeNastIE(InfoExtractor):
'wmagazine': 'W Magazine', 'wmagazine': 'W Magazine',
} }
_VALID_URL = r'http://(?:video|www|player)\.(?P<site>%s)\.com/(?P<type>watch|series|video|embed(?:js)?)/(?P<id>[^/?#]+)' % '|'.join(_SITES.keys()) _VALID_URL = r'https?://(?:video|www|player)\.(?P<site>%s)\.com/(?P<type>watch|series|video|embed(?:js)?)/(?P<id>[^/?#]+)' % '|'.join(_SITES.keys())
IE_DESC = 'Condé Nast media group: %s' % ', '.join(sorted(_SITES.values())) IE_DESC = 'Condé Nast media group: %s' % ', '.join(sorted(_SITES.values()))
EMBED_URL = r'(?:https?:)?//player\.(?P<site>%s)\.com/(?P<type>embed(?:js)?)/.+?' % '|'.join(_SITES.keys()) EMBED_URL = r'(?:https?:)?//player\.(?P<site>%s)\.com/(?P<type>embed(?:js)?)/.+?' % '|'.join(_SITES.keys())

View File

@@ -54,7 +54,7 @@ class CrunchyrollBaseIE(InfoExtractor):
def _real_initialize(self): def _real_initialize(self):
self._login() self._login()
def _download_webpage(self, url_or_request, video_id, note=None, errnote=None, fatal=True, tries=1, timeout=5, encoding=None): def _download_webpage(self, url_or_request, *args, **kwargs):
request = (url_or_request if isinstance(url_or_request, compat_urllib_request.Request) request = (url_or_request if isinstance(url_or_request, compat_urllib_request.Request)
else sanitized_Request(url_or_request)) else sanitized_Request(url_or_request))
# Accept-Language must be set explicitly to accept any language to avoid issues # Accept-Language must be set explicitly to accept any language to avoid issues
@@ -65,8 +65,7 @@ class CrunchyrollBaseIE(InfoExtractor):
# Crunchyroll to not work in georestriction cases in some browsers that don't place # Crunchyroll to not work in georestriction cases in some browsers that don't place
# the locale lang first in header. However allowing any language seems to workaround the issue. # the locale lang first in header. However allowing any language seems to workaround the issue.
request.add_header('Accept-Language', '*') request.add_header('Accept-Language', '*')
return super(CrunchyrollBaseIE, self)._download_webpage( return super(CrunchyrollBaseIE, self)._download_webpage(request, *args, **kwargs)
request, video_id, note, errnote, fatal, tries, timeout, encoding)
@staticmethod @staticmethod
def _add_skip_wall(url): def _add_skip_wall(url):
@@ -180,40 +179,40 @@ class CrunchyrollIE(CrunchyrollBaseIE):
return assvalue return assvalue
output = '[Script Info]\n' output = '[Script Info]\n'
output += 'Title: %s\n' % sub_root.attrib["title"] output += 'Title: %s\n' % sub_root.attrib['title']
output += 'ScriptType: v4.00+\n' output += 'ScriptType: v4.00+\n'
output += 'WrapStyle: %s\n' % sub_root.attrib["wrap_style"] output += 'WrapStyle: %s\n' % sub_root.attrib['wrap_style']
output += 'PlayResX: %s\n' % sub_root.attrib["play_res_x"] output += 'PlayResX: %s\n' % sub_root.attrib['play_res_x']
output += 'PlayResY: %s\n' % sub_root.attrib["play_res_y"] output += 'PlayResY: %s\n' % sub_root.attrib['play_res_y']
output += """ScaledBorderAndShadow: yes output += """ScaledBorderAndShadow: yes
[V4+ Styles] [V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
""" """
for style in sub_root.findall('./styles/style'): for style in sub_root.findall('./styles/style'):
output += 'Style: ' + style.attrib["name"] output += 'Style: ' + style.attrib['name']
output += ',' + style.attrib["font_name"] output += ',' + style.attrib['font_name']
output += ',' + style.attrib["font_size"] output += ',' + style.attrib['font_size']
output += ',' + style.attrib["primary_colour"] output += ',' + style.attrib['primary_colour']
output += ',' + style.attrib["secondary_colour"] output += ',' + style.attrib['secondary_colour']
output += ',' + style.attrib["outline_colour"] output += ',' + style.attrib['outline_colour']
output += ',' + style.attrib["back_colour"] output += ',' + style.attrib['back_colour']
output += ',' + ass_bool(style.attrib["bold"]) output += ',' + ass_bool(style.attrib['bold'])
output += ',' + ass_bool(style.attrib["italic"]) output += ',' + ass_bool(style.attrib['italic'])
output += ',' + ass_bool(style.attrib["underline"]) output += ',' + ass_bool(style.attrib['underline'])
output += ',' + ass_bool(style.attrib["strikeout"]) output += ',' + ass_bool(style.attrib['strikeout'])
output += ',' + style.attrib["scale_x"] output += ',' + style.attrib['scale_x']
output += ',' + style.attrib["scale_y"] output += ',' + style.attrib['scale_y']
output += ',' + style.attrib["spacing"] output += ',' + style.attrib['spacing']
output += ',' + style.attrib["angle"] output += ',' + style.attrib['angle']
output += ',' + style.attrib["border_style"] output += ',' + style.attrib['border_style']
output += ',' + style.attrib["outline"] output += ',' + style.attrib['outline']
output += ',' + style.attrib["shadow"] output += ',' + style.attrib['shadow']
output += ',' + style.attrib["alignment"] output += ',' + style.attrib['alignment']
output += ',' + style.attrib["margin_l"] output += ',' + style.attrib['margin_l']
output += ',' + style.attrib["margin_r"] output += ',' + style.attrib['margin_r']
output += ',' + style.attrib["margin_v"] output += ',' + style.attrib['margin_v']
output += ',' + style.attrib["encoding"] output += ',' + style.attrib['encoding']
output += '\n' output += '\n'
output += """ output += """
@@ -222,15 +221,15 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
""" """
for event in sub_root.findall('./events/event'): for event in sub_root.findall('./events/event'):
output += 'Dialogue: 0' output += 'Dialogue: 0'
output += ',' + event.attrib["start"] output += ',' + event.attrib['start']
output += ',' + event.attrib["end"] output += ',' + event.attrib['end']
output += ',' + event.attrib["style"] output += ',' + event.attrib['style']
output += ',' + event.attrib["name"] output += ',' + event.attrib['name']
output += ',' + event.attrib["margin_l"] output += ',' + event.attrib['margin_l']
output += ',' + event.attrib["margin_r"] output += ',' + event.attrib['margin_r']
output += ',' + event.attrib["margin_v"] output += ',' + event.attrib['margin_v']
output += ',' + event.attrib["effect"] output += ',' + event.attrib['effect']
output += ',' + event.attrib["text"] output += ',' + event.attrib['text']
output += '\n' output += '\n'
return output return output
@@ -376,7 +375,7 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
class CrunchyrollShowPlaylistIE(CrunchyrollBaseIE): class CrunchyrollShowPlaylistIE(CrunchyrollBaseIE):
IE_NAME = "crunchyroll:playlist" IE_NAME = 'crunchyroll:playlist'
_VALID_URL = r'https?://(?:(?P<prefix>www|m)\.)?(?P<url>crunchyroll\.com/(?!(?:news|anime-news|library|forum|launchcalendar|lineup|store|comics|freetrial|login))(?P<id>[\w\-]+))/?(?:\?|$)' _VALID_URL = r'https?://(?:(?P<prefix>www|m)\.)?(?P<url>crunchyroll\.com/(?!(?:news|anime-news|library|forum|launchcalendar|lineup|store|comics|freetrial|login))(?P<id>[\w\-]+))/?(?:\?|$)'
_TESTS = [{ _TESTS = [{

View File

@@ -15,7 +15,7 @@ from .senateisvp import SenateISVPIE
class CSpanIE(InfoExtractor): class CSpanIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?c-span\.org/video/\?(?P<id>[0-9a-f]+)' _VALID_URL = r'https?://(?:www\.)?c-span\.org/video/\?(?P<id>[0-9a-f]+)'
IE_DESC = 'C-SPAN' IE_DESC = 'C-SPAN'
_TESTS = [{ _TESTS = [{
'url': 'http://www.c-span.org/video/?313572-1/HolderonV', 'url': 'http://www.c-span.org/video/?313572-1/HolderonV',

View File

@@ -8,7 +8,7 @@ from ..utils import parse_iso8601, ExtractorError
class CtsNewsIE(InfoExtractor): class CtsNewsIE(InfoExtractor):
IE_DESC = '華視新聞' IE_DESC = '華視新聞'
# https connection failed (Connection reset) # https connection failed (Connection reset)
_VALID_URL = r'http://news\.cts\.com\.tw/[a-z]+/[a-z]+/\d+/(?P<id>\d+)\.html' _VALID_URL = r'https?://news\.cts\.com\.tw/[a-z]+/[a-z]+/\d+/(?P<id>\d+)\.html'
_TESTS = [{ _TESTS = [{
'url': 'http://news.cts.com.tw/cts/international/201501/201501291578109.html', 'url': 'http://news.cts.com.tw/cts/international/201501/201501291578109.html',
'md5': 'a9875cb790252b08431186d741beaabe', 'md5': 'a9875cb790252b08431186d741beaabe',

View File

@@ -122,10 +122,13 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
description = self._og_search_description(webpage) or self._html_search_meta( description = self._og_search_description(webpage) or self._html_search_meta(
'description', webpage, 'description') 'description', webpage, 'description')
view_count = str_to_int(self._search_regex( view_count_str = self._search_regex(
[r'<meta[^>]+itemprop="interactionCount"[^>]+content="UserPlays:(\d+)"', (r'<meta[^>]+itemprop="interactionCount"[^>]+content="UserPlays:([\s\d,.]+)"',
r'video_views_count[^>]+>\s+([\d\.,]+)'], r'video_views_count[^>]+>\s+([\s\d\,.]+)'),
webpage, 'view count', fatal=False)) webpage, 'view count', fatal=False)
if view_count_str:
view_count_str = re.sub(r'\s', '', view_count_str)
view_count = str_to_int(view_count_str)
comment_count = int_or_none(self._search_regex( comment_count = int_or_none(self._search_regex(
r'<meta[^>]+itemprop="interactionCount"[^>]+content="UserComments:(\d+)"', r'<meta[^>]+itemprop="interactionCount"[^>]+content="UserComments:(\d+)"',
webpage, 'comment count', fatal=False)) webpage, 'comment count', fatal=False))
@@ -396,13 +399,13 @@ class DailymotionCloudIE(DailymotionBaseInfoExtractor):
}] }]
@classmethod @classmethod
def _extract_dmcloud_url(self, webpage): def _extract_dmcloud_url(cls, webpage):
mobj = re.search(r'<iframe[^>]+src=[\'"](%s)[\'"]' % self._VALID_EMBED_URL, webpage) mobj = re.search(r'<iframe[^>]+src=[\'"](%s)[\'"]' % cls._VALID_EMBED_URL, webpage)
if mobj: if mobj:
return mobj.group(1) return mobj.group(1)
mobj = re.search( mobj = re.search(
r'<input[^>]+id=[\'"]dmcloudUrlEmissionSelect[\'"][^>]+value=[\'"](%s)[\'"]' % self._VALID_EMBED_URL, r'<input[^>]+id=[\'"]dmcloudUrlEmissionSelect[\'"][^>]+value=[\'"](%s)[\'"]' % cls._VALID_EMBED_URL,
webpage) webpage)
if mobj: if mobj:
return mobj.group(1) return mobj.group(1)

View File

@@ -6,7 +6,7 @@ from ..compat import compat_str
class DctpTvIE(InfoExtractor): class DctpTvIE(InfoExtractor):
_VALID_URL = r'http://www.dctp.tv/(#/)?filme/(?P<id>.+?)/$' _VALID_URL = r'https?://www.dctp.tv/(#/)?filme/(?P<id>.+?)/$'
_TEST = { _TEST = {
'url': 'http://www.dctp.tv/filme/videoinstallation-fuer-eine-kaufhausfassade/', 'url': 'http://www.dctp.tv/filme/videoinstallation-fuer-eine-kaufhausfassade/',
'info_dict': { 'info_dict': {

View File

@@ -5,7 +5,7 @@ from .common import InfoExtractor
class DefenseGouvFrIE(InfoExtractor): class DefenseGouvFrIE(InfoExtractor):
IE_NAME = 'defense.gouv.fr' IE_NAME = 'defense.gouv.fr'
_VALID_URL = r'http://.*?\.defense\.gouv\.fr/layout/set/ligthboxvideo/base-de-medias/webtv/(?P<id>[^/?#]*)' _VALID_URL = r'https?://.*?\.defense\.gouv\.fr/layout/set/ligthboxvideo/base-de-medias/webtv/(?P<id>[^/?#]*)'
_TEST = { _TEST = {
'url': 'http://www.defense.gouv.fr/layout/set/ligthboxvideo/base-de-medias/webtv/attaque-chimique-syrienne-du-21-aout-2013-1', 'url': 'http://www.defense.gouv.fr/layout/set/ligthboxvideo/base-de-medias/webtv/attaque-chimique-syrienne-du-21-aout-2013-1',

View File

@@ -9,7 +9,7 @@ from ..compat import compat_str
class DiscoveryIE(InfoExtractor): class DiscoveryIE(InfoExtractor):
_VALID_URL = r'''(?x)http://(?:www\.)?(?: _VALID_URL = r'''(?x)https?://(?:www\.)?(?:
discovery| discovery|
investigationdiscovery| investigationdiscovery|
discoverylife| discoverylife|

View File

@@ -10,7 +10,7 @@ from ..compat import (compat_str, compat_basestring)
class DouyuTVIE(InfoExtractor): class DouyuTVIE(InfoExtractor):
IE_DESC = '斗鱼' IE_DESC = '斗鱼'
_VALID_URL = r'http://(?:www\.)?douyutv\.com/(?P<id>[A-Za-z0-9]+)' _VALID_URL = r'https?://(?:www\.)?douyu(?:tv)?\.com/(?P<id>[A-Za-z0-9]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.douyutv.com/iseven', 'url': 'http://www.douyutv.com/iseven',
'info_dict': { 'info_dict': {
@@ -18,7 +18,7 @@ class DouyuTVIE(InfoExtractor):
'display_id': 'iseven', 'display_id': 'iseven',
'ext': 'flv', 'ext': 'flv',
'title': 're:^清晨醒脑T-ara根本停不下来 [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$', 'title': 're:^清晨醒脑T-ara根本停不下来 [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': 'md5:c93d6692dde6fe33809a46edcbecca44', 'description': 'md5:f34981259a03e980a3c6404190a3ed61',
'thumbnail': 're:^https?://.*\.jpg$', 'thumbnail': 're:^https?://.*\.jpg$',
'uploader': '7师傅', 'uploader': '7师傅',
'uploader_id': '431925', 'uploader_id': '431925',
@@ -26,7 +26,7 @@ class DouyuTVIE(InfoExtractor):
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
} },
}, { }, {
'url': 'http://www.douyutv.com/85982', 'url': 'http://www.douyutv.com/85982',
'info_dict': { 'info_dict': {
@@ -42,7 +42,27 @@ class DouyuTVIE(InfoExtractor):
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
} },
'skip': 'Romm not found',
}, {
'url': 'http://www.douyutv.com/17732',
'info_dict': {
'id': '17732',
'display_id': '17732',
'ext': 'flv',
'title': 're:^清晨醒脑T-ara根本停不下来 [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': 'md5:f34981259a03e980a3c6404190a3ed61',
'thumbnail': 're:^https?://.*\.jpg$',
'uploader': '7师傅',
'uploader_id': '431925',
'is_live': True,
},
'params': {
'skip_download': True,
},
}, {
'url': 'http://www.douyu.com/xiaocang',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):

View File

@@ -1,6 +1,8 @@
# encoding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import json
import re
import time import time
from .common import InfoExtractor from .common import InfoExtractor
@@ -8,44 +10,125 @@ from ..utils import int_or_none
class DPlayIE(InfoExtractor): class DPlayIE(InfoExtractor):
_VALID_URL = r'http://www\.dplay\.se/[^/]+/(?P<id>[^/?#]+)' _VALID_URL = r'https?://(?P<domain>it\.dplay\.com|www\.dplay\.(?:dk|se|no))/[^/]+/(?P<id>[^/?#]+)'
_TEST = { _TESTS = [{
'url': 'http://it.dplay.com/take-me-out/stagione-1-episodio-25/',
'info_dict': {
'id': '1255600',
'display_id': 'stagione-1-episodio-25',
'ext': 'mp4',
'title': 'Episodio 25',
'description': 'md5:cae5f40ad988811b197d2d27a53227eb',
'duration': 2761,
'timestamp': 1454701800,
'upload_date': '20160205',
'creator': 'RTIT',
'series': 'Take me out',
'season_number': 1,
'episode_number': 25,
'age_limit': 0,
},
'expected_warnings': ['Unable to download f4m manifest'],
}, {
'url': 'http://www.dplay.se/nugammalt-77-handelser-som-format-sverige/season-1-svensken-lar-sig-njuta-av-livet/', 'url': 'http://www.dplay.se/nugammalt-77-handelser-som-format-sverige/season-1-svensken-lar-sig-njuta-av-livet/',
'info_dict': { 'info_dict': {
'id': '3172', 'id': '3172',
'ext': 'mp4',
'display_id': 'season-1-svensken-lar-sig-njuta-av-livet', 'display_id': 'season-1-svensken-lar-sig-njuta-av-livet',
'ext': 'flv',
'title': 'Svensken lär sig njuta av livet', 'title': 'Svensken lär sig njuta av livet',
'description': 'md5:d3819c9bccffd0fe458ca42451dd50d8',
'duration': 2650, 'duration': 2650,
'timestamp': 1365454320,
'upload_date': '20130408',
'creator': 'Kanal 5 (Home)',
'series': 'Nugammalt - 77 händelser som format Sverige',
'season_number': 1,
'episode_number': 1,
'age_limit': 0,
}, },
} }, {
'url': 'http://www.dplay.dk/mig-og-min-mor/season-6-episode-12/',
'info_dict': {
'id': '70816',
'display_id': 'season-6-episode-12',
'ext': 'flv',
'title': 'Episode 12',
'description': 'md5:9c86e51a93f8a4401fc9641ef9894c90',
'duration': 2563,
'timestamp': 1429696800,
'upload_date': '20150422',
'creator': 'Kanal 4',
'series': 'Mig og min mor',
'season_number': 6,
'episode_number': 12,
'age_limit': 0,
},
}, {
'url': 'http://www.dplay.no/pga-tour/season-1-hoydepunkter-18-21-februar/',
'only_matching': True,
}]
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) mobj = re.match(self._VALID_URL, url)
display_id = mobj.group('id')
domain = mobj.group('domain')
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
video_id = self._search_regex( video_id = self._search_regex(
r'data-video-id="(\d+)"', webpage, 'video id') r'data-video-id=["\'](\d+)', webpage, 'video id')
info = self._download_json( info = self._download_json(
'http://www.dplay.se/api/v2/ajax/videos?video_id=' + video_id, 'http://%s/api/v2/ajax/videos?video_id=%s' % (domain, video_id),
video_id)['data'][0] video_id)['data'][0]
title = info['title']
PROTOCOLS = ('hls', 'hds')
formats = []
def extract_formats(protocol, manifest_url):
if protocol == 'hls':
formats.extend(self._extract_m3u8_formats(
manifest_url, video_id, ext='mp4',
entry_protocol='m3u8_native', m3u8_id=protocol, fatal=False))
elif protocol == 'hds':
formats.extend(self._extract_f4m_formats(
manifest_url + '&hdcore=3.8.0&plugin=flowplayer-3.8.0.0',
video_id, f4m_id=protocol, fatal=False))
domain_tld = domain.split('.')[-1]
if domain_tld in ('se', 'dk'):
for protocol in PROTOCOLS:
self._set_cookie( self._set_cookie(
'secure.dplay.se', 'dsc-geo', 'secure.dplay.%s' % domain_tld, 'dsc-geo',
'{"countryCode":"NL","expiry":%d}' % ((time.time() + 20 * 60) * 1000)) json.dumps({
# TODO: consider adding support for 'stream_type=hds', it seems to 'countryCode': domain_tld.upper(),
# require setting some cookies 'expiry': (time.time() + 20 * 60) * 1000,
manifest_url = self._download_json( }))
'https://secure.dplay.se/secure/api/v2/user/authorization/stream/%s?stream_type=hls' % video_id, stream = self._download_json(
video_id, 'Getting manifest url for hls stream')['hls'] 'https://secure.dplay.%s/secure/api/v2/user/authorization/stream/%s?stream_type=%s'
formats = self._extract_m3u8_formats( % (domain_tld, video_id, protocol), video_id,
manifest_url, video_id, ext='mp4', entry_protocol='m3u8_native') 'Downloading %s stream JSON' % protocol, fatal=False)
if stream and stream.get(protocol):
extract_formats(protocol, stream[protocol])
else:
for protocol in PROTOCOLS:
if info.get(protocol):
extract_formats(protocol, info[protocol])
return { return {
'id': video_id, 'id': video_id,
'display_id': display_id, 'display_id': display_id,
'title': info['title'], 'title': title,
'formats': formats, 'description': info.get('video_metadata_longDescription'),
'duration': int_or_none(info.get('video_metadata_length'), scale=1000), 'duration': int_or_none(info.get('video_metadata_length'), scale=1000),
'timestamp': int_or_none(info.get('video_publish_date')),
'creator': info.get('video_metadata_homeChannel'),
'series': info.get('video_metadata_show'),
'season_number': int_or_none(info.get('season')),
'episode_number': int_or_none(info.get('episode')),
'age_limit': int_or_none(info.get('minimum_age')),
'formats': formats,
} }

View File

@@ -87,7 +87,7 @@ class DRBonanzaIE(InfoExtractor):
formats = [] formats = []
for file in info['Files']: for file in info['Files']:
if info['Type'] == "Video": if info['Type'] == 'Video':
if file['Type'] in video_types: if file['Type'] in video_types:
format = parse_filename_info(file['Location']) format = parse_filename_info(file['Location'])
format.update({ format.update({
@@ -101,10 +101,10 @@ class DRBonanzaIE(InfoExtractor):
if '/bonanza/' in rtmp_url: if '/bonanza/' in rtmp_url:
format['play_path'] = rtmp_url.split('/bonanza/')[1] format['play_path'] = rtmp_url.split('/bonanza/')[1]
formats.append(format) formats.append(format)
elif file['Type'] == "Thumb": elif file['Type'] == 'Thumb':
thumbnail = file['Location'] thumbnail = file['Location']
elif info['Type'] == "Audio": elif info['Type'] == 'Audio':
if file['Type'] == "Audio": if file['Type'] == 'Audio':
format = parse_filename_info(file['Location']) format = parse_filename_info(file['Location'])
format.update({ format.update({
'url': file['Location'], 'url': file['Location'],
@@ -112,7 +112,7 @@ class DRBonanzaIE(InfoExtractor):
'vcodec': 'none', 'vcodec': 'none',
}) })
formats.append(format) formats.append(format)
elif file['Type'] == "Thumb": elif file['Type'] == 'Thumb':
thumbnail = file['Location'] thumbnail = file['Location']
description = '%s\n%s\n%s\n' % ( description = '%s\n%s\n%s\n' % (

View File

@@ -7,7 +7,7 @@ from .zdf import ZDFIE
class DreiSatIE(ZDFIE): class DreiSatIE(ZDFIE):
IE_NAME = '3sat' IE_NAME = '3sat'
_VALID_URL = r'(?:http://)?(?:www\.)?3sat\.de/mediathek/(?:index\.php|mediathek\.php)?\?(?:(?:mode|display)=[^&]+&)*obj=(?P<id>[0-9]+)$' _VALID_URL = r'(?:https?://)?(?:www\.)?3sat\.de/mediathek/(?:index\.php|mediathek\.php)?\?(?:(?:mode|display)=[^&]+&)*obj=(?P<id>[0-9]+)$'
_TESTS = [ _TESTS = [
{ {
'url': 'http://www.3sat.de/mediathek/index.php?mode=play&obj=45918', 'url': 'http://www.3sat.de/mediathek/index.php?mode=play&obj=45918',

View File

@@ -15,7 +15,7 @@ class DVTVIE(InfoExtractor):
IE_NAME = 'dvtv' IE_NAME = 'dvtv'
IE_DESC = 'http://video.aktualne.cz/' IE_DESC = 'http://video.aktualne.cz/'
_VALID_URL = r'http://video\.aktualne\.cz/(?:[^/]+/)+r~(?P<id>[0-9a-f]{32})' _VALID_URL = r'https?://video\.aktualne\.cz/(?:[^/]+/)+r~(?P<id>[0-9a-f]{32})'
_TESTS = [{ _TESTS = [{
'url': 'http://video.aktualne.cz/dvtv/vondra-o-ceskem-stoleti-pri-pohledu-na-havla-mi-bylo-trapne/r~e5efe9ca855511e4833a0025900fea04/', 'url': 'http://video.aktualne.cz/dvtv/vondra-o-ceskem-stoleti-pri-pohledu-na-havla-mi-bylo-trapne/r~e5efe9ca855511e4833a0025900fea04/',

View File

@@ -0,0 +1,85 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import int_or_none
from ..compat import compat_urlparse
class DWIE(InfoExtractor):
IE_NAME = 'dw'
_VALID_URL = r'https?://(?:www\.)?dw\.com/(?:[^/]+/)+av-(?P<id>\d+)'
_TESTS = [{
# video
'url': 'http://www.dw.com/en/intelligent-light/av-19112290',
'md5': '7372046e1815c5a534b43f3c3c36e6e9',
'info_dict': {
'id': '19112290',
'ext': 'mp4',
'title': 'Intelligent light',
'description': 'md5:90e00d5881719f2a6a5827cb74985af1',
'upload_date': '20160311',
}
}, {
# audio
'url': 'http://www.dw.com/en/worldlink-my-business/av-19111941',
'md5': '2814c9a1321c3a51f8a7aeb067a360dd',
'info_dict': {
'id': '19111941',
'ext': 'mp3',
'title': 'WorldLink: My business',
'description': 'md5:bc9ca6e4e063361e21c920c53af12405',
'upload_date': '20160311',
}
}]
def _real_extract(self, url):
media_id = self._match_id(url)
webpage = self._download_webpage(url, media_id)
hidden_inputs = self._hidden_inputs(webpage)
title = hidden_inputs['media_title']
formats = []
if hidden_inputs.get('player_type') == 'video' and hidden_inputs.get('stream_file') == '1':
formats = self._extract_smil_formats(
'http://www.dw.com/smil/v-%s' % media_id, media_id,
transform_source=lambda s: s.replace(
'rtmp://tv-od.dw.de/flash/',
'http://tv-download.dw.de/dwtv_video/flv/'))
else:
formats = [{'url': hidden_inputs['file_name']}]
return {
'id': media_id,
'title': title,
'description': self._og_search_description(webpage),
'thumbnail': hidden_inputs.get('preview_image'),
'duration': int_or_none(hidden_inputs.get('file_duration')),
'upload_date': hidden_inputs.get('display_date'),
'formats': formats,
}
class DWArticleIE(InfoExtractor):
IE_NAME = 'dw:article'
_VALID_URL = r'https?://(?:www\.)?dw\.com/(?:[^/]+/)+a-(?P<id>\d+)'
_TEST = {
'url': 'http://www.dw.com/en/no-hope-limited-options-for-refugees-in-idomeni/a-19111009',
'md5': '8ca657f9d068bbef74d6fc38b97fc869',
'info_dict': {
'id': '19105868',
'ext': 'mp4',
'title': 'The harsh life of refugees in Idomeni',
'description': 'md5:196015cc7e48ebf474db9399420043c7',
'upload_date': '20160310',
}
}
def _real_extract(self, url):
article_id = self._match_id(url)
webpage = self._download_webpage(url, article_id)
hidden_inputs = self._hidden_inputs(webpage)
media_id = hidden_inputs['media_id']
media_path = self._search_regex(r'href="([^"]+av-%s)"\s+class="overlayLink"' % media_id, webpage, 'media url')
media_url = compat_urlparse.urljoin(url, media_path)
return self.url_result(media_url, 'DW', media_id)

View File

@@ -7,7 +7,7 @@ from .common import InfoExtractor
class EchoMskIE(InfoExtractor): class EchoMskIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?echo\.msk\.ru/sounds/(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?echo\.msk\.ru/sounds/(?P<id>\d+)'
_TEST = { _TEST = {
'url': 'http://www.echo.msk.ru/sounds/1464134.html', 'url': 'http://www.echo.msk.ru/sounds/1464134.html',
'md5': '2e44b3b78daff5b458e4dbc37f191f7c', 'md5': '2e44b3b78daff5b458e4dbc37f191f7c',

View File

@@ -17,85 +17,85 @@ class EightTracksIE(InfoExtractor):
IE_NAME = '8tracks' IE_NAME = '8tracks'
_VALID_URL = r'https?://8tracks\.com/(?P<user>[^/]+)/(?P<id>[^/#]+)(?:#.*)?$' _VALID_URL = r'https?://8tracks\.com/(?P<user>[^/]+)/(?P<id>[^/#]+)(?:#.*)?$'
_TEST = { _TEST = {
"name": "EightTracks", 'name': 'EightTracks',
"url": "http://8tracks.com/ytdl/youtube-dl-test-tracks-a", 'url': 'http://8tracks.com/ytdl/youtube-dl-test-tracks-a',
"info_dict": { 'info_dict': {
'id': '1336550', 'id': '1336550',
'display_id': 'youtube-dl-test-tracks-a', 'display_id': 'youtube-dl-test-tracks-a',
"description": "test chars: \"'/\\ä↭", 'description': "test chars: \"'/\\ä↭",
"title": "youtube-dl test tracks \"'/\\ä↭<>", 'title': "youtube-dl test tracks \"'/\\ä↭<>",
}, },
"playlist": [ 'playlist': [
{ {
"md5": "96ce57f24389fc8734ce47f4c1abcc55", 'md5': '96ce57f24389fc8734ce47f4c1abcc55',
"info_dict": { 'info_dict': {
"id": "11885610", 'id': '11885610',
"ext": "m4a", 'ext': 'm4a',
"title": "youtue-dl project<>\"' - youtube-dl test track 1 \"'/\\\u00e4\u21ad", 'title': "youtue-dl project<>\"' - youtube-dl test track 1 \"'/\\\u00e4\u21ad",
"uploader_id": "ytdl" 'uploader_id': 'ytdl'
} }
}, },
{ {
"md5": "4ab26f05c1f7291ea460a3920be8021f", 'md5': '4ab26f05c1f7291ea460a3920be8021f',
"info_dict": { 'info_dict': {
"id": "11885608", 'id': '11885608',
"ext": "m4a", 'ext': 'm4a',
"title": "youtube-dl project - youtube-dl test track 2 \"'/\\\u00e4\u21ad", 'title': "youtube-dl project - youtube-dl test track 2 \"'/\\\u00e4\u21ad",
"uploader_id": "ytdl" 'uploader_id': 'ytdl'
} }
}, },
{ {
"md5": "d30b5b5f74217410f4689605c35d1fd7", 'md5': 'd30b5b5f74217410f4689605c35d1fd7',
"info_dict": { 'info_dict': {
"id": "11885679", 'id': '11885679',
"ext": "m4a", 'ext': 'm4a',
"title": "youtube-dl project as well - youtube-dl test track 3 \"'/\\\u00e4\u21ad", 'title': "youtube-dl project as well - youtube-dl test track 3 \"'/\\\u00e4\u21ad",
"uploader_id": "ytdl" 'uploader_id': 'ytdl'
} }
}, },
{ {
"md5": "4eb0a669317cd725f6bbd336a29f923a", 'md5': '4eb0a669317cd725f6bbd336a29f923a',
"info_dict": { 'info_dict': {
"id": "11885680", 'id': '11885680',
"ext": "m4a", 'ext': 'm4a',
"title": "youtube-dl project as well - youtube-dl test track 4 \"'/\\\u00e4\u21ad", 'title': "youtube-dl project as well - youtube-dl test track 4 \"'/\\\u00e4\u21ad",
"uploader_id": "ytdl" 'uploader_id': 'ytdl'
} }
}, },
{ {
"md5": "1893e872e263a2705558d1d319ad19e8", 'md5': '1893e872e263a2705558d1d319ad19e8',
"info_dict": { 'info_dict': {
"id": "11885682", 'id': '11885682',
"ext": "m4a", 'ext': 'm4a',
"title": "PH - youtube-dl test track 5 \"'/\\\u00e4\u21ad", 'title': "PH - youtube-dl test track 5 \"'/\\\u00e4\u21ad",
"uploader_id": "ytdl" 'uploader_id': 'ytdl'
} }
}, },
{ {
"md5": "b673c46f47a216ab1741ae8836af5899", 'md5': 'b673c46f47a216ab1741ae8836af5899',
"info_dict": { 'info_dict': {
"id": "11885683", 'id': '11885683',
"ext": "m4a", 'ext': 'm4a',
"title": "PH - youtube-dl test track 6 \"'/\\\u00e4\u21ad", 'title': "PH - youtube-dl test track 6 \"'/\\\u00e4\u21ad",
"uploader_id": "ytdl" 'uploader_id': 'ytdl'
} }
}, },
{ {
"md5": "1d74534e95df54986da7f5abf7d842b7", 'md5': '1d74534e95df54986da7f5abf7d842b7',
"info_dict": { 'info_dict': {
"id": "11885684", 'id': '11885684',
"ext": "m4a", 'ext': 'm4a',
"title": "phihag - youtube-dl test track 7 \"'/\\\u00e4\u21ad", 'title': "phihag - youtube-dl test track 7 \"'/\\\u00e4\u21ad",
"uploader_id": "ytdl" 'uploader_id': 'ytdl'
} }
}, },
{ {
"md5": "f081f47af8f6ae782ed131d38b9cd1c0", 'md5': 'f081f47af8f6ae782ed131d38b9cd1c0',
"info_dict": { 'info_dict': {
"id": "11885685", 'id': '11885685',
"ext": "m4a", 'ext': 'm4a',
"title": "phihag - youtube-dl test track 8 \"'/\\\u00e4\u21ad", 'title': "phihag - youtube-dl test track 8 \"'/\\\u00e4\u21ad",
"uploader_id": "ytdl" 'uploader_id': 'ytdl'
} }
} }
] ]

View File

@@ -72,7 +72,7 @@ class EllenTVClipsIE(InfoExtractor):
def _extract_playlist(self, webpage): def _extract_playlist(self, webpage):
json_string = self._search_regex(r'playerView.addClips\(\[\{(.*?)\}\]\);', webpage, 'json') json_string = self._search_regex(r'playerView.addClips\(\[\{(.*?)\}\]\);', webpage, 'json')
try: try:
return json.loads("[{" + json_string + "}]") return json.loads('[{' + json_string + '}]')
except ValueError as ve: except ValueError as ve:
raise ExtractorError('Failed to download JSON', cause=ve) raise ExtractorError('Failed to download JSON', cause=ve)

View File

@@ -9,7 +9,7 @@ class ElPaisIE(InfoExtractor):
_VALID_URL = r'https?://(?:[^.]+\.)?elpais\.com/.*/(?P<id>[^/#?]+)\.html(?:$|[?#])' _VALID_URL = r'https?://(?:[^.]+\.)?elpais\.com/.*/(?P<id>[^/#?]+)\.html(?:$|[?#])'
IE_DESC = 'El País' IE_DESC = 'El País'
_TEST = { _TESTS = [{
'url': 'http://blogs.elpais.com/la-voz-de-inaki/2014/02/tiempo-nuevo-recetas-viejas.html', 'url': 'http://blogs.elpais.com/la-voz-de-inaki/2014/02/tiempo-nuevo-recetas-viejas.html',
'md5': '98406f301f19562170ec071b83433d55', 'md5': '98406f301f19562170ec071b83433d55',
'info_dict': { 'info_dict': {
@@ -19,30 +19,41 @@ class ElPaisIE(InfoExtractor):
'description': 'De lunes a viernes, a partir de las ocho de la mañana, Iñaki Gabilondo nos cuenta su visión de la actualidad nacional e internacional.', 'description': 'De lunes a viernes, a partir de las ocho de la mañana, Iñaki Gabilondo nos cuenta su visión de la actualidad nacional e internacional.',
'upload_date': '20140206', 'upload_date': '20140206',
} }
}, {
'url': 'http://elcomidista.elpais.com/elcomidista/2016/02/24/articulo/1456340311_668921.html#?id_externo_nwl=newsletter_diaria20160303t',
'md5': '3bd5b09509f3519d7d9e763179b013de',
'info_dict': {
'id': '1456340311_668921',
'ext': 'mp4',
'title': 'Cómo hacer el mejor café con cafetera italiana',
'description': 'Que sí, que las cápsulas son cómodas. Pero si le pides algo más a la vida, quizá deberías aprender a usar bien la cafetera italiana. No tienes más que ver este vídeo y seguir sus siete normas básicas.',
'upload_date': '20160303',
} }
}]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
prefix = self._html_search_regex( prefix = self._html_search_regex(
r'var url_cache = "([^"]+)";', webpage, 'URL prefix') r'var\s+url_cache\s*=\s*"([^"]+)";', webpage, 'URL prefix')
video_suffix = self._search_regex( video_suffix = self._search_regex(
r"URLMediaFile = url_cache \+ '([^']+)'", webpage, 'video URL') r"(?:URLMediaFile|urlVideo_\d+)\s*=\s*url_cache\s*\+\s*'([^']+)'", webpage, 'video URL')
video_url = prefix + video_suffix video_url = prefix + video_suffix
thumbnail_suffix = self._search_regex( thumbnail_suffix = self._search_regex(
r"URLMediaStill = url_cache \+ '([^']+)'", webpage, 'thumbnail URL', r"(?:URLMediaStill|urlFotogramaFijo_\d+)\s*=\s*url_cache\s*\+\s*'([^']+)'",
fatal=False) webpage, 'thumbnail URL', fatal=False)
thumbnail = ( thumbnail = (
None if thumbnail_suffix is None None if thumbnail_suffix is None
else prefix + thumbnail_suffix) else prefix + thumbnail_suffix)
title = self._html_search_regex( title = self._html_search_regex(
'<h2 class="entry-header entry-title.*?>(.*?)</h2>', (r"tituloVideo\s*=\s*'([^']+)'", webpage, 'title',
r'<h2 class="entry-header entry-title.*?>(.*?)</h2>'),
webpage, 'title') webpage, 'title')
date_str = self._search_regex( upload_date = unified_strdate(self._search_regex(
r'<p class="date-header date-int updated"\s+title="([^"]+)">', r'<p class="date-header date-int updated"\s+title="([^"]+)">',
webpage, 'upload date', fatal=False) webpage, 'upload date', default=None) or self._html_search_meta(
upload_date = (None if date_str is None else unified_strdate(date_str)) 'datePublished', webpage, 'timestamp'))
return { return {
'id': video_id, 'id': video_id,

View File

@@ -1,21 +1,13 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import (
url_basename,
)
class EngadgetIE(InfoExtractor): class EngadgetIE(InfoExtractor):
_VALID_URL = r'''(?x)https?://www.engadget.com/ _VALID_URL = r'https?://www.engadget.com/video/(?P<id>\d+)'
(?:video(?:/5min)?/(?P<id>\d+)|
[\d/]+/.*?)
'''
_TEST = { _TEST = {
'url': 'http://www.engadget.com/video/5min/518153925/', 'url': 'http://www.engadget.com/video/518153925/',
'md5': 'c6820d4828a5064447a4d9fc73f312c9', 'md5': 'c6820d4828a5064447a4d9fc73f312c9',
'info_dict': { 'info_dict': {
'id': '518153925', 'id': '518153925',
@@ -27,15 +19,4 @@ class EngadgetIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
if video_id is not None:
return self.url_result('5min:%s' % video_id) return self.url_result('5min:%s' % video_id)
else:
title = url_basename(url)
webpage = self._download_webpage(url, title)
ids = re.findall(r'<iframe[^>]+?playList=(\d+)', webpage)
return {
'_type': 'playlist',
'title': title,
'entries': [self.url_result('5min:%s' % vid) for vid in ids]
}

View File

@@ -14,14 +14,14 @@ class EveryonesMixtapeIE(InfoExtractor):
_TESTS = [{ _TESTS = [{
'url': 'http://everyonesmixtape.com/#/mix/m7m0jJAbMQi/5', 'url': 'http://everyonesmixtape.com/#/mix/m7m0jJAbMQi/5',
"info_dict": { 'info_dict': {
'id': '5bfseWNmlds', 'id': '5bfseWNmlds',
'ext': 'mp4', 'ext': 'mp4',
"title": "Passion Pit - \"Sleepyhead\" (Official Music Video)", 'title': "Passion Pit - \"Sleepyhead\" (Official Music Video)",
"uploader": "FKR.TV", 'uploader': 'FKR.TV',
"uploader_id": "frenchkissrecords", 'uploader_id': 'frenchkissrecords',
"description": "Music video for \"Sleepyhead\" from Passion Pit's debut EP Chunk Of Change.\nBuy on iTunes: https://itunes.apple.com/us/album/chunk-of-change-ep/id300087641\n\nDirected by The Wilderness.\n\nhttp://www.passionpitmusic.com\nhttp://www.frenchkissrecords.com", 'description': "Music video for \"Sleepyhead\" from Passion Pit's debut EP Chunk Of Change.\nBuy on iTunes: https://itunes.apple.com/us/album/chunk-of-change-ep/id300087641\n\nDirected by The Wilderness.\n\nhttp://www.passionpitmusic.com\nhttp://www.frenchkissrecords.com",
"upload_date": "20081015" 'upload_date': '20081015'
}, },
'params': { 'params': {
'skip_download': True, # This is simply YouTube 'skip_download': True, # This is simply YouTube

View File

@@ -8,7 +8,7 @@ from .common import InfoExtractor
class ExfmIE(InfoExtractor): class ExfmIE(InfoExtractor):
IE_NAME = 'exfm' IE_NAME = 'exfm'
IE_DESC = 'ex.fm' IE_DESC = 'ex.fm'
_VALID_URL = r'http://(?:www\.)?ex\.fm/song/(?P<id>[^/]+)' _VALID_URL = r'https?://(?:www\.)?ex\.fm/song/(?P<id>[^/]+)'
_SOUNDCLOUD_URL = r'http://(?:www\.)?api\.soundcloud\.com/tracks/([^/]+)/stream' _SOUNDCLOUD_URL = r'http://(?:www\.)?api\.soundcloud\.com/tracks/([^/]+)/stream'
_TESTS = [ _TESTS = [
{ {
@@ -41,7 +41,7 @@ class ExfmIE(InfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url) mobj = re.match(self._VALID_URL, url)
song_id = mobj.group('id') song_id = mobj.group('id')
info_url = "http://ex.fm/api/v3/song/%s" % song_id info_url = 'http://ex.fm/api/v3/song/%s' % song_id
info = self._download_json(info_url, song_id)['song'] info = self._download_json(info_url, song_id)['song']
song_url = info['url'] song_url = info['url']
if re.match(self._SOUNDCLOUD_URL, song_url) is not None: if re.match(self._SOUNDCLOUD_URL, song_url) is not None:

View File

@@ -34,9 +34,12 @@ class FacebookIE(InfoExtractor):
video/video\.php| video/video\.php|
photo\.php| photo\.php|
video\.php| video\.php|
video/embed video/embed|
)\?(?:.*?)(?:v|video_id)=| story\.php
[^/]+/videos/(?:[^/]+/)? )\?(?:.*?)(?:v|video_id|story_fbid)=|
[^/]+/videos/(?:[^/]+/)?|
[^/]+/posts/|
groups/[^/]+/permalink/
)| )|
facebook: facebook:
) )
@@ -49,6 +52,8 @@ class FacebookIE(InfoExtractor):
_CHROME_USER_AGENT = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.97 Safari/537.36' _CHROME_USER_AGENT = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.97 Safari/537.36'
_VIDEO_PAGE_TEMPLATE = 'https://www.facebook.com/video/video.php?v=%s'
_TESTS = [{ _TESTS = [{
'url': 'https://www.facebook.com/video.php?v=637842556329505&fref=nf', 'url': 'https://www.facebook.com/video.php?v=637842556329505&fref=nf',
'md5': '6a40d33c0eccbb1af76cf0485a052659', 'md5': '6a40d33c0eccbb1af76cf0485a052659',
@@ -80,6 +85,33 @@ class FacebookIE(InfoExtractor):
'title': 'When you post epic content on instagram.com/433 8 million followers, this is ...', 'title': 'When you post epic content on instagram.com/433 8 million followers, this is ...',
'uploader': 'Demy de Zeeuw', 'uploader': 'Demy de Zeeuw',
}, },
}, {
'url': 'https://www.facebook.com/maxlayn/posts/10153807558977570',
'md5': '037b1fa7f3c2d02b7a0d7bc16031ecc6',
'info_dict': {
'id': '544765982287235',
'ext': 'mp4',
'title': '"What are you doing running in the snow?"',
'uploader': 'FailArmy',
}
}, {
'url': 'https://m.facebook.com/story.php?story_fbid=1035862816472149&id=116132035111903',
'md5': '1deb90b6ac27f7efcf6d747c8a27f5e3',
'info_dict': {
'id': '1035862816472149',
'ext': 'mp4',
'title': 'What the Flock Is Going On In New Zealand Credit: ViralHog',
'uploader': 'S. Saint',
},
}, {
'note': 'swf params escaped',
'url': 'https://www.facebook.com/barackobama/posts/10153664894881749',
'md5': '97ba073838964d12c70566e0085c2b91',
'info_dict': {
'id': '10153664894881749',
'ext': 'mp4',
'title': 'Facebook video #10153664894881749',
},
}, { }, {
'url': 'https://www.facebook.com/video.php?v=10204634152394104', 'url': 'https://www.facebook.com/video.php?v=10204634152394104',
'only_matching': True, 'only_matching': True,
@@ -92,6 +124,9 @@ class FacebookIE(InfoExtractor):
}, { }, {
'url': 'facebook:544765982287235', 'url': 'facebook:544765982287235',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.facebook.com/groups/164828000315060/permalink/764967300301124/',
'only_matching': True,
}] }]
def _login(self): def _login(self):
@@ -160,19 +195,19 @@ class FacebookIE(InfoExtractor):
def _real_initialize(self): def _real_initialize(self):
self._login() self._login()
def _real_extract(self, url): def _extract_from_url(self, url, video_id, fatal_if_no_video=True):
video_id = self._match_id(url) req = sanitized_Request(url)
req = sanitized_Request('https://www.facebook.com/video/video.php?v=%s' % video_id)
req.add_header('User-Agent', self._CHROME_USER_AGENT) req.add_header('User-Agent', self._CHROME_USER_AGENT)
webpage = self._download_webpage(req, video_id) webpage = self._download_webpage(req, video_id)
video_data = None video_data = None
BEFORE = '{swf.addParam(param[0], param[1]);});\n' BEFORE = '{swf.addParam(param[0], param[1]);});'
AFTER = '.forEach(function(variable) {swf.addVariable(variable[0], variable[1]);});' AFTER = '.forEach(function(variable) {swf.addVariable(variable[0], variable[1]);});'
m = re.search(re.escape(BEFORE) + '(.*?)' + re.escape(AFTER), webpage) m = re.search(re.escape(BEFORE) + '(?:\n|\\\\n)(.*?)' + re.escape(AFTER), webpage)
if m: if m:
data = dict(json.loads(m.group(1))) swf_params = m.group(1).replace('\\\\', '\\').replace('\\"', '"')
data = dict(json.loads(swf_params))
params_raw = compat_urllib_parse_unquote(data['params']) params_raw = compat_urllib_parse_unquote(data['params'])
video_data = json.loads(params_raw)['video_data'] video_data = json.loads(params_raw)['video_data']
@@ -185,13 +220,15 @@ class FacebookIE(InfoExtractor):
if not video_data: if not video_data:
server_js_data = self._parse_json(self._search_regex( server_js_data = self._parse_json(self._search_regex(
r'handleServerJS\(({.+})\);', webpage, 'server js data'), video_id) r'handleServerJS\(({.+})\);', webpage, 'server js data', default='{}'), video_id)
for item in server_js_data['instances']: for item in server_js_data.get('instances', []):
if item[1][0] == 'VideoConfig': if item[1][0] == 'VideoConfig':
video_data = video_data_list2dict(item[2][0]['videoData']) video_data = video_data_list2dict(item[2][0]['videoData'])
break break
if not video_data: if not video_data:
if not fatal_if_no_video:
return webpage, False
m_msg = re.search(r'class="[^"]*uiInterstitialContent[^"]*"><div>(.*?)</div>', webpage) m_msg = re.search(r'class="[^"]*uiInterstitialContent[^"]*"><div>(.*?)</div>', webpage)
if m_msg is not None: if m_msg is not None:
raise ExtractorError( raise ExtractorError(
@@ -208,10 +245,13 @@ class FacebookIE(InfoExtractor):
for src_type in ('src', 'src_no_ratelimit'): for src_type in ('src', 'src_no_ratelimit'):
src = f[0].get('%s_%s' % (quality, src_type)) src = f[0].get('%s_%s' % (quality, src_type))
if src: if src:
preference = -10 if format_id == 'progressive' else 0
if quality == 'hd':
preference += 5
formats.append({ formats.append({
'format_id': '%s_%s_%s' % (format_id, quality, src_type), 'format_id': '%s_%s_%s' % (format_id, quality, src_type),
'url': src, 'url': src,
'preference': -10 if format_id == 'progressive' else 0, 'preference': preference,
}) })
dash_manifest = f[0].get('dash_manifest') dash_manifest = f[0].get('dash_manifest')
if dash_manifest: if dash_manifest:
@@ -234,39 +274,36 @@ class FacebookIE(InfoExtractor):
video_title = 'Facebook video #%s' % video_id video_title = 'Facebook video #%s' % video_id
uploader = clean_html(get_element_by_id('fbPhotoPageAuthorName', webpage)) uploader = clean_html(get_element_by_id('fbPhotoPageAuthorName', webpage))
return { info_dict = {
'id': video_id, 'id': video_id,
'title': video_title, 'title': video_title,
'formats': formats, 'formats': formats,
'uploader': uploader, 'uploader': uploader,
} }
return webpage, info_dict
class FacebookPostIE(InfoExtractor):
IE_NAME = 'facebook:post'
_VALID_URL = r'https?://(?:\w+\.)?facebook\.com/[^/]+/posts/(?P<id>\d+)'
_TEST = {
'url': 'https://www.facebook.com/maxlayn/posts/10153807558977570',
'md5': '037b1fa7f3c2d02b7a0d7bc16031ecc6',
'info_dict': {
'id': '544765982287235',
'ext': 'mp4',
'title': '"What are you doing running in the snow?"',
'uploader': 'FailArmy',
}
}
def _real_extract(self, url): def _real_extract(self, url):
post_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, post_id) real_url = self._VIDEO_PAGE_TEMPLATE % video_id if url.startswith('facebook:') else url
webpage, info_dict = self._extract_from_url(real_url, video_id, fatal_if_no_video=False)
if info_dict:
return info_dict
if '/posts/' in url:
entries = [ entries = [
self.url_result('facebook:%s' % video_id, FacebookIE.ie_key()) self.url_result('facebook:%s' % vid, FacebookIE.ie_key())
for video_id in self._parse_json( for vid in self._parse_json(
self._search_regex( self._search_regex(
r'(["\'])video_ids\1\s*:\s*(?P<ids>\[.+?\])', r'(["\'])video_ids\1\s*:\s*(?P<ids>\[.+?\])',
webpage, 'video ids', group='ids'), webpage, 'video ids', group='ids'),
post_id)] video_id)]
return self.playlist_result(entries, post_id) return self.playlist_result(entries, video_id)
else:
_, info_dict = self._extract_from_url(
self._VIDEO_PAGE_TEMPLATE % video_id,
video_id, fatal_if_no_video=True)
return info_dict

View File

@@ -52,7 +52,7 @@ class FazIE(InfoExtractor):
formats = [] formats = []
for pref, code in enumerate(['LOW', 'HIGH', 'HQ']): for pref, code in enumerate(['LOW', 'HIGH', 'HQ']):
encoding = xpath_element(encodings, code) encoding = xpath_element(encodings, code)
if encoding: if encoding is not None:
encoding_url = xpath_text(encoding, 'FILENAME') encoding_url = xpath_text(encoding, 'FILENAME')
if encoding_url: if encoding_url:
formats.append({ formats.append({

View File

@@ -17,7 +17,7 @@ from ..utils import (
class FC2IE(InfoExtractor): class FC2IE(InfoExtractor):
_VALID_URL = r'^http://video\.fc2\.com/(?:[^/]+/)*content/(?P<id>[^/]+)' _VALID_URL = r'^https?://video\.fc2\.com/(?:[^/]+/)*content/(?P<id>[^/]+)'
IE_NAME = 'fc2' IE_NAME = 'fc2'
_NETRC_MACHINE = 'fc2' _NETRC_MACHINE = 'fc2'
_TESTS = [{ _TESTS = [{
@@ -87,7 +87,7 @@ class FC2IE(InfoExtractor):
mimi = hashlib.md5((video_id + '_gGddgPfeaf_gzyr').encode('utf-8')).hexdigest() mimi = hashlib.md5((video_id + '_gGddgPfeaf_gzyr').encode('utf-8')).hexdigest()
info_url = ( info_url = (
"http://video.fc2.com/ginfo.php?mimi={1:s}&href={2:s}&v={0:s}&fversion=WIN%2011%2C6%2C602%2C180&from=2&otag=0&upid={0:s}&tk=null&". 'http://video.fc2.com/ginfo.php?mimi={1:s}&href={2:s}&v={0:s}&fversion=WIN%2011%2C6%2C602%2C180&from=2&otag=0&upid={0:s}&tk=null&'.
format(video_id, mimi, compat_urllib_request.quote(refer, safe=b'').replace('.', '%2E'))) format(video_id, mimi, compat_urllib_request.quote(refer, safe=b'').replace('.', '%2E')))
info_webpage = self._download_webpage( info_webpage = self._download_webpage(

View File

@@ -4,7 +4,7 @@ from .common import InfoExtractor
class FirstpostIE(InfoExtractor): class FirstpostIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?firstpost\.com/[^/]+/.*-(?P<id>[0-9]+)\.html' _VALID_URL = r'https?://(?:www\.)?firstpost\.com/[^/]+/.*-(?P<id>[0-9]+)\.html'
_TEST = { _TEST = {
'url': 'http://www.firstpost.com/india/india-to-launch-indigenous-aircraft-carrier-monday-1025403.html', 'url': 'http://www.firstpost.com/india/india-to-launch-indigenous-aircraft-carrier-monday-1025403.html',

View File

@@ -8,7 +8,7 @@ from ..utils import int_or_none
class FirstTVIE(InfoExtractor): class FirstTVIE(InfoExtractor):
IE_NAME = '1tv' IE_NAME = '1tv'
IE_DESC = 'Первый канал' IE_DESC = 'Первый канал'
_VALID_URL = r'http://(?:www\.)?1tv\.ru/(?:[^/]+/)+(?P<id>.+)' _VALID_URL = r'https?://(?:www\.)?1tv\.ru/(?:[^/]+/)+(?P<id>.+)'
_TESTS = [{ _TESTS = [{
'url': 'http://www.1tv.ru/videoarchive/73390', 'url': 'http://www.1tv.ru/videoarchive/73390',

View File

@@ -1,5 +1,7 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import ( from ..compat import (
compat_urllib_parse, compat_urllib_parse,
@@ -16,12 +18,7 @@ from ..utils import (
class FiveMinIE(InfoExtractor): class FiveMinIE(InfoExtractor):
IE_NAME = '5min' IE_NAME = '5min'
_VALID_URL = r'''(?x) _VALID_URL = r'(?:5min:(?P<id>\d+)(?::(?P<sid>\d+))?|https?://[^/]*?5min\.com/Scripts/PlayerSeed\.js\?(?P<query>.*))'
(?:https?://[^/]*?5min\.com/Scripts/PlayerSeed\.js\?(?:.*?&)?playList=|
https?://(?:(?:massively|www)\.)?joystiq\.com/video/|
5min:)
(?P<id>\d+)
'''
_TESTS = [ _TESTS = [
{ {
@@ -45,6 +42,7 @@ class FiveMinIE(InfoExtractor):
'title': 'How to Make a Next-Level Fruit Salad', 'title': 'How to Make a Next-Level Fruit Salad',
'duration': 184, 'duration': 184,
}, },
'skip': 'no longer available',
}, },
] ]
_ERRORS = { _ERRORS = {
@@ -91,20 +89,33 @@ class FiveMinIE(InfoExtractor):
} }
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
sid = mobj.group('sid')
if mobj.group('query'):
qs = compat_parse_qs(mobj.group('query'))
if not qs.get('playList'):
raise ExtractorError('Invalid URL', expected=True)
video_id = qs['playList'][0]
if qs.get('sid'):
sid = qs['sid'][0]
embed_url = 'https://embed.5min.com/playerseed/?playList=%s' % video_id embed_url = 'https://embed.5min.com/playerseed/?playList=%s' % video_id
if not sid:
embed_page = self._download_webpage(embed_url, video_id, embed_page = self._download_webpage(embed_url, video_id,
'Downloading embed page') 'Downloading embed page')
sid = self._search_regex(r'sid=(\d+)', embed_page, 'sid') sid = self._search_regex(r'sid=(\d+)', embed_page, 'sid')
query = compat_urllib_parse.urlencode({
response = self._download_json(
'https://syn.5min.com/handlers/SenseHandler.ashx?' +
compat_urllib_parse.urlencode({
'func': 'GetResults', 'func': 'GetResults',
'playlist': video_id, 'playlist': video_id,
'sid': sid, 'sid': sid,
'isPlayerSeed': 'true', 'isPlayerSeed': 'true',
'url': embed_url, 'url': embed_url,
}) }),
response = self._download_json(
'https://syn.5min.com/handlers/SenseHandler.ashx?' + query,
video_id) video_id)
if not response['success']: if not response['success']:
raise ExtractorError( raise ExtractorError(
@@ -118,9 +129,7 @@ class FiveMinIE(InfoExtractor):
parsed_video_url = compat_urllib_parse_urlparse(compat_parse_qs( parsed_video_url = compat_urllib_parse_urlparse(compat_parse_qs(
compat_urllib_parse_urlparse(info['EmbededURL']).query)['videoUrl'][0]) compat_urllib_parse_urlparse(info['EmbededURL']).query)['videoUrl'][0])
for rendition in info['Renditions']: for rendition in info['Renditions']:
if rendition['RenditionType'] == 'm3u8': if rendition['RenditionType'] == 'aac' or rendition['RenditionType'] == 'm3u8':
formats.extend(self._extract_m3u8_formats(rendition['Url'], video_id, m3u8_id='hls'))
elif rendition['RenditionType'] == 'aac':
continue continue
else: else:
rendition_url = compat_urlparse.urlunparse(parsed_video_url._replace(path=replace_extension(parsed_video_url.path.replace('//', '/%s/' % rendition['ID']), rendition['RenditionType']))) rendition_url = compat_urlparse.urlunparse(parsed_video_url._replace(path=replace_extension(parsed_video_url.path.replace('//', '/%s/' % rendition['ID']), rendition['RenditionType'])))

View File

@@ -10,7 +10,7 @@ from ..utils import (
class FKTVIE(InfoExtractor): class FKTVIE(InfoExtractor):
IE_NAME = 'fernsehkritik.tv' IE_NAME = 'fernsehkritik.tv'
_VALID_URL = r'http://(?:www\.)?fernsehkritik\.tv/folge-(?P<id>[0-9]+)(?:/.*)?' _VALID_URL = r'https?://(?:www\.)?fernsehkritik\.tv/folge-(?P<id>[0-9]+)(?:/.*)?'
_TEST = { _TEST = {
'url': 'http://fernsehkritik.tv/folge-1', 'url': 'http://fernsehkritik.tv/folge-1',

View File

@@ -5,7 +5,7 @@ from .common import InfoExtractor
class FootyRoomIE(InfoExtractor): class FootyRoomIE(InfoExtractor):
_VALID_URL = r'http://footyroom\.com/(?P<id>[^/]+)' _VALID_URL = r'https?://footyroom\.com/(?P<id>[^/]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://footyroom.com/schalke-04-0-2-real-madrid-2015-02/', 'url': 'http://footyroom.com/schalke-04-0-2-real-madrid-2015-02/',
'info_dict': { 'info_dict': {

View File

@@ -4,7 +4,7 @@ from .common import InfoExtractor
class FoxgayIE(InfoExtractor): class FoxgayIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?foxgay\.com/videos/(?:\S+-)?(?P<id>\d+)\.shtml' _VALID_URL = r'https?://(?:www\.)?foxgay\.com/videos/(?:\S+-)?(?P<id>\d+)\.shtml'
_TEST = { _TEST = {
'url': 'http://foxgay.com/videos/fuck-turkish-style-2582.shtml', 'url': 'http://foxgay.com/videos/fuck-turkish-style-2582.shtml',
'md5': '80d72beab5d04e1655a56ad37afe6841', 'md5': '80d72beab5d04e1655a56ad37afe6841',

View File

@@ -36,6 +36,10 @@ class FoxNewsIE(AMPIE):
# 'upload_date': '20141204', # 'upload_date': '20141204',
'thumbnail': 're:^https?://.*\.jpg$', 'thumbnail': 're:^https?://.*\.jpg$',
}, },
'params': {
# m3u8 download
'skip_download': True,
},
}, },
{ {
'url': 'http://video.foxnews.com/v/video-embed.html?video_id=3937480&d=video.foxnews.com', 'url': 'http://video.foxnews.com/v/video-embed.html?video_id=3937480&d=video.foxnews.com',

View File

@@ -6,11 +6,11 @@ from ..utils import int_or_none
class FranceInterIE(InfoExtractor): class FranceInterIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?franceinter\.fr/player/reecouter\?play=(?P<id>[0-9]+)' _VALID_URL = r'https?://(?:www\.)?franceinter\.fr/player/reecouter\?play=(?P<id>[0-9]+)'
_TEST = { _TEST = {
'url': 'http://www.franceinter.fr/player/reecouter?play=793962', 'url': 'http://www.franceinter.fr/player/reecouter?play=793962',
'md5': '4764932e466e6f6c79c317d2e74f6884', 'md5': '4764932e466e6f6c79c317d2e74f6884',
"info_dict": { 'info_dict': {
'id': '793962', 'id': '793962',
'ext': 'mp3', 'ext': 'mp3',
'title': 'LHistoire dans les jeux vidéo', 'title': 'LHistoire dans les jeux vidéo',

Some files were not shown because too many files have changed in this diff Show More