Compare commits

...

581 Commits

Author SHA1 Message Date
Philipp Hagemeister
900813a328 release 2015.01.07.2 2015-01-07 07:41:48 +01:00
Philipp Hagemeister
2bad0e5d20 [/__init__] Define public API 2015-01-07 07:41:05 +01:00
Philipp Hagemeister
ccc5842bc9 [gameone] Modernize 2015-01-07 07:37:21 +01:00
Philipp Hagemeister
fd86c2026d release 2015.01.07.1 2015-01-07 07:31:38 +01:00
Philipp Hagemeister
e4a8eae701 Merge commit '8ee3415' 2015-01-07 07:30:57 +01:00
Philipp Hagemeister
75e51819d0 release 2015.01.07 2015-01-07 07:22:28 +01:00
Philipp Hagemeister
8ee341500d [viki] Modernize 2015-01-07 07:21:24 +01:00
Philipp Hagemeister
0590062925 Respect age_limit when listing extractors (Fixes #4653) 2015-01-07 07:20:20 +01:00
Sergey M․
799d88d3d8 [nrktv] Add support for playlists (Closes #4656) 2015-01-07 06:46:56 +06:00
Sergey M․
760aea9a96 Merge branch 'oskar456-ceskatelevizesrt' 2015-01-07 05:05:30 +06:00
Sergey M․
d6a31b1766 Credit @oskar456 for ceskatelevize subtitles support (#4622) 2015-01-07 05:05:18 +06:00
Sergey M․
0b54a5b10a [ceskatelevize] Add subtitles tests 2015-01-07 05:04:15 +06:00
Sergey M․
6309cb9b41 [ceskatelevize] Fix python 2.6 format issue 2015-01-07 05:03:34 +06:00
Sergey M․
27a82a1b93 [ceskatelevize] Simplify 2015-01-07 05:03:14 +06:00
Sergey M․
ecd1936695 Merge branch 'ceskatelevizesrt' of https://github.com/oskar456/youtube-dl into oskar456-ceskatelevizesrt 2015-01-07 05:02:27 +06:00
Jaime Marquínez Ferrándiz
76b3c61012 [youtube] Add formats 308 and 315 (closes #4650) 2015-01-06 11:59:41 +01:00
Sergey M․
0df2dea73b [giga] Add extractor (Closes #4090) 2015-01-06 06:54:31 +06:00
Philipp Hagemeister
f8bb576c4f release 2015.01.05.1 2015-01-05 22:42:38 +01:00
Philipp Hagemeister
ee61f6f3e2 [youtube] Handle cases where format comes without a preference (Fixes #4648) 2015-01-05 22:42:17 +01:00
Jaime Marquínez Ferrándiz
8f9529cd05 [motorsport] Fix extraction and make trailing '/' optional
They directly embed a youtube video now.
2015-01-05 19:19:01 +01:00
Philipp Hagemeister
f4bca0b348 release 2015.01.05 2015-01-05 18:44:29 +01:00
Philipp Hagemeister
6291438073 [auengine] Simplify (#4643) 2015-01-05 18:21:32 +01:00
Philipp Hagemeister
18c3c15391 Merge remote-tracking branch 'Oteng/master' 2015-01-05 18:18:15 +01:00
Philipp Hagemeister
dda620e88c [radiobremen] Make code more readable and more resilient to failures 2015-01-05 18:17:03 +01:00
Philipp Hagemeister
d7cc31b63e [generic] PEP8 2015-01-05 18:16:47 +01:00
Philipp Hagemeister
5e3e1c82d8 Credit @ckrooss for radiobremen (#4632) 2015-01-05 18:14:39 +01:00
Philipp Hagemeister
aa80652f47 [radiobremen] Add test for thumbnail 2015-01-05 18:14:09 +01:00
Philipp Hagemeister
9d247bbd2d [radiobremen] Fix under Python 2.6 and fix duration 2015-01-05 18:13:19 +01:00
Philipp Hagemeister
93e40a7b2f Merge remote-tracking branch 'ckrooss/master' 2015-01-05 18:07:16 +01:00
oteng
03ff2cc1c4 [Auengine] corrected extractions logic
The way the video download url was been extracted was
not working well so i change it for it to extract the
correct url
2015-01-05 16:28:24 +00:00
Jaime Marquínez Ferrándiz
a285b6377b [normalboots] Skip download in test, it uses rtmp 2015-01-05 13:59:49 +01:00
Jaime Marquínez Ferrándiz
cd791a5ea0 [ted] Add support for embed-ssl.ted.com embedded videos 2015-01-05 13:11:13 +01:00
Jaime Marquínez Ferrándiz
87830900a9 [generic] Update some tests 2015-01-05 13:07:24 +01:00
Jaime Marquínez Ferrándiz
dfc9d9f50a Merge pull request #4639 from bartkappenburg/patch-1
Update rtlnl.py
2015-01-05 12:31:07 +01:00
Jaime Marquínez Ferrándiz
75311a7e16 .travis.yml: Remove my email from the list 2015-01-05 12:29:32 +01:00
Jaime Marquínez Ferrándiz
628bc4d1e7 [khanacademy] Update test 2015-01-05 12:28:35 +01:00
Jaime Marquínez Ferrándiz
a4c3f48639 [vimple] Replace tests
The first one seems to be no longer available and the second was an episode from a tv show.
2015-01-05 11:54:14 +01:00
Bart Kappenburg
bdf80aa542 Update rtlnl.py
Added support for the non-www version of rtlxl.nl by making "www." optional.
2015-01-05 11:51:24 +01:00
Naglis Jonaitis
adf3c58ad3 [lrt] Fix missing provider key
Also, modernize a bit.
2015-01-05 02:55:12 +02:00
Naglis Jonaitis
caf90bfaa5 [webofstories] Add new extractor (Closes #4585) 2015-01-05 02:22:01 +02:00
Jaime Marquínez Ferrándiz
2f985f4bb4 [youtube:toplist] Remove extractor
They use now normal playlists (their id is PL*).
2015-01-05 00:18:43 +01:00
Philipp Hagemeister
67c2bcdf4c Remove extractors which infringe copyright (#4554) 2015-01-04 19:19:18 +01:00
Jaime Marquínez Ferrándiz
1d2d0e3ff2 utils: Remove blank line at the end of file 2015-01-04 14:07:06 +01:00
Jaime Marquínez Ferrándiz
9fda6ee39f [tf1] Remove unused import 2015-01-04 14:06:23 +01:00
Jaime Marquínez Ferrándiz
bc3e582fe4 Don't use '-shortest' option for merging formats (closes #4220, closes #4580)
With avconv and older versions of ffmpeg the video is partially copied.
The duration difference between the audio and the video seem to be really small, so it's probably not noticeable.
2015-01-04 14:02:17 +01:00
Christopher Krooss
bc1fc5ddbc Don't check for height as it's not provided 2015-01-04 14:02:07 +01:00
Jaime Marquínez Ferrándiz
63948fc62c [downloader/hls] Respect the 'prefer_ffmpeg' option 2015-01-04 13:41:49 +01:00
Christopher Krooss
f4858a7103 Add support for Radio Bremen 2015-01-04 13:33:26 +01:00
Philipp Hagemeister
26886e6140 release 2015.01.04 2015-01-04 03:15:48 +01:00
Philipp Hagemeister
7a1818c99b [vk] Add support for rutube embeds (Fixes #4514) 2015-01-04 03:15:27 +01:00
Philipp Hagemeister
2ccd1b10e5 [soulanime] Fix under Python 3 2015-01-04 02:20:45 +01:00
Philipp Hagemeister
788fa208c8 Merge branch 'master' of github.com:rg3/youtube-dl 2015-01-04 02:08:38 +01:00
Philipp Hagemeister
8848314c08 [Makefile] Make offline tests actually work offline 2015-01-04 02:08:18 +01:00
Philipp Hagemeister
c11125f9ed [tests] Remove format 138 from tests (#4559) 2015-01-04 02:06:53 +01:00
Philipp Hagemeister
95ceeec722 Remove unused import 2015-01-04 02:05:35 +01:00
Philipp Hagemeister
b68ff25917 Add various anime sites (Closes #4554) 2015-01-04 02:05:26 +01:00
Sergey M.
3e3327ea17 Merge pull request #4629 from t0mm0/tf1-tfou
[tf1] add support for TFOU
2015-01-04 06:51:28 +06:00
t0mm0
b158bb8693 [tf1] simplify regex 2015-01-04 00:45:23 +00:00
t0mm0
2bf098eda4 [tf1] fix test 2015-01-04 00:43:55 +00:00
t0mm0
382e05fa56 [tf1] add support for TFOU 2015-01-04 00:05:31 +00:00
Philipp Hagemeister
19b05d886e release 2015.01.03 2015-01-03 18:35:30 +01:00
Philipp Hagemeister
e65566a9cc [youtube] Correct handling when DASH manifest is not necessary to find all formats 2015-01-03 18:33:38 +01:00
Sergey M․
baa3c3f0f6 [ellentv] Improve extraction 2015-01-03 21:54:18 +06:00
Sergey M․
f4f339529c [ellentv] Clean up and simplify 2015-01-03 21:44:47 +06:00
Sergey M.
7d02fae85b Merge pull request #4626 from gauravb7090/ellentube
Added support for EllenTube along with EllenTV
2015-01-03 21:40:39 +06:00
Gaurav
6e46c3f1fd Added support for EllenTube along with EllenTV 2015-01-03 20:30:28 +05:30
Sergey M․
c7e675940c [bbccouk] Add support for music clips (Closes #4143) 2015-01-03 20:43:40 +06:00
Jaime Marquínez Ferrándiz
d26b1317ed [downloader/mplayer] Use check_executable 2015-01-03 00:33:36 +01:00
Jaime Marquínez Ferrándiz
a221f22969 [crunchyroll] Fix format extraction
Reported in https://github.com/rg3/youtube-dl/issues/2782#issuecomment-68556780
2015-01-02 21:17:10 +01:00
Jaime Marquínez Ferrándiz
817f786fbb [canalplus] Raise an error if the video is georestricted (closes #4472) 2015-01-02 21:02:34 +01:00
Sergey M․
62420c73cb [played] Skip test 2015-01-02 22:31:55 +06:00
Sergey M․
2522a0b7da [kontrtube] Extract display_id
Trailing slash in URL is mandatory now
2015-01-02 22:28:48 +06:00
Sergey M․
46d32a12c9 [bet] Update test 2015-01-02 22:23:00 +06:00
Sergey M․
c491418526 [bbccouk] Update test 2015-01-02 22:13:26 +06:00
Ondřej Caletka
c067545c17 ceskatelevize: Closed captions support 2015-01-02 17:12:20 +01:00
Sergey M․
823a155293 [vier:videos] Tune _VALID_URL not to match single videos 2015-01-02 22:09:00 +06:00
Sergey M․
324b2c78fa [xtube] Fix uploader regex 2015-01-02 21:46:57 +06:00
Sergey M․
d34f98289b [xhamster] Remove identical tests 2015-01-02 21:12:25 +06:00
Sergey M.
644096b15c Merge pull request #4615 from dwemthy/https_xhamster
[xhamster] Add HTTPS support
2015-01-02 21:09:28 +06:00
Sergey M․
15cebcc363 Merge branch 'master' of github.com:rg3/youtube-dl 2015-01-02 20:57:12 +06:00
Sergey M․
faa4ea68c0 [generic] Add BBC iPlayer playlist test 2015-01-02 20:56:42 +06:00
Philipp Hagemeister
29a9385ff0 release 2015.01.02 2015-01-02 15:56:26 +01:00
Sergey M․
476eae0c2a [generic] Generalize BBC iPlayer playlist extraction 2015-01-02 20:55:09 +06:00
Sergey M․
8399267671 [generic] Make getter None by default 2015-01-02 20:54:30 +06:00
Sergey M․
db546cf87f [generic] Add support for BBC iPlayer embeds (Closes #4619) 2015-01-02 20:46:17 +06:00
Sergey M․
317639758a [bbccouk] Improve _VALID_URL 2015-01-02 20:37:54 +06:00
Sergey M․
fdbabca85f [vier:videos] Tune _VALID_URL 2015-01-02 20:21:41 +06:00
Sergey M․
6f790e5821 Credit @lovebug356 for vier (#4617) 2015-01-02 20:16:43 +06:00
Sergey M․
6f5cdeb611 Merge branch 'lovebug356-vier' 2015-01-02 20:15:59 +06:00
Sergey M․
9eb4f404cb [vier] Simplify, add support for more URL formats, extract all playlist pages when page is not specified 2015-01-02 20:15:40 +06:00
Thijs Vermeir
f58487b392 [vier] Add new extractor 2015-01-02 13:35:47 +01:00
dwemthy
5b9aefef77 [xhamster] Add HTTPS support 2015-01-02 11:54:38 +00:00
Philipp Hagemeister
772fd5cc44 [youtube] Add a pseudo-extractor for truncated YouTube video IDs (#4610) 2015-01-01 23:44:39 +01:00
Philipp Hagemeister
50a0f6df7e [/__init__] Add another cute search example 2015-01-01 22:47:21 +01:00
Philipp Hagemeister
9f435c5f1c Add an extractor for common mistakes (#4610) 2015-01-01 22:34:58 +01:00
Philipp Hagemeister
931e2d1d26 [bbccouk] PEP8 2015-01-01 22:15:46 +01:00
Philipp Hagemeister
a42419da42 [options] Upper-case options and URL in --help output
Hopefully, this reduces confusion as in #4610.
2015-01-01 22:01:47 +01:00
Philipp Hagemeister
9a237b776c release 2015.01.01 2015-01-01 21:41:42 +01:00
Sergey M․
02ec32a1ef [ceskatelevize] Adapt to new API (Closes #4531) 2015-01-01 20:01:55 +06:00
Sergey M․
a1e9e6440f [moevideo] Skip removed video test 2015-01-01 00:46:03 +06:00
Sergey M․
5878e6398c [nrktv] Update tests' checksums 2015-01-01 00:37:57 +06:00
Sergey M․
6c6f1408f2 [extractor/common] Allow multiline content tags 2015-01-01 00:37:14 +06:00
Sergey M․
b7a7319c38 [slideshare] Fix extraction 2015-01-01 00:26:19 +06:00
Sergey M․
68f705cac5 [tnaflix] Make sure config URL has correct scheme 2015-01-01 00:12:41 +06:00
Sergey M․
079d1dcd80 [tnaflix] Fix title extraction 2015-01-01 00:11:56 +06:00
Sergey M․
7b24bbdf49 [xboxclips] Fix extraction 2014-12-31 23:59:16 +06:00
Jaime Marquínez Ferrándiz
f86d543ebb [pbs] Catch geoblocking errors (closes #4516) 2014-12-31 17:43:49 +01:00
Jaime Marquínez Ferrándiz
60e47a2699 [youtube] Use '_download_xml' for getting the available subtitles 2014-12-31 15:44:15 +01:00
Sergey M․
b8bc7a696b [openfilm] Add extractor (Closes #4538) 2014-12-31 19:40:35 +06:00
Jaime Marquínez Ferrándiz
7d900ef1bf [youtube] Add support for automatically translated subtitles (fixes #4555)
They have a manually uploaded subtitles track and YouTube can transtale it.
2014-12-31 14:15:16 +01:00
Sergey M․
1931a73f39 [echomsk] Add extractor (Closes #4600) 2014-12-31 18:03:51 +06:00
Philipp Hagemeister
966ea3aebd [README] Typo / clarify FAQ 2014-12-30 23:41:29 +01:00
Philipp Hagemeister
b3013681ff Merge remote-tracking branch 'origin/master' 2014-12-30 19:41:04 +01:00
Philipp Hagemeister
416c7fcbce Add documentation about supported sites (Fixes #4503) 2014-12-30 19:35:35 +01:00
Sergey M․
e83eebb12f [atresplayer] Fix python3 bug 2014-12-30 22:46:23 +06:00
Sergey M․
a349873226 [atresplayer] Add extractor (Closes #2341) 2014-12-30 22:28:07 +06:00
Sergey M․
fccae2b911 [youtube] Add test for age-gate video with encrypted signature 2014-12-30 17:26:21 +06:00
Sergey M․
3ee08848db Credit @0xced for #4598 2014-12-30 17:12:12 +06:00
Sergey M.
0129b4dd45 Merge pull request #4598 from 0xced/encrypted-age-gate
[youtube] Fix videos with age gate and encrypted signatures
2014-12-30 17:09:02 +06:00
Sergey M․
1c57e7f1f4 [daum] Improve full_id regex 2014-12-30 16:55:53 +06:00
Sergey M.
d0caf3a11e Merge pull request #4599 from t0mm0/daum_fix
[daum] update 'full id' regex
2014-12-30 16:52:02 +06:00
t0mm0
a87bb090d9 [daum] update 'full id' regex
fixes #4566
2014-12-29 23:06:56 +00:00
Cédric Luthi
beb95e7781 [youtube] Fix videos with age gate and encrypted signatures
The `sts` value is available on the embed webpage, get it from there.

Fixes #4108.
2014-12-29 22:58:14 +01:00
Sergey M․
5435d7af91 Merge branch 't0mm0-hitbox' 2014-12-30 03:22:25 +06:00
Sergey M․
0c0a70f4c6 [hitbox] Minor changes 2014-12-30 03:22:07 +06:00
t0mm0
e3947e2b7f [hitbox] add support for live streams 2014-12-29 20:12:23 +00:00
t0mm0
da3f7fb7f8 [hitbox] add extractor for hitbox vods 2014-12-29 20:12:23 +00:00
Sergey M․
429ddfd38d [cnn] Add support for hln URL format (Closes #4595) 2014-12-30 01:50:28 +06:00
Sergey M․
479514d015 Merge branch 'peugeot-hellporno' 2014-12-29 21:33:57 +06:00
Sergey M․
355e41466d [hellporno] Extract all formats and improve 2014-12-29 21:33:41 +06:00
Sergey M․
03d9aad87c Merge branch 'hellporno' of https://github.com/peugeot/youtube-dl into peugeot-hellporno 2014-12-29 21:13:09 +06:00
Sergey M․
3e2bcf530b Merge branch 'peugeot-xxxymovies' 2014-12-29 21:05:41 +06:00
Sergey M․
6343a5f68e [xxxymovies] Improve 2014-12-29 21:05:21 +06:00
Sergey M․
00de9a9828 Merge branch 'xxxymovies' of https://github.com/peugeot/youtube-dl into peugeot-xxxymovies 2014-12-29 20:38:28 +06:00
Sergey M․
7fc2cd819e [cnn] Improve regexes and fix test 2014-12-29 20:27:09 +06:00
Sergey M.
974739aab5 Merge pull request #4543 from akretz/cnn_fix
[cnn] Add support for articles with videos (fixes #4541)
2014-12-29 20:21:39 +06:00
peugeot
0cc4f8e385 [xxxymovies] new ectractor 2014-12-29 11:31:22 +01:00
peugeot
513fd2a872 [hellporno] simplify 2014-12-29 10:38:07 +01:00
Sergey M․
ae6986fb89 [bbccouk] Switch to new JSON playlist format (Closes #4588) 2014-12-29 03:00:24 +06:00
Sergey M․
e8e28989eb [archiveorg] Add test, simplify and modernize 2014-12-29 02:08:46 +06:00
Sergey M.
0fa629d05b Merge pull request #4590 from derrotebaron/master
[archiveorg] most metadata fields are optional
2014-12-29 01:53:59 +06:00
Johannes Knoedtel
ff7a07d5c4 [archiveorg] most metadata fields are optional
Example: https://archive.org/details/Cops1922
2014-12-28 20:31:25 +01:00
Sergey M․
5a18403057 [arte.tv] Fix typo 2014-12-28 15:42:29 +06:00
Sergey M․
1b7b1d6eac [arte.tv:+7] Make quality optional (Closes #4583) 2014-12-28 15:41:52 +06:00
Sergey M․
23cfa4ae45 Merge branch 'peugeot-alphaporno' 2014-12-27 00:08:25 +06:00
Sergey M․
e82def52a9 [alphaporno] Improve 2014-12-27 00:08:04 +06:00
Sergey M․
bcfe9db299 Merge branch 'alphaporno' of https://github.com/peugeot/youtube-dl into peugeot-alphaporno 2014-12-26 23:34:12 +06:00
Sergey M․
cf00ae7640 Merge branch 'peugeot-eroprofile' 2014-12-26 23:33:01 +06:00
Sergey M․
f9b9e88646 [eroprofile] Simplify 2014-12-26 23:32:41 +06:00
Sergey M․
c2500434c3 Merge branch 'eroprofile' of https://github.com/peugeot/youtube-dl into peugeot-eroprofile 2014-12-26 23:16:25 +06:00
Sergey M․
f74b341dde expect_info_dict actual-expected argument consistency 2014-12-26 23:07:24 +06:00
peugeot
461b00f34a [eroprofile] new extractor 2014-12-26 17:15:34 +01:00
peugeot
4cda41ac7b [alphaporno] new extractor 2014-12-26 16:17:35 +01:00
peugeot
6a1c4fbfcb [hellporno] new extractor 2014-12-26 15:49:12 +01:00
Sergey M․
31424c126f [sunporno] Modernize 2014-12-26 19:28:51 +06:00
Sergey M.
53096539dc Merge pull request #4568 from peugeot/sunporno
[sunporno] fix duration
2014-12-26 19:25:05 +06:00
peugeot
2c0b475235 [sunporno] fix duration 2014-12-26 12:49:13 +01:00
Sergey M․
a542405200 Credit @MaxReimann for teletask (#4533) 2014-12-25 23:29:10 +06:00
Sergey M․
3e2b085ef9 Merge branch 'MaxReimann-teletask' 2014-12-25 23:27:23 +06:00
Sergey M․
885e4384a1 [teletask] Simplify 2014-12-25 23:26:57 +06:00
Sergey M․
2b8f151094 Merge branch 'teletask' of https://github.com/MaxReimann/youtube-dl into MaxReimann-teletask 2014-12-25 23:06:26 +06:00
Sergey M․
5ac71f0b27 [sohu] Modernize and extract all formats and more metadata (Closes #4409, closes #2056, closes #3009) 2014-12-25 22:25:05 +06:00
Sergey M․
39ac7c9435 [gameone] Extract duration as float 2014-12-24 19:18:59 +06:00
Sergey M.
ed7bdc8a90 Merge pull request #4553 from tobidope/gameone
[gameone] This fix resolves issue #4552
2014-12-24 19:05:05 +06:00
Tobias Bell
55f0cab3a3 [gameone] This fix resolves issue #4552
The duration metadata for certain episodes contained floating point
numbers instead of integers. Now only the integer part will be
interpreted. Also added a test for this
2014-12-23 22:09:21 +01:00
Sergey M․
544dec6298 [smotri] Skip broken tests 2014-12-23 20:33:56 +06:00
Jaime Marquínez Ferrándiz
e0ae1814b1 [sportdeutschland] Fix extraction (fixes #4544) 2014-12-22 22:24:19 +01:00
Adrian Kretz
9532d72371 [cnn] Add support for articles with videos (fixes #4541) 2014-12-22 18:40:36 +01:00
Sergey M․
1362bbbb4b [adobetv] Add extractor (Closes #4536) 2014-12-22 22:05:47 +06:00
Jaime Marquínez Ferrándiz
f00fd51dae Don't write the description file if info_dict['description'] is None (#3166) 2014-12-21 20:49:14 +01:00
Sergey M․
a8896c5ac2 [crunchyroll] Add .fr domain (#4537) 2014-12-22 00:58:15 +06:00
Jaime Marquínez Ferrándiz
5d3808524d [extractor/common] Update docstring: replace FileDownloader with YoutubeDL 2014-12-21 16:58:29 +01:00
Jaime Marquínez Ferrándiz
c8f167823f [dbtv] Make sure the 'id' field is a string 2014-12-21 16:57:07 +01:00
Jaime Marquínez Ferrándiz
70f6796e7d [telecinco] Rename 'episode' group to 'id' in the _VALID_URL regex
MiTeleIE now uses '_match_id'
2014-12-21 16:54:53 +01:00
Jaime Marquínez Ferrándiz
85d253af6b [internetvideoarchive] Update test's duration field 2014-12-21 15:37:21 +01:00
Jaime Marquínez Ferrándiz
a86cbf5876 [rtp] Fix test's id field 2014-12-21 15:28:40 +01:00
Jaime Marquínez Ferrándiz
3f1399de8a [tmz] Fix test's thumbnail field 2014-12-21 15:26:00 +01:00
Jaime Marquínez Ferrándiz
1f809a8560 [nerdcubed] Style fixes 2014-12-21 15:22:30 +01:00
Jaime Marquínez Ferrándiz
653d14e2f9 [yahoo] Update extraction process
Their webpage uses now https://video.media.yql.yahoo.com/v1/video/sapi/streams/ for getting the video info.
2014-12-21 14:47:44 +01:00
Jaime Marquínez Ferrándiz
85fab7e47b [yahoo] Replace two tests
The first one returned an internal server error.
The other doesn't seem to contain a video anymore.
2014-12-21 14:47:12 +01:00
Jaime Marquínez Ferrándiz
3aa9176f08 [yahoo] Improve video id detection (fixes #4521) 2014-12-21 14:09:00 +01:00
MaxReimann
33b53b6021 [teletask] Add new extractor 2014-12-21 12:26:47 +01:00
MaxReimann
3f7421b71b fix test and remove lengthy description 2014-12-21 11:13:59 +01:00
MaxReimann
ee45625290 Add extractor for teletask 2014-12-21 11:01:28 +01:00
Sergey M․
2c2a42587b [dvtv] Fix thumbnail scheme 2014-12-21 07:38:55 +06:00
Sergey M․
e2f65efcf9 Merge branch 'petrkutalek-dvtv' 2014-12-21 07:34:27 +06:00
Sergey M․
081d6e4784 [dvtv] Simplify 2014-12-21 07:33:58 +06:00
Petr Kutalek
1d4247f64e [dvtv] Add support for playlists 2014-12-21 01:24:05 +01:00
Sergey M․
1ff30d7b79 [npo] Add support for streams (Closes #4276) 2014-12-20 18:30:56 +06:00
Sergey M․
16ea817968 [xtube] Fix and modernize (Closes #4489) 2014-12-19 21:56:44 +06:00
Philipp Hagemeister
a2a4bae929 Credit @willglynn for nerdcubed (#4515) 2014-12-19 10:32:20 +01:00
Will Glynn
c58843b3a1 [nerdcubed] Add new extractor
nerdcubed.co.uk describes videos in a single a feed.json file, providing
references to and metadata on >1300 YouTube videos spread across 3 main
channels as well as guest appareances on other channels via a single HTTP
request.

NerdCubedFeedIE transforms this feed into a youtube-dl playlist, preserving
information present in the upstream JSON (allowing zero-cost title/date
matches) and ultimately referencing the embedded YouTube videos.
2014-12-18 22:32:24 -06:00
Sergey M․
a22524b004 [twitch] Add support for vods (Closes #4512) 2014-12-18 21:25:42 +06:00
Philipp Hagemeister
87c4c21e75 Credit @petrkutalek for dvtv (#4502) 2014-12-17 23:38:11 +01:00
Philipp Hagemeister
b9465395cb [dvtv] PEP8 and correct format sorting (#4502) 2014-12-17 23:18:06 +01:00
Philipp Hagemeister
edf41477f0 Merge remote-tracking branch 'petrkutalek/dvtv' 2014-12-17 23:12:38 +01:00
Petr Kutalek
5f627b4448 [dvtv] Add new extractor 2014-12-17 15:52:54 +01:00
Philipp Hagemeister
60e5428925 [flake8] Ignore build/ directory
That directory is temporarily generated when building for PyPi and may be present if something goes wrong with the upload.
2014-12-17 15:36:18 +01:00
Sergey M․
748ec66725 [theplatform] Extract captions (Closes #4495) 2014-12-17 20:20:40 +06:00
Jaime Marquínez Ferrándiz
e54a3a2f01 [screencastomatic] Remove unused variable 2014-12-17 14:56:30 +01:00
Jaime Marquínez Ferrándiz
0e4cb4f406 YoutubeDL: style fix 2014-12-17 14:55:27 +01:00
Philipp Hagemeister
f7ffe72ac7 Merge pull request #4501 from AndroKev/master
only add video-id to archive, when successful
2014-12-17 13:31:33 +01:00
AndroKev
cd58dc3e56 Update YoutubeDL.py 2014-12-17 13:21:22 +01:00
AndroKev
c33bcf2051 only add video-id to archive, when successful
Example:
no space left--> youtube-dl adds the id to archive, but the video isn't fully downloaded
2014-12-17 13:05:19 +01:00
Philipp Hagemeister
7642c08763 release 2014.12.17.2 2014-12-17 11:39:25 +01:00
Philipp Hagemeister
fdc8000810 [downloader] Handle a file ./- (Fixes #4498) 2014-12-17 11:39:06 +01:00
Philipp Hagemeister
a91c9b15e3 release 2014.12.17.1 2014-12-17 11:29:52 +01:00
Philipp Hagemeister
27d67ea2ba [comedycentral] Match URLs with a second ID (fixes #4499) 2014-12-17 11:29:35 +01:00
Philipp Hagemeister
d6a8160902 release 2014.12.17 2014-12-17 10:53:17 +01:00
Philipp Hagemeister
6e1b9395c6 [screencastomatic] Add new extractor (Fixes #4497) 2014-12-17 10:53:12 +01:00
Philipp Hagemeister
b1ccbed3d4 [nhl] Allow upper-case video IDs (Fixes #4494) 2014-12-17 00:26:04 +01:00
Philipp Hagemeister
37381350f8 [aljazeera] Add unicode_literals marker 2014-12-17 00:08:04 +01:00
Philipp Hagemeister
7af808a5ef Improve code style 2014-12-17 00:06:41 +01:00
Philipp Hagemeister
876bef5937 [mit] Modernize 2014-12-17 00:04:24 +01:00
Jaime Marquínez Ferrándiz
a16af51873 flake8: Add more ignored files
* setup.py: the '__version__' variable is not defined in the script, it is loadded from youtube_dl/version.py
* devscripts/buildserver.py: Produces a lot of messages
2014-12-16 20:38:59 +01:00
Jaime Marquínez Ferrándiz
dc9a441bfa Move flake8 configuration to setup.cfg
It will be used when calling flake8 from any directory in the project
2014-12-16 20:34:07 +01:00
Jaime Marquínez Ferrándiz
ee6dfe8308 Use flake8 instead of pyflakes and pep8
It combines both tools
2014-12-16 20:34:07 +01:00
Jaime Marquínez Ferrándiz
2cb5b03e53 [test/test_unicode_literals] Remove duplicated imports 2014-12-16 20:33:23 +01:00
Philipp Hagemeister
964b190350 release 2014.12.16.2 2014-12-16 16:45:35 +01:00
Philipp Hagemeister
13d27a42cc [orf:tvthek] Add support for topic URLs (Fixes #4474) 2014-12-16 16:45:28 +01:00
Philipp Hagemeister
ec05fee43a [brightcove] Add shorter URL scheme for other extractors 2014-12-16 16:38:26 +01:00
Philipp Hagemeister
b50e3bc67f [README] Add table of contents (Closes #4458) 2014-12-16 16:33:23 +01:00
Philipp Hagemeister
ac78b5e97b release 2014.12.16.1 2014-12-16 16:03:57 +01:00
Philipp Hagemeister
17e0d63957 Merge branch 'master' of github.com:rg3/youtube-dl 2014-12-16 16:03:46 +01:00
Sergey M․
9209fe3878 [allocine] Add test for new URL format 2014-12-16 21:03:10 +06:00
Philipp Hagemeister
84d84211ac [youtube:feeds] (Fixes #4486) 2014-12-16 15:59:31 +01:00
Sergey M.
b4116dcdd5 Merge pull request #4490 from Tailszefox/master
[Allocine] Support for more URLs
2014-12-16 20:59:07 +06:00
Jaime Marquínez Ferrándiz
bb18d787b5 [aljazeera] Add extractor (closes #4487) 2014-12-16 15:48:01 +01:00
Tailszefox
0647084f39 [Allocine] Support for more URLs 2014-12-16 15:46:04 +01:00
Philipp Hagemeister
734ea11e3c Drop hash character in downloader output (#4484) 2014-12-16 00:37:42 +01:00
Philipp Hagemeister
3940450878 release 2014.12.16 2014-12-16 00:24:30 +01:00
Philipp Hagemeister
ccbfaa83b0 [devscripts/make_contributing] Switch to optparse (Fixes #4483) 2014-12-16 00:24:11 +01:00
Philipp Hagemeister
d86007873e [YoutubeDL] Document where details for format can be found 2014-12-16 00:22:23 +01:00
Jaime Marquínez Ferrándiz
4b7df0d30c [youtube:playlist] Work around buggy playlists (fixes #4449)
They show a "Load more" button, but they don't have more videos.
The continuation url in the json file was a link to itself, so we ended up in an infinite loop.
2014-12-15 19:19:15 +01:00
Philipp Hagemeister
caff59499c [README] Fix code rendering 2014-12-15 11:14:06 +01:00
Philipp Hagemeister
99a0f9824a [README] Highlight code examples 2014-12-15 11:11:52 +01:00
Jaime Marquínez Ferrándiz
3013bbb27d Remove unused imports 2014-12-15 08:24:50 +01:00
Naglis Jonaitis
6f9b54933f [streamcloud] Modernize 2014-12-15 03:32:17 +02:00
Naglis Jonaitis
1bbe317508 [mooshare] Modernize 2014-12-15 03:31:54 +02:00
Philipp Hagemeister
e97a534f13 release 2014.12.15 2014-12-15 01:36:46 +01:00
Philipp Hagemeister
8acb83d993 [README] Make example audio sound not that horrible ;) 2014-12-15 01:34:39 +01:00
Philipp Hagemeister
71b640cc5b [YoutubeDL] Add declarative version of progress hooks 2014-12-15 01:26:20 +01:00
Philipp Hagemeister
4f026fafbc [YoutubeDL] Make postprocessors declarative
Instead of having to configure PPs in code, this allows us and embedding programs not to worry about imports or finer details, similarly to how we handle IEs.
2014-12-15 01:06:25 +01:00
Philipp Hagemeister
39f594d660 [Makefile] Ensure that offline test really is offline 2014-12-15 00:59:23 +01:00
Philipp Hagemeister
cae97f6521 Improve and test ffmpeg version detection 2014-12-14 21:59:59 +01:00
Philipp Hagemeister
6cbf345f28 Remove test/write_info_json
This is now covered by every single test_download testcase anyways :)
2014-12-14 21:56:12 +01:00
Philipp Hagemeister
a0ab29f8a1 Add offlinetest make target 2014-12-14 21:55:57 +01:00
Naglis Jonaitis
4a4fbfc967 [yesjapan] Look for datetime inside submit_info
Oops..
2014-12-14 18:03:05 +02:00
Naglis Jonaitis
408b5839b1 [yesjapan] Add new extractor (Closes #4466) 2014-12-14 17:59:25 +02:00
Philipp Hagemeister
60620368d7 [youtube] Fix player ID detection 2014-12-14 00:43:34 +01:00
Philipp Hagemeister
4927de4f86 release 2014.12.14 2014-12-14 00:13:17 +01:00
Philipp Hagemeister
bad5c1a303 [rtp] Also match e-id-less URLs (#4382) 2014-12-14 00:13:07 +01:00
Philipp Hagemeister
6f18cc9abc release 2014.12.13.1 2014-12-13 23:51:57 +01:00
Philipp Hagemeister
4d144be8b0 [bandcamp:album] Do not match plain Bandcamp URLs (#4461)
The _VALID_URL 1fa174692a is to broad, since it matches everything beginning with bandcamp.com.
2014-12-13 23:50:06 +01:00
Philipp Hagemeister
2128b696b8 [utils] Do not make an exception for SSLv3
SSLv3 is terminally vulnerable to POODLE; web browsers are currently deprecating/removing it.
Closes #4459, fixes #4294
2014-12-13 23:45:34 +01:00
Philipp Hagemeister
a23669220a [utils] Make ssl work on Python 2.7.8 2014-12-13 23:27:21 +01:00
Philipp Hagemeister
051c46256b release 2014.12.13 2014-12-13 23:13:48 +01:00
Philipp Hagemeister
d5524947b5 Merge remote-tracking branch 'fstirlitz/master' 2014-12-13 23:05:41 +01:00
Philipp Hagemeister
74f91c4af7 Merge branch 'master' of github.com:rg3/youtube-dl 2014-12-13 23:05:28 +01:00
Philipp Hagemeister
da4d4191a9 Merge branch 'master' of github.com:rg3/youtube-dl 2014-12-13 23:05:22 +01:00
Sergey M․
2564300e55 Credit @Mortal for restudy (#4463) 2014-12-14 03:42:42 +06:00
Sergey M․
cb0713d2c9 Merge branch 'Mortal-restudy' 2014-12-14 03:41:17 +06:00
Sergey M․
ac265bef1e [restudy] Simplify and extract all formats 2014-12-14 03:41:00 +06:00
Mathias Rav
4a0132c570 [Restudy] Add new extractor for restudy.dk 2014-12-13 22:25:32 +01:00
Sergey M․
1fa174692a [bandcamp:album] Make path optional (Closes #4461) 2014-12-14 02:00:54 +06:00
Sergey M․
04c9544187 [bbccouk] Fix vpid warning 2014-12-13 18:47:34 +06:00
Sergey M․
8085fc15cc [adultswim] Improve segment duration extraction 2014-12-13 18:42:29 +06:00
Philipp Hagemeister
2f15832f56 Merge pull request #3927 from qrtt1/master
apply ratelimit to f4m
2014-12-13 12:59:12 +01:00
Jaime Marquínez Ferrándiz
1557ed153c [test_unicode_literals] Import from test.helper 2014-12-13 12:45:09 +01:00
Philipp Hagemeister
a6620ac28d [orf] Modernize 2014-12-13 12:41:38 +01:00
Philipp Hagemeister
89e36657cc [keek] remove unused import 2014-12-13 12:36:46 +01:00
Philipp Hagemeister
7129bed51b [keek] Modernize and extract uploader 2014-12-13 12:35:45 +01:00
Philipp Hagemeister
1cc79574fc Fix imports and general cleanup
· Import from compat what comes from compat. Yes, some names are available in utils too, but that's an implementation detail.
· Use _match_id consistently whenever possible
· Fix some outdated tests
· Use consistent valid URL (always match the whole protocol, no ^ at start required)
· Use modern test definitions
2014-12-13 12:35:45 +01:00
Philipp Hagemeister
20e35880bf [streamcz] Update extractor 2014-12-13 12:35:45 +01:00
Philipp Hagemeister
5e1912cfc1 [5min] Remove helper method and modernize
Previously, other extractor would go call a private(!) helper method. Instead, just hardcode the 5min:video_id format - it's not if that would ever change.
2014-12-13 12:35:45 +01:00
Jaime Marquínez Ferrándiz
293f0f39ce [utils] make_HTTPS_handler: Remove try/except block that would always raise an exception
This code is only run for Python < 3.4, where context.load_default_certs doesn't exist
2014-12-12 23:43:25 +01:00
Jaime Marquínez Ferrándiz
0db261ba56 [utils] make_HTTPS_handler: Use ssl.create_default_context in Python 2.7.9
The new features in the ssl module have been backported from 3.4, see https://docs.python.org/dev/whatsnew/2.7.html#pep-466-network-security-enhancements-for-python-2-7
2014-12-12 23:35:17 +01:00
felix
7668a2c5cb [comcarcoff] add webpage_url datum 2014-12-12 23:20:34 +01:00
Jaime Marquínez Ferrándiz
26c06f0c51 [youtube:playlist] Remove unused property 2014-12-12 22:26:50 +01:00
Jaime Marquínez Ferrándiz
23d3608c6b [youtube:channel] Fix extraction (fixes #4435)
It uses now the same pagination system as playlists
2014-12-12 22:23:54 +01:00
Philipp Hagemeister
baa7081d68 [urort] Update to new multi-format protocol 2014-12-12 20:55:18 +01:00
Philipp Hagemeister
19bf2b4e88 [comcarcoff] Add unicode_literals declaration 2014-12-12 20:37:58 +01:00
Philipp Hagemeister
6a1b20de2a [urort] Modernize 2014-12-12 20:37:28 +01:00
Philipp Hagemeister
3c864e930d [comcarcoff] Adapt c62159ea91a04ef82560472b254aef1cc9f70a11 2014-12-12 20:35:17 +01:00
Philipp Hagemeister
dc5596ff54 [comcarcoff] (#4454) 2014-12-12 20:32:02 +01:00
Philipp Hagemeister
46d9760f5e Merge remote-tracking branch 'fstirlitz/master' 2014-12-12 20:17:26 +01:00
Philipp Hagemeister
90d71d3f08 [ooyala] Remove test md5sums 2014-12-12 20:12:51 +01:00
Philipp Hagemeister
e9404524cc [ninegag] Test for additional properties 2014-12-12 20:10:15 +01:00
felix
dc65a213fd comediansincarsgettingcoffee.com support 2014-12-12 19:58:44 +01:00
Philipp Hagemeister
4237ba10dc [pornotube] Adapt to new interface 2014-12-12 19:44:25 +01:00
Naglis Jonaitis
c3f3b29b92 [rtp] Add new extractor (Closes #4382) 2014-12-12 20:22:24 +02:00
Philipp Hagemeister
1c985da0ca release 2014.12.12.7 2014-12-12 18:25:58 +01:00
Philipp Hagemeister
7a60322abf release 2014.12.12.6 2014-12-12 17:52:50 +01:00
Sergey M․
07bc9a3530 [nowvideo] Add .li domain (Closes #4453) 2014-12-12 22:44:16 +06:00
Philipp Hagemeister
a099965bad release 2014.12.12.5 2014-12-12 17:40:27 +01:00
Philipp Hagemeister
146323a7f8 [groupon] Add extractor (Fixes #4386) 2014-12-12 17:39:33 +01:00
Philipp Hagemeister
57e086dcea [ebaumsworld] Modernize 2014-12-12 17:24:05 +01:00
Philipp Hagemeister
2101f5d4cc release 2014.12.12.4 2014-12-12 17:18:22 +01:00
Philipp Hagemeister
cc8c9281e6 [downloader/common] Do not use classic int division 2014-12-12 17:17:09 +01:00
Philipp Hagemeister
cf372f0778 Merge remote-tracking branch 'SyxbEaEQ2/rate-limit' 2014-12-12 17:16:13 +01:00
Philipp Hagemeister
34bc0ae667 Merge branch 'master' of github.com:rg3/youtube-dl 2014-12-12 17:12:25 +01:00
Philipp Hagemeister
2865cf0419 Deprecate --auto-number (Closes #2704) 2014-12-12 17:11:53 +01:00
Sergey M․
58c1f6f0a7 [nbc] Fix extraction (Closes #4441) 2014-12-12 22:10:32 +06:00
Philipp Hagemeister
7c7a0d395c Remove unused imports 2014-12-12 17:07:39 +01:00
Philipp Hagemeister
8bdcb436f9 [test_unicode_literals] Fix test 2014-12-12 17:06:52 +01:00
Mark Schreiber
ff815fe65a Download playlist items in reverse order
Series of videos are typically uploaded to YouTube playlists in
chronological order.  By default, these videos are downloaded
latest-to-earliest; this is great for seeing the latest videos in a
series, but prevents streaming video in the order that the videos were
produced.  Add an option to download videos in reverse order,
earliest-to-latest.

Conflicts:
	youtube_dl/YoutubeDL.py
	youtube_dl/__init__.py
2014-12-12 16:56:29 +01:00
Philipp Hagemeister
da3a2d8137 release 2014.12.12.3 2014-12-12 16:47:38 +01:00
Philipp Hagemeister
13dcfd41bd [CONTRIBUTING.md] Remove the section about embedding; that is not applicable for youtube-dl contributors 2014-12-12 16:47:22 +01:00
Philipp Hagemeister
e56190b378 [Makefile] Add CONTRIBUTING.md (Fixes #2984) 2014-12-12 16:42:40 +01:00
Philipp Hagemeister
a79553f39f Merge branch 'master' of github.com:rg3/youtube-dl 2014-12-12 16:41:12 +01:00
Philipp Hagemeister
b3efb3ebae [README] More concise and nicer bug reporting instructions 2014-12-12 16:40:37 +01:00
Sergey M․
68d301ffd4 [giantbomb] Add extractor (Closes #4432) 2014-12-12 21:23:42 +06:00
Philipp Hagemeister
3b0bec8d11 release 2014.12.12.2 2014-12-12 15:56:45 +01:00
Philipp Hagemeister
412c617d0f [cnet] Update to new theplatform infrastructure (Fixes #2736) 2014-12-12 15:55:55 +01:00
Philipp Hagemeister
751536f5c8 [goldenmoustache] Remove view count
view count is not present anymore, so we can't extract it.
2014-12-12 13:09:55 +01:00
Philipp Hagemeister
025f30ba38 [channel9] Do not return compat_list results anymore 2014-12-12 13:07:43 +01:00
Philipp Hagemeister
0d2fb1d193 [helsinki] Fix extraction 2014-12-12 13:03:16 +01:00
Philipp Hagemeister
82b34105d3 [goshgay] Fix extraction 2014-12-12 12:55:13 +01:00
Philipp Hagemeister
73aeb2dc56 [goshgay] Modernize 2014-12-12 12:44:50 +01:00
Philipp Hagemeister
c6973bd412 [compat] Simplify kwarg detection code
This enables nuitka to compile youtube-dl.
2014-12-12 12:42:35 +01:00
Philipp Hagemeister
f8780e6d11 Merge remote-tracking branch 'grompe/patch-1' 2014-12-12 11:35:04 +01:00
Philipp Hagemeister
e2f89ec7aa Revert "[utils] Work around PyPy stupidity with Windows DLLs (Fixes #4392)"
This reverts commit 16040f46d6.
2014-12-12 11:33:55 +01:00
Philipp Hagemeister
62651c556a [howstuffworks] Parse only once, but right (#4383) 2014-12-12 04:23:34 +01:00
Philipp Hagemeister
bf94e38d3d Merge remote-tracking branch 'Tithen-Firion/hsw-update' 2014-12-12 04:10:55 +01:00
Philipp Hagemeister
4f97852316 Remove unused imports 2014-12-12 04:09:32 +01:00
Philipp Hagemeister
16040f46d6 [utils] Work around PyPy stupidity with Windows DLLs (Fixes #4392) 2014-12-12 04:01:08 +01:00
Philipp Hagemeister
d068ba24f3 release 2014.12.12.1 2014-12-12 03:34:33 +01:00
Philipp Hagemeister
f5e43bc695 [vine] Provide alt_title (Fixes #4448) 2014-12-12 03:34:28 +01:00
Philipp Hagemeister
6a5308ab49 release 2014.12.12 2014-12-12 03:02:56 +01:00
Philipp Hagemeister
63e0f29564 [vine] Modernize 2014-12-12 02:59:52 +01:00
Philipp Hagemeister
42bdd9d051 [cinchcast] Add new extractor (Fixes #4428) 2014-12-12 02:57:36 +01:00
Philipp Hagemeister
4e40de6e2a Merge branch 'master' of github.com:rg3/youtube-dl 2014-12-12 02:14:31 +01:00
Philipp Hagemeister
0fa2b899d1 [Makefile] remove *.info.json in clean target 2014-12-12 02:14:04 +01:00
Philipp Hagemeister
f17e4c9c28 [screenwavemedia] Simplify (#3766) 2014-12-12 02:11:58 +01:00
Philipp Hagemeister
807962f4a1 [pornhd] Adapt to new sources scheme (Fixes #4446) 2014-12-11 23:50:25 +01:00
Jaime Marquínez Ferrándiz
9c1aa1d668 [mixcloud] Fix metadata extraction (fixes #4443) 2014-12-11 23:16:40 +01:00
Philipp Hagemeister
69f491f14e Merge remote-tracking branch 'fstirlitz/master' 2014-12-11 17:11:25 +01:00
Philipp Hagemeister
cb007f47c1 release 2014.12.11 2014-12-11 17:08:31 +01:00
Philipp Hagemeister
9abd500a74 [zdf:channel] Simplify (#4427) 2014-12-11 17:07:59 +01:00
Philipp Hagemeister
cf68bcaeff Merge remote-tracking branch 'akretz/master' 2014-12-11 16:35:45 +01:00
Philipp Hagemeister
cbe2bd914d [youtube] Amend test 2014-12-11 16:34:37 +01:00
Philipp Hagemeister
75111274ed [youtube] Do not warn if DASH manifest is missing (#4442) 2014-12-11 16:33:28 +01:00
Philipp Hagemeister
624dcebff6 [youtube] Make category optional (#4442) 2014-12-11 16:32:48 +01:00
Philipp Hagemeister
9684f17cde Merge remote-tracking branch 'akretz/youtube_fix' 2014-12-11 16:28:10 +01:00
Philipp Hagemeister
e52a40abf7 [youtube] Add test case for #4431 2014-12-11 16:28:07 +01:00
Philipp Hagemeister
0daa05961b Merge branch 'master' of github.com:rg3/youtube-dl 2014-12-11 16:23:01 +01:00
Naglis Jonaitis
158731f83e [tvplay] Don't raise an exception if is_geo_blocked is True
Videos which return `is_geo_blocked' to be True can actually be downloaded from
the country to which the video is restricted
2014-12-11 17:07:50 +02:00
Adrian Kretz
24270b0301 [youtube] The case that 'url_encoded_fmt_stream_map' or 'adaptive_fmts' is the empty string is handled accordingly (fixes #4431) 2014-12-11 16:00:46 +01:00
Naglis Jonaitis
3c1b81b957 [ntv] Rename flash_ver to flash_version in the format dict
RTMP downloader uses `flash_version`
2014-12-11 16:58:45 +02:00
Philipp Hagemeister
45c24df512 Merge branch 'master' of github.com:rg3/youtube-dl 2014-12-11 15:27:54 +01:00
Sergey M․
bf671b605e [behindkink] Remove superfluous whitespace 2014-12-11 20:09:52 +06:00
Sergey M․
09c82fbc9a [behindkink] Simplify 2014-12-11 20:06:19 +06:00
Sergey M.
3bca0409fe Merge pull request #4440 from 5moufl/behindkink-fix
[BehindKink] update
2014-12-11 19:58:31 +06:00
5moufl
d6f78a354d [BehindKink] Replace test
Old one is not accessible anymore
2014-12-11 14:26:59 +01:00
5moufl
e0b9d47387 [BehindKink] Update URL extraction 2014-12-11 14:25:26 +01:00
Philipp Hagemeister
f8795e102b [utils] Add "yesterday" as a date keyword 2014-12-11 10:29:30 +01:00
Philipp Hagemeister
4bb4a18876 [youtube] Fix imports 2014-12-11 10:08:17 +01:00
Adrian Kretz
8560c61842 [zdf] Add support for channels 2014-12-10 17:29:03 +01:00
Sergey M․
a81bbebf44 [smotri:broadcast] Fix extraction 2014-12-10 20:22:49 +06:00
Philipp Hagemeister
72e3ffeb74 release 2014.12.10.3 2014-12-10 15:19:08 +01:00
Philipp Hagemeister
2fc9f2b41d [facebook] Make thumbnail and duration optional
Fixes #4425.
Looks like both properties aren't given to us anymore. For now, just fall back to not returning them.
2014-12-10 15:18:36 +01:00
Philipp Hagemeister
5f3544baa3 release 2014.12.10.2 2014-12-10 14:39:06 +01:00
Philipp Hagemeister
da27660014 [youtube] Pass in all variables to DASH manifest (Fixes #4424) 2014-12-10 14:39:00 +01:00
Philipp Hagemeister
b8a6114309 release 2014.12.10.1 2014-12-10 13:21:49 +01:00
Philipp Hagemeister
774e208f94 [youtube] Handle missing DASH manifest (Fixes #4421, fixes #4420) 2014-12-10 13:21:24 +01:00
Philipp Hagemeister
f20b52778b release 2014.12.10 2014-12-10 12:21:40 +01:00
Jaime Marquínez Ferrándiz
83e865a370 Fix PEP8 issue E713 2014-12-09 23:11:26 +01:00
Sergey M․
b89a938687 [bet] Add extractor (Closes #4416) 2014-12-09 22:29:01 +06:00
Sergey M․
e89a2aabed [extractor/common] Add generic SMIL formats extraction routine 2014-12-09 22:28:28 +06:00
Philipp Hagemeister
f58766ce5c [extractor/common] Document ie_key in url results 2014-12-09 10:58:06 +01:00
Philipp Hagemeister
15644a40df Merge pull request #4395 from cryptonaut/issue2883
Handle --get-url with merged formats (fixes #2883)
2014-12-08 17:21:56 +01:00
Philipp Hagemeister
d4800f3c3f Merge branch 'master' of github.com:rg3/youtube-dl 2014-12-08 17:17:31 +01:00
Philipp Hagemeister
09a5dd2d3b [bliptv] Add support for audio-only files (Fixes #4404) 2014-12-08 17:17:22 +01:00
Sergey M․
819039ee63 [tvigle] Update test and modernize 2014-12-08 22:03:02 +06:00
felix
ce36339575 add teamfourstar.com support 2014-12-08 17:01:22 +01:00
felix
684712076f add direct screenwavemedia.com URL support 2014-12-08 17:01:22 +01:00
Jaime Marquínez Ferrándiz
603c92080f [nhl] Make sure we add '_sd' before the extension (fixes #4397)
'.replace' would find the first dot in the path.
2014-12-07 11:26:07 +01:00
cryptonaut
16ae61f655 Handle --get-url with merged formats (fixes #2883)
Outputs one URL per line
2014-12-06 12:55:07 -08:00
Sergey M․
0ef4d4ab7e Credit @akretz for prosiebensat1 playlist support (#4394) 2014-12-07 01:48:45 +06:00
Sergey M․
4542535f94 Merge branch 'akretz-master' 2014-12-07 01:47:09 +06:00
Sergey M․
6a52eed80e [prosiebensat1] Improve and simplify 2014-12-07 01:46:44 +06:00
Sergey M․
acf5cbfe93 [extractor/common] Add description to playlist_result 2014-12-07 01:46:30 +06:00
Adrian Kretz
8d1c8cae9c [prosiebensat1] Fix broken tests 2014-12-06 19:21:05 +01:00
Adrian Kretz
c84890f708 [prosiebensat1] Add support for playlists (fixes #4357) 2014-12-06 19:05:22 +01:00
Sergey M․
6d0886204a [radio.de] Add support for radio.de websites (Closes #4393) 2014-12-06 23:01:52 +06:00
Sergey M․
04d02a9d57 [twitch] Add login support (#3986) 2014-12-06 21:24:20 +06:00
Grom PE
6ac4e8065a Fix utils.py for PyPy on Windows
The line
```python
from __future__ import unicode_literals
```
introduced in commit [ecc0c5ee01](ecc0c5ee01) broke youtube-dl for PyPy on Windows, making it unable to locate WinAPI functions.
Error: "TypeError: function name must be a string or integer"

Adding "b" prefix to strings with WinAPI function names fixes it.
2014-12-06 20:15:41 +07:00
Philipp Hagemeister
b82f815f37 Allow iterators for playlist result entries 2014-12-06 14:02:19 +01:00
Philipp Hagemeister
158f8cadc0 [adultswim] PEP8 2014-12-06 14:01:59 +01:00
Philipp Hagemeister
7d70cf4157 [nba] Remove unused import 2014-12-06 13:59:37 +01:00
Philipp Hagemeister
6591fdf51f [tagesschau] Look at the right place for download links 2014-12-06 13:59:10 +01:00
Philipp Hagemeister
47d7c64274 [test_utils] Make test more realistically (#4377) 2014-12-06 12:36:23 +01:00
Philipp Hagemeister
db175341c7 Credit @cryptonaut for adultswim (#4388) 2014-12-06 12:32:01 +01:00
Philipp Hagemeister
9ff6772790 [youtube] Modernize 2014-12-06 12:20:54 +01:00
Philipp Hagemeister
5f9b83944d [ffmpeg] Improve version check and call it from hls (Fixes #4377) 2014-12-06 12:14:26 +01:00
Philipp Hagemeister
f6735be4da Merge remote-tracking branch 'cryptonaut/adultswim' 2014-12-06 11:55:24 +01:00
Philipp Hagemeister
6a3e0103bb [nba] Add test for #4387 2014-12-06 11:26:17 +01:00
Philipp Hagemeister
0b5cc1983e [nba] Modernize 2014-12-06 11:15:25 +01:00
cryptonaut
1a9f8b1ad4 [nba] Improve _VALID_URL regex (fixes #4387)
Allows for optional trailing / or /index.html
2014-12-06 01:49:22 -08:00
cryptonaut
7115599121 [adultswim] Updated to work with new site format (fixes #4317) 2014-12-05 21:55:47 -08:00
Philipp Hagemeister
0df23ba9f9 release 2014.12.06.1 2014-12-06 00:48:34 +01:00
Philipp Hagemeister
58daf5ebed [youporn] Fix JSON parameter regexp (Fixes #4384) 2014-12-06 00:48:29 +01:00
Philipp Hagemeister
1a7c6c69d3 release 2014.12.06 2014-12-06 00:43:04 +01:00
Philipp Hagemeister
045c48847a [tagesschau] Add suppot for sendung (Fixes #4378) 2014-12-06 00:42:43 +01:00
Tithen-Firion
e638e83662 [howstuffworks] Update extractor 2014-12-05 19:46:49 +01:00
Sergey M․
90644a6843 [azubu] Add extractor (Closes #4379) 2014-12-05 22:08:30 +06:00
Tithen-Firion
d958fa9ff9 [howstuffworks] Rewrite extractor 2014-12-05 12:21:21 +01:00
Tithen-Firion
ebb6419960 [common] Split _download_json
Add ability for extractor to use _parse_json
2014-12-05 12:21:21 +01:00
Philipp Hagemeister
122c2f87c1 [tagesschau] Modernize 2014-12-05 10:59:55 +01:00
Philipp Hagemeister
a154eb3d15 release 2014.12.04.2 2014-12-04 17:43:39 +01:00
Philipp Hagemeister
81028ff9eb [xminus] Capture description (#4300) 2014-12-04 17:43:34 +01:00
Philipp Hagemeister
e8df5cee12 [minhateca] Fix duration parsing 2014-12-04 17:35:40 +01:00
Philipp Hagemeister
ab07963b5c release 2014.12.04.1 2014-12-04 17:02:23 +01:00
Philipp Hagemeister
7e26084d09 Merge branch 'master' of github.com:rg3/youtube-dl 2014-12-04 17:02:14 +01:00
Philipp Hagemeister
4349c07dd7 [minhateca] Add extractor (Fixes #4094) 2014-12-04 17:02:05 +01:00
Sergey M․
1139a54d9b [foxnews] Add extractor (Closes #4352) 2014-12-04 21:19:08 +06:00
Sergey M․
b128c9ed68 [vine:user] Add support for another URL format (Closes #4365) 2014-12-04 20:12:06 +06:00
Philipp Hagemeister
9776bc7f57 release 2014.12.04 2014-12-04 08:34:12 +01:00
Philipp Hagemeister
e703fc66c2 Merge remote-tracking branch 'origin/master'
Conflicts:
	youtube_dl/extractor/audiomack.py
2014-12-04 08:33:37 +01:00
Philipp Hagemeister
39c52bbd32 [myvidster] Enforce age limit in test 2014-12-04 08:31:55 +01:00
Philipp Hagemeister
6219802165 Merge remote-tracking branch 'zackfern/myvidster' 2014-12-04 08:30:22 +01:00
Philipp Hagemeister
8b97115358 Credit @zackfern for foxgay (#4371) 2014-12-04 08:28:41 +01:00
Philipp Hagemeister
810fb84d5e pep8 and minor beautification all around 2014-12-04 08:27:40 +01:00
Philipp Hagemeister
5f5e993dc6 [bbccouk] Remove unused import 2014-12-04 08:22:53 +01:00
Philipp Hagemeister
191cc41ba4 [foxgay] Add thumbnail to test definition 2014-12-04 08:22:20 +01:00
Jaime Marquínez Ferrándiz
abe70fa044 [audiomack] Modernize test definition 2014-12-04 08:21:29 +01:00
Philipp Hagemeister
7f142293df Merge remote-tracking branch 'zackfern/foxgay' 2014-12-04 08:20:01 +01:00
Philipp Hagemeister
d4e06d4a83 [options] Standardize mentoined configuration file location (Fixes #4367) 2014-12-04 07:57:18 +01:00
Zack Fernandes
ecd7ea1e6b [myvidster] Added support for Myvidster 2014-12-03 22:22:36 -08:00
Zack Fernandes
b92c548693 [foxgay] Initial support 2014-12-03 20:22:48 -08:00
Tithen-Firion
eecd6a467d [vgtv] Update tests 2014-12-04 01:34:24 +01:00
Philipp Hagemeister
dce2a3cf9e [break] Remove md5sum from test 2014-12-04 01:33:30 +01:00
Tithen-Firion
9095aa38ac [audiomack] Update test 2014-12-04 00:42:01 +01:00
Tithen-Firion
0403b06985 [soundcloud] Improve_VALID_URL
Add support for links from Audiomack
2014-12-04 00:42:01 +01:00
Sergey M․
de9bd74bc2 [ted] Fix type_watch links extraction 2014-12-03 21:17:11 +06:00
Jaime Marquínez Ferrándiz
233d37fb6b [brightcove] Make sure that the 'ext' variable is set (fixes #4360) 2014-12-03 13:25:49 +01:00
Philipp Hagemeister
c627f7d48c release 2014.12.03 2014-12-03 12:15:34 +01:00
Jaime Marquínez Ferrándiz
163c8babaa [nhl] Simplify 2014-12-03 00:08:26 +01:00
Jaime Marquínez Ferrándiz
6708542099 Merge branch 'master' of https://github.com/akretz/youtube-dl 2014-12-03 00:00:05 +01:00
Jaime Marquínez Ferrándiz
ea2ee40357 [nhl.com:videocenter] Don't match url with 'id=*' before 'catid' in the query
Since the order extractors are added is not defined, it would match instead of NHLIE.
2014-12-02 23:56:30 +01:00
Adrian Kretz
62d8b56655 [nhl] Support videos which don't have mp4-extension (fixes #4348) 2014-12-02 23:26:37 +01:00
Sergey M․
c492970b4b [rts] Improve _VALID_URL 2014-12-02 22:24:47 +06:00
Sergey M․
ac5633592a [24video] Add extractor (Closes #4350) 2014-12-02 22:23:23 +06:00
Sergey M․
706d7d4ee7 [YoutubeDL] Avoid negative timestamps on Windows 2014-12-02 21:18:07 +06:00
Sergey M․
752c8c9b76 [rts] Improve _VALID_URL 2014-12-02 20:53:19 +06:00
Sergey M․
b1399a144d [rts] Add support for the new URL format and extract display id (Closes #4349) 2014-12-02 20:45:43 +06:00
Jaime Marquínez Ferrándiz
05177b34a6 [rutube] Extract m3u8 formats (fixes #3984) 2014-12-01 18:20:36 +01:00
Jaime Marquínez Ferrándiz
c41a9650c3 [youtube] Extract framerate from the dash manifest
Not all videos have 60 fps, for example they can have 48 fps.
2014-12-01 17:36:12 +01:00
Philipp Hagemeister
df015c69ea release 2014.12.01 2014-12-01 17:28:34 +01:00
Naglis Jonaitis
1434bffa1f [tunein] Use station API 2014-12-01 18:10:15 +02:00
Jaime Marquínez Ferrándiz
94aa25b995 Credit @Tithen-Firion for the myspace changes (#4341) 2014-12-01 16:15:09 +01:00
Sergey M․
d128cfe393 [slideshare] Fix description extraction 2014-12-01 20:18:42 +06:00
Jaime Marquínez Ferrándiz
954f36f890 [myspace] Cleanup 2014-12-01 00:10:12 +01:00
Jaime Marquínez Ferrándiz
19e92770c9 [myspace] Replace removed test video and fix the others 2014-12-01 00:10:12 +01:00
Tithen-Firion
95c673a148 [myspace] Add extractor for albums 2014-12-01 00:10:12 +01:00
Tithen-Firion
a196a53265 [myspace] Update tests 2014-12-01 00:10:12 +01:00
Tithen-Firion
3266f0c68e [myspace] Redirect to other extractors
There are many songs just linked from Vevo/YouTube to MySpace.
Vevo example: https://myspace.com/threedaysgrace/music/song/animal-i-have-become-28400208-28218041
YouTube example: https://myspace.com/starset2/music/song/first-light-95799905-106964426
2014-12-01 00:10:12 +01:00
Tithen-Firion
1940fadd53 [myspace] Handle non-playable songs
I'm adding this because sometimes there is a song page, but you cannot play it.
Example: https://myspace.com/starset2/music/song/let-it-die-maniac-agenda-remix-bonus-track-95799916-106964439
It will be useful for downloading whole album with songs like this.
2014-12-01 00:10:11 +01:00
Tithen-Firion
03fd72d996 [myspace] Add more data to info dict
`uploader` is an artist
`playlist` is an album
2014-12-01 00:10:11 +01:00
Tithen-Firion
f2b44a2513 [myspace] Use player_url for faster download
It keeps reconnecting without it. Download time decreased from 7+ minutes to 25 seconds for me.
2014-12-01 00:10:11 +01:00
Jaime Marquínez Ferrándiz
c522adb1f0 [youtube] Add a normal age-gate test video 2014-11-30 21:45:49 +01:00
Jaime Marquínez Ferrándiz
7160532d41 [youtube] Simplify code for getting the dash manifest url
video_info contains now the 'ytplayer.config.args' dictionary
2014-11-30 21:07:50 +01:00
Jaime Marquínez Ferrándiz
4e62ebe250 [youtube] Try to extract the video_info from the webpage before requesting the 'get_video_info' pages
The YouTube player doesn't seem to use them except for embedded videos, so we can skip a network request.
But they still provide better error mesagges (for removed videos for example).
2014-11-30 20:56:32 +01:00
Jaime Marquínez Ferrándiz
4472f84f0c [test/test_subtitles] Update checksum for vimeo subtitle file 2014-11-30 19:42:54 +01:00
Jaime Marquínez Ferrándiz
b766eb2707 [youtube] Update test 2014-11-30 19:18:39 +01:00
Jaime Marquínez Ferrándiz
10a404c335 [youtube] Add format 313 (fixes #4339) 2014-11-30 18:56:14 +01:00
Sergey M․
c056efa2e3 [bbccouk] Fix extraction (#4104, #4214) 2014-11-30 22:37:56 +06:00
Philipp Hagemeister
283ac8d592 Merge pull request #4338 from t0mm0/x-minus-fix
[xminus] update tkn extraction regex
2014-11-30 17:11:05 +01:00
t0mm0
313d4572ce [xminus] update tkn extraction regex 2014-11-30 16:04:04 +00:00
Jaime Marquínez Ferrándiz
42939b6129 [youtube] Use a cookie for seeting the language
This way, we don't have to do an aditional request
2014-11-30 00:03:59 +01:00
Jaime Marquínez Ferrándiz
37ea8164d3 [youtube] Don't confirm age when initializing
It seems that all the videos with age restriction use now the age gate method, which doesn't require any confirmation.
2014-11-29 23:46:39 +01:00
Jaime Marquínez Ferrándiz
8c810a7db3 Merge pull request #4333 from ymln/bliptv-fixes
[bliptv] Fix some videos not downloading
2014-11-29 20:20:45 +01:00
Yuriy Melnyk
248a0b890f [bliptv] Fix \n\n at the end of real_url
See https://github.com/rg3/youtube-dl/issues/3544#issuecomment-53166516
2014-11-29 19:17:56 +02:00
Yuriy Melnyk
96b7c7fe3f [bliptv] Fix resolution of lookup id in some videos
In some videos (for example, http://blip.tv/play/gbk766dkj4Yn) resolving
lookup id would fail, because page at
http://blip.tv/play/gbk766dkj4Yn.x?p=1 would have no "config.id" in
it. Fixed by requesting different URL and inspecting the URL which the
client is redirected to.
2014-11-29 19:17:56 +02:00
Sergey M․
e987e91fcc [playvid] Capture and output error message 2014-11-29 22:16:35 +06:00
Sergey M․
cb6444e197 [noco] Add support for multi language videos (Closes #4326) 2014-11-28 20:38:47 +06:00
Philipp Hagemeister
93b8a10e3b release 2014.11.27 2014-11-27 15:44:49 +01:00
Philipp Hagemeister
4207558e8b [buzzfeed] Add support for more video types (#4259) 2014-11-27 15:44:35 +01:00
Philipp Hagemeister
ad0d800fc3 release 2014.11.26.4 2014-11-26 22:53:02 +01:00
Philipp Hagemeister
e232f787f6 [buzzfeed] Add new extractor (Fixes #4259) 2014-11-26 22:52:52 +01:00
Philipp Hagemeister
155f9550c0 [test/helper] Fix newlines in output of missing test fields 2014-11-26 22:52:28 +01:00
Philipp Hagemeister
72476fcc42 release 2014.11.26.3 2014-11-26 22:08:30 +01:00
Philipp Hagemeister
29e950f7c8 release 2014.11.26.2 2014-11-26 22:06:27 +01:00
Philipp Hagemeister
7c8ea53b96 release 2014.11.26.1 2014-11-26 22:01:06 +01:00
Philipp Hagemeister
dcddc10a50 [test_unicode_literals] Arm unicode_literals check
From now on, the line

from __future__ import unicode_literals

should be contained in every single Python file lest we run into any more 2.x/3.x issues.
Going forward, we're likely to develop on 3.x only and would likely miss subtle bugs otherwise.
2014-11-26 20:01:22 +01:00
Sergey M․
a1008af412 [gorillavid] Update IE_DESC 2014-11-27 00:24:19 +06:00
Sergey M․
61c0663c1e [udemy] Generalize download json and fix login 2014-11-26 21:25:43 +06:00
Sergey M․
81a7a521c5 [gorillavid] Remove unused import 2014-11-26 21:02:46 +06:00
Sergey M․
e293711802 [udemy] Set session cookies to API requests (Closes #4124, closes #4219, closes #4308) 2014-11-26 21:00:18 +06:00
Sergey M․
ceb3367320 [gorillavid] Generalize extraction with countdown timeout and support faststream.in (Closes #4297) 2014-11-26 20:02:40 +06:00
Philipp Hagemeister
a03aaaed2e Declare Python 3.2 compatibility 2014-11-26 13:08:42 +01:00
Philipp Hagemeister
e075a44afb [tests] Remove useless u prefixes 2014-11-26 13:07:32 +01:00
Philipp Hagemeister
8865bdeb37 Remove useless u prefixes 2014-11-26 13:06:02 +01:00
Philipp Hagemeister
3aa578cad2 [ffmpeg] Modernize 2014-11-26 13:05:49 +01:00
Philipp Hagemeister
d3b5101a91 [videopremium] Modernize 2014-11-26 13:03:22 +01:00
Philipp Hagemeister
5c32110114 [videofyme] Modernize 2014-11-26 13:01:39 +01:00
Philipp Hagemeister
24144e3b8d [tvp] Modernize 2014-11-26 12:58:53 +01:00
Philipp Hagemeister
b3034f9df7 [trilulilu] Modernize 2014-11-26 12:56:43 +01:00
Philipp Hagemeister
4c6d2ff8dc [sohu] Modernize 2014-11-26 12:53:55 +01:00
Philipp Hagemeister
faf3494894 [redtube] Modernize 2014-11-26 12:52:45 +01:00
Philipp Hagemeister
535a66ef66 [muzu] Modernize 2014-11-26 12:50:37 +01:00
Philipp Hagemeister
5c40bba82f [hotnewhiphop] Modernize 2014-11-26 12:45:40 +01:00
Philipp Hagemeister
855dc479c2 [subtitles] Modernize 2014-11-26 12:43:06 +01:00
Philipp Hagemeister
0792d5634e [youtube] Remove useless u prefixes 2014-11-26 12:41:53 +01:00
Philipp Hagemeister
e91cdcae1a [appletrailers] Modernize 2014-11-26 12:41:24 +01:00
Philipp Hagemeister
27e1400f55 [aparat] Modernize 2014-11-26 12:40:51 +01:00
Philipp Hagemeister
e0938e7731 [addanime] Modernize 2014-11-26 12:40:05 +01:00
Philipp Hagemeister
b72823a0a4 [francetv] PEP8 2014-11-26 12:38:20 +01:00
Philipp Hagemeister
673cf0e773 [update] Remove useless import 2014-11-26 12:37:45 +01:00
Philipp Hagemeister
f8aace93cd [academicearth] Modernize 2014-11-26 12:35:57 +01:00
Philipp Hagemeister
80310134e0 [mplayer] Modernize 2014-11-26 12:34:52 +01:00
Philipp Hagemeister
4d2d638df4 [http] Modernize 2014-11-26 12:27:36 +01:00
Philipp Hagemeister
0e44f90e18 [hls] Remove useless u porefixes 2014-11-26 12:26:21 +01:00
Philipp Hagemeister
15938ab67a [update] Modernize 2014-11-26 12:24:57 +01:00
Philipp Hagemeister
ab4ee31eb1 [utils] remove useless u prefix 2014-11-26 11:50:22 +01:00
Philipp Hagemeister
b061ea6e9f [compat] Beautify assertion 2014-11-26 11:48:09 +01:00
Philipp Hagemeister
4aae94f9d0 [YoutubeDL] Remove incorrect documentation 2014-11-26 11:25:43 +01:00
Philipp Hagemeister
acda92f6bc Clarify --no-playlist documentation (Closes #4309) 2014-11-26 10:51:03 +01:00
Philipp Hagemeister
ddfd0f2727 release 2014.11.26 2014-11-26 10:46:12 +01:00
Philipp Hagemeister
d0720e7118 Merge branch 'master' of github.com:rg3/youtube-dl 2014-11-26 10:45:57 +01:00
Philipp Hagemeister
4e262a8838 [generic] Detect direct video links (Fixes #4149, #4313) 2014-11-26 10:44:39 +01:00
Sergey M․
b9ed3af343 [tass] Add extractor (Closes #4296) 2014-11-25 22:24:33 +06:00
Philipp Hagemeister
63c9b2c1d9 release 2014.11.25.1 2014-11-25 14:34:29 +01:00
Philipp Hagemeister
65f3a228b1 [generic] Add support for LazyYT embeds (Fixes #4306) 2014-11-25 14:34:19 +01:00
Philipp Hagemeister
3004ae2c3a Credit @t0mm0 for xminus (#4302) 2014-11-25 12:16:48 +01:00
Philipp Hagemeister
d9836a5917 release 2014.11.25 2014-11-25 09:56:52 +01:00
Philipp Hagemeister
be64b5b098 [xminus] Simplify and extend (#4302) 2014-11-25 09:54:54 +01:00
Philipp Hagemeister
c3e74731c2 [README] Mention _og_search_description (#4304)
Lots of sites do have this meta tag, so just add it to the example.
2014-11-25 09:36:27 +01:00
Philipp Hagemeister
c920d7f00d [README] Adapt code to new style
Next to every IE will download the webpage first anyways.
2014-11-25 09:23:46 +01:00
Philipp Hagemeister
0bbf12239c Merge remote-tracking branch 't0mm0/x-minus' 2014-11-25 09:22:33 +01:00
Philipp Hagemeister
70d68eb46f Credit @MatthewRayfield for tmz (#4304) 2014-11-25 09:17:59 +01:00
Philipp Hagemeister
c553fe5d29 [tmz] Simplify (#4304) 2014-11-25 09:16:40 +01:00
Matthew Rayfield
f0c3d729d7 [tmz] Add new extractor 2014-11-25 02:54:13 -05:00
t0mm0
1cdedfee10 [XMinus] Added new extractor. 2014-11-25 03:25:28 +00:00
Philipp Hagemeister
93129d9442 release 2014.11.24 2014-11-24 22:56:43 +01:00
Philipp Hagemeister
e8c8653e9d Merge remote-tracking branch 'origin/master' 2014-11-24 22:52:04 +01:00
Philipp Hagemeister
fab89c67c5 Credit @ossi96 for bpb (#4298) 2014-11-24 22:47:49 +01:00
Philipp Hagemeister
3d960a22fa [bpb] Simplify (#4298) 2014-11-24 22:47:23 +01:00
Philipp Hagemeister
51bbb084d3 Merge remote-tracking branch 'ossi96/bpb' 2014-11-24 22:42:56 +01:00
Naglis Jonaitis
2c25a2bd29 [tunein] Add new extractor (Closes #4097) 2014-11-24 23:15:33 +02:00
Oskar Jauch
355682be01 bpb Add new extractor 2014-11-24 20:02:00 +01:00
Jaime Marquínez Ferrándiz
00e9d396ab [francetv] Use the m3u8 manifest for georestricted videos (closes #3963)
Generating the correct urls for the f4m segments seems to require a lot of work.
Also raise an error if the video is not available from your location.
2014-11-24 19:49:43 +01:00
Philipp Hagemeister
14d4e90eb1 [downloader/__init__] Define proper __all__ 2014-11-23 22:25:12 +01:00
Philipp Hagemeister
b74e86f48a Fix all PEP8 issues except E501 2014-11-23 22:21:46 +01:00
Philipp Hagemeister
3d36cea4ac [vk] PEP8 2014-11-23 22:14:27 +01:00
Philipp Hagemeister
380b822003 Remove outdated transition helper scripts 2014-11-23 22:13:03 +01:00
Philipp Hagemeister
b66e699877 [myspace] pep8 and modernization 2014-11-23 22:12:18 +01:00
Philipp Hagemeister
27f8b0994e Merge remote-tracking branch 'jtwaleson/master' 2014-11-23 22:10:26 +01:00
Philipp Hagemeister
e311b6389a Credit @daohoangson for zingmp3 (#4288) 2014-11-23 22:01:15 +01:00
Jouke Waleson
fab6d4c048 remove useless line, the result is never used 2014-11-23 22:00:35 +01:00
Philipp Hagemeister
4ffc31033e [zingmp3] Simplify and PEP8 (#4288) 2014-11-23 22:00:25 +01:00
Philipp Hagemeister
c1777d5cb3 Merge remote-tracking branch 'daohoangson/zing-mp3' 2014-11-23 21:55:51 +01:00
Jouke Waleson
9e1a5b8455 PEP8: applied even more rules 2014-11-23 21:39:15 +01:00
Philipp Hagemeister
784b6d3a9b Merge remote-tracking branch 'jtwaleson/master' 2014-11-23 21:33:31 +01:00
Dao Hoang Son
c66bdc4869 [zingmp3] Added support for songs and albums 2014-11-24 03:25:47 +07:00
Jouke Waleson
2514d2635e PEP8: E225,E227 2014-11-23 21:23:05 +01:00
Jouke Waleson
8bcc875676 PEP8: more applied 2014-11-23 21:20:46 +01:00
Jouke Waleson
5f6a1245ff PEP8 applied 2014-11-23 20:41:03 +01:00
Philipp Hagemeister
f3a3407226 [youtube] Clarify keywords 2014-11-23 20:09:10 +01:00
Sergey M․
598c218f7b [smotri] Adapt to new API and modernize 2014-11-23 23:53:41 +06:00
Naglis Jonaitis
4698b14b76 [rtlxl] Strip additional dot from video URL (#4115) 2014-11-23 13:28:09 +02:00
Philipp Hagemeister
835a22ef3f release 2014.11.23.1 2014-11-23 10:51:16 +01:00
Philipp Hagemeister
7d4111ed14 Provide guidance when called with a YouTube ID starting with a dash.
Reported at https://news.ycombinator.com/item?id=8648121
2014-11-23 10:51:09 +01:00
Philipp Hagemeister
d37cab2a9d Credit @WillSewell for vk:user (#4233) 2014-11-23 10:12:35 +01:00
Philipp Hagemeister
d16abf434a [vk] Some PEP8 love 2014-11-23 10:11:52 +01:00
Philipp Hagemeister
a8363f3ab7 [vk] Clarify test 2014-11-23 10:11:04 +01:00
Philipp Hagemeister
010cd3a3ee Merge remote-tracking branch 'WillSewell/vk-playlists' 2014-11-23 10:09:45 +01:00
Will Sewell
9262867e86 [vk.com] Added newline at the end of the file. 2014-11-21 23:25:05 +00:00
Will Sewell
b9272e8f8f [vk.com] Removed redundant log message -- this information is already being logged. 2014-11-21 23:22:52 +00:00
Will Sewell
021a0db8f7 [vk.com] Simplified the page_id acquisition by using the id matched in the URL earlier on. 2014-11-21 23:22:44 +00:00
Will Sewell
e1e8b6897b [vk.com] Updated the extract_videos_from_page function with a much simpler 1-liner. 2014-11-21 23:16:12 +00:00
Will Sewell
53d1cd1f77 [vk.com] Updated the _VALID_URL regex for the playlist IE. Removed optional m, and named the id group. 2014-11-21 23:03:31 +00:00
Will Sewell
cad985ab4d [vk.com] Updated the description to include vk.com. 2014-11-21 23:00:43 +00:00
Will Sewell
c52331f30c [vk.com] Updated a test video that has been removed, and added a comment for others to update two other test videos that are also now removed. 2014-11-21 23:00:33 +00:00
Will Sewell
42e1ff8665 [vk.com] Added upload_date variable to the test cases that still work. 2014-11-21 23:00:17 +00:00
Will Sewell
02a12f9fe6 [vk] date_added is now extracted from the video page. 2014-11-18 20:19:56 +00:00
Will Sewell
6fcd6e0e21 [vk] Updated the regex for matching user video pages. It now matches optional URL parameters too. 2014-11-18 19:34:12 +00:00
Will Sewell
469d4c8968 [vk] Added a new information extractor for pages that are a list of a user\'s videos on vk.com. It works in a same way to playlist style pages for the YT information extractors. 2014-11-17 17:53:34 -05:00
Ching Yi, Chan
b1c3a49fff apply ratelimit to f4m 2014-10-12 08:32:26 +08:00
SyxbEaEQ2
00cf122d7a [downloader/common] Fix possible negative sleep time in slow_down() 2014-08-06 20:53:04 +02:00
SyxbEaEQ2
c7667c2d7f [downloader/(common/http)] Changes calculation of the rate-limit. (Fix #2297, fix #2140, fix #595, fix #2370) 2014-07-31 03:08:24 +02:00
368 changed files with 8666 additions and 3339 deletions

2
.gitignore vendored
View File

@@ -31,3 +31,5 @@ updates_key.pem
test/testdata
.tox
youtube-dl.zsh
.idea
.idea/*

View File

@@ -9,7 +9,6 @@ notifications:
email:
- filippo.valsorda@gmail.com
- phihag@phihag.de
- jaime.marquinez.ferrandiz+travis@gmail.com
- yasoob.khld@gmail.com
# irc:
# channels:

18
AUTHORS
View File

@@ -83,3 +83,21 @@ Gabriel Schubiner
xantares
Jan Matějka
Mauroy Sébastien
William Sewell
Dao Hoang Son
Oskar Jauch
Matthew Rayfield
t0mm0
Tithen-Firion
Zack Fernandes
cryptonaut
Adrian Kretz
Mathias Rav
Petr Kutalek
Will Glynn
Max Reimann
Cédric Luthi
Thijs Vermeir
Joel Leclerc
Christopher Krooss
Ondřej Caletka

136
CONTRIBUTING.md Normal file
View File

@@ -0,0 +1,136 @@
Please include the full output of the command when run with `--verbose`. The output (including the first lines) contain important debugging information. Issues without the full output are often not reproducible and therefore do not get solved in short order, if ever.
Please re-read your issue once again to avoid a couple of common mistakes (you can and should use this as a checklist):
### Is the description of the issue itself sufficient?
We often get issue reports that we cannot really decipher. While in most cases we eventually get the required information after asking back multiple times, this poses an unnecessary drain on our resources. Many contributors, including myself, are also not native speakers, so we may misread some parts.
So please elaborate on what feature you are requesting, or what bug you want to be fixed. Make sure that it's obvious
- What the problem is
- How it could be fixed
- How your proposed solution would look like
If your report is shorter than two lines, it is almost certainly missing some of these, which makes it hard for us to respond to it. We're often too polite to close the issue outright, but the missing info makes misinterpretation likely. As a commiter myself, I often get frustrated by these issues, since the only possible way for me to move forward on them is to ask for clarification over and over.
For bug reports, this means that your report should contain the *complete* output of youtube-dl when called with the -v flag. The error message you get for (most) bugs even says so, but you would not believe how many of our bug reports do not contain this information.
Site support requests **must contain an example URL**. An example URL is a URL you might want to download, like http://www.youtube.com/watch?v=BaW_jenozKc . There should be an obvious video present. Except under very special circumstances, the main page of a video service (e.g. http://www.youtube.com/ ) is *not* an example URL.
### Are you using the latest version?
Before reporting any issue, type youtube-dl -U. This should report that you're up-to-date. About 20% of the reports we receive are already fixed, but people are using outdated versions. This goes for feature requests as well.
### Is the issue already documented?
Make sure that someone has not already opened the issue you're trying to open. Search at the top of the window or at https://github.com/rg3/youtube-dl/search?type=Issues . If there is an issue, feel free to write something along the lines of "This affects me as well, with version 2015.01.01. Here is some more information on the issue: ...". While some issues may be old, a new post into them often spurs rapid activity.
### Why are existing options not enough?
Before requesting a new feature, please have a quick peek at [the list of supported options](https://github.com/rg3/youtube-dl/blob/master/README.md#synopsis). Many feature requests are for features that actually exist already! Please, absolutely do show off your work in the issue report and detail how the existing similar options do *not* solve your problem.
### Is there enough context in your bug report?
People want to solve problems, and often think they do us a favor by breaking down their larger problems (e.g. wanting to skip already downloaded files) to a specific request (e.g. requesting us to look whether the file exists before downloading the info page). However, what often happens is that they break down the problem into two steps: One simple, and one impossible (or extremely complicated one).
We are then presented with a very complicated request when the original problem could be solved far easier, e.g. by recording the downloaded video IDs in a separate file. To avoid this, you must include the greater context where it is non-obvious. In particular, every feature request that does not consist of adding support for a new site should contain a use case scenario that explains in what situation the missing feature would be useful.
### Does the issue involve one problem, and one problem only?
Some of our users seem to think there is a limit of issues they can or should open. There is no limit of issues they can or should open. While it may seem appealing to be able to dump all your issues into one ticket, that means that someone who solves one of your issues cannot mark the issue as closed. Typically, reporting a bunch of issues leads to the ticket lingering since nobody wants to attack that behemoth, until someone mercifully splits the issue into multiple ones.
In particular, every site support request issue should only pertain to services at one site (generally under a common domain, but always using the same backend technology). Do not request support for vimeo user videos, Whitehouse podcasts, and Google Plus pages in the same issue. Also, make sure that you don't post bug reports alongside feature requests. As a rule of thumb, a feature request does not include outputs of youtube-dl that are not immediately related to the feature at hand. Do not post reports of a network error alongside the request for a new video service.
### Is anyone going to need the feature?
Only post features that you (or an incapicated friend you can personally talk to) require. Do not post features because they seem like a good idea. If they are really useful, they will be requested by someone who requires them.
### Is your question about youtube-dl?
It may sound strange, but some bug reports we receive are completely unrelated to youtube-dl and relate to a different or even the reporter's own application. Please make sure that you are actually using youtube-dl. If you are using a UI for youtube-dl, report the bug to the maintainer of the actual application providing the UI. On the other hand, if your UI for youtube-dl fails in some way you believe is related to youtube-dl, by all means, go ahead and report the bug.
# DEVELOPER INSTRUCTIONS
Most users do not need to build youtube-dl and can [download the builds](http://rg3.github.io/youtube-dl/download.html) or get them from their distribution.
To run youtube-dl as a developer, you don't need to build anything either. Simply execute
python -m youtube_dl
To run the test, simply invoke your favorite test runner, or execute a test file directly; any of the following work:
python -m unittest discover
python test/test_download.py
nosetests
If you want to create a build of youtube-dl yourself, you'll need
* python
* make
* pandoc
* zip
* nosetests
### Adding support for a new site
If you want to add support for a new site, you can follow this quick list (assuming your service is called `yourextractor`):
1. [Fork this repository](https://github.com/rg3/youtube-dl/fork)
2. Check out the source code with `git clone git@github.com:YOUR_GITHUB_USERNAME/youtube-dl.git`
3. Start a new git branch with `cd youtube-dl; git checkout -b yourextractor`
4. Start with this simple template and save it to `youtube_dl/extractor/yourextractor.py`:
```python
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
class YourExtractorIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?yourextractor\.com/watch/(?P<id>[0-9]+)'
_TEST = {
'url': 'http://yourextractor.com/watch/42',
'md5': 'TODO: md5 sum of the first 10241 bytes of the video file (use --test)',
'info_dict': {
'id': '42',
'ext': 'mp4',
'title': 'Video title goes here',
'thumbnail': 're:^https?://.*\.jpg$',
# TODO more properties, either as:
# * A value
# * MD5 checksum; start the string with md5:
# * A regular expression; start the string with re:
# * Any Python type (for example int or float)
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
# TODO more code goes here, for example ...
title = self._html_search_regex(r'<h1>(.*?)</h1>', webpage, 'title')
return {
'id': video_id,
'title': title,
'description': self._og_search_description(webpage),
# TODO more properties (see youtube_dl/extractor/common.py)
}
```
5. Add an import in [`youtube_dl/extractor/__init__.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/__init__.py).
6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will be then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc.
7. Have a look at [`youtube_dl/common/extractor/common.py`](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should return](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/extractor/common.py#L38). Add tests and code for as many as you want.
8. If you can, check the code with [pyflakes](https://pypi.python.org/pypi/pyflakes) (a good idea) and [pep8](https://pypi.python.org/pypi/pep8) (optional, ignore E501).
9. When the tests pass, [add](http://git-scm.com/docs/git-add) the new files and [commit](http://git-scm.com/docs/git-commit) them and [push](http://git-scm.com/docs/git-push) the result, like this:
$ git add youtube_dl/extractor/__init__.py
$ git add youtube_dl/extractor/yourextractor.py
$ git commit -m '[yourextractor] Add new extractor'
$ git push origin yourextractor
10. Finally, [create a pull request](https://help.github.com/articles/creating-a-pull-request). We'll then review and merge it.
In any case, thank you very much for your contributions!

View File

@@ -1,7 +1,7 @@
all: youtube-dl README.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish
all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites
clean:
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish *.dump *.part
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish *.dump *.part *.info.json CONTRIBUTING.md.tmp
cleanall: clean
rm -f youtube-dl youtube-dl.exe
@@ -35,13 +35,22 @@ install: youtube-dl youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtu
install -d $(DESTDIR)$(SYSCONFDIR)/fish/completions
install -m 644 youtube-dl.fish $(DESTDIR)$(SYSCONFDIR)/fish/completions/youtube-dl.fish
codetest:
flake8 .
test:
#nosetests --with-coverage --cover-package=youtube_dl --cover-html --verbose --processes 4 test
nosetests --verbose test
$(MAKE) codetest
ot: offlinetest
offlinetest: codetest
nosetests --verbose test --exclude test_download --exclude test_age_restriction --exclude test_subtitles --exclude test_write_annotations --exclude test_youtube_lists
tar: youtube-dl.tar.gz
.PHONY: all clean install test tar bash-completion pypi-files zsh-completion fish-completion
.PHONY: all clean install test tar bash-completion pypi-files zsh-completion fish-completion ot offlinetest codetest supportedsites
pypi-files: youtube-dl.bash-completion README.txt youtube-dl.1 youtube-dl.fish
@@ -56,6 +65,12 @@ youtube-dl: youtube_dl/*.py youtube_dl/*/*.py
README.md: youtube_dl/*.py youtube_dl/*/*.py
COLUMNS=80 python -m youtube_dl --help | python devscripts/make_readme.py
CONTRIBUTING.md: README.md
python devscripts/make_contributing.py README.md CONTRIBUTING.md
supportedsites:
python devscripts/make_supportedsites.py docs/supportedsites.md
README.txt: README.md
pandoc -f markdown -t plain README.md -o README.txt

View File

@@ -1,7 +1,15 @@
youtube-dl - download videos from youtube.com or other video platforms
# SYNOPSIS
**youtube-dl** [OPTIONS] URL [URL...]
- [INSTALLATION](#installation)
- [DESCRIPTION](#description)
- [OPTIONS](#options)
- [CONFIGURATION](#configuration)
- [OUTPUT TEMPLATE](#output-template)
- [VIDEO SELECTION](#video-selection)
- [FAQ](#faq)
- [DEVELOPER INSTRUCTIONS](#developer-instructions)
- [BUGS](#bugs)
- [COPYRIGHT](#copyright)
# INSTALLATION
@@ -30,10 +38,12 @@ Alternatively, refer to the developer instructions below for how to check out an
# DESCRIPTION
**youtube-dl** is a small command-line program to download videos from
YouTube.com and a few more sites. It requires the Python interpreter, version
2.6, 2.7, or 3.3+, and it is not platform specific. It should work on
2.6, 2.7, or 3.2+, and it is not platform specific. It should work on
your Unix box, on Windows or on Mac OS X. It is released to the public domain,
which means you can modify it, redistribute it or use it however you like.
youtube-dl [OPTIONS] URL [URL...]
# OPTIONS
-h, --help print this help text and exit
--version print program version and exit
@@ -65,10 +75,10 @@ which means you can modify it, redistribute it or use it however you like.
this is not possible instead of searching.
--ignore-config Do not read configuration files. When given
in the global configuration file /etc
/youtube-dl.conf: do not read the user
configuration in ~/.config/youtube-dl.conf
(%APPDATA%/youtube-dl/config.txt on
Windows)
/youtube-dl.conf: Do not read the user
configuration in ~/.config/youtube-
dl/config (%APPDATA%/youtube-dl/config.txt
on Windows)
--flat-playlist Do not extract the videos of a playlist,
only list them.
@@ -93,7 +103,8 @@ which means you can modify it, redistribute it or use it however you like.
COUNT views
--max-views COUNT Do not download any videos with more than
COUNT views
--no-playlist download only the currently playing video
--no-playlist If the URL refers to a video and a
playlist, download only the video.
--age-limit YEARS download only videos suitable for the given
age
--download-archive FILE Download only videos not listed in the
@@ -112,12 +123,12 @@ which means you can modify it, redistribute it or use it however you like.
size. By default, the buffer size is
automatically resized from an initial value
of SIZE.
--playlist-reverse Download playlist videos in reverse order
## Filesystem Options:
-a, --batch-file FILE file containing URLs to download ('-' for
stdin)
--id use only video ID in file name
-A, --auto-number number downloaded files starting from 00000
-o, --output TEMPLATE output filename template. Use %(title)s to
get the title, %(uploader)s for the
uploader name, %(uploader_id)s for the
@@ -151,6 +162,9 @@ which means you can modify it, redistribute it or use it however you like.
--restrict-filenames Restrict filenames to only ASCII
characters, and avoid "&" and spaces in
filenames
-A, --auto-number [deprecated; use -o
"%(autonumber)s-%(title)s.%(ext)s" ] number
downloaded files starting from 00000
-t, --title [deprecated] use title in file name
(default)
-l, --literal [deprecated] alias of --title
@@ -435,6 +449,14 @@ Since June 2012 (#342) youtube-dl is packed as an executable zipfile, simply unz
To run the exe you need to install first the [Microsoft Visual C++ 2008 Redistributable Package](http://www.microsoft.com/en-us/download/details.aspx?id=29).
### How can I detect whether a given URL is supported by youtube-dl?
For one, have a look at the [list of supported sites](docs/supportedsites). Note that it can sometimes happen that the site changes its URL scheme (say, from http://example.com/v/1234567 to http://example.com/v/1234567 ) and youtube-dl reports an URL of a service in that list as unsupported. In that case, simply report a bug.
It is *not* possible to detect whether a URL is supported or not. That's because youtube-dl contains a generic extractor which matches **all** URLs. You may be tempted to disable, exclude, or remove the generic extractor, but the generic extractor not only allows users to extract videos from lots of websites that embed a video from another service, but may also be used to extract video from a service that it's hosting itself. Therefore, we neither recommend nor support disabling, excluding, or removing the generic extractor.
If you want to find out whether a given URL is supported, simply call youtube-dl with it. If you get no videos back, chances are the URL is either not referring to a video or unsupported. You can find out which by examining the output (if you run youtube-dl on the console) or catching an `UnsupportedError` exception if you run it from a Python program.
# DEVELOPER INSTRUCTIONS
Most users do not need to build youtube-dl and can [download the builds](http://rg3.github.io/youtube-dl/download.html) or get them from their distribution.
@@ -492,14 +514,15 @@ If you want to add support for a new site, you can follow this quick list (assum
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
# TODO more code goes here, for example ...
webpage = self._download_webpage(url, video_id)
title = self._html_search_regex(r'<h1>(.*?)</h1>', webpage, 'title')
return {
'id': video_id,
'title': title,
'description': self._og_search_description(webpage),
# TODO more properties (see youtube_dl/extractor/common.py)
}
```
@@ -524,23 +547,59 @@ youtube-dl makes the best effort to be a good command-line program, and thus sho
From a Python program, you can embed youtube-dl in a more powerful fashion, like this:
import youtube_dl
```python
import youtube_dl
ydl_opts = {}
with youtube_dl.YoutubeDL(ydl_opts) as ydl:
ydl.download(['http://www.youtube.com/watch?v=BaW_jenozKc'])
ydl_opts = {}
with youtube_dl.YoutubeDL(ydl_opts) as ydl:
ydl.download(['http://www.youtube.com/watch?v=BaW_jenozKc'])
```
Most likely, you'll want to use various options. For a list of what can be done, have a look at [youtube_dl/YoutubeDL.py](https://github.com/rg3/youtube-dl/blob/master/youtube_dl/YoutubeDL.py#L69). For a start, if you want to intercept youtube-dl's output, set a `logger` object.
Here's a more complete example of a program that outputs only errors (and a short message after the download is finished), and downloads/converts the video to an mp3 file:
```python
import youtube_dl
class MyLogger(object):
def debug(self, msg):
pass
def warning(self, msg):
pass
def error(self, msg):
print(msg)
def my_hook(d):
if d['status'] == 'finished':
print('Done downloading, now converting ...')
ydl_opts = {
'format': 'bestaudio/best',
'postprocessors': [{
'key': 'FFmpegExtractAudio',
'preferredcodec': 'mp3',
'preferredquality': '192',
}],
'logger': MyLogger(),
'progress_hooks': [my_hook],
}
with youtube_dl.YoutubeDL(ydl_opts) as ydl:
ydl.download(['http://www.youtube.com/watch?v=BaW_jenozKc'])
```
# BUGS
Bugs and suggestions should be reported at: <https://github.com/rg3/youtube-dl/issues> . Unless you were prompted so or there is another pertinent reason (e.g. GitHub fails to accept the bug report), please do not send bug reports via personal email.
Bugs and suggestions should be reported at: <https://github.com/rg3/youtube-dl/issues> . Unless you were prompted so or there is another pertinent reason (e.g. GitHub fails to accept the bug report), please do not send bug reports via personal email. For discussions, join us in the irc channel #youtube-dl on freenode.
Please include the full output of the command when run with `--verbose`. The output (including the first lines) contain important debugging information. Issues without the full output are often not reproducible and therefore do not get solved in short order, if ever.
For discussions, join us in the irc channel #youtube-dl on freenode.
When you submit a request, please re-read it once to avoid a couple of mistakes (you can and should use this as a checklist):
Please re-read your issue once again to avoid a couple of common mistakes (you can and should use this as a checklist):
### Is the description of the issue itself sufficient?

View File

@@ -1,4 +1,6 @@
#!/usr/bin/env python
from __future__ import unicode_literals
import os
from os.path import dirname as dirn
import sys
@@ -9,16 +11,17 @@ import youtube_dl
BASH_COMPLETION_FILE = "youtube-dl.bash-completion"
BASH_COMPLETION_TEMPLATE = "devscripts/bash-completion.in"
def build_completion(opt_parser):
opts_flag = []
for group in opt_parser.option_groups:
for option in group.option_list:
#for every long flag
# for every long flag
opts_flag.append(option.get_opt_string())
with open(BASH_COMPLETION_TEMPLATE) as f:
template = f.read()
with open(BASH_COMPLETION_FILE, "w") as f:
#just using the special char
# just using the special char
filled_template = template.replace("{{flags}}", " ".join(opts_flag))
f.write(filled_template)

View File

@@ -142,7 +142,7 @@ def win_service_set_status(handle, status_code):
def win_service_main(service_name, real_main, argc, argv_raw):
try:
#args = [argv_raw[i].value for i in range(argc)]
# args = [argv_raw[i].value for i in range(argc)]
stop_event = threading.Event()
handler = HandlerEx(functools.partial(stop_event, win_service_handler))
h = advapi32.RegisterServiceCtrlHandlerExW(service_name, handler, None)
@@ -233,6 +233,7 @@ def rmtree(path):
#==============================================================================
class BuildError(Exception):
def __init__(self, output, code=500):
self.output = output
@@ -369,7 +370,7 @@ class Builder(PythonBuilder, GITBuilder, YoutubeDLBuilder, DownloadBuilder, Clea
class BuildHTTPRequestHandler(BaseHTTPRequestHandler):
actionDict = { 'build': Builder, 'download': Builder } # They're the same, no more caching.
actionDict = {'build': Builder, 'download': Builder} # They're the same, no more caching.
def do_GET(self):
path = urlparse.urlparse(self.path)

View File

@@ -1,4 +1,5 @@
#!/usr/bin/env python
from __future__ import unicode_literals
"""
This script employs a VERY basic heuristic ('porn' in webpage.lower()) to check

View File

@@ -23,13 +23,13 @@ EXTRA_ARGS = {
'batch-file': ['--require-parameter'],
}
def build_completion(opt_parser):
commands = []
for group in opt_parser.option_groups:
for option in group.option_list:
long_option = option.get_opt_string().strip('-')
help_msg = shell_quote([option.help])
complete_cmd = ['complete', '--command', 'youtube-dl', '--long-option', long_option]
if option._short_opts:
complete_cmd += ['--short-option', option._short_opts[0].strip('-')]

View File

@@ -1,4 +1,5 @@
#!/usr/bin/env python3
from __future__ import unicode_literals
import json
import sys

View File

@@ -1,8 +1,7 @@
#!/usr/bin/env python3
from __future__ import unicode_literals
import hashlib
import shutil
import subprocess
import tempfile
import urllib.request
import json

View File

@@ -1,4 +1,5 @@
#!/usr/bin/env python3
from __future__ import unicode_literals, with_statement
import rsa
import json
@@ -11,22 +12,23 @@ except NameError:
versions_info = json.load(open('update/versions.json'))
if 'signature' in versions_info:
del versions_info['signature']
del versions_info['signature']
print('Enter the PKCS1 private key, followed by a blank line:')
privkey = b''
while True:
try:
line = input()
except EOFError:
break
if line == '':
break
privkey += line.encode('ascii') + b'\n'
try:
line = input()
except EOFError:
break
if line == '':
break
privkey += line.encode('ascii') + b'\n'
privkey = rsa.PrivateKey.load_pkcs1(privkey)
signature = hexlify(rsa.pkcs1.sign(json.dumps(versions_info, sort_keys=True).encode('utf-8'), privkey, 'SHA-256')).decode()
print('signature: ' + signature)
versions_info['signature'] = signature
json.dump(versions_info, open('update/versions.json', 'w'), indent=4, sort_keys=True)
with open('update/versions.json', 'w') as versionsf:
json.dump(versions_info, versionsf, indent=4, sort_keys=True)

View File

@@ -1,11 +1,11 @@
#!/usr/bin/env python
# coding: utf-8
from __future__ import with_statement
from __future__ import with_statement, unicode_literals
import datetime
import glob
import io # For Python 2 compatibilty
import io # For Python 2 compatibilty
import os
import re
@@ -13,7 +13,7 @@ year = str(datetime.datetime.now().year)
for fn in glob.glob('*.html*'):
with io.open(fn, encoding='utf-8') as f:
content = f.read()
newc = re.sub(u'(?P<copyright>Copyright © 2006-)(?P<year>[0-9]{4})', u'Copyright © 2006-' + year, content)
newc = re.sub(r'(?P<copyright>Copyright © 2006-)(?P<year>[0-9]{4})', 'Copyright © 2006-' + year, content)
if content != newc:
tmpFn = fn + '.part'
with io.open(tmpFn, 'wt', encoding='utf-8') as outf:

View File

@@ -1,4 +1,5 @@
#!/usr/bin/env python3
from __future__ import unicode_literals
import datetime
import io
@@ -73,4 +74,3 @@ atom_template = atom_template.replace('@ENTRIES@', entries_str)
with io.open('update/releases.atom', 'w', encoding='utf-8') as atom_file:
atom_file.write(atom_template)

View File

@@ -1,4 +1,5 @@
#!/usr/bin/env python3
from __future__ import unicode_literals
import sys
import os
@@ -9,19 +10,20 @@ sys.path.append(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(
import youtube_dl
def main():
with open('supportedsites.html.in', 'r', encoding='utf-8') as tmplf:
template = tmplf.read()
ie_htmls = []
for ie in sorted(youtube_dl.gen_extractors(), key=lambda i: i.IE_NAME.lower()):
for ie in youtube_dl.list_extractors(age_limit=None):
ie_html = '<b>{}</b>'.format(ie.IE_NAME)
ie_desc = getattr(ie, 'IE_DESC', None)
if ie_desc is False:
continue
elif ie_desc is not None:
ie_html += ': {}'.format(ie.IE_DESC)
if ie.working() == False:
if not ie.working():
ie_html += ' (Currently broken)'
ie_htmls.append('<li>{}</li>'.format(ie_html))

32
devscripts/make_contributing.py Executable file
View File

@@ -0,0 +1,32 @@
#!/usr/bin/env python
from __future__ import unicode_literals
import io
import optparse
import re
def main():
parser = optparse.OptionParser(usage='%prog INFILE OUTFILE')
options, args = parser.parse_args()
if len(args) != 2:
parser.error('Expected an input and an output filename')
infile, outfile = args
with io.open(infile, encoding='utf-8') as inf:
readme = inf.read()
bug_text = re.search(
r'(?s)#\s*BUGS\s*[^\n]*\s*(.*?)#\s*COPYRIGHT', readme).group(1)
dev_text = re.search(
r'(?s)(#\s*DEVELOPER INSTRUCTIONS.*?)#\s*EMBEDDING YOUTUBE-DL',
readme).group(1)
out = bug_text + dev_text
with io.open(outfile, 'w', encoding='utf-8') as outf:
outf.write(out)
if __name__ == '__main__':
main()

View File

@@ -1,3 +1,5 @@
from __future__ import unicode_literals
import io
import sys
import re

View File

@@ -0,0 +1,45 @@
#!/usr/bin/env python
from __future__ import unicode_literals
import io
import optparse
import os
import sys
# Import youtube_dl
ROOT_DIR = os.path.join(os.path.dirname(__file__), '..')
sys.path.append(ROOT_DIR)
import youtube_dl
def main():
parser = optparse.OptionParser(usage='%prog OUTFILE.md')
options, args = parser.parse_args()
if len(args) != 1:
parser.error('Expected an output filename')
outfile, = args
def gen_ies_md(ies):
for ie in ies:
ie_md = '**{}**'.format(ie.IE_NAME)
ie_desc = getattr(ie, 'IE_DESC', None)
if ie_desc is False:
continue
if ie_desc is not None:
ie_md += ': {}'.format(ie.IE_DESC)
if not ie.working():
ie_md += ' (Currently broken)'
yield ie_md
ies = sorted(youtube_dl.gen_extractors(), key=lambda i: i.IE_NAME.lower())
out = '# Supported sites\n' + ''.join(
' - ' + md + '\n'
for md in gen_ies_md(ies))
with io.open(outfile, 'w', encoding='utf-8') as outf:
outf.write(out)
if __name__ == '__main__':
main()

View File

@@ -1,3 +1,4 @@
from __future__ import unicode_literals
import io
import os.path
@@ -10,8 +11,19 @@ README_FILE = os.path.join(ROOT_DIR, 'README.md')
with io.open(README_FILE, encoding='utf-8') as f:
readme = f.read()
PREFIX = '%YOUTUBE-DL(1)\n\n# NAME\n'
readme = re.sub(r'(?s)# INSTALLATION.*?(?=# DESCRIPTION)', '', readme)
PREFIX = '''%YOUTUBE-DL(1)
# NAME
youtube\-dl \- download videos from youtube.com or other video platforms
# SYNOPSIS
**youtube-dl** \[OPTIONS\] URL [URL...]
'''
readme = re.sub(r'(?s)^.*?(?=# DESCRIPTION)', '', readme)
readme = re.sub(r'\s+youtube-dl \[OPTIONS\] URL \[URL\.\.\.\]', '', readme)
readme = PREFIX + readme
if sys.version_info < (3, 0):

View File

@@ -1,40 +0,0 @@
#!/usr/bin/env python
import sys, os
try:
import urllib.request as compat_urllib_request
except ImportError: # Python 2
import urllib2 as compat_urllib_request
sys.stderr.write(u'Hi! We changed distribution method and now youtube-dl needs to update itself one more time.\n')
sys.stderr.write(u'This will only happen once. Simply press enter to go on. Sorry for the trouble!\n')
sys.stderr.write(u'The new location of the binaries is https://github.com/rg3/youtube-dl/downloads, not the git repository.\n\n')
try:
raw_input()
except NameError: # Python 3
input()
filename = sys.argv[0]
API_URL = "https://api.github.com/repos/rg3/youtube-dl/downloads"
BIN_URL = "https://github.com/downloads/rg3/youtube-dl/youtube-dl"
if not os.access(filename, os.W_OK):
sys.exit('ERROR: no write permissions on %s' % filename)
try:
urlh = compat_urllib_request.urlopen(BIN_URL)
newcontent = urlh.read()
urlh.close()
except (IOError, OSError) as err:
sys.exit('ERROR: unable to download latest version')
try:
with open(filename, 'wb') as outf:
outf.write(newcontent)
except (IOError, OSError) as err:
sys.exit('ERROR: unable to overwrite current version')
sys.stderr.write(u'Done! Now you can run youtube-dl.\n')

View File

@@ -1,12 +0,0 @@
from distutils.core import setup
import py2exe
py2exe_options = {
"bundle_files": 1,
"compressed": 1,
"optimize": 2,
"dist_dir": '.',
"dll_excludes": ['w9xpopen.exe']
}
setup(console=['youtube-dl.py'], options={ "py2exe": py2exe_options }, zipfile=None)

View File

@@ -1,102 +0,0 @@
#!/usr/bin/env python
import sys, os
import urllib2
import json, hashlib
def rsa_verify(message, signature, key):
from struct import pack
from hashlib import sha256
from sys import version_info
def b(x):
if version_info[0] == 2: return x
else: return x.encode('latin1')
assert(type(message) == type(b('')))
block_size = 0
n = key[0]
while n:
block_size += 1
n >>= 8
signature = pow(int(signature, 16), key[1], key[0])
raw_bytes = []
while signature:
raw_bytes.insert(0, pack("B", signature & 0xFF))
signature >>= 8
signature = (block_size - len(raw_bytes)) * b('\x00') + b('').join(raw_bytes)
if signature[0:2] != b('\x00\x01'): return False
signature = signature[2:]
if not b('\x00') in signature: return False
signature = signature[signature.index(b('\x00'))+1:]
if not signature.startswith(b('\x30\x31\x30\x0D\x06\x09\x60\x86\x48\x01\x65\x03\x04\x02\x01\x05\x00\x04\x20')): return False
signature = signature[19:]
if signature != sha256(message).digest(): return False
return True
sys.stderr.write(u'Hi! We changed distribution method and now youtube-dl needs to update itself one more time.\n')
sys.stderr.write(u'This will only happen once. Simply press enter to go on. Sorry for the trouble!\n')
sys.stderr.write(u'From now on, get the binaries from http://rg3.github.com/youtube-dl/download.html, not from the git repository.\n\n')
raw_input()
filename = sys.argv[0]
UPDATE_URL = "http://rg3.github.io/youtube-dl/update/"
VERSION_URL = UPDATE_URL + 'LATEST_VERSION'
JSON_URL = UPDATE_URL + 'versions.json'
UPDATES_RSA_KEY = (0x9d60ee4d8f805312fdb15a62f87b95bd66177b91df176765d13514a0f1754bcd2057295c5b6f1d35daa6742c3ffc9a82d3e118861c207995a8031e151d863c9927e304576bc80692bc8e094896fcf11b66f3e29e04e3a71e9a11558558acea1840aec37fc396fb6b65dc81a1c4144e03bd1c011de62e3f1357b327d08426fe93, 65537)
if not os.access(filename, os.W_OK):
sys.exit('ERROR: no write permissions on %s' % filename)
exe = os.path.abspath(filename)
directory = os.path.dirname(exe)
if not os.access(directory, os.W_OK):
sys.exit('ERROR: no write permissions on %s' % directory)
try:
versions_info = urllib2.urlopen(JSON_URL).read().decode('utf-8')
versions_info = json.loads(versions_info)
except:
sys.exit(u'ERROR: can\'t obtain versions info. Please try again later.')
if not 'signature' in versions_info:
sys.exit(u'ERROR: the versions file is not signed or corrupted. Aborting.')
signature = versions_info['signature']
del versions_info['signature']
if not rsa_verify(json.dumps(versions_info, sort_keys=True), signature, UPDATES_RSA_KEY):
sys.exit(u'ERROR: the versions file signature is invalid. Aborting.')
version = versions_info['versions'][versions_info['latest']]
try:
urlh = urllib2.urlopen(version['exe'][0])
newcontent = urlh.read()
urlh.close()
except (IOError, OSError) as err:
sys.exit('ERROR: unable to download latest version')
newcontent_hash = hashlib.sha256(newcontent).hexdigest()
if newcontent_hash != version['exe'][1]:
sys.exit(u'ERROR: the downloaded file hash does not match. Aborting.')
try:
with open(exe + '.new', 'wb') as outf:
outf.write(newcontent)
except (IOError, OSError) as err:
sys.exit(u'ERROR: unable to write the new version')
try:
bat = os.path.join(directory, 'youtube-dl-updater.bat')
b = open(bat, 'w')
b.write("""
echo Updating youtube-dl...
ping 127.0.0.1 -n 5 -w 1000 > NUL
move /Y "%s.new" "%s"
del "%s"
\n""" %(exe, exe, bat))
b.close()
os.startfile(bat)
except (IOError, OSError) as err:
sys.exit('ERROR: unable to overwrite current version')
sys.stderr.write(u'Done! Now you can run youtube-dl.\n')

View File

@@ -1,4 +1,6 @@
#!/usr/bin/env python
from __future__ import unicode_literals
import os
from os.path import dirname as dirn
import sys

500
docs/supportedsites.md Normal file
View File

@@ -0,0 +1,500 @@
# Supported sites
- **1up.com**
- **220.ro**
- **24video**
- **3sat**
- **4tube**
- **56.com**
- **5min**
- **8tracks**
- **9gag**
- **abc.net.au**
- **AcademicEarth:Course**
- **AddAnime**
- **AdobeTV**
- **AdultSwim**
- **Aftonbladet**
- **AlJazeera**
- **Allocine**
- **anitube.se**
- **AnySex**
- **Aparat**
- **AppleTrailers**
- **archive.org**: archive.org videos
- **ARD**
- **ARD:mediathek**
- **arte.tv**
- **arte.tv:+7**
- **arte.tv:concert**
- **arte.tv:creative**
- **arte.tv:ddc**
- **arte.tv:embed**
- **arte.tv:future**
- **audiomack**
- **AUEngine**
- **Azubu**
- **bambuser**
- **bambuser:channel**
- **Bandcamp**
- **Bandcamp:album**
- **bbc.co.uk**: BBC iPlayer
- **Beeg**
- **BehindKink**
- **Bet**
- **Bild**: Bild.de
- **BiliBili**
- **blinkx**
- **blip.tv:user**
- **BlipTV**
- **Bloomberg**
- **Bpb**: Bundeszentrale für politische Bildung
- **BR**: Bayerischer Rundfunk Mediathek
- **Break**
- **Brightcove**
- **BuzzFeed**
- **BYUtv**
- **Canal13cl**
- **canalc2.tv**
- **Canalplus**: canalplus.fr, piwiplus.fr and d8.tv
- **CBS**
- **CBSNews**: CBS News
- **CeskaTelevize**
- **channel9**: Channel 9
- **Chilloutzone**
- **Cinchcast**
- **Cinemassacre**
- **clipfish**
- **cliphunter**
- **Clipsyndicate**
- **Cloudy**
- **Clubic**
- **cmt.com**
- **CNET**
- **CNN**
- **CNNBlogs**
- **CollegeHumor**
- **ComCarCoff**
- **ComedyCentral**
- **ComedyCentralShows**: The Daily Show / The Colbert Report
- **CondeNast**: Condé Nast media group: Condé Nast, GQ, Glamour, Vanity Fair, Vogue, W Magazine, WIRED
- **Cracked**
- **Criterion**
- **Crunchyroll**
- **crunchyroll:playlist**
- **CSpan**: C-SPAN
- **culturebox.francetvinfo.fr**
- **dailymotion**
- **dailymotion:playlist**
- **dailymotion:user**
- **daum.net**
- **DBTV**
- **DeezerPlaylist**
- **defense.gouv.fr**
- **Discovery**
- **divxstage**: DivxStage
- **Dotsub**
- **Dropbox**
- **DrTuber**
- **DRTV**
- **Dump**
- **dvtv**: http://video.aktualne.cz/
- **EbaumsWorld**
- **eHow**
- **Einthusan**
- **eitb.tv**
- **EllenTV**
- **EllenTV:clips**
- **ElPais**: El País
- **EMPFlix**
- **Engadget**
- **Eporner**
- **Escapist**
- **EveryonesMixtape**
- **exfm**: ex.fm
- **ExpoTV**
- **ExtremeTube**
- **facebook**
- **faz.net**
- **fc2**
- **fernsehkritik.tv**
- **fernsehkritik.tv:postecke**
- **Firedrive**
- **Firstpost**
- **firsttv**: Видеоархив - Первый канал
- **Flickr**
- **Folketinget**: Folketinget (ft.dk; Danish parliament)
- **Foxgay**
- **FoxNews**
- **france2.fr:generation-quoi**
- **FranceCulture**
- **FranceInter**
- **francetv**: France 2, 3, 4, 5 and Ô
- **francetvinfo.fr**
- **Freesound**
- **freespeech.org**
- **FreeVideo**
- **FunnyOrDie**
- **Gamekings**
- **GameOne**
- **gameone:playlist**
- **GameSpot**
- **GameStar**
- **Gametrailers**
- **GDCVault**
- **generic**: Generic downloader that works on some sites
- **GiantBomb**
- **Glide**: Glide mobile video messages (glide.me)
- **Globo**
- **GodTube**
- **GoldenMoustache**
- **Golem**
- **GorillaVid**: GorillaVid.in, daclips.in, movpod.in and fastvideo.in
- **Goshgay**
- **Grooveshark**
- **Groupon**
- **Hark**
- **Heise**
- **Helsinki**: helsinki.fi
- **HentaiStigma**
- **HornBunny**
- **HostingBulk**
- **HotNewHipHop**
- **Howcast**
- **HowStuffWorks**
- **HuffPost**: Huffington Post
- **Hypem**
- **Iconosquare**
- **ign.com**
- **imdb**: Internet Movie Database trailers
- **imdb:list**: Internet Movie Database lists
- **Ina**
- **InfoQ**
- **Instagram**
- **instagram:user**: Instagram user profile
- **InternetVideoArchive**
- **IPrima**
- **ivi**: ivi.ru
- **ivi:compilation**: ivi.ru compilations
- **Izlesene**
- **JadoreCettePub**
- **JeuxVideo**
- **Jove**
- **jpopsuki.tv**
- **Jukebox**
- **Kankan**
- **keek**
- **KeezMovies**
- **KhanAcademy**
- **KickStarter**
- **kontrtube**: KontrTube.ru - Труба зовёт
- **KrasView**: Красвью
- **Ku6**
- **la7.tv**
- **Laola1Tv**
- **lifenews**: LIFE | NEWS
- **LiveLeak**
- **livestream**
- **livestream:original**
- **lrt.lt**
- **lynda**: lynda.com videos
- **lynda:course**: lynda.com online courses
- **m6**
- **macgamestore**: MacGameStore trailers
- **mailru**: Видео@Mail.Ru
- **Malemotion**
- **MDR**
- **metacafe**
- **Metacritic**
- **Mgoon**
- **Minhateca**
- **MinistryGrid**
- **mitele.es**
- **mixcloud**
- **MLB**
- **MoeVideo**: LetitBit video services: moevideo.net, playreplay.net and videochart.net
- **Mofosex**
- **Mojvideo**
- **Moniker**: allmyvideos.net and vidspot.net
- **mooshare**: Mooshare.biz
- **Morningstar**: morningstar.com
- **Motherless**
- **Motorsport**: motorsport.com
- **MovieClips**
- **Moviezine**
- **movshare**: MovShare
- **MPORA**
- **MTV**
- **mtviggy.com**
- **mtvservices:embedded**
- **MuenchenTV**: münchen.tv
- **MusicPlayOn**
- **MusicVault**
- **muzu.tv**
- **MySpace**
- **MySpace:album**
- **MySpass**
- **myvideo**
- **MyVidster**
- **Naver**
- **NBA**
- **NBC**
- **NBCNews**
- **ndr**: NDR.de - Mediathek
- **NDTV**
- **NerdCubedFeed**
- **Newgrounds**
- **Newstube**
- **nfb**: National Film Board of Canada
- **nfl.com**
- **nhl.com**
- **nhl.com:videocenter**: NHL videocenter category
- **niconico**: ニコニコ動画
- **NiconicoPlaylist**
- **Noco**
- **Normalboots**
- **NosVideo**
- **novamov**: NovaMov
- **Nowness**
- **nowvideo**: NowVideo
- **npo.nl**
- **NRK**
- **NRKTV**
- **NTV**
- **Nuvid**
- **NYTimes**
- **ocw.mit.edu**
- **OktoberfestTV**
- **on.aol.com**
- **Ooyala**
- **orf:oe1**: Radio Österreich 1
- **orf:tvthek**: ORF TVthek
- **ORFFM4**: radio FM4
- **parliamentlive.tv**: UK parliament videos
- **Patreon**
- **PBS**
- **Phoenix**
- **Photobucket**
- **PlanetaPlay**
- **play.fm**
- **played.to**
- **Playvid**
- **plus.google**: Google Plus
- **pluzz.francetv.fr**
- **podomatic**
- **PornHd**
- **PornHub**
- **Pornotube**
- **PornoXO**
- **PromptFile**
- **prosiebensat1**: ProSiebenSat.1 Digital
- **Pyvideo**
- **QuickVid**
- **radio.de**
- **radiofrance**
- **Rai**
- **RBMARadio**
- **RedTube**
- **Restudy**
- **ReverbNation**
- **RingTV**
- **RottenTomatoes**
- **Roxwel**
- **RTBF**
- **RTLnow**
- **rtlxl.nl**
- **RTP**
- **RTS**: RTS.ch
- **rtve.es:alacarta**: RTVE a la carta
- **rtve.es:live**: RTVE.es live streams
- **RUHD**
- **rutube**: Rutube videos
- **rutube:channel**: Rutube channels
- **rutube:movie**: Rutube movies
- **rutube:person**: Rutube person videos
- **RUTV**: RUTV.RU
- **Sapo**: SAPO Vídeos
- **savefrom.net**
- **SBS**: sbs.com.au
- **SciVee**
- **screen.yahoo:search**: Yahoo screen search
- **Screencast**
- **ScreencastOMatic**
- **ScreenwaveMedia**
- **ServingSys**
- **Sexu**
- **SexyKarma**: Sexy Karma and Watch Indian Porn
- **Shared**
- **ShareSix**
- **Sina**
- **Slideshare**
- **Slutload**
- **smotri**: Smotri.com
- **smotri:broadcast**: Smotri.com broadcasts
- **smotri:community**: Smotri.com community videos
- **smotri:user**: Smotri.com user videos
- **Snotr**
- **Sockshare**
- **Sohu**
- **soundcloud**
- **soundcloud:playlist**
- **soundcloud:set**
- **soundcloud:user**
- **Soundgasm**
- **southpark.cc.com**
- **southpark.de**
- **Space**
- **Spankwire**
- **Spiegel**
- **Spiegel:Article**: Articles on spiegel.de
- **Spiegeltv**
- **Spike**
- **Sport5**
- **SportBox**
- **SportDeutschland**
- **SRMediathek**: Süddeutscher Rundfunk
- **stanfordoc**: Stanford Open ClassRoom
- **Steam**
- **streamcloud.eu**
- **StreamCZ**
- **SunPorno**
- **SWRMediathek**
- **Syfy**
- **SztvHu**
- **Tagesschau**
- **Tapely**
- **Tass**
- **teachertube**: teachertube.com videos
- **teachertube:user:collection**: teachertube.com user and collection videos
- **TeachingChannel**
- **Teamcoco**
- **TeamFour**
- **TechTalks**
- **techtv.mit.edu**
- **TED**
- **tegenlicht.vpro.nl**
- **TeleBruxelles**
- **telecinco.es**
- **TeleMB**
- **TenPlay**
- **TF1**
- **TheOnion**
- **ThePlatform**
- **TheSixtyOne**
- **ThisAV**
- **THVideo**
- **THVideoPlaylist**
- **tinypic**: tinypic.com videos
- **tlc.com**
- **tlc.de**
- **TMZ**
- **TNAFlix**
- **tou.tv**
- **Toypics**: Toypics user profile
- **ToypicsUser**: Toypics user profile
- **TrailerAddict** (Currently broken)
- **Trilulilu**
- **TruTube**
- **Tube8**
- **Tudou**
- **Tumblr**
- **TuneIn**
- **Turbo**
- **Tutv**
- **tv.dfb.de**
- **tvigle**: Интернет-телевидение Tvigle.ru
- **tvp.pl**
- **TVPlay**: TV3Play and related services
- **Twitch**
- **Ubu**
- **udemy**
- **udemy:course**
- **Unistra**
- **Urort**: NRK P3 Urørt
- **ustream**
- **ustream:channel**
- **Vbox7**
- **VeeHD**
- **Veoh**
- **Vesti**: Вести.Ru
- **Vevo**
- **VGTV**
- **vh1.com**
- **Vice**
- **Viddler**
- **video.google:search**: Google Video search
- **video.mit.edu**
- **VideoBam**
- **VideoDetective**
- **videofy.me**
- **videolectures.net**
- **VideoMega**
- **VideoPremium**
- **VideoTt**: video.tt - Your True Tube
- **videoweed**: VideoWeed
- **Vidme**
- **Vidzi**
- **viki**
- **vimeo**
- **vimeo:album**
- **vimeo:channel**
- **vimeo:group**
- **vimeo:likes**: Vimeo user likes
- **vimeo:review**: Review pages on vimeo
- **vimeo:user**
- **vimeo:watchlater**: Vimeo watch later list, "vimeowatchlater" keyword (requires authentication)
- **Vimple**: Vimple.ru
- **Vine**
- **vine:user**
- **vk.com**
- **vk.com:user-videos**: vk.com:All of a user's videos
- **Vodlocker**
- **Vporn**
- **VRT**
- **vube**: Vube.com
- **VuClip**
- **vulture.com**
- **Walla**
- **WashingtonPost**
- **wat.tv**
- **WayOfTheMaster**
- **WDR**
- **wdr:mobile**
- **WDRMaus**: Sendung mit der Maus
- **Weibo**
- **Wimp**
- **Wistia**
- **WorldStarHipHop**
- **wrzuta.pl**
- **XBef**
- **XboxClips**
- **XHamster**
- **XMinus**
- **XNXX**
- **XTube**
- **XTubeUser**: XTube user profile
- **XVideos**
- **Yahoo**: Yahoo screen and movies
- **YesJapan**
- **Ynet**
- **YouJizz**
- **Youku**
- **YouPorn**
- **YourUpload**
- **youtube**: YouTube.com
- **youtube:channel**: YouTube.com channels
- **youtube:favorites**: YouTube.com favourite videos, ":ytfav" for short (requires authentication)
- **youtube:history**: Youtube watch history, ":ythistory" for short (requires authentication)
- **youtube:playlist**: YouTube.com playlists
- **youtube:recommended**: YouTube.com recommended videos, ":ytrec" for short (requires authentication)
- **youtube:search**: YouTube.com searches
- **youtube:search:date**: YouTube.com searches, newest videos first
- **youtube:search_url**: YouTube.com search URLs
- **youtube:show**: YouTube.com (multi-season) shows
- **youtube:subscriptions**: YouTube.com subscriptions feed, "ytsubs" keyword (requires authentication)
- **youtube:toplist**: YouTube.com top lists, "yttoplist:{channel}:{list title}" (Example: "yttoplist:music:Top Tracks")
- **youtube:user**: YouTube.com user videos (URL or "ytuser" keyword)
- **youtube:watch_later**: Youtube watch later list, ":ytwatchlater" for short (requires authentication)
- **ZDF**
- **ZDFChannel**
- **zingmp3:album**: mp3.zing.vn albums
- **zingmp3:song**: mp3.zing.vn songs

View File

@@ -1,2 +1,6 @@
[wheel]
universal = True
[flake8]
exclude = youtube_dl/extractor/__init__.py,devscripts/buildserver.py,setup.py,build
ignore = E501

View File

@@ -4,7 +4,6 @@
from __future__ import print_function
import os.path
import pkg_resources
import warnings
import sys
@@ -103,7 +102,9 @@ setup(
"Programming Language :: Python :: 2.6",
"Programming Language :: Python :: 2.7",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.3"
"Programming Language :: Python :: 3.2",
"Programming Language :: Python :: 3.3",
"Programming Language :: Python :: 3.4",
],
**params

View File

@@ -59,7 +59,7 @@ class FakeYDL(YoutubeDL):
params = get_params(override=override)
super(FakeYDL, self).__init__(params, auto_init=False)
self.result = []
def to_screen(self, s, skip_eol=None):
print(s)
@@ -72,32 +72,24 @@ class FakeYDL(YoutubeDL):
def expect_warning(self, regex):
# Silence an expected warning matching a regex
old_report_warning = self.report_warning
def report_warning(self, message):
if re.match(regex, message): return
if re.match(regex, message):
return
old_report_warning(message)
self.report_warning = types.MethodType(report_warning, self)
def gettestcases(include_onlymatching=False):
for ie in youtube_dl.extractor.gen_extractors():
t = getattr(ie, '_TEST', None)
if t:
assert not hasattr(ie, '_TESTS'), \
'%s has _TEST and _TESTS' % type(ie).__name__
tests = [t]
else:
tests = getattr(ie, '_TESTS', [])
for t in tests:
if not include_onlymatching and t.get('only_matching', False):
continue
t['name'] = type(ie).__name__[:-len('IE')]
yield t
for tc in ie.get_testcases(include_onlymatching):
yield tc
md5 = lambda s: hashlib.md5(s.encode('utf-8')).hexdigest()
def expect_info_dict(self, expected_dict, got_dict):
def expect_info_dict(self, got_dict, expected_dict):
for info_field, expected in expected_dict.items():
if isinstance(expected, compat_str) and expected.startswith('re:'):
got = got_dict.get(info_field)
@@ -114,14 +106,14 @@ def expect_info_dict(self, expected_dict, got_dict):
elif isinstance(expected, type):
got = got_dict.get(info_field)
self.assertTrue(isinstance(got, expected),
'Expected type %r for field %s, but got value %r of type %r' % (expected, info_field, got, type(got)))
'Expected type %r for field %s, but got value %r of type %r' % (expected, info_field, got, type(got)))
else:
if isinstance(expected, compat_str) and expected.startswith('md5:'):
got = 'md5:' + md5(got_dict.get(info_field))
else:
got = got_dict.get(info_field)
self.assertEqual(expected, got,
'invalid value for field %s, expected %r, got %r' % (info_field, expected, got))
'invalid value for field %s, expected %r, got %r' % (info_field, expected, got))
# Check for the presence of mandatory fields
if got_dict.get('_type') != 'playlist':
@@ -133,13 +125,13 @@ def expect_info_dict(self, expected_dict, got_dict):
# Are checkable fields missing from the test case definition?
test_info_dict = dict((key, value if not isinstance(value, compat_str) or len(value) < 250 else 'md5:' + md5(value))
for key, value in got_dict.items()
if value and key in ('title', 'description', 'uploader', 'upload_date', 'timestamp', 'uploader_id', 'location'))
for key, value in got_dict.items()
if value and key in ('title', 'description', 'uploader', 'upload_date', 'timestamp', 'uploader_id', 'location'))
missing_keys = set(test_info_dict.keys()) - set(expected_dict.keys())
if missing_keys:
def _repr(v):
if isinstance(v, compat_str):
return "'%s'" % v.replace('\\', '\\\\').replace("'", "\\'")
return "'%s'" % v.replace('\\', '\\\\').replace("'", "\\'").replace('\n', '\\n')
else:
return repr(v)
info_dict_str = ''.join(
@@ -159,7 +151,9 @@ def assertRegexpMatches(self, text, regexp, msg=None):
else:
m = re.match(regexp, text)
if not m:
note = 'Regexp didn\'t match: %r not found in %r' % (regexp, text)
note = 'Regexp didn\'t match: %r not found' % (regexp)
if len(text) < 1000:
note += ' in %r' % text
if msg is None:
msg = note
else:

View File

@@ -218,7 +218,7 @@ class TestFormatSelection(unittest.TestCase):
# 3D
'85', '84', '102', '83', '101', '82', '100',
# Dash video
'138', '137', '248', '136', '247', '135', '246',
'137', '248', '136', '247', '135', '246',
'245', '244', '134', '243', '133', '242', '160',
# Dash audio
'141', '172', '140', '171', '139',
@@ -266,6 +266,7 @@ class TestFormatSelection(unittest.TestCase):
'ext': 'mp4',
'width': None,
}
def fname(templ):
ydl = YoutubeDL({'outtmpl': templ})
return ydl.prepare_filename(info)

View File

@@ -32,19 +32,19 @@ class TestAllURLsMatching(unittest.TestCase):
def test_youtube_playlist_matching(self):
assertPlaylist = lambda url: self.assertMatch(url, ['youtube:playlist'])
assertPlaylist('ECUl4u3cNGP61MdtwGTqZA0MreSaDybji8')
assertPlaylist('UUBABnxM4Ar9ten8Mdjj1j0Q') #585
assertPlaylist('UUBABnxM4Ar9ten8Mdjj1j0Q') # 585
assertPlaylist('PL63F0C78739B09958')
assertPlaylist('https://www.youtube.com/playlist?list=UUBABnxM4Ar9ten8Mdjj1j0Q')
assertPlaylist('https://www.youtube.com/course?list=ECUl4u3cNGP61MdtwGTqZA0MreSaDybji8')
assertPlaylist('https://www.youtube.com/playlist?list=PLwP_SiAcdui0KVebT0mU9Apz359a4ubsC')
assertPlaylist('https://www.youtube.com/watch?v=AV6J6_AeFEQ&playnext=1&list=PL4023E734DA416012') #668
assertPlaylist('https://www.youtube.com/watch?v=AV6J6_AeFEQ&playnext=1&list=PL4023E734DA416012') # 668
self.assertFalse('youtube:playlist' in self.matching_ies('PLtS2H6bU1M'))
# Top tracks
assertPlaylist('https://www.youtube.com/playlist?list=MCUS.20142101')
def test_youtube_matching(self):
self.assertTrue(YoutubeIE.suitable('PLtS2H6bU1M'))
self.assertFalse(YoutubeIE.suitable('https://www.youtube.com/watch?v=AV6J6_AeFEQ&playnext=1&list=PL4023E734DA416012')) #668
self.assertFalse(YoutubeIE.suitable('https://www.youtube.com/watch?v=AV6J6_AeFEQ&playnext=1&list=PL4023E734DA416012')) # 668
self.assertMatch('http://youtu.be/BaW_jenozKc', ['youtube'])
self.assertMatch('http://www.youtube.com/v/BaW_jenozKc', ['youtube'])
self.assertMatch('https://youtube.googleapis.com/v/BaW_jenozKc', ['youtube'])

View File

@@ -40,18 +40,22 @@ from youtube_dl.extractor import get_info_extractor
RETRIES = 3
class YoutubeDL(youtube_dl.YoutubeDL):
def __init__(self, *args, **kwargs):
self.to_stderr = self.to_screen
self.processed_info_dicts = []
super(YoutubeDL, self).__init__(*args, **kwargs)
def report_warning(self, message):
# Don't accept warnings during tests
raise ExtractorError(message)
def process_info(self, info_dict):
self.processed_info_dicts.append(info_dict)
return super(YoutubeDL, self).process_info(info_dict)
def _file_md5(fn):
with open(fn, 'rb') as f:
return hashlib.md5(f.read()).hexdigest()
@@ -61,10 +65,13 @@ defs = gettestcases()
class TestDownload(unittest.TestCase):
maxDiff = None
def setUp(self):
self.defs = defs
### Dynamically generate tests
# Dynamically generate tests
def generator(test_case):
def test_template(self):
@@ -90,7 +97,7 @@ def generator(test_case):
return
for other_ie in other_ies:
if not other_ie.working():
print_skipping(u'test depends on %sIE, marked as not WORKING' % other_ie.ie_key())
print_skipping('test depends on %sIE, marked as not WORKING' % other_ie.ie_key())
return
params = get_params(test_case.get('params', {}))
@@ -101,6 +108,7 @@ def generator(test_case):
ydl = YoutubeDL(params, auto_init=False)
ydl.add_default_info_extractors()
finished_hook_called = set()
def _hook(status):
if status['status'] == 'finished':
finished_hook_called.add(status['filename'])
@@ -111,6 +119,7 @@ def generator(test_case):
return tc.get('file') or ydl.prepare_filename(tc.get('info_dict', {}))
res_dict = None
def try_rm_tcs_files(tcs=None):
if tcs is None:
tcs = test_cases
@@ -134,7 +143,7 @@ def generator(test_case):
raise
if try_num == RETRIES:
report_warning(u'Failed due to network errors, skipping...')
report_warning('Failed due to network errors, skipping...')
return
print('Retrying: {0} failed tries\n\n##########\n\n'.format(try_num))
@@ -146,7 +155,7 @@ def generator(test_case):
if is_playlist:
self.assertEqual(res_dict['_type'], 'playlist')
self.assertTrue('entries' in res_dict)
expect_info_dict(self, test_case.get('info_dict', {}), res_dict)
expect_info_dict(self, res_dict, test_case.get('info_dict', {}))
if 'playlist_mincount' in test_case:
assertGreaterEqual(
@@ -195,7 +204,7 @@ def generator(test_case):
with io.open(info_json_fn, encoding='utf-8') as infof:
info_dict = json.load(infof)
expect_info_dict(self, tc.get('info_dict', {}), info_dict)
expect_info_dict(self, info_dict, tc.get('info_dict', {}))
finally:
try_rm_tcs_files()
if is_playlist and res_dict is not None and res_dict.get('entries'):
@@ -206,7 +215,7 @@ def generator(test_case):
return test_template
### And add them to TestDownload
# And add them to TestDownload
for n, test_case in enumerate(defs):
test_method = generator(test_case)
tname = 'test_' + str(test_case['name'])

View File

@@ -17,12 +17,14 @@ from youtube_dl.extractor import (
TEDIE,
VimeoIE,
WallaIE,
CeskaTelevizeIE,
)
class BaseTestSubtitles(unittest.TestCase):
url = None
IE = None
def setUp(self):
self.DL = FakeYDL()
self.ie = self.IE(self.DL)
@@ -87,6 +89,14 @@ class TestYoutubeSubtitles(BaseTestSubtitles):
subtitles = self.getSubtitles()
self.assertTrue(subtitles['it'] is not None)
def test_youtube_translated_subtitles(self):
# This video has a subtitles track, which can be translated
self.url = 'Ky9eprVWzlI'
self.DL.params['writeautomaticsub'] = True
self.DL.params['subtitleslangs'] = ['it']
subtitles = self.getSubtitles()
self.assertTrue(subtitles['it'] is not None)
def test_youtube_nosubtitles(self):
self.DL.expect_warning('video doesn\'t have subtitles')
self.url = 'n5BB19UTcdA'
@@ -237,7 +247,7 @@ class TestVimeoSubtitles(BaseTestSubtitles):
def test_subtitles(self):
self.DL.params['writesubtitles'] = True
subtitles = self.getSubtitles()
self.assertEqual(md5(subtitles['en']), '8062383cf4dec168fc40a088aa6d5888')
self.assertEqual(md5(subtitles['en']), '26399116d23ae3cf2c087cea94bc43b4')
def test_subtitles_lang(self):
self.DL.params['writesubtitles'] = True
@@ -308,5 +318,32 @@ class TestWallaSubtitles(BaseTestSubtitles):
self.assertEqual(len(subtitles), 0)
class TestCeskaTelevizeSubtitles(BaseTestSubtitles):
url = 'http://www.ceskatelevize.cz/ivysilani/10600540290-u6-uzasny-svet-techniky'
IE = CeskaTelevizeIE
def test_list_subtitles(self):
self.DL.expect_warning('Automatic Captions not supported by this server')
self.DL.params['listsubtitles'] = True
info_dict = self.getInfoDict()
self.assertEqual(info_dict, None)
def test_allsubtitles(self):
self.DL.expect_warning('Automatic Captions not supported by this server')
self.DL.params['writesubtitles'] = True
self.DL.params['allsubtitles'] = True
subtitles = self.getSubtitles()
self.assertEqual(set(subtitles.keys()), set(['cs']))
self.assertEqual(md5(subtitles['cs']), '9bf52d9549533c32c427e264bf0847d4')
def test_nosubtitles(self):
self.DL.expect_warning('video doesn\'t have subtitles')
self.url = 'http://www.ceskatelevize.cz/ivysilani/ivysilani/10441294653-hyde-park-civilizace/214411058091220'
self.DL.params['writesubtitles'] = True
self.DL.params['allsubtitles'] = True
subtitles = self.getSubtitles()
self.assertEqual(len(subtitles), 0)
if __name__ == '__main__':
unittest.main()

View File

@@ -1,22 +1,28 @@
from __future__ import unicode_literals
import io
# Allow direct execution
import os
import re
import sys
import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
import io
import re
rootDir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
IGNORED_FILES = [
'setup.py', # http://bugs.python.org/issue13943
'conf.py',
'buildserver.py',
]
from test.helper import assertRegexpMatches
class TestUnicodeLiterals(unittest.TestCase):
def test_all_files(self):
print('Skipping this test (not yet fully implemented)')
return
for dirpath, _, filenames in os.walk(rootDir):
for basename in filenames:
if not basename.endswith('.py'):
@@ -30,10 +36,11 @@ class TestUnicodeLiterals(unittest.TestCase):
if "'" not in code and '"' not in code:
continue
imps = 'from __future__ import unicode_literals'
self.assertTrue(
imps in code,
' %s missing in %s' % (imps, fn))
assertRegexpMatches(
self,
code,
r'(?:(?:#.*?|\s*)\n)*from __future__ import (?:[a-z_]+,\s*)*unicode_literals',
'unicode_literals import missing in %s' % fn)
m = re.search(r'(?<=\s)u[\'"](?!\)|,|$)', code)
if m is not None:

View File

@@ -16,37 +16,41 @@ import json
import xml.etree.ElementTree
from youtube_dl.utils import (
age_restricted,
args_to_str,
clean_html,
DateRange,
detect_exe_version,
encodeFilename,
escape_rfc3986,
escape_url,
find_xpath_attr,
fix_xml_ampersands,
orderedSet,
OnDemandPagedList,
InAdvancePagedList,
intlist_to_bytes,
js_to_json,
limit_length,
OnDemandPagedList,
orderedSet,
parse_duration,
parse_filesize,
parse_iso8601,
read_batch_urls,
sanitize_filename,
shell_quote,
smuggle_url,
str_to_int,
strip_jsonp,
struct_unpack,
timeconvert,
unescapeHTML,
unified_strdate,
unsmuggle_url,
uppercase_escape,
url_basename,
urlencode_postdata,
version_tuple,
xpath_with_ns,
parse_iso8601,
strip_jsonp,
uppercase_escape,
limit_length,
escape_rfc3986,
escape_url,
js_to_json,
get_filesystem_encoding,
intlist_to_bytes,
)
@@ -119,16 +123,16 @@ class TestUtil(unittest.TestCase):
self.assertEqual(orderedSet([1, 1, 2, 3, 4, 4, 5, 6, 7, 3, 5]), [1, 2, 3, 4, 5, 6, 7])
self.assertEqual(orderedSet([]), [])
self.assertEqual(orderedSet([1]), [1])
#keep the list ordered
# keep the list ordered
self.assertEqual(orderedSet([135, 1, 1, 1]), [135, 1])
def test_unescape_html(self):
self.assertEqual(unescapeHTML('%20;'), '%20;')
self.assertEqual(
unescapeHTML('&eacute;'), 'é')
def test_daterange(self):
_20century = DateRange("19000101","20000101")
_20century = DateRange("19000101", "20000101")
self.assertFalse("17890714" in _20century)
_ac = DateRange("00010101")
self.assertTrue("19690721" in _ac)
@@ -142,6 +146,9 @@ class TestUtil(unittest.TestCase):
self.assertEqual(unified_strdate('2012/10/11 01:56:38 +0000'), '20121011')
self.assertEqual(unified_strdate('1968-12-10'), '19681210')
self.assertEqual(unified_strdate('28/01/2014 21:00:00 +0100'), '20140128')
self.assertEqual(
unified_strdate('11/26/2014 11:30:00 AM PST', day_first=False),
'20141126')
def test_find_xpath_attr(self):
testxml = '''<root>
@@ -170,7 +177,7 @@ class TestUtil(unittest.TestCase):
self.assertEqual(find('media:song/url').text, 'http://server.com/download.mp3')
def test_smuggle_url(self):
data = {u"ö": u"ö", u"abc": [3]}
data = {"ö": "ö", "abc": [3]}
url = 'https://foo.bar/baz?x=y#a'
smug_url = smuggle_url(url, data)
unsmug_url, unsmug_data = unsmuggle_url(smug_url)
@@ -219,6 +226,9 @@ class TestUtil(unittest.TestCase):
self.assertEqual(parse_duration('0s'), 0)
self.assertEqual(parse_duration('01:02:03.05'), 3723.05)
self.assertEqual(parse_duration('T30M38S'), 1838)
self.assertEqual(parse_duration('5 s'), 5)
self.assertEqual(parse_duration('3 min'), 180)
self.assertEqual(parse_duration('2.5 hours'), 9000)
def test_fix_xml_ampersands(self):
self.assertEqual(
@@ -361,5 +371,44 @@ class TestUtil(unittest.TestCase):
intlist_to_bytes([0, 1, 127, 128, 255]),
b'\x00\x01\x7f\x80\xff')
def test_args_to_str(self):
self.assertEqual(
args_to_str(['foo', 'ba/r', '-baz', '2 be', '']),
'foo ba/r -baz \'2 be\' \'\''
)
def test_parse_filesize(self):
self.assertEqual(parse_filesize(None), None)
self.assertEqual(parse_filesize(''), None)
self.assertEqual(parse_filesize('91 B'), 91)
self.assertEqual(parse_filesize('foobar'), None)
self.assertEqual(parse_filesize('2 MiB'), 2097152)
self.assertEqual(parse_filesize('5 GB'), 5000000000)
self.assertEqual(parse_filesize('1.2Tb'), 1200000000000)
self.assertEqual(parse_filesize('1,24 KB'), 1240)
def test_version_tuple(self):
self.assertEqual(version_tuple('1'), (1,))
self.assertEqual(version_tuple('10.23.344'), (10, 23, 344))
self.assertEqual(version_tuple('10.1-6'), (10, 1, 6)) # avconv style
def test_detect_exe_version(self):
self.assertEqual(detect_exe_version('''ffmpeg version 1.2.1
built on May 27 2013 08:37:26 with gcc 4.7 (Debian 4.7.3-4)
configuration: --prefix=/usr --extra-'''), '1.2.1')
self.assertEqual(detect_exe_version('''ffmpeg version N-63176-g1fb4685
built on May 15 2014 22:09:06 with gcc 4.8.2 (GCC)'''), 'N-63176-g1fb4685')
self.assertEqual(detect_exe_version('''X server found. dri2 connection failed!
Trying to open render node...
Success at /dev/dri/renderD128.
ffmpeg version 2.4.4 Copyright (c) 2000-2014 the FFmpeg ...'''), '2.4.4')
def test_age_restricted(self):
self.assertFalse(age_restricted(None, 10)) # unrestricted content
self.assertFalse(age_restricted(1, None)) # unrestricted policy
self.assertFalse(age_restricted(8, 10))
self.assertTrue(age_restricted(18, 14))
self.assertFalse(age_restricted(18, 18))
if __name__ == '__main__':
unittest.main()

View File

@@ -1,5 +1,6 @@
#!/usr/bin/env python
# coding: utf-8
from __future__ import unicode_literals
# Allow direct execution
import os
@@ -31,19 +32,18 @@ params = get_params({
})
TEST_ID = 'gr51aVj-mLg'
ANNOTATIONS_FILE = TEST_ID + '.flv.annotations.xml'
EXPECTED_ANNOTATIONS = ['Speech bubble', 'Note', 'Title', 'Spotlight', 'Label']
class TestAnnotations(unittest.TestCase):
def setUp(self):
# Clear old files
self.tearDown()
def test_info_json(self):
expected = list(EXPECTED_ANNOTATIONS) #Two annotations could have the same text.
expected = list(EXPECTED_ANNOTATIONS) # Two annotations could have the same text.
ie = youtube_dl.extractor.YoutubeIE()
ydl = YoutubeDL(params)
ydl.add_info_extractor(ie)
@@ -51,7 +51,7 @@ class TestAnnotations(unittest.TestCase):
self.assertTrue(os.path.exists(ANNOTATIONS_FILE))
annoxml = None
with io.open(ANNOTATIONS_FILE, 'r', encoding='utf-8') as annof:
annoxml = xml.etree.ElementTree.parse(annof)
annoxml = xml.etree.ElementTree.parse(annof)
self.assertTrue(annoxml is not None, 'Failed to parse annotations XML')
root = annoxml.getroot()
self.assertEqual(root.tag, 'document')
@@ -59,18 +59,17 @@ class TestAnnotations(unittest.TestCase):
self.assertEqual(annotationsTag.tag, 'annotations')
annotations = annotationsTag.findall('annotation')
#Not all the annotations have TEXT children and the annotations are returned unsorted.
# Not all the annotations have TEXT children and the annotations are returned unsorted.
for a in annotations:
self.assertEqual(a.tag, 'annotation')
if a.get('type') == 'text':
textTag = a.find('TEXT')
text = textTag.text
self.assertTrue(text in expected) #assertIn only added in python 2.7
#remove the first occurance, there could be more than one annotation with the same text
expected.remove(text)
#We should have seen (and removed) all the expected annotation texts.
self.assertEqual(a.tag, 'annotation')
if a.get('type') == 'text':
textTag = a.find('TEXT')
text = textTag.text
self.assertTrue(text in expected) # assertIn only added in python 2.7
# remove the first occurance, there could be more than one annotation with the same text
expected.remove(text)
# We should have seen (and removed) all the expected annotation texts.
self.assertEqual(len(expected), 0, 'Not all expected annotations were found.')
def tearDown(self):
try_rm(ANNOTATIONS_FILE)

View File

@@ -1,75 +0,0 @@
#!/usr/bin/env python
# coding: utf-8
# Allow direct execution
import os
import sys
import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from test.helper import get_params
import io
import json
import youtube_dl.YoutubeDL
import youtube_dl.extractor
class YoutubeDL(youtube_dl.YoutubeDL):
def __init__(self, *args, **kwargs):
super(YoutubeDL, self).__init__(*args, **kwargs)
self.to_stderr = self.to_screen
params = get_params({
'writeinfojson': True,
'skip_download': True,
'writedescription': True,
})
TEST_ID = 'BaW_jenozKc'
INFO_JSON_FILE = TEST_ID + '.info.json'
DESCRIPTION_FILE = TEST_ID + '.mp4.description'
EXPECTED_DESCRIPTION = u'''test chars: "'/\ä↭𝕐
test URL: https://github.com/rg3/youtube-dl/issues/1892
This is a test video for youtube-dl.
For more information, contact phihag@phihag.de .'''
class TestInfoJSON(unittest.TestCase):
def setUp(self):
# Clear old files
self.tearDown()
def test_info_json(self):
ie = youtube_dl.extractor.YoutubeIE()
ydl = YoutubeDL(params)
ydl.add_info_extractor(ie)
ydl.download([TEST_ID])
self.assertTrue(os.path.exists(INFO_JSON_FILE))
with io.open(INFO_JSON_FILE, 'r', encoding='utf-8') as jsonf:
jd = json.load(jsonf)
self.assertEqual(jd['upload_date'], u'20121002')
self.assertEqual(jd['description'], EXPECTED_DESCRIPTION)
self.assertEqual(jd['id'], TEST_ID)
self.assertEqual(jd['extractor'], 'youtube')
self.assertEqual(jd['title'], u'''youtube-dl test video "'/\ä↭𝕐''')
self.assertEqual(jd['uploader'], 'Philipp Hagemeister')
self.assertTrue(os.path.exists(DESCRIPTION_FILE))
with io.open(DESCRIPTION_FILE, 'r', encoding='utf-8') as descf:
descr = descf.read()
self.assertEqual(descr, EXPECTED_DESCRIPTION)
def tearDown(self):
if os.path.exists(INFO_JSON_FILE):
os.remove(INFO_JSON_FILE)
if os.path.exists(DESCRIPTION_FILE):
os.remove(DESCRIPTION_FILE)
if __name__ == '__main__':
unittest.main()

View File

@@ -1,4 +1,5 @@
#!/usr/bin/env python
from __future__ import unicode_literals
# Allow direct execution
import os
@@ -12,10 +13,6 @@ from test.helper import FakeYDL
from youtube_dl.extractor import (
YoutubePlaylistIE,
YoutubeIE,
YoutubeChannelIE,
YoutubeShowIE,
YoutubeTopListIE,
YoutubeSearchURLIE,
)
@@ -31,7 +28,7 @@ class TestYoutubeLists(unittest.TestCase):
result = ie.extract('https://www.youtube.com/watch?v=FXxLjLQi3Fg&list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re')
self.assertEqual(result['_type'], 'url')
self.assertEqual(YoutubeIE().extract_id(result['url']), 'FXxLjLQi3Fg')
def test_youtube_course(self):
dl = FakeYDL()
ie = YoutubePlaylistIE(dl)

View File

@@ -7,6 +7,7 @@ import collections
import datetime
import errno
import io
import itertools
import json
import locale
import os
@@ -26,6 +27,7 @@ from .compat import (
compat_cookiejar,
compat_expanduser,
compat_http_client,
compat_kwargs,
compat_str,
compat_urllib_error,
compat_urllib_request,
@@ -60,12 +62,18 @@ from .utils import (
write_string,
YoutubeDLHandler,
prepend_extension,
args_to_str,
age_restricted,
)
from .cache import Cache
from .extractor import get_info_extractor, gen_extractors
from .downloader import get_suitable_downloader
from .downloader.rtmp import rtmpdump_version
from .postprocessor import FFmpegMergerPP, FFmpegPostProcessor
from .postprocessor import (
FFmpegMergerPP,
FFmpegPostProcessor,
get_postprocessor,
)
from .version import __version__
@@ -114,7 +122,7 @@ class YoutubeDL(object):
dump_single_json: Force printing the info_dict of the whole playlist
(or video) as a single JSON line.
simulate: Do not download the video files.
format: Video format code.
format: Video format code. See options.py for more information.
format_limit: Highest quality format to try.
outtmpl: Template for output names.
restrictfilenames: Do not allow "&" and spaces in file names
@@ -122,6 +130,7 @@ class YoutubeDL(object):
nooverwrites: Prevent overwriting files.
playliststart: Playlist item to start at.
playlistend: Playlist item to end at.
playlistreverse: Download playlist items in reverse order.
matchtitle: Download only matching titles.
rejecttitle: Reject downloads for matching titles.
logger: Log messages to a logging.Logger instance.
@@ -173,6 +182,28 @@ class YoutubeDL(object):
extract_flat: Do not resolve URLs, return the immediate result.
Pass in 'in_playlist' to only show this behavior for
playlist items.
postprocessors: A list of dictionaries, each with an entry
* key: The name of the postprocessor. See
youtube_dl/postprocessor/__init__.py for a list.
as well as any further keyword arguments for the
postprocessor.
progress_hooks: A list of functions that get called on download
progress, with a dictionary with the entries
* filename: The final filename
* status: One of "downloading" and "finished"
The dict may also have some of the following entries:
* downloaded_bytes: Bytes on disk
* total_bytes: Size of the whole file, None if unknown
* tmpfilename: The filename we're currently writing to
* eta: The estimated time in seconds, None if unknown
* speed: The download speed in bytes/second, None if
unknown
Progress hooks are guaranteed to be called at least once
(with status "finished") if the download is successful.
The following parameters are not used by YoutubeDL itself, they are used by
the FileDownloader:
@@ -253,6 +284,32 @@ class YoutubeDL(object):
self.print_debug_header()
self.add_default_info_extractors()
for pp_def_raw in self.params.get('postprocessors', []):
pp_class = get_postprocessor(pp_def_raw['key'])
pp_def = dict(pp_def_raw)
del pp_def['key']
pp = pp_class(self, **compat_kwargs(pp_def))
self.add_post_processor(pp)
for ph in self.params.get('progress_hooks', []):
self.add_progress_hook(ph)
def warn_if_short_id(self, argv):
# short YouTube ID starting with dash?
idxs = [
i for i, a in enumerate(argv)
if re.match(r'^-[0-9A-Za-z_-]{10}$', a)]
if idxs:
correct_argv = (
['youtube-dl'] +
[a for i, a in enumerate(argv) if i not in idxs] +
['--'] + [argv[i] for i in idxs]
)
self.report_warning(
'Long argument string detected. '
'Use -- to separate parameters and URLs, like this:\n%s\n' %
args_to_str(correct_argv))
def add_info_extractor(self, ie):
"""Add an InfoExtractor object to the end of the list."""
self._ies.append(ie)
@@ -297,7 +354,7 @@ class YoutubeDL(object):
self._output_process.stdin.write((message + '\n').encode('utf-8'))
self._output_process.stdin.flush()
res = ''.join(self._output_channel.readline().decode('utf-8')
for _ in range(line_count))
for _ in range(line_count))
return res[:-len('\n')]
def to_screen(self, message, skip_eol=False):
@@ -494,13 +551,8 @@ class YoutubeDL(object):
max_views = self.params.get('max_views')
if max_views is not None and view_count > max_views:
return 'Skipping %s, because it has exceeded the maximum view count (%d/%d)' % (video_title, view_count, max_views)
age_limit = self.params.get('age_limit')
if age_limit is not None:
actual_age_limit = info_dict.get('age_limit')
if actual_age_limit is None:
actual_age_limit = 0
if age_limit < actual_age_limit:
return 'Skipping "' + title + '" because it is age restricted'
if age_restricted(info_dict.get('age_limit'), self.params.get('age_limit')):
return 'Skipping "%s" because it is age restricted' % title
if self.in_download_archive(info_dict):
return '%s has already been recorded in archive' % video_title
return None
@@ -534,7 +586,7 @@ class YoutubeDL(object):
try:
ie_result = ie.extract(url)
if ie_result is None: # Finished already (backwards compatibility; listformats and friends should be moved here)
if ie_result is None: # Finished already (backwards compatibility; listformats and friends should be moved here)
break
if isinstance(ie_result, list):
# Backwards compatibility: old IE result format
@@ -547,7 +599,7 @@ class YoutubeDL(object):
return self.process_ie_result(ie_result, download, extra_info)
else:
return ie_result
except ExtractorError as de: # An error we somewhat expected
except ExtractorError as de: # An error we somewhat expected
self.report_error(compat_str(de), de.format_traceback())
break
except MaxDownloadsReached:
@@ -604,23 +656,15 @@ class YoutubeDL(object):
ie_result['url'], ie_key=ie_result.get('ie_key'),
extra_info=extra_info, download=False, process=False)
def make_result(embedded_info):
new_result = ie_result.copy()
for f in ('_type', 'url', 'ext', 'player_url', 'formats',
'entries', 'ie_key', 'duration',
'subtitles', 'annotations', 'format',
'thumbnail', 'thumbnails'):
if f in new_result:
del new_result[f]
if f in embedded_info:
new_result[f] = embedded_info[f]
return new_result
new_result = make_result(info)
force_properties = dict(
(k, v) for k, v in ie_result.items() if v is not None)
for f in ('_type', 'url'):
if f in force_properties:
del force_properties[f]
new_result = info.copy()
new_result.update(force_properties)
assert new_result.get('_type') != 'url_transparent'
if new_result.get('_type') == 'compat_list':
new_result['entries'] = [
make_result(e) for e in new_result['entries']]
return self.process_ie_result(
new_result, download=download, extra_info=extra_info)
@@ -637,24 +681,34 @@ class YoutubeDL(object):
if playlistend == -1:
playlistend = None
if isinstance(ie_result['entries'], list):
n_all_entries = len(ie_result['entries'])
entries = ie_result['entries'][playliststart:playlistend]
ie_entries = ie_result['entries']
if isinstance(ie_entries, list):
n_all_entries = len(ie_entries)
entries = ie_entries[playliststart:playlistend]
n_entries = len(entries)
self.to_screen(
"[%s] playlist %s: Collected %d video ids (downloading %d of them)" %
(ie_result['extractor'], playlist, n_all_entries, n_entries))
else:
assert isinstance(ie_result['entries'], PagedList)
entries = ie_result['entries'].getslice(
elif isinstance(ie_entries, PagedList):
entries = ie_entries.getslice(
playliststart, playlistend)
n_entries = len(entries)
self.to_screen(
"[%s] playlist %s: Downloading %d videos" %
(ie_result['extractor'], playlist, n_entries))
else: # iterable
entries = list(itertools.islice(
ie_entries, playliststart, playlistend))
n_entries = len(entries)
self.to_screen(
"[%s] playlist %s: Downloading %d videos" %
(ie_result['extractor'], playlist, n_entries))
if self.params.get('playlistreverse', False):
entries = entries[::-1]
for i, entry in enumerate(entries, 1):
self.to_screen('[download] Downloading video #%s of %s' % (i, n_entries))
self.to_screen('[download] Downloading video %s of %s' % (i, n_entries))
extra = {
'n_entries': n_entries,
'playlist': playlist,
@@ -682,14 +736,17 @@ class YoutubeDL(object):
self.report_warning(
'Extractor %s returned a compat_list result. '
'It needs to be updated.' % ie_result.get('extractor'))
def _fixup(r):
self.add_extra_info(r,
self.add_extra_info(
r,
{
'extractor': ie_result['extractor'],
'webpage_url': ie_result['webpage_url'],
'webpage_url_basename': url_basename(ie_result['webpage_url']),
'extractor_key': ie_result['extractor_key'],
})
}
)
return r
ie_result['entries'] = [
self.process_ie_result(_fixup(r), download, extra_info)
@@ -767,6 +824,10 @@ class YoutubeDL(object):
info_dict['display_id'] = info_dict['id']
if info_dict.get('upload_date') is None and info_dict.get('timestamp') is not None:
# Working around negative timestamps in Windows
# (see http://bugs.python.org/issue1646728)
if info_dict['timestamp'] < 0 and os.name == 'nt':
info_dict['timestamp'] = 0
upload_date = datetime.datetime.utcfromtimestamp(
info_dict['timestamp'])
info_dict['upload_date'] = upload_date.strftime('%Y%m%d')
@@ -839,14 +900,14 @@ class YoutubeDL(object):
# Two formats have been requested like '137+139'
format_1, format_2 = rf.split('+')
formats_info = (self.select_format(format_1, formats),
self.select_format(format_2, formats))
self.select_format(format_2, formats))
if all(formats_info):
# The first format must contain the video and the
# second the audio
if formats_info[0].get('vcodec') == 'none':
self.report_error('The first format must '
'contain the video, try using '
'"-f %s+%s"' % (format_2, format_1))
'contain the video, try using '
'"-f %s+%s"' % (format_2, format_1))
return
selected_format = {
'requested_formats': formats_info,
@@ -910,8 +971,12 @@ class YoutubeDL(object):
if self.params.get('forceid', False):
self.to_stdout(info_dict['id'])
if self.params.get('forceurl', False):
# For RTMP URLs, also include the playpath
self.to_stdout(info_dict['url'] + info_dict.get('play_path', ''))
if info_dict.get('requested_formats') is not None:
for f in info_dict['requested_formats']:
self.to_stdout(f['url'] + f.get('play_path', ''))
else:
# For RTMP URLs, also include the playpath
self.to_stdout(info_dict['url'] + info_dict.get('play_path', ''))
if self.params.get('forcethumbnail', False) and info_dict.get('thumbnail') is not None:
self.to_stdout(info_dict['thumbnail'])
if self.params.get('forcedescription', False) and info_dict.get('description') is not None:
@@ -947,13 +1012,13 @@ class YoutubeDL(object):
descfn = filename + '.description'
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(descfn)):
self.to_screen('[info] Video description is already present')
elif info_dict.get('description') is None:
self.report_warning('There\'s no description to write.')
else:
try:
self.to_screen('[info] Writing video description to: ' + descfn)
with io.open(encodeFilename(descfn), 'w', encoding='utf-8') as descfile:
descfile.write(info_dict['description'])
except (KeyError, TypeError):
self.report_warning('There\'s no description to write.')
except (OSError, IOError):
self.report_error('Cannot write description file ' + descfn)
return
@@ -992,7 +1057,7 @@ class YoutubeDL(object):
else:
self.to_screen('[info] Writing video subtitles to: ' + sub_filename)
with io.open(encodeFilename(sub_filename), 'w', encoding='utf-8') as subfile:
subfile.write(sub)
subfile.write(sub)
except (OSError, IOError):
self.report_error('Cannot write subtitles file ' + sub_filename)
return
@@ -1024,10 +1089,10 @@ class YoutubeDL(object):
with open(thumb_filename, 'wb') as thumbf:
shutil.copyfileobj(uf, thumbf)
self.to_screen('[%s] %s: Writing thumbnail to: %s' %
(info_dict['extractor'], info_dict['id'], thumb_filename))
(info_dict['extractor'], info_dict['id'], thumb_filename))
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
self.report_warning('Unable to download thumbnail "%s": %s' %
(info_dict['thumbnail'], compat_str(err)))
(info_dict['thumbnail'], compat_str(err)))
if not self.params.get('skip_download', False):
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(filename)):
@@ -1048,8 +1113,8 @@ class YoutubeDL(object):
if not merger._executable:
postprocessors = []
self.report_warning('You have requested multiple '
'formats but ffmpeg or avconv are not installed.'
' The formats won\'t be merged')
'formats but ffmpeg or avconv are not installed.'
' The formats won\'t be merged')
else:
postprocessors = [merger]
for f in info_dict['requested_formats']:
@@ -1080,8 +1145,7 @@ class YoutubeDL(object):
except (PostProcessingError) as err:
self.report_error('postprocessing: %s' % str(err))
return
self.record_download_archive(info_dict)
self.record_download_archive(info_dict)
def download(self, url_list):
"""Download a given list of URLs."""
@@ -1093,7 +1157,7 @@ class YoutubeDL(object):
for url in url_list:
try:
#It also downloads the videos
# It also downloads the videos
res = self.extract_info(url)
except UnavailableVideoError:
self.report_error('unable to download video')
@@ -1265,7 +1329,9 @@ class YoutubeDL(object):
formats = info_dict.get('formats', [info_dict])
idlen = max(len('format code'),
max(len(f['format_id']) for f in formats))
formats_s = [line(f, idlen) for f in formats]
formats_s = [
line(f, idlen) for f in formats
if f.get('preference') is None or f['preference'] >= -1000]
if len(formats) > 1:
formats_s[0] += (' ' if self._format_note(formats[0]) else '') + '(worst)'
formats_s[-1] += (' ' if self._format_note(formats[-1]) else '') + '(best)'

View File

@@ -38,18 +38,8 @@ from .update import update_self
from .downloader import (
FileDownloader,
)
from .extractor import gen_extractors
from .extractor import gen_extractors, list_extractors
from .YoutubeDL import YoutubeDL
from .postprocessor import (
AtomicParsleyPP,
FFmpegAudioFixPP,
FFmpegMetadataPP,
FFmpegVideoConvertor,
FFmpegExtractAudioPP,
FFmpegEmbedSubtitlePP,
XAttrMetadataPP,
ExecAfterDownloadPP,
)
def _real_main(argv=None):
@@ -76,10 +66,10 @@ def _real_main(argv=None):
if opts.headers is not None:
for h in opts.headers:
if h.find(':', 1) < 0:
parser.error('wrong header formatting, it should be key:value, not "%s"'%h)
parser.error('wrong header formatting, it should be key:value, not "%s"' % h)
key, value = h.split(':', 2)
if opts.verbose:
write_string('[debug] Adding header from command line option %s:%s\n'%(key, value))
write_string('[debug] Adding header from command line option %s:%s\n' % (key, value))
std_headers[key] = value
# Dump user agent
@@ -105,30 +95,27 @@ def _real_main(argv=None):
_enc = preferredencoding()
all_urls = [url.decode(_enc, 'ignore') if isinstance(url, bytes) else url for url in all_urls]
extractors = gen_extractors()
if opts.list_extractors:
for ie in sorted(extractors, key=lambda ie: ie.IE_NAME.lower()):
for ie in list_extractors(opts.age_limit):
compat_print(ie.IE_NAME + (' (CURRENTLY BROKEN)' if not ie._WORKING else ''))
matchedUrls = [url for url in all_urls if ie.suitable(url)]
for mu in matchedUrls:
compat_print(' ' + mu)
sys.exit(0)
if opts.list_extractor_descriptions:
for ie in sorted(extractors, key=lambda ie: ie.IE_NAME.lower()):
for ie in list_extractors(opts.age_limit):
if not ie._WORKING:
continue
desc = getattr(ie, 'IE_DESC', ie.IE_NAME)
if desc is False:
continue
if hasattr(ie, 'SEARCH_KEY'):
_SEARCHES = ('cute kittens', 'slithering pythons', 'falling cat', 'angry poodle', 'purple fish', 'running tortoise', 'sleeping bunny')
_SEARCHES = ('cute kittens', 'slithering pythons', 'falling cat', 'angry poodle', 'purple fish', 'running tortoise', 'sleeping bunny', 'burping cow')
_COUNTS = ('', '5', '10', 'all')
desc += ' (Example: "%s%s:%s" )' % (ie.SEARCH_KEY, random.choice(_COUNTS), random.choice(_SEARCHES))
compat_print(desc)
sys.exit(0)
# Conflicting, missing and erroneous options
if opts.usenetrc and (opts.username is not None or opts.password is not None):
parser.error('using .netrc conflicts with giving username/password')
@@ -190,21 +177,21 @@ def _real_main(argv=None):
# --all-sub automatically sets --write-sub if --write-auto-sub is not given
# this was the old behaviour if only --all-sub was given.
if opts.allsubtitles and (opts.writeautomaticsub == False):
if opts.allsubtitles and not opts.writeautomaticsub:
opts.writesubtitles = True
if sys.version_info < (3,):
# In Python 2, sys.argv is a bytestring (also note http://bugs.python.org/issue2128 for Windows systems)
if opts.outtmpl is not None:
opts.outtmpl = opts.outtmpl.decode(preferredencoding())
outtmpl =((opts.outtmpl is not None and opts.outtmpl)
or (opts.format == '-1' and opts.usetitle and '%(title)s-%(id)s-%(format)s.%(ext)s')
or (opts.format == '-1' and '%(id)s-%(format)s.%(ext)s')
or (opts.usetitle and opts.autonumber and '%(autonumber)s-%(title)s-%(id)s.%(ext)s')
or (opts.usetitle and '%(title)s-%(id)s.%(ext)s')
or (opts.useid and '%(id)s.%(ext)s')
or (opts.autonumber and '%(autonumber)s-%(id)s.%(ext)s')
or DEFAULT_OUTTMPL)
outtmpl = ((opts.outtmpl is not None and opts.outtmpl)
or (opts.format == '-1' and opts.usetitle and '%(title)s-%(id)s-%(format)s.%(ext)s')
or (opts.format == '-1' and '%(id)s-%(format)s.%(ext)s')
or (opts.usetitle and opts.autonumber and '%(autonumber)s-%(title)s-%(id)s.%(ext)s')
or (opts.usetitle and '%(title)s-%(id)s.%(ext)s')
or (opts.useid and '%(id)s.%(ext)s')
or (opts.autonumber and '%(autonumber)s-%(id)s.%(ext)s')
or DEFAULT_OUTTMPL)
if not os.path.splitext(outtmpl)[1] and opts.extractaudio:
parser.error('Cannot download a video and extract audio into the same'
' file! Use "{0}.%(ext)s" instead of "{0}" as the output'
@@ -213,6 +200,43 @@ def _real_main(argv=None):
any_printing = opts.geturl or opts.gettitle or opts.getid or opts.getthumbnail or opts.getdescription or opts.getfilename or opts.getformat or opts.getduration or opts.dumpjson or opts.dump_single_json
download_archive_fn = compat_expanduser(opts.download_archive) if opts.download_archive is not None else opts.download_archive
# PostProcessors
postprocessors = []
# Add the metadata pp first, the other pps will copy it
if opts.addmetadata:
postprocessors.append({'key': 'FFmpegMetadata'})
if opts.extractaudio:
postprocessors.append({
'key': 'FFmpegExtractAudio',
'preferredcodec': opts.audioformat,
'preferredquality': opts.audioquality,
'nopostoverwrites': opts.nopostoverwrites,
})
if opts.recodevideo:
postprocessors.append({
'key': 'FFmpegVideoConvertor',
'preferedformat': opts.recodevideo,
})
if opts.embedsubtitles:
postprocessors.append({
'key': 'FFmpegEmbedSubtitle',
'subtitlesformat': opts.subtitlesformat,
})
if opts.xattrs:
postprocessors.append({'key': 'XAttrMetadata'})
if opts.embedthumbnail:
if not opts.addmetadata:
postprocessors.append({'key': 'FFmpegAudioFix'})
postprocessors.append({'key': 'AtomicParsley'})
# Please keep ExecAfterDownload towards the bottom as it allows the user to modify the final file in any way.
# So if the user is able to remove the file before your postprocessor runs it might cause a few problems.
if opts.exec_cmd:
postprocessors.append({
'key': 'ExecAfterDownload',
'verboseOutput': opts.verbose,
'exec_cmd': opts.exec_cmd,
})
ydl_opts = {
'usenetrc': opts.usenetrc,
'username': opts.username,
@@ -250,6 +274,7 @@ def _real_main(argv=None):
'progress_with_newline': opts.progress_with_newline,
'playliststart': opts.playliststart,
'playlistend': opts.playlistend,
'playlistreverse': opts.playlist_reverse,
'noplaylist': opts.noplaylist,
'logtostderr': opts.outtmpl == '-',
'consoletitle': opts.consoletitle,
@@ -297,33 +322,10 @@ def _real_main(argv=None):
'encoding': opts.encoding,
'exec_cmd': opts.exec_cmd,
'extract_flat': opts.extract_flat,
'postprocessors': postprocessors,
}
with YoutubeDL(ydl_opts) as ydl:
# PostProcessors
# Add the metadata pp first, the other pps will copy it
if opts.addmetadata:
ydl.add_post_processor(FFmpegMetadataPP())
if opts.extractaudio:
ydl.add_post_processor(FFmpegExtractAudioPP(preferredcodec=opts.audioformat, preferredquality=opts.audioquality, nopostoverwrites=opts.nopostoverwrites))
if opts.recodevideo:
ydl.add_post_processor(FFmpegVideoConvertor(preferedformat=opts.recodevideo))
if opts.embedsubtitles:
ydl.add_post_processor(FFmpegEmbedSubtitlePP(subtitlesformat=opts.subtitlesformat))
if opts.xattrs:
ydl.add_post_processor(XAttrMetadataPP())
if opts.embedthumbnail:
if not opts.addmetadata:
ydl.add_post_processor(FFmpegAudioFixPP())
ydl.add_post_processor(AtomicParsleyPP())
# Please keep ExecAfterDownload towards the bottom as it allows the user to modify the final file in any way.
# So if the user is able to remove the file before your postprocessor runs it might cause a few problems.
if opts.exec_cmd:
ydl.add_post_processor(ExecAfterDownloadPP(
verboseOutput=opts.verbose, exec_cmd=opts.exec_cmd))
# Update version
if opts.update_self:
update_self(ydl.to_screen, opts.verbose)
@@ -334,11 +336,12 @@ def _real_main(argv=None):
# Maybe do nothing
if (len(all_urls) < 1) and (opts.load_info_filename is None):
if not (opts.update_self or opts.rm_cachedir):
parser.error('you must provide at least one URL')
else:
if opts.update_self or opts.rm_cachedir:
sys.exit()
ydl.warn_if_short_id(sys.argv[1:] if argv is None else argv)
parser.error('you must provide at least one URL')
try:
if opts.load_info_filename is not None:
retcode = ydl.download_with_info_file(opts.load_info_filename)
@@ -360,3 +363,5 @@ def main(argv=None):
sys.exit('ERROR: fixed output name but more than one file to download')
except KeyboardInterrupt:
sys.exit('\nERROR: Interrupted by user')
__all__ = ['main', 'YoutubeDL', 'gen_extractors', 'list_extractors']

View File

@@ -1,4 +1,5 @@
#!/usr/bin/env python
from __future__ import unicode_literals
# Execute with
# $ python youtube_dl/__main__.py (2.6+)

View File

@@ -1,3 +1,5 @@
from __future__ import unicode_literals
__all__ = ['aes_encrypt', 'key_expansion', 'aes_ctr_decrypt', 'aes_cbc_decrypt', 'aes_decrypt_text']
import base64
@@ -7,10 +9,11 @@ from .utils import bytes_to_intlist, intlist_to_bytes
BLOCK_SIZE_BYTES = 16
def aes_ctr_decrypt(data, key, counter):
"""
Decrypt with aes in counter mode
@param {int[]} data cipher
@param {int[]} key 16/24/32-Byte cipher key
@param {instance} counter Instance whose next_value function (@returns {int[]} 16-Byte block)
@@ -19,23 +22,24 @@ def aes_ctr_decrypt(data, key, counter):
"""
expanded_key = key_expansion(key)
block_count = int(ceil(float(len(data)) / BLOCK_SIZE_BYTES))
decrypted_data=[]
decrypted_data = []
for i in range(block_count):
counter_block = counter.next_value()
block = data[i*BLOCK_SIZE_BYTES : (i+1)*BLOCK_SIZE_BYTES]
block += [0]*(BLOCK_SIZE_BYTES - len(block))
block = data[i * BLOCK_SIZE_BYTES: (i + 1) * BLOCK_SIZE_BYTES]
block += [0] * (BLOCK_SIZE_BYTES - len(block))
cipher_counter_block = aes_encrypt(counter_block, expanded_key)
decrypted_data += xor(block, cipher_counter_block)
decrypted_data = decrypted_data[:len(data)]
return decrypted_data
def aes_cbc_decrypt(data, key, iv):
"""
Decrypt with aes in CBC mode
@param {int[]} data cipher
@param {int[]} key 16/24/32-Byte cipher key
@param {int[]} iv 16-Byte IV
@@ -43,94 +47,98 @@ def aes_cbc_decrypt(data, key, iv):
"""
expanded_key = key_expansion(key)
block_count = int(ceil(float(len(data)) / BLOCK_SIZE_BYTES))
decrypted_data=[]
decrypted_data = []
previous_cipher_block = iv
for i in range(block_count):
block = data[i*BLOCK_SIZE_BYTES : (i+1)*BLOCK_SIZE_BYTES]
block += [0]*(BLOCK_SIZE_BYTES - len(block))
block = data[i * BLOCK_SIZE_BYTES: (i + 1) * BLOCK_SIZE_BYTES]
block += [0] * (BLOCK_SIZE_BYTES - len(block))
decrypted_block = aes_decrypt(block, expanded_key)
decrypted_data += xor(decrypted_block, previous_cipher_block)
previous_cipher_block = block
decrypted_data = decrypted_data[:len(data)]
return decrypted_data
def key_expansion(data):
"""
Generate key schedule
@param {int[]} data 16/24/32-Byte cipher key
@returns {int[]} 176/208/240-Byte expanded key
@returns {int[]} 176/208/240-Byte expanded key
"""
data = data[:] # copy
data = data[:] # copy
rcon_iteration = 1
key_size_bytes = len(data)
expanded_key_size_bytes = (key_size_bytes // 4 + 7) * BLOCK_SIZE_BYTES
while len(data) < expanded_key_size_bytes:
temp = data[-4:]
temp = key_schedule_core(temp, rcon_iteration)
rcon_iteration += 1
data += xor(temp, data[-key_size_bytes : 4-key_size_bytes])
data += xor(temp, data[-key_size_bytes: 4 - key_size_bytes])
for _ in range(3):
temp = data[-4:]
data += xor(temp, data[-key_size_bytes : 4-key_size_bytes])
data += xor(temp, data[-key_size_bytes: 4 - key_size_bytes])
if key_size_bytes == 32:
temp = data[-4:]
temp = sub_bytes(temp)
data += xor(temp, data[-key_size_bytes : 4-key_size_bytes])
for _ in range(3 if key_size_bytes == 32 else 2 if key_size_bytes == 24 else 0):
data += xor(temp, data[-key_size_bytes: 4 - key_size_bytes])
for _ in range(3 if key_size_bytes == 32 else 2 if key_size_bytes == 24 else 0):
temp = data[-4:]
data += xor(temp, data[-key_size_bytes : 4-key_size_bytes])
data += xor(temp, data[-key_size_bytes: 4 - key_size_bytes])
data = data[:expanded_key_size_bytes]
return data
def aes_encrypt(data, expanded_key):
"""
Encrypt one block with aes
@param {int[]} data 16-Byte state
@param {int[]} expanded_key 176/208/240-Byte expanded key
@param {int[]} expanded_key 176/208/240-Byte expanded key
@returns {int[]} 16-Byte cipher
"""
rounds = len(expanded_key) // BLOCK_SIZE_BYTES - 1
data = xor(data, expanded_key[:BLOCK_SIZE_BYTES])
for i in range(1, rounds+1):
for i in range(1, rounds + 1):
data = sub_bytes(data)
data = shift_rows(data)
if i != rounds:
data = mix_columns(data)
data = xor(data, expanded_key[i*BLOCK_SIZE_BYTES : (i+1)*BLOCK_SIZE_BYTES])
data = xor(data, expanded_key[i * BLOCK_SIZE_BYTES: (i + 1) * BLOCK_SIZE_BYTES])
return data
def aes_decrypt(data, expanded_key):
"""
Decrypt one block with aes
@param {int[]} data 16-Byte cipher
@param {int[]} expanded_key 176/208/240-Byte expanded key
@returns {int[]} 16-Byte state
"""
rounds = len(expanded_key) // BLOCK_SIZE_BYTES - 1
for i in range(rounds, 0, -1):
data = xor(data, expanded_key[i*BLOCK_SIZE_BYTES : (i+1)*BLOCK_SIZE_BYTES])
data = xor(data, expanded_key[i * BLOCK_SIZE_BYTES: (i + 1) * BLOCK_SIZE_BYTES])
if i != rounds:
data = mix_columns_inv(data)
data = shift_rows_inv(data)
data = sub_bytes_inv(data)
data = xor(data, expanded_key[:BLOCK_SIZE_BYTES])
return data
def aes_decrypt_text(data, password, key_size_bytes):
"""
Decrypt text
@@ -138,33 +146,34 @@ def aes_decrypt_text(data, password, key_size_bytes):
- The cipher key is retrieved by encrypting the first 16 Byte of 'password'
with the first 'key_size_bytes' Bytes from 'password' (if necessary filled with 0's)
- Mode of operation is 'counter'
@param {str} data Base64 encoded string
@param {str,unicode} password Password (will be encoded with utf-8)
@param {int} key_size_bytes Possible values: 16 for 128-Bit, 24 for 192-Bit or 32 for 256-Bit
@returns {str} Decrypted data
"""
NONCE_LENGTH_BYTES = 8
data = bytes_to_intlist(base64.b64decode(data))
password = bytes_to_intlist(password.encode('utf-8'))
key = password[:key_size_bytes] + [0]*(key_size_bytes - len(password))
key = password[:key_size_bytes] + [0] * (key_size_bytes - len(password))
key = aes_encrypt(key[:BLOCK_SIZE_BYTES], key_expansion(key)) * (key_size_bytes // BLOCK_SIZE_BYTES)
nonce = data[:NONCE_LENGTH_BYTES]
cipher = data[NONCE_LENGTH_BYTES:]
class Counter:
__value = nonce + [0]*(BLOCK_SIZE_BYTES - NONCE_LENGTH_BYTES)
__value = nonce + [0] * (BLOCK_SIZE_BYTES - NONCE_LENGTH_BYTES)
def next_value(self):
temp = self.__value
self.__value = inc(self.__value)
return temp
decrypted_data = aes_ctr_decrypt(cipher, key, Counter())
plaintext = intlist_to_bytes(decrypted_data)
return plaintext
RCON = (0x8d, 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80, 0x1b, 0x36)
@@ -200,14 +209,14 @@ SBOX_INV = (0x52, 0x09, 0x6a, 0xd5, 0x30, 0x36, 0xa5, 0x38, 0xbf, 0x40, 0xa3, 0x
0x60, 0x51, 0x7f, 0xa9, 0x19, 0xb5, 0x4a, 0x0d, 0x2d, 0xe5, 0x7a, 0x9f, 0x93, 0xc9, 0x9c, 0xef,
0xa0, 0xe0, 0x3b, 0x4d, 0xae, 0x2a, 0xf5, 0xb0, 0xc8, 0xeb, 0xbb, 0x3c, 0x83, 0x53, 0x99, 0x61,
0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26, 0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d)
MIX_COLUMN_MATRIX = ((0x2,0x3,0x1,0x1),
(0x1,0x2,0x3,0x1),
(0x1,0x1,0x2,0x3),
(0x3,0x1,0x1,0x2))
MIX_COLUMN_MATRIX_INV = ((0xE,0xB,0xD,0x9),
(0x9,0xE,0xB,0xD),
(0xD,0x9,0xE,0xB),
(0xB,0xD,0x9,0xE))
MIX_COLUMN_MATRIX = ((0x2, 0x3, 0x1, 0x1),
(0x1, 0x2, 0x3, 0x1),
(0x1, 0x1, 0x2, 0x3),
(0x3, 0x1, 0x1, 0x2))
MIX_COLUMN_MATRIX_INV = ((0xE, 0xB, 0xD, 0x9),
(0x9, 0xE, 0xB, 0xD),
(0xD, 0x9, 0xE, 0xB),
(0xB, 0xD, 0x9, 0xE))
RIJNDAEL_EXP_TABLE = (0x01, 0x03, 0x05, 0x0F, 0x11, 0x33, 0x55, 0xFF, 0x1A, 0x2E, 0x72, 0x96, 0xA1, 0xF8, 0x13, 0x35,
0x5F, 0xE1, 0x38, 0x48, 0xD8, 0x73, 0x95, 0xA4, 0xF7, 0x02, 0x06, 0x0A, 0x1E, 0x22, 0x66, 0xAA,
0xE5, 0x34, 0x5C, 0xE4, 0x37, 0x59, 0xEB, 0x26, 0x6A, 0xBE, 0xD9, 0x70, 0x90, 0xAB, 0xE6, 0x31,
@@ -241,30 +250,37 @@ RIJNDAEL_LOG_TABLE = (0x00, 0x00, 0x19, 0x01, 0x32, 0x02, 0x1a, 0xc6, 0x4b, 0xc7
0x44, 0x11, 0x92, 0xd9, 0x23, 0x20, 0x2e, 0x89, 0xb4, 0x7c, 0xb8, 0x26, 0x77, 0x99, 0xe3, 0xa5,
0x67, 0x4a, 0xed, 0xde, 0xc5, 0x31, 0xfe, 0x18, 0x0d, 0x63, 0x8c, 0x80, 0xc0, 0xf7, 0x70, 0x07)
def sub_bytes(data):
return [SBOX[x] for x in data]
def sub_bytes_inv(data):
return [SBOX_INV[x] for x in data]
def rotate(data):
return data[1:] + [data[0]]
def key_schedule_core(data, rcon_iteration):
data = rotate(data)
data = sub_bytes(data)
data[0] = data[0] ^ RCON[rcon_iteration]
return data
def xor(data1, data2):
return [x^y for x, y in zip(data1, data2)]
return [x ^ y for x, y in zip(data1, data2)]
def rijndael_mul(a, b):
if(a==0 or b==0):
if(a == 0 or b == 0):
return 0
return RIJNDAEL_EXP_TABLE[(RIJNDAEL_LOG_TABLE[a] + RIJNDAEL_LOG_TABLE[b]) % 0xFF]
def mix_column(data, matrix):
data_mixed = []
for row in range(4):
@@ -275,33 +291,38 @@ def mix_column(data, matrix):
data_mixed.append(mixed)
return data_mixed
def mix_columns(data, matrix=MIX_COLUMN_MATRIX):
data_mixed = []
for i in range(4):
column = data[i*4 : (i+1)*4]
column = data[i * 4: (i + 1) * 4]
data_mixed += mix_column(column, matrix)
return data_mixed
def mix_columns_inv(data):
return mix_columns(data, MIX_COLUMN_MATRIX_INV)
def shift_rows(data):
data_shifted = []
for column in range(4):
for row in range(4):
data_shifted.append( data[((column + row) & 0b11) * 4 + row] )
data_shifted.append(data[((column + row) & 0b11) * 4 + row])
return data_shifted
def shift_rows_inv(data):
data_shifted = []
for column in range(4):
for row in range(4):
data_shifted.append( data[((column - row) & 0b11) * 4 + row] )
data_shifted.append(data[((column - row) & 0b11) * 4 + row])
return data_shifted
def inc(data):
data = data[:] # copy
for i in range(len(data)-1,-1,-1):
data = data[:] # copy
for i in range(len(data) - 1, -1, -1):
if data[i] == 255:
data[i] = 0
else:

View File

@@ -3,53 +3,54 @@ from __future__ import unicode_literals
import getpass
import optparse
import os
import re
import subprocess
import sys
try:
import urllib.request as compat_urllib_request
except ImportError: # Python 2
except ImportError: # Python 2
import urllib2 as compat_urllib_request
try:
import urllib.error as compat_urllib_error
except ImportError: # Python 2
except ImportError: # Python 2
import urllib2 as compat_urllib_error
try:
import urllib.parse as compat_urllib_parse
except ImportError: # Python 2
except ImportError: # Python 2
import urllib as compat_urllib_parse
try:
from urllib.parse import urlparse as compat_urllib_parse_urlparse
except ImportError: # Python 2
except ImportError: # Python 2
from urlparse import urlparse as compat_urllib_parse_urlparse
try:
import urllib.parse as compat_urlparse
except ImportError: # Python 2
except ImportError: # Python 2
import urlparse as compat_urlparse
try:
import http.cookiejar as compat_cookiejar
except ImportError: # Python 2
except ImportError: # Python 2
import cookielib as compat_cookiejar
try:
import html.entities as compat_html_entities
except ImportError: # Python 2
except ImportError: # Python 2
import htmlentitydefs as compat_html_entities
try:
import html.parser as compat_html_parser
except ImportError: # Python 2
except ImportError: # Python 2
import HTMLParser as compat_html_parser
try:
import http.client as compat_http_client
except ImportError: # Python 2
except ImportError: # Python 2
import httplib as compat_http_client
try:
@@ -110,12 +111,12 @@ except ImportError:
try:
from urllib.parse import parse_qs as compat_parse_qs
except ImportError: # Python 2
except ImportError: # Python 2
# HACK: The following is the correct parse_qs implementation from cpython 3's stdlib.
# Python 2's version is apparently totally broken
def _parse_qsl(qs, keep_blank_values=False, strict_parsing=False,
encoding='utf-8', errors='replace'):
encoding='utf-8', errors='replace'):
qs, _coerce_result = qs, unicode
pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')]
r = []
@@ -144,10 +145,10 @@ except ImportError: # Python 2
return r
def compat_parse_qs(qs, keep_blank_values=False, strict_parsing=False,
encoding='utf-8', errors='replace'):
encoding='utf-8', errors='replace'):
parsed_result = {}
pairs = _parse_qsl(qs, keep_blank_values, strict_parsing,
encoding=encoding, errors=errors)
encoding=encoding, errors=errors)
for name, value in pairs:
if name in parsed_result:
parsed_result[name].append(value)
@@ -156,12 +157,12 @@ except ImportError: # Python 2
return parsed_result
try:
compat_str = unicode # Python 2
compat_str = unicode # Python 2
except NameError:
compat_str = str
try:
compat_chr = unichr # Python 2
compat_chr = unichr # Python 2
except NameError:
compat_chr = chr
@@ -174,12 +175,17 @@ try:
from shlex import quote as shlex_quote
except ImportError: # Python < 3.3
def shlex_quote(s):
return "'" + s.replace("'", "'\"'\"'") + "'"
if re.match(r'^[-_\w./]+$', s):
return s
else:
return "'" + s.replace("'", "'\"'\"'") + "'"
def compat_ord(c):
if type(c) is int: return c
else: return ord(c)
if type(c) is int:
return c
else:
return ord(c)
if sys.version_info >= (3, 0):
@@ -241,7 +247,7 @@ else:
userhome = compat_getenv('HOME')
elif 'USERPROFILE' in os.environ:
userhome = compat_getenv('USERPROFILE')
elif not 'HOMEPATH' in os.environ:
elif 'HOMEPATH' not in os.environ:
return path
else:
try:
@@ -250,7 +256,7 @@ else:
drive = ''
userhome = os.path.join(drive, compat_getenv('HOMEPATH'))
if i != 1: #~user
if i != 1: # ~user
userhome = os.path.join(os.path.dirname(userhome), path[1:i])
return userhome + path[i:]
@@ -264,7 +270,7 @@ if sys.version_info < (3, 0):
print(s.encode(preferredencoding(), 'xmlcharrefreplace'))
else:
def compat_print(s):
assert type(s) == type(u'')
assert isinstance(s, compat_str)
print(s)
@@ -291,7 +297,9 @@ else:
# Old 2.6 and 2.7 releases require kwargs to be bytes
try:
(lambda x: x)(**{'x': 0})
def _testfunc(x):
pass
_testfunc(**{'x': 0})
except TypeError:
def compat_kwargs(kwargs):
return dict((bytes(k), v) for k, v in kwargs.items())

View File

@@ -30,3 +30,8 @@ def get_suitable_downloader(info_dict):
return F4mFD
else:
return HttpFD
__all__ = [
'get_suitable_downloader',
'FileDownloader',
]

View File

@@ -5,8 +5,8 @@ import re
import sys
import time
from ..compat import compat_str
from ..utils import (
compat_str,
encodeFilename,
format_bytes,
timeconvert,
@@ -80,8 +80,10 @@ class FileDownloader(object):
def calc_eta(start, now, total, current):
if total is None:
return None
if now is None:
now = time.time()
dif = now - start
if current == 0 or dif < 0.001: # One millisecond
if current == 0 or dif < 0.001: # One millisecond
return None
rate = float(current) / dif
return int((float(total) - float(current)) / rate)
@@ -95,7 +97,7 @@ class FileDownloader(object):
@staticmethod
def calc_speed(start, now, bytes):
dif = now - start
if bytes == 0 or dif < 0.001: # One millisecond
if bytes == 0 or dif < 0.001: # One millisecond
return None
return float(bytes) / dif
@@ -108,7 +110,7 @@ class FileDownloader(object):
@staticmethod
def best_block_size(elapsed_time, bytes):
new_min = max(bytes / 2.0, 1.0)
new_max = min(max(bytes * 2.0, 1.0), 4194304) # Do not surpass 4 MB
new_max = min(max(bytes * 2.0, 1.0), 4194304) # Do not surpass 4 MB
if elapsed_time < 0.001:
return int(new_max)
rate = bytes / elapsed_time
@@ -146,18 +148,19 @@ class FileDownloader(object):
def report_error(self, *args, **kargs):
self.ydl.report_error(*args, **kargs)
def slow_down(self, start_time, byte_counter):
def slow_down(self, start_time, now, byte_counter):
"""Sleep if the download speed is over the rate limit."""
rate_limit = self.params.get('ratelimit', None)
if rate_limit is None or byte_counter == 0:
return
now = time.time()
if now is None:
now = time.time()
elapsed = now - start_time
if elapsed <= 0.0:
return
speed = float(byte_counter) / elapsed
if speed > rate_limit:
time.sleep((byte_counter - rate_limit * (now - start_time)) / rate_limit)
time.sleep(max((byte_counter // rate_limit) - elapsed, 0))
def temp_name(self, filename):
"""Returns a temporary filename for the given filename."""
@@ -282,7 +285,7 @@ class FileDownloader(object):
Return True on success and False otherwise
"""
# Check file already present
if self.params.get('continuedl', False) and os.path.isfile(encodeFilename(filename)) and not self.params.get('nopart', False):
if filename != '-' and self.params.get('continuedl', False) and os.path.isfile(encodeFilename(filename)) and not self.params.get('nopart', False):
self.report_file_already_downloaded(filename)
self._hook_progress({
'filename': filename,
@@ -302,19 +305,6 @@ class FileDownloader(object):
ph(status)
def add_progress_hook(self, ph):
""" ph gets called on download progress, with a dictionary with the entries
* filename: The final filename
* status: One of "downloading" and "finished"
It can also have some of the following entries:
* downloaded_bytes: Bytes on disks
* total_bytes: Total bytes, None if unknown
* tmpfilename: The filename we're currently writing to
* eta: The estimated time in seconds, None if unknown
* speed: The download speed in bytes/second, None if unknown
Hooks are guaranteed to be called at least once (with status "finished")
if the download is successful.
"""
# See YoutubeDl.py (search for progress_hooks) for a description of
# this interface
self._progress_hooks.append(ph)

View File

@@ -9,10 +9,12 @@ import xml.etree.ElementTree as etree
from .common import FileDownloader
from .http import HttpFD
from ..compat import (
compat_urlparse,
)
from ..utils import (
struct_pack,
struct_unpack,
compat_urlparse,
format_bytes,
encodeFilename,
sanitize_open,
@@ -55,7 +57,7 @@ class FlvReader(io.BytesIO):
if size == 1:
real_size = self.read_unsigned_long_long()
header_end = 16
return real_size, box_type, self.read(real_size-header_end)
return real_size, box_type, self.read(real_size - header_end)
def read_asrt(self):
# version
@@ -180,7 +182,7 @@ def build_fragments_list(boot_info):
n_frags = segment_run_entry[1]
fragment_run_entry_table = boot_info['fragments'][0]['fragments']
first_frag_number = fragment_run_entry_table[0]['first']
for (i, frag_number) in zip(range(1, n_frags+1), itertools.count(first_frag_number)):
for (i, frag_number) in zip(range(1, n_frags + 1), itertools.count(first_frag_number)):
res.append((1, frag_number))
return res
@@ -201,7 +203,7 @@ def write_flv_header(stream, metadata):
stream.write(b'\x00\x00\x00\x00\x00\x00\x00')
stream.write(metadata)
# Magic numbers extracted from the output files produced by AdobeHDS.php
#(https://github.com/K-S-V/Scripts)
# (https://github.com/K-S-V/Scripts)
stream.write(b'\x00\x00\x01\x73')
@@ -225,13 +227,16 @@ class F4mFD(FileDownloader):
self.to_screen('[download] Downloading f4m manifest')
manifest = self.ydl.urlopen(man_url).read()
self.report_destination(filename)
http_dl = HttpQuietDownloader(self.ydl,
http_dl = HttpQuietDownloader(
self.ydl,
{
'continuedl': True,
'quiet': True,
'noprogress': True,
'ratelimit': self.params.get('ratelimit', None),
'test': self.params.get('test', False),
})
}
)
doc = etree.fromstring(manifest)
formats = [(int(f.attrib.get('bitrate', -1)), f) for f in doc.findall(_add_ns('media'))]
@@ -277,7 +282,7 @@ class F4mFD(FileDownloader):
def frag_progress_hook(status):
frag_total_bytes = status.get('total_bytes', 0)
estimated_size = (state['downloaded_bytes'] +
(total_frags - state['frag_counter']) * frag_total_bytes)
(total_frags - state['frag_counter']) * frag_total_bytes)
if status['status'] == 'finished':
state['downloaded_bytes'] += frag_total_bytes
state['frag_counter'] += 1
@@ -287,13 +292,13 @@ class F4mFD(FileDownloader):
frag_downloaded_bytes = status['downloaded_bytes']
byte_counter = state['downloaded_bytes'] + frag_downloaded_bytes
frag_progress = self.calc_percent(frag_downloaded_bytes,
frag_total_bytes)
frag_total_bytes)
progress = self.calc_percent(state['frag_counter'], total_frags)
progress += frag_progress / float(total_frags)
eta = self.calc_eta(start, time.time(), estimated_size, byte_counter)
self.report_progress(progress, format_bytes(estimated_size),
status.get('speed'), eta)
status.get('speed'), eta)
http_dl.add_progress_hook(frag_progress_hook)
frags_filenames = []

View File

@@ -4,11 +4,13 @@ import os
import re
import subprocess
from ..postprocessor.ffmpeg import FFmpegPostProcessor
from .common import FileDownloader
from ..utils import (
from ..compat import (
compat_urlparse,
compat_urllib_request,
check_executable,
)
from ..utils import (
encodeFilename,
)
@@ -24,18 +26,18 @@ class HlsFD(FileDownloader):
'-bsf:a', 'aac_adtstoasc',
encodeFilename(tmpfilename, for_subprocess=True)]
for program in ['avconv', 'ffmpeg']:
if check_executable(program, ['-version']):
break
else:
self.report_error(u'm3u8 download detected but ffmpeg or avconv could not be found. Please install one.')
ffpp = FFmpegPostProcessor(downloader=self)
program = ffpp._executable
if program is None:
self.report_error('m3u8 download detected but ffmpeg or avconv could not be found. Please install one.')
return False
ffpp.check_version()
cmd = [program] + args
retval = subprocess.call(cmd)
if retval == 0:
fsize = os.path.getsize(encodeFilename(tmpfilename))
self.to_screen(u'\r[%s] %s bytes' % (cmd[0], fsize))
self.to_screen('\r[%s] %s bytes' % (cmd[0], fsize))
self.try_rename(tmpfilename, filename)
self._hook_progress({
'downloaded_bytes': fsize,
@@ -45,8 +47,8 @@ class HlsFD(FileDownloader):
})
return True
else:
self.to_stderr(u"\n")
self.report_error(u'%s exited with code %d' % (program, retval))
self.to_stderr('\n')
self.report_error('%s exited with code %d' % (program, retval))
return False
@@ -101,4 +103,3 @@ class NativeHlsFD(FileDownloader):
})
self.try_rename(tmpfilename, filename)
return True

View File

@@ -1,12 +1,15 @@
from __future__ import unicode_literals
import os
import time
from .common import FileDownloader
from ..utils import (
from ..compat import (
compat_urllib_request,
compat_urllib_error,
)
from ..utils import (
ContentTooShortError,
encodeFilename,
sanitize_open,
format_bytes,
@@ -106,7 +109,7 @@ class HttpFD(FileDownloader):
self.report_retry(count, retries)
if count > retries:
self.report_error(u'giving up after %s retries' % retries)
self.report_error('giving up after %s retries' % retries)
return False
data_len = data.info().get('Content-length', None)
@@ -124,26 +127,31 @@ class HttpFD(FileDownloader):
min_data_len = self.params.get("min_filesize", None)
max_data_len = self.params.get("max_filesize", None)
if min_data_len is not None and data_len < min_data_len:
self.to_screen(u'\r[download] File is smaller than min-filesize (%s bytes < %s bytes). Aborting.' % (data_len, min_data_len))
self.to_screen('\r[download] File is smaller than min-filesize (%s bytes < %s bytes). Aborting.' % (data_len, min_data_len))
return False
if max_data_len is not None and data_len > max_data_len:
self.to_screen(u'\r[download] File is larger than max-filesize (%s bytes > %s bytes). Aborting.' % (data_len, max_data_len))
self.to_screen('\r[download] File is larger than max-filesize (%s bytes > %s bytes). Aborting.' % (data_len, max_data_len))
return False
data_len_str = format_bytes(data_len)
byte_counter = 0 + resume_len
block_size = self.params.get('buffersize', 1024)
start = time.time()
# measure time over whole while-loop, so slow_down() and best_block_size() work together properly
now = None # needed for slow_down() in the first loop run
before = start # start measuring
while True:
# Download and write
before = time.time()
data_block = data.read(block_size if not is_test else min(block_size, data_len - byte_counter))
after = time.time()
if len(data_block) == 0:
break
byte_counter += len(data_block)
# Open file just in time
# exit loop when download is finished
if len(data_block) == 0:
break
# Open destination file just in time
if stream is None:
try:
(stream, tmpfilename) = sanitize_open(tmpfilename, open_mode)
@@ -151,19 +159,30 @@ class HttpFD(FileDownloader):
filename = self.undo_temp_name(tmpfilename)
self.report_destination(filename)
except (OSError, IOError) as err:
self.report_error(u'unable to open for writing: %s' % str(err))
self.report_error('unable to open for writing: %s' % str(err))
return False
try:
stream.write(data_block)
except (IOError, OSError) as err:
self.to_stderr(u"\n")
self.report_error(u'unable to write data: %s' % str(err))
self.to_stderr('\n')
self.report_error('unable to write data: %s' % str(err))
return False
# Apply rate limit
self.slow_down(start, now, byte_counter - resume_len)
# end measuring of one loop run
now = time.time()
after = now
# Adjust block size
if not self.params.get('noresizebuffer', False):
block_size = self.best_block_size(after - before, len(data_block))
before = after
# Progress message
speed = self.calc_speed(start, time.time(), byte_counter - resume_len)
speed = self.calc_speed(start, now, byte_counter - resume_len)
if data_len is None:
eta = percent = None
else:
@@ -184,14 +203,11 @@ class HttpFD(FileDownloader):
if is_test and byte_counter == data_len:
break
# Apply rate limit
self.slow_down(start, byte_counter - resume_len)
if stream is None:
self.to_stderr(u"\n")
self.report_error(u'Did not get any data blocks')
self.to_stderr('\n')
self.report_error('Did not get any data blocks')
return False
if tmpfilename != u'-':
if tmpfilename != '-':
stream.close()
self.report_finish(data_len_str, (time.time() - start))
if data_len is not None and byte_counter != data_len:

View File

@@ -1,8 +1,11 @@
from __future__ import unicode_literals
import os
import subprocess
from .common import FileDownloader
from ..utils import (
check_executable,
encodeFilename,
)
@@ -13,19 +16,19 @@ class MplayerFD(FileDownloader):
self.report_destination(filename)
tmpfilename = self.temp_name(filename)
args = ['mplayer', '-really-quiet', '-vo', 'null', '-vc', 'dummy', '-dumpstream', '-dumpfile', tmpfilename, url]
args = [
'mplayer', '-really-quiet', '-vo', 'null', '-vc', 'dummy',
'-dumpstream', '-dumpfile', tmpfilename, url]
# Check for mplayer first
try:
subprocess.call(['mplayer', '-h'], stdout=(open(os.path.devnull, 'w')), stderr=subprocess.STDOUT)
except (OSError, IOError):
self.report_error(u'MMS or RTSP download detected but "%s" could not be run' % args[0])
if not check_executable('mplayer', ['-h']):
self.report_error('MMS or RTSP download detected but "%s" could not be run' % args[0])
return False
# Download using mplayer.
retval = subprocess.call(args)
if retval == 0:
fsize = os.path.getsize(encodeFilename(tmpfilename))
self.to_screen(u'\r[%s] %s bytes' % (args[0], fsize))
self.to_screen('\r[%s] %s bytes' % (args[0], fsize))
self.try_rename(tmpfilename, filename)
self._hook_progress({
'downloaded_bytes': fsize,
@@ -35,6 +38,6 @@ class MplayerFD(FileDownloader):
})
return True
else:
self.to_stderr(u"\n")
self.report_error(u'mplayer exited with code %d' % retval)
self.to_stderr('\n')
self.report_error('mplayer exited with code %d' % retval)
return False

View File

@@ -7,9 +7,9 @@ import sys
import time
from .common import FileDownloader
from ..compat import compat_str
from ..utils import (
check_executable,
compat_str,
encodeFilename,
format_bytes,
get_exe_version,
@@ -46,13 +46,13 @@ class RtmpFD(FileDownloader):
continue
mobj = re.search(r'([0-9]+\.[0-9]{3}) kB / [0-9]+\.[0-9]{2} sec \(([0-9]{1,2}\.[0-9])%\)', line)
if mobj:
downloaded_data_len = int(float(mobj.group(1))*1024)
downloaded_data_len = int(float(mobj.group(1)) * 1024)
percent = float(mobj.group(2))
if not resume_percent:
resume_percent = percent
resume_downloaded_data_len = downloaded_data_len
eta = self.calc_eta(start, time.time(), 100-resume_percent, percent-resume_percent)
speed = self.calc_speed(start, time.time(), downloaded_data_len-resume_downloaded_data_len)
eta = self.calc_eta(start, time.time(), 100 - resume_percent, percent - resume_percent)
speed = self.calc_speed(start, time.time(), downloaded_data_len - resume_downloaded_data_len)
data_len = None
if percent > 0:
data_len = int(downloaded_data_len * 100 / percent)
@@ -72,7 +72,7 @@ class RtmpFD(FileDownloader):
# no percent for live streams
mobj = re.search(r'([0-9]+\.[0-9]{3}) kB / [0-9]+\.[0-9]{2} sec', line)
if mobj:
downloaded_data_len = int(float(mobj.group(1))*1024)
downloaded_data_len = int(float(mobj.group(1)) * 1024)
time_now = time.time()
speed = self.calc_speed(start, time_now, downloaded_data_len)
self.report_progress_live_stream(downloaded_data_len, speed, time_now - start)
@@ -88,7 +88,7 @@ class RtmpFD(FileDownloader):
if not cursor_in_new_line:
self.to_screen('')
cursor_in_new_line = True
self.to_screen('[rtmpdump] '+line)
self.to_screen('[rtmpdump] ' + line)
proc.wait()
if not cursor_in_new_line:
self.to_screen('')
@@ -180,12 +180,12 @@ class RtmpFD(FileDownloader):
while (retval == RD_INCOMPLETE or retval == RD_FAILED) and not test and not live:
prevsize = os.path.getsize(encodeFilename(tmpfilename))
self.to_screen('[rtmpdump] %s bytes' % prevsize)
time.sleep(5.0) # This seems to be needed
time.sleep(5.0) # This seems to be needed
retval = run_rtmpdump(basic_args + ['-e'] + [[], ['-k', '1']][retval == RD_FAILED])
cursize = os.path.getsize(encodeFilename(tmpfilename))
if prevsize == cursize and retval == RD_FAILED:
break
# Some rtmp streams seem abort after ~ 99.8%. Don't complain for those
# Some rtmp streams seem abort after ~ 99.8%. Don't complain for those
if prevsize == cursize and retval == RD_INCOMPLETE and cursize > 1024:
self.to_screen('[rtmpdump] Could not download the whole video. This can happen for some advertisements.')
retval = RD_SUCCESS

View File

@@ -1,8 +1,13 @@
from __future__ import unicode_literals
from .abc import ABCIE
from .academicearth import AcademicEarthCourseIE
from .addanime import AddAnimeIE
from .adobetv import AdobeTVIE
from .adultswim import AdultSwimIE
from .aftonbladet import AftonbladetIE
from .aljazeera import AlJazeeraIE
from .alphaporno import AlphaPornoIE
from .anitube import AnitubeIE
from .anysex import AnySexIE
from .aol import AolIE
@@ -20,21 +25,26 @@ from .arte import (
ArteTVDDCIE,
ArteTVEmbedIE,
)
from .atresplayer import AtresPlayerIE
from .audiomack import AudiomackIE
from .auengine import AUEngineIE
from .azubu import AzubuIE
from .bambuser import BambuserIE, BambuserChannelIE
from .bandcamp import BandcampIE, BandcampAlbumIE
from .bbccouk import BBCCoUkIE
from .beeg import BeegIE
from .behindkink import BehindKinkIE
from .bet import BetIE
from .bild import BildIE
from .bilibili import BiliBiliIE
from .blinkx import BlinkxIE
from .bliptv import BlipTVIE, BlipTVUserIE
from .bloomberg import BloombergIE
from .bpb import BpbIE
from .br import BRIE
from .breakcom import BreakIE
from .brightcove import BrightcoveIE
from .buzzfeed import BuzzFeedIE
from .byutv import BYUtvIE
from .c56 import C56IE
from .canal13cl import Canal13clIE
@@ -45,7 +55,7 @@ from .cbsnews import CBSNewsIE
from .ceskatelevize import CeskaTelevizeIE
from .channel9 import Channel9IE
from .chilloutzone import ChilloutzoneIE
from .cinemassacre import CinemassacreIE
from .cinchcast import CinchcastIE
from .clipfish import ClipfishIE
from .cliphunter import CliphunterIE
from .clipsyndicate import ClipsyndicateIE
@@ -56,9 +66,12 @@ from .cnet import CNETIE
from .cnn import (
CNNIE,
CNNBlogsIE,
CNNArticleIE,
)
from .collegehumor import CollegeHumorIE
from .comedycentral import ComedyCentralIE, ComedyCentralShowsIE
from .comcarcoff import ComCarCoffIE
from .commonmistakes import CommonMistakesIE
from .condenast import CondeNastIE
from .cracked import CrackedIE
from .criterion import CriterionIE
@@ -80,12 +93,14 @@ from .dotsub import DotsubIE
from .dreisat import DreiSatIE
from .drtuber import DrTuberIE
from .drtv import DRTVIE
from .dvtv import DVTVIE
from .dump import DumpIE
from .defense import DefenseGouvFrIE
from .discovery import DiscoveryIE
from .divxstage import DivxStageIE
from .dropbox import DropboxIE
from .ebaumsworld import EbaumsWorldIE
from .echomsk import EchoMskIE
from .ehow import EHowIE
from .eighttracks import EightTracksIE
from .einthusan import EinthusanIE
@@ -98,6 +113,7 @@ from .elpais import ElPaisIE
from .empflix import EMPFlixIE
from .engadget import EngadgetIE
from .eporner import EpornerIE
from .eroprofile import EroProfileIE
from .escapist import EscapistIE
from .everyonesmixtape import EveryonesMixtapeIE
from .exfm import ExfmIE
@@ -117,6 +133,8 @@ from .fktv import (
from .flickr import FlickrIE
from .folketinget import FolketingetIE
from .fourtube import FourTubeIE
from .foxgay import FoxgayIE
from .foxnews import FoxNewsIE
from .franceculture import FranceCultureIE
from .franceinter import FranceInterIE
from .francetv import (
@@ -140,6 +158,8 @@ from .gamestar import GameStarIE
from .gametrailers import GametrailersIE
from .gdcvault import GDCVaultIE
from .generic import GenericIE
from .giantbomb import GiantBombIE
from .giga import GigaIE
from .glide import GlideIE
from .globo import GloboIE
from .godtube import GodTubeIE
@@ -150,10 +170,13 @@ from .googlesearch import GoogleSearchIE
from .gorillavid import GorillaVidIE
from .goshgay import GoshgayIE
from .grooveshark import GroovesharkIE
from .groupon import GrouponIE
from .hark import HarkIE
from .heise import HeiseIE
from .hellporno import HellPornoIE
from .helsinki import HelsinkiIE
from .hentaistigma import HentaiStigmaIE
from .hitbox import HitboxIE, HitboxLiveIE
from .hornbunny import HornBunnyIE
from .hostingbulk import HostingBulkIE
from .hotnewhiphop import HotNewHipHopIE
@@ -212,6 +235,7 @@ from .mdr import MDRIE
from .metacafe import MetacafeIE
from .metacritic import MetacriticIE
from .mgoon import MgoonIE
from .minhateca import MinhatecaIE
from .ministrygrid import MinistryGridIE
from .mit import TechTVMITIE, MITIE, OCWMITIE
from .mitele import MiTeleIE
@@ -238,9 +262,10 @@ from .muenchentv import MuenchenTVIE
from .musicplayon import MusicPlayOnIE
from .musicvault import MusicVaultIE
from .muzu import MuzuTVIE
from .myspace import MySpaceIE
from .myspace import MySpaceIE, MySpaceAlbumIE
from .myspass import MySpassIE
from .myvideo import MyVideoIE
from .myvidster import MyVidsterIE
from .naver import NaverIE
from .nba import NBAIE
from .nbc import (
@@ -249,6 +274,7 @@ from .nbc import (
)
from .ndr import NDRIE
from .ndtv import NDTVIE
from .nerdcubed import NerdCubedFeedIE
from .newgrounds import NewgroundsIE
from .newstube import NewstubeIE
from .nfb import NFBIE
@@ -275,6 +301,7 @@ from .nytimes import NYTimesIE
from .nuvid import NuvidIE
from .oktoberfesttv import OktoberfestTVIE
from .ooyala import OoyalaIE
from .openfilm import OpenFilmIE
from .orf import (
ORFTVthekIE,
ORFOE1IE,
@@ -298,10 +325,13 @@ from .promptfile import PromptFileIE
from .prosiebensat1 import ProSiebenSat1IE
from .pyvideo import PyvideoIE
from .quickvid import QuickVidIE
from .radiode import RadioDeIE
from .radiobremen import RadioBremenIE
from .radiofrance import RadioFranceIE
from .rai import RaiIE
from .rbmaradio import RBMARadioIE
from .redtube import RedTubeIE
from .restudy import RestudyIE
from .reverbnation import ReverbNationIE
from .ringtv import RingTVIE
from .ro220 import Ro220IE
@@ -310,12 +340,14 @@ from .roxwel import RoxwelIE
from .rtbf import RTBFIE
from .rtlnl import RtlXlIE
from .rtlnow import RTLnowIE
from .rtp import RTPIE
from .rts import RTSIE
from .rtve import RTVEALaCartaIE, RTVELiveIE
from .ruhd import RUHDIE
from .rutube import (
RutubeIE,
RutubeChannelIE,
RutubeEmbedIE,
RutubeMovieIE,
RutubePersonIE,
)
@@ -325,6 +357,8 @@ from .savefrom import SaveFromIE
from .sbs import SBSIE
from .scivee import SciVeeIE
from .screencast import ScreencastIE
from .screencastomatic import ScreencastOMaticIE
from .screenwavemedia import CinemassacreIE, ScreenwaveMediaIE, TeamFourIE
from .servingsys import ServingSysIE
from .sexu import SexuIE
from .sexykarma import SexyKarmaIE
@@ -372,6 +406,7 @@ from .syfy import SyfyIE
from .sztvhu import SztvHuIE
from .tagesschau import TagesschauIE
from .tapely import TapelyIE
from .tass import TassIE
from .teachertube import (
TeacherTubeIE,
TeacherTubeUserIE,
@@ -383,6 +418,7 @@ from .ted import TEDIE
from .telebruxelles import TeleBruxellesIE
from .telecinco import TelecincoIE
from .telemb import TeleMBIE
from .teletask import TeleTaskIE
from .tenplay import TenPlayIE
from .testurl import TestURLIE
from .tf1 import TF1IE
@@ -392,6 +428,7 @@ from .thesixtyone import TheSixtyOneIE
from .thisav import ThisAVIE
from .tinypic import TinyPicIE
from .tlc import TlcIE, TlcDeIE
from .tmz import TMZIE
from .tnaflix import TNAFlixIE
from .thvideo import (
THVideoIE,
@@ -405,11 +442,13 @@ from .trutube import TruTubeIE
from .tube8 import Tube8IE
from .tudou import TudouIE
from .tumblr import TumblrIE
from .tunein import TuneInIE
from .turbo import TurboIE
from .tutv import TutvIE
from .tvigle import TvigleIE
from .tvp import TvpIE
from .tvplay import TVPlayIE
from .twentyfourvideo import TwentyFourVideoIE
from .twitch import TwitchIE
from .ubu import UbuIE
from .udemy import (
@@ -438,6 +477,7 @@ from .videott import VideoTtIE
from .videoweed import VideoWeedIE
from .vidme import VidmeIE
from .vidzi import VidziIE
from .vier import VierIE, VierVideosIE
from .vimeo import (
VimeoIE,
VimeoAlbumIE,
@@ -454,7 +494,10 @@ from .vine import (
VineUserIE,
)
from .viki import VikiIE
from .vk import VKIE
from .vk import (
VKIE,
VKUserVideosIE,
)
from .vodlocker import VodlockerIE
from .vporn import VpornIE
from .vrt import VRTIE
@@ -470,6 +513,7 @@ from .wdr import (
WDRMobileIE,
WDRMausIE,
)
from .webofstories import WebOfStoriesIE
from .weibo import WeiboIE
from .wimp import WimpIE
from .wistia import WistiaIE
@@ -478,13 +522,16 @@ from .wrzuta import WrzutaIE
from .xbef import XBefIE
from .xboxclips import XboxClipsIE
from .xhamster import XHamsterIE
from .xminus import XMinusIE
from .xnxx import XNXXIE
from .xvideos import XVideosIE
from .xtube import XTubeUserIE, XTubeIE
from .xxxymovies import XXXYMoviesIE
from .yahoo import (
YahooIE,
YahooSearchIE,
)
from .yesjapan import YesJapanIE
from .ynet import YnetIE
from .youjizz import YouJizzIE
from .youku import YoukuIE
@@ -502,12 +549,18 @@ from .youtube import (
YoutubeSearchURLIE,
YoutubeShowIE,
YoutubeSubscriptionsIE,
YoutubeTopListIE,
YoutubeTruncatedIDIE,
YoutubeTruncatedURLIE,
YoutubeUserIE,
YoutubeWatchLaterIE,
)
from .zdf import ZDFIE
from .zdf import ZDFIE, ZDFChannelIE
from .zingmp3 import (
ZingMp3SongIE,
ZingMp3AlbumIE,
)
from ..utils import age_restricted
_ALL_CLASSES = [
klass
@@ -524,6 +577,17 @@ def gen_extractors():
return [klass() for klass in _ALL_CLASSES]
def list_extractors(age_limit):
"""
Return a list of extractors that are suitable for the given age,
sorted by extractor ID.
"""
return sorted(
filter(lambda ie: ie.is_suitable(age_limit), gen_extractors()),
key=lambda ie: ie.IE_NAME.lower())
def get_info_extractor(ie_name):
"""Returns the info extractor class with the given ie_name"""
return globals()[ie_name+'IE']
return globals()[ie_name + 'IE']

View File

@@ -1,4 +1,5 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
@@ -18,15 +19,14 @@ class AcademicEarthCourseIE(InfoExtractor):
}
def _real_extract(self, url):
m = re.match(self._VALID_URL, url)
playlist_id = m.group('id')
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
title = self._html_search_regex(
r'<h1 class="playlist-name"[^>]*?>(.*?)</h1>', webpage, u'title')
r'<h1 class="playlist-name"[^>]*?>(.*?)</h1>', webpage, 'title')
description = self._html_search_regex(
r'<p class="excerpt"[^>]*?>(.*?)</p>',
webpage, u'description', fatal=False)
webpage, 'description', fatal=False)
urls = re.findall(
r'<li class="lecture-preview">\s*?<a target="_blank" href="([^"]+)">',
webpage)

View File

@@ -15,8 +15,7 @@ from ..utils import (
class AddAnimeIE(InfoExtractor):
_VALID_URL = r'^http://(?:\w+\.)?add-anime\.net/watch_video\.php\?(?:.*?)v=(?P<video_id>[\w_]+)(?:.*)'
_VALID_URL = r'^http://(?:\w+\.)?add-anime\.net/watch_video\.php\?(?:.*?)v=(?P<id>[\w_]+)(?:.*)'
_TEST = {
'url': 'http://www.add-anime.net/watch_video.php?v=24MR3YO5SAS9',
'md5': '72954ea10bc979ab5e2eb288b21425a0',
@@ -29,9 +28,9 @@ class AddAnimeIE(InfoExtractor):
}
def _real_extract(self, url):
video_id = self._match_id(url)
try:
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('video_id')
webpage = self._download_webpage(url, video_id)
except ExtractorError as ee:
if not isinstance(ee.cause, compat_HTTPError) or \
@@ -49,7 +48,7 @@ class AddAnimeIE(InfoExtractor):
r'a\.value = ([0-9]+)[+]([0-9]+)[*]([0-9]+);',
redir_webpage)
if av is None:
raise ExtractorError(u'Cannot find redirect math task')
raise ExtractorError('Cannot find redirect math task')
av_res = int(av.group(1)) + int(av.group(2)) * int(av.group(3))
parsed_url = compat_urllib_parse_urlparse(url)

View File

@@ -0,0 +1,70 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
parse_duration,
unified_strdate,
str_to_int,
)
class AdobeTVIE(InfoExtractor):
_VALID_URL = r'https?://tv\.adobe\.com/watch/[^/]+/(?P<id>[^/]+)'
_TEST = {
'url': 'http://tv.adobe.com/watch/the-complete-picture-with-julieanne-kost/quick-tip-how-to-draw-a-circle-around-an-object-in-photoshop/',
'md5': '9bc5727bcdd55251f35ad311ca74fa1e',
'info_dict': {
'id': 'quick-tip-how-to-draw-a-circle-around-an-object-in-photoshop',
'ext': 'mp4',
'title': 'Quick Tip - How to Draw a Circle Around an Object in Photoshop',
'description': 'md5:99ec318dc909d7ba2a1f2b038f7d2311',
'thumbnail': 're:https?://.*\.jpg$',
'upload_date': '20110914',
'duration': 60,
'view_count': int,
},
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
player = self._parse_json(
self._search_regex(r'html5player:\s*({.+?})\s*\n', webpage, 'player'),
video_id)
title = player.get('title') or self._search_regex(
r'data-title="([^"]+)"', webpage, 'title')
description = self._og_search_description(webpage)
thumbnail = self._og_search_thumbnail(webpage)
upload_date = unified_strdate(
self._html_search_meta('datepublished', webpage, 'upload date'))
duration = parse_duration(
self._html_search_meta('duration', webpage, 'duration')
or self._search_regex(r'Runtime:\s*(\d{2}:\d{2}:\d{2})', webpage, 'duration'))
view_count = str_to_int(self._search_regex(
r'<div class="views">\s*Views?:\s*([\d,.]+)\s*</div>',
webpage, 'view count'))
formats = [{
'url': source['src'],
'format_id': source.get('quality') or source['src'].split('-')[-1].split('.')[0] or None,
'tbr': source.get('bitrate'),
} for source in player['sources']]
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'upload_date': upload_date,
'duration': duration,
'view_count': view_count,
'formats': formats,
}

View File

@@ -2,122 +2,150 @@
from __future__ import unicode_literals
import re
import json
from .common import InfoExtractor
from ..utils import (
ExtractorError,
xpath_text,
float_or_none,
)
class AdultSwimIE(InfoExtractor):
_VALID_URL = r'https?://video\.adultswim\.com/(?P<path>.+?)(?:\.html)?(?:\?.*)?(?:#.*)?$'
_TEST = {
'url': 'http://video.adultswim.com/rick-and-morty/close-rick-counters-of-the-rick-kind.html?x=y#title',
_VALID_URL = r'https?://(?:www\.)?adultswim\.com/videos/(?P<is_playlist>playlists/)?(?P<show_path>[^/]+)/(?P<episode_path>[^/?#]+)/?'
_TESTS = [{
'url': 'http://adultswim.com/videos/rick-and-morty/pilot',
'playlist': [
{
'md5': '4da359ec73b58df4575cd01a610ba5dc',
'md5': '247572debc75c7652f253c8daa51a14d',
'info_dict': {
'id': '8a250ba1450996e901453d7f02ca02f5',
'id': 'rQxZvXQ4ROaSOqq-or2Mow-0',
'ext': 'flv',
'title': 'Rick and Morty Close Rick-Counters of the Rick Kind part 1',
'description': 'Rick has a run in with some old associates, resulting in a fallout with Morty. You got any chips, broh?',
'uploader': 'Rick and Morty',
'thumbnail': 'http://i.cdn.turner.com/asfix/repository/8a250ba13f865824013fc9db8b6b0400/thumbnail_267549017116827057.jpg'
}
'title': 'Rick and Morty - Pilot Part 1',
'description': "Rick moves in with his daughter's family and establishes himself as a bad influence on his grandson, Morty. "
},
},
{
'md5': 'ffbdf55af9331c509d95350bd0cc1819',
'md5': '77b0e037a4b20ec6b98671c4c379f48d',
'info_dict': {
'id': '8a250ba1450996e901453d7f4bd102f6',
'id': 'rQxZvXQ4ROaSOqq-or2Mow-3',
'ext': 'flv',
'title': 'Rick and Morty Close Rick-Counters of the Rick Kind part 2',
'description': 'Rick has a run in with some old associates, resulting in a fallout with Morty. You got any chips, broh?',
'uploader': 'Rick and Morty',
'thumbnail': 'http://i.cdn.turner.com/asfix/repository/8a250ba13f865824013fc9db8b6b0400/thumbnail_267549017116827057.jpg'
}
'title': 'Rick and Morty - Pilot Part 4',
'description': "Rick moves in with his daughter's family and establishes himself as a bad influence on his grandson, Morty. "
},
},
],
'info_dict': {
'title': 'Rick and Morty - Pilot',
'description': "Rick moves in with his daughter's family and establishes himself as a bad influence on his grandson, Morty. "
}
}, {
'url': 'http://www.adultswim.com/videos/playlists/american-parenting/putting-francine-out-of-business/',
'playlist': [
{
'md5': 'b92409635540304280b4b6c36bd14a0a',
'md5': '2eb5c06d0f9a1539da3718d897f13ec5',
'info_dict': {
'id': '8a250ba1450996e901453d7fa73c02f7',
'id': '-t8CamQlQ2aYZ49ItZCFog-0',
'ext': 'flv',
'title': 'Rick and Morty Close Rick-Counters of the Rick Kind part 3',
'description': 'Rick has a run in with some old associates, resulting in a fallout with Morty. You got any chips, broh?',
'uploader': 'Rick and Morty',
'thumbnail': 'http://i.cdn.turner.com/asfix/repository/8a250ba13f865824013fc9db8b6b0400/thumbnail_267549017116827057.jpg'
}
},
{
'md5': 'e8818891d60e47b29cd89d7b0278156d',
'info_dict': {
'id': '8a250ba1450996e901453d7fc8ba02f8',
'ext': 'flv',
'title': 'Rick and Morty Close Rick-Counters of the Rick Kind part 4',
'description': 'Rick has a run in with some old associates, resulting in a fallout with Morty. You got any chips, broh?',
'uploader': 'Rick and Morty',
'thumbnail': 'http://i.cdn.turner.com/asfix/repository/8a250ba13f865824013fc9db8b6b0400/thumbnail_267549017116827057.jpg'
}
'title': 'American Dad - Putting Francine Out of Business',
'description': 'Stan hatches a plan to get Francine out of the real estate business.Watch more American Dad on [adult swim].'
},
}
]
}
],
'info_dict': {
'title': 'American Dad - Putting Francine Out of Business',
'description': 'Stan hatches a plan to get Francine out of the real estate business.Watch more American Dad on [adult swim].'
},
}]
_video_extensions = {
'3500': 'flv',
'640': 'mp4',
'150': 'mp4',
'ipad': 'm3u8',
'iphone': 'm3u8'
}
_video_dimensions = {
'3500': (1280, 720),
'640': (480, 270),
'150': (320, 180)
}
@staticmethod
def find_video_info(collection, slug):
for video in collection.get('videos'):
if video.get('slug') == slug:
return video
@staticmethod
def find_collection_by_linkURL(collections, linkURL):
for collection in collections:
if collection.get('linkURL') == linkURL:
return collection
@staticmethod
def find_collection_containing_video(collections, slug):
for collection in collections:
for video in collection.get('videos'):
if video.get('slug') == slug:
return collection, video
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_path = mobj.group('path')
show_path = mobj.group('show_path')
episode_path = mobj.group('episode_path')
is_playlist = True if mobj.group('is_playlist') else False
webpage = self._download_webpage(url, video_path)
episode_id = self._html_search_regex(
r'<link rel="video_src" href="http://i\.adultswim\.com/adultswim/adultswimtv/tools/swf/viralplayer.swf\?id=([0-9a-f]+?)"\s*/?\s*>',
webpage, 'episode_id')
title = self._og_search_title(webpage)
webpage = self._download_webpage(url, episode_path)
index_url = 'http://asfix.adultswim.com/asfix-svc/episodeSearch/getEpisodesByIDs?networkName=AS&ids=%s' % episode_id
idoc = self._download_xml(index_url, title, 'Downloading episode index', 'Unable to download episode index')
# Extract the value of `bootstrappedData` from the Javascript in the page.
bootstrappedDataJS = self._search_regex(r'var bootstrappedData = ({.*});', webpage, episode_path)
episode_el = idoc.find('.//episode')
show_title = episode_el.attrib.get('collectionTitle')
episode_title = episode_el.attrib.get('title')
thumbnail = episode_el.attrib.get('thumbnailUrl')
description = episode_el.find('./description').text.strip()
try:
bootstrappedData = json.loads(bootstrappedDataJS)
except ValueError as ve:
errmsg = '%s: Failed to parse JSON ' % episode_path
raise ExtractorError(errmsg, cause=ve)
# Downloading videos from a /videos/playlist/ URL needs to be handled differently.
# NOTE: We are only downloading one video (the current one) not the playlist
if is_playlist:
collections = bootstrappedData['playlists']['collections']
collection = self.find_collection_by_linkURL(collections, show_path)
video_info = self.find_video_info(collection, episode_path)
show_title = video_info['showTitle']
segment_ids = [video_info['videoPlaybackID']]
else:
collections = bootstrappedData['show']['collections']
collection, video_info = self.find_collection_containing_video(collections, episode_path)
show = bootstrappedData['show']
show_title = show['title']
segment_ids = [clip['videoPlaybackID'] for clip in video_info['clips']]
episode_id = video_info['id']
episode_title = video_info['title']
episode_description = video_info['description']
episode_duration = video_info.get('duration')
entries = []
segment_els = episode_el.findall('./segments/segment')
for part_num, segment_id in enumerate(segment_ids):
segment_url = 'http://www.adultswim.com/videos/api/v0/assets?id=%s&platform=mobile' % segment_id
for part_num, segment_el in enumerate(segment_els):
segment_id = segment_el.attrib.get('id')
segment_title = '%s %s part %d' % (show_title, episode_title, part_num + 1)
thumbnail = segment_el.attrib.get('thumbnailUrl')
duration = segment_el.attrib.get('duration')
segment_title = '%s - %s' % (show_title, episode_title)
if len(segment_ids) > 1:
segment_title += ' Part %d' % (part_num + 1)
segment_url = 'http://asfix.adultswim.com/asfix-svc/episodeservices/getCvpPlaylist?networkName=AS&id=%s' % segment_id
idoc = self._download_xml(
segment_url, segment_title,
'Downloading segment information', 'Unable to download segment information')
segment_duration = float_or_none(
xpath_text(idoc, './/trt', 'segment duration').strip())
formats = []
file_els = idoc.findall('.//files/file')
for file_el in file_els:
bitrate = file_el.attrib.get('bitrate')
type = file_el.attrib.get('type')
width, height = self._video_dimensions.get(bitrate, (None, None))
ftype = file_el.attrib.get('type')
formats.append({
'format_id': '%s-%s' % (bitrate, type),
'url': file_el.text,
'ext': self._video_extensions.get(bitrate, 'mp4'),
'format_id': '%s_%s' % (bitrate, ftype),
'url': file_el.text.strip(),
# The bitrate may not be a number (for example: 'iphone')
'tbr': int(bitrate) if bitrate.isdigit() else None,
'height': height,
'width': width
'quality': 1 if ftype == 'hd' else -1
})
self._sort_formats(formats)
@@ -126,18 +154,16 @@ class AdultSwimIE(InfoExtractor):
'id': segment_id,
'title': segment_title,
'formats': formats,
'uploader': show_title,
'thumbnail': thumbnail,
'duration': duration,
'description': description
'duration': segment_duration,
'description': episode_description
})
return {
'_type': 'playlist',
'id': episode_id,
'display_id': video_path,
'display_id': episode_path,
'entries': entries,
'title': '%s %s' % (show_title, episode_title),
'description': description,
'thumbnail': thumbnail
'title': '%s - %s' % (show_title, episode_title),
'description': episode_description,
'duration': episode_duration
}

View File

@@ -0,0 +1,35 @@
from __future__ import unicode_literals
from .common import InfoExtractor
class AlJazeeraIE(InfoExtractor):
_VALID_URL = r'http://www\.aljazeera\.com/programmes/.*?/(?P<id>[^/]+)\.html'
_TEST = {
'url': 'http://www.aljazeera.com/programmes/the-slum/2014/08/deliverance-201482883754237240.html',
'info_dict': {
'id': '3792260579001',
'ext': 'mp4',
'title': 'The Slum - Episode 1: Deliverance',
'description': 'As a birth attendant advocating for family planning, Remy is on the frontline of Tondo\'s battle with overcrowding.',
'uploader': 'Al Jazeera English',
},
'add_ie': ['Brightcove'],
}
def _real_extract(self, url):
program_name = self._match_id(url)
webpage = self._download_webpage(url, program_name)
brightcove_id = self._search_regex(
r'RenderPagesVideo\(\'(.+?)\'', webpage, 'brightcove id')
return {
'_type': 'url',
'url': (
'brightcove:'
'playerKey=AQ~~%2CAAAAmtVJIFk~%2CTVGOQ5ZTwJbeMWnq5d_H4MOM57xfzApc'
'&%40videoPlayer={0}'.format(brightcove_id)
),
'ie_key': 'Brightcove',
}

View File

@@ -5,15 +5,14 @@ import re
import json
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
compat_str,
qualities,
determine_ext,
)
class AllocineIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?allocine\.fr/(?P<typ>article|video|film)/(fichearticle_gen_carticle=|player_gen_cmedia=|fichefilm_gen_cfilm=)(?P<id>[0-9]+)(?:\.html)?'
_VALID_URL = r'https?://(?:www\.)?allocine\.fr/(?P<typ>article|video|film)/(fichearticle_gen_carticle=|player_gen_cmedia=|fichefilm_gen_cfilm=|video-)(?P<id>[0-9]+)(?:\.html)?'
_TESTS = [{
'url': 'http://www.allocine.fr/article/fichearticle_gen_carticle=18635087.html',
@@ -45,6 +44,9 @@ class AllocineIE(InfoExtractor):
'description': 'md5:71742e3a74b0d692c7fce0dd2017a4ac',
'thumbnail': 're:http://.*\.jpg',
},
}, {
'url': 'http://www.allocine.fr/video/video-19550147/',
'only_matching': True,
}]
def _real_extract(self, url):
@@ -75,9 +77,7 @@ class AllocineIE(InfoExtractor):
'format_id': format_id,
'quality': quality(format_id),
'url': v,
'ext': determine_ext(v),
})
self._sort_formats(formats)
return {

View File

@@ -0,0 +1,77 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
parse_iso8601,
parse_duration,
parse_filesize,
int_or_none,
)
class AlphaPornoIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?alphaporno\.com/videos/(?P<id>[^/]+)'
_TEST = {
'url': 'http://www.alphaporno.com/videos/sensual-striptease-porn-with-samantha-alexandra/',
'md5': 'feb6d3bba8848cd54467a87ad34bd38e',
'info_dict': {
'id': '258807',
'display_id': 'sensual-striptease-porn-with-samantha-alexandra',
'ext': 'mp4',
'title': 'Sensual striptease porn with Samantha Alexandra',
'thumbnail': 're:https?://.*\.jpg$',
'timestamp': 1418694611,
'upload_date': '20141216',
'duration': 387,
'filesize_approx': 54120000,
'tbr': 1145,
'categories': list,
'age_limit': 18,
}
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
video_id = self._search_regex(
r"video_id\s*:\s*'([^']+)'", webpage, 'video id', default=None)
video_url = self._search_regex(
r"video_url\s*:\s*'([^']+)'", webpage, 'video url')
ext = self._html_search_meta(
'encodingFormat', webpage, 'ext', default='.mp4')[1:]
title = self._search_regex(
[r'<meta content="([^"]+)" itemprop="description">',
r'class="title" itemprop="name">([^<]+)<'],
webpage, 'title')
thumbnail = self._html_search_meta('thumbnail', webpage, 'thumbnail')
timestamp = parse_iso8601(self._html_search_meta(
'uploadDate', webpage, 'upload date'))
duration = parse_duration(self._html_search_meta(
'duration', webpage, 'duration'))
filesize_approx = parse_filesize(self._html_search_meta(
'contentSize', webpage, 'file size'))
bitrate = int_or_none(self._html_search_meta(
'bitrate', webpage, 'bitrate'))
categories = self._html_search_meta(
'keywords', webpage, 'categories', default='').split(',')
age_limit = self._rta_search(webpage)
return {
'id': video_id,
'display_id': display_id,
'url': video_url,
'ext': ext,
'title': title,
'thumbnail': thumbnail,
'timestamp': timestamp,
'duration': duration,
'filesize_approx': filesize_approx,
'tbr': bitrate,
'categories': categories,
'age_limit': age_limit,
}

View File

@@ -3,7 +3,6 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from .fivemin import FiveMinIE
class AolIE(InfoExtractor):
@@ -42,31 +41,30 @@ class AolIE(InfoExtractor):
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
playlist_id = mobj.group('playlist_id')
if playlist_id and not self._downloader.params.get('noplaylist'):
self.to_screen('Downloading playlist %s - add --no-playlist to just download video %s' % (playlist_id, video_id))
if not playlist_id or self._downloader.params.get('noplaylist'):
return self.url_result('5min:%s' % video_id)
webpage = self._download_webpage(url, playlist_id)
title = self._html_search_regex(
r'<h1 class="video-title[^"]*">(.+?)</h1>', webpage, 'title')
playlist_html = self._search_regex(
r"(?s)<ul\s+class='video-related[^']*'>(.*?)</ul>", webpage,
'playlist HTML')
entries = [{
'_type': 'url',
'url': 'aol-video:%s' % m.group('id'),
'ie_key': 'Aol',
} for m in re.finditer(
r"<a\s+href='.*videoid=(?P<id>[0-9]+)'\s+class='video-thumb'>",
playlist_html)]
self.to_screen('Downloading playlist %s - add --no-playlist to just download video %s' % (playlist_id, video_id))
return {
'_type': 'playlist',
'id': playlist_id,
'display_id': mobj.group('playlist_display_id'),
'title': title,
'entries': entries,
}
webpage = self._download_webpage(url, playlist_id)
title = self._html_search_regex(
r'<h1 class="video-title[^"]*">(.+?)</h1>', webpage, 'title')
playlist_html = self._search_regex(
r"(?s)<ul\s+class='video-related[^']*'>(.*?)</ul>", webpage,
'playlist HTML')
entries = [{
'_type': 'url',
'url': 'aol-video:%s' % m.group('id'),
'ie_key': 'Aol',
} for m in re.finditer(
r"<a\s+href='.*videoid=(?P<id>[0-9]+)'\s+class='video-thumb'>",
playlist_html)]
return FiveMinIE._build_result(video_id)
return {
'_type': 'playlist',
'id': playlist_id,
'display_id': mobj.group('playlist_display_id'),
'title': title,
'entries': entries,
}

View File

@@ -1,5 +1,4 @@
#coding: utf-8
# coding: utf-8
from __future__ import unicode_literals
import re
@@ -26,8 +25,7 @@ class AparatIE(InfoExtractor):
}
def _real_extract(self, url):
m = re.match(self._VALID_URL, url)
video_id = m.group('id')
video_id = self._match_id(url)
# Note: There is an easier-to-parse configuration at
# http://www.aparat.com/video/video/config/videohash/%video_id
@@ -40,15 +38,15 @@ class AparatIE(InfoExtractor):
for i, video_url in enumerate(video_urls):
req = HEADRequest(video_url)
res = self._request_webpage(
req, video_id, note=u'Testing video URL %d' % i, errnote=False)
req, video_id, note='Testing video URL %d' % i, errnote=False)
if res:
break
else:
raise ExtractorError(u'No working video URLs found')
raise ExtractorError('No working video URLs found')
title = self._search_regex(r'\s+title:\s*"([^"]+)"', webpage, u'title')
title = self._search_regex(r'\s+title:\s*"([^"]+)"', webpage, 'title')
thumbnail = self._search_regex(
r'\s+image:\s*"([^"]+)"', webpage, u'thumbnail', fatal=False)
r'\s+image:\s*"([^"]+)"', webpage, 'thumbnail', fatal=False)
return {
'id': video_id,

View File

@@ -4,8 +4,8 @@ import re
import json
from .common import InfoExtractor
from ..compat import compat_urlparse
from ..utils import (
compat_urlparse,
int_or_none,
)
@@ -70,15 +70,17 @@ class AppleTrailersIE(InfoExtractor):
uploader_id = mobj.group('company')
playlist_url = compat_urlparse.urljoin(url, 'includes/playlists/itunes.inc')
def fix_html(s):
s = re.sub(r'(?s)<script[^<]*?>.*?</script>', '', s)
s = re.sub(r'<img ([^<]*?)>', r'<img \1/>', s)
# The ' in the onClick attributes are not escaped, it couldn't be parsed
# like: http://trailers.apple.com/trailers/wb/gravity/
def _clean_json(m):
return 'iTunes.playURL(%s);' % m.group(1).replace('\'', '&#39;')
s = re.sub(self._JSON_RE, _clean_json, s)
s = '<html>' + s + u'</html>'
s = '<html>%s</html>' % s
return s
doc = self._download_xml(playlist_url, movie, transform_source=fix_html)
@@ -86,7 +88,7 @@ class AppleTrailersIE(InfoExtractor):
for li in doc.findall('./div/ul/li'):
on_click = li.find('.//a').attrib['onClick']
trailer_info_json = self._search_regex(self._JSON_RE,
on_click, 'trailer info')
on_click, 'trailer info')
trailer_info = json.loads(trailer_info_json)
title = trailer_info['title']
video_id = movie + '-' + re.sub(r'[^a-zA-Z0-9]', '', title).lower()

View File

@@ -1,42 +1,48 @@
from __future__ import unicode_literals
import json
import re
from .common import InfoExtractor
from ..utils import (
unified_strdate,
)
from ..utils import unified_strdate
class ArchiveOrgIE(InfoExtractor):
IE_NAME = 'archive.org'
IE_DESC = 'archive.org videos'
_VALID_URL = r'(?:https?://)?(?:www\.)?archive\.org/details/(?P<id>[^?/]+)(?:[?].*)?$'
_TEST = {
"url": "http://archive.org/details/XD300-23_68HighlightsAResearchCntAugHumanIntellect",
'file': 'XD300-23_68HighlightsAResearchCntAugHumanIntellect.ogv',
_VALID_URL = r'https?://(?:www\.)?archive\.org/details/(?P<id>[^?/]+)(?:[?].*)?$'
_TESTS = [{
'url': 'http://archive.org/details/XD300-23_68HighlightsAResearchCntAugHumanIntellect',
'md5': '8af1d4cf447933ed3c7f4871162602db',
'info_dict': {
"title": "1968 Demo - FJCC Conference Presentation Reel #1",
"description": "Reel 1 of 3: Also known as the \"Mother of All Demos\", Doug Engelbart's presentation at the Fall Joint Computer Conference in San Francisco, December 9, 1968 titled \"A Research Center for Augmenting Human Intellect.\" For this presentation, Doug and his team astonished the audience by not only relating their research, but demonstrating it live. This was the debut of the mouse, interactive computing, hypermedia, computer supported software engineering, video teleconferencing, etc. See also <a href=\"http://dougengelbart.org/firsts/dougs-1968-demo.html\" rel=\"nofollow\">Doug's 1968 Demo page</a> for more background, highlights, links, and the detailed paper published in this conference proceedings. Filmed on 3 reels: Reel 1 | <a href=\"http://www.archive.org/details/XD300-24_68HighlightsAResearchCntAugHumanIntellect\" rel=\"nofollow\">Reel 2</a> | <a href=\"http://www.archive.org/details/XD300-25_68HighlightsAResearchCntAugHumanIntellect\" rel=\"nofollow\">Reel 3</a>",
"upload_date": "19681210",
"uploader": "SRI International"
'id': 'XD300-23_68HighlightsAResearchCntAugHumanIntellect',
'ext': 'ogv',
'title': '1968 Demo - FJCC Conference Presentation Reel #1',
'description': 'md5:1780b464abaca9991d8968c877bb53ed',
'upload_date': '19681210',
'uploader': 'SRI International'
}
}
}, {
'url': 'https://archive.org/details/Cops1922',
'md5': '18f2a19e6d89af8425671da1cf3d4e04',
'info_dict': {
'id': 'Cops1922',
'ext': 'ogv',
'title': 'Buster Keaton\'s "Cops" (1922)',
'description': 'md5:70f72ee70882f713d4578725461ffcc3',
}
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
video_id = self._match_id(url)
json_url = url + ('?' if '?' in url else '&') + 'output=json'
json_data = self._download_webpage(json_url, video_id)
data = json.loads(json_data)
data = self._download_json(json_url, video_id)
title = data['metadata']['title'][0]
description = data['metadata']['description'][0]
uploader = data['metadata']['creator'][0]
upload_date = unified_strdate(data['metadata']['date'][0])
def get_optional(data_dict, field):
return data_dict['metadata'].get(field, [None])[0]
title = get_optional(data, 'title')
description = get_optional(data, 'description')
uploader = get_optional(data, 'creator')
upload_date = unified_strdate(get_optional(data, 'date'))
formats = [
{

View File

@@ -192,4 +192,3 @@ class ARDIE(InfoExtractor):
'upload_date': upload_date,
'thumbnail': thumbnail,
}

View File

@@ -13,7 +13,7 @@ from ..utils import (
qualities,
)
# There are different sources of video in arte.tv, the extraction process
# There are different sources of video in arte.tv, the extraction process
# is different for each one. The videos usually expire in 7 days, so we can't
# add tests.
@@ -37,7 +37,7 @@ class ArteTvIE(InfoExtractor):
config_xml_url, video_id, note='Downloading configuration')
formats = [{
'forma_id': q.attrib['quality'],
'format_id': q.attrib['quality'],
# The playpath starts at 'mp4:', if we don't manually
# split the url, rtmpdump will incorrectly parse them
'url': q.text.split('mp4:', 1)[0],
@@ -133,7 +133,7 @@ class ArteTVPlus7IE(InfoExtractor):
'width': int_or_none(f.get('width')),
'height': int_or_none(f.get('height')),
'tbr': int_or_none(f.get('bitrate')),
'quality': qfunc(f['quality']),
'quality': qfunc(f.get('quality')),
'source_preference': source_pref,
}

View File

@@ -0,0 +1,114 @@
from __future__ import unicode_literals
import time
import hmac
from .common import InfoExtractor
from ..utils import (
compat_str,
compat_urllib_request,
int_or_none,
float_or_none,
xpath_text,
ExtractorError,
)
class AtresPlayerIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?atresplayer\.com/television/[^/]+/[^/]+/[^/]+/(?P<id>.+?)_\d+\.html'
_TESTS = [
{
'url': 'http://www.atresplayer.com/television/programas/el-club-de-la-comedia/temporada-4/capitulo-10-especial-solidario-nochebuena_2014122100174.html',
'md5': 'efd56753cda1bb64df52a3074f62e38a',
'info_dict': {
'id': 'capitulo-10-especial-solidario-nochebuena',
'ext': 'mp4',
'title': 'Especial Solidario de Nochebuena',
'description': 'md5:e2d52ff12214fa937107d21064075bf1',
'duration': 5527.6,
'thumbnail': 're:^https?://.*\.jpg$',
},
},
{
'url': 'http://www.atresplayer.com/television/series/el-secreto-de-puente-viejo/el-chico-de-los-tres-lunares/capitulo-977-29-12-14_2014122400174.html',
'only_matching': True,
},
]
_USER_AGENT = 'Dalvik/1.6.0 (Linux; U; Android 4.3; GT-I9300 Build/JSS15J'
_MAGIC = 'QWtMLXs414Yo+c#_+Q#K@NN)'
_TIMESTAMP_SHIFT = 30000
_TIME_API_URL = 'http://servicios.atresplayer.com/api/admin/time.json'
_URL_VIDEO_TEMPLATE = 'https://servicios.atresplayer.com/api/urlVideo/{1}/{0}/{1}|{2}|{3}.json'
_PLAYER_URL_TEMPLATE = 'https://servicios.atresplayer.com/episode/getplayer.json?episodePk=%s'
_EPISODE_URL_TEMPLATE = 'http://www.atresplayer.com/episodexml/%s'
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
episode_id = self._search_regex(
r'episode="([^"]+)"', webpage, 'episode id')
timestamp = int_or_none(self._download_webpage(
self._TIME_API_URL,
video_id, 'Downloading timestamp', fatal=False), 1000, time.time())
timestamp_shifted = compat_str(timestamp + self._TIMESTAMP_SHIFT)
token = hmac.new(
self._MAGIC.encode('ascii'),
(episode_id + timestamp_shifted).encode('utf-8')
).hexdigest()
formats = []
for fmt in ['windows', 'android_tablet']:
request = compat_urllib_request.Request(
self._URL_VIDEO_TEMPLATE.format(fmt, episode_id, timestamp_shifted, token))
request.add_header('Youtubedl-user-agent', self._USER_AGENT)
fmt_json = self._download_json(
request, video_id, 'Downloading %s video JSON' % fmt)
result = fmt_json.get('resultDes')
if result.lower() != 'ok':
raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, result), expected=True)
for _, video_url in fmt_json['resultObject'].items():
if video_url.endswith('/Manifest'):
formats.extend(self._extract_f4m_formats(video_url[:-9] + '/manifest.f4m', video_id))
else:
formats.append({
'url': video_url,
'format_id': 'android',
'preference': 1,
})
self._sort_formats(formats)
player = self._download_json(
self._PLAYER_URL_TEMPLATE % episode_id,
episode_id)
path_data = player.get('pathData')
episode = self._download_xml(
self._EPISODE_URL_TEMPLATE % path_data,
video_id, 'Downloading episode XML')
duration = float_or_none(xpath_text(
episode, './media/asset/info/technical/contentDuration', 'duration'))
art = episode.find('./media/asset/info/art')
title = xpath_text(art, './name', 'title')
description = xpath_text(art, './description', 'description')
thumbnail = xpath_text(episode, './media/asset/files/background', 'thumbnail')
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'duration': duration,
'formats': formats,
}

View File

@@ -12,29 +12,29 @@ class AudiomackIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?audiomack\.com/song/(?P<id>[\w/-]+)'
IE_NAME = 'audiomack'
_TESTS = [
#hosted on audiomack
# hosted on audiomack
{
'url': 'http://www.audiomack.com/song/roosh-williams/extraordinary',
'info_dict':
{
'id' : 'roosh-williams/extraordinary',
'id': 'roosh-williams/extraordinary',
'ext': 'mp3',
'title': 'Roosh Williams - Extraordinary'
}
},
#hosted on soundcloud via audiomack
# hosted on soundcloud via audiomack
{
'add_ie': ['Soundcloud'],
'url': 'http://www.audiomack.com/song/xclusiveszone/take-kare',
'file': '172419696.mp3',
'info_dict':
{
'info_dict': {
'id': '172419696',
'ext': 'mp3',
'description': 'md5:1fc3272ed7a635cce5be1568c2822997',
'title': 'Young Thug ft Lil Wayne - Take Kare',
"upload_date": "20141016",
"description": "New track produced by London On Da Track called “Take Kare\"\n\nhttp://instagram.com/theyoungthugworld\nhttps://www.facebook.com/ThuggerThuggerCashMoney\n",
"uploader": "Young Thug World"
'uploader': 'Young Thug World',
'upload_date': '20141016',
}
}
},
]
def _real_extract(self, url):
@@ -49,7 +49,7 @@ class AudiomackIE(InfoExtractor):
raise ExtractorError("Unable to deduce api url of song")
realurl = api_response["url"]
#Audiomack wraps a lot of soundcloud tracks in their branded wrapper
# Audiomack wraps a lot of soundcloud tracks in their branded wrapper
# - if so, pass the work off to the soundcloud extractor
if SoundcloudIE.suitable(realurl):
return {'_type': 'url', 'url': realurl, 'ie_key': 'Soundcloud'}

View File

@@ -3,10 +3,11 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_urllib_parse
from ..utils import (
compat_urllib_parse,
determine_ext,
ExtractorError,
remove_end,
)
@@ -27,23 +28,18 @@ class AUEngineIE(InfoExtractor):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
title = self._html_search_regex(r'<title>(?P<title>.+?)</title>', webpage, 'title')
title = title.strip()
links = re.findall(r'\s(?:file|url):\s*["\']([^\'"]+)["\']', webpage)
links = map(compat_urllib_parse.unquote, links)
title = self._html_search_regex(
r'<title>\s*(?P<title>.+?)\s*</title>', webpage, 'title')
video_urls = re.findall(r'http://\w+.auengine.com/vod/.*[^\W]', webpage)
video_url = compat_urllib_parse.unquote(video_urls[0])
thumbnails = re.findall(r'http://\w+.auengine.com/thumb/.*[^\W]', webpage)
thumbnail = compat_urllib_parse.unquote(thumbnails[0])
thumbnail = None
video_url = None
for link in links:
if link.endswith('.png'):
thumbnail = link
elif '/videos/' in link:
video_url = link
if not video_url:
raise ExtractorError('Could not find video URL')
ext = '.' + determine_ext(video_url)
if ext == title[-len(ext):]:
title = title[:-len(ext)]
title = remove_end(title, ext)
return {
'id': video_id,

View File

@@ -0,0 +1,93 @@
from __future__ import unicode_literals
import json
from .common import InfoExtractor
from ..utils import float_or_none
class AzubuIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?azubu\.tv/[^/]+#!/play/(?P<id>\d+)'
_TESTS = [
{
'url': 'http://www.azubu.tv/GSL#!/play/15575/2014-hot6-cup-last-big-match-ro8-day-1',
'md5': 'a88b42fcf844f29ad6035054bd9ecaf4',
'info_dict': {
'id': '15575',
'ext': 'mp4',
'title': '2014 HOT6 CUP LAST BIG MATCH Ro8 Day 1',
'description': 'md5:d06bdea27b8cc4388a90ad35b5c66c01',
'thumbnail': 're:^https?://.*\.jpe?g',
'timestamp': 1417523507.334,
'upload_date': '20141202',
'duration': 9988.7,
'uploader': 'GSL',
'uploader_id': 414310,
'view_count': int,
},
},
{
'url': 'http://www.azubu.tv/FnaticTV#!/play/9344/-fnatic-at-worlds-2014:-toyz---%22i-love-rekkles,-he-has-amazing-mechanics%22-',
'md5': 'b72a871fe1d9f70bd7673769cdb3b925',
'info_dict': {
'id': '9344',
'ext': 'mp4',
'title': 'Fnatic at Worlds 2014: Toyz - "I love Rekkles, he has amazing mechanics"',
'description': 'md5:4a649737b5f6c8b5c5be543e88dc62af',
'thumbnail': 're:^https?://.*\.jpe?g',
'timestamp': 1410530893.320,
'upload_date': '20140912',
'duration': 172.385,
'uploader': 'FnaticTV',
'uploader_id': 272749,
'view_count': int,
},
},
]
def _real_extract(self, url):
video_id = self._match_id(url)
data = self._download_json(
'http://www.azubu.tv/api/video/%s' % video_id, video_id)['data']
title = data['title'].strip()
description = data['description']
thumbnail = data['thumbnail']
view_count = data['view_count']
uploader = data['user']['username']
uploader_id = data['user']['id']
stream_params = json.loads(data['stream_params'])
timestamp = float_or_none(stream_params['creationDate'], 1000)
duration = float_or_none(stream_params['length'], 1000)
renditions = stream_params.get('renditions') or []
video = stream_params.get('FLVFullLength') or stream_params.get('videoFullLength')
if video:
renditions.append(video)
formats = [{
'url': fmt['url'],
'width': fmt['frameWidth'],
'height': fmt['frameHeight'],
'vbr': float_or_none(fmt['encodingRate'], 1000),
'filesize': fmt['size'],
'vcodec': fmt['videoCodec'],
'container': fmt['videoContainer'],
} for fmt in renditions if fmt['url']]
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'timestamp': timestamp,
'duration': duration,
'uploader': uploader,
'uploader_id': uploader_id,
'view_count': view_count,
'formats': formats,
}

View File

@@ -5,7 +5,7 @@ import json
import itertools
from .common import InfoExtractor
from ..utils import (
from ..compat import (
compat_urllib_request,
)
@@ -18,7 +18,7 @@ class BambuserIE(InfoExtractor):
_TEST = {
'url': 'http://bambuser.com/v/4050584',
# MD5 seems to be flaky, see https://travis-ci.org/rg3/youtube-dl/jobs/14051016#L388
#u'md5': 'fba8f7693e48fd4e8641b3fd5539a641',
# 'md5': 'fba8f7693e48fd4e8641b3fd5539a641',
'info_dict': {
'id': '4050584',
'ext': 'flv',
@@ -38,7 +38,7 @@ class BambuserIE(InfoExtractor):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
info_url = ('http://player-c.api.bambuser.com/getVideo.json?'
'&api_key=%s&vid=%s' % (self._API_KEY, video_id))
'&api_key=%s&vid=%s' % (self._API_KEY, video_id))
info_json = self._download_webpage(info_url, video_id)
info = json.loads(info_json)['result']
@@ -73,10 +73,11 @@ class BambuserChannelIE(InfoExtractor):
urls = []
last_id = ''
for i in itertools.count(1):
req_url = ('http://bambuser.com/xhr-api/index.php?username={user}'
req_url = (
'http://bambuser.com/xhr-api/index.php?username={user}'
'&sort=created&access_mode=0%2C1%2C2&limit={count}'
'&method=broadcast&format=json&vid_older_than={last}'
).format(user=user, count=self._STEP, last=last_id)
).format(user=user, count=self._STEP, last=last_id)
req = compat_urllib_request.Request(req_url)
# Without setting this header, we wouldn't get any result
req.add_header('Referer', 'http://bambuser.com/channel/%s' % user)

View File

@@ -4,9 +4,11 @@ import json
import re
from .common import InfoExtractor
from ..utils import (
from ..compat import (
compat_str,
compat_urlparse,
)
from ..utils import (
ExtractorError,
)
@@ -83,12 +85,12 @@ class BandcampIE(InfoExtractor):
initial_url = mp3_info['url']
re_url = r'(?P<server>http://(.*?)\.bandcamp\.com)/download/track\?enc=mp3-320&fsig=(?P<fsig>.*?)&id=(?P<id>.*?)&ts=(?P<ts>.*)$'
m_url = re.match(re_url, initial_url)
#We build the url we will use to get the final track url
# We build the url we will use to get the final track url
# This url is build in Bandcamp in the script download_bunde_*.js
request_url = '%s/statdownload/track?enc=mp3-320&fsig=%s&id=%s&ts=%s&.rand=665028774616&.vrs=1' % (m_url.group('server'), m_url.group('fsig'), video_id, m_url.group('ts'))
final_url_webpage = self._download_webpage(request_url, video_id, 'Requesting download url')
# If we could correctly generate the .rand field the url would be
#in the "download_url" key
# in the "download_url" key
final_url = re.search(r'"retry_url":"(.*?)"', final_url_webpage).group(1)
return {
@@ -104,7 +106,7 @@ class BandcampIE(InfoExtractor):
class BandcampAlbumIE(InfoExtractor):
IE_NAME = 'Bandcamp:album'
_VALID_URL = r'https?://(?:(?P<subdomain>[^.]+)\.)?bandcamp\.com(?:/album/(?P<title>[^?#]+))'
_VALID_URL = r'https?://(?:(?P<subdomain>[^.]+)\.)?bandcamp\.com(?:/album/(?P<title>[^?#]+)|/?(?:$|[?#]))'
_TESTS = [{
'url': 'http://blazo.bandcamp.com/album/jazz-format-mixtape-vol-1',
@@ -139,6 +141,12 @@ class BandcampAlbumIE(InfoExtractor):
'title': 'Hierophany of the Open Grave',
},
'playlist_mincount': 9,
}, {
'url': 'http://dotscale.bandcamp.com',
'info_dict': {
'title': 'Loom',
},
'playlist_mincount': 7,
}]
def _real_extract(self, url):

View File

@@ -1,15 +1,16 @@
from __future__ import unicode_literals
import re
import xml.etree.ElementTree
from .subtitles import SubtitlesInfoExtractor
from ..utils import ExtractorError
from ..compat import compat_HTTPError
class BBCCoUkIE(SubtitlesInfoExtractor):
IE_NAME = 'bbc.co.uk'
IE_DESC = 'BBC iPlayer'
_VALID_URL = r'https?://(?:www\.)?bbc\.co\.uk/(?:programmes|iplayer/episode)/(?P<id>[\da-z]{8})'
_VALID_URL = r'https?://(?:www\.)?bbc\.co\.uk/(?:(?:(?:programmes|iplayer/(?:episode|playlist))/)|music/clips[/#])(?P<id>[\da-z]{8})'
_TESTS = [
{
@@ -17,8 +18,8 @@ class BBCCoUkIE(SubtitlesInfoExtractor):
'info_dict': {
'id': 'b039d07m',
'ext': 'flv',
'title': 'Kaleidoscope: Leonard Cohen',
'description': 'md5:db4755d7a665ae72343779f7dacb402c',
'title': 'Kaleidoscope, Leonard Cohen',
'description': 'The Canadian poet and songwriter reflects on his musical career.',
'duration': 1740,
},
'params': {
@@ -55,6 +56,68 @@ class BBCCoUkIE(SubtitlesInfoExtractor):
'skip_download': True,
},
'skip': 'Currently BBC iPlayer TV programmes are available to play in the UK only',
},
{
'url': 'http://www.bbc.co.uk/iplayer/episode/p026c7jt/tomorrows-worlds-the-unearthly-history-of-science-fiction-2-invasion',
'info_dict': {
'id': 'b03k3pb7',
'ext': 'flv',
'title': "Tomorrow's Worlds: The Unearthly History of Science Fiction",
'description': '2. Invasion',
'duration': 3600,
},
'params': {
# rtmp download
'skip_download': True,
},
'skip': 'Currently BBC iPlayer TV programmes are available to play in the UK only',
}, {
'url': 'http://www.bbc.co.uk/programmes/b04v20dw',
'info_dict': {
'id': 'b04v209v',
'ext': 'flv',
'title': 'Pete Tong, The Essential New Tune Special',
'description': "Pete has a very special mix - all of 2014's Essential New Tunes!",
'duration': 10800,
},
'params': {
# rtmp download
'skip_download': True,
}
}, {
'url': 'http://www.bbc.co.uk/music/clips/p02frcc3',
'note': 'Audio',
'info_dict': {
'id': 'p02frcch',
'ext': 'flv',
'title': 'Pete Tong, Past, Present and Future Special, Madeon - After Hours mix',
'description': 'French house superstar Madeon takes us out of the club and onto the after party.',
'duration': 3507,
},
'params': {
# rtmp download
'skip_download': True,
}
}, {
'url': 'http://www.bbc.co.uk/music/clips/p025c0zz',
'note': 'Video',
'info_dict': {
'id': 'p025c103',
'ext': 'flv',
'title': 'Reading and Leeds Festival, 2014, Rae Morris - Closer (Live on BBC Three)',
'description': 'Rae Morris performs Closer for BBC Three at Reading 2014',
'duration': 226,
},
'params': {
# rtmp download
'skip_download': True,
}
}, {
'url': 'http://www.bbc.co.uk/iplayer/playlist/p01dvks4',
'only_matching': True,
}, {
'url': 'http://www.bbc.co.uk/music/clips#p02frcc3',
'only_matching': True,
}
]
@@ -102,6 +165,10 @@ class BBCCoUkIE(SubtitlesInfoExtractor):
return playlist.findall('./{http://bbc.co.uk/2008/emp/playlist}item')
def _extract_medias(self, media_selection):
error = media_selection.find('./{http://bbc.co.uk/2008/mp/mediaselection}error')
if error is not None:
raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, error.get('id')), expected=True)
return media_selection.findall('./{http://bbc.co.uk/2008/mp/mediaselection}media')
def _extract_connections(self, media):
@@ -158,54 +225,101 @@ class BBCCoUkIE(SubtitlesInfoExtractor):
subtitles[lang] = srt
return subtitles
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
group_id = mobj.group('id')
def _download_media_selector(self, programme_id):
try:
media_selection = self._download_xml(
'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/pc/vpid/%s' % programme_id,
programme_id, 'Downloading media selection XML')
except ExtractorError as ee:
if isinstance(ee.cause, compat_HTTPError) and ee.cause.code == 403:
media_selection = xml.etree.ElementTree.fromstring(ee.cause.read().encode('utf-8'))
else:
raise
webpage = self._download_webpage(url, group_id, 'Downloading video page')
if re.search(r'id="emp-error" class="notinuk">', webpage):
raise ExtractorError('Currently BBC iPlayer TV programmes are available to play in the UK only',
expected=True)
formats = []
subtitles = None
playlist = self._download_xml('http://www.bbc.co.uk/iplayer/playlist/%s' % group_id, group_id,
'Downloading playlist XML')
for media in self._extract_medias(media_selection):
kind = media.get('kind')
if kind == 'audio':
formats.extend(self._extract_audio(media, programme_id))
elif kind == 'video':
formats.extend(self._extract_video(media, programme_id))
elif kind == 'captions':
subtitles = self._extract_captions(media, programme_id)
return formats, subtitles
def _download_playlist(self, playlist_id):
try:
playlist = self._download_json(
'http://www.bbc.co.uk/programmes/%s/playlist.json' % playlist_id,
playlist_id, 'Downloading playlist JSON')
version = playlist.get('defaultAvailableVersion')
if version:
smp_config = version['smpConfig']
title = smp_config['title']
description = smp_config['summary']
for item in smp_config['items']:
kind = item['kind']
if kind != 'programme' and kind != 'radioProgramme':
continue
programme_id = item.get('vpid')
duration = int(item.get('duration'))
formats, subtitles = self._download_media_selector(programme_id)
return programme_id, title, description, duration, formats, subtitles
except ExtractorError as ee:
if not isinstance(ee.cause, compat_HTTPError) and ee.cause.code == 404:
raise
# fallback to legacy playlist
playlist = self._download_xml(
'http://www.bbc.co.uk/iplayer/playlist/%s' % playlist_id,
playlist_id, 'Downloading legacy playlist XML')
no_items = playlist.find('./{http://bbc.co.uk/2008/emp/playlist}noItems')
if no_items is not None:
reason = no_items.get('reason')
if reason == 'preAvailability':
msg = 'Episode %s is not yet available' % group_id
msg = 'Episode %s is not yet available' % playlist_id
elif reason == 'postAvailability':
msg = 'Episode %s is no longer available' % group_id
msg = 'Episode %s is no longer available' % playlist_id
elif reason == 'noMedia':
msg = 'Episode %s is not currently available' % playlist_id
else:
msg = 'Episode %s is not available: %s' % (group_id, reason)
msg = 'Episode %s is not available: %s' % (playlist_id, reason)
raise ExtractorError(msg, expected=True)
formats = []
subtitles = None
for item in self._extract_items(playlist):
kind = item.get('kind')
if kind != 'programme' and kind != 'radioProgramme':
continue
title = playlist.find('./{http://bbc.co.uk/2008/emp/playlist}title').text
description = playlist.find('./{http://bbc.co.uk/2008/emp/playlist}summary').text
programme_id = item.get('identifier')
duration = int(item.get('duration'))
formats, subtitles = self._download_media_selector(programme_id)
media_selection = self._download_xml(
'http://open.live.bbc.co.uk/mediaselector/5/select/version/2.0/mediaset/pc/vpid/%s' % programme_id,
programme_id, 'Downloading media selection XML')
return programme_id, title, description, duration, formats, subtitles
for media in self._extract_medias(media_selection):
kind = media.get('kind')
if kind == 'audio':
formats.extend(self._extract_audio(media, programme_id))
elif kind == 'video':
formats.extend(self._extract_video(media, programme_id))
elif kind == 'captions':
subtitles = self._extract_captions(media, programme_id)
def _real_extract(self, url):
group_id = self._match_id(url)
webpage = self._download_webpage(url, group_id, 'Downloading video page')
programme_id = self._search_regex(
r'"vpid"\s*:\s*"([\da-z]{8})"', webpage, 'vpid', fatal=False, default=None)
if programme_id:
player = self._download_json(
'http://www.bbc.co.uk/iplayer/episode/%s.json' % group_id,
group_id)['jsConf']['player']
title = player['title']
description = player['subtitle']
duration = player['duration']
formats, subtitles = self._download_media_selector(programme_id)
else:
programme_id, title, description, duration, formats, subtitles = self._download_playlist(group_id)
if self._downloader.params.get('listsubtitles', False):
self._list_available_subtitles(programme_id, subtitles)
@@ -220,4 +334,4 @@ class BBCCoUkIE(SubtitlesInfoExtractor):
'duration': duration,
'formats': formats,
'subtitles': subtitles,
}
}

View File

@@ -40,7 +40,7 @@ class BeegIE(InfoExtractor):
title = self._html_search_regex(
r'<title>([^<]+)\s*-\s*beeg\.?</title>', webpage, 'title')
description = self._html_search_regex(
r'<meta name="description" content="([^"]*)"',
webpage, 'description', fatal=False)

View File

@@ -10,15 +10,15 @@ from ..utils import url_basename
class BehindKinkIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?behindkink\.com/(?P<year>[0-9]{4})/(?P<month>[0-9]{2})/(?P<day>[0-9]{2})/(?P<id>[^/#?_]+)'
_TEST = {
'url': 'http://www.behindkink.com/2014/08/14/ab1576-performers-voice-finally-heard-the-bill-is-killed/',
'md5': '41ad01222b8442089a55528fec43ec01',
'url': 'http://www.behindkink.com/2014/12/05/what-are-you-passionate-about-marley-blaze/',
'md5': '507b57d8fdcd75a41a9a7bdb7989c762',
'info_dict': {
'id': '36370',
'id': '37127',
'ext': 'mp4',
'title': 'AB1576 - PERFORMERS VOICE FINALLY HEARD - THE BILL IS KILLED!',
'description': 'The adult industry voice was finally heard as Assembly Bill 1576 remained\xa0 in suspense today at the Senate Appropriations Hearing. AB1576 was, among other industry damaging issues, a condom mandate...',
'upload_date': '20140814',
'thumbnail': 'http://www.behindkink.com/wp-content/uploads/2014/08/36370_AB1576_Win.jpg',
'title': 'What are you passionate about Marley Blaze',
'description': 'md5:aee8e9611b4ff70186f752975d9b94b4',
'upload_date': '20141205',
'thumbnail': 'http://www.behindkink.com/wp-content/uploads/2014/12/blaze-1.jpg',
'age_limit': 18,
}
}
@@ -26,26 +26,19 @@ class BehindKinkIE(InfoExtractor):
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
display_id = mobj.group('id')
year = mobj.group('year')
month = mobj.group('month')
day = mobj.group('day')
upload_date = year + month + day
webpage = self._download_webpage(url, display_id)
video_url = self._search_regex(
r"'file':\s*'([^']+)'",
webpage, 'URL base')
video_id = url_basename(video_url)
video_id = video_id.split('_')[0]
r'<source src="([^"]+)"', webpage, 'video URL')
video_id = url_basename(video_url).split('_')[0]
upload_date = mobj.group('year') + mobj.group('month') + mobj.group('day')
return {
'id': video_id,
'url': video_url,
'ext': 'mp4',
'title': self._og_search_title(webpage),
'display_id': display_id,
'url': video_url,
'title': self._og_search_title(webpage),
'thumbnail': self._og_search_thumbnail(webpage),
'description': self._og_search_description(webpage),
'upload_date': upload_date,

108
youtube_dl/extractor/bet.py Normal file
View File

@@ -0,0 +1,108 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_urllib_parse
from ..utils import (
xpath_text,
xpath_with_ns,
int_or_none,
parse_iso8601,
)
class BetIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?bet\.com/(?:[^/]+/)+(?P<id>.+?)\.html'
_TESTS = [
{
'url': 'http://www.bet.com/news/politics/2014/12/08/in-bet-exclusive-obama-talks-race-and-racism.html',
'info_dict': {
'id': '406429c6-1b8a-463e-83fc-814adb81a9db',
'display_id': 'in-bet-exclusive-obama-talks-race-and-racism',
'ext': 'flv',
'title': 'BET News Presents: A Conversation With President Obama',
'description': 'md5:5a88d8ae912c1b33e090290af7ec33c6',
'duration': 1534,
'timestamp': 1418075340,
'upload_date': '20141208',
'uploader': 'admin',
'thumbnail': 're:(?i)^https?://.*\.jpg$',
},
'params': {
# rtmp download
'skip_download': True,
},
},
{
'url': 'http://www.bet.com/video/news/national/2014/justice-for-ferguson-a-community-reacts.html',
'info_dict': {
'id': '4160e53b-ad41-43b1-980f-8d85f63121f4',
'display_id': 'justice-for-ferguson-a-community-reacts',
'ext': 'flv',
'title': 'Justice for Ferguson: A Community Reacts',
'description': 'A BET News special.',
'duration': 1696,
'timestamp': 1416942360,
'upload_date': '20141125',
'uploader': 'admin',
'thumbnail': 're:(?i)^https?://.*\.jpg$',
},
'params': {
# rtmp download
'skip_download': True,
},
}
]
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
media_url = compat_urllib_parse.unquote(self._search_regex(
[r'mediaURL\s*:\s*"([^"]+)"', r"var\s+mrssMediaUrl\s*=\s*'([^']+)'"],
webpage, 'media URL'))
mrss = self._download_xml(media_url, display_id)
item = mrss.find('./channel/item')
NS_MAP = {
'dc': 'http://purl.org/dc/elements/1.1/',
'media': 'http://search.yahoo.com/mrss/',
'ka': 'http://kickapps.com/karss',
}
title = xpath_text(item, './title', 'title')
description = xpath_text(
item, './description', 'description', fatal=False)
video_id = xpath_text(item, './guid', 'video id', fatal=False)
timestamp = parse_iso8601(xpath_text(
item, xpath_with_ns('./dc:date', NS_MAP),
'upload date', fatal=False))
uploader = xpath_text(
item, xpath_with_ns('./dc:creator', NS_MAP),
'uploader', fatal=False)
media_content = item.find(
xpath_with_ns('./media:content', NS_MAP))
duration = int_or_none(media_content.get('duration'))
smil_url = media_content.get('url')
thumbnail = media_content.find(
xpath_with_ns('./media:thumbnail', NS_MAP)).get('url')
formats = self._extract_smil_formats(smil_url, display_id)
return {
'id': video_id,
'display_id': display_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'timestamp': timestamp,
'uploader': uploader,
'duration': duration,
'formats': formats,
}

View File

@@ -1,4 +1,4 @@
#coding: utf-8
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor

View File

@@ -4,8 +4,8 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_parse_qs
from ..utils import (
compat_parse_qs,
ExtractorError,
int_or_none,
unified_strdate,
@@ -29,10 +29,9 @@ class BiliBiliIE(InfoExtractor):
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
video_code = self._search_regex(
r'(?s)<div itemprop="video".*?>(.*?)</div>', webpage, 'video code')

View File

@@ -4,13 +4,17 @@ import re
from .common import InfoExtractor
from .subtitles import SubtitlesInfoExtractor
from ..utils import (
compat_urllib_request,
unescapeHTML,
parse_iso8601,
compat_urlparse,
clean_html,
from ..compat import (
compat_str,
compat_urllib_request,
compat_urlparse,
)
from ..utils import (
clean_html,
int_or_none,
parse_iso8601,
unescapeHTML,
)
@@ -64,7 +68,39 @@ class BlipTVIE(SubtitlesInfoExtractor):
'uploader': 'redvsblue',
'uploader_id': '792887',
}
}
},
{
'url': 'http://blip.tv/play/gbk766dkj4Yn',
'md5': 'fe0a33f022d49399a241e84a8ea8b8e3',
'info_dict': {
'id': '1749452',
'ext': 'mp4',
'upload_date': '20090208',
'description': 'Witness the first appearance of the Nostalgia Critic character, as Doug reviews the movie Transformers.',
'title': 'Nostalgia Critic: Transformers',
'timestamp': 1234068723,
'uploader': 'NostalgiaCritic',
'uploader_id': '246467',
}
},
{
# https://github.com/rg3/youtube-dl/pull/4404
'note': 'Audio only',
'url': 'http://blip.tv/hilarios-productions/weekly-manga-recap-kingdom-7119982',
'md5': '76c0a56f24e769ceaab21fbb6416a351',
'info_dict': {
'id': '7103299',
'ext': 'flv',
'title': 'Weekly Manga Recap: Kingdom',
'description': 'And then Shin breaks the enemy line, and he&apos;s all like HWAH! And then he slices a guy and it&apos;s all like FWASHING! And... it&apos;s really hard to describe the best parts of this series without breaking down into sound effects, okay?',
'timestamp': 1417660321,
'upload_date': '20141204',
'uploader': 'The Rollo T',
'uploader_id': '407429',
'duration': 7251,
'vcodec': 'none',
}
},
]
def _real_extract(self, url):
@@ -74,11 +110,13 @@ class BlipTVIE(SubtitlesInfoExtractor):
# See https://github.com/rg3/youtube-dl/issues/857 and
# https://github.com/rg3/youtube-dl/issues/4197
if lookup_id:
info_page = self._download_webpage(
'http://blip.tv/play/%s.x?p=1' % lookup_id, lookup_id, 'Resolving lookup id')
video_id = self._search_regex(r'config\.id\s*=\s*"([0-9]+)', info_page, 'video_id')
else:
video_id = mobj.group('id')
urlh = self._request_webpage(
'http://blip.tv/play/%s' % lookup_id, lookup_id, 'Resolving lookup id')
url = compat_urlparse.urlparse(urlh.geturl())
qs = compat_urlparse.parse_qs(url.query)
mobj = re.match(self._VALID_URL, qs['file'][0])
video_id = mobj.group('id')
rss = self._download_xml('http://blip.tv/rss/flash/%s' % video_id, video_id, 'Downloading video RSS')
@@ -114,7 +152,7 @@ class BlipTVIE(SubtitlesInfoExtractor):
msg = self._download_webpage(
url + '?showplayer=20140425131715&referrer=http://blip.tv&mask=7&skin=flashvars&view=url',
video_id, 'Resolving URL for %s' % role)
real_url = compat_urlparse.parse_qs(msg)['message'][0]
real_url = compat_urlparse.parse_qs(msg.strip())['message'][0]
media_type = media_content.get('type')
if media_type == 'text/srt' or url.endswith('.srt'):
@@ -129,11 +167,11 @@ class BlipTVIE(SubtitlesInfoExtractor):
'url': real_url,
'format_id': role,
'format_note': media_type,
'vcodec': media_content.get(blip('vcodec')),
'vcodec': media_content.get(blip('vcodec')) or 'none',
'acodec': media_content.get(blip('acodec')),
'filesize': media_content.get('filesize'),
'width': int(media_content.get('width')),
'height': int(media_content.get('height')),
'width': int_or_none(media_content.get('width')),
'height': int_or_none(media_content.get('height')),
})
self._sort_formats(formats)

View File

@@ -0,0 +1,37 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
class BpbIE(InfoExtractor):
IE_DESC = 'Bundeszentrale für politische Bildung'
_VALID_URL = r'http://www\.bpb\.de/mediathek/(?P<id>[0-9]+)/'
_TEST = {
'url': 'http://www.bpb.de/mediathek/297/joachim-gauck-zu-1989-und-die-erinnerung-an-die-ddr',
'md5': '0792086e8e2bfbac9cdf27835d5f2093',
'info_dict': {
'id': '297',
'ext': 'mp4',
'title': 'Joachim Gauck zu 1989 und die Erinnerung an die DDR',
'description': 'Joachim Gauck, erster Beauftragter für die Stasi-Unterlagen, spricht auf dem Geschichtsforum über die friedliche Revolution 1989 und eine "gewisse Traurigkeit" im Umgang mit der DDR-Vergangenheit.'
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
title = self._html_search_regex(
r'<h2 class="white">(.*?)</h2>', webpage, 'title')
video_url = self._html_search_regex(
r'(http://film\.bpb\.de/player/dokument_[0-9]+\.mp4)',
webpage, 'video URL')
return {
'id': video_id,
'url': video_url,
'title': title,
'description': self._og_search_description(webpage),
}

View File

@@ -14,7 +14,6 @@ class BreakIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?break\.com/video/(?:[^/]+/)*.+-(?P<id>\d+)'
_TESTS = [{
'url': 'http://www.break.com/video/when-girls-act-like-guys-2468056',
'md5': '33aa4ff477ecd124d18d7b5d23b87ce5',
'info_dict': {
'id': '2468056',
'ext': 'mp4',

View File

@@ -6,25 +6,26 @@ import json
import xml.etree.ElementTree
from .common import InfoExtractor
from ..utils import (
compat_urllib_parse,
find_xpath_attr,
fix_xml_ampersands,
compat_urlparse,
compat_str,
compat_urllib_request,
from ..compat import (
compat_parse_qs,
compat_str,
compat_urllib_parse,
compat_urllib_parse_urlparse,
compat_urllib_request,
compat_urlparse,
)
from ..utils import (
determine_ext,
ExtractorError,
unsmuggle_url,
find_xpath_attr,
fix_xml_ampersands,
unescapeHTML,
unsmuggle_url,
)
class BrightcoveIE(InfoExtractor):
_VALID_URL = r'https?://.*brightcove\.com/(services|viewer).*?\?(?P<query>.*)'
_VALID_URL = r'(?:https?://.*brightcove\.com/(services|viewer).*?\?|brightcove:)(?P<query>.*)'
_FEDERATED_URL_TEMPLATE = 'http://c.brightcove.com/services/viewer/htmlFederated?%s'
_TESTS = [
@@ -265,6 +266,7 @@ class BrightcoveIE(InfoExtractor):
url = rend['defaultURL']
if not url:
continue
ext = None
if rend['remote']:
url_comp = compat_urllib_parse_urlparse(url)
if url_comp.path.endswith('.m3u8'):
@@ -276,7 +278,7 @@ class BrightcoveIE(InfoExtractor):
# akamaihd.net, but they don't use f4m manifests
url = url.replace('control/', '') + '?&v=3.3.0&fp=13&r=FEEFJ&g=RTSJIMBMPFPB'
ext = 'flv'
else:
if ext is None:
ext = determine_ext(url)
size = rend.get('size')
formats.append({

View File

@@ -0,0 +1,74 @@
# coding: utf-8
from __future__ import unicode_literals
import json
import re
from .common import InfoExtractor
class BuzzFeedIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?buzzfeed\.com/[^?#]*?/(?P<id>[^?#]+)'
_TESTS = [{
'url': 'http://www.buzzfeed.com/abagg/this-angry-ram-destroys-a-punching-bag-like-a-boss?utm_term=4ldqpia',
'info_dict': {
'id': 'this-angry-ram-destroys-a-punching-bag-like-a-boss',
'title': 'This Angry Ram Destroys A Punching Bag Like A Boss',
'description': 'Rambro!',
},
'playlist': [{
'info_dict': {
'id': 'aVCR29aE_OQ',
'ext': 'mp4',
'upload_date': '20141024',
'uploader_id': 'Buddhanz1',
'description': 'He likes to stay in shape with his heavy bag, he wont stop until its on the ground\n\nFollow Angry Ram on Facebook for regular updates -\nhttps://www.facebook.com/pages/Angry-Ram/1436897249899558?ref=hl',
'uploader': 'Buddhanz',
'title': 'Angry Ram destroys a punching bag',
}
}]
}, {
'url': 'http://www.buzzfeed.com/sheridanwatson/look-at-this-cute-dog-omg?utm_term=4ldqpia',
'params': {
'skip_download': True, # Got enough YouTube download tests
},
'info_dict': {
'description': 'Munchkin the Teddy Bear is back !',
'title': 'You Need To Stop What You\'re Doing And Watching This Dog Walk On A Treadmill',
},
'playlist': [{
'info_dict': {
'id': 'mVmBL8B-In0',
'ext': 'mp4',
'upload_date': '20141124',
'uploader_id': 'CindysMunchkin',
'description': '© 2014 Munchkin the Shih Tzu\nAll rights reserved\nFacebook: http://facebook.com/MunchkintheShihTzu',
'uploader': 'Munchkin the Shih Tzu',
'title': 'Munchkin the Teddy Bear gets her exercise',
},
}]
}]
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
all_buckets = re.findall(
r'(?s)<div class="video-embed[^"]*"..*?rel:bf_bucket_data=\'([^\']+)\'',
webpage)
entries = []
for bd_json in all_buckets:
bd = json.loads(bd_json)
video = bd.get('video') or bd.get('progload_video')
if not video:
continue
entries.append(self.url_result(video['url']))
return {
'_type': 'playlist',
'id': playlist_id,
'title': self._og_search_title(webpage),
'description': self._og_search_description(webpage),
'entries': entries,
}

View File

@@ -5,6 +5,8 @@ import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
HEADRequest,
unified_strdate,
url_basename,
qualities,
@@ -76,6 +78,16 @@ class CanalplusIE(InfoExtractor):
preference = qualities(['MOBILE', 'BAS_DEBIT', 'HAUT_DEBIT', 'HD', 'HLS', 'HDS'])
fmt_url = next(iter(media.find('VIDEOS'))).text
if '/geo' in fmt_url.lower():
response = self._request_webpage(
HEADRequest(fmt_url), video_id,
'Checking if the video is georestricted')
if '/blocage' in response.geturl():
raise ExtractorError(
'The video is not available in your country',
expected=True)
formats = []
for fmt in media.find('VIDEOS'):
format_url = fmt.text
@@ -112,4 +124,4 @@ class CanalplusIE(InfoExtractor):
'like_count': int(infos.find('NB_LIKES').text),
'comment_count': int(infos.find('NB_COMMENTS').text),
'formats': formats,
}
}

View File

@@ -45,4 +45,4 @@ class CBSIE(InfoExtractor):
real_id = self._search_regex(
r"video\.settings\.pid\s*=\s*'([^']+)';",
webpage, 'real video ID')
return self.url_result(u'theplatform:%s' % real_id)
return self.url_result('theplatform:%s' % real_id)

View File

@@ -84,4 +84,4 @@ class CBSNewsIE(InfoExtractor):
'thumbnail': thumbnail,
'duration': duration,
'formats': formats,
}
}

View File

@@ -3,55 +3,50 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
from .subtitles import SubtitlesInfoExtractor
from ..compat import (
compat_urllib_request,
compat_urllib_parse,
compat_urllib_parse_urlparse,
)
from ..utils import (
ExtractorError,
float_or_none,
)
class CeskaTelevizeIE(InfoExtractor):
class CeskaTelevizeIE(SubtitlesInfoExtractor):
_VALID_URL = r'https?://www\.ceskatelevize\.cz/(porady|ivysilani)/(.+/)?(?P<id>[^?#]+)'
_TESTS = [
{
'url': 'http://www.ceskatelevize.cz/ivysilani/10532695142-prvni-republika/213512120230004-spanelska-chripka',
'url': 'http://www.ceskatelevize.cz/ivysilani/ivysilani/10441294653-hyde-park-civilizace/214411058091220',
'info_dict': {
'id': '213512120230004',
'ext': 'flv',
'title': 'První republika: Španělská chřipka',
'duration': 3107.4,
'id': '214411058091220',
'ext': 'mp4',
'title': 'Hyde Park Civilizace',
'description': 'Věda a současná civilizace. Interaktivní pořad - prostor pro vaše otázky a komentáře',
'thumbnail': 're:^https?://.*\.jpg',
'duration': 3350,
},
'params': {
'skip_download': True, # requires rtmpdump
# m3u8 download
'skip_download': True,
},
'skip': 'Works only from Czech Republic.',
},
{
'url': 'http://www.ceskatelevize.cz/ivysilani/1030584952-tsatsiki-maminka-a-policajt',
'info_dict': {
'id': '20138143440',
'ext': 'flv',
'title': 'Tsatsiki, maminka a policajt',
'duration': 6754.1,
},
'params': {
'skip_download': True, # requires rtmpdump
},
'skip': 'Works only from Czech Republic.',
},
{
'url': 'http://www.ceskatelevize.cz/ivysilani/10532695142-prvni-republika/bonus/14716-zpevacka-z-duparny-bobina',
'info_dict': {
'id': '14716',
'ext': 'flv',
'ext': 'mp4',
'title': 'První republika: Zpěvačka z Dupárny Bobina',
'duration': 90,
'description': 'Sága mapující atmosféru první republiky od r. 1918 do r. 1945.',
'thumbnail': 're:^https?://.*\.jpg',
'duration': 88.4,
},
'params': {
'skip_download': True, # requires rtmpdump
# m3u8 download
'skip_download': True,
},
},
]
@@ -78,8 +73,9 @@ class CeskaTelevizeIE(InfoExtractor):
'requestSource': 'iVysilani',
}
req = compat_urllib_request.Request('http://www.ceskatelevize.cz/ivysilani/ajax/get-playlist-url',
data=compat_urllib_parse.urlencode(data))
req = compat_urllib_request.Request(
'http://www.ceskatelevize.cz/ivysilani/ajax/get-client-playlist',
data=compat_urllib_parse.urlencode(data))
req.add_header('Content-type', 'application/x-www-form-urlencoded')
req.add_header('x-addr', '127.0.0.1')
@@ -88,39 +84,72 @@ class CeskaTelevizeIE(InfoExtractor):
playlistpage = self._download_json(req, video_id)
req = compat_urllib_request.Request(compat_urllib_parse.unquote(playlistpage['url']))
playlist_url = playlistpage['url']
if playlist_url == 'error_region':
raise ExtractorError(NOT_AVAILABLE_STRING, expected=True)
req = compat_urllib_request.Request(compat_urllib_parse.unquote(playlist_url))
req.add_header('Referer', url)
playlist = self._download_xml(req, video_id)
playlist = self._download_json(req, video_id)
item = playlist['playlist'][0]
formats = []
for i in playlist.find('smilRoot/body'):
if 'AD' not in i.attrib['id']:
base_url = i.attrib['base']
parsedurl = compat_urllib_parse_urlparse(base_url)
duration = i.attrib['duration']
for video in i.findall('video'):
if video.attrib['label'] != 'AD':
format_id = video.attrib['label']
play_path = video.attrib['src']
vbr = int(video.attrib['system-bitrate'])
formats.append({
'format_id': format_id,
'url': base_url,
'vbr': vbr,
'play_path': play_path,
'app': parsedurl.path[1:] + '?' + parsedurl.query,
'rtmp_live': True,
'ext': 'flv',
})
for format_id, stream_url in item['streamUrls'].items():
formats.extend(self._extract_m3u8_formats(stream_url, video_id, 'mp4'))
self._sort_formats(formats)
title = self._og_search_title(webpage)
description = self._og_search_description(webpage)
duration = float_or_none(item.get('duration'))
thumbnail = item.get('previewImageUrl')
subtitles = {}
subs = item.get('subtitles')
if subs:
subtitles['cs'] = subs[0]['url']
if self._downloader.params.get('listsubtitles', False):
self._list_available_subtitles(video_id, subtitles)
return
subtitles = self._fix_subtitles(self.extract_subtitles(video_id, subtitles))
return {
'id': episode_id,
'title': self._html_search_regex(r'<title>(.+?) — iVysílání — Česká televize</title>', webpage, 'title'),
'duration': float(duration),
'title': title,
'description': description,
'thumbnail': thumbnail,
'duration': duration,
'formats': formats,
'subtitles': subtitles,
}
@staticmethod
def _fix_subtitles(subtitles):
""" Convert millisecond-based subtitles to SRT """
if subtitles is None:
return subtitles # subtitles not requested
def _msectotimecode(msec):
""" Helper utility to convert milliseconds to timecode """
components = []
for divider in [1000, 60, 60, 100]:
components.append(msec % divider)
msec //= divider
return "{3:02}:{2:02}:{1:02},{0:03}".format(*components)
def _fix_subtitle(subtitle):
for line in subtitle.splitlines():
m = re.match(r"^\s*([0-9]+);\s*([0-9]+)\s+([0-9]+)\s*$", line)
if m:
yield m.group(1)
start, stop = (_msectotimecode(int(t)) for t in m.groups()[1:])
yield "{0} --> {1}".format(start, stop)
else:
yield line
fixed_subtitles = {}
for k, v in subtitles.items():
fixed_subtitles[k] = "\r\n".join(_fix_subtitle(v))
return fixed_subtitles

View File

@@ -5,6 +5,7 @@ import re
from .common import InfoExtractor
from ..utils import ExtractorError
class Channel9IE(InfoExtractor):
'''
Common extractor for channel9.msdn.com.
@@ -31,7 +32,7 @@ class Channel9IE(InfoExtractor):
'session_code': 'KOS002',
'session_day': 'Day 1',
'session_room': 'Arena 1A',
'session_speakers': [ 'Ed Blankenship', 'Andrew Coates', 'Brady Gaster', 'Patrick Klug', 'Mads Kristensen' ],
'session_speakers': ['Ed Blankenship', 'Andrew Coates', 'Brady Gaster', 'Patrick Klug', 'Mads Kristensen'],
},
},
{
@@ -44,7 +45,7 @@ class Channel9IE(InfoExtractor):
'description': 'md5:d1e6ecaafa7fb52a2cacdf9599829f5b',
'duration': 1540,
'thumbnail': 'http://video.ch9.ms/ch9/87e1/0300391f-a455-4c72-bec3-4422f19287e1/selfservicenuk_512.jpg',
'authors': [ 'Mike Wilmot' ],
'authors': ['Mike Wilmot'],
},
}
]
@@ -83,7 +84,7 @@ class Channel9IE(InfoExtractor):
'format_id': x.group('quality'),
'format_note': x.group('note'),
'format': '%s (%s)' % (x.group('quality'), x.group('note')),
'filesize': self._restore_bytes(x.group('filesize')), # File size is approximate
'filesize': self._restore_bytes(x.group('filesize')), # File size is approximate
'preference': self._known_formats.index(x.group('quality')),
'vcodec': 'none' if x.group('note') == 'Audio only' else None,
} for x in list(re.finditer(FORMAT_REGEX, html)) if x.group('quality') in self._known_formats]
@@ -187,32 +188,33 @@ class Channel9IE(InfoExtractor):
view_count = self._extract_view_count(html)
comment_count = self._extract_comment_count(html)
common = {'_type': 'video',
'id': content_path,
'description': description,
'thumbnail': thumbnail,
'duration': duration,
'avg_rating': avg_rating,
'rating_count': rating_count,
'view_count': view_count,
'comment_count': comment_count,
}
common = {
'_type': 'video',
'id': content_path,
'description': description,
'thumbnail': thumbnail,
'duration': duration,
'avg_rating': avg_rating,
'rating_count': rating_count,
'view_count': view_count,
'comment_count': comment_count,
}
result = []
if slides is not None:
d = common.copy()
d.update({ 'title': title + '-Slides', 'url': slides })
d.update({'title': title + '-Slides', 'url': slides})
result.append(d)
if zip_ is not None:
d = common.copy()
d.update({ 'title': title + '-Zip', 'url': zip_ })
d.update({'title': title + '-Zip', 'url': zip_})
result.append(d)
if len(formats) > 0:
d = common.copy()
d.update({ 'title': title, 'formats': formats })
d.update({'title': title, 'formats': formats})
result.append(d)
return result
@@ -234,16 +236,17 @@ class Channel9IE(InfoExtractor):
if contents is None:
return contents
session_meta = {'session_code': self._extract_session_code(html),
'session_day': self._extract_session_day(html),
'session_room': self._extract_session_room(html),
'session_speakers': self._extract_session_speakers(html),
}
session_meta = {
'session_code': self._extract_session_code(html),
'session_day': self._extract_session_day(html),
'session_room': self._extract_session_room(html),
'session_speakers': self._extract_session_speakers(html),
}
for content in contents:
content.update(session_meta)
return contents
return self.playlist_result(contents)
def _extract_list(self, content_path):
rss = self._download_xml(self._RSS_URL % content_path, content_path, 'Downloading RSS')
@@ -270,5 +273,5 @@ class Channel9IE(InfoExtractor):
else:
raise ExtractorError('Unexpected WT.entryid %s' % page_type, expected=True)
else: # Assuming list
else: # Assuming list
return self._extract_list(content_path)

View File

@@ -0,0 +1,52 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
unified_strdate,
xpath_text,
)
class CinchcastIE(InfoExtractor):
_VALID_URL = r'https?://player\.cinchcast\.com/.*?assetId=(?P<id>[0-9]+)'
_TEST = {
# Actual test is run in generic, look for undergroundwellness
'url': 'http://player.cinchcast.com/?platformId=1&#038;assetType=single&#038;assetId=7141703',
'only_matching': True,
}
def _real_extract(self, url):
video_id = self._match_id(url)
doc = self._download_xml(
'http://www.blogtalkradio.com/playerasset/mrss?assetType=single&assetId=%s' % video_id,
video_id)
item = doc.find('.//item')
title = xpath_text(item, './title', fatal=True)
date_str = xpath_text(
item, './{http://developer.longtailvideo.com/trac/}date')
upload_date = unified_strdate(date_str, day_first=False)
# duration is present but wrong
formats = []
formats.append({
'format_id': 'main',
'url': item.find(
'./{http://search.yahoo.com/mrss/}content').attrib['url'],
})
backup_url = xpath_text(
item, './{http://developer.longtailvideo.com/trac/}backupContent')
if backup_url:
formats.append({
'preference': 2, # seems to be more reliable
'format_id': 'backup',
'url': backup_url,
})
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
'upload_date': upload_date,
'formats': formats,
}

View File

@@ -24,7 +24,7 @@ class ClipfishIE(InfoExtractor):
'title': 'FIFA 14 - E3 2013 Trailer',
'duration': 82,
},
u'skip': 'Blocked in the US'
'skip': 'Blocked in the US'
}
def _real_extract(self, url):
@@ -34,7 +34,7 @@ class ClipfishIE(InfoExtractor):
info_url = ('http://www.clipfish.de/devxml/videoinfo/%s?ts=%d' %
(video_id, int(time.time())))
doc = self._download_xml(
info_url, video_id, note=u'Downloading info page')
info_url, video_id, note='Downloading info page')
title = doc.find('title').text
video_url = doc.find('filename').text
if video_url is None:

View File

@@ -39,6 +39,7 @@ class ClipsyndicateIE(InfoExtractor):
transform_source=fix_xml_ampersands)
track_doc = pdoc.find('trackList/track')
def find_param(name):
node = find_xpath_attr(track_doc, './/param', 'name', name)
if node is not None:

View File

@@ -2,12 +2,10 @@
from __future__ import unicode_literals
import json
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
int_or_none,
)
@@ -15,23 +13,24 @@ class CNETIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?cnet\.com/videos/(?P<id>[^/]+)/'
_TEST = {
'url': 'http://www.cnet.com/videos/hands-on-with-microsofts-windows-8-1-update/',
'md5': '041233212a0d06b179c87cbcca1577b8',
'info_dict': {
'id': '56f4ea68-bd21-4852-b08c-4de5b8354c60',
'ext': 'mp4',
'ext': 'flv',
'title': 'Hands-on with Microsoft Windows 8.1 Update',
'description': 'The new update to the Windows 8 OS brings improved performance for mouse and keyboard users.',
'thumbnail': 're:^http://.*/flmswindows8.jpg$',
'uploader_id': 'sarah.mitroff@cbsinteractive.com',
'uploader_id': '6085384d-619e-11e3-b231-14feb5ca9861',
'uploader': 'Sarah Mitroff',
},
'params': {
'skip_download': 'requires rtmpdump',
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
display_id = mobj.group('id')
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
data_json = self._html_search_regex(
r"<div class=\"cnetVideoPlayer\"\s+.*?data-cnet-video-options='([^']+)'",
webpage, 'data json')
@@ -42,37 +41,31 @@ class CNETIE(InfoExtractor):
if not vdata:
raise ExtractorError('Cannot find video data')
mpx_account = data['config']['players']['default']['mpx_account']
vid = vdata['files']['rtmp']
tp_link = 'http://link.theplatform.com/s/%s/%s' % (mpx_account, vid)
video_id = vdata['id']
title = vdata.get('headline')
if title is None:
title = vdata.get('title')
if title is None:
raise ExtractorError('Cannot find title!')
description = vdata.get('dek')
thumbnail = vdata.get('image', {}).get('path')
author = vdata.get('author')
if author:
uploader = '%s %s' % (author['firstName'], author['lastName'])
uploader_id = author.get('email')
uploader_id = author.get('id')
else:
uploader = None
uploader_id = None
formats = [{
'format_id': '%s-%s-%s' % (
f['type'], f['format'],
int_or_none(f.get('bitrate'), 1000, default='')),
'url': f['uri'],
'tbr': int_or_none(f.get('bitrate'), 1000),
} for f in vdata['files']['data']]
self._sort_formats(formats)
return {
'_type': 'url_transparent',
'url': tp_link,
'id': video_id,
'display_id': display_id,
'title': title,
'formats': formats,
'description': description,
'uploader': uploader,
'uploader_id': uploader_id,
'thumbnail': thumbnail,

View File

@@ -11,22 +11,21 @@ from ..utils import (
class CNNIE(InfoExtractor):
_VALID_URL = r'''(?x)https?://((edition|www)\.)?cnn\.com/video/(data/.+?|\?)/
(?P<path>.+?/(?P<title>[^/]+?)(?:\.cnn(-ap)?|(?=&)))'''
_VALID_URL = r'''(?x)https?://(?:(?:edition|www)\.)?cnn\.com/video/(?:data/.+?|\?)/
(?P<path>.+?/(?P<title>[^/]+?)(?:\.(?:cnn|hln)(?:-ap)?|(?=&)))'''
_TESTS = [{
'url': 'http://edition.cnn.com/video/?/video/sports/2013/06/09/nadal-1-on-1.cnn',
'md5': '3e6121ea48df7e2259fe73a0628605c4',
'info_dict': {
'id': 'sports_2013_06_09_nadal-1-on-1.cnn',
'id': 'sports/2013/06/09/nadal-1-on-1.cnn',
'ext': 'mp4',
'title': 'Nadal wins 8th French Open title',
'description': 'World Sport\'s Amanda Davies chats with 2013 French Open champion Rafael Nadal.',
'duration': 135,
'upload_date': '20130609',
},
},
{
}, {
"url": "http://edition.cnn.com/video/?/video/us/2013/08/21/sot-student-gives-epic-speech.georgia-institute-of-technology&utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+rss%2Fcnn_topstories+%28RSS%3A+Top+Stories%29",
"md5": "b5cc60c60a3477d185af8f19a2a26f4e",
"info_dict": {
@@ -36,6 +35,16 @@ class CNNIE(InfoExtractor):
"description": "A Georgia Tech student welcomes the incoming freshmen with an epic speech backed by music from \"2001: A Space Odyssey.\"",
"upload_date": "20130821",
}
}, {
'url': 'http://www.cnn.com/video/data/2.0/video/living/2014/12/22/growing-america-nashville-salemtown-board-episode-1.hln.html',
'md5': 'f14d02ebd264df951feb2400e2c25a1b',
'info_dict': {
'id': 'living/2014/12/22/growing-america-nashville-salemtown-board-episode-1.hln',
'ext': 'mp4',
'title': 'Nashville Ep. 1: Hand crafted skateboards',
'description': 'md5:e7223a503315c9f150acac52e76de086',
'upload_date': '20141222',
}
}]
def _real_extract(self, url):
@@ -128,3 +137,28 @@ class CNNBlogsIE(InfoExtractor):
'url': cnn_url,
'ie_key': CNNIE.ie_key(),
}
class CNNArticleIE(InfoExtractor):
_VALID_URL = r'https?://(?:(?:edition|www)\.)?cnn\.com/(?!video/)'
_TEST = {
'url': 'http://www.cnn.com/2014/12/21/politics/obama-north-koreas-hack-not-war-but-cyber-vandalism/',
'md5': '275b326f85d80dff7592a9820f5dc887',
'info_dict': {
'id': 'bestoftv/2014/12/21/sotu-crowley-president-obama-north-korea-not-going-to-be-intimidated.cnn',
'ext': 'mp4',
'title': 'Obama: We\'re not going to be intimidated',
'description': 'md5:e735586f3dc936075fa654a4d91b21f9',
'upload_date': '20141220',
},
'add_ie': ['CNN'],
}
def _real_extract(self, url):
webpage = self._download_webpage(url, url_basename(url))
cnn_url = self._html_search_regex(r"video:\s*'([^']+)'", webpage, 'cnn url')
return {
'_type': 'url',
'url': 'http://cnn.com/video/?/video/' + cnn_url,
'ie_key': CNNIE.ie_key(),
}

View File

@@ -10,47 +10,46 @@ from ..utils import int_or_none
class CollegeHumorIE(InfoExtractor):
_VALID_URL = r'^(?:https?://)?(?:www\.)?collegehumor\.com/(video|embed|e)/(?P<videoid>[0-9]+)/?(?P<shorttitle>.*)$'
_TESTS = [{
'url': 'http://www.collegehumor.com/video/6902724/comic-con-cosplay-catastrophe',
'md5': 'dcc0f5c1c8be98dc33889a191f4c26bd',
'info_dict': {
'id': '6902724',
'ext': 'mp4',
'title': 'Comic-Con Cosplay Catastrophe',
'description': "Fans get creative this year at San Diego. Too creative. And yes, that's really Joss Whedon.",
'age_limit': 13,
'duration': 187,
_TESTS = [
{
'url': 'http://www.collegehumor.com/video/6902724/comic-con-cosplay-catastrophe',
'md5': 'dcc0f5c1c8be98dc33889a191f4c26bd',
'info_dict': {
'id': '6902724',
'ext': 'mp4',
'title': 'Comic-Con Cosplay Catastrophe',
'description': "Fans get creative this year at San Diego. Too creative. And yes, that's really Joss Whedon.",
'age_limit': 13,
'duration': 187,
},
}, {
'url': 'http://www.collegehumor.com/video/3505939/font-conference',
'md5': '72fa701d8ef38664a4dbb9e2ab721816',
'info_dict': {
'id': '3505939',
'ext': 'mp4',
'title': 'Font Conference',
'description': "This video wasn't long enough, so we made it double-spaced.",
'age_limit': 10,
'duration': 179,
},
}, {
# embedded youtube video
'url': 'http://www.collegehumor.com/embed/6950306',
'info_dict': {
'id': 'Z-bao9fg6Yc',
'ext': 'mp4',
'title': 'Young Americans Think President John F. Kennedy Died THIS MORNING IN A CAR ACCIDENT!!!',
'uploader': 'Mark Dice',
'uploader_id': 'MarkDice',
'description': 'md5:62c3dab9351fac7bb44b53b69511d87f',
'upload_date': '20140127',
},
'params': {
'skip_download': True,
},
'add_ie': ['Youtube'],
},
},
{
'url': 'http://www.collegehumor.com/video/3505939/font-conference',
'md5': '72fa701d8ef38664a4dbb9e2ab721816',
'info_dict': {
'id': '3505939',
'ext': 'mp4',
'title': 'Font Conference',
'description': "This video wasn't long enough, so we made it double-spaced.",
'age_limit': 10,
'duration': 179,
},
},
# embedded youtube video
{
'url': 'http://www.collegehumor.com/embed/6950306',
'info_dict': {
'id': 'Z-bao9fg6Yc',
'ext': 'mp4',
'title': 'Young Americans Think President John F. Kennedy Died THIS MORNING IN A CAR ACCIDENT!!!',
'uploader': 'Mark Dice',
'uploader_id': 'MarkDice',
'description': 'md5:62c3dab9351fac7bb44b53b69511d87f',
'upload_date': '20140127',
},
'params': {
'skip_download': True,
},
'add_ie': ['Youtube'],
},
]
def _real_extract(self, url):

View File

@@ -0,0 +1,57 @@
# encoding: utf-8
from __future__ import unicode_literals
import json
from .common import InfoExtractor
from ..utils import parse_iso8601
class ComCarCoffIE(InfoExtractor):
_VALID_URL = r'http://(?:www\.)?comediansincarsgettingcoffee\.com/(?P<id>[a-z0-9\-]*)'
_TESTS = [{
'url': 'http://comediansincarsgettingcoffee.com/miranda-sings-happy-thanksgiving-miranda/',
'info_dict': {
'id': 'miranda-sings-happy-thanksgiving-miranda',
'ext': 'mp4',
'upload_date': '20141127',
'timestamp': 1417107600,
'title': 'Happy Thanksgiving Miranda',
'description': 'Jerry Seinfeld and his special guest Miranda Sings cruise around town in search of coffee, complaining and apologizing along the way.',
'thumbnail': 'http://ccc.crackle.com/images/s5e4_thumb.jpg',
},
'params': {
'skip_download': 'requires ffmpeg',
}
}]
def _real_extract(self, url):
display_id = self._match_id(url)
if not display_id:
display_id = 'comediansincarsgettingcoffee.com'
webpage = self._download_webpage(url, display_id)
full_data = json.loads(self._search_regex(
r'<script type="application/json" id="videoData">(?P<json>.+?)</script>',
webpage, 'full data json'))
video_id = full_data['activeVideo']['video']
video_data = full_data['videos'][video_id]
thumbnails = [{
'url': video_data['images']['thumb'],
}, {
'url': video_data['images']['poster'],
}]
formats = self._extract_m3u8_formats(
video_data['mediaUrl'], video_id, ext='mp4')
return {
'id': video_id,
'display_id': display_id,
'title': video_data['title'],
'description': video_data.get('description'),
'timestamp': parse_iso8601(video_data.get('pubDate')),
'thumbnails': thumbnails,
'formats': formats,
'webpage_url': 'http://comediansincarsgettingcoffee.com/%s' % (video_data.get('urlSlug', video_data.get('slug'))),
}

View File

@@ -3,9 +3,11 @@ from __future__ import unicode_literals
import re
from .mtv import MTVServicesInfoExtractor
from ..utils import (
from ..compat import (
compat_str,
compat_urllib_parse,
)
from ..utils import (
ExtractorError,
float_or_none,
unified_strdate,
@@ -48,7 +50,7 @@ class ComedyCentralShowsIE(MTVServicesInfoExtractor):
)|
(?P<interview>
extended-interviews/(?P<interID>[0-9a-z]+)/(?:playlist_tds_extended_)?(?P<interview_title>.*?)(/.*?)?)))
(?:[?#].*|$)'''
'''
_TESTS = [{
'url': 'http://thedailyshow.cc.com/watch/thu-december-13-2012/kristen-stewart',
'md5': '4e2f5cb088a83cd8cdb7756132f9739d',
@@ -81,6 +83,9 @@ class ComedyCentralShowsIE(MTVServicesInfoExtractor):
}, {
'url': 'http://thedailyshow.cc.com/video-playlists/npde3s/the-daily-show-19088-highlights',
'only_matching': True,
}, {
'url': 'http://thedailyshow.cc.com/video-playlists/t6d9sg/the-daily-show-20038-highlights/be3cwo',
'only_matching': True,
}, {
'url': 'http://thedailyshow.cc.com/special-editions/2l8fdb/special-edition---a-look-back-at-food',
'only_matching': True,

View File

@@ -13,6 +13,7 @@ import time
import xml.etree.ElementTree
from ..compat import (
compat_cookiejar,
compat_http_client,
compat_urllib_error,
compat_urllib_parse_urlparse,
@@ -20,6 +21,7 @@ from ..compat import (
compat_str,
)
from ..utils import (
age_restricted,
clean_html,
compiled_regex_type,
ExtractorError,
@@ -39,7 +41,7 @@ class InfoExtractor(object):
information about the video (or videos) the URL refers to. This
information includes the real video URL, the video title, author and
others. The information is stored in a dictionary which is then
passed to the FileDownloader. The FileDownloader processes this
passed to the YoutubeDL. The YoutubeDL processes this
information possibly downloading the video to the file system, among
other possible outcomes.
@@ -91,6 +93,8 @@ class InfoExtractor(object):
by this field, regardless of all other values.
-1 for default (order by other properties),
-2 or smaller for less than default.
< -1000 to hide the format (if there is
another one which is strictly better)
* language_preference Is this in the correct requested
language?
10 if it's what the URL is about,
@@ -117,6 +121,7 @@ class InfoExtractor(object):
The following fields are optional:
alt_title: A secondary title of the video.
display_id An alternative identifier for the video, not necessarily
unique, but available before title. Typically, id is
something like "4234987", title "Dancing naked mole rats",
@@ -128,7 +133,7 @@ class InfoExtractor(object):
* "resolution" (optional, string "{width}x{height"},
deprecated)
thumbnail: Full URL to a video thumbnail image.
description: One-line video description.
description: Full video description.
uploader: Full name of the video uploader.
timestamp: UNIX timestamp of the moment the video became available.
upload_date: Video upload date (YYYYMMDD).
@@ -157,8 +162,8 @@ class InfoExtractor(object):
_type "playlist" indicates multiple videos.
There must be a key "entries", which is a list or a PagedList object, each
element of which is a valid dictionary under this specfication.
There must be a key "entries", which is a list, an iterable, or a PagedList
object, each element of which is a valid dictionary by this specification.
Additionally, playlists can have "title" and "id" attributes with the same
semantics as videos (see above).
@@ -173,9 +178,10 @@ class InfoExtractor(object):
_type "url" indicates that the video must be extracted from another
location, possibly by a different extractor. Its only required key is:
"url" - the next URL to extract.
Additionally, it may have properties believed to be identical to the
resolved entity, for example "title" if the title of the referred video is
The key "ie_key" can be set to the class name (minus the trailing "IE",
e.g. "Youtube") if the extractor class is known in advance.
Additionally, the dictionary may have any properties of the resolved entity
known in advance, for example "title" if the title of the referred video is
known ahead of time.
@@ -296,9 +302,11 @@ class InfoExtractor(object):
content = self._webpage_read_content(urlh, url_or_request, video_id, note, errnote, fatal)
return (content, urlh)
def _webpage_read_content(self, urlh, url_or_request, video_id, note=None, errnote=None, fatal=True):
def _webpage_read_content(self, urlh, url_or_request, video_id, note=None, errnote=None, fatal=True, prefix=None):
content_type = urlh.headers.get('Content-Type', '')
webpage_bytes = urlh.read()
if prefix is not None:
webpage_bytes = prefix + webpage_bytes
m = re.match(r'[a-zA-Z0-9_.-]+/[a-zA-Z0-9_.-]+\s*;\s*charset=(.+)', content_type)
if m:
encoding = m.group(1)
@@ -387,6 +395,10 @@ class InfoExtractor(object):
url_or_request, video_id, note, errnote, fatal=fatal)
if (not fatal) and json_string is False:
return None
return self._parse_json(
json_string, video_id, transform_source=transform_source, fatal=fatal)
def _parse_json(self, json_string, video_id, transform_source=None, fatal=True):
if transform_source:
json_string = transform_source(json_string)
try:
@@ -423,19 +435,20 @@ class InfoExtractor(object):
"""Report attempt to log in."""
self.to_screen('Logging in')
#Methods for following #608
# Methods for following #608
@staticmethod
def url_result(url, ie=None, video_id=None):
"""Returns a url that points to a page that should be processed"""
#TODO: ie should be the class used for getting the info
# TODO: ie should be the class used for getting the info
video_info = {'_type': 'url',
'url': url,
'ie_key': ie}
if video_id is not None:
video_info['id'] = video_id
return video_info
@staticmethod
def playlist_result(entries, playlist_id=None, playlist_title=None):
def playlist_result(entries, playlist_id=None, playlist_title=None, playlist_description=None):
"""Returns a playlist"""
video_info = {'_type': 'playlist',
'entries': entries}
@@ -443,6 +456,8 @@ class InfoExtractor(object):
video_info['id'] = playlist_id
if playlist_title:
video_info['title'] = playlist_title
if playlist_description:
video_info['description'] = playlist_description
return video_info
def _search_regex(self, pattern, string, name, default=_NO_DEFAULT, fatal=True, flags=0, group=None):
@@ -477,7 +492,7 @@ class InfoExtractor(object):
raise RegexNotFoundError('Unable to extract %s' % _name)
else:
self._downloader.report_warning('unable to extract %s; '
'please report this issue on http://yt-dl.org/bug' % _name)
'please report this issue on http://yt-dl.org/bug' % _name)
return None
def _html_search_regex(self, pattern, string, name, default=_NO_DEFAULT, fatal=True, flags=0, group=None):
@@ -517,7 +532,7 @@ class InfoExtractor(object):
raise netrc.NetrcParseError('No authenticators for %s' % self._NETRC_MACHINE)
except (IOError, netrc.NetrcParseError) as err:
self._downloader.report_warning('parsing .netrc: %s' % compat_str(err))
return (username, password)
def _get_tfa_info(self):
@@ -577,7 +592,7 @@ class InfoExtractor(object):
if display_name is None:
display_name = name
return self._html_search_regex(
r'''(?ix)<meta
r'''(?isx)<meta
(?=[^>]+(?:itemprop|name|property)=(["\']?)%s\1)
[^>]+content=(["\'])(?P<content>.*?)\1''' % re.escape(name),
html, display_name, fatal=fatal, group='content', **kwargs)
@@ -611,7 +626,7 @@ class InfoExtractor(object):
def _twitter_search_player(self, html):
return self._html_search_meta('twitter:player', html,
'twitter card player')
'twitter card player')
def _sort_formats(self, formats):
if not formats:
@@ -786,6 +801,49 @@ class InfoExtractor(object):
self._sort_formats(formats)
return formats
# TODO: improve extraction
def _extract_smil_formats(self, smil_url, video_id):
smil = self._download_xml(
smil_url, video_id, 'Downloading SMIL file',
'Unable to download SMIL file')
base = smil.find('./head/meta').get('base')
formats = []
rtmp_count = 0
for video in smil.findall('./body/switch/video'):
src = video.get('src')
if not src:
continue
bitrate = int_or_none(video.get('system-bitrate') or video.get('systemBitrate'), 1000)
width = int_or_none(video.get('width'))
height = int_or_none(video.get('height'))
proto = video.get('proto')
if not proto:
if base:
if base.startswith('rtmp'):
proto = 'rtmp'
elif base.startswith('http'):
proto = 'http'
ext = video.get('ext')
if proto == 'm3u8':
formats.extend(self._extract_m3u8_formats(src, video_id, ext))
elif proto == 'rtmp':
rtmp_count += 1
streamer = video.get('streamer') or base
formats.append({
'url': streamer,
'play_path': src,
'ext': 'flv',
'format_id': 'rtmp-%d' % (rtmp_count if bitrate is None else bitrate),
'tbr': bitrate,
'width': width,
'height': height,
})
self._sort_formats(formats)
return formats
def _live_title(self, name):
""" Generate the title for a live video """
now = datetime.datetime.now()
@@ -814,6 +872,41 @@ class InfoExtractor(object):
self._downloader.report_warning(msg)
return res
def _set_cookie(self, domain, name, value, expire_time=None):
cookie = compat_cookiejar.Cookie(
0, name, value, None, None, domain, None,
None, '/', True, False, expire_time, '', None, None, None)
self._downloader.cookiejar.set_cookie(cookie)
def get_testcases(self, include_onlymatching=False):
t = getattr(self, '_TEST', None)
if t:
assert not hasattr(self, '_TESTS'), \
'%s has _TEST and _TESTS' % type(self).__name__
tests = [t]
else:
tests = getattr(self, '_TESTS', [])
for t in tests:
if not include_onlymatching and t.get('only_matching', False):
continue
t['name'] = type(self).__name__[:-len('IE')]
yield t
def is_suitable(self, age_limit):
""" Test whether the extractor is generally suitable for the given
age limit (i.e. pornographic sites are not, all others usually are) """
any_restricted = False
for tc in self.get_testcases(include_onlymatching=False):
if 'playlist' in tc:
tc = tc['playlist'][0]
is_restricted = age_restricted(
tc.get('info_dict', {}).get('age_limit'), age_limit)
if not is_restricted:
return True
any_restricted = any_restricted or is_restricted
return not any_restricted
class SearchInfoExtractor(InfoExtractor):
"""

View File

@@ -0,0 +1,29 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import ExtractorError
class CommonMistakesIE(InfoExtractor):
IE_DESC = False # Do not list
_VALID_URL = r'''(?x)
(?:url|URL)
'''
_TESTS = [{
'url': 'url',
'only_matching': True,
}, {
'url': 'URL',
'only_matching': True,
}]
def _real_extract(self, url):
msg = (
'You\'ve asked youtube-dl to download the URL "%s". '
'That doesn\'t make any sense. '
'Simply remove the parameter in your command or configuration.'
) % url
if self._downloader.params.get('verbose'):
msg += ' Add -v to the command line to see what arguments and configuration youtube-dl got.'
raise ExtractorError(msg, expected=True)

View File

@@ -5,12 +5,14 @@ import re
import json
from .common import InfoExtractor
from ..utils import (
from ..compat import (
compat_urllib_parse,
orderedSet,
compat_urllib_parse_urlparse,
compat_urlparse,
)
from ..utils import (
orderedSet,
)
class CondeNastIE(InfoExtractor):

View File

@@ -54,7 +54,7 @@ class CrackedIE(InfoExtractor):
return {
'id': video_id,
'url':video_url,
'url': video_url,
'title': title,
'description': description,
'timestamp': timestamp,
@@ -62,4 +62,4 @@ class CrackedIE(InfoExtractor):
'comment_count': comment_count,
'height': height,
'width': width,
}
}

View File

@@ -10,10 +10,12 @@ import xml.etree.ElementTree
from hashlib import sha1
from math import pow, sqrt, floor
from .subtitles import SubtitlesInfoExtractor
from ..utils import (
ExtractorError,
from ..compat import (
compat_urllib_parse,
compat_urllib_request,
)
from ..utils import (
ExtractorError,
bytes_to_intlist,
intlist_to_bytes,
unified_strdate,
@@ -27,10 +29,9 @@ from .common import InfoExtractor
class CrunchyrollIE(SubtitlesInfoExtractor):
_VALID_URL = r'https?://(?:(?P<prefix>www|m)\.)?(?P<url>crunchyroll\.com/(?:[^/]*/[^/?&]*?|media/\?id=)(?P<video_id>[0-9]+))(?:[/?&]|$)'
_TEST = {
_VALID_URL = r'https?://(?:(?P<prefix>www|m)\.)?(?P<url>crunchyroll\.(?:com|fr)/(?:[^/]*/[^/?&]*?|media/\?id=)(?P<video_id>[0-9]+))(?:[/?&]|$)'
_TESTS = [{
'url': 'http://www.crunchyroll.com/wanna-be-the-strongest-in-the-world/episode-1-an-idol-wrestler-is-born-645513',
#'md5': 'b1639fd6ddfaa43788c85f6d1dddd412',
'info_dict': {
'id': '645513',
'ext': 'flv',
@@ -45,7 +46,10 @@ class CrunchyrollIE(SubtitlesInfoExtractor):
# rtmp
'skip_download': True,
},
}
}, {
'url': 'http://www.crunchyroll.fr/girl-friend-beta/episode-11-goodbye-la-mode-661697',
'only_matching': True,
}]
_FORMAT_IDS = {
'360': ('60', '106'),
@@ -69,11 +73,9 @@ class CrunchyrollIE(SubtitlesInfoExtractor):
login_request.add_header('Content-Type', 'application/x-www-form-urlencoded')
self._download_webpage(login_request, None, False, 'Wrong login info')
def _real_initialize(self):
self._login()
def _decrypt_subtitles(self, data, iv, id):
data = bytes_to_intlist(data)
iv = bytes_to_intlist(iv)
@@ -99,8 +101,10 @@ class CrunchyrollIE(SubtitlesInfoExtractor):
return shaHash + [0] * 12
key = obfuscate_key(id)
class Counter:
__value = iv
def next_value(self):
temp = self.__value
self.__value = inc(self.__value)
@@ -183,7 +187,7 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
return output
def _real_extract(self,url):
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('video_id')
@@ -224,12 +228,12 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
video_thumbnail = self._search_regex(r'<episode_image_url>([^<]+)', playerdata, 'thumbnail', fatal=False)
formats = []
for fmt in re.findall(r'\?p([0-9]{3,4})=1', webpage):
for fmt in re.findall(r'showmedia\.([0-9]{3,4})p', webpage):
stream_quality, stream_format = self._FORMAT_IDS[fmt]
video_format = fmt+'p'
video_format = fmt + 'p'
streamdata_req = compat_urllib_request.Request('http://www.crunchyroll.com/xml/')
# urlencode doesn't work!
streamdata_req.data = 'req=RpcApiVideoEncode%5FGetStreamInfo&video%5Fencode%5Fquality='+stream_quality+'&media%5Fid='+stream_id+'&video%5Fformat='+stream_format
streamdata_req.data = 'req=RpcApiVideoEncode%5FGetStreamInfo&video%5Fencode%5Fquality=' + stream_quality + '&media%5Fid=' + stream_id + '&video%5Fformat=' + stream_format
streamdata_req.add_header('Content-Type', 'application/x-www-form-urlencoded')
streamdata_req.add_header('Content-Length', str(len(streamdata_req.data)))
streamdata = self._download_xml(
@@ -248,8 +252,9 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
subtitles = {}
sub_format = self._downloader.params.get('subtitlesformat', 'srt')
for sub_id, sub_name in re.findall(r'\?ssid=([0-9]+)" title="([^"]+)', webpage):
sub_page = self._download_webpage('http://www.crunchyroll.com/xml/?req=RpcApiSubtitle_GetXml&subtitle_script_id='+sub_id,\
video_id, note='Downloading subtitles for '+sub_name)
sub_page = self._download_webpage(
'http://www.crunchyroll.com/xml/?req=RpcApiSubtitle_GetXml&subtitle_script_id=' + sub_id,
video_id, note='Downloading subtitles for ' + sub_name)
id = self._search_regex(r'id=\'([0-9]+)', sub_page, 'subtitle_id', fatal=False)
iv = self._search_regex(r'<iv>([^<]+)', sub_page, 'subtitle_iv', fatal=False)
data = self._search_regex(r'<data>([^<]+)', sub_page, 'subtitle_data', fatal=False)
@@ -274,14 +279,14 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
return
return {
'id': video_id,
'title': video_title,
'id': video_id,
'title': video_title,
'description': video_description,
'thumbnail': video_thumbnail,
'uploader': video_uploader,
'thumbnail': video_thumbnail,
'uploader': video_uploader,
'upload_date': video_upload_date,
'subtitles': subtitles,
'formats': formats,
'subtitles': subtitles,
'formats': formats,
}

View File

@@ -27,7 +27,6 @@ class CSpanIE(InfoExtractor):
'url': 'http://www.c-span.org/video/?c4486943/cspan-international-health-care-models',
# For whatever reason, the served video alternates between
# two different ones
#'md5': 'dbb0f047376d457f2ab8b3929cbb2d0c',
'info_dict': {
'id': '340723',
'ext': 'mp4',

View File

@@ -1,4 +1,4 @@
#coding: utf-8
# coding: utf-8
from __future__ import unicode_literals
import re
@@ -8,16 +8,19 @@ import itertools
from .common import InfoExtractor
from .subtitles import SubtitlesInfoExtractor
from ..utils import (
compat_urllib_request,
from ..compat import (
compat_str,
compat_urllib_request,
)
from ..utils import (
ExtractorError,
int_or_none,
orderedSet,
str_to_int,
int_or_none,
ExtractorError,
unescapeHTML,
)
class DailymotionBaseInfoExtractor(InfoExtractor):
@staticmethod
def _build_request(url):
@@ -27,6 +30,7 @@ class DailymotionBaseInfoExtractor(InfoExtractor):
request.add_header('Cookie', 'ff=off')
return request
class DailymotionIE(DailymotionBaseInfoExtractor, SubtitlesInfoExtractor):
"""Information Extractor for Dailymotion"""
@@ -112,7 +116,7 @@ class DailymotionIE(DailymotionBaseInfoExtractor, SubtitlesInfoExtractor):
embed_page = self._download_webpage(embed_url, video_id,
'Downloading embed page')
info = self._search_regex(r'var info = ({.*?}),$', embed_page,
'video info', flags=re.MULTILINE)
'video info', flags=re.MULTILINE)
info = json.loads(info)
if info.get('error') is not None:
msg = 'Couldn\'t get video, Dailymotion says: %s' % info['error']['title']
@@ -206,7 +210,7 @@ class DailymotionPlaylistIE(DailymotionBaseInfoExtractor):
if re.search(self._MORE_PAGES_INDICATOR, webpage) is None:
break
return [self.url_result('http://www.dailymotion.com/video/%s' % video_id, 'Dailymotion')
for video_id in orderedSet(video_ids)]
for video_id in orderedSet(video_ids)]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)

Some files were not shown because too many files have changed in this diff Show More