Compare commits

...

87 Commits

Author SHA1 Message Date
Philipp Hagemeister
07490f8017 release 2015.03.03 2015-03-03 00:05:05 +01:00
Philipp Hagemeister
a7440261c5 [utils] Streap leading dots
Fixes #2865, closes #5087
2015-03-02 19:07:19 +01:00
Philipp Hagemeister
76c73715fb [generic] Parse RSS enclosure URLs (Fixes #5091) 2015-03-02 18:21:31 +01:00
Philipp Hagemeister
c75f0b361a [downloader/external] Add support for custom options (Fixes #4885, closes #5098) 2015-03-02 18:21:31 +01:00
Sergey M․
295df4edb9 [soundcloud] Fix glitches (#5101) 2015-03-02 22:47:07 +06:00
Sergey M․
562ceab13d [soundcloud] Check direct links validity (Closes #5101) 2015-03-02 22:39:32 +06:00
Sergey M․
2f0f6578c3 [extractor/common] Assume non HTTP(S) URLs valid 2015-03-02 22:38:44 +06:00
Sergey M․
30cbd4e0d6 [lynda] Completely skip videos we don't have access to, extract base class and modernize (Closes #5093) 2015-03-02 22:12:10 +06:00
Sergey M.
549e58069c Merge pull request #5105 from Ftornik/Lynda-subtitle-hotfix-2
[lynda] Check for the empty subtitles
2015-03-02 21:15:26 +06:00
Sergey
7594be85ff [lynda] Check for the empty subtitle 2015-03-02 11:49:39 +02:00
Sergey M․
3630034609 [vk] Fix test (Closes #5100) 2015-03-02 03:30:18 +06:00
Sergey M․
4e01501bbf [vk] Fix extraction (Closes #4967, closes #4686) 2015-03-01 21:56:30 +06:00
Sergey M․
1aa5172f56 [vk] Catch temporarily unavailable video error message 2015-03-01 21:55:43 +06:00
Philipp Hagemeister
f7e2ee8fa6 Merge branch 'master' of github.com:rg3/youtube-dl 2015-03-01 12:05:13 +01:00
Philipp Hagemeister
66dc9a3701 [README] Document HTTP 429 (Closes #5092) 2015-03-01 12:04:39 +01:00
Jaime Marquínez Ferrándiz
31bd39256b --load-info: Use the fileinput module
It automatically handles the '-' filename as stdin
2015-03-01 11:54:48 +01:00
Jaime Marquínez Ferrándiz
003c69a84b Use shutil.get_terminal_size for getting the terminal width if it's available (python >= 3.3) 2015-02-28 21:44:57 +01:00
Philipp Hagemeister
0134901108 release 2015.02.28 2015-02-28 21:24:25 +01:00
Philipp Hagemeister
eee6293d57 [thechive] remove in favor of Kaltura (#5072) 2015-02-28 20:55:49 +01:00
Philipp Hagemeister
8237bec4f0 [escapist] Extract duration 2015-02-28 20:52:52 +01:00
Philipp Hagemeister
29cad7ad13 Merge remote-tracking branch 'origin/master' 2015-02-28 20:51:54 +01:00
Sergey M․
0d103de3b0 [twitch] Pass api_token along with every request (Closes #3986) 2015-02-28 22:59:55 +06:00
Sergey M․
a0090691d0 Merge branch 'HanYOLO-puls4' 2015-02-28 22:26:35 +06:00
Sergey M․
6c87c2eea8 [puls4] Improve and extract more metadata 2015-02-28 22:25:57 +06:00
Sergey M․
58c2ec6ab3 Merge branch 'puls4' of https://github.com/HanYOLO/youtube-dl 2015-02-28 21:39:10 +06:00
Sergey M․
df5ae3eb16 [oppetarkiv] Merge with svtplay 2015-02-28 21:25:04 +06:00
Sergey M․
efda2d7854 Merge branch 'thc202-oppetarkiv' 2015-02-28 21:12:23 +06:00
Sergey M․
e143f5dae9 [oppetarkiv] Extract f4m formats and age limit 2015-02-28 21:12:06 +06:00
Sergey M․
48218cdb97 Merge branch 'oppetarkiv' of https://github.com/thc202/youtube-dl into thc202-oppetarkiv 2015-02-28 20:41:56 +06:00
Jaime Marquínez Ferrándiz
e9fade72f3 Add postprocessor for converting subtitles (closes #4954) 2015-02-28 14:43:24 +01:00
Jaime Marquínez Ferrándiz
0f2c0d335b [YoutubeDL] Use the InfoExtractor._download_webpage method for getting the subtitles
It handles encodings better, for example for 'http://www.npo.nl/nos-journaal/14-02-2015/POW_00942207'
2015-02-28 14:03:27 +01:00
thc202
40b077bc7e [oppetarkiv] Add new extractor
Some, if not all, of the videos appear to be geo-blocked (Sweden).
Test might fail (403 Forbidden) if not run through a Swedish connection.
2015-02-27 22:27:30 +00:00
Sergey M․
a931092cb3 Merge branch 'puls4' of https://github.com/HanYOLO/youtube-dl into HanYOLO-puls4 2015-02-28 00:22:48 +06:00
Sergey M․
bd3749ed69 [kaltura] Extend _VALID_URL (Closes #5081) 2015-02-28 00:19:31 +06:00
Sergey M․
4ffbf77886 [odnoklassniki] Add extractor (Closes #5075) 2015-02-28 00:15:03 +06:00
Jaime Marquínez Ferrándiz
781a7ef60a [lynda] Use 'lstrip' for the subtitles
The newlines at the end are important, they separate each piece of text.
2015-02-27 16:18:18 +01:00
Sergey M.
5b2949ee0b Merge pull request #5076 from Ftornik/Lynda-subtitles-hotfix
[lynda] Fixed subtitles broken file
2015-02-27 20:56:54 +06:00
Sergey M․
a0d646135a [lynda] Extend _VALID_URL 2015-02-27 20:56:06 +06:00
HanYOLO
7862ad88b7 puls4 Add new extractor 2015-02-27 15:41:58 +01:00
Jaime Marquínez Ferrándiz
f3bff94cf9 [rtve] Extract duration 2015-02-27 12:24:51 +01:00
Sergey
0eba1e1782 [lynda] Fixed subtitles broken file 2015-02-27 00:51:22 +02:00
Naglis Jonaitis
e3216b82bf [generic] Support dynamic Kaltura embeds (#5016) (#5073) 2015-02-27 00:34:19 +02:00
Naglis Jonaitis
da419e2332 [musicvault] Use the Kaltura extractor 2015-02-26 23:47:45 +02:00
Naglis Jonaitis
0d97ef43be [kaltura] Add new extractor 2015-02-26 23:45:54 +02:00
anovicecodemonkey
1a2313a6f2 [TheChiveIE] added support for TheChive.com (Closes #5016) 2015-02-27 02:36:45 +10:30
Sergey M․
250a9bdfe2 [mpora] Improve _VALID_URL 2015-02-26 21:16:35 +06:00
Sergey M․
6317a3e9da [mpora] Fix extraction 2015-02-26 21:10:49 +06:00
Naglis Jonaitis
7ab7c9e932 [gamestar] Fix title extraction 2015-02-26 16:22:05 +02:00
Naglis Jonaitis
e129c5bc0d [laola1tv] Allow live stream downloads 2015-02-26 14:35:48 +02:00
PishPosh.McGee
2e241242a3 Adding subtitles 2015-02-26 03:59:35 -06:00
Philipp Hagemeister
9724e5d336 release 2015.02.26.2 2015-02-26 09:45:11 +01:00
Philipp Hagemeister
63a562f95e [escapist] Detect IP blocking and use another UA (Fixes #5069) 2015-02-26 09:19:26 +01:00
Philipp Hagemeister
5c340b0387 release 2015.02.26.1 2015-02-26 01:47:16 +01:00
Philipp Hagemeister
1c6510f57a [Makefile] clean pyc files in clean target 2015-02-26 01:47:12 +01:00
Philipp Hagemeister
2a15a98a6a [rmtp] Encode filename before invoking subprocess
This fixes #5066.
Reproducible with
LC_ALL=C youtube-dl "http://www.prosieben.de/tv/germanys-next-topmodel/video/playlist/ganze-folge-episode-2-das-casting-in-muenchen"
2015-02-26 01:44:20 +01:00
Philipp Hagemeister
72a406e7aa [extractor/common] Pass in video_id (#5057) 2015-02-26 01:35:43 +01:00
Philipp Hagemeister
feccc3ff37 Merge remote-tracking branch 'aajanki/wdr_live' 2015-02-26 01:34:01 +01:00
Philipp Hagemeister
265bfa2c79 [letv] Simplify 2015-02-26 01:30:18 +01:00
Philipp Hagemeister
8faf9b9b41 Merge remote-tracking branch 'yan12125/IE_Letv' 2015-02-26 01:26:55 +01:00
Philipp Hagemeister
84be7c230c Cred @duncankl for airmozilla 2015-02-26 01:25:54 +01:00
Philipp Hagemeister
3e675fabe0 [airmozilla] Be more tolerant when nonessential items are missing (#5030) 2015-02-26 01:25:00 +01:00
Philipp Hagemeister
cd5b4b0bc2 Merge remote-tracking branch 'duncankl/airmozilla' 2015-02-26 01:15:08 +01:00
Philipp Hagemeister
7ef822021b Merge remote-tracking branch 'mmue/fix-rtlnow' 2015-02-26 01:13:03 +01:00
Philipp Hagemeister
9a48926a57 [escapist] Add support for advertisements 2015-02-26 00:59:53 +01:00
Philipp Hagemeister
13cd97f3df release 2015.02.26 2015-02-26 00:42:02 +01:00
Philipp Hagemeister
183139340b [utils] Bump our user agent 2015-02-26 00:40:12 +01:00
Philipp Hagemeister
1c69bca258 [escapist] Fix config URL matching 2015-02-26 00:24:54 +01:00
Jaime Marquínez Ferrándiz
c10ea454dc [telecinco] Recognize more urls (closes #5065) 2015-02-25 23:52:54 +01:00
Markus Müller
9504fc21b5 Fix the RTL extractor for new episodes by using a different hostname 2015-02-25 23:27:19 +01:00
Jaime Marquínez Ferrándiz
13d8fbef30 [generic] Don't set the 'title' if it's not defined in the entry (closes #5061)
Some of them may be an 'url' result, which in general don't have the 'title' field.
2015-02-25 17:56:51 +01:00
Antti Ajanki
b8988b63a6 [wdr] Download a live stream 2015-02-24 21:23:59 +02:00
Antti Ajanki
5eaaeb7c31 [f4m] Tolerate missed fragments on live streams 2015-02-24 21:22:59 +02:00
Antti Ajanki
c4f8c453ae [f4m] Refresh fragment list periodically on live streams 2015-02-24 21:22:59 +02:00
Antti Ajanki
6f4ba54079 [extractor/common] Extract HTTP (possibly f4m) URLs from a .smil file 2015-02-24 21:22:59 +02:00
Antti Ajanki
637570326b [extractor/common] Extract the first of a seq of videos in a .smil file 2015-02-24 21:22:59 +02:00
Sergey M․
37f885650c [eporner] Simplify and hardcode age limit 2015-02-25 01:08:54 +06:00
Sergey M.
c8c34ccb20 Merge pull request #5056 from logon84/master
Eporner Fix (Closes #5050)
2015-02-25 01:05:35 +06:00
logon84
e765ed3a9c [eporner] Fix redirect_code error 2015-02-24 19:41:46 +01:00
Yen Chi Hsuan
677063594e [Letv] Update testcases 2015-02-25 02:10:55 +08:00
logon84
59c7cbd482 Update eporner.py
Updated to work. Old version shows an error about being unable to extract "redirect_code"
2015-02-24 18:58:32 +01:00
Yen Chi Hsuan
570311610e [Letv] Add playlist support 2015-02-25 01:26:44 +08:00
Sergey M․
41b264e77c [nrktv] Workaround subtitles conversion issues on python 2.6 (Closes #5036) 2015-02-24 23:06:44 +06:00
Philipp Hagemeister
df4bd0d53f [options] Add --yes-playlist as inverse of --no-playlist (Fixes #5051) 2015-02-24 17:25:02 +01:00
Yen Chi Hsuan
7f09a662a0 [Letv] Add new extractor. Single video only 2015-02-24 23:58:21 +08:00
Philipp Hagemeister
4f3b21e1c7 release 2015.02.24.2 2015-02-24 16:34:42 +01:00
Philipp Hagemeister
54233c9080 [escapist] Support JavaScript player (Fixes #5034) 2015-02-24 16:33:07 +01:00
Duncan Keall
1b40dc92eb [airmozilla] Add new extractor 2015-02-23 16:10:08 +13:00
43 changed files with 1307 additions and 528 deletions

View File

@@ -112,3 +112,4 @@ Frans de Jonge
Robin de Rooij Robin de Rooij
Ryan Schmidt Ryan Schmidt
Leslie P. Polzer Leslie P. Polzer
Duncan Keall

View File

@@ -2,6 +2,7 @@ all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bas
clean: clean:
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish *.dump *.part *.info.json *.mp4 *.flv *.mp3 *.avi CONTRIBUTING.md.tmp youtube-dl youtube-dl.exe rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish *.dump *.part *.info.json *.mp4 *.flv *.mp3 *.avi CONTRIBUTING.md.tmp youtube-dl youtube-dl.exe
find -name "*.pyc" -delete
PREFIX ?= /usr/local PREFIX ?= /usr/local
BINDIR ?= $(PREFIX)/bin BINDIR ?= $(PREFIX)/bin

411
README.md
View File

@@ -47,209 +47,107 @@ which means you can modify it, redistribute it or use it however you like.
# OPTIONS # OPTIONS
-h, --help print this help text and exit -h, --help print this help text and exit
--version print program version and exit --version print program version and exit
-U, --update update this program to latest version. Make -U, --update update this program to latest version. Make sure that you have sufficient permissions (run with sudo if needed)
sure that you have sufficient permissions -i, --ignore-errors continue on download errors, for example to skip unavailable videos in a playlist
(run with sudo if needed) --abort-on-error Abort downloading of further videos (in the playlist or the command line) if an error occurs
-i, --ignore-errors continue on download errors, for example to
skip unavailable videos in a playlist
--abort-on-error Abort downloading of further videos (in the
playlist or the command line) if an error
occurs
--dump-user-agent display the current browser identification --dump-user-agent display the current browser identification
--list-extractors List all supported extractors and the URLs --list-extractors List all supported extractors and the URLs they would handle
they would handle --extractor-descriptions Output descriptions of all supported extractors
--extractor-descriptions Output descriptions of all supported --default-search PREFIX Use this prefix for unqualified URLs. For example "gvsearch2:" downloads two videos from google videos for youtube-dl "large apple".
extractors Use the value "auto" to let youtube-dl guess ("auto_warning" to emit a warning when guessing). "error" just throws an error. The
--default-search PREFIX Use this prefix for unqualified URLs. For default value "fixup_error" repairs broken URLs, but emits an error if this is not possible instead of searching.
example "gvsearch2:" downloads two videos --ignore-config Do not read configuration files. When given in the global configuration file /etc/youtube-dl.conf: Do not read the user configuration
from google videos for youtube-dl "large in ~/.config/youtube-dl/config (%APPDATA%/youtube-dl/config.txt on Windows)
apple". Use the value "auto" to let --flat-playlist Do not extract the videos of a playlist, only list them.
youtube-dl guess ("auto_warning" to emit a
warning when guessing). "error" just throws
an error. The default value "fixup_error"
repairs broken URLs, but emits an error if
this is not possible instead of searching.
--ignore-config Do not read configuration files. When given
in the global configuration file /etc
/youtube-dl.conf: Do not read the user
configuration in ~/.config/youtube-
dl/config (%APPDATA%/youtube-dl/config.txt
on Windows)
--flat-playlist Do not extract the videos of a playlist,
only list them.
--no-color Do not emit color codes in output. --no-color Do not emit color codes in output.
## Network Options: ## Network Options:
--proxy URL Use the specified HTTP/HTTPS proxy. Pass in --proxy URL Use the specified HTTP/HTTPS proxy. Pass in an empty string (--proxy "") for direct connection
an empty string (--proxy "") for direct
connection
--socket-timeout SECONDS Time to wait before giving up, in seconds --socket-timeout SECONDS Time to wait before giving up, in seconds
--source-address IP Client-side IP address to bind to --source-address IP Client-side IP address to bind to (experimental)
(experimental) -4, --force-ipv4 Make all connections via IPv4 (experimental)
-4, --force-ipv4 Make all connections via IPv4 -6, --force-ipv6 Make all connections via IPv6 (experimental)
(experimental)
-6, --force-ipv6 Make all connections via IPv6
(experimental)
## Video Selection: ## Video Selection:
--playlist-start NUMBER playlist video to start at (default is 1) --playlist-start NUMBER playlist video to start at (default is 1)
--playlist-end NUMBER playlist video to end at (default is last) --playlist-end NUMBER playlist video to end at (default is last)
--playlist-items ITEM_SPEC playlist video items to download. Specify --playlist-items ITEM_SPEC playlist video items to download. Specify indices of the videos in the playlist seperated by commas like: "--playlist-items 1,2,5,8"
indices of the videos in the playlist if you want to download videos indexed 1, 2, 5, 8 in the playlist. You can specify range: "--playlist-items 1-3,7,10-13", it will
seperated by commas like: "--playlist-items download the videos at index 1, 2, 3, 7, 10, 11, 12 and 13.
1,2,5,8" if you want to download videos --match-title REGEX download only matching titles (regex or caseless sub-string)
indexed 1, 2, 5, 8 in the playlist. You can --reject-title REGEX skip download for matching titles (regex or caseless sub-string)
specify range: "--playlist-items
1-3,7,10-13", it will download the videos
at index 1, 2, 3, 7, 10, 11, 12 and 13.
--match-title REGEX download only matching titles (regex or
caseless sub-string)
--reject-title REGEX skip download for matching titles (regex or
caseless sub-string)
--max-downloads NUMBER Abort after downloading NUMBER files --max-downloads NUMBER Abort after downloading NUMBER files
--min-filesize SIZE Do not download any videos smaller than --min-filesize SIZE Do not download any videos smaller than SIZE (e.g. 50k or 44.6m)
SIZE (e.g. 50k or 44.6m) --max-filesize SIZE Do not download any videos larger than SIZE (e.g. 50k or 44.6m)
--max-filesize SIZE Do not download any videos larger than SIZE
(e.g. 50k or 44.6m)
--date DATE download only videos uploaded in this date --date DATE download only videos uploaded in this date
--datebefore DATE download only videos uploaded on or before --datebefore DATE download only videos uploaded on or before this date (i.e. inclusive)
this date (i.e. inclusive) --dateafter DATE download only videos uploaded on or after this date (i.e. inclusive)
--dateafter DATE download only videos uploaded on or after --min-views COUNT Do not download any videos with less than COUNT views
this date (i.e. inclusive) --max-views COUNT Do not download any videos with more than COUNT views
--min-views COUNT Do not download any videos with less than --match-filter FILTER (Experimental) Generic video filter. Specify any key (see help for -o for a list of available keys) to match if the key is present,
COUNT views !key to check if the key is not present,key > NUMBER (like "comment_count > 12", also works with >=, <, <=, !=, =) to compare against
--max-views COUNT Do not download any videos with more than a number, and & to require multiple matches. Values which are not known are excluded unless you put a question mark (?) after the
COUNT views operator.For example, to only match videos that have been liked more than 100 times and disliked less than 50 times (or the dislike
--match-filter FILTER (Experimental) Generic video filter. functionality is not available at the given service), but who also have a description, use --match-filter "like_count > 100 &
Specify any key (see help for -o for a list
of available keys) to match if the key is
present, !key to check if the key is not
present,key > NUMBER (like "comment_count >
12", also works with >=, <, <=, !=, =) to
compare against a number, and & to require
multiple matches. Values which are not
known are excluded unless you put a
question mark (?) after the operator.For
example, to only match videos that have
been liked more than 100 times and disliked
less than 50 times (or the dislike
functionality is not available at the given
service), but who also have a description,
use --match-filter "like_count > 100 &
dislike_count <? 50 & description" . dislike_count <? 50 & description" .
--no-playlist If the URL refers to a video and a --no-playlist If the URL refers to a video and a playlist, download only the video.
playlist, download only the video. --yes-playlist If the URL refers to a video and a playlist, download the playlist.
--age-limit YEARS download only videos suitable for the given --age-limit YEARS download only videos suitable for the given age
age --download-archive FILE Download only videos not listed in the archive file. Record the IDs of all downloaded videos in it.
--download-archive FILE Download only videos not listed in the --include-ads Download advertisements as well (experimental)
archive file. Record the IDs of all
downloaded videos in it.
--include-ads Download advertisements as well
(experimental)
## Download Options: ## Download Options:
-r, --rate-limit LIMIT maximum download rate in bytes per second -r, --rate-limit LIMIT maximum download rate in bytes per second (e.g. 50K or 4.2M)
(e.g. 50K or 4.2M) -R, --retries RETRIES number of retries (default is 10), or "infinite".
-R, --retries RETRIES number of retries (default is 10), or --buffer-size SIZE size of download buffer (e.g. 1024 or 16K) (default is 1024)
"infinite". --no-resize-buffer do not automatically adjust the buffer size. By default, the buffer size is automatically resized from an initial value of SIZE.
--buffer-size SIZE size of download buffer (e.g. 1024 or 16K)
(default is 1024)
--no-resize-buffer do not automatically adjust the buffer
size. By default, the buffer size is
automatically resized from an initial value
of SIZE.
--playlist-reverse Download playlist videos in reverse order --playlist-reverse Download playlist videos in reverse order
--xattr-set-filesize (experimental) set file xattribute --xattr-set-filesize (experimental) set file xattribute ytdl.filesize with expected filesize
ytdl.filesize with expected filesize --hls-prefer-native (experimental) Use the native HLS downloader instead of ffmpeg.
--hls-prefer-native (experimental) Use the native HLS --external-downloader COMMAND Use the specified external downloader. Currently supports aria2c,curl,wget
downloader instead of ffmpeg. --external-downloader-args ARGS Give these arguments to the external downloader.
--external-downloader COMMAND (experimental) Use the specified external
downloader. Currently supports
aria2c,curl,wget
## Filesystem Options: ## Filesystem Options:
-a, --batch-file FILE file containing URLs to download ('-' for -a, --batch-file FILE file containing URLs to download ('-' for stdin)
stdin)
--id use only video ID in file name --id use only video ID in file name
-o, --output TEMPLATE output filename template. Use %(title)s to -o, --output TEMPLATE output filename template. Use %(title)s to get the title, %(uploader)s for the uploader name, %(uploader_id)s for the uploader
get the title, %(uploader)s for the nickname if different, %(autonumber)s to get an automatically incremented number, %(ext)s for the filename extension, %(format)s for
uploader name, %(uploader_id)s for the the format description (like "22 - 1280x720" or "HD"), %(format_id)s for the unique id of the format (like Youtube's itags: "137"),
uploader nickname if different, %(upload_date)s for the upload date (YYYYMMDD), %(extractor)s for the provider (youtube, metacafe, etc), %(id)s for the video id,
%(autonumber)s to get an automatically %(playlist_title)s, %(playlist_id)s, or %(playlist)s (=title if present, ID otherwise) for the playlist the video is in,
incremented number, %(ext)s for the %(playlist_index)s for the position in the playlist. %(height)s and %(width)s for the width and height of the video format.
filename extension, %(format)s for the %(resolution)s for a textual description of the resolution of the video format. %% for a literal percent. Use - to output to stdout.
format description (like "22 - 1280x720" or Can also be used to download to a different directory, for example with -o '/my/downloads/%(uploader)s/%(title)s-%(id)s.%(ext)s' .
"HD"), %(format_id)s for the unique id of --autonumber-size NUMBER Specifies the number of digits in %(autonumber)s when it is present in output filename template or --auto-number option is given
the format (like Youtube's itags: "137"), --restrict-filenames Restrict filenames to only ASCII characters, and avoid "&" and spaces in filenames
%(upload_date)s for the upload date -A, --auto-number [deprecated; use -o "%(autonumber)s-%(title)s.%(ext)s" ] number downloaded files starting from 00000
(YYYYMMDD), %(extractor)s for the provider -t, --title [deprecated] use title in file name (default)
(youtube, metacafe, etc), %(id)s for the
video id, %(playlist_title)s,
%(playlist_id)s, or %(playlist)s (=title if
present, ID otherwise) for the playlist the
video is in, %(playlist_index)s for the
position in the playlist. %(height)s and
%(width)s for the width and height of the
video format. %(resolution)s for a textual
description of the resolution of the video
format. %% for a literal percent. Use - to
output to stdout. Can also be used to
download to a different directory, for
example with -o '/my/downloads/%(uploader)s
/%(title)s-%(id)s.%(ext)s' .
--autonumber-size NUMBER Specifies the number of digits in
%(autonumber)s when it is present in output
filename template or --auto-number option
is given
--restrict-filenames Restrict filenames to only ASCII
characters, and avoid "&" and spaces in
filenames
-A, --auto-number [deprecated; use -o
"%(autonumber)s-%(title)s.%(ext)s" ] number
downloaded files starting from 00000
-t, --title [deprecated] use title in file name
(default)
-l, --literal [deprecated] alias of --title -l, --literal [deprecated] alias of --title
-w, --no-overwrites do not overwrite files -w, --no-overwrites do not overwrite files
-c, --continue force resume of partially downloaded files. -c, --continue force resume of partially downloaded files. By default, youtube-dl will resume downloads if possible.
By default, youtube-dl will resume --no-continue do not resume partially downloaded files (restart from beginning)
downloads if possible. --no-part do not use .part files - write directly into output file
--no-continue do not resume partially downloaded files --no-mtime do not use the Last-modified header to set the file modification time
(restart from beginning) --write-description write video description to a .description file
--no-part do not use .part files - write directly
into output file
--no-mtime do not use the Last-modified header to set
the file modification time
--write-description write video description to a .description
file
--write-info-json write video metadata to a .info.json file --write-info-json write video metadata to a .info.json file
--write-annotations write video annotations to a .annotation --write-annotations write video annotations to a .annotation file
file --load-info FILE json file containing the video information (created with the "--write-json" option)
--load-info FILE json file containing the video information --cookies FILE file to read cookies from and dump cookie jar in
(created with the "--write-json" option) --cache-dir DIR Location in the filesystem where youtube-dl can store some downloaded information permanently. By default $XDG_CACHE_HOME/youtube-dl
--cookies FILE file to read cookies from and dump cookie or ~/.cache/youtube-dl . At the moment, only YouTube player files (for videos with obfuscated signatures) are cached, but that may
jar in change.
--cache-dir DIR Location in the filesystem where youtube-dl
can store some downloaded information
permanently. By default $XDG_CACHE_HOME
/youtube-dl or ~/.cache/youtube-dl . At the
moment, only YouTube player files (for
videos with obfuscated signatures) are
cached, but that may change.
--no-cache-dir Disable filesystem caching --no-cache-dir Disable filesystem caching
--rm-cache-dir Delete all filesystem cache files --rm-cache-dir Delete all filesystem cache files
## Thumbnail images: ## Thumbnail images:
--write-thumbnail write thumbnail image to disk --write-thumbnail write thumbnail image to disk
--write-all-thumbnails write all thumbnail image formats to disk --write-all-thumbnails write all thumbnail image formats to disk
--list-thumbnails Simulate and list all available thumbnail --list-thumbnails Simulate and list all available thumbnail formats
formats
## Verbosity / Simulation Options: ## Verbosity / Simulation Options:
-q, --quiet activates quiet mode -q, --quiet activates quiet mode
--no-warnings Ignore warnings --no-warnings Ignore warnings
-s, --simulate do not download the video and do not write -s, --simulate do not download the video and do not write anything to disk
anything to disk
--skip-download do not download the video --skip-download do not download the video
-g, --get-url simulate, quiet but print URL -g, --get-url simulate, quiet but print URL
-e, --get-title simulate, quiet but print title -e, --get-title simulate, quiet but print title
@@ -259,153 +157,84 @@ which means you can modify it, redistribute it or use it however you like.
--get-duration simulate, quiet but print video length --get-duration simulate, quiet but print video length
--get-filename simulate, quiet but print output filename --get-filename simulate, quiet but print output filename
--get-format simulate, quiet but print output format --get-format simulate, quiet but print output format
-j, --dump-json simulate, quiet but print JSON information. -j, --dump-json simulate, quiet but print JSON information. See --output for a description of available keys.
See --output for a description of available -J, --dump-single-json simulate, quiet but print JSON information for each command-line argument. If the URL refers to a playlist, dump the whole playlist
keys. information in a single line.
-J, --dump-single-json simulate, quiet but print JSON information --print-json Be quiet and print the video information as JSON (video is still being downloaded).
for each command-line argument. If the URL
refers to a playlist, dump the whole
playlist information in a single line.
--print-json Be quiet and print the video information as
JSON (video is still being downloaded).
--newline output progress bar as new lines --newline output progress bar as new lines
--no-progress do not print progress bar --no-progress do not print progress bar
--console-title display progress in console titlebar --console-title display progress in console titlebar
-v, --verbose print various debugging information -v, --verbose print various debugging information
--dump-intermediate-pages print downloaded pages to debug problems --dump-intermediate-pages print downloaded pages to debug problems (very verbose)
(very verbose) --write-pages Write downloaded intermediary pages to files in the current directory to debug problems
--write-pages Write downloaded intermediary pages to
files in the current directory to debug
problems
--print-traffic Display sent and read HTTP traffic --print-traffic Display sent and read HTTP traffic
-C, --call-home Contact the youtube-dl server for -C, --call-home Contact the youtube-dl server for debugging.
debugging. --no-call-home Do NOT contact the youtube-dl server for debugging.
--no-call-home Do NOT contact the youtube-dl server for
debugging.
## Workarounds: ## Workarounds:
--encoding ENCODING Force the specified encoding (experimental) --encoding ENCODING Force the specified encoding (experimental)
--no-check-certificate Suppress HTTPS certificate validation. --no-check-certificate Suppress HTTPS certificate validation.
--prefer-insecure Use an unencrypted connection to retrieve --prefer-insecure Use an unencrypted connection to retrieve information about the video. (Currently supported only for YouTube)
information about the video. (Currently
supported only for YouTube)
--user-agent UA specify a custom user agent --user-agent UA specify a custom user agent
--referer URL specify a custom referer, use if the video --referer URL specify a custom referer, use if the video access is restricted to one domain
access is restricted to one domain --add-header FIELD:VALUE specify a custom HTTP header and its value, separated by a colon ':'. You can use this option multiple times
--add-header FIELD:VALUE specify a custom HTTP header and its value, --bidi-workaround Work around terminals that lack bidirectional text support. Requires bidiv or fribidi executable in PATH
separated by a colon ':'. You can use this --sleep-interval SECONDS Number of seconds to sleep before each download.
option multiple times
--bidi-workaround Work around terminals that lack
bidirectional text support. Requires bidiv
or fribidi executable in PATH
--sleep-interval SECONDS Number of seconds to sleep before each
download.
## Video Format Options: ## Video Format Options:
-f, --format FORMAT video format code, specify the order of -f, --format FORMAT video format code, specify the order of preference using slashes, as in -f 22/17/18 . Instead of format codes, you can select by
preference using slashes, as in -f 22/17/18 extension for the extensions aac, m4a, mp3, mp4, ogg, wav, webm. You can also use the special names "best", "bestvideo", "bestaudio",
. Instead of format codes, you can select "worst". You can filter the video results by putting a condition in brackets, as in -f "best[height=720]" (or -f "[filesize>10M]").
by extension for the extensions aac, m4a, This works for filesize, height, width, tbr, abr, vbr, asr, and fps and the comparisons <, <=, >, >=, =, != and for ext, acodec,
mp3, mp4, ogg, wav, webm. You can also use vcodec, container, and protocol and the comparisons =, != . Formats for which the value is not known are excluded unless you put a
the special names "best", "bestvideo", question mark (?) after the operator. You can combine format filters, so -f "[height <=? 720][tbr>500]" selects up to 720p videos
"bestaudio", "worst". You can filter the (or videos where the height is not known) with a bitrate of at least 500 KBit/s. By default, youtube-dl will pick the best quality.
video results by putting a condition in Use commas to download multiple audio formats, such as -f 136/137/mp4/bestvideo,140/m4a/bestaudio. You can merge the video and audio
brackets, as in -f "best[height=720]" (or of two formats into a single file using -f <video-format>+<audio-format> (requires ffmpeg or avconv), for example -f
-f "[filesize>10M]"). This works for
filesize, height, width, tbr, abr, vbr,
asr, and fps and the comparisons <, <=, >,
>=, =, != and for ext, acodec, vcodec,
container, and protocol and the comparisons
=, != . Formats for which the value is not
known are excluded unless you put a
question mark (?) after the operator. You
can combine format filters, so -f "[height
<=? 720][tbr>500]" selects up to 720p
videos (or videos where the height is not
known) with a bitrate of at least 500
KBit/s. By default, youtube-dl will pick
the best quality. Use commas to download
multiple audio formats, such as -f
136/137/mp4/bestvideo,140/m4a/bestaudio.
You can merge the video and audio of two
formats into a single file using -f <video-
format>+<audio-format> (requires ffmpeg or
avconv), for example -f
bestvideo+bestaudio. bestvideo+bestaudio.
--all-formats download all available video formats --all-formats download all available video formats
--prefer-free-formats prefer free video formats unless a specific --prefer-free-formats prefer free video formats unless a specific one is requested
one is requested
--max-quality FORMAT highest quality format to download --max-quality FORMAT highest quality format to download
-F, --list-formats list all available formats -F, --list-formats list all available formats
--youtube-skip-dash-manifest Do not download the DASH manifest on --youtube-skip-dash-manifest Do not download the DASH manifest on YouTube videos
YouTube videos --merge-output-format FORMAT If a merge is required (e.g. bestvideo+bestaudio), output to given container format. One of mkv, mp4, ogg, webm, flv.Ignored if no
--merge-output-format FORMAT If a merge is required (e.g. merge is required
bestvideo+bestaudio), output to given
container format. One of mkv, mp4, ogg,
webm, flv.Ignored if no merge is required
## Subtitle Options: ## Subtitle Options:
--write-sub write subtitle file --write-sub write subtitle file
--write-auto-sub write automatic subtitle file (youtube --write-auto-sub write automatic subtitle file (youtube only)
only) --all-subs downloads all the available subtitles of the video
--all-subs downloads all the available subtitles of
the video
--list-subs lists all available subtitles for the video --list-subs lists all available subtitles for the video
--sub-format FORMAT subtitle format, accepts formats --sub-format FORMAT subtitle format, accepts formats preference, for example: "ass/srt/best"
preference, for example: "ass/srt/best" --sub-lang LANGS languages of the subtitles to download (optional) separated by commas, use IETF language tags like 'en,pt'
--sub-lang LANGS languages of the subtitles to download
(optional) separated by commas, use IETF
language tags like 'en,pt'
## Authentication Options: ## Authentication Options:
-u, --username USERNAME login with this account ID -u, --username USERNAME login with this account ID
-p, --password PASSWORD account password. If this option is left -p, --password PASSWORD account password. If this option is left out, youtube-dl will ask interactively.
out, youtube-dl will ask interactively.
-2, --twofactor TWOFACTOR two-factor auth code -2, --twofactor TWOFACTOR two-factor auth code
-n, --netrc use .netrc authentication data -n, --netrc use .netrc authentication data
--video-password PASSWORD video password (vimeo, smotri) --video-password PASSWORD video password (vimeo, smotri)
## Post-processing Options: ## Post-processing Options:
-x, --extract-audio convert video files to audio-only files -x, --extract-audio convert video files to audio-only files (requires ffmpeg or avconv and ffprobe or avprobe)
(requires ffmpeg or avconv and ffprobe or --audio-format FORMAT "best", "aac", "vorbis", "mp3", "m4a", "opus", or "wav"; "best" by default
avprobe) --audio-quality QUALITY ffmpeg/avconv audio quality specification, insert a value between 0 (better) and 9 (worse) for VBR or a specific bitrate like 128K
--audio-format FORMAT "best", "aac", "vorbis", "mp3", "m4a", (default 5)
"opus", or "wav"; "best" by default --recode-video FORMAT Encode the video to another format if necessary (currently supported: mp4|flv|ogg|webm|mkv)
--audio-quality QUALITY ffmpeg/avconv audio quality specification, -k, --keep-video keeps the video file on disk after the post-processing; the video is erased by default
insert a value between 0 (better) and 9 --no-post-overwrites do not overwrite post-processed files; the post-processed files are overwritten by default
(worse) for VBR or a specific bitrate like --embed-subs embed subtitles in the video (only for mp4 videos)
128K (default 5)
--recode-video FORMAT Encode the video to another format if
necessary (currently supported:
mp4|flv|ogg|webm|mkv)
-k, --keep-video keeps the video file on disk after the
post-processing; the video is erased by
default
--no-post-overwrites do not overwrite post-processed files; the
post-processed files are overwritten by
default
--embed-subs embed subtitles in the video (only for mp4
videos)
--embed-thumbnail embed thumbnail in the audio as cover art --embed-thumbnail embed thumbnail in the audio as cover art
--add-metadata write metadata to the video file --add-metadata write metadata to the video file
--xattrs write metadata to the video file's xattrs --xattrs write metadata to the video file's xattrs (using dublin core and xdg standards)
(using dublin core and xdg standards) --fixup POLICY Automatically correct known faults of the file. One of never (do nothing), warn (only emit a warning), detect_or_warn(the default;
--fixup POLICY Automatically correct known faults of the fix file if we can, warn otherwise)
file. One of never (do nothing), warn (only --prefer-avconv Prefer avconv over ffmpeg for running the postprocessors (default)
emit a warning), detect_or_warn(the --prefer-ffmpeg Prefer ffmpeg over avconv for running the postprocessors
default; fix file if we can, warn --ffmpeg-location PATH Location of the ffmpeg/avconv binary; either the path to the binary or its containing directory.
otherwise) --exec CMD Execute a command on the file after downloading, similar to find's -exec syntax. Example: --exec 'adb push {} /sdcard/Music/ && rm
--prefer-avconv Prefer avconv over ffmpeg for running the {}'
postprocessors (default) --convert-subtitles FORMAT Convert the subtitles to other format (currently supported: srt|ass|vtt)
--prefer-ffmpeg Prefer ffmpeg over avconv for running the
postprocessors
--ffmpeg-location PATH Location of the ffmpeg/avconv binary;
either the path to the binary or its
containing directory.
--exec CMD Execute a command on the file after
downloading, similar to find's -exec
syntax. Example: --exec 'adb push {}
/sdcard/Music/ && rm {}'
# CONFIGURATION # CONFIGURATION
@@ -525,6 +354,10 @@ YouTube requires an additional signature since September 2012 which is not suppo
In February 2015, the new YouTube player contained a character sequence in a string that was misinterpreted by old versions of youtube-dl. See [above](#how-do-i-update-youtube-dl) for how to update youtube-dl. In February 2015, the new YouTube player contained a character sequence in a string that was misinterpreted by old versions of youtube-dl. See [above](#how-do-i-update-youtube-dl) for how to update youtube-dl.
### HTTP Error 429: Too Many Requests or 402: Payment Required
These two error codes indicate that the service is blocking your IP address because of overuse. Contact the service and ask them to unblock your IP address, or - if you have acquired a whitelisted IP address already - use the [`--proxy` or `--network-address` options](#network-options) to select another IP address.
### SyntaxError: Non-ASCII character ### ### SyntaxError: Non-ASCII character ###
The error The error

View File

@@ -17,6 +17,7 @@
- **AdultSwim** - **AdultSwim**
- **Aftenposten** - **Aftenposten**
- **Aftonbladet** - **Aftonbladet**
- **AirMozilla**
- **AlJazeera** - **AlJazeera**
- **Allocine** - **Allocine**
- **AlphaPorno** - **AlphaPorno**
@@ -209,6 +210,7 @@
- **Jove** - **Jove**
- **jpopsuki.tv** - **jpopsuki.tv**
- **Jukebox** - **Jukebox**
- **Kaltura**
- **Kankan** - **Kankan**
- **Karaoketv** - **Karaoketv**
- **keek** - **keek**
@@ -220,6 +222,9 @@
- **Ku6** - **Ku6**
- **la7.tv** - **la7.tv**
- **Laola1Tv** - **Laola1Tv**
- **Letv**
- **LetvPlaylist**
- **LetvTv**
- **lifenews**: LIFE | NEWS - **lifenews**: LIFE | NEWS
- **LiveLeak** - **LiveLeak**
- **livestream** - **livestream**
@@ -304,6 +309,7 @@
- **Nuvid** - **Nuvid**
- **NYTimes** - **NYTimes**
- **ocw.mit.edu** - **ocw.mit.edu**
- **Odnoklassniki**
- **OktoberfestTV** - **OktoberfestTV**
- **on.aol.com** - **on.aol.com**
- **Ooyala** - **Ooyala**
@@ -330,6 +336,7 @@
- **PornoXO** - **PornoXO**
- **PromptFile** - **PromptFile**
- **prosiebensat1**: ProSiebenSat.1 Digital - **prosiebensat1**: ProSiebenSat.1 Digital
- **Puls4**
- **Pyvideo** - **Pyvideo**
- **QuickVid** - **QuickVid**
- **R7** - **R7**
@@ -408,7 +415,7 @@
- **StreamCZ** - **StreamCZ**
- **StreetVoice** - **StreetVoice**
- **SunPorno** - **SunPorno**
- **SVTPlay** - **SVTPlay**: SVT Play and Öppet arkiv
- **SWRMediathek** - **SWRMediathek**
- **Syfy** - **Syfy**
- **SztvHu** - **SztvHu**

View File

@@ -85,8 +85,11 @@ class TestUtil(unittest.TestCase):
self.assertEqual( self.assertEqual(
sanitize_filename('New World record at 0:12:34'), sanitize_filename('New World record at 0:12:34'),
'New World record at 0_12_34') 'New World record at 0_12_34')
self.assertEqual(sanitize_filename('--gasdgf'), '_-gasdgf') self.assertEqual(sanitize_filename('--gasdgf'), '_-gasdgf')
self.assertEqual(sanitize_filename('--gasdgf', is_id=True), '--gasdgf') self.assertEqual(sanitize_filename('--gasdgf', is_id=True), '--gasdgf')
self.assertEqual(sanitize_filename('.gasdgf'), 'gasdgf')
self.assertEqual(sanitize_filename('.gasdgf', is_id=True), '.gasdgf')
forbidden = '"\0\\/' forbidden = '"\0\\/'
for fc in forbidden: for fc in forbidden:
@@ -246,6 +249,7 @@ class TestUtil(unittest.TestCase):
self.assertEqual(parse_duration('2.5 hours'), 9000) self.assertEqual(parse_duration('2.5 hours'), 9000)
self.assertEqual(parse_duration('02:03:04'), 7384) self.assertEqual(parse_duration('02:03:04'), 7384)
self.assertEqual(parse_duration('01:02:03:04'), 93784) self.assertEqual(parse_duration('01:02:03:04'), 93784)
self.assertEqual(parse_duration('1 hour 3 minutes'), 3780)
def test_fix_xml_ampersands(self): def test_fix_xml_ampersands(self):
self.assertEqual( self.assertEqual(

View File

@@ -4,8 +4,10 @@
from __future__ import absolute_import, unicode_literals from __future__ import absolute_import, unicode_literals
import collections import collections
import contextlib
import datetime import datetime
import errno import errno
import fileinput
import io import io
import itertools import itertools
import json import json
@@ -28,6 +30,7 @@ from .compat import (
compat_basestring, compat_basestring,
compat_cookiejar, compat_cookiejar,
compat_expanduser, compat_expanduser,
compat_get_terminal_size,
compat_http_client, compat_http_client,
compat_kwargs, compat_kwargs,
compat_str, compat_str,
@@ -46,7 +49,6 @@ from .utils import (
ExtractorError, ExtractorError,
format_bytes, format_bytes,
formatSeconds, formatSeconds,
get_term_width,
locked_file, locked_file,
make_HTTPS_handler, make_HTTPS_handler,
MaxDownloadsReached, MaxDownloadsReached,
@@ -247,10 +249,10 @@ class YoutubeDL(object):
hls_prefer_native: Use the native HLS downloader instead of ffmpeg/avconv. hls_prefer_native: Use the native HLS downloader instead of ffmpeg/avconv.
The following parameters are not used by YoutubeDL itself, they are used by The following parameters are not used by YoutubeDL itself, they are used by
the FileDownloader: the downloader (see youtube_dl/downloader/common.py):
nopart, updatetime, buffersize, ratelimit, min_filesize, max_filesize, test, nopart, updatetime, buffersize, ratelimit, min_filesize, max_filesize, test,
noresizebuffer, retries, continuedl, noprogress, consoletitle, noresizebuffer, retries, continuedl, noprogress, consoletitle,
xattr_set_filesize. xattr_set_filesize, external_downloader_args.
The following options are used by the post processors: The following options are used by the post processors:
prefer_ffmpeg: If True, use ffmpeg instead of avconv if both are available, prefer_ffmpeg: If True, use ffmpeg instead of avconv if both are available,
@@ -284,7 +286,7 @@ class YoutubeDL(object):
try: try:
import pty import pty
master, slave = pty.openpty() master, slave = pty.openpty()
width = get_term_width() width = compat_get_terminal_size().columns
if width is None: if width is None:
width_args = [] width_args = []
else: else:
@@ -1300,17 +1302,18 @@ class YoutubeDL(object):
# subtitles download errors are already managed as troubles in relevant IE # subtitles download errors are already managed as troubles in relevant IE
# that way it will silently go on when used with unsupporting IE # that way it will silently go on when used with unsupporting IE
subtitles = info_dict['requested_subtitles'] subtitles = info_dict['requested_subtitles']
ie = self.get_info_extractor(info_dict['extractor_key'])
for sub_lang, sub_info in subtitles.items(): for sub_lang, sub_info in subtitles.items():
sub_format = sub_info['ext'] sub_format = sub_info['ext']
if sub_info.get('data') is not None: if sub_info.get('data') is not None:
sub_data = sub_info['data'] sub_data = sub_info['data']
else: else:
try: try:
uf = self.urlopen(sub_info['url']) sub_data = ie._download_webpage(
sub_data = uf.read().decode('utf-8') sub_info['url'], info_dict['id'], note=False)
except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err: except ExtractorError as err:
self.report_warning('Unable to download subtitle for "%s": %s' % self.report_warning('Unable to download subtitle for "%s": %s' %
(sub_lang, compat_str(err))) (sub_lang, compat_str(err.cause)))
continue continue
try: try:
sub_filename = subtitles_filename(filename, sub_lang, sub_format) sub_filename = subtitles_filename(filename, sub_lang, sub_format)
@@ -1451,8 +1454,11 @@ class YoutubeDL(object):
return self._download_retcode return self._download_retcode
def download_with_info_file(self, info_filename): def download_with_info_file(self, info_filename):
with io.open(info_filename, 'r', encoding='utf-8') as f: with contextlib.closing(fileinput.FileInput(
info = json.load(f) [info_filename], mode='r',
openhook=fileinput.hook_encoded('utf-8'))) as f:
# FileInput doesn't have a read method, we can't call json.load
info = json.loads('\n'.join(f))
try: try:
self.process_ie_result(info, download=True) self.process_ie_result(info, download=True)
except DownloadError: except DownloadError:

View File

@@ -9,6 +9,7 @@ import codecs
import io import io
import os import os
import random import random
import shlex
import sys import sys
@@ -170,6 +171,9 @@ def _real_main(argv=None):
if opts.recodevideo is not None: if opts.recodevideo is not None:
if opts.recodevideo not in ['mp4', 'flv', 'webm', 'ogg', 'mkv']: if opts.recodevideo not in ['mp4', 'flv', 'webm', 'ogg', 'mkv']:
parser.error('invalid video recode format specified') parser.error('invalid video recode format specified')
if opts.convertsubtitles is not None:
if opts.convertsubtitles not in ['srt', 'vtt', 'ass']:
parser.error('invalid subtitle format specified')
if opts.date is not None: if opts.date is not None:
date = DateRange.day(opts.date) date = DateRange.day(opts.date)
@@ -223,6 +227,11 @@ def _real_main(argv=None):
'key': 'FFmpegVideoConvertor', 'key': 'FFmpegVideoConvertor',
'preferedformat': opts.recodevideo, 'preferedformat': opts.recodevideo,
}) })
if opts.convertsubtitles:
postprocessors.append({
'key': 'FFmpegSubtitlesConvertor',
'format': opts.convertsubtitles,
})
if opts.embedsubtitles: if opts.embedsubtitles:
postprocessors.append({ postprocessors.append({
'key': 'FFmpegEmbedSubtitle', 'key': 'FFmpegEmbedSubtitle',
@@ -247,6 +256,9 @@ def _real_main(argv=None):
xattr # Confuse flake8 xattr # Confuse flake8
except ImportError: except ImportError:
parser.error('setting filesize xattr requested but python-xattr is not available') parser.error('setting filesize xattr requested but python-xattr is not available')
external_downloader_args = None
if opts.external_downloader_args:
external_downloader_args = shlex.split(opts.external_downloader_args)
match_filter = ( match_filter = (
None if opts.match_filter is None None if opts.match_filter is None
else match_filter_func(opts.match_filter)) else match_filter_func(opts.match_filter))
@@ -351,6 +363,7 @@ def _real_main(argv=None):
'no_color': opts.no_color, 'no_color': opts.no_color,
'ffmpeg_location': opts.ffmpeg_location, 'ffmpeg_location': opts.ffmpeg_location,
'hls_prefer_native': opts.hls_prefer_native, 'hls_prefer_native': opts.hls_prefer_native,
'external_downloader_args': external_downloader_args,
} }
with YoutubeDL(ydl_opts) as ydl: with YoutubeDL(ydl_opts) as ydl:

View File

@@ -1,9 +1,11 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import collections
import getpass import getpass
import optparse import optparse
import os import os
import re import re
import shutil
import socket import socket
import subprocess import subprocess
import sys import sys
@@ -364,6 +366,33 @@ def workaround_optparse_bug9161():
return real_add_option(self, *bargs, **bkwargs) return real_add_option(self, *bargs, **bkwargs)
optparse.OptionGroup.add_option = _compat_add_option optparse.OptionGroup.add_option = _compat_add_option
if hasattr(shutil, 'get_terminal_size'): # Python >= 3.3
compat_get_terminal_size = shutil.get_terminal_size
else:
_terminal_size = collections.namedtuple('terminal_size', ['columns', 'lines'])
def compat_get_terminal_size():
columns = compat_getenv('COLUMNS', None)
if columns:
columns = int(columns)
else:
columns = None
lines = compat_getenv('LINES', None)
if lines:
lines = int(lines)
else:
lines = None
try:
sp = subprocess.Popen(
['stty', 'size'],
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
out, err = sp.communicate()
lines, columns = map(int, out.split())
except:
pass
return _terminal_size(columns, lines)
__all__ = [ __all__ = [
'compat_HTTPError', 'compat_HTTPError',
@@ -371,6 +400,7 @@ __all__ = [
'compat_chr', 'compat_chr',
'compat_cookiejar', 'compat_cookiejar',
'compat_expanduser', 'compat_expanduser',
'compat_get_terminal_size',
'compat_getenv', 'compat_getenv',
'compat_getpass', 'compat_getpass',
'compat_html_entities', 'compat_html_entities',

View File

@@ -42,6 +42,8 @@ class FileDownloader(object):
max_filesize: Skip files larger than this size max_filesize: Skip files larger than this size
xattr_set_filesize: Set ytdl.filesize user xattribute with expected size. xattr_set_filesize: Set ytdl.filesize user xattribute with expected size.
(experimenatal) (experimenatal)
external_downloader_args: A list of additional command-line arguments for the
external downloader.
Subclasses of this one must re-define the real_download method. Subclasses of this one must re-define the real_download method.
""" """

View File

@@ -51,6 +51,13 @@ class ExternalFD(FileDownloader):
return [] return []
return [command_option, source_address] return [command_option, source_address]
def _configuration_args(self, default=[]):
ex_args = self.params.get('external_downloader_args')
if ex_args is None:
return default
assert isinstance(ex_args, list)
return ex_args
def _call_downloader(self, tmpfilename, info_dict): def _call_downloader(self, tmpfilename, info_dict):
""" Either overwrite this or implement _make_cmd """ """ Either overwrite this or implement _make_cmd """
cmd = self._make_cmd(tmpfilename, info_dict) cmd = self._make_cmd(tmpfilename, info_dict)
@@ -79,6 +86,7 @@ class CurlFD(ExternalFD):
for key, val in info_dict['http_headers'].items(): for key, val in info_dict['http_headers'].items():
cmd += ['--header', '%s: %s' % (key, val)] cmd += ['--header', '%s: %s' % (key, val)]
cmd += self._source_address('--interface') cmd += self._source_address('--interface')
cmd += self._configuration_args()
cmd += ['--', info_dict['url']] cmd += ['--', info_dict['url']]
return cmd return cmd
@@ -89,15 +97,16 @@ class WgetFD(ExternalFD):
for key, val in info_dict['http_headers'].items(): for key, val in info_dict['http_headers'].items():
cmd += ['--header', '%s: %s' % (key, val)] cmd += ['--header', '%s: %s' % (key, val)]
cmd += self._source_address('--bind-address') cmd += self._source_address('--bind-address')
cmd += self._configuration_args()
cmd += ['--', info_dict['url']] cmd += ['--', info_dict['url']]
return cmd return cmd
class Aria2cFD(ExternalFD): class Aria2cFD(ExternalFD):
def _make_cmd(self, tmpfilename, info_dict): def _make_cmd(self, tmpfilename, info_dict):
cmd = [ cmd = [self.exe, '-c']
self.exe, '-c', cmd += self._configuration_args([
'--min-split-size', '1M', '--max-connection-per-server', '4'] '--min-split-size', '1M', '--max-connection-per-server', '4'])
dn = os.path.dirname(tmpfilename) dn = os.path.dirname(tmpfilename)
if dn: if dn:
cmd += ['--dir', dn] cmd += ['--dir', dn]

View File

@@ -11,6 +11,7 @@ from .common import FileDownloader
from .http import HttpFD from .http import HttpFD
from ..compat import ( from ..compat import (
compat_urlparse, compat_urlparse,
compat_urllib_error,
) )
from ..utils import ( from ..utils import (
struct_pack, struct_pack,
@@ -121,7 +122,8 @@ class FlvReader(io.BytesIO):
self.read_unsigned_int() # BootstrapinfoVersion self.read_unsigned_int() # BootstrapinfoVersion
# Profile,Live,Update,Reserved # Profile,Live,Update,Reserved
self.read(1) flags = self.read_unsigned_char()
live = flags & 0x20 != 0
# time scale # time scale
self.read_unsigned_int() self.read_unsigned_int()
# CurrentMediaTime # CurrentMediaTime
@@ -160,6 +162,7 @@ class FlvReader(io.BytesIO):
return { return {
'segments': segments, 'segments': segments,
'fragments': fragments, 'fragments': fragments,
'live': live,
} }
def read_bootstrap_info(self): def read_bootstrap_info(self):
@@ -182,6 +185,10 @@ def build_fragments_list(boot_info):
for segment, fragments_count in segment_run_table['segment_run']: for segment, fragments_count in segment_run_table['segment_run']:
for _ in range(fragments_count): for _ in range(fragments_count):
res.append((segment, next(fragments_counter))) res.append((segment, next(fragments_counter)))
if boot_info['live']:
res = res[-2:]
return res return res
@@ -246,6 +253,38 @@ class F4mFD(FileDownloader):
self.report_error('Unsupported DRM') self.report_error('Unsupported DRM')
return media return media
def _get_bootstrap_from_url(self, bootstrap_url):
bootstrap = self.ydl.urlopen(bootstrap_url).read()
return read_bootstrap_info(bootstrap)
def _update_live_fragments(self, bootstrap_url, latest_fragment):
fragments_list = []
retries = 30
while (not fragments_list) and (retries > 0):
boot_info = self._get_bootstrap_from_url(bootstrap_url)
fragments_list = build_fragments_list(boot_info)
fragments_list = [f for f in fragments_list if f[1] > latest_fragment]
if not fragments_list:
# Retry after a while
time.sleep(5.0)
retries -= 1
if not fragments_list:
self.report_error('Failed to update fragments')
return fragments_list
def _parse_bootstrap_node(self, node, base_url):
if node.text is None:
bootstrap_url = compat_urlparse.urljoin(
base_url, node.attrib['url'])
boot_info = self._get_bootstrap_from_url(bootstrap_url)
else:
bootstrap_url = None
bootstrap = base64.b64decode(node.text)
boot_info = read_bootstrap_info(bootstrap)
return (boot_info, bootstrap_url)
def real_download(self, filename, info_dict): def real_download(self, filename, info_dict):
man_url = info_dict['url'] man_url = info_dict['url']
requested_bitrate = info_dict.get('tbr') requested_bitrate = info_dict.get('tbr')
@@ -265,18 +304,13 @@ class F4mFD(FileDownloader):
base_url = compat_urlparse.urljoin(man_url, media.attrib['url']) base_url = compat_urlparse.urljoin(man_url, media.attrib['url'])
bootstrap_node = doc.find(_add_ns('bootstrapInfo')) bootstrap_node = doc.find(_add_ns('bootstrapInfo'))
if bootstrap_node.text is None: boot_info, bootstrap_url = self._parse_bootstrap_node(bootstrap_node, base_url)
bootstrap_url = compat_urlparse.urljoin( live = boot_info['live']
base_url, bootstrap_node.attrib['url'])
bootstrap = self.ydl.urlopen(bootstrap_url).read()
else:
bootstrap = base64.b64decode(bootstrap_node.text)
metadata_node = media.find(_add_ns('metadata')) metadata_node = media.find(_add_ns('metadata'))
if metadata_node is not None: if metadata_node is not None:
metadata = base64.b64decode(metadata_node.text) metadata = base64.b64decode(metadata_node.text)
else: else:
metadata = None metadata = None
boot_info = read_bootstrap_info(bootstrap)
fragments_list = build_fragments_list(boot_info) fragments_list = build_fragments_list(boot_info)
if self.params.get('test', False): if self.params.get('test', False):
@@ -301,7 +335,8 @@ class F4mFD(FileDownloader):
(dest_stream, tmpfilename) = sanitize_open(tmpfilename, 'wb') (dest_stream, tmpfilename) = sanitize_open(tmpfilename, 'wb')
write_flv_header(dest_stream) write_flv_header(dest_stream)
write_metadata_tag(dest_stream, metadata) if not live:
write_metadata_tag(dest_stream, metadata)
# This dict stores the download progress, it's updated by the progress # This dict stores the download progress, it's updated by the progress
# hook # hook
@@ -348,24 +383,45 @@ class F4mFD(FileDownloader):
http_dl.add_progress_hook(frag_progress_hook) http_dl.add_progress_hook(frag_progress_hook)
frags_filenames = [] frags_filenames = []
for (seg_i, frag_i) in fragments_list: while fragments_list:
seg_i, frag_i = fragments_list.pop(0)
name = 'Seg%d-Frag%d' % (seg_i, frag_i) name = 'Seg%d-Frag%d' % (seg_i, frag_i)
url = base_url + name url = base_url + name
if akamai_pv: if akamai_pv:
url += '?' + akamai_pv.strip(';') url += '?' + akamai_pv.strip(';')
frag_filename = '%s-%s' % (tmpfilename, name) frag_filename = '%s-%s' % (tmpfilename, name)
success = http_dl.download(frag_filename, {'url': url}) try:
if not success: success = http_dl.download(frag_filename, {'url': url})
return False if not success:
with open(frag_filename, 'rb') as down: return False
down_data = down.read() with open(frag_filename, 'rb') as down:
reader = FlvReader(down_data) down_data = down.read()
while True: reader = FlvReader(down_data)
_, box_type, box_data = reader.read_box_info() while True:
if box_type == b'mdat': _, box_type, box_data = reader.read_box_info()
dest_stream.write(box_data) if box_type == b'mdat':
break dest_stream.write(box_data)
frags_filenames.append(frag_filename) break
if live:
os.remove(frag_filename)
else:
frags_filenames.append(frag_filename)
except (compat_urllib_error.HTTPError, ) as err:
if live and (err.code == 404 or err.code == 410):
# We didn't keep up with the live window. Continue
# with the next available fragment.
msg = 'Fragment %d unavailable' % frag_i
self.report_warning(msg)
fragments_list = []
else:
raise
if not fragments_list and live and bootstrap_url:
fragments_list = self._update_live_fragments(bootstrap_url, frag_i)
total_frags += len(fragments_list)
if fragments_list and (fragments_list[0][1] > frag_i + 1):
msg = 'Missed %d fragments' % (fragments_list[0][1] - (frag_i + 1))
self.report_warning(msg)
dest_stream.close() dest_stream.close()

View File

@@ -119,7 +119,9 @@ class RtmpFD(FileDownloader):
# Download using rtmpdump. rtmpdump returns exit code 2 when # Download using rtmpdump. rtmpdump returns exit code 2 when
# the connection was interrumpted and resuming appears to be # the connection was interrumpted and resuming appears to be
# possible. This is part of rtmpdump's normal usage, AFAIK. # possible. This is part of rtmpdump's normal usage, AFAIK.
basic_args = ['rtmpdump', '--verbose', '-r', url, '-o', tmpfilename] basic_args = [
'rtmpdump', '--verbose', '-r', url,
'-o', encodeFilename(tmpfilename, True)]
if player_url is not None: if player_url is not None:
basic_args += ['--swfVfy', player_url] basic_args += ['--swfVfy', player_url]
if page_url is not None: if page_url is not None:

View File

@@ -8,6 +8,7 @@ from .adobetv import AdobeTVIE
from .adultswim import AdultSwimIE from .adultswim import AdultSwimIE
from .aftenposten import AftenpostenIE from .aftenposten import AftenpostenIE
from .aftonbladet import AftonbladetIE from .aftonbladet import AftonbladetIE
from .airmozilla import AirMozillaIE
from .aljazeera import AlJazeeraIE from .aljazeera import AlJazeeraIE
from .alphaporno import AlphaPornoIE from .alphaporno import AlphaPornoIE
from .anitube import AnitubeIE from .anitube import AnitubeIE
@@ -226,6 +227,7 @@ from .jeuxvideo import JeuxVideoIE
from .jove import JoveIE from .jove import JoveIE
from .jukebox import JukeboxIE from .jukebox import JukeboxIE
from .jpopsukitv import JpopsukiIE from .jpopsukitv import JpopsukiIE
from .kaltura import KalturaIE
from .kankan import KankanIE from .kankan import KankanIE
from .karaoketv import KaraoketvIE from .karaoketv import KaraoketvIE
from .keezmovies import KeezMoviesIE from .keezmovies import KeezMoviesIE
@@ -237,6 +239,11 @@ from .krasview import KrasViewIE
from .ku6 import Ku6IE from .ku6 import Ku6IE
from .la7 import LA7IE from .la7 import LA7IE
from .laola1tv import Laola1TvIE from .laola1tv import Laola1TvIE
from .letv import (
LetvIE,
LetvTvIE,
LetvPlaylistIE
)
from .lifenews import LifeNewsIE from .lifenews import LifeNewsIE
from .liveleak import LiveLeakIE from .liveleak import LiveLeakIE
from .livestream import ( from .livestream import (
@@ -339,6 +346,7 @@ from .ntvde import NTVDeIE
from .ntvru import NTVRuIE from .ntvru import NTVRuIE
from .nytimes import NYTimesIE from .nytimes import NYTimesIE
from .nuvid import NuvidIE from .nuvid import NuvidIE
from .odnoklassniki import OdnoklassnikiIE
from .oktoberfesttv import OktoberfestTVIE from .oktoberfesttv import OktoberfestTVIE
from .ooyala import OoyalaIE from .ooyala import OoyalaIE
from .openfilm import OpenFilmIE from .openfilm import OpenFilmIE
@@ -366,6 +374,7 @@ from .pornotube import PornotubeIE
from .pornoxo import PornoXOIE from .pornoxo import PornoXOIE
from .promptfile import PromptFileIE from .promptfile import PromptFileIE
from .prosiebensat1 import ProSiebenSat1IE from .prosiebensat1 import ProSiebenSat1IE
from .puls4 import Puls4IE
from .pyvideo import PyvideoIE from .pyvideo import PyvideoIE
from .quickvid import QuickVidIE from .quickvid import QuickVidIE
from .r7 import R7IE from .r7 import R7IE

View File

@@ -0,0 +1,74 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_duration,
parse_iso8601,
)
class AirMozillaIE(InfoExtractor):
_VALID_URL = r'https?://air\.mozilla\.org/(?P<id>[0-9a-z-]+)/?'
_TEST = {
'url': 'https://air.mozilla.org/privacy-lab-a-meetup-for-privacy-minded-people-in-san-francisco/',
'md5': '2e3e7486ba5d180e829d453875b9b8bf',
'info_dict': {
'id': '6x4q2w',
'ext': 'mp4',
'title': 'Privacy Lab - a meetup for privacy minded people in San Francisco',
'thumbnail': 're:https://\w+\.cloudfront\.net/6x4q2w/poster\.jpg\?t=\d+',
'description': 'Brings together privacy professionals and others interested in privacy at for-profits, non-profits, and NGOs in an effort to contribute to the state of the ecosystem...',
'timestamp': 1422487800,
'upload_date': '20150128',
'location': 'SFO Commons',
'duration': 3780,
'view_count': int,
'categories': ['Main'],
}
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
video_id = self._html_search_regex(r'//vid.ly/(.*?)/embed', webpage, 'id')
embed_script = self._download_webpage('https://vid.ly/{0}/embed'.format(video_id), video_id)
jwconfig = self._search_regex(r'\svar jwconfig = (\{.*?\});\s', embed_script, 'metadata')
metadata = self._parse_json(jwconfig, video_id)
formats = [{
'url': source['file'],
'ext': source['type'],
'format_id': self._search_regex(r'&format=(.*)$', source['file'], 'video format'),
'format': source['label'],
'height': int(source['label'].rstrip('p')),
} for source in metadata['playlist'][0]['sources']]
self._sort_formats(formats)
view_count = int_or_none(self._html_search_regex(
r'Views since archived: ([0-9]+)',
webpage, 'view count', fatal=False))
timestamp = parse_iso8601(self._html_search_regex(
r'<time datetime="(.*?)"', webpage, 'timestamp', fatal=False))
duration = parse_duration(self._search_regex(
r'Duration:\s*(\d+\s*hours?\s*\d+\s*minutes?)',
webpage, 'duration', fatal=False))
return {
'id': video_id,
'title': self._og_search_title(webpage),
'formats': formats,
'url': self._og_search_url(webpage),
'display_id': display_id,
'thumbnail': metadata['playlist'][0].get('image'),
'description': self._og_search_description(webpage),
'timestamp': timestamp,
'location': self._html_search_regex(r'Location: (.*)', webpage, 'location', default=None),
'duration': duration,
'view_count': view_count,
'categories': re.findall(r'<a href=".*?" class="channel">(.*?)</a>', webpage),
}

View File

@@ -250,6 +250,8 @@ class ComedyCentralShowsIE(MTVServicesInfoExtractor):
}) })
self._sort_formats(formats) self._sort_formats(formats)
subtitles = self._extract_subtitles(cdoc, guid)
virtual_id = show_name + ' ' + epTitle + ' part ' + compat_str(part_num + 1) virtual_id = show_name + ' ' + epTitle + ' part ' + compat_str(part_num + 1)
entries.append({ entries.append({
'id': guid, 'id': guid,
@@ -260,6 +262,7 @@ class ComedyCentralShowsIE(MTVServicesInfoExtractor):
'duration': duration, 'duration': duration,
'thumbnail': thumbnail, 'thumbnail': thumbnail,
'description': description, 'description': description,
'subtitles': subtitles,
}) })
return { return {

View File

@@ -767,6 +767,10 @@ class InfoExtractor(object):
formats) formats)
def _is_valid_url(self, url, video_id, item='video'): def _is_valid_url(self, url, video_id, item='video'):
url = self._proto_relative_url(url, scheme='http:')
# For now assume non HTTP(S) URLs always valid
if not (url.startswith('http://') or url.startswith('https://')):
return True
try: try:
self._request_webpage(url, video_id, 'Checking %s URL' % item) self._request_webpage(url, video_id, 'Checking %s URL' % item)
return True return True
@@ -921,39 +925,57 @@ class InfoExtractor(object):
formats = [] formats = []
rtmp_count = 0 rtmp_count = 0
for video in smil.findall('./body/switch/video'): if smil.findall('./body/seq/video'):
src = video.get('src') video = smil.findall('./body/seq/video')[0]
if not src: fmts, rtmp_count = self._parse_smil_video(video, video_id, base, rtmp_count)
continue formats.extend(fmts)
bitrate = int_or_none(video.get('system-bitrate') or video.get('systemBitrate'), 1000) else:
width = int_or_none(video.get('width')) for video in smil.findall('./body/switch/video'):
height = int_or_none(video.get('height')) fmts, rtmp_count = self._parse_smil_video(video, video_id, base, rtmp_count)
proto = video.get('proto') formats.extend(fmts)
if not proto:
if base:
if base.startswith('rtmp'):
proto = 'rtmp'
elif base.startswith('http'):
proto = 'http'
ext = video.get('ext')
if proto == 'm3u8':
formats.extend(self._extract_m3u8_formats(src, video_id, ext))
elif proto == 'rtmp':
rtmp_count += 1
streamer = video.get('streamer') or base
formats.append({
'url': streamer,
'play_path': src,
'ext': 'flv',
'format_id': 'rtmp-%d' % (rtmp_count if bitrate is None else bitrate),
'tbr': bitrate,
'width': width,
'height': height,
})
self._sort_formats(formats) self._sort_formats(formats)
return formats return formats
def _parse_smil_video(self, video, video_id, base, rtmp_count):
src = video.get('src')
if not src:
return ([], rtmp_count)
bitrate = int_or_none(video.get('system-bitrate') or video.get('systemBitrate'), 1000)
width = int_or_none(video.get('width'))
height = int_or_none(video.get('height'))
proto = video.get('proto')
if not proto:
if base:
if base.startswith('rtmp'):
proto = 'rtmp'
elif base.startswith('http'):
proto = 'http'
ext = video.get('ext')
if proto == 'm3u8':
return (self._extract_m3u8_formats(src, video_id, ext), rtmp_count)
elif proto == 'rtmp':
rtmp_count += 1
streamer = video.get('streamer') or base
return ([{
'url': streamer,
'play_path': src,
'ext': 'flv',
'format_id': 'rtmp-%d' % (rtmp_count if bitrate is None else bitrate),
'tbr': bitrate,
'width': width,
'height': height,
}], rtmp_count)
elif proto.startswith('http'):
return ([{
'url': base + src,
'ext': ext or 'flv',
'tbr': bitrate,
'width': width,
'height': height,
}], rtmp_count)
def _live_title(self, name): def _live_title(self, name):
""" Generate the title for a live video """ """ Generate the title for a live video """
now = datetime.datetime.now() now = datetime.datetime.now()

View File

@@ -35,10 +35,7 @@ class EpornerIE(InfoExtractor):
title = self._html_search_regex( title = self._html_search_regex(
r'<title>(.*?) - EPORNER', webpage, 'title') r'<title>(.*?) - EPORNER', webpage, 'title')
redirect_code = self._html_search_regex( redirect_url = 'http://www.eporner.com/config5/%s' % video_id
r'<script type="text/javascript" src="/config5/%s/([a-f\d]+)/">' % video_id,
webpage, 'redirect_code')
redirect_url = 'http://www.eporner.com/config5/%s/%s' % (video_id, redirect_code)
player_code = self._download_webpage( player_code = self._download_webpage(
redirect_url, display_id, note='Downloading player config') redirect_url, display_id, note='Downloading player config')
@@ -69,5 +66,5 @@ class EpornerIE(InfoExtractor):
'duration': duration, 'duration': duration,
'view_count': view_count, 'view_count': view_count,
'formats': formats, 'formats': formats,
'age_limit': self._rta_search(webpage), 'age_limit': 18,
} }

View File

@@ -3,15 +3,18 @@ from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import ( from ..compat import (
compat_urllib_parse, compat_urllib_parse,
compat_urllib_request,
) )
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
js_to_json, js_to_json,
parse_duration,
) )
class EscapistIE(InfoExtractor): class EscapistIE(InfoExtractor):
_VALID_URL = r'https?://?(www\.)?escapistmagazine\.com/videos/view/[^/?#]+/(?P<id>[0-9]+)-[^/?#]*(?:$|[?#])' _VALID_URL = r'https?://?(www\.)?escapistmagazine\.com/videos/view/[^/?#]+/(?P<id>[0-9]+)-[^/?#]*(?:$|[?#])'
_USER_AGENT = 'Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko'
_TEST = { _TEST = {
'url': 'http://www.escapistmagazine.com/videos/view/the-escapist-presents/6618-Breaking-Down-Baldurs-Gate', 'url': 'http://www.escapistmagazine.com/videos/view/the-escapist-presents/6618-Breaking-Down-Baldurs-Gate',
'md5': 'ab3a706c681efca53f0a35f1415cf0d1', 'md5': 'ab3a706c681efca53f0a35f1415cf0d1',
@@ -23,12 +26,15 @@ class EscapistIE(InfoExtractor):
'uploader': 'The Escapist Presents', 'uploader': 'The Escapist Presents',
'title': "Breaking Down Baldur's Gate", 'title': "Breaking Down Baldur's Gate",
'thumbnail': 're:^https?://.*\.jpg$', 'thumbnail': 're:^https?://.*\.jpg$',
'duration': 264,
} }
} }
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id) webpage_req = compat_urllib_request.Request(url)
webpage_req.add_header('User-Agent', self._USER_AGENT)
webpage = self._download_webpage(webpage_req, video_id)
uploader_id = self._html_search_regex( uploader_id = self._html_search_regex(
r"<h1\s+class='headline'>\s*<a\s+href='/videos/view/(.*?)'", r"<h1\s+class='headline'>\s*<a\s+href='/videos/view/(.*?)'",
@@ -37,31 +43,50 @@ class EscapistIE(InfoExtractor):
r"<h1\s+class='headline'>(.*?)</a>", r"<h1\s+class='headline'>(.*?)</a>",
webpage, 'uploader', fatal=False) webpage, 'uploader', fatal=False)
description = self._html_search_meta('description', webpage) description = self._html_search_meta('description', webpage)
duration = parse_duration(self._html_search_meta('duration', webpage))
raw_title = self._html_search_meta('title', webpage, fatal=True) raw_title = self._html_search_meta('title', webpage, fatal=True)
title = raw_title.partition(' : ')[2] title = raw_title.partition(' : ')[2]
config_url = compat_urllib_parse.unquote(self._html_search_regex( config_url = compat_urllib_parse.unquote(self._html_search_regex(
r'<param\s+name="flashvars"\s+value="config=([^"&]+)', webpage, 'config URL')) r'''(?x)
(?:
<param\s+name="flashvars".*?\s+value="config=|
flashvars=&quot;config=
)
(https?://[^"&]+)
''',
webpage, 'config URL'))
formats = [] formats = []
ad_formats = []
def _add_format(name, cfgurl, quality): def _add_format(name, cfg_url, quality):
cfg_req = compat_urllib_request.Request(cfg_url)
cfg_req.add_header('User-Agent', self._USER_AGENT)
config = self._download_json( config = self._download_json(
cfgurl, video_id, cfg_req, video_id,
'Downloading ' + name + ' configuration', 'Downloading ' + name + ' configuration',
'Unable to download ' + name + ' configuration', 'Unable to download ' + name + ' configuration',
transform_source=js_to_json) transform_source=js_to_json)
playlist = config['playlist'] playlist = config['playlist']
video_url = next( for p in playlist:
p['url'] for p in playlist if p.get('eventCategory') == 'Video':
if p.get('eventCategory') == 'Video') ar = formats
formats.append({ elif p.get('eventCategory') == 'Video Postroll':
'url': video_url, ar = ad_formats
'format_id': name, else:
'quality': quality, continue
})
ar.append({
'url': p['url'],
'format_id': name,
'quality': quality,
'http_headers': {
'User-Agent': self._USER_AGENT,
},
})
_add_format('normal', config_url, quality=0) _add_format('normal', config_url, quality=0)
hq_url = (config_url + hq_url = (config_url +
@@ -70,10 +95,12 @@ class EscapistIE(InfoExtractor):
_add_format('hq', hq_url, quality=1) _add_format('hq', hq_url, quality=1)
except ExtractorError: except ExtractorError:
pass # That's fine, we'll just use normal quality pass # That's fine, we'll just use normal quality
self._sort_formats(formats) self._sort_formats(formats)
return { if '/escapist/sales-marketing/' in formats[-1]['url']:
raise ExtractorError('This IP address has been blocked by The Escapist', expected=True)
res = {
'id': video_id, 'id': video_id,
'formats': formats, 'formats': formats,
'uploader': uploader, 'uploader': uploader,
@@ -81,4 +108,21 @@ class EscapistIE(InfoExtractor):
'title': title, 'title': title,
'thumbnail': self._og_search_thumbnail(webpage), 'thumbnail': self._og_search_thumbnail(webpage),
'description': description, 'description': description,
'duration': duration,
} }
if self._downloader.params.get('include_ads') and ad_formats:
self._sort_formats(ad_formats)
ad_res = {
'id': '%s-ad' % video_id,
'title': '%s (Postroll)' % title,
'formats': ad_formats,
}
return {
'_type': 'playlist',
'entries': [res, ad_res],
'title': title,
'id': video_id,
}
return res

View File

@@ -1,6 +1,8 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
int_or_none, int_or_none,
@@ -31,7 +33,7 @@ class GameStarIE(InfoExtractor):
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
og_title = self._og_search_title(webpage) og_title = self._og_search_title(webpage)
title = og_title.replace(' - Video bei GameStar.de', '').strip() title = re.sub(r'\s*- Video (bei|-) GameStar\.de$', '', og_title)
url = 'http://gamestar.de/_misc/videos/portal/getVideoUrl.cfm?premium=0&videoId=' + video_id url = 'http://gamestar.de/_misc/videos/portal/getVideoUrl.cfm?premium=0&videoId=' + video_id

View File

@@ -26,6 +26,7 @@ from ..utils import (
unsmuggle_url, unsmuggle_url,
UnsupportedError, UnsupportedError,
url_basename, url_basename,
xpath_text,
) )
from .brightcove import BrightcoveIE from .brightcove import BrightcoveIE
from .ooyala import OoyalaIE from .ooyala import OoyalaIE
@@ -557,6 +558,28 @@ class GenericIE(InfoExtractor):
'title': 'EP3S5 - Bon Appétit - Baqueira Mi Corazon !', 'title': 'EP3S5 - Bon Appétit - Baqueira Mi Corazon !',
} }
}, },
# Kaltura embed
{
'url': 'http://www.monumentalnetwork.com/videos/john-carlson-postgame-2-25-15',
'info_dict': {
'id': '1_eergr3h1',
'ext': 'mp4',
'upload_date': '20150226',
'uploader_id': 'MonumentalSports-Kaltura@perfectsensedigital.com',
'timestamp': int,
'title': 'John Carlson Postgame 2/25/15',
},
},
# RSS feed with enclosure
{
'url': 'http://podcastfeeds.nbcnews.com/audio/podcast/MSNBC-MADDOW-NETCAST-M4V.xml',
'info_dict': {
'id': 'pdv_maddow_netcast_m4v-02-27-2015-201624',
'ext': 'm4v',
'upload_date': '20150228',
'title': 'pdv_maddow_netcast_m4v-02-27-2015-201624',
}
}
] ]
def report_following_redirect(self, new_url): def report_following_redirect(self, new_url):
@@ -568,11 +591,24 @@ class GenericIE(InfoExtractor):
playlist_desc_el = doc.find('./channel/description') playlist_desc_el = doc.find('./channel/description')
playlist_desc = None if playlist_desc_el is None else playlist_desc_el.text playlist_desc = None if playlist_desc_el is None else playlist_desc_el.text
entries = [{ entries = []
'_type': 'url', for it in doc.findall('./channel/item'):
'url': e.find('link').text, next_url = xpath_text(it, 'link', fatal=False)
'title': e.find('title').text, if not next_url:
} for e in doc.findall('./channel/item')] enclosure_nodes = it.findall('./enclosure')
for e in enclosure_nodes:
next_url = e.attrib.get('url')
if next_url:
break
if not next_url:
continue
entries.append({
'_type': 'url',
'url': next_url,
'title': it.find('title').text,
})
return { return {
'_type': 'playlist', '_type': 'playlist',
@@ -1113,6 +1149,12 @@ class GenericIE(InfoExtractor):
if mobj is not None: if mobj is not None:
return self.url_result(mobj.group('url'), 'Zapiks') return self.url_result(mobj.group('url'), 'Zapiks')
# Look for Kaltura embeds
mobj = re.search(
r"(?s)kWidget\.(?:thumb)?[Ee]mbed\(\{.*?'wid'\s*:\s*'_?(?P<partner_id>[^']+)',.*?'entry_id'\s*:\s*'(?P<id>[^']+)',", webpage)
if mobj is not None:
return self.url_result('kaltura:%(partner_id)s:%(id)s' % mobj.groupdict(), 'Kaltura')
def check_video(vurl): def check_video(vurl):
if YoutubeIE.suitable(vurl): if YoutubeIE.suitable(vurl):
return True return True
@@ -1208,7 +1250,9 @@ class GenericIE(InfoExtractor):
return entries[0] return entries[0]
else: else:
for num, e in enumerate(entries, start=1): for num, e in enumerate(entries, start=1):
e['title'] = '%s (%d)' % (e['title'], num) # 'url' results don't have a title
if e.get('title') is not None:
e['title'] = '%s (%d)' % (e['title'], num)
return { return {
'_type': 'playlist', '_type': 'playlist',
'entries': entries, 'entries': entries,

View File

@@ -0,0 +1,138 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_urllib_parse
from ..utils import (
ExtractorError,
int_or_none,
)
class KalturaIE(InfoExtractor):
_VALID_URL = r'''(?x)
(?:kaltura:|
https?://(:?(?:www|cdnapisec)\.)?kaltura\.com/index\.php/kwidget/(?:[^/]+/)*?wid/_
)(?P<partner_id>\d+)
(?::|
/(?:[^/]+/)*?entry_id/
)(?P<id>[0-9a-z_]+)'''
_API_BASE = 'http://cdnapi.kaltura.com/api_v3/index.php?'
_TESTS = [
{
'url': 'kaltura:269692:1_1jc2y3e4',
'md5': '3adcbdb3dcc02d647539e53f284ba171',
'info_dict': {
'id': '1_1jc2y3e4',
'ext': 'mp4',
'title': 'Track 4',
'upload_date': '20131219',
'uploader_id': 'mlundberg@wolfgangsvault.com',
'description': 'The Allman Brothers Band, 12/16/1981',
'thumbnail': 're:^https?://.*/thumbnail/.*',
'timestamp': int,
},
},
{
'url': 'http://www.kaltura.com/index.php/kwidget/cache_st/1300318621/wid/_269692/uiconf_id/3873291/entry_id/1_1jc2y3e4',
'only_matching': True,
},
{
'url': 'https://cdnapisec.kaltura.com/index.php/kwidget/wid/_557781/uiconf_id/22845202/entry_id/1_plr1syf3',
'only_matching': True,
},
]
def _kaltura_api_call(self, video_id, actions, *args, **kwargs):
params = actions[0]
if len(actions) > 1:
for i, a in enumerate(actions[1:], start=1):
for k, v in a.items():
params['%d:%s' % (i, k)] = v
query = compat_urllib_parse.urlencode(params)
url = self._API_BASE + query
data = self._download_json(url, video_id, *args, **kwargs)
status = data if len(actions) == 1 else data[0]
if status.get('objectType') == 'KalturaAPIException':
raise ExtractorError(
'%s said: %s' % (self.IE_NAME, status['message']))
return data
def _get_kaltura_signature(self, video_id, partner_id):
actions = [{
'apiVersion': '3.1',
'expiry': 86400,
'format': 1,
'service': 'session',
'action': 'startWidgetSession',
'widgetId': '_%s' % partner_id,
}]
return self._kaltura_api_call(
video_id, actions, note='Downloading Kaltura signature')['ks']
def _get_video_info(self, video_id, partner_id):
signature = self._get_kaltura_signature(video_id, partner_id)
actions = [
{
'action': 'null',
'apiVersion': '3.1.5',
'clientTag': 'kdp:v3.8.5',
'format': 1, # JSON, 2 = XML, 3 = PHP
'service': 'multirequest',
'ks': signature,
},
{
'action': 'get',
'entryId': video_id,
'service': 'baseentry',
'version': '-1',
},
{
'action': 'getContextData',
'contextDataParams:objectType': 'KalturaEntryContextDataParams',
'contextDataParams:referrer': 'http://www.kaltura.com/',
'contextDataParams:streamerType': 'http',
'entryId': video_id,
'service': 'baseentry',
},
]
return self._kaltura_api_call(
video_id, actions, note='Downloading video info JSON')
def _real_extract(self, url):
video_id = self._match_id(url)
mobj = re.match(self._VALID_URL, url)
partner_id, entry_id = mobj.group('partner_id'), mobj.group('id')
info, source_data = self._get_video_info(entry_id, partner_id)
formats = [{
'format_id': '%(fileExt)s-%(bitrate)s' % f,
'ext': f['fileExt'],
'tbr': f['bitrate'],
'fps': f.get('frameRate'),
'filesize_approx': int_or_none(f.get('size'), invscale=1024),
'container': f.get('containerFormat'),
'vcodec': f.get('videoCodecId'),
'height': f.get('height'),
'width': f.get('width'),
'url': '%s/flavorId/%s' % (info['dataUrl'], f['id']),
} for f in source_data['flavorAssets']]
self._sort_formats(formats)
return {
'id': video_id,
'title': info['name'],
'formats': formats,
'description': info.get('description'),
'thumbnail': info.get('thumbnailUrl'),
'duration': info.get('duration'),
'timestamp': info.get('createdAt'),
'uploader_id': info.get('userId'),
'view_count': info.get('plays'),
}

View File

@@ -27,8 +27,6 @@ class Laola1TvIE(InfoExtractor):
} }
} }
_BROKEN = True # Not really - extractor works fine, but f4m downloader does not support live streams yet.
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url) mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id') video_id = mobj.group('id')
@@ -57,11 +55,7 @@ class Laola1TvIE(InfoExtractor):
title = xpath_text(hd_doc, './/video/title', fatal=True) title = xpath_text(hd_doc, './/video/title', fatal=True)
flash_url = xpath_text(hd_doc, './/video/url', fatal=True) flash_url = xpath_text(hd_doc, './/video/url', fatal=True)
uploader = xpath_text(hd_doc, './/video/meta_organistation') uploader = xpath_text(hd_doc, './/video/meta_organistation')
is_live = xpath_text(hd_doc, './/video/islive') == 'true' is_live = xpath_text(hd_doc, './/video/islive') == 'true'
if is_live:
raise ExtractorError(
'Live streams are not supported by the f4m downloader.')
categories = xpath_text(hd_doc, './/video/meta_sports') categories = xpath_text(hd_doc, './/video/meta_sports')
if categories: if categories:

View File

@@ -0,0 +1,190 @@
# coding: utf-8
from __future__ import unicode_literals
import datetime
import re
import time
from .common import InfoExtractor
from ..compat import (
compat_urlparse,
compat_urllib_parse,
)
from ..utils import (
determine_ext,
ExtractorError,
parse_iso8601,
)
class LetvIE(InfoExtractor):
_VALID_URL = r'http://www\.letv\.com/ptv/vplay/(?P<id>\d+).html'
_TESTS = [{
'url': 'http://www.letv.com/ptv/vplay/22005890.html',
'md5': 'cab23bd68d5a8db9be31c9a222c1e8df',
'info_dict': {
'id': '22005890',
'ext': 'mp4',
'title': '第87届奥斯卡颁奖礼完美落幕 《鸟人》成最大赢家',
'timestamp': 1424747397,
'upload_date': '20150224',
'description': 'md5:a9cb175fd753e2962176b7beca21a47c',
}
}, {
'url': 'http://www.letv.com/ptv/vplay/1415246.html',
'info_dict': {
'id': '1415246',
'ext': 'mp4',
'title': '美人天下01',
'description': 'md5:f88573d9d7225ada1359eaf0dbf8bcda',
},
'expected_warnings': [
'publish time'
]
}]
# http://www.letv.com/ptv/vplay/1118082.html
# This video is available only in Mainland China
@staticmethod
def urshift(val, n):
return val >> n if val >= 0 else (val + 0x100000000) >> n
# ror() and calc_time_key() are reversed from a embedded swf file in KLetvPlayer.swf
def ror(self, param1, param2):
_loc3_ = 0
while _loc3_ < param2:
param1 = self.urshift(param1, 1) + ((param1 & 1) << 31)
_loc3_ += 1
return param1
def calc_time_key(self, param1):
_loc2_ = 773625421
_loc3_ = self.ror(param1, _loc2_ % 13)
_loc3_ = _loc3_ ^ _loc2_
_loc3_ = self.ror(_loc3_, _loc2_ % 17)
return _loc3_
def _real_extract(self, url):
media_id = self._match_id(url)
page = self._download_webpage(url, media_id)
params = {
'id': media_id,
'platid': 1,
'splatid': 101,
'format': 1,
'tkey': self.calc_time_key(int(time.time())),
'domain': 'www.letv.com'
}
play_json = self._download_json(
'http://api.letv.com/mms/out/video/playJson?' + compat_urllib_parse.urlencode(params),
media_id, 'playJson data')
# Check for errors
playstatus = play_json['playstatus']
if playstatus['status'] == 0:
flag = playstatus['flag']
if flag == 1:
msg = 'Country %s auth error' % playstatus['country']
else:
msg = 'Generic error. flag = %d' % flag
raise ExtractorError(msg, expected=True)
playurl = play_json['playurl']
formats = ['350', '1000', '1300', '720p', '1080p']
dispatch = playurl['dispatch']
urls = []
for format_id in formats:
if format_id in dispatch:
media_url = playurl['domain'][0] + dispatch[format_id][0]
# Mimic what flvxz.com do
url_parts = list(compat_urlparse.urlparse(media_url))
qs = dict(compat_urlparse.parse_qs(url_parts[4]))
qs.update({
'platid': '14',
'splatid': '1401',
'tss': 'no',
'retry': 1
})
url_parts[4] = compat_urllib_parse.urlencode(qs)
media_url = compat_urlparse.urlunparse(url_parts)
url_info_dict = {
'url': media_url,
'ext': determine_ext(dispatch[format_id][1])
}
if format_id[-1:] == 'p':
url_info_dict['height'] = format_id[:-1]
urls.append(url_info_dict)
publish_time = parse_iso8601(self._html_search_regex(
r'发布时间&nbsp;([^<>]+) ', page, 'publish time', fatal=False),
delimiter=' ', timezone=datetime.timedelta(hours=8))
description = self._html_search_meta('description', page, fatal=False)
return {
'id': media_id,
'formats': urls,
'title': playurl['title'],
'thumbnail': playurl['pic'],
'description': description,
'timestamp': publish_time,
}
class LetvTvIE(InfoExtractor):
_VALID_URL = r'http://www.letv.com/tv/(?P<id>\d+).html'
_TESTS = [{
'url': 'http://www.letv.com/tv/46177.html',
'info_dict': {
'id': '46177',
'title': '美人天下',
'description': 'md5:395666ff41b44080396e59570dbac01c'
},
'playlist_count': 35
}]
def _real_extract(self, url):
playlist_id = self._match_id(url)
page = self._download_webpage(url, playlist_id)
media_urls = list(set(re.findall(
r'http://www.letv.com/ptv/vplay/\d+.html', page)))
entries = [self.url_result(media_url, ie='Letv')
for media_url in media_urls]
title = self._html_search_meta('keywords', page,
fatal=False).split('')[0]
description = self._html_search_meta('description', page, fatal=False)
return self.playlist_result(entries, playlist_id, playlist_title=title,
playlist_description=description)
class LetvPlaylistIE(LetvTvIE):
_VALID_URL = r'http://tv.letv.com/[a-z]+/(?P<id>[a-z]+)/index.s?html'
_TESTS = [{
'url': 'http://tv.letv.com/izt/wuzetian/index.html',
'info_dict': {
'id': 'wuzetian',
'title': '武媚娘传奇',
'description': 'md5:e12499475ab3d50219e5bba00b3cb248'
},
# This playlist contains some extra videos other than the drama itself
'playlist_mincount': 96
}, {
'url': 'http://tv.letv.com/pzt/lswjzzjc/index.shtml',
'info_dict': {
'id': 'lswjzzjc',
# The title should be "劲舞青春", but I can't find a simple way to
# determine the playlist title
'title': '乐视午间自制剧场',
'description': 'md5:b1eef244f45589a7b5b1af9ff25a4489'
},
'playlist_mincount': 7
}]

View File

@@ -15,19 +15,73 @@ from ..utils import (
) )
class LyndaIE(InfoExtractor): class LyndaBaseIE(InfoExtractor):
_LOGIN_URL = 'https://www.lynda.com/login/login.aspx'
_SUCCESSFUL_LOGIN_REGEX = r'isLoggedIn: true'
_ACCOUNT_CREDENTIALS_HINT = 'Use --username and --password options to provide lynda.com account credentials.'
def _real_initialize(self):
self._login()
def _login(self):
(username, password) = self._get_login_info()
if username is None:
return
login_form = {
'username': username,
'password': password,
'remember': 'false',
'stayPut': 'false'
}
request = compat_urllib_request.Request(
self._LOGIN_URL, compat_urllib_parse.urlencode(login_form))
login_page = self._download_webpage(
request, None, 'Logging in as %s' % username)
# Not (yet) logged in
m = re.search(r'loginResultJson = \'(?P<json>[^\']+)\';', login_page)
if m is not None:
response = m.group('json')
response_json = json.loads(response)
state = response_json['state']
if state == 'notlogged':
raise ExtractorError(
'Unable to login, incorrect username and/or password',
expected=True)
# This is when we get popup:
# > You're already logged in to lynda.com on two devices.
# > If you log in here, we'll log you out of another device.
# So, we need to confirm this.
if state == 'conflicted':
confirm_form = {
'username': '',
'password': '',
'resolve': 'true',
'remember': 'false',
'stayPut': 'false',
}
request = compat_urllib_request.Request(
self._LOGIN_URL, compat_urllib_parse.urlencode(confirm_form))
login_page = self._download_webpage(
request, None,
'Confirming log in and log out from another device')
if re.search(self._SUCCESSFUL_LOGIN_REGEX, login_page) is None:
raise ExtractorError('Unable to log in')
class LyndaIE(LyndaBaseIE):
IE_NAME = 'lynda' IE_NAME = 'lynda'
IE_DESC = 'lynda.com videos' IE_DESC = 'lynda.com videos'
_VALID_URL = r'https?://www\.lynda\.com/[^/]+/[^/]+/\d+/(\d+)-\d\.html' _VALID_URL = r'https?://www\.lynda\.com/(?:[^/]+/[^/]+/\d+|player/embed)/(?P<id>\d+)'
_LOGIN_URL = 'https://www.lynda.com/login/login.aspx'
_NETRC_MACHINE = 'lynda' _NETRC_MACHINE = 'lynda'
_SUCCESSFUL_LOGIN_REGEX = r'isLoggedIn: true'
_TIMECODE_REGEX = r'\[(?P<timecode>\d+:\d+:\d+[\.,]\d+)\]' _TIMECODE_REGEX = r'\[(?P<timecode>\d+:\d+:\d+[\.,]\d+)\]'
ACCOUNT_CREDENTIALS_HINT = 'Use --username and --password options to provide lynda.com account credentials.' _TESTS = [{
_TEST = {
'url': 'http://www.lynda.com/Bootstrap-tutorials/Using-exercise-files/110885/114408-4.html', 'url': 'http://www.lynda.com/Bootstrap-tutorials/Using-exercise-files/110885/114408-4.html',
'md5': 'ecfc6862da89489161fb9cd5f5a6fac1', 'md5': 'ecfc6862da89489161fb9cd5f5a6fac1',
'info_dict': { 'info_dict': {
@@ -36,25 +90,27 @@ class LyndaIE(InfoExtractor):
'title': 'Using the exercise files', 'title': 'Using the exercise files',
'duration': 68 'duration': 68
} }
} }, {
'url': 'https://www.lynda.com/player/embed/133770?tr=foo=1;bar=g;fizz=rt&fs=0',
def _real_initialize(self): 'only_matching': True,
self._login() }]
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url) video_id = self._match_id(url)
video_id = mobj.group(1)
page = self._download_webpage('http://www.lynda.com/ajax/player?videoId=%s&type=video' % video_id, video_id, page = self._download_webpage(
'Downloading video JSON') 'http://www.lynda.com/ajax/player?videoId=%s&type=video' % video_id,
video_id, 'Downloading video JSON')
video_json = json.loads(page) video_json = json.loads(page)
if 'Status' in video_json: if 'Status' in video_json:
raise ExtractorError('lynda returned error: %s' % video_json['Message'], expected=True) raise ExtractorError(
'lynda returned error: %s' % video_json['Message'], expected=True)
if video_json['HasAccess'] is False: if video_json['HasAccess'] is False:
raise ExtractorError( raise ExtractorError(
'Video %s is only available for members. ' % video_id + self.ACCOUNT_CREDENTIALS_HINT, expected=True) 'Video %s is only available for members. '
% video_id + self._ACCOUNT_CREDENTIALS_HINT, expected=True)
video_id = compat_str(video_json['ID']) video_id = compat_str(video_json['ID'])
duration = video_json['DurationInSeconds'] duration = video_json['DurationInSeconds']
@@ -97,50 +153,9 @@ class LyndaIE(InfoExtractor):
'formats': formats 'formats': formats
} }
def _login(self):
(username, password) = self._get_login_info()
if username is None:
return
login_form = {
'username': username,
'password': password,
'remember': 'false',
'stayPut': 'false'
}
request = compat_urllib_request.Request(self._LOGIN_URL, compat_urllib_parse.urlencode(login_form))
login_page = self._download_webpage(request, None, 'Logging in as %s' % username)
# Not (yet) logged in
m = re.search(r'loginResultJson = \'(?P<json>[^\']+)\';', login_page)
if m is not None:
response = m.group('json')
response_json = json.loads(response)
state = response_json['state']
if state == 'notlogged':
raise ExtractorError('Unable to login, incorrect username and/or password', expected=True)
# This is when we get popup:
# > You're already logged in to lynda.com on two devices.
# > If you log in here, we'll log you out of another device.
# So, we need to confirm this.
if state == 'conflicted':
confirm_form = {
'username': '',
'password': '',
'resolve': 'true',
'remember': 'false',
'stayPut': 'false',
}
request = compat_urllib_request.Request(self._LOGIN_URL, compat_urllib_parse.urlencode(confirm_form))
login_page = self._download_webpage(request, None, 'Confirming log in and log out from another device')
if re.search(self._SUCCESSFUL_LOGIN_REGEX, login_page) is None:
raise ExtractorError('Unable to log in')
def _fix_subtitles(self, subs): def _fix_subtitles(self, subs):
srt = '' srt = ''
seq_counter = 0
for pos in range(0, len(subs) - 1): for pos in range(0, len(subs) - 1):
seq_current = subs[pos] seq_current = subs[pos]
m_current = re.match(self._TIMECODE_REGEX, seq_current['Timecode']) m_current = re.match(self._TIMECODE_REGEX, seq_current['Timecode'])
@@ -152,8 +167,10 @@ class LyndaIE(InfoExtractor):
continue continue
appear_time = m_current.group('timecode') appear_time = m_current.group('timecode')
disappear_time = m_next.group('timecode') disappear_time = m_next.group('timecode')
text = seq_current['Caption'] text = seq_current['Caption'].strip()
srt += '%s\r\n%s --> %s\r\n%s' % (str(pos), appear_time, disappear_time, text) if text:
seq_counter += 1
srt += '%s\r\n%s --> %s\r\n%s\r\n\r\n' % (seq_counter, appear_time, disappear_time, text)
if srt: if srt:
return srt return srt
@@ -166,7 +183,7 @@ class LyndaIE(InfoExtractor):
return {} return {}
class LyndaCourseIE(InfoExtractor): class LyndaCourseIE(LyndaBaseIE):
IE_NAME = 'lynda:course' IE_NAME = 'lynda:course'
IE_DESC = 'lynda.com online courses' IE_DESC = 'lynda.com online courses'
@@ -179,35 +196,37 @@ class LyndaCourseIE(InfoExtractor):
course_path = mobj.group('coursepath') course_path = mobj.group('coursepath')
course_id = mobj.group('courseid') course_id = mobj.group('courseid')
page = self._download_webpage('http://www.lynda.com/ajax/player?courseId=%s&type=course' % course_id, page = self._download_webpage(
course_id, 'Downloading course JSON') 'http://www.lynda.com/ajax/player?courseId=%s&type=course' % course_id,
course_id, 'Downloading course JSON')
course_json = json.loads(page) course_json = json.loads(page)
if 'Status' in course_json and course_json['Status'] == 'NotFound': if 'Status' in course_json and course_json['Status'] == 'NotFound':
raise ExtractorError('Course %s does not exist' % course_id, expected=True) raise ExtractorError(
'Course %s does not exist' % course_id, expected=True)
unaccessible_videos = 0 unaccessible_videos = 0
videos = [] videos = []
(username, _) = self._get_login_info()
# Might want to extract videos right here from video['Formats'] as it seems 'Formats' is not provided # Might want to extract videos right here from video['Formats'] as it seems 'Formats' is not provided
# by single video API anymore # by single video API anymore
for chapter in course_json['Chapters']: for chapter in course_json['Chapters']:
for video in chapter['Videos']: for video in chapter['Videos']:
if username is None and video['HasAccess'] is False: if video['HasAccess'] is False:
unaccessible_videos += 1 unaccessible_videos += 1
continue continue
videos.append(video['ID']) videos.append(video['ID'])
if unaccessible_videos > 0: if unaccessible_videos > 0:
self._downloader.report_warning('%s videos are only available for members and will not be downloaded. ' self._downloader.report_warning(
% unaccessible_videos + LyndaIE.ACCOUNT_CREDENTIALS_HINT) '%s videos are only available for members (or paid members) and will not be downloaded. '
% unaccessible_videos + self._ACCOUNT_CREDENTIALS_HINT)
entries = [ entries = [
self.url_result('http://www.lynda.com/%s/%s-4.html' % self.url_result(
(course_path, video_id), 'http://www.lynda.com/%s/%s-4.html' % (course_path, video_id),
'Lynda') 'Lynda')
for video_id in videos] for video_id in videos]
course_title = course_json['Title'] course_title = course_json['Title']

View File

@@ -18,7 +18,7 @@ class MiTeleIE(InfoExtractor):
IE_NAME = 'mitele.es' IE_NAME = 'mitele.es'
_VALID_URL = r'http://www\.mitele\.es/[^/]+/[^/]+/[^/]+/(?P<id>[^/]+)/' _VALID_URL = r'http://www\.mitele\.es/[^/]+/[^/]+/[^/]+/(?P<id>[^/]+)/'
_TEST = { _TESTS = [{
'url': 'http://www.mitele.es/programas-tv/diario-de/la-redaccion/programa-144/', 'url': 'http://www.mitele.es/programas-tv/diario-de/la-redaccion/programa-144/',
'md5': '6a75fe9d0d3275bead0cb683c616fddb', 'md5': '6a75fe9d0d3275bead0cb683c616fddb',
'info_dict': { 'info_dict': {
@@ -29,7 +29,7 @@ class MiTeleIE(InfoExtractor):
'display_id': 'programa-144', 'display_id': 'programa-144',
'duration': 2913, 'duration': 2913,
}, },
} }]
def _real_extract(self, url): def _real_extract(self, url):
episode = self._match_id(url) episode = self._match_id(url)

View File

@@ -5,7 +5,7 @@ from ..utils import int_or_none
class MporaIE(InfoExtractor): class MporaIE(InfoExtractor):
_VALID_URL = r'https?://(www\.)?mpora\.(?:com|de)/videos/(?P<id>[^?#/]+)' _VALID_URL = r'https?://(?:www\.)?mpora\.(?:com|de)/videos/(?P<id>[^?#/]+)'
IE_NAME = 'MPORA' IE_NAME = 'MPORA'
_TEST = { _TEST = {
@@ -25,7 +25,9 @@ class MporaIE(InfoExtractor):
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
data_json = self._search_regex( data_json = self._search_regex(
r"new FM\.Player\('[^']+',\s*(\{.*?)\).player;", webpage, 'json') [r"new FM\.Player\('[^']+',\s*(\{.*?)\).player;",
r"new\s+FM\.Kaltura\.Player\('[^']+'\s*,\s*({.+?})\);"],
webpage, 'json')
data = self._parse_json(data_json, video_id) data = self._parse_json(data_json, video_id)
uploader = data['info_overlay'].get('username') uploader = data['info_overlay'].get('username')

View File

@@ -3,17 +3,13 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import (
parse_duration,
unified_strdate,
)
class MusicVaultIE(InfoExtractor): class MusicVaultIE(InfoExtractor):
_VALID_URL = r'https?://www\.musicvault\.com/(?P<uploader_id>[^/?#]*)/video/(?P<display_id>[^/?#]*)_(?P<id>[0-9]+)\.html' _VALID_URL = r'https?://www\.musicvault\.com/(?P<uploader_id>[^/?#]*)/video/(?P<display_id>[^/?#]*)_(?P<id>[0-9]+)\.html'
_TEST = { _TEST = {
'url': 'http://www.musicvault.com/the-allman-brothers-band/video/straight-from-the-heart_1010863.html', 'url': 'http://www.musicvault.com/the-allman-brothers-band/video/straight-from-the-heart_1010863.html',
'md5': '2cdbb3ae75f7fb3519821507d2fb3c15', 'md5': '3adcbdb3dcc02d647539e53f284ba171',
'info_dict': { 'info_dict': {
'id': '1010863', 'id': '1010863',
'ext': 'mp4', 'ext': 'mp4',
@@ -22,9 +18,10 @@ class MusicVaultIE(InfoExtractor):
'duration': 244, 'duration': 244,
'uploader': 'The Allman Brothers Band', 'uploader': 'The Allman Brothers Band',
'thumbnail': 're:^https?://.*/thumbnail/.*', 'thumbnail': 're:^https?://.*/thumbnail/.*',
'upload_date': '19811216', 'upload_date': '20131219',
'location': 'Capitol Theatre (Passaic, NJ)', 'location': 'Capitol Theatre (Passaic, NJ)',
'description': 'Listen to The Allman Brothers Band perform Straight from the Heart at Capitol Theatre (Passaic, NJ) on Dec 16, 1981', 'description': 'Listen to The Allman Brothers Band perform Straight from the Heart at Capitol Theatre (Passaic, NJ) on Dec 16, 1981',
'timestamp': int,
} }
} }
@@ -43,34 +40,24 @@ class MusicVaultIE(InfoExtractor):
r'<h1.*?>(.*?)</h1>', data_div, 'uploader', fatal=False) r'<h1.*?>(.*?)</h1>', data_div, 'uploader', fatal=False)
title = self._html_search_regex( title = self._html_search_regex(
r'<h2.*?>(.*?)</h2>', data_div, 'title') r'<h2.*?>(.*?)</h2>', data_div, 'title')
upload_date = unified_strdate(self._html_search_regex(
r'<h3.*?>(.*?)</h3>', data_div, 'uploader', fatal=False))
location = self._html_search_regex( location = self._html_search_regex(
r'<h4.*?>(.*?)</h4>', data_div, 'location', fatal=False) r'<h4.*?>(.*?)</h4>', data_div, 'location', fatal=False)
duration = parse_duration(self._html_search_meta('duration', webpage))
VIDEO_URL_TEMPLATE = 'http://cdnapi.kaltura.com/p/%(uid)s/sp/%(wid)s/playManifest/entryId/%(entry_id)s/format/url/protocol/http'
kaltura_id = self._search_regex( kaltura_id = self._search_regex(
r'<div id="video-detail-player" data-kaltura-id="([^"]+)"', r'<div id="video-detail-player" data-kaltura-id="([^"]+)"',
webpage, 'kaltura ID') webpage, 'kaltura ID')
video_url = VIDEO_URL_TEMPLATE % { wid = self._search_regex(r'/wid/_([0-9]+)/', webpage, 'wid')
'entry_id': kaltura_id,
'wid': self._search_regex(r'/wid/_([0-9]+)/', webpage, 'wid'),
'uid': self._search_regex(r'uiconf_id/([0-9]+)/', webpage, 'uid'),
}
return { return {
'id': mobj.group('id'), 'id': mobj.group('id'),
'url': video_url, '_type': 'url_transparent',
'ext': 'mp4', 'url': 'kaltura:%s:%s' % (wid, kaltura_id),
'ie_key': 'Kaltura',
'display_id': display_id, 'display_id': display_id,
'uploader_id': mobj.group('uploader_id'), 'uploader_id': mobj.group('uploader_id'),
'thumbnail': thumbnail, 'thumbnail': thumbnail,
'description': self._html_search_meta('description', webpage), 'description': self._html_search_meta('description', webpage),
'upload_date': upload_date,
'location': location, 'location': location,
'title': title, 'title': title,
'uploader': uploader, 'uploader': uploader,
'duration': duration,
} }

View File

@@ -4,6 +4,7 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
float_or_none, float_or_none,
@@ -158,7 +159,9 @@ class NRKTVIE(InfoExtractor):
def _get_subtitles(self, subtitlesurl, video_id, baseurl): def _get_subtitles(self, subtitlesurl, video_id, baseurl):
url = "%s%s" % (baseurl, subtitlesurl) url = "%s%s" % (baseurl, subtitlesurl)
self._debug_print('%s: Subtitle url: %s' % (video_id, url)) self._debug_print('%s: Subtitle url: %s' % (video_id, url))
captions = self._download_xml(url, video_id, 'Downloading subtitles') captions = self._download_xml(
url, video_id, 'Downloading subtitles',
transform_source=lambda s: s.replace(r'<br />', '\r\n'))
lang = captions.get('lang', 'no') lang = captions.get('lang', 'no')
ps = captions.findall('./{0}body/{0}div/{0}p'.format('{http://www.w3.org/ns/ttml}')) ps = captions.findall('./{0}body/{0}div/{0}p'.format('{http://www.w3.org/ns/ttml}'))
srt = '' srt = ''
@@ -167,8 +170,7 @@ class NRKTVIE(InfoExtractor):
duration = parse_duration(p.get('dur')) duration = parse_duration(p.get('dur'))
starttime = self._seconds2str(begin) starttime = self._seconds2str(begin)
endtime = self._seconds2str(begin + duration) endtime = self._seconds2str(begin + duration)
text = '\n'.join(p.itertext()) srt += '%s\r\n%s --> %s\r\n%s\r\n\r\n' % (compat_str(pos), starttime, endtime, p.text)
srt += '%s\r\n%s --> %s\r\n%s\r\n\r\n' % (str(pos), starttime, endtime, text)
return {lang: [ return {lang: [
{'ext': 'ttml', 'url': url}, {'ext': 'ttml', 'url': url},
{'ext': 'srt', 'data': srt}, {'ext': 'srt', 'data': srt},

View File

@@ -0,0 +1,85 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
unified_strdate,
int_or_none,
qualities,
)
class OdnoklassnikiIE(InfoExtractor):
_VALID_URL = r'https?://(?:odnoklassniki|ok)\.ru/(?:video|web-api/video/moviePlayer)/(?P<id>\d+)'
_TESTS = [{
'url': 'http://ok.ru/video/20079905452',
'md5': '8e24ad2da6f387948e7a7d44eb8668fe',
'info_dict': {
'id': '20079905452',
'ext': 'mp4',
'title': 'Культура меняет нас (прекрасный ролик!))',
'duration': 100,
'upload_date': '20141207',
'uploader_id': '330537914540',
'uploader': 'Виталий Добровольский',
'like_count': int,
'age_limit': 0,
},
}, {
'url': 'http://ok.ru/web-api/video/moviePlayer/20079905452',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
player = self._parse_json(
self._search_regex(
r"OKVideo\.start\(({.+?})\s*,\s*'VideoAutoplay_player'", webpage, 'player'),
video_id)
metadata = self._parse_json(player['flashvars']['metadata'], video_id)
movie = metadata['movie']
title = movie['title']
thumbnail = movie.get('poster')
duration = int_or_none(movie.get('duration'))
author = metadata.get('author', {})
uploader_id = author.get('id')
uploader = author.get('name')
upload_date = unified_strdate(self._html_search_meta(
'ya:ovs:upload_date', webpage, 'upload date'))
age_limit = None
adult = self._html_search_meta(
'ya:ovs:adult', webpage, 'age limit')
if adult:
age_limit = 18 if adult == 'true' else 0
like_count = int_or_none(metadata.get('likeCount'))
quality = qualities(('mobile', 'lowest', 'low', 'sd', 'hd'))
formats = [{
'url': f['url'],
'ext': 'mp4',
'format_id': f['name'],
'quality': quality(f['name']),
} for f in metadata['videos']]
return {
'id': video_id,
'title': title,
'thumbnail': thumbnail,
'duration': duration,
'upload_date': upload_date,
'uploader': uploader,
'uploader_id': uploader_id,
'like_count': like_count,
'age_limit': age_limit,
'formats': formats,
}

View File

@@ -0,0 +1,88 @@
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
ExtractorError,
unified_strdate,
int_or_none,
)
class Puls4IE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?puls4\.com/video/[^/]+/play/(?P<id>[0-9]+)'
_TESTS = [{
'url': 'http://www.puls4.com/video/pro-und-contra/play/2716816',
'md5': '49f6a6629747eeec43cef6a46b5df81d',
'info_dict': {
'id': '2716816',
'ext': 'mp4',
'title': 'Pro und Contra vom 23.02.2015',
'description': 'md5:293e44634d9477a67122489994675db6',
'duration': 2989,
'upload_date': '20150224',
'uploader': 'PULS_4',
},
'skip': 'Only works from Germany',
}, {
'url': 'http://www.puls4.com/video/kult-spielfilme/play/1298106',
'md5': '6a48316c8903ece8dab9b9a7bf7a59ec',
'info_dict': {
'id': '1298106',
'ext': 'mp4',
'title': 'Lucky Fritz',
},
'skip': 'Only works from Germany',
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
error_message = self._html_search_regex(
r'<div class="message-error">(.+?)</div>',
webpage, 'error message', default=None)
if error_message:
raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, error_message), expected=True)
real_url = self._html_search_regex(
r'\"fsk-button\".+?href=\"([^"]+)',
webpage, 'fsk_button', default=None)
if real_url:
webpage = self._download_webpage(real_url, video_id)
player = self._search_regex(
r'p4_video_player(?:_iframe)?\("video_\d+_container"\s*,(.+?)\);\s*\}',
webpage, 'player')
player_json = self._parse_json(
'[%s]' % player, video_id,
transform_source=lambda s: s.replace('undefined,', ''))
formats = None
result = None
for v in player_json:
if isinstance(v, list) and not formats:
formats = [{
'url': f['url'],
'format': 'hd' if f.get('hd') else 'sd',
'width': int_or_none(f.get('size_x')),
'height': int_or_none(f.get('size_y')),
'tbr': int_or_none(f.get('bitrate')),
} for f in v]
self._sort_formats(formats)
elif isinstance(v, dict) and not result:
result = {
'id': video_id,
'title': v['videopartname'].strip(),
'description': v.get('videotitle'),
'duration': int_or_none(v.get('videoduration') or v.get('episodeduration')),
'upload_date': unified_strdate(v.get('clipreleasetime')),
'uploader': v.get('channel'),
}
result['formats'] = formats
return result

View File

@@ -146,7 +146,7 @@ class RTLnowIE(InfoExtractor):
mobj = re.search(r'.*/(?P<hoster>[^/]+)/videos/(?P<play_path>.+)\.f4m', filename.text) mobj = re.search(r'.*/(?P<hoster>[^/]+)/videos/(?P<play_path>.+)\.f4m', filename.text)
if mobj: if mobj:
fmt = { fmt = {
'url': 'rtmpe://fmspay-fra2.rtl.de/' + mobj.group('hoster'), 'url': 'rtmpe://fms.rtl.de/' + mobj.group('hoster'),
'play_path': 'mp4:' + mobj.group('play_path'), 'play_path': 'mp4:' + mobj.group('play_path'),
'page_url': url, 'page_url': url,
'player_url': video_page_url + 'includes/vodplayer.swf', 'player_url': video_page_url + 'includes/vodplayer.swf',

View File

@@ -8,8 +8,9 @@ import time
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_urlparse from ..compat import compat_urlparse
from ..utils import ( from ..utils import (
struct_unpack, float_or_none,
remove_end, remove_end,
struct_unpack,
) )
@@ -67,6 +68,7 @@ class RTVEALaCartaIE(InfoExtractor):
'id': '2491869', 'id': '2491869',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Balonmano - Swiss Cup masculina. Final: España-Suecia', 'title': 'Balonmano - Swiss Cup masculina. Final: España-Suecia',
'duration': 5024.566,
}, },
}, { }, {
'note': 'Live stream', 'note': 'Live stream',
@@ -113,6 +115,7 @@ class RTVEALaCartaIE(InfoExtractor):
'thumbnail': info.get('image'), 'thumbnail': info.get('image'),
'page_url': url, 'page_url': url,
'subtitles': subtitles, 'subtitles': subtitles,
'duration': float_or_none(info.get('duration'), scale=1000),
} }
def _get_subtitles(self, video_id, sub_file): def _get_subtitles(self, video_id, sub_file):

View File

@@ -180,7 +180,7 @@ class SoundcloudIE(InfoExtractor):
'format_id': key, 'format_id': key,
'url': url, 'url': url,
'play_path': 'mp3:' + path, 'play_path': 'mp3:' + path,
'ext': ext, 'ext': 'flv',
'vcodec': 'none', 'vcodec': 'none',
}) })
@@ -200,8 +200,9 @@ class SoundcloudIE(InfoExtractor):
if f['format_id'].startswith('rtmp'): if f['format_id'].startswith('rtmp'):
f['protocol'] = 'rtmp' f['protocol'] = 'rtmp'
self._sort_formats(formats) self._check_formats(formats, track_id)
result['formats'] = formats self._sort_formats(formats)
result['formats'] = formats
return result return result

View File

@@ -1,6 +1,8 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
determine_ext, determine_ext,
@@ -8,23 +10,40 @@ from ..utils import (
class SVTPlayIE(InfoExtractor): class SVTPlayIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?svtplay\.se/video/(?P<id>[0-9]+)' IE_DESC = 'SVT Play and Öppet arkiv'
_TEST = { _VALID_URL = r'https?://(?:www\.)?(?P<host>svtplay|oppetarkiv)\.se/video/(?P<id>[0-9]+)'
_TESTS = [{
'url': 'http://www.svtplay.se/video/2609989/sm-veckan/sm-veckan-rally-final-sasong-1-sm-veckan-rally-final', 'url': 'http://www.svtplay.se/video/2609989/sm-veckan/sm-veckan-rally-final-sasong-1-sm-veckan-rally-final',
'md5': 'f4a184968bc9c802a9b41316657aaa80', 'md5': 'ade3def0643fa1c40587a422f98edfd9',
'info_dict': { 'info_dict': {
'id': '2609989', 'id': '2609989',
'ext': 'mp4', 'ext': 'flv',
'title': 'SM veckan vinter, Örebro - Rally, final', 'title': 'SM veckan vinter, Örebro - Rally, final',
'duration': 4500, 'duration': 4500,
'thumbnail': 're:^https?://.*[\.-]jpg$', 'thumbnail': 're:^https?://.*[\.-]jpg$',
'age_limit': 0,
}, },
} }, {
'url': 'http://www.oppetarkiv.se/video/1058509/rederiet-sasong-1-avsnitt-1-av-318',
'md5': 'c3101a17ce9634f4c1f9800f0746c187',
'info_dict': {
'id': '1058509',
'ext': 'flv',
'title': 'Farlig kryssning',
'duration': 2566,
'thumbnail': 're:^https?://.*[\.-]jpg$',
'age_limit': 0,
},
'skip': 'Only works from Sweden',
}]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
host = mobj.group('host')
info = self._download_json( info = self._download_json(
'http://www.svtplay.se/video/%s?output=json' % video_id, video_id) 'http://www.%s.se/video/%s?output=json' % (host, video_id), video_id)
title = info['context']['title'] title = info['context']['title']
thumbnail = info['context'].get('thumbnailImage') thumbnail = info['context'].get('thumbnailImage')
@@ -33,11 +52,16 @@ class SVTPlayIE(InfoExtractor):
formats = [] formats = []
for vr in video_info['videoReferences']: for vr in video_info['videoReferences']:
vurl = vr['url'] vurl = vr['url']
if determine_ext(vurl) == 'm3u8': ext = determine_ext(vurl)
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats( formats.extend(self._extract_m3u8_formats(
vurl, video_id, vurl, video_id,
ext='mp4', entry_protocol='m3u8_native', ext='mp4', entry_protocol='m3u8_native',
m3u8_id=vr.get('playerType'))) m3u8_id=vr.get('playerType')))
elif ext == 'f4m':
formats.extend(self._extract_f4m_formats(
vurl + '?hdcore=3.3.0', video_id,
f4m_id=vr.get('playerType')))
else: else:
formats.append({ formats.append({
'format_id': vr.get('playerType'), 'format_id': vr.get('playerType'),
@@ -46,6 +70,7 @@ class SVTPlayIE(InfoExtractor):
self._sort_formats(formats) self._sort_formats(formats)
duration = video_info.get('materialLength') duration = video_info.get('materialLength')
age_limit = 18 if video_info.get('inappropriateForChildren') else 0
return { return {
'id': video_id, 'id': video_id,
@@ -53,4 +78,5 @@ class SVTPlayIE(InfoExtractor):
'formats': formats, 'formats': formats,
'thumbnail': thumbnail, 'thumbnail': thumbnail,
'duration': duration, 'duration': duration,
'age_limit': age_limit,
} }

View File

@@ -6,9 +6,9 @@ from .mitele import MiTeleIE
class TelecincoIE(MiTeleIE): class TelecincoIE(MiTeleIE):
IE_NAME = 'telecinco.es' IE_NAME = 'telecinco.es'
_VALID_URL = r'https?://www\.telecinco\.es/[^/]+/[^/]+/[^/]+/(?P<id>.*?)\.html' _VALID_URL = r'https?://www\.telecinco\.es/[^/]+/[^/]+/(?:[^/]+/)?(?P<id>.*?)\.html'
_TEST = { _TESTS = [{
'url': 'http://www.telecinco.es/robinfood/temporada-01/t01xp14/Bacalao-cocochas-pil-pil_0_1876350223.html', 'url': 'http://www.telecinco.es/robinfood/temporada-01/t01xp14/Bacalao-cocochas-pil-pil_0_1876350223.html',
'info_dict': { 'info_dict': {
'id': 'MDSVID20141015_0058', 'id': 'MDSVID20141015_0058',
@@ -16,4 +16,7 @@ class TelecincoIE(MiTeleIE):
'title': 'Con Martín Berasategui, hacer un bacalao al ...', 'title': 'Con Martín Berasategui, hacer un bacalao al ...',
'duration': 662, 'duration': 662,
}, },
} }, {
'url': 'http://www.telecinco.es/informativos/nacional/Pablo_Iglesias-Informativos_Telecinco-entrevista-Pedro_Piqueras_2_1945155182.html',
'only_matching': True,
}]

View File

@@ -34,7 +34,15 @@ class TwitchBaseIE(InfoExtractor):
expected=True) expected=True)
def _download_json(self, url, video_id, note='Downloading JSON metadata'): def _download_json(self, url, video_id, note='Downloading JSON metadata'):
response = super(TwitchBaseIE, self)._download_json(url, video_id, note) headers = {
'Referer': 'http://api.twitch.tv/crossdomain/receiver.html?v=2',
'X-Requested-With': 'XMLHttpRequest',
}
for cookie in self._downloader.cookiejar:
if cookie.name == 'api_token':
headers['Twitch-Api-Token'] = cookie.value
request = compat_urllib_request.Request(url, headers=headers)
response = super(TwitchBaseIE, self)._download_json(request, video_id, note)
self._handle_error(response) self._handle_error(response)
return response return response

View File

@@ -31,7 +31,7 @@ class VKIE(InfoExtractor):
'id': '162222515', 'id': '162222515',
'ext': 'flv', 'ext': 'flv',
'title': 'ProtivoGunz - Хуёвая песня', 'title': 'ProtivoGunz - Хуёвая песня',
'uploader': 're:Noize MC.*', 'uploader': 're:(?:Noize MC|Alexander Ilyashenko).*',
'duration': 195, 'duration': 195,
'upload_date': '20120212', 'upload_date': '20120212',
}, },
@@ -140,7 +140,7 @@ class VKIE(InfoExtractor):
if not video_id: if not video_id:
video_id = '%s_%s' % (mobj.group('oid'), mobj.group('id')) video_id = '%s_%s' % (mobj.group('oid'), mobj.group('id'))
info_url = 'http://vk.com/al_video.php?act=show&al=1&video=%s' % video_id info_url = 'http://vk.com/al_video.php?act=show&al=1&module=video&video=%s' % video_id
info_page = self._download_webpage(info_url, video_id) info_page = self._download_webpage(info_url, video_id)
ERRORS = { ERRORS = {
@@ -152,7 +152,10 @@ class VKIE(InfoExtractor):
'use --username and --password options to provide account credentials.', 'use --username and --password options to provide account credentials.',
r'<!>Unknown error': r'<!>Unknown error':
'Video %s does not exist.' 'Video %s does not exist.',
r'<!>Видео временно недоступно':
'Video %s is temporarily unavailable.',
} }
for error_re, error_msg in ERRORS.items(): for error_re, error_msg in ERRORS.items():

View File

@@ -28,6 +28,7 @@ class WDRIE(InfoExtractor):
'title': 'Servicezeit', 'title': 'Servicezeit',
'description': 'md5:c8f43e5e815eeb54d0b96df2fba906cb', 'description': 'md5:c8f43e5e815eeb54d0b96df2fba906cb',
'upload_date': '20140310', 'upload_date': '20140310',
'is_live': False
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
@@ -41,6 +42,7 @@ class WDRIE(InfoExtractor):
'title': 'Marga Spiegel ist tot', 'title': 'Marga Spiegel ist tot',
'description': 'md5:2309992a6716c347891c045be50992e4', 'description': 'md5:2309992a6716c347891c045be50992e4',
'upload_date': '20140311', 'upload_date': '20140311',
'is_live': False
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
@@ -55,6 +57,7 @@ class WDRIE(InfoExtractor):
'title': 'Erlebte Geschichten: Marga Spiegel (29.11.2009)', 'title': 'Erlebte Geschichten: Marga Spiegel (29.11.2009)',
'description': 'md5:2309992a6716c347891c045be50992e4', 'description': 'md5:2309992a6716c347891c045be50992e4',
'upload_date': '20091129', 'upload_date': '20091129',
'is_live': False
}, },
}, },
{ {
@@ -66,6 +69,7 @@ class WDRIE(InfoExtractor):
'title': 'Flavia Coelho: Amar é Amar', 'title': 'Flavia Coelho: Amar é Amar',
'description': 'md5:7b29e97e10dfb6e265238b32fa35b23a', 'description': 'md5:7b29e97e10dfb6e265238b32fa35b23a',
'upload_date': '20140717', 'upload_date': '20140717',
'is_live': False
}, },
}, },
{ {
@@ -74,6 +78,20 @@ class WDRIE(InfoExtractor):
'info_dict': { 'info_dict': {
'id': 'mediathek/video/sendungen/quarks_und_co/filterseite-quarks-und-co100', 'id': 'mediathek/video/sendungen/quarks_und_co/filterseite-quarks-und-co100',
} }
},
{
'url': 'http://www1.wdr.de/mediathek/video/livestream/index.html',
'info_dict': {
'id': 'mdb-103364',
'title': 're:^WDR Fernsehen [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': 'md5:ae2ff888510623bf8d4b115f95a9b7c9',
'ext': 'flv',
'upload_date': '20150212',
'is_live': True
},
'params': {
'skip_download': True,
},
} }
] ]
@@ -119,6 +137,10 @@ class WDRIE(InfoExtractor):
video_url = flashvars['dslSrc'][0] video_url = flashvars['dslSrc'][0]
title = flashvars['trackerClipTitle'][0] title = flashvars['trackerClipTitle'][0]
thumbnail = flashvars['startPicture'][0] if 'startPicture' in flashvars else None thumbnail = flashvars['startPicture'][0] if 'startPicture' in flashvars else None
is_live = flashvars.get('isLive', ['0'])[0] == '1'
if is_live:
title = self._live_title(title)
if 'trackerClipAirTime' in flashvars: if 'trackerClipAirTime' in flashvars:
upload_date = flashvars['trackerClipAirTime'][0] upload_date = flashvars['trackerClipAirTime'][0]
@@ -131,6 +153,13 @@ class WDRIE(InfoExtractor):
if video_url.endswith('.f4m'): if video_url.endswith('.f4m'):
video_url += '?hdcore=3.2.0&plugin=aasp-3.2.0.77.18' video_url += '?hdcore=3.2.0&plugin=aasp-3.2.0.77.18'
ext = 'flv' ext = 'flv'
elif video_url.endswith('.smil'):
fmt = self._extract_smil_formats(video_url, page_id)[0]
video_url = fmt['url']
sep = '&' if '?' in video_url else '?'
video_url += sep
video_url += 'hdcore=3.3.0&plugin=aasp-3.3.0.99.43'
ext = fmt['ext']
else: else:
ext = determine_ext(video_url) ext = determine_ext(video_url)
@@ -144,6 +173,7 @@ class WDRIE(InfoExtractor):
'description': description, 'description': description,
'thumbnail': thumbnail, 'thumbnail': thumbnail,
'upload_date': upload_date, 'upload_date': upload_date,
'is_live': is_live
} }

View File

@@ -8,11 +8,11 @@ import sys
from .downloader.external import list_external_downloaders from .downloader.external import list_external_downloaders
from .compat import ( from .compat import (
compat_expanduser, compat_expanduser,
compat_get_terminal_size,
compat_getenv, compat_getenv,
compat_kwargs, compat_kwargs,
) )
from .utils import ( from .utils import (
get_term_width,
write_string, write_string,
) )
from .version import __version__ from .version import __version__
@@ -100,7 +100,7 @@ def parseOpts(overrideArguments=None):
return opts return opts
# No need to wrap help messages if we're on a wide console # No need to wrap help messages if we're on a wide console
columns = get_term_width() columns = compat_get_terminal_size().columns
max_width = columns if columns else 80 max_width = columns if columns else 80
max_help_position = 80 max_help_position = 80
@@ -272,6 +272,10 @@ def parseOpts(overrideArguments=None):
'--no-playlist', '--no-playlist',
action='store_true', dest='noplaylist', default=False, action='store_true', dest='noplaylist', default=False,
help='If the URL refers to a video and a playlist, download only the video.') help='If the URL refers to a video and a playlist, download only the video.')
selection.add_option(
'--yes-playlist',
action='store_false', dest='noplaylist', default=False,
help='If the URL refers to a video and a playlist, download the playlist.')
selection.add_option( selection.add_option(
'--age-limit', '--age-limit',
metavar='YEARS', dest='age_limit', default=None, type=int, metavar='YEARS', dest='age_limit', default=None, type=int,
@@ -431,8 +435,12 @@ def parseOpts(overrideArguments=None):
downloader.add_option( downloader.add_option(
'--external-downloader', '--external-downloader',
dest='external_downloader', metavar='COMMAND', dest='external_downloader', metavar='COMMAND',
help='(experimental) Use the specified external downloader. ' help='Use the specified external downloader. '
'Currently supports %s' % ','.join(list_external_downloaders())) 'Currently supports %s' % ','.join(list_external_downloaders()))
downloader.add_option(
'--external-downloader-args',
dest='external_downloader_args', metavar='ARGS',
help='Give these arguments to the external downloader.')
workarounds = optparse.OptionGroup(parser, 'Workarounds') workarounds = optparse.OptionGroup(parser, 'Workarounds')
workarounds.add_option( workarounds.add_option(
@@ -747,6 +755,10 @@ def parseOpts(overrideArguments=None):
'--exec', '--exec',
metavar='CMD', dest='exec_cmd', metavar='CMD', dest='exec_cmd',
help='Execute a command on the file after downloading, similar to find\'s -exec syntax. Example: --exec \'adb push {} /sdcard/Music/ && rm {}\'') help='Execute a command on the file after downloading, similar to find\'s -exec syntax. Example: --exec \'adb push {} /sdcard/Music/ && rm {}\'')
postproc.add_option(
'--convert-subtitles', '--convert-subs',
metavar='FORMAT', dest='convertsubtitles', default=None,
help='Convert the subtitles to other format (currently supported: srt|ass|vtt)')
parser.add_option_group(general) parser.add_option_group(general)
parser.add_option_group(network) parser.add_option_group(network)

View File

@@ -11,6 +11,7 @@ from .ffmpeg import (
FFmpegMergerPP, FFmpegMergerPP,
FFmpegMetadataPP, FFmpegMetadataPP,
FFmpegVideoConvertorPP, FFmpegVideoConvertorPP,
FFmpegSubtitlesConvertorPP,
) )
from .xattrpp import XAttrMetadataPP from .xattrpp import XAttrMetadataPP
from .execafterdownload import ExecAfterDownloadPP from .execafterdownload import ExecAfterDownloadPP
@@ -31,6 +32,7 @@ __all__ = [
'FFmpegMergerPP', 'FFmpegMergerPP',
'FFmpegMetadataPP', 'FFmpegMetadataPP',
'FFmpegPostProcessor', 'FFmpegPostProcessor',
'FFmpegSubtitlesConvertorPP',
'FFmpegVideoConvertorPP', 'FFmpegVideoConvertorPP',
'XAttrMetadataPP', 'XAttrMetadataPP',
] ]

View File

@@ -1,5 +1,6 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import io
import os import os
import subprocess import subprocess
import sys import sys
@@ -635,3 +636,40 @@ class FFmpegFixupM4aPP(FFmpegPostProcessor):
os.rename(encodeFilename(temp_filename), encodeFilename(filename)) os.rename(encodeFilename(temp_filename), encodeFilename(filename))
return True, info return True, info
class FFmpegSubtitlesConvertorPP(FFmpegPostProcessor):
def __init__(self, downloader=None, format=None):
super(FFmpegSubtitlesConvertorPP, self).__init__(downloader)
self.format = format
def run(self, info):
subs = info.get('requested_subtitles')
filename = info['filepath']
new_ext = self.format
new_format = new_ext
if new_format == 'vtt':
new_format = 'webvtt'
if subs is None:
self._downloader.to_screen('[ffmpeg] There aren\'t any subtitles to convert')
return True, info
self._downloader.to_screen('[ffmpeg] Converting subtitles')
for lang, sub in subs.items():
ext = sub['ext']
if ext == new_ext:
self._downloader.to_screen(
'[ffmpeg] Subtitle file for %s is already in the requested'
'format' % new_ext)
continue
new_file = subtitles_filename(filename, lang, new_ext)
self.run_ffmpeg(
subtitles_filename(filename, lang, ext),
new_file, ['-f', new_format])
with io.open(new_file, 'rt', encoding='utf-8') as f:
subs[lang] = {
'ext': ext,
'data': f.read(),
}
return True, info

View File

@@ -35,7 +35,6 @@ import zlib
from .compat import ( from .compat import (
compat_basestring, compat_basestring,
compat_chr, compat_chr,
compat_getenv,
compat_html_entities, compat_html_entities,
compat_http_client, compat_http_client,
compat_parse_qs, compat_parse_qs,
@@ -54,7 +53,7 @@ from .compat import (
compiled_regex_type = type(re.compile('')) compiled_regex_type = type(re.compile(''))
std_headers = { std_headers = {
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20100101 Firefox/10.0 (Chrome)', 'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20150101 Firefox/20.0 (Chrome)',
'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.7', 'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.7',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Encoding': 'gzip, deflate', 'Accept-Encoding': 'gzip, deflate',
@@ -306,6 +305,7 @@ def sanitize_filename(s, restricted=False, is_id=False):
result = result[2:] result = result[2:]
if result.startswith('-'): if result.startswith('-'):
result = '_' + result[len('-'):] result = '_' + result[len('-'):]
result = result.lstrip('.')
if not result: if not result:
result = '_' result = '_'
return result return result
@@ -1173,22 +1173,6 @@ def parse_filesize(s):
return int(float(num_str) * mult) return int(float(num_str) * mult)
def get_term_width():
columns = compat_getenv('COLUMNS', None)
if columns:
return int(columns)
try:
sp = subprocess.Popen(
['stty', 'size'],
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
out, err = sp.communicate()
return int(out.split()[1])
except:
pass
return None
def month_by_name(name): def month_by_name(name):
""" Return the number of a month by (locale-independently) English name """ """ Return the number of a month by (locale-independently) English name """
@@ -1290,6 +1274,7 @@ def parse_duration(s):
(?P<only_mins>[0-9.]+)\s*(?:mins?|minutes?)\s*| (?P<only_mins>[0-9.]+)\s*(?:mins?|minutes?)\s*|
(?P<only_hours>[0-9.]+)\s*(?:hours?)| (?P<only_hours>[0-9.]+)\s*(?:hours?)|
\s*(?P<hours_reversed>[0-9]+)\s*(?:[:h]|hours?)\s*(?P<mins_reversed>[0-9]+)\s*(?:[:m]|mins?|minutes?)\s*|
(?: (?:
(?: (?:
(?:(?P<days>[0-9]+)\s*(?:[:d]|days?)\s*)? (?:(?P<days>[0-9]+)\s*(?:[:d]|days?)\s*)?
@@ -1308,10 +1293,14 @@ def parse_duration(s):
return float_or_none(m.group('only_hours'), invscale=60 * 60) return float_or_none(m.group('only_hours'), invscale=60 * 60)
if m.group('secs'): if m.group('secs'):
res += int(m.group('secs')) res += int(m.group('secs'))
if m.group('mins_reversed'):
res += int(m.group('mins_reversed')) * 60
if m.group('mins'): if m.group('mins'):
res += int(m.group('mins')) * 60 res += int(m.group('mins')) * 60
if m.group('hours'): if m.group('hours'):
res += int(m.group('hours')) * 60 * 60 res += int(m.group('hours')) * 60 * 60
if m.group('hours_reversed'):
res += int(m.group('hours_reversed')) * 60 * 60
if m.group('days'): if m.group('days'):
res += int(m.group('days')) * 24 * 60 * 60 res += int(m.group('days')) * 24 * 60 * 60
if m.group('ms'): if m.group('ms'):

View File

@@ -1,3 +1,3 @@
from __future__ import unicode_literals from __future__ import unicode_literals
__version__ = '2015.02.24.1' __version__ = '2015.03.03'