Compare commits

...

97 Commits

Author SHA1 Message Date
Sergey M․
4ac0f573ef release 2017.05.07 2017-05-07 04:51:34 +07:00
Sergey M․
3892a9f4ab [ChangeLog] Actualize 2017-05-07 04:44:54 +07:00
Sergey M․
3995d37da5 [youtube] Fix TFA (#12927) 2017-05-07 04:19:11 +07:00
Sergey M․
e4a75d7932 [test_youtube_chapters] PEP 8 2017-05-07 00:00:11 +07:00
Sergey M․
e00eb564e9 [youtube] Fix authentication (closes #12927) 2017-05-06 23:58:47 +07:00
Yen Chi Hsuan
10c87c151b [utils] Rename try_multipart_encode to _multipart_encode_impl
To state that this is an internal function and people should be careful
when using it outside youtube-dl.
2017-05-06 19:06:18 +08:00
Yen Chi Hsuan
228cd9bb90 [bilibili] Fix video downloading (closes #13001) 2017-05-06 18:58:38 +08:00
Sergey M․
566fbbaefd [rmcdecouverte] Improve (closes #12937) 2017-05-06 17:56:10 +07:00
midas02
74c09c852a [rmcdecouverte] Fix extraction 2017-05-06 17:56:10 +07:00
Remita Amine
fd178b8748 [theplatform] extract chapters 2017-05-06 07:19:07 +01:00
Sergey M․
a57a8e9918 [test_youtube_chapters] Add coding cookie 2017-05-06 05:30:56 +07:00
Tithen-Firion
1f9fefe7f5 [crackle] Update test 2017-05-06 03:39:14 +07:00
Luca Steeb
8b4774dcac [bandcamp] Fix thumbnail extraction 2017-05-06 03:35:42 +07:00
Sergey M․
a99cc4ca16 [pornhub] Extend _VALID_URL (closes #12996) 2017-05-06 02:46:37 +07:00
Sergey M․
9cafc3fd8b [youtube] Extract chapters 2017-05-06 02:27:06 +07:00
Sergey M․
329e3dd5ad [nrk] Extract chapters 2017-05-05 22:59:15 +07:00
Remita Amine
1d9e0a4f40 [vice] update tests and add support for ooyala embeds in article pages 2017-05-05 16:13:12 +01:00
Sergey M․
7ad53cb7ff [laola1tv] PEP 8 2017-05-05 21:59:23 +07:00
Yen Chi Hsuan
b2ad479d17 [utils] Fix multipart_encode for Python < 3.5 2017-05-05 20:51:59 +08:00
Yen Chi Hsuan
4ac6dc3732 [vice] Support Vice articles (closes #12968) 2017-05-05 20:26:51 +08:00
Yen Chi Hsuan
cc7bda4fff [vice] Fix extraction for non en_us videos (closes #12967) 2017-05-05 20:01:02 +08:00
Yen Chi Hsuan
50ad078b7b [gdcvault] Fix extraction for videos with gdc-player.html
Closes #12733
2017-05-05 15:13:40 +08:00
Sergey M․
4947f13cd0 [pbs] Improve multipart video support (closes #12981) 2017-05-04 22:42:49 +07:00
Sergey M․
7f09e523e8 [laola1tv:embed] Fix tests 2017-05-04 22:41:47 +07:00
Remita Amine
4fe14732a2 [laola1tv] fix extraction(closes #12880) 2017-05-04 16:07:08 +01:00
Remita Amine
ff6f9a6704 [extractor/common] fix typo in _extract_akamai_formats 2017-05-04 16:07:08 +01:00
Yen Chi Hsuan
0c26548601 [cda] Implement birthday verification (closes #12789) 2017-05-04 16:26:17 +08:00
Yen Chi Hsuan
5401bea27f [leeco] Fix extraction (closes #12974)
Seems on mobile devices a similar API is used, but I always get an AD
with mimicking that API.
2017-05-04 03:18:56 +08:00
remitamine
7a6d33a9a5 [pbs] extract chapters information 2017-05-02 20:41:48 +01:00
remitamine
fa2a36d9bc [ffmpeg] add support for chapters field postprocessing 2017-05-02 20:41:48 +01:00
remitamine
55949fede6 [common] introduce chapters field 2017-05-02 20:41:48 +01:00
Remita Amine
7fc875195f [amp] imporove thumbnail and subtitle extraction 2017-05-02 00:06:58 +01:00
Tithen-Firion
c6fe5a7e12 [douyutv] Update test 2017-05-02 03:45:27 +08:00
Tithen-Firion
ae21d2fd94 [dotsub] Update test 2017-05-02 02:56:44 +08:00
Tithen-Firion
77481f1386 [democracynow] Update test 2017-05-02 01:38:31 +07:00
Tithen-Firion
d86d169dd5 [dailymotion] Add working test 2017-05-02 01:37:23 +07:00
Tithen-Firion
b9f9f361fa [crunchyroll] Update test 2017-05-02 00:56:51 +07:00
Remita Amine
ab39a25c75 [foxsports] fix extraction(closes #12945) 2017-05-01 09:02:41 +01:00
Tithen-Firion
a146fa1c68 [coub] Update test and remove comment count extraction 2017-05-01 05:54:44 +07:00
Sergey M․
e0c1e9a98c release 2017.05.01 2017-05-01 01:39:52 +07:00
Sergey M․
086041e2f8 [ChangeLog] Actualize 2017-05-01 01:34:51 +07:00
Sergey M․
74da856544 [infoq] Make audio format extraction non fatal (closes #12938) 2017-05-01 01:23:05 +07:00
Sergey M․
9edf47df7b [brightcove] Allow whitespace around attribute names in embedded code 2017-05-01 01:03:47 +07:00
Sergey M․
238cec17ae [extractor/anvato] PEP 8 2017-04-30 22:04:21 +07:00
Sergey M․
50534b7158 [downloader/fragment] PEP 8 2017-04-30 22:04:01 +07:00
Sergey M․
9cd4209724 [zaq1] Improve extraction (closes #12693) 2017-04-30 21:46:05 +07:00
Sergey M․
33a81c2c6f [extractor/common] Extract view count from JSON-LD 2017-04-30 21:45:59 +07:00
Sergey M․
deef31955b [utils] Improve unified_timestamp
Seen at http://zaq1.pl/video/xev0e
2017-04-30 21:45:53 +07:00
slocum
9dac2cec2d [zaq1] Add new extractor 2017-04-30 21:45:47 +07:00
Sergey M․
6ec371cd9e [xvideos] Extract og:duration (closes #12828) 2017-04-30 18:14:01 +07:00
Sander
13081db1f5 [xvideos] Add video duration 2017-04-30 18:10:49 +07:00
Sergey M․
b07ea5eaec [vevo] Modernize 2017-04-30 17:58:22 +07:00
gritstub
5599253009 [vevo] Fix extraction (config.token.key) 2017-04-30 17:56:10 +07:00
Remita Amine
98ce1a3fd3 [utils] add video/mp2t to mimetype2ext 2017-04-30 09:03:10 +01:00
Yen Chi Hsuan
ba5c3caf88 [washingtonpost] Fix invalid escape sequence on Python 3.6 2017-04-30 02:15:28 +08:00
Sergey M․
b5c39537be [noovo] Improve extraction (closes #12792) 2017-04-30 00:24:25 +07:00
Frederic Bournival
1c7c76e4fb [noovo] Add extractor 2017-04-30 00:24:19 +07:00
John Hawkinson
557194591a [washingtonpost] Add support for embeds (closes #12699) 2017-04-29 23:07:26 +07:00
Yen Chi Hsuan
27e70a8f6c Merge pull request #12869 from Tithen-Firion/cbc-update-tests
[cbc] update test cases
2017-04-29 21:34:18 +08:00
Sergey M․
a4c81e4968 [yandexmusic:playlist] Fix extraction for python 3 (closes #12888) 2017-04-29 20:23:26 +07:00
Sergey M․
7986c3abcd [anvato] Improve extraction (closes #12913)
* Promote to regular shortcut based extractor
* Add mcp to access key mapping table
* Add support for embeds extraction
* Add support for anvato embeds in generic extractor
2017-04-29 19:49:04 +07:00
Yen Chi Hsuan
a1ebfd4494 Merge pull request #12854 from Tithen-Firion/appletrailer-test-fix
[appletrailers] update test cases
2017-04-29 19:24:38 +08:00
Yen Chi Hsuan
d19093bd50 Merge pull request #12906 from Tithen-Firion/clean-html-fix
[utils] Fix inconsistent output of clean_html
2017-04-29 15:58:45 +08:00
Yen Chi Hsuan
24eb7c2578 [xtube] Fix extraction with non-standard JSON 'sources'
Closes #12734

Thanks @paulguy for the fix!
2017-04-29 15:55:08 +08:00
Sergey M․
e7db6759e4 [downloader/external] Properly handle live stream downloading cancellation (closes #8932) 2017-04-29 04:33:35 +07:00
Sergey M․
b364c87c42 [tvplayer] Fix extraction (closes #12908) 2017-04-29 03:46:08 +07:00
Tithen-Firion
9222d94510 [test_utils] Add one more clean_html test 2017-04-28 18:05:14 +02:00
Tithen-Firion
edd9221cd2 [utils] Fix inconsistent output of clean_html
`\s` in Python 2.x doesn't match unicode whitespace characters by
default
2017-04-28 17:34:27 +02:00
Sergey M․
bc8a2ea071 release 2017.04.28 2017-04-28 18:30:03 +07:00
Sergey M․
7527923371 [ChangeLog] Actualize 2017-04-28 18:27:29 +07:00
Remita Amine
20783b8b50 [aenetworks] fix extraction for shows with single season 2017-04-28 12:04:56 +01:00
Remita Amine
bf2a5555c0 [go] add support for Disney, DisneyJunior and DisneyXD show pages 2017-04-28 09:48:52 +01:00
Remita Amine
fb8e8b2d16 [adobepass] use geo verification headers for all requests 2017-04-28 09:48:52 +01:00
Yen Chi Hsuan
b62985a9a5 [youtube] Recognize another HTML5 player URL (#12885) 2017-04-28 16:25:04 +08:00
Yen Chi Hsuan
e31fed95b4 [youtube] Recognize new locale-based player URLs (fixes #12885) 2017-04-28 15:48:30 +08:00
Tithen-Firion
3fd0f70f6a [cbslocal] Update test 2017-04-28 04:26:59 +07:00
Tithen-Firion
33c62efc32 [collegerama] Update tests 2017-04-28 04:00:49 +07:00
Tithen-Firion
6b4ddd336c [afreecatv] Fix title extraction 2017-04-28 04:00:15 +07:00
Tithen-Firion
c12b4b80f8 [archiveorg] Update test 2017-04-28 03:48:32 +07:00
Tithen-Firion
064fafe932 [appleconnect] Update test 2017-04-28 03:47:25 +07:00
Tithen-Firion
ac1a5b9a12 [audioboom] Update test 2017-04-28 03:36:28 +07:00
Tithen-Firion
a15777491a [atresplayer] Update test 2017-04-28 03:32:25 +07:00
Tithen-Firion
d8571dd6bf [bleacherreport] Update tests 2017-04-28 03:28:26 +07:00
Sergey M․
c0fa4245ce [downloader/fragment] Remove assert for resume_len when no fragments downloaded
This may be incorrect due some header (e.g. flv header in f4m downloader)
2017-04-28 03:26:19 +07:00
Tithen-Firion
8814ae42bc [beeg] Update test 2017-04-28 03:14:11 +07:00
Tithen-Firion
0f63dc2402 [bandcamp] Update test 2017-04-28 03:13:12 +07:00
Tithen-Firion
dde97ea8da [canalc2] Update test 2017-04-28 03:07:42 +07:00
Sergey M․
30bb6ce1a4 [test_InfoExtractor] Fix test_parse_m3u8_formats 2017-04-28 03:01:43 +07:00
Sergey M․
c89b49f743 [extractor/common] Add manifest_url for explicit group rendition formats 2017-04-28 03:00:14 +07:00
Tithen-Firion
6f4a888416 [br] Update test 2017-04-28 02:53:11 +07:00
Tithen-Firion
f5edd7ae51 [clipfish] Update test 2017-04-28 02:51:30 +07:00
Tithen-Firion
c95e2b5911 [cbc] update test cases 2017-04-27 18:07:07 +02:00
Tithen-Firion
374560f018 [test_download] Fix order when testing file's md5 2017-04-27 22:27:34 +07:00
Sergey M․
ff99fe529e Don't list master m3u8 playlists in format list (closes #12832) 2017-04-27 21:53:17 +07:00
Tithen-Firion
76c1951036 [appletrailers] update test cases 2017-04-27 10:04:21 +02:00
Lucas M
e8bfe2a946 [streamable] Add support for new embedded URL schema 2017-04-26 23:39:53 +07:00
Sergey M․
3dc8b61b7f [arte:+7] Relax _VALID_URL (closes #12837) 2017-04-26 01:55:29 +07:00
76 changed files with 1630 additions and 447 deletions

View File

@@ -6,8 +6,8 @@
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.04.26*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.04.26**
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2017.05.07*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2017.05.07**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2017.04.26
[debug] youtube-dl version 2017.05.07
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

View File

@@ -1,3 +1,80 @@
version 2017.05.07
Common
* [extractor/common] Fix typo in _extract_akamai_formats
+ [postprocessor/ffmpeg] Embed chapters into media file with --add-metadata
+ [extractor/common] Introduce chapters meta field
Extractors
* [youtube] Fix authentication (#12820, #12927, #12973, #12992, #12993, #12995,
#13003)
* [bilibili] Fix video downloading (#13001)
* [rmcdecouverte] Fix extraction (#12937)
* [theplatform] Extract chapters
* [bandcamp] Fix thumbnail extraction (#12980)
* [pornhub] Extend URL regular expression (#12996)
+ [youtube] Extract chapters
+ [nrk] Extract chapters
+ [vice] Add support for ooyala embeds in article pages
+ [vice] Support vice articles (#12968)
* [vice] Fix extraction for non en_us videos (#12967)
* [gdcvault] Fix extraction for some videos (#12733)
* [pbs] Improve multipart video support (#12981)
* [laola1tv] Fix extraction (#12880)
+ [cda] Support birthday verification (#12789)
* [leeco] Fix extraction (#12974)
+ [pbs] Extract chapters
* [amp] Imporove thumbnail and subtitles extraction
* [foxsports] Fix extraction (#12945)
- [coub] Remove comment count extraction (#12941)
version 2017.05.01
Core
+ [extractor/common] Extract view count from JSON-LD
* [utils] Improve unified_timestamp
+ [utils] Add video/mp2t to mimetype2ext
* [downloader/external] Properly handle live stream downloading cancellation
(#8932)
+ [utils] Add support for unicode whitespace in clean_html on python 2 (#12906)
Extractors
* [infoq] Make audio format extraction non fatal (#12938)
* [brightcove] Allow whitespace around attribute names in embedded code
+ [zaq1] Add support for zaq1.pl (#12693)
+ [xvideos] Extract duration (#12828)
* [vevo] Fix extraction (#12879)
+ [noovo] Add support for noovo.ca (#12792)
+ [washingtonpost] Add support for embeds (#12699)
* [yandexmusic:playlist] Fix extraction for python 3 (#12888)
* [anvato] Improve extraction (#12913)
* Promote to regular shortcut based extractor
* Add mcp to access key mapping table
* Add support for embeds extraction
* Add support for anvato embeds in generic extractor
* [xtube] Fix extraction for older FLV videos (#12734)
* [tvplayer] Fix extraction (#12908)
version 2017.04.28
Core
+ [adobepass] Use geo verification headers for all requests
- [downloader/fragment] Remove assert for resume_len when no fragments
downloaded
+ [extractor/common] Add manifest_url for explicit group rendition formats
* [extractor/common] Fix manifest_url for m3u8 formats
- [extractor/common] Don't list master m3u8 playlists in format list (#12832)
Extractor
* [aenetworks] Fix extraction for shows with single season
+ [go] Add support for Disney, DisneyJunior and DisneyXD show pages
* [youtube] Recognize new locale-based player URLs (#12885)
+ [streamable] Add support for new embedded URL schema (#12844)
* [arte:+7] Relax URL regular expression (#12837)
version 2017.04.26
Core
@@ -6,19 +83,19 @@ Core
* [YoutubeDL] Fix output template for missing timestamp (#12796)
* [socks] Handle cases where credentials are required but missing
* [extractor/common] Improve HLS extraction (#12211)
- Extract m3u8 parsing to separate method
- Improve rendition groups extraction
- Build stream name according stream GROUP-ID
- Ignore reference to AUDIO group without URI when stream has no CODECS
- Use float for scaled tbr in _parse_m3u8_formats
* Extract m3u8 parsing to separate method
* Improve rendition groups extraction
* Build stream name according stream GROUP-ID
* Ignore reference to AUDIO group without URI when stream has no CODECS
* Use float for scaled tbr in _parse_m3u8_formats
* [utils] Add support for TTML styles in dfxp2srt
* [downloader/hls] No need to download keys for fragments that have been
already downloaded
* [downloader/fragment] Improve fragment downloading
- Resume immediately
- Don't concatenate fragments and decrypt them on every resume
- Optimize disk storage usage, don't store intermediate fragments on disk
- Store bookkeeping download state file
* Resume immediately
* Don't concatenate fragments and decrypt them on every resume
* Optimize disk storage usage, don't store intermediate fragments on disk
* Store bookkeeping download state file
+ [extractor/common] Add support for multiple getters in try_get
+ [extractor/common] Add support for video of WebPage context in _json_ld
(#12778)

View File

@@ -45,6 +45,7 @@
- **anderetijden**: npo.nl and ntr.nl
- **AnimeOnDemand**
- **anitube.se**
- **Anvato**
- **AnySex**
- **Aparat**
- **AppleConnect**
@@ -529,6 +530,7 @@
- **NJPWWorld**: 新日本プロレスワールド
- **NobelPrize**
- **Noco**
- **Noovo**
- **Normalboots**
- **NosVideo**
- **Nova**: TN.cz, Prásk.tv, Nova.cz, Novaplus.cz, FANDA.tv, Krásná.cz and Doma.cz
@@ -877,9 +879,10 @@
- **VGTV**: VGTV, BTTV, FTV, Aftenposten and Aftonbladet
- **vh1.com**
- **Viafree**
- **Vice**
- **vice**
- **vice:article**
- **vice:show**
- **Viceland**
- **ViceShow**
- **Vidbit**
- **Viddler**
- **Videa**
@@ -1013,6 +1016,7 @@
- **youtube:user**: YouTube.com user videos (URL or "ytuser" keyword)
- **youtube:watchlater**: Youtube watch later list, ":ytwatchlater" for short (requires authentication)
- **Zapiks**
- **Zaq1**
- **ZDF**
- **ZDFChannel**
- **zingmp3**: mp3.zing.vn

View File

@@ -184,16 +184,8 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'pluzz_francetv_11507',
'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/master.m3u8?caption=2017%2F16%2F156589847-1492488987.m3u8%3Afra%3AFrancais&audiotrack=0%3Afra%3AFrancais',
[{
'url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/master.m3u8?caption=2017%2F16%2F156589847-1492488987.m3u8%3Afra%3AFrancais&audiotrack=0%3Afra%3AFrancais',
'ext': 'mp4',
'format_id': 'meta',
'format_note': 'Quality selection URL',
'protocol': 'm3u8',
'preference': -100,
'resolution': 'multiple'
}, {
'url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/index_0_av.m3u8?null=0',
'manifest_url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/index_0_av.m3u8?null=0',
'manifest_url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/master.m3u8?caption=2017%2F16%2F156589847-1492488987.m3u8%3Afra%3AFrancais&audiotrack=0%3Afra%3AFrancais',
'ext': 'mp4',
'format_id': '180',
'protocol': 'm3u8',
@@ -204,7 +196,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'height': 144,
}, {
'url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/index_1_av.m3u8?null=0',
'manifest_url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/index_1_av.m3u8?null=0',
'manifest_url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/master.m3u8?caption=2017%2F16%2F156589847-1492488987.m3u8%3Afra%3AFrancais&audiotrack=0%3Afra%3AFrancais',
'ext': 'mp4',
'format_id': '303',
'protocol': 'm3u8',
@@ -215,7 +207,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'height': 180,
}, {
'url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/index_2_av.m3u8?null=0',
'manifest_url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/index_2_av.m3u8?null=0',
'manifest_url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/master.m3u8?caption=2017%2F16%2F156589847-1492488987.m3u8%3Afra%3AFrancais&audiotrack=0%3Afra%3AFrancais',
'ext': 'mp4',
'format_id': '575',
'protocol': 'm3u8',
@@ -226,7 +218,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'height': 288,
}, {
'url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/index_3_av.m3u8?null=0',
'manifest_url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/index_3_av.m3u8?null=0',
'manifest_url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/master.m3u8?caption=2017%2F16%2F156589847-1492488987.m3u8%3Afra%3AFrancais&audiotrack=0%3Afra%3AFrancais',
'ext': 'mp4',
'format_id': '831',
'protocol': 'm3u8',
@@ -237,7 +229,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'height': 396,
}, {
'url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/index_4_av.m3u8?null=0',
'manifest_url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/index_4_av.m3u8?null=0',
'manifest_url': 'http://replayftv-vh.akamaihd.net/i/streaming-adaptatif_france-dom-tom/2017/S16/J2/156589847-58f59130c1f52-,standard1,standard2,standard3,standard4,standard5,.mp4.csmil/master.m3u8?caption=2017%2F16%2F156589847-1492488987.m3u8%3Afra%3AFrancais&audiotrack=0%3Afra%3AFrancais',
'ext': 'mp4',
'protocol': 'm3u8',
'format_id': '1467',
@@ -254,28 +246,22 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'teamcoco_11995',
'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/main.m3u8',
[{
'url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/main.m3u8',
'ext': 'mp4',
'format_id': 'meta',
'format_note': 'Quality selection URL',
'protocol': 'm3u8',
'preference': -100,
'resolution': 'multiple',
}, {
'url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/hls/CONAN_020217_Highlight_show-audio-160k_v4.m3u8',
'manifest_url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/main.m3u8',
'ext': 'mp4',
'format_id': 'audio-0-Default',
'protocol': 'm3u8',
'vcodec': 'none',
}, {
'url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/hls/CONAN_020217_Highlight_show-audio-64k_v4.m3u8',
'manifest_url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/main.m3u8',
'ext': 'mp4',
'format_id': 'audio-1-Default',
'protocol': 'm3u8',
'vcodec': 'none',
}, {
'url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/hls/CONAN_020217_Highlight_show-audio-64k_v4.m3u8',
'manifest_url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/hls/CONAN_020217_Highlight_show-audio-64k_v4.m3u8',
'manifest_url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/main.m3u8',
'ext': 'mp4',
'format_id': '71',
'protocol': 'm3u8',
@@ -284,7 +270,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'tbr': 71,
}, {
'url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/hls/CONAN_020217_Highlight_show-400k_v4.m3u8',
'manifest_url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/hls/CONAN_020217_Highlight_show-400k_v4.m3u8',
'manifest_url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/main.m3u8',
'ext': 'mp4',
'format_id': '413',
'protocol': 'm3u8',
@@ -295,7 +281,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'height': 224,
}, {
'url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/hls/CONAN_020217_Highlight_show-400k_v4.m3u8',
'manifest_url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/hls/CONAN_020217_Highlight_show-400k_v4.m3u8',
'manifest_url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/main.m3u8',
'ext': 'mp4',
'format_id': '522',
'protocol': 'm3u8',
@@ -306,7 +292,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'height': 224,
}, {
'url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/hls/CONAN_020217_Highlight_show-1m_v4.m3u8',
'manifest_url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/hls/CONAN_020217_Highlight_show-1m_v4.m3u8',
'manifest_url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/main.m3u8',
'ext': 'mp4',
'format_id': '1205',
'protocol': 'm3u8',
@@ -317,7 +303,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'height': 360,
}, {
'url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/hls/CONAN_020217_Highlight_show-2m_v4.m3u8',
'manifest_url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/hls/CONAN_020217_Highlight_show-2m_v4.m3u8',
'manifest_url': 'http://ak.storage-w.teamcococdn.com/cdn/2017-02/98599/ed8f/main.m3u8',
'ext': 'mp4',
'format_id': '2374',
'protocol': 'm3u8',
@@ -334,15 +320,8 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'toggle_mobile_12211',
'http://cdnapi.kaltura.com/p/2082311/sp/208231100/playManifest/protocol/http/entryId/0_89q6e8ku/format/applehttp/tags/mobile_sd/f/a.m3u8',
[{
'url': 'http://cdnapi.kaltura.com/p/2082311/sp/208231100/playManifest/protocol/http/entryId/0_89q6e8ku/format/applehttp/tags/mobile_sd/f/a.m3u8',
'ext': 'mp4',
'format_id': 'meta',
'format_note': 'Quality selection URL',
'protocol': 'm3u8',
'preference': -100,
'resolution': 'multiple'
}, {
'url': 'http://k.toggle.sg/fhls/p/2082311/sp/208231100/serveFlavor/entryId/0_89q6e8ku/v/2/pv/1/flavorId/0_sa2ntrdg/name/a.mp4/index.m3u8',
'manifest_url': 'http://cdnapi.kaltura.com/p/2082311/sp/208231100/playManifest/protocol/http/entryId/0_89q6e8ku/format/applehttp/tags/mobile_sd/f/a.m3u8',
'ext': 'mp4',
'format_id': 'audio-English',
'protocol': 'm3u8',
@@ -350,6 +329,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'vcodec': 'none',
}, {
'url': 'http://k.toggle.sg/fhls/p/2082311/sp/208231100/serveFlavor/entryId/0_89q6e8ku/v/2/pv/1/flavorId/0_r7y0nitg/name/a.mp4/index.m3u8',
'manifest_url': 'http://cdnapi.kaltura.com/p/2082311/sp/208231100/playManifest/protocol/http/entryId/0_89q6e8ku/format/applehttp/tags/mobile_sd/f/a.m3u8',
'ext': 'mp4',
'format_id': 'audio-Undefined',
'protocol': 'm3u8',
@@ -357,7 +337,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'vcodec': 'none',
}, {
'url': 'http://k.toggle.sg/fhls/p/2082311/sp/208231100/serveFlavor/entryId/0_89q6e8ku/v/2/pv/1/flavorId/0_qlk9hlzr/name/a.mp4/index.m3u8',
'manifest_url': 'http://k.toggle.sg/fhls/p/2082311/sp/208231100/serveFlavor/entryId/0_89q6e8ku/v/2/pv/1/flavorId/0_qlk9hlzr/name/a.mp4/index.m3u8',
'manifest_url': 'http://cdnapi.kaltura.com/p/2082311/sp/208231100/playManifest/protocol/http/entryId/0_89q6e8ku/format/applehttp/tags/mobile_sd/f/a.m3u8',
'ext': 'mp4',
'format_id': '155',
'protocol': 'm3u8',
@@ -366,7 +346,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'height': 180,
}, {
'url': 'http://k.toggle.sg/fhls/p/2082311/sp/208231100/serveFlavor/entryId/0_89q6e8ku/v/2/pv/1/flavorId/0_oefackmi/name/a.mp4/index.m3u8',
'manifest_url': 'http://k.toggle.sg/fhls/p/2082311/sp/208231100/serveFlavor/entryId/0_89q6e8ku/v/2/pv/1/flavorId/0_oefackmi/name/a.mp4/index.m3u8',
'manifest_url': 'http://cdnapi.kaltura.com/p/2082311/sp/208231100/playManifest/protocol/http/entryId/0_89q6e8ku/format/applehttp/tags/mobile_sd/f/a.m3u8',
'ext': 'mp4',
'format_id': '502',
'protocol': 'm3u8',
@@ -375,7 +355,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'height': 270,
}, {
'url': 'http://k.toggle.sg/fhls/p/2082311/sp/208231100/serveFlavor/entryId/0_89q6e8ku/v/12/pv/1/flavorId/0_vyg9pj7k/name/a.mp4/index.m3u8',
'manifest_url': 'http://k.toggle.sg/fhls/p/2082311/sp/208231100/serveFlavor/entryId/0_89q6e8ku/v/12/pv/1/flavorId/0_vyg9pj7k/name/a.mp4/index.m3u8',
'manifest_url': 'http://cdnapi.kaltura.com/p/2082311/sp/208231100/playManifest/protocol/http/entryId/0_89q6e8ku/format/applehttp/tags/mobile_sd/f/a.m3u8',
'ext': 'mp4',
'format_id': '827',
'protocol': 'm3u8',
@@ -384,7 +364,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'height': 360,
}, {
'url': 'http://k.toggle.sg/fhls/p/2082311/sp/208231100/serveFlavor/entryId/0_89q6e8ku/v/12/pv/1/flavorId/0_50n4psvx/name/a.mp4/index.m3u8',
'manifest_url': 'http://k.toggle.sg/fhls/p/2082311/sp/208231100/serveFlavor/entryId/0_89q6e8ku/v/12/pv/1/flavorId/0_50n4psvx/name/a.mp4/index.m3u8',
'manifest_url': 'http://cdnapi.kaltura.com/p/2082311/sp/208231100/playManifest/protocol/http/entryId/0_89q6e8ku/format/applehttp/tags/mobile_sd/f/a.m3u8',
'ext': 'mp4',
'format_id': '1396',
'protocol': 'm3u8',
@@ -398,16 +378,8 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'twitch_vod',
'https://usher.ttvnw.net/vod/6528877?allow_source=true&allow_audio_only=true&allow_spectre=true&player=twitchweb&nauth=%7B%22user_id%22%3Anull%2C%22vod_id%22%3A6528877%2C%22expires%22%3A1492887874%2C%22chansub%22%3A%7B%22restricted_bitrates%22%3A%5B%5D%7D%2C%22privileged%22%3Afalse%2C%22https_required%22%3Afalse%7D&nauthsig=3e29296a6824a0f48f9e731383f77a614fc79bee',
[{
'url': 'https://usher.ttvnw.net/vod/6528877?allow_source=true&allow_audio_only=true&allow_spectre=true&player=twitchweb&nauth=%7B%22user_id%22%3Anull%2C%22vod_id%22%3A6528877%2C%22expires%22%3A1492887874%2C%22chansub%22%3A%7B%22restricted_bitrates%22%3A%5B%5D%7D%2C%22privileged%22%3Afalse%2C%22https_required%22%3Afalse%7D&nauthsig=3e29296a6824a0f48f9e731383f77a614fc79bee',
'ext': 'mp4',
'format_id': 'meta',
'format_note': 'Quality selection URL',
'protocol': 'm3u8',
'preference': -100,
'resolution': 'multiple'
}, {
'url': 'https://vod.edgecast.hls.ttvnw.net/e5da31ab49_riotgames_15001215120_261543898/audio_only/index-muted-HM49I092CC.m3u8',
'manifest_url': 'https://vod.edgecast.hls.ttvnw.net/e5da31ab49_riotgames_15001215120_261543898/audio_only/index-muted-HM49I092CC.m3u8',
'manifest_url': 'https://usher.ttvnw.net/vod/6528877?allow_source=true&allow_audio_only=true&allow_spectre=true&player=twitchweb&nauth=%7B%22user_id%22%3Anull%2C%22vod_id%22%3A6528877%2C%22expires%22%3A1492887874%2C%22chansub%22%3A%7B%22restricted_bitrates%22%3A%5B%5D%7D%2C%22privileged%22%3Afalse%2C%22https_required%22%3Afalse%7D&nauthsig=3e29296a6824a0f48f9e731383f77a614fc79bee',
'ext': 'mp4',
'format_id': 'Audio Only',
'protocol': 'm3u8',
@@ -416,7 +388,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'tbr': 182.725,
}, {
'url': 'https://vod.edgecast.hls.ttvnw.net/e5da31ab49_riotgames_15001215120_261543898/mobile/index-muted-HM49I092CC.m3u8',
'manifest_url': 'https://vod.edgecast.hls.ttvnw.net/e5da31ab49_riotgames_15001215120_261543898/mobile/index-muted-HM49I092CC.m3u8',
'manifest_url': 'https://usher.ttvnw.net/vod/6528877?allow_source=true&allow_audio_only=true&allow_spectre=true&player=twitchweb&nauth=%7B%22user_id%22%3Anull%2C%22vod_id%22%3A6528877%2C%22expires%22%3A1492887874%2C%22chansub%22%3A%7B%22restricted_bitrates%22%3A%5B%5D%7D%2C%22privileged%22%3Afalse%2C%22https_required%22%3Afalse%7D&nauthsig=3e29296a6824a0f48f9e731383f77a614fc79bee',
'ext': 'mp4',
'format_id': 'Mobile',
'protocol': 'm3u8',
@@ -427,7 +399,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'height': 226,
}, {
'url': 'https://vod.edgecast.hls.ttvnw.net/e5da31ab49_riotgames_15001215120_261543898/low/index-muted-HM49I092CC.m3u8',
'manifest_url': 'https://vod.edgecast.hls.ttvnw.net/e5da31ab49_riotgames_15001215120_261543898/low/index-muted-HM49I092CC.m3u8',
'manifest_url': 'https://usher.ttvnw.net/vod/6528877?allow_source=true&allow_audio_only=true&allow_spectre=true&player=twitchweb&nauth=%7B%22user_id%22%3Anull%2C%22vod_id%22%3A6528877%2C%22expires%22%3A1492887874%2C%22chansub%22%3A%7B%22restricted_bitrates%22%3A%5B%5D%7D%2C%22privileged%22%3Afalse%2C%22https_required%22%3Afalse%7D&nauthsig=3e29296a6824a0f48f9e731383f77a614fc79bee',
'ext': 'mp4',
'format_id': 'Low',
'protocol': 'm3u8',
@@ -438,7 +410,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'height': 360,
}, {
'url': 'https://vod.edgecast.hls.ttvnw.net/e5da31ab49_riotgames_15001215120_261543898/medium/index-muted-HM49I092CC.m3u8',
'manifest_url': 'https://vod.edgecast.hls.ttvnw.net/e5da31ab49_riotgames_15001215120_261543898/medium/index-muted-HM49I092CC.m3u8',
'manifest_url': 'https://usher.ttvnw.net/vod/6528877?allow_source=true&allow_audio_only=true&allow_spectre=true&player=twitchweb&nauth=%7B%22user_id%22%3Anull%2C%22vod_id%22%3A6528877%2C%22expires%22%3A1492887874%2C%22chansub%22%3A%7B%22restricted_bitrates%22%3A%5B%5D%7D%2C%22privileged%22%3Afalse%2C%22https_required%22%3Afalse%7D&nauthsig=3e29296a6824a0f48f9e731383f77a614fc79bee',
'ext': 'mp4',
'format_id': 'Medium',
'protocol': 'm3u8',
@@ -449,7 +421,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'height': 480,
}, {
'url': 'https://vod.edgecast.hls.ttvnw.net/e5da31ab49_riotgames_15001215120_261543898/high/index-muted-HM49I092CC.m3u8',
'manifest_url': 'https://vod.edgecast.hls.ttvnw.net/e5da31ab49_riotgames_15001215120_261543898/high/index-muted-HM49I092CC.m3u8',
'manifest_url': 'https://usher.ttvnw.net/vod/6528877?allow_source=true&allow_audio_only=true&allow_spectre=true&player=twitchweb&nauth=%7B%22user_id%22%3Anull%2C%22vod_id%22%3A6528877%2C%22expires%22%3A1492887874%2C%22chansub%22%3A%7B%22restricted_bitrates%22%3A%5B%5D%7D%2C%22privileged%22%3Afalse%2C%22https_required%22%3Afalse%7D&nauthsig=3e29296a6824a0f48f9e731383f77a614fc79bee',
'ext': 'mp4',
'format_id': 'High',
'protocol': 'm3u8',
@@ -460,7 +432,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'height': 720,
}, {
'url': 'https://vod.edgecast.hls.ttvnw.net/e5da31ab49_riotgames_15001215120_261543898/chunked/index-muted-HM49I092CC.m3u8',
'manifest_url': 'https://vod.edgecast.hls.ttvnw.net/e5da31ab49_riotgames_15001215120_261543898/chunked/index-muted-HM49I092CC.m3u8',
'manifest_url': 'https://usher.ttvnw.net/vod/6528877?allow_source=true&allow_audio_only=true&allow_spectre=true&player=twitchweb&nauth=%7B%22user_id%22%3Anull%2C%22vod_id%22%3A6528877%2C%22expires%22%3A1492887874%2C%22chansub%22%3A%7B%22restricted_bitrates%22%3A%5B%5D%7D%2C%22privileged%22%3Afalse%2C%22https_required%22%3Afalse%7D&nauthsig=3e29296a6824a0f48f9e731383f77a614fc79bee',
'ext': 'mp4',
'format_id': 'Source',
'protocol': 'm3u8',
@@ -478,16 +450,8 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'vidio',
'https://www.vidio.com/videos/165683/playlist.m3u8',
[{
'url': 'https://www.vidio.com/videos/165683/playlist.m3u8',
'ext': 'mp4',
'format_id': 'meta',
'format_note': 'Quality selection URL',
'protocol': 'm3u8',
'preference': -100,
'resolution': 'multiple'
}, {
'url': 'https://cdn1-a.production.vidio.static6.com/uploads/165683/dj_ambred-4383-b300.mp4.m3u8',
'manifest_url': 'https://cdn1-a.production.vidio.static6.com/uploads/165683/dj_ambred-4383-b300.mp4.m3u8',
'manifest_url': 'https://www.vidio.com/videos/165683/playlist.m3u8',
'ext': 'mp4',
'format_id': '270p 3G',
'protocol': 'm3u8',
@@ -496,7 +460,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'height': 270,
}, {
'url': 'https://cdn1-a.production.vidio.static6.com/uploads/165683/dj_ambred-4383-b600.mp4.m3u8',
'manifest_url': 'https://cdn1-a.production.vidio.static6.com/uploads/165683/dj_ambred-4383-b600.mp4.m3u8',
'manifest_url': 'https://www.vidio.com/videos/165683/playlist.m3u8',
'ext': 'mp4',
'format_id': '360p SD',
'protocol': 'm3u8',
@@ -505,7 +469,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'height': 360,
}, {
'url': 'https://cdn1-a.production.vidio.static6.com/uploads/165683/dj_ambred-4383-b1200.mp4.m3u8',
'manifest_url': 'https://cdn1-a.production.vidio.static6.com/uploads/165683/dj_ambred-4383-b1200.mp4.m3u8',
'manifest_url': 'https://www.vidio.com/videos/165683/playlist.m3u8',
'ext': 'mp4',
'format_id': '720p HD',
'protocol': 'm3u8',

View File

@@ -225,7 +225,7 @@ def generator(test_case, tname):
format_bytes(got_fsize)))
if 'md5' in tc:
md5_for_file = _file_md5(tc_filename)
self.assertEqual(md5_for_file, tc['md5'])
self.assertEqual(tc['md5'], md5_for_file)
# Finally, check test cases' data again but this time against
# extracted data from info JSON file written during processing
info_json_fn = os.path.splitext(tc_filename)[0] + '.info.json'

View File

@@ -44,6 +44,7 @@ from youtube_dl.utils import (
limit_length,
mimetype2ext,
month_by_name,
multipart_encode,
ohdave_rsa_encrypt,
OnDemandPagedList,
orderedSet,
@@ -338,6 +339,7 @@ class TestUtil(unittest.TestCase):
self.assertEqual(unified_timestamp('UNKNOWN DATE FORMAT'), None)
self.assertEqual(unified_timestamp('May 16, 2016 11:15 PM'), 1463440500)
self.assertEqual(unified_timestamp('Feb 7, 2016 at 6:35 pm'), 1454870100)
self.assertEqual(unified_timestamp('2017-03-30T17:52:41Q'), 1490896361)
def test_determine_ext(self):
self.assertEqual(determine_ext('http://example.com/foo/bar.mp4/?download'), 'mp4')
@@ -619,6 +621,16 @@ class TestUtil(unittest.TestCase):
'http://example.com/path', {'test': '第二行тест'})),
query_dict('http://example.com/path?test=%E7%AC%AC%E4%BA%8C%E8%A1%8C%D1%82%D0%B5%D1%81%D1%82'))
def test_multipart_encode(self):
self.assertEqual(
multipart_encode({b'field': b'value'}, boundary='AAAAAA')[0],
b'--AAAAAA\r\nContent-Disposition: form-data; name="field"\r\n\r\nvalue\r\n--AAAAAA--\r\n')
self.assertEqual(
multipart_encode({'欄位'.encode('utf-8'): ''.encode('utf-8')}, boundary='AAAAAA')[0],
b'--AAAAAA\r\nContent-Disposition: form-data; name="\xe6\xac\x84\xe4\xbd\x8d"\r\n\r\n\xe5\x80\xbc\r\n--AAAAAA--\r\n')
self.assertRaises(
ValueError, multipart_encode, {b'field': b'value'}, boundary='value')
def test_dict_get(self):
FALSE_VALUES = {
'none': None,
@@ -899,6 +911,7 @@ class TestUtil(unittest.TestCase):
def test_clean_html(self):
self.assertEqual(clean_html('a:\nb'), 'a: b')
self.assertEqual(clean_html('a:\n "b"'), 'a: "b"')
self.assertEqual(clean_html('a<br>\xa0b'), 'a\nb')
def test_intlist_to_bytes(self):
self.assertEqual(

View File

@@ -0,0 +1,268 @@
#!/usr/bin/env python
# coding: utf-8
from __future__ import unicode_literals
# Allow direct execution
import os
import sys
import unittest
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from test.helper import expect_value
from youtube_dl.extractor import YoutubeIE
class TestYoutubeChapters(unittest.TestCase):
_TEST_CASES = [
(
# https://www.youtube.com/watch?v=A22oy8dFjqc
# pattern: 00:00 - <title>
'''This is the absolute ULTIMATE experience of Queen's set at LIVE AID, this is the best video mixed to the absolutely superior stereo radio broadcast. This vastly superior audio mix takes a huge dump on all of the official mixes. Best viewed in 1080p. ENJOY! ***MAKE SURE TO READ THE DESCRIPTION***<br /><a href="#" onclick="yt.www.watch.player.seekTo(00*60+36);return false;">00:36</a> - Bohemian Rhapsody<br /><a href="#" onclick="yt.www.watch.player.seekTo(02*60+42);return false;">02:42</a> - Radio Ga Ga<br /><a href="#" onclick="yt.www.watch.player.seekTo(06*60+53);return false;">06:53</a> - Ay Oh!<br /><a href="#" onclick="yt.www.watch.player.seekTo(07*60+34);return false;">07:34</a> - Hammer To Fall<br /><a href="#" onclick="yt.www.watch.player.seekTo(12*60+08);return false;">12:08</a> - Crazy Little Thing Called Love<br /><a href="#" onclick="yt.www.watch.player.seekTo(16*60+03);return false;">16:03</a> - We Will Rock You<br /><a href="#" onclick="yt.www.watch.player.seekTo(17*60+18);return false;">17:18</a> - We Are The Champions<br /><a href="#" onclick="yt.www.watch.player.seekTo(21*60+12);return false;">21:12</a> - Is This The World We Created...?<br /><br />Short song analysis:<br /><br />- "Bohemian Rhapsody": Although it's a short medley version, it's one of the best performances of the ballad section, with Freddie nailing the Bb4s with the correct studio phrasing (for the first time ever!).<br /><br />- "Radio Ga Ga": Although it's missing one chorus, this is one of - if not the best - the best versions ever, Freddie nails all the Bb4s and sounds very clean! Spike Edney's Roland Jupiter 8 also really shines through on this mix, compared to the DVD releases!<br /><br />- "Audience Improv": A great improv, Freddie sounds strong and confident. You gotta love when he sustains that A4 for 4 seconds!<br /><br />- "Hammer To Fall": Despite missing a verse and a chorus, it's a strong version (possibly the best ever). Freddie sings the song amazingly, and even ad-libs a C#5 and a C5! Also notice how heavy Brian's guitar sounds compared to the thin DVD mixes - it roars!<br /><br />- "Crazy Little Thing Called Love": A great version, the crowd loves the song, the jam is great as well! Only downside to this is the slight feedback issues.<br /><br />- "We Will Rock You": Although cut down to the 1st verse and chorus, Freddie sounds strong. He nails the A4, and the solo from Dr. May is brilliant!<br /><br />- "We Are the Champions": Perhaps the high-light of the performance - Freddie is very daring on this version, he sustains the pre-chorus Bb4s, nails the 1st C5, belts great A4s, but most importantly: He nails the chorus Bb4s, in all 3 choruses! This is the only time he has ever done so! It has to be said though, the last one sounds a bit rough, but that's a side effect of belting high notes for the past 18 minutes, with nodules AND laryngitis!<br /><br />- "Is This The World We Created... ?": Freddie and Brian perform a beautiful version of this, and it is one of the best versions ever. It's both sad and hilarious that a couple of BBC engineers are talking over the song, one of them being completely oblivious of the fact that he is interrupting the performance, on live television... Which was being televised to almost 2 billion homes.<br /><br /><br />All rights go to their respective owners!<br />-----Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for fair use for purposes such as criticism, comment, news reporting, teaching, scholarship, and research. Fair use is a use permitted by copyright statute that might otherwise be infringing. Non-profit, educational or personal use tips the balance in favor of fair use''',
1477,
[{
'start_time': 36,
'end_time': 162,
'title': 'Bohemian Rhapsody',
}, {
'start_time': 162,
'end_time': 413,
'title': 'Radio Ga Ga',
}, {
'start_time': 413,
'end_time': 454,
'title': 'Ay Oh!',
}, {
'start_time': 454,
'end_time': 728,
'title': 'Hammer To Fall',
}, {
'start_time': 728,
'end_time': 963,
'title': 'Crazy Little Thing Called Love',
}, {
'start_time': 963,
'end_time': 1038,
'title': 'We Will Rock You',
}, {
'start_time': 1038,
'end_time': 1272,
'title': 'We Are The Champions',
}, {
'start_time': 1272,
'end_time': 1477,
'title': 'Is This The World We Created...?',
}]
),
(
# https://www.youtube.com/watch?v=ekYlRhALiRQ
# pattern: <num>. <title> 0:00
'1. Those Beaten Paths of Confusion <a href="#" onclick="yt.www.watch.player.seekTo(0*60+00);return false;">0:00</a><br />2. Beyond the Shadows of Emptiness & Nothingness <a href="#" onclick="yt.www.watch.player.seekTo(11*60+47);return false;">11:47</a><br />3. Poison Yourself...With Thought <a href="#" onclick="yt.www.watch.player.seekTo(26*60+30);return false;">26:30</a><br />4. The Agents of Transformation <a href="#" onclick="yt.www.watch.player.seekTo(35*60+57);return false;">35:57</a><br />5. Drowning in the Pain of Consciousness <a href="#" onclick="yt.www.watch.player.seekTo(44*60+32);return false;">44:32</a><br />6. Deny the Disease of Life <a href="#" onclick="yt.www.watch.player.seekTo(53*60+07);return false;">53:07</a><br /><br />More info/Buy: http://crepusculonegro.storenvy.com/products/257645-cn-03-arizmenda-within-the-vacuum-of-infinity<br /><br />No copyright is intended. The rights to this video are assumed by the owner and its affiliates.',
4009,
[{
'start_time': 0,
'end_time': 707,
'title': '1. Those Beaten Paths of Confusion',
}, {
'start_time': 707,
'end_time': 1590,
'title': '2. Beyond the Shadows of Emptiness & Nothingness',
}, {
'start_time': 1590,
'end_time': 2157,
'title': '3. Poison Yourself...With Thought',
}, {
'start_time': 2157,
'end_time': 2672,
'title': '4. The Agents of Transformation',
}, {
'start_time': 2672,
'end_time': 3187,
'title': '5. Drowning in the Pain of Consciousness',
}, {
'start_time': 3187,
'end_time': 4009,
'title': '6. Deny the Disease of Life',
}]
),
(
# https://www.youtube.com/watch?v=WjL4pSzog9w
# pattern: 00:00 <title>
'<a href="https://arizmenda.bandcamp.com/merch/despairs-depths-descended-cd" class="yt-uix-servicelink " data-target-new-window="True" data-servicelink="CDAQ6TgYACITCNf1raqT2dMCFdRjGAod_o0CBSj4HQ" data-url="https://arizmenda.bandcamp.com/merch/despairs-depths-descended-cd" rel="nofollow noopener" target="_blank">https://arizmenda.bandcamp.com/merch/...</a><br /><br /><a href="#" onclick="yt.www.watch.player.seekTo(00*60+00);return false;">00:00</a> Christening Unborn Deformities <br /><a href="#" onclick="yt.www.watch.player.seekTo(07*60+08);return false;">07:08</a> Taste of Purity<br /><a href="#" onclick="yt.www.watch.player.seekTo(16*60+16);return false;">16:16</a> Sculpting Sins of a Universal Tongue<br /><a href="#" onclick="yt.www.watch.player.seekTo(24*60+45);return false;">24:45</a> Birth<br /><a href="#" onclick="yt.www.watch.player.seekTo(31*60+24);return false;">31:24</a> Neves<br /><a href="#" onclick="yt.www.watch.player.seekTo(37*60+55);return false;">37:55</a> Libations in Limbo',
2705,
[{
'start_time': 0,
'end_time': 428,
'title': 'Christening Unborn Deformities',
}, {
'start_time': 428,
'end_time': 976,
'title': 'Taste of Purity',
}, {
'start_time': 976,
'end_time': 1485,
'title': 'Sculpting Sins of a Universal Tongue',
}, {
'start_time': 1485,
'end_time': 1884,
'title': 'Birth',
}, {
'start_time': 1884,
'end_time': 2275,
'title': 'Neves',
}, {
'start_time': 2275,
'end_time': 2705,
'title': 'Libations in Limbo',
}]
),
(
# https://www.youtube.com/watch?v=o3r1sn-t3is
# pattern: <title> 00:00 <note>
'Download this show in MP3: <a href="http://sh.st/njZKK" class="yt-uix-servicelink " data-url="http://sh.st/njZKK" data-target-new-window="True" data-servicelink="CDAQ6TgYACITCK3j8_6o2dMCFVDCGAoduVAKKij4HQ" rel="nofollow noopener" target="_blank">http://sh.st/njZKK</a><br /><br />Setlist:<br />I-E-A-I-A-I-O <a href="#" onclick="yt.www.watch.player.seekTo(00*60+45);return false;">00:45</a><br />Suite-Pee <a href="#" onclick="yt.www.watch.player.seekTo(4*60+26);return false;">4:26</a> (Incomplete)<br />Attack <a href="#" onclick="yt.www.watch.player.seekTo(5*60+31);return false;">5:31</a> (First live performance since 2011)<br />Prison Song <a href="#" onclick="yt.www.watch.player.seekTo(8*60+42);return false;">8:42</a><br />Know <a href="#" onclick="yt.www.watch.player.seekTo(12*60+32);return false;">12:32</a> (First live performance since 2011)<br />Aerials <a href="#" onclick="yt.www.watch.player.seekTo(15*60+32);return false;">15:32</a><br />Soldier Side - Intro <a href="#" onclick="yt.www.watch.player.seekTo(19*60+13);return false;">19:13</a><br />B.Y.O.B. <a href="#" onclick="yt.www.watch.player.seekTo(20*60+09);return false;">20:09</a><br />Soil <a href="#" onclick="yt.www.watch.player.seekTo(24*60+32);return false;">24:32</a><br />Darts <a href="#" onclick="yt.www.watch.player.seekTo(27*60+48);return false;">27:48</a><br />Radio/Video <a href="#" onclick="yt.www.watch.player.seekTo(30*60+38);return false;">30:38</a><br />Hypnotize <a href="#" onclick="yt.www.watch.player.seekTo(35*60+05);return false;">35:05</a><br />Temper <a href="#" onclick="yt.www.watch.player.seekTo(38*60+08);return false;">38:08</a> (First live performance since 1999)<br />CUBErt <a href="#" onclick="yt.www.watch.player.seekTo(41*60+00);return false;">41:00</a><br />Needles <a href="#" onclick="yt.www.watch.player.seekTo(42*60+57);return false;">42:57</a><br />Deer Dance <a href="#" onclick="yt.www.watch.player.seekTo(46*60+27);return false;">46:27</a><br />Bounce <a href="#" onclick="yt.www.watch.player.seekTo(49*60+38);return false;">49:38</a><br />Suggestions <a href="#" onclick="yt.www.watch.player.seekTo(51*60+25);return false;">51:25</a><br />Psycho <a href="#" onclick="yt.www.watch.player.seekTo(53*60+52);return false;">53:52</a><br />Chop Suey! <a href="#" onclick="yt.www.watch.player.seekTo(58*60+13);return false;">58:13</a><br />Lonely Day <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+01*60+15);return false;">1:01:15</a><br />Question! <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+04*60+14);return false;">1:04:14</a><br />Lost in Hollywood <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+08*60+10);return false;">1:08:10</a><br />Vicinity of Obscenity <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+13*60+40);return false;">1:13:40</a>(First live performance since 2012)<br />Forest <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+16*60+17);return false;">1:16:17</a><br />Cigaro <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+20*60+02);return false;">1:20:02</a><br />Toxicity <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+23*60+57);return false;">1:23:57</a>(with Chino Moreno)<br />Sugar <a href="#" onclick="yt.www.watch.player.seekTo(1*3600+27*60+53);return false;">1:27:53</a>',
5640,
[{
'start_time': 45,
'end_time': 266,
'title': 'I-E-A-I-A-I-O',
}, {
'start_time': 266,
'end_time': 331,
'title': 'Suite-Pee (Incomplete)',
}, {
'start_time': 331,
'end_time': 522,
'title': 'Attack (First live performance since 2011)',
}, {
'start_time': 522,
'end_time': 752,
'title': 'Prison Song',
}, {
'start_time': 752,
'end_time': 932,
'title': 'Know (First live performance since 2011)',
}, {
'start_time': 932,
'end_time': 1153,
'title': 'Aerials',
}, {
'start_time': 1153,
'end_time': 1209,
'title': 'Soldier Side - Intro',
}, {
'start_time': 1209,
'end_time': 1472,
'title': 'B.Y.O.B.',
}, {
'start_time': 1472,
'end_time': 1668,
'title': 'Soil',
}, {
'start_time': 1668,
'end_time': 1838,
'title': 'Darts',
}, {
'start_time': 1838,
'end_time': 2105,
'title': 'Radio/Video',
}, {
'start_time': 2105,
'end_time': 2288,
'title': 'Hypnotize',
}, {
'start_time': 2288,
'end_time': 2460,
'title': 'Temper (First live performance since 1999)',
}, {
'start_time': 2460,
'end_time': 2577,
'title': 'CUBErt',
}, {
'start_time': 2577,
'end_time': 2787,
'title': 'Needles',
}, {
'start_time': 2787,
'end_time': 2978,
'title': 'Deer Dance',
}, {
'start_time': 2978,
'end_time': 3085,
'title': 'Bounce',
}, {
'start_time': 3085,
'end_time': 3232,
'title': 'Suggestions',
}, {
'start_time': 3232,
'end_time': 3493,
'title': 'Psycho',
}, {
'start_time': 3493,
'end_time': 3675,
'title': 'Chop Suey!',
}, {
'start_time': 3675,
'end_time': 3854,
'title': 'Lonely Day',
}, {
'start_time': 3854,
'end_time': 4090,
'title': 'Question!',
}, {
'start_time': 4090,
'end_time': 4420,
'title': 'Lost in Hollywood',
}, {
'start_time': 4420,
'end_time': 4577,
'title': 'Vicinity of Obscenity (First live performance since 2012)',
}, {
'start_time': 4577,
'end_time': 4802,
'title': 'Forest',
}, {
'start_time': 4802,
'end_time': 5037,
'title': 'Cigaro',
}, {
'start_time': 5037,
'end_time': 5273,
'title': 'Toxicity (with Chino Moreno)',
}, {
'start_time': 5273,
'end_time': 5640,
'title': 'Sugar',
}]
),
(
# https://www.youtube.com/watch?v=PkYLQbsqCE8
# pattern: <num> - <title> [<latinized title>] 0:00:00
'''Затемно (Zatemno) is an Obscure Black Metal Band from Russia.<br /><br />"Во прах (Vo prakh)'' Into The Ashes", Debut mini-album released may 6, 2016, by Death Knell Productions<br />Released on 6 panel digipak CD, limited to 100 copies only<br />And digital format on Bandcamp<br /><br />Tracklist<br /><br />1 - Во прах [Vo prakh] <a href="#" onclick="yt.www.watch.player.seekTo(0*3600+00*60+00);return false;">0:00:00</a><br />2 - Искупление [Iskupleniye] <a href="#" onclick="yt.www.watch.player.seekTo(0*3600+08*60+10);return false;">0:08:10</a><br />3 - Из серпов луны...[Iz serpov luny] <a href="#" onclick="yt.www.watch.player.seekTo(0*3600+14*60+30);return false;">0:14:30</a><br /><br />Links:<br /><a href="https://deathknellprod.bandcamp.com/album/--2" class="yt-uix-servicelink " data-target-new-window="True" data-url="https://deathknellprod.bandcamp.com/album/--2" data-servicelink="CC8Q6TgYACITCNP234Kr2dMCFcNxGAodQqsIwSj4HQ" target="_blank" rel="nofollow noopener">https://deathknellprod.bandcamp.com/a...</a><br /><a href="https://www.facebook.com/DeathKnellProd/" class="yt-uix-servicelink " data-target-new-window="True" data-url="https://www.facebook.com/DeathKnellProd/" data-servicelink="CC8Q6TgYACITCNP234Kr2dMCFcNxGAodQqsIwSj4HQ" target="_blank" rel="nofollow noopener">https://www.facebook.com/DeathKnellProd/</a><br /><br /><br />I don't have any right about this artifact, my only intention is to spread the music of the band, all rights are reserved to the Затемно (Zatemno) and his producers, Death Knell Productions.<br /><br />------------------------------------------------------------------<br /><br />Subscribe for more videos like this.<br />My link: <a href="https://web.facebook.com/AttackOfTheDragons" class="yt-uix-servicelink " data-target-new-window="True" data-url="https://web.facebook.com/AttackOfTheDragons" data-servicelink="CC8Q6TgYACITCNP234Kr2dMCFcNxGAodQqsIwSj4HQ" target="_blank" rel="nofollow noopener">https://web.facebook.com/AttackOfTheD...</a>''',
1138,
[{
'start_time': 0,
'end_time': 490,
'title': '1 - Во прах [Vo prakh]',
}, {
'start_time': 490,
'end_time': 870,
'title': '2 - Искупление [Iskupleniye]',
}, {
'start_time': 870,
'end_time': 1138,
'title': '3 - Из серпов луны...[Iz serpov luny]',
}]
),
]
def test_youtube_chapters(self):
for description, duration, expected_chapters in self._TEST_CASES:
ie = YoutubeIE()
expect_value(
self, ie._extract_chapters(description, duration),
expected_chapters, None)
if __name__ == '__main__':
unittest.main()

View File

@@ -29,7 +29,17 @@ class ExternalFD(FileDownloader):
self.report_destination(filename)
tmpfilename = self.temp_name(filename)
retval = self._call_downloader(tmpfilename, info_dict)
try:
retval = self._call_downloader(tmpfilename, info_dict)
except KeyboardInterrupt:
if not info_dict.get('is_live'):
raise
# Live stream downloading cancellation should be considered as
# correct and expected termination thus all postprocessing
# should take place
retval = 0
self.to_screen('[%s] Interrupted by user' % self.get_basename())
if retval == 0:
fsize = os.path.getsize(encodeFilename(tmpfilename))
self.to_screen('\r[%s] Downloaded %s bytes' % (self.get_basename(), fsize))

View File

@@ -49,7 +49,7 @@ class FragmentFD(FileDownloader):
index: 0-based index of current fragment among all fragments
fragment_count:
Total count of fragments
This feature is experimental and file format may change in future.
"""
@@ -155,8 +155,6 @@ class FragmentFD(FileDownloader):
self._write_ytdl_file(ctx)
if ctx['fragment_index'] > 0:
assert resume_len > 0
else:
assert resume_len == 0
dest_stream, tmpfilename = sanitize_open(tmpfilename, open_mode)

View File

@@ -1308,6 +1308,12 @@ class AdobePassIE(InfoExtractor):
_USER_AGENT = 'Mozilla/5.0 (X11; Linux i686; rv:47.0) Gecko/20100101 Firefox/47.0'
_MVPD_CACHE = 'ap-mvpd'
def _download_webpage_handle(self, *args, **kwargs):
headers = kwargs.get('headers', {})
headers.update(self.geo_verification_headers())
kwargs['headers'] = headers
return super(AdobePassIE, self)._download_webpage_handle(*args, **kwargs)
@staticmethod
def _get_mvpd_resource(provider_id, title, guid, rating):
channel = etree.Element('channel')

View File

@@ -101,10 +101,14 @@ class AENetworksIE(AENetworksBaseIE):
for season_url_path in re.findall(r'(?s)<li[^>]+data-href="(/shows/%s/season-\d+)"' % url_parts[0], webpage):
entries.append(self.url_result(
compat_urlparse.urljoin(url, season_url_path), 'AENetworks'))
return self.playlist_result(
entries, self._html_search_meta('aetn:SeriesId', webpage),
self._html_search_meta('aetn:SeriesTitle', webpage))
elif url_parts_len == 2:
if entries:
return self.playlist_result(
entries, self._html_search_meta('aetn:SeriesId', webpage),
self._html_search_meta('aetn:SeriesTitle', webpage))
else:
# single season
url_parts_len = 2
if url_parts_len == 2:
entries = []
for episode_item in re.findall(r'(?s)<[^>]+class="[^"]*(?:episode|program)-item[^"]*"[^>]*>', webpage):
episode_attributes = extract_attributes(episode_item)
@@ -112,7 +116,7 @@ class AENetworksIE(AENetworksBaseIE):
url, episode_attributes['data-canonical'])
entries.append(self.url_result(
episode_url, 'AENetworks',
episode_attributes['data-videoid']))
episode_attributes.get('data-videoid') or episode_attributes.get('data-video-id')))
return self.playlist_result(
entries, self._html_search_meta('aetn:SeasonId', webpage))

View File

@@ -207,11 +207,10 @@ class AfreecaTVIE(InfoExtractor):
file_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls',
note='Downloading part %d m3u8 information' % file_num)
title = title if one else '%s (part %d)' % (title, file_num)
file_info = common_entry.copy()
file_info.update({
'id': format_id,
'title': title,
'title': title if one else '%s (part %d)' % (title, file_num),
'upload_date': upload_date,
'duration': file_duration,
'formats': formats,

View File

@@ -34,9 +34,12 @@ class AMPIE(InfoExtractor):
if isinstance(media_thumbnail, dict):
media_thumbnail = [media_thumbnail]
for thumbnail_data in media_thumbnail:
thumbnail = thumbnail_data['@attributes']
thumbnail = thumbnail_data.get('@attributes', {})
thumbnail_url = thumbnail.get('url')
if not thumbnail_url:
continue
thumbnails.append({
'url': self._proto_relative_url(thumbnail['url'], 'http:'),
'url': self._proto_relative_url(thumbnail_url, 'http:'),
'width': int_or_none(thumbnail.get('width')),
'height': int_or_none(thumbnail.get('height')),
})
@@ -47,9 +50,14 @@ class AMPIE(InfoExtractor):
if isinstance(media_subtitle, dict):
media_subtitle = [media_subtitle]
for subtitle_data in media_subtitle:
subtitle = subtitle_data['@attributes']
lang = subtitle.get('lang') or 'en'
subtitles[lang] = [{'url': subtitle['href']}]
subtitle = subtitle_data.get('@attributes', {})
subtitle_href = subtitle.get('href')
if not subtitle_href:
continue
subtitles.setdefault(subtitle.get('lang') or 'en', []).append({
'url': subtitle_href,
'ext': mimetype2ext(subtitle.get('type')) or determine_ext(subtitle_href),
})
formats = []
media_content = get_media_node('content')

View File

@@ -5,6 +5,7 @@ import base64
import hashlib
import json
import random
import re
import time
from .common import InfoExtractor
@@ -16,6 +17,7 @@ from ..utils import (
intlist_to_bytes,
int_or_none,
strip_jsonp,
unescapeHTML,
)
@@ -26,6 +28,8 @@ def md5_text(s):
class AnvatoIE(InfoExtractor):
_VALID_URL = r'anvato:(?P<access_key_or_mcp>[^:]+):(?P<id>\d+)'
# Copied from anvplayer.min.js
_ANVACK_TABLE = {
'nbcu_nbcd_desktop_web_prod_93d8ead38ce2024f8f544b78306fbd15895ae5e6': 'NNemUkySjxLyPTKvZRiGntBIjEyK8uqicjMakIaQ',
@@ -114,6 +118,22 @@ class AnvatoIE(InfoExtractor):
'nbcu_nbcd_desktop_web_prod_93d8ead38ce2024f8f544b78306fbd15895ae5e6_secure': 'NNemUkySjxLyPTKvZRiGntBIjEyK8uqicjMakIaQ'
}
_MCP_TO_ACCESS_KEY_TABLE = {
'qa': 'anvato_mcpqa_demo_web_stage_18b55e00db5a13faa8d03ae6e41f6f5bcb15b922',
'lin': 'anvato_mcp_lin_web_prod_4c36fbfd4d8d8ecae6488656e21ac6d1ac972749',
'univison': 'anvato_mcp_univision_web_prod_37fe34850c99a3b5cdb71dab10a417dd5cdecafa',
'uni': 'anvato_mcp_univision_web_prod_37fe34850c99a3b5cdb71dab10a417dd5cdecafa',
'dev': 'anvato_mcp_fs2go_web_prod_c7b90a93e171469cdca00a931211a2f556370d0a',
'sps': 'anvato_mcp_sps_web_prod_54bdc90dd6ba21710e9f7074338365bba28da336',
'spsstg': 'anvato_mcp_sps_web_prod_54bdc90dd6ba21710e9f7074338365bba28da336',
'anv': 'anvato_mcp_anv_web_prod_791407490f4c1ef2a4bcb21103e0cb1bcb3352b3',
'gray': 'anvato_mcp_gray_web_prod_4c10f067c393ed8fc453d3930f8ab2b159973900',
'hearst': 'anvato_mcp_hearst_web_prod_5356c3de0fc7c90a3727b4863ca7fec3a4524a99',
'cbs': 'anvato_mcp_cbs_web_prod_02f26581ff80e5bda7aad28226a8d369037f2cbe',
'telemundo': 'anvato_mcp_telemundo_web_prod_c5278d51ad46fda4b6ca3d0ea44a7846a054f582'
}
_ANVP_RE = r'<script[^>]+\bdata-anvp\s*=\s*(["\'])(?P<anvp>(?:(?!\1).)+)\1'
_AUTH_KEY = b'\x31\xc2\x42\x84\x9e\x73\xa0\xce'
def __init__(self, *args, **kwargs):
@@ -178,12 +198,7 @@ class AnvatoIE(InfoExtractor):
}
if ext == 'm3u8' or media_format in ('m3u8', 'm3u8-variant'):
# Not using _extract_m3u8_formats here as individual media
# playlists are also included in published_urls.
if tbr is None:
formats.append(self._m3u8_meta_format(video_url, ext='mp4', m3u8_id='hls'))
continue
else:
if tbr is not None:
a_format.update({
'format_id': '-'.join(filter(None, ['hls', compat_str(tbr)])),
'ext': 'mp4',
@@ -222,9 +237,42 @@ class AnvatoIE(InfoExtractor):
'subtitles': subtitles,
}
@staticmethod
def _extract_urls(ie, webpage, video_id):
entries = []
for mobj in re.finditer(AnvatoIE._ANVP_RE, webpage):
anvplayer_data = ie._parse_json(
mobj.group('anvp'), video_id, transform_source=unescapeHTML,
fatal=False)
if not anvplayer_data:
continue
video = anvplayer_data.get('video')
if not isinstance(video, compat_str) or not video.isdigit():
continue
access_key = anvplayer_data.get('accessKey')
if not access_key:
mcp = anvplayer_data.get('mcp')
if mcp:
access_key = AnvatoIE._MCP_TO_ACCESS_KEY_TABLE.get(
mcp.lower())
if not access_key:
continue
entries.append(ie.url_result(
'anvato:%s:%s' % (access_key, video), ie=AnvatoIE.ie_key(),
video_id=video))
return entries
def _extract_anvato_videos(self, webpage, video_id):
anvplayer_data = self._parse_json(self._html_search_regex(
r'<script[^>]+data-anvp=\'([^\']+)\'', webpage,
'Anvato player data'), video_id)
anvplayer_data = self._parse_json(
self._html_search_regex(
self._ANVP_RE, webpage, 'Anvato player data', group='anvp'),
video_id)
return self._get_anvato_videos(
anvplayer_data['accessKey'], anvplayer_data['video'])
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
access_key, video_id = mobj.group('access_key_or_mcp', 'id')
if access_key not in self._ANVACK_TABLE:
access_key = self._MCP_TO_ACCESS_KEY_TABLE[access_key]
return self._get_anvato_videos(access_key, video_id)

View File

@@ -12,13 +12,13 @@ class AppleConnectIE(InfoExtractor):
_VALID_URL = r'https?://itunes\.apple\.com/\w{0,2}/?post/idsa\.(?P<id>[\w-]+)'
_TEST = {
'url': 'https://itunes.apple.com/us/post/idsa.4ab17a39-2720-11e5-96c5-a5b38f6c42d3',
'md5': '10d0f2799111df4cb1c924520ca78f98',
'md5': 'e7c38568a01ea45402570e6029206723',
'info_dict': {
'id': '4ab17a39-2720-11e5-96c5-a5b38f6c42d3',
'ext': 'm4v',
'title': 'Energy',
'uploader': 'Drake',
'thumbnail': 'http://is5.mzstatic.com/image/thumb/Video5/v4/78/61/c5/7861c5fa-ad6d-294b-1464-cf7605b911d6/source/1920x1080sr.jpg',
'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20150710',
'timestamp': 1436545535,
},

View File

@@ -70,7 +70,8 @@ class AppleTrailersIE(InfoExtractor):
}, {
'url': 'http://trailers.apple.com/trailers/magnolia/blackthorn/',
'info_dict': {
'id': 'blackthorn',
'id': '4489',
'title': 'Blackthorn',
},
'playlist_mincount': 2,
'expected_warnings': ['Unable to download JSON metadata'],
@@ -261,7 +262,7 @@ class AppleTrailersSectionIE(InfoExtractor):
'title': 'Most Popular',
'id': 'mostpopular',
},
'playlist_mincount': 80,
'playlist_mincount': 30,
}, {
'url': 'http://trailers.apple.com/#section=moviestudios',
'info_dict': {

View File

@@ -24,12 +24,12 @@ class ArchiveOrgIE(InfoExtractor):
}
}, {
'url': 'https://archive.org/details/Cops1922',
'md5': 'bc73c8ab3838b5a8fc6c6651fa7b58ba',
'md5': '0869000b4ce265e8ca62738b336b268a',
'info_dict': {
'id': 'Cops1922',
'ext': 'mp4',
'title': 'Buster Keaton\'s "Cops" (1922)',
'description': 'md5:b4544662605877edd99df22f9620d858',
'description': 'md5:89e7c77bf5d965dd5c0372cfb49470f6',
}
}, {
'url': 'http://archive.org/embed/XD300-23_68HighlightsAResearchCntAugHumanIntellect',

View File

@@ -180,7 +180,7 @@ class ArteTVBaseIE(InfoExtractor):
class ArteTVPlus7IE(ArteTVBaseIE):
IE_NAME = 'arte.tv:+7'
_VALID_URL = r'https?://(?:(?:www|sites)\.)?arte\.tv/[^/]+/(?P<lang>fr|de|en|es)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_VALID_URL = r'https?://(?:(?:www|sites)\.)?arte\.tv/(?:[^/]+/)?(?P<lang>fr|de|en|es)/(?:videos/)?(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://www.arte.tv/guide/de/sendungen/XEN/xenius/?vid=055918-015_PLUS7-D',
@@ -188,6 +188,9 @@ class ArteTVPlus7IE(ArteTVBaseIE):
}, {
'url': 'http://sites.arte.tv/karambolage/de/video/karambolage-22',
'only_matching': True,
}, {
'url': 'http://www.arte.tv/de/videos/048696-000-A/der-kluge-bauch-unser-zweites-gehirn',
'only_matching': True,
}]
@classmethod

View File

@@ -36,7 +36,7 @@ class AtresPlayerIE(InfoExtractor):
},
{
'url': 'http://www.atresplayer.com/television/especial/videoencuentros/temporada-1/capitulo-112-david-bustamante_2014121600375.html',
'md5': '0d0e918533bbd4b263f2de4d197d4aac',
'md5': '6e52cbb513c405e403dbacb7aacf8747',
'info_dict': {
'id': 'capitulo-112-david-bustamante',
'ext': 'flv',

View File

@@ -16,7 +16,7 @@ class AudioBoomIE(InfoExtractor):
'title': '3/09/2016 Czaban Hour 3',
'description': 'Guest: Nate Davis - NFL free agency, Guest: Stan Gans',
'duration': 2245.72,
'uploader': 'Steve Czaban',
'uploader': 'SB Nation A.M.',
'uploader_url': r're:https?://(?:www\.)?audioboom\.com/channel/steveczabanyahoosportsradio',
}
}, {

View File

@@ -34,12 +34,12 @@ class BandcampIE(InfoExtractor):
'_skip': 'There is a limit of 200 free downloads / month for the test song'
}, {
'url': 'http://benprunty.bandcamp.com/track/lanius-battle',
'md5': '73d0b3171568232574e45652f8720b5c',
'md5': '0369ace6b939f0927e62c67a1a8d9fa7',
'info_dict': {
'id': '2650410135',
'ext': 'mp3',
'title': 'Lanius (Battle)',
'uploader': 'Ben Prunty Music',
'ext': 'aiff',
'title': 'Ben Prunty - Lanius (Battle)',
'uploader': 'Ben Prunty',
},
}]
@@ -47,6 +47,7 @@ class BandcampIE(InfoExtractor):
mobj = re.match(self._VALID_URL, url)
title = mobj.group('title')
webpage = self._download_webpage(url, title)
thumbnail = self._html_search_meta('og:image', webpage, default=None)
m_download = re.search(r'freeDownloadPage: "(.*?)"', webpage)
if not m_download:
m_trackinfo = re.search(r'trackinfo: (.+),\s*?\n', webpage)
@@ -75,6 +76,7 @@ class BandcampIE(InfoExtractor):
return {
'id': track_id,
'title': data['title'],
'thumbnail': thumbnail,
'formats': formats,
'duration': float_or_none(data.get('duration')),
}
@@ -143,7 +145,7 @@ class BandcampIE(InfoExtractor):
return {
'id': video_id,
'title': title,
'thumbnail': info.get('thumb_url'),
'thumbnail': info.get('thumb_url') or thumbnail,
'uploader': info.get('artist'),
'artist': artist,
'track': track,

View File

@@ -16,7 +16,7 @@ class BeegIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?beeg\.com/(?P<id>\d+)'
_TEST = {
'url': 'http://beeg.com/5416503',
'md5': '46c384def73b33dbc581262e5ee67cef',
'md5': 'a1a1b1a8bc70a89e49ccfd113aed0820',
'info_dict': {
'id': '5416503',
'ext': 'mp4',

View File

@@ -122,6 +122,11 @@ class BiliBiliIE(InfoExtractor):
'preference': -2 if 'hd.mp4' in backup_url else -3,
})
for a_format in formats:
a_format.setdefault('http_headers', {}).update({
'Referer': url,
})
self._sort_formats(formats)
entries.append({

View File

@@ -35,7 +35,7 @@ class BleacherReportIE(InfoExtractor):
'title': 'Aussie Golfers Get Fright of Their Lives After Being Chased by Angry Kangaroo',
'timestamp': 1446839961,
'uploader': 'Sean Fay',
'description': 'md5:825e94e0f3521df52fa83b2ed198fa20',
'description': 'md5:b1601e2314c4d8eec23b6eafe086a757',
'uploader_id': 6466954,
'upload_date': '20151011',
},
@@ -90,17 +90,13 @@ class BleacherReportCMSIE(AMPIE):
_VALID_URL = r'https?://(?:www\.)?bleacherreport\.com/video_embed\?id=(?P<id>[0-9a-f-]{36})'
_TESTS = [{
'url': 'http://bleacherreport.com/video_embed?id=8fd44c2f-3dc5-4821-9118-2c825a98c0e1',
'md5': '8c2c12e3af7805152675446c905d159b',
'md5': '2e4b0a997f9228ffa31fada5c53d1ed1',
'info_dict': {
'id': '8fd44c2f-3dc5-4821-9118-2c825a98c0e1',
'ext': 'mp4',
'ext': 'flv',
'title': 'Cena vs. Rollins Would Expose the Heavyweight Division',
'description': 'md5:984afb4ade2f9c0db35f3267ed88b36e',
},
'params': {
# m3u8 download
'skip_download': True,
},
}]
def _real_extract(self, url):

View File

@@ -77,7 +77,7 @@ class BRIE(InfoExtractor):
'description': 'md5:bb659990e9e59905c3d41e369db1fbe3',
'duration': 893,
'uploader': 'Eva Maria Steimle',
'upload_date': '20140117',
'upload_date': '20170208',
}
},
]

View File

@@ -522,7 +522,7 @@ class BrightcoveNewIE(InfoExtractor):
# [2] looks like:
for video, script_tag, account_id, player_id, embed in re.findall(
r'''(?isx)
(<video\s+[^>]*data-video-id=['"]?[^>]+>)
(<video\s+[^>]*\bdata-video-id\s*=\s*['"]?[^>]+>)
(?:.*?
(<script[^>]+
src=["\'](?:https?:)?//players\.brightcove\.net/

View File

@@ -16,13 +16,10 @@ class Canalc2IE(InfoExtractor):
'md5': '060158428b650f896c542dfbb3d6487f',
'info_dict': {
'id': '12163',
'ext': 'flv',
'ext': 'mp4',
'title': 'Terrasses du Numérique',
'duration': 122,
},
'params': {
'skip_download': True, # Requires rtmpdump
}
}, {
'url': 'http://archives-canalc2.u-strasbg.fr/video.asp?idVideo=11427&voir=oui',
'only_matching': True,

View File

@@ -96,6 +96,7 @@ class CBCIE(InfoExtractor):
'info_dict': {
'title': 'Keep Rover active during the deep freeze with doggie pushups and other fun indoor tasks',
'id': 'dog-indoor-exercise-winter-1.3928238',
'description': 'md5:c18552e41726ee95bd75210d1ca9194c',
},
'playlist_mincount': 6,
}]
@@ -165,12 +166,11 @@ class CBCPlayerIE(InfoExtractor):
'uploader': 'CBCC-NEW',
},
}, {
# available only when we add `formats=MPEG4,FLV,MP3` to theplatform url
'url': 'http://www.cbc.ca/player/play/2164402062',
'md5': '17a61eb813539abea40618d6323a7f82',
'md5': '33fcd8f6719b9dd60a5e73adcb83b9f6',
'info_dict': {
'id': '2164402062',
'ext': 'flv',
'ext': 'mp4',
'title': 'Cancer survivor four times over',
'description': 'Tim Mayer has beaten three different forms of cancer four times in five years.',
'timestamp': 1320410746,

View File

@@ -60,8 +60,8 @@ class CBSLocalIE(AnvatoIE):
'title': 'A Very Blue Anniversary',
'description': 'CBS2s Cindy Hsu has more.',
'thumbnail': 're:^https?://.*',
'timestamp': 1479962220,
'upload_date': '20161124',
'timestamp': int,
'upload_date': r're:^\d{8}$',
'uploader': 'CBS',
'subtitles': {
'en': 'mincount:5',

View File

@@ -9,7 +9,10 @@ from ..utils import (
ExtractorError,
float_or_none,
int_or_none,
multipart_encode,
parse_duration,
random_birthday,
urljoin,
)
@@ -27,7 +30,8 @@ class CDAIE(InfoExtractor):
'description': 'md5:269ccd135d550da90d1662651fcb9772',
'thumbnail': r're:^https?://.*\.jpg$',
'average_rating': float,
'duration': 39
'duration': 39,
'age_limit': 0,
}
}, {
'url': 'http://www.cda.pl/video/57413289',
@@ -41,13 +45,41 @@ class CDAIE(InfoExtractor):
'uploader': 'crash404',
'view_count': int,
'average_rating': float,
'duration': 137
'duration': 137,
'age_limit': 0,
}
}, {
# Age-restricted
'url': 'http://www.cda.pl/video/1273454c4',
'info_dict': {
'id': '1273454c4',
'ext': 'mp4',
'title': 'Bronson (2008) napisy HD 1080p',
'description': 'md5:1b6cb18508daf2dc4e0fa4db77fec24c',
'height': 1080,
'uploader': 'boniek61',
'thumbnail': r're:^https?://.*\.jpg$',
'duration': 5554,
'age_limit': 18,
'view_count': int,
'average_rating': float,
},
}, {
'url': 'http://ebd.cda.pl/0x0/5749950c',
'only_matching': True,
}]
def _download_age_confirm_page(self, url, video_id, *args, **kwargs):
form_data = random_birthday('rok', 'miesiac', 'dzien')
form_data.update({'return': url, 'module': 'video', 'module_id': video_id})
data, content_type = multipart_encode(form_data)
return self._download_webpage(
urljoin(url, '/a/validatebirth'), video_id, *args,
data=data, headers={
'Referer': url,
'Content-Type': content_type,
}, **kwargs)
def _real_extract(self, url):
video_id = self._match_id(url)
self._set_cookie('cda.pl', 'cda.player', 'html5')
@@ -57,6 +89,13 @@ class CDAIE(InfoExtractor):
if 'Ten film jest dostępny dla użytkowników premium' in webpage:
raise ExtractorError('This video is only available for premium users.', expected=True)
need_confirm_age = False
if self._html_search_regex(r'(<form[^>]+action="/a/validatebirth")',
webpage, 'birthday validate form', default=None):
webpage = self._download_age_confirm_page(
url, video_id, note='Confirming age')
need_confirm_age = True
formats = []
uploader = self._search_regex(r'''(?x)
@@ -81,6 +120,7 @@ class CDAIE(InfoExtractor):
'thumbnail': self._og_search_thumbnail(webpage),
'formats': formats,
'duration': None,
'age_limit': 18 if need_confirm_age else 0,
}
def extract_format(page, version):
@@ -121,7 +161,12 @@ class CDAIE(InfoExtractor):
for href, resolution in re.findall(
r'<a[^>]+data-quality="[^"]+"[^>]+href="([^"]+)"[^>]+class="quality-btn"[^>]*>([0-9]+p)',
webpage):
webpage = self._download_webpage(
if need_confirm_age:
handler = self._download_age_confirm_page
else:
handler = self._download_webpage
webpage = handler(
self._BASE_URL + href, video_id,
'Downloading %s version information' % resolution, fatal=False)
if not webpage:
@@ -129,6 +174,7 @@ class CDAIE(InfoExtractor):
# invalid version is requested.
self.report_warning('Unable to download %s version information' % resolution)
continue
extract_format(webpage, resolution)
self._sort_formats(formats)

View File

@@ -12,7 +12,7 @@ class ClipfishIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?clipfish\.de/(?:[^/]+/)+video/(?P<id>[0-9]+)'
_TEST = {
'url': 'http://www.clipfish.de/special/ugly-americans/video/4343170/s01-e01-ugly-americans-date-in-der-hoelle/',
'md5': '720563e467b86374c194bdead08d207d',
'md5': 'b9a5dc46294154c1193e2d10e0c95693',
'info_dict': {
'id': '4343170',
'ext': 'mp4',

View File

@@ -21,7 +21,7 @@ class CollegeRamaIE(InfoExtractor):
'ext': 'mp4',
'title': 'Een nieuwe wereld: waarden, bewustzijn en techniek van de mensheid 2.0.',
'description': '',
'thumbnail': r're:^https?://.*\.jpg$',
'thumbnail': r're:^https?://.*\.jpg(?:\?.*?)?$',
'duration': 7713.088,
'timestamp': 1413309600,
'upload_date': '20141014',
@@ -35,6 +35,7 @@ class CollegeRamaIE(InfoExtractor):
'ext': 'wmv',
'title': '64ste Vakantiecursus: Afvalwater',
'description': 'md5:7fd774865cc69d972f542b157c328305',
'thumbnail': r're:^https?://.*\.jpg(?:\?.*?)?$',
'duration': 10853,
'timestamp': 1326446400,
'upload_date': '20120113',

View File

@@ -245,6 +245,10 @@ class InfoExtractor(object):
specified in the URL.
end_time: Time in seconds where the reproduction should end, as
specified in the URL.
chapters: A list of dictionaries, with the following entries:
* "start_time" - The start time of the chapter in seconds
* "end_time" - The end time of the chapter in seconds
* "title" (optional, string)
The following fields should only be used when the video belongs to some logical
chapter or section:
@@ -990,6 +994,7 @@ class InfoExtractor(object):
'tbr': int_or_none(e.get('bitrate')),
'width': int_or_none(e.get('width')),
'height': int_or_none(e.get('height')),
'view_count': int_or_none(e.get('interactionCount')),
})
for e in json_ld:
@@ -1334,7 +1339,7 @@ class InfoExtractor(object):
if '#EXT-X-FAXS-CM:' in m3u8_doc: # Adobe Flash Access
return []
formats = [self._m3u8_meta_format(m3u8_url, ext, preference, m3u8_id)]
formats = []
format_url = lambda u: (
u
@@ -1386,6 +1391,7 @@ class InfoExtractor(object):
f = {
'format_id': '-'.join(format_id),
'url': format_url(media_url),
'manifest_url': m3u8_url,
'language': media.get('LANGUAGE'),
'ext': ext,
'protocol': entry_protocol,
@@ -1438,7 +1444,7 @@ class InfoExtractor(object):
f = {
'format_id': '-'.join(format_id),
'url': manifest_url,
'manifest_url': manifest_url,
'manifest_url': m3u8_url,
'tbr': tbr,
'ext': ext,
'fps': float_or_none(last_stream_inf.get('FRAME-RATE')),
@@ -2168,7 +2174,7 @@ class InfoExtractor(object):
def _extract_akamai_formats(self, manifest_url, video_id, hosts={}):
formats = []
hdcore_sign = 'hdcore=3.7.0'
f4m_url = re.sub(r'(https?://[^/+])/i/', r'\1/z/', manifest_url).replace('/master.m3u8', '/manifest.f4m')
f4m_url = re.sub(r'(https?://[^/]+)/i/', r'\1/z/', manifest_url).replace('/master.m3u8', '/manifest.f4m')
hds_host = hosts.get('hds')
if hds_host:
f4m_url = re.sub(r'(https?://)[^/]+', r'\1' + hds_host, f4m_url)

View File

@@ -24,12 +24,11 @@ class CoubIE(InfoExtractor):
'duration': 4.6,
'timestamp': 1428527772,
'upload_date': '20150408',
'uploader': 'Артём Лоскутников',
'uploader': 'Artyom Loskutnikov',
'uploader_id': 'artyom.loskutnikov',
'view_count': int,
'like_count': int,
'repost_count': int,
'comment_count': int,
'age_limit': 0,
},
}, {
@@ -118,7 +117,6 @@ class CoubIE(InfoExtractor):
view_count = int_or_none(coub.get('views_count') or coub.get('views_increase_count'))
like_count = int_or_none(coub.get('likes_count'))
repost_count = int_or_none(coub.get('recoubs_count'))
comment_count = int_or_none(coub.get('comments_count'))
age_restricted = coub.get('age_restricted', coub.get('age_restricted_by_admin'))
if age_restricted is not None:
@@ -137,7 +135,6 @@ class CoubIE(InfoExtractor):
'view_count': view_count,
'like_count': like_count,
'repost_count': repost_count,
'comment_count': comment_count,
'age_limit': age_limit,
'formats': formats,
}

View File

@@ -21,9 +21,10 @@ class CrackleIE(InfoExtractor):
'season_number': 8,
'episode_number': 4,
'subtitles': {
'en-US': [{
'ext': 'ttml',
}]
'en-US': [
{'ext': 'vtt'},
{'ext': 'tt'},
]
},
},
'params': {

View File

@@ -171,7 +171,7 @@ class CrunchyrollIE(CrunchyrollBaseIE):
'info_dict': {
'id': '727589',
'ext': 'mp4',
'title': "KONOSUBA -God's blessing on this wonderful world! 2 Episode 1 Give Me Deliverance from this Judicial Injustice!",
'title': "KONOSUBA -God's blessing on this wonderful world! 2 Episode 1 Give Me Deliverance From This Judicial Injustice!",
'description': 'md5:cbcf05e528124b0f3a0a419fc805ea7d',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Kadokawa Pictures Inc.',
@@ -179,7 +179,7 @@ class CrunchyrollIE(CrunchyrollBaseIE):
'series': "KONOSUBA -God's blessing on this wonderful world!",
'season': "KONOSUBA -God's blessing on this wonderful world! 2",
'season_number': 2,
'episode': 'Give Me Deliverance from this Judicial Injustice!',
'episode': 'Give Me Deliverance From This Judicial Injustice!',
'episode_number': 1,
},
'params': {

View File

@@ -50,6 +50,24 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
]
_TESTS = [
{
'url': 'http://www.dailymotion.com/video/x5kesuj_office-christmas-party-review-jason-bateman-olivia-munn-t-j-miller_news',
'md5': '074b95bdee76b9e3654137aee9c79dfe',
'info_dict': {
'id': 'x5kesuj',
'ext': 'mp4',
'title': 'Office Christmas Party Review Jason Bateman, Olivia Munn, T.J. Miller',
'description': 'Office Christmas Party Review - Jason Bateman, Olivia Munn, T.J. Miller',
'thumbnail': r're:^https?:.*\.(?:jpg|png)$',
'duration': 187,
'timestamp': 1493651285,
'upload_date': '20170501',
'uploader': 'Deadline',
'uploader_id': 'x1xm8ri',
'age_limit': 0,
'view_count': int,
},
},
{
'url': 'https://www.dailymotion.com/video/x2iuewm_steam-machine-models-pricing-listed-on-steam-store-ign-news_videogames',
'md5': '2137c41a8e78554bb09225b8eb322406',
@@ -66,7 +84,8 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
'uploader_id': 'xijv66',
'age_limit': 0,
'view_count': int,
}
},
'skip': 'video gone',
},
# Vevo video
{

View File

@@ -21,7 +21,8 @@ class DemocracynowIE(InfoExtractor):
'info_dict': {
'id': '2015-0703-001',
'ext': 'mp4',
'title': 'Daily Show',
'title': 'Daily Show for July 03, 2015',
'description': 'md5:80eb927244d6749900de6072c7cc2c86',
},
}, {
'url': 'http://www.democracynow.org/2015/7/3/this_flag_comes_down_today_bree',

View File

@@ -35,7 +35,7 @@ class DotsubIE(InfoExtractor):
'thumbnail': 're:^https?://dotsub.com/media/747bcf58-bd59-45b7-8c8c-ac312d084ee6/p',
'duration': 290,
'timestamp': 1476767794.2809999,
'upload_date': '20160525',
'upload_date': '20161018',
'uploader': 'parthivi001',
'uploader_id': 'user52596202',
'view_count': int,

View File

@@ -20,7 +20,7 @@ class DouyuTVIE(InfoExtractor):
'id': '17732',
'display_id': 'iseven',
'ext': 'flv',
'title': 're:^清晨醒脑!T-ARA根本停不下来! [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'title': 're:^清晨醒脑!根本停不下来! [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': r're:.*m7show@163\.com.*',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': '7师傅',
@@ -51,7 +51,7 @@ class DouyuTVIE(InfoExtractor):
'id': '17732',
'display_id': '17732',
'ext': 'flv',
'title': 're:^清晨醒脑!T-ARA根本停不下来! [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'title': 're:^清晨醒脑!根本停不下来! [0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}$',
'description': r're:.*m7show@163\.com.*',
'thumbnail': r're:^https?://.*\.jpg$',
'uploader': '7师傅',

View File

@@ -41,6 +41,7 @@ from .alphaporno import AlphaPornoIE
from .amcnetworks import AMCNetworksIE
from .animeondemand import AnimeOnDemandIE
from .anitube import AnitubeIE
from .anvato import AnvatoIE
from .anysex import AnySexIE
from .aol import AolIE
from .allocine import AllocineIE
@@ -662,6 +663,7 @@ from .nintendo import NintendoIE
from .njpwworld import NJPWWorldIE
from .nobelprize import NobelPrizeIE
from .noco import NocoIE
from .noovo import NoovoIE
from .normalboots import NormalbootsIE
from .nosvideo import NosVideoIE
from .nova import NovaIE
@@ -1123,6 +1125,7 @@ from .vgtv import (
from .vh1 import VH1IE
from .vice import (
ViceIE,
ViceArticleIE,
ViceShowIE,
)
from .viceland import VicelandIE
@@ -1298,5 +1301,6 @@ from .youtube import (
YoutubeWatchLaterIE,
)
from .zapiks import ZapiksIE
from .zaq1 import Zaq1IE
from .zdf import ZDFIE, ZDFChannelIE
from .zingmp3 import ZingMp3IE

View File

@@ -11,10 +11,10 @@ class FoxSportsIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?foxsports\.com/(?:[^/]+/)*(?P<id>[^/]+)'
_TEST = {
'url': 'http://www.foxsports.com/video?vid=432609859715',
'url': 'http://www.foxsports.com/tennessee/video/432609859715',
'md5': 'b49050e955bebe32c301972e4012ac17',
'info_dict': {
'id': 'i0qKWsk3qJaM',
'id': 'bwduI3X_TgUB',
'ext': 'mp4',
'title': 'Courtney Lee on going up 2-0 in series vs. Blazers',
'description': 'Courtney Lee talks about Memphis being focused.',
@@ -31,8 +31,9 @@ class FoxSportsIE(InfoExtractor):
webpage = self._download_webpage(url, video_id)
config = self._parse_json(
self._search_regex(
r"data-player-config='([^']+)'", webpage, 'data player config'),
self._html_search_regex(
r"""class="[^"]*(?:fs-player|platformPlayer-wrapper)[^"]*".+?data-player-config='([^']+)'""",
webpage, 'data player config'),
video_id)
return self.url_result(smuggle_url(update_url_query(

View File

@@ -58,8 +58,7 @@ class FunnyOrDieIE(InfoExtractor):
m3u8_url, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False)
source_formats = list(filter(
lambda f: f.get('vcodec') != 'none' and f.get('resolution') != 'multiple',
m3u8_formats))
lambda f: f.get('vcodec') != 'none', m3u8_formats))
bitrates = [int(bitrate) for bitrate in re.findall(r'[,/]v(\d+)(?=[,/])', m3u8_url)]
bitrates.sort()

View File

@@ -78,8 +78,7 @@ class GameSpotIE(OnceIE):
if m3u8_formats:
self._sort_formats(m3u8_formats)
m3u8_formats = list(filter(
lambda f: f.get('vcodec') != 'none' and f.get('resolution') != 'multiple',
m3u8_formats))
lambda f: f.get('vcodec') != 'none', m3u8_formats))
if len(qualities) == len(m3u8_formats):
for q, m3u8_format in zip(qualities, m3u8_formats):
f = m3u8_format.copy()

View File

@@ -75,6 +75,19 @@ class GDCVaultIE(InfoExtractor):
'format': 'jp', # The japanese audio
}
},
{
# gdc-player.html
'url': 'http://www.gdcvault.com/play/1435/An-American-engine-in-Tokyo',
'info_dict': {
'id': '1435',
'display_id': 'An-American-engine-in-Tokyo',
'ext': 'flv',
'title': 'An American Engine in Tokyo:/nThe collaboration of Epic Games and Square Enix/nFor THE LAST REMINANT',
},
'params': {
'skip_download': True, # Requires rtmpdump
},
},
]
def _login(self, webpage_url, display_id):
@@ -128,7 +141,7 @@ class GDCVaultIE(InfoExtractor):
'title': title,
}
PLAYER_REGEX = r'<iframe src="(?P<xml_root>.+?)/player.*?\.html.*?".*?</iframe>'
PLAYER_REGEX = r'<iframe src="(?P<xml_root>.+?)/(?:gdc-)?player.*?\.html.*?".*?</iframe>'
xml_root = self._html_search_regex(
PLAYER_REGEX, start_page, 'xml root', default=None)

View File

@@ -86,6 +86,8 @@ from .openload import OpenloadIE
from .videopress import VideoPressIE
from .rutube import RutubeIE
from .limelight import LimelightBaseIE
from .anvato import AnvatoIE
from .washingtonpost import WashingtonPostIE
class GenericIE(InfoExtractor):
@@ -1427,6 +1429,22 @@ class GenericIE(InfoExtractor):
'skip_download': True,
},
},
{
# Brightcove embed with whitespace around attribute names
'url': 'http://www.stack.com/video/3167554373001/learn-to-hit-open-three-pointers-with-damian-lillard-s-baseline-drift-drill',
'info_dict': {
'id': '3167554373001',
'ext': 'mp4',
'title': "Learn to Hit Open Three-Pointers With Damian Lillard's Baseline Drift Drill",
'description': 'md5:57bacb0e0f29349de4972bfda3191713',
'uploader_id': '1079349493',
'upload_date': '20140207',
'timestamp': 1391810548,
},
'params': {
'skip_download': True,
},
},
# Another form of arte.tv embed
{
'url': 'http://www.tv-replay.fr/redirection/09-04-16/arte-reportage-arte-11508975.html',
@@ -1677,6 +1695,29 @@ class GenericIE(InfoExtractor):
},
'playlist_mincount': 5,
},
{
'url': 'http://kron4.com/2017/04/28/standoff-with-walnut-creek-murder-suspect-ends-with-arrest/',
'info_dict': {
'id': 'standoff-with-walnut-creek-murder-suspect-ends-with-arrest',
'title': 'Standoff with Walnut Creek murder suspect ends',
'description': 'md5:3ccc48a60fc9441eeccfc9c469ebf788',
},
'playlist_mincount': 4,
},
{
# WashingtonPost embed
'url': 'http://www.vanityfair.com/hollywood/2017/04/donald-trump-tv-pitches',
'info_dict': {
'id': '8caf6e88-d0ec-11e5-90d3-34c2c42653ac',
'ext': 'mp4',
'title': "No one has seen the drama series based on Trump's life \u2014 until now",
'description': 'Donald Trump wanted a weekly TV drama based on his life. It never aired. But The Washington Post recently obtained a scene from the pilot script — and enlisted actors.',
'timestamp': 1455216756,
'uploader': 'The Washington Post',
'upload_date': '20160211',
},
'add_ie': [WashingtonPostIE.ie_key()],
},
# {
# # TODO: find another test
# # http://schema.org/VideoObject
@@ -2537,6 +2578,12 @@ class GenericIE(InfoExtractor):
'limelight:media:%s' % mobj.group('id'),
{'source_url': url}), 'LimelightMedia', mobj.group('id'))
# Look for Anvato embeds
anvato_urls = AnvatoIE._extract_urls(self, webpage, video_id)
if anvato_urls:
return self.playlist_result(
anvato_urls, video_id, video_title, video_description)
# Look for AdobeTVVideo embeds
mobj = re.search(
r'<iframe[^>]+src=[\'"]((?:https?:)?//video\.tv\.adobe\.com/v/\d+[^"]+)[\'"]',
@@ -2654,6 +2701,12 @@ class GenericIE(InfoExtractor):
return self.playlist_from_matches(
rutube_urls, ie=RutubeIE.ie_key())
# Look for WashingtonPost embeds
wapo_urls = WashingtonPostIE._extract_urls(webpage)
if wapo_urls:
return self.playlist_from_matches(
wapo_urls, video_id, video_title, ie=WashingtonPostIE.ie_key())
# Looking for http://schema.org/VideoObject
json_ld = self._search_json_ld(
webpage, video_id, default={}, expected_type='VideoObject')

View File

@@ -36,22 +36,26 @@ class GoIE(AdobePassIE):
'requestor_id': 'DisneyXD',
}
}
_VALID_URL = r'https?://(?:(?P<sub_domain>%s)\.)?go\.com/(?:[^/]+/)*(?:vdka(?P<id>\w+)|(?:[^/]+/)*(?P<display_id>[^/?#]+))' % '|'.join(_SITE_INFO.keys())
_VALID_URL = r'https?://(?:(?P<sub_domain>%s)\.)?go\.com/(?:(?:[^/]+/)*(?P<id>vdka\w+)|(?:[^/]+/)*(?P<display_id>[^/?#]+))' % '|'.join(_SITE_INFO.keys())
_TESTS = [{
'url': 'http://abc.go.com/shows/castle/video/most-recent/vdka0_g86w5onx',
'url': 'http://abc.go.com/shows/designated-survivor/video/most-recent/VDKA3807643',
'info_dict': {
'id': '0_g86w5onx',
'id': 'VDKA3807643',
'ext': 'mp4',
'title': 'Sneak Peek: Language Arts',
'description': 'md5:7dcdab3b2d17e5217c953256af964e9c',
'title': 'The Traitor in the White House',
'description': 'md5:05b009d2d145a1e85d25111bd37222e8',
},
'params': {
# m3u8 download
'skip_download': True,
},
}, {
'url': 'http://abc.go.com/shows/after-paradise/video/most-recent/vdka3335601',
'only_matching': True,
'url': 'http://watchdisneyxd.go.com/doraemon',
'info_dict': {
'title': 'Doraemon',
'id': 'SH55574025',
},
'playlist_mincount': 51,
}, {
'url': 'http://abc.go.com/shows/the-catch/episode-guide/season-01/10-the-wedding',
'only_matching': True,
@@ -60,19 +64,36 @@ class GoIE(AdobePassIE):
'only_matching': True,
}]
def _extract_videos(self, brand, video_id='-1', show_id='-1'):
display_id = video_id if video_id != '-1' else show_id
return self._download_json(
'http://api.contents.watchabc.go.com/vp2/ws/contents/3000/videos/%s/001/-1/%s/-1/%s/-1/-1.json' % (brand, show_id, video_id),
display_id)['video']
def _real_extract(self, url):
sub_domain, video_id, display_id = re.match(self._VALID_URL, url).groups()
site_info = self._SITE_INFO[sub_domain]
brand = site_info['brand']
if not video_id:
webpage = self._download_webpage(url, display_id)
video_id = self._search_regex(
# There may be inner quotes, e.g. data-video-id="'VDKA3609139'"
# from http://freeform.go.com/shows/shadowhunters/episodes/season-2/1-this-guilty-blood
r'data-video-id=["\']*VDKA(\w+)', webpage, 'video id')
site_info = self._SITE_INFO[sub_domain]
brand = site_info['brand']
video_data = self._download_json(
'http://api.contents.watchabc.go.com/vp2/ws/contents/3000/videos/%s/001/-1/-1/-1/%s/-1/-1.json' % (brand, video_id),
video_id)['video'][0]
r'data-video-id=["\']*(VDKA\w+)', webpage, 'video id', default=None)
if not video_id:
# show extraction works for Disney, DisneyJunior and DisneyXD
# ABC and Freeform has different layout
show_id = self._search_regex(r'data-show-id=["\']*(SH\d+)', webpage, 'show id')
videos = self._extract_videos(brand, show_id=show_id)
show_title = self._search_regex(r'data-show-title="([^"]+)"', webpage, 'show title', fatal=False)
entries = []
for video in videos:
entries.append(self.url_result(
video['url'], 'Go', video.get('id'), video.get('title')))
entries.reverse()
return self.playlist_result(entries, show_id, show_title)
video_data = self._extract_videos(brand, video_id)[0]
video_id = video_data['id']
title = video_data['title']
formats = []
@@ -105,7 +126,7 @@ class GoIE(AdobePassIE):
self._initialize_geo_bypass(['US'])
entitlement = self._download_json(
'https://api.entitlement.watchabc.go.com/vp2/ws-secure/entitlement/2020/authorize.json',
video_id, data=urlencode_postdata(data), headers=self.geo_verification_headers())
video_id, data=urlencode_postdata(data))
errors = entitlement.get('errors', {}).get('errors', [])
if errors:
for error in errors:

View File

@@ -87,8 +87,8 @@ class InfoQIE(BokeCCBaseIE):
def _extract_http_audio(self, webpage, video_id):
fields = self._hidden_inputs(webpage)
http_audio_url = fields['filename']
if http_audio_url is None:
http_audio_url = fields.get('filename')
if not http_audio_url:
return []
cookies_header = {'Cookie': self._extract_cookies(webpage)}

View File

@@ -1,6 +1,8 @@
# coding: utf-8
from __future__ import unicode_literals
import json
from .common import InfoExtractor
from ..utils import (
ExtractorError,
@@ -8,15 +10,15 @@ from ..utils import (
urlencode_postdata,
xpath_element,
xpath_text,
urljoin,
update_url_query,
js_to_json,
)
class Laola1TvEmbedIE(InfoExtractor):
IE_NAME = 'laola1tv:embed'
_VALID_URL = r'https?://(?:www\.)?laola1\.tv/titanplayer\.php\?.*?\bvideoid=(?P<id>\d+)'
_TEST = {
_TESTS = [{
# flashvars.premium = "false";
'url': 'https://www.laola1.tv/titanplayer.php?videoid=708065&type=V&lang=en&portal=int&customer=1024',
'info_dict': {
@@ -26,7 +28,30 @@ class Laola1TvEmbedIE(InfoExtractor):
'uploader': 'ITTF - International Table Tennis Federation',
'upload_date': '20161211',
},
}
}]
def _extract_token_url(self, stream_access_url, video_id, data):
return self._download_json(
stream_access_url, video_id, headers={
'Content-Type': 'application/json',
}, data=json.dumps(data).encode())['data']['stream-access'][0]
def _extract_formats(self, token_url, video_id):
token_doc = self._download_xml(
token_url, video_id, 'Downloading token',
headers=self.geo_verification_headers())
token_attrib = xpath_element(token_doc, './/token').attrib
if token_attrib['status'] != '0':
raise ExtractorError(
'Token error: %s' % token_attrib['comment'], expected=True)
formats = self._extract_akamai_formats(
'%s?hdnea=%s' % (token_attrib['url'], token_attrib['auth']),
video_id)
self._sort_formats(formats)
return formats
def _real_extract(self, url):
video_id = self._match_id(url)
@@ -68,29 +93,16 @@ class Laola1TvEmbedIE(InfoExtractor):
else:
data_abo = urlencode_postdata(
dict((i, v) for i, v in enumerate(_v('req_liga_abos').split(','))))
token_url = self._download_json(
'https://club.laola1.tv/sp/laola1/api/v3/user/session/premium/player/stream-access',
video_id, query={
stream_access_url = update_url_query(
'https://club.laola1.tv/sp/laola1/api/v3/user/session/premium/player/stream-access', {
'videoId': _v('id'),
'target': self._search_regex(r'vs_target = (\d+);', webpage, 'vs target'),
'label': _v('label'),
'area': _v('area'),
}, data=data_abo)['data']['stream-access'][0]
})
token_url = self._extract_token_url(stream_access_url, video_id, data_abo)
token_doc = self._download_xml(
token_url, video_id, 'Downloading token',
headers=self.geo_verification_headers())
token_attrib = xpath_element(token_doc, './/token').attrib
if token_attrib['status'] != '0':
raise ExtractorError(
'Token error: %s' % token_attrib['comment'], expected=True)
formats = self._extract_akamai_formats(
'%s?hdnea=%s' % (token_attrib['url'], token_attrib['auth']),
video_id)
self._sort_formats(formats)
formats = self._extract_formats(token_url, video_id)
categories_str = _v('meta_sports')
categories = categories_str.split(',') if categories_str else []
@@ -107,7 +119,7 @@ class Laola1TvEmbedIE(InfoExtractor):
}
class Laola1TvIE(InfoExtractor):
class Laola1TvIE(Laola1TvEmbedIE):
IE_NAME = 'laola1tv'
_VALID_URL = r'https?://(?:www\.)?laola1\.tv/[a-z]+-[a-z]+/[^/]+/(?P<id>[^/?#&]+)'
_TESTS = [{
@@ -164,13 +176,42 @@ class Laola1TvIE(InfoExtractor):
if 'Dieser Livestream ist bereits beendet.' in webpage:
raise ExtractorError('This live stream has already finished.', expected=True)
iframe_url = urljoin(url, self._search_regex(
r'<iframe[^>]*?id="videoplayer"[^>]*?src="([^"]+)"',
webpage, 'iframe url'))
conf = self._parse_json(self._search_regex(
r'(?s)conf\s*=\s*({.+?});', webpage, 'conf'),
display_id, js_to_json)
video_id = conf['videoid']
config = self._download_json(conf['configUrl'], video_id, query={
'videoid': video_id,
'partnerid': conf['partnerid'],
'language': conf.get('language', ''),
'portal': conf.get('portalid', ''),
})
error = config.get('error')
if error:
raise ExtractorError('%s said: %s' % (self.IE_NAME, error), expected=True)
video_data = config['video']
title = video_data['title']
is_live = video_data.get('isLivestream') and video_data.get('isLive')
meta = video_data.get('metaInformation')
sports = meta.get('sports')
categories = sports.split(',') if sports else []
token_url = self._extract_token_url(
video_data['streamAccess'], video_id,
video_data['abo']['required'])
formats = self._extract_formats(token_url, video_id)
return {
'_type': 'url',
'id': video_id,
'display_id': display_id,
'url': iframe_url,
'ie_key': 'Laola1TvEmbed',
'title': self._live_title(title) if is_live else title,
'description': video_data.get('description'),
'thumbnail': video_data.get('image'),
'categories': categories,
'formats': formats,
'is_live': is_live,
}

View File

@@ -23,7 +23,6 @@ from ..utils import (
str_or_none,
url_basename,
urshift,
update_url_query,
)
@@ -51,7 +50,7 @@ class LeIE(InfoExtractor):
'id': '1415246',
'ext': 'mp4',
'title': '美人天下01',
'description': 'md5:f88573d9d7225ada1359eaf0dbf8bcda',
'description': 'md5:28942e650e82ed4fcc8e4de919ee854d',
},
'params': {
'hls_prefer_native': True,
@@ -69,7 +68,6 @@ class LeIE(InfoExtractor):
'params': {
'hls_prefer_native': True,
},
'skip': 'Only available in China',
}, {
'url': 'http://sports.le.com/video/25737697.html',
'only_matching': True,
@@ -81,7 +79,7 @@ class LeIE(InfoExtractor):
'only_matching': True,
}]
# ror() and calc_time_key() are reversed from a embedded swf file in KLetvPlayer.swf
# ror() and calc_time_key() are reversed from a embedded swf file in LetvPlayer.swf
def ror(self, param1, param2):
_loc3_ = 0
while _loc3_ < param2:
@@ -90,15 +88,8 @@ class LeIE(InfoExtractor):
return param1
def calc_time_key(self, param1):
_loc2_ = 773625421
_loc3_ = self.ror(param1, _loc2_ % 13)
_loc3_ = _loc3_ ^ _loc2_
_loc3_ = self.ror(_loc3_, _loc2_ % 17)
return _loc3_
# reversed from http://jstatic.letvcdn.com/sdk/player.js
def get_mms_key(self, time):
return self.ror(time, 8) ^ 185025305
_loc2_ = 185025305
return self.ror(param1, _loc2_ % 17) ^ _loc2_
# see M3U8Encryption class in KLetvPlayer.swf
@staticmethod
@@ -122,7 +113,7 @@ class LeIE(InfoExtractor):
def _check_errors(self, play_json):
# Check for errors
playstatus = play_json['playstatus']
playstatus = play_json['msgs']['playstatus']
if playstatus['status'] == 0:
flag = playstatus['flag']
if flag == 1:
@@ -134,58 +125,31 @@ class LeIE(InfoExtractor):
media_id = self._match_id(url)
page = self._download_webpage(url, media_id)
play_json_h5 = self._download_json(
'http://api.le.com/mms/out/video/playJsonH5',
media_id, 'Downloading html5 playJson data', query={
'id': media_id,
'platid': 3,
'splatid': 304,
'format': 1,
'tkey': self.get_mms_key(int(time.time())),
'domain': 'www.le.com',
'tss': 'no',
},
headers=self.geo_verification_headers())
self._check_errors(play_json_h5)
play_json_flash = self._download_json(
'http://api.le.com/mms/out/video/playJson',
'http://player-pc.le.com/mms/out/video/playJson',
media_id, 'Downloading flash playJson data', query={
'id': media_id,
'platid': 1,
'splatid': 101,
'format': 1,
'source': 1000,
'tkey': self.calc_time_key(int(time.time())),
'domain': 'www.le.com',
'region': 'cn',
},
headers=self.geo_verification_headers())
self._check_errors(play_json_flash)
def get_h5_urls(media_url, format_id):
location = self._download_json(
media_url, media_id,
'Download JSON metadata for format %s' % format_id, query={
'format': 1,
'expect': 3,
'tss': 'no',
})['location']
return {
'http': update_url_query(location, {'tss': 'no'}),
'hls': update_url_query(location, {'tss': 'ios'}),
}
def get_flash_urls(media_url, format_id):
media_url += '&' + compat_urllib_parse_urlencode({
'm3v': 1,
'format': 1,
'expect': 3,
'rateid': format_id,
})
nodes_data = self._download_json(
media_url, media_id,
'Download JSON metadata for format %s' % format_id)
'Download JSON metadata for format %s' % format_id,
query={
'm3v': 1,
'format': 1,
'expect': 3,
'tss': 'ios',
})
req = self._request_webpage(
nodes_data['nodelist'][0]['location'], media_id,
@@ -199,29 +163,28 @@ class LeIE(InfoExtractor):
extracted_formats = []
formats = []
for play_json, get_urls in ((play_json_h5, get_h5_urls), (play_json_flash, get_flash_urls)):
playurl = play_json['playurl']
play_domain = playurl['domain'][0]
playurl = play_json_flash['msgs']['playurl']
play_domain = playurl['domain'][0]
for format_id, format_data in playurl.get('dispatch', []).items():
if format_id in extracted_formats:
continue
extracted_formats.append(format_id)
for format_id, format_data in playurl.get('dispatch', []).items():
if format_id in extracted_formats:
continue
extracted_formats.append(format_id)
media_url = play_domain + format_data[0]
for protocol, format_url in get_urls(media_url, format_id).items():
f = {
'url': format_url,
'ext': determine_ext(format_data[1]),
'format_id': '%s-%s' % (protocol, format_id),
'protocol': 'm3u8_native' if protocol == 'hls' else 'http',
'quality': int_or_none(format_id),
}
media_url = play_domain + format_data[0]
for protocol, format_url in get_flash_urls(media_url, format_id).items():
f = {
'url': format_url,
'ext': determine_ext(format_data[1]),
'format_id': '%s-%s' % (protocol, format_id),
'protocol': 'm3u8_native' if protocol == 'hls' else 'http',
'quality': int_or_none(format_id),
}
if format_id[-1:] == 'p':
f['height'] = int_or_none(format_id[:-1])
if format_id[-1:] == 'p':
f['height'] = int_or_none(format_id[:-1])
formats.append(f)
formats.append(f)
self._sort_formats(formats, ('height', 'quality', 'format_id'))
publish_time = parse_iso8601(self._html_search_regex(

View File

@@ -86,7 +86,7 @@ class LEGOIE(InfoExtractor):
formats = self._extract_akamai_formats(
'%si/s/public/%s_,%s,.mp4.csmil/master.m3u8' % (streaming_base, path, streaming_path), video_id)
m3u8_formats = list(filter(
lambda f: f.get('protocol') == 'm3u8_native' and f.get('vcodec') != 'none' and f.get('resolution') != 'multiple',
lambda f: f.get('protocol') == 'm3u8_native' and f.get('vcodec') != 'none',
formats))
if len(m3u8_formats) == len(self._BITRATES):
self._sort_formats(m3u8_formats)

View File

@@ -0,0 +1,97 @@
# coding: utf-8
from __future__ import unicode_literals
from .brightcove import BrightcoveNewIE
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
int_or_none,
smuggle_url,
try_get,
)
class NoovoIE(InfoExtractor):
_VALID_URL = r'https?://(?:[^/]+\.)?noovo\.ca/videos/(?P<id>[^/]+/[^/?#&]+)'
_TESTS = [{
# clip
'url': 'http://noovo.ca/videos/rpm-plus/chrysler-imperial',
'info_dict': {
'id': '5386045029001',
'ext': 'mp4',
'title': 'Chrysler Imperial',
'description': 'md5:de3c898d1eb810f3e6243e08c8b4a056',
'timestamp': 1491399228,
'upload_date': '20170405',
'uploader_id': '618566855001',
'creator': 'vtele',
'view_count': int,
'series': 'RPM+',
},
'params': {
'skip_download': True,
},
}, {
# episode
'url': 'http://noovo.ca/videos/l-amour-est-dans-le-pre/episode-13-8',
'info_dict': {
'id': '5395865725001',
'title': 'Épisode 13 : Les retrouvailles',
'description': 'md5:336d5ebc5436534e61d16e63ddfca327',
'ext': 'mp4',
'timestamp': 1492019320,
'upload_date': '20170412',
'uploader_id': '618566855001',
'creator': 'vtele',
'view_count': int,
'series': "L'amour est dans le pré",
'season_number': 5,
'episode': 'Épisode 13',
'episode_number': 13,
},
'params': {
'skip_download': True,
},
}]
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/618566855001/default_default/index.html?videoId=%s'
def _real_extract(self, url):
video_id = self._match_id(url)
data = self._download_json(
'http://api.noovo.ca/api/v1/pages/single-episode/%s' % video_id,
video_id)['data']
content = try_get(data, lambda x: x['contents'][0])
brightcove_id = data.get('brightcoveId') or content['brightcoveId']
series = try_get(
data, (
lambda x: x['show']['title'],
lambda x: x['season']['show']['title']),
compat_str)
episode = None
og = data.get('og')
if isinstance(og, dict) and og.get('type') == 'video.episode':
episode = og.get('title')
video = content or data
return {
'_type': 'url_transparent',
'ie_key': BrightcoveNewIE.ie_key(),
'url': smuggle_url(
self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id,
{'geo_countries': ['CA']}),
'id': brightcove_id,
'title': video.get('title'),
'creator': video.get('source'),
'view_count': int_or_none(video.get('viewsCount')),
'series': series,
'season_number': int_or_none(try_get(
data, lambda x: x['season']['seasonNumber'])),
'episode': episode,
'episode_number': int_or_none(data.get('episodeNumber')),
}

View File

@@ -148,13 +148,34 @@ class NRKBaseIE(InfoExtractor):
vcodec = 'none' if data.get('mediaType') == 'Audio' else None
# TODO: extract chapters when https://github.com/rg3/youtube-dl/pull/9409 is merged
for entry in entries:
entry.update(common_info)
for f in entry['formats']:
f['vcodec'] = vcodec
points = data.get('shortIndexPoints')
if isinstance(points, list):
chapters = []
for next_num, point in enumerate(points, start=1):
if not isinstance(point, dict):
continue
start_time = parse_duration(point.get('startPoint'))
if start_time is None:
continue
end_time = parse_duration(
data.get('duration')
if next_num == len(points)
else points[next_num].get('startPoint'))
if end_time is None:
continue
chapters.append({
'start_time': start_time,
'end_time': end_time,
'title': point.get('title'),
})
if chapters and len(entries) == 1:
entries[0]['chapters'] = chapters
return self.playlist_result(entries, video_id, title, description)

View File

@@ -8,7 +8,9 @@ from ..utils import (
ExtractorError,
determine_ext,
int_or_none,
float_or_none,
js_to_json,
orderedSet,
strip_jsonp,
strip_or_none,
unified_strdate,
@@ -263,6 +265,13 @@ class PBSIE(InfoExtractor):
},
'playlist_count': 2,
},
{
'url': 'http://www.pbs.org/wgbh/americanexperience/films/great-war/',
'info_dict': {
'id': 'great-war',
},
'playlist_count': 3,
},
{
'url': 'http://www.pbs.org/wgbh/americanexperience/films/death/player/',
'info_dict': {
@@ -381,10 +390,10 @@ class PBSIE(InfoExtractor):
# tabbed frontline videos
MULTI_PART_REGEXES = (
r'<div[^>]+class="videotab[^"]*"[^>]+vid="(\d+)"',
r'<a[^>]+href=["\']#video-\d+["\'][^>]+data-coveid=["\'](\d+)',
r'<a[^>]+href=["\']#(?:video-|part)\d+["\'][^>]+data-cove[Ii]d=["\'](\d+)',
)
for p in MULTI_PART_REGEXES:
tabbed_videos = re.findall(p, webpage)
tabbed_videos = orderedSet(re.findall(p, webpage))
if tabbed_videos:
return tabbed_videos, presumptive_id, upload_date, description
@@ -464,6 +473,7 @@ class PBSIE(InfoExtractor):
redirects.append(redirect)
redirect_urls.add(redirect_url)
chapters = []
# Player pages may also serve different qualities
for page in ('widget/partnerplayer', 'portalplayer'):
player = self._download_webpage(
@@ -479,6 +489,20 @@ class PBSIE(InfoExtractor):
extract_redirect_urls(video_info)
if not info:
info = video_info
if not chapters:
for chapter_data in re.findall(r'(?s)chapters\.push\(({.*?})\)', player):
chapter = self._parse_json(chapter_data, video_id, js_to_json, fatal=False)
if not chapter:
continue
start_time = float_or_none(chapter.get('start_time'), 1000)
duration = float_or_none(chapter.get('duration'), 1000)
if start_time is None or duration is None:
continue
chapters.append({
'start_time': start_time,
'end_time': start_time + duration,
'title': chapter.get('title'),
})
formats = []
http_url = None
@@ -515,7 +539,7 @@ class PBSIE(InfoExtractor):
http_url = format_url
self._remove_duplicate_formats(formats)
m3u8_formats = list(filter(
lambda f: f.get('protocol') == 'm3u8' and f.get('vcodec') != 'none' and f.get('resolution') != 'multiple',
lambda f: f.get('protocol') == 'm3u8' and f.get('vcodec') != 'none',
formats))
if http_url:
for m3u8_format in m3u8_formats:
@@ -588,4 +612,5 @@ class PBSIE(InfoExtractor):
'upload_date': upload_date,
'formats': formats,
'subtitles': subtitles,
'chapters': chapters,
}

View File

@@ -33,7 +33,7 @@ class PornHubIE(InfoExtractor):
_VALID_URL = r'''(?x)
https?://
(?:
(?:[a-z]+\.)?pornhub\.com/(?:view_video\.php\?viewkey=|embed/)|
(?:[a-z]+\.)?pornhub\.com/(?:(?:view_video\.php|video/show)\?viewkey=|embed/)|
(?:www\.)?thumbzilla\.com/video/
)
(?P<id>[\da-z]+)
@@ -97,6 +97,9 @@ class PornHubIE(InfoExtractor):
}, {
'url': 'https://www.thumbzilla.com/video/ph56c6114abd99a/horny-girlfriend-sex',
'only_matching': True,
}, {
'url': 'http://www.pornhub.com/video/show?viewkey=648719015',
'only_matching': True,
}]
@staticmethod

View File

@@ -62,8 +62,7 @@ class R7IE(InfoExtractor):
# m3u8 format always matches the http format, let's copy metadata from
# one to another
m3u8_formats = list(filter(
lambda f: f.get('vcodec') != 'none' and f.get('resolution') != 'multiple',
formats))
lambda f: f.get('vcodec') != 'none', formats))
if len(m3u8_formats) == 1:
f_copy = m3u8_formats[0].copy()
f_copy.update(f)

View File

@@ -13,21 +13,20 @@ class RMCDecouverteIE(InfoExtractor):
_VALID_URL = r'https?://rmcdecouverte\.bfmtv\.com/mediaplayer-replay.*?\bid=(?P<id>\d+)'
_TEST = {
'url': 'http://rmcdecouverte.bfmtv.com/mediaplayer-replay/?id=1430&title=LES%20HEROS%20DU%2088e%20ETAGE',
'url': 'http://rmcdecouverte.bfmtv.com/mediaplayer-replay/?id=13502&title=AQUAMEN:LES%20ROIS%20DES%20AQUARIUMS%20:UN%20DELICIEUX%20PROJET',
'info_dict': {
'id': '5111223049001',
'id': '5419055995001',
'ext': 'mp4',
'title': ': LES HEROS DU 88e ETAGE',
'description': 'Découvrez comment la bravoure de deux hommes dans la Tour Nord du World Trade Center a sauvé la vie d\'innombrables personnes le 11 septembre 2001.',
'title': 'UN DELICIEUX PROJET',
'description': 'md5:63610df7c8b1fc1698acd4d0d90ba8b5',
'uploader_id': '1969646226001',
'upload_date': '20160904',
'timestamp': 1472951103,
'upload_date': '20170502',
'timestamp': 1493745308,
},
'params': {
# rtmp download
'skip_download': True,
},
'skip': 'Only works from France',
'skip': 'only available for a week',
}
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/1969646226001/default_default/index.html?videoId=%s'
@@ -35,5 +34,12 @@ class RMCDecouverteIE(InfoExtractor):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
brightcove_legacy_url = BrightcoveLegacyIE._extract_brightcove_url(webpage)
brightcove_id = compat_parse_qs(compat_urlparse.urlparse(brightcove_legacy_url).query)['@videoPlayer'][0]
return self.url_result(self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id, 'BrightcoveNew', brightcove_id)
if brightcove_legacy_url:
brightcove_id = compat_parse_qs(compat_urlparse.urlparse(
brightcove_legacy_url).query)['@videoPlayer'][0]
else:
brightcove_id = self._search_regex(
r'data-video-id=["\'](\d+)', webpage, 'brightcove id')
return self.url_result(
self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id, 'BrightcoveNew',
brightcove_id)

View File

@@ -12,7 +12,7 @@ from ..utils import (
class StreamableIE(InfoExtractor):
_VALID_URL = r'https?://streamable\.com/(?:e/)?(?P<id>\w+)'
_VALID_URL = r'https?://streamable\.com/(?:[es]/)?(?P<id>\w+)'
_TESTS = [
{
'url': 'https://streamable.com/dnd1',
@@ -47,6 +47,10 @@ class StreamableIE(InfoExtractor):
{
'url': 'https://streamable.com/e/dnd1',
'only_matching': True,
},
{
'url': 'https://streamable.com/s/okkqk/drxjds',
'only_matching': True,
}
]

View File

@@ -210,7 +210,7 @@ class TEDIE(InfoExtractor):
resources.get('stream'), video_name, 'mp4', m3u8_id=format_id, fatal=False))
m3u8_formats = list(filter(
lambda f: f.get('protocol') == 'm3u8' and f.get('vcodec') != 'none' and f.get('resolution') != 'multiple',
lambda f: f.get('protocol') == 'm3u8' and f.get('vcodec') != 'none',
formats))
if http_url:
for m3u8_format in m3u8_formats:

View File

@@ -80,14 +80,33 @@ class ThePlatformBaseIE(OnceIE):
'url': src,
})
duration = info.get('duration')
tp_chapters = info.get('chapters', [])
chapters = []
if tp_chapters:
def _add_chapter(start_time, end_time):
start_time = float_or_none(start_time, 1000)
end_time = float_or_none(end_time, 1000)
if start_time is None or end_time is None:
return
chapters.append({
'start_time': start_time,
'end_time': end_time,
})
for chapter in tp_chapters[:-1]:
_add_chapter(chapter.get('startTime'), chapter.get('endTime'))
_add_chapter(tp_chapters[-1].get('startTime'), tp_chapters[-1].get('endTime') or duration)
return {
'title': info['title'],
'subtitles': subtitles,
'description': info['description'],
'thumbnail': info['defaultThumbnailUrl'],
'duration': int_or_none(info.get('duration'), 1000),
'duration': float_or_none(duration, 1000),
'timestamp': int_or_none(info.get('pubDate'), 1000) or None,
'uploader': info.get('billingCode'),
'chapters': chapters,
}
def _extract_theplatform_metadata(self, path, video_id):

View File

@@ -150,8 +150,7 @@ class TVPEmbedIE(InfoExtractor):
'mp4', 'm3u8_native', m3u8_id='hls', fatal=False)
self._sort_formats(m3u8_formats)
m3u8_formats = list(filter(
lambda f: f.get('vcodec') != 'none' and f.get('resolution') != 'multiple',
m3u8_formats))
lambda f: f.get('vcodec') != 'none', m3u8_formats))
formats.extend(m3u8_formats)
for i, m3u8_format in enumerate(m3u8_formats, 2):
http_url = '%s-%d.mp4' % (video_url_base, i)

View File

@@ -2,9 +2,13 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_HTTPError
from ..compat import (
compat_HTTPError,
compat_str,
)
from ..utils import (
extract_attributes,
try_get,
urlencode_postdata,
ExtractorError,
)
@@ -34,25 +38,32 @@ class TVPlayerIE(InfoExtractor):
webpage, 'channel element'))
title = current_channel['data-name']
resource_id = self._search_regex(
r'resourceId\s*=\s*"(\d+)"', webpage, 'resource id')
platform = self._search_regex(
r'platform\s*=\s*"([^"]+)"', webpage, 'platform')
resource_id = current_channel['data-id']
token = self._search_regex(
r'token\s*=\s*"([^"]+)"', webpage, 'token', default='null')
validate = self._search_regex(
r'validate\s*=\s*"([^"]+)"', webpage, 'validate', default='null')
r'data-token=(["\'])(?P<token>(?!\1).+)\1', webpage,
'token', group='token')
context = self._download_json(
'https://tvplayer.com/watch/context', display_id,
'Downloading JSON context', query={
'resource': resource_id,
'nonce': token,
})
validate = context['validate']
platform = try_get(
context, lambda x: x['platform']['key'], compat_str) or 'firefox'
try:
response = self._download_json(
'http://api.tvplayer.com/api/v2/stream/live',
resource_id, headers={
display_id, 'Downloading JSON stream', headers={
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
}, data=urlencode_postdata({
'id': resource_id,
'service': 1,
'platform': platform,
'id': resource_id,
'token': token,
'validate': validate,
}))['tvplayer']['response']
except ExtractorError as e:
@@ -63,7 +74,7 @@ class TVPlayerIE(InfoExtractor):
'%s said: %s' % (self.IE_NAME, response['error']), expected=True)
raise
formats = self._extract_m3u8_formats(response['stream'], resource_id, 'mp4')
formats = self._extract_m3u8_formats(response['stream'], display_id, 'mp4')
self._sort_formats(formats)
return {

View File

@@ -1,6 +1,7 @@
from __future__ import unicode_literals
import re
import json
from .common import InfoExtractor
from ..compat import (
@@ -11,7 +12,6 @@ from ..compat import (
from ..utils import (
ExtractorError,
int_or_none,
sanitized_Request,
parse_iso8601,
)
@@ -154,19 +154,24 @@ class VevoIE(VevoBaseIE):
}
def _initialize_api(self, video_id):
req = sanitized_Request(
'http://www.vevo.com/auth', data=b'')
webpage = self._download_webpage(
req, None,
'https://accounts.vevo.com/token', None,
note='Retrieving oauth token',
errnote='Unable to retrieve oauth token')
errnote='Unable to retrieve oauth token',
data=json.dumps({
'client_id': 'SPupX1tvqFEopQ1YS6SS',
'grant_type': 'urn:vevo:params:oauth:grant-type:anonymous',
}).encode('utf-8'),
headers={
'Content-Type': 'application/json',
})
if re.search(r'(?i)THIS PAGE IS CURRENTLY UNAVAILABLE IN YOUR REGION', webpage):
self.raise_geo_restricted(
'%s said: This page is currently unavailable in your region' % self.IE_NAME)
auth_info = self._parse_json(webpage, video_id)
self._api_url_template = self.http_scheme() + '//apiv2.vevo.com/%s?token=' + auth_info['access_token']
self._api_url_template = self.http_scheme() + '//apiv2.vevo.com/%s?token=' + auth_info['legacy_token']
def _call_api(self, path, *args, **kwargs):
try:

View File

@@ -20,7 +20,7 @@ from ..utils import (
class ViceBaseIE(AdobePassIE):
def _extract_preplay_video(self, url, webpage):
def _extract_preplay_video(self, url, locale, webpage):
watch_hub_data = extract_attributes(self._search_regex(
r'(?s)(<watch-hub\s*.+?</watch-hub>)', webpage, 'watch hub'))
video_id = watch_hub_data['vms-id']
@@ -32,7 +32,8 @@ class ViceBaseIE(AdobePassIE):
resource = self._get_mvpd_resource(
'VICELAND', title, video_id,
watch_hub_data.get('video-rating'))
query['tvetoken'] = self._extract_mvpd_auth(url, video_id, 'VICELAND', resource)
query['tvetoken'] = self._extract_mvpd_auth(
url, video_id, 'VICELAND', resource)
# signature generation algorithm is reverse engineered from signatureGenerator in
# webpack:///../shared/~/vice-player/dist/js/vice-player.js in
@@ -45,11 +46,14 @@ class ViceBaseIE(AdobePassIE):
try:
host = 'www.viceland' if is_locked else self._PREPLAY_HOST
preplay = self._download_json('https://%s.com/en_us/preplay/%s' % (host, video_id), video_id, query=query)
preplay = self._download_json(
'https://%s.com/%s/preplay/%s' % (host, locale, video_id),
video_id, query=query)
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 400:
error = json.loads(e.cause.read().decode())
raise ExtractorError('%s said: %s' % (self.IE_NAME, error['details']), expected=True)
raise ExtractorError('%s said: %s' % (
self.IE_NAME, error['details']), expected=True)
raise
video_data = preplay['video']
@@ -88,41 +92,30 @@ class ViceBaseIE(AdobePassIE):
class ViceIE(ViceBaseIE):
_VALID_URL = r'https?://(?:.+?\.)?vice\.com/(?:[^/]+/)?videos?/(?P<id>[^/?#&]+)'
IE_NAME = 'vice'
_VALID_URL = r'https?://(?:.+?\.)?vice\.com/(?:(?P<locale>[^/]+)/)?videos?/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://www.vice.com/video/cowboy-capitalists-part-1',
'md5': 'e9d77741f9e42ba583e683cd170660f7',
'url': 'https://news.vice.com/video/experimenting-on-animals-inside-the-monkey-lab',
'md5': '7d3ae2f9ba5f196cdd9f9efd43657ac2',
'info_dict': {
'id': '43cW1mYzpia9IlestBjVpd23Yu3afAfp',
'id': 'N2bzkydjraWDGwnt8jAttCF6Y0PDv4Zj',
'ext': 'flv',
'title': 'VICE_COWBOYCAPITALISTS_PART01_v1_VICE_WM_1080p.mov',
'duration': 725.983,
'title': 'Monkey Labs of Holland',
'description': 'md5:92b3c7dcbfe477f772dd4afa496c9149',
},
'add_ie': ['Ooyala'],
}, {
'url': 'http://www.vice.com/video/how-to-hack-a-car',
'md5': 'a7ecf64ee4fa19b916c16f4b56184ae2',
'info_dict': {
'id': '3jstaBeXgAs',
'ext': 'mp4',
'title': 'How to Hack a Car: Phreaked Out (Episode 2)',
'description': 'md5:ee95453f7ff495db8efe14ae8bf56f30',
'uploader_id': 'MotherboardTV',
'uploader': 'Motherboard',
'upload_date': '20140529',
},
'add_ie': ['Youtube'],
}, {
'url': 'https://video.vice.com/en_us/video/the-signal-from-tolva/5816510690b70e6c5fd39a56',
'md5': '',
'info_dict': {
'id': '5816510690b70e6c5fd39a56',
'ext': 'mp4',
'uploader': 'Waypoint',
'title': 'The Signal From Tölva',
'description': 'md5:3927e3c79f9e8094606a2b3c5b5e55d5',
'uploader_id': '57f7d621e05ca860fa9ccaf9',
'timestamp': 1477941983938,
'timestamp': 1477941983,
'upload_date': '20161031',
},
'params': {
# m3u8 download
@@ -130,19 +123,31 @@ class ViceIE(ViceBaseIE):
},
'add_ie': ['UplynkPreplay'],
}, {
'url': 'https://news.vice.com/video/experimenting-on-animals-inside-the-monkey-lab',
'only_matching': True,
'url': 'https://video.vice.com/alps/video/ulfs-wien-beruchtigste-grafitti-crew-part-1/581b12b60a0e1f4c0fb6ea2f',
'info_dict': {
'id': '581b12b60a0e1f4c0fb6ea2f',
'ext': 'mp4',
'title': 'ULFs - Wien berüchtigste Grafitti Crew - Part 1',
'description': '<p>Zwischen Hinterzimmer-Tattoos und U-Bahnschächten erzählen uns die Ulfs, wie es ist, "süchtig nach Sachbeschädigung" zu sein.</p>',
'uploader': 'VICE',
'uploader_id': '57a204088cb727dec794c67b',
'timestamp': 1485368119,
'upload_date': '20170125',
'age_limit': 14,
},
'params': {
# AES-encrypted m3u8
'skip_download': True,
},
'add_ie': ['UplynkPreplay'],
}, {
'url': 'http://www.vice.com/ru/video/big-night-out-ibiza-clive-martin-229',
'only_matching': True,
}, {
'url': 'https://munchies.vice.com/en/videos/watch-the-trailer-for-our-new-series-the-pizza-show',
'url': 'https://video.vice.com/en_us/video/pizza-show-trailer/56d8c9a54d286ed92f7f30e4',
'only_matching': True,
}]
_PREPLAY_HOST = 'video.vice'
def _real_extract(self, url):
video_id = self._match_id(url)
locale, video_id = re.match(self._VALID_URL, url).groups()
webpage, urlh = self._download_webpage_handle(url, video_id)
embed_code = self._search_regex(
r'embedCode=([^&\'"]+)', webpage,
@@ -153,10 +158,11 @@ class ViceIE(ViceBaseIE):
r'data-youtube-id="([^"]+)"', webpage, 'youtube id', default=None)
if youtube_id:
return self.url_result(youtube_id, 'Youtube')
return self._extract_preplay_video(urlh.geturl(), webpage)
return self._extract_preplay_video(urlh.geturl(), locale, webpage)
class ViceShowIE(InfoExtractor):
IE_NAME = 'vice:show'
_VALID_URL = r'https?://(?:.+?\.)?vice\.com/(?:[^/]+/)?show/(?P<id>[^/?#&]+)'
_TEST = {
@@ -183,6 +189,86 @@ class ViceShowIE(InfoExtractor):
r'<title>(.+?)</title>', webpage, 'title', default=None)
if title:
title = re.sub(r'(.+)\s*\|\s*.+$', r'\1', title).strip()
description = self._html_search_meta('description', webpage, 'description')
description = self._html_search_meta(
'description', webpage, 'description')
return self.playlist_result(entries, show_id, title, description)
class ViceArticleIE(InfoExtractor):
IE_NAME = 'vice:article'
_VALID_URL = r'https://www.vice.com/[^/]+/article/(?P<id>[^?#]+)'
_TESTS = [{
'url': 'https://www.vice.com/en_us/article/on-set-with-the-woman-making-mormon-porn-in-utah',
'info_dict': {
'id': '58dc0a3dee202d2a0ccfcbd8',
'ext': 'mp4',
'title': 'Mormon War on Porn ',
'description': 'md5:ad396a2481e7f8afb5ed486878421090',
'uploader': 'VICE',
'uploader_id': '57a204088cb727dec794c693',
'timestamp': 1489160690,
'upload_date': '20170310',
},
'params': {
# AES-encrypted m3u8
'skip_download': True,
},
'add_ie': ['UplynkPreplay'],
}, {
'url': 'https://www.vice.com/en_us/article/how-to-hack-a-car',
'md5': 'a7ecf64ee4fa19b916c16f4b56184ae2',
'info_dict': {
'id': '3jstaBeXgAs',
'ext': 'mp4',
'title': 'How to Hack a Car: Phreaked Out (Episode 2)',
'description': 'md5:ee95453f7ff495db8efe14ae8bf56f30',
'uploader_id': 'MotherboardTV',
'uploader': 'Motherboard',
'upload_date': '20140529',
},
'add_ie': ['Youtube'],
}, {
'url': 'https://www.vice.com/en_us/article/cowboy-capitalists-part-1',
'only_matching': True,
}, {
'url': 'https://www.vice.com/ru/article/big-night-out-ibiza-clive-martin-229',
'only_matching': True,
}]
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
prefetch_data = self._parse_json(self._search_regex(
r'window\.__PREFETCH_DATA\s*=\s*({.*});',
webpage, 'prefetch data'), display_id)
body = prefetch_data['body']
def _url_res(video_url, ie_key):
return {
'_type': 'url_transparent',
'url': video_url,
'display_id': display_id,
'ie_key': ie_key,
}
embed_code = self._search_regex(
r'embedCode=([^&\'"]+)', body,
'ooyala embed code', default=None)
if embed_code:
return _url_res('ooyala:%s' % embed_code, 'Ooyala')
youtube_url = self._html_search_regex(
r'<iframe[^>]+src="(.*youtube\.com/.*)"',
body, 'YouTube URL', default=None)
if youtube_url:
return _url_res(youtube_url, 'Youtube')
video_url = self._html_search_regex(
r'data-video-url="([^"]+)"',
prefetch_data['embed_code'], 'video URL')
return _url_res(video_url, ViceIE.ie_key())

View File

@@ -1,11 +1,13 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .vice import ViceBaseIE
class VicelandIE(ViceBaseIE):
_VALID_URL = r'https?://(?:www\.)?viceland\.com/[^/]+/video/[^/]+/(?P<id>[a-f0-9]+)'
_VALID_URL = r'https?://(?:www\.)?viceland\.com/(?P<locale>[^/]+)/video/[^/]+/(?P<id>[a-f0-9]+)'
_TEST = {
'url': 'https://www.viceland.com/en_us/video/trapped/588a70d0dba8a16007de7316',
'info_dict': {
@@ -24,10 +26,13 @@ class VicelandIE(ViceBaseIE):
'skip_download': True,
},
'add_ie': ['UplynkPreplay'],
'skip': '404',
}
_PREPLAY_HOST = 'www.viceland'
def _real_extract(self, url):
video_id = self._match_id(url)
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
locale = mobj.group('locale')
webpage = self._download_webpage(url, video_id)
return self._extract_preplay_video(url, webpage)
return self._extract_preplay_video(url, locale, webpage)

View File

@@ -1,7 +1,6 @@
# coding: utf-8
from __future__ import unicode_literals
import random
import re
from .common import InfoExtractor
@@ -11,6 +10,7 @@ from ..utils import (
float_or_none,
parse_age_limit,
qualities,
random_birthday,
try_get,
unified_timestamp,
urljoin,
@@ -47,13 +47,10 @@ class VideoPressIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
query = random_birthday('birth_year', 'birth_month', 'birth_day')
video = self._download_json(
'https://public-api.wordpress.com/rest/v1.1/videos/%s' % video_id,
video_id, query={
'birth_month': random.randint(1, 12),
'birth_day': random.randint(1, 31),
'birth_year': random.randint(1950, 1995),
})
video_id, query=query)
title = video['title']

View File

@@ -176,8 +176,7 @@ class ViewsterIE(InfoExtractor):
if m3u8_formats:
self._sort_formats(m3u8_formats)
m3u8_formats = list(filter(
lambda f: f.get('vcodec') != 'none' and f.get('resolution') != 'multiple',
m3u8_formats))
lambda f: f.get('vcodec') != 'none', m3u8_formats))
if len(qualities) == len(m3u8_formats):
for q, m3u8_format in zip(qualities, m3u8_formats):
f = m3u8_format.copy()

View File

@@ -13,6 +13,7 @@ from ..utils import (
class WashingtonPostIE(InfoExtractor):
IE_NAME = 'washingtonpost'
_VALID_URL = r'(?:washingtonpost:|https?://(?:www\.)?washingtonpost\.com/video/(?:[^/]+/)*)(?P<id>[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})'
_EMBED_URL = r'https?://(?:www\.)?washingtonpost\.com/video/c/embed/[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12}'
_TEST = {
'url': 'https://www.washingtonpost.com/video/c/video/480ba4ee-1ec7-11e6-82c2-a7dcb313287d',
'md5': '6f537e1334b714eb15f9563bd4b9cdfa',
@@ -27,6 +28,11 @@ class WashingtonPostIE(InfoExtractor):
},
}
@classmethod
def _extract_urls(cls, webpage):
return re.findall(
r'<iframe[^>]+\bsrc=["\'](%s)' % cls._EMBED_URL, webpage)
def _real_extract(self, url):
video_id = self._match_id(url)
video_data = self._download_json(

View File

@@ -6,6 +6,7 @@ import re
from .common import InfoExtractor
from ..utils import (
int_or_none,
js_to_json,
orderedSet,
parse_duration,
sanitized_Request,
@@ -37,6 +38,22 @@ class XTubeIE(InfoExtractor):
'comment_count': int,
'age_limit': 18,
}
}, {
# FLV videos with duplicated formats
'url': 'http://www.xtube.com/video-watch/A-Super-Run-Part-1-YT-9299752',
'md5': 'a406963eb349dd43692ec54631efd88b',
'info_dict': {
'id': '9299752',
'display_id': 'A-Super-Run-Part-1-YT',
'ext': 'flv',
'title': 'A Super Run - Part 1 (YT)',
'description': 'md5:ca0d47afff4a9b2942e4b41aa970fd93',
'uploader': 'tshirtguy59',
'duration': 579,
'view_count': int,
'comment_count': int,
'age_limit': 18,
},
}, {
# new URL schema
'url': 'http://www.xtube.com/video-watch/strange-erotica-625837',
@@ -68,8 +85,9 @@ class XTubeIE(InfoExtractor):
})
sources = self._parse_json(self._search_regex(
r'(["\'])sources\1\s*:\s*(?P<sources>{.+?}),',
webpage, 'sources', group='sources'), video_id)
r'(["\'])?sources\1?\s*:\s*(?P<sources>{.+?}),',
webpage, 'sources', group='sources'), video_id,
transform_source=js_to_json)
formats = []
for format_id, format_url in sources.items():
@@ -78,6 +96,7 @@ class XTubeIE(InfoExtractor):
'format_id': format_id,
'height': int_or_none(format_id),
})
self._remove_duplicate_formats(formats)
self._sort_formats(formats)
title = self._search_regex(

View File

@@ -6,8 +6,10 @@ from .common import InfoExtractor
from ..compat import compat_urllib_parse_unquote
from ..utils import (
clean_html,
ExtractorError,
determine_ext,
ExtractorError,
int_or_none,
parse_duration,
)
@@ -20,6 +22,7 @@ class XVideosIE(InfoExtractor):
'id': '4588838',
'ext': 'mp4',
'title': 'Biker Takes his Girl',
'duration': 108,
'age_limit': 18,
}
}
@@ -36,6 +39,11 @@ class XVideosIE(InfoExtractor):
r'<title>(.*?)\s+-\s+XVID', webpage, 'title')
video_thumbnail = self._search_regex(
r'url_bigthumb=(.+?)&amp', webpage, 'thumbnail', fatal=False)
video_duration = int_or_none(self._og_search_property(
'duration', webpage, default=None)) or parse_duration(
self._search_regex(
r'<span[^>]+class=["\']duration["\'][^>]*>.*?(\d[^<]+)',
webpage, 'duration', fatal=False))
formats = []
@@ -67,6 +75,7 @@ class XVideosIE(InfoExtractor):
'id': video_id,
'formats': formats,
'title': video_title,
'duration': video_duration,
'thumbnail': video_thumbnail,
'age_limit': 18,
}

View File

@@ -234,7 +234,8 @@ class YandexMusicPlaylistIE(YandexMusicPlaylistBaseIE):
'overembed': 'false',
})['playlist']
tracks, track_ids = playlist['tracks'], map(compat_str, playlist['trackIds'])
tracks = playlist['tracks']
track_ids = [compat_str(track_id) for track_id in playlist['trackIds']]
# tracks dictionary shipped with playlist.jsx API is limited to 150 tracks,
# missing tracks should be retrieved manually.

View File

@@ -38,7 +38,6 @@ from ..utils import (
parse_duration,
remove_quotes,
remove_start,
sanitized_Request,
smuggle_url,
str_to_int,
try_get,
@@ -54,7 +53,11 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
"""Provide base functions for Youtube extractors"""
_LOGIN_URL = 'https://accounts.google.com/ServiceLogin'
_TWOFACTOR_URL = 'https://accounts.google.com/signin/challenge'
_PASSWORD_CHALLENGE_URL = 'https://accounts.google.com/signin/challenge/sl/password'
_LOOKUP_URL = 'https://accounts.google.com/_/signin/sl/lookup'
_CHALLENGE_URL = 'https://accounts.google.com/_/signin/sl/challenge'
_TFA_URL = 'https://accounts.google.com/_/signin/challenge?hl=en&TL={0}'
_NETRC_MACHINE = 'youtube'
# If True it will raise an error if no login info is provided
_LOGIN_REQUIRED = False
@@ -96,72 +99,150 @@ class YoutubeBaseInfoExtractor(InfoExtractor):
login_form = self._hidden_inputs(login_page)
login_form.update({
'checkConnection': 'youtube',
'Email': username,
'Passwd': password,
})
login_results = self._download_webpage(
self._PASSWORD_CHALLENGE_URL, None,
note='Logging in', errnote='unable to log in', fatal=False,
data=urlencode_postdata(login_form))
if login_results is False:
return False
error_msg = self._html_search_regex(
r'<[^>]+id="errormsg_0_Passwd"[^>]*>([^<]+)<',
login_results, 'error message', default=None)
if error_msg:
raise ExtractorError('Unable to login: %s' % error_msg, expected=True)
if re.search(r'id="errormsg_0_Passwd"', login_results) is not None:
raise ExtractorError('Please use your account password and a two-factor code instead of an application-specific password.', expected=True)
# Two-Factor
# TODO add SMS and phone call support - these require making a request and then prompting the user
if re.search(r'(?i)<form[^>]+id="challenge"', login_results) is not None:
tfa_code = self._get_tfa_info('2-step verification code')
if not tfa_code:
self._downloader.report_warning(
'Two-factor authentication required. Provide it either interactively or with --twofactor <code>'
'(Note that only TOTP (Google Authenticator App) codes work at this time.)')
return False
tfa_code = remove_start(tfa_code, 'G-')
tfa_form_strs = self._form_hidden_inputs('challenge', login_results)
tfa_form_strs.update({
'Pin': tfa_code,
'TrustDevice': 'on',
def req(url, f_req, note, errnote):
data = login_form.copy()
data.update({
'pstMsg': 1,
'checkConnection': 'youtube',
'checkedDomains': 'youtube',
'hl': 'en',
'deviceinfo': '[null,null,null,[],null,"US",null,null,[],"GlifWebSignIn",null,[null,null,[]]]',
'f.req': json.dumps(f_req),
'flowName': 'GlifWebSignIn',
'flowEntry': 'ServiceLogin',
})
return self._download_json(
url, None, note=note, errnote=errnote,
transform_source=lambda s: re.sub(r'^[^[]*', '', s),
fatal=False,
data=urlencode_postdata(data), headers={
'Content-Type': 'application/x-www-form-urlencoded;charset=utf-8',
'Google-Accounts-XSRF': 1,
})
tfa_data = urlencode_postdata(tfa_form_strs)
def warn(message):
self._downloader.report_warning(message)
tfa_req = sanitized_Request(self._TWOFACTOR_URL, tfa_data)
tfa_results = self._download_webpage(
tfa_req, None,
note='Submitting TFA code', errnote='unable to submit tfa', fatal=False)
lookup_req = [
username,
None, [], None, 'US', None, None, 2, False, True,
[
None, None,
[2, 1, None, 1,
'https://accounts.google.com/ServiceLogin?passive=true&continue=https%3A%2F%2Fwww.youtube.com%2Fsignin%3Fnext%3D%252F%26action_handle_signin%3Dtrue%26hl%3Den%26app%3Ddesktop%26feature%3Dsign_in_button&hl=en&service=youtube&uilel=3&requestPath=%2FServiceLogin&Page=PasswordSeparationSignIn',
None, [], 4],
1, [None, None, []], None, None, None, True
],
username,
]
if tfa_results is False:
return False
lookup_results = req(
self._LOOKUP_URL, lookup_req,
'Looking up account info', 'Unable to look up account info')
if re.search(r'(?i)<form[^>]+id="challenge"', tfa_results) is not None:
self._downloader.report_warning('Two-factor code expired or invalid. Please try again, or use a one-use backup code instead.')
return False
if re.search(r'(?i)<form[^>]+id="gaia_loginform"', tfa_results) is not None:
self._downloader.report_warning('unable to log in - did the page structure change?')
return False
if re.search(r'smsauth-interstitial-reviewsettings', tfa_results) is not None:
self._downloader.report_warning('Your Google account has a security notice. Please log in on your web browser, resolve the notice, and try again.')
return False
if re.search(r'(?i)<form[^>]+id="gaia_loginform"', login_results) is not None:
self._downloader.report_warning('unable to log in: bad username or password')
if lookup_results is False:
return False
user_hash = try_get(lookup_results, lambda x: x[0][2], compat_str)
if not user_hash:
warn('Unable to extract user hash')
return False
challenge_req = [
user_hash,
None, 1, None, [1, None, None, None, [password, None, True]],
[
None, None, [2, 1, None, 1, 'https://accounts.google.com/ServiceLogin?passive=true&continue=https%3A%2F%2Fwww.youtube.com%2Fsignin%3Fnext%3D%252F%26action_handle_signin%3Dtrue%26hl%3Den%26app%3Ddesktop%26feature%3Dsign_in_button&hl=en&service=youtube&uilel=3&requestPath=%2FServiceLogin&Page=PasswordSeparationSignIn', None, [], 4],
1, [None, None, []], None, None, None, True
]]
challenge_results = req(
self._CHALLENGE_URL, challenge_req,
'Logging in', 'Unable to log in')
if challenge_results is False:
return
login_res = try_get(challenge_results, lambda x: x[0][5], list)
if login_res:
login_msg = try_get(login_res, lambda x: x[5], compat_str)
warn(
'Unable to login: %s' % 'Invalid password'
if login_msg == 'INCORRECT_ANSWER_ENTERED' else login_msg)
return False
res = try_get(challenge_results, lambda x: x[0][-1], list)
if not res:
warn('Unable to extract result entry')
return False
tfa = try_get(res, lambda x: x[0][0], list)
if tfa:
tfa_str = try_get(tfa, lambda x: x[2], compat_str)
if tfa_str == 'TWO_STEP_VERIFICATION':
# SEND_SUCCESS - TFA code has been successfully sent to phone
# QUOTA_EXCEEDED - reached the limit of TFA codes
status = try_get(tfa, lambda x: x[5], compat_str)
if status == 'QUOTA_EXCEEDED':
warn('Exceeded the limit of TFA codes, try later')
return False
tl = try_get(challenge_results, lambda x: x[1][2], compat_str)
if not tl:
warn('Unable to extract TL')
return False
tfa_code = self._get_tfa_info('2-step verification code')
if not tfa_code:
warn(
'Two-factor authentication required. Provide it either interactively or with --twofactor <code>'
'(Note that only TOTP (Google Authenticator App) codes work at this time.)')
return False
tfa_code = remove_start(tfa_code, 'G-')
tfa_req = [
user_hash, None, 2, None,
[
9, None, None, None, None, None, None, None,
[None, tfa_code, True, 2]
]]
tfa_results = req(
self._TFA_URL.format(tl), tfa_req,
'Submitting TFA code', 'Unable to submit TFA code')
if tfa_results is False:
return False
tfa_res = try_get(tfa_results, lambda x: x[0][5], list)
if tfa_res:
tfa_msg = try_get(tfa_res, lambda x: x[5], compat_str)
warn(
'Unable to finish TFA: %s' % 'Invalid TFA code'
if tfa_msg == 'INCORRECT_ANSWER_ENTERED' else tfa_msg)
return False
check_cookie_url = try_get(
tfa_results, lambda x: x[0][-1][2], compat_str)
else:
check_cookie_url = try_get(res, lambda x: x[2], compat_str)
if not check_cookie_url:
warn('Unable to extract CheckCookie URL')
return False
check_cookie_results = self._download_webpage(
check_cookie_url, None, 'Checking cookie', fatal=False)
if check_cookie_results is False:
return False
if 'https://myaccount.google.com/' not in check_cookie_results:
warn('Unable to log in')
return False
return True
def _real_initialize(self):
@@ -963,7 +1044,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
def _extract_signature_function(self, video_id, player_url, example_sig):
id_m = re.match(
r'.*?-(?P<id>[a-zA-Z0-9_-]+)(?:/watch_as3|/html5player(?:-new)?|/base)?\.(?P<ext>[a-z]+)$',
r'.*?-(?P<id>[a-zA-Z0-9_-]+)(?:/watch_as3|/html5player(?:-new)?|(?:/[a-z]{2}_[A-Z]{2})?/base)?\.(?P<ext>[a-z]+)$',
player_url)
if not id_m:
raise ExtractorError('Cannot identify player %r' % player_url)
@@ -1257,6 +1338,35 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
url = 'https://www.youtube.com/annotations_invideo?features=1&legacy=1&video_id=%s' % video_id
return self._download_webpage(url, video_id, note='Searching for annotations.', errnote='Unable to download video annotations.')
@staticmethod
def _extract_chapters(description, duration):
if not description:
return None
chapter_lines = re.findall(
r'(?:^|<br\s*/>)([^<]*<a[^>]+onclick=["\']yt\.www\.watch\.player\.seekTo[^>]+>(\d{1,2}:\d{1,2}(?::\d{1,2})?)</a>[^>]*)(?=$|<br\s*/>)',
description)
if not chapter_lines:
return None
chapters = []
for next_num, (chapter_line, time_point) in enumerate(
chapter_lines, start=1):
start_time = parse_duration(time_point)
if start_time is None:
continue
end_time = (duration if next_num == len(chapter_lines)
else parse_duration(chapter_lines[next_num][1]))
if end_time is None:
continue
chapter_title = re.sub(
r'<a[^>]+>[^<]+</a>', '', chapter_line).strip(' \t-')
chapter_title = re.sub(r'\s+', ' ', chapter_title)
chapters.append({
'start_time': start_time,
'end_time': end_time,
'title': chapter_title,
})
return chapters
def _real_extract(self, url):
url, smuggled_data = unsmuggle_url(url, {})
@@ -1399,9 +1509,9 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
video_title = '_'
# description
video_description = get_element_by_id("eow-description", video_webpage)
description_original = video_description = get_element_by_id("eow-description", video_webpage)
if video_description:
video_description = re.sub(r'''(?x)
description_original = video_description = re.sub(r'''(?x)
<a\s+
(?:[a-zA-Z-]+="[^"]*"\s+)*?
(?:title|href)="([^"]+)"\s+
@@ -1558,6 +1668,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
if self._downloader.params.get('writeannotations', False):
video_annotations = self._extract_annotations(video_id)
chapters = self._extract_chapters(description_original, video_duration)
if 'conn' in video_info and video_info['conn'][0].startswith('rtmp'):
self.report_rtmp_download()
formats = [{
@@ -1629,7 +1741,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
player_desc = 'flash player %s' % player_version
else:
player_version = self._search_regex(
[r'html5player-([^/]+?)(?:/html5player(?:-new)?)?\.js', r'(?:www|player)-([^/]+)/base\.js'],
[r'html5player-([^/]+?)(?:/html5player(?:-new)?)?\.js',
r'(?:www|player)-([^/]+)(?:/[a-z]{2}_[A-Z]{2})?/base\.js'],
player_url,
'html5 player', fatal=False)
player_desc = 'html5 player %s' % player_version
@@ -1789,6 +1902,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'duration': video_duration,
'age_limit': 18 if age_gate else 0,
'annotations': video_annotations,
'chapters': chapters,
'webpage_url': proto + '://www.youtube.com/watch?v=%s' % video_id,
'view_count': view_count,
'like_count': like_count,

View File

@@ -0,0 +1,101 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
int_or_none,
unified_timestamp,
)
class Zaq1IE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?zaq1\.pl/video/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://zaq1.pl/video/xev0e',
'md5': '24a5eb3f052e604ae597c4d0d19b351e',
'info_dict': {
'id': 'xev0e',
'title': 'DJ NA WESELE. TANIEC Z FIGURAMI.węgrów/sokołów podlaski/siedlce/mińsk mazowiecki/warszawa',
'description': 'www.facebook.com/weseledjKontakt: 728 448 199 / 505 419 147',
'ext': 'mp4',
'duration': 511,
'timestamp': 1490896361,
'uploader': 'Anonim',
'upload_date': '20170330',
'view_count': int,
}
}, {
# malformed JSON-LD
'url': 'http://zaq1.pl/video/x81vn',
'info_dict': {
'id': 'x81vn',
'title': 'SEKRETNE ŻYCIE WALTERA MITTY',
'ext': 'mp4',
'duration': 6234,
'timestamp': 1493494860,
'uploader': 'Anonim',
'upload_date': '20170429',
'view_count': int,
},
'params': {
'skip_download': True,
},
'expected_warnings': ['Failed to parse JSON'],
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
video_url = self._search_regex(
r'data-video-url=(["\'])(?P<url>(?:(?!\1).)+)\1', webpage,
'video url', group='url')
info = self._search_json_ld(webpage, video_id, fatal=False)
def extract_data(field, name, fatal=False):
return self._search_regex(
r'data-%s=(["\'])(?P<field>(?:(?!\1).)+)\1' % field,
webpage, field, fatal=fatal, group='field')
if not info.get('title'):
info['title'] = extract_data('file-name', 'title', fatal=True)
if not info.get('duration'):
info['duration'] = int_or_none(extract_data('duration', 'duration'))
if not info.get('thumbnail'):
info['thumbnail'] = extract_data('photo-url', 'thumbnail')
if not info.get('timestamp'):
info['timestamp'] = unified_timestamp(self._html_search_meta(
'uploadDate', webpage, 'timestamp'))
if not info.get('interactionCount'):
info['view_count'] = int_or_none(self._html_search_meta(
'interactionCount', webpage, 'view count'))
uploader = self._html_search_regex(
r'Wideo dodał:\s*<a[^>]*>([^<]+)</a>', webpage, 'uploader',
fatal=False)
width = int_or_none(self._html_search_meta(
'width', webpage, fatal=False))
height = int_or_none(self._html_search_meta(
'height', webpage, fatal=False))
info.update({
'id': video_id,
'formats': [{
'url': video_url,
'width': width,
'height': height,
'http_headers': {
'Referer': url,
},
}],
'uploader': uploader,
})
return info

View File

@@ -4,6 +4,7 @@ import io
import os
import subprocess
import time
import re
from .common import AudioConversionError, PostProcessor
@@ -22,6 +23,7 @@ from ..utils import (
subtitles_filename,
dfxp2srt,
ISO639Utils,
replace_extension,
)
@@ -429,17 +431,40 @@ class FFmpegMetadataPP(FFmpegPostProcessor):
filename = info['filepath']
temp_filename = prepend_extension(filename, 'temp')
in_filenames = [filename]
options = []
if info['ext'] == 'm4a':
options = ['-vn', '-acodec', 'copy']
options.extend(['-vn', '-acodec', 'copy'])
else:
options = ['-c', 'copy']
options.extend(['-c', 'copy'])
for (name, value) in metadata.items():
options.extend(['-metadata', '%s=%s' % (name, value)])
chapters = info.get('chapters', [])
if chapters:
metadata_filename = encodeFilename(replace_extension(filename, 'meta'))
with io.open(metadata_filename, 'wt', encoding='utf-8') as f:
def ffmpeg_escape(text):
return re.sub(r'(=|;|#|\\|\n)', r'\\\1', text)
metadata_file_content = ';FFMETADATA1\n'
for chapter in chapters:
metadata_file_content += '[CHAPTER]\nTIMEBASE=1/1000\n'
metadata_file_content += 'START=%d\n' % (chapter['start_time'] * 1000)
metadata_file_content += 'END=%d\n' % (chapter['end_time'] * 1000)
chapter_title = chapter.get('title')
if chapter_title:
metadata_file_content += 'title=%s\n' % ffmpeg_escape(chapter_title)
f.write(metadata_file_content)
in_filenames.append(metadata_filename)
options.extend(['-map_metadata', '1'])
self._downloader.to_screen('[ffmpeg] Adding metadata to \'%s\'' % filename)
self.run_ffmpeg(filename, temp_filename, options)
self.run_ffmpeg_multiple_files(in_filenames, temp_filename, options)
if chapters:
os.remove(metadata_filename)
os.remove(encodeFilename(filename))
os.rename(encodeFilename(temp_filename), encodeFilename(filename))
return [], info

View File

@@ -11,6 +11,7 @@ import contextlib
import ctypes
import datetime
import email.utils
import email.header
import errno
import functools
import gzip
@@ -421,8 +422,8 @@ def clean_html(html):
# Newline vs <br />
html = html.replace('\n', ' ')
html = re.sub(r'\s*<\s*br\s*/?\s*>\s*', '\n', html)
html = re.sub(r'<\s*/\s*p\s*>\s*<\s*p[^>]*>', '\n', html)
html = re.sub(r'(?u)\s*<\s*br\s*/?\s*>\s*', '\n', html)
html = re.sub(r'(?u)<\s*/\s*p\s*>\s*<\s*p[^>]*>', '\n', html)
# Strip html tags
html = re.sub('<.*?>', '', html)
# Replace html entities
@@ -1194,6 +1195,11 @@ def unified_timestamp(date_str, day_first=True):
# Remove AM/PM + timezone
date_str = re.sub(r'(?i)\s*(?:AM|PM)(?:\s+[A-Z]+)?', '', date_str)
# Remove unrecognized timezones from ISO 8601 alike timestamps
m = re.search(r'\d{1,2}:\d{1,2}(?:\.\d+)?(?P<tz>\s*[A-Z]+)$', date_str)
if m:
date_str = date_str[:-len(m.group('tz'))]
for expression in date_formats(day_first):
try:
dt = datetime.datetime.strptime(date_str, expression) - timezone + datetime.timedelta(hours=pm_delta)
@@ -2092,6 +2098,58 @@ def update_Request(req, url=None, data=None, headers={}, query={}):
return new_req
def _multipart_encode_impl(data, boundary):
content_type = 'multipart/form-data; boundary=%s' % boundary
out = b''
for k, v in data.items():
out += b'--' + boundary.encode('ascii') + b'\r\n'
if isinstance(k, compat_str):
k = k.encode('utf-8')
if isinstance(v, compat_str):
v = v.encode('utf-8')
# RFC 2047 requires non-ASCII field names to be encoded, while RFC 7578
# suggests sending UTF-8 directly. Firefox sends UTF-8, too
content = b'Content-Disposition: form-data; name="' + k + b'"\r\n\r\n' + v + b'\r\n'
if boundary.encode('ascii') in content:
raise ValueError('Boundary overlaps with data')
out += content
out += b'--' + boundary.encode('ascii') + b'--\r\n'
return out, content_type
def multipart_encode(data, boundary=None):
'''
Encode a dict to RFC 7578-compliant form-data
data:
A dict where keys and values can be either Unicode or bytes-like
objects.
boundary:
If specified a Unicode object, it's used as the boundary. Otherwise
a random boundary is generated.
Reference: https://tools.ietf.org/html/rfc7578
'''
has_specified_boundary = boundary is not None
while True:
if boundary is None:
boundary = '---------------' + str(random.randrange(0x0fffffff, 0xffffffff))
try:
out, content_type = _multipart_encode_impl(data, boundary)
break
except ValueError:
if has_specified_boundary:
raise
boundary = None
return out, content_type
def dict_get(d, key_or_keys, default=None, skip_false_values=True):
if isinstance(key_or_keys, (list, tuple)):
for key in key_or_keys:
@@ -2273,10 +2331,8 @@ def mimetype2ext(mt):
return {
'3gpp': '3gp',
'smptett+xml': 'tt',
'srt': 'srt',
'ttaf+xml': 'dfxp',
'ttml+xml': 'ttml',
'vtt': 'vtt',
'x-flv': 'flv',
'x-mp4-fragmented': 'mp4',
'x-ms-wmv': 'wmv',
@@ -2284,11 +2340,11 @@ def mimetype2ext(mt):
'x-mpegurl': 'm3u8',
'vnd.apple.mpegurl': 'm3u8',
'dash+xml': 'mpd',
'f4m': 'f4m',
'f4m+xml': 'f4m',
'hds+xml': 'f4m',
'vnd.ms-sstr+xml': 'ism',
'quicktime': 'mov',
'mp2t': 'ts',
}.get(res, res)
@@ -3757,3 +3813,11 @@ def write_xattr(path, key, value):
"Couldn't find a tool to set the xattrs. "
"Install either the python 'xattr' module, "
"or the 'xattr' binary.")
def random_birthday(year_field, month_field, day_field):
return {
year_field: str(random.randint(1950, 1995)),
month_field: str(random.randint(1, 12)),
day_field: str(random.randint(1, 31)),
}

View File

@@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2017.04.26'
__version__ = '2017.05.07'