Compare commits

..

1 Commits

Author SHA1 Message Date
Sergey M․
c03f08929a Start moving to ytdl-org 2019-03-09 19:15:16 +07:00
109 changed files with 1847 additions and 5457 deletions

61
.github/ISSUE_TEMPLATE.md vendored Normal file
View File

@@ -0,0 +1,61 @@
## Please follow the guide below
- You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly
- Put an `x` into all the boxes [ ] relevant to your *issue* (like this: `[x]`)
- Use the *Preview* tab to see what your issue will actually look like
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2019.03.09*. If it's not, read [this FAQ entry](https://github.com/ytdl-org/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2019.03.09**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through the [README](https://github.com/ytdl-org/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/ytdl-org/youtube-dl#faq) and [BUGS](https://github.com/ytdl-org/youtube-dl#bugs) sections
- [ ] [Searched](https://github.com/ytdl-org/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
- [ ] Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser
### What is the purpose of your *issue*?
- [ ] Bug report (encountered problems with youtube-dl)
- [ ] Site support request (request for adding support for a new site)
- [ ] Feature request (request for a new functionality)
- [ ] Question
- [ ] Other
---
### The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your *issue*
---
### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows:
Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl -v <your command line>`), copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```):
```
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2019.03.09
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
...
<end of log>
```
---
### If the purpose of this *issue* is a *site support request* please provide all kinds of example URLs support for which should be included (replace following example URLs by **yours**):
- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
Note that **youtube-dl does not support sites dedicated to [copyright infringement](https://github.com/ytdl-org/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
---
### Description of your *issue*, suggested solution and other information
Explanation of your *issue* in arbitrary form goes here. Please make sure the [description is worded well enough to be understood](https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient). Provide as much context and examples as possible.
If work on your *issue* requires account credentials please provide them or explain how one can obtain them.

View File

@@ -1,63 +0,0 @@
---
name: Broken site support
about: Report broken or misfunctioning site
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.04.30. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a broken site support
- [ ] I've verified that I'm running youtube-dl version **2019.04.30**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar issues including closed ones
## Verbose log
<!--
Provide the complete verbose output of youtube-dl that clearly demonstrates the problem.
Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <your command line>`), copy the WHOLE output and insert it below. It should look similar to this:
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2019.04.30
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
<more lines>
-->
```
PASTE VERBOSE LOG HERE
```
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Provide any additional information, suggested solution and as much context and examples as possible.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View File

@@ -1,54 +0,0 @@
---
name: Site support request
about: Request support for a new site
title: ''
labels: 'site-support-request'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.04.30. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that site you are requesting is not dedicated to copyright infringement, see https://yt-dl.org/copyright-infringement. youtube-dl does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
- Search the bugtracker for similar site support requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a new site support request
- [ ] I've verified that I'm running youtube-dl version **2019.04.30**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that none of provided URLs violate any copyrights
- [ ] I've searched the bugtracker for similar site support requests including closed ones
## Example URLs
<!--
Provide all kinds of example URLs support for which should be included. Replace following example URLs by yours.
-->
- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
## Description
<!--
Provide any additional information.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View File

@@ -1,37 +0,0 @@
---
name: Site feature request
about: Request a new functionality for a site
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.04.30. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar site feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a site feature request
- [ ] I've verified that I'm running youtube-dl version **2019.04.30**
- [ ] I've searched the bugtracker for similar site feature requests including closed ones
## Description
<!--
Provide an explanation of your site feature request in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
-->
WRITE DESCRIPTION HERE

View File

@@ -1,65 +0,0 @@
---
name: Bug report
about: Report a bug unrelated to any particular site or extractor
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.04.30. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Read bugs section in FAQ: http://yt-dl.org/reporting
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a broken site support issue
- [ ] I've verified that I'm running youtube-dl version **2019.04.30**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar bug reports including closed ones
- [ ] I've read bugs section in FAQ
## Verbose log
<!--
Provide the complete verbose output of youtube-dl that clearly demonstrates the problem.
Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <your command line>`), copy the WHOLE output and insert it below. It should look similar to this:
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2019.04.30
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
<more lines>
-->
```
PASTE VERBOSE LOG HERE
```
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View File

@@ -1,38 +0,0 @@
---
name: Feature request
about: Request a new functionality unrelated to any particular site or extractor
title: ''
labels: 'request'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.04.30. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a feature request
- [ ] I've verified that I'm running youtube-dl version **2019.04.30**
- [ ] I've searched the bugtracker for similar feature requests including closed ones
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
-->
WRITE DESCRIPTION HERE

View File

@@ -1,38 +0,0 @@
---
name: Ask question
about: Ask youtube-dl related question
title: ''
labels: 'question'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- Look through the README (http://yt-dl.org/readme) and FAQ (http://yt-dl.org/faq) for similar questions
- Search the bugtracker for similar questions: http://yt-dl.org/search-issues
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm asking a question
- [ ] I've looked through the README and FAQ for similar questions
- [ ] I've searched the bugtracker for similar questions including closed ones
## Question
<!--
Ask your question in an arbitrary form. Please make sure it's worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient.
-->
WRITE QUESTION HERE

61
.github/ISSUE_TEMPLATE_tmpl.md vendored Normal file
View File

@@ -0,0 +1,61 @@
## Please follow the guide below
- You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly
- Put an `x` into all the boxes [ ] relevant to your *issue* (like this: `[x]`)
- Use the *Preview* tab to see what your issue will actually look like
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *%(version)s*. If it's not, read [this FAQ entry](https://github.com/ytdl-org/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **%(version)s**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through the [README](https://github.com/ytdl-org/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/ytdl-org/youtube-dl#faq) and [BUGS](https://github.com/ytdl-org/youtube-dl#bugs) sections
- [ ] [Searched](https://github.com/ytdl-org/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
- [ ] Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser
### What is the purpose of your *issue*?
- [ ] Bug report (encountered problems with youtube-dl)
- [ ] Site support request (request for adding support for a new site)
- [ ] Feature request (request for a new functionality)
- [ ] Question
- [ ] Other
---
### The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your *issue*
---
### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows:
Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl -v <your command line>`), copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```):
```
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version %(version)s
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
...
<end of log>
```
---
### If the purpose of this *issue* is a *site support request* please provide all kinds of example URLs support for which should be included (replace following example URLs by **yours**):
- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
Note that **youtube-dl does not support sites dedicated to [copyright infringement](https://github.com/ytdl-org/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
---
### Description of your *issue*, suggested solution and other information
Explanation of your *issue* in arbitrary form goes here. Please make sure the [description is worded well enough to be understood](https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient). Provide as much context and examples as possible.
If work on your *issue* requires account credentials please provide them or explain how one can obtain them.

View File

@@ -1,63 +0,0 @@
---
name: Broken site support
about: Report broken or misfunctioning site
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a broken site support
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar issues including closed ones
## Verbose log
<!--
Provide the complete verbose output of youtube-dl that clearly demonstrates the problem.
Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <your command line>`), copy the WHOLE output and insert it below. It should look similar to this:
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version %(version)s
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
<more lines>
-->
```
PASTE VERBOSE LOG HERE
```
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Provide any additional information, suggested solution and as much context and examples as possible.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View File

@@ -1,54 +0,0 @@
---
name: Site support request
about: Request support for a new site
title: ''
labels: 'site-support-request'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that site you are requesting is not dedicated to copyright infringement, see https://yt-dl.org/copyright-infringement. youtube-dl does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
- Search the bugtracker for similar site support requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a new site support request
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that none of provided URLs violate any copyrights
- [ ] I've searched the bugtracker for similar site support requests including closed ones
## Example URLs
<!--
Provide all kinds of example URLs support for which should be included. Replace following example URLs by yours.
-->
- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
## Description
<!--
Provide any additional information.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View File

@@ -1,37 +0,0 @@
---
name: Site feature request
about: Request a new functionality for a site
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar site feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a site feature request
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've searched the bugtracker for similar site feature requests including closed ones
## Description
<!--
Provide an explanation of your site feature request in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
-->
WRITE DESCRIPTION HERE

View File

@@ -1,65 +0,0 @@
---
name: Bug report
about: Report a bug unrelated to any particular site or extractor
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Read bugs section in FAQ: http://yt-dl.org/reporting
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a broken site support issue
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar bug reports including closed ones
- [ ] I've read bugs section in FAQ
## Verbose log
<!--
Provide the complete verbose output of youtube-dl that clearly demonstrates the problem.
Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <your command line>`), copy the WHOLE output and insert it below. It should look similar to this:
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version %(version)s
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
<more lines>
-->
```
PASTE VERBOSE LOG HERE
```
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View File

@@ -1,38 +0,0 @@
---
name: Feature request
about: Request a new functionality unrelated to any particular site or extractor
title: ''
labels: 'request'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a feature request
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've searched the bugtracker for similar feature requests including closed ones
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
-->
WRITE DESCRIPTION HERE

148
ChangeLog
View File

@@ -1,151 +1,3 @@
version 2019.04.30
Extractors
* [openload] Use real Chrome versions (#20902)
- [youtube] Remove info el for get_video_info request
* [youtube] Improve extraction robustness
- [dramafever] Remove extractor (#20868)
* [adn] Fix subtitle extraction (#12724)
+ [ccc] Extract creator (#20355)
+ [ccc:playlist] Add support for media.ccc.de playlists (#14601, #20355)
+ [sverigesradio] Add support for sverigesradio.se (#18635)
+ [cinemax] Add support for cinemax.com
* [sixplay] Try extracting non-DRM protected manifests (#20849)
+ [youtube] Extract Youtube Music Auto-generated metadata (#20599, #20742)
- [wrzuta] Remove extractor (#20684, #20801)
* [twitch] Prefer source format (#20850)
+ [twitcasting] Add support for private videos (#20843)
* [reddit] Validate thumbnail URL (#20030)
* [yandexmusic] Fix track URL extraction (#20820)
version 2019.04.24
Extractors
* [youtube] Fix extraction (#20758, #20759, #20761, #20762, #20764, #20766,
#20767, #20769, #20771, #20768, #20770)
* [toutv] Fix extraction and extract series info (#20757)
+ [vrv] Add support for movie listings (#19229)
+ [youtube] Print error when no data is available (#20737)
+ [soundcloud] Add support for new rendition and improve extraction (#20699)
+ [ooyala] Add support for geo verification proxy
+ [nrl] Add support for nrl.com (#15991)
+ [vimeo] Extract live archive source format (#19144)
+ [vimeo] Add support for live streams and improve info extraction (#19144)
+ [ntvcojp] Add support for cu.ntv.co.jp
+ [nhk] Extract RTMPT format
+ [nhk] Add support for audio URLs
+ [udemy] Add another course id extraction pattern (#20491)
+ [openload] Add support for oload.services (#20691)
+ [openload] Add support for openloed.co (#20691, #20693)
* [bravotv] Fix extraction (#19213)
version 2019.04.17
Extractors
* [openload] Randomize User-Agent (closes #20688)
+ [openload] Add support for oladblock domains (#20471)
* [adn] Fix subtitle extraction (#12724)
+ [aol] Add support for localized websites
+ [yahoo] Add support GYAO episode URLs
+ [yahoo] Add support for streaming.yahoo.co.jp (#5811, #7098)
+ [yahoo] Add support for gyao.yahoo.co.jp
* [aenetworks] Fix history topic extraction and extract more formats
+ [cbs] Extract smpte and vtt subtitles
+ [streamango] Add support for streamcherry.com (#20592)
+ [yourporn] Add support for sxyprn.com (#20646)
* [mgtv] Fix extraction (#20650)
* [linkedin:learning] Use urljoin for form action URL (#20431)
+ [gdc] Add support for kaltura embeds (#20575)
* [dispeak] Improve mp4 bitrate extraction
* [kaltura] Sanitize embed URLs
* [jwplatfom] Do not match manifest URLs (#20596)
* [aol] Restrict URL regular expression and improve format extraction
+ [tiktok] Add support for new URL schema (#20573)
+ [stv:player] Add support for player.stv.tv (#20586)
version 2019.04.07
Core
+ [downloader/external] Pass rtmp_conn to ffmpeg
Extractors
+ [ruutu] Add support for audio podcasts (#20473, #20545)
+ [xvideos] Extract all thumbnails (#20432)
+ [platzi] Add support for platzi.com (#20562)
* [dvtv] Fix extraction (#18514, #19174)
+ [vrv] Add basic support for individual movie links (#19229)
+ [bfi:player] Add support for player.bfi.org.uk (#19235)
* [hbo] Fix extraction and extract subtitles (#14629, #13709)
* [youtube] Extract srv[1-3] subtitle formats (#20566)
* [adultswim] Fix extraction (#18025)
* [teamcoco] Fix extraction and add suport for subdomains (#17099, #20339)
* [adn] Fix subtitle compatibility with ffmpeg
* [adn] Fix extraction and add support for positioning styles (#20549)
* [vk] Use unique video id (#17848)
* [newstube] Fix extraction
* [rtl2] Actualize extraction
+ [adobeconnect] Add support for adobeconnect.com (#20283)
+ [gaia] Add support for authentication (#14605)
+ [mediasite] Add support for dashed ids and named catalogs (#20531)
version 2019.04.01
Core
* [utils] Improve int_or_none and float_or_none (#20403)
* Check for valid --min-sleep-interval when --max-sleep-interval is specified
(#20435)
Extractors
+ [weibo] Extend URL regular expression (#20496)
+ [xhamster] Add support for xhamster.one (#20508)
+ [mediasite] Add support for catalogs (#20507)
+ [teamtreehouse] Add support for teamtreehouse.com (#9836)
+ [ina] Add support for audio URLs
* [ina] Improve extraction
* [cwtv] Fix episode number extraction (#20461)
* [npo] Improve DRM detection
+ [pornhub] Add support for DASH formats (#20403)
* [svtplay] Update API endpoint (#20430)
version 2019.03.18
Core
* [extractor/common] Improve HTML5 entries extraction
+ [utils] Introduce parse_bitrate
* [update] Hide update URLs behind redirect
* [extractor/common] Fix url meta field for unfragmented DASH formats (#20346)
Extractors
+ [yandexvideo] Add extractor
* [openload] Improve embed detection
+ [corus] Add support for bigbrothercanada.ca (#20357)
+ [orf:radio] Extract series (#20012)
+ [cbc:watch] Add support for gem.cbc.ca (#20251, #20359)
- [anysex] Remove extractor (#19279)
+ [ciscolive] Add support for new URL schema (#20320, #20351)
+ [youtube] Add support for invidiou.sh (#20309)
- [anitube] Remove extractor (#20334)
- [ruleporn] Remove extractor (#15344, #20324)
* [npr] Fix extraction (#10793, #13440)
* [biqle] Fix extraction (#11471, #15313)
* [viddler] Modernize
* [moevideo] Fix extraction
* [primesharetv] Remove extractor
* [hypem] Modernize and extract more metadata (#15320)
* [veoh] Fix extraction
* [escapist] Modernize
- [videomega] Remove extractor (#10108)
+ [beeg] Add support for beeg.porn (#20306)
* [vimeo:review] Improve config url extraction and extract original format
(#20305)
* [fox] Detect geo restriction and authentication errors (#20208)
version 2019.03.09
Core

View File

@@ -1,7 +1,7 @@
all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites
clean:
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.ytdl *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.wav *.ape *.swf *.jpg *.png CONTRIBUTING.md.tmp youtube-dl youtube-dl.exe
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.ytdl *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.wav *.ape *.swf *.jpg *.png CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe
find . -name "*.pyc" -delete
find . -name "*.class" -delete
@@ -78,12 +78,8 @@ README.md: youtube_dl/*.py youtube_dl/*/*.py
CONTRIBUTING.md: README.md
$(PYTHON) devscripts/make_contributing.py README.md CONTRIBUTING.md
issuetemplates: devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/1_broken_site.md .github/ISSUE_TEMPLATE_tmpl/2_site_support_request.md .github/ISSUE_TEMPLATE_tmpl/3_site_feature_request.md .github/ISSUE_TEMPLATE_tmpl/4_bug_report.md .github/ISSUE_TEMPLATE_tmpl/5_feature_request.md youtube_dl/version.py
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/1_broken_site.md .github/ISSUE_TEMPLATE/1_broken_site.md
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/2_site_support_request.md .github/ISSUE_TEMPLATE/2_site_support_request.md
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/3_site_feature_request.md .github/ISSUE_TEMPLATE/3_site_feature_request.md
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/4_bug_report.md .github/ISSUE_TEMPLATE/4_bug_report.md
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/5_feature_request.md .github/ISSUE_TEMPLATE/5_feature_request.md
.github/ISSUE_TEMPLATE.md: devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl.md youtube_dl/version.py
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl.md .github/ISSUE_TEMPLATE.md
supportedsites:
$(PYTHON) devscripts/make_supportedsites.py docs/supportedsites.md

View File

@@ -642,7 +642,6 @@ The simplest case is requesting a specific format, for example with `-f 22` you
You can also use a file extension (currently `3gp`, `aac`, `flv`, `m4a`, `mp3`, `mp4`, `ogg`, `wav`, `webm` are supported) to download the best quality format of a particular file extension served as a single file, e.g. `-f webm` will download the best quality format with the `webm` extension served as a single file.
You can also use special names to select particular edge case formats:
- `best`: Select the best quality format represented by a single file with video and audio.
- `worst`: Select the worst quality format represented by a single file with video and audio.
- `bestvideo`: Select the best quality video-only format (e.g. DASH video). May not be available.
@@ -659,7 +658,6 @@ If you want to download several formats of the same video use a comma as a separ
You can also filter the video formats by putting a condition in brackets, as in `-f "best[height=720]"` (or `-f "[filesize>10M]"`).
The following numeric meta fields can be used with comparisons `<`, `<=`, `>`, `>=`, `=` (equals), `!=` (not equals):
- `filesize`: The number of bytes, if known in advance
- `width`: Width of the video, if known
- `height`: Height of the video, if known
@@ -670,7 +668,6 @@ The following numeric meta fields can be used with comparisons `<`, `<=`, `>`, `
- `fps`: Frame rate
Also filtering work for comparisons `=` (equals), `^=` (starts with), `$=` (ends with), `*=` (contains) and following string meta fields:
- `ext`: File extension
- `acodec`: Name of the audio codec in use
- `vcodec`: Name of the video codec in use
@@ -700,7 +697,7 @@ Note that on Windows you may need to use double quotes instead of single.
# Download best mp4 format available or any other best if no mp4 available
$ youtube-dl -f 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best'
# Download best format available but no better than 480p
# Download best format available but not better that 480p
$ youtube-dl -f 'bestvideo[height<=480]+bestaudio/best[height<=480]'
# Download best video only format but no bigger than 50 MB

View File

@@ -78,8 +78,8 @@ sed -i "s/__version__ = '.*'/__version__ = '$version'/" youtube_dl/version.py
sed -i "s/<unreleased>/$version/" ChangeLog
/bin/echo -e "\n### Committing documentation, templates and youtube_dl/version.py..."
make README.md CONTRIBUTING.md issuetemplates supportedsites
git add README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE/1_broken_site.md .github/ISSUE_TEMPLATE/2_site_support_request.md .github/ISSUE_TEMPLATE/3_site_feature_request.md .github/ISSUE_TEMPLATE/4_bug_report.md .github/ISSUE_TEMPLATE/5_feature_request.md .github/ISSUE_TEMPLATE/6_question.md docs/supportedsites.md youtube_dl/version.py ChangeLog
make README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE.md supportedsites
git add README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE.md docs/supportedsites.md youtube_dl/version.py ChangeLog
git commit $gpg_sign_commits -m "release $version"
/bin/echo -e "\n### Now tagging, signing and pushing..."

View File

@@ -28,7 +28,6 @@
- **acast:channel**
- **AddAnime**
- **ADN**: Anime Digital Network
- **AdobeConnect**
- **AdobeTV**
- **AdobeTVChannel**
- **AdobeTVShow**
@@ -45,8 +44,9 @@
- **AmericasTestKitchen**
- **anderetijden**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **AnimeOnDemand**
- **anitube.se**
- **Anvato**
- **aol.com**
- **AnySex**
- **APA**
- **Aparat**
- **AppleConnect**
@@ -103,7 +103,6 @@
- **Bellator**
- **BellMedia**
- **Bet**
- **bfi:player**
- **Bigflix**
- **Bild**: Bild.de
- **BiliBili**
@@ -164,7 +163,6 @@
- **chirbit**
- **chirbit:profile**
- **Cinchcast**
- **Cinemax**
- **CiscoLiveSearch**
- **CiscoLiveSession**
- **CJSW**
@@ -202,7 +200,6 @@
- **CSpan**: C-SPAN
- **CtsNews**: 華視新聞
- **CTVNews**
- **cu.ntv.co.jp**: Nippon Television Network
- **Culturebox**
- **CultureUnplugged**
- **curiositystream**
@@ -238,6 +235,8 @@
- **DouyuTV**: 斗鱼
- **DPlay**
- **DPlayIt**
- **dramafever**
- **dramafever:series**
- **DRBonanza**
- **Dropbox**
- **DrTuber**
@@ -348,6 +347,7 @@
- **Groupon**
- **Hark**
- **hbo**
- **hbo:episode**
- **HearThisAt**
- **Heise**
- **HellPorno**
@@ -487,12 +487,9 @@
- **MatchTV**
- **MDR**: MDR.DE and KiKA
- **media.ccc.de**
- **media.ccc.de:lists**
- **Medialaan**
- **Mediaset**
- **Mediasite**
- **MediasiteCatalog**
- **MediasiteNamedCatalog**
- **Medici**
- **megaphone.fm**: megaphone.fm embedded players
- **Meipai**: 美拍
@@ -625,7 +622,6 @@
- **NRKTVEpisodes**
- **NRKTVSeason**
- **NRKTVSeries**
- **NRLTV**
- **ntv.ru**
- **Nuvid**
- **NYTimes**
@@ -635,6 +631,7 @@
- **OdaTV**
- **Odnoklassniki**
- **OktoberfestTV**
- **on.aol.com**
- **OnDemandKorea**
- **onet.pl**
- **onet.tv**
@@ -675,8 +672,6 @@
- **Piksel**
- **Pinkbike**
- **Pladform**
- **Platzi**
- **PlatziCourse**
- **play.fm**
- **PlayPlusTV**
- **PlaysTV**
@@ -703,6 +698,7 @@
- **PornoXO**
- **PornTube**
- **PressTV**
- **PrimeShareTV**
- **PromptFile**
- **prosiebensat1**: ProSiebenSat.1 Digital
- **puhutv**
@@ -722,7 +718,7 @@
- **radio.de**
- **radiobremen**
- **radiocanada**
- **radiocanada:audiovideo**
- **RadioCanadaAudioVideo**
- **radiofrance**
- **RadioJavan**
- **Rai**
@@ -769,6 +765,7 @@
- **RTVS**
- **Rudo**
- **RUHD**
- **RulePorn**
- **rutube**: Rutube videos
- **rutube:channel**: Rutube channels
- **rutube:embed**: Rutube embedded videos
@@ -855,10 +852,7 @@
- **StreamCZ**
- **StreetVoice**
- **StretchInternet**
- **stv:player**
- **SunPorno**
- **sverigesradio:episode**
- **sverigesradio:publication**
- **SVT**
- **SVTPage**
- **SVTPlay**: SVT Play and Öppet arkiv
@@ -879,7 +873,6 @@
- **teachertube:user:collection**: teachertube.com user and collection videos
- **TeachingChannel**
- **Teamcoco**
- **TeamTreeHouse**
- **TechTalks**
- **techtv.mit.edu**
- **ted**
@@ -1017,6 +1010,7 @@
- **video.mit.edu**
- **VideoDetective**
- **videofy.me**
- **VideoMega**
- **videomore**
- **videomore:season**
- **videomore:video**
@@ -1104,6 +1098,8 @@
- **Wistia**
- **wnl**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **WorldStarHipHop**
- **wrzuta.pl**
- **wrzuta.pl:playlist**
- **WSJ**: Wall Street Journal
- **WSJArticle**
- **WWE**
@@ -1127,13 +1123,10 @@
- **XVideos**
- **XXXYMovies**
- **Yahoo**: Yahoo screen and movies
- **yahoo:gyao**
- **yahoo:gyao:player**
- **YandexDisk**
- **yandexmusic:album**: Яндекс.Музыка - Альбом
- **yandexmusic:playlist**: Яндекс.Музыка - Плейлист
- **yandexmusic:track**: Яндекс.Музыка - Трек
- **YandexVideo**
- **YapFiles**
- **YesJapan**
- **yinyuetai:video**: 音悦Tai

View File

@@ -107,184 +107,6 @@ class TestInfoExtractor(unittest.TestCase):
self.assertRaises(ExtractorError, self.ie._download_json, uri, None)
self.assertEqual(self.ie._download_json(uri, None, fatal=False), None)
def test_parse_html5_media_entries(self):
# from https://www.r18.com/
# with kpbs in label
expect_dict(
self,
self.ie._parse_html5_media_entries(
'https://www.r18.com/',
r'''
<video id="samplevideo_amateur" class="js-samplevideo video-js vjs-default-skin vjs-big-play-centered" controls preload="auto" width="400" height="225" poster="//pics.r18.com/digital/amateur/mgmr105/mgmr105jp.jpg">
<source id="video_source" src="https://awscc3001.r18.com/litevideo/freepv/m/mgm/mgmr105/mgmr105_sm_w.mp4" type="video/mp4" res="240" label="300kbps">
<source id="video_source" src="https://awscc3001.r18.com/litevideo/freepv/m/mgm/mgmr105/mgmr105_dm_w.mp4" type="video/mp4" res="480" label="1000kbps">
<source id="video_source" src="https://awscc3001.r18.com/litevideo/freepv/m/mgm/mgmr105/mgmr105_dmb_w.mp4" type="video/mp4" res="740" label="1500kbps">
<p>Your browser does not support the video tag.</p>
</video>
''', None)[0],
{
'formats': [{
'url': 'https://awscc3001.r18.com/litevideo/freepv/m/mgm/mgmr105/mgmr105_sm_w.mp4',
'ext': 'mp4',
'format_id': '300kbps',
'height': 240,
'tbr': 300,
}, {
'url': 'https://awscc3001.r18.com/litevideo/freepv/m/mgm/mgmr105/mgmr105_dm_w.mp4',
'ext': 'mp4',
'format_id': '1000kbps',
'height': 480,
'tbr': 1000,
}, {
'url': 'https://awscc3001.r18.com/litevideo/freepv/m/mgm/mgmr105/mgmr105_dmb_w.mp4',
'ext': 'mp4',
'format_id': '1500kbps',
'height': 740,
'tbr': 1500,
}],
'thumbnail': '//pics.r18.com/digital/amateur/mgmr105/mgmr105jp.jpg'
})
# from https://www.csfd.cz/
# with width and height
expect_dict(
self,
self.ie._parse_html5_media_entries(
'https://www.csfd.cz/',
r'''
<video width="770" height="328" preload="none" controls poster="https://img.csfd.cz/files/images/film/video/preview/163/344/163344118_748d20.png?h360" >
<source src="https://video.csfd.cz/files/videos/157/750/157750813/163327358_eac647.mp4" type="video/mp4" width="640" height="360">
<source src="https://video.csfd.cz/files/videos/157/750/157750813/163327360_3d2646.mp4" type="video/mp4" width="1280" height="720">
<source src="https://video.csfd.cz/files/videos/157/750/157750813/163327356_91f258.mp4" type="video/mp4" width="1920" height="1080">
<source src="https://video.csfd.cz/files/videos/157/750/157750813/163327359_962b4a.webm" type="video/webm" width="640" height="360">
<source src="https://video.csfd.cz/files/videos/157/750/157750813/163327361_6feee0.webm" type="video/webm" width="1280" height="720">
<source src="https://video.csfd.cz/files/videos/157/750/157750813/163327357_8ab472.webm" type="video/webm" width="1920" height="1080">
<track src="https://video.csfd.cz/files/subtitles/163/344/163344115_4c388b.srt" type="text/x-srt" kind="subtitles" srclang="cs" label="cs">
</video>
''', None)[0],
{
'formats': [{
'url': 'https://video.csfd.cz/files/videos/157/750/157750813/163327358_eac647.mp4',
'ext': 'mp4',
'width': 640,
'height': 360,
}, {
'url': 'https://video.csfd.cz/files/videos/157/750/157750813/163327360_3d2646.mp4',
'ext': 'mp4',
'width': 1280,
'height': 720,
}, {
'url': 'https://video.csfd.cz/files/videos/157/750/157750813/163327356_91f258.mp4',
'ext': 'mp4',
'width': 1920,
'height': 1080,
}, {
'url': 'https://video.csfd.cz/files/videos/157/750/157750813/163327359_962b4a.webm',
'ext': 'webm',
'width': 640,
'height': 360,
}, {
'url': 'https://video.csfd.cz/files/videos/157/750/157750813/163327361_6feee0.webm',
'ext': 'webm',
'width': 1280,
'height': 720,
}, {
'url': 'https://video.csfd.cz/files/videos/157/750/157750813/163327357_8ab472.webm',
'ext': 'webm',
'width': 1920,
'height': 1080,
}],
'subtitles': {
'cs': [{'url': 'https://video.csfd.cz/files/subtitles/163/344/163344115_4c388b.srt'}]
},
'thumbnail': 'https://img.csfd.cz/files/images/film/video/preview/163/344/163344118_748d20.png?h360'
})
# from https://tamasha.com/v/Kkdjw
# with height in label
expect_dict(
self,
self.ie._parse_html5_media_entries(
'https://tamasha.com/v/Kkdjw',
r'''
<video crossorigin="anonymous">
<source src="https://s-v2.tamasha.com/statics/videos_file/19/8f/Kkdjw_198feff8577d0057536e905cce1fb61438dd64e0_n_240.mp4" type="video/mp4" label="AUTO" res="0"/>
<source src="https://s-v2.tamasha.com/statics/videos_file/19/8f/Kkdjw_198feff8577d0057536e905cce1fb61438dd64e0_n_240.mp4" type="video/mp4"
label="240p" res="240"/>
<source src="https://s-v2.tamasha.com/statics/videos_file/20/00/Kkdjw_200041c66f657fc967db464d156eafbc1ed9fe6f_n_144.mp4" type="video/mp4"
label="144p" res="144"/>
</video>
''', None)[0],
{
'formats': [{
'url': 'https://s-v2.tamasha.com/statics/videos_file/19/8f/Kkdjw_198feff8577d0057536e905cce1fb61438dd64e0_n_240.mp4',
}, {
'url': 'https://s-v2.tamasha.com/statics/videos_file/19/8f/Kkdjw_198feff8577d0057536e905cce1fb61438dd64e0_n_240.mp4',
'ext': 'mp4',
'format_id': '240p',
'height': 240,
}, {
'url': 'https://s-v2.tamasha.com/statics/videos_file/20/00/Kkdjw_200041c66f657fc967db464d156eafbc1ed9fe6f_n_144.mp4',
'ext': 'mp4',
'format_id': '144p',
'height': 144,
}]
})
# from https://www.directvnow.com
# with data-src
expect_dict(
self,
self.ie._parse_html5_media_entries(
'https://www.directvnow.com',
r'''
<video id="vid1" class="header--video-masked active" muted playsinline>
<source data-src="https://cdn.directv.com/content/dam/dtv/prod/website_directvnow-international/videos/DTVN_hdr_HBO_v3.mp4" type="video/mp4" />
</video>
''', None)[0],
{
'formats': [{
'ext': 'mp4',
'url': 'https://cdn.directv.com/content/dam/dtv/prod/website_directvnow-international/videos/DTVN_hdr_HBO_v3.mp4',
}]
})
# from https://www.directvnow.com
# with data-src
expect_dict(
self,
self.ie._parse_html5_media_entries(
'https://www.directvnow.com',
r'''
<video id="vid1" class="header--video-masked active" muted playsinline>
<source data-src="https://cdn.directv.com/content/dam/dtv/prod/website_directvnow-international/videos/DTVN_hdr_HBO_v3.mp4" type="video/mp4" />
</video>
''', None)[0],
{
'formats': [{
'url': 'https://cdn.directv.com/content/dam/dtv/prod/website_directvnow-international/videos/DTVN_hdr_HBO_v3.mp4',
'ext': 'mp4',
}]
})
# from https://www.klarna.com/uk/
# with data-video-src
expect_dict(
self,
self.ie._parse_html5_media_entries(
'https://www.directvnow.com',
r'''
<video loop autoplay muted class="responsive-video block-kl__video video-on-medium">
<source src="" data-video-desktop data-video-src="https://www.klarna.com/uk/wp-content/uploads/sites/11/2019/01/KL062_Smooth3_0_DogWalking_5s_920x080_.mp4" type="video/mp4" />
</video>
''', None)[0],
{
'formats': [{
'url': 'https://www.klarna.com/uk/wp-content/uploads/sites/11/2019/01/KL062_Smooth3_0_DogWalking_5s_920x080_.mp4',
'ext': 'mp4',
}],
})
def test_extract_jwplayer_data_realworld(self):
# from http://www.suffolk.edu/sjc/
expect_dict(
@@ -752,8 +574,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
# Also tests duplicate representation ids, see
# https://github.com/ytdl-org/youtube-dl/issues/15111
'float_duration',
'http://unknown/manifest.mpd', # mpd_url
None, # mpd_base_url
'http://unknown/manifest.mpd',
[{
'manifest_url': 'http://unknown/manifest.mpd',
'ext': 'm4a',
@@ -833,8 +654,7 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
), (
# https://github.com/ytdl-org/youtube-dl/pull/14844
'urls_only',
'http://unknown/manifest.mpd', # mpd_url
None, # mpd_base_url
'http://unknown/manifest.mpd',
[{
'manifest_url': 'http://unknown/manifest.mpd',
'ext': 'mp4',
@@ -913,61 +733,15 @@ jwplayer("mediaplayer").setup({"abouttext":"Visit Indie DB","aboutlink":"http:\/
'width': 1920,
'height': 1080,
}]
), (
# https://github.com/ytdl-org/youtube-dl/issues/20346
# Media considered unfragmented even though it contains
# Initialization tag
'unfragmented',
'https://v.redd.it/hw1x7rcg7zl21/DASHPlaylist.mpd', # mpd_url
'https://v.redd.it/hw1x7rcg7zl21', # mpd_base_url
[{
'url': 'https://v.redd.it/hw1x7rcg7zl21/audio',
'manifest_url': 'https://v.redd.it/hw1x7rcg7zl21/DASHPlaylist.mpd',
'ext': 'm4a',
'format_id': 'AUDIO-1',
'format_note': 'DASH audio',
'container': 'm4a_dash',
'acodec': 'mp4a.40.2',
'vcodec': 'none',
'tbr': 129.87,
'asr': 48000,
}, {
'url': 'https://v.redd.it/hw1x7rcg7zl21/DASH_240',
'manifest_url': 'https://v.redd.it/hw1x7rcg7zl21/DASHPlaylist.mpd',
'ext': 'mp4',
'format_id': 'VIDEO-2',
'format_note': 'DASH video',
'container': 'mp4_dash',
'acodec': 'none',
'vcodec': 'avc1.4d401e',
'tbr': 608.0,
'width': 240,
'height': 240,
'fps': 30,
}, {
'url': 'https://v.redd.it/hw1x7rcg7zl21/DASH_360',
'manifest_url': 'https://v.redd.it/hw1x7rcg7zl21/DASHPlaylist.mpd',
'ext': 'mp4',
'format_id': 'VIDEO-1',
'format_note': 'DASH video',
'container': 'mp4_dash',
'acodec': 'none',
'vcodec': 'avc1.4d401e',
'tbr': 804.261,
'width': 360,
'height': 360,
'fps': 30,
}]
)
]
for mpd_file, mpd_url, mpd_base_url, expected_formats in _TEST_CASES:
for mpd_file, mpd_url, expected_formats in _TEST_CASES:
with io.open('./test/testdata/mpd/%s.mpd' % mpd_file,
mode='r', encoding='utf-8') as f:
formats = self.ie._parse_mpd_formats(
compat_etree_fromstring(f.read().encode('utf-8')),
mpd_base_url=mpd_base_url, mpd_url=mpd_url)
mpd_url=mpd_url)
self.ie._sort_formats(formats)
expect_value(self, formats, expected_formats, None)

View File

@@ -33,13 +33,11 @@ from youtube_dl.utils import (
ExtractorError,
find_xpath_attr,
fix_xml_ampersands,
float_or_none,
get_element_by_class,
get_element_by_attribute,
get_elements_by_class,
get_elements_by_attribute,
InAdvancePagedList,
int_or_none,
intlist_to_bytes,
is_html,
js_to_json,
@@ -57,7 +55,6 @@ from youtube_dl.utils import (
parse_count,
parse_iso8601,
parse_resolution,
parse_bitrate,
pkcs1pad,
read_batch_urls,
sanitize_filename,
@@ -470,21 +467,6 @@ class TestUtil(unittest.TestCase):
shell_quote(args),
"""ffmpeg -i 'ñ€ß'"'"'.mp4'""" if compat_os_name != 'nt' else '''ffmpeg -i "ñ€ß'.mp4"''')
def test_float_or_none(self):
self.assertEqual(float_or_none('42.42'), 42.42)
self.assertEqual(float_or_none('42'), 42.0)
self.assertEqual(float_or_none(''), None)
self.assertEqual(float_or_none(None), None)
self.assertEqual(float_or_none([]), None)
self.assertEqual(float_or_none(set()), None)
def test_int_or_none(self):
self.assertEqual(int_or_none('42'), 42)
self.assertEqual(int_or_none(''), None)
self.assertEqual(int_or_none(None), None)
self.assertEqual(int_or_none([]), None)
self.assertEqual(int_or_none(set()), None)
def test_str_to_int(self):
self.assertEqual(str_to_int('123,456'), 123456)
self.assertEqual(str_to_int('123.456'), 123456)
@@ -1048,13 +1030,6 @@ class TestUtil(unittest.TestCase):
self.assertEqual(parse_resolution('4k'), {'height': 2160})
self.assertEqual(parse_resolution('8K'), {'height': 4320})
def test_parse_bitrate(self):
self.assertEqual(parse_bitrate(None), None)
self.assertEqual(parse_bitrate(''), None)
self.assertEqual(parse_bitrate('300kbps'), 300)
self.assertEqual(parse_bitrate('1500kbps'), 1500)
self.assertEqual(parse_bitrate('300 kbps'), 300)
def test_version_tuple(self):
self.assertEqual(version_tuple('1'), (1,))
self.assertEqual(version_tuple('10.23.344'), (10, 23, 344))

View File

@@ -1,28 +0,0 @@
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<MPD mediaPresentationDuration="PT54.915S" minBufferTime="PT1.500S" profiles="urn:mpeg:dash:profile:isoff-on-demand:2011" type="static" xmlns="urn:mpeg:dash:schema:mpd:2011">
<Period duration="PT54.915S">
<AdaptationSet segmentAlignment="true" subsegmentAlignment="true" subsegmentStartsWithSAP="1">
<Representation bandwidth="804261" codecs="avc1.4d401e" frameRate="30" height="360" id="VIDEO-1" mimeType="video/mp4" startWithSAP="1" width="360">
<BaseURL>DASH_360</BaseURL>
<SegmentBase indexRange="915-1114" indexRangeExact="true">
<Initialization range="0-914"/>
</SegmentBase>
</Representation>
<Representation bandwidth="608000" codecs="avc1.4d401e" frameRate="30" height="240" id="VIDEO-2" mimeType="video/mp4" startWithSAP="1" width="240">
<BaseURL>DASH_240</BaseURL>
<SegmentBase indexRange="913-1112" indexRangeExact="true">
<Initialization range="0-912"/>
</SegmentBase>
</Representation>
</AdaptationSet>
<AdaptationSet>
<Representation audioSamplingRate="48000" bandwidth="129870" codecs="mp4a.40.2" id="AUDIO-1" mimeType="audio/mp4" startWithSAP="1">
<AudioChannelConfiguration schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011" value="2"/>
<BaseURL>audio</BaseURL>
<SegmentBase indexRange="832-1007" indexRangeExact="true">
<Initialization range="0-831"/>
</SegmentBase>
</Representation>
</AdaptationSet>
</Period>
</MPD>

View File

@@ -309,8 +309,6 @@ class YoutubeDL(object):
The following options are used by the post processors:
prefer_ffmpeg: If False, use avconv instead of ffmpeg if both are available,
otherwise prefer ffmpeg.
ffmpeg_location: Location of the ffmpeg/avconv binary; either the path
to the binary or its containing directory.
postprocessor_args: A list of additional command-line arguments for the
postprocessor.

View File

@@ -166,8 +166,6 @@ def _real_main(argv=None):
if opts.max_sleep_interval is not None:
if opts.max_sleep_interval < 0:
parser.error('max sleep interval must be positive or 0')
if opts.sleep_interval is None:
parser.error('min sleep interval must be specified, use --min-sleep-interval')
if opts.max_sleep_interval < opts.sleep_interval:
parser.error('max sleep interval must be greater than or equal to min sleep interval')
else:

View File

@@ -289,7 +289,6 @@ class FFmpegFD(ExternalFD):
tc_url = info_dict.get('tc_url')
flash_version = info_dict.get('flash_version')
live = info_dict.get('rtmp_live', False)
conn = info_dict.get('rtmp_conn')
if player_url is not None:
args += ['-rtmp_swfverify', player_url]
if page_url is not None:
@@ -304,11 +303,6 @@ class FFmpegFD(ExternalFD):
args += ['-rtmp_flashver', flash_version]
if live:
args += ['-rtmp_live', 'live']
if isinstance(conn, list):
for entry in conn:
args += ['-rtmp_conn', entry]
elif isinstance(conn, compat_str):
args += ['-rtmp_conn', conn]
args += ['-i', url, '-c', 'copy']

View File

@@ -21,6 +21,7 @@ from ..utils import (
intlist_to_bytes,
long_to_bytes,
pkcs1pad,
srt_subtitles_timecode,
strip_or_none,
urljoin,
)
@@ -41,18 +42,6 @@ class ADNIE(InfoExtractor):
}
_BASE_URL = 'http://animedigitalnetwork.fr'
_RSA_KEY = (0xc35ae1e4356b65a73b551493da94b8cb443491c0aa092a357a5aee57ffc14dda85326f42d716e539a34542a0d3f363adf16c5ec222d713d5997194030ee2e4f0d1fb328c01a81cf6868c090d50de8e169c6b13d1675b9eeed1cbc51e1fffca9b38af07f37abd790924cd3bee59d0257cfda4fe5f3f0534877e21ce5821447d1b, 65537)
_POS_ALIGN_MAP = {
'start': 1,
'end': 3,
}
_LINE_ALIGN_MAP = {
'middle': 8,
'end': 4,
}
@staticmethod
def _ass_subtitles_timecode(seconds):
return '%01d:%02d:%02d.%02d' % (seconds / 3600, (seconds % 3600) / 60, seconds % 60, (seconds % 1) * 100)
def _get_subtitles(self, sub_path, video_id):
if not sub_path:
@@ -60,20 +49,14 @@ class ADNIE(InfoExtractor):
enc_subtitles = self._download_webpage(
urljoin(self._BASE_URL, sub_path),
video_id, 'Downloading subtitles location', fatal=False) or '{}'
subtitle_location = (self._parse_json(enc_subtitles, video_id, fatal=False) or {}).get('location')
if subtitle_location:
enc_subtitles = self._download_webpage(
urljoin(self._BASE_URL, subtitle_location),
video_id, 'Downloading subtitles data', fatal=False,
headers={'Origin': 'https://animedigitalnetwork.fr'})
video_id, fatal=False)
if not enc_subtitles:
return None
# http://animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js
dec_subtitles = intlist_to_bytes(aes_cbc_decrypt(
bytes_to_intlist(compat_b64decode(enc_subtitles[24:])),
bytes_to_intlist(binascii.unhexlify(self._K + '4b8ef13ec1872730')),
bytes_to_intlist(binascii.unhexlify(self._K + '9032ad7083106400')),
bytes_to_intlist(compat_b64decode(enc_subtitles[:24]))
))
subtitles_json = self._parse_json(
@@ -84,27 +67,23 @@ class ADNIE(InfoExtractor):
subtitles = {}
for sub_lang, sub in subtitles_json.items():
ssa = '''[Script Info]
ScriptType:V4.00
[V4 Styles]
Format: Name,Fontname,Fontsize,PrimaryColour,SecondaryColour,TertiaryColour,BackColour,Bold,Italic,BorderStyle,Outline,Shadow,Alignment,MarginL,MarginR,MarginV,AlphaLevel,Encoding
Style: Default,Arial,18,16777215,16777215,16777215,0,-1,0,1,1,0,2,20,20,20,0,0
[Events]
Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
for current in sub:
start, end, text, line_align, position_align = (
srt = ''
for num, current in enumerate(sub):
start, end, text = (
float_or_none(current.get('startTime')),
float_or_none(current.get('endTime')),
current.get('text'), current.get('lineAlign'),
current.get('positionAlign'))
current.get('text'))
if start is None or end is None or text is None:
continue
alignment = self._POS_ALIGN_MAP.get(position_align, 2) + self._LINE_ALIGN_MAP.get(line_align, 0)
ssa += os.linesep + 'Dialogue: Marked=0,%s,%s,Default,,0,0,0,,%s%s' % (
self._ass_subtitles_timecode(start),
self._ass_subtitles_timecode(end),
'{\\a%d}' % alignment if alignment != 2 else '',
text.replace('\n', '\\N').replace('<i>', '{\\i1}').replace('</i>', '{\\i0}'))
srt += os.linesep.join(
(
'%d' % num,
'%s --> %s' % (
srt_subtitles_timecode(start),
srt_subtitles_timecode(end)),
text,
os.linesep,
))
if sub_lang == 'vostf':
sub_lang = 'fr'
@@ -112,8 +91,8 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
'ext': 'json',
'data': json.dumps(sub),
}, {
'ext': 'ssa',
'data': ssa,
'ext': 'srt',
'data': srt,
}])
return subtitles
@@ -121,15 +100,7 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
player_config = self._parse_json(self._search_regex(
r'playerConfig\s*=\s*({.+});', webpage,
'player config', default='{}'), video_id, fatal=False)
if not player_config:
config_url = urljoin(self._BASE_URL, self._search_regex(
r'(?:id="player"|class="[^"]*adn-player-container[^"]*")[^>]+data-url="([^"]+)"',
webpage, 'config url'))
player_config = self._download_json(
config_url, video_id,
'Downloading player config JSON metadata')['player']
r'playerConfig\s*=\s*({.+});', webpage, 'player config'), video_id)
video_info = {}
video_info_str = self._search_regex(
@@ -158,15 +129,12 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
encrypted_message = long_to_bytes(pow(bytes_to_long(padded_message), e, n))
authorization = base64.b64encode(encrypted_message).decode()
links_data = self._download_json(
urljoin(self._BASE_URL, links_url), video_id,
'Downloading links JSON metadata', headers={
urljoin(self._BASE_URL, links_url), video_id, headers={
'Authorization': 'Bearer ' + authorization,
})
links = links_data.get('links') or {}
metas = metas or links_data.get('meta') or {}
sub_path = sub_path or links_data.get('subtitles') or \
'index.php?option=com_vodapi&task=subtitles.getJSON&format=json&id=' + video_id
sub_path += '&token=' + token
sub_path = (sub_path or links_data.get('subtitles')) + '&token=' + token
error = links_data.get('error')
title = metas.get('title') or video_info['title']
@@ -174,11 +142,9 @@ Format: Marked,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text'''
for format_id, qualities in links.items():
if not isinstance(qualities, dict):
continue
for quality, load_balancer_url in qualities.items():
for load_balancer_url in qualities.values():
load_balancer_data = self._download_json(
load_balancer_url, video_id,
'Downloading %s %s JSON metadata' % (format_id, quality),
fatal=False) or {}
load_balancer_url, video_id, fatal=False) or {}
m3u8_url = load_balancer_data.get('location')
if not m3u8_url:
continue

View File

@@ -1,37 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import (
compat_parse_qs,
compat_urlparse,
)
class AdobeConnectIE(InfoExtractor):
_VALID_URL = r'https?://\w+\.adobeconnect\.com/(?P<id>[\w-]+)'
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
title = self._html_search_regex(r'<title>(.+?)</title>', webpage, 'title')
qs = compat_parse_qs(self._search_regex(r"swfUrl\s*=\s*'([^']+)'", webpage, 'swf url').split('?')[1])
is_live = qs.get('isLive', ['false'])[0] == 'true'
formats = []
for con_string in qs['conStrings'][0].split(','):
formats.append({
'format_id': con_string.split('://')[0],
'app': compat_urlparse.quote('?' + con_string.split('?')[1] + 'flvplayerapp/' + qs['appInstance'][0]),
'ext': 'flv',
'play_path': 'mp4:' + qs['streamName'][0],
'rtmp_conn': 'S:' + qs['ticket'][0],
'rtmp_live': is_live,
'url': con_string,
})
return {
'id': video_id,
'title': self._live_title(title) if is_live else title,
'formats': formats,
'is_live': is_live,
}

View File

@@ -1,19 +1,13 @@
# coding: utf-8
from __future__ import unicode_literals
import json
import re
from .turner import TurnerBaseIE
from ..utils import (
determine_ext,
float_or_none,
int_or_none,
mimetype2ext,
parse_age_limit,
parse_iso8601,
strip_or_none,
try_get,
url_or_none,
)
@@ -27,8 +21,8 @@ class AdultSwimIE(TurnerBaseIE):
'ext': 'mp4',
'title': 'Rick and Morty - Pilot',
'description': 'Rick moves in with his daughter\'s family and establishes himself as a bad influence on his grandson, Morty.',
'timestamp': 1543294800,
'upload_date': '20181127',
'timestamp': 1493267400,
'upload_date': '20170427',
},
'params': {
# m3u8 download
@@ -49,7 +43,6 @@ class AdultSwimIE(TurnerBaseIE):
# m3u8 download
'skip_download': True,
},
'skip': '404 Not Found',
}, {
'url': 'http://www.adultswim.com/videos/decker/inside-decker-a-new-hero/',
'info_dict': {
@@ -68,9 +61,9 @@ class AdultSwimIE(TurnerBaseIE):
}, {
'url': 'http://www.adultswim.com/videos/attack-on-titan',
'info_dict': {
'id': 'attack-on-titan',
'id': 'b7A69dzfRzuaXIECdxW8XQ',
'title': 'Attack on Titan',
'description': 'md5:41caa9416906d90711e31dc00cb7db7e',
'description': 'md5:6c8e003ea0777b47013e894767f5e114',
},
'playlist_mincount': 12,
}, {
@@ -85,118 +78,83 @@ class AdultSwimIE(TurnerBaseIE):
# m3u8 download
'skip_download': True,
},
'skip': '404 Not Found',
}]
def _real_extract(self, url):
show_path, episode_path = re.match(self._VALID_URL, url).groups()
display_id = episode_path or show_path
query = '''query {
getShowBySlug(slug:"%s") {
%%s
}
}''' % show_path
if episode_path:
query = query % '''title
getVideoBySlug(slug:"%s") {
_id
auth
description
duration
episodeNumber
launchDate
mediaID
seasonNumber
poster
title
tvRating
}''' % episode_path
['getVideoBySlug']
else:
query = query % '''metaDescription
title
videos(first:1000,sort:["episode_number"]) {
edges {
node {
_id
slug
}
}
}'''
show_data = self._download_json(
'https://www.adultswim.com/api/search', display_id,
data=json.dumps({'query': query}).encode(),
headers={'Content-Type': 'application/json'})['data']['getShowBySlug']
if episode_path:
video_data = show_data['getVideoBySlug']
video_id = video_data['_id']
episode_title = title = video_data['title']
series = show_data.get('title')
if series:
title = '%s - %s' % (series, title)
info = {
'id': video_id,
'title': title,
'description': strip_or_none(video_data.get('description')),
'duration': float_or_none(video_data.get('duration')),
'formats': [],
'subtitles': {},
'age_limit': parse_age_limit(video_data.get('tvRating')),
'thumbnail': video_data.get('poster'),
'timestamp': parse_iso8601(video_data.get('launchDate')),
'series': series,
'season_number': int_or_none(video_data.get('seasonNumber')),
'episode': episode_title,
'episode_number': int_or_none(video_data.get('episodeNumber')),
}
webpage = self._download_webpage(url, display_id)
initial_data = self._parse_json(self._search_regex(
r'AS_INITIAL_DATA(?:__)?\s*=\s*({.+?});',
webpage, 'initial data'), display_id)
auth = video_data.get('auth')
media_id = video_data.get('mediaID')
if media_id:
info.update(self._extract_ngtv_info(media_id, {
# CDN_TOKEN_APP_ID from:
# https://d2gg02c3xr550i.cloudfront.net/assets/asvp.e9c8bef24322d060ef87.bundle.js
'appId': 'eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJhcHBJZCI6ImFzLXR2ZS1kZXNrdG9wLXB0enQ2bSIsInByb2R1Y3QiOiJ0dmUiLCJuZXR3b3JrIjoiYXMiLCJwbGF0Zm9ybSI6ImRlc2t0b3AiLCJpYXQiOjE1MzI3MDIyNzl9.BzSCk-WYOZ2GMCIaeVb8zWnzhlgnXuJTCu0jGp_VaZE',
}, {
'url': url,
'site_name': 'AdultSwim',
'auth_required': auth,
}))
is_stream = show_path == 'streams'
if is_stream:
if not episode_path:
episode_path = 'live-stream'
if not auth:
extract_data = self._download_json(
'https://www.adultswim.com/api/shows/v1/videos/' + video_id,
video_id, query={'fields': 'stream'}, fatal=False) or {}
assets = try_get(extract_data, lambda x: x['data']['video']['stream']['assets'], list) or []
for asset in assets:
asset_url = asset.get('url')
if not asset_url:
video_data = next(stream for stream_path, stream in initial_data['streams'].items() if stream_path == episode_path)
video_id = video_data.get('stream')
if not video_id:
entries = []
for episode in video_data.get('archiveEpisodes', []):
episode_url = url_or_none(episode.get('url'))
if not episode_url:
continue
ext = determine_ext(asset_url, mimetype2ext(asset.get('mime_type')))
if ext == 'm3u8':
info['formats'].extend(self._extract_m3u8_formats(
asset_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
elif ext == 'f4m':
continue
# info['formats'].extend(self._extract_f4m_formats(
# asset_url, video_id, f4m_id='hds', fatal=False))
elif ext in ('scc', 'ttml', 'vtt'):
info['subtitles'].setdefault('en', []).append({
'url': asset_url,
})
self._sort_formats(info['formats'])
return info
entries.append(self.url_result(
episode_url, 'AdultSwim', episode.get('id')))
return self.playlist_result(
entries, video_data.get('id'), video_data.get('title'),
strip_or_none(video_data.get('description')))
else:
entries = []
for edge in show_data.get('videos', {}).get('edges', []):
video = edge.get('node') or {}
slug = video.get('slug')
if not slug:
continue
entries.append(self.url_result(
'http://adultswim.com/videos/%s/%s' % (show_path, slug),
'AdultSwim', video.get('_id')))
return self.playlist_result(
entries, show_path, show_data.get('title'),
strip_or_none(show_data.get('metaDescription')))
show_data = initial_data['show']
if not episode_path:
entries = []
for video in show_data.get('videos', []):
slug = video.get('slug')
if not slug:
continue
entries.append(self.url_result(
'http://adultswim.com/videos/%s/%s' % (show_path, slug),
'AdultSwim', video.get('id')))
return self.playlist_result(
entries, show_data.get('id'), show_data.get('title'),
strip_or_none(show_data.get('metadata', {}).get('description')))
video_data = show_data['sluggedVideo']
video_id = video_data['id']
info = self._extract_cvp_info(
'http://www.adultswim.com/videos/api/v0/assets?platform=desktop&id=' + video_id,
video_id, {
'secure': {
'media_src': 'http://androidhls-secure.cdn.turner.com/adultswim/big',
'tokenizer_src': 'http://www.adultswim.com/astv/mvpd/processors/services/token_ipadAdobe.do',
},
}, {
'url': url,
'site_name': 'AdultSwim',
'auth_required': video_data.get('auth'),
})
info.update({
'id': video_id,
'display_id': display_id,
'description': info.get('description') or strip_or_none(video_data.get('description')),
})
if not is_stream:
info.update({
'duration': info.get('duration') or int_or_none(video_data.get('duration')),
'timestamp': info.get('timestamp') or int_or_none(video_data.get('launch_date')),
'season_number': info.get('season_number') or int_or_none(video_data.get('season_number')),
'episode': info['title'],
'episode_number': info.get('episode_number') or int_or_none(video_data.get('episode_number')),
})
info['series'] = video_data.get('collection_title') or info.get('series')
if info['series'] and info['series'] != info['title']:
info['title'] = '%s - %s' % (info['series'], info['title'])
return info

View File

@@ -1,15 +1,14 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .theplatform import ThePlatformIE
from ..utils import (
extract_attributes,
ExtractorError,
int_or_none,
smuggle_url,
update_url_query,
unescapeHTML,
extract_attributes,
get_element_by_attribute,
)
from ..compat import (
compat_urlparse,
@@ -20,43 +19,6 @@ class AENetworksBaseIE(ThePlatformIE):
_THEPLATFORM_KEY = 'crazyjava'
_THEPLATFORM_SECRET = 's3cr3t'
def _extract_aen_smil(self, smil_url, video_id, auth=None):
query = {'mbr': 'true'}
if auth:
query['auth'] = auth
TP_SMIL_QUERY = [{
'assetTypes': 'high_video_ak',
'switch': 'hls_high_ak'
}, {
'assetTypes': 'high_video_s3'
}, {
'assetTypes': 'high_video_s3',
'switch': 'hls_ingest_fastly'
}]
formats = []
subtitles = {}
last_e = None
for q in TP_SMIL_QUERY:
q.update(query)
m_url = update_url_query(smil_url, q)
m_url = self._sign_url(m_url, self._THEPLATFORM_KEY, self._THEPLATFORM_SECRET)
try:
tp_formats, tp_subtitles = self._extract_theplatform_smil(
m_url, video_id, 'Downloading %s SMIL data' % (q.get('switch') or q['assetTypes']))
except ExtractorError as e:
last_e = e
continue
formats.extend(tp_formats)
subtitles = self._merge_subtitles(subtitles, tp_subtitles)
if last_e and not formats:
raise last_e
self._sort_formats(formats)
return {
'id': video_id,
'formats': formats,
'subtitles': subtitles,
}
class AENetworksIE(AENetworksBaseIE):
IE_NAME = 'aenetworks'
@@ -71,25 +33,22 @@ class AENetworksIE(AENetworksBaseIE):
(?:
shows/(?P<show_path>[^/]+(?:/[^/]+){0,2})|
movies/(?P<movie_display_id>[^/]+)(?:/full-movie)?|
specials/(?P<special_display_id>[^/]+)/(?:full-special|preview-)|
specials/(?P<special_display_id>[^/]+)/full-special|
collections/[^/]+/(?P<collection_display_id>[^/]+)
)
'''
_TESTS = [{
'url': 'http://www.history.com/shows/mountain-men/season-1/episode-1',
'md5': 'a97a65f7e823ae10e9244bc5433d5fe6',
'info_dict': {
'id': '22253814',
'ext': 'mp4',
'title': 'Winter is Coming',
'title': 'Winter Is Coming',
'description': 'md5:641f424b7a19d8e24f26dea22cf59d74',
'timestamp': 1338306241,
'upload_date': '20120529',
'uploader': 'AENE-NEW',
},
'params': {
# m3u8 download
'skip_download': True,
},
'add_ie': ['ThePlatform'],
}, {
'url': 'http://www.history.com/shows/ancient-aliens/season-1',
@@ -125,9 +84,6 @@ class AENetworksIE(AENetworksBaseIE):
}, {
'url': 'https://www.historyvault.com/collections/america-the-story-of-us/westward',
'only_matching': True
}, {
'url': 'https://www.aetv.com/specials/hunting-jonbenets-killer-the-untold-story/preview-hunting-jonbenets-killer-the-untold-story',
'only_matching': True
}]
_DOMAIN_TO_REQUESTOR_ID = {
'history.com': 'HISTORY',
@@ -168,6 +124,11 @@ class AENetworksIE(AENetworksBaseIE):
return self.playlist_result(
entries, self._html_search_meta('aetn:SeasonId', webpage))
query = {
'mbr': 'true',
'assetTypes': 'high_video_ak',
'switch': 'hls_high_ak',
}
video_id = self._html_search_meta('aetn:VideoID', webpage)
media_url = self._search_regex(
[r"media_url\s*=\s*'(?P<url>[^']+)'",
@@ -177,39 +138,64 @@ class AENetworksIE(AENetworksBaseIE):
theplatform_metadata = self._download_theplatform_metadata(self._search_regex(
r'https?://link\.theplatform\.com/s/([^?]+)', media_url, 'theplatform_path'), video_id)
info = self._parse_theplatform_metadata(theplatform_metadata)
auth = None
if theplatform_metadata.get('AETN$isBehindWall'):
requestor_id = self._DOMAIN_TO_REQUESTOR_ID[domain]
resource = self._get_mvpd_resource(
requestor_id, theplatform_metadata['title'],
theplatform_metadata.get('AETN$PPL_pplProgramId') or theplatform_metadata.get('AETN$PPL_pplProgramId_OLD'),
theplatform_metadata['ratings'][0]['rating'])
auth = self._extract_mvpd_auth(
query['auth'] = self._extract_mvpd_auth(
url, video_id, requestor_id, resource)
info.update(self._search_json_ld(webpage, video_id, fatal=False))
info.update(self._extract_aen_smil(media_url, video_id, auth))
media_url = update_url_query(media_url, query)
media_url = self._sign_url(media_url, self._THEPLATFORM_KEY, self._THEPLATFORM_SECRET)
formats, subtitles = self._extract_theplatform_smil(media_url, video_id)
self._sort_formats(formats)
info.update({
'id': video_id,
'formats': formats,
'subtitles': subtitles,
})
return info
class HistoryTopicIE(AENetworksBaseIE):
IE_NAME = 'history:topic'
IE_DESC = 'History.com Topic'
_VALID_URL = r'https?://(?:www\.)?history\.com/topics/[^/]+/(?P<id>[\w+-]+?)-video'
_VALID_URL = r'https?://(?:www\.)?history\.com/topics/(?:[^/]+/)?(?P<topic_id>[^/]+)(?:/[^/]+(?:/(?P<video_display_id>[^/?#]+))?)?'
_TESTS = [{
'url': 'https://www.history.com/topics/valentines-day/history-of-valentines-day-video',
'url': 'http://www.history.com/topics/valentines-day/history-of-valentines-day/videos/bet-you-didnt-know-valentines-day?m=528e394da93ae&s=undefined&f=1&free=false',
'info_dict': {
'id': '40700995724',
'ext': 'mp4',
'title': "History of Valentines Day",
'title': "Bet You Didn't Know: Valentine's Day",
'description': 'md5:7b57ea4829b391995b405fa60bd7b5f7',
'timestamp': 1375819729,
'upload_date': '20130806',
'uploader': 'AENE-NEW',
},
'params': {
# m3u8 download
'skip_download': True,
},
'add_ie': ['ThePlatform'],
}, {
'url': 'http://www.history.com/topics/world-war-i/world-war-i-history/videos',
'info_dict':
{
'id': 'world-war-i-history',
'title': 'World War I History',
},
'playlist_mincount': 23,
}, {
'url': 'http://www.history.com/topics/world-war-i-history/videos',
'only_matching': True,
}, {
'url': 'http://www.history.com/topics/world-war-i/world-war-i-history',
'only_matching': True,
}, {
'url': 'http://www.history.com/topics/world-war-i/world-war-i-history/speeches',
'only_matching': True,
}]
def theplatform_url_result(self, theplatform_url, video_id, query):
@@ -229,19 +215,27 @@ class HistoryTopicIE(AENetworksBaseIE):
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
video_id = self._search_regex(
r'<phoenix-iframe[^>]+src="[^"]+\btpid=(\d+)', webpage, 'tpid')
result = self._download_json(
'https://feeds.video.aetnd.com/api/v2/history/videos',
video_id, query={'filter[id]': video_id})['results'][0]
title = result['title']
info = self._extract_aen_smil(result['publicUrl'], video_id)
info.update({
'title': title,
'description': result.get('description'),
'duration': int_or_none(result.get('duration')),
'timestamp': int_or_none(result.get('added'), 1000),
})
return info
topic_id, video_display_id = re.match(self._VALID_URL, url).groups()
if video_display_id:
webpage = self._download_webpage(url, video_display_id)
release_url, video_id = re.search(r"_videoPlayer.play\('([^']+)'\s*,\s*'[^']+'\s*,\s*'(\d+)'\)", webpage).groups()
release_url = unescapeHTML(release_url)
return self.theplatform_url_result(
release_url, video_id, {
'mbr': 'true',
'switch': 'hls',
'assetTypes': 'high_video_ak',
})
else:
webpage = self._download_webpage(url, topic_id)
entries = []
for episode_item in re.findall(r'<a.+?data-release-url="[^"]+"[^>]*>', webpage):
video_attributes = extract_attributes(episode_item)
entries.append(self.theplatform_url_result(
video_attributes['data-release-url'], video_attributes['data-id'], {
'mbr': 'true',
'switch': 'hls',
'assetTypes': 'high_video_ak',
}))
return self.playlist_result(entries, topic_id, get_element_by_attribute('class', 'show-title', webpage))

View File

@@ -0,0 +1,30 @@
from __future__ import unicode_literals
from .nuevo import NuevoBaseIE
class AnitubeIE(NuevoBaseIE):
IE_NAME = 'anitube.se'
_VALID_URL = r'https?://(?:www\.)?anitube\.se/video/(?P<id>\d+)'
_TEST = {
'url': 'http://www.anitube.se/video/36621',
'md5': '59d0eeae28ea0bc8c05e7af429998d43',
'info_dict': {
'id': '36621',
'ext': 'mp4',
'title': 'Recorder to Randoseru 01',
'duration': 180.19,
},
'skip': 'Blocked in the US',
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
key = self._search_regex(
r'src=["\']https?://[^/]+/embed/([A-Za-z0-9_-]+)', webpage, 'key')
return self._extract_nuevo(
'http://www.anitube.se/nuevo/econfig.php?key=%s' % key, video_id)

View File

@@ -0,0 +1,61 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
parse_duration,
int_or_none,
)
class AnySexIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?anysex\.com/(?P<id>\d+)'
_TEST = {
'url': 'http://anysex.com/156592/',
'md5': '023e9fbb7f7987f5529a394c34ad3d3d',
'info_dict': {
'id': '156592',
'ext': 'mp4',
'title': 'Busty and sexy blondie in her bikini strips for you',
'description': 'md5:de9e418178e2931c10b62966474e1383',
'categories': ['Erotic'],
'duration': 270,
'age_limit': 18,
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
webpage = self._download_webpage(url, video_id)
video_url = self._html_search_regex(r"video_url\s*:\s*'([^']+)'", webpage, 'video URL')
title = self._html_search_regex(r'<title>(.*?)</title>', webpage, 'title')
description = self._html_search_regex(
r'<div class="description"[^>]*>([^<]+)</div>', webpage, 'description', fatal=False)
thumbnail = self._html_search_regex(
r'preview_url\s*:\s*\'(.*?)\'', webpage, 'thumbnail', fatal=False)
categories = re.findall(
r'<a href="http://anysex\.com/categories/[^"]+" title="[^"]*">([^<]+)</a>', webpage)
duration = parse_duration(self._search_regex(
r'<b>Duration:</b> (?:<q itemprop="duration">)?(\d+:\d+)', webpage, 'duration', fatal=False))
view_count = int_or_none(self._html_search_regex(
r'<b>Views:</b> (\d+)', webpage, 'view count', fatal=False))
return {
'id': video_id,
'url': video_url,
'ext': 'mp4',
'title': title,
'description': description,
'thumbnail': thumbnail,
'categories': categories,
'duration': duration,
'view_count': view_count,
'age_limit': 18,
}

View File

@@ -4,10 +4,6 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import (
compat_parse_qs,
compat_urllib_parse_urlparse,
)
from ..utils import (
ExtractorError,
int_or_none,
@@ -16,12 +12,12 @@ from ..utils import (
class AolIE(InfoExtractor):
IE_NAME = 'aol.com'
_VALID_URL = r'(?:aol-video:|https?://(?:www\.)?aol\.(?:com|ca|co\.uk|de|jp)/video/(?:[^/]+/)*)(?P<id>[0-9a-f]+)'
IE_NAME = 'on.aol.com'
_VALID_URL = r'(?:aol-video:|https?://(?:(?:www|on)\.)?aol\.com/(?:[^/]+/)*(?:[^/?#&]+-)?)(?P<id>[^/?#&]+)'
_TESTS = [{
# video with 5min ID
'url': 'https://www.aol.com/video/view/u-s--official-warns-of-largest-ever-irs-phone-scam/518167793/',
'url': 'http://on.aol.com/video/u-s--official-warns-of-largest-ever-irs-phone-scam-518167793?icid=OnHomepageC2Wide_MustSee_Img',
'md5': '18ef68f48740e86ae94b98da815eec42',
'info_dict': {
'id': '518167793',
@@ -38,7 +34,7 @@ class AolIE(InfoExtractor):
}
}, {
# video with vidible ID
'url': 'https://www.aol.com/video/view/netflix-is-raising-rates/5707d6b8e4b090497b04f706/',
'url': 'http://www.aol.com/video/view/netflix-is-raising-rates/5707d6b8e4b090497b04f706/',
'info_dict': {
'id': '5707d6b8e4b090497b04f706',
'ext': 'mp4',
@@ -53,29 +49,17 @@ class AolIE(InfoExtractor):
'skip_download': True,
}
}, {
'url': 'https://www.aol.com/video/view/park-bench-season-2-trailer/559a1b9be4b0c3bfad3357a7/',
'url': 'http://on.aol.com/partners/abc-551438d309eab105804dbfe8/sneak-peek-was-haley-really-framed-570eaebee4b0448640a5c944',
'only_matching': True,
}, {
'url': 'https://www.aol.com/video/view/donald-trump-spokeswoman-tones-down-megyn-kelly-attacks/519442220/',
'url': 'http://on.aol.com/shows/park-bench-shw518173474-559a1b9be4b0c3bfad3357a7?context=SH:SHW518173474:PL4327:1460619712763',
'only_matching': True,
}, {
'url': 'http://on.aol.com/video/519442220',
'only_matching': True,
}, {
'url': 'aol-video:5707d6b8e4b090497b04f706',
'only_matching': True,
}, {
'url': 'https://www.aol.com/video/playlist/PL8245/5ca79d19d21f1a04035db606/',
'only_matching': True,
}, {
'url': 'https://www.aol.ca/video/view/u-s-woman-s-family-arrested-for-murder-first-pinned-on-panhandler-police/5c7ccf45bc03931fa04b2fe1/',
'only_matching': True,
}, {
'url': 'https://www.aol.co.uk/video/view/-one-dead-and-22-hurt-in-bus-crash-/5cb3a6f3d21f1a072b457347/',
'only_matching': True,
}, {
'url': 'https://www.aol.de/video/view/eva-braun-privataufnahmen-von-hitlers-geliebter-werden-digitalisiert/5cb2d49de98ab54c113d3d5d/',
'only_matching': True,
}, {
'url': 'https://www.aol.jp/video/playlist/5a28e936a1334d000137da0c/5a28f3151e642219fde19831/',
'only_matching': True,
}]
def _real_extract(self, url):
@@ -89,7 +73,7 @@ class AolIE(InfoExtractor):
video_data = response['data']
formats = []
m3u8_url = url_or_none(video_data.get('videoMasterPlaylist'))
m3u8_url = video_data.get('videoMasterPlaylist')
if m3u8_url:
formats.extend(self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
@@ -112,12 +96,6 @@ class AolIE(InfoExtractor):
'width': int(mobj.group(1)),
'height': int(mobj.group(2)),
})
else:
qs = compat_parse_qs(compat_urllib_parse_urlparse(video_url).query)
f.update({
'width': int_or_none(qs.get('w', [None])[0]),
'height': int_or_none(qs.get('h', [None])[0]),
})
formats.append(f)
self._sort_formats(formats, ('width', 'height', 'tbr', 'format_id'))

View File

@@ -9,8 +9,8 @@ from ..utils import (
class BeegIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?beeg\.(?:com|porn(?:/video)?)/(?P<id>\d+)'
_TESTS = [{
_VALID_URL = r'https?://(?:www\.)?beeg\.com/(?P<id>\d+)'
_TEST = {
'url': 'http://beeg.com/5416503',
'md5': 'a1a1b1a8bc70a89e49ccfd113aed0820',
'info_dict': {
@@ -24,13 +24,7 @@ class BeegIE(InfoExtractor):
'tags': list,
'age_limit': 18,
}
}, {
'url': 'https://beeg.porn/video/5416503',
'only_matching': True,
}, {
'url': 'https://beeg.porn/5416503',
'only_matching': True,
}]
}
def _real_extract(self, url):
video_id = self._match_id(url)

View File

@@ -1,37 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import extract_attributes
class BFIPlayerIE(InfoExtractor):
IE_NAME = 'bfi:player'
_VALID_URL = r'https?://player\.bfi\.org\.uk/[^/]+/film/watch-(?P<id>[\w-]+)-online'
_TEST = {
'url': 'https://player.bfi.org.uk/free/film/watch-computer-doctor-1974-online',
'md5': 'e8783ebd8e061ec4bc6e9501ed547de8',
'info_dict': {
'id': 'htNnhlZjE60C9VySkQEIBtU-cNV1Xx63',
'ext': 'mp4',
'title': 'Computer Doctor',
'description': 'md5:fb6c240d40c4dbe40428bdd62f78203b',
},
'skip': 'BFI Player films cannot be played outside of the UK',
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
entries = []
for player_el in re.findall(r'(?s)<[^>]+class="player"[^>]*>', webpage):
player_attr = extract_attributes(player_el)
ooyala_id = player_attr.get('data-video-id')
if not ooyala_id:
continue
entries.append(self.url_result(
'ooyala:' + ooyala_id, 'Ooyala',
ooyala_id, player_attr.get('data-label')))
return self.playlist_result(entries)

View File

@@ -2,96 +2,39 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from .vk import VKIE
from ..utils import (
HEADRequest,
int_or_none,
)
class BIQLEIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?biqle\.(?:com|org|ru)/watch/(?P<id>-?\d+_\d+)'
_TESTS = [{
# Youtube embed
'url': 'https://biqle.ru/watch/-115995369_456239081',
'md5': '97af5a06ee4c29bbf9c001bdb1cf5c06',
'url': 'http://www.biqle.ru/watch/847655_160197695',
'md5': 'ad5f746a874ccded7b8f211aeea96637',
'info_dict': {
'id': '8v4f-avW-VI',
'id': '160197695',
'ext': 'mp4',
'title': "PASSE-PARTOUT - L'ete c'est fait pour jouer",
'description': 'Passe-Partout',
'uploader_id': 'mrsimpsonstef3',
'uploader': 'Phanolito',
'upload_date': '20120822',
},
'title': 'Foo Fighters - The Pretender (Live at Wembley Stadium)',
'uploader': 'Andrey Rogozin',
'upload_date': '20110605',
}
}, {
'url': 'http://biqle.org/watch/-44781847_168547604',
'url': 'https://biqle.org/watch/-44781847_168547604',
'md5': '7f24e72af1db0edf7c1aaba513174f97',
'info_dict': {
'id': '-44781847_168547604',
'id': '168547604',
'ext': 'mp4',
'title': 'Ребенок в шоке от автоматической мойки',
'timestamp': 1396633454,
'uploader': 'Dmitry Kotov',
'upload_date': '20140404',
'uploader_id': '47850140',
},
'skip': ' This video was marked as adult. Embedding adult videos on external sites is prohibited.',
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
embed_url = self._proto_relative_url(self._search_regex(
r'<iframe.+?src="((?:https?:)?//daxab\.com/[^"]+)".*?></iframe>',
webpage, 'embed url'))
if VKIE.suitable(embed_url):
return self.url_result(embed_url, VKIE.ie_key(), video_id)
self._request_webpage(
HEADRequest(embed_url), video_id, headers={'Referer': url})
video_id, sig, _, access_token = self._get_cookies(embed_url)['video_ext'].value.split('%3A')
item = self._download_json(
'https://api.vk.com/method/video.get', video_id,
headers={'User-Agent': 'okhttp/3.4.1'}, query={
'access_token': access_token,
'sig': sig,
'v': 5.44,
'videos': video_id,
})['response']['items'][0]
title = item['title']
formats = []
for f_id, f_url in item.get('files', {}).items():
if f_id == 'external':
return self.url_result(f_url)
ext, height = f_id.split('_')
formats.append({
'format_id': height + 'p',
'url': f_url,
'height': int_or_none(height),
'ext': ext,
})
self._sort_formats(formats)
thumbnails = []
for k, v in item.items():
if k.startswith('photo_') and v:
width = k.replace('photo_', '')
thumbnails.append({
'id': width,
'url': v,
'width': int_or_none(width),
})
r'<iframe.+?src="((?:http:)?//daxab\.com/[^"]+)".*?></iframe>', webpage, 'embed url'))
return {
'id': video_id,
'title': title,
'formats': formats,
'comment_count': int_or_none(item.get('comments')),
'description': item.get('description'),
'duration': int_or_none(item.get('duration')),
'thumbnails': thumbnails,
'timestamp': int_or_none(item.get('date')),
'uploader': item.get('owner_id'),
'view_count': int_or_none(item.get('views')),
'_type': 'url_transparent',
'url': embed_url,
}

View File

@@ -1,8 +1,6 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .adobepass import AdobePassIE
from ..utils import (
smuggle_url,
@@ -14,16 +12,16 @@ from ..utils import (
class BravoTVIE(AdobePassIE):
_VALID_URL = r'https?://(?:www\.)?bravotv\.com/(?:[^/]+/)+(?P<id>[^/?#]+)'
_TESTS = [{
'url': 'https://www.bravotv.com/top-chef/season-16/episode-15/videos/the-top-chef-season-16-winner-is',
'md5': 'e34684cfea2a96cd2ee1ef3a60909de9',
'url': 'http://www.bravotv.com/last-chance-kitchen/season-5/videos/lck-ep-12-fishy-finale',
'md5': '9086d0b7ef0ea2aabc4781d75f4e5863',
'info_dict': {
'id': 'epL0pmK1kQlT',
'id': 'zHyk1_HU_mPy',
'ext': 'mp4',
'title': 'The Top Chef Season 16 Winner Is...',
'description': 'Find out who takes the title of Top Chef!',
'title': 'LCK Ep 12: Fishy Finale',
'description': 'S13/E12: Two eliminated chefs have just 12 minutes to cook up a delicious fish dish.',
'uploader': 'NBCU-BRAV',
'upload_date': '20190314',
'timestamp': 1552591860,
'upload_date': '20160302',
'timestamp': 1456945320,
}
}, {
'url': 'http://www.bravotv.com/below-deck/season-3/ep-14-reunion-part-1',
@@ -34,38 +32,30 @@ class BravoTVIE(AdobePassIE):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
settings = self._parse_json(self._search_regex(
r'<script[^>]+data-drupal-selector="drupal-settings-json"[^>]*>({.+?})</script>', webpage, 'drupal settings'),
r'jQuery\.extend\(Drupal\.settings\s*,\s*({.+?})\);', webpage, 'drupal settings'),
display_id)
info = {}
query = {
'mbr': 'true',
}
account_pid, release_pid = [None] * 2
tve = settings.get('ls_tve')
tve = settings.get('sharedTVE')
if tve:
query['manifest'] = 'm3u'
mobj = re.search(r'<[^>]+id="pdk-player"[^>]+data-url=["\']?(?:https?:)?//player\.theplatform\.com/p/([^/]+)/(?:[^/]+/)*select/([^?#&"\']+)', webpage)
if mobj:
account_pid, tp_path = mobj.groups()
release_pid = tp_path.strip('/').split('/')[-1]
else:
account_pid = 'HNK2IC'
tp_path = release_pid = tve['release_pid']
account_pid = 'HNK2IC'
release_pid = tve['release_pid']
if tve.get('entitlement') == 'auth':
adobe_pass = settings.get('tve_adobe_auth', {})
adobe_pass = settings.get('adobePass', {})
resource = self._get_mvpd_resource(
adobe_pass.get('adobePassResourceId', 'bravo'),
tve['title'], release_pid, tve.get('rating'))
query['auth'] = self._extract_mvpd_auth(
url, release_pid, adobe_pass.get('adobePassRequestorId', 'bravo'), resource)
else:
shared_playlist = settings['ls_playlist']
shared_playlist = settings['shared_playlist']
account_pid = shared_playlist['account_pid']
metadata = shared_playlist['video_metadata'][shared_playlist['default_clip']]
tp_path = release_pid = metadata.get('release_pid')
if not release_pid:
release_pid = metadata['guid']
tp_path = 'media/guid/2140479951/' + release_pid
release_pid = metadata['release_pid']
info.update({
'title': metadata['title'],
'description': metadata.get('description'),
@@ -77,7 +67,7 @@ class BravoTVIE(AdobePassIE):
'_type': 'url_transparent',
'id': release_pid,
'url': smuggle_url(update_url_query(
'http://link.theplatform.com/s/%s/%s' % (account_pid, tp_path),
'http://link.theplatform.com/s/%s/%s' % (account_pid, release_pid),
query), {'force_smil_url': True}),
'ie_key': 'ThePlatform',
})

View File

@@ -360,7 +360,7 @@ class CBCWatchVideoIE(CBCWatchBaseIE):
class CBCWatchIE(CBCWatchBaseIE):
IE_NAME = 'cbc.ca:watch'
_VALID_URL = r'https?://(?:gem|watch)\.cbc\.ca/(?:[^/]+/)+(?P<id>[0-9a-f-]+)'
_VALID_URL = r'https?://watch\.cbc\.ca/(?:[^/]+/)+(?P<id>[0-9a-f-]+)'
_TESTS = [{
# geo-restricted to Canada, bypassable
'url': 'http://watch.cbc.ca/doc-zone/season-6/customer-disservice/38e815a-009e3ab12e4',
@@ -386,9 +386,6 @@ class CBCWatchIE(CBCWatchBaseIE):
'description': 'Arthur, the sweetest 8-year-old aardvark, and his pals solve all kinds of problems with humour, kindness and teamwork.',
},
'playlist_mincount': 30,
}, {
'url': 'https://gem.cbc.ca/media/this-hour-has-22-minutes/season-26/episode-20/38e815a-0108c6c6a42',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@@ -13,17 +13,13 @@ from ..utils import (
class CBSBaseIE(ThePlatformFeedIE):
def _parse_smil_subtitles(self, smil, namespace=None, subtitles_lang='en'):
subtitles = {}
for k, ext in [('sMPTE-TTCCURL', 'tt'), ('ClosedCaptionURL', 'ttml'), ('webVTTCaptionURL', 'vtt')]:
cc_e = find_xpath_attr(smil, self._xpath_ns('.//param', namespace), 'name', k)
if cc_e is not None:
cc_url = cc_e.get('value')
if cc_url:
subtitles.setdefault(subtitles_lang, []).append({
'ext': ext,
'url': cc_url,
})
return subtitles
closed_caption_e = find_xpath_attr(smil, self._xpath_ns('.//param', namespace), 'name', 'ClosedCaptionURL')
return {
'en': [{
'ext': 'ttml',
'url': closed_caption_e.attrib['value'],
}]
} if closed_caption_e is not None and closed_caption_e.attrib.get('value') else []
class CBSIE(CBSBaseIE):

View File

@@ -1,12 +1,9 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_iso8601,
try_get,
url_or_none,
)
@@ -21,13 +18,11 @@ class CCCIE(InfoExtractor):
'id': '1839',
'ext': 'mp4',
'title': 'Introduction to Processor Design',
'creator': 'byterazor',
'description': 'md5:df55f6d073d4ceae55aae6f2fd98a0ac',
'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20131228',
'timestamp': 1388188800,
'duration': 3710,
'tags': list,
}
}, {
'url': 'https://media.ccc.de/v/32c3-7368-shopshifting#download',
@@ -73,7 +68,6 @@ class CCCIE(InfoExtractor):
'id': event_id,
'display_id': display_id,
'title': event_data['title'],
'creator': try_get(event_data, lambda x: ', '.join(x['persons'])),
'description': event_data.get('description'),
'thumbnail': event_data.get('thumb_url'),
'timestamp': parse_iso8601(event_data.get('date')),
@@ -81,31 +75,3 @@ class CCCIE(InfoExtractor):
'tags': event_data.get('tags'),
'formats': formats,
}
class CCCPlaylistIE(InfoExtractor):
IE_NAME = 'media.ccc.de:lists'
_VALID_URL = r'https?://(?:www\.)?media\.ccc\.de/c/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'https://media.ccc.de/c/30c3',
'info_dict': {
'title': '30C3',
'id': '30c3',
},
'playlist_count': 135,
}]
def _real_extract(self, url):
playlist_id = self._match_id(url).lower()
conf = self._download_json(
'https://media.ccc.de/public/conferences/' + playlist_id,
playlist_id)
entries = []
for e in conf['events']:
event_url = url_or_none(e.get('frontend_link'))
if event_url:
entries.append(self.url_result(event_url, ie=CCCIE.ie_key()))
return self.playlist_result(entries, playlist_id, conf.get('title'))

View File

@@ -1,29 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .hbo import HBOBaseIE
class CinemaxIE(HBOBaseIE):
_VALID_URL = r'https?://(?:www\.)?cinemax\.com/(?P<path>[^/]+/video/[0-9a-z-]+-(?P<id>\d+))'
_TESTS = [{
'url': 'https://www.cinemax.com/warrior/video/s1-ep-1-recap-20126903',
'md5': '82e0734bba8aa7ef526c9dd00cf35a05',
'info_dict': {
'id': '20126903',
'ext': 'mp4',
'title': 'S1 Ep 1: Recap',
},
'expected_warnings': ['Unknown MIME type application/mp4 in DASH manifest'],
}, {
'url': 'https://www.cinemax.com/warrior/video/s1-ep-1-recap-20126903.embed',
'only_matching': True,
}]
def _real_extract(self, url):
path, video_id = re.match(self._VALID_URL, url).groups()
info = self._extract_info('https://www.cinemax.com/%s.xml' % path, video_id)
info['id'] = video_id
return info

View File

@@ -65,8 +65,8 @@ class CiscoLiveBaseIE(InfoExtractor):
class CiscoLiveSessionIE(CiscoLiveBaseIE):
_VALID_URL = r'https?://(?:www\.)?ciscolive(?:\.cisco)?\.com/[^#]*#/session/(?P<id>[^/?&]+)'
_TESTS = [{
_VALID_URL = r'https?://ciscolive\.cisco\.com/on-demand-library/\??[^#]*#/session/(?P<id>[^/?&]+)'
_TEST = {
'url': 'https://ciscolive.cisco.com/on-demand-library/?#/session/1423353499155001FoSs',
'md5': 'c98acf395ed9c9f766941c70f5352e22',
'info_dict': {
@@ -79,13 +79,7 @@ class CiscoLiveSessionIE(CiscoLiveBaseIE):
'uploader_id': '5647924234001',
'location': '16B Mezz.',
},
}, {
'url': 'https://www.ciscolive.com/global/on-demand-library.html?search.event=ciscoliveemea2019#/session/15361595531500013WOU',
'only_matching': True,
}, {
'url': 'https://www.ciscolive.com/global/on-demand-library.html?#/session/1490051371645001kNaS',
'only_matching': True,
}]
}
def _real_extract(self, url):
rf_id = self._match_id(url)
@@ -94,7 +88,7 @@ class CiscoLiveSessionIE(CiscoLiveBaseIE):
class CiscoLiveSearchIE(CiscoLiveBaseIE):
_VALID_URL = r'https?://(?:www\.)?ciscolive(?:\.cisco)?\.com/(?:global/)?on-demand-library(?:\.html|/)'
_VALID_URL = r'https?://ciscolive\.cisco\.com/on-demand-library/'
_TESTS = [{
'url': 'https://ciscolive.cisco.com/on-demand-library/?search.event=ciscoliveus2018&search.technicallevel=scpsSkillLevel_aintroductory&search.focus=scpsSessionFocus_designAndDeployment#/',
'info_dict': {
@@ -104,9 +98,6 @@ class CiscoLiveSearchIE(CiscoLiveBaseIE):
}, {
'url': 'https://ciscolive.cisco.com/on-demand-library/?search.technology=scpsTechnology_applicationDevelopment&search.technology=scpsTechnology_ipv6&search.focus=scpsSessionFocus_troubleshootingTroubleshooting#/',
'only_matching': True,
}, {
'url': 'https://www.ciscolive.com/global/on-demand-library.html?search.technicallevel=scpsSkillLevel_aintroductory&search.event=ciscoliveemea2019&search.technology=scpsTechnology_dataCenter&search.focus=scpsSessionFocus_bestPractices#/',
'only_matching': True,
}]
@classmethod

View File

@@ -44,7 +44,6 @@ from ..utils import (
compiled_regex_type,
determine_ext,
determine_protocol,
dict_get,
error_to_compat_str,
ExtractorError,
extract_attributes,
@@ -57,16 +56,13 @@ from ..utils import (
JSON_LD_RE,
mimetype2ext,
orderedSet,
parse_bitrate,
parse_codecs,
parse_duration,
parse_iso8601,
parse_m3u8_attributes,
parse_resolution,
RegexNotFoundError,
sanitized_Request,
sanitize_filename,
str_or_none,
unescapeHTML,
unified_strdate,
unified_timestamp,
@@ -112,13 +108,10 @@ class InfoExtractor(object):
for RTMP - RTMP URL,
for HLS - URL of the M3U8 media playlist,
for HDS - URL of the F4M manifest,
for DASH
- HTTP URL to plain file media (in case of
unfragmented media)
- URL of the MPD manifest or base URL
representing the media if MPD manifest
is parsed froma string (in case of
fragmented media)
for DASH - URL of the MPD manifest or
base URL representing the media
if MPD manifest is parsed from
a string,
for MSS - URL of the ISM manifest.
* manifest_url
The URL of the manifest file in case of
@@ -2019,8 +2012,6 @@ class InfoExtractor(object):
if res is False:
return []
mpd_doc, urlh = res
if mpd_doc is None:
return []
mpd_base_url = base_url(urlh.geturl())
return self._parse_mpd_formats(
@@ -2146,6 +2137,8 @@ class InfoExtractor(object):
bandwidth = int_or_none(representation_attrib.get('bandwidth'))
f = {
'format_id': '%s-%s' % (mpd_id, representation_id) if mpd_id else representation_id,
# NB: mpd_url may be empty when MPD manifest is parsed from a string
'url': mpd_url or base_url,
'manifest_url': mpd_url,
'ext': mimetype2ext(mime_type),
'width': int_or_none(representation_attrib.get('width')),
@@ -2284,14 +2277,10 @@ class InfoExtractor(object):
fragment['duration'] = segment_duration
fragments.append(fragment)
representation_ms_info['fragments'] = fragments
# If there is a fragments key available then we correctly recognized fragmented media.
# Otherwise we will assume unfragmented media with direct access. Technically, such
# assumption is not necessarily correct since we may simply have no support for
# some forms of fragmented media renditions yet, but for now we'll use this fallback.
# NB: MPD manifest may contain direct URLs to unfragmented media.
# No fragments key is present in this case.
if 'fragments' in representation_ms_info:
f.update({
# NB: mpd_url may be empty when MPD manifest is parsed from a string
'url': mpd_url or base_url,
'fragment_base_url': base_url,
'fragments': [],
'protocol': 'http_dash_segments',
@@ -2302,10 +2291,6 @@ class InfoExtractor(object):
f['url'] = initialization_url
f['fragments'].append({location_key(initialization_url): initialization_url})
f['fragments'].extend(representation_ms_info['fragments'])
else:
# Assuming direct URL to unfragmented media.
f['url'] = base_url
# According to [1, 5.3.5.2, Table 7, page 35] @id of Representation
# is not necessarily unique within a Period thus formats with
# the same `format_id` are quite possible. There are numerous examples
@@ -2487,43 +2472,18 @@ class InfoExtractor(object):
media_info['thumbnail'] = absolute_url(media_attributes.get('poster'))
if media_content:
for source_tag in re.findall(r'<source[^>]+>', media_content):
s_attr = extract_attributes(source_tag)
# data-video-src and data-src are non standard but seen
# several times in the wild
src = dict_get(s_attr, ('src', 'data-video-src', 'data-src'))
source_attributes = extract_attributes(source_tag)
src = source_attributes.get('src')
if not src:
continue
f = parse_content_type(s_attr.get('type'))
f = parse_content_type(source_attributes.get('type'))
is_plain_url, formats = _media_formats(src, media_type, f)
if is_plain_url:
# width, height, res, label and title attributes are
# all not standard but seen several times in the wild
labels = [
s_attr.get(lbl)
for lbl in ('label', 'title')
if str_or_none(s_attr.get(lbl))
]
width = int_or_none(s_attr.get('width'))
height = (int_or_none(s_attr.get('height')) or
int_or_none(s_attr.get('res')))
if not width or not height:
for lbl in labels:
resolution = parse_resolution(lbl)
if not resolution:
continue
width = width or resolution.get('width')
height = height or resolution.get('height')
for lbl in labels:
tbr = parse_bitrate(lbl)
if tbr:
break
else:
tbr = None
# res attribute is not standard but seen several times
# in the wild
f.update({
'width': width,
'height': height,
'tbr': tbr,
'format_id': s_attr.get('label') or s_attr.get('title'),
'height': int_or_none(source_attributes.get('res')),
'format_id': source_attributes.get('label'),
})
f.update(formats[0])
media_info['formats'].append(f)

View File

@@ -13,9 +13,9 @@ class CorusIE(ThePlatformFeedIE):
(?:www\.)?
(?P<domain>
(?:globaltv|etcanada)\.com|
(?:hgtv|foodnetwork|slice|history|showcase|bigbrothercanada)\.ca
(?:hgtv|foodnetwork|slice|history|showcase)\.ca
)
/(?:video/(?:[^/]+/)?|(?:[^/]+/)+(?:videos/[a-z0-9-]+-|video\.html\?.*?\bv=))
/(?:video/|(?:[^/]+/)+(?:videos/[a-z0-9-]+-|video\.html\?.*?\bv=))
(?P<id>\d+)
'''
_TESTS = [{
@@ -42,12 +42,6 @@ class CorusIE(ThePlatformFeedIE):
}, {
'url': 'http://www.showcase.ca/eyewitness/video/eyewitness++106/video.html?v=955070531919&p=1&s=da#video',
'only_matching': True,
}, {
'url': 'http://www.bigbrothercanada.ca/video/1457812035894/',
'only_matching': True
}, {
'url': 'https://www.bigbrothercanada.ca/video/big-brother-canada-704/1457812035894/',
'only_matching': True
}]
_TP_FEEDS = {
@@ -79,10 +73,6 @@ class CorusIE(ThePlatformFeedIE):
'feed_id': '9H6qyshBZU3E',
'account_id': 2414426607,
},
'bigbrothercanada': {
'feed_id': 'ChQqrem0lNUp',
'account_id': 2269680845,
},
}
def _real_extract(self, url):

View File

@@ -79,7 +79,7 @@ class CWTVIE(InfoExtractor):
season = str_or_none(video_data.get('season'))
episode = str_or_none(video_data.get('episode'))
if episode and season:
episode = episode[len(season):]
episode = episode.lstrip(season)
return {
'_type': 'url_transparent',

View File

@@ -58,17 +58,10 @@ class DigitallySpeakingIE(InfoExtractor):
stream_name = xpath_text(a_format, 'streamName', fatal=True)
video_path = re.match(r'mp4\:(?P<path>.*)', stream_name).group('path')
url = video_root + video_path
bitrate = xpath_text(a_format, 'bitrate')
tbr = int_or_none(bitrate)
vbr = int_or_none(self._search_regex(
r'-(\d+)\.mp4', video_path, 'vbr', default=None))
abr = tbr - vbr if tbr and vbr else None
vbr = xpath_text(a_format, 'bitrate')
video_formats.append({
'format_id': bitrate,
'url': url,
'tbr': tbr,
'vbr': vbr,
'abr': abr,
'vbr': int_or_none(vbr),
})
return video_formats

View File

@@ -0,0 +1,266 @@
# coding: utf-8
from __future__ import unicode_literals
import itertools
import json
from .common import InfoExtractor
from ..compat import (
compat_HTTPError,
compat_urlparse,
)
from ..utils import (
clean_html,
ExtractorError,
int_or_none,
parse_age_limit,
parse_duration,
unified_timestamp,
url_or_none,
)
class DramaFeverBaseIE(InfoExtractor):
_NETRC_MACHINE = 'dramafever'
_CONSUMER_SECRET = 'DA59dtVXYLxajktV'
_consumer_secret = None
def _get_consumer_secret(self):
mainjs = self._download_webpage(
'http://www.dramafever.com/static/51afe95/df2014/scripts/main.js',
None, 'Downloading main.js', fatal=False)
if not mainjs:
return self._CONSUMER_SECRET
return self._search_regex(
r"var\s+cs\s*=\s*'([^']+)'", mainjs,
'consumer secret', default=self._CONSUMER_SECRET)
def _real_initialize(self):
self._consumer_secret = self._get_consumer_secret()
self._login()
def _login(self):
username, password = self._get_login_info()
if username is None:
return
login_form = {
'username': username,
'password': password,
}
try:
response = self._download_json(
'https://www.dramafever.com/api/users/login', None, 'Logging in',
data=json.dumps(login_form).encode('utf-8'), headers={
'x-consumer-key': self._consumer_secret,
})
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code in (403, 404):
response = self._parse_json(
e.cause.read().decode('utf-8'), None)
else:
raise
# Successful login
if response.get('result') or response.get('guid') or response.get('user_guid'):
return
errors = response.get('errors')
if errors and isinstance(errors, list):
error = errors[0]
message = error.get('message') or error['reason']
raise ExtractorError('Unable to login: %s' % message, expected=True)
raise ExtractorError('Unable to log in')
class DramaFeverIE(DramaFeverBaseIE):
IE_NAME = 'dramafever'
_VALID_URL = r'https?://(?:www\.)?dramafever\.com/(?:[^/]+/)?drama/(?P<id>[0-9]+/[0-9]+)(?:/|$)'
_TESTS = [{
'url': 'https://www.dramafever.com/drama/4274/1/Heirs/',
'info_dict': {
'id': '4274.1',
'ext': 'wvm',
'title': 'Heirs - Episode 1',
'description': 'md5:362a24ba18209f6276e032a651c50bc2',
'thumbnail': r're:^https?://.*\.jpg',
'duration': 3783,
'timestamp': 1381354993,
'upload_date': '20131009',
'series': 'Heirs',
'season_number': 1,
'episode': 'Episode 1',
'episode_number': 1,
},
'params': {
# m3u8 download
'skip_download': True,
},
}, {
'url': 'http://www.dramafever.com/drama/4826/4/Mnet_Asian_Music_Awards_2015/?ap=1',
'info_dict': {
'id': '4826.4',
'ext': 'flv',
'title': 'Mnet Asian Music Awards 2015',
'description': 'md5:3ff2ee8fedaef86e076791c909cf2e91',
'episode': 'Mnet Asian Music Awards 2015 - Part 3',
'episode_number': 4,
'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1450213200,
'upload_date': '20151215',
'duration': 5359,
},
'params': {
# m3u8 download
'skip_download': True,
},
}, {
'url': 'https://www.dramafever.com/zh-cn/drama/4972/15/Doctor_Romantic/',
'only_matching': True,
}]
def _call_api(self, path, video_id, note, fatal=False):
return self._download_json(
'https://www.dramafever.com/api/5/' + path,
video_id, note=note, headers={
'x-consumer-key': self._consumer_secret,
}, fatal=fatal)
def _get_subtitles(self, video_id):
subtitles = {}
subs = self._call_api(
'video/%s/subtitles/webvtt/' % video_id, video_id,
'Downloading subtitles JSON', fatal=False)
if not subs or not isinstance(subs, list):
return subtitles
for sub in subs:
if not isinstance(sub, dict):
continue
sub_url = url_or_none(sub.get('url'))
if not sub_url:
continue
subtitles.setdefault(
sub.get('code') or sub.get('language') or 'en', []).append({
'url': sub_url
})
return subtitles
def _real_extract(self, url):
video_id = self._match_id(url).replace('/', '.')
series_id, episode_number = video_id.split('.')
video = self._call_api(
'series/%s/episodes/%s/' % (series_id, episode_number), video_id,
'Downloading video JSON')
formats = []
download_assets = video.get('download_assets')
if download_assets and isinstance(download_assets, dict):
for format_id, format_dict in download_assets.items():
if not isinstance(format_dict, dict):
continue
format_url = url_or_none(format_dict.get('url'))
if not format_url:
continue
formats.append({
'url': format_url,
'format_id': format_id,
'filesize': int_or_none(video.get('filesize')),
})
stream = self._call_api(
'video/%s/stream/' % video_id, video_id, 'Downloading stream JSON',
fatal=False)
if stream:
stream_url = stream.get('stream_url')
if stream_url:
formats.extend(self._extract_m3u8_formats(
stream_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
self._sort_formats(formats)
title = video.get('title') or 'Episode %s' % episode_number
description = video.get('description')
thumbnail = video.get('thumbnail')
timestamp = unified_timestamp(video.get('release_date'))
duration = parse_duration(video.get('duration'))
age_limit = parse_age_limit(video.get('tv_rating'))
series = video.get('series_title')
season_number = int_or_none(video.get('season'))
if series:
title = '%s - %s' % (series, title)
subtitles = self.extract_subtitles(video_id)
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'duration': duration,
'timestamp': timestamp,
'age_limit': age_limit,
'series': series,
'season_number': season_number,
'episode_number': int_or_none(episode_number),
'formats': formats,
'subtitles': subtitles,
}
class DramaFeverSeriesIE(DramaFeverBaseIE):
IE_NAME = 'dramafever:series'
_VALID_URL = r'https?://(?:www\.)?dramafever\.com/(?:[^/]+/)?drama/(?P<id>[0-9]+)(?:/(?:(?!\d+(?:/|$)).+)?)?$'
_TESTS = [{
'url': 'http://www.dramafever.com/drama/4512/Cooking_with_Shin/',
'info_dict': {
'id': '4512',
'title': 'Cooking with Shin',
'description': 'md5:84a3f26e3cdc3fb7f500211b3593b5c1',
},
'playlist_count': 4,
}, {
'url': 'http://www.dramafever.com/drama/124/IRIS/',
'info_dict': {
'id': '124',
'title': 'IRIS',
'description': 'md5:b3a30e587cf20c59bd1c01ec0ee1b862',
},
'playlist_count': 20,
}]
_PAGE_SIZE = 60 # max is 60 (see http://api.drama9.com/#get--api-4-episode-series-)
def _real_extract(self, url):
series_id = self._match_id(url)
series = self._download_json(
'http://www.dramafever.com/api/4/series/query/?cs=%s&series_id=%s'
% (self._consumer_secret, series_id),
series_id, 'Downloading series JSON')['series'][series_id]
title = clean_html(series['name'])
description = clean_html(series.get('description') or series.get('description_short'))
entries = []
for page_num in itertools.count(1):
episodes = self._download_json(
'http://www.dramafever.com/api/4/episode/series/?cs=%s&series_id=%s&page_size=%d&page_number=%d'
% (self._consumer_secret, series_id, self._PAGE_SIZE, page_num),
series_id, 'Downloading episodes JSON page #%d' % page_num)
for episode in episodes.get('value', []):
episode_url = episode.get('episode_url')
if not episode_url:
continue
entries.append(self.url_result(
compat_urlparse.urljoin(url, episode_url),
'DramaFever', episode.get('guid')))
if page_num == episodes['num_pages']:
break
return self.playlist_result(entries, series_id, title, description)

View File

@@ -10,16 +10,16 @@ from ..utils import (
int_or_none,
js_to_json,
mimetype2ext,
try_get,
unescapeHTML,
parse_iso8601,
)
class DVTVIE(InfoExtractor):
IE_NAME = 'dvtv'
IE_DESC = 'http://video.aktualne.cz/'
_VALID_URL = r'https?://video\.aktualne\.cz/(?:[^/]+/)+r~(?P<id>[0-9a-f]{32})'
_TESTS = [{
'url': 'http://video.aktualne.cz/dvtv/vondra-o-ceskem-stoleti-pri-pohledu-na-havla-mi-bylo-trapne/r~e5efe9ca855511e4833a0025900fea04/',
'md5': '67cb83e4a955d36e1b5d31993134a0c2',
@@ -28,13 +28,11 @@ class DVTVIE(InfoExtractor):
'ext': 'mp4',
'title': 'Vondra o Českém století: Při pohledu na Havla mi bylo trapně',
'duration': 1484,
'upload_date': '20141217',
'timestamp': 1418792400,
}
}, {
'url': 'http://video.aktualne.cz/dvtv/dvtv-16-12-2014-utok-talibanu-boj-o-kliniku-uprchlici/r~973eb3bc854e11e498be002590604f2e/',
'info_dict': {
'title': r'DVTV 16. 12. 2014: útok Talibanu, boj o kliniku, uprchlíci',
'title': r're:^DVTV 16\. 12\. 2014: útok Talibanu, boj o kliniku, uprchlíci',
'id': '973eb3bc854e11e498be002590604f2e',
},
'playlist': [{
@@ -86,8 +84,6 @@ class DVTVIE(InfoExtractor):
'ext': 'mp4',
'title': 'Zeman si jen léčí mindráky, Sobotku nenávidí a Babiš se mu teď hodí, tvrdí Kmenta',
'duration': 1103,
'upload_date': '20170511',
'timestamp': 1494514200,
},
'params': {
'skip_download': True,
@@ -95,59 +91,43 @@ class DVTVIE(InfoExtractor):
}, {
'url': 'http://video.aktualne.cz/v-cechach-poprve-zazni-zelenkova-zrestaurovana-mse/r~45b4b00483ec11e4883b002590604f2e/',
'only_matching': True,
}, {
# Test live stream video (liveStarter) parsing
'url': 'https://video.aktualne.cz/dvtv/zive-mistryne-sveta-eva-samkova-po-navratu-ze-sampionatu/r~182654c2288811e990fd0cc47ab5f122/',
'md5': '2e552e483f2414851ca50467054f9d5d',
'info_dict': {
'id': '8d116360288011e98c840cc47ab5f122',
'ext': 'mp4',
'title': 'Živě: Mistryně světa Eva Samková po návratu ze šampionátu',
'upload_date': '20190204',
'timestamp': 1549289591,
},
'params': {
# Video content is no longer available
'skip_download': True,
},
}]
def _parse_video_metadata(self, js, video_id, timestamp):
def _parse_video_metadata(self, js, video_id, live_js=None):
data = self._parse_json(js, video_id, transform_source=js_to_json)
if live_js:
data.update(self._parse_json(
live_js, video_id, transform_source=js_to_json))
title = unescapeHTML(data['title'])
live_starter = try_get(data, lambda x: x['plugins']['liveStarter'], dict)
if live_starter:
data.update(live_starter)
formats = []
for tracks in data.get('tracks', {}).values():
for video in tracks:
video_url = video.get('src')
if not video_url:
continue
video_type = video.get('type')
ext = determine_ext(video_url, mimetype2ext(video_type))
if video_type == 'application/vnd.apple.mpegurl' or ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
video_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
elif video_type == 'application/dash+xml' or ext == 'mpd':
formats.extend(self._extract_mpd_formats(
video_url, video_id, mpd_id='dash', fatal=False))
else:
label = video.get('label')
height = self._search_regex(
r'^(\d+)[pP]', label or '', 'height', default=None)
format_id = ['http']
for f in (ext, label):
if f:
format_id.append(f)
formats.append({
'url': video_url,
'format_id': '-'.join(format_id),
'height': int_or_none(height),
})
for video in data['sources']:
video_url = video.get('file')
if not video_url:
continue
video_type = video.get('type')
ext = determine_ext(video_url, mimetype2ext(video_type))
if video_type == 'application/vnd.apple.mpegurl' or ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
video_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
elif video_type == 'application/dash+xml' or ext == 'mpd':
formats.extend(self._extract_mpd_formats(
video_url, video_id, mpd_id='dash', fatal=False))
else:
label = video.get('label')
height = self._search_regex(
r'^(\d+)[pP]', label or '', 'height', default=None)
format_id = ['http']
for f in (ext, label):
if f:
format_id.append(f)
formats.append({
'url': video_url,
'format_id': '-'.join(format_id),
'height': int_or_none(height),
})
self._sort_formats(formats)
return {
@@ -156,29 +136,41 @@ class DVTVIE(InfoExtractor):
'description': data.get('description'),
'thumbnail': data.get('image'),
'duration': int_or_none(data.get('duration')),
'timestamp': int_or_none(timestamp),
'timestamp': int_or_none(data.get('pubtime')),
'formats': formats
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
timestamp = parse_iso8601(self._html_search_meta(
'article:published_time', webpage, 'published time', default=None))
items = re.findall(r'(?s)playlist\.push\(({.+?})\);', webpage)
if items:
return self.playlist_result(
[self._parse_video_metadata(i, video_id, timestamp) for i in items],
video_id, self._html_search_meta('twitter:title', webpage))
item = self._search_regex(
r'(?s)BBXPlayer\.setup\((.+?)\);',
# live content
live_item = self._search_regex(
r'(?s)embedData[0-9a-f]{32}\.asset\.liveStarter\s*=\s*(\{.+?\});',
webpage, 'video', default=None)
# single video
item = self._search_regex(
r'(?s)embedData[0-9a-f]{32}\[["\']asset["\']\]\s*=\s*(\{.+?\});',
webpage, 'video', default=None)
if item:
# remove function calls (ex. htmldeentitize)
# TODO this should be fixed in a general way in the js_to_json
item = re.sub(r'\w+?\((.+)\)', r'\1', item)
return self._parse_video_metadata(item, video_id, timestamp)
return self._parse_video_metadata(item, video_id, live_item)
# playlist
items = re.findall(
r"(?s)BBX\.context\.assets\['[0-9a-f]{32}'\]\.push\(({.+?})\);",
webpage)
if not items:
items = re.findall(r'(?s)var\s+asset\s*=\s*({.+?});\n', webpage)
if items:
return {
'_type': 'playlist',
'id': video_id,
'title': self._og_search_title(webpage),
'entries': [self._parse_video_metadata(i, video_id) for i in items]
}
raise ExtractorError('Could not find neither video nor playlist')

View File

@@ -1,11 +1,14 @@
from __future__ import unicode_literals
import json
from .common import InfoExtractor
from ..utils import (
determine_ext,
clean_html,
int_or_none,
float_or_none,
sanitized_Request,
)
@@ -33,7 +36,7 @@ def _decrypt_config(key, string):
class EscapistIE(InfoExtractor):
_VALID_URL = r'https?://?(?:(?:www|v1)\.)?escapistmagazine\.com/videos/view/[^/]+/(?P<id>[0-9]+)'
_VALID_URL = r'https?://?(?:www\.)?escapistmagazine\.com/videos/view/[^/?#]+/(?P<id>[0-9]+)-[^/?#]*(?:$|[?#])'
_TESTS = [{
'url': 'http://www.escapistmagazine.com/videos/view/the-escapist-presents/6618-Breaking-Down-Baldurs-Gate',
'md5': 'ab3a706c681efca53f0a35f1415cf0d1',
@@ -58,12 +61,6 @@ class EscapistIE(InfoExtractor):
'duration': 304,
'uploader': 'The Escapist',
}
}, {
'url': 'http://escapistmagazine.com/videos/view/the-escapist-presents/6618',
'only_matching': True,
}, {
'url': 'https://v1.escapistmagazine.com/videos/view/the-escapist-presents/6618-Breaking-Down-Baldurs-Gate',
'only_matching': True,
}]
def _real_extract(self, url):
@@ -77,20 +74,19 @@ class EscapistIE(InfoExtractor):
video_id = ims_video['videoID']
key = ims_video['hash']
config = self._download_webpage(
'http://www.escapistmagazine.com/videos/vidconfig.php',
video_id, 'Downloading video config', headers={
'Referer': url,
}, query={
'videoID': video_id,
'hash': key,
})
config_req = sanitized_Request(
'http://www.escapistmagazine.com/videos/'
'vidconfig.php?videoID=%s&hash=%s' % (video_id, key))
config_req.add_header('Referer', url)
config = self._download_webpage(config_req, video_id, 'Downloading video config')
data = self._parse_json(_decrypt_config(key, config), video_id)
data = json.loads(_decrypt_config(key, config))
video_data = data['videoData']
title = clean_html(video_data['title'])
duration = float_or_none(video_data.get('duration'), 1000)
uploader = video_data.get('publisher')
formats = [{
'url': video['src'],
@@ -103,9 +99,8 @@ class EscapistIE(InfoExtractor):
'id': video_id,
'formats': formats,
'title': title,
'thumbnail': self._og_search_thumbnail(webpage) or data.get('poster'),
'thumbnail': self._og_search_thumbnail(webpage),
'description': self._og_search_description(webpage),
'duration': float_or_none(video_data.get('duration'), 1000),
'uploader': video_data.get('publisher'),
'series': video_data.get('show'),
'duration': duration,
'uploader': uploader,
}

View File

@@ -20,7 +20,6 @@ from .acast import (
)
from .addanime import AddAnimeIE
from .adn import ADNIE
from .adobeconnect import AdobeConnectIE
from .adobetv import (
AdobeTVIE,
AdobeTVShowIE,
@@ -39,7 +38,9 @@ from .alphaporno import AlphaPornoIE
from .amcnetworks import AMCNetworksIE
from .americastestkitchen import AmericasTestKitchenIE
from .animeondemand import AnimeOnDemandIE
from .anitube import AnitubeIE
from .anvato import AnvatoIE
from .anysex import AnySexIE
from .aol import AolIE
from .allocine import AllocineIE
from .aliexpress import AliExpressLiveIE
@@ -107,7 +108,6 @@ from .behindkink import BehindKinkIE
from .bellmedia import BellMediaIE
from .beatport import BeatportIE
from .bet import BetIE
from .bfi import BFIPlayerIE
from .bigflix import BigflixIE
from .bild import BildIE
from .bilibili import (
@@ -177,10 +177,7 @@ from .cbsnews import (
CBSNewsLiveVideoIE,
)
from .cbssports import CBSSportsIE
from .ccc import (
CCCIE,
CCCPlaylistIE,
)
from .ccc import CCCIE
from .ccma import CCMAIE
from .cctv import CCTVIE
from .cda import CDAIE
@@ -197,7 +194,6 @@ from .chirbit import (
ChirbitProfileIE,
)
from .cinchcast import CinchcastIE
from .cinemax import CinemaxIE
from .ciscolive import (
CiscoLiveSessionIE,
CiscoLiveSearchIE,
@@ -287,6 +283,10 @@ from .dplay import (
DPlayIE,
DPlayItIE,
)
from .dramafever import (
DramaFeverIE,
DramaFeverSeriesIE,
)
from .dreisat import DreiSatIE
from .drbonanza import DRBonanzaIE
from .drtuber import DrTuberIE
@@ -442,7 +442,10 @@ from .goshgay import GoshgayIE
from .gputechconf import GPUTechConfIE
from .groupon import GrouponIE
from .hark import HarkIE
from .hbo import HBOIE
from .hbo import (
HBOIE,
HBOEpisodeIE,
)
from .hearthisat import HearThisAtIE
from .heise import HeiseIE
from .hellporno import HellPornoIE
@@ -631,11 +634,7 @@ from .massengeschmacktv import MassengeschmackTVIE
from .matchtv import MatchTVIE
from .mdr import MDRIE
from .mediaset import MediasetIE
from .mediasite import (
MediasiteIE,
MediasiteCatalogIE,
MediasiteNamedCatalogIE,
)
from .mediasite import MediasiteIE
from .medici import MediciIE
from .megaphone import MegaphoneIE
from .meipai import MeipaiIE
@@ -808,8 +807,6 @@ from .nrk import (
NRKTVSeasonIE,
NRKTVSeriesIE,
)
from .nrl import NRLTVIE
from .ntvcojp import NTVCoJpCUIE
from .ntvde import NTVDeIE
from .ntvru import NTVRuIE
from .nytimes import (
@@ -870,10 +867,6 @@ from .picarto import (
from .piksel import PikselIE
from .pinkbike import PinkbikeIE
from .pladform import PladformIE
from .platzi import (
PlatziIE,
PlatziCourseIE,
)
from .playfm import PlayFMIE
from .playplustv import PlayPlusTVIE
from .plays import PlaysTVIE
@@ -908,6 +901,7 @@ from .puhutv import (
PuhuTVSerieIE,
)
from .presstv import PressTVIE
from .primesharetv import PrimeShareTVIE
from .promptfile import PromptFileIE
from .prosiebensat1 import ProSiebenSat1IE
from .puls4 import Puls4IE
@@ -984,6 +978,7 @@ from .rtvnh import RTVNHIE
from .rtvs import RTVSIE
from .rudo import RudoIE
from .ruhd import RUHDIE
from .ruleporn import RulePornIE
from .rutube import (
RutubeIE,
RutubeChannelIE,
@@ -1095,12 +1090,7 @@ from .streamcloud import StreamcloudIE
from .streamcz import StreamCZIE
from .streetvoice import StreetVoiceIE
from .stretchinternet import StretchInternetIE
from .stv import STVPlayerIE
from .sunporno import SunPornoIE
from .sverigesradio import (
SverigesRadioEpisodeIE,
SverigesRadioPublicationIE,
)
from .svt import (
SVTIE,
SVTPageIE,
@@ -1128,7 +1118,6 @@ from .teachertube import (
)
from .teachingchannel import TeachingChannelIE
from .teamcoco import TeamcocoIE
from .teamtreehouse import TeamTreeHouseIE
from .techtalks import TechTalksIE
from .ted import TEDIE
from .tele5 import Tele5IE
@@ -1310,6 +1299,7 @@ from .viddler import ViddlerIE
from .videa import VideaIE
from .videodetective import VideoDetectiveIE
from .videofyme import VideofyMeIE
from .videomega import VideoMegaIE
from .videomore import (
VideomoreIE,
VideomoreVideoIE,
@@ -1415,13 +1405,17 @@ from .webofstories import (
WebOfStoriesPlaylistIE,
)
from .weibo import (
WeiboIE,
WeiboIE,
WeiboMobileIE
)
from .weiqitv import WeiqiTVIE
from .wimp import WimpIE
from .wistia import WistiaIE
from .worldstarhiphop import WorldStarHipHopIE
from .wrzuta import (
WrzutaIE,
WrzutaPlaylistIE,
)
from .wsj import (
WSJIE,
WSJArticleIE,
@@ -1454,16 +1448,13 @@ from .xxxymovies import XXXYMoviesIE
from .yahoo import (
YahooIE,
YahooSearchIE,
YahooGyaOPlayerIE,
YahooGyaOIE,
)
from .yandexdisk import YandexDiskIE
from .yandexmusic import (
YandexMusicTrackIE,
YandexMusicAlbumIE,
YandexMusicPlaylistIE,
)
from .yandexvideo import YandexVideoIE
from .yandexdisk import YandexDiskIE
from .yapfiles import YapFilesIE
from .yesjapan import YesJapanIE
from .yinyuetai import YinYueTaiIE

View File

@@ -6,12 +6,10 @@ import uuid
from .adobepass import AdobePassIE
from ..compat import (
compat_HTTPError,
compat_str,
compat_urllib_parse_unquote,
)
from ..utils import (
ExtractorError,
int_or_none,
parse_age_limit,
parse_duration,
@@ -50,7 +48,6 @@ class FOXIE(AdobePassIE):
'url': 'https://www.fox.com/watch/30056b295fb57f7452aeeb4920bc3024/',
'only_matching': True,
}]
_GEO_BYPASS = False
_HOME_PAGE_URL = 'https://www.fox.com/'
_API_KEY = 'abdcbed02c124d393b39e818a4312055'
_access_token = None
@@ -61,22 +58,9 @@ class FOXIE(AdobePassIE):
}
if self._access_token:
headers['Authorization'] = 'Bearer ' + self._access_token
try:
return self._download_json(
'https://api2.fox.com/v2.0/' + path,
video_id, data=data, headers=headers)
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.status == 403:
entitlement_issues = self._parse_json(
e.cause.read().decode(), video_id)['entitlementIssues']
for e in entitlement_issues:
if e.get('errorCode') == 1005:
raise ExtractorError(
'This video is only available via cable service provider '
'subscription. You may want to use --cookies.', expected=True)
messages = ', '.join([e['message'] for e in entitlement_issues])
raise ExtractorError(messages, expected=True)
raise
return self._download_json(
'https://api2.fox.com/v2.0/' + path,
video_id, data=data, headers=headers)
def _real_initialize(self):
if not self._access_token:
@@ -97,15 +81,7 @@ class FOXIE(AdobePassIE):
title = video['name']
release_url = video['url']
try:
m3u8_url = self._download_json(release_url, video_id)['playURL']
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.status == 403:
error = self._parse_json(e.cause.read().decode(), video_id)
if error.get('exception') == 'GeoLocationBlocked':
self.raise_geo_restricted(countries=['US'])
raise ExtractorError(error['description'], expected=True)
raise
m3u8_url = self._download_json(release_url, video_id)['playURL']
formats = self._extract_m3u8_formats(
m3u8_url, video_id, 'mp4',
entry_protocol='m3u8_native', m3u8_id='hls')

View File

@@ -4,17 +4,12 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import (
compat_str,
compat_urllib_parse_unquote,
)
from ..compat import compat_str
from ..utils import (
ExtractorError,
int_or_none,
str_or_none,
strip_or_none,
try_get,
urlencode_postdata,
)
@@ -51,29 +46,6 @@ class GaiaIE(InfoExtractor):
'skip_download': True,
},
}]
_NETRC_MACHINE = 'gaia'
_jwt = None
def _real_initialize(self):
auth = self._get_cookies('https://www.gaia.com/').get('auth')
if auth:
auth = self._parse_json(
compat_urllib_parse_unquote(auth.value),
None, fatal=False)
if not auth:
username, password = self._get_login_info()
if username is None:
return
auth = self._download_json(
'https://auth.gaia.com/v1/login',
None, data=urlencode_postdata({
'username': username,
'password': password
}))
if auth.get('success') is False:
raise ExtractorError(', '.join(auth['messages']), expected=True)
if auth:
self._jwt = auth.get('jwt')
def _real_extract(self, url):
display_id, vtype = re.search(self._VALID_URL, url).groups()
@@ -87,12 +59,8 @@ class GaiaIE(InfoExtractor):
media_id = compat_str(vdata['nid'])
title = node['title']
headers = None
if self._jwt:
headers = {'Authorization': 'Bearer ' + self._jwt}
media = self._download_json(
'https://brooklyn.gaia.com/media/' + media_id,
media_id, headers=headers)
'https://brooklyn.gaia.com/media/' + media_id, media_id)
formats = self._extract_m3u8_formats(
media['mediaUrls']['bcHLS'], media_id, 'mp4')
self._sort_formats(formats)

View File

@@ -3,24 +3,22 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from .kaltura import KalturaIE
from ..utils import (
HEADRequest,
sanitized_Request,
smuggle_url,
urlencode_postdata,
)
class GDCVaultIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?gdcvault\.com/play/(?P<id>\d+)(?:/(?P<name>[\w-]+))?'
_VALID_URL = r'https?://(?:www\.)?gdcvault\.com/play/(?P<id>\d+)/(?P<name>(\w|-)+)?'
_NETRC_MACHINE = 'gdcvault'
_TESTS = [
{
'url': 'http://www.gdcvault.com/play/1019721/Doki-Doki-Universe-Sweet-Simple',
'md5': '7ce8388f544c88b7ac11c7ab1b593704',
'info_dict': {
'id': '201311826596_AWNY',
'id': '1019721',
'display_id': 'Doki-Doki-Universe-Sweet-Simple',
'ext': 'mp4',
'title': 'Doki-Doki Universe: Sweet, Simple and Genuine (GDC Next 10)'
@@ -29,7 +27,7 @@ class GDCVaultIE(InfoExtractor):
{
'url': 'http://www.gdcvault.com/play/1015683/Embracing-the-Dark-Art-of',
'info_dict': {
'id': '201203272_1330951438328RSXR',
'id': '1015683',
'display_id': 'Embracing-the-Dark-Art-of',
'ext': 'flv',
'title': 'Embracing the Dark Art of Mathematical Modeling in AI'
@@ -58,7 +56,7 @@ class GDCVaultIE(InfoExtractor):
'url': 'http://gdcvault.com/play/1023460/Tenacious-Design-and-The-Interface',
'md5': 'a8efb6c31ed06ca8739294960b2dbabd',
'info_dict': {
'id': '840376_BQRC',
'id': '1023460',
'ext': 'mp4',
'display_id': 'Tenacious-Design-and-The-Interface',
'title': 'Tenacious Design and The Interface of \'Destiny\'',
@@ -68,38 +66,26 @@ class GDCVaultIE(InfoExtractor):
# Multiple audios
'url': 'http://www.gdcvault.com/play/1014631/Classic-Game-Postmortem-PAC',
'info_dict': {
'id': '12396_1299111843500GMPX',
'ext': 'mp4',
'id': '1014631',
'ext': 'flv',
'title': 'How to Create a Good Game - From My Experience of Designing Pac-Man',
},
# 'params': {
# 'skip_download': True, # Requires rtmpdump
# 'format': 'jp', # The japanese audio
# }
'params': {
'skip_download': True, # Requires rtmpdump
'format': 'jp', # The japanese audio
}
},
{
# gdc-player.html
'url': 'http://www.gdcvault.com/play/1435/An-American-engine-in-Tokyo',
'info_dict': {
'id': '9350_1238021887562UHXB',
'id': '1435',
'display_id': 'An-American-engine-in-Tokyo',
'ext': 'mp4',
'ext': 'flv',
'title': 'An American Engine in Tokyo:/nThe collaboration of Epic Games and Square Enix/nFor THE LAST REMINANT',
},
},
{
# Kaltura Embed
'url': 'https://www.gdcvault.com/play/1026180/Mastering-the-Apex-of-Scaling',
'info_dict': {
'id': '0_h1fg8j3p',
'ext': 'mp4',
'title': 'Mastering the Apex of Scaling Game Servers (Presented by Multiplay)',
'timestamp': 1554401811,
'upload_date': '20190404',
'uploader_id': 'joe@blazestreaming.com',
},
'params': {
'format': 'mp4-408',
'skip_download': True, # Requires rtmpdump
},
},
]
@@ -128,8 +114,10 @@ class GDCVaultIE(InfoExtractor):
return start_page
def _real_extract(self, url):
video_id, name = re.match(self._VALID_URL, url).groups()
display_id = name or video_id
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
display_id = mobj.group('name') or video_id
webpage_url = 'http://www.gdcvault.com/play/' + video_id
start_page = self._download_webpage(webpage_url, display_id)
@@ -139,12 +127,12 @@ class GDCVaultIE(InfoExtractor):
start_page, 'url', default=None)
if direct_url:
title = self._html_search_regex(
r'<td><strong>Session Name:?</strong></td>\s*<td>(.*?)</td>',
r'<td><strong>Session Name</strong></td>\s*<td>(.*?)</td>',
start_page, 'title')
video_url = 'http://www.gdcvault.com' + direct_url
# resolve the url so that we can detect the correct extension
video_url = self._request_webpage(
HEADRequest(video_url), video_id).geturl()
head = self._request_webpage(HEADRequest(video_url), video_id)
video_url = head.geturl()
return {
'id': video_id,
@@ -153,36 +141,34 @@ class GDCVaultIE(InfoExtractor):
'title': title,
}
embed_url = KalturaIE._extract_url(start_page)
if embed_url:
embed_url = smuggle_url(embed_url, {'source_url': url})
ie_key = 'Kaltura'
else:
PLAYER_REGEX = r'<iframe src="(?P<xml_root>.+?)/(?:gdc-)?player.*?\.html.*?".*?</iframe>'
PLAYER_REGEX = r'<iframe src="(?P<xml_root>.+?)/(?:gdc-)?player.*?\.html.*?".*?</iframe>'
xml_root = self._html_search_regex(
PLAYER_REGEX, start_page, 'xml root', default=None)
if xml_root is None:
# Probably need to authenticate
login_res = self._login(webpage_url, display_id)
if login_res is None:
self.report_warning('Could not login.')
else:
start_page = login_res
# Grab the url from the authenticated page
xml_root = self._html_search_regex(
PLAYER_REGEX, start_page, 'xml root')
xml_root = self._html_search_regex(
PLAYER_REGEX, start_page, 'xml root', default=None)
if xml_root is None:
# Probably need to authenticate
login_res = self._login(webpage_url, display_id)
if login_res is None:
self.report_warning('Could not login.')
else:
start_page = login_res
# Grab the url from the authenticated page
xml_root = self._html_search_regex(
PLAYER_REGEX, start_page, 'xml root')
xml_name = self._html_search_regex(
r'<iframe src=".*?\?xml=(.+?\.xml).*?".*?</iframe>',
start_page, 'xml filename', default=None)
if xml_name is None:
# Fallback to the older format
xml_name = self._html_search_regex(
r'<iframe src=".*?\?xml(?:=|URL=xml/)(.+?\.xml).*?".*?</iframe>',
r'<iframe src=".*?\?xmlURL=xml/(?P<xml_file>.+?\.xml).*?".*?</iframe>',
start_page, 'xml filename')
embed_url = '%s/xml/%s' % (xml_root, xml_name)
ie_key = 'DigitallySpeaking'
return {
'_type': 'url_transparent',
'id': video_id,
'display_id': display_id,
'url': embed_url,
'ie_key': ie_key,
'url': '%s/xml/%s' % (xml_root, xml_name),
'ie_key': 'DigitallySpeaking',
}

View File

@@ -4,12 +4,12 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
xpath_text,
xpath_element,
int_or_none,
parse_duration,
urljoin,
)
@@ -53,13 +53,10 @@ class HBOBaseIE(InfoExtractor):
},
}
def _extract_info(self, url, display_id):
video_data = self._download_xml(url, display_id)
video_id = xpath_text(video_data, 'id', fatal=True)
episode_title = title = xpath_text(video_data, 'title', fatal=True)
series = xpath_text(video_data, 'program')
if series:
title = '%s - %s' % (series, title)
def _extract_from_id(self, video_id):
video_data = self._download_xml(
'http://render.lv3.hbo.com/data/content/global/videos/data/%s.xml' % video_id, video_id)
title = xpath_text(video_data, 'title', 'title', True)
formats = []
for source in xpath_element(video_data, 'videos', 'sources', True):
@@ -131,45 +128,68 @@ class HBOBaseIE(InfoExtractor):
'width': width,
})
subtitles = None
caption_url = xpath_text(video_data, 'captionUrl')
if caption_url:
subtitles = {
'en': [{
'url': caption_url,
'ext': 'ttml'
}],
}
return {
'id': video_id,
'title': title,
'duration': parse_duration(xpath_text(video_data, 'duration/tv14')),
'series': series,
'episode': episode_title,
'formats': formats,
'thumbnails': thumbnails,
'subtitles': subtitles,
}
class HBOIE(HBOBaseIE):
IE_NAME = 'hbo'
_VALID_URL = r'https?://(?:www\.)?hbo\.com/(?:video|embed)(?:/[^/]+)*/(?P<id>[^/?#]+)'
_VALID_URL = r'https?://(?:www\.)?hbo\.com/video/video\.html\?.*vid=(?P<id>[0-9]+)'
_TEST = {
'url': 'https://www.hbo.com/video/game-of-thrones/seasons/season-8/videos/trailer',
'md5': '8126210656f433c452a21367f9ad85b3',
'url': 'http://www.hbo.com/video/video.html?autoplay=true&g=u&vid=1437839',
'md5': '2c6a6bc1222c7e91cb3334dad1746e5a',
'info_dict': {
'id': '22113301',
'id': '1437839',
'ext': 'mp4',
'title': 'Game of Thrones - Trailer',
},
'expected_warnings': ['Unknown MIME type application/mp4 in DASH manifest'],
'title': 'Ep. 64 Clip: Encryption',
'thumbnail': r're:https?://.*\.jpg$',
'duration': 1072,
}
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
location_path = self._parse_json(self._html_search_regex(
r'data-state="({.+?})"', webpage, 'state'), display_id)['video']['locationUrl']
return self._extract_info(urljoin(url, location_path), display_id)
video_id = self._match_id(url)
return self._extract_from_id(video_id)
class HBOEpisodeIE(HBOBaseIE):
IE_NAME = 'hbo:episode'
_VALID_URL = r'https?://(?:www\.)?hbo\.com/(?P<path>(?!video)(?:(?:[^/]+/)+video|watch-free-episodes)/(?P<id>[0-9a-z-]+))(?:\.html)?'
_TESTS = [{
'url': 'http://www.hbo.com/girls/episodes/5/52-i-love-you-baby/video/ep-52-inside-the-episode.html?autoplay=true',
'md5': '61ead79b9c0dfa8d3d4b07ef4ac556fb',
'info_dict': {
'id': '1439518',
'display_id': 'ep-52-inside-the-episode',
'ext': 'mp4',
'title': 'Ep. 52: Inside the Episode',
'thumbnail': r're:https?://.*\.jpg$',
'duration': 240,
},
}, {
'url': 'http://www.hbo.com/game-of-thrones/about/video/season-5-invitation-to-the-set.html?autoplay=true',
'only_matching': True,
}, {
'url': 'http://www.hbo.com/watch-free-episodes/last-week-tonight-with-john-oliver',
'only_matching': True,
}]
def _real_extract(self, url):
path, display_id = re.match(self._VALID_URL, url).groups()
content = self._download_json(
'http://www.hbo.com/api/content/' + path, display_id)['content']
video_id = compat_str((content.get('parsed', {}).get(
'common:FullBleedVideo', {}) or content['selectedEpisode'])['videoId'])
info_dict = self._extract_from_id(video_id)
info_dict['display_id'] = display_id
return info_dict

View File

@@ -1,11 +1,18 @@
from __future__ import unicode_literals
import json
import time
from .common import InfoExtractor
from ..utils import int_or_none
from ..compat import compat_urllib_parse_urlencode
from ..utils import (
ExtractorError,
sanitized_Request,
)
class HypemIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?hypem\.com/track/(?P<id>[0-9a-z]{5})'
_VALID_URL = r'https?://(?:www\.)?hypem\.com/track/(?P<id>[^/]+)/'
_TEST = {
'url': 'http://hypem.com/track/1v6ga/BODYWORK+-+TAME',
'md5': 'b9cc91b5af8995e9f0c1cee04c575828',
@@ -14,36 +21,41 @@ class HypemIE(InfoExtractor):
'ext': 'mp3',
'title': 'Tame',
'uploader': 'BODYWORK',
'timestamp': 1371810457,
'upload_date': '20130621',
}
}
def _real_extract(self, url):
track_id = self._match_id(url)
response = self._download_webpage(url, track_id)
data = {'ax': 1, 'ts': time.time()}
request = sanitized_Request(url + '?' + compat_urllib_parse_urlencode(data))
response, urlh = self._download_webpage_handle(
request, track_id, 'Downloading webpage with the url')
track = self._parse_json(self._html_search_regex(
r'(?s)<script\s+type="application/json"\s+id="displayList-data">(.+?)</script>',
response, 'tracks'), track_id)['tracks'][0]
html_tracks = self._html_search_regex(
r'(?ms)<script type="application/json" id="displayList-data">(.+?)</script>',
response, 'tracks')
try:
track_list = json.loads(html_tracks)
track = track_list['tracks'][0]
except ValueError:
raise ExtractorError('Hypemachine contained invalid JSON.')
key = track['key']
track_id = track['id']
title = track['song']
final_url = self._download_json(
'http://hypem.com/serve/source/%s/%s' % (track_id, track['key']),
track_id, 'Downloading metadata', headers={
'Content-Type': 'application/json'
})['url']
request = sanitized_Request(
'http://hypem.com/serve/source/%s/%s' % (track_id, key),
'', {'Content-Type': 'application/json'})
song_data = self._download_json(request, track_id, 'Downloading metadata')
final_url = song_data['url']
artist = track.get('artist')
return {
'id': track_id,
'url': final_url,
'ext': 'mp3',
'title': title,
'uploader': track.get('artist'),
'duration': int_or_none(track.get('time')),
'timestamp': int_or_none(track.get('ts')),
'track': title,
'uploader': artist,
}

View File

@@ -1,83 +1,36 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
determine_ext,
int_or_none,
strip_or_none,
xpath_attr,
xpath_text,
)
class InaIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?ina\.fr/(?:video|audio)/(?P<id>[A-Z0-9_]+)'
_TESTS = [{
_VALID_URL = r'https?://(?:www\.)?ina\.fr/video/(?P<id>I?[A-Z0-9]+)'
_TEST = {
'url': 'http://www.ina.fr/video/I12055569/francois-hollande-je-crois-que-c-est-clair-video.html',
'md5': 'a667021bf2b41f8dc6049479d9bb38a3',
'info_dict': {
'id': 'I12055569',
'ext': 'mp4',
'title': 'François Hollande "Je crois que c\'est clair"',
'description': 'md5:3f09eb072a06cb286b8f7e4f77109663',
}
}, {
'url': 'https://www.ina.fr/video/S806544_001/don-d-organes-des-avancees-mais-d-importants-besoins-video.html',
'only_matching': True,
}, {
'url': 'https://www.ina.fr/audio/P16173408',
'only_matching': True,
}, {
'url': 'https://www.ina.fr/video/P16173408-video.html',
'only_matching': True,
}]
}
def _real_extract(self, url):
video_id = self._match_id(url)
info_doc = self._download_xml(
'http://player.ina.fr/notices/%s.mrss' % video_id, video_id)
item = info_doc.find('channel/item')
title = xpath_text(item, 'title', fatal=True)
media_ns_xpath = lambda x: self._xpath_ns(x, 'http://search.yahoo.com/mrss/')
content = item.find(media_ns_xpath('content'))
mobj = re.match(self._VALID_URL, url)
get_furl = lambda x: xpath_attr(content, media_ns_xpath(x), 'url')
formats = []
for q, w, h in (('bq', 400, 300), ('mq', 512, 384), ('hq', 768, 576)):
q_url = get_furl(q)
if not q_url:
continue
formats.append({
'format_id': q,
'url': q_url,
'width': w,
'height': h,
})
if not formats:
furl = get_furl('player') or content.attrib['url']
ext = determine_ext(furl)
formats = [{
'url': furl,
'vcodec': 'none' if ext == 'mp3' else None,
'ext': ext,
}]
video_id = mobj.group('id')
mrss_url = 'http://player.ina.fr/notices/%s.mrss' % video_id
info_doc = self._download_xml(mrss_url, video_id)
thumbnails = []
for thumbnail in content.findall(media_ns_xpath('thumbnail')):
thumbnail_url = thumbnail.get('url')
if not thumbnail_url:
continue
thumbnails.append({
'url': thumbnail_url,
'height': int_or_none(thumbnail.get('height')),
'width': int_or_none(thumbnail.get('width')),
})
self.report_extraction(video_id)
video_url = info_doc.find('.//{http://search.yahoo.com/mrss/}player').attrib['url']
return {
'id': video_id,
'formats': formats,
'title': title,
'description': strip_or_none(xpath_text(item, 'description')),
'thumbnails': thumbnails,
'url': video_url,
'title': info_doc.find('.//title').text,
}

View File

@@ -7,7 +7,7 @@ from .common import InfoExtractor
class JWPlatformIE(InfoExtractor):
_VALID_URL = r'(?:https?://(?:content\.jwplatform|cdn\.jwplayer)\.com/(?:(?:feed|player|thumb|preview|video)s|jw6|v2/media)/|jwplatform:)(?P<id>[a-zA-Z0-9]{8})'
_VALID_URL = r'(?:https?://(?:content\.jwplatform|cdn\.jwplayer)\.com/(?:(?:feed|player|thumb|preview|video|manifest)s|jw6|v2/media)/|jwplatform:)(?P<id>[a-zA-Z0-9]{8})'
_TESTS = [{
'url': 'http://content.jwplatform.com/players/nPripu9l-ALJ3XQCI.js',
'md5': 'fa8899fa601eb7c83a64e9d568bdf325',

View File

@@ -145,8 +145,6 @@ class KalturaIE(InfoExtractor):
)
if mobj:
embed_info = mobj.groupdict()
for k, v in embed_info.items():
embed_info[k] = v.strip()
url = 'kaltura:%(partner_id)s:%(id)s' % embed_info
escaped_pid = re.escape(embed_info['partner_id'])
service_url = re.search(

View File

@@ -9,13 +9,11 @@ from ..utils import (
float_or_none,
int_or_none,
urlencode_postdata,
urljoin,
)
class LinkedInLearningBaseIE(InfoExtractor):
_NETRC_MACHINE = 'linkedin'
_LOGIN_URL = 'https://www.linkedin.com/uas/login?trk=learning'
def _call_api(self, course_slug, fields, video_slug=None, resolution=None):
query = {
@@ -52,10 +50,11 @@ class LinkedInLearningBaseIE(InfoExtractor):
return
login_page = self._download_webpage(
self._LOGIN_URL, None, 'Downloading login page')
action_url = urljoin(self._LOGIN_URL, self._search_regex(
'https://www.linkedin.com/uas/login?trk=learning',
None, 'Downloading login page')
action_url = self._search_regex(
r'<form[^>]+action=(["\'])(?P<url>.+?)\1', login_page, 'post url',
default='https://www.linkedin.com/uas/login-submit', group='url'))
default='https://www.linkedin.com/uas/login-submit', group='url')
data = self._hidden_inputs(login_page)
data.update({
'session_key': email,

View File

@@ -13,8 +13,6 @@ from ..utils import (
ExtractorError,
float_or_none,
mimetype2ext,
str_or_none,
try_get,
unescapeHTML,
unsmuggle_url,
url_or_none,
@@ -22,11 +20,8 @@ from ..utils import (
)
_ID_RE = r'(?:[0-9a-f]{32,34}|[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12,14})'
class MediasiteIE(InfoExtractor):
_VALID_URL = r'(?xi)https?://[^/]+/Mediasite/(?:Play|Showcase/(?:default|livebroadcast)/Presentation)/(?P<id>%s)(?P<query>\?[^#]+|)' % _ID_RE
_VALID_URL = r'(?xi)https?://[^/]+/Mediasite/(?:Play|Showcase/(?:default|livebroadcast)/Presentation)/(?P<id>[0-9a-f]{32,34})(?P<query>\?[^#]+|)'
_TESTS = [
{
'url': 'https://hitsmediaweb.h-its.org/mediasite/Play/2db6c271681e4f199af3c60d1f82869b1d',
@@ -98,11 +93,6 @@ class MediasiteIE(InfoExtractor):
'url': 'https://mediasite.ntnu.no/Mediasite/Showcase/default/Presentation/7d8b913259334b688986e970fae6fcb31d',
'only_matching': True,
},
{
# dashed id
'url': 'https://hitsmediaweb.h-its.org/mediasite/Play/2db6c271-681e-4f19-9af3-c60d1f82869b1d',
'only_matching': True,
}
]
# look in Mediasite.Core.js (Mediasite.ContentStreamType[*])
@@ -119,7 +109,7 @@ class MediasiteIE(InfoExtractor):
return [
unescapeHTML(mobj.group('url'))
for mobj in re.finditer(
r'(?xi)<iframe\b[^>]+\bsrc=(["\'])(?P<url>(?:(?:https?:)?//[^/]+)?/Mediasite/Play/%s(?:\?.*?)?)\1' % _ID_RE,
r'(?xi)<iframe\b[^>]+\bsrc=(["\'])(?P<url>(?:(?:https?:)?//[^/]+)?/Mediasite/Play/[0-9a-f]{32,34}(?:\?.*?)?)\1',
webpage)]
def _real_extract(self, url):
@@ -231,136 +221,3 @@ class MediasiteIE(InfoExtractor):
'formats': formats,
'thumbnails': thumbnails,
}
class MediasiteCatalogIE(InfoExtractor):
_VALID_URL = r'''(?xi)
(?P<url>https?://[^/]+/Mediasite)
/Catalog/Full/
(?P<catalog_id>{0})
(?:
/(?P<current_folder_id>{0})
/(?P<root_dynamic_folder_id>{0})
)?
'''.format(_ID_RE)
_TESTS = [{
'url': 'http://events7.mediasite.com/Mediasite/Catalog/Full/631f9e48530d454381549f955d08c75e21',
'info_dict': {
'id': '631f9e48530d454381549f955d08c75e21',
'title': 'WCET Summit: Adaptive Learning in Higher Ed: Improving Outcomes Dynamically',
},
'playlist_count': 6,
'expected_warnings': ['is not a supported codec'],
}, {
# with CurrentFolderId and RootDynamicFolderId
'url': 'https://medaudio.medicine.iu.edu/Mediasite/Catalog/Full/9518c4a6c5cf4993b21cbd53e828a92521/97a9db45f7ab47428c77cd2ed74bb98f14/9518c4a6c5cf4993b21cbd53e828a92521',
'info_dict': {
'id': '9518c4a6c5cf4993b21cbd53e828a92521',
'title': 'IUSM Family and Friends Sessions',
},
'playlist_count': 2,
}, {
'url': 'http://uipsyc.mediasite.com/mediasite/Catalog/Full/d5d79287c75243c58c50fef50174ec1b21',
'only_matching': True,
}, {
# no AntiForgeryToken
'url': 'https://live.libraries.psu.edu/Mediasite/Catalog/Full/8376d4b24dd1457ea3bfe4cf9163feda21',
'only_matching': True,
}, {
'url': 'https://medaudio.medicine.iu.edu/Mediasite/Catalog/Full/9518c4a6c5cf4993b21cbd53e828a92521/97a9db45f7ab47428c77cd2ed74bb98f14/9518c4a6c5cf4993b21cbd53e828a92521',
'only_matching': True,
}, {
# dashed id
'url': 'http://events7.mediasite.com/Mediasite/Catalog/Full/631f9e48-530d-4543-8154-9f955d08c75e',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
mediasite_url = mobj.group('url')
catalog_id = mobj.group('catalog_id')
current_folder_id = mobj.group('current_folder_id') or catalog_id
root_dynamic_folder_id = mobj.group('root_dynamic_folder_id')
webpage = self._download_webpage(url, catalog_id)
# AntiForgeryToken is optional (e.g. [1])
# 1. https://live.libraries.psu.edu/Mediasite/Catalog/Full/8376d4b24dd1457ea3bfe4cf9163feda21
anti_forgery_token = self._search_regex(
r'AntiForgeryToken\s*:\s*(["\'])(?P<value>(?:(?!\1).)+)\1',
webpage, 'anti forgery token', default=None, group='value')
if anti_forgery_token:
anti_forgery_header = self._search_regex(
r'AntiForgeryHeaderName\s*:\s*(["\'])(?P<value>(?:(?!\1).)+)\1',
webpage, 'anti forgery header name',
default='X-SOFO-AntiForgeryHeader', group='value')
data = {
'IsViewPage': True,
'IsNewFolder': True,
'AuthTicket': None,
'CatalogId': catalog_id,
'CurrentFolderId': current_folder_id,
'RootDynamicFolderId': root_dynamic_folder_id,
'ItemsPerPage': 1000,
'PageIndex': 0,
'PermissionMask': 'Execute',
'CatalogSearchType': 'SearchInFolder',
'SortBy': 'Date',
'SortDirection': 'Descending',
'StartDate': None,
'EndDate': None,
'StatusFilterList': None,
'PreviewKey': None,
'Tags': [],
}
headers = {
'Content-Type': 'application/json; charset=UTF-8',
'Referer': url,
'X-Requested-With': 'XMLHttpRequest',
}
if anti_forgery_token:
headers[anti_forgery_header] = anti_forgery_token
catalog = self._download_json(
'%s/Catalog/Data/GetPresentationsForFolder' % mediasite_url,
catalog_id, data=json.dumps(data).encode(), headers=headers)
entries = []
for video in catalog['PresentationDetailsList']:
if not isinstance(video, dict):
continue
video_id = str_or_none(video.get('Id'))
if not video_id:
continue
entries.append(self.url_result(
'%s/Play/%s' % (mediasite_url, video_id),
ie=MediasiteIE.ie_key(), video_id=video_id))
title = try_get(
catalog, lambda x: x['CurrentFolder']['Name'], compat_str)
return self.playlist_result(entries, catalog_id, title,)
class MediasiteNamedCatalogIE(InfoExtractor):
_VALID_URL = r'(?xi)(?P<url>https?://[^/]+/Mediasite)/Catalog/catalogs/(?P<catalog_name>[^/?#&]+)'
_TESTS = [{
'url': 'https://msite.misis.ru/Mediasite/Catalog/catalogs/2016-industrial-management-skriabin-o-o',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
mediasite_url = mobj.group('url')
catalog_name = mobj.group('catalog_name')
webpage = self._download_webpage(url, catalog_name)
catalog_id = self._search_regex(
r'CatalogId\s*:\s*["\'](%s)' % _ID_RE, webpage, 'catalog id')
return self.url_result(
'%s/Catalog/Full/%s' % (mediasite_url, catalog_id),
ie=MediasiteCatalogIE.ie_key(), video_id=catalog_id)

View File

@@ -1,32 +1,22 @@
# coding: utf-8
from __future__ import unicode_literals
import base64
import time
import uuid
from .common import InfoExtractor
from ..compat import (
compat_HTTPError,
compat_str,
)
from ..utils import (
ExtractorError,
int_or_none,
)
from ..compat import compat_str
from ..utils import int_or_none
class MGTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?mgtv\.com/(v|b)/(?:[^/]+/)*(?P<id>\d+)\.html'
IE_DESC = '芒果TV'
_GEO_COUNTRIES = ['CN']
_TESTS = [{
'url': 'http://www.mgtv.com/v/1/290525/f/3116640.html',
'md5': 'b1ffc0fc163152acf6beaa81832c9ee7',
'info_dict': {
'id': '3116640',
'ext': 'mp4',
'title': '我是歌手 第四季',
'title': '我是歌手第四季双年巅峰会:韩红李玟“双王”领军对抗',
'description': '我是歌手第四季双年巅峰会',
'duration': 7461,
'thumbnail': r're:^https?://.*\.jpg$',
@@ -38,30 +28,16 @@ class MGTVIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
try:
api_data = self._download_json(
'https://pcweb.api.mgtv.com/player/video', video_id, query={
'tk2': base64.urlsafe_b64encode(b'did=%s|pno=1030|ver=0.3.0301|clit=%d' % (compat_str(uuid.uuid4()).encode(), time.time()))[::-1],
'video_id': video_id,
}, headers=self.geo_verification_headers())['data']
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 401:
error = self._parse_json(e.cause.read().decode(), None)
if error.get('code') == 40005:
self.raise_geo_restricted(countries=self._GEO_COUNTRIES)
raise ExtractorError(error['msg'], expected=True)
raise
api_data = self._download_json(
'http://pcweb.api.mgtv.com/player/video', video_id,
query={'video_id': video_id},
headers=self.geo_verification_headers())['data']
info = api_data['info']
title = info['title'].strip()
stream_data = self._download_json(
'https://pcweb.api.mgtv.com/player/getSource', video_id, query={
'pm2': api_data['atc']['pm2'],
'video_id': video_id,
}, headers=self.geo_verification_headers())['data']
stream_domain = stream_data['stream_domain'][0]
stream_domain = api_data['stream_domain'][0]
formats = []
for idx, stream in enumerate(stream_data['stream']):
for idx, stream in enumerate(api_data['stream']):
stream_path = stream.get('url')
if not stream_path:
continue
@@ -71,7 +47,7 @@ class MGTVIE(InfoExtractor):
format_url = format_data.get('info')
if not format_url:
continue
tbr = int_or_none(stream.get('filebitrate') or self._search_regex(
tbr = int_or_none(self._search_regex(
r'_(\d+)_mp4/', format_url, 'tbr', default=None))
formats.append({
'format_id': compat_str(tbr or idx),

View File

@@ -1,12 +1,15 @@
# coding: utf-8
from __future__ import unicode_literals
import json
import re
from .common import InfoExtractor
from ..utils import (
clean_html,
ExtractorError,
int_or_none,
sanitized_Request,
urlencode_postdata,
)
@@ -14,8 +17,8 @@ class MoeVideoIE(InfoExtractor):
IE_DESC = 'LetitBit video services: moevideo.net, playreplay.net and videochart.net'
_VALID_URL = r'''(?x)
https?://(?P<host>(?:www\.)?
(?:(?:moevideo|playreplay|videochart)\.net|thesame\.tv))/
(?:video|framevideo|embed)/(?P<id>[0-9a-z]+\.[0-9A-Za-z]+)'''
(?:(?:moevideo|playreplay|videochart)\.net))/
(?:video|framevideo)/(?P<id>[0-9]+\.[0-9A-Za-z]+)'''
_API_URL = 'http://api.letitbit.net/'
_API_KEY = 'tVL0gjqo5'
_TESTS = [
@@ -54,26 +57,58 @@ class MoeVideoIE(InfoExtractor):
]
def _real_extract(self, url):
host, video_id = re.match(self._VALID_URL, url).groups()
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
webpage = self._download_webpage(
'http://%s/video/%s' % (host, video_id),
'http://%s/video/%s' % (mobj.group('host'), video_id),
video_id, 'Downloading webpage')
title = self._og_search_title(webpage)
thumbnail = self._og_search_thumbnail(webpage)
description = self._og_search_description(webpage)
embed_webpage = self._download_webpage(
'http://%s/embed/%s' % (host, video_id),
video_id, 'Downloading embed webpage')
video = self._parse_json(self._search_regex(
r'mvplayer\("#player"\s*,\s*({.+})',
embed_webpage, 'mvplayer'), video_id)['video']
r = [
self._API_KEY,
[
'preview/flv_link',
{
'uid': video_id,
},
],
]
r_json = json.dumps(r)
post = urlencode_postdata({'r': r_json})
req = sanitized_Request(self._API_URL, post)
req.add_header('Content-type', 'application/x-www-form-urlencoded')
response = self._download_json(req, video_id)
if response['status'] != 'OK':
raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, response['data']),
expected=True
)
item = response['data'][0]
video_url = item['link']
duration = int_or_none(item['length'])
width = int_or_none(item['width'])
height = int_or_none(item['height'])
filesize = int_or_none(item['convert_size'])
formats = [{
'format_id': 'sd',
'http_headers': {'Range': 'bytes=0-'}, # Required to download
'url': video_url,
'width': width,
'height': height,
'filesize': filesize,
}]
return {
'id': video_id,
'title': title,
'thumbnail': video.get('poster') or self._og_search_thumbnail(webpage),
'description': clean_html(self._og_search_description(webpage)),
'duration': int_or_none(self._og_search_property('video:duration', webpage)),
'url': video['ourUrl'],
'thumbnail': thumbnail,
'description': description,
'duration': duration,
'formats': formats,
}

View File

@@ -1,17 +1,12 @@
# coding: utf-8
from __future__ import unicode_literals
import base64
import hashlib
import re
from .common import InfoExtractor
from ..aes import aes_cbc_decrypt
from ..utils import (
bytes_to_intlist,
ExtractorError,
int_or_none,
intlist_to_bytes,
parse_codecs,
parse_duration,
)
@@ -19,7 +14,7 @@ class NewstubeIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?newstube\.ru/media/(?P<id>.+)'
_TEST = {
'url': 'http://www.newstube.ru/media/telekanal-cnn-peremestil-gorod-slavyansk-v-krym',
'md5': '9d10320ad473444352f72f746ccb8b8c',
'md5': '801eef0c2a9f4089fa04e4fe3533abdc',
'info_dict': {
'id': '728e0ef2-e187-4012-bac0-5a081fdcb1f6',
'ext': 'mp4',
@@ -30,45 +25,84 @@ class NewstubeIE(InfoExtractor):
}
def _real_extract(self, url):
video_id = self._match_id(url)
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
page = self._download_webpage(url, video_id)
title = self._html_search_meta(['og:title', 'twitter:title'], page, fatal=True)
page = self._download_webpage(url, video_id, 'Downloading page')
video_guid = self._html_search_regex(
r'<meta\s+property="og:video(?::(?:(?:secure_)?url|iframe))?"\s+content="https?://(?:www\.)?newstube\.ru/embed/(?P<guid>[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})',
r'<meta property="og:video:url" content="https?://(?:www\.)?newstube\.ru/freshplayer\.swf\?guid=(?P<guid>[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})',
page, 'video GUID')
enc_data = base64.b64decode(self._download_webpage(
'https://www.newstube.ru/embed/api/player/getsources2',
video_guid, query={
'guid': video_guid,
'ff': 3,
}))
key = hashlib.pbkdf2_hmac(
'sha1', video_guid.replace('-', '').encode(), enc_data[:16], 1)[:16]
dec_data = aes_cbc_decrypt(
bytes_to_intlist(enc_data[32:]), bytes_to_intlist(key),
bytes_to_intlist(enc_data[16:32]))
sources = self._parse_json(intlist_to_bytes(dec_data[:-dec_data[-1]]), video_guid)
player = self._download_xml(
'http://p.newstube.ru/v2/player.asmx/GetAutoPlayInfo6?state=&url=%s&sessionId=&id=%s&placement=profile&location=n2' % (url, video_guid),
video_guid, 'Downloading player XML')
def ns(s):
return s.replace('/', '/%(ns)s') % {'ns': '{http://app1.newstube.ru/N2SiteWS/player.asmx}'}
error_message = player.find(ns('./ErrorMessage'))
if error_message is not None:
raise ExtractorError('%s returned error: %s' % (self.IE_NAME, error_message.text), expected=True)
session_id = player.find(ns('./SessionId')).text
media_info = player.find(ns('./Medias/MediaInfo'))
title = media_info.find(ns('./Name')).text
description = self._og_search_description(page)
thumbnail = media_info.find(ns('./KeyFrame')).text
duration = int(media_info.find(ns('./Duration')).text) / 1000.0
formats = []
for source in sources:
source_url = source.get('Src')
if not source_url:
for stream_info in media_info.findall(ns('./Streams/StreamInfo')):
media_location = stream_info.find(ns('./MediaLocation'))
if media_location is None:
continue
height = int_or_none(source.get('Height'))
f = {
'format_id': 'http' + ('-%dp' % height if height else ''),
'url': source_url,
'width': int_or_none(source.get('Width')),
server = media_location.find(ns('./Server')).text
app = media_location.find(ns('./App')).text
media_id = stream_info.find(ns('./Id')).text
name = stream_info.find(ns('./Name')).text
width = int(stream_info.find(ns('./Width')).text)
height = int(stream_info.find(ns('./Height')).text)
formats.append({
'url': 'rtmp://%s/%s' % (server, app),
'app': app,
'play_path': '01/%s' % video_guid.upper(),
'rtmp_conn': ['S:%s' % session_id, 'S:%s' % media_id, 'S:n2'],
'page_url': url,
'ext': 'flv',
'format_id': 'rtmp' + ('-%s' % name if name else ''),
'width': width,
'height': height,
}
source_type = source.get('Type')
if source_type:
f.update(parse_codecs(self._search_regex(
r'codecs="([^"]+)"', source_type, 'codecs', fatal=False)))
formats.append(f)
})
sources_data = self._download_json(
'http://www.newstube.ru/player2/getsources?guid=%s' % video_guid,
video_guid, fatal=False)
if sources_data:
for source in sources_data.get('Sources', []):
source_url = source.get('Src')
if not source_url:
continue
height = int_or_none(source.get('Height'))
f = {
'format_id': 'http' + ('-%dp' % height if height else ''),
'url': source_url,
'width': int_or_none(source.get('Width')),
'height': height,
}
source_type = source.get('Type')
if source_type:
mobj = re.search(r'codecs="([^,]+),\s*([^"]+)"', source_type)
if mobj:
vcodec, acodec = mobj.groups()
f.update({
'vcodec': vcodec,
'acodec': acodec,
})
formats.append(f)
self._check_formats(formats, video_guid)
self._sort_formats(formats)
@@ -76,8 +110,8 @@ class NewstubeIE(InfoExtractor):
return {
'id': video_guid,
'title': title,
'description': self._html_search_meta(['description', 'og:description'], page),
'thumbnail': self._html_search_meta(['og:image:secure_url', 'og:image', 'twitter:image'], page),
'duration': parse_duration(self._html_search_meta('duration', page)),
'description': description,
'thumbnail': thumbnail,
'duration': duration,
'formats': formats,
}

View File

@@ -1,81 +1,54 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import ExtractorError
class NhkVodIE(InfoExtractor):
_VALID_URL = r'https?://www3\.nhk\.or\.jp/nhkworld/(?P<lang>[a-z]{2})/ondemand/(?P<type>video|audio)/(?P<id>\d{7}|[a-z]+-\d{8}-\d+)'
# Content available only for a limited period of time. Visit
# https://www3.nhk.or.jp/nhkworld/en/ondemand/ for working samples.
_VALID_URL = r'https?://www3\.nhk\.or\.jp/nhkworld/en/(?:vod|ondemand)/(?P<id>[^/]+/[^/?#&]+)'
_TESTS = [{
# Videos available only for a limited period of time. Visit
# http://www3.nhk.or.jp/nhkworld/en/vod/ for working samples.
'url': 'http://www3.nhk.or.jp/nhkworld/en/vod/tokyofashion/20160815',
'info_dict': {
'id': 'A1bnNiNTE6nY3jLllS-BIISfcC_PpvF5',
'ext': 'flv',
'title': 'TOKYO FASHION EXPRESS - The Kimono as Global Fashion',
'description': 'md5:db338ee6ce8204f415b754782f819824',
'series': 'TOKYO FASHION EXPRESS',
'episode': 'The Kimono as Global Fashion',
},
'skip': 'Videos available only for a limited period of time',
}, {
'url': 'https://www3.nhk.or.jp/nhkworld/en/ondemand/video/2015173/',
'only_matching': True,
}, {
'url': 'https://www3.nhk.or.jp/nhkworld/en/ondemand/audio/plugin-20190404-1/',
'only_matching': True,
}, {
'url': 'https://www3.nhk.or.jp/nhkworld/fr/ondemand/audio/plugin-20190404-1/',
'only_matching': True,
}]
_API_URL_TEMPLATE = 'https://api.nhk.or.jp/nhkworld/%sodesdlist/v7/episode/%s/%s/all%s.json'
_API_URL = 'http://api.nhk.or.jp/nhkworld/vodesdlist/v1/all/all/all.json?apikey=EJfK8jdS57GqlupFgAfAAwr573q01y6k'
def _real_extract(self, url):
lang, m_type, episode_id = re.match(self._VALID_URL, url).groups()
if episode_id.isdigit():
episode_id = episode_id[:4] + '-' + episode_id[4:]
video_id = self._match_id(url)
data = self._download_json(self._API_URL, video_id)
try:
episode = next(
e for e in data['data']['episodes']
if e.get('url') and video_id in e['url'])
except StopIteration:
raise ExtractorError('Unable to find episode')
embed_code = episode['vod_id']
is_video = m_type == 'video'
episode = self._download_json(
self._API_URL_TEMPLATE % ('v' if is_video else 'r', episode_id, lang, '/all' if is_video else ''),
episode_id, query={'apikey': 'EJfK8jdS57GqlupFgAfAAwr573q01y6k'})['data']['episodes'][0]
title = episode.get('sub_title_clean') or episode['sub_title']
description = episode.get('description_clean') or episode.get('description')
series = episode.get('title_clean') or episode.get('title')
def get_clean_field(key):
return episode.get(key + '_clean') or episode.get(key)
series = get_clean_field('title')
thumbnails = []
for s, w, h in [('', 640, 360), ('_l', 1280, 720)]:
img_path = episode.get('image' + s)
if not img_path:
continue
thumbnails.append({
'id': '%dp' % h,
'height': h,
'width': w,
'url': 'https://www3.nhk.or.jp' + img_path,
})
info = {
'id': episode_id + '-' + lang,
return {
'_type': 'url_transparent',
'ie_key': 'Ooyala',
'url': 'ooyala:%s' % embed_code,
'title': '%s - %s' % (series, title) if series and title else title,
'description': get_clean_field('description'),
'thumbnails': thumbnails,
'description': description,
'series': series,
'episode': title,
}
if is_video:
info.update({
'_type': 'url_transparent',
'ie_key': 'Ooyala',
'url': 'ooyala:' + episode['vod_id'],
})
else:
audio = episode['audio']
audio_path = audio['audio']
info['formats'] = self._extract_m3u8_formats(
'https://nhks-vh.akamaihd.net/i%s/master.m3u8' % audio_path,
episode_id, 'm4a', m3u8_id='hls', fatal=False)
for proto in ('rtmpt', 'rtmp'):
info['formats'].append({
'ext': 'flv',
'format_id': proto,
'url': '%s://flv.nhk.or.jp/ondemand/mp4:flv%s' % (proto, audio_path),
'vcodec': 'none',
})
for f in info['formats']:
f['language'] = lang
return info

View File

@@ -181,7 +181,10 @@ class NPOIE(NPOBaseIE):
def _real_extract(self, url):
video_id = self._match_id(url)
return self._get_info(url, video_id) or self._get_old_info(video_id)
try:
return self._get_info(url, video_id)
except ExtractorError:
return self._get_old_info(video_id)
def _get_info(self, url, video_id):
token = self._download_json(
@@ -203,7 +206,6 @@ class NPOIE(NPOBaseIE):
player_token = player['token']
drm = False
format_urls = set()
formats = []
for profile in ('hls', 'dash-widevine', 'dash-playready', 'smooth'):
@@ -225,8 +227,7 @@ class NPOIE(NPOBaseIE):
if not stream_url or stream_url in format_urls:
continue
format_urls.add(stream_url)
if stream.get('protection') is not None or stream.get('keySystemOptions') is not None:
drm = True
if stream.get('protection') is not None:
continue
stream_type = stream.get('type')
stream_ext = determine_ext(stream_url)
@@ -245,11 +246,6 @@ class NPOIE(NPOBaseIE):
'url': stream_url,
})
if not formats:
if drm:
raise ExtractorError('This video is DRM protected.', expected=True)
return
self._sort_formats(formats)
info = {

View File

@@ -1,6 +1,7 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_urllib_parse_urlencode
from ..utils import (
int_or_none,
qualities,
@@ -8,16 +9,16 @@ from ..utils import (
class NprIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?npr\.org/(?:sections/[^/]+/)?\d{4}/\d{2}/\d{2}/(?P<id>\d+)'
_VALID_URL = r'https?://(?:www\.)?npr\.org/player/v2/mediaPlayer\.html\?.*\bid=(?P<id>\d+)'
_TESTS = [{
'url': 'https://www.npr.org/sections/allsongs/2015/10/21/449974205/new-music-from-beach-house-chairlift-cmj-discoveries-and-more',
'url': 'http://www.npr.org/player/v2/mediaPlayer.html?id=449974205',
'info_dict': {
'id': '449974205',
'title': 'New Music From Beach House, Chairlift, CMJ Discoveries And More'
},
'playlist_count': 7,
}, {
'url': 'https://www.npr.org/sections/deceptivecadence/2015/10/09/446928052/music-from-the-shadows-ancient-armenian-hymns-and-piano-jazz',
'url': 'http://www.npr.org/player/v2/mediaPlayer.html?action=1&t=1&islist=false&id=446928052&m=446929930&live=1',
'info_dict': {
'id': '446928052',
'title': "Songs We Love: Tigran Hamasyan, 'Your Mercy is Boundless'"
@@ -31,46 +32,30 @@ class NprIE(InfoExtractor):
'duration': 402,
},
}],
}, {
# mutlimedia, not media title
'url': 'https://www.npr.org/2017/06/19/533198237/tigers-jaw-tiny-desk-concert',
'info_dict': {
'id': '533198237',
'title': 'Tigers Jaw: Tiny Desk Concert',
},
'playlist': [{
'md5': '12fa60cb2d3ed932f53609d4aeceabf1',
'info_dict': {
'id': '533201718',
'ext': 'mp4',
'title': 'Tigers Jaw: Tiny Desk Concert',
'duration': 402,
},
}],
'expected_warnings': ['Failed to download m3u8 information'],
}]
def _real_extract(self, url):
playlist_id = self._match_id(url)
story = self._download_json(
'http://api.npr.org/query', playlist_id, query={
config = self._download_json(
'http://api.npr.org/query?%s' % compat_urllib_parse_urlencode({
'id': playlist_id,
'fields': 'audio,multimedia,title',
'fields': 'titles,audio,show',
'format': 'json',
'apiKey': 'MDAzMzQ2MjAyMDEyMzk4MTU1MDg3ZmM3MQ010',
})['list']['story'][0]
playlist_title = story.get('title', {}).get('$text')
}), playlist_id)
KNOWN_FORMATS = ('threegp', 'm3u8', 'smil', 'mp4', 'mp3')
story = config['list']['story'][0]
KNOWN_FORMATS = ('threegp', 'mp4', 'mp3')
quality = qualities(KNOWN_FORMATS)
entries = []
for media in story.get('audio', []) + story.get('multimedia', []):
media_id = media['id']
for audio in story.get('audio', []):
title = audio.get('title', {}).get('$text')
duration = int_or_none(audio.get('duration', {}).get('$text'))
formats = []
for format_id, formats_entry in media.get('format', {}).items():
for format_id, formats_entry in audio.get('format', {}).items():
if not formats_entry:
continue
if isinstance(formats_entry, list):
@@ -79,30 +64,19 @@ class NprIE(InfoExtractor):
if not format_url:
continue
if format_id in KNOWN_FORMATS:
if format_id == 'm3u8':
formats.extend(self._extract_m3u8_formats(
format_url, media_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
elif format_id == 'smil':
smil_formats = self._extract_smil_formats(
format_url, media_id, transform_source=lambda s: s.replace(
'rtmp://flash.npr.org/ondemand/', 'https://ondemand.npr.org/'))
self._check_formats(smil_formats, media_id)
formats.extend(smil_formats)
else:
formats.append({
'url': format_url,
'format_id': format_id,
'quality': quality(format_id),
})
formats.append({
'url': format_url,
'format_id': format_id,
'ext': formats_entry.get('type'),
'quality': quality(format_id),
})
self._sort_formats(formats)
entries.append({
'id': media_id,
'title': media.get('title', {}).get('$text') or playlist_title,
'thumbnail': media.get('altImageUrl', {}).get('$text'),
'duration': int_or_none(media.get('duration', {}).get('$text')),
'id': audio['id'],
'title': title,
'duration': duration,
'formats': formats,
})
playlist_title = story.get('title', {}).get('$text')
return self.playlist_result(entries, playlist_id, playlist_title)

View File

@@ -1,30 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
class NRLTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?nrl\.com/tv(/[^/]+)*/(?P<id>[^/?&#]+)'
_TEST = {
'url': 'https://www.nrl.com/tv/news/match-highlights-titans-v-knights-862805/',
'info_dict': {
'id': 'YyNnFuaDE6kPJqlDhG4CGQ_w89mKTau4',
'ext': 'mp4',
'title': 'Match Highlights: Titans v Knights',
},
'params': {
# m3u8 download
'skip_download': True,
'format': 'bestvideo',
},
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
q_data = self._parse_json(self._search_regex(
r"(?s)q-data='({.+?})'", webpage, 'player data'), display_id)
ooyala_id = q_data['videoId']
return self.url_result(
'ooyala:' + ooyala_id, 'Ooyala', ooyala_id, q_data.get('title'))

View File

@@ -1,49 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
js_to_json,
smuggle_url,
)
class NTVCoJpCUIE(InfoExtractor):
IE_NAME = 'cu.ntv.co.jp'
IE_DESC = 'Nippon Television Network'
_VALID_URL = r'https?://cu\.ntv\.co\.jp/(?!program)(?P<id>[^/?&#]+)'
_TEST = {
'url': 'https://cu.ntv.co.jp/televiva-chill-gohan_181031/',
'info_dict': {
'id': '5978891207001',
'ext': 'mp4',
'title': '桜エビと炒り卵がポイント! 「中華風 エビチリおにぎり」──『美虎』五十嵐美幸',
'upload_date': '20181213',
'description': 'md5:211b52f4fd60f3e0e72b68b0c6ba52a9',
'uploader_id': '3855502814001',
'timestamp': 1544669941,
},
'params': {
# m3u8 download
'skip_download': True,
},
}
BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/%s/default_default/index.html?videoId=%s'
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
player_config = self._parse_json(self._search_regex(
r'(?s)PLAYER_CONFIG\s*=\s*({.+?})',
webpage, 'player config'), display_id, js_to_json)
video_id = player_config['videoId']
account_id = player_config.get('account') or '3855502814001'
return {
'_type': 'url_transparent',
'id': video_id,
'display_id': display_id,
'title': self._search_regex(r'<h1[^>]+class="title"[^>]*>([^<]+)', webpage, 'title').strip(),
'description': self._html_search_meta(['description', 'og:description'], webpage),
'url': smuggle_url(self.BRIGHTCOVE_URL_TEMPLATE % (account_id, video_id), {'geo_countries': ['JP']}),
'ie_key': 'BrightcoveNew',
}

View File

@@ -36,7 +36,7 @@ class OoyalaBaseIE(InfoExtractor):
'domain': domain,
'supportedFormats': supportedformats or 'mp4,rtmp,m3u8,hds,dash,smooth',
'embedToken': embed_token,
}), video_id, headers=self.geo_verification_headers())
}), video_id)
cur_auth_data = auth_data['authorization_data'][embed_code]

File diff suppressed because it is too large Load Diff

View File

@@ -176,8 +176,7 @@ class ORFRadioIE(InfoExtractor):
'description': subtitle,
'duration': (info['end'] - info['start']) / 1000,
'timestamp': info['start'] / 1000,
'ext': 'mp3',
'series': data.get('programTitle')
'ext': 'mp3'
}
entries = [extract_entry_dict(t, data['title'], data['subtitle']) for t in data['streams']]

View File

@@ -36,7 +36,7 @@ class PandaTVIE(InfoExtractor):
'https://www.panda.tv/api_room_v2?roomid=%s' % video_id, video_id)
error_code = config.get('errno', 0)
if error_code != 0:
if error_code is not 0:
raise ExtractorError(
'%s returned error %s: %s'
% (self.IE_NAME, error_code, config['errmsg']),

View File

@@ -1,217 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import (
compat_b64decode,
compat_str,
)
from ..utils import (
clean_html,
ExtractorError,
int_or_none,
str_or_none,
try_get,
url_or_none,
urlencode_postdata,
urljoin,
)
class PlatziIE(InfoExtractor):
_VALID_URL = r'''(?x)
https?://
(?:
platzi\.com/clases| # es version
courses\.platzi\.com/classes # en version
)/[^/]+/(?P<id>\d+)-[^/?\#&]+
'''
_LOGIN_URL = 'https://platzi.com/login/'
_NETRC_MACHINE = 'platzi'
_TESTS = [{
'url': 'https://platzi.com/clases/1311-next-js/12074-creando-nuestra-primera-pagina/',
'md5': '8f56448241005b561c10f11a595b37e3',
'info_dict': {
'id': '12074',
'ext': 'mp4',
'title': 'Creando nuestra primera página',
'description': 'md5:4c866e45034fc76412fbf6e60ae008bc',
'duration': 420,
},
'skip': 'Requires platzi account credentials',
}, {
'url': 'https://courses.platzi.com/classes/1367-communication-codestream/13430-background/',
'info_dict': {
'id': '13430',
'ext': 'mp4',
'title': 'Background',
'description': 'md5:49c83c09404b15e6e71defaf87f6b305',
'duration': 360,
},
'skip': 'Requires platzi account credentials',
'params': {
'skip_download': True,
},
}]
def _real_initialize(self):
self._login()
def _login(self):
username, password = self._get_login_info()
if username is None:
return
login_page = self._download_webpage(
self._LOGIN_URL, None, 'Downloading login page')
login_form = self._hidden_inputs(login_page)
login_form.update({
'email': username,
'password': password,
})
urlh = self._request_webpage(
self._LOGIN_URL, None, 'Logging in',
data=urlencode_postdata(login_form),
headers={'Referer': self._LOGIN_URL})
# login succeeded
if 'platzi.com/login' not in compat_str(urlh.geturl()):
return
login_error = self._webpage_read_content(
urlh, self._LOGIN_URL, None, 'Downloading login error page')
login = self._parse_json(
self._search_regex(
r'login\s*=\s*({.+?})(?:\s*;|\s*</script)', login_error, 'login'),
None)
for kind in ('error', 'password', 'nonFields'):
error = str_or_none(login.get('%sError' % kind))
if error:
raise ExtractorError(
'Unable to login: %s' % error, expected=True)
raise ExtractorError('Unable to log in')
def _real_extract(self, url):
lecture_id = self._match_id(url)
webpage = self._download_webpage(url, lecture_id)
data = self._parse_json(
self._search_regex(
r'client_data\s*=\s*({.+?})\s*;', webpage, 'client data'),
lecture_id)
material = data['initialState']['material']
desc = material['description']
title = desc['title']
formats = []
for server_id, server in material['videos'].items():
if not isinstance(server, dict):
continue
for format_id in ('hls', 'dash'):
format_url = url_or_none(server.get(format_id))
if not format_url:
continue
if format_id == 'hls':
formats.extend(self._extract_m3u8_formats(
format_url, lecture_id, 'mp4',
entry_protocol='m3u8_native', m3u8_id=format_id,
note='Downloading %s m3u8 information' % server_id,
fatal=False))
elif format_id == 'dash':
formats.extend(self._extract_mpd_formats(
format_url, lecture_id, mpd_id=format_id,
note='Downloading %s MPD manifest' % server_id,
fatal=False))
self._sort_formats(formats)
content = str_or_none(desc.get('content'))
description = (clean_html(compat_b64decode(content).decode('utf-8'))
if content else None)
duration = int_or_none(material.get('duration'), invscale=60)
return {
'id': lecture_id,
'title': title,
'description': description,
'duration': duration,
'formats': formats,
}
class PlatziCourseIE(InfoExtractor):
_VALID_URL = r'''(?x)
https?://
(?:
platzi\.com/clases| # es version
courses\.platzi\.com/classes # en version
)/(?P<id>[^/?\#&]+)
'''
_TESTS = [{
'url': 'https://platzi.com/clases/next-js/',
'info_dict': {
'id': '1311',
'title': 'Curso de Next.js',
},
'playlist_count': 22,
}, {
'url': 'https://courses.platzi.com/classes/communication-codestream/',
'info_dict': {
'id': '1367',
'title': 'Codestream Course',
},
'playlist_count': 14,
}]
@classmethod
def suitable(cls, url):
return False if PlatziIE.suitable(url) else super(PlatziCourseIE, cls).suitable(url)
def _real_extract(self, url):
course_name = self._match_id(url)
webpage = self._download_webpage(url, course_name)
props = self._parse_json(
self._search_regex(r'data\s*=\s*({.+?})\s*;', webpage, 'data'),
course_name)['initialProps']
entries = []
for chapter_num, chapter in enumerate(props['concepts'], 1):
if not isinstance(chapter, dict):
continue
materials = chapter.get('materials')
if not materials or not isinstance(materials, list):
continue
chapter_title = chapter.get('title')
chapter_id = str_or_none(chapter.get('id'))
for material in materials:
if not isinstance(material, dict):
continue
if material.get('material_type') != 'video':
continue
video_url = urljoin(url, material.get('url'))
if not video_url:
continue
entries.append({
'_type': 'url_transparent',
'url': video_url,
'title': str_or_none(material.get('name')),
'id': str_or_none(material.get('id')),
'ie_key': PlatziIE.ie_key(),
'chapter': chapter_title,
'chapter_number': chapter_num,
'chapter_id': chapter_id,
})
course_id = compat_str(try_get(props, lambda x: x['course']['id']))
course_title = try_get(props, lambda x: x['course']['name'], compat_str)
return self.playlist_result(entries, course_id, course_title)

View File

@@ -14,7 +14,6 @@ from ..compat import (
)
from .openload import PhantomJSwrapper
from ..utils import (
determine_ext,
ExtractorError,
int_or_none,
orderedSet,
@@ -276,10 +275,6 @@ class PornHubIE(PornHubBaseIE):
r'/(\d{6}/\d{2})/', video_url, 'upload data', default=None)
if upload_date:
upload_date = upload_date.replace('/', '')
if determine_ext(video_url) == 'mpd':
formats.extend(self._extract_mpd_formats(
video_url, video_id, mpd_id='dash', fatal=False))
continue
tbr = None
mobj = re.search(r'(?P<height>\d+)[pP]?_(?P<tbr>\d+)[kK]', video_url)
if mobj:

View File

@@ -0,0 +1,62 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
ExtractorError,
sanitized_Request,
urlencode_postdata,
)
class PrimeShareTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?primeshare\.tv/download/(?P<id>[\da-zA-Z]+)'
_TEST = {
'url': 'http://primeshare.tv/download/238790B611',
'md5': 'b92d9bf5461137c36228009f31533fbc',
'info_dict': {
'id': '238790B611',
'ext': 'mp4',
'title': 'Public Domain - 1960s Commercial - Crest Toothpaste-YKsuFona',
},
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
if '>File not exist<' in webpage:
raise ExtractorError('Video %s does not exist' % video_id, expected=True)
fields = self._hidden_inputs(webpage)
headers = {
'Referer': url,
'Content-Type': 'application/x-www-form-urlencoded',
}
wait_time = int(self._search_regex(
r'var\s+cWaitTime\s*=\s*(\d+)',
webpage, 'wait time', default=7)) + 1
self._sleep(wait_time, video_id)
req = sanitized_Request(
url, urlencode_postdata(fields), headers)
video_page = self._download_webpage(
req, video_id, 'Downloading video page')
video_url = self._search_regex(
r"url\s*:\s*'([^']+\.primeshare\.tv(?::443)?/file/[^']+)'",
video_page, 'video url')
title = self._html_search_regex(
r'<h1>Watch\s*(?:&nbsp;)?\s*\((.+?)(?:\s*\[\.\.\.\])?\)\s*(?:&nbsp;)?\s*<strong>',
video_page, 'title')
return {
'id': video_id,
'url': video_url,
'title': title,
'ext': 'mp4',
}

View File

@@ -147,7 +147,7 @@ class RadioCanadaIE(InfoExtractor):
class RadioCanadaAudioVideoIE(InfoExtractor):
IE_NAME = 'radiocanada:audiovideo'
'radiocanada:audiovideo'
_VALID_URL = r'https?://ici\.radio-canada\.ca/([^/]+/)*media-(?P<id>[0-9]+)'
_TESTS = [{
'url': 'http://ici.radio-canada.ca/audio-video/media-7527184/barack-obama-au-vietnam',

View File

@@ -7,7 +7,6 @@ from ..utils import (
ExtractorError,
int_or_none,
float_or_none,
url_or_none,
)
@@ -120,7 +119,7 @@ class RedditRIE(InfoExtractor):
'_type': 'url_transparent',
'url': video_url,
'title': data.get('title'),
'thumbnail': url_or_none(data.get('thumbnail')),
'thumbnail': data.get('thumbnail'),
'timestamp': float_or_none(data.get('created_utc')),
'uploader': data.get('author'),
'like_count': int_or_none(data.get('ups')),

View File

@@ -21,7 +21,7 @@ from ..utils import (
class RTL2IE(InfoExtractor):
IE_NAME = 'rtl2'
_VALID_URL = r'https?://(?:www\.)?rtl2\.de/sendung/[^/]+/(?:video/(?P<vico_id>\d+)[^/]+/(?P<vivi_id>\d+)-|folge/)(?P<id>[^/?#]+)'
_VALID_URL = r'http?://(?:www\.)?rtl2\.de/[^?#]*?/(?P<id>[^?#/]*?)(?:$|/(?:$|[?#]))'
_TESTS = [{
'url': 'http://www.rtl2.de/sendung/grip-das-motormagazin/folge/folge-203-0',
'info_dict': {
@@ -34,11 +34,10 @@ class RTL2IE(InfoExtractor):
# rtmp download
'skip_download': True,
},
'expected_warnings': ['Unable to download f4m manifest', 'Failed to download m3u8 information'],
}, {
'url': 'http://www.rtl2.de/sendung/koeln-50667/video/5512-anna/21040-anna-erwischt-alex/',
'info_dict': {
'id': 'anna-erwischt-alex',
'id': '21040-anna-erwischt-alex',
'ext': 'mp4',
'title': 'Anna erwischt Alex!',
'description': 'Anna nimmt ihrem Vater nicht ab, dass er nicht spielt. Und tatsächlich erwischt sie ihn auf frischer Tat.'
@@ -47,29 +46,31 @@ class RTL2IE(InfoExtractor):
# rtmp download
'skip_download': True,
},
'expected_warnings': ['Unable to download f4m manifest', 'Failed to download m3u8 information'],
}]
def _real_extract(self, url):
vico_id, vivi_id, display_id = re.match(self._VALID_URL, url).groups()
if not vico_id:
webpage = self._download_webpage(url, display_id)
# Some rtl2 urls have no slash at the end, so append it.
if not url.endswith('/'):
url += '/'
mobj = re.search(
r'data-collection="(?P<vico_id>\d+)"[^>]+data-video="(?P<vivi_id>\d+)"',
webpage)
if mobj:
vico_id = mobj.group('vico_id')
vivi_id = mobj.group('vivi_id')
else:
vico_id = self._html_search_regex(
r'vico_id\s*:\s*([0-9]+)', webpage, 'vico_id')
vivi_id = self._html_search_regex(
r'vivi_id\s*:\s*([0-9]+)', webpage, 'vivi_id')
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
mobj = re.search(
r'<div[^>]+data-collection="(?P<vico_id>\d+)"[^>]+data-video="(?P<vivi_id>\d+)"',
webpage)
if mobj:
vico_id = mobj.group('vico_id')
vivi_id = mobj.group('vivi_id')
else:
vico_id = self._html_search_regex(
r'vico_id\s*:\s*([0-9]+)', webpage, 'vico_id')
vivi_id = self._html_search_regex(
r'vivi_id\s*:\s*([0-9]+)', webpage, 'vivi_id')
info = self._download_json(
'https://service.rtl2.de/api-player-vipo/video.php',
display_id, query={
'http://www.rtl2.de/sites/default/modules/rtl2/mediathek/php/get_video_jw.php',
video_id, query={
'vico_id': vico_id,
'vivi_id': vivi_id,
})
@@ -88,7 +89,7 @@ class RTL2IE(InfoExtractor):
'format_id': 'rtmp',
'url': rtmp_url,
'play_path': stream_url,
'player_url': 'https://www.rtl2.de/sites/default/modules/rtl2/jwplayer/jwplayer-7.6.0/jwplayer.flash.swf',
'player_url': 'http://www.rtl2.de/flashplayer/vipo_player.swf',
'page_url': url,
'flash_version': 'LNX 11,2,202,429',
'rtmp_conn': rtmp_conn,
@@ -98,12 +99,12 @@ class RTL2IE(InfoExtractor):
m3u8_url = video_info.get('streamurl_hls')
if m3u8_url:
formats.extend(self._extract_akamai_formats(m3u8_url, display_id))
formats.extend(self._extract_akamai_formats(m3u8_url, video_id))
self._sort_formats(formats)
return {
'id': display_id,
'id': video_id,
'title': title,
'thumbnail': video_info.get('image'),
'description': video_info.get('beschreibung'),

View File

@@ -0,0 +1,44 @@
from __future__ import unicode_literals
from .nuevo import NuevoBaseIE
class RulePornIE(NuevoBaseIE):
_VALID_URL = r'https?://(?:www\.)?ruleporn\.com/(?:[^/?#&]+/)*(?P<id>[^/?#&]+)'
_TEST = {
'url': 'http://ruleporn.com/brunette-nympho-chick-takes-her-boyfriend-in-every-angle/',
'md5': '86861ebc624a1097c7c10eaf06d7d505',
'info_dict': {
'id': '48212',
'display_id': 'brunette-nympho-chick-takes-her-boyfriend-in-every-angle',
'ext': 'mp4',
'title': 'Brunette Nympho Chick Takes Her Boyfriend In Every Angle',
'description': 'md5:6d28be231b981fff1981deaaa03a04d5',
'age_limit': 18,
'duration': 635.1,
}
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
video_id = self._search_regex(
r'lovehomeporn\.com/embed/(\d+)', webpage, 'video id')
title = self._search_regex(
r'<h2[^>]+title=(["\'])(?P<url>.+?)\1',
webpage, 'title', group='url')
description = self._html_search_meta('description', webpage)
info = self._extract_nuevo(
'http://lovehomeporn.com/media/nuevo/econfig.php?key=%s&rp=true' % video_id,
video_id)
info.update({
'display_id': display_id,
'title': title,
'description': description,
'age_limit': 18
})
return info

View File

@@ -59,20 +59,6 @@ class RuutuIE(InfoExtractor):
'url': 'http://www.ruutu.fi/video/3193728',
'only_matching': True,
},
{
# audio podcast
'url': 'https://www.supla.fi/supla/3382410',
'md5': 'b9d7155fed37b2ebf6021d74c4b8e908',
'info_dict': {
'id': '3382410',
'ext': 'mp3',
'title': 'Mikä ihmeen poltergeist?',
'description': 'md5:bbb6963df17dfd0ecd9eb9a61bf14b52',
'thumbnail': r're:^https?://.*\.jpg$',
'age_limit': 0,
},
'expected_warnings': ['HTTP Error 502: Bad Gateway'],
}
]
def _real_extract(self, url):
@@ -108,12 +94,6 @@ class RuutuIE(InfoExtractor):
continue
formats.extend(self._extract_mpd_formats(
video_url, video_id, mpd_id='dash', fatal=False))
elif ext == 'mp3' or child.tag == 'AudioMediaFile':
formats.append({
'format_id': 'audio',
'url': video_url,
'vcodec': 'none',
})
else:
proto = compat_urllib_parse_urlparse(video_url).scheme
if not child.tag.startswith('HTTP') and proto != 'rtmp':

View File

@@ -65,7 +65,7 @@ class SixPlayIE(InfoExtractor):
for asset in assets:
asset_url = asset.get('full_physical_path')
protocol = asset.get('protocol')
if not asset_url or ((protocol == 'primetime' or asset.get('type') == 'usp_hlsfp_h264') and not ('_drmnp.ism/' in asset_url or '_unpnp.ism/' in asset_url)) or asset_url in urls:
if not asset_url or protocol == 'primetime' or asset.get('type') == 'usp_hlsfp_h264' or asset_url in urls:
continue
urls.append(asset_url)
container = asset.get('video_container')
@@ -82,7 +82,6 @@ class SixPlayIE(InfoExtractor):
if not urlh:
continue
asset_url = urlh.geturl()
asset_url = asset_url.replace('_drmnp.ism/', '_unpnp.ism/')
for i in range(3, 0, -1):
asset_url = asset_url = asset_url.replace('_sd1/', '_sd%d/' % i)
m3u8_formats = self._extract_m3u8_formats(

View File

@@ -15,12 +15,7 @@ from ..compat import (
)
from ..utils import (
ExtractorError,
float_or_none,
int_or_none,
KNOWN_EXTENSIONS,
merge_dicts,
mimetype2ext,
str_or_none,
try_get,
unified_timestamp,
update_url_query,
@@ -62,7 +57,7 @@ class SoundcloudIE(InfoExtractor):
'uploader': 'E.T. ExTerrestrial Music',
'timestamp': 1349920598,
'upload_date': '20121011',
'duration': 143.216,
'duration': 143,
'license': 'all-rights-reserved',
'view_count': int,
'like_count': int,
@@ -105,7 +100,7 @@ class SoundcloudIE(InfoExtractor):
'uploader': 'jaimeMF',
'timestamp': 1386604920,
'upload_date': '20131209',
'duration': 9.927,
'duration': 9,
'license': 'all-rights-reserved',
'view_count': int,
'like_count': int,
@@ -125,7 +120,7 @@ class SoundcloudIE(InfoExtractor):
'uploader': 'jaimeMF',
'timestamp': 1386604920,
'upload_date': '20131209',
'duration': 9.927,
'duration': 9,
'license': 'all-rights-reserved',
'view_count': int,
'like_count': int,
@@ -145,7 +140,7 @@ class SoundcloudIE(InfoExtractor):
'uploader': 'oddsamples',
'timestamp': 1389232924,
'upload_date': '20140109',
'duration': 17.346,
'duration': 17,
'license': 'cc-by-sa',
'view_count': int,
'like_count': int,
@@ -165,7 +160,7 @@ class SoundcloudIE(InfoExtractor):
'uploader': 'Ori Uplift Music',
'timestamp': 1504206263,
'upload_date': '20170831',
'duration': 7449.096,
'duration': 7449,
'license': 'all-rights-reserved',
'view_count': int,
'like_count': int,
@@ -185,7 +180,7 @@ class SoundcloudIE(InfoExtractor):
'uploader': 'garyvee',
'timestamp': 1488152409,
'upload_date': '20170226',
'duration': 207.012,
'duration': 207,
'thumbnail': r're:https?://.*\.jpg',
'license': 'all-rights-reserved',
'view_count': int,
@@ -197,31 +192,9 @@ class SoundcloudIE(InfoExtractor):
'skip_download': True,
},
},
# not avaialble via api.soundcloud.com/i1/tracks/id/streams
{
'url': 'https://soundcloud.com/giovannisarani/mezzo-valzer',
'md5': 'e22aecd2bc88e0e4e432d7dcc0a1abf7',
'info_dict': {
'id': '583011102',
'ext': 'mp3',
'title': 'Mezzo Valzer',
'description': 'md5:4138d582f81866a530317bae316e8b61',
'uploader': 'Giovanni Sarani',
'timestamp': 1551394171,
'upload_date': '20190228',
'duration': 180.157,
'thumbnail': r're:https?://.*\.jpg',
'license': 'all-rights-reserved',
'view_count': int,
'like_count': int,
'comment_count': int,
'repost_count': int,
},
'expected_warnings': ['Unable to download JSON metadata'],
}
]
_CLIENT_ID = 'FweeGBOOEOYJWLJN3oEyToGLKhmSz0I7'
_CLIENT_ID = 'NmW1FlPaiL94ueEu7oziOWjYEzZzQDcK'
@staticmethod
def _extract_urls(webpage):
@@ -229,6 +202,10 @@ class SoundcloudIE(InfoExtractor):
r'<iframe[^>]+src=(["\'])(?P<url>(?:https?://)?(?:w\.)?soundcloud\.com/player.+?)\1',
webpage)]
def report_resolve(self, video_id):
"""Report information extraction."""
self.to_screen('%s: Resolving id' % video_id)
@classmethod
def _resolv_url(cls, url):
return 'https://api.soundcloud.com/resolve.json?url=' + url + '&client_id=' + cls._CLIENT_ID
@@ -247,10 +224,6 @@ class SoundcloudIE(InfoExtractor):
def extract_count(key):
return int_or_none(info.get('%s_count' % key))
like_count = extract_count('favoritings')
if like_count is None:
like_count = extract_count('likes')
result = {
'id': track_id,
'uploader': username,
@@ -258,17 +231,15 @@ class SoundcloudIE(InfoExtractor):
'title': title,
'description': info.get('description'),
'thumbnail': thumbnail,
'duration': float_or_none(info.get('duration'), 1000),
'duration': int_or_none(info.get('duration'), 1000),
'webpage_url': info.get('permalink_url'),
'license': info.get('license'),
'view_count': extract_count('playback'),
'like_count': like_count,
'like_count': extract_count('favoritings'),
'comment_count': extract_count('comment'),
'repost_count': extract_count('reposts'),
'genre': info.get('genre'),
}
format_urls = set()
formats = []
query = {'client_id': self._CLIENT_ID}
if secret_token is not None:
@@ -277,7 +248,6 @@ class SoundcloudIE(InfoExtractor):
# We can build a direct link to the song
format_url = update_url_query(
'https://api.soundcloud.com/tracks/%s/download' % track_id, query)
format_urls.add(format_url)
formats.append({
'format_id': 'download',
'ext': info.get('original_format', 'mp3'),
@@ -286,91 +256,44 @@ class SoundcloudIE(InfoExtractor):
'preference': 10,
})
# Old API, does not work for some tracks (e.g.
# https://soundcloud.com/giovannisarani/mezzo-valzer)
# We have to retrieve the url
format_dict = self._download_json(
'https://api.soundcloud.com/i1/tracks/%s/streams' % track_id,
track_id, 'Downloading track url', query=query, fatal=False)
track_id, 'Downloading track url', query=query)
if format_dict:
for key, stream_url in format_dict.items():
if stream_url in format_urls:
continue
format_urls.add(stream_url)
ext, abr = 'mp3', None
mobj = re.search(r'_([^_]+)_(\d+)_url', key)
if mobj:
ext, abr = mobj.groups()
abr = int(abr)
if key.startswith('http'):
stream_formats = [{
'format_id': key,
'ext': ext,
'url': stream_url,
}]
elif key.startswith('rtmp'):
# The url doesn't have an rtmp app, we have to extract the playpath
url, path = stream_url.split('mp3:', 1)
stream_formats = [{
'format_id': key,
'url': url,
'play_path': 'mp3:' + path,
'ext': 'flv',
}]
elif key.startswith('hls'):
stream_formats = self._extract_m3u8_formats(
stream_url, track_id, ext, entry_protocol='m3u8_native',
m3u8_id=key, fatal=False)
else:
continue
for key, stream_url in format_dict.items():
ext, abr = 'mp3', None
mobj = re.search(r'_([^_]+)_(\d+)_url', key)
if mobj:
ext, abr = mobj.groups()
abr = int(abr)
if key.startswith('http'):
stream_formats = [{
'format_id': key,
'ext': ext,
'url': stream_url,
}]
elif key.startswith('rtmp'):
# The url doesn't have an rtmp app, we have to extract the playpath
url, path = stream_url.split('mp3:', 1)
stream_formats = [{
'format_id': key,
'url': url,
'play_path': 'mp3:' + path,
'ext': 'flv',
}]
elif key.startswith('hls'):
stream_formats = self._extract_m3u8_formats(
stream_url, track_id, ext, entry_protocol='m3u8_native',
m3u8_id=key, fatal=False)
else:
continue
if abr:
for f in stream_formats:
f['abr'] = abr
if abr:
for f in stream_formats:
f['abr'] = abr
formats.extend(stream_formats)
# New API
transcodings = try_get(
info, lambda x: x['media']['transcodings'], list) or []
for t in transcodings:
if not isinstance(t, dict):
continue
format_url = url_or_none(t.get('url'))
if not format_url:
continue
stream = self._download_json(
update_url_query(format_url, query), track_id, fatal=False)
if not isinstance(stream, dict):
continue
stream_url = url_or_none(stream.get('url'))
if not stream_url:
continue
if stream_url in format_urls:
continue
format_urls.add(stream_url)
protocol = try_get(t, lambda x: x['format']['protocol'], compat_str)
if protocol != 'hls' and '/hls' in format_url:
protocol = 'hls'
ext = None
preset = str_or_none(t.get('preset'))
if preset:
ext = preset.split('_')[0]
if ext not in KNOWN_EXTENSIONS:
mimetype = try_get(
t, lambda x: x['format']['mime_type'], compat_str)
ext = mimetype2ext(mimetype) or 'mp3'
format_id_list = []
if protocol:
format_id_list.append(protocol)
format_id_list.append(ext)
format_id = '_'.join(format_id_list)
formats.append({
'url': stream_url,
'format_id': format_id,
'ext': ext,
'protocol': 'm3u8_native' if protocol == 'hls' else 'http',
})
formats.extend(stream_formats)
if not formats:
# We fallback to the stream_url in the original info, this
@@ -380,11 +303,11 @@ class SoundcloudIE(InfoExtractor):
'url': update_url_query(info['stream_url'], query),
'ext': 'mp3',
})
self._check_formats(formats, track_id)
for f in formats:
f['vcodec'] = 'none'
self._check_formats(formats, track_id)
self._sort_formats(formats)
result['formats'] = formats
@@ -396,7 +319,6 @@ class SoundcloudIE(InfoExtractor):
raise ExtractorError('Invalid URL: %s' % url)
track_id = mobj.group('track_id')
new_info = {}
if track_id is not None:
info_json_url = 'https://api.soundcloud.com/tracks/' + track_id + '.json?client_id=' + self._CLIENT_ID
@@ -422,31 +344,13 @@ class SoundcloudIE(InfoExtractor):
if token:
resolve_title += '/%s' % token
webpage = self._download_webpage(url, full_title, fatal=False)
if webpage:
entries = self._parse_json(
self._search_regex(
r'var\s+c\s*=\s*(\[.+?\])\s*,\s*o\s*=Date\b', webpage,
'data', default='[]'), full_title, fatal=False)
if entries:
for e in entries:
if not isinstance(e, dict):
continue
if e.get('id') != 67:
continue
data = try_get(e, lambda x: x['data'][0], dict)
if data:
new_info = data
break
info_json_url = self._resolv_url(
'https://soundcloud.com/%s' % resolve_title)
self.report_resolve(full_title)
# Contains some additional info missing from new_info
info = self._download_json(
info_json_url, full_title, 'Downloading info JSON')
url = 'https://soundcloud.com/%s' % resolve_title
info_json_url = self._resolv_url(url)
info = self._download_json(info_json_url, full_title, 'Downloading info JSON')
return self._extract_info_dict(
merge_dicts(info, new_info), full_title, secret_token=token)
return self._extract_info_dict(info, full_title, secret_token=token)
class SoundcloudPlaylistBaseIE(SoundcloudIE):
@@ -492,6 +396,8 @@ class SoundcloudSetIE(SoundcloudPlaylistBaseIE):
full_title += '/' + token
url += '/' + token
self.report_resolve(full_title)
resolv_url = self._resolv_url(url)
info = self._download_json(resolv_url, full_title)

View File

@@ -14,7 +14,7 @@ from ..utils import (
class StreamangoIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?(?:streamango\.com|fruithosts\.net|streamcherry\.com)/(?:f|embed)/(?P<id>[^/?#&]+)'
_VALID_URL = r'https?://(?:www\.)?(?:streamango\.com|fruithosts\.net)/(?:f|embed)/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'https://streamango.com/f/clapasobsptpkdfe/20170315_150006_mp4',
'md5': 'e992787515a182f55e38fc97588d802a',
@@ -41,9 +41,6 @@ class StreamangoIE(InfoExtractor):
}, {
'url': 'https://fruithosts.net/f/mreodparcdcmspsm/w1f1_r4lph_2018_brrs_720p_latino_mp4',
'only_matching': True,
}, {
'url': 'https://streamcherry.com/f/clapasobsptpkdfe/',
'only_matching': True,
}]
def _real_extract(self, url):

View File

@@ -1,94 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import (
compat_parse_qs,
compat_urllib_parse_urlparse
)
from ..utils import (
extract_attributes,
float_or_none,
int_or_none,
str_or_none,
)
class STVPlayerIE(InfoExtractor):
IE_NAME = 'stv:player'
_VALID_URL = r'https?://player\.stv\.tv/(?P<type>episode|video)/(?P<id>[a-z0-9]{4})'
_TEST = {
'url': 'https://player.stv.tv/video/7srz/victoria/interview-with-the-cast-ahead-of-new-victoria/',
'md5': '2ad867d4afd641fa14187596e0fbc91b',
'info_dict': {
'id': '6016487034001',
'ext': 'mp4',
'upload_date': '20190321',
'title': 'Interview with the cast ahead of new Victoria',
'description': 'Nell Hudson and Lily Travers tell us what to expect in the new season of Victoria.',
'timestamp': 1553179628,
'uploader_id': '1486976045',
},
'skip': 'this resource is unavailable outside of the UK',
}
_PUBLISHER_ID = '1486976045'
_PTYPE_MAP = {
'episode': 'episodes',
'video': 'shortform',
}
def _real_extract(self, url):
ptype, video_id = re.match(self._VALID_URL, url).groups()
webpage = self._download_webpage(url, video_id)
qs = compat_parse_qs(compat_urllib_parse_urlparse(self._search_regex(
r'itemprop="embedURL"[^>]+href="([^"]+)',
webpage, 'embed URL', default=None)).query)
publisher_id = qs.get('publisherID', [None])[0] or self._PUBLISHER_ID
player_attr = extract_attributes(self._search_regex(
r'(<[^>]+class="bcplayer"[^>]+>)', webpage, 'player', default=None)) or {}
info = {}
duration = ref_id = series = video_id = None
api_ref_id = player_attr.get('data-player-api-refid')
if api_ref_id:
resp = self._download_json(
'https://player.api.stv.tv/v1/%s/%s' % (self._PTYPE_MAP[ptype], api_ref_id),
api_ref_id, fatal=False)
if resp:
result = resp.get('results') or {}
video = result.get('video') or {}
video_id = str_or_none(video.get('id'))
ref_id = video.get('guid')
duration = video.get('length')
programme = result.get('programme') or {}
series = programme.get('name') or programme.get('shortName')
subtitles = {}
_subtitles = result.get('_subtitles') or {}
for ext, sub_url in _subtitles.items():
subtitles.setdefault('en', []).append({
'ext': 'vtt' if ext == 'webvtt' else ext,
'url': sub_url,
})
info.update({
'description': result.get('summary'),
'subtitles': subtitles,
'view_count': int_or_none(result.get('views')),
})
if not video_id:
video_id = qs.get('videoId', [None])[0] or self._search_regex(
r'<link\s+itemprop="url"\s+href="(\d+)"',
webpage, 'video id', default=None) or 'ref:' + (ref_id or player_attr['data-refid'])
info.update({
'_type': 'url_transparent',
'duration': float_or_none(duration or player_attr.get('data-duration'), 1000),
'id': video_id,
'ie_key': 'BrightcoveNew',
'series': series or player_attr.get('data-programme-name'),
'url': 'http://players.brightcove.net/%s/default_default/index.html?videoId=%s' % (publisher_id, video_id),
})
return info

View File

@@ -1,115 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
determine_ext,
int_or_none,
str_or_none,
)
class SverigesRadioBaseIE(InfoExtractor):
_BASE_URL = 'https://sverigesradio.se/sida/playerajax/'
_QUALITIES = ['low', 'medium', 'high']
_EXT_TO_CODEC_MAP = {
'mp3': 'mp3',
'm4a': 'aac',
}
_CODING_FORMAT_TO_ABR_MAP = {
5: 128,
11: 192,
12: 32,
13: 96,
}
def _real_extract(self, url):
audio_id = self._match_id(url)
query = {
'id': audio_id,
'type': self._AUDIO_TYPE,
}
item = self._download_json(
self._BASE_URL + 'audiometadata', audio_id,
'Downloading audio JSON metadata', query=query)['items'][0]
title = item['subtitle']
query['format'] = 'iis'
urls = []
formats = []
for quality in self._QUALITIES:
query['quality'] = quality
audio_url_data = self._download_json(
self._BASE_URL + 'getaudiourl', audio_id,
'Downloading %s format JSON metadata' % quality,
fatal=False, query=query) or {}
audio_url = audio_url_data.get('audioUrl')
if not audio_url or audio_url in urls:
continue
urls.append(audio_url)
ext = determine_ext(audio_url)
coding_format = audio_url_data.get('codingFormat')
abr = int_or_none(self._search_regex(
r'_a(\d+)\.m4a', audio_url, 'audio bitrate',
default=None)) or self._CODING_FORMAT_TO_ABR_MAP.get(coding_format)
formats.append({
'abr': abr,
'acodec': self._EXT_TO_CODEC_MAP.get(ext),
'ext': ext,
'format_id': str_or_none(coding_format),
'vcodec': 'none',
'url': audio_url,
})
self._sort_formats(formats)
return {
'id': audio_id,
'title': title,
'formats': formats,
'series': item.get('title'),
'duration': int_or_none(item.get('duration')),
'thumbnail': item.get('displayimageurl'),
'description': item.get('description'),
}
class SverigesRadioPublicationIE(SverigesRadioBaseIE):
IE_NAME = 'sverigesradio:publication'
_VALID_URL = r'https?://(?:www\.)?sverigesradio\.se/sida/(?:artikel|gruppsida)\.aspx\?.*?\bartikel=(?P<id>[0-9]+)'
_TESTS = [{
'url': 'https://sverigesradio.se/sida/artikel.aspx?programid=83&artikel=7038546',
'md5': '6a4917e1923fccb080e5a206a5afa542',
'info_dict': {
'id': '7038546',
'ext': 'm4a',
'duration': 132,
'series': 'Nyheter (Ekot)',
'title': 'Esa Teittinen: Sanningen har inte kommit fram',
'description': 'md5:daf7ce66a8f0a53d5465a5984d3839df',
'thumbnail': r're:^https?://.*\.jpg',
},
}, {
'url': 'https://sverigesradio.se/sida/gruppsida.aspx?programid=3304&grupp=6247&artikel=7146887',
'only_matching': True,
}]
_AUDIO_TYPE = 'publication'
class SverigesRadioEpisodeIE(SverigesRadioBaseIE):
IE_NAME = 'sverigesradio:episode'
_VALID_URL = r'https?://(?:www\.)?sverigesradio\.se/(?:sida/)?avsnitt/(?P<id>[0-9]+)'
_TEST = {
'url': 'https://sverigesradio.se/avsnitt/1140922?programid=1300',
'md5': '20dc4d8db24228f846be390b0c59a07c',
'info_dict': {
'id': '1140922',
'ext': 'mp3',
'duration': 3307,
'series': 'Konflikt',
'title': 'Metoo och valen',
'description': 'md5:fcb5c1f667f00badcc702b196f10a27e',
'thumbnail': r're:^https?://.*\.jpg',
}
}
_AUDIO_TYPE = 'episode'

View File

@@ -185,7 +185,7 @@ class SVTPlayIE(SVTPlayBaseIE):
def _extract_by_video_id(self, video_id, webpage=None):
data = self._download_json(
'https://api.svt.se/video/%s' % video_id,
'https://api.svt.se/videoplayer-api/video/%s' % video_id,
video_id, headers=self.geo_verification_headers())
info_dict = self._extract_video(data, video_id)
if not info_dict.get('title'):

View File

@@ -16,7 +16,7 @@ from ..utils import (
class TeamcocoIE(TurnerBaseIE):
_VALID_URL = r'https?://(?:\w+\.)?teamcoco\.com/(?P<id>([^/]+/)*[^/?#]+)'
_VALID_URL = r'https?://teamcoco\.com/(?P<id>([^/]+/)*[^/?#]+)'
_TESTS = [
{
'url': 'http://teamcoco.com/video/mary-kay-remote',
@@ -79,20 +79,15 @@ class TeamcocoIE(TurnerBaseIE):
}, {
'url': 'http://teamcoco.com/israel/conan-hits-the-streets-beaches-of-tel-aviv',
'only_matching': True,
}, {
'url': 'https://conan25.teamcoco.com/video/ice-cube-kevin-hart-conan-share-lyft',
'only_matching': True,
}
]
def _graphql_call(self, query_template, object_type, object_id):
find_object = 'find' + object_type
return self._download_json(
'https://teamcoco.com/graphql', object_id, data=json.dumps({
'http://teamcoco.com/graphql/', object_id, data=json.dumps({
'query': query_template % (find_object, object_id)
}).encode(), headers={
'Content-Type': 'application/json',
})['data'][find_object]
}))['data'][find_object]
def _real_extract(self, url):
display_id = self._match_id(url)
@@ -150,12 +145,7 @@ class TeamcocoIE(TurnerBaseIE):
'accessTokenType': 'jws',
}))
else:
d = self._download_json(
'https://teamcoco.com/_truman/d/' + video_id,
video_id, fatal=False) or {}
video_sources = d.get('meta') or {}
if not video_sources:
video_sources = self._graphql_call('''{
video_sources = self._graphql_call('''{
%s(id: "%s") {
src
}

View File

@@ -1,140 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
clean_html,
determine_ext,
ExtractorError,
float_or_none,
get_element_by_class,
get_element_by_id,
parse_duration,
remove_end,
urlencode_postdata,
urljoin,
)
class TeamTreeHouseIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?teamtreehouse\.com/library/(?P<id>[^/]+)'
_TESTS = [{
# Course
'url': 'https://teamtreehouse.com/library/introduction-to-user-authentication-in-php',
'info_dict': {
'id': 'introduction-to-user-authentication-in-php',
'title': 'Introduction to User Authentication in PHP',
'description': 'md5:405d7b4287a159b27ddf30ca72b5b053',
},
'playlist_mincount': 24,
}, {
# WorkShop
'url': 'https://teamtreehouse.com/library/deploying-a-react-app',
'info_dict': {
'id': 'deploying-a-react-app',
'title': 'Deploying a React App',
'description': 'md5:10a82e3ddff18c14ac13581c9b8e5921',
},
'playlist_mincount': 4,
}, {
# Video
'url': 'https://teamtreehouse.com/library/application-overview-2',
'info_dict': {
'id': 'application-overview-2',
'ext': 'mp4',
'title': 'Application Overview',
'description': 'md5:4b0a234385c27140a4378de5f1e15127',
},
'expected_warnings': ['This is just a preview'],
}]
_NETRC_MACHINE = 'teamtreehouse'
def _real_initialize(self):
email, password = self._get_login_info()
if email is None:
return
signin_page = self._download_webpage(
'https://teamtreehouse.com/signin',
None, 'Downloading signin page')
data = self._form_hidden_inputs('new_user_session', signin_page)
data.update({
'user_session[email]': email,
'user_session[password]': password,
})
error_message = get_element_by_class('error-message', self._download_webpage(
'https://teamtreehouse.com/person_session',
None, 'Logging in', data=urlencode_postdata(data)))
if error_message:
raise ExtractorError(clean_html(error_message), expected=True)
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
title = self._html_search_meta(['og:title', 'twitter:title'], webpage)
description = self._html_search_meta(
['description', 'og:description', 'twitter:description'], webpage)
entries = self._parse_html5_media_entries(url, webpage, display_id)
if entries:
info = entries[0]
for subtitles in info.get('subtitles', {}).values():
for subtitle in subtitles:
subtitle['ext'] = determine_ext(subtitle['url'], 'srt')
is_preview = 'data-preview="true"' in webpage
if is_preview:
self.report_warning(
'This is just a preview. You need to be signed in with a Basic account to download the entire video.', display_id)
duration = 30
else:
duration = float_or_none(self._search_regex(
r'data-duration="(\d+)"', webpage, 'duration'), 1000)
if not duration:
duration = parse_duration(get_element_by_id(
'video-duration', webpage))
info.update({
'id': display_id,
'title': title,
'description': description,
'duration': duration,
})
return info
else:
def extract_urls(html, extract_info=None):
for path in re.findall(r'<a[^>]+href="([^"]+)"', html):
page_url = urljoin(url, path)
entry = {
'_type': 'url_transparent',
'id': self._match_id(page_url),
'url': page_url,
'id_key': self.ie_key(),
}
if extract_info:
entry.update(extract_info)
entries.append(entry)
workshop_videos = self._search_regex(
r'(?s)<ul[^>]+id="workshop-videos"[^>]*>(.+?)</ul>',
webpage, 'workshop videos', default=None)
if workshop_videos:
extract_urls(workshop_videos)
else:
stages_path = self._search_regex(
r'(?s)<div[^>]+id="syllabus-stages"[^>]+data-url="([^"]+)"',
webpage, 'stages path')
if stages_path:
stages_page = self._download_webpage(
urljoin(url, stages_path), display_id, 'Downloading stages page')
for chapter_number, (chapter, steps_list) in enumerate(re.findall(r'(?s)<h2[^>]*>\s*(.+?)\s*</h2>.+?<ul[^>]*>(.+?)</ul>', stages_page), 1):
extract_urls(steps_list, {
'chapter': chapter,
'chapter_number': chapter_number,
})
title = remove_end(title, ' Course')
return self.playlist_result(
entries, display_id, title, description)

View File

@@ -65,15 +65,8 @@ class TikTokBaseIE(InfoExtractor):
class TikTokIE(TikTokBaseIE):
_VALID_URL = r'''(?x)
https?://
(?:
(?:m\.)?tiktok\.com/v|
(?:www\.)?tiktok\.com/share/video
)
/(?P<id>\d+)
'''
_TESTS = [{
_VALID_URL = r'https?://(?:m\.)?tiktok\.com/v/(?P<id>\d+)'
_TEST = {
'url': 'https://m.tiktok.com/v/6606727368545406213.html',
'md5': 'd584b572e92fcd48888051f238022420',
'info_dict': {
@@ -88,39 +81,25 @@ class TikTokIE(TikTokBaseIE):
'comment_count': int,
'repost_count': int,
}
}, {
'url': 'https://www.tiktok.com/share/video/6606727368545406213',
'only_matching': True,
}]
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(
'https://m.tiktok.com/v/%s.html' % video_id, video_id)
webpage = self._download_webpage(url, video_id)
data = self._parse_json(self._search_regex(
r'\bdata\s*=\s*({.+?})\s*;', webpage, 'data'), video_id)
return self._extract_aweme(data)
class TikTokUserIE(TikTokBaseIE):
_VALID_URL = r'''(?x)
https?://
(?:
(?:m\.)?tiktok\.com/h5/share/usr|
(?:www\.)?tiktok\.com/share/user
)
/(?P<id>\d+)
'''
_TESTS = [{
_VALID_URL = r'https?://(?:m\.)?tiktok\.com/h5/share/usr/(?P<id>\d+)'
_TEST = {
'url': 'https://m.tiktok.com/h5/share/usr/188294915489964032.html',
'info_dict': {
'id': '188294915489964032',
},
'playlist_mincount': 24,
}, {
'url': 'https://www.tiktok.com/share/user/188294915489964032',
'only_matching': True,
}]
}
def _real_extract(self, url):
user_id = self._match_id(url)

View File

@@ -66,12 +66,7 @@ class TouTvIE(RadioCanadaIE):
def _real_extract(self, url):
path = self._match_id(url)
metadata = self._download_json(
'https://services.radio-canada.ca/toutv/presentation/%s' % path, path, query={
'client_key': self._CLIENT_KEY,
'device': 'web',
'version': 4,
})
metadata = self._download_json('http://ici.tou.tv/presentation/%s' % path, path)
# IsDrm does not necessarily mean the video is DRM protected (see
# https://github.com/ytdl-org/youtube-dl/issues/13994).
if metadata.get('IsDrm'):
@@ -82,12 +77,6 @@ class TouTvIE(RadioCanadaIE):
return merge_dicts({
'id': video_id,
'title': details.get('OriginalTitle'),
'description': details.get('Description'),
'thumbnail': details.get('ImageUrl'),
'duration': int_or_none(details.get('LengthInSeconds')),
'series': metadata.get('ProgramTitle'),
'season_number': int_or_none(metadata.get('SeasonNumber')),
'season': metadata.get('SeasonTitle'),
'episode_number': int_or_none(metadata.get('EpisodeNumber')),
'episode': metadata.get('EpisodeTitle'),
}, self._extract_info(metadata.get('AppCode', 'toutv'), video_id))

View File

@@ -2,20 +2,19 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import urlencode_postdata
import re
class TwitCastingIE(InfoExtractor):
_VALID_URL = r'https?://(?:[^/]+\.)?twitcasting\.tv/(?P<uploader_id>[^/]+)/movie/(?P<id>\d+)'
_TESTS = [{
_TEST = {
'url': 'https://twitcasting.tv/ivetesangalo/movie/2357609',
'md5': '745243cad58c4681dc752490f7540d7f',
'info_dict': {
'id': '2357609',
'ext': 'mp4',
'title': 'Live #2357609',
'title': 'Recorded Live #2357609',
'uploader_id': 'ivetesangalo',
'description': "Moi! I'm live on TwitCasting from my iPhone.",
'thumbnail': r're:^https?://.*\.jpg$',
@@ -23,34 +22,14 @@ class TwitCastingIE(InfoExtractor):
'params': {
'skip_download': True,
},
}, {
'url': 'https://twitcasting.tv/mttbernardini/movie/3689740',
'info_dict': {
'id': '3689740',
'ext': 'mp4',
'title': 'Live playing something #3689740',
'uploader_id': 'mttbernardini',
'description': "I'm live on TwitCasting from my iPad. password: abc (Santa Marinella/Lazio, Italia)",
'thumbnail': r're:^https?://.*\.jpg$',
},
'params': {
'skip_download': True,
'videopassword': 'abc',
},
}]
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
uploader_id = mobj.group('uploader_id')
video_password = self._downloader.params.get('videopassword')
request_data = None
if video_password:
request_data = urlencode_postdata({
'password': video_password,
})
webpage = self._download_webpage(url, video_id, data=request_data)
webpage = self._download_webpage(url, video_id)
title = self._html_search_regex(
r'(?s)<[^>]+id=["\']movietitle[^>]+>(.+?)</',

View File

@@ -134,12 +134,12 @@ class TwitchBaseIE(InfoExtractor):
def _prefer_source(self, formats):
try:
source = next(f for f in formats if f['format_id'] == 'Source')
source['quality'] = 10
source['preference'] = 10
except StopIteration:
for f in formats:
if '/chunked/' in f['url']:
f.update({
'quality': 10,
'source_preference': 10,
'format_note': 'Source',
})
self._sort_formats(formats)

View File

@@ -76,10 +76,7 @@ class UdemyIE(InfoExtractor):
webpage, 'course', default='{}')),
video_id, fatal=False) or {}
course_id = course.get('id') or self._search_regex(
[
r'data-course-id=["\'](\d+)',
r'&quot;courseId&quot;\s*:\s*(\d+)'
], webpage, 'course id')
r'data-course-id=["\'](\d+)', webpage, 'course id')
return course_id, course.get('title')
def _enroll_course(self, base_url, webpage, course_id):

View File

@@ -1,10 +1,13 @@
from __future__ import unicode_literals
import re
import json
from .common import InfoExtractor
from ..utils import (
int_or_none,
parse_duration,
qualities,
ExtractorError,
sanitized_Request,
)
@@ -13,9 +16,9 @@ class VeohIE(InfoExtractor):
_TESTS = [{
'url': 'http://www.veoh.com/watch/v56314296nk7Zdmz3',
'md5': '9e7ecc0fd8bbee7a69fe38953aeebd30',
'md5': '620e68e6a3cff80086df3348426c9ca3',
'info_dict': {
'id': 'v56314296nk7Zdmz3',
'id': '56314296',
'ext': 'mp4',
'title': 'Straight Backs Are Stronger',
'uploader': 'LUMOback',
@@ -53,6 +56,29 @@ class VeohIE(InfoExtractor):
'only_matching': True,
}]
def _extract_formats(self, source):
formats = []
link = source.get('aowPermalink')
if link:
formats.append({
'url': link,
'ext': 'mp4',
'format_id': 'aow',
})
link = source.get('fullPreviewHashLowPath')
if link:
formats.append({
'url': link,
'format_id': 'low',
})
link = source.get('fullPreviewHashHighPath')
if link:
formats.append({
'url': link,
'format_id': 'high',
})
return formats
def _extract_video(self, source):
return {
'id': source.get('videoId'),
@@ -67,37 +93,38 @@ class VeohIE(InfoExtractor):
}
def _real_extract(self, url):
video_id = self._match_id(url)
video = self._download_json(
'https://www.veoh.com/watch/getVideo/' + video_id,
video_id)['video']
title = video['title']
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
thumbnail_url = None
q = qualities(['HQ', 'Regular'])
formats = []
for f_id, f_url in video.get('src', {}).items():
if not f_url:
continue
if f_id == 'poster':
thumbnail_url = f_url
else:
formats.append({
'format_id': f_id,
'quality': q(f_id),
'url': f_url,
})
self._sort_formats(formats)
if video_id.startswith('v'):
rsp = self._download_xml(
r'http://www.veoh.com/api/findByPermalink?permalink=%s' % video_id, video_id, 'Downloading video XML')
stat = rsp.get('stat')
if stat == 'ok':
return self._extract_video(rsp.find('./videoList/video'))
elif stat == 'fail':
raise ExtractorError(
'%s said: %s' % (self.IE_NAME, rsp.find('./errorList/error').get('errorMessage')), expected=True)
return {
'id': video_id,
'title': title,
'description': video.get('description'),
'thumbnail': thumbnail_url,
'uploader': video.get('author', {}).get('nickname'),
'duration': int_or_none(video.get('lengthBySec')) or parse_duration(video.get('length')),
'view_count': int_or_none(video.get('views')),
'formats': formats,
'average_rating': int_or_none(video.get('rating')),
'comment_count': int_or_none(video.get('numOfComments')),
}
webpage = self._download_webpage(url, video_id)
age_limit = 0
if 'class="adultwarning-container"' in webpage:
self.report_age_confirmation()
age_limit = 18
request = sanitized_Request(url)
request.add_header('Cookie', 'confirmedAdult=true')
webpage = self._download_webpage(request, video_id)
m_youtube = re.search(r'http://www\.youtube\.com/v/(.*?)(\&|"|\?)', webpage)
if m_youtube is not None:
youtube_id = m_youtube.group(1)
self.to_screen('%s: detected Youtube video.' % video_id)
return self.url_result(youtube_id, 'Youtube')
info = json.loads(
self._search_regex(r'videoDetailsJSON = \'({.*?})\';', webpage, 'info').replace('\\\'', '\''))
video = self._extract_video(info)
video['age_limit'] = age_limit
return video

View File

@@ -1,16 +1,19 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import (
compat_urllib_parse_urlencode,
compat_urlparse,
)
from ..utils import (
float_or_none,
int_or_none,
sanitized_Request,
)
class ViddlerIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?viddler\.com/(?:v|embed|player)/(?P<id>[a-z0-9]+)(?:.+?\bsecret=(\d+))?'
_VALID_URL = r'https?://(?:www\.)?viddler\.com/(?:v|embed|player)/(?P<id>[a-z0-9]+)'
_TESTS = [{
'url': 'http://www.viddler.com/v/43903784',
'md5': '9eee21161d2c7f5b39690c3e325fab2f',
@@ -75,18 +78,23 @@ class ViddlerIE(InfoExtractor):
}]
def _real_extract(self, url):
video_id, secret = re.match(self._VALID_URL, url).groups()
video_id = self._match_id(url)
query = {
'video_id': video_id,
'key': 'v0vhrt7bg2xq1vyxhkct',
}
qs = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
secret = qs.get('secret', [None])[0]
if secret:
query['secret'] = secret
data = self._download_json(
'http://api.viddler.com/api/v2/viddler.videos.getPlaybackDetails.json',
video_id, headers={'Referer': url}, query=query)['video']
headers = {'Referer': 'http://static.cdn-ec.viddler.com/js/arpeggio/v2/embed.html'}
request = sanitized_Request(
'http://api.viddler.com/api/v2/viddler.videos.getPlaybackDetails.json?%s'
% compat_urllib_parse_urlencode(query), None, headers)
data = self._download_json(request, video_id)['video']
formats = []
for filed in data['files']:

View File

@@ -0,0 +1,60 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
decode_packed_codes,
sanitized_Request,
)
class VideoMegaIE(InfoExtractor):
_VALID_URL = r'(?:videomega:|https?://(?:www\.)?videomega\.tv/(?:(?:view|iframe|cdn)\.php)?\?ref=)(?P<id>[A-Za-z0-9]+)'
_TESTS = [{
'url': 'http://videomega.tv/cdn.php?ref=AOSQBJYKIDDIKYJBQSOA',
'md5': 'cc1920a58add3f05c6a93285b84fb3aa',
'info_dict': {
'id': 'AOSQBJYKIDDIKYJBQSOA',
'ext': 'mp4',
'title': '1254207',
'thumbnail': r're:^https?://.*\.jpg$',
}
}, {
'url': 'http://videomega.tv/cdn.php?ref=AOSQBJYKIDDIKYJBQSOA&width=1070&height=600',
'only_matching': True,
}, {
'url': 'http://videomega.tv/view.php?ref=090051111052065112106089103052052103089106112065052111051090',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
iframe_url = 'http://videomega.tv/cdn.php?ref=%s' % video_id
req = sanitized_Request(iframe_url)
req.add_header('Referer', url)
req.add_header('Cookie', 'noadvtday=0')
webpage = self._download_webpage(req, video_id)
title = self._html_search_regex(
r'<title>(.+?)</title>', webpage, 'title')
title = re.sub(
r'(?:^[Vv]ideo[Mm]ega\.tv\s-\s*|\s*-\svideomega\.tv$)', '', title)
thumbnail = self._search_regex(
r'<video[^>]+?poster="([^"]+)"', webpage, 'thumbnail', fatal=False)
real_codes = decode_packed_codes(webpage)
video_url = self._search_regex(
r'"src"\s*,\s*"([^"]+)"', real_codes, 'video URL')
return {
'id': video_id,
'title': title,
'url': video_url,
'thumbnail': thumbnail,
'http_headers': {
'Referer': iframe_url,
},
}

View File

@@ -109,9 +109,23 @@ class VimeoBaseInfoExtractor(InfoExtractor):
def _parse_config(self, config, video_id):
video_data = config['video']
# Extract title
video_title = video_data['title']
live_event = video_data.get('live_event') or {}
is_live = live_event.get('status') == 'started'
# Extract uploader, uploader_url and uploader_id
video_uploader = video_data.get('owner', {}).get('name')
video_uploader_url = video_data.get('owner', {}).get('url')
video_uploader_id = video_uploader_url.split('/')[-1] if video_uploader_url else None
# Extract video thumbnail
video_thumbnail = video_data.get('thumbnail')
if video_thumbnail is None:
video_thumbs = video_data.get('thumbs')
if video_thumbs and isinstance(video_thumbs, dict):
_, video_thumbnail = sorted((int(width if width.isdigit() else 0), t_url) for (width, t_url) in video_thumbs.items())[-1]
# Extract video duration
video_duration = int_or_none(video_data.get('duration'))
formats = []
config_files = video_data.get('files') or config['request'].get('files', {})
@@ -128,7 +142,6 @@ class VimeoBaseInfoExtractor(InfoExtractor):
'tbr': int_or_none(f.get('bitrate')),
})
# TODO: fix handling of 308 status code returned for live archive manifest requests
for files_type in ('hls', 'dash'):
for cdn_name, cdn_data in config_files.get(files_type, {}).get('cdns', {}).items():
manifest_url = cdn_data.get('url')
@@ -138,7 +151,7 @@ class VimeoBaseInfoExtractor(InfoExtractor):
if files_type == 'hls':
formats.extend(self._extract_m3u8_formats(
manifest_url, video_id, 'mp4',
'm3u8' if is_live else 'm3u8_native', m3u8_id=format_id,
'm3u8_native', m3u8_id=format_id,
note='Downloading %s m3u8 information' % cdn_name,
fatal=False))
elif files_type == 'dash':
@@ -151,10 +164,6 @@ class VimeoBaseInfoExtractor(InfoExtractor):
else:
mpd_manifest_urls = [(format_id, manifest_url)]
for f_id, m_url in mpd_manifest_urls:
if 'json=1' in m_url:
real_m_url = (self._download_json(m_url, video_id, fatal=False) or {}).get('url')
if real_m_url:
m_url = real_m_url
mpd_formats = self._extract_mpd_formats(
m_url.replace('/master.json', '/master.mpd'), video_id, f_id,
'Downloading %s MPD information' % cdn_name,
@@ -166,15 +175,6 @@ class VimeoBaseInfoExtractor(InfoExtractor):
f['preference'] = -40
formats.extend(mpd_formats)
live_archive = live_event.get('archive') or {}
live_archive_source_url = live_archive.get('source_url')
if live_archive_source_url and live_archive.get('status') == 'done':
formats.append({
'format_id': 'live-archive-source',
'url': live_archive_source_url,
'preference': 1,
})
subtitles = {}
text_tracks = config['request'].get('text_tracks')
if text_tracks:
@@ -184,61 +184,17 @@ class VimeoBaseInfoExtractor(InfoExtractor):
'url': 'https://vimeo.com' + tt['url'],
}]
thumbnails = []
if not is_live:
for key, thumb in video_data.get('thumbs', {}).items():
thumbnails.append({
'id': key,
'width': int_or_none(key),
'url': thumb,
})
thumbnail = video_data.get('thumbnail')
if thumbnail:
thumbnails.append({
'url': thumbnail,
})
owner = video_data.get('owner') or {}
video_uploader_url = owner.get('url')
return {
'title': self._live_title(video_title) if is_live else video_title,
'uploader': owner.get('name'),
'uploader_id': video_uploader_url.split('/')[-1] if video_uploader_url else None,
'title': video_title,
'uploader': video_uploader,
'uploader_id': video_uploader_id,
'uploader_url': video_uploader_url,
'thumbnails': thumbnails,
'duration': int_or_none(video_data.get('duration')),
'thumbnail': video_thumbnail,
'duration': video_duration,
'formats': formats,
'subtitles': subtitles,
'is_live': is_live,
}
def _extract_original_format(self, url, video_id):
download_data = self._download_json(
url, video_id, fatal=False,
query={'action': 'load_download_config'},
headers={'X-Requested-With': 'XMLHttpRequest'})
if download_data:
source_file = download_data.get('source_file')
if isinstance(source_file, dict):
download_url = source_file.get('download_url')
if download_url and not source_file.get('is_cold') and not source_file.get('is_defrosting'):
source_name = source_file.get('public_name', 'Original')
if self._is_valid_url(download_url, video_id, '%s video' % source_name):
ext = (try_get(
source_file, lambda x: x['extension'],
compat_str) or determine_ext(
download_url, None) or 'mp4').lower()
return {
'url': download_url,
'ext': ext,
'width': int_or_none(source_file.get('width')),
'height': int_or_none(source_file.get('height')),
'filesize': parse_filesize(source_file.get('size')),
'format_id': source_name,
'preference': 1,
}
class VimeoIE(VimeoBaseInfoExtractor):
"""Information extractor for vimeo.com."""
@@ -703,11 +659,29 @@ class VimeoIE(VimeoBaseInfoExtractor):
comment_count = None
formats = []
source_format = self._extract_original_format(
'https://vimeo.com/' + video_id, video_id)
if source_format:
formats.append(source_format)
download_request = sanitized_Request('https://vimeo.com/%s?action=load_download_config' % video_id, headers={
'X-Requested-With': 'XMLHttpRequest'})
download_data = self._download_json(download_request, video_id, fatal=False)
if download_data:
source_file = download_data.get('source_file')
if isinstance(source_file, dict):
download_url = source_file.get('download_url')
if download_url and not source_file.get('is_cold') and not source_file.get('is_defrosting'):
source_name = source_file.get('public_name', 'Original')
if self._is_valid_url(download_url, video_id, '%s video' % source_name):
ext = (try_get(
source_file, lambda x: x['extension'],
compat_str) or determine_ext(
download_url, None) or 'mp4').lower()
formats.append({
'url': download_url,
'ext': ext,
'width': int_or_none(source_file.get('width')),
'height': int_or_none(source_file.get('height')),
'filesize': parse_filesize(source_file.get('size')),
'format_id': source_name,
'preference': 1,
})
info_dict_config = self._parse_config(config, video_id)
formats.extend(info_dict_config['formats'])
@@ -966,7 +940,7 @@ class VimeoGroupsIE(VimeoAlbumIE):
class VimeoReviewIE(VimeoBaseInfoExtractor):
IE_NAME = 'vimeo:review'
IE_DESC = 'Review pages on vimeo'
_VALID_URL = r'(?P<url>https://vimeo\.com/[^/]+/review/(?P<id>[^/]+)/[0-9a-f]{10})'
_VALID_URL = r'https://vimeo\.com/[^/]+/review/(?P<id>[^/]+)'
_TESTS = [{
'url': 'https://vimeo.com/user21297594/review/75524534/3c257a1b5d',
'md5': 'c507a72f780cacc12b2248bb4006d253',
@@ -1018,8 +992,7 @@ class VimeoReviewIE(VimeoBaseInfoExtractor):
data = self._parse_json(self._search_regex(
r'window\s*=\s*_extend\(window,\s*({.+?})\);', webpage, 'data',
default=NO_DEFAULT if video_password_verified else '{}'), video_id)
config = data.get('vimeo_esi', {}).get('config', {})
config_url = config.get('configUrl') or try_get(config, lambda x: x['clipData']['configUrl'])
config_url = data.get('vimeo_esi', {}).get('config', {}).get('configUrl')
if config_url is None:
self._verify_video_password(webpage_url, video_id, webpage)
config_url = self._get_config_url(
@@ -1027,13 +1000,10 @@ class VimeoReviewIE(VimeoBaseInfoExtractor):
return config_url
def _real_extract(self, url):
page_url, video_id = re.match(self._VALID_URL, url).groups()
video_id = self._match_id(url)
config_url = self._get_config_url(url, video_id)
config = self._download_json(config_url, video_id)
info_dict = self._parse_config(config, video_id)
source_format = self._extract_original_format(page_url, video_id)
if source_format:
info_dict['formats'].append(source_format)
self._vimeo_sort_formats(info_dict['formats'])
info_dict['id'] = video_id
return info_dict

View File

@@ -6,7 +6,10 @@ import re
import sys
from .common import InfoExtractor
from ..compat import compat_urlparse
from ..compat import (
compat_str,
compat_urlparse,
)
from ..utils import (
clean_html,
ExtractorError,
@@ -100,7 +103,7 @@ class VKIE(VKBaseIE):
'url': 'http://vk.com/videos-77521?z=video-77521_162222515%2Fclub77521',
'md5': '7babad3b85ea2e91948005b1b8b0cb84',
'info_dict': {
'id': '-77521_162222515',
'id': '162222515',
'ext': 'mp4',
'title': 'ProtivoGunz - Хуёвая песня',
'uploader': 're:(?:Noize MC|Alexander Ilyashenko).*',
@@ -114,7 +117,7 @@ class VKIE(VKBaseIE):
'url': 'http://vk.com/video205387401_165548505',
'md5': '6c0aeb2e90396ba97035b9cbde548700',
'info_dict': {
'id': '205387401_165548505',
'id': '165548505',
'ext': 'mp4',
'title': 'No name',
'uploader': 'Tom Cruise',
@@ -129,7 +132,7 @@ class VKIE(VKBaseIE):
'url': 'http://vk.com/video_ext.php?oid=32194266&id=162925554&hash=7d8c2e0d5e05aeaa&hd=1',
'md5': 'c7ce8f1f87bec05b3de07fdeafe21a0a',
'info_dict': {
'id': '32194266_162925554',
'id': '162925554',
'ext': 'mp4',
'uploader': 'Vladimir Gavrin',
'title': 'Lin Dan',
@@ -146,7 +149,7 @@ class VKIE(VKBaseIE):
'md5': 'a590bcaf3d543576c9bd162812387666',
'note': 'Only available for registered users',
'info_dict': {
'id': '-8871596_164049491',
'id': '164049491',
'ext': 'mp4',
'uploader': 'Триллеры',
'title': '► Бойцовский клуб / Fight Club 1999 [HD 720]',
@@ -160,7 +163,7 @@ class VKIE(VKBaseIE):
'url': 'http://vk.com/hd_kino_mania?z=video-43215063_168067957%2F15c66b9b533119788d',
'md5': '4d7a5ef8cf114dfa09577e57b2993202',
'info_dict': {
'id': '-43215063_168067957',
'id': '168067957',
'ext': 'mp4',
'uploader': 'Киномания - лучшее из мира кино',
'title': ' ',
@@ -174,7 +177,7 @@ class VKIE(VKBaseIE):
'md5': '0c45586baa71b7cb1d0784ee3f4e00a6',
'note': 'ivi.ru embed',
'info_dict': {
'id': '-43215063_169084319',
'id': '60690',
'ext': 'mp4',
'title': 'Книга Илая',
'duration': 6771,
@@ -188,7 +191,7 @@ class VKIE(VKBaseIE):
'url': 'https://vk.com/video30481095_171201961?list=8764ae2d21f14088d4',
'md5': '091287af5402239a1051c37ec7b92913',
'info_dict': {
'id': '30481095_171201961',
'id': '171201961',
'ext': 'mp4',
'title': 'ТюменцевВВ_09.07.2015',
'uploader': 'Anton Ivanov',
@@ -203,10 +206,10 @@ class VKIE(VKBaseIE):
'url': 'https://vk.com/video276849682_170681728',
'info_dict': {
'id': 'V3K4mi0SYkc',
'ext': 'mp4',
'ext': 'webm',
'title': "DSWD Awards 'Children's Joy Foundation, Inc.' Certificate of Registration and License to Operate",
'description': 'md5:bf9c26cfa4acdfb146362682edd3827a',
'duration': 178,
'duration': 179,
'upload_date': '20130116',
'uploader': "Children's Joy Foundation Inc.",
'uploader_id': 'thecjf',
@@ -236,7 +239,7 @@ class VKIE(VKBaseIE):
'url': 'http://vk.com/video-110305615_171782105',
'md5': 'e13fcda136f99764872e739d13fac1d1',
'info_dict': {
'id': '-110305615_171782105',
'id': '171782105',
'ext': 'mp4',
'title': 'S-Dance, репетиции к The way show',
'uploader': 'THE WAY SHOW | 17 апреля',
@@ -251,17 +254,14 @@ class VKIE(VKBaseIE):
{
# finished live stream, postlive_mp4
'url': 'https://vk.com/videos-387766?z=video-387766_456242764%2Fpl_-387766_-2',
'md5': '90d22d051fccbbe9becfccc615be6791',
'info_dict': {
'id': '-387766_456242764',
'id': '456242764',
'ext': 'mp4',
'title': 'ИгроМир 2016 День 1 — Игромания Утром',
'title': 'ИгроМир 2016 — день 1',
'uploader': 'Игромания',
'duration': 5239,
# TODO: use act=show to extract view_count
# 'view_count': int,
'upload_date': '20160929',
'uploader_id': '-387766',
'timestamp': 1475137527,
'view_count': int,
},
},
{
@@ -465,7 +465,7 @@ class VKIE(VKBaseIE):
self._sort_formats(formats)
return {
'id': video_id,
'id': compat_str(data.get('vid') or video_id),
'formats': formats,
'title': title,
'thumbnail': data.get('jpg'),

View File

@@ -102,15 +102,6 @@ class VRVIE(VRVBaseIE):
# m3u8 download
'skip_download': True,
},
}, {
# movie listing
'url': 'https://vrv.co/watch/G6NQXZ1J6/Lily-CAT',
'info_dict': {
'id': 'G6NQXZ1J6',
'title': 'Lily C.A.T',
'description': 'md5:988b031e7809a6aeb60968be4af7db07',
},
'playlist_count': 2,
}]
_NETRC_MACHINE = 'vrv'
@@ -132,23 +123,23 @@ class VRVIE(VRVBaseIE):
def _extract_vrv_formats(self, url, video_id, stream_format, audio_lang, hardsub_lang):
if not url or stream_format not in ('hls', 'dash'):
return []
assert audio_lang or hardsub_lang
stream_id_list = []
if audio_lang:
stream_id_list.append('audio-%s' % audio_lang)
if hardsub_lang:
stream_id_list.append('hardsub-%s' % hardsub_lang)
format_id = stream_format
if stream_id_list:
format_id += '-' + '-'.join(stream_id_list)
stream_id = '-'.join(stream_id_list)
format_id = '%s-%s' % (stream_format, stream_id)
if stream_format == 'hls':
adaptive_formats = self._extract_m3u8_formats(
url, video_id, 'mp4', m3u8_id=format_id,
note='Downloading %s information' % format_id,
note='Downloading %s m3u8 information' % stream_id,
fatal=False)
elif stream_format == 'dash':
adaptive_formats = self._extract_mpd_formats(
url, video_id, mpd_id=format_id,
note='Downloading %s information' % format_id,
note='Downloading %s MPD information' % stream_id,
fatal=False)
if audio_lang:
for f in adaptive_formats:
@@ -159,28 +150,10 @@ class VRVIE(VRVBaseIE):
def _real_extract(self, url):
video_id = self._match_id(url)
object_data = self._call_cms(self._get_cms_resource(
'cms:/objects/' + video_id, video_id), video_id, 'object')['items'][0]
resource_path = object_data['__links__']['resource']['href']
video_data = self._call_cms(resource_path, video_id, 'video')
episode_path = self._get_cms_resource(
'cms:/episodes/' + video_id, video_id)
video_data = self._call_cms(episode_path, video_id, 'video')
title = video_data['title']
description = video_data.get('description')
if video_data.get('__class__') == 'movie_listing':
items = self._call_cms(
video_data['__links__']['movie_listing/movies']['href'],
video_id, 'movie listing').get('items') or []
if len(items) != 1:
entries = []
for item in items:
item_id = item.get('id')
if not item_id:
continue
entries.append(self.url_result(
'https://vrv.co/watch/' + item_id,
self.ie_key(), item_id, item.get('title')))
return self.playlist_result(entries, video_id, title, description)
video_data = items[0]
streams_path = video_data['__links__'].get('streams', {}).get('href')
if not streams_path:
@@ -224,7 +197,7 @@ class VRVIE(VRVBaseIE):
'formats': formats,
'subtitles': subtitles,
'thumbnails': thumbnails,
'description': description,
'description': video_data.get('description'),
'duration': float_or_none(video_data.get('duration_ms'), 1000),
'uploader_id': video_data.get('channel_id'),
'series': video_data.get('series_title'),

View File

@@ -19,7 +19,7 @@ from ..utils import (
class WeiboIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?weibo\.com/[0-9]+/(?P<id>[a-zA-Z0-9]+)'
_VALID_URL = r'https?://weibo\.com/[0-9]+/(?P<id>[a-zA-Z0-9]+)'
_TEST = {
'url': 'https://weibo.com/6275294458/Fp6RGfbff?type=comment',
'info_dict': {

View File

@@ -0,0 +1,158 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
ExtractorError,
int_or_none,
qualities,
remove_start,
)
class WrzutaIE(InfoExtractor):
IE_NAME = 'wrzuta.pl'
_VALID_URL = r'https?://(?P<uploader>[0-9a-zA-Z]+)\.wrzuta\.pl/(?P<typ>film|audio)/(?P<id>[0-9a-zA-Z]+)'
_TESTS = [{
'url': 'http://laboratoriumdextera.wrzuta.pl/film/aq4hIZWrkBu/nike_football_the_last_game',
'md5': '9e67e05bed7c03b82488d87233a9efe7',
'info_dict': {
'id': 'aq4hIZWrkBu',
'ext': 'mp4',
'title': 'Nike Football: The Last Game',
'duration': 307,
'uploader_id': 'laboratoriumdextera',
'description': 'md5:7fb5ef3c21c5893375fda51d9b15d9cd',
},
'skip': 'Redirected to wrzuta.pl',
}, {
'url': 'http://vexling.wrzuta.pl/audio/01xBFabGXu6/james_horner_-_into_the_na_39_vi_world_bonus',
'md5': 'f80564fb5a2ec6ec59705ae2bf2ba56d',
'info_dict': {
'id': '01xBFabGXu6',
'ext': 'mp3',
'title': 'James Horner - Into The Na\'vi World [Bonus]',
'description': 'md5:30a70718b2cd9df3120fce4445b0263b',
'duration': 95,
'uploader_id': 'vexling',
},
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
typ = mobj.group('typ')
uploader = mobj.group('uploader')
webpage, urlh = self._download_webpage_handle(url, video_id)
if urlh.geturl() == 'http://www.wrzuta.pl/':
raise ExtractorError('Video removed', expected=True)
quality = qualities(['SD', 'MQ', 'HQ', 'HD'])
audio_table = {'flv': 'mp3', 'webm': 'ogg', '???': 'mp3'}
embedpage = self._download_json('http://www.wrzuta.pl/npp/embed/%s/%s' % (uploader, video_id), video_id)
formats = []
for media in embedpage['url']:
fmt = media['type'].split('@')[0]
if typ == 'audio':
ext = audio_table.get(fmt, fmt)
else:
ext = fmt
formats.append({
'format_id': '%s_%s' % (ext, media['quality'].lower()),
'url': media['url'],
'ext': ext,
'quality': quality(media['quality']),
})
self._sort_formats(formats)
return {
'id': video_id,
'title': self._og_search_title(webpage),
'thumbnail': self._og_search_thumbnail(webpage),
'formats': formats,
'duration': int_or_none(embedpage['duration']),
'uploader_id': uploader,
'description': self._og_search_description(webpage),
'age_limit': embedpage.get('minimalAge', 0),
}
class WrzutaPlaylistIE(InfoExtractor):
"""
this class covers extraction of wrzuta playlist entries
the extraction process bases on following steps:
* collect information of playlist size
* download all entries provided on
the playlist webpage (the playlist is split
on two pages: first directly reached from webpage
second: downloaded on demand by ajax call and rendered
using the ajax call response)
* in case size of extracted entries not reached total number of entries
use the ajax call to collect the remaining entries
"""
IE_NAME = 'wrzuta.pl:playlist'
_VALID_URL = r'https?://(?P<uploader>[0-9a-zA-Z]+)\.wrzuta\.pl/playlista/(?P<id>[0-9a-zA-Z]+)'
_TESTS = [{
'url': 'http://miromak71.wrzuta.pl/playlista/7XfO4vE84iR/moja_muza',
'playlist_mincount': 14,
'info_dict': {
'id': '7XfO4vE84iR',
'title': 'Moja muza',
},
}, {
'url': 'http://heroesf70.wrzuta.pl/playlista/6Nj3wQHx756/lipiec_-_lato_2015_muzyka_swiata',
'playlist_mincount': 144,
'info_dict': {
'id': '6Nj3wQHx756',
'title': 'Lipiec - Lato 2015 Muzyka Świata',
},
}, {
'url': 'http://miromak71.wrzuta.pl/playlista/7XfO4vE84iR',
'only_matching': True,
}]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
playlist_id = mobj.group('id')
uploader = mobj.group('uploader')
webpage = self._download_webpage(url, playlist_id)
playlist_size = int_or_none(self._html_search_regex(
(r'<div[^>]+class=["\']playlist-counter["\'][^>]*>\d+/(\d+)',
r'<div[^>]+class=["\']all-counter["\'][^>]*>(.+?)</div>'),
webpage, 'playlist size', default=None))
playlist_title = remove_start(
self._og_search_title(webpage), 'Playlista: ')
entries = []
if playlist_size:
entries = [
self.url_result(entry_url)
for _, entry_url in re.findall(
r'<a[^>]+href=(["\'])(http.+?)\1[^>]+class=["\']playlist-file-page',
webpage)]
if playlist_size > len(entries):
playlist_content = self._download_json(
'http://%s.wrzuta.pl/xhr/get_playlist_offset/%s' % (uploader, playlist_id),
playlist_id,
'Downloading playlist JSON',
'Unable to download playlist JSON')
entries.extend([
self.url_result(entry['filelink'])
for entry in playlist_content.get('files', []) if entry.get('filelink')])
return self.playlist_result(entries, playlist_id, playlist_title)

View File

@@ -20,7 +20,7 @@ from ..utils import (
class XHamsterIE(InfoExtractor):
_VALID_URL = r'''(?x)
https?://
(?:.+?\.)?xhamster\.(?:com|one)/
(?:.+?\.)?xhamster\.com/
(?:
movies/(?P<id>\d+)/(?P<display_id>[^/]*)\.html|
videos/(?P<display_id_2>[^/]*)-(?P<id_2>\d+)
@@ -91,9 +91,6 @@ class XHamsterIE(InfoExtractor):
# new URL schema
'url': 'https://pt.xhamster.com/videos/euro-pedal-pumping-7937821',
'only_matching': True,
}, {
'url': 'https://xhamster.one/videos/femaleagent-shy-beauty-takes-the-bait-1509445',
'only_matching': True,
}]
def _real_extract(self, url):

Some files were not shown because too many files have changed in this diff Show More