Compare commits

...

26 Commits

Author SHA1 Message Date
Sergey M․
73a85620ee release 2016.08.13 2016-08-13 23:17:11 +07:00
Sergey M․
a560f28c98 [ChangeLog] Actualize 2016-08-13 23:01:35 +07:00
Sergey M․
5ec5461e1a [pbs] Clarify comment on http formats 2016-08-13 22:50:18 +07:00
Sergey M․
542130a5d9 [pbs] Fix description extraction and update tests 2016-08-13 21:59:29 +07:00
Sergey M․
82997dad57 [franceculture] Fix extraction (Closes #10324) 2016-08-13 21:00:34 +07:00
Sergey M․
647a7bf5e8 [pornotube] Fix extraction (Closes #10322) 2016-08-13 20:49:16 +07:00
Sergey M․
77afa008dd [4tube] Fix metadata extraction (Closes #10321) 2016-08-13 19:55:09 +07:00
Yen Chi Hsuan
db535435b3 [bigflix] Remove an invalid test
There's no video anymore
2016-08-13 18:02:11 +08:00
Sergey M․
c2a453b461 [imgur] Fix width and height extraction (Closes #10325) 2016-08-13 16:46:07 +07:00
Sergey M․
cd29eaab95 [vbox7] Remove unused imports 2016-08-13 16:45:34 +07:00
Yen Chi Hsuan
52aa7e7476 [test_verbose_output] Fix tests under Python 3 2016-08-13 17:36:14 +08:00
Sergey M․
e97c55ee6a [expotv] Improve extraction and update test 2016-08-13 16:29:05 +07:00
Remita Amine
acfccacad5 [downloader/external:curl] Clarify why CurlFD should not capture stderr 2016-08-13 10:26:02 +01:00
Remita Amine
5f2c2b7936 [test_utils] add test for option with not str value 2016-08-13 09:54:12 +01:00
Sergey M․
cb55908e51 [vbox7] Fix extraction (Closes #10309) 2016-08-13 15:47:20 +07:00
Yen Chi Hsuan
e581224843 [tapely] Remove extractor. It's shut down
Closes #10323
2016-08-13 16:32:07 +08:00
Remita Amine
f50365e91c [pbs] add test for videos with undocumented http formats and remove unused import 2016-08-13 09:10:09 +01:00
Sergey M․
c366f8d30a [24video] Add support for me and xxx TLDs 2016-08-13 14:47:51 +07:00
Sergey M․
6a26c5f9d5 [muenchentv] Fix extraction (Closes #10313) 2016-08-13 14:28:44 +07:00
Sergey M․
bd6fb007de [24video] Fix comment count extraction 2016-08-13 14:22:47 +07:00
Sergey M․
b69b2ff736 [sunporno] Add support for embed URLs 2016-08-13 14:13:49 +07:00
Sergey M․
794e5dcd7e [sunporno] Fix metadata extraction (Closes #10316) 2016-08-13 14:09:35 +07:00
Remita Amine
f0d3669437 [hgtv] Add new extractor(closes #3999) 2016-08-12 18:05:49 +01:00
Remita Amine
98e698f1ff [external/curl] respect more downloader options and display progress 2016-08-12 12:30:02 +01:00
Remita Amine
3cddb8d6a7 [pbs] check all http formats and remove unnecessary request
- some of the quality that not reported in the documentation
are available(4500k, 6500k)
- the videoInfo request doesn't work for a long time
2016-08-12 08:38:06 +01:00
Sergey M․
990d533ee4 [crunchyroll] Add support for HLS (Closes #10301) 2016-08-12 00:56:16 +07:00
23 changed files with 328 additions and 390 deletions

View File

@@ -6,8 +6,8 @@
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.08.12*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.08.12**
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2016.08.13*. If it's not read [this FAQ entry](https://github.com/rg3/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2016.08.13**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through [README](https://github.com/rg3/youtube-dl/blob/master/README.md) and **most notably** [FAQ](https://github.com/rg3/youtube-dl#faq) and [BUGS](https://github.com/rg3/youtube-dl#bugs) sections
@@ -35,7 +35,7 @@ $ youtube-dl -v <your command line>
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2016.08.12
[debug] youtube-dl version 2016.08.13
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}

View File

@@ -1,3 +1,28 @@
version 2016.08.13
Core
* Show progress for curl external downloader
* Forward more options to curl external downloader
Extractors
* [pbs] Fix description extraction
* [franceculture] Fix extraction (#10324)
* [pornotube] Fix extraction (#10322)
* [4tube] Fix metadata extraction (#10321)
* [imgur] Fix width and height extraction (#10325)
* [expotv] Improve extraction
+ [vbox7] Fix extraction (#10309)
- [tapely] Remove extractor (#10323)
* [muenchentv] Fix extraction (#10313)
+ [24video] Add support for .me and .xxx TLDs
* [24video] Fix comment count extraction
* [sunporno] Add support for embed URLs
* [sunporno] Fix metadata extraction (#10316)
+ [hgtv] Add extractor for hgtv.ca (#3999)
- [pbs] Remove request to unavailable API
+ [pbs] Add support for high quality HTTP formats
+ [crunchyroll] Add support for HLS formats (#10301)
version 2016.08.12
Core

View File

@@ -238,7 +238,6 @@
- **FoxSports**
- **france2.fr:generation-quoi**
- **FranceCulture**
- **FranceCultureEmission**
- **FranceInter**
- **francetv**: France 2, 3, 4, 5 and Ô
- **francetvinfo.fr**
@@ -277,6 +276,7 @@
- **HellPorno**
- **Helsinki**: helsinki.fi
- **HentaiStigma**
- **HGTV**
- **HistoricFilms**
- **history:topic**: History.com Topic
- **hitbox**
@@ -664,7 +664,6 @@
- **SztvHu**
- **Tagesschau**
- **tagesschau:player**
- **Tapely**
- **Tass**
- **TDSLifeway**
- **teachertube**: teachertube.com videos

View File

@@ -968,6 +968,7 @@ The first line
self.assertEqual(cli_option({'proxy': '127.0.0.1:3128'}, '--proxy', 'proxy'), ['--proxy', '127.0.0.1:3128'])
self.assertEqual(cli_option({'proxy': None}, '--proxy', 'proxy'), [])
self.assertEqual(cli_option({}, '--proxy', 'proxy'), [])
self.assertEqual(cli_option({'retries': 10}, '--retries', 'retries'), ['--retries', '10'])
def test_cli_valueless_option(self):
self.assertEqual(cli_valueless_option(

View File

@@ -22,10 +22,10 @@ class TestVerboseOutput(unittest.TestCase):
'--password', 'secret',
], cwd=rootDir, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
sout, serr = outp.communicate()
self.assertTrue('--username' in serr)
self.assertTrue('johnsmith' not in serr)
self.assertTrue('--password' in serr)
self.assertTrue('secret' not in serr)
self.assertTrue(b'--username' in serr)
self.assertTrue(b'johnsmith' not in serr)
self.assertTrue(b'--password' in serr)
self.assertTrue(b'secret' not in serr)
def test_private_info_shortarg(self):
outp = subprocess.Popen(
@@ -35,10 +35,10 @@ class TestVerboseOutput(unittest.TestCase):
'-p', 'secret',
], cwd=rootDir, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
sout, serr = outp.communicate()
self.assertTrue('-u' in serr)
self.assertTrue('johnsmith' not in serr)
self.assertTrue('-p' in serr)
self.assertTrue('secret' not in serr)
self.assertTrue(b'-u' in serr)
self.assertTrue(b'johnsmith' not in serr)
self.assertTrue(b'-p' in serr)
self.assertTrue(b'secret' not in serr)
def test_private_info_eq(self):
outp = subprocess.Popen(
@@ -48,10 +48,10 @@ class TestVerboseOutput(unittest.TestCase):
'--password=secret',
], cwd=rootDir, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
sout, serr = outp.communicate()
self.assertTrue('--username' in serr)
self.assertTrue('johnsmith' not in serr)
self.assertTrue('--password' in serr)
self.assertTrue('secret' not in serr)
self.assertTrue(b'--username' in serr)
self.assertTrue(b'johnsmith' not in serr)
self.assertTrue(b'--password' in serr)
self.assertTrue(b'secret' not in serr)
def test_private_info_shortarg_eq(self):
outp = subprocess.Popen(
@@ -61,10 +61,10 @@ class TestVerboseOutput(unittest.TestCase):
'-p=secret',
], cwd=rootDir, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
sout, serr = outp.communicate()
self.assertTrue('-u' in serr)
self.assertTrue('johnsmith' not in serr)
self.assertTrue('-p' in serr)
self.assertTrue('secret' not in serr)
self.assertTrue(b'-u' in serr)
self.assertTrue(b'johnsmith' not in serr)
self.assertTrue(b'-p' in serr)
self.assertTrue(b'secret' not in serr)
if __name__ == '__main__':
unittest.main()

View File

@@ -96,6 +96,12 @@ class CurlFD(ExternalFD):
cmd = [self.exe, '--location', '-o', tmpfilename]
for key, val in info_dict['http_headers'].items():
cmd += ['--header', '%s: %s' % (key, val)]
cmd += self._bool_option('--continue-at', 'continuedl', '-', '0')
cmd += self._valueless_option('--silent', 'noprogress')
cmd += self._valueless_option('--verbose', 'verbose')
cmd += self._option('--limit-rate', 'ratelimit')
cmd += self._option('--retry', 'retries')
cmd += self._option('--max-filesize', 'max_filesize')
cmd += self._option('--interface', 'source_address')
cmd += self._option('--proxy', 'proxy')
cmd += self._valueless_option('--insecure', 'nocheckcertificate')
@@ -103,6 +109,16 @@ class CurlFD(ExternalFD):
cmd += ['--', info_dict['url']]
return cmd
def _call_downloader(self, tmpfilename, info_dict):
cmd = [encodeArgument(a) for a in self._make_cmd(tmpfilename, info_dict)]
self._debug_cmd(cmd)
# curl writes the progress to stderr so don't capture it.
p = subprocess.Popen(cmd)
p.communicate()
return p.returncode
class AxelFD(ExternalFD):
AVAILABLE_OPT = '-V'

View File

@@ -11,15 +11,6 @@ from ..compat import compat_urllib_parse_unquote
class BigflixIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?bigflix\.com/.+/(?P<id>[0-9]+)'
_TESTS = [{
'url': 'http://www.bigflix.com/Hindi-movies/Action-movies/Singham-Returns/16537',
'md5': 'dc1b4aebb46e3a7077ecc0d9f43f61e3',
'info_dict': {
'id': '16537',
'ext': 'mp4',
'title': 'Singham Returns',
'description': 'md5:3d2ba5815f14911d5cc6a501ae0cf65d',
}
}, {
# 2 formats
'url': 'http://www.bigflix.com/Tamil-movies/Drama-movies/Madarasapatinam/16070',
'info_dict': {

View File

@@ -114,6 +114,21 @@ class CrunchyrollIE(CrunchyrollBaseIE):
# rtmp
'skip_download': True,
},
}, {
'url': 'http://www.crunchyroll.com/rezero-starting-life-in-another-world-/episode-5-the-morning-of-our-promise-is-still-distant-702409',
'info_dict': {
'id': '702409',
'ext': 'mp4',
'title': 'Re:ZERO -Starting Life in Another World- Episode 5 The Morning of Our Promise Is Still Distant',
'description': 'md5:97664de1ab24bbf77a9c01918cb7dca9',
'thumbnail': 're:^https?://.*\.jpg$',
'uploader': 'TV TOKYO',
'upload_date': '20160508',
},
'params': {
# m3u8 download
'skip_download': True,
},
}, {
'url': 'http://www.crunchyroll.fr/girl-friend-beta/episode-11-goodbye-la-mode-661697',
'only_matching': True,
@@ -336,9 +351,18 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
if video_encode_id in video_encode_ids:
continue
video_encode_ids.append(video_encode_id)
video_file = xpath_text(stream_info, './file')
if not video_file:
continue
if video_file.startswith('http'):
formats.extend(self._extract_m3u8_formats(
video_file, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
continue
video_url = xpath_text(stream_info, './host')
video_play_path = xpath_text(stream_info, './file')
if not video_url or not video_play_path:
if not video_url:
continue
metadata = stream_info.find('./metadata')
format_info = {
@@ -353,7 +377,7 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
parsed_video_url = compat_urlparse.urlparse(video_url)
direct_video_url = compat_urlparse.urlunparse(parsed_video_url._replace(
netloc='v.lvlt.crcdn.net',
path='%s/%s' % (remove_end(parsed_video_url.path, '/'), video_play_path.split(':')[-1])))
path='%s/%s' % (remove_end(parsed_video_url.path, '/'), video_file.split(':')[-1])))
if self._is_valid_url(direct_video_url, video_id, video_format):
format_info.update({
'url': direct_video_url,
@@ -363,7 +387,7 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
format_info.update({
'url': video_url,
'play_path': video_play_path,
'play_path': video_file,
'ext': 'flv',
})
formats.append(format_info)

View File

@@ -1,7 +1,5 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
int_or_none,
@@ -12,23 +10,22 @@ from ..utils import (
class ExpoTVIE(InfoExtractor):
_VALID_URL = r'https?://www\.expotv\.com/videos/[^?#]*/(?P<id>[0-9]+)($|[?#])'
_TEST = {
'url': 'http://www.expotv.com/videos/reviews/1/24/LinneCardscom/17561',
'md5': '2985e6d7a392b2f7a05e0ca350fe41d0',
'url': 'http://www.expotv.com/videos/reviews/3/40/NYX-Butter-lipstick/667916',
'md5': 'fe1d728c3a813ff78f595bc8b7a707a8',
'info_dict': {
'id': '17561',
'id': '667916',
'ext': 'mp4',
'upload_date': '20060212',
'title': 'My Favorite Online Scrapbook Store',
'view_count': int,
'description': 'You\'ll find most everything you need at this virtual store front.',
'uploader': 'Anna T.',
'title': 'NYX Butter Lipstick Little Susie',
'description': 'Goes on like butter, but looks better!',
'thumbnail': 're:^https?://.*\.jpg$',
'uploader': 'Stephanie S.',
'upload_date': '20150520',
'view_count': int,
}
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
player_key = self._search_regex(
@@ -66,7 +63,7 @@ class ExpoTVIE(InfoExtractor):
fatal=False)
upload_date = unified_strdate(self._search_regex(
r'<h5>Reviewed on ([0-9/.]+)</h5>', webpage, 'upload date',
fatal=False))
fatal=False), day_first=False)
return {
'id': video_id,

View File

@@ -272,10 +272,7 @@ from .fox import FOXIE
from .foxgay import FoxgayIE
from .foxnews import FoxNewsIE
from .foxsports import FoxSportsIE
from .franceculture import (
FranceCultureIE,
FranceCultureEmissionIE,
)
from .franceculture import FranceCultureIE
from .franceinter import FranceInterIE
from .francetv import (
PluzzIE,
@@ -325,6 +322,7 @@ from .heise import HeiseIE
from .hellporno import HellPornoIE
from .helsinki import HelsinkiIE
from .hentaistigma import HentaiStigmaIE
from .hgtv import HGTVIE
from .historicfilms import HistoricFilmsIE
from .hitbox import HitboxIE, HitboxLiveIE
from .hornbunny import HornBunnyIE
@@ -811,7 +809,6 @@ from .tagesschau import (
TagesschauPlayerIE,
TagesschauIE,
)
from .tapely import TapelyIE
from .tass import TassIE
from .tdslifeway import TDSLifewayIE
from .teachertube import (

View File

@@ -43,14 +43,14 @@ class FourTubeIE(InfoExtractor):
'uploadDate', webpage))
thumbnail = self._html_search_meta('thumbnailUrl', webpage)
uploader_id = self._html_search_regex(
r'<a class="img-avatar" href="[^"]+/channels/([^/"]+)" title="Go to [^"]+ page">',
r'<a class="item-to-subscribe" href="[^"]+/channels/([^/"]+)" title="Go to [^"]+ page">',
webpage, 'uploader id', fatal=False)
uploader = self._html_search_regex(
r'<a class="img-avatar" href="[^"]+/channels/[^/"]+" title="Go to ([^"]+) page">',
r'<a class="item-to-subscribe" href="[^"]+/channels/[^/"]+" title="Go to ([^"]+) page">',
webpage, 'uploader', fatal=False)
categories_html = self._search_regex(
r'(?s)><i class="icon icon-tag"></i>\s*Categories / Tags\s*.*?<ul class="list">(.*?)</ul>',
r'(?s)><i class="icon icon-tag"></i>\s*Categories / Tags\s*.*?<ul class="[^"]*?list[^"]*?">(.*?)</ul>',
webpage, 'categories', fatal=False)
categories = None
if categories_html:
@@ -59,10 +59,10 @@ class FourTubeIE(InfoExtractor):
r'(?s)<li><a.*?>(.*?)</a>', categories_html)]
view_count = str_to_int(self._search_regex(
r'<meta itemprop="interactionCount" content="UserPlays:([0-9,]+)">',
r'<meta[^>]+itemprop="interactionCount"[^>]+content="UserPlays:([0-9,]+)">',
webpage, 'view count', fatal=False))
like_count = str_to_int(self._search_regex(
r'<meta itemprop="interactionCount" content="UserLikes:([0-9,]+)">',
r'<meta[^>]+itemprop="interactionCount"[^>]+content="UserLikes:([0-9,]+)">',
webpage, 'like count', fatal=False))
duration = parse_duration(self._html_search_meta('duration', webpage))

View File

@@ -2,104 +2,56 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import (
compat_urlparse,
)
from ..utils import (
determine_ext,
int_or_none,
ExtractorError,
unified_strdate,
)
class FranceCultureIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?franceculture\.fr/player/reecouter\?play=(?P<id>[0-9]+)'
_VALID_URL = r'https?://(?:www\.)?franceculture\.fr/emissions/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TEST = {
'url': 'http://www.franceculture.fr/player/reecouter?play=4795174',
'url': 'http://www.franceculture.fr/emissions/carnet-nomade/rendez-vous-au-pays-des-geeks',
'info_dict': {
'id': '4795174',
'id': 'rendez-vous-au-pays-des-geeks',
'display_id': 'rendez-vous-au-pays-des-geeks',
'ext': 'mp3',
'title': 'Rendez-vous au pays des geeks',
'alt_title': 'Carnet nomade | 13-14',
'vcodec': 'none',
'thumbnail': 're:^https?://.*\\.jpg$',
'upload_date': '20140301',
'thumbnail': r're:^http://static\.franceculture\.fr/.*/images/player/Carnet-nomade\.jpg$',
'description': 'startswith:Avec :Jean-Baptiste Péretié pour son documentaire sur Arte "La revanche',
'timestamp': 1393700400,
'vcodec': 'none',
}
}
def _extract_from_player(self, url, video_id):
webpage = self._download_webpage(url, video_id)
def _real_extract(self, url):
display_id = self._match_id(url)
video_path = self._search_regex(
r'<a id="player".*?href="([^"]+)"', webpage, 'video path')
video_url = compat_urlparse.urljoin(url, video_path)
timestamp = int_or_none(self._search_regex(
r'<a id="player".*?data-date="([0-9]+)"',
webpage = self._download_webpage(url, display_id)
video_url = self._search_regex(
r'(?s)<div[^>]+class="[^"]*?title-zone-diffusion[^"]*?"[^>]*>.*?<a[^>]+href="([^"]+)"',
webpage, 'video path')
title = self._og_search_title(webpage)
upload_date = unified_strdate(self._search_regex(
'(?s)<div[^>]+class="date"[^>]*>.*?<span[^>]+class="inner"[^>]*>([^<]+)<',
webpage, 'upload date', fatal=False))
thumbnail = self._search_regex(
r'<a id="player".*?>\s+<img src="([^"]+)"',
r'(?s)<figure[^>]+itemtype="https://schema.org/ImageObject"[^>]*>.*?<img[^>]+data-pagespeed-(?:lazy|high-res)-src="([^"]+)"',
webpage, 'thumbnail', fatal=False)
display_id = self._search_regex(
r'<span class="path-diffusion">emission-(.*?)</span>', webpage, 'display_id')
title = self._html_search_regex(
r'<span class="title-diffusion">(.*?)</span>', webpage, 'title')
alt_title = self._html_search_regex(
r'<span class="title">(.*?)</span>',
webpage, 'alt_title', fatal=False)
description = self._html_search_regex(
r'<span class="description">(.*?)</span>',
webpage, 'description', fatal=False)
uploader = self._html_search_regex(
r'(?s)<div id="emission".*?<span class="author">(.*?)</span>',
webpage, 'uploader', default=None)
vcodec = 'none' if determine_ext(video_url.lower()) == 'mp3' else None
return {
'id': video_id,
'id': display_id,
'display_id': display_id,
'url': video_url,
'title': title,
'thumbnail': thumbnail,
'vcodec': vcodec,
'uploader': uploader,
'timestamp': timestamp,
'title': title,
'alt_title': alt_title,
'thumbnail': thumbnail,
'description': description,
'display_id': display_id,
'upload_date': upload_date,
}
def _real_extract(self, url):
video_id = self._match_id(url)
return self._extract_from_player(url, video_id)
class FranceCultureEmissionIE(FranceCultureIE):
_VALID_URL = r'https?://(?:www\.)?franceculture\.fr/emission-(?P<id>[^?#]+)'
_TEST = {
'url': 'http://www.franceculture.fr/emission-les-carnets-de-la-creation-jean-gabriel-periot-cineaste-2015-10-13',
'info_dict': {
'title': 'Jean-Gabriel Périot, cinéaste',
'alt_title': 'Les Carnets de la création',
'id': '5093239',
'display_id': 'les-carnets-de-la-creation-jean-gabriel-periot-cineaste-2015-10-13',
'ext': 'mp3',
'timestamp': 1444762500,
'upload_date': '20151013',
'description': 'startswith:Aujourd\'hui dans "Les carnets de la création", le cinéaste',
},
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
video_path = self._html_search_regex(
r'<a class="rf-player-open".*?href="([^"]+)"', webpage, 'video path', 'no_path_player')
if video_path == 'no_path_player':
raise ExtractorError('no player : no sound in this page.', expected=True)
new_id = self._search_regex('play=(?P<id>[0-9]+)', video_path, 'new_id', group='id')
video_url = compat_urlparse.urljoin(url, video_path)
return self._extract_from_player(video_url, new_id)

View File

@@ -0,0 +1,48 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
int_or_none,
js_to_json,
smuggle_url,
)
class HGTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?hgtv\.ca/[^/]+/video/(?P<id>[^/]+)/video.html'
_TEST = {
'url': 'http://www.hgtv.ca/homefree/video/overnight-success/video.html?v=738081859718&p=1&s=da#video',
'md5': '',
'info_dict': {
'id': 'aFH__I_5FBOX',
'ext': 'mp4',
'title': 'Overnight Success',
'description': 'After weeks of hard work, high stakes, breakdowns and pep talks, the final 2 contestants compete to win the ultimate dream.',
'uploader': 'SHWM-NEW',
'timestamp': 1470320034,
'upload_date': '20160804',
},
'params': {
# m3u8 download
'skip_download': True,
},
}
def _real_extract(self, url):
display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id)
embed_vars = self._parse_json(self._search_regex(
r'(?s)embed_vars\s*=\s*({.*?});',
webpage, 'embed vars'), display_id, js_to_json)
return {
'_type': 'url_transparent',
'url': smuggle_url(
'http://link.theplatform.com/s/dtjsEC/%s?mbr=true&manifest=m3u' % embed_vars['pid'], {
'force_smil_url': True
}),
'series': embed_vars.get('show'),
'season_number': int_or_none(embed_vars.get('season')),
'episode_number': int_or_none(embed_vars.get('episode')),
'ie_key': 'ThePlatform',
}

View File

@@ -50,12 +50,10 @@ class ImgurIE(InfoExtractor):
webpage = self._download_webpage(
compat_urlparse.urljoin(url, video_id), video_id)
width = int_or_none(self._search_regex(
r'<param name="width" value="([0-9]+)"',
webpage, 'width', fatal=False))
height = int_or_none(self._search_regex(
r'<param name="height" value="([0-9]+)"',
webpage, 'height', fatal=False))
width = int_or_none(self._og_search_property(
'video:width', webpage, default=None))
height = int_or_none(self._og_search_property(
'video:height', webpage, default=None))
video_elements = self._search_regex(
r'(?s)<div class="video-elements">(.*?)</div>',

View File

@@ -36,7 +36,7 @@ class MuenchenTVIE(InfoExtractor):
title = self._live_title(self._og_search_title(webpage))
data_js = self._search_regex(
r'(?s)\nplaylist:\s*(\[.*?}\]),related:',
r'(?s)\nplaylist:\s*(\[.*?}\]),',
webpage, 'playlist configuration')
data_json = js_to_json(data_js)
data = json.loads(data_json)[0]

View File

@@ -4,13 +4,13 @@ from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import compat_HTTPError
from ..utils import (
ExtractorError,
determine_ext,
int_or_none,
js_to_json,
strip_jsonp,
strip_or_none,
unified_strdate,
US_RATINGS,
)
@@ -201,7 +201,7 @@ class PBSIE(InfoExtractor):
'id': '2365006249',
'ext': 'mp4',
'title': 'Constitution USA with Peter Sagal - A More Perfect Union',
'description': 'md5:36f341ae62e251b8f5bd2b754b95a071',
'description': 'md5:31b664af3c65fd07fa460d306b837d00',
'duration': 3190,
},
},
@@ -212,7 +212,7 @@ class PBSIE(InfoExtractor):
'id': '2365297690',
'ext': 'mp4',
'title': 'FRONTLINE - Losing Iraq',
'description': 'md5:4d3eaa01f94e61b3e73704735f1196d9',
'description': 'md5:5979a4d069b157f622d02bff62fbe654',
'duration': 5050,
},
},
@@ -223,7 +223,7 @@ class PBSIE(InfoExtractor):
'id': '2201174722',
'ext': 'mp4',
'title': 'PBS NewsHour - Cyber Schools Gain Popularity, but Quality Questions Persist',
'description': 'md5:95a19f568689d09a166dff9edada3301',
'description': 'md5:86ab9a3d04458b876147b355788b8781',
'duration': 801,
},
},
@@ -268,7 +268,7 @@ class PBSIE(InfoExtractor):
'display_id': 'player',
'ext': 'mp4',
'title': 'American Experience - Death and the Civil War, Chapter 1',
'description': 'md5:1b80a74e0380ed2a4fb335026de1600d',
'description': 'md5:67fa89a9402e2ee7d08f53b920674c18',
'duration': 682,
'thumbnail': 're:^https?://.*\.jpg$',
},
@@ -294,13 +294,13 @@ class PBSIE(InfoExtractor):
# "<iframe style='position: absolute;<br />\ntop: 0; left: 0;' ...", see
# https://github.com/rg3/youtube-dl/issues/7059)
'url': 'http://www.pbs.org/food/features/a-chefs-life-season-3-episode-5-prickly-business/',
'md5': '84ced42850d78f1d4650297356e95e6f',
'md5': '59b0ef5009f9ac8a319cc5efebcd865e',
'info_dict': {
'id': '2365546844',
'display_id': 'a-chefs-life-season-3-episode-5-prickly-business',
'ext': 'mp4',
'title': "A Chef's Life - Season 3, Ep. 5: Prickly Business",
'description': 'md5:54033c6baa1f9623607c6e2ed245888b',
'description': 'md5:c0ff7475a4b70261c7e58f493c2792a5',
'duration': 1480,
'thumbnail': 're:^https?://.*\.jpg$',
},
@@ -313,7 +313,7 @@ class PBSIE(InfoExtractor):
'display_id': 'the-atomic-artists',
'ext': 'mp4',
'title': 'FRONTLINE - The Atomic Artists',
'description': 'md5:1a2481e86b32b2e12ec1905dd473e2c1',
'description': 'md5:f677e4520cfacb4a5ce1471e31b57800',
'duration': 723,
'thumbnail': 're:^https?://.*\.jpg$',
},
@@ -324,7 +324,7 @@ class PBSIE(InfoExtractor):
{
# Serves hd only via wigget/partnerplayer page
'url': 'http://www.pbs.org/video/2365641075/',
'md5': 'acfd4c400b48149a44861cb16dd305cf',
'md5': 'fdf907851eab57211dd589cf12006666',
'info_dict': {
'id': '2365641075',
'ext': 'mp4',
@@ -353,11 +353,16 @@ class PBSIE(InfoExtractor):
def _extract_webpage(self, url):
mobj = re.match(self._VALID_URL, url)
description = None
presumptive_id = mobj.group('presumptive_id')
display_id = presumptive_id
if presumptive_id:
webpage = self._download_webpage(url, display_id)
description = strip_or_none(self._og_search_description(
webpage, default=None) or self._html_search_meta(
'description', webpage, default=None))
upload_date = unified_strdate(self._search_regex(
r'<input type="hidden" id="air_date_[0-9]+" value="([^"]+)"',
webpage, 'upload date', default=None))
@@ -370,7 +375,7 @@ class PBSIE(InfoExtractor):
for p in MULTI_PART_REGEXES:
tabbed_videos = re.findall(p, webpage)
if tabbed_videos:
return tabbed_videos, presumptive_id, upload_date
return tabbed_videos, presumptive_id, upload_date, description
MEDIA_ID_REGEXES = [
r"div\s*:\s*'videoembed'\s*,\s*mediaid\s*:\s*'(\d+)'", # frontline video embed
@@ -382,7 +387,7 @@ class PBSIE(InfoExtractor):
media_id = self._search_regex(
MEDIA_ID_REGEXES, webpage, 'media ID', fatal=False, default=None)
if media_id:
return media_id, presumptive_id, upload_date
return media_id, presumptive_id, upload_date, description
# Fronline video embedded via flp
video_id = self._search_regex(
@@ -399,7 +404,7 @@ class PBSIE(InfoExtractor):
'http://www.pbs.org/wgbh/pages/frontline/.json/getdir/getdir%d.json' % prg_id,
presumptive_id, 'Downloading getdir JSON',
transform_source=strip_jsonp)
return getdir['mid'], presumptive_id, upload_date
return getdir['mid'], presumptive_id, upload_date, description
for iframe in re.findall(r'(?s)<iframe(.+?)></iframe>', webpage):
url = self._search_regex(
@@ -423,10 +428,10 @@ class PBSIE(InfoExtractor):
video_id = mobj.group('id')
display_id = video_id
return video_id, display_id, None
return video_id, display_id, None, description
def _real_extract(self, url):
video_id, display_id, upload_date = self._extract_webpage(url)
video_id, display_id, upload_date, description = self._extract_webpage(url)
if isinstance(video_id, list):
entries = [self.url_result(
@@ -448,17 +453,6 @@ class PBSIE(InfoExtractor):
redirects.append(redirect)
redirect_urls.add(redirect_url)
try:
video_info = self._download_json(
'http://player.pbs.org/videoInfo/%s?format=json&type=partner' % video_id,
display_id, 'Downloading video info JSON')
extract_redirect_urls(video_info)
info = video_info
except ExtractorError as e:
# videoInfo API may not work for some videos
if not isinstance(e.cause, compat_HTTPError) or e.cause.code != 404:
raise
# Player pages may also serve different qualities
for page in ('widget/partnerplayer', 'portalplayer'):
player = self._download_webpage(
@@ -511,15 +505,19 @@ class PBSIE(InfoExtractor):
formats))
if http_url:
for m3u8_format in m3u8_formats:
bitrate = self._search_regex(r'(\d+k)', m3u8_format['url'], 'bitrate', default=None)
# extract only the formats that we know that they will be available as http format.
# https://projects.pbs.org/confluence/display/coveapi/COVE+Video+Specifications
if not bitrate or bitrate not in ('400k', '800k', '1200k', '2500k'):
bitrate = self._search_regex(r'(\d+)k', m3u8_format['url'], 'bitrate', default=None)
# Lower qualities (150k and 192k) are not available as HTTP formats (see [1]),
# we won't try extracting them.
# Since summer 2016 higher quality formats (4500k and 6500k) are also available
# albeit they are not documented in [2].
# 1. https://github.com/rg3/youtube-dl/commit/cbc032c8b70a038a69259378c92b4ba97b42d491#commitcomment-17313656
# 2. https://projects.pbs.org/confluence/display/coveapi/COVE+Video+Specifications
if not bitrate or int(bitrate) < 400:
continue
f_url = re.sub(r'\d+k|baseline', bitrate, http_url)
f_url = re.sub(r'\d+k|baseline', bitrate + 'k', http_url)
# This may produce invalid links sometimes (e.g.
# http://www.pbs.org/wgbh/frontline/film/suicide-plan)
if not self._is_valid_url(f_url, display_id, 'http-%s video' % bitrate):
if not self._is_valid_url(f_url, display_id, 'http-%sk video' % bitrate):
continue
f = m3u8_format.copy()
f.update({
@@ -562,11 +560,14 @@ class PBSIE(InfoExtractor):
if alt_title:
info['title'] = alt_title + ' - ' + re.sub(r'^' + alt_title + '[\s\-:]+', '', info['title'])
description = info.get('description') or info.get(
'program', {}).get('description') or description
return {
'id': video_id,
'display_id': display_id,
'title': info['title'],
'description': info.get('description') or info.get('program', {}).get('description'),
'description': description,
'thumbnail': info.get('image_url'),
'duration': int_or_none(info.get('duration')),
'age_limit': age_limit,

View File

@@ -3,10 +3,7 @@ from __future__ import unicode_literals
import json
from .common import InfoExtractor
from ..utils import (
int_or_none,
sanitized_Request,
)
from ..utils import int_or_none
class PornotubeIE(InfoExtractor):
@@ -31,59 +28,55 @@ class PornotubeIE(InfoExtractor):
def _real_extract(self, url):
video_id = self._match_id(url)
# Fetch origin token
js_config = self._download_webpage(
'http://www.pornotube.com/assets/src/app/config.js', video_id,
note='Download JS config')
originAuthenticationSpaceKey = self._search_regex(
r"constant\('originAuthenticationSpaceKey',\s*'([^']+)'",
js_config, 'originAuthenticationSpaceKey')
token = self._download_json(
'https://api.aebn.net/auth/v2/origins/authenticate',
video_id, note='Downloading token',
data=json.dumps({'credentials': 'Clip Application'}).encode('utf-8'),
headers={
'Content-Type': 'application/json',
'Origin': 'http://www.pornotube.com',
})['tokenKey']
# Fetch actual token
token_req_data = {
'authenticationSpaceKey': originAuthenticationSpaceKey,
'credentials': 'Clip Application',
}
token_req = sanitized_Request(
'https://api.aebn.net/auth/v1/token/primal',
data=json.dumps(token_req_data).encode('utf-8'))
token_req.add_header('Content-Type', 'application/json')
token_req.add_header('Origin', 'http://www.pornotube.com')
token_answer = self._download_json(
token_req, video_id, note='Requesting primal token')
token = token_answer['tokenKey']
video_url = self._download_json(
'https://api.aebn.net/delivery/v1/clips/%s/MP4' % video_id,
video_id, note='Downloading delivery information',
headers={'Authorization': token})['mediaUrl']
# Get video URL
delivery_req = sanitized_Request(
'https://api.aebn.net/delivery/v1/clips/%s/MP4' % video_id)
delivery_req.add_header('Authorization', token)
delivery_info = self._download_json(
delivery_req, video_id, note='Downloading delivery information')
video_url = delivery_info['mediaUrl']
FIELDS = (
'title', 'description', 'startSecond', 'endSecond', 'publishDate',
'studios{name}', 'categories{name}', 'movieId', 'primaryImageNumber'
)
# Get additional info (title etc.)
info_req = sanitized_Request(
'https://api.aebn.net/content/v1/clips/%s?expand='
'title,description,primaryImageNumber,startSecond,endSecond,'
'movie.title,movie.MovieId,movie.boxCoverFront,movie.stars,'
'movie.studios,stars.name,studios.name,categories.name,'
'clipActive,movieActive,publishDate,orientations' % video_id)
info_req.add_header('Authorization', token)
info = self._download_json(
info_req, video_id, note='Downloading metadata')
'https://api.aebn.net/content/v2/clips/%s?fields=%s'
% (video_id, ','.join(FIELDS)), video_id,
note='Downloading metadata',
headers={'Authorization': token})
if isinstance(info, list):
info = info[0]
title = info['title']
timestamp = int_or_none(info.get('publishDate'), scale=1000)
uploader = info.get('studios', [{}])[0].get('name')
movie_id = info['movie']['movieId']
thumbnail = 'http://pic.aebn.net/dis/t/%s/%s_%08d.jpg' % (
movie_id, movie_id, info['primaryImageNumber'])
categories = [c['name'] for c in info.get('categories')]
movie_id = info.get('movieId')
primary_image_number = info.get('primaryImageNumber')
thumbnail = None
if movie_id and primary_image_number:
thumbnail = 'http://pic.aebn.net/dis/t/%s/%s_%08d.jpg' % (
movie_id, movie_id, primary_image_number)
start = int_or_none(info.get('startSecond'))
end = int_or_none(info.get('endSecond'))
duration = end - start if start and end else None
categories = [c['name'] for c in info.get('categories', []) if c.get('name')]
return {
'id': video_id,
'url': video_url,
'title': info['title'],
'title': title,
'description': info.get('description'),
'duration': duration,
'timestamp': timestamp,
'uploader': uploader,
'thumbnail': thumbnail,

View File

@@ -12,25 +12,29 @@ from ..utils import (
class SunPornoIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?sunporno\.com/videos/(?P<id>\d+)'
_TEST = {
_VALID_URL = r'https?://(?:(?:www\.)?sunporno\.com/videos|embeds\.sunporno\.com/embed)/(?P<id>\d+)'
_TESTS = [{
'url': 'http://www.sunporno.com/videos/807778/',
'md5': '6457d3c165fd6de062b99ef6c2ff4c86',
'md5': '507887e29033502f29dba69affeebfc9',
'info_dict': {
'id': '807778',
'ext': 'flv',
'ext': 'mp4',
'title': 'md5:0a400058e8105d39e35c35e7c5184164',
'description': 'md5:a31241990e1bd3a64e72ae99afb325fb',
'thumbnail': 're:^https?://.*\.jpg$',
'duration': 302,
'age_limit': 18,
}
}
}, {
'url': 'http://embeds.sunporno.com/embed/807778',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
webpage = self._download_webpage(
'http://www.sunporno.com/videos/%s' % video_id, video_id)
title = self._html_search_regex(
r'<title>([^<]+)</title>', webpage, 'title')
@@ -40,7 +44,8 @@ class SunPornoIE(InfoExtractor):
r'poster="([^"]+)"', webpage, 'thumbnail', fatal=False)
duration = parse_duration(self._search_regex(
r'itemprop="duration">\s*(\d+:\d+)\s*<',
(r'itemprop="duration"[^>]*>\s*(\d+:\d+)\s*<',
r'>Duration:\s*<span[^>]+>\s*(\d+:\d+)\s*<'),
webpage, 'duration', fatal=False))
view_count = int_or_none(self._html_search_regex(
@@ -48,7 +53,7 @@ class SunPornoIE(InfoExtractor):
webpage, 'view count', fatal=False))
comment_count = int_or_none(self._html_search_regex(
r'(\d+)</b> Comments?',
webpage, 'comment count', fatal=False))
webpage, 'comment count', fatal=False, default=None))
formats = []
quality = qualities(['mp4', 'flv'])

View File

@@ -1,109 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..utils import (
clean_html,
ExtractorError,
float_or_none,
parse_iso8601,
sanitized_Request,
)
class TapelyIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?(?:tape\.ly|tapely\.com)/(?P<id>[A-Za-z0-9\-_]+)(?:/(?P<songnr>\d+))?'
_API_URL = 'http://tape.ly/showtape?id={0:}'
_S3_SONG_URL = 'http://mytape.s3.amazonaws.com/{0:}'
_SOUNDCLOUD_SONG_URL = 'http://api.soundcloud.com{0:}'
_TESTS = [
{
'url': 'http://tape.ly/my-grief-as-told-by-water',
'info_dict': {
'id': 23952,
'title': 'my grief as told by water',
'thumbnail': 're:^https?://.*\.png$',
'uploader_id': 16484,
'timestamp': 1411848286,
'description': 'For Robin and Ponkers, whom the tides of life have taken out to sea.',
},
'playlist_count': 13,
},
{
'url': 'http://tape.ly/my-grief-as-told-by-water/1',
'md5': '79031f459fdec6530663b854cbc5715c',
'info_dict': {
'id': 258464,
'title': 'Dreaming Awake (My Brightest Diamond)',
'ext': 'm4a',
},
},
{
'url': 'https://tapely.com/my-grief-as-told-by-water',
'only_matching': True,
},
]
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
display_id = mobj.group('id')
playlist_url = self._API_URL.format(display_id)
request = sanitized_Request(playlist_url)
request.add_header('X-Requested-With', 'XMLHttpRequest')
request.add_header('Accept', 'application/json')
request.add_header('Referer', url)
playlist = self._download_json(request, display_id)
tape = playlist['tape']
entries = []
for s in tape['songs']:
song = s['song']
entry = {
'id': song['id'],
'duration': float_or_none(song.get('songduration'), 1000),
'title': song['title'],
}
if song['source'] == 'S3':
entry.update({
'url': self._S3_SONG_URL.format(song['filename']),
})
entries.append(entry)
elif song['source'] == 'YT':
self.to_screen('YouTube video detected')
yt_id = song['filename'].replace('/youtube/', '')
entry.update(self.url_result(yt_id, 'Youtube', video_id=yt_id))
entries.append(entry)
elif song['source'] == 'SC':
self.to_screen('SoundCloud song detected')
sc_url = self._SOUNDCLOUD_SONG_URL.format(song['filename'])
entry.update(self.url_result(sc_url, 'Soundcloud'))
entries.append(entry)
else:
self.report_warning('Unknown song source: %s' % song['source'])
if mobj.group('songnr'):
songnr = int(mobj.group('songnr')) - 1
try:
return entries[songnr]
except IndexError:
raise ExtractorError(
'No song with index: %s' % mobj.group('songnr'),
expected=True)
return {
'_type': 'playlist',
'id': tape['id'],
'display_id': display_id,
'title': tape['name'],
'entries': entries,
'thumbnail': tape.get('image_url'),
'description': clean_html(tape.get('subtext')),
'like_count': tape.get('likescount'),
'uploader_id': tape.get('user_id'),
'timestamp': parse_iso8601(tape.get('published_at')),
}

View File

@@ -12,32 +12,32 @@ from ..utils import (
class TwentyFourVideoIE(InfoExtractor):
IE_NAME = '24video'
_VALID_URL = r'https?://(?:www\.)?24video\.net/(?:video/(?:view|xml)/|player/new24_play\.swf\?id=)(?P<id>\d+)'
_VALID_URL = r'https?://(?:www\.)?24video\.(?:net|me|xxx)/(?:video/(?:view|xml)/|player/new24_play\.swf\?id=)(?P<id>\d+)'
_TESTS = [
{
'url': 'http://www.24video.net/video/view/1044982',
'md5': 'e09fc0901d9eaeedac872f154931deeb',
'info_dict': {
'id': '1044982',
'ext': 'mp4',
'title': 'Эротика каменного века',
'description': 'Как смотрели порно в каменном веке.',
'thumbnail': 're:^https?://.*\.jpg$',
'uploader': 'SUPERTELO',
'duration': 31,
'timestamp': 1275937857,
'upload_date': '20100607',
'age_limit': 18,
'like_count': int,
'dislike_count': int,
},
_TESTS = [{
'url': 'http://www.24video.net/video/view/1044982',
'md5': 'e09fc0901d9eaeedac872f154931deeb',
'info_dict': {
'id': '1044982',
'ext': 'mp4',
'title': 'Эротика каменного века',
'description': 'Как смотрели порно в каменном веке.',
'thumbnail': 're:^https?://.*\.jpg$',
'uploader': 'SUPERTELO',
'duration': 31,
'timestamp': 1275937857,
'upload_date': '20100607',
'age_limit': 18,
'like_count': int,
'dislike_count': int,
},
{
'url': 'http://www.24video.net/player/new24_play.swf?id=1044982',
'only_matching': True,
}
]
}, {
'url': 'http://www.24video.net/player/new24_play.swf?id=1044982',
'only_matching': True,
}, {
'url': 'http://www.24video.me/video/view/1044982',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
@@ -64,7 +64,7 @@ class TwentyFourVideoIE(InfoExtractor):
r'<span class="video-views">(\d+) просмотр',
webpage, 'view count', fatal=False))
comment_count = int_or_none(self._html_search_regex(
r'<div class="comments-title" id="comments-count">(\d+) комментари',
r'<a[^>]+href="#tab-comments"[^>]*>(\d+) комментари',
webpage, 'comment count', fatal=False))
# Sets some cookies

View File

@@ -2,17 +2,20 @@
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_urlparse
from ..utils import (
ExtractorError,
sanitized_Request,
urlencode_postdata,
)
from ..utils import urlencode_postdata
class Vbox7IE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?vbox7\.com/play:(?P<id>[^/]+)'
_TEST = {
_TESTS = [{
'url': 'http://vbox7.com/play:0946fff23c',
'md5': 'a60f9ab3a3a2f013ef9a967d5f7be5bf',
'info_dict': {
'id': '0946fff23c',
'ext': 'mp4',
'title': 'Борисов: Притеснен съм за бъдещето на България',
},
}, {
'url': 'http://vbox7.com/play:249bb972c2',
'md5': '99f65c0c9ef9b682b97313e052734c3f',
'info_dict': {
@@ -20,43 +23,38 @@ class Vbox7IE(InfoExtractor):
'ext': 'mp4',
'title': 'Смях! Чудо - чист за секунди - Скрита камера',
},
}
'skip': 'georestricted',
}]
def _real_extract(self, url):
video_id = self._match_id(url)
# need to get the page 3 times for the correct jsSecretToken cookie
# which is necessary for the correct title
def get_session_id():
redirect_page = self._download_webpage(url, video_id)
session_id_url = self._search_regex(
r'var\s*url\s*=\s*\'([^\']+)\';', redirect_page,
'session id url')
self._download_webpage(
compat_urlparse.urljoin(url, session_id_url), video_id,
'Getting session id')
webpage = self._download_webpage(url, video_id)
get_session_id()
get_session_id()
title = self._html_search_regex(
r'<title>(.*)</title>', webpage, 'title').split('/')[0].strip()
webpage = self._download_webpage(url, video_id,
'Downloading redirect page')
video_url = self._search_regex(
r'src\s*:\s*(["\'])(?P<url>.+?.mp4.*?)\1',
webpage, 'video url', default=None, group='url')
title = self._html_search_regex(r'<title>(.*)</title>',
webpage, 'title').split('/')[0].strip()
thumbnail_url = self._og_search_thumbnail(webpage)
info_url = 'http://vbox7.com/play/magare.do'
data = urlencode_postdata({'as3': '1', 'vid': video_id})
info_request = sanitized_Request(info_url, data)
info_request.add_header('Content-Type', 'application/x-www-form-urlencoded')
info_response = self._download_webpage(info_request, video_id, 'Downloading info webpage')
if info_response is None:
raise ExtractorError('Unable to extract the media url')
(final_url, thumbnail_url) = map(lambda x: x.split('=')[1], info_response.split('&'))
if not video_url:
info_response = self._download_webpage(
'http://vbox7.com/play/magare.do', video_id,
'Downloading info webpage',
data=urlencode_postdata({'as3': '1', 'vid': video_id}),
headers={'Content-Type': 'application/x-www-form-urlencoded'})
final_url, thumbnail_url = map(
lambda x: x.split('=')[1], info_response.split('&'))
if '/na.mp4' in video_url:
self.raise_geo_restricted()
return {
'id': video_id,
'url': final_url,
'url': self._proto_relative_url(video_url, 'http:'),
'title': title,
'thumbnail': thumbnail_url,
}

View File

@@ -2410,6 +2410,8 @@ def dfxp2srt(dfxp_data):
def cli_option(params, command_option, param):
param = params.get(param)
if param:
param = compat_str(param)
return [command_option, param] if param is not None else []

View File

@@ -1,3 +1,3 @@
from __future__ import unicode_literals
__version__ = '2016.08.12'
__version__ = '2016.08.13'