一键爬取网页视频:You Get
一直以来爬取网页视频一直是个很麻烦的事情,B站、油管,等等,直到Github上看到了这个项目【you-get】。
Github地址:
https://github.com/soimort/you-get
安装:
1 2 3 4 5 6 7 8 |
[root@oracle1 python_area]# pip3 install you-get WARNING: Running pip install with root privileges is generally not a good idea. Try `pip3 install --user` instead. Collecting you-get Downloading https://files.pythonhosted.org/packages/a9/58/382ba8588318b1be53b2de34c136d81c04d2c997bd24c5e5c000d72f74f4/you_get-0.4.1545-py3-none-any.whl (228kB) 100% |████████████████████████████████| 235kB 19kB/s Installing collected packages: you-get Successfully installed you-get-0.4.1545 [root@oracle1 python_area]# |
该命令的帮助信息:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
[root@oracle1 python_area]# you-get -h you-get: version 0.4.1545, a tiny downloader that scrapes the web. usage: you-get [OPTION]... URL... A tiny downloader that scrapes the web optional arguments: -V, --version Print version and exit -h, --help Print this help message and exit Dry-run options: (no actual downloading) -i, --info Print extracted information -u, --url Print extracted information with URLs --json Print extracted URLs in JSON format Download options: -n, --no-merge Do not merge video parts --no-caption Do not download captions (subtitles, lyrics, danmaku, ...) -f, --force Force overwriting existing files --skip-existing-file-size-check Skip existing file without checking file size -F STREAM_ID, --format STREAM_ID Set video format to STREAM_ID -O FILE, --output-filename FILE Set output filename -o DIR, --output-dir DIR Set output directory -p PLAYER, --player PLAYER Stream extracted URL to a PLAYER -c COOKIES_FILE, --cookies COOKIES_FILE Load cookies.txt or cookies.sqlite -t SECONDS, --timeout SECONDS Set socket timeout -d, --debug Show traceback and other debug info -I FILE, --input-file FILE Read non-playlist URLs from FILE -P PASSWORD, --password PASSWORD Set video visit password to PASSWORD -l, --playlist Prefer to download a playlist -a, --auto-rename Auto rename same name different files -k, --insecure ignore ssl errors Playlist optional options: --first FIRST the first number --last LAST the last number --size PAGE_SIZE, --page-size PAGE_SIZE the page size number Proxy options: -x HOST:PORT, --http-proxy HOST:PORT Use an HTTP proxy for downloading -y HOST:PORT, --extractor-proxy HOST:PORT Use an HTTP proxy for extracting only --no-proxy Never use a proxy -s HOST:PORT or USERNAME:PASSWORD@HOST:PORT, --socks-proxy HOST:PORT or USERNAME:PASSWORD@HOST:PORT Use an SOCKS5 proxy for downloading [root@oracle1 python_area]# |
使用:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
[root@oracle1 python_area]# ls -ltr total 93768 -rw-r--r-- 1 root root 25864932 Sep 8 05:46 【原神】这仨神终于疯了.flv -rw-r--r-- 1 root root 57916 Sep 8 05:46 【原神】这仨神终于疯了.cmt.xml -rw-r--r-- 1 root root 19323798 Sep 8 05:47 【原神】钟离:以普遍理性而论,你好像有个大病.flv -rw-r--r-- 1 root root 61195 Sep 8 05:47 【原神】钟离:以普遍理性而论,你好像有个大病.cmt.xml -rw-r--r-- 1 root root 42738112 Sep 8 05:59 一切众生,本来而有!正式揭晓玛娜的根源!【灵笼前传·诡秘的苏醒】深度解析[00].mp4 -rw-r--r-- 1 root root 108486 Sep 8 05:59 一切众生,本来而有!正式揭晓玛娜的根源!【灵笼前传·诡秘的苏醒】深度解析.cmt.xml -rw-r--r-- 1 root root 7851847 Sep 8 05:59 一切众生,本来而有!正式揭晓玛娜的根源!【灵笼前传·诡秘的苏醒】深度解析[01].mp4 [root@oracle1 python_area]# [root@oracle1 python_area]# [root@oracle1 python_area]# [root@oracle1 python_area]# you-get https://www.bilibili.com/video/BV1XP4y1a7EP?spm_id_from=333.851.b_7265636f6d6d656e64.6 site: Bilibili title: 堪比特效大片!跨越3000公里,带你看放慢1300倍下的闪电⚡️⚡️ stream: - format: flv container: flv quality: 高清 1080P size: 95.4 MiB (100043721 bytes) # download-with: you-get --format=flv [URL] Downloading 堪比特效大片!跨越3000公里,带你看放慢1300倍下的闪电⚡️⚡️.flv ... 100% ( 95.4/ 95.4MB) ├███████████████████████████████████████████████████████████████████████████████████████████████████┤[1/1] 321 kB/s Downloading 堪比特效大片!跨越3000公里,带你看放慢1300倍下的闪电⚡️⚡️.cmt.xml ... [root@oracle1 python_area]# [root@oracle1 python_area]# ls -ltr total 191640 -rw-r--r-- 1 root root 25864932 Sep 8 05:46 【原神】这仨神终于疯了.flv -rw-r--r-- 1 root root 57916 Sep 8 05:46 【原神】这仨神终于疯了.cmt.xml -rw-r--r-- 1 root root 19323798 Sep 8 05:47 【原神】钟离:以普遍理性而论,你好像有个大病.flv -rw-r--r-- 1 root root 61195 Sep 8 05:47 【原神】钟离:以普遍理性而论,你好像有个大病.cmt.xml -rw-r--r-- 1 root root 42738112 Sep 8 05:59 一切众生,本来而有!正式揭晓玛娜的根源!【灵笼前传·诡秘的苏醒】深度解析[00].mp4 -rw-r--r-- 1 root root 108486 Sep 8 05:59 一切众生,本来而有!正式揭晓玛娜的根源!【灵笼前传·诡秘的苏醒】深度解析.cmt.xml -rw-r--r-- 1 root root 7851847 Sep 8 05:59 一切众生,本来而有!正式揭晓玛娜的根源!【灵笼前传·诡秘的苏醒】深度解析[01].mp4 -rw-r--r-- 1 root root 100043721 Sep 8 08:14 堪比特效大片!跨越3000公里,带你看放慢1300倍下的闪电⚡️⚡️.flv -rw-r--r-- 1 root root 172938 Sep 8 08:14 堪比特效大片!跨越3000公里,带你看放慢1300倍下的闪电⚡️⚡️.cmt.xml [root@oracle1 python_area]# |
然后,传到MS Windows的机器上,看看效果:

