取消
最近搜索
清空历史

代码样例-Http代理

本文档包含编程请求http代理服务器的代码样例,供开发者参考。

代码样例使用说明

  1. 代码样例不能直接运行,因为代码中的订单SecretIdo1fjh1re9o28876h7c08、signaturexxxxx、代理IP和端口号59.38.241.25:23916、用户名username、密码password都是虚构的,请替换成您自己的信息。查看我的用户名密码>>
  2. 代码样例正常运行所需的运行环境和注意事项在样例末尾均有说明,使用前请仔细阅读。
  3. 使用代码样例过程中遇到问题请联系售后客服,我们会为您提供技术支持。

特别注意

以下代码样例均为基础样例,运行基础样例并不能保证成功爬取目标网站。目标网站通常具备反爬机制,比如跳转需要输入验证码的页面。

我们建议您在开发过程中基于基础样例进行如下改进:

  1. 添加IP池管理;
  2. 合理控制对目标网站的请求频率,建议对同一网站1个代理IP每秒请求不超过1次;
  3. 发出的http请求尽可能带上完整的header信息。

Python3

requests

requests(推荐)

使用提示

  1. 基于requests的代码样例支持访问http,https网页,推荐使用
  2. requests不是python原生库,需要安装才能使用: pip install requests

代码下载:githubgitee

#!/usr/bin/env python
# -*- coding: utf-8 -*-

"""
使用requests请求代理服务器
请求http和https网页均适用
"""

import requests

# 提取代理API接口,获取1个代理IP
api_url = "https://dps.kdlapi.com/api/getdps/?secret_id=o1fjh1re9o28876h7c08&signature=xxxxx&num=1&pt=1&sep=1"

# 获取API接口返回的代理IP
proxy_ip = requests.get(api_url).text

# 用户名密码认证(私密代理/独享代理)
username = "username"
password = "password"
proxies = {
    "http": "http://%(user)s:%(pwd)s@%(proxy)s/" % {"user": username, "pwd": password, "proxy": proxy_ip},
    "https": "http://%(user)s:%(pwd)s@%(proxy)s/" % {"user": username, "pwd": password, "proxy": proxy_ip}
}

# 白名单方式(需提前设置白名单)
# proxies = {
#     "http": "http://%(proxy)s/" % {"proxy": proxy_ip},
#     "https": "http://%(proxy)s/" % {"proxy": proxy_ip}
# }

# 要访问的目标网页
target_url = "https://dev.kdlapi.com/testproxy"

# 使用代理IP发送请求
response = requests.get(target_url, proxies=proxies)

# 获取页面内容
if response.status_code == 200:
    print(response.text)

urllib

urllib

使用提示

  1. 基于urllib的代码样例同时支持访问http和https网页
  2. 运行环境要求 python3.x

代码下载:githubgitee

#!/usr/bin/env python
# -*- coding: utf-8 -*-

"""
使用urllib请求代理服务器
请求http和https网页均适用
"""

import urllib.request
import ssl

# 全局取消证书验证,避免访问https网页报错
ssl._create_default_https_context = ssl._create_unverified_context

# 提取代理API接口,获取1个代理IP
api_url = "https://dps.kdlapi.com/api/getdps/?secret_id=o1fjh1re9o28876h7c08&signature=xxxxx&num=1&pt=1&sep=1"

# 获取API接口返回的IP
proxy_ip = urllib.request.urlopen(api_url).read().decode('utf-8')

# 用户名密码认证(私密代理/独享代理)
username = "username"
password = "password"
proxies = {
    "http": "http://%(user)s:%(pwd)s@%(proxy)s/" % {"user": username, "pwd": password, "proxy": proxy_ip},
    "https": "http://%(user)s:%(pwd)s@%(proxy)s/" % {"user": username, "pwd": password, "proxy": proxy_ip}
}

# 白名单方式(需提前设置白名单)
# proxies = {
#     "http": "http://%(proxy)s/" % {"proxy": proxy_ip},
#     "https": "http://%(proxy)s/" % {"proxy": proxy_ip}
# }

# 要访问的目标网页
target_url = "https://dev.kdlapi.com/testproxy"

# 使用代理IP发送请求
proxy_support = urllib.request.ProxyHandler(proxies)
opener = urllib.request.build_opener(proxy_support)
# urllib.request.install_opener(opener)  注意此处是全局设置代理,如用这种写法进程内之后的所有urllib请求都会使用代理
# response = urllib.request.urlopen(target_url)
response = opener.open(target_url)

# 获取页面内容
if response.code == 200:
    print(response.read().decode('utf-8'))

aiohttp

aiohttp

使用提示

  1. 基于aiohttp的代码样例支持访问http,https网页
  2. aiohttp不是python原生库,需要安装才能使用: pip install aiohttp
  3. aiohttp只支持Python3.5及以上
  4. 如Windows系统使用aiohttp访问https网站抛出异常,在import asyncio后调用 asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())即可解决。

代码下载:githubgitee

#!/usr/bin/env python
# -*- coding: utf-8 -*-

"""
使用aiohttp请求代理服务器
请求http和https网页均适用

"""
import random
import asyncio
# asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy()) windows系统请求https网站报错时调用此方法

import aiohttp
import requests

page_url = "http://icanhazip.com/"  # 要访问的目标网页

# API接口,返回格式为json
api_url = "https://dps.kdlapi.com/api/getdps/?secret_id=o1fjh1re9o28876h7c08&signature=xxxxx&num=5&pt=1&format=json&sep=1"  # API接口

# API接口返回的proxy_list
proxy_list = requests.get(api_url).json().get('data').get('proxy_list')

# 用户名密码认证(私密代理/独享代理)
username = "username"
password = "password"

proxy_auth = aiohttp.BasicAuth(username, password)


async def fetch(url):
    async with aiohttp.ClientSession() as session:
        async with session.get(url, proxy="http://" + random.choice(proxy_list), proxy_auth=proxy_auth) as resp:
            content = await resp.read()
            print(f"status_code: {resp.status}, content: {content}")


def run():
    loop = asyncio.get_event_loop()
    # 异步发出5次请求
    tasks = [fetch(page_url) for _ in range(5)]
    loop.run_until_complete(asyncio.wait(tasks))


if __name__ == '__main__':
    run()

httpx

httpx

使用提示

  1. 基于httpx的代码样例支持访问http,https网页
  2. httpx不是python原生库,需要安装才能使用: pip install httpx
  3. httpx运行环境要求 Python3.7+
  4. httpx暂时还不支持SOCKS代理

代码下载:githubgitee

#!/usr/bin/env python
# -*- coding: utf-8 -*-

"""
使用requests请求代理服务器
请求http和https网页均适用
"""

import random
import asyncio

import httpx
import requests

page_url = "http://icanhazip.com/"  # 要访问的目标网页

# API接口,返回格式为json
api_url = "https://dps.kdlapi.com/api/getdps/?secret_id=o1fjh1re9o28876h7c08&signature=xxxxx&num=10&pt=1&format=json&sep=1"  # API接口

# API接口返回的proxy_list
proxy_list = requests.get(api_url).json().get('data').get('proxy_list')

# 用户名密码认证(私密代理/独享代理)
username = "username"
password = "password"


async def fetch(url):
    proxies = httpx.Proxy(
        url=f"http://{username}:{password}@{random.choice(proxy_list)}",
    )
    async with httpx.AsyncClient(proxies=proxies, timeout=10) as client:
        resp = await client.get(url)
        print(f"status_code: {resp.status_code}, content: {resp.content}")


def run():
    loop = asyncio.get_event_loop()
    # 异步发出5次请求
    tasks = [fetch(page_url) for _ in range(5)]
    loop.run_until_complete(asyncio.wait(tasks))


if __name__ == '__main__':
    run()

httpclient

httpclient(IP白名单)

使用提示

  1. 基于httpclient的代码样例同时支持访问http和https网页
#!/usr/bin/env python
# -*- coding: utf-8 -*-

"""
使用http.client请求代理服务器
请求http和https网页均适用
"""
import http.client

# 代理服务器的地址和端口
proxy_host = "59.38.241.25"
proxy_port = 15818

# 目标服务器的地址
target_host = "dev.kdlapi.com"

# 创建连接对象
conn = http.client.HTTPSConnection(proxy_host, proxy_port)

# 设置代理信息
conn.set_tunnel(target_host)

# 发送请求
conn.request("GET", "/testproxy")

# 获取响应
response = conn.getresponse()

# 打印响应状态和内容
print(response.status, response.reason)
print(response.read().decode('utf-8'))

# 关闭连接
conn.close()

websocket

websocket(长连接版)

使用提示

  1. 安装运行所需的客户端: pip install websocket-client
  2. 使用HTTP代理发送websocket请求
  3. 在IP可用的情况下,客户端长时间不发送消息,服务端会断开连接
  4. 运行环境要求 python3.x

代码下载:githubgitee

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

"""
使用HTTP代理发送websocket请求
"""
import gzip
import zlib

import websocket

OPCODE_DATA = (websocket.ABNF.OPCODE_TEXT, websocket.ABNF.OPCODE_BINARY)

url = "ws://echo.websocket.events/"

proxies = {
    "http_proxy_host": "59.38.241.25",
    "http_proxy_port": 23916,
    "http_proxy_auth": ("username", "password"),
}

ws = websocket.create_connection(url, **proxies)


def recv():
    try:
        frame = ws.recv_frame()
    except websocket.WebSocketException:
        return websocket.ABNF.OPCODE_CLOSE, None
    if not frame:
        raise websocket.WebSocketException("Not a valid frame %s" % frame)
    elif frame.opcode in OPCODE_DATA:
        return frame.opcode, frame.data
    elif frame.opcode == websocket.ABNF.OPCODE_CLOSE:
        ws.send_close()
        return frame.opcode, None
    elif frame.opcode == websocket.ABNF.OPCODE_PING:
        ws.pong(frame.data)
        return frame.opcode, frame.data

    return frame.opcode, frame.data


def recv_ws():
    opcode, data = recv()
    if opcode == websocket.ABNF.OPCODE_CLOSE:
        return
    if opcode == websocket.ABNF.OPCODE_TEXT and isinstance(data, bytes):
        data = str(data, "utf-8")
    if isinstance(data, bytes) and len(data) > 2 and data[:2] == b'\037\213':  # gzip magick
        try:
            data = "[gzip] " + str(gzip.decompress(data), "utf-8")
        except Exception:
            pass
    elif isinstance(data, bytes):
        try:
            data = "[zlib] " + str(zlib.decompress(data, -zlib.MAX_WBITS), "utf-8")
        except Exception:
            pass
    if isinstance(data, bytes):
        data = repr(data)

    print("< " + data)


def main():
    print("Press Ctrl+C to quit")
    while True:
        message = input("> ")
        ws.send(message)
        recv_ws()


if __name__ == "__main__":
    try:
        main()
    except KeyboardInterrupt:
        print('\nbye')
    except Exception as e:
        print(e)
websocket(短连接版)

使用提示

  1. 安装运行所需的客户端: pip install websocket-client
  2. 使用HTTP代理发送websocket请求
  3. 运行环境要求 python3.x

代码下载:githubgitee

#!/usr/bin/env python
# -*- encoding: utf-8 -*-

import ssl
import websocket


def on_message(ws, message):
    print(message)


def on_error(ws, error):
    print(error)


def on_open(ws):
    data = '{}'  # 此处填入您需要传给目标网站的json格式参数,如{"type":"web","data":{"_id":"xxxx"}}
    ws.send(data)


def on_close(*args):
    print("### closed ###")


proxies = {
    "http_proxy_host": "59.38.241.25",
    "http_proxy_port": 23916,
    "http_proxy_auth": ("username", "password"),
    "proxy_type":"http",
}


def start():
    websocket.enableTrace(True)
    target_url = 'ws://echo.websocket.events/'  # 此处替换您的目标网站
    ws = websocket.WebSocketApp(
        url = target_url,
        header = [
            "User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36"
        ],
        on_message=on_message,
        on_error=on_error,
        on_close=on_close,
    )
    ws.on_open = on_open
    ws.run_forever(sslopt={"cert_reqs": ssl.CERT_NONE}, **proxies)


if __name__ == "__main__":
    start()

pyppeteer

pyppeteer

使用提示

  1. 基于pyppeteer的代码样例支持访问http,https网页
  2. pyppeteer不是python原生库,需要安装才能使用: pip install pyppeteer
  3. pyppeteer只支持Python3.5及以上
  4. pyppeteer是异步渲染网页,需要使用asyncio等库

代码下载:githubgitee

#!/usr/bin/env python
# -*- encoding: utf-8 -*-

"""
请求http和https网页均适用
"""

import asyncio

import requests
from pyppeteer import launch

# 提取代理API接口,获取1个代理IP
api_url = "https://dps.kdlapi.com/api/getdps/?secret_id=o1fjh1re9o28876h7c08&signature=xxxxx&num=1&pt=1&sep=1"
# 获取API接口返回的代理IP
proxy_ip = requests.get(api_url).text
proxy = "http://" + proxy_ip


def accounts():
    # 用户名密码认证(私密代理/独享代理)
    username = "username"
    password = "password"
    account = {"username": username, "password": password}
    return account


async def main():
    # 要访问的目标网页
    target_url = "https://dev.kdlapi.com/testproxy"

    browser = await launch({'headless': False, 'args': ['--disable-infobars', '--proxy-server=' + proxy]})
    page = await browser.newPage()
    await page.authenticate(accounts())  # 白名单方式,注释本行(需提前设置白名单)
    await page.setViewport({'width': 1920, 'height': 1080})
    # 使用代理IP发送请求
    await page.goto(target_url)
    await asyncio.sleep(209)
    await browser.close()

asyncio.get_event_loop().run_until_complete(main())

playwright

playwright

使用提示

  1. 基于playwright的代码样例支持访问http,https网页
  2. Playwright不是python原生库,需要安装才能使用: pip install playwright
  3. 如果您的计算机上没有支持的浏览器,需要执行playwright install以安装依赖文件
  4. playwright只支持Python3.7及以上
  5. playwright支持同步或异步执行,以下为同步执行示例
#!/usr/bin/env python
# -*- coding: utf-8 -*-

"""
请求http和https网页均适用
"""
import requests
from playwright.sync_api import sync_playwright

# 提取代理API接口,获取1个代理IP
api_url = "https://dps.kdlapi.com/api/getdps/?secret_id=o1fjh1re9o28876h7c08&num=1&signature=xxxxx&pt=1&sep=1"

# 获取API接口返回的代理IP
proxy_ip = requests.get(api_url).text

# 用户名密码方式
username = "username"
password = "password"

# 要访问的目标网页
url = "https://dev.kdlapi.com/testproxy"

proxies = {
    "server": proxy_ip,
    "username": username,
    "password": password,
}

# 白名单方式(需提前设置白名单)
# proxies = {
#     "server": proxy,
# }

with sync_playwright() as playwright:
    # headless=True 无头模式,不显示浏览器窗口
    # browser = playwright.chromium.launch(channel="msedge", headless=True, proxy=proxies)  # Microsoft Edge 浏览器
    # browser = playwright.firefox.launch(headless=True, proxy=proxies)                     # Mozilla Firefox 浏览器
    # browser = playwright.webkit.launch(headless=True, proxy=proxies)                      # WebKit 浏览器,如 Apple Safari
    browser = playwright.chromium.launch(channel="chrome", headless=True, proxy=proxies)    # Google Chrome 浏览器
    context = browser.new_context()
    page = context.new_page()
    page.goto(url)
    content = page.content()
    print(content)
    # other actions...
    browser.close()

ProxyPool

ProxyPool

使用提示

  1. 此样例是私密代理简单IP池管理的实现
  2. requests不是python原生库,需要安装才能使用: pip install requests
  3. 支持Python2.7和Python3

代码下载:githubgitee

#!/usr/bin/env python
# -*- encoding: utf-8 -*-


import time
import random
import threading

import requests


class ProxyPool():

    def __init__(self, secretid, proxy_count):
        self.secretid = secretid
        self.signature = signature
        self.proxy_count = proxy_count if proxy_count < 50 else 50 # 池子维护的IP总数,建议一般不要超过50
        self.alive_proxy_list = []  # 活跃IP列表

    def _fetch_proxy_list(self, count):
        """调用快代理API获取代理IP列表"""
        try:
            res = requests.get("https://dps.kdlapi.com/api/getdps/?secret_id=%s&signature=%s&num=%s&pt=1&sep=1&f_et=1&format=json" % (self.secretid, self.signature, count))
            return [proxy.split(',') for proxy in res.json().get('data').get('proxy_list')]
        except:
            print("API获取IP异常,请检查订单")
        return []

    def _init_proxy(self):
        """初始化IP池"""
        self.alive_proxy_list = self._fetch_proxy_list(self.proxy_count)

    def add_alive_proxy(self, add_count):
        """导入新的IP, 参数为新增IP数"""
        self.alive_proxy_list.extend(self._fetch_proxy_list(add_count))

    def get_proxy(self):
        """从IP池中获取IP"""
        return random.choice(self.alive_proxy_list)[0] if self.alive_proxy_list else ""

    def run(self):
        sleep_seconds = 1
        self._init_proxy()
        while True:
            for proxy in self.alive_proxy_list:
                proxy[1] = float(proxy[1]) - sleep_seconds  # proxy[1]代表此IP的剩余可用时间
                if proxy[1] <= 3:
                    self.alive_proxy_list.remove(proxy)  # IP还剩3s时丢弃此IP
            if len(self.alive_proxy_list) < self.proxy_count:
                self.add_alive_proxy(self.proxy_count - len(self.alive_proxy_list))
            time.sleep(sleep_seconds)

    def start(self):
        """开启子线程更新IP池"""
        t = threading.Thread(target=self.run)
        t.setDaemon(True)  # 将子线程设为守护进程,主线程不会等待子线程结束,主线程结束子线程立刻结束
        t.start()


def parse_url(proxy):
    # 用户名密码认证(私密代理/独享代理)
    username = "username"
    password = "password"
    proxies = {
        "http": "http://%(user)s:%(pwd)s@%(proxy)s/" % {"user": username, "pwd": password,"proxy": proxy},
        "https": "http://%(user)s:%(pwd)s@%(proxy)s/" % {"user": username, "pwd": password,"proxy": proxy}
    }

    # 白名单方式(需提前设置白名单)
    # proxies = {
    #     "http": "http://%(proxy)s/" % {"proxy": proxy_ip},
    #     "https": "http://%(proxy)s/" % {"proxy": proxy_ip}
    # }

    # 要访问的目标网页
    target_url = "https://dev.kdlapi.com/testproxy"
    # 使用代理IP发送请求
    response = requests.get(target_url, proxies=proxies)
    # 获取页面内容
    if response.status_code == 200:
        print(response.text)


if __name__ == '__main__':
    proxy_pool = ProxyPool('o1fjh1re9o28876h7c08', 'xxxxxx', 30) # 订单SecretId, 签名(secret_token), 池子中维护的IP数
    proxy_pool.start()
    time.sleep(1)  # 等待IP池初始化

    proxy = proxy_pool.get_proxy() # 从IP池中提取IP
    if proxy:
        parse_url(proxy)

Python2

requests

requests(推荐)

使用提示

  1. 基于requests的代码样例支持访问http,https网页,推荐使用
  2. requests不是python原生库,需要安装才能使用: pip install requests

代码下载:githubgitee

#!/usr/bin/env python
# -*- coding: utf-8 -*-

"""
使用requests请求代理服务器
请求http和https网页均适用
"""

import requests

# 提取代理API接口,获取1个代理IP
api_url = "https://dps.kdlapi.com/api/getdps/?secret_id=o1fjh1re9o28876h7c08&signature=xxxxx&num=1&pt=1&sep=1"

# 获取API接口返回的代理IP
proxy_ip = requests.get(api_url).text

# 用户名密码认证(私密代理/独享代理)
username = "username"
password = "password"
proxies = {
    "http": "http://%(user)s:%(pwd)s@%(proxy)s/" % {"user": username, "pwd": password, "proxy": proxy_ip},
    "https": "http://%(user)s:%(pwd)s@%(proxy)s/" % {"user": username, "pwd": password, "proxy": proxy_ip}
}

# 白名单方式(需提前设置白名单)
# proxies = {
#     "http": "http://%(proxy)s/" % {"proxy": proxy_ip},
#     "https": "http://%(proxy)s/" % {"proxy": proxy_ip}
# }

# 要访问的目标网页
target_url = "https://dev.kdlapi.com/testproxy"

# 使用代理IP发送请求
response = requests.get(target_url, proxies=proxies)

# 获取页面内容
if response.status_code == 200:
    print response.text

urllib2

urllib2

使用提示

  1. 基于urllib2的代码样例同时支持访问http和https网页
  2. 运行环境要求 python2.6 / 2.7

代码下载:githubgitee

#!/usr/bin/env python
# -*- coding: utf-8 -*-

"""
使用urllib2请求代理服务器
请求http和https网页均适用
"""

import urllib2
import ssl

# 全局取消证书验证,避免访问https网页报错
ssl._create_default_https_context = ssl._create_unverified_context  

# 提取代理API接口,获取1个代理IP
api_url = "https://dps.kdlapi.com/api/getdps/?secret_id=o1fjh1re9o28876h7c08&signature=xxxxx&num=1&pt=1&sep=1"

# 获取API接口返回的IP
proxy_ip = urllib2.urlopen(api_url).read()

# 用户名密码认证(私密代理/独享代理)
username = "username"
password = "password"
proxies = {
    "http": "http://%(user)s:%(pwd)s@%(proxy)s/" % {"user": username, "pwd": password, "proxy": proxy_ip},
    "https": "http://%(user)s:%(pwd)s@%(proxy)s/" % {"user": username, "pwd": password, "proxy": proxy_ip}
}

# 白名单方式(需提前设置白名单)
# proxies = {
#     "http": "http://%(proxy)s/" % {"proxy": proxy_ip},
#     "https": "http://%(proxy)s/" % {"proxy": proxy_ip}
# }

# 要访问的目标网页
target_url = "https://dev.kdlapi.com/testproxy"

# 使用代理IP发送请求
proxy_support = urllib2.ProxyHandler(proxies)
opener = urllib2.build_opener(proxy_support)
# urllib2.install_opener(opener)  注意此处是全局设置代理,如用这种写法进程内之后的所有urllib请求都会使用代理
# response = urllib2.urlopen(target_url)  
response = opener.open(target_url)

# 获取页面内容
if response.code == 200:
    print response.read()

Python-Selenium

Chrome

Chrome(IP白名单,推荐)

使用提示

  1. 基于白名单方式使用Selenium+Chrome认证代理
  2. 运行环境要求python2/3 + selenium + Chrome + Chromedriver + Windows/Linux/macOS
  3. 下载chromedriver(注意chromedriver版本要和Chrome版本对应)
  4. selenium不是python原生库,需要安装才能使用:pip install selenium (注意:selenium 4.6版本开始,无需手动下载driver)
  5. 请注意替换代码中的部分信息:
    ${ip:port}:代理IP:端口号,如:"59.38.241.25:23916"
    ${chromedriver_path}:您本机chromedriver驱动存放路径,如:"C:\chromedriver.exe"

代码下载:githubgitee

#!/usr/bin/env python
# encoding: utf-8

from selenium import webdriver
import time

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--proxy-server=http://${ip:port}')  # 代理IP:端口号
# selenium 4.6及以上
driver = webdriver.Chrome(options=chrome_options)
# ${chromedriver_path}: chromedriver驱动存放路径
# driver = webdriver.Chrome(executable_path="${chromedriver_path}", options=chrome_options)
driver.get("https://dev.kdlapi.com/testproxy")

# 获取页面内容
print(driver.page_source)

# 延迟3秒后关闭当前窗口,如果是最后一个窗口则退出
time.sleep(3)
driver.close()

Chrome(用户名密码认证)

使用提示

  1. 基于用户名密码方式使用Selenium + Chrome认证代理(chrome86版已测试通过)
  2. 运行环境要求python2/3 + selenium + Chrome + Chromedriver + Windows/Linux/macOS
  3. 下载chromedriver(注意chromedriver版本要和Chrome版本对应)
  4. selenium不是python原生库,需要安装才能使用:pip install selenium (注意:selenium 4.6版本开始,无需手动下载driver)
  5. 请注意替换代码中的部分信息:
    ${proxy_ip}:代理IP
    ${proxy_port}:端口号
    ${username}}:用户名
    ${password}:密码
    ${chromedriver_path}:您本机chromedriver驱动存放路径,如:"C:\chromedriver.exe"

代码下载:githubgitee

#!/usr/bin/env python
# encoding: utf-8

from selenium import webdriver
import string
import zipfile
import time


def create_proxyauth_extension(proxy_host, proxy_port, proxy_username, proxy_password, scheme='http', plugin_path=None):
    """代理认证插件

    args:
        proxy_host (str): 你的代理地址或者域名(str类型)
        proxy_port (int): 代理端口号(int类型)
        # 用户名密码认证(私密代理/独享代理)
        proxy_username (str):用户名(字符串)
        proxy_password (str): 密码 (字符串)
    kwargs:
        scheme (str): 代理方式 默认http
        plugin_path (str): 扩展的绝对路径

    return str -> plugin_path
    """

    if plugin_path is None:
        plugin_path = 'vimm_chrome_proxyauth_plugin.zip'

    manifest_json = """
    {
        "version": "1.0.0",
        "manifest_version": 2,
        "name": "Chrome Proxy",
        "permissions": [
            "proxy",
            "tabs",
            "unlimitedStorage",
            "storage",
            "<all_urls>",
            "webRequest",
            "webRequestBlocking"
        ],
        "background": {
            "scripts": ["background.js"]
        },
        "minimum_chrome_version":"22.0.0"
    }
    """

    background_js = string.Template(
        """
        var config = {
                mode: "fixed_servers",
                rules: {
                singleProxy: {
                    scheme: "${scheme}",
                    host: "${host}",
                    port: parseInt(${port})
                },
                bypassList: ["foobar.com"]
                }
            };

        chrome.proxy.settings.set({value: config, scope: "regular"}, function() {});

        function callbackFn(details) {
            return {
                authCredentials: {
                    username: "${username}",
                    password: "${password}"
                }
            };
        }

        chrome.webRequest.onAuthRequired.addListener(
                    callbackFn,
                    {urls: ["<all_urls>"]},
                    ['blocking']
        );
        """
    ).substitute(
        host=proxy_host,
        port=proxy_port,
        username=proxy_username,
        password=proxy_password,
        scheme=scheme,
    )
    with zipfile.ZipFile(plugin_path, 'w') as zp:
        zp.writestr("manifest.json", manifest_json)
        zp.writestr("background.js", background_js)
    return plugin_path


proxyauth_plugin_path = create_proxyauth_extension(
    proxy_host="${proxy_ip}",  # 代理IP
    proxy_port="${proxy_port}",  # 端口号
    # 用户名密码(私密代理/独享代理)
    proxy_username="${username}", 
    proxy_password="${password}"
)


chrome_options = webdriver.ChromeOptions()
chrome_options.add_extension(proxyauth_plugin_path)
# selenium 4.6及以上
driver = webdriver.Chrome(options=chrome_options)
# ${chromedriver_path}: chromedriver驱动存放路径
# driver = webdriver.Chrome(executable_path="${chromedriver_path}", options=chrome_options)
driver.get("https://dev.kdlapi.com/testproxy")

# 获取页面内容
print(driver.page_source)

# 延迟3秒后关闭当前窗口,如果是最后一个窗口则退出
time.sleep(3)
driver.close()

使用提示

如需使用用户名密码+无界面方式认证代理,请使用使用 Selenium + PhantomJS

PhantomJS

PhantomJS(用户名密码认证+无界面模式)

使用提示

  1. 基于用户名密码+无界面方式使用 Selenium + PhantomJS认证代理
  2. 运行环境要求python2/3 + selenium + PhantomJS + Windows/Linux/macOS
  3. 点此下载PhantomJS(推荐使用2.1.1版)
  4. ${executable_path}:您本机PhantomJS驱动存放路径,如:"C:\phantomjs-2.1.1-windows\bin\phantomjs.exe"

代码下载:githubgitee

#!/usr/bin/env python
# encoding: utf-8

from selenium import webdriver
import time

#先下载phantomjs包文件,再填入phantomjs.exe的路径 (路径不要包含中文)
executable_path = '${executable_path}'
service_args=[
    '--proxy=host:port', #此处替换您的代理ip,如59.38.241.25:23918
    '--proxy-type=http',
    '--proxy-auth=username:password' #用户名密码
]
driver=webdriver.PhantomJS(service_args=service_args,executable_path=executable_path)
driver.get('https://dev.kdlapi.com/testproxy')

print(driver.page_source)
time.sleep(3)
driver.close()

Firefox

Firefox(IP白名单,推荐)

使用提示

  1. 基于白名单方式使用Selenium+Firefox认证代理
  2. 运行环境要求python2/3 + selenium + Firefox + geckodriver + Windows/Linux/macOS
  3. 下载geckodriver(注意geckodriver版本要和Firefox版本对应)
  4. selenium不是python原生库,需要安装才能使用:pip install selenium (注意:selenium 4.6版本开始,无需手动下载driver)
  5. 请注意替换代码中的部分信息:
    ${ip:port}:代理IP:端口号,如:"59.38.241.25:23916"
    ${geckodriver_path}:您本机geckodriver驱动存放路径,如:"C:\geckodriver.exe"

代码下载:githubgitee

#!/usr/bin/env python
# encoding: utf-8

from selenium import webdriver
import time

fp = webdriver.FirefoxProfile()
proxy = '${ip:port}'
ip, port = proxy.split(":")
port = int(port)

# 设置代理配置
fp.set_preference('network.proxy.type', 1)
fp.set_preference('network.proxy.http', ip)
fp.set_preference('network.proxy.http_port', port)
fp.set_preference('network.proxy.ssl', ip)
fp.set_preference('network.proxy.ssl_port', port)

driver = webdriver.Firefox(executable_path="${geckodriver_path}", firefox_profile=fp)
driver.get('https://dev.kdlapi.com/testproxy')

# 获取页面内容
print(driver.page_source)

# 延迟3秒后关闭当前窗口,如果是最后一个窗口则退出
time.sleep(3)
driver.close()
Firefox(用户名密码认证)

使用提示

  1. 基于用户名密码认证方式使用Selenium-wire + Firefox认证代理
  2. 运行环境要求python3.4以上 + selenium-wire + Firefox + geckodriver + Windows/Linux/macOS
  3. 下载geckodriver(注意geckodriver版本要和Firefox版本对应)
  4. selenium-wire不是python原生库,需要安装才能使用:pip install selenium-wire
  5. 请注意替换代码中的部分信息:
    ${ip:port}:代理IP:端口号,如:"59.38.241.25:23916"
    ${geckodriver_path}:您本机geckodriver驱动存放路径,如:"C:\geckodriver.exe"

代码下载:githubgitee

#!/usr/bin/env python
# encoding: utf-8

import time

from seleniumwire import webdriver # pip install selenium-wire

username = 'username' # 请替换您的用户名和密码
password = 'password'
proxy_ip = '59.38.241.25:23916' # 请替换您提取到的代理ip
options = {
        'proxy': {
            'http': "http://%(user)s:%(pwd)s@%(proxy)s/" % {"user": username, "pwd": password, "proxy": proxy_ip},
            'https': "http://%(user)s:%(pwd)s@%(proxy)s/" % {"user": username, "pwd": password, "proxy": proxy_ip}
        }
    }

driver = webdriver.Firefox(seleniumwire_options=options,executable_path="${geckodriver_path}")

driver.get('https://dev.kdlapi.com/testproxy')

# 获取页面内容
print(driver.page_source)

# 延迟3秒后关闭当前窗口,如果是最后一个窗口则退出
time.sleep(3)
driver.close()

Python-DrissionPage

IP白名单,推荐

使用提示

  1. 基于白名单方式使用
  2. 运行环境要求 python3 + Windows/Linux
  3. 支持Chromium内核浏览器(如 Chrome 和 Edge)
  4. DrissionPage不是python原生库,需要安装才能使用:pip install DrissionPage
  5. 请注意替换代码中的部分信息:
    ${ip:port}:代理IP:端口号,如:"59.38.241.25:23916"
#!/usr/bin/env python
# -*- coding: utf-8 -*-

from DrissionPage import WebPage, ChromiumOptions
import time

co = ChromiumOptions()
co.set_proxy("http://${ip:port}")  # 代理IP:端口号

page = WebPage(chromium_options=co)
page.get("https://dev.kdlapi.com/testproxy")  # 要访问的目标网页

# 获取页面内容
print(page.html)

# 等待3秒后关闭页面
time.sleep(3)
page.quit()
用户名密码认证

使用提示

  1. 基于用户名密码方式使用
  2. 运行环境要求 python3 + Windows/Linux
  3. 支持Chromium内核浏览器(如 Chrome 和 Edge)
  4. DrissionPage不是python原生库,需要安装才能使用:pip install DrissionPage
  5. 请注意替换代码中的部分信息:
    proxy_ip:代理IP
    proxy_port:端口号
    username:代理用户名
    password:代理密码
#!/usr/bin/env python
# -*- coding: utf-8 -*-

from DrissionPage import WebPage, ChromiumOptions
import string
import os
import time

proxy_ip = 'proxy_ip'  # 代理IP
proxy_port = 'proxy_port'  # 端口号

username = 'username'  # 代理用户名
password = 'password'  # 代理密码

# 要访问的目标网页
url = 'https://dev.kdlapi.com/testproxy'

def create_proxyauth_extension(proxy_host, proxy_port, proxy_username, proxy_password, scheme='http', plugin_folder=None):
    if plugin_folder is None:
        plugin_folder = 'kdl_Chromium_Proxy'  # 插件文件夹名称
    if not os.path.exists(plugin_folder):
        os.makedirs(plugin_folder)
    manifest_json = """
        {
            "version": "1.0.0",
            "manifest_version": 2,
            "name": "kdl_Chromium_Proxy",
            "permissions": [
                "proxy",
                "tabs",
                "unlimitedStorage",
                "storage",
                "<all_urls>",
                "webRequest",
                "webRequestBlocking",
                "browsingData"
            ],
            "background": {
                "scripts": ["background.js"]
            },
            "minimum_chrome_version":"22.0.0"
        }
    """
    background_js = string.Template("""
        var config = {
            mode: "fixed_servers",
            rules: {
            singleProxy: {
                scheme: "${scheme}",
                host: "${host}",
                port: parseInt(${port})
            },
            bypassList: []
            }
        };

        chrome.proxy.settings.set({value: config, scope: "regular"}, function() {});

        function callbackFn(details) {
            return {
                authCredentials: {
                    username: "${username}",
                    password: "${password}"
                }
            };
        }

        chrome.webRequest.onAuthRequired.addListener(
            callbackFn,
            {urls: ["<all_urls>"]},
            ['blocking']
        );
    """).substitute(
        host=proxy_host,
        port=proxy_port,
        username=proxy_username,
        password=proxy_password,
        scheme=scheme,
    )
    with open(os.path.join(plugin_folder, "manifest.json"), "w") as manifest_file:
        manifest_file.write(manifest_json)
    with open(os.path.join(plugin_folder, "background.js"), "w") as background_file:
        background_file.write(background_js)
    return plugin_folder

proxyauth_plugin_folder = create_proxyauth_extension(
    proxy_host=proxy_ip,
    proxy_port=proxy_port,
    proxy_username=username,
    proxy_password=password
)

co = ChromiumOptions()
current_directory = os.path.dirname(os.path.abspath(__file__))
co.add_extension(os.path.join(current_directory, 'kdl_Chromium_Proxy'))

page = WebPage(chromium_options=co)
page.get(url)

# 获取页面内容
print(page.html)

# 等待3秒后关闭页面
time.sleep(3)
page.quit()

Python-Scrapy

使用提示

  1. http/https 网页均可适用
  2. scrapy 不是 python 原生库,需要安装才能使用: pip install scrapy
  3. 在第一级 tutorial 目录下运行如下命令查看结果:scrapy crawl kdl

代码下载:githubgitee

Scrapy项目目录

运行命令:scrapy startproject tutorial 新建 Scrapy 项目,创建包含下列内容的 tutorial 目录

tutorial/ 
    scrapy.cfg        # 项目的配置文件 
    tutorial/         # 该项目的python模块。之后您将在此加入代码 
        __init__.py 
        items.py      # 项目中的item文件 
        pipelines.py  # 项目中的pipelines文件 
        settings.py   # 项目的设置文件 
        spiders/      # 放置spider代码的目录 __init__.py 
        ...

kdl_spider.py

编写爬虫(Spider):在 tutorial/spiders/ 目录下新建 kdl_spider.py 文件

#!/usr/bin/env python 
# -- coding: utf-8 -- 

import scrapy

class KdlSpider(scrapy.spiders.Spider):
    name = "kdl"

    def start_requests(self):
        url = "https://dev.kdlapi.com/testproxy"
        yield scrapy.Request(url, callback=self.parse)

    def parse(self, response):
        print(response.text)


# 如scrapy报ssl异常"('SSL routines', 'ssl3_get_record', 'wrong version number')", 您可以尝试打开以下代码来解决
# from OpenSSL import SSL
# from scrapy.core.downloader.contextfactory import ScrapyClientContextFactory
# 
# init = ScrapyClientContextFactory.__init__
# def init2(self, *args, **kwargs):
#     init(self, *args, **kwargs)
#     self.method = SSL.SSLv23_METHOD
# ScrapyClientContextFactory.__init__ = init2
myextend.py

添加自定义扩展(Extend):在 tutorial/ 目录下新建 myextend.py 文件,调用时只需修改 api_url 以及在 time.sleep 处设置提取IP的间隔时间即可

#!/usr/bin/env python 
# -- coding: utf-8 -- 
import time
import threading

import requests
from scrapy import signals

# 提取代理IP的api
api_url = 'https://dps.kdlapi.com/api/getdps/?secret_id=o1fjh1re9o28876h7c08&signature=xxxxx&num=10&pt=1&format=json&sep=1'
foo = True

class Proxy:

    def __init__(self, ):
        self._proxy_list = requests.get(api_url).json().get('data').get('proxy_list')

    @property
    def proxy_list(self):
        return self._proxy_list

    @proxy_list.setter
    def proxy_list(self, list):
        self._proxy_list = list


pro = Proxy()
print(pro.proxy_list)


class MyExtend:

    def __init__(self, crawler):
        self.crawler = crawler
        # 将自定义方法绑定到scrapy信号上,使程序与spider引擎同步启动与关闭
        # scrapy信号文档: https://www.osgeo.cn/scrapy/topics/signals.html
        # scrapy自定义拓展文档: https://www.osgeo.cn/scrapy/topics/extensions.html
        crawler.signals.connect(self.start, signals.engine_started)
        crawler.signals.connect(self.close, signals.spider_closed)

    @classmethod
    def from_crawler(cls, crawler):
        return cls(crawler)

    def start(self):
        t = threading.Thread(target=self.extract_proxy)
        t.start()

    def extract_proxy(self):
        while foo:
            pro.proxy_list = requests.get(api_url).json().get('data').get('proxy_list')
            #设置每15秒提取一次ip
            time.sleep(15)

    def close(self):
        global foo
        foo = False
middlewares.py
  1. middlewares.py 中新增 ProxyDownloaderMiddleware 即代理中间件
  2. 请注意替换代码中的部分信息:username:用户名,password:密码
#!/usr/bin/env python 
# -- coding: utf-8 -- 

from scrapy import signals
from .myextend import pro
import random


class ProxyDownloaderMiddleware:

    def process_request(self, request, spider):
        proxy = random.choice(pro.proxy_list)

        # 用户名密码认证(私密代理/独享代理)
        username = "username"
        password = "password"
        request.meta['proxy'] = "http://%(user)s:%(pwd)s@%(proxy)s/" % {"user": username, "pwd": password, "proxy": proxy}

        # 白名单认证(私密代理/独享代理)
        # request.meta['proxy'] = "http://%(proxy)s/" % {"proxy": proxy}
        return None
settings.py

settings.py中激活ProxyDownloaderMiddleware代理中间件和自定义拓展

# -- coding: utf-8 --

# Enable or disable downloader middlewares
# See https://docs.scrapy.org/en/latest/topics/downloader-middleware.html
DOWNLOADER_MIDDLEWARES = {
    'tutorial.middlewares.ProxyDownloaderMiddleware': 100,
}
#注意路径
EXTENSIONS = {
    'tutorial.myextend.MyExtend': 300,
}

Python-feapder

使用提示

  1. http/https 网页均可适用
  2. 运行环境要求 python3.6 以上
  3. feapder 不是 python 原生库,需要安装才能使用: pip install feapder
  4. 使用命令 feapder create -s py3_feapder 创建一个轻量爬虫

代码下载:githubgitee

py3_feapder.py
  1. py3_feapder.py 中新增 download_midware 方法,即下载中间件
  2. 请注意替换代码中的部分信息:username:用户名,password:密码
import feapder


class Py3Feapder(feapder.AirSpider):
    def start_requests(self):
        yield feapder.Request("https://dev.kdlapi.com/testproxy")

    def download_midware(self, request):
        # 提取代理API接口,获取1个代理IP
        api_url = "https://dps.kdlapi.com/api/getdps/?secret_id=o1fjh1re9o28876h7c08&signature=xxxxx&num=1&pt=1&sep=1"

        # 获取API接口返回的代理IP
        proxy_ip = feapder.Request(api_url).get_response().text

        # 用户名密码认证(私密代理/独享代理)
        username = "username"
        password = "password"
        proxies = {
            "http": "http://%(user)s:%(pwd)s@%(proxy)s/" % {"user": username, "pwd": password, "proxy": proxy_ip},
            "https": "http://%(user)s:%(pwd)s@%(proxy)s/" % {"user": username, "pwd": password, "proxy": proxy_ip}
        }

        # 白名单认证(需提前设置白名单)
        # proxies = {
        #     "http": "http://%(proxy)s/" % {"proxy": proxy_ip},
        #     "https": "http://%(proxy)s/" % {"proxy": proxy_ip}
        # }

        request.proxies = proxies
        return request

    def parse(self, request, response):
        print(response.text)


if __name__ == "__main__":
    Py3Feapder().start()

Java

okhttp3

okhttp-3.8.1

使用提示

  1. 此样例同时支持访问http和https网页
  2. 使用用户名密码访问的情况下,每次请求httpclient不会发送两次进行认证,与使用白名单效果相同
  3. 使用用户名密码验证时必须重写 Authenticatorauthenticate方法
  4. 添加依赖
import okhttp3.*;
import java.io.IOException;
import java.net.InetSocketAddress;
import java.net.Proxy;

public class TestProxyOKHttpClient {
    public static void main(String args[]) throws IOException {
        // 目标网站
        String targetUrl = "https://dev.kdlapi.com/testproxy";

        // 用户名密码认证(私密代理/独享代理)
        final String username = "username";
        final String password = "password";

        String ip = "59.38.241.25";   // 代理服务器IP
        int port = 23916;

        Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress(ip, port));

        Authenticator authenticator = new Authenticator() {
            @Override
            public Request authenticate(Route route, Response response) throws IOException {
                String credential = Credentials.basic(username, password);
                return response.request().newBuilder()
                        .header("Proxy-Authorization", credential)
                        .build();
            }
        };
        OkHttpClient client = new OkHttpClient.Builder()
                .proxy(proxy)
                .proxyAuthenticator(authenticator)
                .build();

        Request request = new Request.Builder()
                .url(targetUrl)
                .addHeader("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3100.0 Safari/537.36")
                .build();

        Response response = client.newCall(request).execute();
        System.out.println(response.body().string());

    }
}

httpclient

HttpClient-4.5.6

使用提示

  1. 此样例同时支持访问http和https网页
  2. 使用用户名密码访问的情况下,每次请求httpclient会发送两次进行认证从而导致请求耗时增加,建议使用白名单访问
  3. 若有多个用户名、密码进行认证需要在代码中须添加AuthCacheValue.setAuthCache(new AuthCacheImpl());
  4. 依赖包下载:
    httpclient-4.5.6.jar
    httpcore-4.4.10.jar
    commons-codec-1.10.jar
    commons-logging-1.2.jar
import java.net.URL;
import org.apache.http.HttpHost;
import org.apache.http.auth.AuthScope;
import org.apache.http.auth.UsernamePasswordCredentials;
import org.apache.http.client.CredentialsProvider;
import org.apache.http.client.config.RequestConfig;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.BasicCredentialsProvider;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;

/**
* 使用httpclient请求代理服务器 请求http和https网页均适用
*/
public class TestProxyHttpClient {

    private static String pageUrl = "https://dev.kdlapi.com/testproxy"; // 要访问的目标网页
    private static String proxyIp = "59.38.241.25"; // 代理服务器IP
    private static int proxyPort = 23916; // 端口号
    // 用户名密码认证(私密代理/独享代理)
    private static String username = "username";
    private static String password = "password";

    public static void main(String[] args) throws Exception {
        // JDK 8u111版本后,目标页面为HTTPS协议,启用proxy用户密码鉴权
        System.setProperty("jdk.http.auth.tunneling.disabledSchemes", "");

        CredentialsProvider credsProvider = new BasicCredentialsProvider();
        credsProvider.setCredentials(new AuthScope(proxyIp, proxyPort),
                new UsernamePasswordCredentials(username, password));
        CloseableHttpClient httpclient = HttpClients.custom().setDefaultCredentialsProvider(credsProvider).build();
        try {
            URL url = new URL(pageUrl);
            HttpHost target = new HttpHost(url.getHost(), url.getDefaultPort(), url.getProtocol());
            HttpHost proxy = new HttpHost(proxyIp, proxyPort);

            /*
            httpclient各个版本设置超时都略有不同, 此处对应版本4.5.6
            setConnectTimeout:设置连接超时时间
            setConnectionRequestTimeout:设置从connect Manager获取Connection 超时时间
            setSocketTimeout:请求获取数据的超时时间
            */
            RequestConfig config = RequestConfig.custom().setProxy(proxy).setConnectTimeout(6000)
                    .setConnectionRequestTimeout(2000).setSocketTimeout(6000).build();
            HttpGet httpget = new HttpGet(url.getPath());
            httpget.setConfig(config);
            httpget.addHeader("Accept-Encoding", "gzip"); // 使用gzip压缩传输数据让访问更快
            httpget.addHeader("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36");
            CloseableHttpResponse response = httpclient.execute(target, httpget);
            try {
                System.out.println(response.getStatusLine());
                System.out.println(EntityUtils.toString(response.getEntity()));
            } finally {
                response.close();
            }
        } finally {
            httpclient.close();
        }
    }
}

jsoup

使用jsoup发起请求

使用提示

  1. 此样例同时支持访问http和https网页
  2. 使用用户名密码访问的情况下,每次请求httpclient会发送两次进行认证从而导致请求耗时增加,建议使用白名单访问
  3. 若有多个用户名、密码进行认证需要在代码中须添加AuthCacheValue.setAuthCache(new AuthCacheImpl());
  4. 依赖包下载:
    jsoup-1.13.1.jar
import java.io.IOException;
import java.net.Authenticator;
import java.net.InetSocketAddress;
import java.net.PasswordAuthentication;
import java.net.Proxy;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;

public class TestProxyJsoup {
    // 用户名密码认证(私密代理/独享代理)
    final static String ProxyUser = "username";
    final static String ProxyPass = "password";

    // 代理IP、端口号
    final static String ProxyHost = "59.38.241.25";
    final static Integer ProxyPort = 23916;

    public static String getUrlProxyContent(String url) {
        Authenticator.setDefault(new Authenticator() {
            public PasswordAuthentication getPasswordAuthentication() {
                return new PasswordAuthentication(ProxyUser, ProxyPass.toCharArray());
            }
        });

        Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress(ProxyHost, ProxyPort));

        try {
            // 此处自己处理异常、其他参数等
            Document doc = Jsoup.connect(url).followRedirects(false).timeout(3000).proxy(proxy).get();
            if (doc != null) {
                System.out.println(doc.body().html());
            }
        } catch (IOException e) {
            e.printStackTrace();
        }

        return null;
    }

    public static void main(String[] args) throws Exception {
        // 目标网站
        String targetUrl = "https://dev.kdlapi.com/testproxy";

        // JDK 8u111版本后,目标页面为HTTPS协议,启用proxy用户密码鉴权
        System.setProperty("jdk.http.auth.tunneling.disabledSchemes", "");

        getUrlProxyContent(targetUrl);
    }
}    

hutool

使用hutool发起请求

使用提示

  1. 此样例同时支持访问http和https网页
  2. 使用用户名密码访问的情况下,每次请求会发送两次进行认证从而导致请求耗时增加,建议使用白名单访问
  3. 依赖包下载:
    hutool-all-5.5.4.jar
import java.net.Authenticator;
import java.net.PasswordAuthentication;
import cn.hutool.http.HttpResponse;
import cn.hutool.http.HttpRequest;

// 代理验证信息
class ProxyAuthenticator extends Authenticator {
    private String user, password;

    public ProxyAuthenticator(String user, String password) {
        this.user     = user;
        this.password = password;
    }

    protected PasswordAuthentication getPasswordAuthentication() {
        return new PasswordAuthentication(user, password.toCharArray());
    }
}


public class TestProxyHutool {
    // 用户名密码认证(私密代理/独享代理)
    final static String ProxyUser = "username";
    final static String ProxyPass = "password";

    // 代理IP、端口号
    final static String ProxyHost = "59.38.241.25";
    final static Integer ProxyPort = 23916;

    public static void main(String[] args) {
        // 目标网站
        String url = "https://dev.kdlapi.com/testproxy";
        // JDK 8u111版本后,目标页面为HTTPS协议,启用proxy用户密码鉴权
        System.setProperty("jdk.http.auth.tunneling.disabledSchemes", "");
        // 设置请求验证信息
        Authenticator.setDefault(new ProxyAuthenticator(ProxyUser, ProxyPass));

        // 发送请求
        HttpResponse result = HttpRequest.get(url)
                .setHttpProxy(ProxyHost, ProxyPort)
                .timeout(20000)//设置超时,毫秒
                .execute();
        System.out.println(result.body());
    }
}

selenium-java

selenium-java

使用提示

  1. 基于白名单方式使用selenium-java认证代理
  2. 下载chromedriver(注意chromedriver版本要和Chrome版本对应)
  3. 依赖下载:selenium-java-4.1.2.jar
  4. 请注意替换代码中的部分信息:
    ${ip:port}:代理IP:端口号,如:"59.38.241.25:23916"
    ${chromedriver_path}:您本机chromedriver驱动存放路径,如:"C:\chromedriver.exe"

代码下载:githubgitee

import org.openqa.selenium.By;
import org.openqa.selenium.Proxy;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.chrome.ChromeOptions;


public class TestProxySelenium {
    public static void main(String[] args) throws InterruptedException {
        // 目标网站
        String targetUrl = "https://dev.kdlapi.com/testproxy";

        // 代理ip: port
        String proxyServer = "59.38.241.25:23916";

        // 创建webdriver驱动,设置代理
        System.setProperty("webdriver.chrome.driver", "${chromedriver_path}"); // webdriver驱动路径
        Proxy proxy = new Proxy().setHttpProxy(proxyServer).setSslProxy(proxyServer);
        ChromeOptions options = new ChromeOptions();
        options.setProxy(proxy);
        WebDriver driver = new ChromeDriver(options);

        // 发起请求
        driver.get(targetUrl);
        WebElement element = driver.findElement(By.xpath("/html"));
        String resText = element.getText().toString();
        System.out.println(resText);
        Thread.sleep(3000);

        // 关闭webdriver
        driver.quit();
    }
}

resttemplate

RestTemplate

使用提示

  1. 此样例同时支持访问http和https网页
  2. 使用用户名密码访问的情况下,每次请求httpclient会发送两次进行认证从而导致请求耗时增加,建议使用白名单访问
  3. 依赖包下载:
    httpclient-4.5.6.jar
    httpcore-4.4.10.jar
    commons-codec-1.10.jar
    commons-logging-1.2.jar
    spring-web-5.2.24.jar
    spring-beans-5.2.24.jar
    spring-core-5.2.24.jar
    spring-jcl-5.2.24.jar
import org.apache.http.HttpHost;
import org.apache.http.auth.AuthScope;
import org.apache.http.auth.UsernamePasswordCredentials;
import org.apache.http.client.CredentialsProvider;
import org.apache.http.impl.client.BasicCredentialsProvider;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClientBuilder;
import org.apache.http.impl.client.ProxyAuthenticationStrategy;
import org.springframework.http.client.HttpComponentsClientHttpRequestFactory;
import org.springframework.web.client.RestTemplate;

public class TestProxyRestTemplate {
    // 目标网站
    private static String pageUrl = "https://dev.kdlapi.com/testproxy";
    // 代理服务器IP、端口号
    private static String proxyHost = "59.38.241.25";
    private static Integer proxyPort = 23916;
    // 用户名密码认证(私密代理/独享代理)
    private static String ProxyUser = "username";
    private static String Proxypass = "password";

    public static void main(String[] args) {
        CredentialsProvider credsProvider = new BasicCredentialsProvider();
        credsProvider.setCredentials( 
            new AuthScope(proxyHost, proxyPort), 
            new UsernamePasswordCredentials(ProxyUser, Proxypass)
        );

        HttpHost proxy = new HttpHost(proxyHost, proxyPort);
        HttpClientBuilder clientBuilder = HttpClientBuilder.create();
        clientBuilder.useSystemProperties();
        clientBuilder.setProxy(proxy);
        clientBuilder.setDefaultCredentialsProvider(credsProvider);
        clientBuilder.setProxyAuthenticationStrategy(new ProxyAuthenticationStrategy());

        CloseableHttpClient client = clientBuilder.build();
        HttpComponentsClientHttpRequestFactory factory = new HttpComponentsClientHttpRequestFactory();
        factory.setHttpClient(client);

        RestTemplate restTemplate = new RestTemplate();
        restTemplate.setRequestFactory(factory);
        String result = restTemplate.getForObject(pageUrl, String.class);
        System.out.println(result);
    }
}

playwright

playwright

使用提示

  1. 基于白名单方式使用playwright认证代理
  2. 添加pom.xml依赖
  3. 请注意替换代码中的部分信息:
    ${ip:port}:代理IP:端口号,如:"59.38.241.25:23916"
// pom.xml中添加playwright依赖
<dependencies>
    <dependency>
        <groupId>com.microsoft.playwright</groupId>
        <artifactId>playwright</artifactId>
        <version>1.35.0</version>
    </dependency>
</dependencies>
package org.example;
import com.microsoft.playwright.*;

public class App {
    private static String pageUrl = "https://dev.kdlapi.com/testproxy";
    // 用户名密码认证(私密代理/独享代理)
    private static String ProxyUser = "username";
    private static String Proxypass = "password";

    public static void main(String[] args) {
        try (Playwright playwright = Playwright.create()) {
            // 目标网站
            Browser browser = playwright.chromium().launch();
            BrowserContext context = browser.newContext(new Browser.NewContextOptions()
                                     .setProxy("http://ip:port")
                                     .setHttpCredentials(ProxyUser, Proxypass));
            Page page = context.newPage();
            Response response = page.navigate(pageUrl);
            System.out.println("响应为:" + response.text());
        }
    }
}

GoLang

标准库

标准库

使用提示

  • http和https网页均可适用

代码下载:githubgitee

// 请求代理服务器
// http和https网页均适用

package main

import (
    "compress/gzip"
    "fmt"
    "io"
    "io/ioutil"
    "net/http"
    "net/url"
    "os"
)

func main() {
    // 用户名密码认证(私密代理/独享代理)
    username := "username"
    password := "password"

    // 代理服务器
    proxy_raw := "59.38.241.25:23916"
    proxy_str := fmt.Sprintf("http://%s:%s@%s", username, password, proxy_raw)
    proxy, err := url.Parse(proxy_str)

    // 目标网页
    page_url := "http://dev.kdlapi.com/testproxy"

    //  请求目标网页
    client := &http.Client{Transport: &http.Transport{Proxy: http.ProxyURL(proxy)}}
    req, _ := http.NewRequest("GET", page_url, nil)
    req.Header.Add("Accept-Encoding", "gzip") //使用gzip压缩传输数据让访问更快
    res, err := client.Do(req)

    if err != nil {
        // 请求发生异常
        fmt.Println(err.Error())
    } else {
        defer res.Body.Close() //保证最后关闭Body

        fmt.Println("status code:", res.StatusCode) // 获取状态码

        // 有gzip压缩时,需要解压缩读取返回内容
        if res.Header.Get("Content-Encoding") == "gzip" {
            reader, _ := gzip.NewReader(res.Body) // gzip解压缩
            defer reader.Close()
            io.Copy(os.Stdout, reader)
            os.Exit(0) // 正常退出
        }

        // 无gzip压缩, 读取返回内容
        body, _ := ioutil.ReadAll(res.Body)
        fmt.Println(string(body))
    }
}

CSharp

标准库

标准库

使用提示

  • http和https网页均可适用

代码下载:githubgitee

using System;
using System.Text;
using System.Net;
using System.IO;
using System.IO.Compression;

namespace csharp_http
{
    class Program
    {
        static void Main(string[] args)
        {
            // 要访问的目标网页
            string page_url = "http://dev.kdlapi.com/testproxy";

            // 构造请求
            HttpWebRequest request = (HttpWebRequest)WebRequest.Create(page_url);
            request.Method = "GET";
            request.Headers.Add("Accept-Encoding", "Gzip");  // 使用gzip压缩传输数据让访问更快

            // 代理服务器
            string proxy_ip = "59.38.241.25";
            int proxy_port = 23916;

            // 用户名密码认证(私密代理/独享代理)
            string username = "username";
            string password = "password";

            // 设置代理 (开放代理或私密/独享代理&已添加白名单)
            // request.Proxy = new WebProxy(proxy_ip, proxy_port);

            // 设置代理 (私密/独享代理&未添加白名单)
            WebProxy proxy = new WebProxy();
            proxy.Address = new Uri(String.Format("http://{0}:{1}", proxy_ip, proxy_port));
            proxy.Credentials = new NetworkCredential(username, password);
            request.Proxy = proxy;

            // 请求目标网页
            HttpWebResponse response = (HttpWebResponse)request.GetResponse();

            Console.WriteLine((int)response.StatusCode);  // 获取状态码
            // 解压缩读取返回内容
            using (StreamReader reader =  new StreamReader(new GZipStream(response.GetResponseStream(), CompressionMode.Decompress))) {
                Console.WriteLine(reader.ReadToEnd());
            }
        }
    }
}

Node.js

标准库(http+url)

标准库(http,https均适用)

使用提示

  • http,https均适用

代码下载:githubgitee

const http = require("http");  // 引入内置http模块
const url  = require("url");



// 要访问的目标页面
const targetUrl = "http://dev.kdlapi.com/testproxy";
const urlParsed   = url.parse(targetUrl);

// 代理ip
const proxyIp = "proxyIp";  // 代理服务器ip
const proxyPort = "proxyPort"; // 代理服务器host

// 用户名密码认证(私密代理/独享代理)
const username = "username";
const password = "password";
const base64    = new Buffer.from(username + ":" + password).toString("base64");
const options = {
    host    : proxyIp,
    port    : proxyPort,
    path    : targetUrl,
    method  : "GET",
    headers : {
        "Host"                : urlParsed.hostname,
        "Proxy-Authorization" : "Basic " + base64
    }
};

http.request(options,  (res) => {
        console.log("got response: " + res.statusCode);
        // 输出返回内容(使用了gzip压缩)
        if (res.headers['content-encoding'] && res.headers['content-encoding'].indexOf('gzip') != -1) {
            let zlib = require('zlib');
            let unzip = zlib.createGunzip();
            res.pipe(unzip).pipe(process.stdout);
        } else {
            // 输出返回内容(未使用gzip压缩)
            res.pipe(process.stdout);
        }
    })
    .on("error", (err) => {
        console.log(err);
    })
    .end()
;

标准库(http+tls+util)

标准库(适用http和https请求)

使用提示

  • http网页和https网页均可适用

代码下载:githubgitee

let http = require('http'); // 引入内置http模块
let tls = require('tls'); // 引入内置tls模块
let util = require('util');

// 用户名密码认证(私密代理/独享代理)
const username = 'username';
const password = 'password';
const auth = 'Basic ' + new Buffer.from(username + ':' + password).toString('base64');

// 代理服务器ip和端口
let proxy_ip = '59.38.241.25';
let proxy_port = 23916;

// 要访问的主机和路径
let remote_host = 'https://dev.kdlapi.com/testproxy';
let remote_path = '/';

// 发起CONNECT请求
let req = http.request({
    host: proxy_ip,
    port: proxy_port,
    method: 'CONNECT',
    path: util.format('%s:443', remote_host),
    headers: {
        "Host": remote_host,
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3100.0 Safari/537.36",
        "Proxy-Authorization": auth,
        "Accept-Encoding": "gzip"   // 使用gzip压缩让数据传输更快
    }
});



req.on('connect', function (res, socket, head) {
    // TLS握手
    let tlsConnection = tls.connect({
        host: remote_host,
        socket: socket
    }, function () {
        // 发起GET请求
        tlsConnection.write(util.format('GET %s HTTP/1.1\r\nHost: %s\r\n\r\n', remote_path, remote_host));
    });

    tlsConnection.on('data', function (data) {
        // 输出响应结果(完整的响应报文串)
        console.log(data.toString());
    });
});

req.end();

request

request

使用提示

  • 请先安装request库: npm install request
  • http网页和https网页均可适用

代码下载:githubgitee

let request = require('request'); // 引入第三方request库
let util = require('util');
let zlib = require('zlib');

// 用户名密码认证(私密代理/独享代理)
const username = 'username';
const password = 'password';

// 要访问的目标地址
let page_url = 'https://dev.kdlapi.com/testproxy'

// 代理服务器ip和端口
let proxy_ip = '59.38.241.25';
let proxy_port = 23916;

// 完整代理服务器url
let proxy = util.format('http://%s:%s@%s:%d', username, password, proxy_ip, proxy_port);  

// 发起请求
request({
    url: page_url,
    method: 'GET',
    proxy: proxy,
    headers: {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3100.0 Safari/537.36",
        "Accept-Encoding": "gzip"   // 使用gzip压缩让数据传输更快
    },
    encoding: null,  // 方便解压缩返回的数据
}, function(error, res, body) {
    if (!error && res.statusCode == 200) {
        // 输出返回内容(使用了gzip压缩)
        if (res.headers['content-encoding'] && res.headers['content-encoding'].indexOf('gzip') != -1) {
            zlib.gunzip(body, function(err, dezipped) {
                console.log(dezipped.toString()); 
            });
        } else {
            // 输出返回内容(没有使用gzip压缩)
            console.log(body);
        }
    } else {
        console.log(error);
    }
});

puppeteer

puppeteer(IP白名单)

使用提示

  • 基于用户名密码认证的http/https代理Puppeteer
  • 运行环境要求: node7.6.0或以上 + puppeteer
  • 请先安装puppeteer: npm i puppeteer

代码下载:githubgitee

// 引入puppeteer模块
const puppeteer = require('puppeteer');

// 要访问的目标网页
const url = 'http://dev.kuaidaili.com/testproxy';

// 添加headers
const headers = {
    'Accept-Encoding': 'gzip' // 使用gzip压缩让数据传输更快
};

// 代理服务器ip和端口
let proxy_ip = '59.38.241.25'
let proxy_port = 23916

(async ()=> {
    // 新建一个浏览器实例
    const browser = await puppeteer.launch({
        headless: false,  // 是否不显示窗口, 默认为true, 设为false便于调试
        args: [
            `--proxy-server=${proxy_ip}:${proxy_port}`,
            '--no-sandbox',
            '--disable-setuid-sandbox'
        ]
    });

    // 打开一个新页面
    const page = await browser.newPage();

    // 设置headers
    await page.setExtraHTTPHeaders(headers);

    // 访问目标网页
    await page.goto(url);

})();
puppeteer(用户名密码认证)

使用提示

  • 基于白名单的http/https代理Puppeteer
  • 运行环境要求: node7.6.0或以上 + puppeteer
  • 请先安装puppeteer: npm i puppeteer

代码下载:githubgitee

// 引入puppeteer模块
const puppeteer = require('puppeteer');

// 要访问的目标网页
const url = 'http://dev.kuaidaili.com/testproxy';

// 添加headers
const headers = {
    'Accept-Encoding': 'gzip' // 使用gzip压缩让数据传输更快
};

// 代理服务器ip和端口
let proxy_ip = '223.198.230.41'
let proxy_port = 19732

// 用户名密码认证(私密代理/独享代理)
const username = 'username';
const password = 'password';

(async ()=> {
    // 新建一个浏览器实例
    const browser = await puppeteer.launch({
        headless: false,  // 是否不显示窗口, 默认为true, 设为false便于调试
        args: [
            `--proxy-server=${proxy_ip}:${proxy_port}`,
            '--no-sandbox',
            '--disable-setuid-sandbox'
        ]
    });

    // 打开一个新页面
    const page = await browser.newPage();

    // 设置headers
    await page.setExtraHTTPHeaders(headers);

    // 用户民密码认证
    await page.authenticate({username: username, password: password});

    // 访问目标网页
    await page.goto(url);
})();

axios

axios

使用提示

  • 请先安装axioshttps-proxy-agent库: npm install axios https-proxy-agent
const axios = require('axios');
// https-proxy-agent 6.0.0 及以上版本
const { HttpsProxyAgent } = require("https-proxy-agent");
// https-proxy-agent 6.0.0 以下版本
// const HttpsProxyAgent = require("https-proxy-agent");

// 代理ip和代理端口
let proxyIp = '59.38.241.25'
let proxyPort = 23916

// 配置用户名和密码
let username = 'username'
let password = 'password'

axios({
    url: 'https://dev.kdlapi.com/testproxy',
    method: "get",
    httpAgent: new HttpsProxyAgent(`http://${username}:${password}@${proxyIp}:${proxyPort}`),
    httpsAgent: new HttpsProxyAgent(`http://${username}:${password}@${proxyIp}:${proxyPort}`),
}).then(
    res => {
        console.log(res.data);
    }
).catch(err => {
    console.log(err);
})

websocket

websocket

使用提示

  • 请先安装wshttps-proxy-agent库: npm install ws https-proxy-agent
const WebSocket = require('ws');
// https-proxy-agent 6.0.0 及以上版本
const { HttpsProxyAgent } = require("https-proxy-agent");
// https-proxy-agent 6.0.0 以下版本
// const HttpsProxyAgent = require("https-proxy-agent");

// 代理ip和代理端口
let proxyIp = '59.38.241.25'
let proxyPort = 23916

// 配置用户名和密码
let username = 'username'
let password = 'password'

const target = 'ws://echo.websocket.events/';
const agent = new HttpsProxyAgent(`http://${username}:${password}@${proxyIp}:${proxyPort}`);
const socket = new WebSocket(target, {agent});

socket.on('open', function () {
    console.log('"open" event!');
    socket.send('hello world');
});

socket.on('message', function (data, flags) {
    console.log('"message" event!', data, flags);
    socket.close();
});

playwright

playwright

使用提示

  • 请先安装playwright库: npm install playwright
const https = require('https');
const { URL } = require('url');
const { chromium } = require('playwright');


// 发送https请求
function sendRequest(url, options = {}) {
    return new Promise((resolve, reject) => {
        const req = https.request(url, options, (res) => {
          let data = '';
          res.on('data', (chunk) => {
            data += chunk;
          });
          res.on('end', () => {
            resolve(data);
          });
        });

        req.on('error', (error) => {
          reject(error);
        });

        req.end();
    });
}


// 获取代理
function get_proxy() {
    return new Promise((resolve, reject) => {
        const api = 'https://dps.kdlapi.com/api/getdps/';
        const params = {
          'num': 1,
          'pt': 1,
          'sep': 1,
          'secret_id': 'your secret_id',
          'signature': 'yoru signature',
        };
        let url = new URL(api);
        url.search = new URLSearchParams(params).toString();
        sendRequest(url)
          .then((data) => {
            resolve(data);
          })
          .catch((error) => {
            reject(error);
          });
        });
}


// 使用playwright添加代理
async function main() {
    const proxyServer = await get_proxy();
    if(!proxyServer){
        console.log('获取代理失败');
        return;
    }
    console.log('获取代理为:', proxyServer);
    # browser = playwright.chromium.launch(channel="msedge", headless=True)  # Microsoft Edge 浏览器
    # browser = playwright.firefox.launch(headless=True)                     # Mozilla Firefox 浏览器
    # browser = playwright.webkit.launch(headless=True)                      # WebKit 浏览器 Apple Safari
    const browser = await chromium.launch({
        proxy: {
            server: `http://${proxyServer}`,
        }
    });
    const page = await browser.newPage();
    await page.goto('https://dev.kdlapi.com/testproxy');
    const content = await page.content();
    console.log(content);
    await browser.close();
}

main().catch((error) => {
    console.error(error);
});

Ruby

net/http

net/http(IP白名单)

使用提示

  • 基于ip白名单的http/https代理net/http

代码下载:githubgitee

# -*- coding: utf-8 -*-

require 'net/http'  # 引入内置net/http模块
require 'zlib'
require 'stringio'

# 代理服务器ip 和 端口
proxy_ip = '59.38.241.25'
proxy_port = 23916



# 要访问的目标网页, 以快代理testproxy页面为例
page_url = "https://dev.kuaidaili.com/testproxy"
uri = URI(page_url)

# 新建代理实例
proxy = Net::HTTP::Proxy(proxy_ip, proxy_port)

# 创建新的请求对象 
req = Net::HTTP::Get.new(uri)
# 设置User-Agent
req['User-Agent'] = 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_8; en-us) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50'
req['Accept-Encoding'] = 'gzip'  # 使用gzip压缩传输数据让访问更快



# 使用代理发起请求, 若访问的是http网页, 请将use_ssl设为false
res = proxy.start(uri.hostname, uri.port, :use_ssl => true) do |http|
    http.request(req)
end

# 输出状态码
puts "status code: #{res.code}"

# 输出响应体
if  res.code.to_i != 200 then
    puts "page content: #{res.body}"
else
    gz = Zlib::GzipReader.new(StringIO.new(res.body.to_s))
    puts "page content: #{gz.read}" 
end
net/http(用户名密码认证)

使用提示

  • 基于用户名密码认证的http/https代理net/http

代码下载:githubgitee

# -*- coding: utf-8 -*-

require 'net/http'  # 引入内置net/http模块
require 'zlib'
require 'stringio'

# 代理服务器ip 和 端口
proxy_ip = '59.38.241.25'
proxy_port = :23916

# 用户名密码认证(私密代理/独享代理)
username = 'username'
password = 'password'

# 要访问的目标网页, 以快代理testproxy页面为例
page_url = "https://dev.kuaidaili.com/testproxy"
uri = URI(page_url)

# 新建代理实例
proxy = Net::HTTP::Proxy(proxy_ip, proxy_port, username, password)

# 创建新的请求对象 
req = Net::HTTP::Get.new(uri)
# 设置代理用户名密码认证(私密代理/独享代理)
req.basic_auth(username, password)
# 设置User-Agent
req['User-Agent'] = 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_8; en-us) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50'
req['Accept-Encoding'] = 'gzip'  # 使用gzip压缩传输数据让访问更快



# 使用代理发起请求, 若访问的是http网页, 请将use_ssl设为false
res = proxy.start(uri.hostname, uri.port, :use_ssl => true) do |http|
    http.request(req)
end

# 输出状态码
puts "status code: #{res.code}"

# 输出响应体
if  res.code.to_i != 200 then
    puts "page content: #{res.body}"
else
    gz = Zlib::GzipReader.new(StringIO.new(res.body.to_s))
    puts "page content: #{gz.read}" 
end

httparty

httparty(IP白名单)

使用提示

  • 基于IP白名单认证的http/https代理httparty

代码下载:githubgitee

require "httparty"  # 引入httparty模块
require 'zlib'
require 'stringio'

# 代理服务器ip和端口
proxy_ip = '59.38.241.25'
proxy_port = 23916

# 要访问的目标网页, 以快代理testproxy页面为例
page_url = 'https://dev.kuaidaili.com/testproxy'

# 设置headers
headers = {
    "User-Agent" => "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_8; en-us) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50",
    "Accept-Encoding" => "gzip",
}

# 设置代理
options = {
    :headers => headers, 
    :http_proxyaddr => proxy_ip, 
    :http_proxyport => proxy_port,
}

# 发起请求
res = HTTParty.get(page_url, options)

# 输出状态码
puts "status code: #{res.code}"

# 输出响应体
if  res.code.to_i != 200 then
    puts "page content: #{res.body}"
else
    gz = Zlib::GzipReader.new(StringIO.new(res.body.to_s))
    puts "page content: #{gz.read}" 
end
httparty(用户名密码认证)

使用提示

  • 基于用户名密码认证的http/https代理httparty

代码下载:githubgitee

require "httparty"  # 引入httparty模块
require 'zlib'
require 'stringio'

# 代理服务器ip和端口
proxy_ip = '59.38.241.25'
proxy_port = 23916

# 用户名密码认证(私密代理/独享代理)
username = 'username'
password = 'password'

# 要访问的目标网页,以快代理testproxy页面为例
page_url = 'https://dev.kuaidaili.com/testproxy'

# 设置headers
headers = {
    "User-Agent" => "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_8; en-us) AppleWebKit/534.50 (KHTML, like Gecko) Version/5.1 Safari/534.50",
    "Accept-Encoding" => "gzip",
}

# 设置代理
options = {
    :headers => headers, 
    :http_proxyaddr => proxy_ip, 
    :http_proxyport => proxy_port, 
    :http_proxyuser => username, 
    :http_proxypass => password,
}

# 发起请求
res = HTTParty.get(page_url, options)

# 输出状态码
puts "status code: #{res.code}"

# 输出响应体
if  res.code.to_i != 200 then
    puts "page content: #{res.body}"
else
    gz = Zlib::GzipReader.new(StringIO.new(res.body.to_s))
    puts "page content: #{gz.read}" 
end

php

curl

curl

使用提示

  1. 此样例同时支持访问http和https网页
  2. curl不是php原生库,需要安装才能使用:
    Ubuntu/Debian系统:apt-get install php5-curl
    CentOS系统:yum install php-curl

代码下载:githubgitee

<?php
//要访问的目标页面
$page_url = "http://dev.kdlapi.com/testproxy";


$ch = curl_init();
$proxy_ip = "59.38.241.25";
$proxy_port = "23916";
$proxy = $proxy_ip.":".$proxy_port;

// 用户名密码认证(私密代理/独享代理)
$username   = "username";
$password   = "password";

//$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $page_url);

//发送post请求
$requestData["post"] = "send post request";
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($requestData));

curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);  
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);

//设置代理
curl_setopt($ch, CURLOPT_PROXYTYPE, CURLPROXY_HTTP);
curl_setopt($ch, CURLOPT_PROXY, $proxy);
//设置代理用户名密码
curl_setopt($ch, CURLOPT_PROXYAUTH, CURLAUTH_BASIC);
curl_setopt($ch, CURLOPT_PROXYUSERPWD, "{$username}:{$password}");

//自定义header
$headers = array();
$headers["user-agent"] = 'User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0);';
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);

//自定义cookie
curl_setopt($ch, CURLOPT_COOKIE,''); 

curl_setopt($ch, CURLOPT_ENCODING, 'gzip'); //使用gzip压缩传输数据让访问更快

curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);

curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

$result = curl_exec($ch);
$info = curl_getinfo($ch);
curl_close($ch);

echo "$result"; // 使用请求页面方式执行时,打印变量需要加引号
echo "\n\nfetch ".$info['url']."\ntimeuse: ".$info['total_time']."s\n\n";
?>

易语言

易语言使用代理

使用提示

  1. 需要用到两个模块:精易模块和鱼刺类

代码下载:githubgitee

联系我们