浏览器请求
请求模块在浏览器上下文中提供 HTTP 请求功能,支持继承浏览器会话状态、cookies 和身份验证的无缝 API 调用。
概述
浏览器请求模块为在浏览器 JavaScript 上下文中直接进行 HTTP 调用提供了类似 requests
的接口。这种方法相比传统 HTTP 库提供了几个优势:
- 会话继承: 自动处理 cookie、身份验证和 CORS
- 浏览器上下文: 请求在与页面相同的安全上下文中执行
- 无需会话管理: 消除在自动化和 API 调用之间传输 cookies 和令牌的需要
- SPA 兼容性: 完美适配具有复杂身份验证流程的单页应用
Request 类
在浏览器上下文中进行 HTTP 请求的主要接口。
pydoll.browser.requests.request.Request
High-level interface for making HTTP requests using the browser's fetch API.
This class provides a requests-like interface that executes HTTP requests in the browser's JavaScript context. All requests inherit the browser's current session state including cookies, authentication headers, and other automatic browser behaviors. This allows for seamless interaction with websites that require authentication or have complex cookie management.
Key Features: - Executes requests in the browser's JavaScript context using fetch API - Automatically includes browser cookies and session state - Preserves browser's security context and CORS policies - Captures both request and response headers for analysis - Supports all standard HTTP methods (GET, POST, PUT, DELETE, etc.)
Note: - Headers passed to methods are additional headers, not replacements - Browser's automatic headers (User-Agent, Accept, etc.) are preserved - Cookies are managed automatically by the browser
Initialize a new Request instance bound to a browser tab.
PARAMETER | DESCRIPTION |
---|---|
tab
|
The browser tab instance where requests will be executed. This tab provides the JavaScript execution context and maintains the browser's session state (cookies, authentication, etc.).
TYPE:
|
request
async
Execute an HTTP request in the browser's JavaScript context.
This method uses the browser's fetch API to make requests, inheriting all browser session state including cookies, authentication, and security context. The request is executed as if made by the browser itself.
PARAMETER | DESCRIPTION |
---|---|
method
|
HTTP method (GET, POST, PUT, DELETE, etc.). Case insensitive.
TYPE:
|
url
|
Target URL for the request. Can be relative or absolute.
TYPE:
|
params
|
Query parameters to append to the URL. These are URL-encoded and merged with any existing query string in the URL.
TYPE:
|
data
|
Request body data. Behavior depends on type: - dict/list/tuple: URL-encoded as form data (application/x-www-form-urlencoded) - str/bytes: Sent as-is with no Content-Type modification Mutually exclusive with 'json' parameter.
TYPE:
|
json
|
Data to be JSON-serialized as request body. Automatically sets Content-Type to application/json. Mutually exclusive with 'data'.
TYPE:
|
headers
|
Additional headers to include. These are ADDED to browser's automatic headers, not replacements. Format: [{'name': 'X-Custom', 'value': 'value'}]
TYPE:
|
**kwargs
|
Additional fetch API options (e.g., credentials, mode, cache).
DEFAULT:
|
RETURNS | DESCRIPTION |
---|---|
Response
|
Response object containing status, headers, content, and cookies from |
Response
|
both the request and response phases. |
RAISES | DESCRIPTION |
---|---|
HTTPError
|
If the request execution fails or network error occurs. |
Note
- Browser cookies are automatically included
- CORS policies are enforced by the browser
- Authentication headers are preserved from browser session
get
async
Execute a GET request for retrieving data.
PARAMETER | DESCRIPTION |
---|---|
url
|
Target URL to retrieve data from.
TYPE:
|
params
|
Query parameters to append to URL.
TYPE:
|
**kwargs
|
Additional fetch options.
DEFAULT:
|
RETURNS | DESCRIPTION |
---|---|
Response
|
Response object with retrieved data. |
post
async
Execute a POST request for creating or submitting data.
PARAMETER | DESCRIPTION |
---|---|
url
|
Target URL for data submission.
TYPE:
|
data
|
Form data to submit (URL-encoded).
TYPE:
|
json
|
JSON data to submit.
TYPE:
|
**kwargs
|
Additional fetch options.
DEFAULT:
|
RETURNS | DESCRIPTION |
---|---|
Response
|
Response object with server's response to the submission. |
put
async
Execute a PUT request for updating/replacing resources.
PARAMETER | DESCRIPTION |
---|---|
url
|
Target URL of resource to update.
TYPE:
|
data
|
Form data for the update.
TYPE:
|
json
|
JSON data for the update.
TYPE:
|
**kwargs
|
Additional fetch options.
DEFAULT:
|
RETURNS | DESCRIPTION |
---|---|
Response
|
Response object confirming the update operation. |
patch
async
Execute a PATCH request for partial resource updates.
PARAMETER | DESCRIPTION |
---|---|
url
|
Target URL of resource to partially update.
TYPE:
|
data
|
Form data with changes to apply.
TYPE:
|
json
|
JSON data with changes to apply.
TYPE:
|
**kwargs
|
Additional fetch options.
DEFAULT:
|
RETURNS | DESCRIPTION |
---|---|
Response
|
Response object confirming the partial update. |
delete
async
Execute a DELETE request for removing resources.
PARAMETER | DESCRIPTION |
---|---|
url
|
Target URL of resource to delete.
TYPE:
|
**kwargs
|
Additional fetch options.
DEFAULT:
|
RETURNS | DESCRIPTION |
---|---|
Response
|
Response object confirming the deletion. |
head
async
Execute a HEAD request to retrieve only response headers.
Useful for checking resource existence, size, or modification date without downloading the full content.
PARAMETER | DESCRIPTION |
---|---|
url
|
Target URL to check headers for.
TYPE:
|
**kwargs
|
Additional fetch options.
DEFAULT:
|
RETURNS | DESCRIPTION |
---|---|
Response
|
Response object with headers but no body content. |
options
async
Execute an OPTIONS request to check allowed methods and capabilities.
Used for CORS preflight checks and discovering server capabilities.
PARAMETER | DESCRIPTION |
---|---|
url
|
Target URL to check options for.
TYPE:
|
**kwargs
|
Additional fetch options.
DEFAULT:
|
RETURNS | DESCRIPTION |
---|---|
Response
|
Response object with allowed methods and CORS headers. |
_build_url_with_params
staticmethod
Build final URL with query parameters.
_build_request_options
Build request options dictionary.
_add_request_body
Add request body and appropriate Content-Type header.
_execute_fetch_request
async
Execute the fetch request using browser's runtime.
_build_response
staticmethod
Build Response object from fetch result.
_register_callbacks
async
Register network event listeners to capture request/response metadata.
Sets up CDP event listeners to capture all network activity during the request execution. This includes both outgoing request data and incoming response data, which are used for header and cookie extraction.
Note
Network events are only enabled if not already active on the tab.
_clear_callbacks
async
Clean up network event listeners and disable network monitoring.
Removes all registered event callbacks and disables network events if they were enabled by this request instance.
_extract_received_headers
Extract headers from response network events.
RETURNS | DESCRIPTION |
---|---|
list[HeaderEntry]
|
List of headers received from the server during response. |
_extract_sent_headers
Extract headers from request network events.
RETURNS | DESCRIPTION |
---|---|
list[HeaderEntry]
|
List of headers that were actually sent in the request. |
_extract_headers_from_events
staticmethod
Extract headers from network events using appropriate extractors.
PARAMETER | DESCRIPTION |
---|---|
events
|
List of network events to process.
TYPE:
|
event_extractors
|
Mapping of event keys to header extraction functions.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
list[HeaderEntry]
|
Deduplicated list of headers from all matching events. |
Note
Headers are deduplicated based on name-value pairs to avoid duplicate entries from multiple event types.
_extract_request_sent_headers
Extract headers from main request event.
PARAMETER | DESCRIPTION |
---|---|
params
|
Event parameters containing request details. |
RETURNS | DESCRIPTION |
---|---|
list[HeaderEntry]
|
List of headers that were sent with the request. |
_extract_request_sent_extra_info_headers
Extract headers from extra request info event.
This event contains additional header information that may not be present in the main request event, such as security-related headers.
PARAMETER | DESCRIPTION |
---|---|
params
|
Extra info event parameters containing additional headers. |
RETURNS | DESCRIPTION |
---|---|
list[HeaderEntry]
|
List of additional headers sent with the request. |
_extract_response_received_headers
Extract headers from main response event.
PARAMETER | DESCRIPTION |
---|---|
params
|
Event parameters containing response details. |
RETURNS | DESCRIPTION |
---|---|
list[HeaderEntry]
|
List of headers received from the server. |
_extract_response_received_extra_info_headers
Extract headers from extra response info event.
This event contains additional response header information, including Set-Cookie headers and security-related headers that may be filtered from the main response event.
PARAMETER | DESCRIPTION |
---|---|
params
|
Extra info event parameters containing additional headers. |
RETURNS | DESCRIPTION |
---|---|
list[HeaderEntry]
|
List of additional headers received from the server. |
_convert_dict_to_header_entries
staticmethod
Convert header dictionary to standardized HeaderEntry format.
PARAMETER | DESCRIPTION |
---|---|
headers_dict
|
Dictionary mapping header names to values.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
list[HeaderEntry]
|
List of HeaderEntry objects with 'name' and 'value' keys. |
_extract_set_cookies
Extract and parse all Set-Cookie headers from response events.
Processes response events to find Set-Cookie headers and converts them into structured cookie objects. Handles multiple Set-Cookie headers and multi-line cookie declarations.
RETURNS | DESCRIPTION |
---|---|
list[CookieParam]
|
List of unique cookies extracted from Set-Cookie headers. |
_filter_response_extra_info_events
Filter network events to find those containing Set-Cookie information.
RETURNS | DESCRIPTION |
---|---|
list[RequestReceivedEvent]
|
List of events that contain extra response information including cookies. |
_parse_set_cookie_header
Parse a Set-Cookie header value into individual cookie objects.
Handles both single and multi-line Set-Cookie headers, extracting cookie name-value pairs while ignoring attributes like Path, Domain, etc.
PARAMETER | DESCRIPTION |
---|---|
set_cookie_header
|
Raw Set-Cookie header value from HTTP response.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
list[CookieParam]
|
List of parsed cookie objects with name and value. |
_parse_cookie_line
staticmethod
Parse a single cookie line to extract name and value.
Extracts only the cookie name and value, ignoring all cookie attributes like Path, Domain, Secure, HttpOnly, etc. Rejects cookies with empty names.
PARAMETER | DESCRIPTION |
---|---|
line
|
Single line from Set-Cookie header.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Optional[CookieParam]
|
CookieParam object with name and value, or None if parsing fails or name is empty. |
_add_unique_cookies
staticmethod
Add cookies to list while avoiding duplicates.
PARAMETER | DESCRIPTION |
---|---|
cookies
|
Existing list of cookies to add to.
TYPE:
|
new_cookies
|
New cookies to add if not already present.
TYPE:
|
_convert_header_entries_to_dict
staticmethod
Convert HeaderEntry objects to a plain dictionary format.
Used for preparing headers for the JavaScript fetch API which expects a simple object mapping header names to values.
PARAMETER | DESCRIPTION |
---|---|
headers
|
List of HeaderEntry objects with 'name' and 'value' keys.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
dict[str, str]
|
Dictionary mapping header names to values. |
Response 类
表示 HTTP 请求的响应,提供类似于 requests
库的熟悉接口。
pydoll.browser.requests.response.Response
Response(status_code, content=b'', text='', json=None, response_headers=None, request_headers=None, cookies=None, url='')
HTTP response object for browser-based fetch requests.
This class provides a standardized interface for handling HTTP responses obtained through the browser's fetch API. It mimics the requests.Response interface while preserving all browser-specific metadata including cookies, headers, and network timing information.
Key Features: - Compatible with requests.Response API for easy migration - Preserves both request and response headers for analysis - Automatic cookie extraction from Set-Cookie headers - Lazy JSON parsing with caching - Browser-context aware (respects CORS, security policies) - Content available in multiple formats (text, bytes, JSON)
The response contains all data captured during the browser's fetch execution, including redirects, authentication flows, and any browser-applied transformations.
Initialize a new Response instance with browser fetch results.
PARAMETER | DESCRIPTION |
---|---|
status_code
|
HTTP status code returned by the server (e.g., 200, 404, 500).
TYPE:
|
content
|
Raw response body as bytes. Used for binary data or when text encoding is uncertain.
TYPE:
|
text
|
Response body as decoded string. Pre-decoded by browser's fetch API.
TYPE:
|
json
|
Pre-parsed JSON data if response Content-Type was application/json. If None, json() method will attempt to parse from text on demand.
TYPE:
|
response_headers
|
Headers received from the server, including Set-Cookie, Content-Type, and any custom headers sent by the server.
TYPE:
|
request_headers
|
Headers that were actually sent in the request, including browser-generated headers (User-Agent, Accept, etc.) and custom headers.
TYPE:
|
cookies
|
Cookies extracted from Set-Cookie headers during the response. These represent new/updated cookies from this specific request.
TYPE:
|
url
|
Final URL after any redirects. May differ from original request URL if the server performed redirects during the request.
TYPE:
|
ok
property
Check if the request was successful (2xx status codes).
RETURNS | DESCRIPTION |
---|---|
bool
|
True if status code is in the 200-399 range, False otherwise. |
Note
This follows HTTP conventions where 2xx codes indicate success and 3xx codes indicate redirection (still considered "ok").
cookies
property
Get cookies that were set by the server during this response.
RETURNS | DESCRIPTION |
---|---|
list[CookieParam]
|
List of cookies extracted from Set-Cookie headers. Each cookie |
list[CookieParam]
|
contains name and value, with cookie attributes (Path, Domain, etc.) |
list[CookieParam]
|
automatically handled by the browser. |
Note
These are only NEW/UPDATED cookies from this response. Existing browser cookies are managed automatically by the browser context.
request_headers
property
Get headers that were actually sent in the HTTP request.
RETURNS | DESCRIPTION |
---|---|
list[HeaderEntry]
|
List of headers sent to the server, including both custom headers |
list[HeaderEntry]
|
provided by the user and automatic headers added by the browser |
list[HeaderEntry]
|
(User-Agent, Accept, Authorization, etc.). |
Note
This shows the ACTUAL headers sent, which may differ from what was originally specified due to browser modifications.
headers
property
Get headers received from the server in the HTTP response.
RETURNS | DESCRIPTION |
---|---|
list[HeaderEntry]
|
List of response headers sent by the server, including standard |
list[HeaderEntry]
|
headers (Content-Type, Content-Length, etc.) and any custom headers. |
Note
Some security-sensitive headers may be filtered by the browser and not appear in this list due to CORS policies.
status_code
property
Get the HTTP status code returned by the server.
RETURNS | DESCRIPTION |
---|---|
int
|
Integer status code (e.g., 200 for OK, 404 for Not Found, 500 for Server Error). |
text
property
Get the response content as a decoded string.
RETURNS | DESCRIPTION |
---|---|
str
|
Response body decoded as UTF-8 string. If no text was provided |
str
|
during initialization, it will be decoded from the raw content. |
Note
Decoding uses 'replace' error handling to avoid crashes on invalid UTF-8 sequences.
content
property
Get the raw response content as bytes.
RETURNS | DESCRIPTION |
---|---|
bytes
|
Unmodified response body as bytes. Useful for binary data |
bytes
|
(images, files, etc.) or when you need to handle encoding manually. |
url
property
Get the final URL of the response after any redirects.
RETURNS | DESCRIPTION |
---|---|
str
|
The final URL that was accessed, which may differ from the |
str
|
original request URL if redirects occurred. |
json
Parse and return the response content as JSON data.
Attempts to parse the response text as JSON. Uses caching to avoid re-parsing the same content multiple times.
RETURNS | DESCRIPTION |
---|---|
Union[dict[str, Any], list]
|
Parsed JSON data as dictionary, list, or other JSON-compatible type. |
RAISES | DESCRIPTION |
---|---|
ValueError
|
If the response content is not valid JSON or if parsing fails. |
Note
- Uses lazy parsing: JSON is only parsed when first accessed
- Subsequent calls return cached result for better performance
- If JSON was pre-parsed during initialization, that result is returned
raise_for_status
Raise an HTTPError if the response indicates an HTTP error status.
Checks the status code and raises an exception for client errors (4xx) and server errors (5xx). Successful responses (2xx) and redirects (3xx) do not raise an exception.
RAISES | DESCRIPTION |
---|---|
HTTPError
|
If status code is 400 or higher, indicating an error. |
Note
This method is compatible with requests.Response.raise_for_status() for easy migration from the requests library.
使用示例
基本 HTTP 方法
from pydoll.browser.chromium import Chrome
async with Chrome() as browser:
tab = await browser.start()
await tab.go_to("https://api.example.com")
# GET 请求
response = await tab.request.get("/users/123")
user_data = await response.json()
# POST 请求
response = await tab.request.post("/users", json={
"name": "John Doe",
"email": "john@example.com"
})
# 带 headers 的 PUT 请求
response = await tab.request.put("/users/123",
json={"name": "Jane Doe"},
headers={"Authorization": "Bearer token123"}
)
响应处理
# 检查响应状态
if response.ok:
print(f"成功: {response.status_code}")
else:
print(f"错误: {response.status_code}")
response.raise_for_status() # 对 4xx/5xx 抛出 HTTPError
# 访问响应数据
text_data = response.text
json_data = await response.json()
raw_bytes = response.content
# 检查 headers 和 cookies
print("响应 headers:", response.headers)
print("请求 headers:", response.request_headers)
for cookie in response.cookies:
print(f"Cookie: {cookie.name}={cookie.value}")
高级功能
# 带自定义 headers 和参数的请求
response = await tab.request.get("/search",
params={"q": "python", "limit": 10},
headers={
"User-Agent": "Custom Bot 1.0",
"Accept": "application/json"
}
)
# 文件上传模拟
response = await tab.request.post("/upload",
data={"description": "Test file"},
files={"file": ("test.txt", "file content", "text/plain")}
)
# 表单数据提交
response = await tab.request.post("/login",
data={"username": "user", "password": "pass"}
)
与 Tab 的集成
请求功能通过 tab.request
属性访问,该属性为每个 tab 提供一个单例 Request
实例:
# 每个 tab 都有自己的 request 实例
tab1 = await browser.get_tab(0)
tab2 = await browser.new_tab()
# 这些是独立的 Request 实例
request1 = tab1.request # 绑定到 tab1 的 Request
request2 = tab2.request # 绑定到 tab2 的 Request
# 请求继承 tab 的上下文
await tab1.go_to("https://site1.com")
await tab2.go_to("https://site2.com")
# 这些请求将具有不同的 cookie/会话上下文
response1 = await tab1.request.get("/api/data") # 使用 site1.com 的 cookies
response2 = await tab2.request.get("/api/data") # 使用 site2.com 的 cookies
混合自动化
该模块对于需要结合 UI 交互和 API 调用的混合自动化场景特别强大。例如,通过 UI 登录,然后使用已认证的会话进行 API 调用,无需手动处理 cookies 或令牌。