HAR Network Recording
Capture all network activity during a browser session and export it as a standard HAR (HTTP Archive) 1.2 file. Perfect for debugging, performance analysis, and test fixtures.
Debug Like a Pro
HAR files are the industry standard for recording network traffic. You can import them directly into Chrome DevTools, Charles Proxy, or any HAR viewer for detailed analysis.
Why Use HAR Recording?
| Use Case | Benefit |
|---|---|
| Debugging failed requests | See exact headers, timing, and response bodies |
| Performance analysis | Identify slow requests and bottlenecks |
| API documentation | Capture real request/response pairs |
| Test fixtures | Record real traffic for test mocking |
Quick Start
Record all network traffic during a page navigation:
import asyncio
from pydoll.browser.chromium import Chrome
async def record_traffic():
async with Chrome() as browser:
tab = await browser.start()
async with tab.request.record() as capture:
await tab.go_to('https://example.com')
# Save the capture as a HAR file
capture.save('flow.har')
print(f'Captured {len(capture.entries)} requests')
asyncio.run(record_traffic())
Recording API
tab.request.record(resource_types=None)
Context manager that captures network traffic on the tab.
| Parameter | Type | Description |
|---|---|---|
resource_types |
list[ResourceType] \| None |
Optional list of resource types to capture. When None (default), all types are captured. |
async with tab.request.record() as capture:
# All network activity inside this block is captured
await tab.go_to('https://example.com')
await (await tab.find(id='search')).type_text('pydoll')
await (await tab.find(type='submit')).click()
The capture object (HarCapture) provides:
| Property/Method | Description |
|---|---|
capture.entries |
List of captured HAR entries |
capture.to_dict() |
Full HAR 1.2 dict (for custom processing) |
capture.save(path) |
Save as HAR JSON file |
Filtering by Resource Type
Record only specific resource types instead of all traffic:
from pydoll.protocol.network.types import ResourceType
# Record only fetch/XHR requests (skip documents, images, etc.)
async with tab.request.record(
resource_types=[ResourceType.FETCH, ResourceType.XHR]
) as capture:
await tab.go_to('https://example.com')
# Record only document and stylesheet requests
async with tab.request.record(
resource_types=[ResourceType.DOCUMENT, ResourceType.STYLESHEET]
) as capture:
await tab.go_to('https://example.com')
Available ResourceType values:
| Value | Description |
|---|---|
ResourceType.DOCUMENT |
HTML documents |
ResourceType.STYLESHEET |
CSS stylesheets |
ResourceType.SCRIPT |
JavaScript files |
ResourceType.IMAGE |
Images |
ResourceType.FONT |
Web fonts |
ResourceType.MEDIA |
Audio/video |
ResourceType.FETCH |
Fetch API requests |
ResourceType.XHR |
XMLHttpRequest calls |
ResourceType.WEB_SOCKET |
WebSocket connections |
ResourceType.OTHER |
Other resource types |
Saving Captures
# Save as HAR file (can be opened in Chrome DevTools)
capture.save('flow.har')
# Save to a nested directory (created automatically)
capture.save('recordings/session1/flow.har')
# Access the raw HAR dict for custom processing
har_dict = capture.to_dict()
print(har_dict['log']['version']) # "1.2"
Inspecting Entries
async with tab.request.record() as capture:
await tab.go_to('https://example.com')
for entry in capture.entries:
req = entry['request']
resp = entry['response']
print(f"{req['method']} {req['url']} -> {resp['status']}")
Advanced Usage
Filtering Captured Entries
async with tab.request.record() as capture:
await tab.go_to('https://example.com')
# Filter only API calls
api_entries = [
e for e in capture.entries
if '/api/' in e['request']['url']
]
# Filter only failed requests
failed = [
e for e in capture.entries
if e['response']['status'] >= 400
]
Custom HAR Processing
har = capture.to_dict()
# Count requests by type
from collections import Counter
types = Counter(
e.get('_resourceType', 'Other')
for e in har['log']['entries']
)
print(types) # Counter({'Document': 1, 'Script': 5, 'Stylesheet': 3, ...})
HAR File Format
The exported HAR follows the HAR 1.2 specification. Each entry contains:
- Request: method, URL, headers, query parameters, POST data
- Response: status, headers, body content (text or base64-encoded)
- Timings: DNS, connect, SSL, send, wait (TTFB), receive
- Metadata: server IP, connection ID, resource type
Response Bodies
Response bodies are captured automatically after each request completes. Binary content (images, fonts, etc.) is stored as base64-encoded strings.