Screenshots and PDFs
Pydoll provides powerful screenshot and PDF generation capabilities through direct Chrome DevTools Protocol commands. Capture full pages, specific elements, or generate PDFs with fine-grained control.
Screenshots
Basic Page Screenshot
import asyncio
from pydoll.browser.chromium import Chrome
async def take_page_screenshot():
async with Chrome() as browser:
tab = await browser.start()
await tab.go_to('https://example.com')
# Save screenshot to file
await tab.take_screenshot('page.png', quality=100)
asyncio.run(take_page_screenshot())
Supported Formats
Pydoll supports three image formats based on file extension:
# PNG format (lossless, larger file size)
await tab.take_screenshot('screenshot.png', quality=100)
# JPEG format (lossy, smaller file size)
await tab.take_screenshot('screenshot.jpeg', quality=85)
# WebP format (modern, efficient)
await tab.take_screenshot('screenshot.webp', quality=90)
Format Detection
The image format is automatically determined by the file extension. Using an unsupported extension raises InvalidFileExtension.
Both .jpg and .jpeg are supported for JPEG format (.jpg is automatically normalized to .jpeg internally to match CDP requirements).
Screenshot Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
path |
Optional[str] |
None |
File path to save screenshot. Required if as_base64=False. |
quality |
int |
100 |
Image quality (0-100). Higher values mean better quality and larger files. |
beyond_viewport |
bool |
False |
Capture entire scrollable page, not just visible area. |
as_base64 |
bool |
False |
Return base64-encoded string instead of saving to file. |
Full Page Screenshot
Capture content beyond the visible viewport:
async def full_page_screenshot():
async with Chrome() as browser:
tab = await browser.start()
await tab.go_to('https://example.com/long-page')
# Capture entire page including content below the fold
await tab.take_screenshot(
'full-page.png',
beyond_viewport=True,
quality=90
)
Performance Note
Using beyond_viewport=True on very long pages can consume significant memory and take longer to process.
Base64 Screenshot
Get screenshot as base64 string for embedding or sending via API:
async def base64_screenshot():
async with Chrome() as browser:
tab = await browser.start()
await tab.go_to('https://example.com')
# Get screenshot as base64 string
screenshot_base64 = await tab.take_screenshot(
as_base64=True
)
# Use in HTML img tag
html = f'<img src="data:image/png;base64,{screenshot_base64}" />'
# Or send via API
import aiohttp
async with aiohttp.ClientSession() as session:
await session.post(
'https://api.example.com/upload',
json={'image': screenshot_base64}
)
Element Screenshot
Capture specific elements instead of the entire page:
async def element_screenshot():
async with Chrome() as browser:
tab = await browser.start()
await tab.go_to('https://example.com')
# Screenshot a specific element (PNG)
header = await tab.find(tag_name='header')
await header.take_screenshot('header.png', quality=100)
# Screenshot a form (JPEG)
form = await tab.find(id='login-form')
await form.take_screenshot('login-form.jpeg', quality=85)
# Screenshot a chart or graph (WebP)
chart = await tab.find(class_name='data-visualization')
await chart.take_screenshot('chart.webp', quality=90)
Format Detection
The image format is automatically detected from the file extension (.png, .jpeg/.jpg, or .webp). Using an unsupported extension raises InvalidFileExtension.
Automatic Scrolling
When capturing element screenshots, Pydoll automatically scrolls the element into view before taking the screenshot.
Element vs Page Screenshots
| Feature | tab.take_screenshot() |
element.take_screenshot() |
|---|---|---|
| Scope | Entire viewport or page | Specific element only |
| Format Support | PNG, JPEG, WebP | PNG, JPEG, WebP |
| Beyond Viewport | ✅ Supported | ❌ Not applicable |
| Base64 Output | ✅ Supported | ✅ Supported |
| Auto-Scroll | ❌ Not applicable | ✅ Yes |
| Use Case | Full page captures | Component isolation, testing |
PDF Generation
Basic PDF Export
Convert pages to PDF with print-quality output:
import asyncio
from pathlib import Path
from pydoll.browser.chromium import Chrome
async def generate_pdf():
async with Chrome() as browser:
tab = await browser.start()
await tab.go_to('https://example.com/document')
# Generate PDF with Path
await tab.print_to_pdf(Path('document.pdf'))
# Or with string
await tab.print_to_pdf('document.pdf')
asyncio.run(generate_pdf())
PDF Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
path |
Optional[str \| Path] |
None |
File path to save PDF. Required if as_base64=False. |
landscape |
bool |
False |
Use landscape orientation (vs portrait). |
display_header_footer |
bool |
False |
Include browser-generated header/footer with title, URL, page numbers. |
print_background |
bool |
True |
Include background graphics and colors. |
scale |
float |
1.0 |
Page scale factor (0.1-2.0). Useful for zoom/shrink effects. |
as_base64 |
bool |
False |
Return base64-encoded string instead of saving to file. |
Path vs String
While Path objects from pathlib are recommended as best practice for better path handling and cross-platform compatibility, you can also use plain strings if preferred.
Advanced PDF Options
import asyncio
from pathlib import Path
from pydoll.browser.chromium import Chrome
async def advanced_pdf():
async with Chrome() as browser:
tab = await browser.start()
await tab.go_to('https://example.com/report')
# Landscape PDF with headers/footers
await tab.print_to_pdf(
Path('report-landscape.pdf'),
landscape=True,
display_header_footer=True,
print_background=True,
scale=0.9
)
# Portrait PDF without backgrounds (ink-friendly)
await tab.print_to_pdf(
Path('report-ink-friendly.pdf'),
landscape=False,
print_background=False,
scale=1.0
)
asyncio.run(advanced_pdf())
PDF Scale Factor
Control the zoom level of PDF output:
import asyncio
from pathlib import Path
from pydoll.browser.chromium import Chrome
async def scaled_pdfs():
async with Chrome() as browser:
tab = await browser.start()
await tab.go_to('https://example.com/content')
# Shrink content to fit more on each page
await tab.print_to_pdf(Path('compact.pdf'), scale=0.7)
# Normal scale
await tab.print_to_pdf(Path('normal.pdf'), scale=1.0)
# Enlarge content (fewer pages)
await tab.print_to_pdf(Path('large.pdf'), scale=1.5)
asyncio.run(scaled_pdfs())
Scale Limits
The scale parameter accepts values between 0.1 and 2.0. Values outside this range may produce unexpected results.
Base64 PDF
Generate PDF as base64 string for API transmission:
import asyncio
from pydoll.browser.chromium import Chrome
async def base64_pdf():
async with Chrome() as browser:
tab = await browser.start()
await tab.go_to('https://example.com/invoice')
# Get PDF as base64 (no path needed)
pdf_base64 = await tab.print_to_pdf(as_base64=True)
# Send via API
import aiohttp
async with aiohttp.ClientSession() as session:
await session.post(
'https://api.example.com/invoices',
json={'pdf': pdf_base64}
)
asyncio.run(base64_pdf())
CDP Reference
For complete CDP documentation on these commands, see:
Error Handling
from pydoll.exceptions import (
InvalidFileExtension,
MissingScreenshotPath,
TopLevelTargetRequired
)
async def safe_screenshot():
async with Chrome() as browser:
tab = await browser.start()
await tab.go_to('https://example.com')
try:
# Missing path and as_base64=False
await tab.take_screenshot()
except MissingScreenshotPath:
print("Error: Must provide path or set as_base64=True")
try:
# Invalid extension
await tab.take_screenshot('image.bmp')
except InvalidFileExtension as e:
print(f"Error: {e}")
# IFrame screenshot limitation
iframe_element = await tab.find(tag_name='iframe')
frame = await tab.get_frame(iframe_element)
try:
# Won't work for iframes
await frame.take_screenshot('iframe.png')
except TopLevelTargetRequired:
print("Use element.take_screenshot() for iframe content")
# Correct approach
content = await frame.find(id='content')
await content.take_screenshot('iframe-content.jpeg')
Learn More
For additional context on how screenshots and PDFs integrate with Pydoll's architecture:
- Deep Dive: CDP: Understanding Chrome DevTools Protocol commands
- API Reference: Tab: Complete method signatures and parameters
- API Reference: WebElement: Element-specific screenshot capabilities
Screenshots and PDFs are essential tools for automation, testing, and documentation. Pydoll's direct CDP integration provides professional-grade output with fine-grained control.