API Reference
This page documents the core API of pypecdp.
Browser
- class pypecdp.browser.Browser(config=None, *, chrome_path='chromium', user_data_dir=None, clean_data_dir=True, headless=True, extra_args=None, ignore_default_args=None, env=None, **kwargs)[source]
Bases:
objectHigh-level browser automation via Chrome DevTools Protocol.
Manages the Chrome/Chromium browser process lifecycle and CDP message routing between tabs and the browser.
- Parameters:
- config
Configuration object for browser launch.
- proc
The browser subprocess.
- reader
Stream reader for CDP pipe communication.
- writer
Stream writer for CDP pipe communication.
- targets
Mapping of target IDs to Tab instances.
- Class Attributes:
- tab_class: Class to use for creating Tab instances. Override this
in subclasses to use custom Tab implementations.
- __init__(config=None, *, chrome_path='chromium', user_data_dir=None, clean_data_dir=True, headless=True, extra_args=None, ignore_default_args=None, env=None, **kwargs)[source]
Initialize Browser instance.
- Parameters:
config (Config | None) – Pre-configured Config instance. If None, a new Config will be created from the keyword arguments.
chrome_path (str) – Path to Chrome/Chromium executable.
user_data_dir (str | None) – Path to user data directory. If None, a temporary directory will be created.
headless (bool) – Whether to run in headless mode.
extra_args (list[str] | None) – Additional command-line arguments.
ignore_default_args (list[str] | None) – List of default args to ignore.
env (dict[str, str] | None) – Environment variables to set for browser process.
**kwargs (Any) – Additional keyword arguments. Currently supports ‘auto_attach’ to control automatic target attachment. ‘default_domains’ to auto-enable CDP domains on a target.
clean_data_dir (bool)
- Return type:
None
- async close()[source]
Close the browser and clean up resources.
Closes all tabs, terminates the browser process, and cancels background tasks. This method handles cleanup gracefully: - Attempts graceful shutdown via CDP browser.close() - Falls back to SIGTERM if needed - Falls back to SIGKILL if process doesn’t exit in 3 seconds
Note
This method suppresses most errors to ensure cleanup completes even if the browser has already exited or crashed.
- Return type:
None
- async send(cmd, *, session_id=None, **kwargs)[source]
Send a CDP command and await its response.
- Parameters:
- Returns:
The parsed response from the CDP command, or None if ignore_errors=True and an error occurred.
- Raises:
RuntimeError – If the CDP command returns an error and ignore_errors is False (default).
ConnectionError – If the CDP pipe is closed.
- Return type:
Navigate to a URL in a tab.
- async cookies()[source]
Get all cookies for the browser.
Retrieves cookies from Chrome via CDP and converts them to a standard Python CookieJar. The returned CookieJar contains http.cookiejar.Cookie objects that are compatible with urllib, requests, and other HTTP libraries.
Note
The original CDP cookies (list[cdp.network.Cookie]) are preserved in the returned CookieJar’s
cdp_cookiesattribute. This allows access to CDP-specific cookie properties (priority, source_scheme, source_port, same_site, partition_key, etc.) that aren’t available in the standard cookiejar.Cookie objects.- Returns:
- A CookieJar (subclass of http.cookiejar.CookieJar)
containing all browser cookies. The jar.cdp_cookies attribute contains the original CDP cookie objects.
- Return type:
Tab
- class pypecdp.tab.Tab(browser, target_id, target_info=None)[source]
Bases:
objectRepresents a browser tab/target with CDP session.
Manages a CDP session for a specific target, handles event dispatching, and provides methods for navigation and DOM queries.
- Parameters:
browser (Browser)
target_id (cdp.target.TargetID)
target_info (cdp.target.TargetInfo | None)
- browser
The parent Browser instance.
- target_id
CDP target identifier.
- target_info
Optional target metadata.
- session_id
CDP session ID for this tab.
- Class Attributes:
- elem_class: Class to use for creating Elem instances. Override this
in subclasses to use custom Elem implementations.
- __init__(browser, target_id, target_info=None)[source]
Initialize a Tab instance.
- Parameters:
browser (Browser) – The Browser instance managing this tab.
target_id (cdp.target.TargetID) – CDP target identifier.
target_info (cdp.target.TargetInfo | None) – Optional target metadata.
- Return type:
None
- async send(cmd, **kwargs)[source]
Send a CDP command within this tab’s session.
- Parameters:
- Returns:
The parsed response from the CDP command.
- Raises:
RuntimeError – If the tab is not attached or command fails.
- Return type:
- async handle_event(event)[source]
Dispatch a CDP event to registered handlers.
- Parameters:
event (Any) – The CDP event object to dispatch.
- Return type:
None
- async attach()[source]
Attach a CDP session to this tab.
This method is used for manual tab attachment when auto_attach is disabled in the Browser configuration. If auto_attach is enabled (default), tabs are attached automatically by the Browser.
- Returns:
- The session ID for this tab after attachment.
If already attached, returns the existing session ID.
- Return type:
SessionID
- Raises:
RuntimeError – If the CDP attach_to_target command fails.
Navigate to a URL and wait for page load.
- async wait_for_event(event=<class 'pypecdp.cdp.page.LoadEventFired'>, timeout=10.0)[source]
Wait for a specific CDP event to occur.
- async eval(expression, await_promise=True)[source]
Evaluate JavaScript expression in the page context.
- async find_elems(query, depth=100, pierce=True)[source]
Find all elements matching the specified query.
Searches from the document root and includes iframes. To search within a specific element, use Elem.query_selector().
- async wait_for_elems(query, timeout=10.0, **kwargs)[source]
Wait for elements matching the specified query to appear.
- async find_elem(query, depth=100, pierce=True)[source]
Find the first element matching the specified query.
Searches from the document root and includes iframes. To search within a specific element, use Elem.query_selector().
- async wait_for_elem(query, timeout=10.0, **kwargs)[source]
Wait for an element matching the specified query to appear.
- async close()[source]
Close this tab.
Sends a close target command. Errors are suppressed if the tab is already closed or connection is lost.
- Return type:
None
- property parent: Tab | None
Get the parent tab if this tab is a child frame.
This property is useful for navigating iframe hierarchies. Top-level tabs (pages) will have no parent, while iframes and nested frames will return their parent tab.
- Returns:
- The parent Tab instance if this is a frame/iframe,
or None if this is a top-level page or parent not found.
- Return type:
Tab | None
Example
>>> if tab.parent: ... print(f"Frame in: {tab.parent.url}") ... else: ... print("Top-level tab")
- elem(node_id)[source]
Create an Elem instance from a CDP NodeId.
Searches the document tree for the node with the specified ID and wraps it in an Elem instance for interaction.
- Parameters:
node_id (NodeId) – The NodeId of the DOM element to find.
- Returns:
The created Elem instance wrapping the found node.
- Return type:
- Raises:
ValueError – If the tab document is not loaded or if the node with the specified ID is not found.
Elem
- class pypecdp.elem.Elem(tab, node)[source]
Bases:
objectWrapper for DOM elements with interaction methods.
Provides high-level methods for interacting with elements in the browser, including clicking, typing, and retrieving attributes.
- Parameters:
tab (Tab)
node (cdp.dom.Node)
- node
The CDP Node object representing the DOM element.
- Type:
cdp.dom.Node
Note
Additional node properties like node_id and backend_node_id are accessible via __getattr__ delegation to the node object.
- node: cdp.dom.Node
- async scroll_into_view()[source]
Scroll element into viewport and attempt to focus it.
Errors are suppressed if the element is detached or hidden.
- Raises:
ReferenceError – If the tab session is no longer active.
- Return type:
None
- async focus()[source]
Set focus to the element.
Suppresses errors if the element is not focusable.
- Raises:
ReferenceError – If the tab session is no longer active.
- Return type:
None
- async position()[source]
Get the position and coordinates of the element.
- Returns:
Container with element coordinates, or None if unavailable.
- Return type:
Position | None
- Raises:
ReferenceError – If the tab session is no longer active.
- async click(button=MouseButton.LEFT, click_count=1, delay=0.02)[source]
Click the element at its center point.
Scrolls the element into view, calculates the center, and dispatches mouse press and release events. Returns the top-level tab, which is useful when the click triggers navigation.
- Parameters:
- Returns:
- The current top-level Tab containing this element,
or None if the element position cannot be determined.
- Return type:
Tab | None
- Raises:
ReferenceError – If the tab session is no longer active.
Example
>>> link = await tab.wait_for_elem('a[href="/next"]') >>> current_tab = await link.click() >>> if current_tab: ... await current_tab.wait_for_event(cdp.page.LoadEventFired) ... print(f"Navigated to: {current_tab.url}")
- async type(text)[source]
Type text into the element.
Focuses the element and inserts the text via CDP input command.
- Parameters:
text (str) – The text string to type.
- Raises:
ReferenceError – If the tab session is no longer active.
- Return type:
None
- async set_value(value)[source]
Set the value property of the element directly.
Attempts to resolve the element to a RemoteObject and set its value property via JavaScript. This method also dispatches an ‘input’ event to trigger any listeners. Falls back to typing character-by-character if resolution fails.
This is faster than type() for setting form field values but may not trigger all the same events as real user typing.
- Parameters:
value (str) – The value to set.
- Raises:
ReferenceError – If the tab session is no longer active.
- Return type:
None
- async text()[source]
Get the text content of the element.
- Returns:
The text content, or None if unavailable.
- Return type:
str | None
- Raises:
ReferenceError – If the tab session is no longer active.
- async html(include_shadow_dom=True)[source]
Get the outer HTML of the element.
- Parameters:
include_shadow_dom (bool) – Whether to include shadow DOM content.
- Returns:
The outer HTML string.
- Return type:
- Raises:
ReferenceError – If the tab session is no longer active.
- async attribute(name)[source]
Get the value of an attribute.
- Parameters:
name (str) – The attribute name to retrieve.
- Returns:
The attribute value, or None if not found.
- Return type:
str | None
- Raises:
ReferenceError – If the tab session is no longer active.
- async query_selector(selector)[source]
Find a child element matching the selector.
- Parameters:
selector (str) – The CSS selector string.
- Returns:
The found Elem or None if not found.
- Return type:
Elem | None
- Raises:
ReferenceError – If the tab session is no longer active.
- async wait_for_selector(selector, timeout=10.0, poll=0.05)[source]
Wait for a child element matching the selector to appear.
- Parameters:
- Returns:
The matching element, or None if timeout.
- Return type:
Elem | None
- Raises:
ReferenceError – If the tab session is no longer active.
- property parent: Elem | None
Get the parent element of this Elem.
Useful for traversing up the DOM tree. Can be chained to access ancestors: elem.parent.parent
Example:
# Navigate up to find a containing form button = await tab.find_elem("button[type=submit]") form = button.parent # Get parent element while form and form.node_name != "FORM": form = form.parent
- Returns:
- The parent Elem, or None if this is a root element
(no parent_id) or if the parent is the document root.
- Return type:
Elem | None
Config
- class pypecdp.config.Config(chrome_path='chromium', user_data_dir=None, clean_data_dir=True, headless=True, extra_args=<factory>, ignore_default_args=None, env=<factory>)[source]
Bases:
objectConfiguration for launching Chrome/Chromium with CDP pipe.
- Parameters:
- user_data_dir
Path to user data directory. If None, a temporary directory will be created.
- Type:
str | None
- clean_data_dir
Whether to remove existing user data directory before starting. Defaults to True. Set to False to preserve cookies, cache, and other browser state between runs.
- Type:
Example
>>> config = Config( ... chrome_path="chromium", ... clean_data_dir=False, # Preserve profile ... headless=True ... )
- ensure_user_data_dir()[source]
Ensure user data directory exists and return its path.
If user_data_dir is not set, creates a temporary directory.
- Returns:
Path to the user data directory.
- Return type: