Crawlee Python SDK

The Crawlee Python SDK is a Python library for building reliable web scrapers and crawlers. It offers BasicCrawler, HttpCrawler, BeautifulSoupCrawler, ParselCrawler, PlaywrightCrawler, and Adaptive crawlers built on top of asyncio, along with shared infrastructure for proxy rotation, session pooling, RequestQueue, Dataset, and KeyValueStore. The Python SDK targets data engineers and Python developers who want the same crawler ergonomics as the JavaScript version but inside the Python ecosystem.

API entry from apis.yml

apis.yml Raw ↑
aid: crawlee:crawlee-python-sdk
name: Crawlee Python SDK
description: The Crawlee Python SDK is a Python library for building reliable web scrapers and crawlers.
  It offers BasicCrawler, HttpCrawler, BeautifulSoupCrawler, ParselCrawler, PlaywrightCrawler, and Adaptive
  crawlers built on top of asyncio, along with shared infrastructure for proxy rotation, session pooling,
  RequestQueue, Dataset, and KeyValueStore. The Python SDK targets data engineers and Python developers
  who want the same crawler ergonomics as the JavaScript version but inside the Python ecosystem.
humanURL: https://crawlee.dev/python
properties:
- type: Documentation
  url: https://crawlee.dev/python
- type: Reference
  url: https://crawlee.dev/python/api
- type: GettingStarted
  url: https://crawlee.dev/python/docs/quick-start
- type: GitHubRepository
  url: https://github.com/apify/crawlee-python
- type: PyPiPackage
  url: https://pypi.org/project/crawlee/
tags:
- Asyncio
- BeautifulSoup
- Browser Automation
- Parsel
- Playwright
- Python
- Scraping