Apify is a cloud platform for web scraping and data extraction, which provides an ecosystem of more than a thousand ready-made apps called Actors for various scraping, crawling, and extraction use cases.
Overview
Apify provides access to thousands of pre-built tools (Actors) for web scraping, data extraction, and automation. The platform handles infrastructure management, allowing you to focus on data extraction logic.When to use Apify
- Access to thousands of pre-built Actors for various platforms (social media, e-commerce, search engines, etc.)
- Custom web scraping and automation workflows beyond simple search
- Flexible Actor ecosystem – run any Actor from the Apify Store
Apify platform and load their results into LangChain to feed your vector
indexes with documents and data from the web, e.g., to generate answers from websites with documentation,
blogs, or knowledge bases.
Installation and setup
- Install the LangChain Apify package for Python with:
- Get your Apify API token and either set it as
an environment variable (
APIFY_TOKEN) or pass it asapify_api_tokenin the constructor.
Tool
You can use theApifyActorsTool to use Apify Actors with agents.
Wrapper
You can use theApifyWrapper to run Actors on the Apify platform.
Use cases
- Web scraping: Extract data from websites, social media, e-commerce sites
- Search engine results: Scrape Google, Bing, and other search engines
- Data collection: Gather structured data for analysis and ML pipelines
- Content aggregation: Collect content from multiple sources for RAG applications
Document loader
You can also use ourApifyDatasetLoader to get data from an Apify dataset.
Pricing
Apify uses pay-per-use or pay-per-event pricing with a free tier available. Pricing varies by Actor:- Some Actors are free (you only pay for platform compute units)
- Others charge for results or events
- Pay-Per-Event (PPE) pricing: Many Actors support PPE pricing, which is useful when you want predictable, usage-based costs in agent deployments
- See Apify pricing for details