Skip to main content
Apify is a cloud platform for web scraping and data extraction, which provides an ecosystem of more than a thousand ready-made apps called Actors for various scraping, crawling, and extraction use cases.

Overview

Apify provides access to thousands of pre-built tools (Actors) for web scraping, data extraction, and automation. The platform handles infrastructure management, allowing you to focus on data extraction logic.

When to use Apify

  • Access to thousands of pre-built Actors for various platforms (social media, e-commerce, search engines, etc.)
  • Custom web scraping and automation workflows beyond simple search
  • Flexible Actor ecosystem – run any Actor from the Apify Store
This integration enables you to run Actors on the Apify platform and load their results into LangChain to feed your vector indexes with documents and data from the web, e.g., to generate answers from websites with documentation, blogs, or knowledge bases.

Installation and setup

  • Install the LangChain Apify package for Python with:
pip install langchain-apify
  • Get your Apify API token and either set it as an environment variable (APIFY_TOKEN) or pass it as apify_api_token in the constructor.

Tool

You can use the ApifyActorsTool to use Apify Actors with agents.
from langchain_apify import ApifyActorsTool
See this notebook for example usage and a full example of a tool-calling agent with LangGraph in the Apify LangGraph agent Actor template. For more information on how to use this tool, visit the Apify integration documentation.

Wrapper

You can use the ApifyWrapper to run Actors on the Apify platform.
from langchain_apify import ApifyWrapper
For more information on how to use this wrapper, see the Apify integration documentation.

Use cases

  • Web scraping: Extract data from websites, social media, e-commerce sites
  • Search engine results: Scrape Google, Bing, and other search engines
  • Data collection: Gather structured data for analysis and ML pipelines
  • Content aggregation: Collect content from multiple sources for RAG applications

Document loader

You can also use our ApifyDatasetLoader to get data from an Apify dataset.
from langchain_apify import ApifyDatasetLoader
For a more detailed walkthrough of this loader, see this notebook.

Pricing

Apify uses pay-per-use or pay-per-event pricing with a free tier available. Pricing varies by Actor:
  • Some Actors are free (you only pay for platform compute units)
  • Others charge for results or events
  • Pay-Per-Event (PPE) pricing: Many Actors support PPE pricing, which is useful when you want predictable, usage-based costs in agent deployments
  • See Apify pricing for details
Source code for this integration can be found in the LangChain Apify repository.

MCP Server

Unsure which Actor to use or what parameters it requires? Apify provides an MCP (Model Context Protocol) server that helps you discover available Actors, explore their input schemas, and understand parameter requirements. When connecting to the Apify MCP server over HTTP, include your Apify token in the request headers:
Authorization: Bearer <APIFY_TOKEN>

Connect these docs to Claude, VSCode, and more via MCP for real-time answers.