LangChain
Using Hyperbrowser's Document Loader Integration
Hyperbrowser provides a Document Loader integration with LangChain via the langchain-hyperbrowser
package. It can be used to load the metadata and contents(in formatted markdown or html) of any site as a LangChain Document
.
Installation and Setup
To get started with langchain-hyperbrowser
, you can install the package using pip:
And you should configure credentials by setting the following environment variables:
HYPERBROWSER_API_KEY=<your-api-key>
You can get an API Key easily from the dashboard. Once you have your API Key, add it to your .env
file as HYPERBROWSER_API_KEY
or you can pass it via the api_key
argument in the constructor.
Document Loader
The HyperbrowserLoader
class in langchain-hyperbrowser
can easily be used to load content from any single page or multiple pages as well as crawl an entire site. The content can be loaded as markdown or html.
Advanced Usage
You can specify the operation to be performed by the loader. The default operation is scrape
. For scrape
, you can provide a single URL or a list of URLs to be scraped. For crawl
, you can only provide a single URL. The crawl
operation will crawl the provided page and subpages and return a document for each page.
Optional params for the loader can also be provided in the params
argument. For more information on the supported params, you can see the params for scraping or crawling.
Last updated