Crawl
Last updated
Last updated
Starts a crawl job for a given URL.
Method: client.crawl.start(params: StartCrawlJobParams): Promise<StartCrawlJobResponse>
Endpoint: POST /api/crawl
Parameters:
StartCrawlJobParams
:
url: string
- URL to scrape
maxPages?: number
- Max number of pages to crawl
followLinks?: boolean
- Follow links on the page
ignoreSitemap?: boolean
- Ignore sitemap when finding links to crawl
excludePatterns?: string[]
- Patterns for paths to exclude from crawl
includePatterns?: string[]
- Patterns for paths to include in the crawl
sessionOptions?:
scrapeOptions?:
Response:
Example:
Retrieves details of a specific crawl job.
Method: client.crawl.get(id: string): Promise<CrawlJobResponse>
Endpoint: GET /api/crawl/{id}
Parameters:
id: string
- Crawl job ID
Example:
Start a crawl job and wait for it to complete
Method: client.crawl.startAndWait(params: StartCrawlJobParams, returnAllPages: boolean = true): Promise<CrawlJobResponse>
Parameters:
StartCrawlJobParams
:
url: string
- URL to scrape
maxPages?: number
- Max number of pages to crawl
followLinks?: boolean
- Follow links on the page
ignoreSitemap?: boolean
- Ignore sitemap when finding links to crawl
excludePatterns?: string[]
- Patterns for paths to exclude from crawl
includePatterns?: string[]
- Patterns for paths to include in the crawl
returnAllPages: boolean
- Return all pages in the crawl job response
Example:
Response:
sessionOptions?:
scrapeOptions?:
Response: