Crawl
Start Crawl Job
Starts a crawl job for a given URL.
Method: client.crawl.start(params: StartCrawlJobParams): StartCrawlJobResponse
Endpoint: POST /api/crawl
Parameters:
StartCrawlJobParams
:url: string
- URL to scrapemax_pages?: number
- Max number of pages to crawlfollow_links?: boolean
- Follow links on the pageignore_sitemap?: boolean
- Ignore sitemap when finding links to crawlexclude_patterns?: string[]
- Patterns for paths to exclude from crawlinclude_patterns?: string[]
- Patterns for paths to include in the crawlsession_options?:
CreateSessionParams
scrape_options?:
ScrapeOptions
Response: StartCrawlJobResponse
Example:
Get Crawl Job
Retrieves details of a specific crawl job.
Method: client.crawl.get(id: str): CrawlJobResponse
Endpoint: GET /api/crawl/{id}
Parameters:
id: string
- Crawl job ID
Response: CrawlJobResponse
Example:
Start Crawl Job and Wait
Start a crawl job and wait for it to complete
Method: client.crawl.start_and_wait(params: StartCrawlJobParams): CrawlJobResponse
Parameters:
StartCrawlJobParams
:url: string
- URL to scrapemax_pages?: number
- Max number of pages to crawlfollow_links?: boolean
- Follow links on the pageignore_sitemap?: boolean
- Ignore sitemap when finding links to crawlexclude_patterns?: string[]
- Patterns for paths to exclude from crawlinclude_patterns?: string[]
- Patterns for paths to include in the crawlsession_options?:
CreateSessionParams
scrape_options?:
ScrapeOptions
Response: CrawlJobResponse
Example:
Types
CrawlPageStatus
CrawlJobStatus
StartCrawlJobResponse
CrawledPage
CrawlJobResponse
Last updated