API documentation

Extractly — structured data from any URL

Overview

Extractly is a single-endpoint API that scrapes any URL and returns structured JSON matching a schema you define. Built for AI agents and developers.

Authentication

All requests require an x-api-key header. Get your API key by signing up at getextractly.com.

MCP Server (for AI agents)

Extractly is available as an MCP server, making it directly usable by AI agents in Cursor, Claude Desktop, and other MCP-compatible environments.

npx extractly-mcp

Claude Desktop

{
  "mcpServers": {
    "extractly": {
      "command": "npx",
      "args": ["-y", "extractly-mcp"]
    }
  }
}

Cursor

{
  "mcpServers": {
    "extractly": {
      "command": "npx",
      "args": ["-y", "extractly-mcp"]
    }
  }
}

Your API key is passed as the api_key parameter when the model calls the tool.

Endpoint

POST https://getextractly.com/api/v1/extract

Request format

Send JSON with two top-level fields:

  • url (string) — The page to scrape and extract from. Must be a valid URL.
  • schema (object) — Describes the shape of the JSON you want back: field names and types (see Schema format below).
{
  "url": "https://example.com/article",
  "schema": {
    "title": "string",
    "description": "string"
  }
}

Schema format

The schema is a plain JSON object where keys are the field names you want extracted and values describe the type (for example string, number, nested objects, or arrays). The API returns JSON with the same keys; missing values may be null.

Simple example

Extract title and description from a page.

{
  "title": "string",
  "description": "string"
}

Complex example

Job listings with nested company and location fields.

{
  "jobs": [
    {
      "title": "string",
      "company": {
        "name": "string",
        "url": "string"
      },
      "location": "string",
      "salary_range": "string"
    }
  ]
}

Response format

Successful response

HTTP 200 — JSON object matching your schema keys.

{
  "title": "Example Article",
  "description": "A short summary of the page."
}

Error response

HTTP 4xx/5xx — JSON with an error message.

{
  "error": "Insufficient credits."
}

Code examples

JavaScript (fetch)

const res = await fetch('https://getextractly.com/api/v1/extract', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'x-api-key': 'YOUR_API_KEY'
  },
  body: JSON.stringify({
    url: 'https://example.com',
    schema: { title: 'string', description: 'string' }
  })
});
const data = await res.json();
console.log(data);

Python (requests)

import requests

r = requests.post(
    'https://getextractly.com/api/v1/extract',
    headers={
        'Content-Type': 'application/json',
        'x-api-key': 'YOUR_API_KEY',
    },
    json={
        'url': 'https://example.com',
        'schema': {'title': 'string', 'description': 'string'},
    },
)
print(r.json())

curl

curl -X POST https://getextractly.com/api/v1/extract \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{"url":"https://example.com","schema":{"title":"string","description":"string"}}'

Error codes

CodeMeaning
400Bad request — invalid JSON body, missing fields, or malformed URL.
401Unauthorized — missing or invalid API key.
402Payment required — insufficient credits to run the extraction.
422Unprocessable entity — model output did not match your schema (validation failed).
500Server error — upstream failure, configuration issue, or unexpected error.

Credits

Each extraction consumes credits from your API key balance. Cache hits cost 1 credit (same URL + schema served from cache while valid). Fresh extractions cost 5–10 credits depending on page complexity (scraping and model processing). Purchase more credits from the homepage after signup.

Limits

  • 30 second timeout per request.
  • 250,000 character limit on scraped content passed into extraction.