Dive Marketplace home page

Introduction

Welcome to the DiveAccess API documentation. This API allows you to retrieve Industry Dive articles based on various search parameters. It's designed to work with Large Language Models (LLMs) via Retrieval Augmented Generation (RAG), enhancing the capabilities of AI models with up-to-date, high-quality and domain-specific content.

Quick start

Follow these steps to get up and running with the DiveAccess API.

  1. Create an account or log in to retrieve your access_key which acts as your API Secret Key.
  2. Authenticate your requests by adding your key as a URL query parameter.
  3. Optionally add a URL encoded query string, for example: query=artificial%20intelligence
  4. Make a GET request to the base URL with query parameters added.

A list of Industry Dive article objects will be returned in JSON format.

curl -X GET "https://api.diveaccess.com/articles?access_key=YOUR_ACCESS_KEY&query=artificial%20intelligence"

How it works

Industry Dive is a leading business journalism company that provides in-depth, industry-specific news and analysis. Through the DiveAccess API, you will have the ability to fetch articles from Industry Dive publications including Retail Dive, Food Dive, Marketing Dive, CIO Dive and many more. A full list of Industry Dive's 37 publications can be found here

 

Each published article is indexed in Elasticsearch and ordered by date. A call to the API base URL without any parameters will return the 10 most recent articles published across all publications. Adding the query parameter will perform a keyword index search on Industry Dive's content and return the most relevant articles for that query, ordered by date.

Base URL

The base URL is the same for all requests. Only GET requests are supported.

https://api.diveaccess.com/articles

Authentication

Authenticating requests is easy. Simply add your access_key as a query parameter to the base URL. We recommend using a secrets manager to store your key securely.

https://api.diveaccess.com/articles?access_key=YOUR_ACCESS_KEY

Making Requests

The DiveAccess API supports various query parameters to refine your search. None of the parameters are required.

  • query: A URL-encoded search term (e.g., artificial%20intelligence)
  • from_date: Start date for the article search (format: YYYY-MM-DD). The default is 30 days ago.
  • to_date: End date for the article search (format: YYYY-MM-DD). The default is today.

Responses are returned in JSON format, containing an array of article objects. Each object includes properties such as title, publication, date, and content.

 

We have additional query parameter support available and will be adding these capabilities to paid plans in the near future.

1import requests
2
3BASE_URL = "https://api.diveaccess.com/articles"
4ACCESS_KEY = "YOUR_ACCESS_KEY_HERE"
5
6params = {
7    "access_key": ACCESS_KEY,
8    "query": "artificial intelligence",
9    "from_date": "2023-01-01",
10    "to_date": "2023-12-31"
11}
12
13response = requests.get(BASE_URL, params=params)
14
15if response.status_code == 200:
16    data = response.json()
17    print(data)
18else:
19    print(f"Error: {response.status_code}, {response.text}")

Example Response

Below is an example of the JSON response you will receive from the DiveAccess API. The response contains two main parts: num_found and article_set.

  • num_found: The total number of articles that match your query criteria. We plan to enable access for iterating through the entirety of this list for paid subscribers in the future.
  • article_set: An array of article objects, each containing detailed information about a single article.

Each article object in the article_set includes the following key information:

  • category: The top level category of the article as indicated by Google NLP categorization.
  • topics: The mid-level category of the article as indicated by Google NLP categorization.
  • byline: The author's name as it appears in the article.
  • description: The HTML content of the article in its entirety.
  • title: The title of the article.
  • created_at and published_at: Timestamps for when the article was ingested by our system and published, respectively.
  • author_set: Detailed information about the article's author(s).
  • licensed: Indicates if the article is licensed for use in RAG.
  • link: The original URL of the article.
  • publisher and source: Information about the publisher and the specific publication.
  • guid: A unique identifier for the article.
  • image_set: An array of images associated with the article (empty in this example).
  • metadata: Additional metadata including tags and meta features about the article including wordcount which can be used to estimate tokens. 

 

Response Codes

  • 200 OK: The request was successful, and the response contains the requested data.
  • 401 Unauthorized: The request lacks valid authentication credentials. Check your access_key.
  • 429 Too Many Requests: You have exceeded the number of API calls in your plan. Upgrade to a paid plan for higher limits.
  • 500 Internal Server Error: The server encountered an unexpected condition that prevented it from fulfilling the request. If this persists, please contact support.
1{
2  "num_found": 772,
3  "article_set": [
4    {
5      "category": {
6        "name": "Business",
7        "dashed_name": "business"
8      },
9      "topics": [
10        {
11          "guid": "abcdefg123456",
12          "name": "corporate events"
13          "dashed_name": "corporate-events"
14        }
15      ],
16      "byline": "Luke Skywalker",
17      "description": "<h1>Senate approves PTO</h1><p>The Galactic Senate has approved a new bill requiring all employers in the Core Worlds to provide paid time off for Jedi training.</p>",
18      "title": "Coruscant Employers Must Now Provide Paid Leave for Jedi Training",
19      "created_at": "2023-11-27 21:17:01",
20      "author_set": [
21        {
22          "guid": "b92eba1674b0d77b86f46f37604f0123",
23          "first_name": "Luke",
24          "last_name": "Skywalker",
25          "name": "Luke Skywalker"
26        }
27      ],
28      "licensed": "True",
29      "link": "https://www.retaildive.com/news/sephora-exits-korea/710823/",
30      "published_at": "2023-11-27 20:59:05",
31      "score": "1.0",
32      "publisher": {
33        "name": "Industry Dive"
34      },
35      "source": {
36        "name": "Retail Dive",
37        "website": "https://www.retaildive.com/",
38        "state": "IL",
39        "country": "US",
40        "guid": "38aa2cd3e7a7cd6bc96d88cc8bbdl123",
41        "thumbnail": "http://www.google.com/s2/favicons?domain=www.industrydive.com"
42      },
43      "guid": "4e2427ec8d6a11ee9fb802420a0e1234",
44      "image_set": [],
45      "metadata": {
46        "tags": ["retail", "business", "employment"],
47        "extracted_features": {
48          "article_description_wordcount": 1230,
49          "article_title_wordcount": 11,
50          "article_inline_image_count": 0
51        }
52      }
53    }
54  ]
55}

OpenAI Tool Use

The DiveAccess API can be seamlessly integrated with OpenAI's function calling feature to significantly enhance your AI model's capabilities. This integration allows your model to dynamically fetch and incorporate up-to-date, industry-specific information into its responses. Key benefits of this integration include real-time access to the latest industry news and trends with guaranteed high-quality content. This example demonstrates how to use the DiveAccess API as a tool for retrieving relevant articles based on user queries. We'll walk through the process of:

  1. Setting up the necessary functions and API calls
  2. Defining the tool (function) for OpenAI to use
  3. Making the initial API call to OpenAI
  4. Handling the tool call and fetching articles from the DiveAccess API
  5. Making a second API call to OpenAI with the retrieved information

By following this example, you'll be able to create a powerful AI assistant that can provide informed responses on various industry topics, backed by the latest articles from Industry Dive publications.

1from openai import OpenAI
2import requests
3import json
4
5client = OpenAI()
6BASE_URL = "https://api.diveaccess.com/articles"
7ACCESS_KEY = "YOUR_ACCESS_KEY_HERE"
8
9def get_articles(query, from_date=None, to_date=None):
10    """
11    Fetch articles from the DiveAccess API based on the given query and optional date range.
12    """
13    params = {
14        "access_key": ACCESS_KEY,
15        "query": query,
16    }
17    if from_date:
18        params["from_date"] = from_date
19    if to_date:
20        params["to_date"] = to_date
21
22    response = requests.get(BASE_URL, params=params)
23    if response.status_code == 200:
24        return json.dumps(response.json())
25    else:
26        return json.dumps({
27            "error": f"Error: {response.status_code}, {response.text}"
28        })
29
30tools = [
31    {
32        "type": "function",
33        "function": {
34            "name": "get_articles",
35            "description": "Get articles from Industry Dive publications based on keywords extracted from user input. " +
36                           "Tool will return a list of articles. " +
37                           "Should be used when the message indicates the need for information about a specific topic.",
38            "parameters": {
39                "type": "object",
40                "properties": {
41                    "query": {
42                        "type": "string",
43                        "description": "The search query for articles extracted from the user message."
44                    },
45                    "from_date": {
46                        "type": "string",
47                        "description": "Start date for article search (YYYY-MM-DD)." +
48                                       "The default is 30 days ago. Only include when a specific time frame is necessary.",
49                    },
50                    "to_date": {
51                        "type": "string",
52                        "description": "End date for article search (YYYY-MM-DD)." +
53                                       "The default is today. Only include when a specific time frame is necessary.",
54                    },
55                },
56                "required": ["query"],
57            },
58        },
59    }
60]
61
62def run_conversation():
63    prompt = (
64      "What are the latest developments in artificial intelligence? "
65      "Include sources and links where applicable."
66    )
67    messages = [{
68        "role": "user",
69        "content": prompt
70    }]
71
72    response = client.chat.completions.create(
73        model="gpt-4o",
74        messages=messages,
75        tools=tools,
76        tool_choice="auto",
77    )
78    response_message = response.choices[0].message
79    tool_calls = response_message.tool_calls
80
81    if tool_calls:
82        messages.append(response_message)
83        for tool_call in tool_calls:
84            function_name = tool_call.function.name
85            function_args = json.loads(tool_call.function.arguments)
86            function_response = get_articles(
87                query=function_args.get("query"),
88                from_date=function_args.get("from_date"),
89                to_date=function_args.get("to_date"),
90            )
91            messages.append(
92                {
93                    "tool_call_id": tool_call.id,
94                    "role": "tool",
95                    "name": function_name,
96                    "content": function_response,
97                }
98            )
99
100        second_response = client.chat.completions.create(
101            model="gpt-4o",
102            messages=messages,
103        )
104
105        return second_response
106    else:
107        return response
108
109print(run_conversation())

Claude/Anthropic Tool Use

The DiveAccess API can be seamlessly integrated with Anthropic's Claude AI model using its function calling feature. This integration allows Claude to dynamically fetch and incorporate up-to-date, industry-specific information into its responses. Key benefits of this integration include real-time access to the latest industry news and trends with guaranteed high-quality content. This example demonstrates how to use the DiveAccess API as a tool for retrieving relevant articles based on user queries with Claude. We'll walk through the process of:

  1. Setting up the necessary functions and API calls
  2. Defining the tool (function) for Claude to use
  3. Making the initial API call to Claude
  4. Handling the tool call and fetching articles from the DiveAccess API
  5. Making a second API call to Claude with the retrieved information

By following this example, you'll be able to create a powerful AI assistant that can provide informed responses on various industry topics, backed by the latest articles from Industry Dive publications.

1import anthropic
2import requests
3import json
4
5client = anthropic.Anthropic()
6BASE_URL = "https://api.diveaccess.com/articles"
7ACCESS_KEY = "YOUR_ACCESS_KEY_HERE"
8
9def get_articles(query, from_date=None, to_date=None):
10    """
11    Fetch articles from the DiveAccess API based on the given query and optional date range
12    """
13    params = {
14        "access_key": ACCESS_KEY,
15        "query": query,
16    }
17    if from_date:
18        params["from_date"] = from_date
19    if to_date:
20        params["to_date"] = to_date
21
22    response = requests.get(BASE_URL, params=params)
23    if response.status_code == 200:
24        return response.json()
25    else:
26        return {"error": f"Error: {response.status_code}, {response.text}"}
27
28tools = [
29    {
30        "name": "get_articles",
31        "description": "Get articles from Industry Dive publications based on keywords extracted from user input. " +
32                       "Tool will return a list of articles. " +
33                       "Should be used when the message indicates the need for information about a specific topic.",
34        "input_schema": {
35            "type": "object",
36            "properties": {
37                "query": {
38                    "type": "string",
39                    "description": "The search query for articles extracted from the user message.",
40                },
41                "from_date": {
42                    "type": "string",
43                    "description": "Start date for article search (YYYY-MM-DD)." +
44                                   "The default is 30 days ago. Only include when a specific time frame is necessary.",
45                },
46                "to_date": {
47                    "type": "string",
48                    "description": "End date for article search (YYYY-MM-DD)." +
49                                   "The default is today. Only include when a specific time frame is necessary.",
50                },
51            },
52            "required": ["query"]
53        }
54    }
55]
56
57def run_conversation():
58    prompt = (
59      "What are the latest developments in artificial intelligence? "
60      "Include sources and links where applicable."
61    )
62
63  # First API call to get tool use
64    response = client.messages.create(
65        model="claude-3-5-sonnet-20240620",
66        max_tokens=1024,
67        tools=tools,
68        messages=[{
69            "role": "user",
70            "content": prompt
71        }]
72    )
73
74    # Check if the model used the tool
75    tool_use = next((content for content in response.content if content.type == "tool_use"), None)
76
77    if tool_use:
78        # If tool was used, fetch articles and make a second API call
79        articles = get_articles(**tool_use.input)
80
81        second_response = client.messages.create(
82            model="claude-3-5-sonnet-20240620",
83            max_tokens=1024,
84            tools=tools,
85            messages=[
86                {
87                    "role": "user",
88                    "content": prompt
89                },
90                {
91                    "role": "assistant",
92                    "content": response.content
93                },
94                {
95                    "role": "user",
96                    "content": [
97                        {
98                            "type": "tool_result",
99                            "tool_use_id": tool_use.id,
100                            "content": json.dumps(articles)
101                        }
102                    ]
103                }
104            ]
105        )
106
107        return second_response
108
109    else:
110        # If no tool was used, return the first response
111        return response
112
113print(run_conversation())