Query Data by DOI¶

This notebook shows how to fetch information about a specific publication identified by its DOI, and handle API errors.

Related Notebooks:

ORCID Notebook
Query for researchers' data by passing an ORCID to the Augment API. Visualise co-author relationships in a graph.
Publications Notebook
Extract a publications list for a researcher in Bibtex Format. Visualise publication counts with a bar plot and generate a keyword word-cloud.
Affiliations Notebook
Query researchers and affiliations by passing an ORCID to the API. Extract the geolocation data and map affiliations data on a world map. Plot researcher-organisation relationships in a graph.

In [ ]:

            
                Copied!
                
                    
                    
                
                

        
import sys
sys.path.append('../')

# Packages to use API
import requests
import json

# packages to read API_KEY
import os
from os.path import join, dirname
from dotenv import load_dotenv
load_dotenv();
import sys
sys.path.append('../')

# Packages to use API
import requests
import json

# packages to read API_KEY
import os
from os.path import join, dirname
from dotenv import load_dotenv
load_dotenv();

API Errors¶

When using the API, we load API_KEY and DOI you want to search into variables and add them in the url string. Later the python request package will pass those values to the API and get the data you want. This section shows the 2 types of common errors you might get when using augment API. Either the DOI passed is invalid or the API_KEY is not load successfully from you environment file.

DOI Not Found¶

Here we assign an invalid value to the DOI variable. When error occurs, the request.get( ) will be an object with the status code indicating error type and an error message.

In [ ]:

            
                Copied!
                
# DOI does not exist
API_KEY = os.environ.get("API_KEY")
DOI = "10.1038/XXXX"

url = f'https://augmentapi.researchgraph.com/v1/doi/{DOI}?subscription-key={API_KEY}'
r = requests.get(url)

# print a short confirmation on completion
print('Augment API query complete ', r.status_code)

if r.status_code == 400:
    print(r.json()[0]["error"])
# DOI does not exist
API_KEY = os.environ.get("API_KEY")
DOI = "10.1038/XXXX"

url = f'https://augmentapi.researchgraph.com/v1/doi/{DOI}?subscription-key={API_KEY}'
r = requests.get(url)

# print a short confirmation on completion
print('Augment API query complete ', r.status_code)

if r.status_code == 400:
    print(r.json()[0]["error"])

Missing API_KEY¶

You will receive an authentication error if the API KEY is invalid.

In [ ]:

            
                Copied!
                
# Missing API_KEY
API_KEY = ''
DOI = "10.1038/sdata.2018.99"

url = f'https://augmentapi.researchgraph.com/v1/doi/{DOI}?subscription-key={API_KEY}'
r = requests.get(url)

# print a short confirmation on completion
print('Augment API query complete ', r.status_code)

if r.status_code == 401:
    print(f'Authentication error.',r.json()['message'])
# Missing API_KEY
API_KEY = ''
DOI = "10.1038/sdata.2018.99"

url = f'https://augmentapi.researchgraph.com/v1/doi/{DOI}?subscription-key={API_KEY}'
r = requests.get(url)

# print a short confirmation on completion
print('Augment API query complete ', r.status_code)

if r.status_code == 401:
    print(f'Authentication error.',r.json()['message'])

Extract Publications by DOI¶

For valid DOI records retrieved, it is a nested dictionary structure with all data that is connected to the DOI requested. First level has 3 keys as shown in the block below.

In [ ]:

            
                Copied!
                
                    
                    
                
                

        
# DOI does exist
API_KEY = os.environ.get("API_KEY")
DOI = "10.1038/sdata.2018.99"

url = f'https://augmentapi.researchgraph.com/v1/doi/{DOI}?subscription-key={API_KEY}'
r = requests.get(url)

# print a short confirmation on completion
print('Augment API query complete ', r.status_code)
# Shows data 
print('The data returned has below fields: ',r.json()[0].keys())
# DOI does exist
API_KEY = os.environ.get("API_KEY")
DOI = "10.1038/sdata.2018.99"

url = f'https://augmentapi.researchgraph.com/v1/doi/{DOI}?subscription-key={API_KEY}'
r = requests.get(url)

# print a short confirmation on completion
print('Augment API query complete ', r.status_code)
# Shows data 
print('The data returned has below fields: ',r.json()[0].keys())

In 'nodes', data is stored in 5 labels from the ResearchGraph schema:

In [ ]:

            
                Copied!
                
r.json()[0]["nodes"].keys()
r.json()[0]["nodes"].keys()

Each label above is stored as a list of dictionaries. To extract the publication we need, iterate through the list and check for the DOI.

In [ ]:

            
                Copied!
                
                    
                    
                
                

        
if r.status_code == 200 and r.json()[0]["nodes"]["publications"]:    
    publications = r.json()[0]["nodes"]["publications"]
    
    publication = None
    for i in range(len(publications)):
        if publications[i]["doi"] == DOI:
            publication = publications[i]

print()
print(f'DOI: {publication["doi"]}')
print(f'Authors: {publication["authors_list"]}')
print(f'Title: {publication["title"]}')
print(f'Publication year: {publication["publication_year"]}')
print()
print(f'The publication "{publication["title"]}" is connected to {r.json()[0]["stats"]}.')
if r.status_code == 200 and r.json()[0]["nodes"]["publications"]:    
    publications = r.json()[0]["nodes"]["publications"]
    
    publication = None
    for i in range(len(publications)):
        if publications[i]["doi"] == DOI:
            publication = publications[i]

print()
print(f'DOI: {publication["doi"]}')
print(f'Authors: {publication["authors_list"]}')
print(f'Title: {publication["title"]}')
print(f'Publication year: {publication["publication_year"]}')
print()
print(f'The publication "{publication["title"]}" is connected to {r.json()[0]["stats"]}.')