OpenArticleGauge API Documentation

This document contains the specification discussions for the OAG API

Lookup Endpoint

Use this endpoint to find the licensing conditions of articles identified by either DOI or PMID.

  • POST /lookup
  • GET / lookup
  • GET /lookup/ID1
  • GET /lookup/ID1,IDn where IDx are strings

Maximum number of allowed IDs is 1000. Frontend will abort with HTTP 400 if a set longer than 1000 is given.

For GET, conneg for HTML or JSON. HTML shows a simple web page to accept query params to build a query.

    GET /lookup/12345,67890 HTTP/1.1
    Accept: application/json

For POST, expected data is either a JSON list of BibJSON identifier objects, or a simple list of identifiers if no more is known about them:

    [
        10.1234/1389849..., 
        14754839, 10.....
    ]

or

    [
        {
            "id":".......",
            "type":"......if known"
        }
    ]

The type parameter MUST be either "doi" or "pmid" or the identifier will not be recognised.

Both GET and POST respond properly to JSONP requests.

If a simple list of identifiers is provided (without a type parameter, OAG will attempt to determine if they are DOIs or PMIDs, and interpret them accordingly. If it cannot identify the type it will not be able to determine their status.

Both POST and GET, when asked for JSON, respond with an object as follows:

    {
        "requested": NUMBER_REQUESTED,
        "results": [
            {BIBJSON_RECORD}, ...
        ],
        "errors":[
            {
                "identifer" : BIBJSON_IDENTIFIER_OBJECT, 
                "error" : "...description..."
            },
            ...
        ],
        "processing":[
            {
                "identifier" : BIBJSON_IDENTIFIER_OBJECT
            },
            ...
        ]
    }

Some useful definitions of elements used here are:

  • requested - The number of ids requested by the client
  • results - a list of BibJSON records for all items that OAG knows the licensing conditions for already
  • errors - a list of the identifier records that represent the original identifiers provided by the client which OAG was unable to process for any one of a variety of reasons. The "error" key in the resulting object contains a text description of the reason for failure.
  • processing - a list of the identifier records that represent the original identifiers provided by the client which OAG is currently processing.

When the client receives its first response from OAG it is likely that many items will be in the "processing" queue. OAG will lookup the licence conditions for these items as quickly as possible and have them ready to make available, but it will not notify the client. The recommended mode of usage is for the client to poll either its original request URL, or the URLs of all of the items in the "processing" queue, in order to retrieve updates. OAG does not guarantee a particular timescale upon which licence information will be available.

OAG BibJSON record

An OAG formatted BibJSON record will have some or all of the fields laid out in the example below.

    {
        "title": "Open Bibliography for Science, Technology and Medicine",
        "author":[
            {"name": "Richard Jones"},
            {"name": "Mark MacGillivray"},
            ...
        ],
        "year": "2011",
        "journal": {"name": "Journal of Cheminformatics"},
        "link": [{"url":"http://www.jcheminf.com/content/3/1/47"}],
        "identifier": [
            {
                "type":"doi",
                "id":"10.1186/1758-2946-3-47"
                "canonical":"doi:10.1186/1758-2946-3-47"
            }
        ],
        "license": [
            {
                "status": "active",
                "maintainer": "",
                "description": "",
                "family": ""
                "title": "Creative Commons Attribution",
                "domain_data": true/false,
                "url": "http://www.opendefinition.org/licenses/cc-by",                
                "version": "", 
                "domain_content": true/false,
                "is_okd_compliant": true/false,
                "is_osi_compliant": true/false,
                "domain_software": true/false,
                "type": "cc-by",
                "jurisdiction": "",
                "open_access": true/false,
                "BY": true/false,
                "NC": true/false,
                "ND": true/false,
                "SA": true/false,

                "provenance": {
                    "category": "page_scrape",
                    "description": "how the content was acquired ...",
                    "agent": "OpenArticleGauge Service/0.1 alpha",
                    "source": "http://www.plosbiology.org/article/info%3Adoi%2F10...",
                    "source_size" : "3456",
                    "date": "2013-02-16T21:51:54.669040",
                    "handler" : "myplugin",
                    "handler_version" : "1.0"
                }

            }
        ]
    }

Note that the OAG BibJSON records will often have sparse to non-existant bibliographic metadata. If you want bibliographic metadata you should look to a service such as CrossRef.

Some useful definitions of elements used here are:

  • identifier - the list of identifiers in the BibJSON record will contain at least one which has your originally requested ID.
  • license - (note the US spelling) the list of licenses seen for this item. A record may contain an arbitrary number of licence records, but the client should only present to an end user the most recent licence statement as being the definitive one. Licence statements are expressions of the Open Definition of the known licence.
  • license/open_access - boolean indicating whether the OAG service considers this licence to be "Open Access".
  • license/BY - boolean indicating whether the licensing conditions for this item require Attribution (e.g. CC-BY)
  • license/NC - boolean indicating whether the licensing conditions for this item stipulate a Non Commercial clause (e.g. CC-NC)
  • license/ND - boolean indicating whether the licensing conditions for this item indicate that No Derivatives are allowed (e.g. CC-ND)
  • license/SA - boolean indicating whether the licensing conditions for this item require that any derivative works are Share Alike (e.g. CC-SA)
  • license/provenance - Contains information about how this particular license statement was obtained.
  • license/provenance/category - the type of acquisition process that was used to acquire the license information. Should be one of:
    • page_scrape - the content was scraped from an HTML page
    • xml_api - the content was acquired by interrogating an XML-based API provided by the provider
  • license/provenance/source and license/provenance/source_size - The URL of the actual resource we looked at, and the size of it in bytes
  • license/provenance/date - the date that this licence was acquired. This is the field that developers should use to determine the most recent licence record, which is considered by OAG to be the current licence conditions of the item.
  • license/provenance/handler - the name of the plugin which handled this license acquisition
  • license/provenance/handler_version - the version of the plugin which handled this license acquisition
comments powered by Disqus