Using {openFDA}
can be a breeze, if you know how to
construct good queries. This short guide will get you started with
{openFDA}
, and show you how to put together more
complicated queries.
The openFDA API makes public FDA data available from a simple, public API. Users of this API have access to FDA data on food, human and veterinary drugs, devices, and more. You can read all about it at their website.
The simplest way to query the openFDA API is to identify the
endpoint you want to use and provide other search
terms. For example, this snippet retrieves 1 record about
adverse events in the drugs endpoint.
The empty search string (""
) means the results will be
non-specific.
search <- openFDA(search = "", endpoint = "drug-event", limit = 1)
search
#> <httr2_response>
#> GET https://api.fda.gov/drug/event.json?api_key=[API_KEY]&search=&limit=1
#> Status: 200 OK
#> Content-Type: application/json
#> Body: In memory (1819 bytes)
The function returns an {httr2}
response object, with
attached JSON data. We use httr2::resp_body_json()
to
extract the underlying data.
If you don’t specify a field to count
on, the JSON data
has two sections - meta
and results
.
The meta
section has important metadata on your results,
which includes:
disclaimer
- An important disclaimer regarding the data
provided by openFDA.license
- A webpage with license terms that govern the
openFDA API.last_updated
- The last date when this openFDA endpoint
was updated.results.skip
- How many results were skipped? Set by
the skip
parameter in openFDA()
.results.limit
- How many results were retrieved? Set by
the limit
parameter in openFDA()
.results.total
- How many results were there in total
matching your search
criteria?json$meta
#> $disclaimer
#> [1] "Do not rely on openFDA to make decisions regarding medical care. While we make every effort to ensure that data is accurate, you should assume all results are unvalidated. We may limit or otherwise restrict your access to the API in line with our Terms of Service."
#>
#> $terms
#> [1] "https://open.fda.gov/terms/"
#>
#> $license
#> [1] "https://open.fda.gov/license/"
#>
#> $last_updated
#> [1] "2024-07-30"
#>
#> $results
#> $results$skip
#> [1] 0
#>
#> $results$limit
#> [1] 1
#>
#> $results$total
#> [1] 18029782
For non-count
queries, this will be a set of records
which were found in the endpoint and match your search
term.
count
-ingIf you set the count
query, then the openFDA API will
not return full records. Instead, it will count the number of records
for each member in the openFDA field you specified for
count
. For example, let’s look at drug manufacturers in the
Drugs@FDA
endpoint for "paracetamol"
. We’ll use the
limit
parameter to limit our results to the first 3 drug
manufacturers found.
count <- openFDA(search = "",
endpoint = "drug-drugsfda",
limit = 3,
count = "openfda.manufacturer_name.exact") |>
httr2::resp_body_json()
count$results
#> [[1]]
#> [[1]]$term
#> [1] "Aurobindo Pharma Limited"
#>
#> [[1]]$count
#> [1] 393
#>
#>
#> [[2]]
#> [[2]]$term
#> [1] "Zydus Lifesciences Limited"
#>
#> [[2]]$count
#> [1] 326
#>
#>
#> [[3]]
#> [[3]]$term
#> [1] "Hikma Pharmaceuticals USA Inc."
#>
#> [[3]]$count
#> [1] 318
You can count on fields with a date to create a time series, as demonstrated on the openFDA website.
We can increase the complexity of our query using the
search
parameter, which lets us search against specific
openFDA API fields. These fields are harmonised to different degrees in
each API, which you will need to check online.
You can provide search strategies to openFDA()
as single
strings. They are constructed as [FIELD_NAME]:[STRING]
,
where FIELD_NAME
is the openFDA field you want to search
on. If your STRING
contains spaces, you must surround it
with double quotes, or openFDA will search against each word in the
string. So, for example, a search for drugs with the class
"thiazide diuretic
” should be formatted as
"openfda.pharm_class_epc:\"thiazide diuretic\""
, or the API
will collect all drugs which have the words "thiazide"
or
"diuretic"
in their established pharmacological class
(EPC). Let’s do an unrefined search first:
search_unrefined <- openFDA(
search = "openfda.pharm_class_epc:thiazide diuretic",
endpoint = "drug-drugsfda",
limit = 1
)
httr2::resp_body_json(search_unrefined)$meta$results$total
#> [1] 211
Let’s compare this to our refined search, where we add double-quotes around the search term:
search_refined <- openFDA(
search = "openfda.pharm_class_epc:\"thiazide diuretic\"",
endpoint = "drug-drugsfda",
limit = 1
)
httr2::resp_body_json(search_refined)$meta$results$total
#> [1] 118
As you can see, the unrefined search picked up 93 more results, most of which would have probably been non-thiazide diuretics.
The openFDA API lets you search on various fields at once. Simple
methods for doing this are implemented in {openFDA}
.
Using the guides on the openFDA website,
you can put together your own query. For example, the following query
looks for up to 5 records which were submitted by Walmart and are taken
orally. We can use {purrr}
functions to extract a brand
name for each record. Note that though a single record can have multiple
brand names, we are choosing to only extract the first one.
search_term <- "openfda.manufacturer_name:Walmart+AND+openfda.route=oral"
search <- openFDA(search = search_term,
limit = 5,
endpoint = "drug-drugsfda")
json <- httr2::resp_body_json(search)
purrr::map(json$results, .f = \(x) {
purrr::pluck(x, "openfda", "brand_name", 1)
})
#> [[1]]
#> [1] "FOSTER AND THRIVE NICOTINE"
#>
#> [[2]]
#> [1] "COUNTERACT ALLERGY"
#>
#> [[3]]
#> [1] "IBUPROFEN 200 MG AND DIPHENHYDRAMINE CITRATE 38 MG"
#>
#> [[4]]
#> [1] "PURELAX"
#>
#> [[5]]
#> [1] "OMEPRAZOLE"
openFDA()
construct the search termYou can let the package do the heavy lifting for you with
openFDA()
, by providing a named character vector with many
field/search term pairs to the search
parameter. The
function will automatically add double quotes (""
) around
your search terms, if you’re providing field/value pairs like this.
search <- openFDA(search = c("openfda.generic_name" = "amoxicillin"),
endpoint = "drug-drugsfda")
httr2::resp_body_json(search)$meta$results$total
#> [1] 58
You can include as many fields as you like, as long as you only
provide each field once. By default, the terms are combined with an
OR
operator in openFDA()
. The below search
strategy will therefore pick up all entries in Drugs@FDA which are
taken by mouth.
To apply multiple search terms with AND
operators, use
format_search_term()
with mode = "and"
:
You can use the wildcard character "*"
to match zero or
more characters. For example, we could take the prototypical ending to a
common drug class - e.g. the sartans, which are
angiotensin-II receptor blockers - and see which manufacturers are most
represented in Drugs@FDA
for this class. When using wildcards, either pre-format the string
yourself without double-quotes or use
format_search_term()
with exact = FALSE
. If
you try to search with both double-quotes and the wildcard character,
you will get a 404 error from openFDA.
search_term <- format_search_term(c("openfda.generic_name" = "*sartan"),
exact = FALSE)
search <- openFDA(search = search_term,
count = "openfda.manufacturer_name.exact",
endpoint = "drug-drugsfda",
limit = 5)
terms <- purrr::map(
.x = httr2::resp_body_json(search)$results,
.f = purrr::pluck("term")
)
counts <- purrr::map(
.x = httr2::resp_body_json(search)$results,
.f = purrr::pluck("count")
)
setNames(counts, terms)
#> $`Alembic Pharmaceuticals Limited`
#> [1] 14
#>
#> $`Alembic Pharmaceuticals Inc.`
#> [1] 13
#>
#> $`Aurobindo Pharma Limited`
#> [1] 13
#>
#> $`Macleods Pharmaceuticals Limited`
#> [1] 11
#>
#> $`Zydus Lifesciences Limited`
#> [1] 11
It looks like "Alembic Pharmaceuticals"
is very active
in this space - interesting!
This short guide does not cover all aspects of openFDA. It is recommended that you go to the openFDA API website and check out the resources there to see information on: