read_NPX() or read_npx())read_NPX() or
read_npx())check_npx())clean_npx())olink_plate_randomizer())olink_bridgeselector())olink_normalization())olink_lod())olink_ttest())olink_wilcox())olink_anova())olink_anova_posthoc())olink_lmer())olink_lmer_posthoc())olink_pathway_enrichment())olink_boxplot())olink_dist_plot())olink_lmer_plot())olink_pathway_heatmap())olink_pathway_visualization())olink_qc_plot())olink_heatmap_plot())olink_volcano_plot())set_plot_theme())olink_color_discrete(),
olink_color_gradient(), olink_fill_discrete(),
olink_fill_gradient())olink_bridgeability_plot())Olink® Analyze is an R package that provides a versatile toolbox to enable fast and easy handling of Olink® NPX data for your proteomics research. Olink® Analyze provides functions for using Olink data, including functions for importing Olink® NPX datasets, as well as quality control (QC) plot functions and functions for various statistical tests. This package is meant to provide a convenient pipeline for your Olink NPX data analysis.
Note: Starting with OlinkAnalyze v5.0, detailed analysis workflow vignettes have been moved to the new OlinkAnalyzeVignettes package, which is available on CRAN. This vignette provides an overview of the main functions in OlinkAnalyze and introduces the new v5.0 preprocessing functions
check_npx()andclean_npx().
For a quick overview of the main functions and typical workflows, see the Olink® Analyze R Package Cheat sheet.
Preprocessing
read_npx() or read_NPX() Function to read
NPX data into long formatcheck_npx() Function to check the quality and format of
NPX dataclean_npx() Function to clean NPX data based on the
output of check_npx()olink_plate_randomizer() Randomize samples on
platesolink_bridgeselector() Select bridge samplesolink_normalization() Normalization of all
proteinsolink_lod() Calculation of Limit of Detection (LOD) for
Olink NGS dataStatistical analysis
olink_ttest() Function which performs a t-test per
proteinolink_wilcox() Function which performs a Mann-Whitney U
Test per proteinolink_anova() Function which performs an ANOVA per
proteinolink_anova_posthoc() Function which performs an ANOVA
post-hoc test per proteinolink_one_non_parametric() Function which performs a
Kruskal-Wallis Test or Friedman Test per proteinolink_one_non_parametric_posthoc() Function which
performs post-hoc test for one way non-parametric testolink_ordinalRegression() Function which performs an
ordinal regression per proteinolink_ordinalRegression_posthoc() Function which
performs an ordinal regression post-hoc test per proteinolink_lmer() Function which performs a linear mixed
model per proteinolink_lmer_posthoc() Function which performs a linear
mixed model post-hoc per proteinolink_pathway_enrichment() Function which performs GSEA
or ORA pathway enrichment using outcome from other statistical
testsVisualization
olink_boxplot() Function which plots boxplots of a
selected variableolink_dist_plot() Function to plot the NPX distribution
by panelolink_lmer_plot() Function which performs a point-range
plot per protein on a linear mixed modelolink_pathway_visualization() Function which plots a
bar graph for pathways of interestolink_pathway_heatmap() Function which plots estimates
of proteins associated with pathways of interestolink_pca_plot() Function to plot a PCA of the
dataolink_qc_plot() Function to plot an overview of a
sample cohort per Panelolink_umap_plot() Function to plot a UMAP of the
dataolink_volcano_plot() Easy volcano plot with Olink
themeolink_heatmap_plot() Function which generates a heatmap
over all proteinsset_plot_theme() Function to set ggplot2
plot themeolink_bridgeability_plot() Function which generates
plots that illustrate the three criteria for determining whether an
assay is bridgeable in cross-product bridge normalizationSample datasets
npx_data1 NPX Data in Long formatnpx_data2 NPX Data in Long format, follow-upmanifest A sample manifest including Sample ID, Subject
ID and clinical variables
Schematic overview illustrating how the newly introduced functions
check_npx() and clean_npx() in Olink Analyze
v5.0 can be used together in a typical Olink data analysis workflow.
The package contains two test data files named npx_data1
and npx_data2. These are synthetic datasets that resemble
Olink® data accompanied by clinical variables. Olink® data that is
delivered in long format or imported with the function
read_npx() contain the following columns:
SampleID <chr>: Sample names or
IDs.OlinkID <chr>: Unique ID for each assay
assigned by Olink. In case the assay is included in more than one panel
it will have a different OlinkID in each one.UniProt <chr>: UniProt ID.Assay <chr>: Common gene name for the
assay.MissingFreq <dbl>: Missing frequency for
the OlinkID, i.e. frequency of samples with
NPX value below limit of detection (LOD).Panel <chr>: Olink Panel that samples
ran on. Read more about Olink Panels here: https://olink.com/products/compare.Panel_Version <chr>: Version of the
panel. A new panel version might include some different or improved
assays.PlateID <chr>: Name of the plate.QC_Warning or SampleQC
<chr>: Indication whether the sample passed Olink QC.
More information about Olink quality control metrics can be found in our
FAQ
by searching for the term “Quality control”.LOD <dbl>: Limit of detection (LOD) is
the minimum level of an individual protein that can be measured. LOD is
defined as 3 times the standard deviation over background.NPX <dbl>: Normalized Protein eXpression
(NPX), is Olink®’s unit of protein expression level in a log2
scale. The majority of the functions of this package use NPX values for
calculations. Read more about NPX in the Olink FAQ
(search term “What is NPX?”) or in Olink’s Data normalization and
standardization white
paper.Note: There are 5 additional variables in the sample
datasets npx_data1 and npx_data2 that include
clinical or other information, namely: Subject <chr>,
Treatment <chr>, Site <chr>, Time
<chr>, Project <chr>.
The columns found in an Olink data set may vary based on the version and product.
read_NPX() or
read_npx())The read_npx() (or read_NPX()) function
imports an NPX file into a tidy format to work with in R. This function
supports Olink® NPX files generated by Olink® data software in CSV,
Excel, and Parquet formats. No prior alterations to the NPX output file
should be made for this function to work as expected.
filename: Path to the NPX output file.out_df: Output data frame format. Either “tibble” or
“ArrowObject”. Default: “tibble”.long_format: Logical. Whether the input file is in long
or wide format. Default: TRUE.olink_platform: Olink product used to generate the
input file. One of “Target 48”, “Flex”, “Target 96”, “Explore 3072”,
“Explore HT”, “Focus”, or “Reveal”. Defaults to NULL for
auto-detection.data_type: Quantification method of the input data. One
of “Ct”, “NPX”, or “Quantified”. Defaults to NULL for
auto-detection..ignore_files: Character vector of files included in
the zip-compressed Olink software output files that should be ignored.
Used only for zip-compressed input files.quiet: Logical. Whether to suppress messages about the
import process. Default: TRUE.legacy: Logical. Whether to use the legacy import
method. Default: FALSE. The legacy method is less efficient
and will be deprecated in future versions.A tibble or ArrowObject in long format
containing:
SampleID: Sample names or IDs.OlinkID: Unique ID for each assay assigned by Olink. In
case the assay is included in more than one panel it will have a
different OlinkID in each one.UniProt: UniProt ID.Assay: Common gene name for the assay.MissingFreq: Missing frequency for the
OlinkID, i.e. frequency of samples with NPX
value below limit of detection (LOD).Panel: Olink Panel that samples ran on. Read more about
Olink Panels here: https://olink.com/products/comparePanel_Version: Version of the panel. A new panel
version might include some different or improved assays.PlateID: Name of the plate.QC_Warning or SampleQC: Indication whether
the sample passed Olink QC. More information about Olink quality control
metrics can be found in our FAQ
(search term “Quality control”).LOD: Limit of detection (LOD) is the minimum level of
an individual protein that can be measured. LOD is defined as 3 times
the standard deviation over background.NPX: Normalized Protein eXpression (NPX), is Olink’s
unit of protein expression level in a log2 scale. The
majority of the functions of this package use NPX values for
calculations. Read more about NPX in the Olink FAQ
(search term “What is NPX?”) or in Olink’s Data normalization and
standardization white
paper.read_NPX() or
read_npx())In order to import multiple NPX data files at once, the
read_npx() function can be used in combination with the
functions list.files(), lapply() and
dplyr::bind_rows(), as seen below. The pattern
argument of the list.files() function specifies the NPX
file format (.csv, .xlsx, .parquet, or any
combination of these). This method requires that all NPX files are
stored in the same folder and have identical column names. No prior
alterations to the NPX output file should be made for this method to
work as expected.
# Read in multiple NPX files in .csv format
data <- list.files(
path = "path/to/dir/with/NPX/files",
pattern = "csv$",
full.names = TRUE
) |>
lapply(FUN = function(x) {
df_tmp <- OlinkAnalyze::read_npx(x) |>
# Optionally add additional columns to add file identifiers
dplyr::mutate(
File = .env[["x"]]
)
return(df_tmp)
}) |>
# optional to return a single data frame of all files instead of a list of dfs
dplyr::bind_rows()
# Read in multiple NPX files in .parquet format
data <- list.files(
path = "path/to/dir/with/NPX/files",
pattern = "parquet$",
full.names = TRUE
) |>
lapply(
OlinkAnalyze::read_npx
) |>
dplyr::bind_rows()
# Read in multiple NPX files in either format
data <- list.files(
path = "path/to/dir/with/NPX/files",
pattern = "parquet$|csv$",
full.names = TRUE
) |>
lapply(
OlinkAnalyze::read_npx
) |>
dplyr::bind_rows()check_npx())The check_npx() function performs various quality and
format checks on NPX data imported with read_npx(). It is
recommended to run this function after reading in NPX data and before
downstream analysis. The result can be passed as the
check_log argument to clean_npx() and all
downstream Olink Analyze functions, allowing each function to skip its
own internal check and improve performance.
check_npx() also allows for alternative column names to
be used in the downstream analysis, both standard and non-standard. If
the user wishes to analyze data with non-standard column names, or to
resolve column name ambiguities, they can pass a named character vector
to the preferred_names argument. For example, one might
wish to use the column “PCNormalizedNPX” instead of “NPX” for downstream
analysis, or they might have log2 transformed the values
reported in “Quantified_value” and wish to use that column to analyze
their data. Note: for the time being, not all functions
utilize the full capacity of the preferred_names argument,
but we aim to expand this in the upcoming releases.
df: NPX data frame in long format, as returned by
read_npx().preferred_names: Optional named character vector to
resolve column name ambiguities or to map custom column names to
internally expected ones.A named list with the following elements:
col_names <list>: Column names from the
input data frame to be used in downstream analyses.oid_invalid <chr>: OlinkID values that
do not follow the expected formats (OID##### or OID#####_OID#####).assay_na <chr>: OlinkIDs of assays where
all samples have NA quantification values.sample_id_dups <chr>: Duplicate SampleID
values detected in the data.sample_id_na <chr>: SampleIDs of samples
with NA quantification values for all assays.col_class <data.frame>: Columns with
incorrect data types, including the column key, column name, detected
type, and expected type.assay_qc <chr>: OlinkIDs of assays with
at least one assay QC warning.non_unique_uniprot <chr>: OlinkIDs
mapped to more than one UniProt ID.darid_invalid <data.frame>: Invalid
combinations of “DataAnalysisRefID” and “PanelDataArchiveVersion”.clean_npx())The clean_npx() function cleans an NPX data frame by
applying a series of filtering and conversion steps. It removes invalid
or problematic assays and samples identified by
check_npx(), and optionally converts column data types.
Passing the output of check_npx() via the
check_log argument avoids re-running the internal checks
and improves performance.
df: NPX data frame in long format as returned by
read_npx().check_log: Named list returned by
check_npx(). If NULL, check_npx()
is run internally.remove_assay_na: Logical. Remove assays where all
samples have NA values. Default: TRUE.remove_invalid_oid: Logical. Remove assays with invalid
OlinkIDs. Default: TRUE.remove_dup_sample_id: Logical. Remove samples with
duplicate IDs. Default: TRUE.remove_control_assay: Logical. Remove internal control
assays. Default: TRUE.remove_control_sample: Logical. Remove external control
samples based on “SampleType”. Default: TRUE.remove_qc_warning: Logical. Remove samples with QC
status “FAIL”. Default: TRUE.remove_assay_warning: Logical. Remove assays flagged
with assay warnings. Default: TRUE.control_sample_ids: Character vector of identifiers of
samples that should be removed. Default: NULL.convert_df_cols: Logical. Convert columns to their
expected data types. Default: TRUE.convert_nonunique_uniprot: Logical. Resolve non-unique
“OlinkID”-“UniProt” mappings. Default: TRUE.verbose: Logical. Print progress messages. Default:
FALSE.A “tibble” (or “ArrowObject”) in long format containing the cleaned
NPX data, with invalid assays, control samples, QC-failing samples, and
problematic entries removed according to the chosen arguments. Output
matches that of the function read_npx().
Note: We recommend running check_npx()
once again after cleaning the data to confirm that all issues have been
resolved and that the data is ready for downstream analysis. A schematic
illustration of the described workflow is provided (Olink Analyze v5.0 example workflow).
olink_plate_randomizer())The olink_plate_randomizer() function randomly assigns
samples to a plate well with the option to keep the same individuals on
the same plate. Olink® does not recommend to force balance based on
other clinical variables.
For more information on plate randomization, consult the Plate
Randomization Vignette in the R package
OlinkAnalyzeVignettes.
olink_bridgeselector())The bridge selection function selects a number of bridge samples based on the input data. Bridge samples are used to normalize two data sets or projects that have been ran at different time points, hence, a batch effect is expected. It selects samples that have good detectability (if applicable), pass quality control, and cover a wide range of data points.
For more information on bridge sample selection, consult the Introduction
to bridging Olink® NPX datasets tutorial in the R package
OlinkAnalyzeVignettes.
olink_normalization())The Olink® normalization function normalizes NPX values between two different datasets or one Olink® dataset to a set of reference medians.
The function handles four different types of normalization:
OlinkAnalyzeVignettes.reference_project. Adjustment is made using the differences
of medians between the sample subsets from the two data sets. Subset
normalization is useful if no bridge samples were included and one can
assume that the distribution of the two datasets is very similar.df1,
overlapping_samples_df1 and reference_medians
need to be specified.OlinkAnalyzeVignettes.olink_lod())The olink_lod() function adds LOD information to an
Olink next generation sequencing (NGS) data set. This function can
incorporate LOD based on either an Olink NGS data set’s negative
controls or using predetermined fixed LOD values, which can be
downloaded from the Document Download Center at olink.com, or using
both methods. The default LOD calculation method is based on the
negative controls. If an NPX file is intensity normalized, both
intensity normalized and PC normalized LODs are provided.
For more information on calculating LOD, consult the Calculating
LOD from Olink® Explore data tutorial in the R package
OlinkAnalyzeVignettes.
olink_ttest())The olink_ttest() function performs a Welch 2-sample
t-test or paired t-test at confidence level 0.95 for every protein (by
OlinkID) for a given grouping variable. It corrects for
multiple testing using the Benjamini-Hochberg method (“fdr”). Adjusted
p-values are logically evaluated towards adjusted p-value < 0.05. The
resulting t-test table is arranged by ascending p-values.
df: NPX data frame in long format should minimally
contain protein name (Assay), OlinkID,
UniProt, Panel and an outcome factor with 2
levels.variable: Character value that should represent a
column in the df to be used as a grouping variable. Needs
to have exactly 2 levels.pair_id: Character value indicating which column
contains the paired sample identifier. Only used for paired
t-tests.check_log: Named list returned by
check_npx(). If NULL, check_npx()
is run internally.A tibble with the following columns:
Assay <chr>: Assay name.OlinkID <chr>: Unique Olink® ID.UniProt <chr>: UniProt ID.Panel <chr>: Olink® Panel.estimate <dbl>: Difference in mean NPX
between groups.statistic <dbl>: Value of the
t-statistic.p.value <dbl>: P-value for the
test.parameter <dbl>: Degrees of freedom for
the t-statistic.conf.low <dbl>: Low bound of the
confidence interval for the mean.conf.high <dbl>: High bound of the
confidence interval for the mean.method <chr>: Method that was used.alternative <chr>: : Description of the
alternative hypothesis.Adjusted_pval <dbl>: Adjusted p-value
for the test (Benjamini & Hochberg).Threshold <chr>: Text indication if
assay is significant (adjusted p-value < 0.05).olink_wilcox())The olink_wilcox() function performs a 2-sample
Mann-Whitney U test or paired Mann-Whitney U test at confidence level
0.95 for every protein (by OlinkID) for a given grouping
variable. It corrects for multiple testing using the Benjamini-Hochberg
method (“fdr”). Adjusted p-values are logically evaluated towards
adjusted p-value<0.05. The resulting Mann-Whitney U table is arranged
by ascending p-values.
df: NPX data frame in long format should minimally
contain protein name (Assay), OlinkID,
UniProt, Panel and an outcome factor with 2
levels.variable: Character value that should represent a
column in the df to be used as a grouping variable. Needs
to have exactly 2 levels.pair_id: Character value indicating which column
contains the paired sample identifier. Only used for paired Mann-Whitney
U tests.check_log: Named list returned by
check_npx(). If NULL, check_npx()
is run internally.A tibble with the following columns:
Assay <chr>: Assay name.OlinkID <chr>: Unique Olink® ID.UniProt <chr>: UniProt ID.Panel <chr>: Olink® Panel.statistic <dbl>: Value of the
Mann-Whitney U statistic.p.value <dbl>: P-value for the
test.method <chr>: Method that was used.alternative <chr>: : Description of the
alternative hypothesis.Adjusted_pval <dbl>: Adjusted p-value
for the test (Benjamini & Hochberg).Threshold <chr>: Text indication if
assay is significant (adjusted p-value < 0.05).olink_anova())The olink_anova() function performs an ANOVA F-test for
each assay (by OlinkID) using Type III sum of squares. The
function handles both factor and numerical variables, and/or confounding
factors.
Samples with missing variable information or factor levels are excluded from the analysis. Character columns in the input data frame are converted to factors.
Control samples and control assays should be removed before using this function.
Crossed/interaction analysis, i.e. A*B formula notation, is inferred from the variable argument in the following cases:
For covariates, crossed analyses need to be specified explicitly,
i.e. two main effects will not be expanded with a
c('A', 'B') notation. Main effects present in the variable
take precedence.
Adjusted p-values are calculated using the Benjamini & Hochberg
(1995) method (“fdr”). The threshold is determined by logic evaluation
of Adjusted_pval < 0.05. Covariates are not included in
the p-value adjustment.
df: NPX data frame in long format should minimally
contain protein name (Assay), OlinkID,
UniProt, Panel and an outcome factor with at
least 3 levels.variable: Single character value or character array. In
case of single character, then that should represent a column in the
df. Otherwise, if length > 1, the included variable
names will be used in crossed analyses. It can also accept the notations
‘:’ or ‘*’.outcome: Name of the column from df that
contains the dependent variable. Default: “NPX”.covariates: Single character value or character array.
Default: NULL. Confounding factors to include in the
analysis. In case of single character then that should represent a
column in the df. It can also accept the notations ‘:’ or
‘*’, while crossed analysis will not be inferred from main effects.return.covariates: Logical. Returns F-test results for
the covariates. Default: FALSE. Note: Adjusted p-values
will be NA for covariates.verbose: Logical. If information about removed samples,
factor conversion and final model formula is to be printed to the
console. Default: TRUE.check_log: Named list returned by
check_npx(). If NULL, check_npx()
is run internally.# One-way ANOVA, no covariates
anova_results_oneway <- OlinkAnalyze::olink_anova(
df = data_clean,
variable = "Site",
check_log = check_log_clean
)
# Two-way ANOVA, no covariates
anova_results_twoway <- OlinkAnalyze::olink_anova(
df = data_clean,
variable = c("Site", "Time"),
check_log = check_log_clean
)
# One-way ANOVA, Treatment as covariates
anova_results_oneway <- OlinkAnalyze::olink_anova(
df = data_clean,
variable = "Site",
covariates = "Treatment",
check_log = check_log_clean
)A tibble with the following columns:
Assay <chr>: Assay name.OlinkID <chr>: Unique Olink ID.UniProt <chr>: UniProt ID.Panel <chr>: Olink Panel.term <chr>: Name of the variable that
was used for the p-value calculation. The “:” between variables
indicates interaction between variables.df <dbl>: Numerator of degrees of
freedom.sumsq <dbl>: Sum of squares.meansq <dbl>: Mean of squares.statistic <dbl>: Value of
F-statistic.p.value <dbl>: P-value for the
test.Adjusted_pval <dbl>: Adjusted p-value
for the test (Benjamini & Hochberg).Threshold <chr>: Text indication if
assay is significant (adjusted p-value < 0.05).olink_anova_posthoc())olink_anova_posthoc() performs a post-hoc ANOVA test
with Tukey p-value adjustment per assay (by OlinkID) at
confidence level 0.95.
The function handles both factor and numerical variables and/or
covariates. The post-hoc test for a numerical variable compares the
difference in means of the outcome variable (default: NPX)
for 1 standard deviation (SD) difference in the numerical variable,
e.g. mean NPX at mean (numerical variable) versus mean NPX at mean
(numerical variable) + 1*SD (numerical variable).
Control samples and control assays (AssayType is not
“assay”, or Assay contains “control” or “ctrl”) should be
removed before using this function.
df: NPX data frame in long format should minimally
contain protein name (Assay), OlinkID,
UniProt, Panel and an outcome factor with at
least 3 levels.olinkid_list: Character vector of
OlinkID’s on which to perform the post-hoc analysis. If not
specified, all assays in df are used.variable: Single character value or character array. In
case of single character then that should represent a column in the
df. Otherwise, if length > 1, the included variable
names will be used in crossed analyses. It can also accept the notations
‘:’ or ‘*’.covariates: Single character value or character array.
Default: NULL. Confounding factors to include in the
analysis. In case of single character then that should represent a
column in the df. It can also accept the notations ‘:’ or
‘*’, while crossed analysis will not be inferred from main effects.outcome: Name of the column from df that
contains the dependent variable. Default: “NPX”.effect: Character vector. Term on which to perform the
post-hoc analysis. Must be subset of or identical to the variable and no
adjustment is performed.mean_return: Logical. If true, returns the mean of each
factor level rather than the difference in means (default). Note that no
p-value is returned for mean_return = TRUE.verbose: Logical. If information about removed samples,
factor conversion and final model formula is to be printed to the
console. Default: TRUE.check_log: Named list returned by
check_npx(). If NULL, check_npx()
is run internally.# calculate the p-value for the ANOVA
anova_results_oneway <- OlinkAnalyze::olink_anova(
df = data_clean,
variable = "Site",
check_log = check_log_clean
)
# extracting the significant proteins
anova_results_oneway_sign <- anova_results_oneway |>
dplyr::filter(
.data[["Threshold"]] == "Significant"
) |>
dplyr::pull(
.data[["OlinkID"]]
)
anova_posthoc_oneway_results <- OlinkAnalyze::olink_anova_posthoc(
df = data_clean,
olinkid_list = anova_results_oneway_sign,
variable = "Site",
effect = "Site",
check_log = check_log_clean
)A tibble with the following columns:
Assay <chr>: Assay name.OlinkID <chr>: Unique Olink ID.UniProt <chr>: UniProt ID.Panel <chr>: Olink Panel.term <chr>: Name of the variable that
was used for the p-value calculation. The “:” between variables
indicates interaction between variables.contrast <chr>: Variables (in term) that
are compared.estimate <dbl>: Difference in mean NPX
between variables (from contrast).conf.low <dbl>: Low bound of the
confidence interval for the mean.conf.high <dbl>: High bound of the
confidence interval for the mean.Adjusted_pval <dbl>: Adjusted p-value
for the test (Benjamini & Hochberg).Threshold <chr>: Text indication if
assay is significant (adjusted p-value < 0.05).olink_lmer())The olink_lmer() function fits a linear mixed effects
model for every protein (by OlinkID) in every panel. The
function handles both factor and numerical variables and/or
covariates.
Samples with missing variable information or factor levels are excluded from the analysis. Character columns in the input data frame are converted to factors.
Crossed/interaction analysis, i.e. A*B formula notation, is inferred from the variable argument in the following cases:
For covariates, crossed analyses need to be specified explicitly, i.e. two main effects will not be expanded with a c(‘A’, ‘B’) notation. Main effects present in the variable take precedence.
Adjusted p-values are calculated using the Benjamini & Hochberg (1995) method (“fdr”). The threshold is determined by logic evaluation of Adjusted_pval < 0.05. Covariates are not included in the p-value adjustment.
df: NPX data frame in long format should minimally
contain protein name (Assay), OlinkID,
UniProt, Panel and 1-2 variables with at least
2 levels and subject identifiers (SubjectID).variable: Single character value or character array. In
case of single character then that should represent a column in the
df. Otherwise, if length > 1, the included variable
names will be used in crossed analyses. It can also accept the notations
‘:’ or ‘*’.outcome: Name of the column from df that
contains the dependent variable. Default: “NPX”.random: Single character value or character array with
random effects.covariates: Single character value or character array.
Default: NULL. Confounding factors to include in the
analysis. In case of single character then that should represent a
column in the df. It can also accept the notations ‘:’ or
‘*’, while crossed analysis will not be inferred from main effects.return.covariates: Logical. Returns F-test results for
the covariates. Note: Adjusted p-values will be NA for covariates.
Default: FALSE.verbose: Logical. If information about removed samples,
factor conversion and final model formula is to be printed to the
console. Default: TRUE.check_log: Named list returned by
check_npx(). If NULL, check_npx()
is run internally.# Linear mixed model with one variable.
lmer_results_oneway <- OlinkAnalyze::olink_lmer(
df = data_clean,
variable = "Site",
random = "Subject",
check_log = check_log_clean
)
# Linear mixed model with two variables.
lmer_results_twoway <- OlinkAnalyze::olink_lmer(
df = data_clean,
variable = c("Site", "Treatment"),
random = "Subject",
check_log = check_log_clean
)A tibble with the following columns:
Assay <chr>: Assay name.OlinkID <chr>: Unique Olink ID.UniProt <chr>: UniProt ID.Panel <chr>: Olink Panel.term <chr>: Name of the variable that
was used for the p-value calculation. The “:” between variables
indicates interaction between variables.sumsq <dbl>: Sum of squares.meansq <dbl>: Mean of squares.NumDF <dbl>: Numerator of degrees of
freedom.DenDF <dbl>: Denominator of degrees of
freedom.statistic <dbl>: Value of
F-statistic.p.value <dbl>: P-value for the
test.Adjusted_pval <dbl>: Adjusted p-value
for the test (Benjamini & Hochberg).Threshold <chr>: Text indication if
assay is significant (adjusted p-value < 0.05).olink_lmer_posthoc())The olink_lmer_posthoc() function is similar to
olink_lmer() but performs a post-hoc analysis based on a
linear mixed model effects model. The function handles both factor and
numerical variables and/or covariates. Differences in estimated marginal
means are calculated for all pairwise levels of a given output variable.
Degrees of freedom are estimated using Satterthwaite’s approximation.
The post-hoc test for a numerical variable compares the difference in
means of the outcome variable (default: NPX) for 1 standard
deviation difference in the numerical variable, e.g. mean NPX at
mean(numerical variable) versus mean NPX at mean(numerical variable) +
1*SD(numerical variable). The output tibble is arranged by ascending
adjusted p-values.
df: NPX data frame in long format should minimally
contain protein name (Assay), OlinkID,
UniProt, Panel and 1-2 variables with at least
2 levels and subject identifiers (SubjectID).variable: Single character value or character array. In
case of single character then that should represent a column in the
df. Otherwise, if length > 1, the included variable
names will be used in crossed analyses. It can also accept the notations
‘:’ or ‘*’.olinkid_list: Character vector of
OlinkID’s on which to perform the post-hoc analysis. If not
specified, all assays in df are used.effect: Character vector. Term on which to perform the
post-hoc analysis. Must be subset of or identical to the variable.outcome: Name of the column from df that
contains the dependent variable. Default: “NPX”.random: Single character value or character array with
random effects.covariates: Single character value or character array.
Default: NULL. Confounding factors to include in the
analysis. In case of single character then that should represent a
column in the df. It can also accept the notations ‘:’ or
‘*’, while crossed analysis will not be inferred from main effects.mean_return: Logical. If true, returns the mean of each
factor level rather than the difference in means (default). Note that no
p-value is returned for mean_return = TRUE and no
adjustment is performed.verbose: Logical. If information about removed samples,
factor conversion and final model formula is to be printed to the
console. Default: TRUE.check_log: Named list returned by
check_npx(). If NULL, check_npx()
is run internally.# Linear mixed model with two variables.
lmer_results_twoway <- OlinkAnalyze::olink_lmer(
df = data_clean,
variable = c("Site", "Treatment"),
random = "Subject",
check_log = check_log_clean
)
# extracting the significant proteins
lmer_results_twoway_sign <- lmer_results_twoway |>
dplyr::filter(
.data[["Threshold"]] == "Significant" &
.data[["term"]] == "Treatment"
) |>
dplyr::pull(
.data[["OlinkID"]]
)
# performing post-hoc analysis
lmer_posthoc_twoway_results <- OlinkAnalyze::olink_lmer_posthoc(
df = data_clean,
olinkid_list = lmer_results_twoway_sign,
variable = c("Site", "Treatment"),
random = "Subject",
effect = "Treatment",
check_log = check_log_clean
)A tibble with the following columns:
Assay <chr>: Assay name.OlinkID <chr>: Unique Olink ID.UniProt <chr>: UniProt ID.Panel <chr>: Olink Panel.term <chr>: Name of the variable that
was used for the p-value calculation. The “:” between variables
indicates interaction between variables.contrast <chr>: Variables (in term) that
are compared.estimate <dbl>: Difference in mean NPX
between variables (from contrast).conf.low <dbl>: Low bound of the
confidence interval for the mean.conf.high <dbl>: High bound of the
confidence interval for the mean.Adjusted_pval <dbl>: Adjusted p-value
for the test (Benjamini & Hochberg).Threshold <chr>: Text indication if
assay is significant (adjusted p-value < 0.05).Many other statistical functions can be found within Olink Analyze, including:
olink_one_non_parametric() Function which performs a
Kruskal-Wallis Test or Friedman Test per protein.olink_one_non_parametric_posthoc() Function which
performs post-hoc test for one way non-parametric test.olink_ordinalRegression() Function which performs an
ordinal regression per protein.olink_ordinalRegression_posthoc() Function which
performs an ordinal regression post-hoc test per protein.To learn more about these functions, consult their help documentation
using the help() function.
olink_pathway_enrichment())The olink_pathway_enrichment() function can be used to
perform Gene Set Enrichment Analysis (GSEA) or Over-Representation
Analysis (ORA) using MSigDB, Reactome, KEGG, or GO annotations. MSigDB
includes curated gene sets (C2) and ontology gene sets (C5) which
encompasses Reactome, KEGG, and GO annotations. This function performs
enrichment using the gsea() or enrich()
functions from the clusterProfiler R package from
BioConductor. The function uses the estimate from a previous statistical
analysis for one contrast for all proteins. MSigDB is subset if ontology
is KEGG, GO, or Reactome. test_results must contain
estimates for all assays. Post-hoc results can be used but should be
filtered for one contrast to improve interpretability.
Alternative statistical results can be used as input as long as they
include the columns OlinkID, Assay, and
estimate. A column named Adjusted_pval is also
needed for ORA. Any statistical results that contains one estimate per
protein will work as long as the estimates are comparable to each
other.
df: NPX data frame in long format with columns
Assay, OlinkID, UniProt,
SampleID, QC_Warning or SampleQC,
and NPX.test_results: a data frame of statistical test results
including Adjusted_pval and estimate columns.method: String of method name. Must be either “GSEA”
(default) or “ORA”.ontology: String of database to query. Must be either
“MSigDb”, “KEGG”, “GO”, or “Reactome”.organism: String of name of organism. Must be either
“human” (default) or “mouse”.ttest_results <- OlinkAnalyze::olink_ttest(
df = data_clean,
variable = "Treatment",
alternative = "two.sided",
check_log = check_log_clean
)
# GSEA enrichment analysis
gsea_results <- OlinkAnalyze::olink_pathway_enrichment(
df = data_clean,
test_results = ttest_results,
check_log = check_log_clean
)
# ORA enrichment analysis
ora_results <- OlinkAnalyze::olink_pathway_enrichment(
df = data_clean,
test_results = ttest_results,
method = "ORA",
check_log = check_log_clean
)A data frame of enrichment results.
Columns for ORA include:
ID <chr>: Pathway ID from MSigDB.Description <chr>: Description of
Pathway from MSigDB.GeneRatio <chr>: ratio of input proteins
that are annotated in a term.BgRatio <chr>: ratio of all genes that
are annotated in this term.pvalue <dbl>: p-value of
enrichment.p.adjust <dbl>: Adjusted p-value
(Benjamini-Hochberg).qvalue <dbl>: false discovery rate, the
estimated probability that the normalized enrichment score represents a
false positive finding.geneID: <chr> list of input proteins
(Gene Symbols) annotated in a term delimited by “/”.Count <dbl>: Number of input proteins
that are annotated in a term.Columns for GSEA:
ID <chr>: Pathway ID from MSigDB.Description <chr>: Description of
Pathway from MSigDB.setSize <dbl>: ratio of input proteins
that are annotated in a term.enrichmentScore <dbl>: Enrichment score,
degree to which a gene set is over-represented at the top or bottom of
the ranked list of genes.NES <dbl>: Normalized Enrichment Score,
normalized to account for differences in gene set size and in
correlations between gene sets and expression data sets. NES can be used
to compare analysis results across gene sets.pvalue <dbl>: p-value of
enrichment.p.adjust <dbl>: Adjusted p-value
(Benjamini-Hochberg).qvalue <dbl>: false discovery rate, the
estimated probability that the normalized enrichment score represents a
false positive finding.rank <dbl>: the position in the ranked
list where the maximum enrichment score occurred.leading_edge <chr>: contains tags, list,
and signal. Tags gives an indication of the percentage of genes
contributing to the enrichment score. List gives an indication of where
in the list the enrichment score is obtained. Signal represents the
enrichment signal strength and combines the tag and list.core_enrichment <chr>: list of input
proteins (Gene Symbols) annotated in a term delimited by “/”.olink_pca_plot())Generates PCA projection of all samples from NPX data along two
principal components (default ‘PC1’ vs ‘PC2’) colored by the variable
specified by color_g (default ‘QC_Warning’) and including
the percentage of explained variance. By default, the values are scaled
and centered in the PCA and proteins with missing NPX values removed
from the corresponding assays. Unique sample names are required.
Imputation by median value is performed for assays with missingness
<10% for multi-plate projects, and for missingness <5% for single
plate projects.
More information about olink_pca() can be found in the
Outlier
Exclusion Vignette in the R package
OlinkAnalyzeVignettes.
olink_umap_plot())Computes a manifold approximation and projection and plots the two specified components. Unique sample names are required and imputation by the median is done for assays with missingness <10% for multi-plate projects and <5% for single plate projects.
The arguments outlierDefX and outlierDefY
can be used to identify outliers in the UMAP results. Sample outliers
will be labelled.
Note: UMAP is a non-linear data transformation that might not accurately preserve the properties of the data. Distances in the UMAP plane should therefore be interpreted with caution.
df: NPX data frame in long format should minimally
contain SampleID, NPX and column that will be
used for grouping/coloring.color_g: Character value indicating the column name
that should be used as fill color. Default QC_Warning.x_val: Integer indicating which principal component to
plot along the x-axis. Default 1.y_val: Integer indicating which principal component to
plot along the y-axis. Default 2.config: Object of class umap.config,
specifying the parameters for the UMAP algorithm.label_samples: Logical. If TRUE, points
are replaced with SampleID. Default
FALSE.drop_assays: Logical. All assays with any missing
values will be dropped. Takes precedence over sample drop.drop_samples: Logical. All samples with any missing
values will be dropped.byPanel: Logical. Perform the UMAP per panel. Default
FALSE.outlierDefX: (Optional) The number standard deviations
along the UMAP dimension plotted on the x-axis that defines an
outlier.outlierDefY: (Optional) The number standard deviations
along the UMAP dimension plotted on the y-axis that defines an
outlier.OutlierLines: Logical. Draw dashed lines at
+/-outlierDef[X,Y] standard deviations from the mean of the
plotted UMAP dimensions. Default FALSE.verbose: Logical. If information about removed samples,
factor conversion and final model formula is to be printed to the
console. Default: TRUE.quiet: Logical. If TRUE, the resulting
plot is not printed. Default: FALSE.check_log: Named list returned by
check_npx(). If NULL, check_npx()
is run internally.OlinkAnalyze::olink_umap_plot(
df = data_clean,
color_g = "QC_Warning",
byPanel = TRUE,
check_log = check_log_clean
)A list of objects of class ggplot is silently returned.
Plots are also printed unless option quiet = TRUE is
set.
olink_boxplot())The olink_boxplot() function is used to generate
boxplots of NPX values stratified on a variable for a given list of
proteins. In order to annotate the plot with ANOVA posthoc analysis
results (i.e. include statistical asterisks in the plot), control
samples and control assays should be removed from the data.
df: NPX data frame in long format should minimally
contain protein name (Assay), OlinkID,
UniProt and a grouping variable.variable: Single character value indicating the column
name to use as a grouping variable in the x axis.olinkid_list: Character vector of OlinkID’s that should
be used for the boxplot. If not specified, all assays in df
are used.posthoc_results: Data frame from ANOVA posthoc
analysis. This data frame need to be generated using the
olink_anova_posthoc() function.ttest_results: Data frame from t-test analysis. This
data frame need to be generated using the olink_ttest()
function.verbose: Logical. Flag indicating if plots shall be
printed additionally to assigned to a list variable. Default:
FALSE.number_of_proteins_per_plot: Number of boxplots to
include in the facets plot. Default 6.check_log: Named list returned by
check_npx(). If NULL, check_npx()
is run internally.plot <- data_clean |>
# removing missing values that exist for Site
dplyr::filter(
!is.na(.data[["Site"]])
) |>
OlinkAnalyze::olink_boxplot(
variable = "Site",
olinkid_list = c("OID00488", "OID01276"),
number_of_proteins_per_plot = 2L,
check_log = check_log_clean
)
plot[[1L]]anova_posthoc_results <- OlinkAnalyze::olink_anova_posthoc(
df = data_clean,
olinkid_list = c("OID00488", "OID01276"),
variable = "Site",
effect = "Site",
check_log = check_log_clean
)
plot2 <- data_clean |>
tidyr::drop_na() |> # removing missing values that exist for Site
OlinkAnalyze::olink_boxplot(
variable = "Site",
olinkid_list = c("OID00488", "OID01276"),
number_of_proteins_per_plot = 2L,
posthoc_results = anova_posthoc_results,
check_log = check_log_clean
)
plot2[[1L]]A list of objects of class ggplot.
Note: Please note that plots will not appear in the Viewer panel of RStudio if not assigned to a variable and printing it (see sample code above).
olink_dist_plot())The olink_dist_plot() function generates boxplots of NPX
values for each sample, faceted by Olink panel. This is used as an
initial QC step to identify potential outliers.
More information about olink_dist_plot() can be found in
the Outlier
Exclusion Vignette in the R package
OlinkAnalyzeVignettes.
olink_lmer_plot())The function olink_lmer_plot() generates a point-range
plot for a given list of proteins based on linear mixed effect model.
The points illustrate the mean NPX level for each group and the error
bars illustrate 95% confidence intervals. Facets are labeled by the
protein name and corresponding OlinkID for the protein.
df: NPX data frame in long format should minimally
contain protein name (Assay), OlinkID,
UniProt, Panel and 1-2 variables with at least
2 levels and subject ID (SubjectID).variable: Single character value or character array. In
case of single character then that should represent a column in the
df. Otherwise, if length > 1, the included variable
names will be used in crossed analyses. It can also accept the notations
‘:’ or ‘*’.outcome: Name of the column from df that
contains the dependent variable. Default: “NPX”.random: Single character value or character array with
random effects.covariates: Single character value or character array.
Default: NULL. Confounding factors to include in the
analysis. In case of single character then that should represent a
column in the df. It can also accept the notations ‘:’ or
‘*’, while crossed analysis will not be inferred from main effects.x_axis_variable: Character. Which main effect to use as
x-axis in the plot.col_variable: Character. If provided, the interaction
effect col_variable:x_axis_variable will be plotted with
x_axis_variable on the x-axis and col_variable
as color.number_of_proteins_per_plot: Number plots to include in
the list of point-range plots. Defaults to 6 plots per figure.verbose: Logical. If information about removed samples,
factor conversion and final model formula is to be printed to the
console. Default: TRUE.check_log: Named list returned by
check_npx(). If NULL, check_npx()
is run internally.plot <- OlinkAnalyze::olink_lmer_plot(
df = data_clean,
olinkid_list = c("OID01216", "OID01217"),
variable = c("Site", "Treatment"),
x_axis_variable = "Site",
col_variable = "Treatment",
random = "Subject",
check_log = check_log_clean
)
plot[[1L]]A list of objects of class ggplot.
Note: Please note that plots will not appear in the Viewer panel of RStudio if not assigned to a variable and printing it (see sample code above).
olink_pathway_heatmap())The olink_pathway_heatmap() function generates a heatmap
of proteins related to pathways using the enrichment results from the
olink_pathway_enrichment() function. Either the top terms
can be visualized or terms containing a certain keyword. For each term,
the proteins in the test_result data frame that are related
to that term will be visualized by their estimate. This visualization
can be used to determining how many proteins of interest are involved in
a particular pathway and in which direction their estimates are.
enrich_results: data frame of enrichment results from
olink_pathway_enrichment().test_results: filtered results from statistical test
with Assay, OlinkID, and estimate
columns.method: method used in
olink_pathway_enrichment(), “GSEA” or “ORA”. Default is
“GSEA”.keyword: (optional) keyword to filter enrichment
results on, if not specified, displays top terms.number_of_terms: number of terms to display, default is
20.OlinkAnalyze::olink_pathway_heatmap(
enrich_results = ora_results,
test_results = ttest_results,
method = "ORA",
keyword = "immune"
)A heatmap as a ggplot object.
olink_pathway_visualization())The olink_pathway_visualization() function generates a
bar graph of the top terms or terms related to a certain keyword for
results from the olink_pathway_enrichment() function. The
bar represents either the normalized enrichment score (NES) for GSEA
results or counts (number of proteins) for ORA results colored by
adjusted p-value. Pathways are ordered by unadjusted p-value. The ORA
visualization also contains the number of proteins out of the total
proteins in that pathway as a ratio after the bar.
enrich_results: data frame of enrichment results from
olink_pathway_enrichment().method: method used in
olink_pathway_enrichment() “GSEA” or “ORA”. Default is
“GSEA”.keyword: (optional) keyword to filter enrichment
results on, if not specified, displays top terms.number_of_terms: number of terms to display, default is
20.A bar graph as a ggplot object.
olink_qc_plot())The olink_qc_plot() function generates a plot faceted by
Panel, plotting IQR vs. median NPX for all samples. This is
a good first check to find out if any samples have a tendency to be
classified as outliers. Horizontal dashed lines indicate +/-3 standard
deviations from the mean IQR. Vertical dashed lines indicate +/-3
standard deviations from the mean sample median.
More information about olink_qc_plot() can be found in
the Outlier
Exclusion Vignette in the R package
OlinkAnalyzeVignettes.
olink_heatmap_plot())The olink_heatmap_plot() function generates a heatmap
for a specified set of samples and proteins. By default, the heatmap
centers and scales NPX across proteins and clusters samples and proteins
using a dendrogram. Unique sample names are required.
The grouping variable(s) are annotated and colored in the left side of the heatmap.
df: NPX data frame in long format which should
minimally contain SampleID, NPX,
OlinkID, Assay. Optionally, columns of choice
for annotations.variable_row_list: Columns in df to be
annotated for rows in the heatmap.variable_col_list: Columns in df to be
annotated for columns in the heatmap.center_scale: Logical. If data should be centered and
scaled across assays. Default: TRUE.cluster_rows: Logical. Determining if rows should be
clustered. Default: TRUE.cluster_cols: Logical. Determining if columns should be
clustered. Default: TRUE.show_rownames: Logical. Determining if row names are
shown. Default: TRUE.show_colnames: Logical. Determining if column names are
shown. Default: TRUE.annotation_legend: Logical. Determining if legend for
annotations should be shown. Default: TRUE.fontsize: Fontsize for all text. Default: 10.na_col: Color of the cells with NA values. Default:
“Black”.check_log: Named list returned by
check_npx(). If NULL, check_npx()
is run internally.first10 <- data_clean |>
dplyr::pull(
.data[["OlinkID"]]
) |>
unique() |>
utils::head(n = 10L)
first15samples <- data_clean |>
dplyr::pull(
.data[["SampleID"]]
) |>
unique() |>
utils::head(n = 15L)
data_clean_small <- data_clean |>
dplyr::filter(
.data[["OlinkID"]] %in% .env[["first10"]]
) |>
dplyr::filter(
.data[["SampleID"]] %in% .env[["first15samples"]]
)
OlinkAnalyze::olink_heatmap_plot(
df = data_clean_small,
variable_row_list = "Treatment",
check_log = check_log_clean
)An object of class ggplot.
olink_volcano_plot())The olink_volcano_plot() function generates a volcano
plot using results from the olink_ttest() function. The
estimated difference is shown in the x-axis and
-log10(p-value) in the y-axis. The horizontal dotted line
indicates p-value = 0.05. Dots are colored based on
significance following Benjamini-Hochberg adjustment with a p-value
cutoff of 0.05. Significant assays after adjustment can optionally be
annotated by OlinkID.
p.val_tbl: a data frame of results generated by
olink_ttest().x_lab: Optional. Character value to use as the x-axis
label.olinkid_list: Optional. Character vector of proteins
(OlinkID) to label in the plot. If not provided, by default
the function will label all significant proteins.# perform t-test
ttest_results <- OlinkAnalyze::olink_ttest(
df = data_clean,
variable = "Treatment",
check_log = check_log_clean
)
# select names of proteins to show
top_10_name <- ttest_results |>
dplyr::slice_head(
n = 10L
) |>
dplyr::pull(
.data[["OlinkID"]]
)
# volcano plot
OlinkAnalyze::olink_volcano_plot(
p.val_tbl = ttest_results,
x_lab = "Treatment",
olinkid_list = top_10_name
)An object of class ggplot.
set_plot_theme())This function sets a consistent plot theme for plots by adding it to
a ggplot object. It is mainly used for aesthetic
reasons.
OlinkAnalyze::npx_data1 |>
dplyr::filter(
!is.na(.data[["Treatment"]])
) |>
dplyr::filter(
.data[["OlinkID"]] == "OID01216"
) |>
ggplot2::ggplot(
ggplot2::aes(
x = .data[["Treatment"]],
y = .data[["NPX"]],
fill = .data[["Treatment"]]
)
) +
ggplot2::geom_boxplot() +
OlinkAnalyze::set_plot_theme()olink_color_discrete(),
olink_color_gradient(), olink_fill_discrete(),
olink_fill_gradient())These functions set a consistent coloring theme for the plots by
adding it to a ggplot object. It is mainly used for
aesthetic reasons.
OlinkAnalyze::npx_data1 |>
dplyr::filter(
!is.na(.data[["Treatment"]])
) |>
dplyr::filter(
.data[["OlinkID"]] == "OID01216"
) |>
ggplot2::ggplot(
mapping = ggplot2::aes(
x = .data[["Treatment"]],
y = .data[["NPX"]],
fill = .data[["Treatment"]]
)
) +
ggplot2::geom_boxplot() +
OlinkAnalyze::set_plot_theme() +
OlinkAnalyze::olink_fill_discrete()olink_bridgeability_plot())The olink_bridgeability_plot() function generates a
series of plots on a per-assay basis for a data frame generated from
between-product bridging. The coloration of the figure headers indicate
whether that assay has been defined as bridgeable or not bridgeable. The
correlation plot, violin plot, and bar chart figures illustrate the
three criteria for determining whether an assay is bridgeable. For
assays determined to be bridgeable, the ECDF curve and corresponding KS
statistic are used to determine which normalization approach (median
centering or quantile smoothing) is most suitable for between-product
normalization. For more information on the between-product bridging
methodology and bridgeability criteria, consult the Bridging
across NGS-based Olink® products Tutorial in the R
package OlinkAnalyzeVignettes.
df: NPX data frame generated from between-product
bridging in long format. Should minimally contain Assay,
OlinkID, OlinkID_E3072, and all data points
corresponding to bridging samples from the reference project and the new
project.median_counts_threshold: Integer indicating minimum
median counts allowed for each platform. If either platform has median
counts below 150 for an assay, the assay fails the counts criteria when
evaluating bridgeability. Default: 150.min_counts: Integer indicating minimum counts allowed
for a data point. If any data point in the bridge normalized dataframe
contains fewer than the defined minimum count cutoff, it is excluded
from the bridgeability assessment and corresponding figures. Default:
10.bridge_sampleid: Character vector containing
overlapping SampleIDs between the two bridging projects. If this
argument is not provided, the function will look for overlapping
SampleID values between the two projects in the bridged
dataframe. Default: NULL.check_log: Named list returned by
check_npx(). If NULL, check_npx()
is run internally.npx_ht <- data_exploreht |>
dplyr::filter(
.data[["SampleType"]] == "SAMPLE"
) |>
dplyr::mutate(
Project = "data1"
)
check_npx_ht <- OlinkAnalyze::check_npx(
df = npx_ht
)
npx_3072 <- data_explore3072 |>
dplyr::filter(
.data[["SampleType"]] == "SAMPLE"
) |>
dplyr::mutate(
Project = "data2"
)
check_npx_3072 <- OlinkAnalyze::check_npx(
df = npx_3072
)
overlapping_samples <- unique(
intersect(
x = npx_ht |> dplyr::distinct(.data[["SampleID"]]) |> dplyr::pull(),
y = npx_3072 |> dplyr::distinct(.data[["SampleID"]]) |> dplyr::pull()
)
)
npx_br_data <- OlinkAnalyze::olink_normalization(
df1 = npx_ht,
df2 = npx_3072,
overlapping_samples_df1 = overlapping_samples,
df1_project_nr = "Explore HT",
df2_project_nr = "Explore 3072",
reference_project = "Explore HT",
format = FALSE,
df1_check_log = check_npx_ht,
df2_check_log = check_npx_3072
)
check_npx_br_data <- OlinkAnalyze::check_npx(
df = npx_br_data
)
npx_br_data_bridgeable_plt <- OlinkAnalyze::olink_bridgeability_plot(
df = npx_br_data,
median_counts_threshold = 150L,
min_count = 10L,
check_log = check_npx_br_data
)
npx_br_data_bridgeable_plt[[1L]]A list of objects of class ggplot.
We are always happy to help. Email us with any questions:
biostat@olink.com for statistical services and general stats questions
support@olink.com for Olink lab product and technical support
info@olink.com for more information
© 2026 Olink Proteomics AB, part of Thermo Fisher Scientific.
Olink products and services are For Research Use Only. Not for use in diagnostic procedures.
All information in this document is subject to change without notice. This document is not intended to convey any warranties, representations and/or recommendations of any kind, unless such warranties, representations and/or recommendations are explicitly stated.
Olink assumes no liability arising from a prospective reader’s actions based on this document.
OLINK, NPX, PEA, PROXIMITY EXTENSION, INSIGHT and the Olink logotype are trademarks registered, or pending registration, by Olink Proteomics AB. All third-party trademarks are the property of their respective owners.
Olink products and assay methods are covered by several patents and patent applications https://www.olink.com/patents/.