Normalize Protein Data and Perform Differential Expression Analysis

This function processes protein data by performing normalization, imputation, and differential expression analysis using methods from the PhosR and limma packages. It can also generate a volcano plot to visualize the results.

Usage

get_norm_prot(
  data,
  alpha = 0.5,
  beta = 0.7,
  plot = c("no", "volcano_plot"),
  value = c("p_value", "adj_p_value")
)

Arguments

data: A data frame containing protein data. The function assumes that columns representing sample intensities have names containing suffixed with "_control" and "_treat" followed by a sample identifier (e.g., "p1_control", "p7_control", "p5_treat", "p6_treat").
alpha: A numeric value between 0 and 1. This parameter is used by `PhosR::selectGrps` to filter out proteins with too many missing values. A protein is kept if it has at least `n` valid values in at least `alpha * number_of_groups` groups. Default is `0.5`.
beta: A numeric value between 0 and 1. This parameter is used by `PhosR::scImpute`, which controls the number of nearest neighbors for imputation. Default is `0.7`.
plot: A character string. Specifies whether to generate a volcano plot. Must be one of "no" (default) or "volcano_plot".
value: A character string. Specifies which p-value to use for the volcano plot's y-axis. Must be one of "p_value" (default) or "adj_p_value".

Value

If `plot` is "no", the function returns a tibble with the results of the differential expression analysis. This table includes columns for log fold change (`logFC`), average expression, t-statistic, p-value, and adjusted p-value. If `plot` is "volcano_plot", the function returns a list containing two elements: `table` (the results tibble) and `plot` (a `ggplot` object of the volcano plot).

Examples

if (FALSE) { # \dontrun{
# Create a data frame that mimics omics data
# The column names must contain suffixed "_control" and "_treat"
data <- tibble(
        protein_group = LETTERS[1:10],
        p1_control = rnorm(10, mean = 100, sd = 10),
        p2_control = rnorm(10, mean = 95, sd = 15),
        p3_control = rnorm(10, mean = 105, sd = 12),
        p4_treat = c(rnorm(5, mean = 150, sd = 20), rnorm(5, mean = 50, sd = 10)),
        p5_treat = c(rnorm(5, mean = 145, sd = 18), rnorm(5, mean = 55, sd = 11)),
        p6_treat = c(rnorm(5, mean = 160, sd = 22), rnorm(5, mean = 60, sd = 13))
)

# Run the function to get the normalized data table
# We set plot = "no" to suppress the volcano plot
results_table <- get_norm_prot(
        data = data,
        plot = "no"
)
# Print the head of the results table
print(head(results_table))
# Run the function to get the results and the volcano plot
# We set plot = "volcano_plot" and value = "adj_p_value"
results_with_plot <- get_norm_prot(
        data = data,
        plot = "volcano_plot",
        value = "adj_p_value"
)

# The function returns a list, so we can access the plot and table separately
results_table <- results_with_plot$table
volcano_plot <- results_with_plot$plot

# Print the plot
print(volcano_plot)

# To save the plot to a file
# ggplot2::ggsave("volcano_plot.png", plot = volcano_plot, width = 8, height = 6, dpi = 300)
} # }