
Normalize Phosphoproteomics Data and Perform Differential Analysis with `limma`
Source:R/get_norm_phos.R
get_norm_phos.Rd
This function processes raw phosphoproteomics intensity data, performs filtering, imputation, and median scaling using `PhosR` functions, and then conducts differential phosphorylation analysis using the `limma` package. Optionally, it can generate a volcano plot to visualize the results.
Arguments
- data
A data frame containing phosphoproteomics quantification data. It is expected to have columns:
`protein_group`: Character, representing protein identifiers.
`amino_acid`: Character, representing the phosphorylated amino acid (e.g., "S", "T", "Y").
`site`: Numeric, representing the position of the phosphorylation site.
`modification_sites`: Character, a unique identifier for each phosphosite (e.g., "protein_S123"). This will be used as the row names for the output table.
Columns suffixed with "_control" and "_treat" followed by a sample identifier (e.g., "p1_control", "p7_control", "p5_treat", "p6_treat") containing quantitative values.
- alpha
Numeric, a parameter for `PhosR::selectGrps` function, controlling the proportion of missing values allowed. Default is `0.5`.
- beta
Numeric, a parameter for `PhosR::scImpute` function, controlling the maximum number of neighbors for imputation. Default is `0.7`.
- plot
character string specifying whether to generate a plot.
"no": No plot is generated; only the differential phosphorylation table is returned.
"volcano_plot": Generates a volcano plot to visualize differential phosphorylation.
- value
A character string indicating which p-value to use for the y-axis of the volcano plot when `plot = "volcano_plot"`.
"p_value": Uses nominal p-values (`P.Value`) for the y-axis.
"adj_p_value": Uses adjusted p-values (`adj.P.Val`) for the y-axis.
This parameter is ignored if `plot = "no"`.
Value
If `plot = "no"`, returns a `data.frame` (tibble) containing the results of the `limma` differential analysis (e.g., `logFC`, `P.Value`, `adj.P.Val`). If `plot = "volcano_plot"`, returns a `list` containing:
`table`: The `data.frame` of differential phosphorylation results.
`plot`: A `ggplot` object of the generated volcano plot.
Examples
if (FALSE) { # \dontrun{
# Create a data frame that mimics omics data
# The column names must contain "control" and "treat"
data <- tibble(
modification_sites = paste0("site_", 1:10),
protein_group = rep(LETTERS[1:5], 2),
amino_acid = c(rep("S", 5), rep("T", 5)),
site = 1:10,
PTM.Group = paste0("group_", 1:10),
p1_control = rnorm(10, mean = 100, sd = 10),
p2_control = rnorm(10, mean = 95, sd = 15),
p3_control = rnorm(10, mean = 105, sd = 12),
p4_treat = c(rnorm(5, mean = 150, sd = 20), rnorm(5, mean = 50, sd = 10)),
p5_treat = c(rnorm(5, mean = 145, sd = 18), rnorm(5, mean = 55, sd = 11)),
p6_treat = c(rnorm(5, mean = 160, sd = 22), rnorm(5, mean = 60, sd = 13))
)
# Run the function to get the normalized data table
# We set plot = "no" to suppress the volcano plot
results_table <- get_norm_phos(
data = data,
plot = "no"
)
# Print the head of the results table
print(head(results_table))
# Run the function to get the results and the volcano plot
# We set plot = "volcano_plot" and value = "adj_p_value"
results_with_plot <- get_norm_phos(
data = data,
plot = "volcano_plot",
value = "adj_p_value"
)
# The function returns a list, so we can access the plot and table separately
results_table <- results_with_plot$table
volcano_plot <- results_with_plot$plot
# Print the plot
print(volcano_plot)
# To save the plot to a file
# ggplot2::ggsave("volcano_plot.png", plot = volcano_plot, width = 8, height = 6, dpi = 300)
} # }