Skip to contents

Map Gene Identifiers

Usage

get_annotation(
  data,
  input_id_column,
  input_id_type,
  output_id_type = "ENTREZID",
  organism = c("Homo sapiens", "Mus musculus")
)

Arguments

data

A data frame containing the gene identifiers to be mapped.

input_id_column

A character string specifying the name of the column in `data` that contains the gene identifiers to be mapped.

input_id_type

A character string specifying the type of the input gene identifiers (e.g., "ENSEMBL", "SYMBOL", "REFSEQ"). This corresponds to the `keytype` argument in `AnnotationDbi::mapIds`. Common `keytype` values can be found by running `keytypes(org.Hs.eg.db)` for homo sapiens or `keytypes(org.Mm.eg.db)` for mus musculus.

output_id_type

A character string specifying the type of the input gene identifiers (e.g., "ENSEMBL", "SYMBOL", "REFSEQ"). This corresponds to the `keytype` argument in `AnnotationDbi::mapIds`. Common `keytype` values can be found by running `keytypes(org.Hs.eg.db)` for homo sapiens or `keytypes(org.Mm.eg.db)` for mus musculus.

organism

character string, either "Homo sapiens" or "Mus musculus", specifying the organism for which the mapping should be performed. Defaults to "Homo sapiens".

Value

A data frame identical to the input `data` frame, but with an additional column named `entrez_id` containing the mapped Entrez Gene IDs. If a mapping is not found for an ID, `NA` will be returned in the `entrez_id` column.

Examples

if (FALSE) { # \dontrun{
# --- Example for Homo sapiens ---
# Create a sample data frame
my_data_human <- data.frame(
  GeneSymbol = c("TP53", "BRCA1", "MYC", "GAPDH", "UNKNOWN_GENE"),
  Expression = c(10, 15, 20, 50, 5),
  stringsAsFactors = FALSE
)

# Map Gene Symbols to Entrez IDs for human
mapped_data_human <- get_annotation(
  data = my_data_human,
  input_id_column = "GeneSymbol",
  input_id_type = "SYMBOL",
  output_id_type = "ENTREZID",
  organism = "Homo sapiens"
)
print(mapped_data_human)

# --- Example for Mus musculus ---
# Create another sample data frame
my_data_mouse <- data.frame(
  EnsemblID = c("ENSMUSG00000020717", "ENSMUSG00000026774", "ENSMUSG00000000001"),
  FoldChange = c(1.2, -0.8, 2.5),
  stringsAsFactors = FALSE
)

# Map Ensembl IDs to Entrez IDs for mouse
mapped_data_mouse <- get_annotation(
  data = my_data_mouse,
  input_id_column = "EnsemblID",
  input_id_type = "ENSEMBL",
  output_id_type = "ENTREZID",
  organism = "Mus musculus"
)
print(mapped_data_mouse)

# Example with a non-existent input column
tryCatch({
  get_annotation(my_data_human, "NonExistentColumn", "SYMBOL", "ENTREZID")
}, error = function(e) {
  message("Caught expected error: ", e$message)
})
} # }