Installation Guide
This guide provides detailed instructions for installing and
configuring mLLMCelltype for cell type annotation in single-cell RNA
sequencing data.
System Requirements
Before installing mLLMCelltype, ensure your system meets the
following requirements:
- R version: 4.0.0 or higher
- Memory: At least 8GB RAM recommended (more for
large datasets)
- Operating System: Windows, macOS, or Linux
- Internet Connection: Required for API calls to LLM
providers
Installing the R Package
Installation from CRAN (Recommended)
mLLMCelltype is now available on CRAN. You can install it directly
using:
# Install from CRAN
install.packages("mLLMCelltype")
This will install the stable version of mLLMCelltype with all
required dependencies.
Installation from GitHub (Development Version)
To install the latest development version from GitHub:
# Install devtools if not already installed
if (!requireNamespace("devtools", quietly = TRUE)) {
install.packages("devtools")
}
# Install mLLMCelltype development version
devtools::install_github("cafferychen777/mLLMCelltype", subdir = "R")
Installation from a Local Source
If you have downloaded the source code or need to install from a
local copy:
# Assuming the package is in the current working directory
devtools::install_local("path/to/mLLMCelltype/R")
Dependencies
mLLMCelltype depends on several R packages that will be automatically
installed during the installation process. The main dependencies
include:
- dplyr: For data manipulation
- httr: For API requests
- jsonlite: For JSON parsing
- R6: For object-oriented programming
- digest: For caching mechanisms
- magrittr: For pipe operations
For visualization and integration with single-cell analysis
workflows, the following packages are recommended but not required:
- Seurat: For integration with Seurat objects
- ggplot2: For visualization
- SCpubr: For publication-ready visualizations
API Keys Setup
mLLMCelltype requires API keys to access different LLM providers. You
will need to obtain API keys for at least one of the supported
providers:
Obtaining API Keys
- OpenAI (GPT-4o/4.1)
- Visit OpenAI Platform
- Create an account or log in
- Navigate to API keys section
- Create a new API key
- Anthropic (Claude-3.7/3.5)
- Visit Anthropic Console
- Create an account or log in
- Generate an API key
- Google (Gemini-2.0/2.5)
- Other Providers
- Similar processes apply for DeepSeek, Qwen, Zhipu, MiniMax, Stepfun,
and Grok
- Visit their respective websites to obtain API keys
Setting Up API Keys
There are three ways to set up your API keys:
1. Environment Variables
Create a .env file in your project directory with your
API keys:
# API Keys for different LLM models
OPENAI_API_KEY=your-openai-key
ANTHROPIC_API_KEY=your-anthropic-key
GEMINI_API_KEY=your-gemini-key
DEEPSEEK_API_KEY=your-deepseek-key
QWEN_API_KEY=your-qwen-key
ZHIPU_API_KEY=your-zhipu-key
STEPFUN_API_KEY=your-stepfun-key
MINIMAX_API_KEY=your-minimax-key
GROK_API_KEY=your-grok-key
OPENROUTER_API_KEY=your-openrouter-key
Then load the environment variables in your R script:
library(dotenv)
dotenv::load_dot_env()
2. Direct Specification in Function Calls
You can directly provide API keys in function calls:
library(mLLMCelltype)
results <- annotate_cell_types(
input = your_marker_data,
tissue_name = "human PBMC",
model = "claude-sonnet-4-5-20250929",
api_key = "your-anthropic-key",
top_gene_count = 10
)
3. R Environment Variables
Set API keys as R environment variables:
Sys.setenv(OPENAI_API_KEY = "your-openai-key")
Sys.setenv(ANTHROPIC_API_KEY = "your-anthropic-key")
# Set other API keys as needed
Verifying Installation
To verify that mLLMCelltype is installed correctly and API keys are
set up properly:
library(mLLMCelltype)
# Check if the package is loaded correctly
packageVersion("mLLMCelltype")
# Verify API key setup for a specific provider
api_key <- get_api_key("anthropic")
if (!is.null(api_key) && api_key != "") {
cat("Anthropic API key is set up correctly\n")
} else {
cat("Anthropic API key is not set up\n")
}
Common Installation Issues
Package Installation Failures
If you encounter issues during installation:
- Check R version: Ensure you’re using R 4.0.0 or
higher
- Update devtools: Run
install.packages("devtools") to ensure you have the latest
version
- Check dependencies: Some dependencies might require
system libraries on Linux
API Connection Issues
If you encounter issues connecting to LLM APIs:
- Verify API keys: Ensure your API keys are correct
and have not expired
- Check internet connection: Ensure you have a stable
internet connection
- Proxy settings: If you’re behind a proxy, configure
R to use your proxy settings
# Example of setting proxy for httr
httr::set_config(httr::use_proxy(url = "proxy_url", port = proxy_port))
Memory Limitations
For large datasets, you might encounter memory issues:
- Increase R memory limit: Use
memory.limit(size = 16000) on Windows to increase available
memory
- Process data in batches: Consider processing large
datasets in smaller batches
Next Steps
Now that you have installed mLLMCelltype, you can proceed to:
If you encounter any issues not covered in this guide, please open an
issue on our GitHub repository.