🧬 Basic Concepts
What is cell type annotation and why is it important?
Cell type annotation is the process of identifying and labeling different cell types in single-cell RNA sequencing (scRNA-seq) data based on their gene expression patterns.
Why it matters:
- Disease Research: Understanding which cell types are affected in diseases
- Drug Development: Identifying target cell populations for treatments
- Developmental Biology: Tracking cell fate decisions during development
- Tissue Function: Understanding cellular composition and interactions
How does AI-powered cell type annotation work?
AI-powered annotation leverages large language models (LLMs) trained on vast biomedical literature:
- Data Input: Upload your marker gene expression data
- AI Analysis: Models analyze gene signatures against biomedical knowledge
- Pattern Recognition: AI identifies characteristic expression patterns
- Consensus Building: Multiple models vote on cell type predictions
- Confidence Scoring: Each prediction receives accuracy estimates
🔬 Technical Usage
What file formats are supported for scRNA-seq data upload?
Supported formats:
- CSV files (.csv) - Comma-separated values
- TSV files (.tsv) - Tab-separated values
- Excel files (.xlsx) - Microsoft Excel format
Data structure requirements:
- Rows: Marker genes (gene symbols)
- Columns: Cell clusters or cell types
- Values: Expression levels, fold changes, or binary presence/absence
Which AI models can I use for cell type annotation?
Available AI Models:
📊 Accuracy & Performance
How accurate is multi-model consensus annotation?
Accuracy Benchmarks:
- Single Model: 75-85% accuracy
- Multi-Model Consensus: 85-95% accuracy
- High-Confidence Predictions: >95% accuracy
Factors affecting accuracy:
- Tissue type complexity
- Marker gene quality and specificity
- Number of models in consensus
- Data preprocessing quality
🔐 Security & Privacy
Is my data secure when using the web platform?
Security Measures:
- Encryption: All data transmission uses HTTPS/TLS encryption
- Temporary Storage: Files processed temporarily and auto-deleted
- No Permanent Storage: We don't keep your research data
- Secure APIs: All AI model APIs use secure connections
- User Control: You decide when to download and delete results
🔧 Troubleshooting
What should I do if annotation results seem incorrect?
Troubleshooting Steps:
- Check Data Quality: Ensure marker genes are specific and well-defined
- Adjust Parameters: Lower consensus threshold or increase discussion rounds
- Add More Models: Use additional AI models for better consensus
- Enable Discussion Mode: Let models discuss and refine predictions
- Validate Markers: Cross-check with literature or databases like CellMarker
- Consider Tissue Context: Some cell types are tissue-specific
🔗 Integration & API
Can I integrate mLLMCelltype with my existing analysis pipeline?
Integration Options:
- Python Package: Install via pip for direct integration
- Web API: RESTful endpoints for programmatic access
- Scanpy Integration: Native support for Scanpy workflows
- Seurat Compatibility: Export results for R/Seurat analysis
- Jupyter Notebooks: Interactive analysis examples
# Python package installation
pip install mllmcelltype
# Basic usage
from mllmcelltype import annotate_cells
results = annotate_cells(marker_data, models=['gpt-4', 'claude-3.5'])