edge-gwas
Contents:
Installation Guide
System Requirements
Dependencies
Quick Installation
Installation from GitHub
Installation from Source
Virtual Environment Setup
External Tools Installation (NEW in v0.1.1)
Verify Installation
Supported File Formats
Upgrading from v0.1.0
Troubleshooting
Common Issues
Platform-Specific Notes
Development Installation
Docker Installation (Optional)
Singularity/Apptainer (for HPC)
Testing Your Installation
Installation Checklist
Minimum Installation (Core Only)
Recommended Installation (Full Features)
Configuration
Uninstallation
Getting Help
Next Steps
See Also
Quick Start Guide
Minimal Example
Complete Example with v0.1.1 Features
Step-by-Step Workflow
1. Install and Setup
2. Import Packages
3. Load Data
4. Quality Control
5. Population Structure Control (NEW in v0.1.1)
6. Initialize EDGE Analysis
7. Split Data
8. Run EDGE Analysis
9. Examine Results
10. Visualize Results
Understanding the Output
Alpha Values DataFrame
GWAS Results DataFrame
Interpreting Alpha Values
Quick Recipes
Recipe 1: Continuous Trait with Transformation
Recipe 2: Analysis with Related Samples
Recipe 3: Compare EDGE with Additive Model
Recipe 4: Cross-Validation
Recipe 5: Multi-Format Data Loading
Common Workflow Patterns
Pattern 1: Basic Case-Control Study
Pattern 2: Quantitative Trait Study
Pattern 3: Biobank-Scale Analysis
Troubleshooting Quick Fixes
What’s New in v0.1.1?
See Also
User Guide
Overview
EDGE Methodology
Complete Workflow
1. Load Data
2. Quality Control
3. Population Structure Control (NEW in v0.1.1)
4. Split Data
5. Run EDGE Analysis
6. Interpret Results
7. Visualize
File Formats
Best Practices
Quality Control
Covariates
Data Splitting
Performance
Advanced Features (v0.1.1)
Cross-Validation
Compare with Additive GWAS
Population Structure Options
Outcome Transformations
Troubleshooting
Common Issues
Quick Reference
See Also
API Reference
Core Module
EDGEAnalysis Class
Methods
calculate_alpha()
apply_alpha()
run_full_analysis()
get_skipped_snps()
Utilities Module
load_plink_data()
load_pgen_data()
load_vcf_data()
load_bgen_data()
prepare_phenotype_data()
filter_genotype_data()
validate_and_align_data()
validate_genotype_df()
validate_and_fix_encoding()
Visualization Module
I/O Handlers Module
Data Structures
Constants and Defaults
Exceptions
Version Information
Command-Line Tools (NEW in v0.1.1)
Function Reference Summary
Complete Workflow Example
Migration Guide (v0.1.0 → v0.1.1)
Function Index
Quick Function Finder
See Also
Statistical Model
Overview
Regression Model
Stage 1: Codominant Model (Training Data)
Encoding Parameter
Stage 2: EDGE-Encoded Model (Test Data)
Interpretation of Alpha
Two-Stage Analysis
Why Two Stages?
Stage 1: Calculate Alpha (Training Data)
Stage 2: Apply Alpha (Test Data)
Outcome Types
Binary Outcomes (Logistic Regression)
Quantitative Outcomes (Linear Regression)
Optimization Methods
Available Algorithms
Algorithm Details
Usage Example
Choosing an Optimization Method
Convergence Diagnostics
Outcome Transformations
Available Transformations
Log Transformation
Inverse Normal Transformation (INT)
Rank-Based Inverse Normal Transformation (RINT)
Choosing a Transformation
Effect Size Interpretation After Transformation
Transformation Diagnostics
Statistical Testing
Hypothesis Testing
P-value Calculation
Multiple Testing Correction
Genomic Inflation Factor
Quality Control Considerations
Minor Allele Frequency (MAF)
Hardy-Weinberg Equilibrium (HWE)
Missingness
Population Structure
Sample Size Considerations
Minimum Sample Size
Power Calculations
Comparison with Other Methods
EDGE vs. Standard Additive GWAS
EDGE vs. Genotypic Model
EDGE vs. Cochran-Armitage Trend Test
Practical Considerations
When to Use EDGE
Recommended Workflow
Computational Complexity
Implementation Details
Software Implementation
Convergence Issues
Extensions and Future Directions
Planned Features (v0.2.0+)
Research Directions
Limitations
Current Limitations
Statistical Assumptions
Validation Studies
Simulation Studies
Real Data Applications
Method Comparison Studies
Mathematical Proofs
Unbiased p-values (Sketch)
Consistency of α (Sketch)
Convergence of Optimization Methods
Glossary of Terms
Best Practices Summary
Data Preparation
Parameter Selection
Result Interpretation
Appendix A: Optimization Method Details
See Also
Example Workflows
Example 1: Basic Binary Outcome Analysis with PCA
Example 2: Quantitative Trait with Outcome Transformation
Example 3: Analysis with GRM for Related Samples
Example 4: Multi-Format Data Loading
Example 5: Comprehensive Quality Control Pipeline
Example 6: Comparing EDGE with Additive GWAS
Example 7: Cross-Validation for Model Stability
Example 8: Complete Analysis with All v0.1.1 Features
Example 9: Multi-Chromosome Genome-Wide Analysis
Example 10: Batch Processing for Very Large Datasets
Example 11: Using Pre-Calculated Alpha Values
Example 12: Detailed Alpha Interpretation and Reporting
Tips and Best Practices
Data Preparation
Quality Control Best Practices
Analysis Strategy
Performance Optimization
Interpretation Guidelines
Common Pitfalls to Avoid
Troubleshooting Common Issues
See Also
Visualization Guide
Built-in Visualization Functions
Manhattan Plot
Basic Usage
Advanced Options
Multiple Chromosomes
QQ Plot
Basic Usage
Advanced Options
Alpha Distribution Plot
Basic Usage
Advanced Options
Custom Visualizations
Regional Association Plot
Alpha vs Effect Size Plot
Forest Plot for Top Hits
Comparing Multiple Analyses
Interactive Plots with Plotly
Volcano Plot
Multi-Panel Summary Figure
Plotting Tips
Color Schemes
Figure Resolution
Font Sizes
Exporting Plots
See Also
Changelog
Version 0.1.1 (2025-12-25) - Current Release
Added
Changed
Migration from v0.1.0
Bug Fixes
Version 0.1.0 (2025-12-24)
Added
Dependencies
Version 0.0.0 (2024-04-02) - Legacy
Version History Summary
See Also
Citation
Primary Citation
BibTeX Entry
Related Publications
Acknowledgments
Contact for Citation Questions
See Also
Troubleshooting Guide
Common Issues and Solutions
Common Errors and Solutions
Error 1: “ValueError: No common samples found between analysis data and GRM”
Error 2: “LinAlgError: Matrix is singular”
Error 3: “ConvergenceWarning: Maximum iterations reached”
Error 4: “ValueError: Log transformation requires all positive values”
Error 5: “KeyError: ‘variant_id’ not found”
Error 6: “MemoryError: Unable to allocate array”
Error 7: “RuntimeWarning: invalid value encountered in divide”
Performance Optimization Tips
Speed Optimization
Memory Optimization
Accuracy Optimization
Reproducibility Best Practices
Setting Random Seeds
Version Control
Parameter Documentation
Appendix A: Troubleshooting Decision Tree
See Also
Frequently Asked Questions (FAQ)
General Questions
Technical Questions
Interpretation Questions
Practical Implementation Questions
See Also
Advanced Topics for Further Updates
Extensions and Future Directions
Planned Features
Research Directions
Advanced Topics
Cross-Validation for Alpha Estimation
Meta-Analysis of EDGE Results
Conditional Analysis
Gene-Based Testing (Future Feature Preview)
Interaction Effects
Stratified Analysis
Polygenic Risk Scores with EDGE
Sensitivity Analyses
Integration with Other Tools
LD Score Regression
Fine-Mapping
Colocalization Analysis
Pathway Enrichment
Visualization Examples
Alpha Distribution Plot
Manhattan Plot with Alpha Coloring
QQ Plot Comparison
Regional Association Plot
Effect Size Comparison Plot
Convergence Diagnostic Plot
Power Analysis Visualization
Simulation Code
Sample Size Calculations
Useful Helper Functions
See Also
edge-gwas
Search
Please activate JavaScript to enable the search functionality.