Soybean Phenotype Prediction

A Novel Framework for Soybean Phenotype Prediction and Salient Loci Mining via Machine Learning and Interpretability Analysis

This platform uses our custom-designed Base-Type SNP encoding and a Support Vector Regression (SVR) model to predict three essential phenotypes:

Simply upload a VCF file of a soybean variety. We will extract essential SNP loci, fill in missing data using a reference genome, and deliver accurate phenotype predictions—powered by explainable machine learning.

Note:

1. Currently, only single-sample phenotype prediction is supported. The input file must be a VCF file for a single soybean variety.

2. The VCF file format can be adjusted using tools like bcftools. Please refer to the provided example file for the correct format. (Reference commands for bcftools: bcftools view -c1 -s single_sample_ID -o your_vcf_file.vcf -O v single_sample_ID.vcf)

Prediction Result:
Oil Content Predicton Protein Content Prediction Water-soluble Protein Content
91.88% Unrecognized 91.88%
Flag Counter