TITANiAN

T-cell mediated immunogenicity and molecular binding prediction using multi-domain adaptation


Method for TITANiAN

TITANiAN is a technique designed to predict the immunogenicity of peptides or peptide-MHC pairs. The method involves collecting relevant immunogenicity data from various sources using adversarial domain adaptation and then fine-tuning the model with the immunogenicity ELISPOT dataset.

Inputs for TITANiAN

To begin, you must create a CSV file containing your input data. The required fields vary depending on the specific action category. All column names are case-sensitive.

p-MHC immunogenicity prediction for immunotherapy development

For this category, the CSV file must include Allele and peptide columns.

Allele represents allele type of each MHC molecule. The allele types should be selected from the allele column of MHC_classI_pseudo.csv. While it is possible to use other allele names, the accuracy of the results cannot be guaranteed.

peptide represents peptide you want to check immunogenicity. Each character in the peptide column must be a single-letter representation of the 20 standard amino acids. Any deviation will result in an error. The model supports peptides up to a maximum length of 20 amino acids, with optimal performance observed with 9-mer peptides.

ADA level prediction of protein drug sequences

For this category, CSV file must include peptide columns.

peptide represents peptide you want to check immunogenicity. Each character in the peptide column must be a single-letter representation of the 20 standard amino acids. Any deviation will result in an error. The model supports peptides up to a maximum length of 20 amino acids, with optimal performance observed with 9-mer peptides. If you want to input protein sequence longer than 9, we recommend you to cut off sequence into 9-mer peptides while sliding the window.

p-MHC binding classification (class I)

For this category, the CSV file must include Allele and peptide columns.

Allele represents allele type of each MHC molecule. The allele types should be selected from the allele column of MHC_classI_pseudo.csv. While it is possible to use other allele names, the accuracy of the results cannot be guaranteed.

peptide represents peptide you want to check. Each character in the peptide column must be a single-letter representation of the 20 standard amino acids. Any deviation will result in an error. The model supports peptides up to a maximum length of 20 amino acids, with optimal performance observed with 9-mer peptides.

p-MHC binding classification (class II)

For this category, the CSV file must include Allele and peptide columns.

Allele represents allele type of each MHC molecule. The allele types should be selected from the allele column of MHC_classII_pseudo.csv. While it is possible to use other allele names, the accuracy of the results cannot be guaranteed.

peptide represents peptide you want to check. Each character in the peptide column must be a single-letter representation of the 20 standard amino acids. Any deviation will result in an error. The model supports peptides up to a maximum length of 20 amino acids, with optimal performance observed with 9-mer peptides.

TCR-p-MHC binding classification

For this category, the CSV file must include CDR3b and peptide columns.

CDR3b represents TCR's CDR 3β sequence. Each character in the CDR3b column must be a single-letter representation of the 20 standard amino acids. Any deviation will result in an error. The model supports sequence up to maximum length of 25 amino acids.

peptide represents peptide you want to check. Each character in the peptide column must be a single-letter representation of the 20 standard amino acids. Any deviation will result in an error. The model supports peptides up to a maximum length of 20 amino acids, with optimal performance observed with 9-mer peptides.

Outputs for TITANiAN

For each action, the output will be a CSV file with an additional Score column. This column represents the probability associated with each action. The result CSV file could be downloaded as an archive.

p-MHC Immunogenicity Prediction for Immunotherapy Development

The score represents the immunogenicity probability of the input peptide-MHC pair. This score can be used to rank the immunogenicity of peptide-MHC pairs or to determine if a pair is immunogenic by checking if the score exceeds 0.5. Both applications show excellent performance according to our manuscript benchmarks.

ADA Level Prediction of Protein Drug Sequences

The score indicates the immunogenicity probability of the input peptide sequence. We recommend using this score to identify immunogenic parts of the sequence. If the score exceeds 0.5, the peptide is considered immunogenic. If over 20% of the peptides are immunogenic, the entire sequence can be classified as immunogenic. These thresholds are recommended based on our benchmarks.

p-MHC Binding Classification (Class I)

The score represents the binding probability of the input peptide-MHC class I pair. This score can be used to rank binding probabilities of peptide-MHC pairs or to determine if a pair is binding by checking if the score exceeds 0.5. Both applications show excellent performance according to our manuscript benchmarks.

p-MHC Binding Classification (Class II)

The score represents the binding probability of the input peptide-MHC class II pair. This score can be used to rank binding probabilities of peptide-MHC pairs or to determine if a pair is binding by checking if the score exceeds 0.5. Both applications show excellent performance according to our manuscript benchmarks.

TCR-p-MHC Binding Classification

The score represents the binding probability of the input TCR CDR3 β-peptide pair. This score can be used to rank binding probabilities of TCR CDR3 β-peptide pairs or to determine if a pair is binding by checking if the score exceeds 0.5.