Skip to content

mutation_bench

This folder contains the scripts required to infer the probabilities of interaction between mutatant proteins and their wild-types proteins.

Embedding Mutated Proteins

Provided in the data folder are pre-computed embeddings for mutated proteins. If you wish to re-compute them, you'll need to use the python bench.py embed command.

python bench.py embed MODEL_NAME <flags>

Arguments

Positional Arguments

Argument Description
MODEL_NAME Name of the LLM to use to embed the proteins.

Flags

Long Flag Default Description
--batch_size 5 The size of the batches used to embed the proteins.
--muts_path "../../data/mutation/elaspic-trainin-set-interface-ids.csv" Data on the binding affinity of mutated and wild-type protein pairs from ELASPIC2
--output_path None Where to output the embedding database.

Requirements

One can install the requirements for this module using the requirements.txt file in the experiments/mutation_bench folder.