Kevlishvili Group

Projects

Our group develops computational tools, datasets, and models for inorganic and organometallic chemistry. Many of these projects are in active development — check back for updates, code releases, and new datasets.

T-REX: Canonical Representations for Transition-Metal Complexes
T-REX: Canonical Representations for Transition-Metal Complexes String-based encoding of metal topology, geometry, oxidation, and spin

T-REX is a canonical, invertible string representation for transition-metal complexes that explicitly encodes metal identity, coordination topology, geometry, oxidation state, and spin. Its extension, MUL-T-REX, generalizes to multinuclear clusters by capturing metal-metal connectivity and bridging topology at each site. These representations make inorganic chemical space searchable, reproducible, and model-ready.

CAT-FM: A Foundation Model for Transition-Metal Catalysis
CAT-FM: A Foundation Model for Transition-Metal Catalysis LLM-type transformers that learn catalytic cycle logic from T-REX sequences

CAT-FM is a catalysis-focused foundation model that treats catalytic cycles as grammatical sequences of T-REX tokens, learning the causal logic of chemical transformations. Trained on dual data streams from high-fidelity manual datasets and automated literature extraction, CAT-FM supports downstream tasks including yield and selectivity prediction, mechanistic reasoning, and catalyst candidate generation.

Mechanophore Discovery via Transfer Learning
Mechanophore Discovery via Transfer Learning From deformation physics to reactivity prediction to scaffold discovery

We use a multi-stage transfer learning pipeline for mechanophore discovery. First, models learn per-bond and per-atom deformation physics from inexpensive DFT calculations. These are then transferred to predict mechanophore kinetics and reactivity under mechanical force. Finally, the trained models are deployed for high-throughput discovery of new mechanophore scaffolds with targeted properties.

More

T-REX Cheminformatics and Property Prediction
T-REX Cheminformatics and Property Prediction ML models for photochemistry, electrochemistry, catalysis, and pharma

Building on the T-REX representation, we train property prediction and virtual screening models for transition-metal complexes across diverse application domains. The tmQMg dataset, and our curated subsets, tmCAT, tmPHOTO, tmBIO, and tmSCO, built using natural language processing of the primary literature, provide the training data that powers these models.

Synthesizability Scoring for Inorganic Complexes
Synthesizability Scoring for Inorganic Complexes Assembly scores conditioned on metal identity and oxidation state

We are developing synthesizability models for inorganic complexes built on T-REX, predicting synthetic accessibility as a function of metal identity, oxidation state, and ligand architecture. These assembly-score-type models help prioritize computationally designed candidates that are likely to be experimentally realizable.