John Mitchell Group: Rachael Skyner

Hydrate Crystal Structures, Pair Distribution Functions & Computing Solubility



Rachael Skyner

Research

Originally from a ‘crystallographic background’, I decided to pursue a PhD in computational chemistry due to a number of experiences at undergraduate level that developed a personal interest in applying computation to my skills in crystallography. The main focus of my work involves data-mining of the CSD (see here) in order to investigate structure-property relationships, primarily relating to solubility and its prediction. My work can be roughly divided into three projects;

1. Probing the average distribution of water in organic hydrate crystal structures with cumulative radial distribution functions (c-RDFs)

The abundance of hydrates within the CSD reflects the role of the solvent in the recrystallization process. An understanding of this is therefore of paramount importance for crystal engineering, with solvent choice often affecting polymorphism as well as making an appearance in the crystal structure. The CSD acts as a primary and invaluable tool in the analysis of the relationship between solvent choice and recrystallization results, as it contains virtually all published small-molecule crystal structure data. A number of approaches have previously been considered in the analysis of large datasets of organic hydrates, often facilitated through various programs developed by CCDC to complement the CSD.

Accurate radial distribution functions (RDFs) of water are highly desirable in order to understand water and aqueous systems at an atomic level. The experimental acquisition of such functions is of great importance, as they can be used as a point of comparison for computational water models.

In this work we attempt to develop a method suitable for the application of empirical data, namely that of organic hydrate crystal structures, to the understanding of the distribution and energetics of systems in water. We initially present a model aimed at combining the distribution functions of multiple atom pairs from a number of crystal structures. From this, we can comment qualitatively on the average distribution of water in organic hydrates.

2. Molecular / structural effects on solubility of drug-like crystalline compounds

Our current project focuses on establishing whether structural (either atomic or multi-atomic/functional group) changes to molecules can induce changes in aqueous solubility, which can be measured as a function of the changes. In order to investigate this, we have searched the most recent version of the CSD (version 5.36 – 2015) for single component drug-like structures (no Lipinski violations) with available melting point data and aqueous solubility data. This has been facilitated through CCDC’s new python API. The API allows the searching and manipulation tasks previously available in the CCDC GUI software suite to be performed with python scripts.

In order to identify the effects of structural changes on solubility, the dataset we have produced has been searched for Matched Molecular Pairs (MMPs) with the python scripts included in rdkit. A MMP is a pair of compounds that only differ by a localised structural change. The MMP algorithm employed for our work generates SMIRKS strings describing the structural change involved in transforming one compound to the other.

3. The effects of polymorphism of drug-like compounds on solubility

As solubility is a thermodynamic term, it is inherently affected by factors such as temperature and pressure, as well as ionisation, solid state effects, and gaseous partial pressure for solvated gases. Intermolecular interaction strengths play an important role in the solvation of substances from the solid state. Solutes which exhibit weak intermolecular forces tend to have a higher solubility, as the energy cost of breaking up the lattice is lower. Polymorphic effects can also lead to complications in solubility prediction.

Complete polymorphic screening and prediction still eludes our capabilities and hence hampers our ability to predict solubility from purely first principles. However, as solubility and dissolution rates are related to drug absorption, there is a significant general interest in how polymorphism can impact solubility in the pharmaceutical chemical industry.

In order to strategically investigate the effects of polymorphism and/or isomerism on solubility, we suggest that a complimentary study involving both computation and experiment be conducted. Identification of drug-like compounds for which known crystal structures are known will initially be conducted. Next, a full investigation of the crystal energy landscape of existing stable and metastable forms of selected structures will be performed in order to identify the most stable polymorphs, and probe the likely existence of other polymorphs easily obtainable within the phase space. Alongside this, an experimental investigation will attempt to reproduce any known structures in order to obtain solubility data which doesn’t currently exist, alongside both spectral and thermal analysis. Any additional polymorphs identified will also be fully characterised where possible. Thermal analysis will primarily serve as a tool to determine whether the relationship between identified polymorphs is enantiotropic or monotropic, and establish the transition points for enantiotropic systems, allowing a consideration of reversible polymorphism.

From the results of both computation and experiment, an analysis of the how thermodynamic behaviour of the polymorphs (from computation) is related to their experimentally observed physiochemical properties will be attempted. This may be considered in terms of first-principles solubility theory, complemented by a cheminformatics approach.

Publications

Sponsorship

My projects are sponsored by CCDC, who provide access to the Cambridge Structural Database.