Abstract
In this study, we explore molecular properties of importance in solution-mediated crystallization occurring in supersaturated aqueous drug solutions. Furthermore, we contrast the identified molecular properties with those of importance for crystallization occurring in the solid state. A literature dataset of 54 structurally diverse compounds for which crystallization kinetics from supersaturated aqueous solutions and in melt-quenched solids were reported, was used to identify molecular drivers for crystallization kinetics observed in solution and contrast these to those observed for solids. The compounds were divided into fast, moderate and slow crystallizers and in silico classification was developed using a molecular K-nearest neighbor (KNN) model. The topological equivalent of Grav3 (T_Grav3; related to molecular size and shape) was identified as the most important molecular descriptor for solution crystallization kinetics; the larger this descriptor, the slower the crystallization. Two electrotopological descriptors (the Atom-type E-state index for -Caa groups and the sum of absolute values of pi Fukui(+) indices on C) were found to separate the moderate and slow crystallizers in solution. The larger these descriptors, the slower the crystallization. With these three descriptors, the computational model correctly sorted the crystallization tendencies from solutions with an overall classification accuracy of 77% (test set).