Data sets in chemistry, physics and materials science often contain a lot of “second hand” information in the form of lists of features complied to characterise the raw data in various way. Post-processing raw data in this way introduces uncertainty, systematic evaluation bias (since not every aspect of the data can be captured), and information bias (since different researchers may use different methods to characterise the same set). An alternative approach is to convert each original raw data point into a unique image, or “fingerprint”, that captures all of the information simultaneously and opens up the possibility of using image processing methods and deep learning on otherwise unsuitable sets. One way doing this is with a type of neural network known as a Kohonen network, which transforms raw numerical data into a self-organising map. In this project you use investigate how to use Kohonen networks consistently to produce reliable and reproducible results, using a molecular or materials data set. You will then use image processing methods to confirm that each image is unique. A machine learning model will then be trained to demonstrate utility. Data sets will be provided.