Skip to main content
Fig. 3 | Evolution: Education and Outreach

Fig. 3

From: Virtual reality in biology: could we become virtual naturalists?

Fig. 3

Supervised and unsupervised machine learning. a Schematic representation of an unsupervised learning model. Unlabelled data is used in unsupervised learning algorithms for clustering. b Schematic representation of a supervised learning model. Labelled data are used in supervised learning algorithms for classification. Machine learning models can be broadly classified into supervised or unsupervised learning algorithms, depending on the structure of the data (Mitchell et al. 2013) [Note: there are intermediate cases called semi-supervised learning which we will not consider here, see e.g., Zhu and Goldberg 2009 for details]. Unsupervised learning algorithms use data in which the outcome is not yet labelled or identified, and therefore the algorithm cannot ‘know’ the outcomes in advance. The algorithm then learns how to classify and predict the outcome from new observations based on the inherent structure of the data at hand. An example of unsupervised learning is the clustering of groups within a dataset. Conversely, supervised learning algorithms uses data in which the outcome is known, and the algorithm learns how to predict the outcome of future observations based on what was learnt from the information and outcomes obtained from previous data. An example of supervised learning is the classification (or prediction, in the case of regression models) of a new observation between two categories based on n number of characteristics or variables

Back to article page