At a high level the way the application works is that, given an object, it will classify the object into one of two categories. For example, the object could be a photograph, and the classifier could automatically label the photograph with a tag saying “there is a face in this picture” or “there is no face in this picture”. In general the approach can be used on any object and any classification scheme.
There are two separate phases:
The first is the training phase, which is typically very computationally expensive (months, lots of humans in the loop, thousands of processors). In this phase the system “learns by example”. We show it many instances of objects of the first category, and then many instances of the second category, and the system learns which is which. The second phase is the use of the classifier, where we show it a new object it has not seen yet and the classifier guesses which category it is in. This second phase consumes very little computing power or memory, and typically can be run on very limited devices such as cell phones. Where the quantum computer will be used here is in the first phase. Once the classifier is trained, it is just a small piece of software that can live anywhere.
Dwave objective will be to beat adaboost [adaptive boosting] in three different dimensions: (1) classification accuracy (less errors in classifying new objects); (2) compactness (less memory required to store the classifier); and (3) reduced time to complete training phase (we want to be faster than boosting).
Next Article: Generating synthetic training and test data
Usually when we want to build a classifier the objects to be classified are somehow distilled into some compact and meaningful representation. For example, let’s say the objects in question are elephants, and we want our classifier to act on these objects to assign a +1 to the elephant if it will fit in your car and -1 if it won’t. Elephants are highly complex entities. The vast list of other features of the elephant probably don’t matter much, at least for this particular task.
We will treat the objects to be classified as vectors of real numbers. Sometimes this can get a bit abstract. But remember that in real life those numbers are (hopefully) some smart “boiled down” version of the actual object.
Prelude to machine learning series mentions published aspects of this work. The work was done by Dwave and Google.
FURTHER READING WIKIPEDIA
Boosting is a machine learning meta-algorithm for performing supervised learning. Boosting is based on the question posed by Kearns: can a set of weak learners create a single strong learner? A weak learner is defined to be a classifier which is only slightly correlated with the true classification. In contrast, a strong learner is a classifier that is arbitrarily well correlated with the true classification.
Examples of boosting algorithms
The main variation between many boosting algorithms is their method of weighting training data points and hypotheses. AdaBoost is very popular and perhaps the most significant historically as it was the first algorithm that could adapt to the weak learners. However, there are many more recent algorithms such as LPBoost, TotalBoost, BrownBoost, MadaBoost, LogitBoost, and others. Many boosting algorithms fit into the AnyBoost framework, which shows that boosting performs gradient descent in function space using a convex cost function.
AdaBoost, short for Adaptive Boosting, is a machine learning algorithm, formulated by Yoav Freund and Robert Schapire. It is a meta-algorithm, and can be used in conjunction with many other learning algorithms to improve their performance. AdaBoost is adaptive in the sense that subsequent classifiers built are tweaked in favor of those instances misclassified by previous classifiers. AdaBoost is sensitive to noisy data and outliers. Otherwise, it is less susceptible to the overfitting problem than most learning algorithms.
Linear Programming Boosting (LPBoost) is a supervised classifier from the Boosting family of classifiers. LPBoost maximizes a margin between training samples of different classes and hence also belongs to the class of margin-maximizing supervised classification algorithms.