Data is the new electricity to tackle. As we can see, machine learning is extremely important in AI applications. Batch learning or offline learning is a machine learning paradigm in which a model is trained by a learning algorithm for an entire training dataset at once and then deployed for inference without the need for any additional updates. When these models need to be re-trained in order to accommodate future modifications, traditional approaches are a bit difficult and expensive.
Nowadays in the era of big data traditional offline learning methods become more and more restricted especially when live data grows and evolves rapidly. Making machine learning scalable and practical, especially learning from continuous data streams has become an open grand challenge in machine learning and AI. Here in this article, we will discuss in detail online learning with its applications. The major points that we will cover in this article are given below.
What is Online Learning?
Traditional machine learning techniques run in batch mode. For example, supervised learning tasks where the complete training data is fed in advance to train a model by applying certain algorithms. Such an approach requires entire training data available prior to the learning task and the process is also in offline mode due to expensive training costs. Conventional techniques suffer from some critical drawbacks like low efficiency in both time and space cost; and poor scalability for large-scale applications because the model often has to retrain from scratch for new data.
On the other hand, online learning is a combination of different techniques of ML where data arrives in sequential order and the learner (algorithm/model) aims to learn and update the best predictor for future data at every step. Online learning is able to overcome drawbacks of offline learning like models can be updated instantly for any change in data. Therefore, online learning is far more efficient and scalable for large-scale learning tasks in real-world data, analytics, and various applications where data is not only large in size but also arrives at high velocity.
Tasks and Applications
Similar to the offline learning methods online learning techniques can be applied to solve a variety of tasks and problems of real-world applications, below are some of those;
Supervised Learning Task
One of the most frequent tasks is classification which aims to predict the category of the data points on the basis of past observed data during training whose category labels are given. For e.g, a commonly studied task is binary classification, filtering emails which involves two categories i.e., spam or not spam. Other types of supervised classification tasks such as multi-class, multi-label classification and multiple label classification are included.
In addition to the classification task, there is a regression task as well. Online learning techniques are naturally applied for regression analysis tasks i.e., time series analysis in financial markets where data instances naturally arrive in sequential order. Another is the online portfolio section where a learner aims to find a good strategy for making a sequence decision for portfolio selection.
Bandit Learning Task
Bandit online learning is also known as multi-armed bandits have been extensively used for many online recommendation systems such as online advertising for internet monetization product recommendation in e-commerce, movies recommendation on OTT platform and other personalized recommendation such as YouTube.
Unsupervised Learning Task
Online learning can be applied for unsupervised tasks such as clustering or cluster analysis which is nothing but the process of grouping objects such the object lies in the same group i.e., cluster. Online clustering is carried out by incremental cluster analysis on sequential data which is common for mining data streams.
Other Learning Task
Online learning can also be used for other machine learning tasks such as learning for recommendation systems, learning to rank, or also reinforcement learning. For example, collaborative filtering with online learning can be applied to enhance the performance of recommenders by learning to improve collaborative filtering tasks sequentially from a continuous stream of ratings, feedback from customers.
Last but not the least; Support Vector Machines is well known for supervised learning methods for offline classification tasks, in which classical SVM algorithms suffer from poor scalability for very large-scale applications. In literature surveys, various online learning algorithms have been explored for training the SVM in an online learning manner making it more efficient and scalable than conventional offline SVM.
(This is a slightly modified version of an article originally published in Analytics India Magazine. The original article can be found at https://analyticsindiamag.com/beginners-guide-to-online-machine-learning/.)