News and Blog

“Batch” and “Online” learning

News 07 - Giribone
AI in banking managementNews

“Batch” and “Online” learning

Edited by Pier Giuseppe Giribone

Previous articles have covered the classification of machine learning algorithms, divided according to the type of training, with or without the need for human supervision.

This criterion has led to the following distinction: supervised, unsupervised, semi-supervised and “reinforcement” systems.

Another very common categorization method depends on whether the system is capable of incremental learning on the fly or not.

In batch learningThe system is incapable of incremental learning: it must be trained using all the data available at that moment. This training process generally requires a lot of computer time and is resource-intensive, so it is typically developed offline.

The system is then first trained and then put into production, where it carries out its task without the possibility of learning new concepts: it simply applies what it has learned during the training phase. An algorithm that works in this mode is characterized by a offline learning.

Consequently, if you want to train a system based on the batch learning (for example, to make it recognize a new spam type, in the case of an email classifier), it is necessary to retrain a new version of the system by operating from scratch on the entire dataset (not just with the new data, but also with all the previous ones). Once the training phase is complete, it is necessary to shut down the old system in production and replace it with the new one.

Fortunately, the entire training process, including performance testing and deployment, can be automated fairly easily, so even a machine learning system designed this way can adapt to a changing environment. It simply requires updating the data and training a new version of the system from scratch whenever necessary.

This solution is simple and often functional, but training on the entire dataset can take several hours of processing. It typically occurs during scheduled phases based on processing time, such as every day at night or weekly on weekends.

If the system needs to adapt to new data more quickly (for example, for stock price forecasting), then a more responsive solution is needed.

Keep in mind that training the entire dataset requires a significant amount of resources (CPU, memory, disk space, etc.), and if a significantly large training dataset were to be processed from scratch every day, the procedure would cost a significant amount of money. Furthermore, if the data exceeded a certain amount, even ignoring the financial aspect, it would be easily unfeasible due to the technological and infrastructural limitations of the machine hosting the algorithm.

Finally, if the system is designed to learn autonomously and has limited resources to do so (for example, a smartphone app), managing a significant amount of training data, occupying device resources daily and for hours, may be impractical.

Fortunately, a more reasonable option in all these cases is to adopt algorithms capable of incremental learning.

In 'online learning, the system is trained incrementally, by feeding it with sequential instances of data: either individually or in small groups called mini-batches.

Each learning step is fast and cost-effective, so the system can learn from new data on the fly, as it becomes available.

Online learning is particularly suitable for all systems that receive data in a continuous flow (for example, real-time stock price quotes) and therefore require rapid and autonomous adaptation to change.

This is also a great option if you have limited resources for numerical processing or data storage: once theonline learning system Once the system has improved its current knowledge by learning from new instances, these are no longer operationally useful and can therefore be discarded (unless one considers the possibility of reverting to a previous state of the system in order to reproduce the results obtained with the type of information known at the time of processing). Tracking and the ability to reproduce a result can be considered a fundamental function in a banking context, for the purposes of second-level (risk management) and third-level (audit) internal controls.

Online learning algorithms can also be used to train systems on large datasets that cannot be processed by the computer at once due to memory constraints. This training method is called out-of-core learningThe algorithm loads a subset of all the data available for training into memory, trains only on this data, and repeats the process until all the subsets to be processed are exhausted.

The out-of-core learning It is a type of learning that is often done offline, meaning it is not convenient to conduct it "live" on a production system. It should therefore be kept in mind that the term online learning can lead to terminological confusion. For this reason, and to avoid misunderstandings, online learning is often identified with the term incremental learning.

A parameter considered of fundamental importance in the design of a online learning system is the speed of adaptation to changing data: this property is called learning rate (learning rate).

If you set this parameter to a very high value, the system will adapt very quickly to new data, but it will also tend to forget the knowledge acquired from previous data more quickly (it is not desirable for a spam filter to identify only the most recent spam tricks that were shown to it in the most recent training phases).

Conversely, if you set a low learning rate, the system will have greater inertia, or rather, it will learn more slowly, but will be characterized by a lower sensitivity to noise in new data or to non-representative sequences (outliers).

A major challenge in designing algorithms based on online learning is feeding the learning process with a lot of new, untrue data. Obviously, the resulting overall performance would gradually decline. If the system were then "live" in production, such a malfunction would certainly be noticed by customers, resulting in significant operational and corporate image risks.

For example, non-compliant data could come from a malfunctioning sensor, or from an agent spamming a search engine in order to increase its ranking in the order in which the search results appear.

To reduce this risk, it's necessary to continuously monitor system performance. If any problems are detected, be ready to interrupt the data flow feeding the algorithm and, if backup copies are available, restore it to a previous state not yet compromised by such flawed data.

An upstream control of the system can also be conducted by continuously monitoring incoming data and, consequently, intervening if anomalies are noted, either in the quantity of supplies or in their content.

Select the fields to be shown. Others want to be hidden. Drag and drop to rearrange the order.
  • Image
  • SKU
  • Rating
  • Price
  • Stock
  • Availability
  • Add to Cart
  • Description
  • Content
  • Weight
  • Size
  • Product information
Click outside to hide the comparison bar
Compare