Why Machine Learning in the Corrosion Field?
The importance of Corrosion
Corrosion is a deterioration in the properties of a material due to a chemical reaction with its physical environment. It is a process that is considered in the design of many different objects, from bridges to containers for transporting chemicals, from the pipes that carry water to the utensils used for cooking. Therefore, understanding corrosion and corrosion rates is necessary to ensure that the design of the system is safe and performs according to specification.
There are three main reasons why corrosion is important: safety, conservation, and economy. Corrosion can compromise the safety of equipment in operation, causing failures that can have catastrophic consequences, such as boilers, pressure tanks, metal containers with toxic chemical products, etc. For example, safety is a critical consideration in the design of nuclear plant equipment and for the tanks containing waste from these plants.
The loss of material due to corrosion is not only a loss to the metal, but also a loss of energy, water, and human effort used to fabricate the structure. In addition, rebuilding corroded material entails a greater investment in all these resources.
The economic factor is a very important motivation for most current corrosion research. Losses suffered by industry and governments run into billions of dollars a year, approximately $275 billion in the United States alone or 3.1% of the Gross Domestic Product (GDP) . Studies from other countries such as Australia, England or Japan have shown that in each country the cost of corrosion is approximately between 3% and 4% of GDP . It is estimated that if the existing technology in this field were applied, between 25% to 30% of the total cost of corrosion could be avoided .
Big data and Machine Learning in Corrosion Management
Corrosion management has been a very traditional and slowly evolving area. Traditionally, consultants found software as something that would reduce the demand for their services. With the advent of machine learning, little by little it is being understood that using this type of tool is a skill that can complement and add value to the work of a corrosion engineer.
The power of big data and machine learning to secure the pipeline environment is a field that will only grow. There are three key points which require focus:
Managing the volume of data - Considering the huge amount of pipeline datasets around the world, how much data is generated and how to store it is a fundamental problem.
Managing the variety of data - Data comes from various sources, such as sensors installed in pipelines or inspections, and it can come in different forms, such as a category or a number. Each piece of information is important to the corrosion problem and requires a different process that companies must investigate with the right tools.
The value of data - If not caught early, corrosion can lead to leaks, the results of which can be devastating. Therefore, timely detection is the best means of prevention for pipeline operators around the world. Machine learning can allow companies to create predictive models to support better decision-making, avoiding disasters and economic losses, and therein lies the great value of this data. Corrosion management can significantly benefit from the application of machine learning or any subset of it as deep learning.
Corrosion is a difficult physical problem
The corrosion field is still focused on the use of inaccurate predictive models of in-service corrosion rate, such as power functions, time-varying functions, and linear logarithms. The basic problem is that corrosion is a diverse, highly stochastic, and performance-related process, so its behaviour can vary with small fluctuations in the environment or material. It is a highly non-linear process; the scenarios are complex and changing. This makes this problem very difficult for traditional statistical systems to address. Machine learning learns from data to solve certain problems and allows the provision of cheap and accurate simulation processes. The predictive power of different machine-learning approaches should encourage the corrosion community to use machine-learning-related tools in the study of this field.
On the other hand, corrosion data is usually incomplete, noisy, and heterogeneous. The use of corrosion monitoring techniques to continuously monitor asset conditions provides data that can be used under machine learning approaches to move towards predictive corrosion management.
Supervised and Unsupervised learning
There are three main types or paradigms of machine learning, these are unsupervised learning, supervised learning, and reinforcement learning. In addition, there are currently new paradigms that arise at the dawn of the classical paradigms, such as active learning or transfer learning.
The methods that are included within the unsupervised learning paradigm are those that work with data that is not previously labelled, and that does not contain the object of study itself. Using unsupervised learning techniques, it is possible to explore the structure of a dataset to extract significant knowledge without the need for an established objective. These techniques are not only useful for the discovery of structures in unlabelled data but are also very suitable for the selection of relevant data compression features, important in the steps prior to the use of other machine learning techniques.
On the other hand, the main objective of supervised learning is to make a prediction model starting from a training data group that contains the instances for which the answer to the problem to be solved is known and is usually known as labels. A supervised learning algorithm creates a function that relates the inputs of the system with the required outputs, for this it uses previous knowledge to obtain future results. In this way, a model can be built that serves to make predictions from data that does not yet exist and/or is in the future.
There are different ways of approaching a machine learning model, supervised and unsupervised are examples of that. These two data analytics approaches can be combined to address a specific problem such as corrosion.
Corrosion is an important problem that needs to be addressed with the best techniques. The use of machine learning and big data can have many applications for decision-making in corrosion management, such as corrosion rate estimation, corrosion risk-based assessments, material selection, predicting equipment end of life or data-driven inspection prioritisation. Corrosion management is a traditional area that evolves slowly, and it could be benefited from using machine learning techniques in combination with existing corrosion monitoring technologies.
 Revie, R. W., & Uhlig, H. H. (2008). Corrosion and corrosion control. An Introduction to Corrosion Science and Engineering. Fourth Edition. Hoboken New Jersey: John Wiley & Sons.
 Koch, G., Brongers, M., Thompson, N., Virmani, Y., & Payer, J. (2001). Corrosion Costs and Preventive Strategies in the United States. Dublin: NACE international.