The Art of Machine Learning:Balancing Innovation and Practicality

Innovation is a key feature of humankind and can be dated back to prehistoric times. It helped us gain dominance on the planet. Innovation is still with us. It just became more technologized, organized, and regulated.

In a business environment, innovation can be defined as developing novel solutions, typically new products and services. A key criterion for innovation is usefulness. It should provide a tangible outcome, for example, improving effectiveness or efficiency. More disruptive innovations address needs that were unmet before. Less disruptive innovations can be a simple upgrade to an already existing service or product. The most important requirement is that the innovation process leads to a positive impact.

Creativity is a must for innovation, as it provides the key, the core of innovation. It fosters unique ideas, which is necessary because novelty is a prerequisite of innovation. In order to be effective, creativity should be coupled with knowledge and strong fundamentals in the specific field where innovation happens. Deep knowledge leads to more realistic creativity, leading to tangible outcomes. I would call innovation controlled creativity, as the ideas should fit the requirement of being viable and should hold a business value.

The common thing about being able to innovate and be creative is that these concepts are our abilities that constantly move humankind forward. These capabilities helped us to become the most powerful species on the planet. It must be in our veins somehow. We started with chipped stone flakes, hand axes, and controlled fire. At least, this is what our historians highlight. The sociological structure is harder to recreate and can be hardly inferred or deduced. The sociological context is important as innovation is rarely conducted alone these days, new technologies are typically not developed solo. Machine learning heavy innovation involves a proper IT infrastructure, huge computing capacity, MLOps staff, software developer staff, and ML staff. This kind of innovation requires several assets and is investment intensive. Unlike the old times, such projects can be found at tech giants these days.

Less capital intensive projects focus more on the idea of the application area than on the technical innovation. Such projects typically involve already existing, published results and rely on already developed machine learning techniques for example utilizing pre-trained language models by downloading and refining it. There are several initiatives, approaches of this kind. The success of these innovations depends on several factors, and only a small percentage become successful. Sometimes these projects fall on the problem fit, sometimes on the market fit, sometimes on the lack of funding.

The success rate of innovative projects strongly depends on the cultural environment the actual project resides in. During my professional career I had the opportunity to work in a wide range of environments from startups on private funds, startups on public funds, NGO projects, grant-based research projects in the Academia, R&D projects in an international consortium, and R&D projects in a multinational company. Each of these environments have its advantages and disadvantages. Small startups are more dynamic but are more financially vulnerable. Big companies are slow but are able to provide a stable financial ground. The academic environment is typically more theoretical and is less business-oriented.

Looking at the history of machine learning, it can be stated that this field of artificial intelligence came into existence to solve practical problems, and it is still the practical problems and the business that push this technology forward. This advancement can be perceived as a constant fight between more complex problems and more powerful models. The question is: How far can we go with the problem fit, and what would be the application that provides a market value and provides the market fit?

The perceptron, the essential building block of feed-forward neural networks, was introduced in 1957. It was an early technique to solve a binary classification problem in the field of computer vision. Binary classification is the definition of a general problem to tell two different things apart, like cat or dog, left or right, etc. Computer vision means that this first artificial neural network processed the signals coming from photocells. At first, the perceptron existed only in simulation, but later, a physical machine was built to process 20×20 pixel images.

The perceptron project also laid down an important principle of machine learning, the concept of being data-driven. The essential nature of machine learning models is that these models identify the numerical interdependencies in the data. Mathematically speaking, machine learning models learn transformations. For example, to transform an image into a list of bounding boxes of pedestrians or other vehicles, to transform gas sensor signals into concentration levels of CO₂, NO, NO₂, to transform a sequence of words (lyrics) into a sequence of sound wave amplitudes (music). Learning these transformations means to iteratively fit the model to the training data. It also means that the quality of the training data has a strong influence on the model quality. Referring to practicality, being data-driven also means that these models are designed to solve real-world problems on real-world data. The question is the quality, the problem fit.

An important machine learning paradigm is formulated by the no free lunch theorem. The main message of this theorem is that there is no omnipotent algorithm in machine learning that can work perfectly on every problem. The job of a machine learning researcher is to find the model that provides the best fit to the current data and the current problem. This process is conducted in a trial and error approach; different models are tested against the data while the performance is being measured, and the problem fit is quantized.

Since the introduction of the perceptron, the first delegate of artificial neural networks (ANN), machine learning has continued its evolution. Several proprietary techniques have been introduced, convolutional neural networks have been developed to deal with visual information, and recurrent neural networks have been defined to operate on a series of data. Deep learning further catalyzed this process. More data and more complex problems led to a significant increase in the size of the neural networks. The increased complexity led to numerical problems in the training process. The answer was the introduction of architectural improvements and more complex building blocks. These days, a neural network architect does not operate with perceptrons but with more complex building blocks, such as C2F, Upsample, C3K2, SPFF, C2PSA, Add&Norm, and Multi-Head Attention.

Having identified the problem, the job of a machine learning expert is the following: formulate the problem, find or collect data, annotate the dataset if necessary, identify candidate models, and evaluate them. All these steps require knowledge, intuition, and experience. The availability of the dataset strongly depends on the formulation of the problem, which should involve experience. For example, the actual technique to represent information has a deep impact on the performance of the overall system. There are models that are designed for time series, there are models that perform well on images, and there are architectures dedicated to the text of natural language. These models can be applied in a creative way and can be interconnected but one should have an intuition on the expected performance. Annotation is a labor intensive job, which can be avoided with a creative idea or with a proper formulation of the problem. This can lead to a cost reduction. Finding the adequate model requires deep knowledge of the techniques to be involved. The main goal of this task is to identify the model with the best problem fit and the lowest error rate / best accuracy.

Thanks to the recent advancements in machine learning, in several problem domains, machine learning models provide better performance than humans. It means that the problem fit seems to be absolved. The next question to reach an adequate level in practicality is the market fit.

Machine learning can provide the problem fit. In order to finalize a successful and innovative project, the market fit should also be considered. To be more exact, these two things should go hand in hand by identifying the limits of technology and analyzing the feedback from the market to understand the market needs and adapt as early as possible.

A machine learning model can be imagined as an assistant that is able to learn a particular task. This assistant can work 24 hours and mainly requires electricity to operate. Early applications defined a very narrow field of application, for example, identifying digits on number plates of cars passing by. In this case, the task to be solved is very simple for humans. It also means that it is a very boring task for someone to do in a 9-5 job. The advancements of machine learning led to the fortunate situation that the constraints and restrictions on the technical solution could be relaxed. These days, it is possible to not only detect the number plate but also to estimate the speed of the vehicle with reasonably good accuracy from camera images. However, the restrictions on machine learning are still inherently present in these projects.

One of the main restrictions of machine learning projects can be found in their data-driven behavior. These models are trained on a particular/restricted set of data. This data should be high quality and should also be relevant to the accessed problem that is intended to be solved. The models are typically optimized to work with high accuracy on the dataset that has been chosen as a reference to the actual problem. To come up with an analogy, interpolation is more accurate than extrapolation, meaning that machine learning models perform well on data that is similar to the data it has been trained on. In the case where the model is applied to data that is not similar to the training data, its performance will degrade. This is the reason why it is important to conduct out-of-distribution detection on the input data that is passed to the model.

The paradigm above leads to the experience that the accuracy of the models trained typically declines at the moment they are taken out of the lab and into the real world. This means that machine learning-based projects are typically iterative because the new or extended application area leads to additional data collection and refining the actual models.

Market fit is a less technical requirement and needs more business-oriented thinking. When assembling a team, it is important to keep in mind that technical knowledge in itself is not sufficient for a successful business. A role should be dedicated to serve the business needs of the project, for example, management and business development. It is also important to have a domain expert on the board. The domain expert has the crucial role in the identification of the market value and can also provide useful hints and insights to the technical staff.

Machine learning can be found in the intersection of mathematics and computer science. Early approaches were more influenced by mathematics; for example, the theory that lies behind the SVM kernels is properly elaborated. During its evolution, the focus shifted from the mathematical aspect to numerical solutions, from theory to practice. Machine learning provides quality, high-accuracy solutions. It can deliver the problem fit in several cases. The business aspect, to provide the market fit is prescribed by innovation. This means that innovation and machine learning should go hand in hand and find the balance with a properly aligned problem fit and an adequately chosen market fit.