Approximation data by exponential function on Python
by Svitla Team
In today's world, the importance of conducting data science research is gaining momentum every day. This applies to so many aspects of the life of an individual, and of society as a whole. Accurate modeling of social, economic, and natural processes is vital.
One of the important processes in data analysis is the approximation process. If you correctly approximate the available data, then it becomes possible to estimate and predict future values. Thus, a weather forecast, a preliminary estimate of oil prices, economic development, social processes in society, and so on can be made. Most processes in nature are described by exponential functions. Let's consider what exactly is a function and its approximation.
What is a function?
The function (relation, operator, transformation) in mathematics determines the correspondence between the elements of two sets, established by such a rule that each element of the first set corresponds to one and only one element of the second set.
The mathematical concept of a function expresses an intuitive idea of how one value completely determines the value of another value.
Often, the term “function” refers to a numerical function, that is, a function that puts one number in correspondence with another.
More strictly, the function f maps the set X to the set Y. The function is also denoted by y = f (x).
In mathematics and data science, this is one of the fundamental concepts for computing and data analysis. The function can be represented in graphical form; for instance, in two dimensions.
An exponential function and why it is important in data science?
As stated earlier, a lot of processes can be described using an exponential function. The function y = Exp(x) is an exponential function with the base e = 2.718281828, i.e. Euler number. Exponential growth is an increase in value where the growth rate is proportional to the value of the quantity itself. Please take a look at the following table and graph to clearly understand the nature of exponential growth.
|x||y = exp(x)|
How to approximate a set of data by the exponential function
Approximation (lat. proxima - closest) is a scientific method consisting of replacing some objects with others, in a sense, close to the original, but simpler.
Approximation allows one to study the numerical characteristics and qualitative properties of an object, reducing the problem to the study of simpler or more convenient objects (for example, those whose characteristics are easily calculated or whose properties are already known).
You can approximate the input values using the approximation functions. The most commonly used approximation is linear, polynomial, and exponential.
Non-linear least-squares problem
The least-squares method is the method of finding the optimal linear regression parameters, such that the sum of the squared errors (regression residuals) is minimal. The method consists of minimizing the Euclidean distance between two vectors, i.e. the vector of the restored values of the dependent variable and the vector of the actual values of the dependent variable.
This method very often is used for optimization and regression, as well as Python library scipy in method scipy.optimize.curve_fit () effectively implemented this algorithm. If we apply an exponential function and a data set x and y to the input of this method, then we can find the right exponent for approximation.
Python code for approximation example
Let's solve the problem of approximating a data set using an exponent. Of course, it is necessary to note that not all data can be approximated using an exponent, but in many cases when the law of change or function is exponential, this is quite possible.
For example, take data that describes the exponential increase in the spread of the virus. This data can be approximated fairly accurately by an exponential function, at least in pieces along the X-axis.
To do this, we will use the standard set from Python, the numpy library, the mathematical method from the sсipy library, and the matplotlib charting library.
To find the parameters of an exponential function of the form y = a * exp (b * x), we use the optimization method. To do this, the scipy.optimize.curve_fit () the function is suitable for us. This method uses a non-linear least squares algorithm to match the function that we specify at the input.
This is one of the optimization methods, more details can be found here. If we find such a and b with which we can very similarly describe the law of the relationship x, y in the data, then we get the opportunity to build a function for other new values of the argument. This allows you to, predict the growth of the function for the following values along the X-axis, for example.
See the scipy.optimize.curve_fit () function manual for more details here.
Now let's look at a small piece of Python code that:
- Specifies input values for x, y
- Using curve_fit(), calculate the value of a, b in an exponential function
- An exponent function is defined as a lambda function lambda x1, a, b: a * numpy.exp (b * x1)
- Then draw graphs of original data (blue), approximated data (red).
import numpy from scipy.optimize import curve_fit import matplotlib.pyplot as plt x = numpy.arange(1, 31, 1) y = numpy.array([3,7,14,16,26,47,73,84,113,196,218,310,356,475,548,645,794, 942,1096,1251,1319,1462,1668,1892,2203,2511,2777,3102,3372,3764]) [a, b], res1 = curve_fit(lambda x1,a,b: a*numpy.exp(b*x1), x, y) y1 = a * numpy.exp(b * x) plt.plot(x, y, 'b') plt.plot(x, y1, 'r') plt.show()
This graph shows that the red curve (approximated data using the exponent) and the blue curve (real data) accurately describe the nature of the data change.
It is worth noting that you can get a sufficiently large value of the approximation error if your input data character obeys some other dependence that is different from the exponential one. In this case, the graph is divided into separate sections and you can try to approximate each section with its exponent. Or select another approximation function, for example, a polynomial.
Concluding this article about data approximation using an exponential function, let’s note that now there are very good and effective tools for solving such an important problem. Using Python language and libraries like numpy and scipy, you can simply work wonders in data science, as shown in this task. The potential of approximation using an exponential function in the first approximation makes it possible to make predictions for a certain type of task in the economy, natural phenomena and in the social sphere.
Our data science specialists are very well trained in solving non-standard problems. Svitla Systems works with complex projects and has vast experience. We know how to satisfy customer requests, coordinate project requirements in agile mode, and maintain efficient communication.
Let's meet Svitla
We look forward to sharing our expertise, consulting you about your product idea, or helping you find the right solution for an existing project.
Your message is received. Svitla's sales manager of your region will contact you to discuss how we could be helpful.