Wednesday, March 28, 2018

I have more than 100,000 pageviews today

I just found out that my blog today has more than 100k pageviews in 2-years. To me, it is a thing worth celebrating ^)^

I started to write more blogs two years ago, and use it as a way to keep my fun things happen each week. I found out that most of the time, I will go back to check my blog to find out some information I need. Then I realize that it becomes a habit for me, I will feel something is not done if I miss one blog in this week. This feeling keeps me try to write it regularly. But sometimes when I am traveling or I have some other important things to do, I will try to make it up later ^)^

Anyway, this is the number until today:


Also, the top 5 posts are:


Tuesday, March 27, 2018

Python fitting curves

Recently I have a friend asking me how to fit a function to some observational data using python. Well, it depends on whether you have a function form in mind. If you have one, then it is easy to do that. But even you don’t know the form of the function you want to fit, you can still do it fairly easy. Here are some examples. You can find all the code on Qingkai’s Github.
import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline

plt.style.use('seaborn-poster')

If you can tell the function form from the data

For example, if I have the following data that actually generated from the function 3*exp(-0.05x) + 12, but with some noise. When we see this dataset, we can tell it might be generated from an exponential function.
np.random.seed(42)

x = np.arange(-10, 10, 0.1)

# true data generated by this function
y = 3 * np.exp(-0.05*x) + 12

# adding noise to the true data
y_noise = y + np.random.normal(0, 0.2, size = len(y))
plt.figure(figsize = (10, 8))
plt.plot(x, y_noise, 'o', label = 'Raw data generated')
plt.xlabel('X')
plt.ylabel('y')
plt.legend()
plt.show()
png
Since we have the function form in mind already, let’s fit the data using scipy function - curve_fit
from scipy.optimize import curve_fit
def func(x, a, b, c):
    return a * np.exp(-b * x) + c
popt, pcov = curve_fit(func, x, y_noise)
plt.figure(figsize = (10, 8))

plt.plot(x, y_noise, 'o',
    label='Raw data')

plt.plot(x, func(x, *popt), 'r',
    label='Fit: a=%5.3f, b=%5.3f, c=%5.3f' % tuple(popt))

plt.legend()

plt.xlabel('X')
plt.ylabel('y')

plt.show()
png

More complicated case, you don’t know the funciton form

For a more complicated case that we can not easily guess the form of the function, we could use a Spline to fit the data. For example, we could use UnivariateSpline.
from scipy.interpolate import UnivariateSpline
np.random.seed(42)

x = np.arange(-10, 10, 0.1)

# true data generated by this function
y = 3 * np.exp(-0.05*x) + 12 + 1.4 * np.sin(1.2*x) + 2.1 * np.sin(-2.2*x + 3)

# adding noise to the true data
y_noise = y + np.random.normal(0, 0.5, size = len(y))
plt.figure(figsize = (10, 8))
plt.plot(x, y_noise, 'o', label = 'Raw data generated')
plt.plot(x, y, 'k', label = 'True model')
plt.xlabel('X')
plt.ylabel('y')
plt.legend()
plt.show()
png
# Note, you need play with the s - smoothing factor
s = UnivariateSpline(x, y_noise, s=15)
xs = np.linspace(-10, 10, 100)
ys = s(xs)
plt.figure(figsize = (10, 8))
plt.plot(x, y_noise, 'o', label = 'Raw data')
plt.plot(x, y, 'k', label = 'True model')
plt.plot(xs, ys, 'r', label = 'Fitting')
plt.xlabel('X')
plt.ylabel('y')
plt.legend(loc = 1)
plt.show()
png

Of course, you could also use Machine Learning algorithms

Many machine learning algorithms could do the job as well, you could treat this as a regression problem in machine learning, and train some model to fit the data well. I will show you two methods here - Random forest and ANN. I didn’t do any test here, it is only fitting the data and use my sense to choose the parameters to avoid the model too flexible.

Use Random Forest

from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error
# fit the model and get the estimation for each data points
yfit = RandomForestRegressor(50, random_state=42).fit(x[:, None], y_noise).predict(x[:, None])

plt.figure(figsize = (10,8))
plt.plot(x, y_noise, 'o', label = 'Raw data')
plt.plot(x, y, 'k', label = 'True model')
plt.plot(x, yfit, '-r', label = 'Fitting', zorder = 10)
plt.legend()
plt.xlabel('X')
plt.ylabel('y')

plt.show()
png

Use ANN

from sklearn.neural_network import MLPRegressor
mlp = MLPRegressor(hidden_layer_sizes=(100,100,100), max_iter = 5000, solver='lbfgs', alpha=0.01, activation = 'tanh', random_state = 8)

yfit = mlp.fit(x[:, None], y_noise).predict(x[:, None])

plt.figure(figsize = (10,8))
plt.plot(x, y_noise, 'o', label = 'Raw data')
plt.plot(x, y, 'k', label = 'True model')
plt.plot(x, yfit, '-r', label = 'Fitting', zorder = 10)
plt.legend()

plt.xlabel('X')
plt.ylabel('y')

plt.show()
png

Sunday, March 18, 2018

Travel: Visiting University of Oregon

I was visiting University of Oregon last week, it is really beautiful! It gives me a really nice feeling about quiet, unique, friendly, green, and go duck! 
jpg  jpg
  jpg
jpg
jpg
jpg
  
jpg
jpg
jpg
jpg  jpg
jpg
jpg  jpg
jpg
jpg  jpg
  jpg

Tuesday, March 13, 2018

Sad news: Professor Stephen Hawking dies aged 76

Sad news - Professor Stephen Hawking dies aged 76. How sad it is! This great man is the role model of many people, including me. I know I can not get his level, but I will try to make more contributions to the human knowledge and society as him!!!