Qingkai's Blog: February 2017

Sunday, February 26, 2017

Pager duty at Berkeley Seismological Lab

As a graduate student at the Berkeley Seismological Lab, doing pager duty is one of the things that accompany you throughout your PhD study. Why is it called pager duty? You carry a pager with you and wait for it to beep 24 hours a day for a week (the duty is rotating every semester). Whenever it beeps, you need to sit in front of a computer, and work out the mechanism of the earthquake. This is what a pager looks like:

Some of the young people don't even know what a paper is, and yes, we are still using them. It is amazing that there are still companies doing business with pagers. They have some advantages over the smartphone: the battery can last for months, they work in some places where no cell signal is available, and it is cool to show off to friends :-). Overall, it does the job nicely. Like I mentioned, whenever this thing beeps, I need to work on the earthquake it reports in the next half hour to one hour, even in the middle of the night. Luckily, during my PhD life, it only happend about 3 or 4 times that I needed to do this 1 or 3 am in the morning.

Ok, now you must be curious what we do after the pager beeps. Well, when it beeps, it means there was an earthquake larger than M3.5 in Northern California that just happened. Based on the waveforms recorded at seismic stations, a computer algorithm will generate a Moment Tensor Solution, which is a mathematical representation of the movement on a fault during an earthquake. This solution can tell us (mostly seismologists) how the fault actually moves. The alarm responders need to login to the system, review the result and do a better version of Moment Tensor Solutions. Once the result is reviewed, or a better solution is made, we will publish it to USGS. It will appear on the USGS website as the following figure

If you play around a little bit on the USGS website, and you can find more details about the solution the responder did, here's an example of one I did in 2013:

Don't worry if you can not understand what is shown here, it is designed for seismologists. And if you really want to know how it works, you can read this famous paper - A student's guide to and review of moment tensors. You can also check out the following video to get an idea of focal mechanism, which is a very close concept and you will learn how to read a beach ball diagram.

I think it is really cool that we students can contribute to the USGS earthquake website and leave our names there forever (I assume the USGS earthquake page will last forever, as long as we still have earthquakes).

Anyway, the pager duty is part of our graduate student life, and I believe that we will miss the feeling of doing duty in the middle of night after we graduate.

Fun facts

The pager will beep two times a day for testing purposes, at noon and 3:30 pm. That's usually the time when we show off to friends or see the curious looks on other people's faces.
During one tour of duty (7 days), responders will experience 0 or 1 earthquakes most of the time. Some lucky ones will have earthquake swarm occur during their duty :-)
The average time to finish one earthquake moment tensor is about 1 hour.
We switch duty on Wednesdays at 5 pm.

I downloaded the earthquakes larger than M3.4 in Northern California (I choose M3.4, because after we finish the moment tensor solution, the magnitude may change from M3.5 to a smaller magnitude, i.e. M3.4, M3.3, and so on) from 2011 to 2017, and made some quick figures to show some interesting facts.

Where do the earthquakes happen?

The red dots are the locations of earthquakes that our responders worked on in the last 6 years.

When do the earthquakes happen?

It seems earthquakes were randomly distributed.

Which hour during the day do the earthquakes happen?

We have relatively more earthquakes in the middle of night in the last 6 years, how many of us remember the earthquakes that woke us up: hands up! I have at least have 3 or 4 cases in my memory. Also, in the last 6 years, relatively low number of earthquakes around noon, that is good, since everyone is at lunch :-)

On which weekday do the most earthquakes happen?

Glad in the last 6 years that Sat and Sun are relatively low on earthquakes.

What is the number of earthquakes each week?

This figure shows the number of earthquakes in each week (Wednesday to next Wednesday), I am wondering who is the lucky one that had 14 earthquakes during the duty :-) Definitely not me.

How many earthquakes do we usually see during duty?

This histogram shows us most responders did 1 or no earthquakes during the duty week (the first two bars show this, and more than half of the weeks in the last 6 years ended up with none earthquake). The following table is a quick statistic summary of the earthquake counts. The mean is 0.75 earthquakes per duty, and the 50^th percentile is 0 earthquake, and 1 earthquakes for 75^th percentile. So far today, I have already done two earthquakes (my lucky day!), maybe I will do more earthquakes this tour of duty and fall on the tails of the histogram.


count	311
mean	0.75
std	1.23
50%	0
75%	1
max	14

Why do we have an alarm response team? Why do we have to get up in the middle of the night to look at earthquakes?

This is copied from our Lab website!

As a seismological observatory, the Berkeley Seismological Laboratory has been involved in earthquake information for over a century. Part of our mission is to monitor earthquake activity and to provide timely and accurate information to state and federal agencies, to the media, and to the public.

Our primary responsibility is for local earthquakes. This has led to the development of the REDI project and the joint notification system with the USGS Menlo Park. Since much of this processing is automated, the Alarm Response workload has been significantly lightened. However, it is extremely important that the UCB alarm response people carefully review - and update, if necessary - information for larger events. This reviewing process must be done in collaboration with the USGS person on duty.

We also have a responsibility to respond to regional and teleseismic events. In this situation, our duties are to provide supporting information for the authoritative agency - for example, Caltech, the University of Washington, or NEIC. In general, we do not formally release our locations and magnitudes to the press if we have a solution from the authoritative group.

Conclusion

This blog was written because this might be my last duty since I will graduate soon and will be off duty. I even feel a little sad, the pager duty experience is somewhat fun while frustrating, and has been a big part of my life. I hope future students will enjoy the experience, and when I talk with them in the future, we will have so many things to cover.

Friday, February 24, 2017

Entrepreneur training 1: Anatomy of a Business

Anatomy of a Business

From this week, I will write a series of posts on the training I am currently getting at Berkeley. The blog will contain the training I had and key points I learned from it (You may read it, and feel it is too scattered, because it serves as the purpose to remind me what I learned in the future). The teacher is Naeem Zafar, a very successful entrepreneur.

I feel I need understand how to become an entrepreneur if in the future I have the chances, it is better to learn now. Even though I decide to stay in academia as a Faculty, but to do good research and being an entrepreneur is not conflict with each other. Just image, if for some research, we can turn that into a self-sustainable entity, and provide support for further research, why not! Besides, I realize that a lot of the things covered in the training actually applicable to other aspects of my life. I think even in the future I will not take the entrepreneur path, it is still will teach me a lot of things that benefit my life.

Ok, let's start with this first week - Anatomy of a Business.

We start by asking the question 'What is Business?'.

Designing a box that can increase the money you put into it
Creating a cash flow engine

Types of business

For profit (C-corp. & others) i.e. Google, IBM
Non-profit (501C3 ...) i.e. Red cross
Benefit corporation (B-corp.) i.e. Social good business

The idea of separating a company from human identity is brilliant. Dutch East India company is the first company.

How a company structure:

Shareholders own the company
Shareholders elect a board of directors
Board of directors will select the CEO
CEO represents shareholder's interests and will hire people to run the company.
The real power is the board of directors

Types of company

C-Corporation
S-Corporation
LLC (Limited Liability Company)
Sole proprietorship
DBA (Doing Business As)

Private or Public company

Who can buy shares in a company? Anyone or accredit investors?
The first day you can sell the share to public, it is called — IPO (Initial Public Offering)

The difference between C-Corp. and LLC is the C-Corporation will not pass profit to the shareholders, and it pays tax by itself. Most of the C-Corporation don’t have dividend, but if you have dividend, then the shareholders have to pay for tax. But most C-Corporations do not pass this to shareholders. This is the biggest difference between the LLC, who will pass the profit to the shareholder, and they will pay for the tax. But the LLC company itself not paying tax.

S1 - The Discourse the company have to make after going to public

Acknowledgements:

All the materials are from the entrepreneurship class at UC Berkeley taught by Naeem Zafar.

Friday, February 17, 2017

Wife's painting: Mermaid transition

Today, my wife draws another painting which I think really cool - The mermaid transition. The original source is from the live model drawing she made yesterday. From this model drawing, she developed the whole idea. You can check out all my wife's paintings.

Mermaid transition

The artist - Fan's explanation.
Mermaid – the princess in the sea. At this moment, her beautiful fish tail is transforming into human feet, representing that she is getting out from her kingdom and going into the human world. She is excited about her new life but also fears her unknown future. Just like a mermaid, every girl is her parents’ princess. When she grows up, she leaves home to have her new life; she leaves school to go into the society. From a little girl, she grows up to be a real woman. No just girls, everyone is the same. We say goodbye to our past and welcome the new day. We get rid of our childishness and become a real human, facing the reality. At that moment, we are not just excited but also afraid about our unknown future. Sometimes, the real world isn’t as good as we expected, but we always have hope for the future. That’s what the sunshine in this painting implies – our hope. But the hope is not something we can really see; it’s always in our hearts. So, we cannot see the sun in this painting, but we know it’s there, just like our hope.

The original idea is from the following drawing, which my wife drew yesterday from a drawing class. The pose from the model in her eyes somehow resonates the fairytale she learned before - The Little Mermaid. The first time when she left her home to university, she had a similar feeling, exciting while afraid of the unknown future. I can see the process is hard, but she handled very well since she always has hope in her heart. She knows what she wants, just like the mermaid, and stand up on her own feet. She seldom complains the hardness of life, and always has smile on her face. This also influenced me a lot in the past 11 years, no matter how hard the life is, how uncertain you are, keep the hope in your heart, and you will soon feel that life is great! I am so proud of my wife!

Live Model drawing

Sunday, February 12, 2017

Machine learning 9 - More on Artificial Neural Network

I talked a series of Artificial Neural Network (ANN) tutorial last year for a workshop (here), where I showed the very basics. But there are more people asking me some details, which I will cover some of them this week. Hope this will be useful to you.

How to select the parameters

We talked about the hidden neurons in the hidden layer, you may ask 'How do we select the number of neurons?' The way I select the number of neurons in the hidden layer is the 10 fold cross-validation. This is a very common way in machine learning community to find good parameters, it works as shown in the following figure.

In k-fold cross-validation, the original sample is randomly partitioned into k subsamples. Of the k subsamples, a single subsample is retained as the validation data for testing the model, and the remaining k − 1 subsamples are used as training data. The cross-validation process is then repeated k times (the folds), with each of the k subsamples used exactly once as the validation data. The k results from the folds then can be averaged (or otherwise combined) to produce a single estimation.

Therefore, the 10 fold cross-validation means I split the data into 10 subgroups, and use 9 of them training, and the other 1 to test the result. You can also use this method for other parameters.

There are also other ways to select the parameters, like grid-search and so on. I will not talk here, since I like to use 10 fold cross-validation most of the time.

When to stop training

When training a neural network, we will do many iterations to update the weights. But when do we decide to stop? Let me show you the following figure, and then you will know when to stop.

The green curve is the training error, which is the error that we get when we run the trained model back on the training data. The red curve is the validation error, which is the error when we get when we run the trained model on a set of data that it has previously never been exposed to (this is also why this data is called validation data, since it is not used in training, and we keep it for validation purposes). We can see that the green training error is constantly decreasing, but at certain point, the decreasing validation error starts to increase. This usually happens when the model starts overfitting the data, which means that the model is excessively complex, that it is too flexible, it starts to model the noise instead of the hidden patterns. The following is an example (figure from Wikipedia).

We can build two models to separate the green and blue dots: one model is the black line, and the other is the green curve. We can see the green curve fits the data really well, it separates the green and blue dots without any mistake! The error associated with it is zero! But which model do you think is a better model? Of course, most of us will choose the black model (if you choose the green model, I don't know what to say ...). Even though the black model made some wrong decisions for some training data points, but it will perform better than the green model when applied to new data. The green model fits too much noise, and it becomes so wiggly. If we keep a validation dataset that never used in training the model, we will find that the green model will make more wrong decisions, this will show on the validation error. Therefore, we should stop at the point where we can see a trend the red validation error starts to take off, showing as the black dotted line in the previous figure.

More on learning rate

We didn't talk too much about the gradient descent method before, but you can check out this awesome blog to get more sense - Single-Layer Neural Networks and Gradient Descent. But we do talk about learning rate before, if you still remember, it will control how fast we will learn by control how much we will update the weights. I grab the following figure from the blog, to show you the effect of large and small learning rate.

The above figure shows a simple example the effect of using a large and small learning rate. We can see the horizontal axis is our weight, and the vertical axis is the cost function. We can think this as a topographic area in our parameter space (in this case, is the weight). The gradient descent method is to find the steepest direction to our next step by taking the gradient of the topographic area, and to this direction. We want to search for the lowest point in this topographic area (finding the minimum). We can see, if we use a large learning rate, the search will bounce back and forth around the minimum. But if we use a small learning rate, every time we move our search with a small step, it will take very long time to find the lowest point, and sometimes trap our search into a local minima instead of the global minimum (as shown in the figure, and we will talk it more in the next section). We can see the smaller learning rate is more stable. It seems using either small or large learning rate is not sufficient to have a good training scheme, the best way is to use both: an adaptive learning rate. This means that we start with large learning rate, but with more and more iterations, we will shrink the learning rate accordingly. We can think this as at the beginning, we use large learning rate to do a coarse search with large move steps, but when we approach the minimum, we use smaller learning rate to do a fine search in this area.

Momentum

The following figure (from here) shows the complexity of the search for the global minimum. Since most of the times, we will have something not as simple as the previous figures with only one minimum. Instead, we see a very hilly area, that full of different local minima. It is very easy for our search to find a local minimum, and stop searching for a better one. For example, the blue ball stopped in a minimum that is not the global minimum. We can train the algorithm multiple times, and every time start at a different initial location, in the hope that we can start at a place where equal to the global minimum, or at least close to.

Also, we can also try to make it less likely that the algorithm will get stuck in local minima. Let's look at the above figure, the reason the ball stop in the local minima is due to run out of energy when it rolling down. If we give the ball some weight, when it is rolling down from a higher place, it will likely have a momentum to overcome a small hill on the other side of the local minimum. This idea can be implemented by using a momentum term in the update of the weights. You can check out more explanations on Quora. Now, let's take a rest and look at the following movie to get a sense why the ball did stop at some local traps while having some fun!

I will stop here this week, and there are more details about training a good ANN, but the most important ones are here, and when you read a lot of books or tutorials, you will meet them, and I hope the high-level concept I write here will give you a good start.

Sunday, February 5, 2017

Summary of Entrepreneurship bootcamp

This week, I am attending a 2-day bootcamp of Entrepreneurship at Berkeley. In the future I want to stay in Academia, but turning some research projects into a company is an appealing idea to me. Just think about you can get funding from NSF, private foundations, industry companies, and you own company, how cool is that. Besides, I think researches driven by the needs of the customers will be a nice category to work on. Therefore, it is better to learn the basics now as a graduate student.

The following are the key takeaways I got from the bootcamp:

First day

The most common way a company fails is ’there is no people care'
Listen to the customers, they know better what they need, and we know better how to make it
The best team is the team that we all trust each other.
If you want to have projects more applied, ask what do you need for clients before doing the research and create something for them.
Two things in the world: what people want, and what you can make it
Ideas come from two sources: First, skills and background of the team. Second, observing change in the environment. Right skill with right time
Adjacent possibility is some other ideas from the world, and you combine it with yours, and get a better one
If you don’t tell the story, you can not get any feedbacks. If you are effective in telling the story, other people will say ’that’s interesting, how can I help?’, this is the way to build the team.
To make business success, we need both social and technology. The social part is just as important as the technology part.
Entrepreneurs are people comfortable living in the fog of uncertainties. And brave enough to leave the comfort zones.
When you try to start a business, you need pass 5 filters: unmet need, market size, differentiated positioning, scalable business model, why us & why now.
Value those who know different things than you, you have to learn to appreciate the new things.
Build the positive momentum for your echo system.

Second day

Select team member, passion is very important.
A players attract A players, but B players attract C players.
Team with diversity is more important than uniformity.
People with major strengths tend to have major weaknesses as well, know which one you value most.
The role of a CEO is to providing strong magnetic force to attract people, and drive to focused direction.
4 points of CEO: (1) lead people, (2) shape product, (3) sell the vision, (4) feed the enterprise.
Stages of raising money: seed -> angel investors -> Early stage VC -> Late stage VC -> PE or IPO.
The importance of building an effective P&L (profit and loss statement). It should be in the zone of reason, since the investors will find any issue with pattern recognition skills ^{)^}
Steps of product development: start with a product vision -> Define Minimum viable product (MVP) -> Create an MVP demo/prototype -> Alpha Test (deploy MVP at few friendly customers) -> Get customer feedback -> Productize & Test the MVP (SQA) -> Beta Test -> Get customer feedback -> Fix bugs & Final Product Test -> Production release
How to build a Minimally Viable Product (MVP)
How to eat an elephant? (One bite a time)
Agile vs waterfall
Luck is a prepared mind meets opportunity

Over the two days, I learned a lot from the bootcamp. Especially the examples we discussed in the bootcamp. I think it is very informative and can be applied to many aspects of my life. I am planing to learn more from the books recommended by the instructor (the instructor is the author).