Advanced Optimization | Optimization

July 15, 2016mgric

Travelling Pokemon Trainer

If you’ve walked outside in the past week and bumped into a person nose deep into their phone, they were probably playing pokemon go. As a child of the 90s I have fond memories of red and blue version and the endless hours spent playing them. Naturally, I picked up pokemon go immediately.

For those of you living under various rocks, pokemon go is a mobile game which makes you explore area around you to catch virtual pokemon. In order to catch pokemon you require a pokeball, These can either be bought with real money or you can receive them at various landmarks called “pokestops.” These “pokestops” are littered around cities and towns. The premise is simple, you walk around your town stopping at these landmarks and gain supplies.

This is ripe for optimization.

Luckily for us trainers this problem has already been explored, only in another form. The travelling salesman problem is a key problem in optimization. The travelling salesman classically had a person travel through a set amount of cities trying to minimize the distance she traveled while still visiting every city. This problem is computationally very tough to solve. Many algorithms have been written and tested to facilitate solving this problem.

The one i focused on is a genetic algorithm. Genetic algorithms mimic natural selection in order to find an optimal route. In my search I found a great blog post on optimal waldo search methods that uses a genetic algorithm. Here you can see Randal S Olson using it to solve a version of the travelling salesman problem involving Where’s Waldo. Naturally I wanted to apply it to my pokestops.

Using Mr. Olson’s Code and extracting all pokestop location data from Ingress (pokemon go precursor) provided me with the following path animation.

Quick note here, this is not the most optimal path, but it is close. Due to the nature of genetic algorithms we will get a close estimate. There are a few problems with the above path. First most glaring problem is that It assumes we can walk through buildings. I am not a haunter, so this is outta the question.

Luckily, I live in a city which utilizes the grid system of road planning. Because of that I can use taxi cab distance to approximate walking on a grid. However, in the upper left hand corner of the map above is a park with a good deal of pokestops. In the park area. In a park, I shouldn’t be constrained by buildings so I could walk on diagonals. I modified the code to calculate euclidean distance for park stops and taxi distance for everything else.

There is still something wrong. As you can see above, we are still moving through buildings, but now at right angles. Figuring out when/how to route around corners and up and down streets is a little more of a project than I am ready to tackle. So lets let google do that for us. Tapping into the google maps api distances are calculated for us using the correct routing. Our final map looks like this (plotting shows route through buildings, however distance is correct).

I guess I know which route I am going to be taking from now on.

Big shout out to Mr. Olson who wrote the underlying genetic algorithm code and part of the google maps api code I used. His write up on the waldo search is excellent and clear I highly suggest reading it.

As for next steps, there is something that is bothering me when I look at the above pathing, well a few things.

It is not a closed loop: I will have to walk allllllll the way back home from where ever this route lets me off. We need to add a constraint that the last stop has to be the same as the first.
Will my bag be full at the end of this?: I dont wanna waste time walking to more pokestops if i can’t receive any more supplies.
I don’t have to visit all the pokestops/ pokestops recharge: This is the most important one. Unlike the travelling salesman, i do not have to visit everyone. I can stay in a small subsection and just repeatedly go the the same 3 stops if i wanted to.

These are some important caveats to the map. We’ll see if we can solve them next time on pokemon.

-Marcello

July 29, 2015mgric

Advanced Optimization Methods: Artificial Neural Networks Part 3

In our last part we went over the mathematical design of the neurons and the network itself. Now we are going to build our network in MatLab and test it out on a real world problem.

Let’s say that we work in a chemical plant. We are creating some compound and we want to anticipate and optimize our production. The compound is synthesized in a fluidized bed reactor. For those of you without a chemical engineering background, think of a tube that contains tons of pellets. Fluid then runs over these pellets and turns into a new compound. Your boss comes to you and tells you that there is too much impurity in our output stream. There are two things you can change to reduce the impurity, catalyst (the pellets in our tube) amount and stabilizer amount.

In the pilot scale facility, you run a few tests varying the amount of catalyst and stabilizer. You come up with the following table of your results.

Catalyst	Stabilizer	Impurities %
0.57	3.41	3.7
3.41	3.41	4.8
0	2	3.7
4	2	8.9
2	0	6.6
2	4	3.6
2	2	4.2

After looking at the results you decide to create a neural network to predict and optimize these values. As we know we have two inputs, catalyst and stabilizer, and one output, impurity percent. From our last part on structures of neural networks we decided that we need two neurons in our input layer (one for catalyst and one for stabilizer), and one neuron in our output layer (impurity percent). That only leaves our hidden layer, since we do not expect a complex difficult problem that requires deep learning we only choose one layer. As for neurons we will choose 3 neurons to make the problem a little more interesting. The structure is shown below.

Now that we have the structure let us build our network in MatLab. The code is actually quite simple for this part. First we input our two variables in a x by 2 matrix. We then multiply these by our first weights from our hidden layer and pass them through our sigmoid function. These values are then multiplied by the weights from the output layer then passed through the sigmoid function again. After they pass through they become our output, impurity %. So lets see how our network performs the vector on the left is our actual values (scaled to the max) and on the right is what our network determined.

As you can see, the network did not guess even remotely correctly. Well we are missing the most important part of the neural network, the training. We must train our network to get the right predictions. In order to do this we need to do our favorite thing, optimize.

-Marcello

Heres the code:


% ANN code
% structure:
% 2 input
% 3 hidden nodes
% 1 output

%initial input data [ catalyst, stabilizer]
input_t0 = [0.57	3.41; 3.41	3.41;0	2;4	2;2	0;2	4;2	2];
%normalize input data
input_t0(:,1) = input_t0/max(input_t0);
input_t0(:,2) = input_t0/max(input_t0);

%normalize output data
output_t0 = [3.7 4.8 3.7 8.9 6.6 3.6 4.2];
output_t0 = output_t0/max(output_t0);

%randomly assigned weights
weight_in = [.3 .6 .7;.2 .8 .5];
weight_out = [ .4 .6 .7]';

%initialize matrices
actHidSig = zeros(7,3);
actOutSig=zeros(7,1);

%find activation for hidden layer
act_hid = input_t0*weight_in ;

%apply sigmoid to hidden activation
for i = 1:7
    for j = 1:3
        actHidSig(i,j) = 1/(1+exp(-act_hid(i,j)));
    end
end

%find activation for output layer
act_out = actHidSig*weight_out;

%apply sigmid to output activation
for i = 1:7
    
        actOutSig(i) = 1/(1+exp(-act_out(i)));
    
end

%show results
output_t0'
actOutSig

July 19, 2015mgric

Advanced Optimization Methods: Artificial Neural Network Part 2

The previous was a brief overview of how to construct a neural network. This part we will go in depth and actually do a little math to create these networks. Just a quick reminder from part 1, our general network contains three layers, one input layer, one hidden layer, and one output layer. For this example I will try to maintain generality. We start with an input layer neuron.

Input layer neurons are the most simple (and the most boring ) neurons of the system. ILN do not accept any inputs, as they are the inputs. Their job is to simply pass the inputs to the hidden layer. There is exactly one input layer for any given NN (as far as I know). Each variable we are looking at gets its own input layer neuron. If we are looking at 3 variables, we would have 3 ILN, if we had 5 variables we would have 5 ILN, 100 variables,100 ILN and so on. They do not modify or weigh the inputs, nor do they apply the activation function before passing on the inputs. The next layer is the hidden layer, which is much more interesting.

Above is a schematic of our hidden layer neuron (HLN). The x values here are from our input layer (or previous hidden layer). These values are then weighed and summed. Giving us a new value. The summation is listed below.

This new z value is passed to our activation function. Last post we skipped over our activation function, however it deserves some attention now. The point of our activation function is to introduce non-linearity to our method. For this neural network I have chosen the sigmoid function. Why did I choose this function? For one it is real-valued and differentiable. Second it has an interesting property of its derivative. The derivative of the sigmoid function is equal to the sigmoid function times one mine the sigmoid function. This makes the derivative computationally easy. However, the sigmoid function is used most often for the non-linearity I mentioned before. Below is the function (from wikipedia) as well as the equation and derivative property.

Using this activation function we then take our z value, which is the summation of our newly weighed inputs, and put it through our simplified sigmoid function. The result of the activation function becomes the input for the next layer, which is our output layer. However, before we move on to an output neuron we have to go over how to determine how many hidden layers we need and how many neurons are in each of those layers. For most applications one layer would be appropriate. One thing to know is that increasing the number of layers give marginally better results. There are a good amount of differing opinions on the amount of hidden layers needed, however there is even more confusion about the amount of hidden neurons needed in those layers. A quick heuristic is that the number of hidden neurons should be somewhere between the amount of input neurons and output neurons. However, this is just a rule of thumb and should not be taken as a hard and fast rule. Really the best way is to test our your network and “prune” out those neurons that aren’t contributing. Before we get to that we have to look at our last layer, the output layer.

An output layer neuron is just like a hidden layer neuron, however the output of the OLN is the answer or prediction we have been looking for. The output layer takes all the results of the activation functions from the previous layer and weighs them. It then sums all the weighted values and puts them through our sigmoid function again. After it passes through our function our method is complete. Like the input layer, there is only one output layer. The number of neurons in this layer depends on the problem. If we are trying to predict a value only one neuron is needed, however, if you are trying to classify a data point you may need more than one neuron.

One big thing we over looked so far was why and how ANN work. Well we have to fine tune the weighs for each neuron (wij). By changing and optimizing these values we can make the network give us the information we want and need. This is called training our network, we will go over this in our next part.

-Marcello

July 19, 2015mgric

Advanced Optimization Methods: Artificial Neural Networks Part 1

If you’ve been following along we’ve finally got to the fun stuff. The first “advanced” topic we will explore is Artificial Neural Networks (ANN). ANN is a simplified model of a network of “neurons” like the ones that make up our brains. Each neuron takes in information and transmits to another neuron, which in turns analyzes(weighs) and modifies the input and sends it to the next. Well how does this help us optimize? Neural Networks let us predict and optimize problems at a greater speed and with greater accuracy than other methods. That is after they are trained, but more on that later. First lets look at an individual neuron then see how it interacts in a network.

Above is a diagram of a neuron (from wikipedia), the ones actually in your brain. Our artificial neurons look a very similar to the one above. Now not all the structures labeled on the above picture are important. The two we are concerned with are the dendrites and the Axon, take a quick note that both of these structures have branches. Just a brief tangent from when I use to work as a neuroscientist at the Motor Neuron Center, neurons work by transmitting signals through their axons and receiving signals from their dendrites. Neurons can receive tons of connections from other neurons through their dendrites. However, they can only have one outgoing connection through their axon. This is a very simplified explanation, but nevertheless important. Our artificial Neurons will work in pretty much the same way.

Above is a diagram of an artificial neuron (from wikipedia), the ones we will use in our method. Unlike our real neuron diagram, all the labels are important here. We will start with the axons from other neurons (inputs) to the dendrites (weights) and into the cell body (transfer function/activation function) and then leave the neuron to the next neuron through the axon (activation). So we have an idea of how the data moves through a neuron, but what does any of this exactly mean.

So imagine we have the neuron listed above. It has n axons attached to it. Each axon will transmit a number (xn)to this neuron. The neuron then weighs each of these numbers by multiplying it by its respective weight (wnj). These newly weighted numbers are then summed at our transfer function step. This new number is then passed to our activation function. This is where the power of neural networks is derived. Different activation functions can be used to achieve more accurate methods than their linearly bound counterparts. After the weighted average is passed through our chosen activation function we pass that newly created matrix of values to the next neuron and the method repeats. This leads us to the network part of neural networks.

Above is an example of a very simple neural network. The network consists of three parts or layers, input layer, hidden layer, and output layer. Each layer can be made up of as few or as many neurons as the user designs. Also it must be noted that the hidden layer is not restricted to single column of neurons. There can be 3,4 or 100 layers in the hidden layer depending on the design and the neurons required, these however, must be designed initially before any optimization takes place. Another thing to note is that all neurons point forward. This is called a feed-forward design. Having this limitation becomes important when we look at the optimization portion but that comes later. Each neuron outputs to every neuron in the next layer and receives inputs from every neuron in the layer previous.

That’s a general overview of how feed-forward artificial neural networks work. Next part we will derive some equations and make a general outline of how to create an ANN.

-Marcello