Journal Club: Week of 12/4/2015

A Few Useful Things to Know about Machine Learning
Pedro Domingos
Department of Computer science and Engineering University of Washington

This was an excellent read. I highly suggest this paper to novices and expert alike. Pedro goes through all the mysticism and what he calls “folk knowledge” in this paper. Knowledge that would takes years of machine learning you uncover. Pedro breaks down machine learning to simple concepts and shows the reader how to deal with them. Do not be mistaken, this is not a tutorial. You will not learn any new algorithms or application. You will only learn how to better use the ones you know. That being said, I believe it is best to go into this paper with a little background so you are not lost by what Pedro is explaining.
Pedro explores major pitfalls of people who are first learning machine learning as well as seasoned pros. I particularly liked his section on overfitting and the section on how to approach problems. ‘Start simple first” is a common piece of advice, but Pedro backs it up with examples and graphs showing how different methods perform. His advice on more data vs a clever-er model is invaluable. I highly suggest reading this paper, it is a quick and powerful read.

 

PLS-regression: a Basic Tool of Chemometrics
Svante Wold, Michael Sjostrom, Lennart Eriksson
Institute of Chemistry Umea University

 

Another paper on PLS, this one a little more current and a little more practical. Like Geladi’s paper on PLS, this paper goes in depth with PLS within the scope of chemistry and engineering, so its right up my alley.  After reading it, not all of my questions were answered butI felt like I had a better grasp on the algorithm. One thing I really liked about this paper was the diagnostics and the interpretation.
The paper is structured around an Amino Acid example. This serves as a good basis and testing ground as the provide the raw data for anyone to test on. The power of this paper is in the last couple of sections. The authors guide the reader through each step of interpreting the results. It goes through initial results to essential plots. Each plot gets its own subsection, however, they are not all given the same importance. The explanations on some of them are very brief, restricted to only one or two paragraphs.
If you are only going to read one section of this paper flip to the second to last page and read “Summary; How to develop and interpret a PLSR model.” Here the authors give a very quick overview which will get you on your feet and give you a basic understanding of what is going on. It makes as a good reference as well.
-Marcello

FOLLOW THE MONEY: FEDERAL LEGISLATURE PART 4

A quick refresher for those just joining us. I took campaign donation data from followthemoney.org. This website makes campaign donations very easy to parse and work with. I gathered the data for all campaign donation to either Senators or Congressmen regardless of whether they were elected or not. With this data I was able to see patterns with regards to political parties, candidate’s office, and others. In this part we will take a look how each state compares to each other. First lets take a look at overall donations for 2014.

2014 Camaign Donations to Legislators Grand Total

Don’t try to pull too many grand conclusions from the above graph. Like I mentioned when talking about winners and losers in elections, donations per candidate (or here per capita) give more insight. The above graph shows what is basically a population map. The more populated state show up in a darker green than the less populated states. This poses an unfair advantages for states like California and New York. People in less populated states have to donate more per person than people in higher populated states. So in order to get a fairer comparison we need to normalize our donations. I have calculated donations per capita for each state.

2014 Camaign Donations to Legislators Grand Total

That’s much better. As you can see the maps are wildly different and does not resemble a population map in any way. States like NY, NJ, MA, and CA are no longer top tier, but rather toward the bottom. Interestingly enough, states that have less people in them seem to have much greater donations per person, Alaska is a notable example. Why do these states get way more contributions than others? One possible explanation are that some of theses states are swing states. Swing states (like New Hampshire above) are very closely divided between the Republicans and the Democrats. These states should naturally garnish more donations as the races should be more exciting and volatile. Speaking of parties which states gave more to the Democrats and which gave more to the Republicans.

The Elephants and the Donkeys

Nothing too surprising here. Most republican states have more donations toward republican candidates and the same for democratic states. However, there are a few confused states. Arizona, Colorado, and New Mexico are generally considered republican states, but the Democrats raised a lot more money. The opposite goes for Wisconsin, Michigan, and Pennsylvania typical Democratic states. This map reinforces some geographical trends. The northeast coast and west coast are usual democratic strongholds.

A quick word on the interactive graphs above. These graphs were made using plotly and python. Plotly makes it very easy to make d3.js type graphs and interactive web apps. Recently plotly went open source which is great news for all of us. If you are looking to quickly make interactive graphs plotly should be your first stop (unless you are really good with d3). This ends the exploratory portion of Follow The Money, next up is the final report. Enjoy the interactive maps!

-Marcello

Posted in

Journal Club: week of 11/20/2015

A Few Useful Things to Know about Machine Learning
Pedro Domingos
Department of Computer science and Engineering University of Washington

This was an excellent read. I highly suggest this paper to novices and expert alike. Pedro goes through all the mysticism and what he calls “folk knowledge” in this paper. Knowledge that would takes years of machine learning you uncover. Pedro breaks down machine learning to simple concepts and shows the reader how to deal with them. Do not be mistaken, this is not a tutorial. You will not learn any new algorithms or application. You will only learn how to better use the ones you know. That being said, I believe it is best to go into this paper with a little background so you are not lost by what Pedro is explaining.

Pedro explores major pitfalls of people who are first learning machine learning as well as seasoned pros. I particularly liked his section on overfitting and the section on how to approach problems. ‘Start simple first” is a common piece of advice, but Pedro backs it up with examples and graphs showing how different methods perform. His advice on more data vs a clever-er model is invaluable. I highly suggest reading this paper, it is a quick and powerful read.

PLS-regression: a Basic Tool of Chemometrics
Svante Wold, Michael Sjostrom, Lennart Eriksson
Institute of Chemistry Umea University

Another paper on PLS, this one a little more current and a little more practical. Like Geladi’s paper on PLS, this paper goes in depth with PLS within the scope of chemistry and engineering, so its right up my alley. After reading it, not all of my questions were answered butI felt like I had a better grasp on the algorithm. One thing I really liked about this paper was the diagnostics and the interpretation.

The paper is structured around an Amino Acid example. This serves as a good basis and testing ground as the provide the raw data for anyone to test on. The power of this paper is in the last couple of sections. The authors guide the reader through each step of interpreting the results. It goes through initial results to essential plots. Each plot gets its own subsection, however, they are not all given the same importance. The explanations on some of them are very brief, restricted to only one or two paragraphs.

If you are only going to read one section of this paper flip to the second to last page and read “Summary; How to develop and interpret a PLSR model.” Here the authors give a very quick overview which will get you on your feet and give you a basic understanding of what is going on. It makes as a good reference as well.

-Marcello

FOLLOW THE MONEY: FEDERAL LEGISLATURE PART 3

I took a quick look candidate donations limited to New Jersey, now I’ve moved nation wide. Lets see if the trends that were in New Jersey were typical of the whole nation or just Jersey. I restricted the data to just 2014 to make it a little more manageable. As always lets look at Dems verse Repubs.

all states leg party

Here we see the party breakdown, along with the elusive third party. If it wasn’t obvious already the de facto two party system completely eclipses all third party hopes. Dems and Repubs trump the cumulative third party total by a magnitude difference.  Moreover Republicans candidates across the nation raise more money than their democratic counterparts. This caught me by surprise as I thought totals would lean a little democratic, but more or less even. Lets take a peak at the office breakdown.

2014 was a big election year for the House, and a lesser year for the Senate. My prediction would put House campaign donations way ahead of the Senate.

all states leg office

Yup that looks about right. Not as big a spread as I would of guess, but this follows from the years context. One thing to note, with this dataset I kept all candidates, even if they lost. This should give a more complete look at ALL donations to candidates not just the ones that have been elected. So I wonder who raised more, the winners or the losers?

win lose

The above graph is misleading. You may want to say that people who won their elections raised more money, and you would be right if you looked at it cumulatively. However, to get anything meaningful out of this graph we need to look at per elected official. It could be that there are simply more candidates that won than lost, leading to the spread.

per poli

Now this is surprising, even per candidate the politicians who were elected raised almost 5 times that of those who lost. Out of the 1415 candidates, 936 of them lost, and 474 of them won. Only 3 withdrew and 2 were “unknown”. Finally, lets look at the industries again.

all state industry

 

Here we see uncoded donations eclipsing the rest of the other industries per usual. As a reminder, Uncoded actually includes PAC donations as well as individual donations. This is why uncoded always comes in as the largest category.

On a federal level it looks like New Jersey is pretty much in line with all the states. However, the whole point of getting data for every state is to be able to compare them. Stay tuned for part 4

 

Marcello

P.S. heres a preview

statescolor

 

Journal Club: Week of 11/13/2015

Got two more for you this week. One on Machine Learning and the other on multivariate. Check them out.

Supervised Machine Learning: A Review of Classification Techniques
By S.B. Kotsiantis
University of Peloponnese (2007)

This paper serves as a review of a subset of supervised machine learning algorithms with a focus on classification. Because of the vast amount of algorithms present the author breaks down the paper into key features of the algorithms. First the author gives a brief overview of machine learning in general, why and how it is used. What I liked most about this paper is that even before any algorithms are mentioned the author talks about general issues with classifiers and algorithm selection. This prepares the reader and removes the notion of the “silver bullet” algorithm.
The article is well organized. Kotsiantis starts with the most intuitive of machine learning algorithms, decision trees, and works his way up to new and more recent (well for 2007 at least) techniques. Each section goes over a multitude of techniques within the subheading, for example Statistical Learning algorithm contains Naïve Bayes and Bayesian Networks. I liked this organization as it guides the reader into more complex techniques. One thing that lacks is the depth. Most techniques are rushed over and not fully explained, but this paper’s purpose is not to outline precise steps to implement each technique but rather to familiarize the reader with existence of certain techniques.
Another criticism I have of the paper is that it seems to feel a little dated. This is of no fault of the author of course, but nevertheless a more recent paper may be worthwhile to follow up on. There is a table in the paper comparing the different techniques in terms of speed, tolerance, and other parameters which is very useful. However it might need to be checked for accuracy as it might be outdated.

Partial Least Squares Regression: A Tutorial
By Paul Geladi and Bruce R Kowalski
Analytica Chimica Acta, 185 (1986) 1-17

Here is an oldie but a goodie. When first learning about Partial Least Squares (PLS, or sometimes called projection onto latent sturctures) there was a vast amount of papers, but none really drove the point home for me. I went back to one paper that was constantly being cited, this paper from 1986. This paper provides a very clear tutorial on how to get PLS up and running. This paper assumes you have an understanding of linear algebra. Starting with data preprocessing, the paper states what form your data needs to be in and how to get it into that form.
The paper takes a detour however. It first goes over exisiting methods like multiple linear regression and principal component regression before it begins to explain PLS. This was good and bad for me as I was solely interested in PLS, nevertheless, the other tutorials gave insight and quick rudimentary ways of using other regression methods. However, I was here for the PLS. The paper immediately dives into building the PLS model. Take care reading this section as the explanation is sparse. Overall, it’s not the best tutorial, however it has two invaluable take aways. Figure 9 in the paper shows a geometrical representation of all the outputs and inputs the PLS model uses. It shows exactly the dimensions of each and how they relate to each other.
The other is the sample PLS algorithm. In the appendix of the paper there is almost a pseudocode like description of the PLS algorithm. Using this, I was able to get a PLS program up and running in less than an hour. This algorithm clearly shows every step that must be taken and exactly how to do it. This is the main reason why I would recommend this paper. There are others out there that explain PLSR better, but this paper allows for a rapid implementation of PLS.

-Marcello

Follow The Money: Federal Legislature Part 2

Last part we took a look at campaign donations to New Jersey State legislatures. Now we are moving on up to the US House and Senate. The stakes are a little higher, the politicians have more power, and hopefully full of campaign donations. Luckily for me we have Followthemoney.org on our side.

All data collected for the following graphs was using followthemoney.org’s API. This made it easy to tabulate and graph all the recorded donations. First up is Democrats Vs Republicans.

fed leg party

Follows state legislature pretty closely. Democrats stomp republicans in terms of donations, however, this may be due to our data source rather than reality. 2014 and 2010 show close donation totals, while 2012 shows a blowout. 2013 seems to be completely missing republican data. That or only Democrats won.

One important qualification to make on this data set is that it only represents donations to candidates who won their elections. We need context for 2013 as it is an off year election there must be some special circumstance. Luckily wikipedia is here to help out. Apparently during this time, sadly a senator  passed away and a special election was held. As we suspected, a democratic candidate won. This may have contributed to the lopsided data. Now lets see if office maters at all.

fed leg office

Depending on the year it looks like office matters quite a bit. The special Senate election in 2013 influenced all campaign spending that year. 2010 was similar to 2013, but completely dominated by House campaign donations. As you probably know, house seats are up every 2 years. In the data above, house donations are all in the same range except in 2013, where there is no election. Senate elections on the other hand are every 2 years, but only 1/3 of the seats are up. New Jersey Senators were up for reelection in both 2012 and 2014 but not in 2010, explaining the lack of donations. Finally lets look at industry donations in 2012.

fed leg industry

Here we see uncoded donations eclipsing the rest of the other industries. After seeing uncoded in part 1 I investigated. Uncoded actually includes a PAC donations as well as individual donations. This is why uncoded always comes in as the largest category.  I did some quick calculations to see what % was from individuals like you and me and what % came from corporations and other PACs.

Individual  $  14,760,750.00
Non-Individual  $    1,412,439.00
Grand Total  $  16,173,189.00

Overwhelmingly the donations stemmed from Individuals. That is super surprising for me.  There’s a lot more visualizations I can do with this data, but before that, we have to go nationwide.

-Marcello

find the data here:NJfedDon

Follow The Money: State Legislature Part 1

The 2016 election is rapidly approaching and one of the major issues of this years race is campaign fiance reform. I am not big into politics, but I am well aware of the Citizens United vs FEC ruling. One thing I do not know however is on what scale politicians actually receive donations. I set out to see how much an average senator or congressman actually receives in a given year.

My intuition led me to believe that these men and women were pulling millions of dollars each year in donations, but that may be based on watching a little to much House of Cards. First thing I needed was the data. Luckily for me all politicians at the state level are required to file info on their finances. Even luckier for me there is an amazing website that databases it all and has an easy to use API

Followthemoney.org

First I wanted to start at the state level, looking at state senators and assemblymen. My guess was that these people were not pulling in the big bucks when it came to campaign donations. I downloaded a data set from follow the money which contained records of donations to lawmakers in the state of New Jersey. From there I cleaned it up and visualized it.  Heads up lots of bar charts coming!

New Jersey leaning democratic I expect the democrats to pull in a little more money than the republicans.

state leg party

WOW that’s a big difference. However, there seems to be an issue. Our data doesn’t look complete. Look at 2012 and 2014, there is missing data for both parties. The total amount is lower than it was back in 1997. Know that this data might be incomplete all analysis must be taken with a grain of salt. Let move on to Senate vs. House.

Senators in New Jersey serve one two-year term and two four-year terms every ten years is considered a 2-4-4 term system. This means that this year all the State senator seats are up. This makes me question the data even more as 2015 is relatively low compared to say 2011, another year were all seats were up. State House members serve 2 years. I have two conflicting trains of thought. One is that Senators will receive more donations as the contributor gets more bang for their buck to put it bluntly. Two is that assemblymen get more donations as they are up for election more frequently and constantly need to replenish the war chest. Lets see.

state leg office

Looks like Senators out do Assemblymen. Look at 2011, this year all State Senate seats were up for election. A grand total of around 31 million was raised that year. That’s pretty impressive , but where is all this money coming from? Lets take a look. I’m going to stick to 2011 as it seems to be the most complete our of all the years.

state leg industry

And our winner is Uncoded with a distant second, unitemized contributions. What does this mean? According to followthemoney.org, unitemized contributions are donations that are under the report-able limit. They are aggregated and listed under this heading. For New Jersey, the limit is $300 dollars from an individual. As for uncoded, this money can come from various industries or most prominently previous years. Uncoded gives an idea of how much these politicians have stocked up in the war chest.

As for the other General Trade Unions comes in third and Lawyers & Lobbyists in forth at around half of General Trade Unions. This is interesting as my previous beliefs on donations are based on big conglomerates or super pacs donating massive amounts of money, not general trade unions. Nevertheless, this is the state level maybe when we look at the federal level there will be much more, for lack of a better word, interesting donators.

Pretty interesting. If you wanna take a look at the data set yourself. I’ve included it here. NJlegDon.  My code is copied below if you wanna check it out (very unoptimized and also in Python!).

-Marcello

 

"""

@author: Marcello

Campaign Donation NJ totals
data sourced from: followthemoney.com

goal of program is to breakdown campaign donations to NJ Senators and 
Congressmen who are currently in office.
"""
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns

def summarytablegen(variable_list,varname):
 
 index = np.arange(len(variable_list))
 donation_summary2=pd.DataFrame(columns=columns_4_summary, index=index) 
 donation_total={} 
 total_year = 0
 i=0
 for variable in variable_list:
 for year in election_year:
 df1=df.loc[(df['Election_Year'] == year )& (df[varname] == variable)]
 total_donation = df1['Total_$'].sum()
 donation_total[str(year)] = total_donation
 total_year = total_year+total_donation
 donation_total['Variable']=variable
 donation_total['Grand Total']=total_year
 donation_summary2.loc[i] = pd.Series(donation_total)
 i=i+1
 donation_total={} 
 total_year = 0
 return donation_summary2

# data preprocessing, removing unnecessary columns

df = pd.DataFrame.from_csv('NJlegDon.csv')

df=df.reset_index()

column_names = df.columns.values.tolist()

columns_to_drop = ['request','Election_Year:token','Election_Year:id','Lawmaker:token',
'Office:token','Office:id','General_Office:token','General_Office:id',
'General_Party:token','General_Party:id','Contributor:token','Type_of_Contributor:token',
'Type_of_Contributor:id','General_Industry:token','Broad_Sector:token','In-Jurisdiction:token',
'In-Jurisdiction:id','#_of_Records']

df = df.drop(columns_to_drop, 1)

# drop all negative donations

df = df[df['Total_$'] >= 0]

# %%find total donations for canidate by year
Lawmaker_Id = list(set(df['Lawmaker'].tolist()))
election_year = list(set(df['Election_Year'].tolist()))
Industry = list(set(df['General_Industry'].tolist()))
party = list(set(df['General_Party'].tolist()))
office= list(set(df['General_Office'].tolist()))

str1 =','.join(str(e) for e in election_year)
str1=str1.split(',')

columns_4_summary = ['Variable','Grand Total']
columns_4_summary.extend(str1)


dflawmaker=summarytablegen(Lawmaker_Id,'Lawmaker')
dfIndustry=summarytablegen(Industry,'General_Industry')
dfparty = summarytablegen(party,'General_Party')
dfoffice = summarytablegen(office,'General_Office')


 
#%% 
# breakdown by party 

party = pd.melt(dfparty, id_vars=['Variable'], value_vars=str1,var_name='year', value_name='Donations')

colors = ["windows blue", "red"]
ax = sns.barplot(x="year", y="Donations",hue="Variable", data=party,palette=sns.xkcd_palette(colors))
ax.set( ylabel='Donation Total')

Journal Club: Week of 11/06/2015

Welcome to the first week of my journal club! I’ve gotten into the habit of looking for new and exciting papers to read. I made it a goal to read at least one new paper each week and I thought I’d share. The subject matter is not only on optimization but on various data analysis techniques, machine learning, multivariate controls, and other topics. Most of these papers are available online for free.


Multivariate Analysis
by Herve Abdi
University of Texas at Dallas


This paper serves as a what is best put as a catalog on multivariate techniques. It is by no means exhaustive, but contains a decent amount of techniques an brief blurbs on usage and statistical technique. The paper is organized by ones amount and format of ones data. First the author looks at techniques focused around one data set then he expands into two data sets. The two data sets section is split up into two categories. The first category assumes that one data set is trying to predict the other, the second category assumes that they are just different sets of variables. I like this organization as it makes it achieves the authors goal as a catalog. When I am looking for possible techniques, I can first look toward my data to rapidly see which techniques are not suitable.
When talking about the statistical techniques, the author goes into jsut enough details. The reviews rarely go above 2-3 paragraphs. One major criticism I have of this review is that it may be too brief. The author goes over how the techniques work, but barely touches upon possible issues or, more importantly, most appropriate usage. The author limits his discussion on usage to maximum 1 or 2 sentences.  However, this may be by design as the author mentions up front in the abstract that  “choice of the proper technique for a given problem is often difficult.” Nevertheless, this article starts as a jumping off point. If one desires to know more about the technique in question this paper at least gives a basis to expand upon.
Overall the paper is short, but provides enough insight for a reader to begin exploring possible options for multivariate analysis.


The paper can be found here.


Principal Component Analysis: Concept, Geometrical Interpretation, Mathematical Background, Algorithms, History, Practice
by K.H. Esbensen and  P. Geladi


This chapter by Esbense and Geladi fully guides the reader through the ins and outs of Principal Component Analysis (PCA). Because PCA is a basis and starting point for many multivariate methods, one needs a strong fundamental understanding. This chapter provides that and more. The chapter uses a geometrical interpretation of PCA which helps the reader to better understand what the algorithm does to decompose a series of variables and observations. Out of all the papers ive read on PCA, this chapter helped me the most.
this chapter includes an abundance diagrams which step the reader through all the projections PCA makes to our data set. Esbensen and Geladi take a matrix of variables and observations X, represented as a rectangle, and they run it through PCA algorithm.  This algorithm decomposes the matrix X into the two vectors, the loading , P, and scores , T, vectors. This is represented as the rectangle X decomposing into two smaller rectangles, T and P. They then go on to represent the “master equation” of PCA in the same way. This allows the reader to quickly grasp how PCA works visually. This is reinforced in the next section where PCA is represented as a change in coordinate axes. Finally, if geometric interpretations are not your thing, the authors include a simple algebraic approach, which stems into an algorithm for PCA. The algorithm is laid out briefly by the authors. I would have liked a more through step by step guidance, but this is satisfactory and enough to get a basic PCA program up and running.
Finally the chapter ends with an example and limitations. The example shows sample outputs and interpretation of the data which I found very beneficial. However, the example section is the weakest of the paper. I would have liked the authros to go into more detail of tha analysis and what conclusions you can make, these are briefly addressed (these may be contained in another section or chapter). Another thing I would have liked the actual data set to play around with, but i realize this was probably an expert chapter and shortened for space. Nevertheless, this paper is my go to for any questions or issues, but not on analysis of PCA.


Check out these two articles for an intro into multivariate!
Two new articles next week!
-Marcello

Advanced Optimization Methods: Artificial Neural Networks Part 3

In our last part we went over the mathematical design of the neurons and the network itself. Now we are going to build our network in MatLab and test it out on a real world problem.

Let’s say that we work in a chemical plant. We are creating some compound and we want to anticipate and optimize our production. The compound is synthesized in a fluidized bed reactor. For those of you without a chemical engineering background, think of a tube that contains tons of pellets. Fluid then runs over these pellets and turns into a new compound. Your boss comes to you and tells you that there is too much impurity in our output stream. There are two things you can change to reduce the impurity, catalyst (the pellets in our tube) amount and stabilizer amount.

In the pilot scale facility, you run a few tests varying the amount of catalyst and stabilizer. You come up with the following table of your results.

Catalyst Stabilizer Impurities %
0.57 3.41 3.7
3.41 3.41 4.8
0 2 3.7
4 2 8.9
2 0 6.6
2 4 3.6
2 2 4.2

 

 

After looking at the results you decide to create a neural network to predict and optimize these values. As we know we have two inputs, catalyst and stabilizer, and one output, impurity percent. From our last part on structures of neural networks we decided that we need two neurons in our input layer (one for catalyst and one for stabilizer), and one neuron in our output layer (impurity percent). That only leaves our hidden layer, since we do not expect a complex difficult problem that requires deep learning we only choose one layer. As for neurons we will choose 3 neurons to make the problem a little more interesting. The structure is shown below.

Screen Shot 2015-07-28 at 9.26.26 PM

Now that we have the structure let us build our network in MatLab. The code is actually quite simple for this part. First we input our two variables in a x by 2 matrix. We then multiply these by our first weights from our hidden layer and pass them through our sigmoid function. These values are then multiplied by the weights from the output layer then passed through the sigmoid function again. After they pass through they become our output, impurity %.  So lets see how our network performs the vector on the left is our actual values (scaled to the max) and on the right is what our network determined.

Screen Shot 2015-07-28 at 10.31.56 PMScreen Shot 2015-07-28 at 10.31.21 PM

As you can see, the network did not guess even remotely correctly. Well we are missing the most important part of the neural network, the training. We must train our network to get the right predictions. In order to do this we need to do our favorite thing, optimize.

-Marcello

Heres the code:


% ANN code
% structure:
% 2 input
% 3 hidden nodes
% 1 output

%initial input data [ catalyst, stabilizer]
input_t0 = [0.57	3.41; 3.41	3.41;0	2;4	2;2	0;2	4;2	2];
%normalize input data
input_t0(:,1) = input_t0/max(input_t0);
input_t0(:,2) = input_t0/max(input_t0);

%normalize output data
output_t0 = [3.7 4.8 3.7 8.9 6.6 3.6 4.2];
output_t0 = output_t0/max(output_t0);

%randomly assigned weights
weight_in = [.3 .6 .7;.2 .8 .5];
weight_out = [ .4 .6 .7]';

%initialize matrices
actHidSig = zeros(7,3);
actOutSig=zeros(7,1);

%find activation for hidden layer
act_hid = input_t0*weight_in ;

%apply sigmoid to hidden activation
for i = 1:7
    for j = 1:3
        actHidSig(i,j) = 1/(1+exp(-act_hid(i,j)));
    end
end

%find activation for output layer
act_out = actHidSig*weight_out;

%apply sigmid to output activation
for i = 1:7
    
        actOutSig(i) = 1/(1+exp(-act_out(i)));
    
end

%show results
output_t0'
actOutSig

Optimization Problem Overview: Setting Up 

The hands down most important part to our adventures in optimization is the correct and proper set up of the situation we hope to optimize. In previous posts I gave glimpse to how to formally define optimization problems. Now you will see the proper way to set up our problems.

All optimization problems start the same way, with a cost or objective function. This function is what we are trying to minimize. Our function can be our cost of ingredient, our time traveled, or sitting space in a resturant. All of these are possible functions. We will call the function we are trying to minimize (our objective function) f(x). Where x is a vector of variables. So the first part of our problem set up looks like this.

So far its a pretty boring optimization problem. We need to add rules or constraints to make our problem more interesting and more meaningful. There are two general types of constraints, equality and inequality. Obviously one type sets our variables equal to something, while the other tells us the relationship of the variable to constants or other variables. However, we like all of our optimization problems to look pretty much the same. This enables us to draw prallels between different problems and hopefully use the same methods of solving. for this reason we have all our inequality and equality constraints in the following form below.

Now our problem is starting to get a little more interesting and also conveying more information to anyone else who is looking at our problem. However we are not done yet. We have to determine what type of optimization problem we have. By identifying our problem type, we know how to approach solving the problem. Certain methods and solvers work better with certain problem type (remember our no free lunch talk). But this is saved for the next post, identifying and categorizing our problem.

Before we go let’s take our diet problem from yesterday for a spin. Let’s say we live in a small town with one grocery. This grocery is poorly stocked and only has 8 items on it’s shelves at any given time. Each of these items has a cost associated with it and certain nutritional value. Since we are watching what we eat, we decided to count our macros. Our macros our fats, carbohydrates, and protein. Also I am going to tack on another “macro” vitamins. So let’s see what this super market has to offer.

Walking down the aisle we see the 5 items. They have apples, steak, gummy vitamins (Vitafusion only), potatoes, orange juice, ice cream, broccoli, and chicken breasts. Before heading home you take note of all the prices and put them in a list below so they are all nice and organized.

Screen Shot 2015-07-27 at 8.28.35 PM

Once you get home you open up chrome and check out some of the nutritional facts on the items from the store. You pop open excel and make a spread sheet that lists all the nutritional facts broke down into our four “macros”. The spreadsheet is shown below.

food fat carbs protein vitamins
apples 0 5 1 3
steak 5 2 10 0
gummy vitamins 0 2 0 10
potatoes 0 8 0 1
orange juice 0 4 0 4
ice cream 10 4 0 0
broccoli 0 5 0 5
chicken breasts 1 2 7 2

As you can see some foods provide a lot more macros than others. However, upon first inspection I cannot tell which foods are gonna be the best options for our diet. But before we determine the most optimal diet we need to know how much of each macro we need. Conservatively we guess that we need 40 grams of fat, 60 grams of carbs, 50 grams of protein, and 45 grams of vitamins.  With this information, we can formulate our problem. First we need to create a few vectors and matrices. The first vector is going to represent the amount of each foodstuff we buy. The next vector is going to come from our cost list above into a cost vector.

Screen Shot 2015-07-27 at 9.00.11 PM

One thing we have to realize is that all the above x’s are non-negative as we cannot vomit up food and sell it to the store. Anyway, its starting to look like an optimization problem. We need two more elements, our constraint matrix and constraint vector. These are going to stem from the spreadsheet we made above and our target macros. The constraint matrix (spreadsheet values) is denoted by “A” and the constraint vector (our target vectors) “b”, they are shown below.

Screen Shot 2015-07-27 at 9.11.01 PM

We have all the necessary elements for our optimization problem. Going back we remember the goal of our optimization, to minimize the amount of money we spend on food. However, this is subject to the constraint that we have to fit our macros. Formally declaring the problem gives us the following.

Screen Shot 2015-07-27 at 9.22.06 PM

There we have it our first optimization problem. This isn’t exactly standard form, but it is close. In the next couple of posts we will go over various methods to get our problems into standard form. But before that we need to classify our optimization problem. When a problem falls into the form above we classify it as a linear programming problem in optimization. This is because both the objective function (our cost minimizing) and our constraints (macro targets) are linear equations.  Linear equations are a nice basis for optimization, Next part we will dive deeper into linear equations and the best ways to solve them.

-Marcello

Posted in