I tried to do the primal with slack variables starting by using the 4 points
to get 4 equations of w1, w2, b and the slack variable.
I did some algebra to reduce the equations but I can't get anything that's
really helpful. However I did notice something interesting..
If I just add the 4 equations, it shows that sum of the 4 slack variable >= 4
Then I just throw in w1=0, w2=0 and hope to find some b that works with the
slack variable constraint. A value of b exists so that w1=0 and w2=0 and sum
of slack variable = 4. I believe that it minimizes the equation.. but there
are two things I am concering about...
1. This is not a good mathematical way... I can't just random pick some number
and claim it to be the optimal.
2. What line does w=[0 0]T give? I know that this dataset is not linearly
separatable unless I have a >1 slack variable for one of the o or x point,
but I can at least draw such a line. The idea of w=[0 0]T doesn't make me
feel right.
Also I am stuck on dual too...
I get the alpha equation, but I don't know how to find the values of alphas.
I get the equation, I get the constraint that yTa = 0, but there are 4
unknown alphas..
Again, I try to make everything 0, simple, but I know it isn't good.
(Like the page 23 example of lecture notes, set alpha1=0 meet the constraints,
but it's not the right solution)
I tried to find another set of alpha that aren't all 0 and minimize the
equation, but that set gives me really big w...