When I read the chapter “Statistical Learning Methods” Support Vector Machine, I saw “Apply Lagrangian Duality (see Appendix C), and get the optimal solution of the original problem by solving the dual problem,” then recursively Take a look at the Lagrangian duality of Appendix C. The name is learned, but in fact it is excerpted, adding a small amount of personal understanding and background supplement. After all, theorems and inferences can be understood, and it is impossible to write flowers. Even the original work did not give proof of partial inference, just knowing that there is such a thing, the next time you have to remember to come over and take a look. The following is the body of the excerpt:

In the constraint optimization problem, the Lagrange duality is often used to transform the original problem into a dual problem, and the solution to the original problem is obtained by solving the dual problem. This method is used in many statistical learning methods, such as maximum entropy models and support vector machines. Here is a brief description of the main concepts and results of Lagrange’s duality.

## Original problem

Hypothesis

Is a continuous differentiable function defined on Rn. Consider constrained optimization problems

Call this constraint optimization problem the original optimization problem or the original problem.

First, introduce a generalized Lagrange function

Here,

Is a Lagrange rider,

. Consider the function of x:

Here, the subscript P indicates the original question.

Suppose that given an x, if x violates the constraints of the original problem, there is some i

Or there is some j so that

Then there is

Because if an i makes a constraint

Can make

If j makes

Can make

Make

And the rest of each

Both are taken as 0.

Conversely, if x satisfies the constraint formula (C.2) and formula (C.3), then from equations (C.5) and (C.4),

(The two of the two are non-positive, one is 0. If you want to take the maximum value, of course, both must be 0). therefore,

So if you consider the minimization problem

It is equivalent to the original optimization problem (C.1) ~ (C.3), that is, they have the same solution. problem

It is called the minimal maximization problem of the generalized Lagrangian function. In this way, the original optimization problem is expressed as a minimal maximal problem of the generalized Lagrangian function. For convenience, define the optimal value of the original problem

The value called the original question.

## 2. Dual problem

Definition

Consider maximizing again

, which is

problem

A very small minimum problem called a generalized Lagrangian function.

The maximal minimization problem of the generalized Lagrangian function can be expressed as a constrained optimization problem:

A dual problem called the original problem. Define the optimal value for the dual problem

The value called the dual problem.

## 3. The relationship between the original problem and the dual problem

The relationship between the original problem and the dual problem is discussed below.

### Theorem C.1

If both the original problem and the dual problem have optimal values, then

prove

From the formula (C.12) and the formula (C.5), for any of α, β and x, there are

which is

Since both the original problem and the dual problem have optimal values,

which is

This seems to be well understood that the upper bound of the minimum of the same function is less than the lower bound of its maximum.

### Corollary C.1

Assume

Are the feasible solutions of the original problem (C.1)~(C.3) and the dual problem (C.12)~(C.13), respectively, and

then

They are the optimal solutions for the original problem and the dual problem.

Under certain conditions, the optimal values of the original problem and the dual problem are equal.

. At this time, the solution to the original problem can be replaced by the solution to the dual problem. The relevant important conclusions are described below in the form of theorems without proof.

### Theorem C.2

Consider the original question (C.1) ~ (C.3) and the dual question (C.12) ~ (C.13). Hypothetical function

with

Is a convex function,

Is an affine function;

### Bean knowledge affine function

An affine function is a special convex function. A function that is both a convex function and a concave function is called an affine function. It must be the sum of a linear function and a constant. In a finite dimensional space, an affine function is a linear function. The importance of the function is that the lower semi-continuous convex function on the local convex space (including the normed linear space and the finite dimensional space) must be the upper envelope of the continuous affine function family.

– “Mathematics Ci Hai Volume III” (Unfortunately, I generally do not understand Chinese academic books, can only understand the red sentence.)

Affine functions represent vector-valued functions of the form

The coefficients can be scalars or dense or sparse matrices. The constant term is a scalar or a column vector.

In geometry, an affine transformation or affine map (from the Latin, affinis, “connected with”) between two vector spaces consists of a linear transformation followed by a translation. In a geometric setting, these are precisely the functions that map straight lines to straight lines.

——https://mathworld.wolfram.com/AffineFunction.html (English is very easy to understand, the affine function is a linear function, the input is an n-dimensional vector, the parameter A can be a constant, or a matrix of m*n, b can be a constant, or can be an m-dimensional column vector, The output is an m-dimensional column vector. Geometrically, an affine function is a transformation from one linear space to another.)

And assume inequality constraints

Is strictly feasible, that is, there is x, for all i have

, then there exists

Make

Is the solution to the original problem,

Is the solution to the dual problem, and

### Theorem C.3

For the original problem (C.1) ~ (C.3) and the dual problem (C.12) ~ (C.13), the hypothesis function

and

Is a convex function,

Is an affine function and inequality constraints

Is strictly feasible, then

and

The necessary and sufficient conditions for the solution of the original problem and the dual problem are respectively; the following Karush-Kuhn-Tucker (KKT) conditions are satisfied:

In particular, the formula (C.24) is called the dual complement condition of KKT. According to this condition, if

,then.

## Reference

Statistical Learning Method