Esiry.com
Focus on Machine Learning.

Towards a Better Language Model

Towards a Better Language Model

We already know some means to improve the language model, such as:

  • Better input: word → root → character
  • Better regularization/pretreatment
  • These methods combine to get a better language model

Better input

Multiple granularity of text:

Towards a Better Language Model

Finer granularity is equivalent to reducing the vocabulary and making the model easier to make choices. The test shows that the error is indeed reduced:

Towards a Better Language Model

Better regularization and pretreatment

Regularization will not be said.

Preprocessing refers to randomly replacing some words in a sentence with another word (such as replacing one place name with another), or using BiGram statistics to generate a replacement.

This will result in a smoother distribution, and the high-frequency words will give some playing opportunities to the low-frequency words.

Towards a Better Language Model

The effect of reducing the error rate is as follows (regularization on the left and preprocessing on the right):

Towards a Better Language Model

A better model?

Towards a Better Language Model

Noise Contrastive Estimation(NCE)

Instead of using an expensive cross-entropy loss function, it is better to use an approximation called NCE loss. The theory is that when the k value is large enough, the two gradients are close.

Larger number of LSTM units

The number of LSTM units is increased to 1024, and the larger the value of k, the better, until the GPU memory is full.

After using these various improvements, I finally got the following results:

Towards a Better Language Model

Your support will encourage us to be creative continuously!

Use [WeChat] Scan QR code for Appreciation

Use [Alipay] Scan QR code for Appreciation

Jumping to PayPal...
赞(2)
Please indicate the source:Esiry » Towards a Better Language Model

Comment 抢沙发

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址