In this post, we deduce Fatou's lemma and monotone convergence theorem (MCT) from each other.

## 2013/11/18

## 2013/11/14

### Young's, Hölder's and Minkowski's Inequalities

In this post, we prove Young's, Holder's and Minkowski's inequalities with full details. We prove Hölder's inequality using Young's inequality. Then we prove Minkowski's inequality by using Hölder.

categories:
analysis,
inequalities,
measure theory

## 2013/08/20

### Sequential importance sampling-resampling

## Introduction

In this post, I review the sequential importance sampling-resampling for state space models. These algorithms are also known as particle filters. I give a derivation of these filters and their application to the general state space models.

categories:
monte carlo methods,
sequential monte carlo

## 2013/07/30

## 2013/07/22

### Static Parameter Estimation for the GARCH model

## Introduction

In this post, we review the online maximum-likelihood parameter estimation for GARCH model which is a dynamic variance model. GARCH can be seen as a toy volatility model and used as a textbook example for financial time series modelling.

categories:
machine learning,
maximum-likelihood,
optimization,
time series analysis

## 2013/06/22

### Nonnegative Matrix Factorization

## Introduction.

In this post, I derive the nonnegative matrix factorization (NMF) algorithm as proposed by Lee and Seung (1999). I derive the multiplicative updates from a gradient descent point of view by using the treatment of Lee and Seung in their later NIPS paper Algorithms for Nonnegative Matrix Factorization. The code for this blogpost can be accessed from here.
categories:
machine learning,
matrix factorizations,
optimization

## 2013/05/25

### The EM Algorithm

## Introduction.

In this post, we review the Expectation-Maximization (EM) algorithm and its use for maximum-likelihood problems.
categories:
bayesian statistics,
maximum-likelihood

## 2013/05/23

### Stochastic gradient descent

In this post, I introduce the widely used stochastic optimization technique, namely the stochastic gradient descent. I also implement the algorithm for the linear-regression problem and provide the Matlab code.

categories:
machine learning,
optimization,
probability

## 2013/05/20

### Gaussianity, Least squares, Pseudoinverse

## Introduction.

In this post, we show the relationship between Gaussian observation model, Least-squares and pseudoinverse. We start with a Gaussian observation model and then move to the least-squares estimation. Then we show that the solution of the least-squares corresponds to the pseudoinverse operation.## 2013/05/03

### The use of Ito-Doeblin formula to solve SDEs

## Introduction

These notes are mostly based on the book Stochastic Calculus for Finance vol. II, Chapter 4. I give a few propositions and focus on exercises of Shreve by make use of the Ito-Doeblin formula. The use of Ito-Doeblin formula is almost purely practical to solve continuous-time stochastic models. My treatment is slightly different from the Shreve since I emphasize on the differential forms of the formulas.

categories:
probability,
stochastic differential equations