Friday, 12 September 2014

R: k-Means Clustering on an Image

Enough with the theory we recently published, let's take a break and have fun on the application of Statistics used in Data Mining and Machine Learning, the k-Means Clustering.
k-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. (Wikipedia, Ref 1.)
We will apply this method to an image, wherein we group the pixels into k different clusters. Below is the image that we are going to use,
Colorful Bird From Wall321
We will utilize the following packages for input and output:
  1. jpeg - Read and write JPEG images; and,
  2. ggplot2 - An implementation of the Grammar of Graphics.

Monday, 8 September 2014

Lebesgue Measure and Outer Measure Problems

More proving, still on Real Analysis. This is my solution and if you find any errors, do let me know.


Lebesgue Measure: Let $\mu$ be set function defined for all set in $\sigma$-algebra $\mathscr{F}$ with values in $[0,\infty]$. Assume $\mu$ is countably additive over countable disjoint collections of sets in $\mathscr{F}$.
  1. Prove that if $A$ and $B$ are two sets in $\mathscr{F}$, with $A\subseteq B$, then $\mu(A)\leq \mu(B)$. This property is called monotonicity.
  2. Prove that if there is a set $A$ in the collection $\mathscr{F}$ for which $\mu(A)<\infty$, then $\mu(\emptyset)=0$.
  3. Let $\{E_{k}\}_{k=1}^{\infty}$ be a countable collection of sets in $\mathscr{F}$. Prove that $\mu\left(\displaystyle\bigcup_{k=1}^{\infty}E_{k}\right)\leq \displaystyle\sum_{k=1}^{\infty}\mu(E_k)$
Lebesgue Outer Measure:
  1. By using property of outer measure, prove that the interval $[0,1]$ is not countable.
  2. Let $A$ be the set of irrational numbers in the interval $[0,1]$. Prove that $\mu^{*}(A)=1$.
  3. Let $B$ be the set of rational numbers in the interval $[0,1]$, and let $\{I_k\}_{k=1}^{n}$ be finite collection of open intervals that covers $B$. Prove that $\displaystyle\sum_{k=1}^{n}\mu^{*}(I_k)\geq 1$.
  4. Prove that if $\mu^{*}(A)=0$, then $\mu^{*}(A\cup B)=\mu^{*}(B).$

Sunday, 7 September 2014

Translation Invariant of Lebesgue Outer Measure

Another proving problem, this time on Real Analysis.


  1. Prove that the Lebesgue outer measure is translation invariant. (Use the property that, the length of an interval $l$ is translation invariant.)


  1. Proof. The outer measure is translation invariant if for $y\in \mathbb{R}$, \begin{equation}\nonumber \mu^{*}(A)=\mu^{*}(A+y) \end{equation} Hence, we need to show that Case 1: $\mu^{*}(A)\leq \mu^{*}(A+y)$; and Case 2: $\mu^{*}(A+y)\leq \mu^{*}(A)$.

    Case 1: Consider a countable collection $\{I_n\}_{n=1}^{\infty}$, and let \begin{equation}\nonumber W = \left\{\displaystyle\sum_{n=1}^{\infty}l(I_n)\mid A\subseteq\displaystyle\bigcup_{n=1}^{\infty}I_n\right\} \end{equation} Then the outer measure of $A$ is, \begin{equation}\nonumber \mu^{*}(A)=\inf\,\{W\}. \end{equation}
Fork me on GitHub