- Creating and Shaping Matrices;
- Matrix Query;
- Subscripts;
- Descriptive Statistics;
- Set Operations;
- Probability Functions and Subroutine;
- Linear Algebra;
- Reading and Creating Data;

scalar |
---|

5 |

row_vec | |||||
---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | 6 |

col_vec |
---|

1 |

2 |

3 |

4 |

5 |

6 |

num_mat | ||
---|---|---|

1 | 2 | 3 |

4 | 5 | 6 |

chr_mat |
---|

Hello, |

world! :D |

i_mat | |||||
---|---|---|---|---|---|

1 | 0 | 0 | 0 | 0 | 0 |

0 | 1 | 0 | 0 | 0 | 0 |

0 | 0 | 1 | 0 | 0 | 0 |

0 | 0 | 0 | 1 | 0 | 0 |

0 | 0 | 0 | 0 | 1 | 0 |

0 | 0 | 0 | 0 | 0 | 1 |

mat_2 |
---|

2 |

2 |

2 |

trow_vec |
---|

1 |

2 |

3 |

4 |

5 |

6 |

mat1 | |
---|---|

1 | 2 |

3 | 4 |

5 | 6 |

SYMBOL ROWS COLS TYPE SIZE ------ ------ ------ ---- ------ CHR_MAT 2 1 char 9 COL_VEC 6 1 num 8 I_MAT 6 6 num 8 MAT1 3 2 num 8 MAT_2 3 1 num 8 NUM_MAT 2 3 num 8 ROW_VEC 1 6 num 8 SCALAR 1 1 num 8 TROW_VEC 6 1 num 8 Number of symbols = 10 (includes those without values)

nmat_row |
---|

2 |

nmat_col |
---|

3 |

nmat_dim | |
---|---|

2 | 3 |

cmat_len |
---|

6 |

9 |

cmat_nlen |
---|

9 |

nmat_typ |
---|

N |

cmat_typ |
---|

C |

NUM_MAT | ||
---|---|---|

1 | 2 | 3 |

4 | 5 | 6 |

n22_mat |
---|

5 |

nr1_mat | ||
---|---|---|

1 | 2 | 3 |

ir12_mat | |||||
---|---|---|---|---|---|

1 | 0 | 0 | 0 | 0 | 0 |

0 | 1 | 0 | 0 | 0 | 0 |

ic12_mat | |
---|---|

1 | 0 |

0 | 1 |

0 | 0 |

0 | 0 |

0 | 0 |

0 | 0 |

ngm_mat |
---|

3.5 |

ncm_mat | ||
---|---|---|

2.5 | 3.5 | 4.5 |

nrm_mat |
---|

2 |

5 |

ngs_mat |
---|

21 |

nrs_mat | ||
---|---|---|

17 | 29 | 45 |

ncs_mat |
---|

14 |

77 |

nss_mat |
---|

91 |

nrs_mat | ||
---|---|---|

17 | 29 | 45 |

ncs_mat |
---|

14 |

77 |

`:`

symbol inside the place holder of the subscript. So that if we have `num_mat[:, 1]`

, then mean is computed over the row entries, giving us the column mean, particularly for first column. The same goes for `num_mat[1, :]`

, where it computes the mean over the column entries, giving us the row mean. If we replace the symbol in the place holder of the subscripts to `+`

, then we are interested in the sum of the entries. Further, if we use `##`

symbol, the returned value will be the sum of square of the elements. And reducing this to `#`

, the returned value will be the product of the elements.Now let's proceed to the next bullet, which is about Descriptive Statistics.

csr_vec | |||||
---|---|---|---|---|---|

1 | 3 | 6 | 10 | 15 | 21 |

csn_mat | ||
---|---|---|

1 | 3 | 6 |

10 | 15 | 21 |

mnr_vec |
---|

1 |

mnn_mat |
---|

1 |

mxr_vec |
---|

6 |

mxn_mat |
---|

6 |

smr_vec |
---|

21 |

smn_mat |
---|

21 |

ssr_vec |
---|

91 |

ssn_mat |
---|

91 |

x1 |
---|

0.2642335 |

1.0747269 |

0.8179241 |

-0.552775 |

1.5401449 |

-1.233822 |

-0.141535 |

1.0420036 |

0.0657322 |

1.225259 |

-0.148304 |

0.2901233 |

-1.149394 |

-0.482548 |

-0.452974 |

0.2738675 |

-0.224133 |

0.218553 |

-0.420015 |

0.246356 |

x2 |
---|

54.993687 |

58.167325 |

59.147705 |

40.74794 |

45.813645 |

53.460273 |

57.877839 |

51.98273 |

49.875743 |

52.570553 |

54.097005 |

46.936325 |

57.509082 |

50.463228 |

42.775346 |

39.376643 |

53.303455 |

54.494482 |

55.747821 |

44.512206 |

x12 | |
---|---|

0.2642335 | 54.993687 |

1.0747269 | 58.167325 |

0.8179241 | 59.147705 |

-0.552775 | 40.74794 |

1.5401449 | 45.813645 |

-1.233822 | 53.460273 |

-0.141535 | 57.877839 |

1.0420036 | 51.98273 |

0.0657322 | 49.875743 |

1.225259 | 52.570553 |

-0.148304 | 54.097005 |

0.2901233 | 46.936325 |

-1.149394 | 57.509082 |

-0.482548 | 50.463228 |

-0.452974 | 42.775346 |

0.2738675 | 39.376643 |

-0.224133 | 53.303455 |

0.218553 | 54.494482 |

-0.420015 | 55.747821 |

0.246356 | 44.512206 |

x12_cor | |
---|---|

1 | -0.001531 |

-0.001531 | 1 |

x12_cov | |
---|---|

0.5645625 | -0.006864 |

-0.006864 | 35.614684 |

x1_mu |
---|

0.1126712 |

x2_std |
---|

5.967804 |

`x1`

variable, and that's done by using the `j`

function. The number of rows of `x1`

represents the sample size of the random numbers needed. One can also set `x1`

to a row vector, where in this case, the number of columns represents the sample size needed. The two sets of random sample, `x1`

and `x2`

, generated from the same family of distribution, Gaussian/Normal, are then concatenated column-wise (`||`

) to form a matrix of size 20 by 2 in line 13. Using this new matrix, `x12`

, we can then compute the correlation and covariance of the two columns using `corr`

and `cov`

functions, respectively, which from the above output tells us that there is almost no relation between the two.SAS can also perform set operations, and it's easy. Consider the following:

B_comp | |||
---|---|---|---|

a | i | m | x |

A_comp | ||||
---|---|---|---|---|

e | h | r | t | y |

AuB | |||||||||
---|---|---|---|---|---|---|---|---|---|

a | e | h | i | m | o | r | t | x | y |

AnB |
---|

o |

AB_unq | |||||||||
---|---|---|---|---|---|---|---|---|---|

a | e | h | i | m | o | r | t | x | y |

`CDF`

function, but note that the exponential density in SAS is given by
$$f(x|\beta)=\frac{1}{\beta}\exp\left[-\frac{x}{\beta}\right].$$
So to compute the probability, we solve for the following integration,
$$
\mathrm{P}(X\leq 2)=\int_{0}^{2}\frac{1}{.5}\exp\left[-\frac{x}{.5}\right]\operatorname{d}x = 0.9816844
$$
To confirm this in SAS, run the following
px |
---|

0.9816844 |

`PDF`

function. For example, we can confirm the above probability by integrating the PDF. And to do so, run the followingpx |
---|

0.9816844 |

z_a |
---|

-1.644854 |

xm_det |
---|

-1 |

xm_inv | ||
---|---|---|

1 | -3 | 2 |

-3 | 3 | -1 |

2 | -1 | 4.441E-16 |

x_evl |
---|

11.344814 |

0.1709152 |

-0.515729 |

x_evc | ||
---|---|---|

0.3279853 | 0.591009 | 0.7369762 |

0.591009 | -0.736976 | 0.3279853 |

0.7369762 | 0.3279853 | -0.591009 |

x_coef |
---|

3 |

-4 |

2 |

x_dat |
---|

Acura |

Acura |

Acura |

Acura |

Acura |

Acura |

Acura |

Audi |

Audi |

Audi |

hp_mean |
---|

215.88551 |

Obs | COL1 | COL2 | COL3 |
---|---|---|---|

1 | 1 | 2 | 3 |

2 | 4 | 5 | 6 |

*I am loving SAS because of IML*. There are still hidden capabilities of this procedure that I would love to explore and share to my readers, so stay tuned. Another great blog about SAS/IML is The DO Loop, whose author, Dr. Rick Wicklin, is also the principal developer of the said procedure and SAS/IML Studio, do check that out.

## No comments:

## Post a Comment