Kim 2 ML

Applied AI notes for data scientists, analysts, and builders

Day 9 (2)

Data Structure

the way how to store data in memory. For example, store data in matrix form, 5row x 3column.

factor(data,levels,labels) single dimension, kinda like relational database dim::fact concept

factor(c("a","a","b",1), 
       levels=c("a","b",1), 
       labels=c("c","d","e"))

Result:

so what happen here is that data is “a”, “a”, “b”, 1. But due to levels parameter, “1” is remove to end and all the data is replace by labels value.

factor(c("a","a","b",1), 
       labels=c("c","d","e"))

Result:

by removing level, data is replace accordingly by sort.


vector(“logical/integer/numeric/character”, length) single dimension, single datatype, similar to c(1,2,3)

a = vector(mode="numeric", length=10)
b = c(0,0,0,0,0,0,0,0,0,0)

Result:



matrix(data,nrow,ncol,byrow,dimnames) single & two dimension, single datatype

a = matrix(c( c(1,2),c("one","two") ),
           nrow=2,
           ncol=2,
           byrow=TRUE,
           dimnames=list(c("rowA","rowB"), 
                         c("colA","colB")))

b = matrix(c( c(1,2),c("one","two") ),
           nrow=2,
           ncol=2,
           byrow=FALSE)

Result:



array(data,dim,dimnames) single, two & three dimension, single datatype

array(1:8,
      dim=c(2,2,2),
      dimnames=list(c("rowA","rowB"), c("colA","colB")))

Result:


Access data by index

variable_name[row,column,…]

array(1:8,
      dim=c(2,2,2),
      dimnames=list(c("rowA","rowB"), 
                    c("colA","colB"))) [1,2,2]

Result: 7


Combine data

rbind(variable_name, variable_name)

a = matrix(c( c(1,2),c("one","two") ),
           nrow=2,
           ncol=2,
           byrow=TRUE,
           dimnames=list(c("rowA","rowB"), c("colA","colB")))

b = matrix(c( c(1,2),c("one","two") ),
           nrow=2,
           ncol=2,
           byrow=FALSE)

r = rbind(a, b)

Result:


cbind(variable_name, variable_name)

a = matrix(c( c(1,2),c("one","two") ),
           nrow=2,
           ncol=2,
           byrow=TRUE,
           dimnames=list(c("rowA","rowB"), c("colA","colB")))

b = matrix(c( c(1,2),c("one","two") ),
           nrow=2,
           ncol=2,
           byrow=FALSE)

c = cbind(a, b)

Result:

Leave a comment