Yasin Senbabaoglu, Ph.D.

What I want for my kid

7/3/2017

Fatherly
What I want for my kids is to experience love in its waxing and waning phases.
I want them to overcome challenges with intelligence and grace.
I want them to adventure through their hearts and minds, uncovering more exquisite treasure within them at every turn.
I want them to live in lifelong curiosity, to hunger for knowledge, to thirst for experience.
I want them to be rooted in empathy and compassion for the world around them.
I want them to be kind and generous, full of humanity.
I want them to want to make a palpable difference to the lives they cross.
I want them to believe in who they are and the power of their tiniest contributions.
I want them to be confident.
I want them to meet fear often, and to find courage on occasion, because that means they’re working outside constant comfort and security.
I want them to explode with ideas and creativity, to explore, to experiment with whatever they can get their hands and minds on.
I want them to take risks, to fall, to fail and to learn what it is to stand up again with scrapes on their knees and scars on their hearts. I want them to believe in the healing power of ‘again’ or ‘next time’.
I want my kids to be secure enough in themselves to go hunting when they’re hungry and to be big enough to share the catch when they make it.

0 Comments

StatQuest: RPKM, FPKM and TPM

1/20/2016

0 Comments

How to use the PAC measure in consensus clustering?

12/6/2015

1 Comment

Every now and then, I get emails asking for the implementation of the PAC (proportion of ambiguous clustering) in consensus clustering.

The truth is, this code snippet should really have been included in the publication in the first place. So, apologies for not having done that. But here is a sample code below to implement PAC after you have obtained the consensus matrices (M). This sample code uses the ConsensusClusterPlus* package in R to obtain consensus matrices.
*Wilkerson M and Waltman P (2013). ConsensusClusterPlus: ConsensusClusterPlus. R package version 1.24.0.
########################################################
library(ConsensusClusterPlus)
seed=11111
d = matrix(rnorm(200000,0,1),ncol=200) # 200 samples in columns, 1000 genes in rows
colnames(d) = paste("Samp",1:200,sep=“")
rownames(d) = paste("Gene",1:1000,sep=“")
d = sweep(d,1, apply(d,1,median,na.rm=T))
maxK = 6 # maximum number of clusters to try
results = ConsensusClusterPlus(d,maxK=maxK,reps=50,pItem=0.8,pFeature=1,title="test_run",
innerLinkage="complete",seed=seed,plot="pdf")

# Note that we implement consensus clustering with innerLinkage="complete". We advise against using innerLinkage="average" which is the default value in this package as average linkage is not robust to outliers.

############## PAC implementation ##############
Kvec = 2:maxK
x1 = 0.1; x2 = 0.9 # threshold defining the intermediate sub-interval
PAC = rep(NA,length(Kvec))
names(PAC) = paste("K=",Kvec,sep=“") # from 2 to maxK

for(i in Kvec){
M = results[[i]]$consensusMatrix
Fn = ecdf(M[lower.tri(M)])
PAC[i-1] = Fn(x2) - Fn(x1)
}#end for i

# The optimal K
optK = Kvec[which.min(PAC)]
########################################################

1 Comment

Data scientist DNA

8/11/2012

0 Comments

A collection of very good advice for people thinking about going into a data scientist career.

0 Comments

Kaggle: Making data science a sport

8/11/2012

1 Comment

Greenplum data science blog

8/11/2012

0 Comments

Greenplum is a Big Data analytics company in San Mateo, California.
http://www.greenplum.com/blog/

0 Comments

Data Science Course on Coursera

8/11/2012

2 Comments

https://www.coursera.org/course/datasci

Introduction to Data Science

Bill Howe - University of Washington

Join the data revolution. Companies are searching for data scientists. This specialized field demands multiple skills not easy to obtain through conventional curricula. Introduce yourself to the basics of data science and leave armed with practical experience programming massive databases.

2 Comments

Blog

What I want for my kid

StatQuest: RPKM, FPKM and TPM

How to use the PAC measure in consensus clustering?

Data scientist DNA

Kaggle: Making data science a sport

Greenplum data science blog

Data Science Course on Coursera

Archives

Categories