Fatherly
What I want for my kids is to experience love in its waxing and waning phases. I want them to overcome challenges with intelligence and grace. I want them to adventure through their hearts and minds, uncovering more exquisite treasure within them at every turn. I want them to live in lifelong curiosity, to hunger for knowledge, to thirst for experience. I want them to be rooted in empathy and compassion for the world around them. I want them to be kind and generous, full of humanity. I want them to want to make a palpable difference to the lives they cross. I want them to believe in who they are and the power of their tiniest contributions. I want them to be confident. I want them to meet fear often, and to find courage on occasion, because that means they’re working outside constant comfort and security. I want them to explode with ideas and creativity, to explore, to experiment with whatever they can get their hands and minds on. I want them to take risks, to fall, to fail and to learn what it is to stand up again with scrapes on their knees and scars on their hearts. I want them to believe in the healing power of ‘again’ or ‘next time’. I want my kids to be secure enough in themselves to go hunting when they’re hungry and to be big enough to share the catch when they make it.
0 Comments
Every now and then, I get emails asking for the implementation of the PAC (proportion of ambiguous clustering) in consensus clustering.
The truth is, this code snippet should really have been included in the publication in the first place. So, apologies for not having done that. But here is a sample code below to implement PAC after you have obtained the consensus matrices (M). This sample code uses the ConsensusClusterPlus* package in R to obtain consensus matrices. *Wilkerson M and Waltman P (2013). ConsensusClusterPlus: ConsensusClusterPlus. R package version 1.24.0. ######################################################## library(ConsensusClusterPlus) seed=11111 d = matrix(rnorm(200000,0,1),ncol=200) # 200 samples in columns, 1000 genes in rows colnames(d) = paste("Samp",1:200,sep=“") rownames(d) = paste("Gene",1:1000,sep=“") d = sweep(d,1, apply(d,1,median,na.rm=T)) maxK = 6 # maximum number of clusters to try results = ConsensusClusterPlus(d,maxK=maxK,reps=50,pItem=0.8,pFeature=1,title="test_run", innerLinkage="complete",seed=seed,plot="pdf") # Note that we implement consensus clustering with innerLinkage="complete". We advise against using innerLinkage="average" which is the default value in this package as average linkage is not robust to outliers. ############## PAC implementation ############## Kvec = 2:maxK x1 = 0.1; x2 = 0.9 # threshold defining the intermediate sub-interval PAC = rep(NA,length(Kvec)) names(PAC) = paste("K=",Kvec,sep=“") # from 2 to maxK for(i in Kvec){ M = results[[i]]$consensusMatrix Fn = ecdf(M[lower.tri(M)]) PAC[i-1] = Fn(x2) - Fn(x1) }#end for i # The optimal K optK = Kvec[which.min(PAC)] ######################################################## A collection of very good advice for people thinking about going into a data scientist career. https://www.coursera.org/course/datasci Introduction to Data Science Bill Howe - University of Washington Join the data revolution. Companies are searching for data scientists. This specialized field demands multiple skills not easy to obtain through conventional curricula. Introduce yourself to the basics of data science and leave armed with practical experience programming massive databases. |
Archives
July 2017
Categories |