Think Hard, Build Harder

behind those words on the screen you'd better have something that really matters

Data

Notes of [Six types of analyses ... ] by Smith

John Hopkins Professor Jeffrey Leek summarized six types of data analyses: Descriptive – descriptive summary of the data, e.g., the mean, standard deviation; Exploratory – “an approach to analyzing data sets to find previously unknown relationships”; Inferential – testing theories…

xiangchen November 26, 2013 Tech No Comments

Response to [DataPlay: Interactive tweaking and example-driven ... ] by Abouzied et al.

One Sentence This paper presents DataPlay, a system that allows users to directly manipulate a query tree or to specify a subset of data (answers and non-answers) as a way to iteratively formulate a quantified query. More Sentences Quantified queries…

xiangchen November 24, 2013 HCI No Comments

Response to [Medical case retrieval ...] by Quellec et al.

One Sentence This paper presents a method of retrieving attribute-missing medical cases with heterogenous features (semantic + images) using decision trees. Useful Information Understanding decision trees: each non-terminal node represents a test on a single attribute; each edge represents a…

xiangchen September 5, 2012 Tech No Comments

[HCI Stats] Types of data

Based on Yatani’s wiki, this post introduces the four types of data one will encounter in a statistical analysis. Start with a story: Your team has invented a new kind of interface that gives thermal feedback (i.e., cold, cool, warm,…

xiangchen April 11, 2012 HCI No Comments

Notes of [MapReduce: simplified data ...] by Dean & Ghemawat

1. So! What is MapReduce? MapReduce is a two-step mechanism for manipulating distributed data with large scale. In particular, the ‘map’ step visits the data according to programmer-defined rules, then the ‘reduce’ step collects the intermediate results from ‘map’ and…

xiangchen January 28, 2012 Tech No Comments

Notes of [Bigtable: a distributed... ] by Chang et al.

1. So! What is Bigtable? Bigtable is similar to the table concept in database but it is deliberately designed for managing large-scaled, structured data across distributed storage systems. 2. So! How is it ‘deliberate’? The big table is a multi-dimensional…

xiangchen January 27, 2012 Tech No Comments

Notes of [The Google file system] by Ghemawat et al.

1. What is Google File System (GFS)? Google File System is a scalable distributed file system for large distributed data-intensive applications. (The Google File System demonstrates the qualities essential for supporting large-scale data processing workloads on commodity hardware) 2. What…

xiangchen January 25, 2012 Tech No Comments

Response to [Studying Software ...] by Lethbridge et al.

GENERAL CITE This paper offers a comprehensive literature review as well as a valuable taxonomy into data collection techniques in studying software engineering. The taxonomy is primarily based on the degree of human intervention involved in the data collection process.…

xiangchen January 17, 2011 HCI, Tech No Comments