Wednesday, 3. August 2005

Article "A Tribute to J. Bertin's Graphical Data Analysis"

brief summary of that article by Antoine de Falguerolles:
* Article discusses the original idea of the re-orderable matrix, introduced by Jacques Bertin in 1977 in "La graphique et le traitement graphique de l'information".
* J. Bertin introduced a display and an analysis strategy for multivariate data with low or medium sample size. Note: Techniques require manual interaction by the user, therefore not applicable for high-dimensional data!
* The tools operate simultaneously on cases and variables, combining aspects otherwise separately encountered in cluster analysis (on cases) and principal component analysis or factor analysis (on variables).
* The discussed data set were percentual voting results on 9 distinct referenda (=variables) in 42 counties (=cases)
* A threshold was introduced, and all data points exceeding that threshold were highlighted in the matrix
* Bertin discussed several actions of how to re-order the original matrix to find an order with homogenous parts (so called "patches"). Actions included basically shifting and splitting. In order to cope with medium-sized data sets strategies are offered, which take as a first step the correlation among variables into account. Nevertheless the manual interaction remains.
* A quality measure for the re-ordered matrix is a "purity function", which can be defined in several ways that can highly differ in their complexity. A simple purtiy function could be on, where each row or column gets a score by summing up all neighboring pairs of that row/column that are in the correct order.
* Falguerolles mentiones the possible problem of differing scales among observed variables (which can be attacked by using ranks, or by normalising the data)
* Bertin matrices can be viewed as special parallel coordinate plots [Inselberg 96] While usual parallel coordinate plots use variables for ordinates, Bertin matrices conventionally operate on the transposed matrix using cases for ordinates...they are [also] related to techniques like the biplot [Gabriel, 71];

All in all the question remains of how to automatically perform a "correct" re-ordering with high(est)-dimensional-data.

iPlots & JGR

iPlots is a package for R statistical environment (see www.r-project.org) which provides high interaction statistical graphics, written in Java.
-> http://www.rosuda.org/iPlots/index.shtml

iPlots is packaged together with JGR, a Java-based GUI for R.
-> http://stats.math.uni-augsburg.de/JGR/

iPlots provides interactive scatter-, histo- and barplots. Interactivity basically means reformatting the graph (rescaling, rotating,..) and selecting certain data points.

iWidget is also part of JGR and allows the addition of interactive sliders, buttons, etc. to a plot!
w <- iwindow()
add(w, igraphics())
a <- rnorm(100)
plot(density(a))
islider(1, window=w, handler=function(h,...)
    plot(density(a, bw=get.value(h$obj)/300)))
visible(w,TRUE)
iwdget
More highly interesting software packages can be found here:
-> http://www.rosuda.org/software/

Mondrian

Exploratory data analysis with focus on large data and databases.

Mondrian is a statistical data-visualization system written in JAVA. The main emphasis of Mondrian is on visualization techniques for Categorical Data , Geographical Data and LARGE Data.

-> http://www.rosuda.org/Mondrian/

Mosaicplots, Barcharts, Maps, Parallel Coordinates, Boxplots, Scatterplots, Histograms.

Interesting technique: using semi-transperancy to deal with large data
-> http://www.rosuda.org/Mondrian/Mondrian.html#alpha
alphapc
Interactive highlighting of several datapoints (i.e. lines) is possible.

Search

 

currently reading



William N. Venables, Brian D. Ripley
Modern Applied Statistics with S

Recent Updates

John
Amoxicillin And Clavulanate 250mg With No Prescription...
Smithe526 (guest) - 13. May, 21:03
Hi, I am doing a project...
Hi, I am doing a project for my school using this doc2mat...
Sangeetha (guest) - 2. Mar, 10:35
mountain vizualization...
By the way, here they explain how the mountain visualization...
Tatiana (guest) - 10. Mar, 02:12
hi, I wonder how did...
hi, I wonder how did you make scrin shorts of the mountin...
Tatiana (guest) - 10. Mar, 02:10
SOM + genes
Interpreting patterns of gene expression with self-organizing...
michi - 4. Sep, 23:03

data analysis
diary
linkdump
literature
software
Profil
Logout
Subscribe Weblog