What is Gauguin? 

GAUGUIN ( Grouping And Using Glyphs Uncovering Individual Nuances ) is a project for the interactive visual exploration of multivariate data sets, developed for use on all major platforms (Windows, Linux, Mac). It supports a variety of methods for displaying flat-form data and hierarchically clustered data.
Glyphs are geometric shapes scaled by the values of dataset variables. They may be drawn for individual cases or for averages of groups or clusters of cases. GAUGUIN offers four different glyph shapes (but more could be added). The number of data elements which can be displayed simultaneously is limited, because each glyph requires a minimum amount of screen space to be viewed, but hierarchical glyphs can be drawn for groups of cases. Hierarchical glyphs are composed of a highlighted case representing the group and a band around it showing the variability of all the members of the cluster.
GAUGUIN also provides scatterplots and tableplots, and via Rserve is able to use R to calculate MDS views and clusters for the data. All GAUGUIN displays are linked interactively and can be directly queried.

Features 

The variable window is the most important control tool where the user can enable and disable variables, completely delete them or select and replace them. The user can decide which variables to be included in (or excluded from) multivariate distance calculations, which are used for highlighting neighbours in the grid plot.


GRID

The glyphs are arranged on a grid and can be ordered by any variable selected by the user.

 



SCATTERPLOTS

The scatterplot matrix displays all N*N pairwise scatterplots of variables selected from the list in the main window. Clicking on one plot zooms in on it to give a detailed view with points displayed by their glyphs. Clicking the back button then returns the user to the scatterplot matrix display.




MDS

Multidimensional scaling (MDS) offers an approximate spatial representation of the data that can facilitate interpretation and reveal relationships. It displays differences between items, as if they were points on a map. The greater the distance, the more different the items are

.



CENTER

This is a special MDS representation, where the currently selected glyphs are displayed in the center of the map.

 



GROUPING

The data can be grouped by a given variable. There are two options:
  • Grouping by radius: cases are grouped by their value on the selected variable,
    so that the variable in each group is within the specified radius.
    The number of groups depends on the radiu
  • Grouping by count: cases are put into groups of equal size based on
    their ranking for the selected variable.



CLUSTERING

The data can be hierarchically clustered to the specified number of clusters. The method is based on functions hclust and kmeans in R’s stats package and supports all of these functions’ option





TABLEPLOT

A special view of the data, where the columns represent variables and rows represent one or several adjacent cases (the group size can be set by the user). By clicking on a column header, the data are sorted by the variable in that column. This view supports selection, randomisation of data, different views of columns, zooming and manipulation of data.





GROUPSPLOT

In this display there is one row for each group (chosen by Grouping) or for each cluster (determined by Clustering) and one column for every selected variable.  Continuous variables are represented by histograms and categorical variables by barcharts.  All plots in the same column are common scaled.



Downloads

Windows
(exe-file)
UNIX (JAR-file)
Mac OS X (zipfile of application)

Data 

Gauguin supports the standard ASCII data format, which consists of a header of variable names, and tab-delimited columns.

EXAMPLE:

Vehicle Name
Type
Drive
Dealer Cost (USD)
Engine Size (liters)
Chevrolet Aveo 4dr
Sedan
front
10965
1.6
Jaguar X-Type 2.5 4dr
Sedan
AWD
27355
2.5
Mercedes-Benz C240 4dr
Sedan
AWD
31187
2.6

Conventions 

    Main Window:

  • Click and drag -> create a selection
  • Del -> delete selected variables
  • Select and drag -> swap selected variables

    All Plots:

  • Click and drag -> create a selection
  • Ctrl and mouse–over -> query object

    Grouping and Clustering Plots:

  • Click and ctrl -> create a window containing approximated glyphs (only Grouping and Clustering plot)
  • Cursor-up and Cursor-down ->
    specify the transparency of cluster colors.

    Tableplot:

  • Click and ctrl or right button mouse click ( if available)-> PopupMenu

Contact 

Gauguin is a project of the Department of Computer Oriented Statistics and Data Analysis (COSADA) at the University of Augsburg, Germany


Implementation by Alexander Gribov