How would Pinpoint work?

The envisioned architecture of Pinpoint consists of two layers: A data-mining engine and the visualization, i.e., the user interface illustrated in the demonstrator. The interface between data mining and visualization consists of two matrices providing pairwise numeric data on subject distance and communicative distance between all people in the organization.

Subject distance refers to how closely related people are in terms of work contents and professional interests. The data-mining engine computes a subject distance matrix by semantic clustering and analysis of documents, database entries and other manifest traces of what people actually do.

Examples of data sources include project documents; meeting notes; assignment of people to projects; role and organizational structure; intranet search and browsing patterns; assignment of activity-specific system privileges; data on competence development and other formal HR information.

The data-mining engine would use techniques such as latent semantic indexing, multidimensional scaling or self-organizing maps combined with genre-specific heuristics to construct clusters and tags, and to compute subject distances between people.

The tags chosen by the user to present herself in the system are weighted more heavily than the system-generated tags, and used to refine the subject distance matrix for the visualization.

Communicative distance is computed by the data-mining engine in a similar way, based on data sources such email and IM archives, address books, buddy lists and internal phone logs.

The interface between visualization and data-mining engine is strictly limited to the two numeric matrices; there is no way to access documents, communication contents and other raw data through the visualization. The reason for this policy is to uphold a reasonable level of personal integrity.