Finding patterns in data


PATN was the result of research in CSIRO toward finding patterns in ecological data. Traditionally, PATN has comprised four types of basic analytical tools-

  • Clustering or classification
  • Ordination
  • Networks
  • Evaluation methods

While statistical packages all have some techniques in most of the three areas, some of the early innovative developments of these techniques happened in Australia in botany.  Dr Godfrey Lance was the Director of the CSIRO Division of Computing Research and Professor Bill Williams was a research scientist with the CSIRO Division of Plant Industry. Between them, they developed a range of new classification techniques that pioneered what was then known as pattern analysis. Years later, Dan Faith, Peter Minchin and myself advanced understanding of association measures and ordination. The result of this work was originally called NTP (Numerical Taxonomy Package) and subsequently PATN. It was originally, a computer program designed for research - to seek new more effective algorithms and strategies for finding patterns in complex data.

The DOS versions of PATN have a comprehensive range of options because each needed to be tested for effectiveness. Most of the options would rarely be used. Subsequent investigations suggested that many should never be used! The philosophy behind the DOS version was modularity; modifications and new options needed to be able to be easily added in a research environment. Another key was getting PATN users to think about why certain options were being selected; to move away from a 'black box' approach to pattern analysis that existed at the time. PATN also had a comprehensive range of data manipulation options. With all the options available, there were thousands of potential pathways through a typical analysis.

With the Windows version of PATN, the philosophy has radically changed. I have used my 25 years of experience in pattern analysis to establish a far smaller suite of robust pattern analysis options. The new version examines your data on import and designs an appropriate suite of default analysis options and analysis scenario. A typical analysis that may have taken hours to work through many steps, now is done in seconds. Clustering, ordination and network components are now all done together (on rows or columns of the Data Table) and can be presented to you ways that make it easier to understand. Now, most of your time should be spent working with the Ordination Plot. It is here that most of the results can be interactively displayed and interpreted.

Additions to PATN (for Windows™) will only be made when significant improvements can be demonstrated from extensive research and user feedback. 

Lee Belbin