Phillip Pearson - web + electronics notes

tech notes and web hackery from a new zealander who was vaguely useful on the web back in 2002 (see: python community server, the blogging ecosystem, the new zealand coffee review, the internet topic exchange).

2006-2-16

Notes from DAS2006

I just got back today from the 7th IAPR International Workshop on Document Analysis Systems (proceedings), held in Nelson from 13-15 Feb.

The presentations were all about document or image analysis, but the heavy use of AI techniques could make some of it relevant to what I work on these days.

Some of the interesting people I met or caught up with:

Projects I should take a look at:

Techniques I should learn (or re-learn):

  • Gabor filters
  • Hidden Markov models
  • Standard classifiers: NNC, LDC
  • Analytical segmentation
  • Dynamic programming
  • Viterbi algorithm
  • Dynamic time warping
  • RAST algorithm for alignment
  • X-tree spatial indexing algorithm
  • Affine invariants
  • Gaussian mixed models

Things that should exist:

  • A better browser for mailing lists - that thinks more about the message content and tries to figure out what's going on, presenting more statistics etc in the list view to help you find interesting messages.
  • A browser for academic papers with tagging so you can collect together papers on a very specific subject without prejudicing the normal categorisation.
  • Realtime image stitching - build a panorama out of a video. (Existing: traffic monitoring.)
  • Connected component analysis on colour images.
... more like this: []

Free OCR

Just so I don't forget - found this free OCR engine on SourceForge, a "commercial quality OCR engine originally developed at HP between 1985 and 1995".

Rumour has it that Google will be developing open source OCR soon...