myelin: urlstemmer

A library for turning weblog archive URLs into root URLs

Consider this alpha at the moment - of course it's not guaranteed to turn any archive URL into a root URL, and it might get it completely wrong. I'm releasing this out into the wild so other people can give it a go and tell me when it works and doesn't work. Comments here please.

v0.01

Save this link as urlstemmer.py.

To use it:

import urlstemmer
print urlstemmer.stem("http://www.pycs.net/0001234/2004/02/02.html#a320")
    ⇒ myelin | notes | christchurch | net [ video hire ] | software [ dbwrappers | xmlrpc | pycs | pss ] | contact