Archive search

19 May 2006

Zane H. Healy wrote:
...
 > He would like some
> volunteers to hack the python code to be more efficient or something,
> but I don't know python. 
Sounds more like it needs rewriting in a more efficient language to me (isn't
Python interpreted? Probably not a good choice for search indexing!)
I can't imagine the indexing code's *that* complicated - I expect it's the
search side and how to quickly find results in the masses of data that tends
to be the tricky part.
...
  "Hello, World" to run on my VMS server).  It
sounds to me like this
 is a good candidate for running either once a week, or once a month. 
Hmm, isn't the dictionary (mapping words to codes) static and built up from
existing archives? 'new' words found subsequently will get longer codes
assigned to them and be less efficient, but if the initial sample data is
large it won't necessarily matter. Beats rebuilding the dictionary and
re-assigning codes all the time, assuming that's what's done at the moment.
If the dictionary is static then files can be indexed as they arrive rather
than the whole archive needing scanning every x hours to keep indexes in sync.
(This was the assumption I was basing some desktop search code on which I was
writing - but that's another one of those half-done projects that's sitting a
way down on the priority list to complete right now. I found myself not being
able to find anything in my local archives and couldn't find anything
available on the 'net to do the job)
cheers
Jules

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Archive search