The Great ListServ Archives Project

Scot Kamins kamins at dogeared.com
Tue Jan 10 19:33:50 EST 2006


ML Folks,

The ModLib archives are filled with great discussions on ML topics,  
and contain a wealth of information on specific titles. But finding  
and reading information in any of the archives can be challenging (as  
anyone who has looked can tell you). This is true for a variety of  
reasons:

* Each archived message comes complete with long and tedious header  
information having almost nothing to do with the subject matter.
* The Subject field can be hard to find, and when you find the  
subject what it says might have little or nothing to do with the  
contents.
* Many messages have lots of odd HTML code following them.
* Many messages have redundant contextual quotes in them referring to  
past posts.
* Posts on the same topic aren't threaded together.

Dogeared (http://www.dogeared.com) & Amenities (http:// 
www.mlcollect.com) are teaming up to organize and present the  
archives in a much more accessible form. Here's what we're working on:

* Stripping the files of meaningless header data and HTML code
* Renaming posts as necessary to reflect their contents
* Organizing posts into meaningful threads in a meaningful order
* Moving the threaded posts into a searchable data base
* Creating a table of contents of threads

The data base and interface to it (which will evolve as we learn more  
about what works best) will live on Amenities. Of course  the  
original archives in their raw form will still be available at the  
ListServ host site. (We have nothing to do with that.)

As far as possible, threads will have links to related material on  
Dogeared.

As you can imagine, this work will take a while. We plan to publish  
the work a year's archives at a time, beginning with the first year  
(March-December 2001). We have no projected date for the completion  
of this first stage.


Your comments and suggestions on this project are welcome.

- Scot Kamins & Ron Holl




More information about the ModLib mailing list