user:raphael_wimmer:projects

random thoughts that are not yet ready for prime time…

Idea: We need raw website data, so we can adjust the indexing algorithms depending on the task. For example, one might be interested in certain punctuation characters, text encodings, RSS feeds, etc. Therefore, we (==science) need to run our own search engine that save the raw pages (also important for reproducibility!) and uses different indexers on this data.

Resources:

Fog + multiple lights

user:raphael