site_index - Multi-page HTML site index generator for multiple domains

	SiteIndex

What is it?

An multi-page HTML site index generator that easily handles virtual domains.

Features:

Automatically generates site index based on the file/folder hierarchy
Handles multiple virtual domains
Splits site index into multiple pages to limit links-per-page
Indented text for different levels of hierarchy
Simple, small, and easy to modify or improve.
Valid HTML (4.01 Transitional)

License:

This software is essentially free, but please read my payment spiel
Please read the full license

Examples:

I use it to generate the site_indexes for all my domains, such as the top site index of this domain.

Download:

It's a single perl script.

Documentation?

This reads a list of domains from files (or STDIN with "-").

The domain information is one per line, of the format:

domain <tab> path

For example:

GetDave.com	/user/home/httpd/html/GetDave/

The path is optional, if it isn't specified than the site_index will just provide a link to the domain.

Warning: It will overwrite files in "Site_Index/" in each root directory!
As a default, it does a recursive listing of all files under the root directory, showing all HTML files that it finds. It won't show directories that don't contain HTML. You can prune/avoid any directories in your tree by creating one of the following files in that directory:

.no_index          Won't include directory in site index
.no_contents       Will include directory without any contents.

You can also ignore parts of the tree using the klunky "-ignore <regex>" option. Some examples:

 -ignore '/images$'                   Ignore any "images" directories
 -ignore '/(images|thumbnails)$'      Multiple ignores
 -ignore '/\.'                        Ignore dot directories

At the same time, you can specify which files get indexed. Right now it just indexes html files. You can specify this regex with -index. Example:

 -index  '\.(s?html?|txt)$'          Index .shtm, .shtml, .htm, .html, .txt

You can also specify an optional importance for each domain, using the format:

domain <tab> path <tab> importance

Importance is a value from 1-5:

List root link at the top of all site indexes (and treat as 2)
List in every site index first.
List in every site index.
Only a link to the top page appears in other indexes.
Doesn't appear in other indexes at all.

Requires:

Perl, which kicks ass

Install

It's just a perl script. No install required.

Revision History:

See the CHANGELOG

Bugs:

It doesn't check for the old infinite-recursion symbolic link trick. Either don't do that, or use .no_index.

Freshmeat?

You bet.