-
The storage mechanisms used to house such massive databases is somewhat distributed. Some of the databases
are only a few gigabytes, which are without problem stored on the webserver. In order to successfully create
a 1.5 terabyte database it takes a relatively large amount of time. The task of storing 1.5 terabytes of data,
is on some levels intimidating and much harder to accomplish.
We've decided to take a semi-primative server/client approach to the entire process. The data is broken up
into chunks, large files if you will. The chunks can then be broken up and spanned across a few or many servers.
Each server must have a small server file that allows the client to connect and scan the chunks located on it.
From that point the client will give a response to the user. This platform grants reliability, speed, and
scalability. Although the methodology is somewhat aged, the concept and practice of client/server based distributed
applications is tried and true. Naturally, some variant of peer-to-peer storage mechanism that allows the
remote volumes to appear as one would be easier to manage and manipulate. Currently there isn't a feasible,
easily implemented type of storage mechanism.
With ATA over Ethernet, we are able to build a large distributed volume with some fault tolerance. As the
databases grow in size, some revisiting to the data storage portion of TMTO[dot]ORG will have to be done.