Log in

View Full Version : Store large amounts of text (logs)



keyboard
09-24-2015, 12:24 PM
Hey all,

I'm looking to store large amounts of data, what is the recommended way of doing so efficiently?

Each entry has three pieces of data: ID, Timestamp, String
Each String has a maximum size of around 300 chars.
There are massive amounts of entries (1000/user/day estimate).

The emphasis is on storing the data efficiently (space-wise).

Searching isn't really an issue, I just need to be able to recover chunks of the data per user.

What's the best way to structure this? Database or flatfile?
Table per user, etc..?

james438
09-24-2015, 01:58 PM
For efficiency in retrieving data there is no real difference. Where MySQL excels is in sorting and organizing data. With storing the amount of data you are talking about I would use a single MySQL table. The number of rows per table in a MySQL database is based on what you set it as and can be expanded as necessary.

See:

https://dev.mysql.com/doc/refman/5.0/en/integer-types.html
https://dev.mysql.com/doc/refman/5.0/en/table-size-limit.html

The biggest problem in table size limits comes from the allowed size of the database. For me I can have nearly as many databases as I want provided they are no more than 1GB each. I have not come close to a drop in the bucket with filling that size. I tend to store audio files and pictures on my website as flat files and store the location to those larger files in my database. This allows me to keep my database small. MySQL can and is designed to store most any data type, but overall size of the database is often where the limits become important. I would find out what the limits are for your database size.

keyboard
09-24-2015, 05:25 PM
I'm running off a vps, so database size isn't a limit (unless there's some built in max database size in sql).

Will the queries grow slower the larger the database is?

james438
09-25-2015, 02:00 AM
It should not be a problem. What sort of data are you working with? For example, if your string data repeats you may want to consider normalizing your data. Also consider how fast forums, even large ones, retrieve information from the database. When you get into sites like facebook and google the database structure is actually quite different. We have a thread discussing how sites like google perform searches, but I am having trouble locating it at the moment.

I am not sure what types of queries you may be referring to. One way to test out specific queries would be run the query in your mysqladmin assuming that is what you are using. Another option is to create a bit of a php timer that will print out the microtime (http://php.net/manual/en/function.microtime.php) elapsed after a query is completed.

http://stackoverflow.com/questions/4406417/how-big-is-too-big-for-a-mysql-table

keyboard
09-25-2015, 05:54 PM
Thanks james.
I forgot to mention above that my current setup is running off mongodb.
If you manage to find that thread please let me know!

I did a bit of research and large mongodb databases can be sharded pretty easily to allow distributed storage.
I'll keep looking into this and run some bench tests.
Thanks

joilben
07-27-2020, 08:08 AM
String information: This variable is utilized to store information in type of characters and words, its non numerical.