Log in

View Full Version : Resolved Structure for storing posts



keyboard
05-09-2012, 05:17 AM
Note: This isn't so much a sql coding question as a structuring question.

My question is this - how do the big guns (example: f a c e b o o k) store the peoples posts.

Would they have one big table and every post that is made, gets stored in their?
Or would they have a database for each person and have a table in it for all the posts they've made.

Sorry if I haven't been clear enough,
Keyboard1333

djr33
05-09-2012, 06:41 AM
I don't know, and that's an interesting question. But I'd imagine that they have special software for at least some of it. It probably works like a cloud server but for a database. I don't know if it copies every bit of info or just in chunks, but it probably can search around various chunks at the same time. It seems like mixing both methods would be most efficient, if you went through the trouble of programming all of that yourself-- and they certainly have the resources.

It could always be a very very well maintained flat file system. Part of that might be used for things like posts that can't be searched (right?), just something displayed on screen. For usernames and thinks like that (that you can search) it would be in the main database.

Has anyone else seen inside any of the sites like that?


A similar question would be how google stores its information, but that's just a completely different situation without the social piece (so different databases can be queried in smarter ways, probably). And lots of caching of queries.

keyboard
05-10-2012, 06:30 AM
I was thinking maybe you could store them all in one huge table then wondered if it would be to slow????

djr33
05-10-2012, 06:07 PM
It would be. Databases can get pretty big, but to store all of facebook in one table would be ridiculous.

keyboard
05-10-2012, 09:57 PM
Could you store all the posts in one table on one day? Then create a new one the next day and so on?

james438
05-10-2012, 10:27 PM
Makes sense. Good thought keyboard1333.

keyboard
05-10-2012, 11:29 PM
Makes sense. Good thought keyboard1333.


Really??? oh, okay :D . Would there still be to many posts per day? (Of course depending on the popularity of the site).

Does anyone know about how many values a mysql table could hold???

Keyboard1333

traq
05-11-2012, 12:18 AM
The size limit is usually imposed by the operating system, not MySQL itself.

For example, if you are using the MyISAM driver, you can have table (table, not DB) sizes "up to 4GB by default (256TB as of MySQL 5.0.6), but this limit can be changed up to the maximum permissible size of 65,536TB"
[http://dev.mysql.com/doc/refman/5.0/en/table-size-limit.html]

Yeah, you read that right. Sixty-five Thousand Terabytes.

As you imagine, though, it probably wouldn't run very efficiently, and I don't know what computer/OS would support a file that large.

About the FB question: interesting read (http://royal.pingdom.com/2010/06/18/the-software-behind-facebook/).

About your question:
I would suspect that "posts" are broken up across several different tables.

The actual text of the post might be in one table, which user it belongs to in another, who likes it in another, and so on. Most of it would be "relational" information: just index keys that point to where to look. Then, there would need to be a lot of indexing, and caching becomes essential long before you get to "FB" size.

keyboard
05-11-2012, 01:17 AM
and caching becomes essential long before you get to "FB" size.

Sorry if this is a dumb question... but what do you mean by that?

traq
05-11-2012, 01:42 AM
instead of querying the DB every single time, query once and save (cache) the result. (Memcache (http://memcached.org/) is a common caching tool.)

It saves a lot of overhead (DB connections, queries, etc.), especially for common/frequent queries.

The cache can be cleared regularly, or left for quite some time in the case of static data.

keyboard
05-11-2012, 02:20 AM
query once and save (cache) the result.

Save it where. I'm sorry but I don't understand the concept of "caching". Could you try and explain it a bit...

traq
05-11-2012, 05:16 AM
"Where" depends on how you handle it. MySQL does some caching internally. Memcache stores data on the server. You might cache dynamically generated html markup in .txt files, or session-specific data in your $_SESSION.

I was going to link to the Wikipedia page, but it seems uncharacteristically akward and helpless.

Basically, a "cache" is just an organized way to save data you plan on using again later, thereby saving time by only retrieving/ preparing it once and using it several (or many) times.

keyboard
05-11-2012, 05:42 AM
Hmmmmm, thanks for that traq. :)