Results 1 to 3 of 3

Thread: Data structure and Design help for huge project

  1. #1
    Join Date
    Apr 2010
    Thanked 0 Times in 0 Posts

    Default Data structure and Design help for huge project

    Ok what i am creating is an object tracking model. but i am just at the proof of concept stages, the hope is to take a number of objects that move through time and space at different rates and determine what the end result is going to be. meaning how did the object move through time and space did it travel up. down. or did it have a collision with another object. and what was the cause to this. Now each object is considered and individual in a swarm. the swarm has the ability to affect a single individual very easily. but its hard for one individual to affect a whole swarm. Now lets track the movement on a swarm level. How will Swarm A affect swarm B. Now if we think about this logically an individual from either swarm will have a very hard time affecting either Swarm A or Swarm B. through loss or gain death or life. But what is noticed is when Swarm A does the same exact thing Swarm B will either have a specific reaction. Be it stand there and undergo a collision. or move accordingly to both the individuals of either Swarm (home swarm or foreign Swarm). But here is where things get tricky Each Swarm A and B are affected by the individuals inside them. the other swarm. but finally what is affecting the overall conduct movement or action this is all controlled by the environment each swarm is in. Say the Area or world has a drastic change this is going to affect each and every swarm on a drastic level by way of affecting the swarm all the way down to the individual. if enough individuals are affected due to environment swarms will become affected etc etc. But there is not just one environment there is a number of them which are all contained and create the existence of a world. So if one environment changes it will not affect the entire world drastically but it could have an affect on other environments changing the world. The point is to essentially track and see how if worlds are changed through environments how will the individuals change. and will individual change dictate the coarse of the world. or can one individuals change in pattern behavior due to changes which the individual cannot control will the individual make a change that will affect the entire world. These are just a few questions I hope to find the answers too in my models and they are more expansive then what I am going to go into here.

    These are all questions that i hope to model out. and i am only at one small part of this project. the question i am asking today is simply how do i store one spec of data which the scope of this project is going to require. I have already created 3 other smaller test models that take days to compute outcomes. and my goal here on this rewrite is to simply take out bottlenecks that i was facing using my old storage concept which did not allow me to retrieve and store data on the mass scale which MYSQL or any database engine possibly can.

    so its essentially this i desire to predict the actions and movements of individuals in a specific swarm but each swarm is affected by one another and the environment which holds the swarm and the world the environments resides in. All of this helps determine the individuals success or peril.

    Now this is a very complex task but how i am going about it is a little different and i am not going to get into that. but here is some starting data. These numbers are essentially coordinates of one individuals movement. this is essentially the raw data of the already pre defined tracking system i have created as the first part of this project. Storing the raw data is very easy and the queries are super fast and the organization is perfect (not really but its fast enough). each individual is essentially its own table with time used as the key. the table is named after its individual. each individual also is given a key. that key is stored in a separated table. In that table the individuals Key is stored. along with its swarm environment and world. this allows me to do very quick queries and access rapid amounts of data really fast. for storing the raw data is the best way to do things that i have came up with and tested out. because i can rapidly make a list of all members in a swarm, of an environment, and of the world. and get all that data as fast as my computer can dull it out which is amazingly fast. so i am satisfied with the storage and access capabilities of all the raw data coordinate

    I could not have asked for a better way to get data to make the keys which a small form of it is essentially this.

    Table ID = Individual#_Swarm#_Environment#_World#
    Line A (time1) 583.44 586.82 572.25 582.98 560.75 582.98
    Line B (time2) 585.98 585.98 575.29 580.41 558.22 580.41


    ok each of those numbers equates out to an observation

    so this is how the key is created

    each number in those 2 lines is compared to one another.

    the first dataset in line A is compared to the second dataset of line A and also to each dataset in line B. Line A and B are just a snapshot in time. A is what happened when things started and Line B is where they are at.

    depending on the value IE if there is an increase in a number upon comparison the number 1 in the key is identified. if there is a decrease the number 2 in the key is assigned. if the numbers are one in the same then the number 0 will be given in the key.

    12222112222 (from above data example)

    would be just the first comparison of the key. the First line data just the first number compared to every other number that is present going in order from left to right line A to B this operation is repeated across all numbers but for simplicity purposes i am not going to create an entire key by eye alone (that is why i am programming it) LOL but you can imagine 012 repeated over 100+ chars.
    now if i did my math right with the above 2 listed lines of data i am prity sure the key would be 132 chars long from all the comparisons. How i got to this is there is 2 lines of data each line has 6 observations there is 11 different possibilities of comparison and each comparison gives you one char of the key. solve for X. LOL i had to say it like that. but i came up with 132 chars for 1 key if this was the data that i was using in this example.

    Now after that happens each key is tagged with an end result. Where did the object go. up down or did it not move. so if the object went up it gets a " 1 " score if the object went down it gets a " -1 " score if the object ends in the same place as it started it gets a " 0 ".

    Now each key is very important. because that tracks exactly how a specific individual moved through time and space. Its easy to hash the key. and then store the hash. and simply keep a tally of what happened. But that is not "tracking enough" that destroys each segment of the key which is actually important. Now each segment can only have 3 possibilities 0 1 and 2. But each segment of a specific key holds a great deal of information and that is why i desire to store it. So that is why i might have to store the key in a raw form. for the entire key gets a score of -1 1 or 0. but each segment of the key also needs to be readily accessible so i can essentially ask this question to the database or something like this

  2. #2
    Join Date
    Apr 2010
    Thanked 0 Times in 0 Posts



    Where segment_key_number = 1 and value = 2 return what_happened from (world)


    Now as a benchmark what i did is i stored a database table of all keys for rapid access and lookup. then i stored a database of the hashed version of that key as the table name and in that table i stored the entire key segmented out along with what happened as in -1 0 or 1.

    Now this quickly created too much information and it bogged down the server due to excessive tables created within a database with excessive columns in each table. but each table did not have a great deal of rows.

    I have had to learn how to program to complete this project. I have been at it for about 4 years. i have completed over 100 different projects but always working on this one in my free time. I have read books on SQL and things like that. So i understand how the data should be laid out but for some reason i cannot come up with a good way to organize this portion of data. It has to be perfect because there is a great deal of Data. Each key is well over 100 chars long. but each char in the key needs to be tracked. but it needs to be organized down to the individual which is apart of a swarm which is contained within a environment which makes up the world. but there is also other swarms. I have written this thing to be expandable to many different sizes to handle many different swarms each containing a few different individuals and both many. but as a benchmark this is about what it works out to be and has always been a benchmark

    I am looking at creating
    A world > with 10-20 different environments > 100-300 different swarms > each swarm containing 10-100 individuals

    so if we take all that into account the overall amount of data generated is daunting at best.

    because lets say

    1 world
    10 environments
    each environment has
    100 swarms
    each swarm has
    10 individuals per swarm
    each individual has
    100 snapshots in time
    each snapshot in time has
    1 key
    a key has
    100 segments
    each segment can be only 3 different numbers
    each segment has 3 different scores that need to be tallied

    so using the above data as an example for database(s) size is about like this
    -1,000 swarms
    -10,000 individuals within all swarms
    -1,000,000 total snapshots in time.
    -each snapshot in time creates one key
    -each key has over 100 segments
    -100,000,000 different data points are created
    -Each data point has 3 different scores which are tracked. -1 1 and 0 which is tallied
    -Each data point needs to be organized and and tallied by its corresponding World / Environment / Swarm / Individual.
    -Each data point can and will only have 3 values 0 1 or 2

    So what happens is each key exists on 4 levels and affects each level in a different way so that is why it needs to be tracked. the 4 levels are World, Environment, Swarm, Individual.

    So with all that information I presented to you I ask my question. What is the best way to store this god awful amount of data so I can access it in an effective manner.

  3. #3
    Join Date
    Mar 2006
    Illinois, USA
    Thanked 690 Times in 678 Posts


    That's a huge amount to read-- can you give a summary so that someone can actually help you? Also, if the project is as big as the description, it's probably better to ask for paid help for this.
    Daniel - Freelance Web Design | <?php?> | <html>| espa˝ol | Deutsch | italiano | portuguŕs | catalÓ | un peu de franšais | some knowledge of several other languages: I can sometimes help translate here on DD | Linguistics Forum


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts