Portal Home > Knowledgebase > Articles Database > MySQL Datatype Question
MySQL Datatype Question
Posted by Ricjustsaid, 12-03-2007, 11:52 PM |
Just a quick question for anyone that uses MySQL (or does any database work)... is there any consequence to not using the correct datatype and size for the type of data you're storing? For example, storing ints in a varchar field?
The reason I ask is because I installed a script on my IPB forum yesterday which basically pulls numerical data from a public webpage, stores it in a table, and later outputs the data for a user. The are about 70 fields of data in total, and the length of data would normally range between 3-12 bytes. But shortly after putting this script online, my server had load spikes around ~40 (this is a Dual Opteron 246, 2GB RAM; regular load is around ~.5). Once I disabled the script, the load spikes went away... and after looking at the script and db structure I found that the database was made up entirely of varchar(128) fields.
Could that have any effect on performance at all?
|
Posted by Steve_Arm, 12-04-2007, 12:26 AM |
The main reason is consuming unnecessary space. Example, INT occupies 4 bytes and can take a number up to 4294967295. Store this number as chars and you have 10 bytes.
Whilst strings are generally slower, I don't believe that the performance problem
has to do with the field type.
|
Posted by foobic, 12-04-2007, 02:06 AM |
Is that 70 fields all in the same table? Does the developer come from the Excel school of database design, perhaps?
Not sure if the varchar fields alone would kill your server but I bet if you look you'll find other problems as well, like different types of data in the same table, no indexes etc.
|
Posted by Ricjustsaid, 12-04-2007, 03:01 AM |
Could there maybe be issues with sorting and comparing though? Looking at the code, I'm seeing a lot of "SORT BY", and I thought sorting by strings was more intensive than sorting integers.
That made me laugh. Yep, all 70 fields are in one table, which surprised me too... but is there a better way to store it? It's like a hiscore system for forum members - it fetches the member's rank, level and experience for a game and store it in the database. There are 22 unique areas to be ranked in, and 3 fields per area.
Could the number of fields have an impact on performance? I could maybe drop the "level" columns and just have an array of values instead -- but would that have any effect?
I noticed while looking at the code that there are uninitialized variables being used in a couple of SQL queries, which cause the query to fail in particular circumstances. It doesn't even seem like the system uses the IPB cache system.
|
Posted by foobic, 12-04-2007, 03:54 AM |
If I'm understanding it correctly, another way would be:
members table: ID, name, whatever else
areas table: ID, name, whatever else
scores table: member_ID, area_ID, rank, level, experience
|
Posted by mitchlrm, 12-04-2007, 06:46 AM |
It is unlikely the problem is caused by the data type or table design. But lack of indexes could do it. The uninitialized fields being used in a query could be a real problem.
|
Posted by Ricjustsaid, 12-05-2007, 02:31 AM |
The PRIMARY KEY is the "id" field. Maybe there's a problem with selecting so much data? Although, that's unlikely... It just seems like there's something that brought the entire server down to a crawl but I can't figure it out.
Yeah, the uninitialized variables made me nervous, so I completely removed the script from the public forum and only have it installed on a test board right now. TBH, there's no telling what other problems are in that script, and I don't need someone stumbling on a vulnerability or something.
That sounds like a good idea, and it seems like it would be easier to manage the data and to add/update new data afterwards. I think I'll try that instead of using one big table... dealing with 5 or 6 fields instead of 70 seems a lot more practical.
One question... there's one field that requires a decimal value to be stored, which is derived from several other fields in the database using a formula. There wouldn't be a consequence to making the "level" field a decimal with something like 3 decimal places, would there? Or would it be better to calculate that in the script and make the "level" field an int?
|
Add to Favourites Print this Article
Also Read