5 Replies - 1900 Views - Last Post: 14 July 2008 - 07:39 AM

#1 Sentry68  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 9
  • Joined: 25-June 08

Arrays in sql?

Post icon  Posted 12 July 2008 - 03:53 PM

Noob here again.. I'm learning about databases and could use a little help. I found a couple posts relating to this but I'm not sure which way to go...

C#
Visual Studio
MSSQL 2005

I'm creating an application to handle a customer database. I'll be using checkboxes for a few pieces of information, like 'active', 'contacted', 'repeat customer', or similar stuff- probably around 10 pieces of info. I could create different columns for each checkbox, but I was wondering if there would be any advantage to saving these as comma separated values into a single varchar column and using it as a string array. I haven't tried to write the code and I'm not asking anyone to write it for me, but before I dug into it I'd like to know if it's even a good idea. I want to learn best practices as much as possible.

Among other things, I'm wondering about;
-Is it very beneficial to limit the number of columns in a table whenever possible?
-Would it limit the queries I could do?
-Couldn't a single varchar column like this actually be larger than 10 bit columns?
-should I probably save 0/1 values instead of true/false?

I'm just guessing here, I'm like Tommy Boy learning to code... Anyone have a thought on this?

Is This A Good Question/Topic? 0
  • +

Replies To: Arrays in sql?

#2 Martyr2  Icon User is online

  • Programming Theoretician
  • member icon

Reputation: 4437
  • View blog
  • Posts: 12,308
  • Joined: 18-April 07

Re: Arrays in sql?

Posted 12 July 2008 - 04:06 PM

First of all, you don't want to have a comma separated list of values in a single column. This violates a database design rule called "atomicity". This states that each column should have the most basic value. This rule is in place because if you did have all your values strung together it would take a lot of processing power to constantly update and manipulate a string. What if you took a checkbox out? or added another one? You would have to update all your fields which takes a lot of time and effort. It also makes it a bit more difficult to relate that value to other tables. Maybe one day that field will be the key that links to another table and a string of values makes that hard to do. So you want to avoid that one.

As for multiple columns, each representing a different value, I say make it a column. Just don't go overboard about the number of columns you do. From a database table stand point, it could make the table lengthy. To help reduce this you could actually put all your settings for the checkboxes (in a given section of the program and make sense together) into a separate table like "settings" and link that table back to the other record. But I wouldn't do that until you have well over 10 or so checkboxes. Example...

table1
-------
id
myname
configsettingsid <--- Links to id of table2 (config options for myname)

table2 (config checkboxes)
-------
id
checkbox1val
checkbox2val
checkbox3val




For limiting queries, linking the values into one string will certainly make queries harder and more lengthy to process. The individual columns for values will keep the queries cleaner since you can pull in only the columns you needed. (You should hardly ever need to use an asterisk in a select unless you truly need all fields... pull in each field on its own if you only need select columns... this will make queries more streamlined as well.)

Lastly, your fields should really be bit fields for something like a two state checkbox. Some databases can use the true or false, but I have always found 1/0 the best method.

Hope I have answered your questions to your satisfaction. :)

This post has been edited by Martyr2: 12 July 2008 - 04:11 PM

Was This Post Helpful? 0
  • +
  • -

#3 baavgai  Icon User is offline

  • Dreaming Coder
  • member icon

Reputation: 5937
  • View blog
  • Posts: 12,862
  • Joined: 16-October 07

Re: Arrays in sql?

Posted 12 July 2008 - 05:32 PM

Is it very beneficial to limit the number of columns in a table whenever possible?
Not exactly, but yes. A table should only store unique elements that relate to it. If you find yourself doing something like Contact1, Contact2, Contact3, ... it's time to reconsider. You need a Contact child table.

Would it limit the queries I could do?
Tables are to organize, not limit. If you think about it, you describe almost anything in just a single table with three columns, EntityId, AttrbuteName, AttrbuteValue. This is relates to some information theory and is also the basis of RDF.

Couldn't a single varchar column like this actually be larger than 10 bit columns?
I honestly don't know what you mean by this.

should I probably save 0/1 values instead of true/false?
Yes. MSSQL 2005 has a "bit" datatype that's preferred for boolean columns.


Now for the array question. A single column table is an array. For things like drop downs, you usually have key-value pairs. You will often see list of value type tables in a database, like so:
create table CustomerType ( CustomerType int, Name varchar(20) );
1	Commercial
2	Industrial
3	Residential



As you can imagine, you'll get a lot of little tables real quick with the approach. Sometimes that's still desireable so you can enforce foreign key constraints with other tables. However, sometimes the application enforces these rules or they aren't applicable.

Another way is to do a dictionary of sorts, often called a code tables, for list of values or list of codes. This can be done with just two tables.


create table CodeCat ( CodeCatId int, Name varchar(255) );
create table Codes ( CodeId int, CodeCatId int, Name varchar(255) );

CodeCat
1	Customer Status
2	Cusomter Type

Codes
1	1	Active
2	1	Inactive
3	1	Suspended
4	2	Commercial
5	2	Industrial
6	2	Residential



Now you can easily generate those key value pairs based on CodeCatId. What's more, because CodeId is unique, you can store the values, like combo box results, as simple single values and still decode the category of the chosen values.

This post has been edited by baavgai: 12 July 2008 - 05:33 PM

Was This Post Helpful? 0
  • +
  • -

#4 Sentry68  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 9
  • Joined: 25-June 08

Re: Arrays in sql?

Posted 13 July 2008 - 12:30 AM

Thank you guys very much, that's exactly what I needed to know.

Martyr2 said:

First of all, you don't want to have a comma separated list of values in a single column. This violates a database design rule called "atomicity". This states that each column should have the most basic value. This rule is in place because if you did have all your values strung together it would take a lot of processing power to constantly update and manipulate a string.

The very principle of atomicity answers several questions I had, and I see why it would matter with future changes to the database. I like this rule already- it makes for a great standard to my untrained mind.

Martyr2 said:

Lastly, your fields should really be bit fields for something like a two state checkbox. Some databases can use the true or false, but I have always found 1/0 the best method.

Yes, that is what I've been using for checkboxes, although I'm still a little unclear about this. I guess there's some confusion about the idea of 3 states available for this bit (or bool)-- false/true/null (or 0/1/null)? I've run into this, but I'll save that for another day.

I wondered if separate tables for drop-down lists and other containers was common practice. So far I've been handling it by populating the combo box inside the application, which saves the result to a varchar column. I figured in a small program it would be just as easy, and even allow the user to type in an alternate option without extra trouble. But I imagine in a big database, a separate table makes a lot of sense- it could easily be used in multiple places, and it's probably much faster to query/update with limited choices in a table.

baavgai said:

Couldn't a single varchar column like this actually be larger than 10 bit columns?
I honestly don't know what you mean by this.

I was obliquely comparing a varchar(30) column (for example, which might be necessary to hold 10 comma separated values) to 10 separate bit columns and asking if the varchar column, essentially holding the same data, would actually be less efficient/larger to save than the 10 separate bit columns.

Obviously I have a lot to learn and I'll go find out more about the specifics on my own, but if I could ask a couple more general questions...

-In reality, you guys on the ground and actually creating this stuff... how many tables and how many columns per table are common? I understand that this could vary widely, but maybe if you could give an upper limit... when do you think you've got too many tables or too many columns?

-it seems like you could save a lot of limited-choice type data (drop-down lists, etc) as tinyint datatypes, saving results as 0,1,2, etc. and mapping the values on the application level- maybe significantly reduce the size of a database and increase it's speed... is this done much?

-baavgai, In your dictionary or code table, are you basically using 1 table to contain unrelated columns, and relating to them from the other table? It seems pretty brilliant, and at the same time creating the potential for a single huge table... also, in your example, could more than one column in the CodeCat table relate to the same column in the Codes table?

Thanks again for the information, and your patience with a beginner.
Was This Post Helpful? 0
  • +
  • -

#5 baavgai  Icon User is offline

  • Dreaming Coder
  • member icon

Reputation: 5937
  • View blog
  • Posts: 12,862
  • Joined: 16-October 07

Re: Arrays in sql?

Posted 13 July 2008 - 03:05 AM

View PostSentry68, on 13 Jul, 2008 - 03:30 AM, said:

-it seems like you could save a lot of limited-choice type data (drop-down lists, etc) as tinyint datatypes, saving results as 0,1,2, etc. and mapping the values on the application level- maybe significantly reduce the size of a database and increase it's speed... is this done much?


Two are questions here. First, data is usually given some kind of single integer reference as unique id for a row. This is why MS uses identity.

Second, datasize. If you're thinking that a one byte field might be a quarter the size of a four byte field, you may be disappointed. Small sizes are often more intrinsic constraints rather than real size savings. There's always processing overhead and there is a good chance things like tinyint find their why to just int at some point in the process. Don't go nuts.

View PostSentry68, on 13 Jul, 2008 - 03:30 AM, said:

In your dictionary or code table, are you basically using 1 table to contain unrelated columns, and relating to them from the other table? It seems pretty brilliant, and at the same time creating the potential for a single huge table...


Huge tables are what databases do. Tables with millions of rows are common place in the wild. There is a hidden tool here; indexes. A binary search is O(log2(N)). This means if it takes me one millisecond to search a one million item index, in only takes 1.301 to search 64 million. The big databases have a number of methods to shortcut that as well. Size of data shouldn't be a concern if it is well organized.

View PostSentry68, on 13 Jul, 2008 - 03:30 AM, said:

also, in your example, could more than one column in the CodeCat table relate to the same column in the Codes table?


Sure. One can imagine common terms showing up in a variety of contexts. However, they only have meaning in context. If every drop down has a "None" thats fine, as long as you know which none it was.
Was This Post Helpful? 0
  • +
  • -

#6 Sentry68  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 9
  • Joined: 25-June 08

Re: Arrays in sql?

Posted 14 July 2008 - 07:39 AM

Cool, thanks again for the info. You guys have been a big help.
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1