Thursday, December 14, 2006
Data Structures: Part 1
Disclaimer: I could be way off base with some of this, but for the most part (if not all) this should be accurate.
Here is an article on some data structures that programmers use. I'm writing this mostly for Jon's homework assignments that I'm giving him. To most of you you'll probably find this to be boring and really not useful, but that is what most programming articles are, boring and useless. So here we go.
In this first article on data structures I'm going to cover three of the more basic structures that programmers use. They are the ArrayList, Linked List and Stack. A few that I hope to get to in a later article are the Heap and Binary Tree.
Background
We'll have to cover some back ground first. I'm talking about Arrays, and I'll give a description of them and their uses. Most people know that as a programmer you have a few different types of information that you can store (i.e. integers, floating point numbers, characters, etc.). Well what if you want to store 20 integers instead of just one? You could declare 20 different int variables and store them one by one. But this leds to a couple of problems.
- The first being that really its just bad programming style. If you have 20, 30 or even a 100 variables in your program (one part of it) you've done something wrong. This will lead to messy and inefficent code. You'll have to spend extra time trying to figure out where to use what variable and just have that many more lines of code to go through when you debug your program.
- The second and more important of the two problems is that it isn't as effiecent. When you declare 20 variables the program will allocate the memory for those where ever it can find a spot. So you can have 20 different numbers stored anywhere in the footprint of your programs memory. This can lead to increased time in execution or even a bigger memory footprint. Lets say that an int takes up 1 byte of space(in reality most ints take up 4 bytes). When a program allocates memory for a variable it grabs the first spot that can fit it. If this spot is 1.5 bytes big the int will fit just fine, but you now have 0.5 bytes wasted in the memory footprint. Not much I know, but imagine it with thousands of variables doing this for a bunch of different programs. It adds up.
Arrays are nice to store large amounts of similar data. The only problem is that they have to be declared statically, meaning that their size can't change. To get around this we have a few different datastructures that we can use. I'm going to talk about three of them here (and maybe more later on).
ArrayList
An ArrayList is just an Array that can grow(hold more data). Most languages come with them in some form. Most main stream languages have them in their standard library. Java and C# have ArrayLists and C++ has Vectors. Using Java as an example, it implements an ArrayList as an object. And doing so it has allowed for them to create several extra methods to use on it. You can insert in the middle instead of just at the end, as well as remove from anywhere in the list. It also has a few other nifty methods associated with it, look them up in the standard library.
Linked List
A Linked List is similar to an ArrayList. The only big differece is that an ArrayList is made up of an array and when that array gets full it creates a new, bigger array and copies the information to the new array. A Linked List is a line of Nodes. A Node holds some information and a pointer to the next Node in the list. So doing things like removing a Node from the list is easier since all you do is take the pointer that the previous Node has and set it to the next Node in the list thereby cutting out the middle Node. The methods that are usually associated with a Linked List are the same as an ArrayList.
Stack
A stack is a bit different that the other two I talked about. It can be implemented as either an array based or as a linked Node based. Stacks are much simpliar that the other two. They have to main functions, Push and Pop. To add something to the stack you Push it on. It goes to the top of the stack, and it can't go anywhere else. To remove you'd call a Pop, which will remove the top element of the stack. Now this seems rather limited in its uses, but they are widely used in programming. As you read this your computer is using multiple stacks just to keep itself running.
And so this article comes to a close. I hope I have't bored you to death and that you learned a little. Later I'll discuss some more data structures. Most likely they will be the Queue, Heap, and Binary Tree(probably BST).