Efficiency of Searching
I am taking a data structures course, and could use a little help with the concept of searching efficiency. So far, we have studied how having an indexed array can be used when searching a file, but one of the questions is confusing me. It asks:
Compare the efficiency of searching an ordered sequential table of size b and searching an un-ordered table of the same size for the key key:
- If no record with 'key key' is present
I have not been able to get in contact with the instructor, but im guessing 'key key' is not a typo. Can someone help clarify what this is asking? I would guess that its saying if we have 2 tables, one ordered, and one unordered, is the searching efficiency effected by having a 'key key'. I'm just a little confused, any help would be greatly appreciated.
How would you search something that's unordered manually? Would you check each element, or check random elements? How would you find page x in some book? Would you use any sort of method to get there as quickly as possible? Or would you check each page to see if it is page x? Looking up linear search, and binary search will be helpful as well.
If its unordered and no 'key key', I would guess that you would try to do it almost like a dictionary search, where you grab a handful of 'pages' and if you go too far, back up, if you pass it, go forward, etc, but I'm not sure how to articulate that. Im guessing were comparing the big oh notation for the efficiency, so with no key key for an unordered table it would be O(n^2).
If you are given an unordered dictionary, how would you know if you've gone past the key? Unordered implies not in alphabetic order. So you could have the word Apple, followed by Zebra, followed by Orange. So what would you do to find some randomly given word? Would you be able to devise any 'smart' approach? Or would you have to use a brute force approach?
Now an ordered dictionary may have a different approach. Since you know for a fact that Apple comes before orange, and orange comes before zebra, you can think up a 'smart' approach.
So for an unordered table, since there is no key for the file, we would have to perform n searches to find the record we want, where n is the number of records, so its a linear search.
If it is ordered, I think the average was n/2 comparisons.
I guess i'm more confused because when we talk about searching a 'table' I don't know what data structure in java that would be. A 2d array?
You are right on the unordered list, it will be O(n), but I don't believe you are correct on the next part. When you search a dictionary, and think like a computer, you don't know that if you are looking for a z word to start near the back, a computer would start at exactly halfway. You would test the page exactly halfway through the dictionary and either three things can happen. The word can be on the halfway page, the word can be less than the halfway page, or it will be past the halfway page. You can repeat this successive halving and checking approach and guarantee that you will eventually find the key(if it exists).
For ordered lists, once we go beyond the point where the item should be, we can stop, because (assuming the data is ordered), the key is not there. So after we perform k searches, and k is greater or lower (depending on the order of data, ascending or descending), we know the key is not present in the list.
Not so in an ordered list. If you have a dictionary, let's say the halfway word happens to be Mango, and the key is zebra(and let's say this dictionary doesn't know what a zebra is), would you ever be checking any page before Mango? Is it possible that a word that starts with a Z can ever be found in the A section, or b, c, d,... ?
Try this, take an ordered list of 1-100, and don't specify a key, but write on paper the successive halfs. I'll even start you out:
Finish this, at each halfway point you can choose whether you want to go up or down, but this will eventually terminate. Once it does terminate, count how many steps you took and see how quickly(or slowly) it terminated and use this to try to hypothesize what the big o notation will be.
I have one more question:
Suppose you have a file of data of approximately 10,000 personnel records using Social Security numbers as primary key. You have decided to store the file as an indexed sequential structure.
How should the index be accessed?
What is meant by accessing the index?
What kind of data structure do you think should be used? Do you know of any data structures that can be indexed into? Which ones are sequential?(meaning to find the nth item you must first check the 1st, 2nd,...,n-1,nth)
a LinkedList is sequential but an array can be indexed.
Check the linked list API to see if it has any methods to get index i.
Does the term 'sequential' imply that you can only traverse the table one record by one record? Or can you hop through the table at will? If it is the first an ordered table takes b/2 steps on average, an unordered table takes b steps. If you can hop through the table(s) randomly an ordererd table takes 2log(b) steps to conclude that a record is not in the table.
Originally Posted by fam2315