Wednesday, July 13, 2005

My 7th sem Mini-Project in college

My initial proposal for the mini project at moi college (where i study the final year - Dept of IT,Sri Venkateswara College of Engineering )was blankly refused. Come to think of it ,i was sort of trying to complicate a seemingly subtle process.It goes somethign like this.

Centralising the sources of content from various sources (say teachers in a e-learning environment) where they would simply add the links to a central structure (an xml file ideally ,if not a database) so that despite where the source of the content is ,it can all be aggregated and thendelivered to whatever front end is used.Thats how its used conventionally.So what i did was apart from the XML structure mentioned above, made a desktop application that sits in the system tray,then dynamically gets the content say...subject wise.

eg:he chooses math...the app(TE_Tray) i called it ,first connects to a universal location that has the xml file,finds out that it had to read maths today ... then parses the links,which in turn reads the contents at the given location.

Unfortunately, our college specified that we could not have have plain-vanilla applications ,and moreover had to deal at a more pre historic(read low tier ) level .I tried talking the mentor into adding complexity like a encryption for the content,or a telnet enabled session.But No,wasnt complex enough and she felt I could finish it in a fortnight,while the time dedicated was atleast 10 times that amount. I dont blame her.Which meant we would have to probably think at the likes of some scheduling optimising algorithm, or offlate what i thought to be worth a try ... a search engine for atleast documents.Going through the Stanford paper that went on to become Google, i realised that even though the motivation was quite simple - to search fast ! Which meant optimizing the parsing and movement across gb's and gb's of nodes,trees ,stacks ..or whatever is being used.

Continuing on the lines of a search tool,led me to investigate more on the possibilites of venturing into the often overlooked world of bots - not just search spiders,but as described in my article on Bots on TechEnclave, the categories of Bots and their applications.

More on the lines of bots,led me to the concepts used in artificial intelligence across projects such as ALICE, and other VRL and bot based work which reintroduced me to the fundamental concepts used in developing AI and other engines such as Evolutionary and Recursive Programming, Self Learning systems and Expert systems, Neural networks and genetic algorithsm, pattern matching,and so on.All really interesting areas that still have plenty of research potential.Which now brings me to my latest idea in mind. :)

Another endeavour

While reading up on decission making trees and thier formation from data matrices, it was evident that most often the questions that were asked while deciding the path and depth of the node thereafter would be directly related to the effectness and perfection on the part of choosing and asking the right questions early on ,so that the depth would be reduced to a relatively low degree without having to pass through the last leaf which meant the worst case scenario and lowest search effectiveness.

So now,that meant using recursive programming to make sure that priorities were made withini the questions asked by the means of a metric called - "purity measures" so that the decissions could then take the route of the most effective decission making questions or rahter those with the maximum puruty measure .But the disdvantage of using recursive algorithm in the decission making is that it is quite probablt to miscalculate the priority of the purity measurs,by simply assigning the wrong purity measures which would again turn out the worst case scenario of reaching the last leaf to find the right path.Evolutionary programming offers the solution to this problem ,by learning and manipulating the root nodes by calulating and assigning metrics to the purity measures,then bein able to change the purity based on the end result of using the various paths.

eg: if path1 which started with a root node of maximum purity measure,then if another node over a period of time gets shorter paths to the solution,then its purity measure dynamically and correspondingly increases till it reaches a point when it is higher than one above ,so that it can substitute that node with itself.

So there you have it ... Optimizing decission-trees from data matrices using evolutionary algorithms .

My first sanctioning review for this tentative project is tomorrow, and hopefully this one will see the light of atleast the initial stages of the SDLC !