Tuesday, November 21, 2006

KISS, my dear friends...

Today I was asked to look at a new piece of software that we would be using on a project. No biggie. It is a piece of server software, so I figured it would be "query the server and you will get data."

How wrong I was.

While the demos mostly worked, I could not find out how to access it. I looked at all the documentation and tutorials I found on the web site and I'm quite lost. It was especially difficult because it used some "magic" to handle the requests. Therefore, when I looked at the examples and looked at the URL that was called to make the request, I found nothing in directory that it should be pointing to.

Needless to say, this got me thinking about some of the things I've done in the past year. Most recently, I've been reworking an API to make it easier to reuse. Essentially, there are several libraries with some, but not a lot, of dependencies between each other. One of the problems was that each one was a singleton, therefore there was a good deal of code duplication. I pulled the singleton functionality out into its own library and set it up so you can "register" either an object or just the functions within the object with the singleton. This way if a developer wants a singleton, all they have to do is create a library with the functionality they want and register it with the singleton library. The registration functions have also been designed so that you can in one statement make all the public methods and, if you register the entire object, members accessible through the singleton. Not hard at all.

This experience also got me thinking of several things I came up with some time ago. While I would love to say that I was inspired to create these things, they were really born out of potential necessity. A client I'm dealing with has been known to change their minds on almost a whim, therefore I needed to come up with a way to rapidly build functionality. Therefore, I ended up building three different frameworks, for lack of a better term. All three were very different in what they are meant to do, but they all have one thing in common: it is very easy to add functionality. They are all set up so that you can add functionality without having to modify any existing files by using a plugin-like architecture. Piece of cake.

The final thought I had was "why isn't this software easy to use?" "Why can't I simply write a little bit of code, call it remotely, and see my data?" "Why isn't this EASY?!?!?!" I understand that many APIs, frameworks, architectures, etc. do things for some reason deemed valid by their creators and there are those who love it, but I find many to be convoluted. Why aren't they easier to work with?

Well, as a final thought, I plan on posting some of the code I mentioned earlier. One is an RPC library that I started and another is an architecture for building a web site. I never got to use the RPC library, however the web site architecture is working out very well. Compared to Fusebox, which I have used and hated, I feel this has improved my ability to make web sites quickly. If nothing else, at least I worry less about how my code fits into the architecture and focus on building site. Anyway, I'm going to try to get this stuff onto Sourceforge or something else. Suggestions? The problem is I am pretty busy with a Master's program and I want to try to finish the RPC library and create examples on how to use both of them, so it may be a while. Sorry.

*Apologies for the vagueness at times. I didn't want to accidentally bad mouth a product in the event it may be my own dumb fault that I couldn't get it to work.

Monday, November 13, 2006

API Done?

Well, I can get, add, and delete entities (subject and objects), predicates, and relationships. I don't think it's quite perfect, but it's a start. Yes, I do know that this is very basic, however it's a start. It's something to build on, which is what I wanted. I didn't want to force anyone into a specific query language. I wanted to make it usefull enough to be the foundation of a useful product.

Now, what's next.
  • Disk-based storage.
  • Sample interface. Perhaps make a server that accepts requests and sends back data in a JSON format? Maybe even something using mod_perl...
  • Clustering? Just toying with the idea now, but it may be an interesting thing to do. It'll require a persistant server, but still...
Well, that's a wrap. Most likely, I'll touch up the API a bit, but I'm happy with how much got done so far.

Wednesday, November 08, 2006

Associative Database Thoughts

Well, this may be madness but I've decided to write my own associative database in Perl. Conceptually, this seems very easy and I did get some code started that seems to work pretty well.

Guess the first thing I need to do is explain why. I know of some people who are using such a product and ran across a significant problem: it isn't cross-platform. If I recall correctly, it's Windows only and they need it to run on a Unix-like OS. Doing this in Perl sounded like a good solution to this since most Perl scripts can be run on different platforms. Of course, I wanted to do something different and this seemed like an interesting programming exercise.

My understanding of an associative database is that it's one giant graph where everything is of the form Subject, Predicate, Object. In other words, node->link->node. No biggie. The way I figure it, I'm going to keep things simple at first with some basic insert/get functions. Initially, I'll keep things in memory, but as soon as I find a method I like, I'll make the data persistent. Oh, and delete functions, of course. Right now things are pretty simple, but as soon as we move to persistent data, I want to ensure that I don't have to load all of the data in memory at the same time.

Speaking of persistent data, I'm currently thinking of some sort of B-Tree, probably a B+Tree, but I'm not sure that's the best way to go. It does give me a nice way to pull data off in chunks, it seems so "been there, done that." I'm not saying it's bad; I just feel the need to use something more interesting.

Granted, I know this will not be the fastest solution, however as I stated before, my two main goals were portability and a challenge. All of the products I've heard about that do this are not open-source, therefore I don't really know how they work, but I have theorized about it. The funny thing is, no matter how I look at it, it seems you have to use some of the same tricks relational databases use in order to get good performance, like indices. This is kind-of ironic considering that associative database proponents claim that this will replace relational databases.