Tuesday, November 30, 2010

Rust Language

In case you've been living under a rock, Mozilla has been working on a new language called Rust. I've known about it for a while, though I just realized that much of what I was thinking would be good in a language is included in Rust. Here's a link to more info: https://github.com/graydon/rust/wiki


Friday, November 19, 2010

Language Ideas Part 3

This part of this series is about what is one of my favorite concept in the language: iterators. This is a bit different that iterators in C++ or Java. In this case, it's really just a special type of function that automatically iterates over an array or list.

Syntactically, it looks something like this:
iterator squareAll(int[]) -> int[]
given currVal:
currVal * currVal

It's not a finished concept. In this case, because the output is an array/list, then automatically takes the return value and adds it to the output array. Whether or not that's appropriate or not is debatable, however, it is clean. Also, it looks suspiciously like a map function. Here's a version that's like a reduce:
iterator summation(int[], init) -> int
given currVal, acc:
acc + currVal
Here, we have an accumulator that we add each value to. However, in both cases, there is a significant flaw: we only look at one value at a time. If we need to look at previous or future values, we're out of luck. So, a recent thought I had was this:

iterator summation(int[], init) -> int
given values, index, acc:
acc += values[i]
In this case, we now retrieve the value using the index, much like a C array. It's not the most pleasant solution, however it's still nicer than recursion. It's still not quite right as there it removes the convenience of not having to use an index to get a value. This is similar to the foreach and for loops that exist in several languages.

To go back in time a bit, the inspiration for this construct came from the D Programming Language. Specifically, the concept of ranges which have the ability to add/remove values from either end of a range as well as access values via an index. Ranges are really nothing more than some sort of data structure contained within a struct or class that has a specific API. In SPL I wanted the same thing mainly because I wanted to allow users of the language to create lists using the best data structure for their application. Each of these data structures could be accessed using an iterator without any additional work.

All of these ideas are good, however it's making them work together that's the hard part. I'm not really sure what the best solution is as I don't want to have two different constructs if one can do the job, but I also don't want to make things too complicated.

Perhaps it'll be figured out in the future.

Labels: ,

Thursday, November 18, 2010

Language Ideas Part 2

State. It's not something that is particularly liked in functional programming languages as it can lead to bugs. But what are the real issues with state? To me, the real issue is state across different scopes, such as global variables. So, I did some thinking and I felt that having no mutable variables at all was too restrictive. The question was, what is the right amount of mutability?

The D Programming Language has the concept of pure functions where they cannot access anything outside of the scope of the function. However, you can modify variables inside of the function, such as a loop counter. This is a concept that I liked a lot as it allows for mutable state, but in such a way that's much safer than normal. So, I tried to incorporate something like this in SPL.

This brings us to the concept of factories in SPL. Factories are similar to classes in that they contain methods that can be executed. However, they are vastly different. First, there are two types: pure and stateful. A pure factory will always return the same result for the exact same input as there is no state stored within the factory. Essentially, it's a class with only methods and no member variables. A stateful, on the other hand, does not have that guarantee. However, the state associated with the factory is contained solely in the factory, thus preventing direct outside modification. Think of it as a class where all member variables are private.

Why is state allowed here? Simply put, there are many different algorithms that are made much simpler if state is allowed. For example, if I have a factory that implements a queue, instead of passing the entire queue from function call to function call, I can keep it within the factory and modify as needed. The key is that the mutable data is only accessible to functions within the factory and from nowhere else. This prevents any accidental modification of the data as you would have to explicitly call a method within the factory to modify the data.

This brings up the next key differences between factories and classes: message passing. The design of the language was to allow a simple interface for communicating with a factory that not only isolated the design of the factory from the code, but also could allow for performance optimizations through threading. To hit on the latter point quickly, it was envisioned that an implementation of SPL could implicitly have factories be different threads, much like Erlang processes. This can allow for asynchronous messages to be executed by the factory without the main program stopping. Think of a good logger.

Why does this make communication simpler? The main reason is loose coupling between the code sending the message and the code executing the message. My vision was that each factory would have a dispatcher that would examine a message and call the correct method. The beauty of this is that if I decide to replace one method with another, the calling code does not have to change. I can create the new method, perhaps put a new entry in the dispatcher for testing, and when it's ready, simply change the entry in the dispatcher to point to the new method. Granted, not every case is safe from changes to the caller, however the more we can make it simpler to update a factory safely, the better.

Another reason for message passing is the concept of fail-fast error handling. Erlang is designed like this and I wanted to have that concept in this as well as it's a simple and proven method to make reliable software. In short, if there is a failure in a factory, stop execution of the factory, generate error information, and send that back to the caller. The caller is then responsible for what to do if an error occurs.

In the end, I felt that this was a very reasonable language construct. It allowed for good encapsulation without being overly complex. Aspects of the system can be redesigned with minimal impact on the code using it.


Labels: ,

Wednesday, November 17, 2010

Language Ideas Part 1

First, apologies for the long time between posts. It's been a stressful few months, but things will be getting better.

Now, before all the stress I began work designing a language to facilitate the creation of software in a very safe manner. I did this by trying to make the code as easy to read as possible, but to also prevent things that we know cause problems, such as shared global variables and lack of bounds checking. Unfortunately, two things happened: first, I lost all the work I did and second, I realized I don't have the time to really do it justice. So, I'm going to blog about it as I recall what I did and perhaps come up with more ideas that I believe are worthwhile.

The first aspect of the language, which I called SPL or Safe Programming Language, I want to discuss is the syntax of functions. What I did was I wrote out the same function in several different styles, looked at the pros and cons of each, and tried to make an informed decision about which was I thought was best. The same was done for the Zimbu language and I liked it a lot, hence why I followed suit. It worked out well as I could then see what the code would look like and I could try to catch poor decisions earlier in the process.

I ended up deciding that a functional programming style was best as it really promoted the creation of very small functions, which is a very good thing as the smaller a function is, the more understandable it is. However, I didn't like any of the current functional programming styles enough because they weren't quite as readable as I wanted it to be, so I came up with my own. Below is a sample as best as I can remember it:
function square(int, int) -> int
given x, y :
x * y
This is designed to be read as follows: The function square takes to arguments, and int and an int, and produces an int. For pattern matching, it reads as follows: given two arguments x and y, then do x * y. You can have multiple "given" clauses to handle different cases. Here's an expanded example using guards:
function abs(float) -> float
given x when x < 0:
x * -1
given x
In this case, the first pattern is read: given a value x, when it is less than 0, then return x * -1. Again, easy to read. Throughout everything I did, I tried to not be overly dependent on symbols, but also not pollute the language with unnecessary keywords. I did tend to use keywords over symbols as they are much easier to understand in many cases.

I didn't hit a lot of possibilities yet, but I was in the early stages of the design and was somewhat ADD as I kept jumping between different aspects of the language. However, I believe this to be pretty solid and definitely fits in with the easy to read aspect of the language.