tag:blogger.com,1999:blog-88755589207548972322024-03-08T08:33:17.140-08:00Programmatically SpeakingMy projects, ideas, and mistakes: considered useless by many, interesting by some, and humorous by few, respectively.Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.comBlogger120125tag:blogger.com,1999:blog-8875558920754897232.post-70131826456690173682014-12-15T09:51:00.000-08:002014-12-15T23:28:49.487-08:00Humble designAbout four and a half years ago I wrote a post titled <a href="http://programmaticallyspeaking.blogspot.se/2010/06/only-design-pattern-is-small-solutions.html"><i>The only design pattern is small solutions</i></a>. I've recently run into the problem of how small and good intended solutions grows to massive complex systems which devour everything around them. It now reached a point where the code we understand and can work with is in minority.<br />
<br />
You know how good ideas need to reach a critical mass before they spread and become popular? Well, these massive complex systems are the black hole equivalent of that -- they absorb everything around them, making themselves larger and more complex with every idea that come in contact with it. They are good ideas that grew out of proportions into black holes of never-ending complexity.<br />
<br />
Generally, we prefer to <i>use</i> small to-the-point solutions. We like things that are easy to understand and get our job done. But we also like to show that we're good at what we're doing. Of course, the size of a system reflects the size of the solved problem, but lately I'm thinking more and more that the size of the system also reflects the size of the designer's ego. Great egos write complex software.<br />
<br />
I've had the <a href="http://programmaticallyspeaking.blogspot.se/2010/05/story-of-being-humble-with-confidence.html">luck of working</a> with really great people who made me realize what it means to write great software with a great team. Maybe at some point I'll get to experience both at the same time, or even one of them over a longer period of time. Anyway, this experience shaped me in really positive ways, and it's only lately that I realized how these two things go hand in hand. Being humble is the opposite of having a big ego -- simple is the opposite of complex. Humble people write simple software.Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com0tag:blogger.com,1999:blog-8875558920754897232.post-90676962659794344962014-12-01T03:02:00.001-08:002014-12-09T14:17:01.260-08:00Constraint Satisfaction ProblemsThis post is brain dump of my recent dives into <a href="http://en.wikipedia.org/wiki/Constraint_satisfaction_problem">CSP</a> -- don't consider any of this to be anything but ramblings by someone who knows approximately nothing about what he's talking about.<br />
<br />
To begin with, CSP is likely a programming paradigm very different from the ones you already heard of. If functional and object-oriented programming is what you think of when you hear <i>programming paradigm</i> then CSP is like a magic bottle of golden fairy dust in that it makes everything you heard and learnt about programming seem like monkey patching a steam engine. CSP is that different. On the other hand, if you have done some <a href="http://en.wikipedia.org/wiki/Linear_programming">linear programming</a>, <a href="http://en.wikipedia.org/wiki/Logic_programming">logic programming</a>, or implemented <a href="http://en.wikipedia.org/wiki/Logic_programming">type inferencing</a>, then you'll find yourself right at home.<br />
<br />
A CSP program is an set (unordered collection) of constraints (assertions), example:<br />
<span style="font-family: Courier New, Courier, monospace;"> X > 1</span><br />
<span style="font-family: Courier New, Courier, monospace;"> Y > X</span><br />
<span style="font-family: Courier New, Courier, monospace;"> X * Y = 21</span><br />
That it, that's the entire program. To figure out what it does, all you need to do is recall some basic math, because CSP (when operating on integers like this) is essentially the same thing as solving mathematical equations numerically. This means that executing a CSP means to solve the asserted equations. This in turn means that implementing a CSP program means to figure out what equations that should hold on the solution (the answer of the program) and write down those equations. As in math, the ordered of the assertions is irrelevant, so you can organize your program in the way that seems local to you.<br />
<br />
This is all nice and all, but how is a CSP executed? Well, this when it's get's complicated. Solving the equations/constraints that constitute the CSP is a hard problem, really hard problem. The reason for this might not be clear from the above example, but replace 21 with a much larger number (e.g., the product of two large prime numbers) and it should be clear why this is a <a href="http://en.wikipedia.org/wiki/Integer_factorization">hard problem</a>.<br />
<br />
So if it's so hard to execute a CSP why bother writing CSP programs? Well, not all programs are hard to execute and this is important to know. Some programs will take several life-times of the universe to execute while other programs will take a split second to execute. What make the difference is the size of the domain of the variables and the number of variables.<br />
<br />
The domain of a variable is the set of it's possible values. In the program above, X has the domain {2, 3, 4, 5, 6, 7, 8, 9, 10, 11} and Y has the domain {3, 4, 5, 6, 7, 8, 9, 10, 11}. Now you might say "well, clearly X nor Y can be 11!", and you would of course be right! What you have done is something called <a href="http://en.wikipedia.org/wiki/Local_consistency" style="font-style: italic;">constraint propagation</a>, which is what makes CSP programs with many variables with large domains possible to execute before the universe is rebooted.<br />
<br />
What do I mean with this? Let's take a step back and look at what's <i>really</i> given in the above program. The only thing we actually know by looking at the assertions one at the time is the following: the domain of X is {2, 3, 4, ...} and the domain of Y is {...}. That is the domains of X and Y are both infinitely large. How long time would it take to execute a program where the domains of any variable is infinitely large? Infinitely long, of course. This is bad.<br />
<br />
What to do? Constraint propagation to the rescue! What this means is that we use our knowledge of how the constrains work (less than, multiplication, and equals, in this case) to infer more knowledge. For example, if Y > X and X > 1, then Y > 2. Pretty simple. Continuing further we have that X * Y = 21, what does this tell us? Well if we know that X and Y are greater than zero (which they are) then we can infer that X <= 21 and Y <= 21. Let's again take a step back and consider what we just have done -- we started with two infinite domains (that is, a program that would take infinitely long to execute) and through a few simple step we're down at two domains of size 19 and 18 respectively, which can be executed fast on any machine you can find (including that old Atari you keep in your storage room). Constraint propagation is that big of a deal.<br />
<br />
Considering what we just did, you might ask <i>what if there is no (good) propagator for a certain constraint?</i> This is a very good question, and there is a lot of research going on (and have been going on for a long time) in finding good propagators for all kinds of constraints. Unfortunately, it's <a href="http://en.wikipedia.org/wiki/P_versus_NP_problem">likely</a> that there will always be CSP programs that execute slow (the life-time-of-the-universe kind of slow).<br />
<br />
Luckily (in some sense) this is not something that is specific for CSP programs, but programs in general -- it just becomes obvious when doing CSP programming as the paradigm open doors that previously closed to you if you came from imperative/object-oriented or functional programming.<br />
<br />
CSP is rich in research and diving into any part of it will lead you to diverse research areas. If that doesn't make you interested then what about the fact that <a href="https://code.google.com/p/or-tools/">Google</a>, <a href="http://z3.codeplex.com/">Microsoft</a>, and many more is doing it and has identified it as one of the most important areas for the the future.<br />
<br />
It's hard to make predictions especially about the future, but considering the amount of research that's is going on in this area and the fact that industry giants work on it as well, it's not too far fetched to imagine CSP to be a big part of programming in a decade or two. Or are everyone just doing it because it "fun and challenging"? No, I don't think so either.<br />
<br />
(Don't worry, there will still be room for C/JavaScript programmers in 2034. It'll be hard to implement an operating system using CSP or the CSP solver itself...)Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com0tag:blogger.com,1999:blog-8875558920754897232.post-81812453036826464792014-09-14T14:18:00.000-07:002014-09-14T14:18:08.814-07:00Getting more testing done with less effortIt's been a while since I posted here -- but don't worry I haven't been lazy, I just haven't posted what I've done. :) Anyway, the last few days I've been thinking about a new way (to the best of my knowledge and research) to do unit testing that is inspired by Design by Contract.<br />
<br />
<a href="http://upload.wikimedia.org/wikipedia/commons/thumb/5/5f/Whimsy.JPG/193px-Whimsy.JPG" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" src="http://upload.wikimedia.org/wikipedia/commons/thumb/5/5f/Whimsy.JPG/193px-Whimsy.JPG" /></a>What is a unit test? For me, a unit test is a small example that illustrates how to use a software unit (function, class, etc). This is good, because examples is a good way to learn how to use something. In fact, looking at examples is a natural way to learn how to do something.<br />
<br />
There is a bad side to this though, and that is that examples can't show the general idea that a test is trying to verify. For instance, imagine that you have a class that implements the data structure <i>stack</i>. The class has a <i>push</i>, a <i>pop</i>, and a <i>size</i> method, which does the expected thing.<br />
<br />
A test for the <i>pop</i> method, would probably initialize the stack with some predefine content, pop a few values from it, and verify that they are the expected values. (Alternatively, the test could setup the state of stack though a sequence of calls to <i>push</i> -- my point here is still valid).<br />
<br />
What such test fails to capture is what <i>pop</i> expects the stack to look like when it's invoked. It's just shows one instance of all possible stacks that <i>pop</i> can operate on. In fact, <i>pop</i> can operate on every possible stack except the empty stack, but examples may or may not communicate this clearly.<br />
<br />
And this is precisely what I've been toying with lately: writing tests where the setup code is replaced with a condition that must be true for the test to be valid to execute (a.k.a precondition). Let's take an example of this to illustrate the point. Below are two simple tests (written in a hypothetical version of <a href="http://programmaticallyspeaking.blogspot.se/search/label/Protest"><i>Protest</i></a> which supports the idea I'm describing) on a stack object.<br />
<br />
<span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;"> suite("stack test") {</span></span><br />
<span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;"> Stack s;</span></span><br />
<br />
<span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;"> test("push test") {</span></span><br />
<span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;"> int before = s.size(); </span></span><br />
<span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;"> s.push(0);</span></span><br />
<span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;"> postcond(s.size() == before + 1); </span></span><br />
<span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;"> } </span></span><br />
<br />
<span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;"> test("pop test") {</span></span><br />
<span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;"> precond(s.size() > 0); // pop can't operate on the empty stack</span></span><br />
<span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;"> </span></span><br />
<span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;"> int before = s.size(); </span></span><br />
<span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;"> s.pop();</span></span><br />
<span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;"> postcond(s.size() == before + 1); </span></span><br />
<span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;"> } </span></span><br />
<span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;"> }</span></span><br />
<br />
These two tests are executed by a framework which invokes one test case after another <i>on the same instance of Stack</i>. In this example, that means that <i>push test</i> will be the first test executes because <i>pop test</i>'s precondition isn't fulfilled (the stack is empty). When this test is finished, the framework picks another test that can be executed, which means either <i>push test</i> again or <i>pop test</i> (since the stack is not empty anymore). If the framework decides to execute <i>pop test</i>, the stack will be empty again, thus the only test case that can be executed next is <i>push test</i>, and so on.<br />
<br />
So, why do you get more testing done with less effort by writing test-cases in this way? Well, for several reasons:<br />
<ul>
<li>Less development time is spent on setting up the <i>cut</i> (it takes less time to write the preconditions than writing the code that setups the <i>cut </i>such that it fulfills them). As a side effect, tests become shorter, more declarative, and easier to read.</li>
<li>A single test verifies that several executions paths that sets the <i>cut</i> up in a state the fulfills certain conditions results in a <i>cut</i> that passes the test. This makes it easier to verify code with messy internal state.</li>
<li>In addition to verifying behavior, a test describes the relationship between the method of the <i>cut</i>'s interface. For example, how <i>Stack::pop</i> relates to <i>Stack::size</i>. </li>
</ul>
Of course, there are some downsides as well:<br />
<ul>
<li>Tests depend on other tests. This is a big no-no and doing this intentionally might fly in the face of intuition.</li>
<li>The initial state of the <i>cut </i>is less concrete as there is no example code to follow.</li>
<li>Some classes (e.g., <a href="http://en.wikipedia.org/wiki/Builder_pattern">builders</a>) can't naturally be used repetitively in a meaningful way (when an object is built, the builder is done).</li>
</ul>
I've working on a prototype implementation of this testing framework which I call <a href="https://gitorious.org/flytest"><i>flytest</i></a> (for several different reasons (e.g., "will it fly?", "flies in the face of intuition"). My pros and cons above is the result of the limited experience I've gained from implementing it and experimenting with it.<br />
<br />
Right now, the biggest hurdle I've run into is how to express the natural recursive-ness of certain <i>cut</i>s. For example, if you push a value X to a stack you'll get X back when you pop the stack <i>as long as</i> you push and pop the same number of times between the push and the pop of X.Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com2tag:blogger.com,1999:blog-8875558920754897232.post-21761925426602222182014-03-25T11:30:00.000-07:002014-03-25T11:30:50.029-07:00What are phi nodes?I've been working on a project that I call <a href="https://gitorious.org/veria/pages/Home"><i>veria</i></a> for around two months now, and yesterday it compiled the first piece of Java code to native x64 code through the <i>veria</i> jit compiler. The largest and most complex part of <i>veria </i>is implemented; what remains is mostly the backend of the jit compiler. Most of the backend is trivial stuff -- two hash lookups, a call to a <i>libjit</i> function, and a hash insertion -- however, the <i><a href="http://en.wikipedia.org/wiki/Static_single_assignment_form">phi</a></i> instruction (the <i>veria</i> instruction set is register-based single static assignment) is somewhat complicated as <i>libjit </i>uses a somewhat different model. Let's take an example; first an instruction set with phi nodes (similar to <i>veria</i>):<br />
<span style="font-family: Courier New, Courier, monospace;"> JEQ R0, R1, L1</span><br />
<span style="font-family: Courier New, Courier, monospace;"> R2 = 10</span><br />
<span style="font-family: Courier New, Courier, monospace;"> JMP L2</span><br />
<span style="font-family: Courier New, Courier, monospace;"> L1:</span><br />
<span style="font-family: Courier New, Courier, monospace;"> R3 = 20</span><br />
<span style="font-family: Courier New, Courier, monospace;"> L2:</span><br />
<span style="font-family: Courier New, Courier, monospace;"> R4 = PHI(R2, R3)</span><br />
<br />
Second, the same function in an instruction set similar to <i>libjit'</i>s:<br />
<span style="font-family: Courier New, Courier, monospace;"> JEQ R0, R1, L1</span><br />
<span style="font-family: Courier New, Courier, monospace;"> R2 = 10</span><br />
<span style="font-family: Courier New, Courier, monospace;"> MEM[0] = R2</span><br />
<span style="font-family: Courier New, Courier, monospace;"> JMP L2</span><br />
<span style="font-family: Courier New, Courier, monospace;"> L1:</span><br />
<span style="font-family: Courier New, Courier, monospace;"> R3 = 20</span><br />
<span style="font-family: Courier New, Courier, monospace;"> MEM[0] = R2</span><br />
<span style="font-family: Courier New, Courier, monospace;"> L2:</span><br />
<span style="font-family: Courier New, Courier, monospace;"> R4 = MEM[0]</span><br />
<br />
Looking carefully at these two instruction sequences we see that the <span style="font-family: Courier New, Courier, monospace;">PHI</span> instruction is replaced with <span style="font-family: Courier New, Courier, monospace;">R4 = MEM[0]</span> (reading from the memory) and the assignments to <span style="font-family: Courier New, Courier, monospace;">R2</span> and <span style="font-family: Courier New, Courier, monospace;">R3</span> are followed by a write to <span style="font-family: Courier New, Courier, monospace;">MEM[0]</span>. The instruction sequence without the <span style="font-family: Courier New, Courier, monospace;">PHI</span> instruction is very close to how code running on an actual computer, while the first has this magic <span style="font-family: Courier New, Courier, monospace;">PHI</span> instruction that can't really be executed... but, assuming <span style="font-family: Courier New, Courier, monospace;">R4 = PHI(R2, R3)</span> is just another spelling of <span style="font-family: Courier New, Courier, monospace;">R4 = MEM[0]</span>, all that is missing from the first sequence to be executed are the writes to memory.<br />
<br />
Let's spell those write instructions as <span style="font-family: Courier New, Courier, monospace;">PREPHI</span> and give it and it's matching <span style="font-family: Courier New, Courier, monospace;">PHI</span> instruction a unique number. Here's the result with comments saying the alternative spelling:<br />
<span style="font-family: Courier New, Courier, monospace;"> JEQ R0, R1, L1</span><br />
<span style="font-family: Courier New, Courier, monospace;"> R2 = 10</span><br />
<span style="font-family: Courier New, Courier, monospace;"> PREPHI R2, 0 // MEM[0] = R2</span><br />
<span style="font-family: Courier New, Courier, monospace;"> JMP L2</span><br />
<span style="font-family: Courier New, Courier, monospace;"> L1:</span><br />
<span style="font-family: Courier New, Courier, monospace;"> R3 = 20</span><br />
<span style="font-family: Courier New, Courier, monospace;"> PREPHI R3, 0 // MEM[0] = R3</span><br />
<span style="font-family: Courier New, Courier, monospace;"> L2:</span><br />
<span style="font-family: Courier New, Courier, monospace;"> R4 = PHI(R2, R3, 0) // R4 = MEM[0]</span><br />
<br />
So what we got here is an instruction set that is single static assignments but that can be mapped to instructions set such as <i>libjit'</i>s through a single pass. And it also de-mystifies the <i>phi</i> instruction<i>. </i>The cost is that any phi instruction inserted by the optimizer needs a matching prephi instructions.Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com0tag:blogger.com,1999:blog-8875558920754897232.post-22488388534112895052014-01-12T13:17:00.000-08:002014-01-12T13:17:03.662-08:00How to design interfacesMy definition of the word <i>interface</i> is <i>point of contact for a piece of functionality</i>. Designing good interfaces to any piece of functionality is hard -- that's at least my experience. I've designed several interfaces, but very few of those make me particularly proud of my programming skills. The problems I frequently see are:<ul>
<li>too specific (hard-wired to a particular client),</li>
<li>too generic (hard to use for simple problems),</li>
<li>not extensible (adding functionality requires changing the interface), and</li>
<li>simply to darn hard to understand. </li>
</ul>
I've taken a few month off from hobby hacking, which has let my mind wondering in various direction without being held up by any particular project that I happen to work with. One theme that keeps popping up in my readings and thinkings is <i>modeling.</i> That is, given a solution to a problem (e.g, a python function someone just wrote), how should I think about it such that it makes sense to me (and others) -- in other words, how do I <i>model</i> the solution. Let's take an example to make this a bit clearer.<br />
<br />
Consider a piece of code that parses a file where each line holds either a number or a string of text. How should the interface to this code look? Some alternatives are:<br />
<ul>
<li><span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;">bool str_to_int_or_str(const char* content, int* int_value, const char** text);</span></span></li>
<li><span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;"><span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;">OneOf<const char*, int> </span></span></span></span><span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;"><span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;"><span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;">str_to_int_or_str</span></span>(const char* content);</span></span> </span></span></li>
<li><span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;">void read_line(const char* content, void (int_func*)(int), void (txt_func*)(const char*));</span></span></li>
<li><span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;">void parse(const char* content, Visitor* visitor);</span></span></li>
<li><span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;">template <class C, class V> parse(C content, V visitor);</span></span></li>
</ul>
I won't go into details, but it's safe to say that each of these alternatives have its good and its bad aspects. The question here is though, which alternative models the solution most accurately. I would personally go for the second to last or last alternative, as parsers and visitors is familiar to most people (by which I mean <i>programmers</i>). That is, I model the solution/code as if it's a parser -- not like some conversion function (<span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;">str_to_int_or_str</span></span>) or a line-based reader (<span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;">read_line</span></span>).<br />
<br />
With small pieces of code like the example above, it might not be clear why thinking about modeling is important. But consider designing an API for a large and complicated library. What are the concepts that the client need to understand? What are the actions it can perform? What side-effects are expected and when should they be visible?<br />
<br />
At work I'm developing an application that just-in-time compiles a proprietary language into native x64/x86 code. We've had some bugs related to when variables are updated. This probably sounds like trivial things to get right, but when you get deep down in the details these things easily slip through because your brain are occupied thinking about other aspects of the problem. These bugs could have been avoided if there was a layer that took care of these details. That is, a layer that models these inherent properties of the language and provides an interface that helps the developers to think in terms of the model.<br />
<br />
This brings me to the other aspect of modeling -- as tool for communication.<br />
<br />
Given any piece of code -- two programmers will think about it slightly different. Furthermore, the larger the code, the more likely that two programmers will understand it more different. In other words, it will be harder for them to understand each other when talking about the code. However, if the code is partitioned into smaller parts and each part <i>models </i>(not only implements) some logical sub functionality, it will be easier to understand it.<br />
<br />
Note that the important word in the above paragraph is not <i>implements</i>, it's <i>models</i>. The implementation of an interface is irrelevant for understanding what a client does if the interface models the solution properly.<br />
<br />
Think about opening and reading files. In most languages it's as straight-forward to implement it as it is to say "think about opening and reading files". The interfaces models the way we think about files. Great! That means we don't need to understand what <span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;">open</span></span> does in order to understand what the code using <span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;">open</span></span> does.<br />
<br />
This means that one easy way to get an interface to be easy to use is design it by mimicking something other people already understand (e.g., a parser, a visitor, a file). This is the alternative I've taken the last few times I've designed an interface -- trying to find a suitable analogy and use that as model.Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com0tag:blogger.com,1999:blog-8875558920754897232.post-67796614213451990392013-10-24T22:48:00.001-07:002013-10-24T22:48:59.524-07:00asmf -- a portable, low-level, jit assemblerSo I've been busy lately with all kinds of big and small things, related and not related to programming. Recently I've been working on a jit assembler that I call <i>asmf</i>. You can find it <a href="https://gitorious.org/asmf/pages/Home">here</a>.<br />
<br />
Why do I call it <i>asmf, </i>that not such a nice sounding name now is it? Well, it's an assembler (therefore the <i>asm</i>-part) and it's take inspiration from printf (therefore that <i>f</i>-part). Now you must be thinking "really? printf? are you seriously parsing strings to emit binary code?". Well yes, and no.<br />
<br />
As you surely like code as much as I do, let's take an example before continuing:<br />
<span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> // Use the 'r' wildcard (meaning rax, rbx, etc) in an asmf emit statement. </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> void clear_reg(unsigned char*& dstbuf, unsigned dst) { </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> asmf("mov %r, 0", dst, dstbuf); </span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> }</span><br />
<div>
I'm sure you can see the relation to printf? This code is passed to the <span style="font-family: Courier New, Courier, monospace; font-size: x-small;">asmf</span> preprocessor, which outputs the following code:</div>
<div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> // mov %r, 0 </span></div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> void __asmf_mov__r__0(unsigned long op0, unsigned char*& bufp) { </span></div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> unsigned char* buf = bufp; </span></div>
<div>
<span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span><span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> buf[0] = 0x48 | ((op0 & 0x08) >> 3); </span></div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> buf[1] = 0xc7 | (op0 & 0x07); </span></div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> buf[2] = 0xc0; </span></div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> buf[3] = 0x00; </span></div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">buf[4] = 0x00; </span></div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> buf[5] = 0x00; </span></div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">buf[6] = 0x00; </span></div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">bufp += 7; </span></div>
<div>
<span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">} </span></div>
<div>
<span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span></div>
<div>
<span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">void clear_reg(char* codebuf, unsigned dst) { </span></div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> __asmf_mov__r__0(dst, dstbuf); </span></div>
<div>
<span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> </span><span style="font-family: Courier New, Courier, monospace; font-size: x-small;">}</span></div>
<div>
That is, the call to the <span style="font-family: Courier New, Courier, monospace; font-size: x-small;">asmf</span> function is replaced to a call to a generated function, and the string literal is dropped. It's a quite simple preprocessor in this regard.</div>
<div>
<br /></div>
<div>
But how does it come up with the newly generated function? This is the core of <i>asmf </i>and this is why I even bother publishing yet another jit assembler -- there are at least 4-5 tools/libraries out there already that does the above. But <i>asmf</i> is different to all those tools, and if you run <span style="font-family: Courier New, Courier, monospace; font-size: x-small;">sloccount</span> on it you'll understand that it is different. In less than 500 lines of code you have a jit assembler for x64 -- and ARM, and Sparc, and your OpenRISC, etc. (Well, in theory at least, as I haven't tried this yet).</div>
</div>
<div>
<br /></div>
<div>
How? By being lazy. I knew that I would never be capable of writing a full jit assembler, and I knew that I would never have the patience to reverse-engineer the output of the assembler for all instructions. That why I wrote <i>asmf</i> to do this for me. Yes, <i>asmf</i> is not only a jit assembler -- it's a jit assembler generator and an instruction encoding reverse-engineer:er. This is why I say that <i>asmf</i> should work on every/most platform.</div>
<div>
<br /></div>
<div>
The major part of <i>asmf</i> is implemented, there are some thing related to usability (error messages, more command line switches, documentation) etc. The testing is unfortunately very x64 centered right now, and that has to fixed when ported to new platforms.</div>
Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com0tag:blogger.com,1999:blog-8875558920754897232.post-12144251334820081062013-09-14T15:03:00.002-07:002013-09-14T15:03:31.853-07:00Pythlog, assignments, and equations<a href="http://gitorious.org/pythlog/pages/Home">Pythlog</a> is a language I'm working on from time to time. It's a dialect of Python that incorporates constraint logic programming. So far it supports integers, list, tuples, strings, and user defined classes. There are a lot things still missing, but it already capable of some pretty cool stuff:<br />
<span style="font-family: Courier New, Courier, monospace;"> def fac(n): </span><br />
<span style="font-family: Courier New, Courier, monospace;"> assert n >= 0 </span><br />
<span style="font-family: Courier New, Courier, monospace;"> </span><span style="font-family: 'Courier New', Courier, monospace;"> </span><span style="font-family: 'Courier New', Courier, monospace;"> </span><span style="font-family: 'Courier New', Courier, monospace;">if n == 0: </span><br />
<span style="font-family: Courier New, Courier, monospace;"> </span><span style="font-family: 'Courier New', Courier, monospace;"> </span><span style="font-family: 'Courier New', Courier, monospace;"> </span><span style="font-family: 'Courier New', Courier, monospace;">return 1 </span><br />
<span style="font-family: Courier New, Courier, monospace;"> </span><span style="font-family: 'Courier New', Courier, monospace;"> </span><span style="font-family: 'Courier New', Courier, monospace;"> </span><span style="font-family: 'Courier New', Courier, monospace;">else: </span><br />
<span style="font-family: Courier New, Courier, monospace;"> </span><span style="font-family: 'Courier New', Courier, monospace;"> </span><span style="font-family: 'Courier New', Courier, monospace;"> </span><span style="font-family: 'Courier New', Courier, monospace;">return fac(n - 1) * n</span><br />
<span style="font-family: Courier New, Courier, monospace;"><br /></span>
<span style="font-family: 'Courier New', Courier, monospace;"> </span><span style="font-family: 'Courier New', Courier, monospace;"> </span><span style="font-family: Courier New, Courier, monospace;">print(fac(7)) # Prints 5040</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> </span><span style="font-family: 'Courier New', Courier, monospace;"> </span><span style="font-family: Courier New, Courier, monospace;">fac(w) = 5040</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> </span><span style="font-family: 'Courier New', Courier, monospace;"> </span><span style="font-family: Courier New, Courier, monospace;">print(w) # Prints 7</span><br />
In this example we define the factorial function the a pretty straight-forward recursive manner. Then we call it to calculate the factorial of 7. The second to last line might appear a bit unorthodox though. What's going on here?<br />
<br />
As said, Pythlog is a logic programming language. Such languages have a few features that set the aside from traditional imperative languages. On such feature is so call <i>free variable. </i>The code above is equivalent to:<br />
<span style="font-family: 'Courier New', Courier, monospace;"> </span><span style="font-family: 'Courier New', Courier, monospace;"> </span><span style="font-family: Courier New, Courier, monospace;">w = free</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> </span><span style="font-family: 'Courier New', Courier, monospace;"> </span><span style="font-family: Courier New, Courier, monospace;">assert fac(w) == 5040</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> </span><span style="font-family: 'Courier New', Courier, monospace;"> </span><span style="font-family: Courier New, Courier, monospace;">print(w)</span><br />
where <span style="font-family: Courier New, Courier, monospace;">w</span> is a free variable which is passed to <span style="font-family: Courier New, Courier, monospace;">fac</span>. By asserting that the return value must be equal to 5040, the constraint satisfaction framework kicks in and solves <span style="font-family: Courier New, Courier, monospace;">w</span> to 7.<br />
<br />
I recently introduced the shorthand syntax <span style="font-family: Courier New, Courier, monospace;">fac(w) = 5040</span> for this. I'm not fully happy with it yet, because there are some non-obvious behaviors. I'm pretty sure there will be some changes in this area. For now though, it make the language at least <i>look</i> nicer, soon I hope it will also feel nicer.Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com0tag:blogger.com,1999:blog-8875558920754897232.post-26944730644842809902013-05-28T12:41:00.000-07:002013-05-31T11:13:35.048-07:00Pretty ugly hacks Recently I've (re-)discovered the more hacky kind of programming. It started after that I implemented my first <a href="http://en.wikipedia.org/wiki/Bloom_filter">bloom-filter</a>. Somehow I ended up reading about all kinds of bit twiddling hacks, and finally I found myself in the realm of <a href="http://en.wikipedia.org/wiki/XOR_linked_list">xor double linked lists</a>.<br />
<br />
Doubled linked list is a data structure where each node in the list has two pointers; one to the next node, and one to the previous node. The size of each node is thus 3 pointers. An xor linked list has the same logical function, but a compress representation -- it only requires 2/3 of the size of a normal linked list. This is achieved by observing that the only way to reach a node in a double linked list is either from the back or from the front of the list. Thus, it's enough to store the xor-sum of the pointer to the next node and the previous node, since either of those pointers is known. Anyway, the linked Wikipedia article explains it much better.<br />
<br />
Well, the neighboring country to xor-lists is <a href="http://en.wikipedia.org/wiki/Tagged_pointer">Tagged Pointer</a>-land. In Tagger Pointer it's ok to do all kinds of crazy things to pointers as long as you know what you're doing. For instance you can exploit that dynamically allocated memory on most systems are aligned to 16-bits, 32-bits, 64-bit or even 128-bits borders. What can this be used for? Distinguishing between native integer values and arbitrary-precision integers for instance, or for putting small data structures (e.g., short strings) in the "pointer" rather than having to dynamically allocating a few bytes of memory.<br />
<br />
It's actually pretty useful stuff, although -- as the title says -- pretty ugly. That is, the pretty kind of ugly.Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com0tag:blogger.com,1999:blog-8875558920754897232.post-87837529514002724882013-05-26T11:12:00.000-07:002013-05-26T11:12:58.770-07:00Features are social processes<a href="http://upload.wikimedia.org/wikipedia/commons/thumb/e/e3/Sign_language_C.svg/200px-Sign_language_C.svg.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"><img alt="" border="0" src="http://upload.wikimedia.org/wikipedia/commons/thumb/e/e3/Sign_language_C.svg/200px-Sign_language_C.svg.png" style="cursor: pointer; float: left; height: 217px; margin: 0pt 10px 10px 0pt; width: 200px;" /></a><br />
For a while now I've been thinking about how human languages <span style="font-style: italic;">evolve</span> compared to how computer languages are <span style="font-style: italic;">designed</span> and how that relates to the features of the respective languages. In this post I will ramble at bit about how the meaning of software related terms is defined. I'll also discuss <span style="font-style: italic;">hard</span> and <span style="font-style: italic;">soft</span> features in programming languages and how the community surrounding the language affects, and is affected by, those features.<br />
<br />
This is mostly a bunch of ideas and observations that I'm trying to put into words to 1) make me understand them better, and 2) make sure I don't forget them. If you expect a scientific survey, then I'm sorry to disappoint you. Maybe, though, you'll find food for your own thought and ideas.<br />
<h3>
What's a language?</h3>
As I see it, languages (be it the human or programming kind) are mutual agreement between the communicating parts of what a statement means. If everyone have a different opinion of what the following statement means, then it effectively <i><span style="font-style: italic;">doesn't have any</span> meaning at all</i> since we can't use it to communicate with anyone:<br />
<pre><code>You need wood from a forest to create a fire.</code></pre>
or in computer-speak:<br />
<pre><code>Fire fire = new Fire(forest.getWood());</code></pre>
On the other hand, when we agree upon what this phrase mean, then we can do all kinds of things: discuss its correctness, use it in another context, abstract it to cover more and general cases, write a compiler for it, etc.<br />
<br />
For example, the common breed of C compiler <a href="http://www.ioccc.org/main.html">accepts</a> a lot of code most C programmers won't. It's the users of the C language that defines the subset of allowed-by-the-compiler-C that is acceptable for us to use. In other words, the C standard can say all it wants; its the C users who in the end defines the (practical) C language. It a bit like if physics would say "Our universe have 11 dimensions. Go use all of them!", but all known users of the universe are three-dimensional beings, thus, the accepted subset of the universe is three-dimensional. Sorry to break it to you, physics, but that's how it is.<br />
<br />
Language features are all about what the community around the language make of the language. In a way, language features are just as much a technical aspect of the language as a social aspect of it. Example: is a language feature that's not accepted by the community really a <span style="font-style: italic;">feature</span>, or is it just useless language complexity? More extreme example: a language without users but with extremely powerful features; effectively, does that language have any features at all?<br />
<br />
I would also say that anyone aiming to develop a successful programming language (without the backing of a <span style="font-size: 78%;"><span style="font-style: italic;">*caugh* Sun</span></span> huge <span style="font-size: 78%;"><span style="font-style: italic;">*caugh* Microsoft</span></span> corporation) needs to have equally good eye for the technical aspect as well as the social aspect. (S)he needs to understand the social processes involved for getting a community of users who agree (hopefully with the language designer) on how the language should be used. (I think python is good example of such community, by the way).<br />
<h3>
What about software?</h3>
Developing software is also a social process. For example, you get requirements from your customer, you discuss the requirements in order to understand them, and you implement them. Implementing requirement are also a social process: you design the code by discussing it with your colleagues. And what words do you use for doing that?<br />
<br />
You use words like object, generic, sort, inheritance, stack, tree, operation, method, message, reuse, client, algorithm, allocation, port, framework, mapping, service, channel, process, decoupled, assumption, resource, provider, input, interface... I could go on forever, but the point is that none of these words really mean anything if we humans don't agree on what they mean. The computer, framework, or programming language has no opinion on what "decouple the client's mapping algorithm from the port allocation" means, but programmers do. It's important it means that same to all programmers involved.<br />
<h3>
Soft and hard</h3>
How does this relate to programming language features? I think there two different kinds of features: <span style="font-style: italic;">hard</span> features that was (deliberately) designed into the language, and <span style="font-style: italic;">soft</span> features that are concepts and idioms that have evolved from using the language.<br />
<br />
Hard features are concrete. You can make a list of hard features by reading the language specification. Soft features, on the other hand, are not. They are embedded in the community and to enumerate them you need to vibe with it for a while. Hard features are taught in classes and in books; soft features are learned by hacking, hacking, living and breathing, and some more hacking.<br />
<br />
Example: C++ templates. Originally intended to provide better type-safety when writing generic code, like <span style="font-family: courier new;">std::vector</span>. The C++ community has then discovered that templates can be used for much, much more (like Boost.Spirit). There are a lot of template code written to implement various kinds of abstract features, e.g., compile-time if, compile-time strings, domain specific languages, etc. The hard feature is "write type-safe generic code". The soft features are "compile-time if", "embedded DSL", and even "it's hard, but you can evaluate <a href="http://ompf.org/forum/viewtopic.php?f=8&t=1556">everything</a> at compile-time".<br />
<br />
The D language took these soft features of C++ templates (e.g., compile-time if, embedded DSL) and integrated them into the core language. Thus, enabled more programmers to use them, because of easier syntax, error messages, compiler support, documentation, etc.<br />
<br />
So when a C++ programmer talks about enabling or disabling some piece of code (s)he needs to think about some abstract concept like the enable-if template, while a D programming just thinks "static if". In fact, I don't think the D programmer even thinks "static if" because it's so natural to them, just as the more common "dynamic if" is so natural to all of us. The D programmer probably thinks in entirely other abstract concepts because his/her mind is free from the horrible details of C++ templates meta-programming.<br />
<br />
You may argue that our mind is very good at abstracting and that this isn't a problem in practice, but I don't think that's true at all. Example: very few C++ programmer have every done something like an template computing the sinus of an angle, so when they're told to optimize a piece of code doing trigonometry what they'll do is to use some kind of table look-up. A D programmer, on the other hand, will simply slap together a sinus function that can be evaluated statically by the compiler because compile-time evaluation is nothing magic to her/him. (In fact, in <a href="http://www.digitalmars.com/d/2.0/features2.html">D 2.0</a> match function will automatically be evaluated at compile-time if their arguments are constants).<br />
<br />
What I'm saying here is that compile-time evaluation is a (hard) language feature of D but not of C++ (where it is an soft feature). Sure, you <span style="font-style: italic;">can in theory</span> do compile-time evaluation of <span style="font-style: italic;">everything</span> you need in C++ (templates are <a href="http://en.wikipedia.org/wiki/Template_metaprogramming">Turing complete</a>), but not in practice because it's so hard that you actively avoid it. Thus, in most programmers conscious mind, C++ does not have compile-time evaluation. Similarly, you can do object-oriented programming in C, but you probably don't because it's hard and littered with horrible details. Thus, C is in most programmers mind not a object-oriented language. It's possible to do structured programing in Commodore BASIC, but most of us don't think of it like that. Got my point yet? Good.<br />
<h3>
Ok, so what?</h3>
By now you are probably thinking "that's all very interesting, but how is this useful?". Well, I did say that this post was a rambling, didn't I? :)<br />
<br />
Seriously though, I don't really think any of this will make you a better software developer, but I think it could be useful if you are developing an tool and there's a community of users around it. Be sure to note what kinds of words the users uses, what features they ignore, what idioms they invent, etc. Integrate the words and idioms into the tool and it's documentation to make the experience for a new user of the application more consistent.Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com2tag:blogger.com,1999:blog-8875558920754897232.post-71459941003632722342013-05-24T07:46:00.000-07:002013-05-24T15:43:28.101-07:00Small features distinguishes Protest<a href="http://programmaticallyspeaking.blogspot.se/search/label/Protest"><i>Protest</i></a> is a unit testing framework for C++ that make testing fun, fast, and simple (see <a href="https://gitorious.org/protest/pages/Home">introduction</a>). Recently, after using it on one of my own project, I've added a few more features to it.<br />
<div>
<br /></div>
<div>
First, I improved the compilation time for compiling a .cc file containing a test cases. This was embarrassingly bad earlier, but now I think its pretty good. The issue was that there were a lot functions <i>defined</i> in the header. I did a bit of preprocessing magic to avoid unnecessary function definition (in favor of function declarations) and that lower the compilation times dramatically.</div>
<div>
<br /></div>
<div>
Second, I made the <span style="font-family: Courier New, Courier, monospace;">check</span> macro a bit more intelligent. <i>Protest</i>'s check macro was already quite smart: it's capable of dealing with <a href="http://code.google.com/p/hamcrest/">matchers</a>, and splitting the left hand side and the right hand side in a comparison (e.g., <span style="font-family: Courier New, Courier, monospace;">foo != bar</span>), which means that there is no need for several check macros as is common for C++ unit test frameworks.</div>
<div>
Anyway, while working on some bit twiddling heavy code my tests tended to look something like this:</div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> test("twiddling #2") {</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> check(f(10) == 0x1f);</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> check(f(11) == 0x2f);</span></div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> }</span></div>
<div>
and when a test failed the following was printed:</div>
<div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> test.cc:9: f(10) == 0x1f (30 == 31) failed. [suite/</span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">twiddling</span><span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> #2][].</span></div>
</div>
<div>
which is all pretty nice, isn't it? Well, not really.</div>
<div>
<br /></div>
<div>
I found myself having to convert the decimal print-outs to hexadecimal before the error made sense to me -- bit twiddling is not easy in base-10... It didn't take long until I wished that my favorite unit test framework did this automatically for me. How nice then that I had the author of it so conveniently close. Sitting on the very same chair even!</div>
<div>
<br /></div>
<div>
Fast forward one hour and <i>Protest</i> now gives the following print-out for the example above</div>
<div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> test.cc:9: f(10) == 0x1f (0x1e == 0x1f) failed. [suite/</span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">twiddling</span><span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> #2][].</span></div>
</div>
<div>
that is, <i>Protest</i> recognizes that I used hexadecimal literals in the <span style="font-family: Courier New, Courier, monospace; font-size: x-small;">check</span> and changes the base of the integers in the print-out appropriately.</div>
<div>
<br /></div>
<div>
Pretty convenient.</div>
<div>
<br /></div>
Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com0tag:blogger.com,1999:blog-8875558920754897232.post-1158007552618670242013-05-05T02:59:00.001-07:002013-05-05T02:59:26.643-07:00pythlog -- python on constraint-logic-programming steroidsI've programmed in Python for about 5 years and I've programmed in Prolog for about half a year. I know my way around python-land reasonably well, but I'm only beginning to learn Prolog. What I have learnt in the last half year, though, is how hard it is to be taken serious when you tell another developer <i>we should really use Prolog to solve this problem</i>. Prolog is simply a too far-off world if you're used to python-land.<br />
<br />
A lot about learning a new tool or language is how easy it is to
leverage existing knowledge. And Prolog does not fair well in this
regard. <i>No side effect? No classes? No for statement? What's this funky if? Where's my list comprehension? Recursion -- for real?? You have to be kidding me! </i>These are all responses of a hypothetical, but likely, programmer that learns Prolog.<br />
<br />
Even though I'm advocating learning Prolog, there are some thing that just seem to fundamentally be designed to work against me, and I keep thinking <i>if Prolog only was a tiny bit forgiving and accepted me for the programmer I am... and my tiny brain</i>. In addition, Prolog lacks a lot of syntactic sugar for common operation, such as string concatenation.<br />
<br />
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: right; text-align: right;"><tbody>
<tr><td style="text-align: center;"><a href="http://www.thehindu.com/sci-tech/energy-and-environment/new-snake-species-found-in-ap/article3719048.ece" imageanchor="1" style="clear: right; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" height="212" src="http://www.thehindu.com/multimedia/dynamic/01164/HY03_ISBS_SNAKE_1164257f.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">A snake around a piece of wood. Get it?</td></tr>
</tbody></table>
As an experiment of how many of Prolog good properties can be moved into python-land, I've started working on a python-to-prolog compiler. The intention is primarily very egoistic: I want a language that makes my life as a programmer better. The way I see it marrying Python with Prolog (or the other way around?) can potentially bring a very powerful language. I'm calling this marriage <i>pythlog</i> and here's the techy elevator pitch:<br />
<br />
<i>pythlog </i>is a language that melds the power of constraint logic programming with the ease of development of Python.<br />
<br />
What does this mean? It means that not only can a lot of traditional Python code be compiled and run, such as<br />
<br />
<span style="font-family: Courier New, Courier, monospace;"> def mersenne_prime(n):</span><br />
<span style="font-family: Courier New, Courier, monospace;"> return 2 ** n - 1</span><br />
<span style="font-family: Courier New, Courier, monospace;"> def format_string(a, b):</span><br />
<span style="font-family: Courier New, Courier, monospace;"> return "[" + a + b + 2 * a + "]"</span><br />
<span style="font-family: Courier New, Courier, monospace;"> </span>but also search/equation solving such as:<br />
<br />
<span style="font-family: Courier New, Courier, monospace;"> n = free # solve this variables</span> <br />
<span style="font-family: Courier New, Courier, monospace;"> assert mersenne_prime</span><span style="font-family: 'Courier New', Courier, monospace;">(n) == </span><span style="font-family: Courier New, Courier, monospace;">6189700196426</span><span style="font-family: Courier New, Courier, monospace;">90137449562111</span><br />
<span style="font-family: Courier New, Courier, monospace;"> a = free</span><br />
<span style="font-family: Courier New, Courier, monospace;"> b = free </span><br />
<span style="font-family: Courier New, Courier, monospace;"> assert format_string(a, b) == "[abc, abcabc]"</span><br />
where <span style="font-family: Courier New, Courier, monospace;">n</span>, <span style="font-family: Courier New, Courier, monospace;">a</span>, and <span style="font-family: Courier New, Courier, monospace;">b</span> are free variables that will be bound to values by solving the above equations. In this example <span style="font-family: Courier New, Courier, monospace;">n</span> is solved to <span style="font-family: Courier New, Courier, monospace;">89</span> and <span style="font-family: Courier New, Courier, monospace;">a</span> to <span style="font-family: Courier New, Courier, monospace;">"abc"</span>, <span style="font-family: Courier New, Courier, monospace;">b</span> to <span style="font-family: Courier New, Courier, monospace;">", "</span>.<br />
Before you get too excited, please note that state of <i>pythlog</i> is proof-of-concept: there is basic integer arithmetic, string, and list support. There is also some support for user defined classes. <br />
<br />What can be expected to be supported of the Python language? Well, so far my prototyping indicates that at least integers, strings, lists, sets, dicts, and user-defined classes can be supported. I've done a lot of prototyping to find out what can be done and what can't be done, and I think there is a lot of potential here. In fact, by using existing tools <i>pythlog</i> will support compilation to native code (using the gprolog compiler), which is cool in itself.<br />
Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com0tag:blogger.com,1999:blog-8875558920754897232.post-72518556154239373632013-04-01T00:56:00.001-07:002013-04-01T00:56:41.316-07:00Five years of hacking and searchingFive years ago I write my <a href="http://programmaticallyspeaking.blogspot.se/2008/04/programatically-speaking.html" target="_blank">first</a> post on this blog. Looking back I realize that writing here has been like a hacking diary of some sort. I can see how my interests has shifted by looking on the topics of my posts.<br />
<br />
The topic for this 110th post is <i>Five years of hacking and searching</i>, because it's not only been five years of hacking, fiddling, and playing around with ideas -- it has also been five years for searching.<br />
<br />
Searching for what? Many things: <a href="http://programmaticallyspeaking.blogspot.se/search/label/patters" target="_blank">patterns</a>, <a href="http://programmaticallyspeaking.blogspot.se/search/label/patters" target="_blank">tools</a>, <a href="http://programmaticallyspeaking.blogspot.se/search/label/design" target="_blank">design</a>, <a href="http://programmaticallyspeaking.blogspot.se/search/label/languages" target="_blank">languages</a>, etc. Ever since I took my first stumbling steps in the programming world by copying code from my Commodore Basic User's Manual, I've felt that there must be better ways of doing this -- this activity that we call hacking, programming, or developing, depending on if we're on couch or at the office.<br />
<br />
At the time of User's Manual-copying I was a eight or nine year old kid with an exceptionally average brain capacity, and neither could I understand what the code I was typing did, nor could I imagine how the world of computing would develop -- or that I would make my living out of it for that matter. But that feeling of that things could be done better was there. Why did I have to manually save line numbers in my Basic program (you know, 10, 20, 30, etc)? Why didn't the computer do that?<br />
<br />
After some time I found the <span style="font-family: "Courier New",Courier,monospace;">renumber</span> command which actually did this automatically. Cool, a command that help me writing my Basic program! This made me very excited: <span style="font-family: "Courier New",Courier,monospace;">renumber</span> was a program that operated on another program (source)! Imagine how awesome that though is for a little kid who just are beginning to program and learning what can be done with a computer.<br />
<br />
A few years later my father brought a PC home from work. I was very interested in this machine, but what could I do with it? Well, not much it seemed -- there was no Basic! I don't recall how I found it out -- I guess hours and hours looking around on the hard drive -- but I finally found <span style="font-family: "Courier New",Courier,monospace;">qbasic.exe</span>. Oh, how I loved playing around in this environment. It was like nothing I had ever seen before. It was an editor. You ran the program in the editor. You could step programs line-by-line in the editor. The help of the editor and the language as integrated into the this environment. The QBasic language even had graphical modes! This was amazing for a computer enthusiast who's feet didn't reach the floor when he was sitting in his father's chair programming his computer through this blue-looking editor. Exploring what the computer memory contained through PEEK was fun, as was seeing it randomly crashing when I POKEd it.<br />
<br />
Move forward a couple of year to the mid 1990's. Somehow my family got internet over 28.8 modem and somehow I realize that I can use internet to find solutions to my programming questions. I find support for mouse in QBasic, and crazy cool graphical tricks that I never understood. But I got some pretty cool stuff working with it. I remember writing a GUI application in QBasic with mouse and windowing support. I remember particularly fondly how I by accident stumble upon recursion while doing this. I remember very clearly how I looked at the screen popping one window, and then another when I pressed a GUI button, and when I closed the top window, the one below was still working. A totally <i></i>awesome feeling of achievement that I never have been able to reach again.<br />
<br />
Late 1990 and I'm introduced to Pascal in school. Pascal was cool because it actually felt like a real programming language. Why did it feel like a real programming language? Because there was a compiler, and you had to import modules, etc, to get things working. Pascal has some good things, but it never reached up to QBasic as a play-around-and-learn-as-you-go language. I did manage to write a platforming game that was inspired by a <a href="http://www.youtube.com/watch?v=7tT0efGw9oc" target="_blank">swedish comedy show</a>, but for one reason or another writing in Pascal was more like fencing with the compiler than writing programs.<br />
<br />
The next languages I learnt was Scheme at the university, which was the most eye-opening programming experience of my life. Somehow Scheme just <a href="http://imgs.xkcd.com/comics/lisp.jpg" target="_blank">made sense</a> to me -- power wrapped in simplicity. Later I learnt C++, which was the complete opposite of learning Scheme -- C++ is a language that "just made sense" to me. C++ has a simple and beautiful core in it's C legacy (C is tiny beautiful language if you approach it from the assembly/machine level), but wrapped around this core are classes, templates, exceptions, etc. The result? Complexity wrapped around simplicity.<br />
<br />
Later I learnt Java and Python at work. Java is C++ made simple, but it also C++ without the power. When I first learnt Java I was stunned by how fast I got from idea to working code. But Java isn't elegant. You can't design elegant application in Java, because the powers for doing so has been lost in the process of dumbing down C++. What about Python? It's a very nice language with tonnes and tonnes of nice libraries, but it's not a language that makes me a better programmer. It doesn't force me to come up with elegant solutions, on the contrary, Python encourages hacky solutions. Do I dislike Python? No. Java? No. C++? No.<br />
<br />
Then half a year ago I learned Prolog. I don't think Prolog is a practical language, nor is it a language where the simple obvious solution to a problem is the solution that fits the language. However, Prolog makes me a better programmer and it's makes me (want to be) smarter. No other language I've played around with has done this -- except (Q)Basic when I was 11.<br />
<br />
Why am I telling you all this? Well, as this five-years-old-celebration post it's partly about my life, but there is also a deeper point. A good language isn't a language with the most flexible type-system, the deepest meta-programming possibilities, the most coherent syntax, or the most well-defined semantics. A good language is a language the makes you a better programmer and in turn make you write better programs. Obviously, this implies that a good language isn't necessarily a practical language, but a good language is a good companion on your search for better solutions.<br />
<br />
As I told you, my biggest personal achievement as a hobbyist programmer was when I was kid writing a GUI program from scratch. Would that be something I would bother doing today? Probably not -- there are several GUI toolkits available and writing one from scratch is a massive undertaking. Why did I do it? First, there was no GUI toolkit available for QBasic (the best of my knowledge); and second, I was already playing around with lines, boxes, and 16 colors on a 640x480 pixel screen. It was only natural to make something GUI-like out my 16-color boxes. The programming environment let me transition from text-based programs to simple graphical programs, and then move on to writing GUI toolkits. The environment helped me becoming a better programmer.<br />
<br />
I'm wrapping up this post by mentioning something about my latest project. I wish to make a good (according to the definition above) <i>and </i>practical language. I'm doing this through the following equation Python + Prolog = Pythlog. Python is the language that I write most of my hobby hacks in, and Prolog is the language I write most of my <i>interesting</i> hobby hacks in. Python is a friendly language for solving problems, and Prolog is a friendly language for solving <i>complicated symbolic</i> problems. Pythlog is the sweet spot between these two languages. It allows Python-like programs to perform complicated symbolic operations in very concise ways. It allows equational reasoning about the code, and even running functions "in reverse" (what argument should I provide <span style="font-family: Courier New, Courier, monospace;">f</span> such that it returns <span style="font-family: Courier New, Courier, monospace;">17</span>?).<br />
<br />
<a href="https://gitorious.org/pythlog" target="_blank">Pythlog</a> isn't particularly far developed, but there is a prototype compiler that pass some <a href="https://gitorious.org/pythlog/pythlog/trees/master/test/systest" target="_blank">tests</a> that show of Pythlog's strengths.Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com0tag:blogger.com,1999:blog-8875558920754897232.post-38176987550628584422013-02-27T12:47:00.003-08:002013-02-27T12:47:41.991-08:00The cost of #include in Protest<a href="https://gitorious.org/protest/pages/Home" target="_blank"><i>Protest</i></a> is a unit test framework for C++ unit test framework that distinguishes itself by having a single all-covering check/assert and deals with fixtures in an extremely slick way. I recommend that you check it out.<br />
<br />
A few month ago I implemented <i>stringification</i> in <i>Protest</i> and I also wrote functions that stringifies all the classes in STL, <span style="font-family: "Courier New",Courier,monospace;">std::vector</span>, etc. This feature added some value to <i>Protest</i> but it also caused the compilation times to increase dramatically. The test suite of <i>Protest </i>ran in 2.5 minutes before this feature, and almost 4 minutes afterwards. What happened?<br />
<br />
Well, what happened was <span style="font-family: "Courier New",Courier,monospace;">#include-</span>ing every possible header of STL. So how to fix it? Don't include every header of STL? Exactly.<br />
<br />
But then the code won't compile as the used classes, e.g., <span style="font-family: "Courier New",Courier,monospace;">std::vector,</span> isn't declared. To get around this issue I did something really ugly... By surrounding the relevant code with <span style="font-family: "Courier New",Courier,monospace;">#if</span>/<span style="font-family: "Courier New",Courier,monospace;">#endif</span>, the stringification function for <span style="font-family: "Courier New",Courier,monospace;">std::vector</span> is only compiled if the <span style="font-family: "Courier New",Courier,monospace;">vector</span> header has been included by the client code. This is done by checking if the include guard used by <span style="font-family: "Courier New",Courier,monospace;">vector</span> is defined.<br />
<br />
When this was implemented the test suite of<i> Protest </i>again finished in around 2.5 minutes.<br />
<br />
Of course, this has to be done for each supported compiler, but that's a small price to pay considering ~40% less time spent on waiting for your test suite to compile. Actually, I haven't ported this to MSVC yet.<br />
<br />
On a more philosophical note, how this feature was added without cost says a lot about <i>Protest</i> -- it questions the status quo, complexity, and lack of features in other C++ testing frameworks. <i>Protest</i> tries to improve.Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com0tag:blogger.com,1999:blog-8875558920754897232.post-61419052678857548112013-01-17T12:15:00.001-08:002013-01-17T12:21:43.838-08:00Python type checking -- or why you should learn PrologA while <a href="http://programmaticallyspeaking.blogspot.se/2012/10/why-you-must-know-about-prolog.html" target="_blank">ago</a> I wrote about the programming language Prolog. Even longer <a href="http://programmaticallyspeaking.blogspot.se/2010/01/statically-typed-and-compiled-python.html" target="_blank">ago</a> I wrote about a python type-checker (and compiler) I was then working on. That type-checker and compiler never reached a state where it could compile any python code, however, it did type check quite complicated code.<br />
<br />
Today I started working on a new type-checker for python that is based on Prolog. Prolog is a logic programming language with several features that makes it an ideal language to implement type-checking in. Here's an example of a piece of code that it successfully type-checks:<br />
<span style="font-family: "Courier New",Courier,monospace;"> A = 1<br /> B = 1.0<br /> C = "c"<br /><br /> def plus_C(value):<br /> return C + value<br /><br /> def func0(a):<br /> global A, B, C<br /> A = "hello"<br /> result = plus_C(A)<br /> C = 10<br /> result = plus_C(B) + a<br /> return result</span><br />
<br />
The way it works is pretty straight-forward. The type-checker simply disassembles the relevant functions (using the <span style="font-family: "Courier New",Courier,monospace;">dis</span> module), and outputs Prolog code where each instruction is replaced with a Prolog predicate, as follows:<br />
<span style="font-family: "Courier New",Courier,monospace;"> py_plus_C(Module, V_value, ReturnValue) :-<br /> load_attr(Module, 'C', T0),<br /> binary_add(T0, V_value, T1),<br /> return_value(T1, ReturnValue).<br /></span><br />
<span style="font-family: "Courier New",Courier,monospace;"><span style="font-family: "Courier New",Courier,monospace;"> </span>py_func0(Module, V_a, ReturnValue) :-</span><br />
<span style="font-family: "Courier New",Courier,monospace;"> </span><span style="font-family: "Courier New",Courier,monospace;"><span style="font-family: "Courier New",Courier,monospace;"> </span>store_attr(Module, 'A', py_str),</span><br />
<span style="font-family: "Courier New",Courier,monospace;"> </span><span style="font-family: "Courier New",Courier,monospace;"><span style="font-family: "Courier New",Courier,monospace;"> </span> load_attr(Module, 'plus_C', T0),</span><br />
<span style="font-family: "Courier New",Courier,monospace;"><span style="font-family: "Courier New",Courier,monospace;"> </span> load_attr(Module, 'A', T1),<br /> </span><span style="font-family: "Courier New",Courier,monospace;"><span style="font-family: "Courier New",Courier,monospace;"></span> call_function(Module, T0, [T1], T2),</span><span style="font-family: "Courier New",Courier,monospace;"><br /> store_attr(Module, 'C', py_int),<span style="font-family: "Courier New",Courier,monospace;"><br /> </span>load_attr(Module, 'plus_C', T3),<br /> </span><span style="font-family: "Courier New",Courier,monospace;">load_attr(Module, 'B', T4),</span><span style="font-family: "Courier New",Courier,monospace;"><br /> call_function(Module, T3, [T4], T5),<br /> </span><span style="font-family: "Courier New",Courier,monospace;"><span style="font-family: "Courier New",Courier,monospace;"> </span>binary_add(T5, V_a, T6),<br /> </span><span style="font-family: "Courier New",Courier,monospace;"><span style="font-family: "Courier New",Courier,monospace;"></span>return_value(T6, ReturnValue).</span><br />
<br />
Here, the type-checker infers the return type of <span style="font-family: "Courier New",Courier,monospace;">func</span> to be <span style="font-family: "Courier New",Courier,monospace;">float</span>, and the type of the argument <span style="font-family: "Courier New",Courier,monospace;">a</span> to be either <span style="font-family: "Courier New",Courier,monospace;">int</span> or <span style="font-family: "Courier New",Courier,monospace;">float</span>. Note that the definition of <span style="font-family: "Courier New",Courier,monospace;">Module</span> is omitted because it's mostly just an internal representation of the python module and not terribly interesting.<br />
<br />
In addition to dealing with mutable state (mutation really complicates type-checking), it handles parametrized types (e.g., <span style="font-family: "Courier New",Courier,monospace;">list</span>) as the following example illustrates:<br />
<span style="font-family: "Courier New",Courier,monospace;"> def func1(lla, lb):<br /> return lla[0] + lb</span> <br />
<br />
where it infers that <span style="font-family: "Courier New",Courier,monospace;">lla</span>, <span style="font-family: "Courier New",Courier,monospace;">lb</span>, and the return type is:<br />
<ul>
<li>list(float), float, float</li>
<li>list(int), float, float</li>
<li>list(int), int, int</li>
<li>list(str), str, str</li>
<li>list(list(X)), list(X), list(X)</li>
<li>dict(int, list(X)), list(X), list(X)</li>
</ul>
and several more alternatives. That is, the type-checker not only makes sure that a piece of code is correct with regards to types, it also infers <i>which types are possible</i>.<br />
<a href="http://upload.wikimedia.org/wikipedia/commons/3/36/Cool_Cat_Barnstar_01c.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" src="http://upload.wikimedia.org/wikipedia/commons/3/36/Cool_Cat_Barnstar_01c.png" /></a><br />
Now to the cool part.<br />
<br />
<br />
The entire thing is 236 lines of code.<br />
<br />
<br />
Ok, granted it doesn't handle <span style="font-family: "Courier New",Courier,monospace;">if</span>s, <span style="font-family: "Courier New",Courier,monospace;">for</span>s, etc, but still. The type-checker can be found <a href="https://gitorious.org/patsy/patsy" target="_blank">here</a>. Unfortunately, I currently don't have time to continue working on it, but the code is there and proofs the idea.Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com0tag:blogger.com,1999:blog-8875558920754897232.post-66248769152893974762013-01-16T14:37:00.000-08:002013-01-16T15:03:39.464-08:00My open source and my closed workI usually don't write about things that actually happen to me. Instead I focus on describing tools, cool ideas, or just telling a joke. In this post, though, I'll tell an ongoing story about bringing <a href="http://programmaticallyspeaking.blogspot.se/search/label/Protest" target="_blank"><i>protest</i></a> into the company where I work.<br />
<br />
<i>protest</i> is a neat little unit testing framework for C++ that I started working on around September last year. It's been <a href="https://gitorious.org/protest" target="_blank">open-source</a> under the Boost Software Licence since its very beginning. The reason I started working on it was for one simple reason that has caused many hackers to start... well, hack... <i>scratching an itch.</i><br />
<br />
The particular itch in this case is the terrible state of C++ unit testing. I've tried several testing frameworks over the year, but all make me say <i>yuck, do they really want me to write that? </i>or <i>should I really do that manually?</i> or even worse <i>what... I can't do that?</i><br />
<br />
I've already written about <i>protest</i> here several times, so I won't do that again. What I will do however, is describing the process of using <i>protest</i> at my work. It started in November when I presented <i>protest</i> to my colleagues. They were positive and saw it as a good candidate for replacing <a href="http://unittest-cpp.sourceforge.net/" target="_blank">UnitTest++</a> that we're currently using.<br />
<br />
I'm working at a company that is very protective of it's source code and information -- for good reasons. What I am worried about is that if we started using <i>protest</i> without explicit acceptance and knowledge from some manager(s), I might run into problems if the source is found on the internet by the "security police" since it has my name on it (my user name on Gitorious is my real name, just as here on my blog). If they found it under my name on the internet, they can (falsely) draw the conclusion that I brought the code outside of the company.<br />
<br />
So, to make sure this wouldn't happen I contacted a manager and explained the situation. Unfortunately, he contacted a person who specializes in law that looked into the matter in more detail. The response I got was <i>we can't accept this, CompanyName might lose the the right to use protest if this-and-that</i>, which wasn't true at all of course.<i> </i><br />
<br />
I got a bit put off by this, but I finally got back to the issue this week. My response went along the following lines:<br />
<i><br /></i>
<i>Regardless if you acknowledge and accept the license under which protest is published, you should understand that any open-source software can be used by any employee at CompanyName at any time. I know for a fact that we/CompanyName is using open-source licenced software, indeed, we <u>rely</u> on it daily.</i><br />
<br />
I'm not sure if this was I good idea or not.Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com0tag:blogger.com,1999:blog-8875558920754897232.post-11578368754634522452013-01-04T15:56:00.000-08:002013-01-04T16:03:35.946-08:00Documentation generation in canji<i>canji</i> is a tool-chain generator that I've been writing about <a href="http://programmaticallyspeaking.blogspot.se/search/label/canji" target="_blank">before</a> on this blog and on <a href="https://gitorious.org/canji/pages/Home" target="_blank">gitorious</a>. The current focus lies on inferring various attributes of instructions. Recently I've spent some time on generating a descriptive string for each instruction, which will be used for the generated instruction set manual. This blog describes the approach for doing so.<br />
<br />
Instructions are described in a C-like language, so to generate a descriptive string for an instruction the code implementing it has to be understood on a high level. This may sound like a dauntingly complex task, but since the purpose of <i>canji</i> is to build virtual machines, we have a good idea of what kind of code that will be processed.<br />
<br />
For instance, the following is needed to understood:<br />
<ul>
<li>reading and writing registers,</li>
<li>reading and writing memory,</li>
<li>addition, multiplication, shifting, etc,</li>
<li>incrementing and decrementing </li>
<li>pushing and popping,</li>
<li>branching (assigning the program counter),</li>
<li>conditional jumps, moves, etc,</li>
<li>call function,</li>
<li>return from function,</li>
<li>and more.</li>
</ul>
In theory, there are arbitrary many ways of expressing any of the above. In practice, though, there aren't particularly many ways of implementing the <span style="font-family: "Courier New",Courier,monospace;">pop</span> instruction, or the <span style="font-family: "Courier New",Courier,monospace;">add</span> instruction, or any/most of the other.<br />
<br />
So the obvious approach to generate a descriptive string is to simply implement a bunch of real-world instructions and make the description generator generate an appropriate description.<br />
<br />
This kind of work is time consuming but simple,<i></i> and in fact not so much different to what an optimization pass of a compiler does. When a piece of code is optimized it is matched against a bunch of patterns and if any of the patterns matches, the code is rewritten according to the rules of the matched pattern.<br />
<br />
When the description is generated for an instruction, its code is matched against a set of patterns, and when a matching pattern is matches a description is generated.<br />
<br />
That's the simplest approach for doing this and the approach taken so far. It's likely that this will be extended in the future and have several layers of patterns that deals with different attributes of the code, e.g., conditional, etc.<br />
<br />
Here's some example of descriptions generated by <i>canji</i> and the code its generated from:<br />
<ul>
<li>Loads the value at memory address <span style="font-family: "Courier New",Courier,monospace;">r<i>src</i></span><span style="font-family: "Courier New",Courier,monospace;"></span> into <span style="font-family: "Courier New",Courier,monospace;">r<i>dst</i></span>. <span style="font-family: "Courier New",Courier,monospace;"><br />load src, dst<br /> r[dst] = mem[r[src]];</span> </li>
<li>Stores <span style="font-family: "Courier New",Courier,monospace;">r<i>src</i></span> at memory address sp and increments <span style="font-family: "Courier New",Courier,monospace;">sp</span>.<br /><span style="font-family: "Courier New",Courier,monospace;">push src<br /> mem[sp] = r[src];<br /> sp = sp + 1;</span></li>
<li>Branches to <span style="font-family: "Courier New",Courier,monospace;">addr</span> if <span style="font-family: "Courier New",Courier,monospace;">status.z</span> == 1.<br /><span style="font-family: "Courier New",Courier,monospace;">jeq addr<br /> if (status.z == 1)<br /> pc = pc + 1;<br /> else<br /> pc = addr;</span></li>
</ul>
Obviously, this is not perfect, but provided the low-level code it's generated from it pretty ok. Also, it's worth noticing that names of instructions, variables, operands, etc, aren't used used to infer any information.<br />
<br />
The next step is to analyse the registers of the virtual memory infer what purpose they fill and generate a description.Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com0tag:blogger.com,1999:blog-8875558920754897232.post-7323596714090079262012-12-28T14:32:00.000-08:002012-12-28T14:32:38.250-08:00Syntax inference in canji<i>canji</i> is a tool that aims to generate a tool-chain for virtual machines or hardware emulators given a compact description of said machine. The tool-chain contains a interpreter, jit, assembler, programmers' reference manual, etc. In this post we explore the syntax inference engine and how it is used to to generate the assembler of the tool-chain.<br />
<br />
First some background information about the way the virtual machine is described. A language developed especially for the purpose is used in conjunction with a very relaxed grammar for the instruction syntax. This is an example of a <span style="font-family: "Courier New",Courier,monospace;">move</span> instruction similar to the <span style="font-family: "Courier New",Courier,monospace;">move</span> instruction of the x64 architecture:<br />
<span style="font-family: "Courier New",Courier,monospace;"> move base + scale * idx + offset, src</span><br />
<span style="font-family: "Courier New",Courier,monospace;"> mem[r[base] + scale * r[idx] + offset] = r[src];</span><br />
where the first line describes the instruction's syntax in an abstract way and the second line is the concrete implementation of the instruction (<span style="font-family: "Courier New",Courier,monospace;">mem</span> and <span style="font-family: "Courier New",Courier,monospace;">r</span> are machine-global variables).<br />
<br />
First, the syntax inference engine infers that <span style="font-family: "Courier New",Courier,monospace;">scale</span> and <span style="font-family: "Courier New",Courier,monospace;">offset</span> are use directly in integer arithmetic, thus these two operands are <i>immediates</i> and should be written as such. For instance <span style="font-family: "Courier New",Courier,monospace;">#17</span> or <span style="font-family: "Courier New",Courier,monospace;">#42</span>.<br />
<br />
Second, the inference engine derives that <span style="font-family: "Courier New",Courier,monospace;">base</span>, <span style="font-family: "Courier New",Courier,monospace;">idx</span>, and <span style="font-family: "Courier New",Courier,monospace;">src</span>, are used to index the register file <span style="font-family: "Courier New",Courier,monospace;">r</span>, thus should be written as for instance <span style="font-family: "Courier New",Courier,monospace;">r0</span>, <span style="font-family: "Courier New",Courier,monospace;">r3</span>, or <span style="font-family: "Courier New",Courier,monospace;">r15</span>. That is, the name of the register file (<span style="font-family: "Courier New",Courier,monospace;">r</span>) followed by the index of the register.<br />
<br />
Third, the inference engine derives that the expression <span style="font-family: "Courier New",Courier,monospace;">r[base] + scale * r[idx] + offset</span> is used to index memory and thus should be written according to the <i><span style="font-family: inherit;">memory access</span></i> syntax, which is for instance <span style="font-family: "Courier New",Courier,monospace;">[r0 + 2 * r1 + 2]</span> (the Intel assembly flavor).<br />
<br />
To sum up, these three items are used to derive a syntax for the <span style="font-family: "Courier New",Courier,monospace;">move</span> instruction that such that the following example is valid:<br />
<span style="font-family: "Courier New",Courier,monospace;">move [r0 + 8 * r1 + 16], r8</span><br />
<br />
Currently the syntax inference engine distinguish register files from memories by simply looking at their sizes. Even though this is in the general case not 100% accurate, it is accurate enough in the majority of (reasonable) cases. So it is likely that this will remain.<br />
<br />
The syntax inference engine uses another inference engine to complete its work -- the optional operands inference engine. This engine find which operands that trivially can be made optional because they have a natural no-op value. An example of such operand is <span style="font-family: "Courier New",Courier,monospace;">offset</span> for the <span style="font-family: "Courier New",Courier,monospace;">move</span> instruction above, which can be made optional by encoding the absence of <span style="font-family: "Courier New",Courier,monospace;">offset</span> as if <span style="font-family: "Courier New",Courier,monospace;">offset</span> is 0. Thus, the following are valid move instructions:<br />
<span style="font-family: "Courier New",Courier,monospace;"> move [r0], r8</span><br />
<span style="font-family: "Courier New",Courier,monospace;"><span style="font-family: "Courier New",Courier,monospace;"> move [r0 + r1], r8</span> </span><br />
<span style="font-family: "Courier New",Courier,monospace;"> move [r0 + 16], r8</span><br />
<br />
As you might realize, the inference engine make an assumption to be able to accept these three different version of the instruction. The assumption is that the <span style="font-family: "Courier New",Courier,monospace;">+</span> and <span style="font-family: "Courier New",Courier,monospace;">*</span> tokens in the move syntax description:<br />
<span style="font-family: "Courier New",Courier,monospace;"> move base + scale * idx + offset, src</span><br />
"belongs to" the operand that directly follows them. This means that the operand is optional the <span style="font-family: "Courier New",Courier,monospace;">+</span> and <span style="font-family: "Courier New",Courier,monospace;">*</span> tokens that precedes it are also optional. For instance, the version of move without the offset operand have the following syntax:<br />
<span style="font-family: "Courier New",Courier,monospace;"> move base + scale * idx, src</span><br />
It's a bit hard to explain in word, but pretty straight forward when you see examples like this.<br />
<br />
So far, the <i>canji</i> syntax inference engine don't handle instruction flags. It is likely that this will simply be dealt with by looking at the type of the operand -- if an operand is of type <span style="font-family: "Courier New",Courier,monospace;">u1</span> or <span style="font-family: "Courier New",Courier,monospace;">s1</span> (meaning unsigned one bit integer and signed one bit integer, respectively).<br />
<br />
A different approach to deal with the syntax inference is to remove the per instruction description of syntax and have a global syntax description instead. For instance, such global description could say which order source and destination operands should be, etc.<br />
<br />
However, many assemblers handle several syntaxes, e.g., the AT&T and Intel assembler syntaxes. How should this be dealt with by <i>canji</i>? Well, currently there is no work on dealing with this, but it is technically not hard to do. Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com0tag:blogger.com,1999:blog-8875558920754897232.post-64405449945962706652012-12-10T11:01:00.000-08:002012-12-10T22:51:25.797-08:00Generating tool-chainsIn my <a href="http://programmaticallyspeaking.blogspot.se/2012/11/generating-interpreters-and-jit.html" target="_blank">previous post</a> I outlined a way of generating interpreters and JIT compilers from a simple description of the state of the target (virtual) machine and the semantics of the instructions. I also linked to a prototype called <i>jiggawatt</i> where I implemented these ideas.<br />
<br />
Since then I continued to explore what can be generated from such small description of the target machine:<br />
<ul>
<li>Interpreter that doubles as a simple inference engine,</li>
<li>Optimizing JIT compiler,</li>
<li>Assembler and disassembler,titool</li>
<li>Instruction binary encoding, and</li>
<li>Documentation including textual description of each instruction.</li>
</ul>
Yes, generating textual description of each instruction. Not only that, the instructions are ordered such that similar instructions are grouped together in the documentation. In addition I've experimented with generating examples for each instruction showing how the instruction can be used, however, I didn't managed to get this entirely correct. Despite that, the end result is quite promising and I think there's potential for more (you find an example of the generated documentation <a href="https://gitorious.org/jitcc/jiggawatt/blobs/raw/0851d601170644e59f8b65d843315f358b742019/isa.pdf" target="_blank">here</a>).<br />
<br />
Unfortunately, since <i>jiggawatt </i>main purpose was to explore what could be done -- not to be a long-living project -- it has grown into a huge ball of untested code. But out of the ideas and experience a new project has been born: <a href="https://gitorious.org/canji" target="_blank"><i>canji</i></a> (yet, it is intentionally misspelled). <br />
<br />
<br />
<i>canji</i>'s ambitions is much greater than those of <i>jiggawatt</i>. With <i>canji</i> I aim to generate a whole tool-chain for a (virtual) machine: interpreter/simulator (with JIT), assembler/disassembler, a static optimizing compiler, ELF, debugger, and documentation with textual description and example code. I also aim to have a test-case generator that generates tests that verifies the interpreter and static compiler. Simpler things such as Emacs modes for the assembler is also possible and hopefully much more. <br />
<br />
The concrete goal of <i>canji </i>is to generate an entire tool-chain, but the less obvious goal is to explore what implicit information that lies hidden in a machine description. For instance, to generate an assembler with operand type-checking (e.g., to check that immediate and register isn't mixed up) in <i>jiggawatt</i> I let the assembler generator assume that if an operand is used to index an array, that operand must be prefixed by the name of that array. For example, an instruction <span style="font-family: "Courier New",Courier,monospace;">load</span> that take two operands, an immediate and a destination register, must be written as follows:<br />
<span style="font-family: "Courier New",Courier,monospace;"> load #10, r0</span><br />
assuming the array describing the register file is called <span style="font-family: "Courier New",Courier,monospace;">r</span>.<br />
<br />
Assumptions like this may or may not be a good idea -- as I said <i>canji</i> is a project that explores what can be generated from a very tiny description of a (virtual) machine.Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com0tag:blogger.com,1999:blog-8875558920754897232.post-53292567643163037572012-11-25T15:11:00.000-08:002012-12-10T10:59:37.070-08:00Generating interpreters and JIT compilersBased on a description of each register in a (virtual) machine and each instruction, shouldn't it be possible to generate an interpreter for such machine? And if it's possible to generate a interpreter, shouldn't it be possible to generate a JIT compiler as well? Additionally, if a interpreter can be generated, shouldn't it be possible to generate a "inference engine" that infers various static properties of a program, e.g., that c = a + b is 1 if a = 1 and b = 0?<br />
<br />
Of course it is possible -- the question, however, is how such description should look. How convenient is it to write such description? Let's start with the description of the machines registers.<br />
<br />
Assuming a simple machine with non-overlapping registers (i.e., no al/ah/ax/eax/rax mess) the register description is fairly straight forward, for instance:<br />
<span style="font-family: "Courier New",Courier,monospace;"> rf: int64[64]</span><br />
<span style="font-family: "Courier New",Courier,monospace;"> zflag: bit</span><br />
which say that there is a register file called <span style="font-family: "Courier New",Courier,monospace;">rf</span> consisting of 64 64bit registers and a flag, <span style="font-family: "Courier New",Courier,monospace;">zflag</span>, indicating whether or not the result of the last arithmetic operation was zero.<br />
<br />
Although the above is a description of the machine's register it's a state-less description, in the sense that it purely declarative description of <i>structure</i> -- there is no mutation described here. Thus, this description is denoted the <i>structural semantic</i>.<br />
<br />
On top of the structure of the machine are the instructions, which mutate the registers. These are described by the <i>operational semantics</i>. The following describes the <span style="font-family: "Courier New",Courier,monospace;">load</span> instruction, which writes an immediate value to the register file:<br />
<span style="font-family: "Courier New",Courier,monospace;"> def load(imm: int64, dst:int6):</span><br />
<span style="font-family: "Courier New",Courier,monospace;"> rf[dst] = imm </span><br />
where <span style="font-family: "Courier New",Courier,monospace;">imm</span> and <span style="font-family: "Courier New",Courier,monospace;">dst</span> are the operands to the instruction -- an immediate and the (index of the) destination register, respectively. Furthermore, <span style="font-family: "Courier New",Courier,monospace;">rf</span> is the register file as described by the structural semantics above.<br />
<br />
To give some credibility to my first claim, that it is possible to generate an interpreter from this description, note that the semantic description formulated as above can trivially be used as a interpreter (simple text transform is enough) as the following python program indicates:<br />
<div style="font-family: "Courier New",Courier,monospace;">
rf = [0] * 64</div>
<div style="font-family: "Courier New",Courier,monospace;">
zflag = 0</div>
<div style="font-family: "Courier New",Courier,monospace;">
<br /></div>
<div style="font-family: "Courier New",Courier,monospace;">
def load(imm, dst): </div>
<div style="font-family: "Courier New",Courier,monospace;">
def execute():</div>
<div style="font-family: "Courier New",Courier,monospace;">
rf[dst] = imm</div>
<div style="font-family: "Courier New",Courier,monospace;">
return execute</div>
<div style="font-family: "Courier New",Courier,monospace;">
<br /></div>
<div style="font-family: "Courier New",Courier,monospace;">
program = [load(1000, 0), load(200, 1)]</div>
<div style="font-family: "Courier New",Courier,monospace;">
for instr in program:</div>
<div style="font-family: "Courier New",Courier,monospace;">
instr()</div>
<div style="font-family: "Courier New",Courier,monospace;">
<br /></div>
<div style="font-family: "Courier New",Courier,monospace;">
print(rf[0], rf[1])</div>
Furthermore, translating the description into a more efficient C implementation is simply a matter of more trivialities and a tea spoon of relatively straight-forward type inference. But such details are not important for the argument I'm making.<br />
<br />
Now, what about generating a JIT compiler for the <span style="font-family: "Courier New",Courier,monospace;">load</span> instruction? Let's start by looking at the end result, that is, how can such a JIT compiler be implemented. Here's a sample of an implementation of the <span style="font-family: "Courier New",Courier,monospace;">load</span> instruction implemented in C++:<br />
<div style="font-family: "Courier New",Courier,monospace;">
class load {</div>
<div style="font-family: "Courier New",Courier,monospace;">
int64_t imm; </div>
<div style="font-family: "Courier New",Courier,monospace;">
unsigned dst;</div>
<div style="font-family: "Courier New",Courier,monospace;">
public:</div>
<div style="font-family: "Courier New",Courier,monospace;">
load(int64_t imm, unsigned dst) : imm(imm), dst(dst) { }</div>
<div style="font-family: "Courier New",Courier,monospace;">
void emit(Xbyak::CodeGenerator& cg) {</div>
<div style="font-family: "Courier New",Courier,monospace;">
gc.mov(gc.r8, imm);</div>
<div style="font-family: "Courier New",Courier,monospace;">
gc.mov(gc.r9, dst);</div>
<div style="font-family: "Courier New",Courier,monospace;">
gc.mov([rdi + offsetof(State, rf) + gc.r9 * 8],</div>
<div style="font-family: "Courier New",Courier,monospace;">
gc.r9); </div>
<div style="font-family: "Courier New",Courier,monospace;">
}</div>
<div style="font-family: "Courier New",Courier,monospace;">
}; </div>
where we used the great <a href="https://github.com/herumi/xbyak" target="_blank">Xbyak</a> framework for translating x64 assembler into binary encoded instructions. This class can be used to emit the <span style="font-family: "Courier New",Courier,monospace;">load</span> instruction at runtime, that is, we have a working JIT compiler -- all generated from the <i>four line long</i> description above!<br />
<br />
I'm guessing that you by now sayt <i>show me the code</i>. Sure, <a href="https://gitorious.org/jitcc" target="_blank">here</a> is a proof-of-concept implementation. It is called <i>jiggawatt</i> and it can, as of the writing of this post, generate interpreters and jit compilers that support load instructions, add and sub instructions, non-conditional jumps, and conditional jumps. So far it does not generate a optimizing jit compiler, however that is (one of) the end goal(s).<br />
<br />
But so what? Why would you ever need a interpreter and jit compiler generator? Well, as a back-end for DSLs as well as full-fledge programming languages is one application that comes to mind. Emulators of physical processors or other kind of hardware is another. A third application is for prototyping instruction sets and virtual machines.Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com0tag:blogger.com,1999:blog-8875558920754897232.post-7319888209013457222012-11-04T04:50:00.002-08:002012-11-04T05:16:23.261-08:00Stringify and sequence checks in Protest<a href="https://gitorious.org/protest/pages/Home" target="_blank"><i>Protest</i></a> is the C++ unit test framework that I've been working on for a month or two that I've written about here <a href="http://programmaticallyspeaking.blogspot.se/search/label/Protest" target="_blank">before</a>. <i>Protest</i> improves on other frameworks by having a single über-powerful check/assert and handles fixtures really well. But since a yesterday it has become even better, as this post will describe. Let's start with the a simple yet important feature of any testing framework -- printing objects.<br />
<br />
<h3>
Stringify </h3>
All C++ unit test frameworks I've used suffer from the same illness -- the <i>sorry-I-can't-stringify-anything</i> flu. The prime example of this is <span style="font-family: "Courier New",Courier,monospace;">std::vector</span>, which has <span style="font-family: "Courier New",Courier,monospace;">operator==</span> overloaded, but no <span style="font-family: "Courier New",Courier,monospace;">operator<<</span>. This implies that <span style="font-family: "Courier New",Courier,monospace;">std::vector</span> can't be used in as arguments to, for instance <span style="font-family: "Courier New",Courier,monospace;">CHECK_EQUAL</span> in <a href="http://unittest-cpp.sourceforge.net/" target="_blank">UnitTest++</a>, because that macro <i>requires</i> the arguments to have <span style="font-family: "Courier New",Courier,monospace;">operator<<</span> implemented.<br />
<br />
<i>Protest</i> solves this with two important features: 1) it can stringify <span style="font-family: "Courier New",Courier,monospace;">std::*</span>, and 2) a object without <span style="font-family: "Courier New",Courier,monospace;">operator<<</span> is simply not stringified. One issue remains though: what if <span style="font-family: "Courier New",Courier,monospace;">operator<<</span> is implemented but it needs to be printed differently when printed from a test? Well, of course, <i>Protest</i> to the rescue!<br />
<br />
<i>Protest</i> doesn't hijack <span style="font-family: "Courier New",Courier,monospace;">operator<<</span>, it does however use it by default to print objects. This means that a object can be printed differently from tests and in production code. This is not yet documented on the <a href="https://gitorious.org/protest/pages/Home" target="_blank">wiki</a>, but soon it will be. For the time being this example has to suffice (note however, that this code has to come before <span style="font-family: "Courier New",Courier,monospace;">#include <protest.hh></span>):<br />
<span style="font-family: "Courier New",Courier,monospace;">struct Foo { };</span><br />
<span style="font-family: "Courier New",Courier,monospace;">void stringify(std::ostream& os, Foo const&) {</span><br />
<span style="font-family: "Courier New",Courier,monospace;"> os << "Foo";</span><br />
<span style="font-family: "Courier New",Courier,monospace;">}</span><br />
<span style="font-family: "Courier New",Courier,monospace;">#include <protest.hh></span><br />
<br />
<h3>
Sequence checks</h3>
A key feature of <i>Protest</i> is that is only has one check/assert macro, while other frameworks either have five, ten, or even twenty; or they follow the <a href="http://code.google.com/p/hamcrest/" target="_blank">Hamcrest</a> route and forces you to write assert such as<br />
<span style="font-family: "Courier New",Courier,monospace;">ASSERT_THAT(v, Each(AllOf(Gt(10), Lt(20))));</span><br />
which honestly isn't particularly easy to read, nor write. Furthermore the many-checks approach and the hamcrest-approach both fail in more complicated situations. Of course, <i>Protest</i> tries to remedy this, and the solution is <i>sequence checks</i>.<br />
<br />
Sequence checks are checks that uses one or more <i>sequence variable</i>, which is essentially equivalent to a <span style="font-family: "Courier New",Courier,monospace;">for</span> loop. The following <i>Protest</i> check is equivalent to the assert in the example above and <span style="font-family: "Courier New",Courier,monospace;">I</span> is a <i>sequence variable</i>:<br />
<span style="font-family: "Courier New",Courier,monospace;">I(0, 3); // check elements 0, 1, and 2.</span><br />
<span style="font-family: "Courier New",Courier,monospace;">check(v[I] > 5); </span><br />
<span style="font-family: "Courier New",Courier,monospace;">check(v[I] < 20);</span><span style="font-family: "Courier New",Courier,monospace;"> </span><br />
which obviously is more lines of code, but roughly the same number of characters. The neat thing with sequence checks is that it handles everything from simple array comparison to checking relationships between functions, e.g.,<br />
<div style="font-family: "Courier New",Courier,monospace;">
I(0, 20, 2);</div>
<div style="font-family: "Courier New",Courier,monospace;">
check(0 == least_significant_bit(I)); // Even numbers</div>
<div style="font-family: "Courier New",Courier,monospace;">
check(1 == least_significant_bit(I + 1)); // Odd numbers</div>
<br />
Sequence checks improve the quality of the your tests by enabling you to express invariants of the code you're testing without increasing the amount of test-code needed.<br />
<br />
<h3>
Conclusion</h3>
<i>Protest</i> ability to stringify all objects (with or without <span style="font-family: "Courier New",Courier,monospace;">operator<<</span>) avoids a annoying mis-feature of many other test frameworks. This feature does not, however, change how you write tests. As as step towards writing tests that express the logic of the production code <i>Protest</i> provides sequence checks. Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com2tag:blogger.com,1999:blog-8875558920754897232.post-68639859084278407552012-10-27T09:27:00.000-07:002012-10-27T09:32:18.721-07:00Why you must know about PrologI've head about Prolog many times and I'm sure you have too. It's a general purpose logic programming language with some very attractive features. The last few days I have experimenting with it and I've gotten a feel for the language but I'm by no means remotely close to fully understand it. I have, however, seen how useful it can be and I well here explain why Prolog is awesome. I will start by explaining Prolog as if it would be a imperative language with some weird syntax and limitations. The example I will use is a simple simulator of a simple instruction set:<br />
<ol>
<li><span style="font-family: "Courier New",Courier,monospace;">load Imm, Dst</span>-- loads immediate <span style="font-family: "Courier New",Courier,monospace;">Imm</span> to register <span style="font-family: "Courier New",Courier,monospace;">Dst</span>.</li>
<li><span style="font-family: "Courier New",Courier,monospace;">add Src, Dst</span>-- store the sum of registers <span style="font-family: "Courier New",Courier,monospace;">Src</span> and <span style="font-family: "Courier New",Courier,monospace;">Dst</span> in register <span style="font-family: "Courier New",Courier,monospace;">Dst</span>.</li>
<li><span style="font-family: "Courier New",Courier,monospace;">mul Src, Dst</span> -- store the product of register <span style="font-family: "Courier New",Courier,monospace;">Src</span> and <span style="font-family: "Courier New",Courier,monospace;">Dst</span> in register <span style="font-family: "Courier New",Courier,monospace;">Dst</span>.</li>
</ol>
As can be seen this is a very, very small instruction set. Suitably, it will execute on a very very small processor -- it's only storage is it's for general purpose registers <span style="font-family: "Courier New",Courier,monospace;">r0</span>, <span style="font-family: "Courier New",Courier,monospace;">r1</span>, <span style="font-family: "Courier New",Courier,monospace;">r2</span> and <span style="font-family: "Courier New",Courier,monospace;">r3</span>. For simplicity of implementing the simulator, each register can hold arbitrarily many bits (as many needed by the value held by the register).<br />
<br />
First thing to implement is the register file. The simplest way to implement reading from the register file is like this:<br />
<div style="font-family: "Courier New",Courier,monospace;">
<br /></div>
<div style="font-family: "Courier New",Courier,monospace;">
read_rf(r0, rf(R0, _, _, _), R0).</div>
<div style="font-family: "Courier New",Courier,monospace;">
read_rf(r1, rf( _, R1, _, _), R1).</div>
<div style="font-family: "Courier New",Courier,monospace;">
read_rf(r2, rf( _, _, R2, _), R2).</div>
<div style="font-family: "Courier New",Courier,monospace;">
read_rf(r3, rf( _, _, _, R3), R3). </div>
<br />
First off, tokens starting with upper case are variables and an underscore represents a variable that is not of interest. Lower case tokens, on the other hand, represent things that must be exactly matched.<br />
<br />
So, for instance, the first line tells the Prolog compiler how to read register r0, and that a register file consists of four values bundled together into something called an <span style="font-family: "Courier New",Courier,monospace;">rf</span>. It also tells that when reading <span style="font-family: "Courier New",Courier,monospace;">r0</span>, all values but the first one in <span style="font-family: "Courier New",Courier,monospace;">rf</span> is not of interest. Finally, the result of reading the register file is put into the last parameter, in this case <span style="font-family: "Courier New",Courier,monospace;">R0</span>. The three remaining lines can be understood in similar fashion. This looks a bit like a function declaration, but it is actually also the implementation. Now, let's continue with writing to the register file. This is done as follows:<br />
<div style="font-family: "Courier New",Courier,monospace;">
<br /></div>
<div style="font-family: "Courier New",Courier,monospace;">
write_rf(r0, rf( _, R1, R2, R3), V, rf( V, R1, R2, R3)).</div>
<div style="font-family: "Courier New",Courier,monospace;">
write_rf(r1, rf(R0, _, R2, R3), V, rf(R0, V, R2, R3)).</div>
<div style="font-family: "Courier New",Courier,monospace;">
write_rf(r2, rf(R0, R1, _, R3), V, rf(R0, R1, V, R3)).</div>
<div style="font-family: "Courier New",Courier,monospace;">
write_rf(r3, rf(R0, R1, R2, _), V, rf(R0, R1, R2, V)).</div>
<div style="font-family: "Courier New",Courier,monospace;">
<br /></div>
The first line here tells the Prolog compiler what it means to write to register <span style="font-family: "Courier New",Courier,monospace;">r0</span> of the register file <span style="font-family: "Courier New",Courier,monospace;">rf</span> which consists of four values (of which the first is not of interest). The variable <span style="font-family: "Courier New",Courier,monospace;">V</span> represents the value to be written, and it is put into the first position of the last parameter (the <span style="font-family: "Courier New",Courier,monospace;">rf( V, R1, R2, R3)</span>-part). Ok, now we continue with defining the instructions:<br />
<div style="font-family: "Courier New",Courier,monospace;">
<br /></div>
<div style="font-family: "Courier New",Courier,monospace;">
instruction(load(Imm, Dst), InRf, OutRf) :-</div>
<div style="font-family: "Courier New",Courier,monospace;">
write_rf(Dst, InRf, Imm, OutRf).</div>
<div style="font-family: "Courier New",Courier,monospace;">
instruction(add(Src, Dst), InRf, OutRf) :-</div>
<div style="font-family: "Courier New",Courier,monospace;">
read_rf(Src, InRf, Value0),</div>
<div style="font-family: "Courier New",Courier,monospace;">
read_rf(Dst, InRf, Value1),</div>
<div style="font-family: "Courier New",Courier,monospace;">
Sum is Value0 + Value1,</div>
<div style="font-family: "Courier New",Courier,monospace;">
write_rf(Dst, InRf, Sum, OutRf).</div>
<div style="font-family: "Courier New",Courier,monospace;">
instruction(mul(Src, Dst), InRf, OutRf) :-</div>
<div style="font-family: "Courier New",Courier,monospace;">
read_rf(Src, InRf, Value0),</div>
<div style="font-family: "Courier New",Courier,monospace;">
read_rf(Dst, InRf, Value1),</div>
<div style="font-family: "Courier New",Courier,monospace;">
Prod is Value0 * Value1,</div>
<div style="font-family: "Courier New",Courier,monospace;">
write_rf(Dst, InRf, Prod, OutRf).</div>
<div style="font-family: "Courier New",Courier,monospace;">
<br /></div>
This tells the compiler that the definition of <span style="font-family: "Courier New",Courier,monospace;">load(Imm, Dst)</span> is to write <span style="font-family: "Courier New",Courier,monospace;">Imm</span> to the register <span style="font-family: "Courier New",Courier,monospace;">Dst</span> in the register file. Furthermore, the definition of <span style="font-family: "Courier New",Courier,monospace;">add(Src, Dst)</span> is to read registers <span style="font-family: "Courier New",Courier,monospace;">Src</span> and <span style="font-family: "Courier New",Courier,monospace;">Dst</span> and write the sum to register <span style="font-family: "Courier New",Courier,monospace;">Dst</span>. The definition of <span style="font-family: "Courier New",Courier,monospace;">mul</span> is analog to <span style="font-family: "Courier New",Courier,monospace;">add</span>.<br />
<br />
Ok, now let's try to run this to get feeling of what Prolog can do. The following is the output from the interactive prompt provided by SWI Prolog.<br />
<br />
<div style="font-family: "Courier New",Courier,monospace;">
?- instruction(load(10, r0), rf(1, 1, 1, 1), OutRf).<br />
OutRf = rf(10, 1, 1, 1).<br />
<br />
?- instruction(add(r1, r0), rf(2, 2, 2, 2), OutRf).<br />
OutRf = rf(4, 2, 2, 2).</div>
<div style="font-family: "Courier New",Courier,monospace;">
<br />
?- instruction(mul(r1, r0), rf(3, 3, 3, 3), OutRf).<br />
OutRf = rf(9, 3, 3, 3).</div>
<br />
Ok, that's seems reasonable, right? Prolog tells us that loading <span style="font-family: "Courier New",Courier,monospace;">10</span> to register <span style="font-family: "Courier New",Courier,monospace;">r0</span> of a register file consiting of <span style="font-family: "Courier New",Courier,monospace;">1, 1, 1, 1</span> results in a register file consisting of <span style="font-family: "Courier New",Courier,monospace;">10, 1, 1, 1</span>. It tells us similar thing about the <span style="font-family: "Courier New",Courier,monospace;">add</span> and <span style="font-family: "Courier New",Courier,monospace;">mul</span> instruction. But nothing of this is particularly unique to Prolog, is it? We could have done this in any other language. But let's continue. Now we'll do a bit more symbolic things with Prolog:<br />
<br />
<div style="font-family: "Courier New",Courier,monospace;">
?- instruction(load(Imm, r0), rf(0, 0, 0, 0), OutRf).<br />
OutRf = rf(Imm, 0, 0, 0).</div>
<div style="font-family: "Courier New",Courier,monospace;">
<br /></div>
<div style="font-family: "Courier New",Courier,monospace;">
?- instruction(load(10, Dst), rf(0, 0, 0, 0), OutRf).<br />
Dst = r0,<br />
OutRf = rf(10, 0, 0, 0) ;<br />
Dst = r1,<br />
OutRf = rf(0, 10, 0, 0) ;<br />
Dst = r2,<br />
OutRf = rf(0, 0, 10, 0) ;<br />
Dst = r3,<br />
OutRf = rf(0, 0, 0, 10).</div>
<div style="font-family: "Courier New",Courier,monospace;">
<br /></div>
Now it starts to get fancy, isn't it? The first example shows that Prolog can load a symbol to register <span style="font-family: "Courier New",Courier,monospace;">r0</span>. The second example show that Prolog also understand what it means to load 10 to the symbolic register <span style="font-family: "Courier New",Courier,monospace;">Dst</span>; it either means loading to <span style="font-family: "Courier New",Courier,monospace;">r0</span>, <span style="font-family: "Courier New",Courier,monospace;">r1</span>, <span style="font-family: "Courier New",Courier,monospace;">r2</span>, or <span style="font-family: "Courier New",Courier,monospace;">r3</span> and it also tells us what the resulting register file is in each case. We now continue to show Prolog's symbolic powers even more:<br />
<br />
<div style="font-family: "Courier New",Courier,monospace;">
?- instruction(load(Imm, r0), rf(0, 1, 2, 3), rf(10, 1, 2, 3)).<br />
Imm = 10. </div>
<br />
Now this is cool. Given a input <i>and</i> and output register file (here, <span style="font-family: "Courier New",Courier,monospace;">rf(0, 1, 2, 3)</span> and <span style="font-family: "Courier New",Courier,monospace;">rf(10, 1, 2, 3)</span>) Prolog can figures out the value of <span style="font-family: "Courier New",Courier,monospace;">Imm</span> required in the <span style="font-family: "Courier New",Courier,monospace;">load</span> instruction. Let's see what more it can do:<br />
<br />
<span style="font-family: "Courier New",Courier,monospace;">?- instruction(Instr, rf(0, 1, 2, 3), rf(3, 1, 2, 3)).</span><br />
<span style="font-family: "Courier New",Courier,monospace;">Instr = load(3, r0) ;</span><br />
<span style="font-family: "Courier New",Courier,monospace;">Instr = add(r3, r0) ;</span><br />
<span style="font-family: "Courier New",Courier,monospace;">false.</span><br />
<br />
Awesome right? Prolog actually figures out what instructions that given <span style="font-family: "Courier New",Courier,monospace;">rf(0, 1, 2, 3)</span> results in <span style="font-family: "Courier New",Courier,monospace;">rf(3, 1, 2, 3)</span>. Try to do that in a normal imperative language... oh, right, we can't do that. And this brings me to (one of) the point(s) of Prolog: given a solution from going from A to B it also (in some cases, like here) gives you a solution for going from B to A. For example, if we wrote an assembler for the above instruction set in Prolog we would automatically get the disassembler.Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com0tag:blogger.com,1999:blog-8875558920754897232.post-58838213140598380382012-10-08T13:04:00.000-07:002012-10-08T13:04:20.800-07:00An update on Protest (the testing framework that doesn't suck)<i>Protest</i> (see <a href="https://gitorious.org/protest/pages/Home" target="_blank">wiki</a> for more information) is a unit test framework for C++ that is like most other test frameworks, except that it does checks in a innovative way and handles test fixtures really well. I first wrote about <i>Protest</i> <a href="http://programmaticallyspeaking.blogspot.se/2012/09/protest-unit-testing-in-c-made-slick.html" target="_blank">here</a>, and since then I've done some more work. Well, actually I rewrote the whole thing.<br />
<br />
Why rewriting? Well, the initial solution was a proof-of-concept and not worth spending any effort making production worthy. The rewrite is cleaner, but not as clean as I want it.<br />
<br />
Anyway, the version that's in the repository has a proper main function, a memory leak detector, and handles of segmentation faults and similar error. It also has preliminary support for JUnit XML output. None of these features distinguish <i>Protest</i> from the rest of the testing framework pack. The distinguishing feature of <i>Protest</i> is how fixtures and assertions are handled. Here's an example:<br />
<div style="font-family: "Courier New",Courier,monospace;">
suite("my suite") {</div>
<div style="font-family: "Courier New",Courier,monospace;">
int one = 1; // This is the fixture!</div>
<div style="font-family: "Courier New",Courier,monospace;">
test("1 should equal 1") {</div>
<div style="font-family: "Courier New",Courier,monospace;">
check(one == 1);</div>
<div style="font-family: "Courier New",Courier,monospace;">
} </div>
<div style="font-family: "Courier New",Courier,monospace;">
test("1 should be less than 2") {</div>
<div style="font-family: "Courier New",Courier,monospace;">
check(one < 2) << "What? your compiler is broken"; </div>
<div style="font-family: "Courier New",Courier,monospace;">
}</div>
<div style="font-family: "Courier New",Courier,monospace;">
} </div>
Note that there is only one assert macro, <span style="font-family: "Courier New",Courier,monospace;">check</span>, which handles all kinds of comparisons. If the check fails, the expression is split into left-hand side and right-hand side before printed.<br />
Also, note how the fixture is setup just as local variables -- this is because fixtures in <i>Protest</i> is local variables. This is much more convenient than class-based fixtures that all (to my knowledge) other test-framework uses<br />
<br />
Actually, <i>Protest</i> support class-based fixtures as well. This is done as follows:<br />
<div style="font-family: "Courier New",Courier,monospace;">
struct Fixture { int one() { return 1; } };</div>
<div style="font-family: "Courier New",Courier,monospace;">
suite("my suite", Fixture) {</div>
<div style="font-family: "Courier New",Courier,monospace;">
test("1 == 1") {</div>
<div style="font-family: "Courier New",Courier,monospace;">
check(one() == 1); </div>
<div style="font-family: "Courier New",Courier,monospace;">
}</div>
<span style="font-family: "Courier New",Courier,monospace;">}</span><br />
This is where I'm not yet fully happy with <i>Protest</i> -- I'd like to make it possible for the test-case to provide the fixture with arguments. Something along the lines:<br />
<div style="font-family: "Courier New",Courier,monospace;">
struct Fixture {</div>
<div style="font-family: "Courier New",Courier,monospace;">
int m_value; </div>
<div style="font-family: "Courier New",Courier,monospace;">
Fixture(int value) : m_value(value) { }</div>
<div style="font-family: "Courier New",Courier,monospace;">
int value() { return m_value; }</div>
<div style="font-family: "Courier New",Courier,monospace;">
};</div>
<div style="font-family: "Courier New",Courier,monospace;">
suite("my suite", Fixture) {</div>
<div style="font-family: "Courier New",Courier,monospace;">
test("1 == 1", (1)) { check(value() == 1); }</div>
<div style="font-family: "Courier New",Courier,monospace;">
test("2 <= 4", (4)) { check(2 <= value()); }</div>
<div style="font-family: "Courier New",Courier,monospace;">
}</div>
That is, the <span style="font-family: "Courier New",Courier,monospace;">test</span> macro takes optional arguments that is used for initializing the fixture. I'm not fully sure this is possible right now, but I'll give it a go soon.Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com0tag:blogger.com,1999:blog-8875558920754897232.post-51639995646647840692012-09-23T14:18:00.001-07:002012-09-23T14:22:20.018-07:00Protest -- unit testing in C++ made slickI've tried so many unit testing framework in C++, yet nothing really impressed me. Most are bad clones of JUnit, others are just <a href="http://tut-framework.sourceforge.net/" target="_blank">silly</a> (macros are not always bad). A few get close to what I'd like to have, but all frameworks really fall short on how fixtures are handled. <br />
<br />
So, this weekend I decided to see if I could come up with something better. <a href="https://gitorious.org/protest" target="_blank">Here's</a> what I got so far. I call it <i>Protest</i>, and it's a unit testing framework that a simple, powerful and slick. Here's an example:<br />
<div style="font-family: "Courier New",Courier,monospace;">
#include <protest.hh> </div>
<div style="font-family: "Courier New",Courier,monospace;">
suite("my first suite of tests.") {</div>
<div style="font-family: "Courier New",Courier,monospace;">
test("my first test") {</div>
<div style="font-family: "Courier New",Courier,monospace;">
int i = 2; </div>
<div style="font-family: "Courier New",Courier,monospace;">
expect(1 != i - 1) << "Intentionally wrong";</div>
<div style="font-family: "Courier New",Courier,monospace;">
}</div>
<div style="font-family: "Courier New",Courier,monospace;">
}</div>
which will print the following when run:<br />
<div style="font-family: "Courier New",Courier,monospace;">
example.cc:4: expectation '1 != i - 1' (1 != 1) failed [my first suite][my first test][Intentionally wrong].</div>
<br />
Fixtures are handled differently in Protest than in most other framework. There is no need to create a separate class or declare any extra stuff:<br />
<div style="font-family: "Courier New",Courier,monospace;">
#include <protest.hh> </div>
<div style="font-family: "Courier New",Courier,monospace;">
suite("tests with fixture 1") {</div>
<div style="font-family: "Courier New",Courier,monospace;">
int i = 2; // Can be used in all tests. </div>
<div style="font-family: "Courier New",Courier,monospace;">
test("test 1") {</div>
<div style="font-family: "Courier New",Courier,monospace;">
expect(i != 1);</div>
<div style="font-family: "Courier New",Courier,monospace;">
}</div>
<div style="font-family: "Courier New",Courier,monospace;">
test("test 2") {</div>
<div style="font-family: "Courier New",Courier,monospace;">
expect(i != 3);</div>
<div style="font-family: "Courier New",Courier,monospace;">
}</div>
<div style="font-family: "Courier New",Courier,monospace;">
// If needed, any tear-down code goes here.</div>
<div style="font-family: "Courier New",Courier,monospace;">
}</div>
<br />
However, sometimes there is a need for a more traditional approach to fixture (that is, inheriting from a base class). This is also supported in Protest: <br />
<div style="font-family: "Courier New",Courier,monospace;">
#include <protest.hh> </div>
<div style="font-family: "Courier New",Courier,monospace;">
struct Fixture {</div>
<div style="font-family: "Courier New",Courier,monospace;">
int two() { return 2; } </div>
<div style="font-family: "Courier New",Courier,monospace;">
} </div>
<div style="font-family: "Courier New",Courier,monospace;">
suite("tests with fixture 2") {</div>
<div style="font-family: "Courier New",Courier,monospace;">
int i = two();</div>
<div style="font-family: "Courier New",Courier,monospace;">
test("test 1") {</div>
<div style="font-family: "Courier New",Courier,monospace;">
expect(two() == i);</div>
<div style="font-family: "Courier New",Courier,monospace;">
}</div>
<div style="font-family: "Courier New",Courier,monospace;">
}</div>
<br />
In addition, Protest supports ignoring test-cases, expected failures, and logging parts of expressions (not yet implemented: will only be logged if test-case fails). It also handles when test-cases crashed (<span style="font-family: "Courier New",Courier,monospace;">SIGSEGV</span>) and reports the last known line that was executed (usually the last <span style="font-family: "Courier New",Courier,monospace;">expect</span> that was executed).<br />
<br />
Note that I've used lower-case for <span style="font-family: "Courier New",Courier,monospace;">suite</span>, <span style="font-family: "Courier New",Courier,monospace;">test</span>, etc, which are macros. These are just development names, and I intend that to be configurable.Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com3tag:blogger.com,1999:blog-8875558920754897232.post-10307482560884612922012-09-19T12:03:00.000-07:002012-09-19T12:03:16.773-07:00Infer return type for templated function in C++When I recently played around with ways of implementing generator in a test framework in C++, I had the need to have the compiler infer the return type of a template function. Something long the lines:<br />
<span style="font-family: "Courier New",Courier,monospace;">
template<typename T></span><br />
<span style="font-family: "Courier New",Courier,monospace;">T generate() {</span><br />
<span style="font-family: "Courier New",Courier,monospace;"> // generate a value of type T.</span><br />
<span style="font-family: "Courier New",Courier,monospace;">}</span><br />
<span style="font-family: "Courier New",Courier,monospace;">void test() {</span><br />
<span style="font-family: "Courier New",Courier,monospace;"> int integer = generate(); </span><br />
<span style="font-family: "Courier New",Courier,monospace;"> std:::string str = generate();</span><br />
<span style="font-family: "Courier New",Courier,monospace;">} </span><br />
Unfortunately, this code does not compile as the compiler cannot infer the return type required for the call to <span style="font-family: "Courier New",Courier,monospace;">generate</span>. There is, however, a way around this -- by using a casting operator.<br />
<br />
Casting operator are functions that are used when the compiler tries to convert one type to another. An example of this is follows here:<br />
<div style="font-family: "Courier New",Courier,monospace;">
struct Object {</div>
<div style="font-family: "Courier New",Courier,monospace;">
operator int() { return 0; } </div>
<div style="font-family: "Courier New",Courier,monospace;">
};</div>
<div style="font-family: "Courier New",Courier,monospace;">
int foo() {</div>
<div style="font-family: "Courier New",Courier,monospace;">
Object obj;</div>
<div style="font-family: "Courier New",Courier,monospace;">
return obj; // 'obj' is implicitly converted to an 'int'.</div>
<div style="font-family: "Courier New",Courier,monospace;">
}</div>
Here, <span style="font-family: "Courier New",Courier,monospace;">foo</span> will return <span style="font-family: "Courier New",Courier,monospace;">0</span> because that's what <span style="font-family: "Courier New",Courier,monospace;">Object</span>'s casting operator returns.<br />
<br />
So, without further ado, here's an example how to emulate return type inference on a template function: <br />
<div style="font-family: "Courier New",Courier,monospace;">
template<typename T></div>
<div style="font-family: "Courier New",Courier,monospace;">
struct ReturnValue { };</div>
<div style="font-family: "Courier New",Courier,monospace;">
<br /></div>
<div style="font-family: "Courier New",Courier,monospace;">
// Specialization for type int. Implements 'T generate()'</div>
<div style="font-family: "Courier New",Courier,monospace;">
// for T == int.</div>
<div style="font-family: "Courier New",Courier,monospace;">
template<></div>
<div style="font-family: "Courier New",Courier,monospace;">
struct ReturnValue<int> {</div>
<div style="font-family: "Courier New",Courier,monospace;">
static int value() { return 17; } </div>
<div style="font-family: "Courier New",Courier,monospace;">
};</div>
<div style="font-family: "Courier New",Courier,monospace;">
<br /></div>
<div style="font-family: "Courier New",Courier,monospace;">
// Specialization for type std::string. Implements</div>
<div style="font-family: "Courier New",Courier,monospace;">
// 'T generate()' for T == std::string.<br />
</div>
<div style="font-family: "Courier New",Courier,monospace;">
template<></div>
<div style="font-family: "Courier New",Courier,monospace;">
struct ReturnValue<std::string> {</div>
<div style="font-family: "Courier New",Courier,monospace;">
static std::string value() { return "foo"; } <br />
</div>
<div style="font-family: "Courier New",Courier,monospace;">
};<br />
</div>
<div style="font-family: "Courier New",Courier,monospace;">
</div>
<div style="font-family: "Courier New",Courier,monospace;">
<br /></div>
<div style="font-family: "Courier New",Courier,monospace;">
struct ReturnType {</div>
<div style="font-family: "Courier New",Courier,monospace;">
template<typename T></div>
<div style="font-family: "Courier New",Courier,monospace;">
operator T() {</div>
<div style="font-family: "Courier New",Courier,monospace;">
return ReturnValue<T>::value(); </div>
<div style="font-family: "Courier New",Courier,monospace;">
} </div>
<div style="font-family: "Courier New",Courier,monospace;">
};</div>
<div style="font-family: "Courier New",Courier,monospace;">
ReturnType generate() {</div>
<div style="font-family: "Courier New",Courier,monospace;">
return ReturnType(); </div>
<div style="font-family: "Courier New",Courier,monospace;">
}</div>
<div style="font-family: "Courier New",Courier,monospace;">
<br /></div>
<div style="font-family: "Courier New",Courier,monospace;">
void test() {</div>
<div style="font-family: "Courier New",Courier,monospace;">
int integer = generate(); // = 17</div>
<div style="font-family: "Courier New",Courier,monospace;">
std::string str = generate(); // = "foo"</div>
<div style="font-family: "Courier New",Courier,monospace;">
}</div>
It's a bit of extra complexity but it works nicely and it makes it possible to separate the implementations of <span style="font-family: "Courier New",Courier,monospace;">T generate()</span> for different values of <span style="font-family: "Courier New",Courier,monospace;">T</span>, which is pretty neat.<br />
<br />
I wouldn't call this pattern useful in normal application code, but might be useful for APIs or parts of DSLs. Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com3tag:blogger.com,1999:blog-8875558920754897232.post-68406757175536499872012-06-27T23:40:00.000-07:002012-06-28T06:02:32.725-07:00Computer science is not physicsI've read Existential Type for a while and gotten to the post <a href="http://existentialtype.wordpress.com/2011/03/16/languages-and-machines/" target="_blank">Languages and Machines</a>, which discusses models of computation. It all makes a lot of sense: there are different ways of modelling computation and they are good for different things, the only catch is that all but one models a machine. A machine with computational units, storage, and what not.<br />
<br />
A programming language is not used for manipulating registers and mutating memory cells, it's used for expressing thoughts and ideas. Thoughts and ideas does not live in the physical world (a least that's what I've <a href="http://en.wikipedia.org/wiki/Theory_of_Forms" target="_blank">heard</a>) so why should we rely on a language (read: C/C++/Java/etc) that inherently is bound to a (more or less) physical machine?<br />
<br />
No one questions that the field of maths is unrelated to the physical world we live in, right? Maths would exists with or without humans discovering mathematical truths and proofs. That's because maths uses some axiomatic system (that just happens to be very useful in the real world). However, I'd hope that no one in their right mind would argue that maths is about the physical realisations of the axioms, e.g., that 1 apple + 2 apples is 3 apples. Maths is not about apples -- apples just happen to fit in the axiomatic system. <br />
<br />
Dijkstra famously said:<br />
<blockquote class="tr_bq">
Computer science is no more about computers than astronomy is about telescopes.</blockquote>
I guess an equivalent statement about maths would be<i> maths is no more about numbers than astronomy is about telescopes</i><i>... </i>Let me now rephrase the previous paragraph to emphasis my point.<br />
<br />
No one questions that the field of computer science is unrelated to the physical world we live in, right? Computer science would exists with or without humans discovering computational truths and proofs. That's because computer science uses some computational model (that just happens
to be very useful in the real world). However, I'd hope that no one in
their right mind would argue that computer science is about the physical realisations of the models, e.g., the NAND gate. Computer science is not about NAND gates -- NAND gates just happen to fit in the computational model.<br />
<br />
So why not call it <i>computation science</i>, or <i>automated maths</i>? No one would question the above paragraph if I'd written <i>automated maths</i> instead of <i>computer science</i>.Torgnyhttp://www.blogger.com/profile/12146418455302215748noreply@blogger.com0