# Two Guys Arguing

## subtraction

Posted in bf by youngnh on 04.18.09

in bf, subtracting one element’s value from another is dead simple. to subtract the value of y in cell 1 from the value of x in cell 0, starting from cell 0:

```>[<->-]
```

this type of subtraction is useful for dealing with numerical values, but doesn’t help much when we need to compare numbers. I’d like to be able to tell if x is greater than y. Subtraction seems to be the natural operation for this. Just subtract y from x and loop if x is positive. This, however, only works when we can be assured that y is less than or equal to x, in which case, our comparison is probably moot. Negative numbers, of course are considered boolean true, so what we really want is a separate subtraction operation that won’t make x any less than 0.

this isn’t too hard to write in C:

```while(x && y) {
x--; y--;
}
```

there is no && statement in bf, but a pair of nested whiles would do as well:

```while(x) {
while(y) {
x--; y--;
}
}
```

which very easily translates into bf:

```[>[-<-]]
```

There is a problem with this loop, though. It doesn’t have a consistent exit point. If x is greater than y, the loop will exit with the pointer at y’s location. If y is greater than x, then the loop will exit with the pointer at x’s location. To remedy this, we need to establish a fixed point to “rewind” to.

If we establish a 0 in cell 0 as our “anchor”, then put x in cell 1 and y in cell 2, after we finish subtracting y from x, we can “rewind” back to our anchor cell for a consistent starting and end point.

```>[>[-<-]]<[<]>
```

## maven – give me my files!

Posted in Uncategorized by benjaminplee on 04.16.09

Every time I work with Maven I like a ton of things about it and get really annoyed by a few.  Today I was faced with an issue for the second time: I have a Maven project that produces a .jar as its artifact and I want to have a build goal produce a directory with that .jar AND copies of the .jar’s that it depends on.  Without this, my only means of distributing my code is if the recipient uses Maven also.  I partially solved this a few months ago and today found a more complete solution.  Its not complete (the .jars end up in a horribly ugly directory under the target folder) but it works … and that’s what matters.

The solutions uses the Maven Assembly plugin.  First, I needed a custom assembly configuration which I put in a file called assembly.xml in the root of my project:

```<?xml version="1.0" encoding="UTF-8"?>
<assembly>
<id>bin</id>
<formats>
<format>dir</format>
</formats>
<dependencySets>
<dependencySet>
<unpack>false</unpack>
</dependencySet>
</dependencySets>
</assembly>
```

This file describes how the plugin should assembly my project; moving all of my dependencies, unpacked, into a single folder.  There is a huge assortment of options available ranging from include/exclude parameters to how/where to move files.  I am still slowly working my way through the reference doc.

Second, I modified my pom.xml to configure the Assembly plugin:

```<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<descriptors>
<descriptor>assembly.xml</descriptor>
</descriptors>
</configuration>
</plugin>
```

Finally, a call to mvn assembly:directory will compile your project, run the tests, package your project and produce the desired folder. Done.

## bookkeeping

Posted in bf by youngnh on 04.16.09

bf has a tendency to push the maximum limit of Things I Can Hold In My Head At One Time.

Keeping track of where a variable\’s value is located takes up the bulk of a bf programmer\’s mental bookkeeping. Heck, a quarter of the language\’s syntax ( < and >) is devoted to updating what the current pointer is looking at. bf doesn\’t have any syntactic notion of variables, but it is impossible for me not to assign a cell\’s location some \”reason\” for existing in my program. Variables, in my head, are usually named along the lines of that reason.

Another note, cells in bf hold only numbers, much like every computer; it is simply how you choose to interpret those numbers that makes reasoning about computation simpler or harder. For the most part, I\’ve interpreted bf numbers as one of only three distinct types:

integers, which need no conversion and are usually 8 bits in size and signed
characters, in which the number actually corresponds to an ASCII value
booleans, where the number 0 represents false and any other integer, true

Lets look at some code. A succinct bit of bf is doubling a number. In bf, it might look like this:

```++++++  initialize cell 0 to 6
[       while cell 0 is not 0
>++     move to cell 1 and add 2
<-]     move back to cell 0 and decrement
```

Notice that since bf code only consists of 8 distinct characters, any others including whitespace are considered comments and thrown away, so you can mix plain text comments anywhere in a bf program that you\'d like.

Here cell 0 acts both as the value we are doubling and the index counter for the loop. We decrement it once each time through the loop until it is zero. The doubled result now resides in cell 1, NOT cell 0 where it began life. This is a universally common pattern in bf. Data is its own counter. Unlike the way that most programmers think about memory references — esp. array elements — data is always moving around in bf.

There are two ways to deal with this, accept that data moves around and update where your program will thenceforth look for that variable\'s value, or copy the doubled value back to its original cell. The second approach takes another seven characters of bf, minimum, to achieve, while the first approach takes no extra characters, but diligent and correct bookkeeping on the part of the programmer.

Or, if you are a programmer, then, inevitably, the program you\'ve created to do this for you.

## my word shift solution

Posted in challenge by youngnh on 04.14.09

I’m a bit obsessed with the unix programs `tr`, `uniq`, `sort` and `paste` after reading Unix for Poets in college, and Ben’s problem falls very nicely into the category of problems that these programs solve effortlessly.

The basic algorithm I decided to implement is plain-english simple: take a dictionary of words, transform all of them by shifting their characters and then removing any words that are no longer in that same dictionary.

Step 1: Move characters to the right. The unix program `tr` does nothing but this:

`tr 'qwertyuiop' 'wertyuiop['`

will translate just the first row of characters

`tr 'qwertyuiopasdfghjklzxcvbnm1234567890' 'wertyuiop[sdfghjkl;xcvbnm,234567890-'`

Each letter in `tr`‘s first argument gets replaced with the corresponding letter in the second argument.  The above command will shift all three rows of letters and the top row of number keys.  I’m going to write a truncated version for the rest of the post to save me some typing.

Step 2: Remove any words that aren’t in the dictionary.  For this step, I used cat to append the list of actual words to our list of translated words, like so:

`cat dict | tr 'qwerty' 'wertyu' | cat - dict`

Once you have a big long list of words and non-words, you want to sort them all which has the effect of putting any actual words created by shifting right next to a real word from the dictionary file.  The program `uniq` has a flag to print only duplicates that appear in its input.

`cat dict | tr 'qwerty' 'wertyu' | cat - dict | sort | uniq -d`

That’s it, go make yourself a drink.

Well….its kinda hard to look at that list and be confident that our little unix one-liner did its job.  (bb? is that even a word?) It would help a great deal if we could see what original hand positions were translated to form these new words. In other words, what words did we think we were typing?  To do this, I’d like to take our shifted words file, shift them back and then display both lists side by side.

To redirect output to a file, we use >.  To shift back we’ll use `tr` again and finally for the side by side display we’ll use paste.  You can swap the arguments to `tr` and that will take care of the leftward shift.
Our full program then becomes:

```cat dict | tr 'qwerty' 'wertyu' | cat - dict | sort | uniq -d > shifted.txt cat shifted.txt | tr 'wertyu' 'qwerty' | paste - shifted.txt```

Simple. On my modest machine, this runs over a 120K dictionary just under 3 seconds. Some of my favorites:

sweet -> derry
waxier -> escort
awed -> serf
dye -> fur
gust -> hidy
rust -> tidy
tyer -> yurt

Tagged with:

## pathological programming

Posted in bf by youngnh on 04.10.09

My post count is slipping.  I’m too worried about writing long, polished entries with a point to them, when what I’m generally working on is short, scatterbrained and only interesting to me for short periods of time.  So, who wants to ride bikes?

The shiny object of my current attention is the brainfuck language.  That’s actually its name, but I’ll be referring to it in polite company as bf.  A few weeks ago, while lamenting to myself the fact that I haven’t entered a horse in the Lambda Lounge Shootout race, I decided that its far too easy to write readable code in a well-established, if not widely used language.  Nevermind that this was the entire spirit of the shootout — to expose developers to how real software is written in other languages — I decided that I was going to attempt to write it in a pathological language.

How pathological?  bf is based around a 30,000 element array and 8 operators.

+ – > < . , [ ]

+ increments the value of the current array element
– decrements the value of the current arrayelement
> increments the current array index pointer
< decrements the current array index pointer
. prints out the value of the current array element
, stores one character of input in the current array element
[ marks the start of a loop
] marks the end of a loop

Loops can be arbitrarily nested and start execution of their bodies if the current array element is zero. When the ending ] is reached, the execution will loop only if the current array element is non-zero.

That’s it.  No strings, no if statements, no methods, no threads, no concurrency, no objects.  It has zero of the features that language presentations at Lambda Lounge have focused on. The language is Turing complete though, so theoretically, any computable function can be expressed in bf.  I intend to put that assumption to the test this coming month.

## a thankful Level-X programmer

Posted in Uncategorized by benjaminplee on 04.08.09

After tonight’s Google Reader catch-up session, I happened across Jeff Atwood’s post: The Eight Levels of Programmers.  The article reminded me of a joke a friend of mine and I came up with a while back where we debated if all software developers could be given a numbered rank … where the rank corresponded to the number of future software versions required to replace them.  While we were really joking about some especially horrible code we had just found … the sad truth is lots of developers would have some low numbers (assuming companies were willing to pay for the upgraded software).

The article, my current project, the bad economy, and a few recent talks with my wife have got me thinking about how lucky I am.  I get paid to do something I would gladly do for free while working with some amazing people.  My wife, bless her heart, has a hard time understanding how I can work in front of a computer all day and still want to hack away at some personal project that night.  I am just lucky I guess.

Despite the luck, it’s a lot of hard work.  It’s amazing how fast things change and how no matter how many blogs/books/feeds you read, personal projects you work on, or how much experience you have … a year from now it might all be irrelevant. Add on top of that the difficulty of trying to make a name for yourself and honing your personal/communication/team/etc skills and your left with a daunting task.

I am not sure what the future holds, or what level of programmer (on either scale) I am today … but I DO know I want it higher tomorrow and am wiling to put in the time to make it happen.

… at least until the software becomes self aware and take over the world.  Then it’s time to see if hard coding all of those DO_NOT_KILL_LIST.add(“Benjamin P Lee”); really work.

** This is my first post from Windows Live Writer which I am trying upon @brianbuttonxp ‘s suggestion.  So far, so good.

## Wrong Word, Right Key Strokes Challenge

Posted in challenge by benjaminplee on 04.07.09

Back in the metaphorical “day” I remember an incident where I tried to quickly type a message to someone and only after hitting Enter, noticed that in my haste my fingers were not lined up in their normal position.  The strange thing was that despite my fingers being off my a key (I can’t remember which direction), I had succeeded in typing a real word.  Unfortunately I can’t remember what the word was (intended or accidental).

Therefore, I put forth this challenge write a script which based on a English word list and the standard QWERTY key layout, find all of the real words that can be accidentally typed if another real word is typed with each keystroke offset by one key.  (e.g. if the word SAD was typed with your hands shifter right one key, it would come out DSF … which isn’t a word but you get the idea).  Key offset should be only one key distance, but I will leave it up to the implementer to decide what to do about moving up and down.

I fully intend to write this up at some point, but just haven’t had the time yet.  If anyone wants to take a stab at it, please let me know.

FYI … the National Puzzlers’ League has some nice word lists under “Solving Tools of the NPL”.

## HTML Entities

Posted in javascript by youngnh on 04.01.09

I found John Resig’s excellent env.js a few weeks ago, and immediately decided that I could do a better job, and so tried writing my own version.  The promise of the tool for me, is that with it I can use very natural javascript dom methods to scrape web pages instead of complicated and fragile regular expressions.

Part of my decision to write my own version stemmed from the fact that when I first tried out env.js, now being maintained by thatcher on github, it choked all over the very first web page I gave it.  Its a wonder I noticed the errors it spit out at all, though, as env.js logs an awful lot of useless information to the console.  Maybe its a unix aesthetic, but I feel solid working programs are quiet programs.

Those two minor quibbles aside, though, there are a lot of neat ideas going on in env.js.  Besides, with git, I could very easily fork thatcher’s work and go whichever way I wanted to.  Long story short, I sat back down with env.js last night and fed it some web pages.  It did choke all over them, but this time, I stuck around and figured out what the problem was.

env.js uses an internal SAX parser written byDavid Joham and Scott Severtson, that looks for any ampersands in its input and tries to interpret them as HTML escape sequences.  That in itself isn’t all that illogical.  If you type &amp; into a web page, env.js should be able to handle it properly.  However, it was doing so inside of <script> tags and dying when it encountered the javascript && operator.  Furthermore, the unescaping logic searched from the first ampersand to the first semicolon after that ampersand to decide what escape sequence it was looking at.  Bad news when it was looking at a language descended from C, which uses operators that start with & and statements the end with ;

I put in a ticket at env.js’s lighthouse site, and am currently working on a patch.

Tagged with: , ,