Two Guys Arguing

mario math – part 3

Posted in Uncategorized by youngnh on 08.17.09

There is a counter that only allows Mario to accelerate vertically for 8 consecutive frames. Release KEY_JUMP at any time before frame 8 and gravity takes over.

When you are on the ground and press KEY_JUMP, the code calculates an initial velocity for Mario, changes his position accordingly and then calculates a decreased velocity owing to friction and gravity before that frame’s tick() ends. If you are pressing KEY_JUMP the next frame, this slower velocity is thrown away and another new initial velocity, smaller than the one before it, but still upwards, is added to Mario’s position and again a final, slower velocity is calculated before the frame’s end. This repeats until frame 8 when your jump runs out and gravity is the only y acceleration acting on Mario.

Here’s a quick table that displays velocities and heights, as it’s not really worth the rigmarole involved in deriving complex equations for so discrete a frame set.

Frame Initial Velocity Height Final Velocity
1 7 * 1.9 13.3 8.305
2 7 * 1.9 26.6 8.305
3 6* 1.9 38 6.69
4 5 * 1.9 47.5 5.075
5 4 * 1.9 55.1 3.46
6 3 * 1.9 60.8 1.845
7 2 * 1.9 64.6 0.23
8 1 * 1.9 66.5 -1.385

Here I’ve written the velocities with a positive sign, since upwards is considered positive on most x/y axes. The code that handles this actually uses negative numbers as it’s axis is upside down, but this shouldn’t affect relative distances and positive numbers look better on blogs so I’m going to use them.

Also, the velocity for the first two frames is not a typo, it really is set to the same value both frames, and as we’ll see after studying gravity, hitting jump for a single frame gives Mario just enough juice to clear hills that are a single block high.

mario math – part 2

Posted in java, mario, math by youngnh on 08.14.09

To calculate Mario’s position with KEY_RIGHT pressed down continuously for n frames:

let a = 0.6

at frame 1:
\begin{align} & v_1 = a \\ & p_1 = a \end{align}

at frame 2:
\begin{align} & v_2 = a(.89) + a \\ & p_2 = a + [a(.89) + a] \end{align}

at frame 3:
\begin{align} & v_3 = (a(.89) + a).89 + a \\ & p_3 = a + [a(.89) + a] + [(a(.89) + a).89 + a] \end{align}

at frame 4:
\begin{align} & v_4 = ((a(.89) + a).89 + a).89 + a \\ & p_4 = a + [a(.89) + a] + [(a(.89) + a).89 + a] + [((a(.89) + a).89 + a).89 + a] \end{align}

If you multiply everything through, a pattern starts to emerge:
\begin{align} p_4 & = a + a(.89) + a + a(.89)^2 + a(.89) + a + a(.89)^3 + a(.89)^2 + a(.89) + a \\ & = 4a + 3a(.89) + 2a(.89)^2 + a(.89)^3 \end{align}

So we can say that in general, Mario’s position after n frames will be:
p_n = \sum_{i=0}^n(n-i)a(.89)^i

But we’d like a formula we can evaluate in O(1), so massaging the equation a bit we get:
\begin{align} p_n & = a * \sum_{i=0}^n(n-i)(.89)^i \\ & = a * \sum_{i=0}^n n(.89)^i - i(.89)^i \\  & = a \left ( \sum_{i=0}^n n(.89)^i - \sum_{i=0}^n i(.89)^i \right ) \\ & = a \left ( n * \sum_{i=0}^n n(.89)^i - \sum_{i=0}^n i(.89)^i \right ) \end{align}

Both of sums in the above equation, we have identities for:
\sum_{i=0}^n x^i = \frac{1-x^{n+1}}{1-x}       and       \sum_{i=0}^n i x^i= \frac{x}{(1-x^2)}(x^n(n(x-1)-1)+1)

Of which the second one is kind of convoluted, but allows us to come up with:
\begin{align} p_n & = a \left ( n * \frac{1-(.89)^{n+1}}{1-(.89)} - \frac{.89}{(1-(.89)^2)}((.89)^n(n(.89-1)-1)+1) \right ) \\ & = a \left ( n * \frac{1-(.89)^{n+1}}{.11} - \frac{(.89)^{n+1}(-.11n-1)+.89}{.2079} \right ) \end{align}

Which can then write in Java as:

public float positionAfter(float accel, int nFrames) {
    float term1 = n * (1 - Math.pow(.89, nFrames + 1)) / .11;
    float term2 = (Math.pow(.89, nFrames + 1) * (-.11 - 1) + .89) / .2079;
    return a * (term1 - term2);
}

The (.89) term is friction (I think?) and on the ground the code uses that value. In the air, the dynamics and equations are the same, but use a different coefficient of friction: (.85). The initial acceleration of a jump is vastly different, though, so I’ll tackle that in the next Mario Math installment.

Tagged with: , , ,

mario math

Posted in java, mario, math by youngnh on 08.13.09

To calculate Mario’s velocity, with KEY_RIGHT pressed down continuously for n frames:

let a = 0.6 (1.2 if KEY_SPEED is held down as well)

at frame one, then, your velocity v_1 = a

frame 2: v_2 = a(.89) + a
frame 3: v_3 = (a(.89) + a).89 + a
frame 4: v_4 = ((a(.89) + a).89 + a).89 + a

which, if you multiply through, looks like this:

v_4 = a + a(.89) + a(.89)^2 + a(.89)^3

Which makes it a bit easier to generalize Mario’s velocity at any frame n:

v_n = \sum_{i=0}^{n-1}a(.89)^i

v_n = a * \sum_{i=0}^{n-1}(.89)^i

we know from paying attention in high school wikipedia that:

\sum_{i=0}^n x^i = \frac{1-x^{n+1}}{1-x}

however, our sequence only goes to n-1 frames. It’s missing the last term:

\left ( \sum_{i=0}^{n-1}x^i \right ) + x^n = \sum_{i=0}^n x^i = \frac{1-x^{n+1}}{1-x}

subtracting xn from both sides

\frac{1 - x^{n+1}}{1-x} - x^n = \frac{1-x^{n+1}}{1-x} - \frac{(1-x)}{(1-x)}x^n

= \frac{1-x^{n+1}-x^n+x^{n+1}}{1-x} = \frac{1-x^n}{1-x}

which we could have maybe figured out faster if we had remembered this identity:

\sum_{i=m}^n x^i = \frac{x^{n+1} - x^m}{x-1}

replacing the lower m with 0 and upper n with n-1 we get:

\sum_{i=0}^{n-1}x^i = \frac{x^{(n-1)+1}-x^0}{x-1}

=\frac{x^n-1}{x-1}

which multiplied by \left ( \frac{-1}{-1} \right ) is what we found earlier.

and so mario’s velocity after n frames of acceleration is:

v_n = a* \frac{1 - (.89)^n}{.11}

which can be calculated in constant time by this Java function:

public float velocityAfter(float accel, int nFrames) {
    return accel * ((1 - Math.pow(0.89, nFrames)) / 0.11);
}

Whew.

Thanks to Heath Borders and my co-blogger Ben for working this out on a whiteboard with me.

Tagged with: , , ,

tour de mario – jumping

Posted in ai, challenge, java, mario by youngnh on 08.10.09

My co-blogger Ben and I are working on an entry for the Mario AI Competition 2009. The idea is to develop a Java program that simulates button presses to guide Mario through randomly generated levels of increasing difficulty. The competition is aimed at exploring Artificial Intelligence methods in gaming, but before Ben and I can even get to that, I think a short tour of the code we will be working with is in order.

Jumping is key in Mario. Maybe more precisely, landing is key. Controlling where you land will stomp an enemy or clear a gap. Jumping is easy, landing is hard.

Since this is so integral, let’s take a look at the code that makes Mario tick.

The class ch.idsia.mario.engine.sprites.Mario controls Mario’s behavior.

The tick() method of Mario’s superclass, Sprite is called once for every frame of the game, 24/sec. tick() is pretty simple, it just calls move(), which Mario happens to implement.

The move() method has a lot of code in it — 224 lines — but at it’s core, it just sets the xa and ya variables — x-acceleration and y-acceleration — and then uses them to increment Mario’s current x and y position variables.

move() starts on line 156, but doesn’t start doing anything we’re interested in until line 211:

        float sideWaysSpeed = keys[KEY_SPEED] ? 1.2f : 0.6f;

Mario moves horizontally at a rate 0.6 per frame at normal speed and 1.2 when the SPEED key is pressed.

The jumping code starts on line 234. I’ve left out the conditions we don’t care about:

if(keys[KEY_JUMP] || ...) {
    ...
    if(onGround && mayJump) {
        xJumpSpeed = 0;
        yJumpSpeed = -1.9f;
        jumpTime = 7;
        ya = jumpTime * yJumpSpeed;
        onGround = false;
        sliding = false;
    }
    ...
    else if(jumpTime > 0) {
        xa += xJumpSpeed;
        ya = jumpTime * yJumpSpeed;
        jumpTime--;
    }
}

This is the heart of the jump. jumpTime is a frame counter variable, and a jump lasts 7 frames. I’m not positive, but I think that the y-origin is in the upper right of the screen, hence the negative yJumpSpeed. After 7 frames, jumpTime is reset to 0.

There are extra bits of code that limit Mario’s maximum acceleration to 8, cut Mario’s xa and ya to 85% of their value and every frame that Mario isn’t on the ground his y-acceleration is is decreased by 3.

        ya *= 0.85f;        
        if (!onGround) {
            ya += 3;
        }

But it’s enough to figure out where you’re going to land.

Update
I made a couple of mistakes, which are crossed out above, and besides the mention of sideWaysSpeed I didn’t mention anything about horizontal speeds. As you can see above, when you are in the air xJumpSpeed starts out as 0 and you don’t gain any acceleration from it for the duration of your jump. The code that sets your horizontal acceleration regardless of whether you are on the ground or not is at line 288:

        if (keys[KEY_RIGHT] && !ducking) {
            if (facing == -1) sliding = false;
            xa += sideWaysSpeed;
            if (jumpTime >= 0) facing = 1;
        }

which merely sets your acceleration depending on what direction you are pressing.

Tagged with: , , ,

Custom Continuum Notifier – Step 2: Build a dummy notifier

Posted in java by benjaminplee on 08.07.09

Now that we have Continuum building without any problems, the next step is getting a basic notifier coded up and Continuum aware of it.

The Continuum user wiki makes this process seem simple enough:  create a new notifier extending AbstractContinuumNotifier and register the new notifier in application.xml.  Essentially, this is true … with a few minor modifications.

  1. Create your new notifier project
    1. Under the continuum-notifiers Maven module, create a new module called continuum-notifier-web as a standard Maven project (or dupe one of the existing ones)
    2. Modify the new module’s pom.xml to have continuum-notifiers as its parent and add the continuum-notifier-api artifact as a dependency
    <project ...>
    <parent>
     <artifactId>continuum-notifiers</artifactId>
     <groupId>org.apache.continuum</groupId>
     <version>1.2.3</version>
     </parent>
     <modelVersion>4.0.0</modelVersion>
     <artifactId>continuum-notifier-web</artifactId>
     <name>Continuum :: web-Dashboard</name>
     <dependencies>
     <dependency>
     <groupId>org.apache.continuum</groupId>
     <artifactId>continuum-notifier-api</artifactId>
     <version>1.2.3</version>
     </dependency>
     </dependencies>
    </project>
    
    1. Add the new sub-module to the continuum-notifiers parent pom.xml.
    <modules>
     <module>continuum-notifier-api</module>
     <module>continuum-notifier-irc</module>
     <module>continuum-notifier-jabber</module>
     <module>continuum-notifier-msn</module>
     <module>continuum-notifier-wagon</module>
    
     <module>continuum-notifier-web</module>
     </modules>
    
  2. In your new project, create a basic notifier that simply logs that it was called.  This class should extend AbstractContinuumNotifier and implement getType() and sendMessage():
package org.apache.maven.continuum.notification.atkpwadashboard;

import org.apache.maven.continuum.notification.AbstractContinuumNotifier;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class WebNotifier extends AbstractContinuumNotifier {
 private Logger log = LoggerFactory.getLogger(getClass());

 public String getType() {
 return "web";
 }

 public void sendMessage(String messageId, MessageContext context)
 throws NotificationException {
 log.info("Web Notifier has been called!");
 }
}
  1. Build your new module either by a full build of Continuum or by just building your sub module (mvn package).
  2. Deploy your new notifier
    1. Copy the newly created notifier .jar into your Continuum intallation at <CONTINUUM_BASE_DIR>/apps/continuum/WEB-INF/lib/continuum-notifier-web.jar
    2. Modify <CONTINUUM_BASE_DIR>/apps/continuum/WEB-INF/classes/META-INF/plexus/application.xml to register the new notifier by adding the following code:
    <component>
     <role>org.apache.maven.continuum.notification.Notifier</role>
     <role-hint>web</role-hint>
     <implementation>org.apache.maven.continuum.notification.atkpwadashboard.WebNotifier</implementation>
     </component>
    
  3. Restart Continuum
  4. On one of your managed applications, modify the pom.xml to use the new notifier:
<ciManagement>
 <system>continuum</system>
 <url>http://localhost:8080/</url>
 <notifiers>
 <notifier>
 <type>web</type>
 </notifier>
 </notifiers>
 </ciManagement>
  1. Profit.

Things to note:

  • Continuum uses Plexus for its dependency injection.  Since your notifier is extending AbstractContinuumNotifier which has dependencies, you MUST register your notifier as needing these inject or you will see some lovely and confusing NullPointerExceptions when you try to use the notifier.
  • Technically you don’t need to get Continuum to compile to build a new notifier, but having the source handy and being able to see how the rest of the project is structured helps.
  • I haven’t figured out how to get Continuum builds to auto install my notifier yet …. it is on my to-do list.
  • To simplify creating the new project, I just copied continuum-notifier-jar and made modifications as necessary.
  • The email notifier is called “mail” and is in the continuum-core module.

Big Frustrating Gotcha:  Continuum’s notifier plugin architecture only extends to the internals of Continuum but does NOT include any of the web-app’s GUI.  This means that your new notifier will not show up in any of the admin screens as an option, or have any edit/configuration screens a web user can see.  These GUI elements are managed in the continuum-core and continuum-web-app modules, separate from the notifier code themselves.  For what I was doing I didn’t need this, but it would have been nice.  It looked to be fairly simple to add the new notifier to the approrpiate .jsp’s and have full graphical interface power.

Helpful Links:

  • Continuum user wiki page on building a notifier
  • Email chains one and two talking about doing the same thing
  • Configure your notifier in the application.xml wrong and you might get an error like this

Last step is to make the notifier actually do something related to build ….

Tagged with: , ,

asdf-install innards

Posted in common lisp by youngnh on 08.07.09

asdf-install’s README is good stuff.

For example, I did not know that sbcl has a standalone program for downloading lisp libraries:

sbcl-asdf-install drakma

will do the same thing as the more traditional method of starting sbcl, requiring asdf-install and then installing drakma.

The REAMDE also notes that the arguments to asdf-install may be:

– The name of a cliki page. asdf-install visits that page and finds
the download location from the `:(package)’ tag – usually rendered
as “Download ASDF package from …”

– A URL, which is downloaded directly

– A local tar.gz file, which is installed

If you look at the wiki source of http://www.cliki.net/Drakma, you’ll see a :(package) tag just as described:

:(package "http://weitz.de/files/drakma.tar.gz")

Poking around in the source of asdf-install, you’ll see this:

  
(let ((url
         (if (= (mismatch package-name-or-url "http://") 7)
             package-name-or-url
             (format nil "http://www.cliki.net/~A?download"
                     package-name-or-url))))

Which means that cliki generates a url for a page that ends in “?download”. Here’s the headers returned when requesting that url:

GET /Drakma?download HTTP/1.1
Host: www.cliki.net
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.1) Gecko/20090718 Shiretoko/3.5.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
If-Modified-Since: Fri, 07 Aug 2009 13:50:21 GMT

HTTP/1.x 302 Redirected
Date: Fri, 07 Aug 2009 13:58:26 GMT
Server: Araneida/0.84
Connection: close
Content-Type: text/html
Last-Modified: Fri, 07 Aug 2009 13:58:26 GMT
Location: http://weitz.de/files/drakma.tar.gz
Pragma: no-cache
Expires: Fri, 30 Oct 1998 14:19:41 GMT

Redirected to the file named in the :(package) tag.

The pages on cliki are freely editable. This means that it is trivially easy to make a piece of software, add a cliki page for it and anyone with asdf-install could install it, which is a fantastic and democratic way to distribute software.

However, by the same token, this is exactly why you should pay attention to the signatures of packages and download any that don’t match or aren’t trusted to see for yourself what you’re getting before loading and executing it.

See the “Making your package downloadable…” section of this cliki page for more details on how to publish your lisp code for all the world to use:
http://www.cliki.net/asdf-install

Tagged with: , ,

Custom Continuum Notifier – Step 1: Get Continuum to Compile

Posted in java by benjaminplee on 08.06.09

report card

This past week I was charged with creating a custom Continuum notifier which would push a build’s Selenium TestNG test results to a second system for reporting.  Changing project priorities, distractions, poor documentation and some weird issues made this simple task take much longer than it should have.  Hopefully, this series of posts will help make the process just a little bit smoother for the next person.

Background

Visibility into the quality of a software project is crucial to delivering a top notch product.  One direct way to achieve this is monitoring the team’s UAT results over time.  I have been slowly pushing the client’s development teams to fully automate their user acceptance testing along side the rest of their build process.  Enter Continuum and Maven, the client’s choices for continuous integration and “one button” builds.  Out of the box, Continuum had pretty much everything we needed from a testing/reporting standpoint, except that the development teams were spread out across several Continuum installations and management wanted a single place to see ALL teams’ test results.  Luckily, I found that Continuum has a plugin architecture for its notifiers – the plugins which decide what to do with the builds’ results.  Out of the box, Continuum comes with the ability to email and/or IM your build results in several ways.  What I needed was a notifier that would send the testng-results.xml file generated by Maven’s surefire testing plugin via a multi-part HTTP POST request to our reporting app.  Sounded simple enough.

Use the Source Luke

A few quick Google searches taught me a couple things: 1) a number of people have wanted to to do exactly what I needed, and 2) there didn’t seem to be any good end-to-end documentation on how to do it.  First step, download the source and get it to compile.  Next, according to the directions on the main site and the user wiki, was a simple mvn install.

First Build – Fail

Unfortunately, the continuum-release module’s tests fail.  I am still looking into why the release module tries to do a SVN commit during a install of the main project and not a deploy, but for the time being, it fails.  In order to get past the release module’s issues (the other modules install just fine and pass their tests) I sinned and added a -Dmaven.test.skip=true to the build and continuum-release installed like a champ.

Second Build – I forgot

Boom!  Not so fast.  OutOfMemoryError from maven while building the web module.  Setting MAVEN_OPTS to increase the max heap memory Maven has fixed the problem.

Third Build – Fell off the log

Boom! Again.   java.lang.NoClassDefFoundError on Commons Logging coming from the JSPC plugin (precompiling JSPS).  It turns out that plugins live in the same classloader as Maven itself and the JSPC plugin requires commons logging.  It also turns out that in the latest version of Maven (2.2.0) the internal dependencies changed and the version JSPC is no longer included in Maven’s “uber”  .jar.  Fix?  Dropped 1.1.1 commons logging three .jars into Maven’s lib folder.  Fixed.

Fourth Build – Houston, We Have Compilation

Finally, everything compiled and installed without problems.  Next up, writing a dummy notifier and getting Continuum to use it.

Tagged with: , ,

asdf-install and GPG signatures

Posted in common lisp by youngnh on 08.05.09

asdf-install is a very useful package manager for lisp code. like rubygems.

The idea is that you load up your lisp REPL, and then punch in (asdf-install:install 'some-package) to install whatever lisp library you might want. It’s then fetched from the internet, compiled and loaded into your running lisp. peachy.

I’m not sure about the specifics, but I think asdf-install looks for the software you’re trying to find on cliki, which is a public lisp wiki to which anyone can edit, or upload files.

asdf-install is smart about this sort of thing though, it expects any software you download from the internet to have an accompanying signature file. This means the author of the software package needs to sign it with his or her private key and put the signature in the same place as the package. asdf-install will then check that:

a) you have that author’s public key
b) you trust the key comes from who it says it does
c) you have given asdf-install permission to install software from that author

This is all well and good. Except it’s a bit of a headache in practice, and if the user so chooses, asdf-install will allow you to bypass these security checks.

My goal for this post was to run through an install without ever using the dubious 0: [SKIP-GPG-CHECK] Don't check GPG signature for this package restart.

Before I get in too deep, these following 3 links were a tremendous help in figuring what the hell was going on. I love Common Lisp, but the learning curve is very, very steep. I’ve been struggling off and on with these issues for 3 years and these pages go into a bit more depth than I do here:
http://common-lisp.net/project/asdf-install/tutorial/install.html
http://www.cliki.net/ASDF-Install%20and%20GPG
http://www.pps.jussieu.fr/~jch/software/pgp-validating.html


First thing I do is create my own public/private keypair:

gpg --gen-key

and download and import the cliki common-lisp.net developer keyring, which has the public keys of all developers publishing software on cliki:

wget http://common-lisp.net/keyring.asc
gpg --import keyring.asc

Next, start up a fresh sbcl 1.0.30 image:

$ sbcl
* (require 'asdf-install)
* (asdf-install:install 'drakma)

and…we run into our first problem, GPG warns that the key id 0x595FF045057958C6 (Dr. Edmund Weitz ) is not fully trusted. Simple enough to fix:

gpg --edit edi@weitz.de
> lsign
> save

and…it blows up again, saying Dr. Edmund Weitz (key id 595FF045057958C6) is not on your package supplier list. This time, the restart is fine, as asdf-install doesn’t recommend editing the package suppliers list by hand, so choose the 0: [ADD-KEY ] Add to package supplier list restart.

Next up, cl+ssl, a dependency of drakma, doesn’t have a signature file to be found. You can sneak a peak at the project’s releases here: http://common-lisp.net/project/cl-plus-ssl/download/ and see that although some of the older ones have signature files, there is, indeed, no signature for the latest release that asdf-install is trying to install. So download to your local machine, crack it open and take a quick look at the source, to be sure that it’s not malicious software. Then:

(asdf-install:install "cl+ssl.tar.gz")

Except even though you are now installing from a local file and asdf-install won’t check for a signature, the installation blows up again, because it depends on the trivial-gray-streams package, which it tries to fetch from the internet and is also missing a signature file. In fact, trivial-gray streams can be downloaded from the same directory as cl+ssl, and you can see for yourself that trivial-gray-streams doesn’t have a signature file either. So we download, peek inside and repeat the local asdf-install for trivial-gray-streams, then cl+ssl:

(asdf-install:install "trivial-gray-streams.tar.gz")
(asdf-install:install "cl+ssl.tar.gz")

This should result in stopping to alert you that the cffi and rt library authors are not trusted. Sign them with gpg, then hit the restarts to add them to the package suppliers list.

And things finally finish.


So that’s one kind of problem. Going through the process with cl-ppcre presents a somewhat different problem.

(asdf-install:install 'cl-ppcre)

When checking the signature of cl-unicode, you get the error No key found for key id 0x595FF045057958C6.

This is a misleading error message. We know it’s not true, since that’s Edi Weitz’s key and we’ve successfully installed software signed by him before. So, quit out of sbcl and let’s check that signature by hand:

wget http://weitz.de/files/cl-unicode.tar.gz
wget http://weitz.de/files/cl-unicode.tar.gz.asc
gpg cl-unicode.tar.gz.asc

and we see:

gpg: Signature made Wed 23 Jul 2008 05:28:42 PM CDT using DSA key ID 057958C6
gpg: BAD signature from "Dr. Edmund Weitz <edi@weitz.de>"

Dun dun duunnn. Here’s an example of exactly why asdf-install does these security checks. The signature provided couldn’t have come from the file we were about to download, nevermind the fact that asdf-install does a terrible job of letting us know this.

However, I have trouble believing this is malicious code since it did come from Edi Weitz’s server (which could have been hacked). Since we’ve downloaded the tarball to check it’s signature anyway, we can take a look inside and see that it’s not terribly obviously malicious. So the source was probably modified and just wasn’t re-signed, so we go through the procedure to install from a local file.

So there you have a rather long-winded tour of just about everything that could go wrong when installing lisp code from wild west internet wikis. asdf-install needs some polish for sure and the packaging practices of the software on common-lisp.net is lacking. I hear clbuild is a good alternative, but I don’t know much about it. I’d like to look into it and try this same excercise with it and post back.

Tagged with: , , , ,

One Infinite Loop

Posted in common lisp by youngnh on 08.03.09

I wanted to write a small bot to scrape my netflix queue from their website, but making a simple request to their homepage redirected in a loop forever.

I’ve used Edi Weitz’s Drakma library for this sort of thing before and I really like it. Here was the code that I wrote:

(let ((cookies (make-instance 'cookie-jar)))
  (http-request "https://www.netflix.com/"
		     :method :get
		     :cookie-jar cookies))

Requesting http://www.netflix.com would result in being redirected to http://www.netflix.com/Default?tcw=1&cqs=, which would result in being redirected back to netflix.com and so on forever. Drakma actually handled the situation very gracefully. After 6 redirect attempts, it threw an exception saying it had exceeded its redirection limit. I just couldn’t figure out why.

I tried messing with my user agent. Writing scrapers has always made me a little paranoid that the developers on the other end know what I’m up to, are locked in a struggle-to-the-death to prevent my unauthorized uses and of course use the most sophisticated and obvious option available to them: my User-Agent header. Drakma has a remarkably convenient built-in facility for spoofing major browser versions. It’s literally a 20 character change to the function call. I tried that. No dice. Apparently the Netflix developers are as overworked as everyone else and bots just aren’t a big enough problem to justify snuffing them out by filtering User-Agent headers.

The redirection eventually terminated in Firefox. Otherwise I and millions of other users wouldn’t be able to see the page at all. Maybe Firefox has a built in limit to redirection, at which point it just says “fuck it, I’m just gonna go with this page”. But that doesn’t make any sense either, becuase 302’s don’t come with any HTML to display just in case the browser decides it’s tired of bouncing from page to page.

Firebug’s net panel showed that with cookies on, I didn’t redirect at all, netflix.com returned a 200 OK. With cookies off, I redirected once to /Default, then back to the homepage and that time, a 200 came back with gloriously renderable HTML.

I sent Drakma’s HTTP headers to *standard-output* with a handy little one-liner:

(setq *header-stream* *standard-output*)

Here’s what I saw:

GET / HTTP/1.1
Host: www.netflix.com
User-Agent: Drakma/1.0.0 (SBCL 1.0.30; Linux; 2.6.30-ARCH; http://weitz.de/drakma/)
Accept: */*
Connection: close

HTTP/1.1 302 Moved Temporarily
Date: Mon, 03 Aug 2009 12:47:31 GMT
Server: Apache-Coyote/1.1
P3P: CP="CAO DSP COR DEVa TAIa OUR BUS UNI STA"
Location: http://www.netflix.com/Default?tcw=1&amp;cqs=
Content-Length: 0
Set-Cookie: VisitorId=XXX; Domain=.netflix.com; Path=/
Set-Cookie: country=X; Domain=.netflix.com; Expires=Tue, 03-Aug-2010 12:47:32 GMT; Path=/
Set-Cookie: nflxsid=XXX; Domain=.netflix.com; Path=/
Set-Cookie: NetflixSession=XXX; Domain=.netflix.com; Path=/
Set-Cookie: NetflixCookies=XXX; Domain=.netflix.com; Expires=Wed, 02-Sep-2009 12:47:32 GMT; Path=/
Set-Cookie: asearch=XXX; Domain=.netflix.com; Expires=Tue, 03-Aug-2010 12:47:32 GMT; Path=/
Vary: Accept-Encoding
Cache-Control: private
Keep-Alive: timeout=15, max=93
Connection: Keep-Alive
Content-Type: text/plain
Set-Cookie: NSC_xxx.ofugmjy.dpn=XXX;path=/

GET /Default?tcw=1&amp;cqs= HTTP/1.1
Host: www.netflix.com
User-Agent: Drakma/1.0.0 (SBCL 1.0.30; Linux; 2.6.30-ARCH; http://weitz.de/drakma/)
Accept: */*
Cookie: NSC_xxx.ofugmjy.dpn=XXX; asearch=XXX; NetflixCookies=XXX; NetflixSession=XXX; nflxsid=XXX; country=XXX; VisitorId=XXX
Connection: close

HTTP/1.1 302 Moved Temporarily
Date: Mon, 03 Aug 2009 12:47:32 GMT
Server: Apache-Coyote/1.1
P3P: CP="CAO DSP COR DEVa TAIa OUR BUS UNI STA"
Location: http://www.netflix.com/
Content-Length: 0
Set-Cookie: lastHitTime=XXX; Domain=.netflix.com; Path=/
Set-Cookie: VisitorId=XXX; Domain=.netflix.com; Path=/
Set-Cookie: country=XXX; Domain=.netflix.com; Expires=Tue, 03-Aug-2010 12:47:32 GMT; Path=/
Set-Cookie: nflxsid=XXX; Domain=.netflix.com; Path=/
Set-Cookie: NetflixCookies=XXX; Domain=.netflix.com; Expires=Wed, 02-Sep-2009 12:47:32 GMT; Path=/
Vary: Accept-Encoding
Cache-Control: private
Keep-Alive: timeout=15, max=84
Connection: Keep-Alive
Content-Type: text/plain

GET / HTTP/1.1
Host: www.netflix.com
User-Agent: Drakma/1.0.0 (SBCL 1.0.30; Linux; 2.6.30-ARCH; http://weitz.de/drakma/)
Accept: */*
Connection: close

HTTP/1.1 302 Moved Temporarily
Date: Mon, 03 Aug 2009 12:47:32 GMT
Server: Apache-Coyote/1.1
P3P: CP="CAO DSP COR DEVa TAIa OUR BUS UNI STA"
Location: http://www.netflix.com/Default?tcw=1&amp;cqs=
Content-Length: 0
Set-Cookie: VisitorId=XXX; Domain=.netflix.com; Path=/
Set-Cookie: country=XXX; Domain=.netflix.com; Expires=Tue, 03-Aug-2010 12:47:32 GMT; Path=/
Set-Cookie: nflxsid=XXX; Domain=.netflix.com; Path=/
Set-Cookie: NetflixSession=XXX; Domain=.netflix.com; Path=/
Set-Cookie: NetflixCookies=XXX; Domain=.netflix.com; Expires=Wed, 02-Sep-2009 12:47:32 GMT; Path=/
Set-Cookie: asearch=XXX; Domain=.netflix.com; Expires=Tue, 03-Aug-2010 12:47:32 GMT; Path=/
Vary: Accept-Encoding
Cache-Control: private
Keep-Alive: timeout=15, max=69
Connection: Keep-Alive
Content-Type: text/plain
Set-Cookie: NSC_xxx.ofugmjy.dpn=XXX;path=/

GET /Default?tcw=1&amp;cqs= HTTP/1.1
Host: www.netflix.com
User-Agent: Drakma/1.0.0 (SBCL 1.0.30; Linux; 2.6.30-ARCH; http://weitz.de/drakma/)
Accept: */*
Cookie: lastHitTime=XXX; NSC_xxx.ofugmjy.dpn=XXX; asearch=XXX; NetflixCookies=XXX; NetflixSession=XXX; nflxsid=XXX; country=XXX; VisitorId=XXX
Connection: close

HTTP/1.1 302 Moved Temporarily
Date: Mon, 03 Aug 2009 12:47:32 GMT
Server: Apache-Coyote/1.1
P3P: CP="CAO DSP COR DEVa TAIa OUR BUS UNI STA"
Location: http://www.netflix.com/
Content-Length: 0
Set-Cookie: lastHitTime=XXX; Domain=.netflix.com; Path=/
Set-Cookie: VisitorId=XXX; Domain=.netflix.com; Path=/
Set-Cookie: country=XXX; Domain=.netflix.com; Expires=Tue, 03-Aug-2010 12:47:33 GMT; Path=/
Set-Cookie: nflxsid=XXX; Domain=.netflix.com; Path=/
Set-Cookie: NetflixCookies=XXX; Domain=.netflix.com; Expires=Wed, 02-Sep-2009 12:47:33 GMT; Path=/
Vary: Accept-Encoding
Cache-Control: private
Keep-Alive: timeout=15, max=92
Connection: Keep-Alive
Content-Type: text/plain

GET / HTTP/1.1
Host: www.netflix.com
User-Agent: Drakma/1.0.0 (SBCL 1.0.30; Linux; 2.6.30-ARCH; http://weitz.de/drakma/)
Accept: */*
Connection: close

HTTP/1.1 302 Moved Temporarily
Date: Mon, 03 Aug 2009 12:47:32 GMT
Server: Apache-Coyote/1.1
P3P: CP="CAO DSP COR DEVa TAIa OUR BUS UNI STA"
Location: http://www.netflix.com/Default?tcw=1&amp;cqs=
Content-Length: 0
Set-Cookie: VisitorId=XXX; Domain=.netflix.com; Path=/
Set-Cookie: country=XXX; Domain=.netflix.com; Expires=Tue, 03-Aug-2010 12:47:33 GMT; Path=/
Set-Cookie: nflxsid=XXX; Domain=.netflix.com; Path=/
Set-Cookie: NetflixSession=XXX; Domain=.netflix.com; Path=/
Set-Cookie: NetflixCookies=XXX; Domain=.netflix.com; Expires=Wed, 02-Sep-2009 12:47:33 GMT; Path=/
Set-Cookie: asearch=XXX; Domain=.netflix.com; Expires=Tue, 03-Aug-2010 12:47:33 GMT; Path=/
Vary: Accept-Encoding
Cache-Control: private
Keep-Alive: timeout=15, max=93
Connection: Keep-Alive
Content-Type: text/plain
Set-Cookie: NSC_xxx.ofugmjy.dpn=XXX;path=/

GET /Default?tcw=1&amp;cqs= HTTP/1.1
Host: www.netflix.com
User-Agent: Drakma/1.0.0 (SBCL 1.0.30; Linux; 2.6.30-ARCH; http://weitz.de/drakma/)
Accept: */*
Cookie: lastHitTime=XXX; NSC_xxx.ofugmjy.dpn=XXX; asearch=XXX; NetflixCookies=XXX; NetflixSession=XXX; nflxsid=XXX; country=XXX; VisitorId=XXX
Connection: close

HTTP/1.1 302 Moved Temporarily
Date: Mon, 03 Aug 2009 12:47:32 GMT
Server: Apache-Coyote/1.1
P3P: CP="CAO DSP COR DEVa TAIa OUR BUS UNI STA"
Location: http://www.netflix.com/
Content-Length: 0
Set-Cookie: lastHitTime=XXX; Domain=.netflix.com; Path=/
Set-Cookie: VisitorId=XXX; Domain=.netflix.com; Path=/
Set-Cookie: country=XXX; Domain=.netflix.com; Expires=Tue, 03-Aug-2010 12:47:33 GMT; Path=/
Set-Cookie: nflxsid=XXX; Domain=.netflix.com; Path=/
Set-Cookie: NetflixCookies=XXX; Domain=.netflix.com; Expires=Wed, 02-Sep-2009 12:47:33 GMT; Path=/
Vary: Accept-Encoding
Cache-Control: private
Keep-Alive: timeout=15, max=15
Connection: Keep-Alive
Content-Type: text/plain

Huh, the issue was in Drakma. I would request their homepage with no cookies, and Netflix would respond with a redirect to a different URL and a bunch of Set-Cookie headers. Drakma would follow, dutifully passing a Cookie header, and be redirected back to the original URL, also with Set-Cookie headers. Drakma would again follow, but this time not include cookies in the request.

Netflix is doing the redirects — I think — to ‘prime the pump’. If you don’t send cookies to the second URL, you get redirected to a you need to turn cookies on page. If you do, Netflix figures you’ll end up back at their homepage with a bunch of initialized cookies. It’s weird behavior on the part of Drakma that’s breaking the system.

I wonder who to talk to about this? Edi Weitz? #lisp?

Tagged with: , , , , ,
Follow

Get every new post delivered to your Inbox.