Forget to reset your Garmin between workouts?

May 23rd, 2011

I love my Garmin Forerunner 305 – I never run / bike / hike without it. I usually download my workout while I stretch, but sometimes that isn’t convenient (especially when traveling) so I just leave the workout on the watch to download later. Whenever I do this I inevitably forget to reset the watch before my next workout … and end up with two (or more) workouts grouped together as one when I finally get around to downloading the data from my watch.

Unfortunately, none of the tools I’ve used to import my data provide a way for me to split this combined “workout” into separate workouts. To overcome this, I wrote a little script which reads a tcx file and splits it up into separate workouts based on the amount of time between one lap stopping and the next starting. You can use it online at my Garmin Workout Splitter page.

The source code for the site is included below. It is quite reliant on the structure of Garmin’s tcx files -mostly because Garmin seems to be a bit overly sensitive about capitalization (and maybe spacing too) which meant I couldn’t use BeautifulSoup to do the parsing (I tried, and then realized Garmin wouldn’t have any of that, doh).

David Underhill Coding 1 comment

PostgreSQL UPSERT (in Python)

January 9th, 2011

PostgreSQL does not yet support the UPSERT command (though it is on their Todo list). If you have a row you want to update (if it already exists in the database) or insert (if it doesn’t exist yet), then PostgreSQL unfortunately makes you implement the logic yourself. Other popular databases like SQLite (INSERT OR IGNORE) and MySQL (ON DUPLICATE KEY UPDATE) both support upserting. I haven’t run across a generic PL/pgSQL function which can do this, but you could write a trigger (like this one) for each table where this functionality is needed.

Unfortunately, this is a bit of a pain if you want to use UPSERT on many tables, so I wrote a Python method which takes care of the UPSERT logic generically. To use it, you call it with a cursor connected to your database, the schema and table name, a list of primary key field names, and the key-value pairs for each field.

For example, let’s say you have a table which tracks scores (and only the last score counts):

CREATE TABLE MySchema.Scores (
    user_id integer PRIMARY KEY,
    score   integer NOT NULL
);

To UPSERT a row into this table you would:

db_conn = psycopg2.connect("...")
db_cur = db_conn.cursor()
upsert(db_cur, 'Scores', ('user_id',), schema='MySchema', user_id=..., score=...)
db_conn.commit()

Here’s the code for the Python-based upsert method:

David Underhill Python , 3 comments

Asynchronous URL Fetch manager for App Engine

October 30th, 2010

App Engine’s URL Fetch API supports fetching URLs asynchronously.  However, a request handler may only simultaneously fetch up to 10 URLs.  To fetch more than 10, it must wait for one to finish before starting another. This is a little tricky to do efficiently*, so I put together a Python module which takes care of the details.  The module provides an AsyncURLFetchManager class with a simple interface – just tell it what URLs you want and it fetches them as quickly as possible.  This interface also simplifies the starting of an asynchronous request into a single method call:

fetch_asynchronously(url)

You can also pass fetch_asynchronously() any arguments which urlfetch.make_fetch_call() accepts (e.g., method, payload). You can also ask it for a callback which will conveniently include the RPC object (which contains the results) as well as any other positional or keyword arguments you would like.

At the end of your request, just call wait() to ensures that any pending fetches and their callbacks are completed prior to the request handler terminating.

* Unfortunately, App Engine does not currently provide select() or any other non-blocking mechanism which can check if an RPC has completed.  Once it does, this implementation could be improved to ensure that it only waits on an RPC which has already completed (currently we just have to wait on the oldest one – this is sub-optimal since later RPCs may actually finish first).

David Underhill Google App Engine , , , 3 comments

CraigNotes: An enhanced Craigslist interface – rate ads, take notes, and more

September 11th, 2010

CraigNotes is a free service which helps you track your favorite ads on Craigslist.

Motivation. I love Craigslist – I’ve used it to find great housing options when moving to a new area, as well as deals for various appliances and furniture. Unfortunately, I always find myself awkwardly scratching notes in a spreadsheet as I try to track the most promising ads. This cumbersome process inspired CraigNotes.

Features. CraigNotes is a simple website which presents Craigslist ads inside an enhanced interface. CraigNotes lets you rate ads, take notes, and hide irrelevant spam. It automatically fetches the latest ads which match your search, and lets you continue to view old ads – even after they disappear from Craigslist (which only keeps them around for a week).

This simple functionality makes it a lot easier to keep your favorites at your fingertips, along with any additional notes you want to remember (I jot down details I receive from the ad’s poster when I contact them, as well as my opinion about the ad).

Give CraigNotes a try and let me know what you think!

CraigNotes Screenshot:
CraigNotes screenshot

David Underhill Software , , , No comments

Rate limiting users requests on app engine (optionally with Captchas)

June 13th, 2010

You may have some functionality on your app engine site that you want to protect from robots and prevent users from executing too frequently. For example, perhaps users can leave comments but you only want them to be able to leave a comment every N seconds – faster than that and the “user” is either a bot or is not using the system as intended.

One way to discourage this behavior is to limit how often a user can take a certain action to a fixed rate. I’ve created a RateLimiter class which handles the logic of tracking how quickly a user is making requests, and determines when your code (optionally) should challenge the user with a captcha before allowing them to continue. If you simply want to rate limit the user’s requests, you can ignore the captcha business and just return an error to the user whenever they exceed the allowed rate.

The source is available at http://gist.github.com/437051 (including the optional captcha handling code).

Example Usage:
The example code below shows a rate limiter which allows a user to interact with a particular page once every 2 seconds. It also gives the user 3 “tokens” which allows the user to violate this limit by up to 3 requests. Tokens are consumed if a user makes a request within 2 seconds of the previous request. Tokens are returned if the user if the user slows down, or if the user solves a captcha.

This example is written as if the request is expected to be made via JavaScript on your page. The client-side JavaScript would check the response for the 'captcha-show' text and prompt the user with a captcha if that test was present. When the captcha is answered, another AJAX call would be made to send the user’s response to the CaptchaHandler class in rate_limit.py. You are free to integrate the captcha challenge however you like. Just call RateLimiter.captcha_solved() or RateLimiter.rate_limit(uid, captcha_solved=True) when the user meets your challenge (it doesn’t even have to be a captcha).

David Underhill Google App Engine , , , , No comments

Country and State Chooser JavaScript Widget

May 8th, 2010

I’ve put together a simple JavaScript widget which dynamically populates a state dropdown box when a country is selected. The widget includes states for the United States, Canada, and a few other countries. It should be trivial to modify the script to use a list of countries and states of your choosing. The full source is below. A minified version is also available. Enjoy!

David Underhill Web Development , No comments

Announcing JCustomUploader: a simple Java-based file uploader

April 20th, 2010

Today I released JCustomUploader, a simple, Java-based file uploader. It can run as an applet on your website (Java 1.4 or higher) or as part of a desktop application. This software is open-source and free to use or include in your own project. Please check out the JCustomUploader homepage for more information.

Here is a screenshot of the UI (with fake “random” failures inserted to show how failures are handled):

JCustomUploader UI

David Underhill Software , , , No comments

FAST Google App Engine Sessions (and RPX integration)

April 12th, 2010

The Google App Engine infrastructure provides many services, but sessions is not one of them. There are several Python-based session middlewares which already do this so I considered them first (spoiler: I ended up writing my own and it is orders of magnitudes faster than the alternatives: gae-sessions).

Beaker is a solid implementation, but it lacks support for memcache on app engine. This means every request must go to the datastore to fetch session data – yuck.

gaeutilities is designed for app engine and takes advantage of both memcache and the datastore. Unfortunately, the code is a bit heavyweight – it is coupled to unrelated functionality (e.g., “flash” messaging) and it is complicated by support for options I do not need (e.g., cookie-only sessions and automatic token rotation). Most significantly, its performance suffers from excess API calls and inefficient model storage.

Since I was unsatisfied with these options, I wrote my own sessions middleware, gae-sessions. It strives to be lightweight, fast (but reliable), secure, and easy to use. I ended up with a pretty small library (200 lines of code) which met these goals. It uses memcache (for speed) and the datastore (for reliability) but only reads and writes when it must. db.Model objects are efficiently stored by converting them to protobufs instead of using the automatic pickling functionality (which is slow since app engine lacks cPickle).

Consider gae-sessions if you need sessions support for a Python web application hosted on Google’s app engine. The project includes demo code which you can run without modification on the app engine development server. The demo shows gae-sessions working with an OpenID-based authentication system powered by RPX (check it out to see how easy it is to integrate with RPX).

Update: I’ve created an in-depth comparison page which compares both the features and performance of alternative sessions libraries (beaker, geautilities, gmemsess, and suas) with gae-sessions.

David Underhill Google App Engine , , , , 10 comments

Python logging and performance: how to have your cake and eat it too

February 6th, 2010

I love Python‘s logging module. I use it all the time to log a wide variety of information — messages to help me debug as well as informative messages for the user. Though you can toggle which messages you want to be printed, if the Python interpreter encounters a logging method call it still creates the string for the log message (the argument to the method) (sadly there Python doesn’t have lazy evaluation like Haskell). If creating this string is expensive, then your application’s performance may suffer. Unfortunately, there is no Python preprocessor (like C’s cpp … though preprocess might be able to do it) so it is difficult to automatically remove a large number of logging statements prior to running an application in a production environment.

The best solution I’ve seen is to prefix logging statements with if __debug__: so that they are optimized away by python -O (see this post on StackOverflow). I like it, but it unfortunately requires this statement to be prefixed to every logging statement I don’t want in a production environment. That’s a lot of ugly extra code and it isn’t easy to change which statements it applies to either.

I decided to write a script which automatically parses a Python file and replaces logging statements of a particular level with a pass statement and a commented out copy of the logging code. It can also do the reverse operation. It has some limitations (see the code, or run the script with the --help option), but it should work for most Python files. I used it for the VNS project and it successfully operated on every file in the project. It also improved performance dramatically – the maximum throughput of the VNS simulator increased by 25%! In comparison, running the code with Psyco only garnered a 6% improvement (though pretty substantial for the minimal 13 lines I had to add to take advantage of it).

I think this script is worth using before running your code in a production environment if you are a heavy user of the logging module like I am. You can find the code here (it is hosted on Siafoo, a neat site for sharing code). Here’s the latest version of the code:

David Underhill Coding, Python , , , , , 3 comments

Overcoming Linux Screen Resolution Limitations (EDID)

January 16th, 2010

A little while ago I picked up a 26″ monitor (NEC MultiSync LCD2690WUXi). Unfortunately, I found that when I connected the monitor to my Ubuntu Linux box that I could only use up to 1280×1028 — even though the monitor’s native resolution was 1920×1080! I also had this problem on my Windows and SuSE machines, so I suspect the monitor is not properly reporting its maximum resolution via EDID.

I used the command-line utility xrandr to fix the problem. Running the tool with no arguments prints a list of displays and available display modes for each. This is handy since you need the name assigned to your display by your OS for the next step. Next, use the “–newmode” option with xrandr and specify the modeline which describes the display configuration you wish to use. This modeline generator might help you create the modeline you need. Once you create the new mode, use the “–addmode” option to add it to the list of modes supported by your monitor.

Finally, add this command to your ~/.xprofile file (or something similar) so that when you start your machine the new mode is automatically added and available (this way Ubuntu automatically reselects it too). This is what I ended up adding to my ~/.xprofile file:

xrandr --newmode "1920x1200_50Hz" 128,300 1920 1968 2000 2079 1200 1203 1209 1234 +hsync -vsync
xrandr --addmode HDMI-0 "1920x1200_50Hz"

Note: If you set your refresh rate too high, your monitor will probably flicker occasionally. If this happens, try lowering the refresh rate by lowering the pixel clock value (the first number in the modeline).

David Underhill Linux , , , , , , No comments