<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>David Underhill &#187; Python</title>
	<atom:link href="http://dound.com/tag/python/feed/" rel="self" type="application/rss+xml" />
	<link>http://dound.com</link>
	<description>dound&#039;s space on the web</description>
	<lastBuildDate>Tue, 31 Jan 2012 10:57:50 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.3</generator>
		<item>
		<title>PostgreSQL UPSERT (in Python)</title>
		<link>http://dound.com/2011/01/postgresql-upsert-in-python/</link>
		<comments>http://dound.com/2011/01/postgresql-upsert-in-python/#comments</comments>
		<pubDate>Mon, 10 Jan 2011 00:53:25 +0000</pubDate>
		<dc:creator>David Underhill</dc:creator>
				<category><![CDATA[Python]]></category>
		<category><![CDATA[PostgreSQL]]></category>

		<guid isPermaLink="false">http://dound.com/?p=492</guid>
		<description><![CDATA[PostgreSQL does not yet support the UPSERT command (though it is on their Todo list). If you have a row you want to update (if it already exists in the database) or insert (if it doesn&#8217;t exist yet), then PostgreSQL unfortunately makes you implement the logic yourself. Other popular databases like SQLite (INSERT OR IGNORE) [...]]]></description>
			<content:encoded><![CDATA[<p>PostgreSQL does not yet support the <code>UPSERT</code> command (though it is on their <a href="http://wiki.postgresql.org/wiki/Todo">Todo list</a>).  If you have a row you want to update (if it already exists in the database) or insert (if it doesn&#8217;t exist yet), then PostgreSQL unfortunately makes you implement the logic yourself.  Other popular databases like SQLite (<code>INSERT OR IGNORE</code>) and MySQL (<code>ON DUPLICATE KEY UPDATE</code>) both support upserting.  I haven&#8217;t run across a <em>generic</em> PL/pgSQL function which can do this, but you could write a trigger (like <a href="http://database-programmer.blogspot.com/2009/06/approaches-to-upsert.html">this one</a>) for each table where this functionality is needed.</p>
<p>Unfortunately, this is a bit of a pain if you want to use <code>UPSERT</code> on many tables, so I wrote a <a href="https://gist.github.com/772171">Python method</a> which takes care of the <code>UPSERT</code> logic generically.  To use it, you call it with a cursor connected to your database, the schema and table name, a list of primary key field names, and the key-value pairs for each field.</p>
<p>For example, let&#8217;s say you have a table which tracks scores (and only the last score counts):</p>

<div class="wp_syntax"><div class="code"><pre class="sql" style="font-family:monospace;"><span style="color: #993333; font-weight: bold;">CREATE</span> <span style="color: #993333; font-weight: bold;">TABLE</span> MySchema<span style="color: #66cc66;">.</span>Scores <span style="color: #66cc66;">&#40;</span>
    user_id integer <span style="color: #993333; font-weight: bold;">PRIMARY</span> <span style="color: #993333; font-weight: bold;">KEY</span><span style="color: #66cc66;">,</span>
    score   integer <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span>
<span style="color: #66cc66;">&#41;</span>;</pre></div></div>

<p>To <code>UPSERT</code> a row into this table you would:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;">db_conn = psycopg2.<span style="color: black;">connect</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;...&quot;</span><span style="color: black;">&#41;</span>
db_cur = db_conn.<span style="color: black;">cursor</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
upsert<span style="color: black;">&#40;</span>db_cur, <span style="color: #483d8b;">'Scores'</span>, <span style="color: black;">&#40;</span><span style="color: #483d8b;">'user_id'</span>,<span style="color: black;">&#41;</span>, schema=<span style="color: #483d8b;">'MySchema'</span>, user_id=..., score=...<span style="color: black;">&#41;</span>
db_conn.<span style="color: black;">commit</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></pre></div></div>

<p>Here&#8217;s the code for the Python-based <code>upsert</code> method:</p>
<p><script src="https://gist.github.com/772171.js"> </script></p>
]]></content:encoded>
			<wfw:commentRss>http://dound.com/2011/01/postgresql-upsert-in-python/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Asynchronous URL Fetch manager for App Engine</title>
		<link>http://dound.com/2010/10/asynchronous-url-fetch-manager-for-app-engine/</link>
		<comments>http://dound.com/2010/10/asynchronous-url-fetch-manager-for-app-engine/#comments</comments>
		<pubDate>Sat, 30 Oct 2010 23:42:21 +0000</pubDate>
		<dc:creator>David Underhill</dc:creator>
				<category><![CDATA[Google App Engine]]></category>
		<category><![CDATA[asynchronous]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[urlfetch]]></category>

		<guid isPermaLink="false">http://dound.com/?p=482</guid>
		<description><![CDATA[App Engine&#8217;s URL Fetch API supports fetching URLs asynchronously.  However, a request handler may only simultaneously fetch up to 10 URLs.  To fetch more than 10, it must wait for one to finish before starting another. This is a little tricky to do efficiently*, so I put together a Python module which takes care of the [...]]]></description>
			<content:encoded><![CDATA[<p>App Engine&#8217;s <a href="http://code.google.com/appengine/docs/python/urlfetch/overview.html">URL Fetch API</a> supports fetching URLs asynchronously.  However, a request handler may only <em>simultaneously</em> fetch up to 10 URLs.  To fetch more than 10, it must wait for one to finish before starting another.  This is a little tricky to do efficiently*, so I put together a Python module which takes care of the details.  The module provides an <code>AsyncURLFetchManager</code> class with a simple interface &#8211; just tell it what URLs you want and it fetches them as quickly as possible.  This interface also simplifies the starting of an asynchronous request into a single method call:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;">fetch_asynchronously<span style="color: black;">&#40;</span>url<span style="color: black;">&#41;</span></pre></div></div>

<p>You can also pass <code lang="python">fetch_asynchronously()</code> any arguments which <a href="http://code.google.com/appengine/docs/python/urlfetch/asynchronousrequests.html#The_make_fetch_call_Function"><code>urlfetch.make_fetch_call()</code></a> accepts (e.g., <code>method</code>, <code>payload</code>).  You can also ask it for a callback which will conveniently include the <a href="http://code.google.com/appengine/docs/python/urlfetch/asynchronousrequests.html#The_RPC_Object">RPC object</a> (which contains the results) as well as any other positional or keyword arguments you would like.</p>
<p>At the end of your request, just call <code>wait()</code> to ensures that any pending fetches and their callbacks are completed prior to the request handler terminating.</p>
<p><script src="http://gist.github.com/655879.js?file=async_urlfetch_manager.py"></script></p>
<p>* Unfortunately, App Engine does not currently provide <code>select()</code> or any other non-blocking mechanism which can check if an RPC has completed.  Once it does, this implementation could be improved to ensure that it only waits on an RPC which has already completed (currently we just have to wait on the oldest one &#8211; this is sub-optimal since later RPCs may actually finish first).</p>
]]></content:encoded>
			<wfw:commentRss>http://dound.com/2010/10/asynchronous-url-fetch-manager-for-app-engine/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Rate limiting users requests on app engine (optionally with Captchas)</title>
		<link>http://dound.com/2010/06/rate-limiting-gae-with-captchas/</link>
		<comments>http://dound.com/2010/06/rate-limiting-gae-with-captchas/#comments</comments>
		<pubDate>Sun, 13 Jun 2010 22:59:55 +0000</pubDate>
		<dc:creator>David Underhill</dc:creator>
				<category><![CDATA[Google App Engine]]></category>
		<category><![CDATA[bot]]></category>
		<category><![CDATA[captcha]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[rate limit]]></category>

		<guid isPermaLink="false">http://dound.com/?p=447</guid>
		<description><![CDATA[You may have some functionality on your app engine site that you want to protect from robots and prevent users from executing too frequently. For example, perhaps users can leave comments but you only want them to be able to leave a comment every N seconds &#8211; faster than that and the &#8220;user&#8221; is either [...]]]></description>
			<content:encoded><![CDATA[<p>You may have some functionality on your app engine site that you want to protect from robots and prevent users from executing too frequently.  For example, perhaps users can leave comments but you only want them to be able to leave a comment every <code>N</code> seconds &#8211; faster than that and the &#8220;user&#8221; is either a bot or is not using the system as intended.</p>
<p>One way to discourage this behavior is to limit how often a user can take a certain action to a fixed rate.  I&#8217;ve created a <code>RateLimiter</code> class which handles the logic of tracking how quickly a user is making requests, and determines when your code (optionally) should challenge the user with a captcha before allowing them to continue.  If you simply want to rate limit the user&#8217;s requests, you can ignore the captcha business and just return an error to the user whenever they exceed the allowed rate.</p>
<p>The source is available at <a href="http://gist.github.com/437051#file_rate_limit.py">http://gist.github.com/437051</a> (including the optional captcha handling code).</p>
<p><strong>Example Usage:</strong><br />
The example code below shows a rate limiter which allows a user to interact with a particular page once every 2 seconds.  It also gives the user 3 &#8220;tokens&#8221; which allows the user to violate this limit by up to 3 requests.  Tokens are consumed if a user makes a request within 2 seconds of the previous request.  Tokens are returned if the user if the user slows down, or if the user solves a captcha.</p>
<p>This example is written as if the request is expected to be made via JavaScript on your page.  The client-side JavaScript would check the response for the <code>'captcha-show'</code> text and prompt the user with a captcha if that test was present.  When the captcha is answered, another AJAX call would be made to send the user&#8217;s response to the <code>CaptchaHandler</code> class in <a href="http://gist.github.com/437051#file_rate_limit.py">rate_limit.py</a>.  You are free to integrate the captcha challenge however you like.  Just call <code>RateLimiter.captcha_solved()</code> or <code>RateLimiter.rate_limit(uid, captcha_solved=True)</code> when the user meets your challenge (it doesn&#8217;t even have to be a captcha).</p>
<p><script src="http://gist.github.com/437051.js?file=example.py"></script></p>
]]></content:encoded>
			<wfw:commentRss>http://dound.com/2010/06/rate-limiting-gae-with-captchas/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>FAST Google App Engine Sessions (and RPX integration)</title>
		<link>http://dound.com/2010/04/google-app-engine-sessions-and-rpx-integration/</link>
		<comments>http://dound.com/2010/04/google-app-engine-sessions-and-rpx-integration/#comments</comments>
		<pubDate>Mon, 12 Apr 2010 04:16:30 +0000</pubDate>
		<dc:creator>David Underhill</dc:creator>
				<category><![CDATA[Google App Engine]]></category>
		<category><![CDATA[app-engine]]></category>
		<category><![CDATA[gae-sessions]]></category>
		<category><![CDATA[OpenID]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[sessions]]></category>

		<guid isPermaLink="false">http://dound.com/?p=307</guid>
		<description><![CDATA[The Google App Engine infrastructure provides many services, but sessions is not one of them. There are several Python-based session middlewares which already do this so I considered them first (spoiler: I ended up writing my own and it is orders of magnitudes faster than the alternatives: gae-sessions). Beaker is a solid implementation, but it [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://code.google.com/appengine/">Google App Engine</a> infrastructure provides many services, but sessions is not one of them.  There are several Python-based session middlewares which already do this so I considered them first (spoiler: I ended up writing my own and it is <a href="http://wiki.github.com/dound/gae-sessions/comparison-with-alternative-libraries">orders of magnitudes faster than the alternatives</a>: <strong><a href="http://github.com/dound/gae-sessions">gae-sessions</a></strong>).</p>
<p><a href="http://beaker.groovie.org/">Beaker</a> is a solid implementation, but it lacks support for memcache on app engine.  This means every request must go to the datastore to fetch session data &#8211; yuck.</p>
<p><a href="http://gaeutilities.appspot.com/">gaeutilities</a> is designed for app engine and takes advantage of both memcache and the datastore.  Unfortunately, the code is a bit heavyweight &#8211; it is coupled to unrelated functionality (e.g., &#8220;flash&#8221; messaging) and it is complicated by support for options I do not need (e.g., cookie-only sessions and automatic token rotation).  Most significantly, its performance suffers from excess API calls and inefficient model storage.</p>
<p>Since I was unsatisfied with these options, I wrote my own sessions middleware, <a href="http://github.com/dound/gae-sessions">gae-sessions</a>.  It strives to be lightweight, fast (but reliable), secure, and easy to use.  I ended up with a pretty small library (200 lines of code) which met these goals.  It uses memcache (for speed) and the datastore (for reliability) but only reads and writes  when it must.  db.Model objects are efficiently stored by converting them to protobufs instead of using the automatic pickling functionality (which is slow since app engine lacks cPickle).</p>
<p>Consider <a href="http://github.com/dound/gae-sessions">gae-sessions</a> if you need sessions support for a Python web application hosted on Google&#8217;s app engine.  The project includes demo code which you can run without modification on the app engine development server.  The demo shows gae-sessions working with an <a href="http://openid.net/">OpenID</a>-based authentication system powered by <a href="http://www.rpxnow.com">RPX</a> (check it out to see how easy it is to integrate with RPX).</p>
<p><strong>Update</strong>: I&#8217;ve created an <a href="http://bit.ly/gae-sessions"><strong>in-depth comparison page</strong></a> which compares both the features and performance of alternative sessions libraries (beaker, geautilities, gmemsess, and suas) with gae-sessions.</p>
]]></content:encoded>
			<wfw:commentRss>http://dound.com/2010/04/google-app-engine-sessions-and-rpx-integration/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Python logging and performance: how to have your cake and eat it too</title>
		<link>http://dound.com/2010/02/python-logging-performance/</link>
		<comments>http://dound.com/2010/02/python-logging-performance/#comments</comments>
		<pubDate>Sun, 07 Feb 2010 05:08:48 +0000</pubDate>
		<dc:creator>David Underhill</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[comment]]></category>
		<category><![CDATA[logging]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[preprocess]]></category>
		<category><![CDATA[script]]></category>

		<guid isPermaLink="false">http://dound.com/?p=268</guid>
		<description><![CDATA[I love Python&#8216;s logging module. I use it all the time to log a wide variety of information &#8212; messages to help me debug as well as informative messages for the user. Though you can toggle which messages you want to be printed, if the Python interpreter encounters a logging method call it still creates [...]]]></description>
			<content:encoded><![CDATA[<p>I love <a href="http://www.python.org">Python</a>&#8216;s <a href="http://docs.python.org/library/logging.html">logging module</a>.  I use it all the time to log a wide variety of information &#8212; messages to help me debug as well as informative messages for the user.  Though you can toggle which messages you want to be printed, if the Python interpreter encounters a logging method call it still creates the string for the log message (the argument to the method) (sadly there Python doesn&#8217;t have <a href="http://www.haskell.org/haskellwiki/Performance/Laziness">lazy evaluation</a> like <a href="http://www.haskell.org/">Haskell</a>).  If creating this string is expensive, then your application&#8217;s performance may suffer.  Unfortunately, there is no Python preprocessor (like C&#8217;s <a href="http://gcc.gnu.org/onlinedocs/cpp/">cpp</a> &#8230; though <a href="http://code.google.com/p/preprocess/">preprocess</a> might be able to do it) so it is difficult to automatically remove a large number of logging statements prior to running an application in a production environment.</p>
<p>The best solution I&#8217;ve seen is to prefix logging statements with <code>if __debug__:</code> so that they are optimized away by <code>python -O</code> (see <a href="http://stackoverflow.com/questions/2006190/python-equivalent-of-define-func-or-how-to-comment-out-a-function-call-in-p">this post on StackOverflow</a>).  I like it, but it unfortunately requires this statement to be prefixed to every logging statement I don&#8217;t want in a production environment.  That&#8217;s a lot of ugly extra code and it isn&#8217;t easy to change which statements it applies to either.</p>
<p>I decided to write a script which automatically parses a Python file and replaces logging statements of a particular level with a <code>pass</code> statement and a commented out copy of the logging code.  It can also do the reverse operation.  It has some limitations (see the code, or run the script with the <code>--help</code> option), but it should work for most Python files.  I used it for the <a href="http://yuba.stanford.edu/vns">VNS</a> project and it successfully operated on every file in the project.  It also improved performance dramatically &#8211; the maximum throughput of the VNS simulator <a href="http://yuba.stanford.edu/vns/2010/02/fairness/">increased by 25%</a>!  In comparison, running the code with <a href="http://psyco.sourceforge.net/">Psyco</a> only garnered a 6% improvement (though pretty substantial for the minimal <a href="http://github.com/dound/vns/commit/54d5415a43043062ae6195b828707692b9231aab">13 lines</a> I had to add to take advantage of it).</p>
<p>I think this script is worth using before running your code in a production environment if you are a heavy user of the logging module like I am.  You can find the code <a href="http://www.siafoo.net/snippet/348">here</a> (it is hosted on <a href="http://www.siafoo.com">Siafoo</a>, a neat site for sharing code).  Here&#8217;s the latest version of the code:<br />
<script type='text/javascript' src='http://www.siafoo.net/snippet/348/embed.js?nolinenos'></script></p>
]]></content:encoded>
			<wfw:commentRss>http://dound.com/2010/02/python-logging-performance/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Integrating Twisted with a pcap-based Python packet sniffer</title>
		<link>http://dound.com/2009/09/integrating-twisted-with-a-pcap-based-python-packet-sniffer/</link>
		<comments>http://dound.com/2009/09/integrating-twisted-with-a-pcap-based-python-packet-sniffer/#comments</comments>
		<pubDate>Wed, 09 Sep 2009 00:53:34 +0000</pubDate>
		<dc:creator>David Underhill</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[pcap]]></category>
		<category><![CDATA[pcapy]]></category>
		<category><![CDATA[raw socket]]></category>
		<category><![CDATA[sniffer]]></category>
		<category><![CDATA[twisted]]></category>

		<guid isPermaLink="false">http://dound.com/?p=223</guid>
		<description><![CDATA[Twisted is an awesome event-driven networking engine. Unfortunately, it does not have good support for interfacing with raw sockets (unlike its support for many network protocols, which is amazing). Anyway, I recently needed to work with raw sockets so I had to find a way to make it work with Twisted. Though Twisted does have [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://twistedmatrix.com/">Twisted</a> is an awesome event-driven networking engine.  Unfortunately, it does not have good support for interfacing with <a href="http://en.wikipedia.org/wiki/Raw_socket">raw sockets</a> (unlike its <a href="http://twistedmatrix.com/documents/8.1.0/api/twisted.protocols.html">support</a> for many network protocols, which is amazing).  Anyway, I recently needed to work with raw sockets so I had to find a way to make it work with Twisted.  Though Twisted does have a module (<a href="http://twistedmatrix.com/trac/wiki/TwistedPair">twisted.pair</a>) which tries to provide some support for raw sockets, the module is poorly documented and requires a library which is not readily available.</p>
<p>Luckily, I stumbled on a module which works on top of the <a href="http://www.tcpdump.org/">libpcap</a> packet capture library called <a href="http://oss.coresecurity.com/projects/pcapy.html">pcapy</a>.  It is simple to use, and thread-safe &#8212; and easy to integrate into a Twisted-based project.</p>
<p>I put together a short sample (see below) which shows how to capture raw packets alongside the main Twisted event loop.  It would be trivial to extend this example to also write to a raw socket (using an ordinary <a href="http://docs.python.org/library/socket.html">Python socket</a>).  This example can also be downloaded <a href="http://dound.com/wp/files/twisted_and_pcap_together.py">here</a>.</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #808080; font-style: italic;"># This sample shows how to run a libpcap-based packet sniffer concurrently with</span>
<span style="color: #808080; font-style: italic;"># the Twisted framework.  The Twisted component is an &quot;Echo&quot; TCP server</span>
<span style="color: #808080; font-style: italic;"># (listening on port 9999) which prints everything it receives.  When a client</span>
<span style="color: #808080; font-style: italic;"># connects, it starts the pcap thread.  When the pcap thread receives a packet,</span>
<span style="color: #808080; font-style: italic;"># it sends a message to the client telling it the size of the received packet.</span>
<span style="color: #808080; font-style: italic;"># Finally, when the client disconnects the program is terminated.</span>
&nbsp;
<span style="color: #808080; font-style: italic;"># To try this contrived example out, run this script as root (so that it can use</span>
<span style="color: #808080; font-style: italic;"># pcap) and then connect to the echo server (e.g., telnet localhost 9999).  Note</span>
<span style="color: #808080; font-style: italic;"># that the pcap parameters are hard-coded.  This code uses twisted 8.0.2 and</span>
<span style="color: #808080; font-style: italic;"># pcapy-0.10.4.</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">os</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">from</span> pcapy <span style="color: #ff7700;font-weight:bold;">import</span> open_live
<span style="color: #ff7700;font-weight:bold;">from</span> twisted.<span style="color: black;">internet</span>.<span style="color: black;">protocol</span> <span style="color: #ff7700;font-weight:bold;">import</span> Protocol, Factory
<span style="color: #ff7700;font-weight:bold;">from</span> twisted.<span style="color: black;">internet</span> <span style="color: #ff7700;font-weight:bold;">import</span> reactor
&nbsp;
<span style="color: #808080; font-style: italic;"># pcap settings</span>
DEV          = <span style="color: #483d8b;">'eth0'</span>  <span style="color: #808080; font-style: italic;"># interface to listen on</span>
MAX_LEN      = <span style="color: #ff4500;">1514</span>    <span style="color: #808080; font-style: italic;"># max size of packet to capture</span>
PROMISCUOUS  = <span style="color: #ff4500;">1</span>       <span style="color: #808080; font-style: italic;"># promiscuous mode?</span>
READ_TIMEOUT = <span style="color: #ff4500;">100</span>     <span style="color: #808080; font-style: italic;"># in milliseconds</span>
PCAP_FILTER  = <span style="color: #483d8b;">''</span>      <span style="color: #808080; font-style: italic;"># empty =&gt; get everything (or we could use a BPF filter)</span>
MAX_PKTS     = -<span style="color: #ff4500;">1</span>      <span style="color: #808080; font-style: italic;"># number of packets to capture; -1 =&gt; no limit</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">def</span> run_pcap<span style="color: black;">&#40;</span>f<span style="color: black;">&#41;</span>:
    <span style="color: #808080; font-style: italic;"># the method which will be called when a packet is captured</span>
    <span style="color: #ff7700;font-weight:bold;">def</span> ph<span style="color: black;">&#40;</span>hdr, data<span style="color: black;">&#41;</span>:
        <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'pcap heard: when=%s sz=%dB'</span> <span style="color: #66cc66;">%</span> <span style="color: black;">&#40;</span>hdr.<span style="color: black;">getts</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>, <span style="color: #008000;">len</span><span style="color: black;">&#40;</span>data<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
        <span style="color: #808080; font-style: italic;"># thread safety: call from the main twisted event loop</span>
        reactor.<span style="color: black;">callFromThread</span><span style="color: black;">&#40;</span>f, <span style="color: #008000;">len</span><span style="color: black;">&#40;</span>data<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
&nbsp;
    <span style="color: #808080; font-style: italic;"># start the packet capture</span>
    p = open_live<span style="color: black;">&#40;</span>DEV, MAX_LEN, PROMISCUOUS, READ_TIMEOUT<span style="color: black;">&#41;</span>
    p.<span style="color: black;">setfilter</span><span style="color: black;">&#40;</span>PCAP_FILTER<span style="color: black;">&#41;</span>
    <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">&quot;Listening on %s: net=%s, mask=%s&quot;</span> <span style="color: #66cc66;">%</span> <span style="color: black;">&#40;</span>DEV, p.<span style="color: black;">getnet</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>, p.<span style="color: black;">getmask</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
    p.<span style="color: black;">loop</span><span style="color: black;">&#40;</span>MAX_PKTS, ph<span style="color: black;">&#41;</span>
&nbsp;
<span style="color: #808080; font-style: italic;"># a silly echo server which prints what it receives and sends info about the</span>
<span style="color: #808080; font-style: italic;"># size of each packet captured on DEV</span>
<span style="color: #ff7700;font-weight:bold;">class</span> Echo<span style="color: black;">&#40;</span>Protocol<span style="color: black;">&#41;</span>:
    <span style="color: #ff7700;font-weight:bold;">def</span> connectionLost<span style="color: black;">&#40;</span><span style="color: #008000;">self</span>, reason<span style="color: black;">&#41;</span>:
        <span style="color: #dc143c;">os</span>._exit<span style="color: black;">&#40;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># kill the whole process</span>
&nbsp;
    <span style="color: #ff7700;font-weight:bold;">def</span> connectionMade<span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
        <span style="color: #808080; font-style: italic;"># run pcap in another thread (it will run forever)</span>
        reactor.<span style="color: black;">callInThread</span><span style="color: black;">&#40;</span>run_pcap, <span style="color: #008000;">self</span>.<span style="color: black;">pcapDataReceived</span><span style="color: black;">&#41;</span>
&nbsp;
    <span style="color: #ff7700;font-weight:bold;">def</span> dataReceived<span style="color: black;">&#40;</span><span style="color: #008000;">self</span>, data<span style="color: black;">&#41;</span>:
        <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'echo got: %s'</span> <span style="color: #66cc66;">%</span> data
&nbsp;
    <span style="color: #ff7700;font-weight:bold;">def</span> pcapDataReceived<span style="color: black;">&#40;</span><span style="color: #008000;">self</span>, sz<span style="color: black;">&#41;</span>:
        <span style="color: #008000;">self</span>.<span style="color: black;">transport</span>.<span style="color: black;">write</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'pcap got: %uB<span style="color: #000099; font-weight: bold;">\n</span>'</span> <span style="color: #66cc66;">%</span> sz<span style="color: black;">&#41;</span>
&nbsp;
<span style="color: #808080; font-style: italic;"># starts the silly echo server on port 9999</span>
<span style="color: #ff7700;font-weight:bold;">def</span> main<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>:
    factory = Factory<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
    factory.<span style="color: black;">protocol</span> = Echo
    reactor.<span style="color: black;">listenTCP</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">9999</span>, factory<span style="color: black;">&#41;</span>
    reactor.<span style="color: black;">run</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">if</span> __name__ == <span style="color: #483d8b;">&quot;__main__&quot;</span>:
    main<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></pre></div></div>

]]></content:encoded>
			<wfw:commentRss>http://dound.com/2009/09/integrating-twisted-with-a-pcap-based-python-packet-sniffer/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Python + Twisted Length-Type-Based Protocol Client / Server</title>
		<link>http://dound.com/2009/03/python-and-twisted-length-type-based-protocol-client-server/</link>
		<comments>http://dound.com/2009/03/python-and-twisted-length-type-based-protocol-client-server/#comments</comments>
		<pubDate>Sun, 08 Mar 2009 09:25:12 +0000</pubDate>
		<dc:creator>David Underhill</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[WordPress]]></category>
		<category><![CDATA[client]]></category>
		<category><![CDATA[protocol]]></category>
		<category><![CDATA[server]]></category>
		<category><![CDATA[tcp]]></category>
		<category><![CDATA[twisted]]></category>

		<guid isPermaLink="false">http://dound.com/?p=108</guid>
		<description><![CDATA[I often have a need to work with a simple TCP protocol whose messages have a header which starts with the length of the message and an integer representing the message type.  To save myself the trouble of creating and debugging a very similar custom implementation each time I have this need, I decided to package it as a simple <a href="http://www.python.org">Python</a> framework which does this for me.  It is based on the event-driven <a href="http://www.twistedmatrix.com">Twised</a> networking engine.]]></description>
			<content:encoded><![CDATA[<p>It seems like I often have a need to work with a simple TCP protocol whose messages have a header which starts with the length of the message and an integer representing the message type (<a href="http://openflowswitch.org/">OpenFlow</a> is one of many such protocols).  To save myself the trouble of creating and debugging a very similar custom implementation each time I have this need, I decided to package it as a simple <a href="http://www.python.org">Python</a> framework which does this for me.  It is based on the event-driven <a href="http://www.twistedmatrix.com">Twised</a> networking engine.  Using this simple extension on top of Twisted has a number of benefits:</p>
<ol>
<li>Automatic handling of the length and type fields when sending and receiving messages.</li>
<li>Automatic unpacking of messages based on type.</li>
<li>Client automatically tries to reconnect if the connection is lost.</li>
<li>Server can handle any number of clients simultaneously.</li>
</ol>
<p>You can view the official package on the <a href="http://pypi.python.org">PyPi</a> website <a href="http://pypi.python.org/pypi/ltprotocol">here</a>.  My local page for the package is <a href="http://dound.com/projects/python/ltprotocol/">here</a> &#8212; please <a href="http://dound.com/projects/python/ltprotocol/">view it</a> for an example on how to use this package.</p>
]]></content:encoded>
			<wfw:commentRss>http://dound.com/2009/03/python-and-twisted-length-type-based-protocol-client-server/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

