<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Bugfree.dk - Ronnie Holm&#039;s blog &#187; Ruby</title>
	<atom:link href="http://www.bugfree.dk/blog/tag/ruby/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.bugfree.dk/blog</link>
	<description>Not anti-anything, just pro-quality</description>
	<lastBuildDate>Mon, 28 Nov 2011 07:32:51 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<item>
		<title>Essential requirements for a developer automation tool</title>
		<link>http://www.bugfree.dk/blog/2010/09/26/essential-requirements-for-a-developer-automation-tool/</link>
		<comments>http://www.bugfree.dk/blog/2010/09/26/essential-requirements-for-a-developer-automation-tool/#comments</comments>
		<pubDate>Sun, 26 Sep 2010 11:19:26 +0000</pubDate>
		<dc:creator>Ronnie Holm</dc:creator>
				<category><![CDATA[.Net]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[SharePoint]]></category>
		<category><![CDATA[Windows]]></category>
		<category><![CDATA[Powershell]]></category>
		<category><![CDATA[Process]]></category>
		<category><![CDATA[Productivity]]></category>
		<category><![CDATA[Tool]]></category>

		<guid isPermaLink="false">http://www.bugfree.dk/blog/?p=1058</guid>
		<description><![CDATA[Developer automation is a subject that I&#8217;ve always felt passionate about. Unit testing may be the most common example, but other tasks may include check-out of source code, build, deploy, setup, and warm-up of an application. I may even want one system rule them all and have the same automation drive continuous integration. Whatever the [...]]]></description>
			<content:encoded><![CDATA[<p>Developer automation is a subject that I&#8217;ve always felt passionate about. Unit testing may be the most common example, but other tasks may include check-out of source code, build, deploy, setup, and warm-up of an application. I may even want <a href="http://simpleprogrammer.com/2010/09/03/one-build-to-rule-them-all">one system rule them all</a> and have the same automation drive <a href="http://martinfowler.com/articles/continuousIntegration.html">continuous integration</a>. Whatever the use, to fully reap the benefits of automation, a developer should master at least one automation tool the same way he masters other developer tools. This and later posts capture my experiences with a few such automation tools centered around Windows and .NET, starting with why the ubiquitous <a href="http://en.wikipedia.org/wiki/Batch_files">batch file</a> is best avoided and how to characterize better solutions in terms of it.</p>
<p>Although Visual Studio, in tandem with the <a href="http://en.wikipedia.org/wiki/Msbuild">MSBuild engine</a>, generally takes good care of compilation, I rarely want to rely on it solely. Instead, I’d prefer that any developer task be easily run from the command-line. This ensures that no magic is going on, that the task is kept simple and flexibility, and that it doesn’t rely on Visual Studio to work. The challenge, however, is which of the many tools available to pick. It must be one that’s flexible enough to meet most requirements with relative ease or the automation will likely not be a valid documentation medium in itself.</p>
<h4>Why not to use Windows batch files</h4>
<p>As a general example, consider the deployment of a <a href="http://www.bugfree.dk/blog/2010/03/31/getting-started-with-sharepoint-presentation">SharePoint 2007 solution</a>. With SharePoint a good deal more than compiling is needed to bring new functionality into a SharePoint installation. Whereas Visual Studio and MSBuild may still compile the code and <a href="http://wspbuilder.codeplex.com">WSPBuilder</a> create the WSP installation package, both from within Visual Studio and from the command-line, getting everything setup cries for automation, even locally. A common approach (possibly inspired by <a href="http://www.amazon.com/Microsoft-Windows-SharePoint-Services-Developer/dp/0735623201/ref=sr_1_1?ie=UTF8&amp;s=books&amp;qid=1285960936&amp;sr=8-1">popular</a> SharePoint 2007 literature) is that of the batch file applying various operations to a WSP. For the sake of brevity, I&#8217;ve kept the example short, but imagine extending it with a check-out source code task, a compilation task, a WSPBuilder task, and a feature deactivation and activation task, and possibly a warm-up task. Add to this a master script that orchestrate the whole thing. In the end, I’d end up with scripts that become increasingly harder to maintain as they grow in number and size.</p>
<pre class="prettyprint lang-sh">    @set STSADM=&quot;c:\program files\common files\microsoft shared\web server extensions\12\bin\stsadm&quot;

    if &quot;%1&quot; == &quot;uninstall&quot; goto uninstall
    if &quot;%1&quot; == &quot;install&quot; goto install
    if &quot;%1&quot; == &quot;&quot; goto reinstall

    :uninstall
        %STSADM% -o retractsolution -name Foo.wsp -immediate
        %STSADM% -o execadmsvcjobs
        %STSADM% -o deletesolution -name Foo.wsp -override
        goto end

    :install
        %STSADM% -o addsolution -filename Foo.wsp
        %STSADM% -o deploysolution -name Foo.wsp -immediate -allowGacDeployment -force
        %STSADM% -o execadmsvcjobs
        goto end

    :reinstall
        call Foo uninstall
        call Foo install
        goto end

    :end</pre>
<p>Still, because of the ubiquitous nature of command.com and cmd.exe, the batch file interpreters, batch files are everywhere. Regardless that the technology is a left-over from the days of MS DOS and haven&#8217;t evolved much since. Not only are the branching and looping constructs limited, so are the <a href="http://technet.microsoft.com/en-us/library/dd560674(WS.10).aspx">available commands</a>. Suppose I want to find out if a command was indeed successful. This turns out to be really hard when all I have to work with is the <a href="http://en.wikipedia.org/wiki/Errorlevel">errorlevel</a> of the most recently executed command. Assuming the command adheres to the errorlevel convention, for the script to fail as early and as close to the real error as possible, I’d have to inspect the property after each command, causing the batch file to grow quite verbose. Sadly, batch files lack the equivalent of <a href="http://www.davidpashley.com/articles/writing-robust-shell-scripts.html#id2399158">set -o errexit</a> of <a href="http://en.wikipedia.org/wiki/Bash_(Unix_shell)">Bash</a>, where the interpreter checks for success after each command and aborts immediately on error. Relying solely on the errorlevel is oftentimes insufficient anyway. To determine success, I’d typically have to parse actual command output or inspect some system property by downloading additional commands or building my own.</p>
<h4>Essential vs. incidental requirements</h4>
<p>Unless I plan to sit idle and stare at the screen and be a human error detector while the batch file run, I think it’s safe to say that it’s mostly unsuitable for developer automation. Hence in late 2005 I <a href="http://www.bugfree.dk/blog/2006/01/04/being-a-functional-pythonian">rolled my own automation tool in Python</a>. With Python or Ruby or a similar dynamic language acting as the glue that ties everything together, almost any task can be automated in a robust way. Of course, I could also automate with a static language like C# (it&#8217;s surprisingly common). But for script-like tasks, I don’t particularly fancy the long cycle of editing source code in Visual Studio, compiling it, deploying it, and having a hard time debugging it in environment without Visual Studio. A dynamic language, on the other hand, short-circuits the development cycle and allows for interactive programming through a <a href="http://en.wikipedia.org/wiki/REPL">REPL</a>.</p>
<p>With a dynamic language that interacts with .NET, such as <a href="http://en.wikipedia.org/wiki/Ironpython">IronPython</a>, <a href="http://en.wikipedia.org/wiki/IronRuby">IronRuby</a>, or <a href="http://en.wikipedia.org/wiki/Powershell">Powershell</a>, possibly with supporting DSLs like <a href="http://www.blueskyonmars.com/projects/paver">Paver</a>, <a href="http://en.wikipedia.org/wiki/Rake_(software)">Rake</a>, or <a href="http://en.wikipedia.org/wiki/Psake">psake</a> on top, the need for writing custom commands to interact with the operating system or the application almost vanishes. The question, then, is which of the dynamic languages to go with when at their technical core they’re so much alike. Besides sharing the concept of a REPL, the notion of a tuple, a list, a map, and operations on each are baked into their syntax, making code quite terse. It even makes it convenient to express any configuration in the language itself and not in XML where I’d first have to come up with a schema, and then create an instance of it before parsing it. On Windows, however, Powershell is gradually becoming the next ubiquitous interpreter, with applications shipping with cmdlets, whereas IronPython or IronRuby is a separate download.</p>
<h4>Conclusion</h4>
<p>No matter the tool, what I end up doing is defining tasks, form dependencies between tasks, and have the tool execute tasks in an order that satisfies their dependencies. As the tool traverses the dependency graph and executes tasks, it’s up to each task to detect success, and up to the tool to report on progress. A good tool is therefore characterized by the ease with which I can express these things, the available language constructs, the ease of debugging, and the tool’s ability to converse in foreign languages.</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fwww.bugfree.dk%2Fblog%2F2010%2F09%2F26%2Fessential-requirements-for-a-developer-automation-tool%2F&amp;title=Essential%20requirements%20for%20a%20developer%20automation%20tool" id="wpa2a_2"><img src="http://www.bugfree.dk/blog/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.bugfree.dk/blog/2010/09/26/essential-requirements-for-a-developer-automation-tool/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>User space traffic shaping with Ruby</title>
		<link>http://www.bugfree.dk/blog/2008/04/12/user-space-traffic-shaping-with-ruby/</link>
		<comments>http://www.bugfree.dk/blog/2008/04/12/user-space-traffic-shaping-with-ruby/#comments</comments>
		<pubDate>Sat, 12 Apr 2008 16:13:39 +0000</pubDate>
		<dc:creator>Ronnie Holm</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Networking]]></category>

		<guid isPermaLink="false">http://www.bugfree.dk/blog/2008/04/12/user-space-traffic-shaping-with-ruby/</guid>
		<description><![CDATA[Download Netwatch-1.0.zip. In my Kernel space traffic shaping with Linux post, I came to the conclusion that none of the traffic shaping algorithms within the Linux kernel was suitable for my needs. The effects of running the traffic shaping algorithms were too hard to quantify and coming up with the right set of parameters to [...]]]></description>
			<content:encoded><![CDATA[<p>
Download <a href="/software/Netwatch-1.0.zip">Netwatch-1.0.zip</a>.
</p>
<p>
In my <a href="2007/11/18/kernel-space-traffic-shaping-with-linux">Kernel space traffic shaping with Linux</a> post, I came to the conclusion that none of the traffic shaping algorithms within the Linux kernel was suitable for my needs. The effects of running the traffic shaping algorithms were too hard to quantify and coming up with the right set of parameters to go with each algorithm was challenging.
</p>
<p>
So I decided to come up with my own shaping algorithm, running in user space because it&#8217;s simpler working from there. I also wanted to use Ruby to learn the language and because of Ruby&#8217;s good properties as an integration platform. Lastly, I wanted the shaper to perform deferred shaping based on the traffic patterns observed over a period of hours or days rather than the more or less immediate shaping carried out by the kernel-based algorithms.
</p>
<p>
Before getting into the details, I should mention that these concepts have been successfully applied to shaping traffic on our 100/100 mbps Internet connection, shared by some 330 apartments, for well over a year. We&#8217;re no longer experiencing issues with network congestion and now that we use a payload agnostic shaper, we no longer have to combat the ever more sophisticated attempts of P2P software to camouflage its traffic.</p>
<h4>Architectural overview</h4>
<p>
At an overview level the shaper is composed into a number of subsystems. At the top are the WRR scheduler, the ARP cache, and the RRD database system that retrieve and store metadata information about computers and their traffic.
</p>
<p>
<img src="http://www.bugfree.dk/blog/wp-content/uploads/2008/04/shaperoverview.png" alt="shaperoverview.png" />
</p>
<p>
Based on input from these subsystems, the shaper&#8217;s decision engine evaluates each computer&#8217;s bandwidth usage against a set of rules. Should at least one rule be violated, e.g., too much traffic over some defined period of time, the shaper calls out to another subsystem that determines how to handle the violation. In this case Netfilter is called upon to take action, blocking the computer from accessing the Internet and redirecting it to an information page.
</p>
<h4>Weighted Round Robin scheduler</h4>
<p>
Within the shaper, kernel and user space form a symbiosis through the WRR scheduler. As part of WRR&#8217;s inner workings, the scheduler counts the bytes transmitted on a per IP basis. So although we don&#8217;t use WRR for shaping, per say, we do use it to track the byte counters of each computer.
</p>
<p>
Getting WRR to reveal this information is done through the tc (for traffic control) command. As outlined in the <a href="2007/11/18/kernel-space-traffic-shaping-with-linux">Kernel space traffic shaping with Linux</a> post, only outgoing traffic can be shaped by WRR (any algorithm really). Hence, tc is called once for eth0 and once for eth1, and parsing the result, we know have much traffic has entered and exited each computer between this and the previous call to tc.
</p>
<p>
For each computer the output has the form below. Of particular interest are the address and the bytes fields:
</p>
<pre>
> tc class show dev eth0
class wrr 8001:1fb parent 8001:
  (address: 192.168.1.231)
  (total weight: 0.872749) (current position: 4) (counters: 1 2 : 3 4)
  Pars 1: (weight 0.872749) (decr: 1e-10) (incr: 7.5e-11) (min: 0.1) (max: 1)
  Pars 2: (weight 1) (decr: 0) (incr: 0) (min: 1) (max: 1)
  (bytes: 4546184) (packets: 55373)
...
</pre>
<p>
The address is dynamically assigned through DHCP and is therefore subject to change. Also, the byte counters aren&#8217;t retained across restarts, so we need to draw on additional subsystems to align the WRR output with a computer&#8217;s unique identity across time.
</p>
<h4>Address Resolution Protocol</h4>
<p>
In a DHCP based environment the IP address of a machine may change over time. So to ensure that traffic is always attributed to the correct physical machine, the WRR byte counters aren&#8217;t tied directly to the IP address when stored. Instead, we use the ARP cache to look up the corresponding MAC address, which is assumed to be static.
</p>
<p>
This lookup is done by maintaining an in-memory set of IP/MAC mappings, populated by parsing the output of the ip command:
</p>
<pre>
> ip neigh
192.168.1.231 dev eth1 lladdr 00:50:8d:68:50:75 REACHABLE
192.168.3.117 dev eth1 lladdr 00:11:d8:8f:0e:3b REACHABLE
...
</pre>
<p>
Thus, combining WRR with ARP, the shaper is able to associate byte counters with the MAC addresses of LAN-connected computers. Restarting the machine, the shaper, or Netfilter, however, will still cause traffic shaping data accumulated over time to be lost.
</p>
<h4>Round Robin Databases</h4>
<p>
We could&#8217;ve opted for persistence to a text file, a hierarchical XML structure, or a relational database, but <a href="http://en.wikipedia.org/wiki/Time_series">time series</a> data doesn&#8217;t lend itself well to these traditional approaches &#8212; at least not without preprocessing. Because with time series data, such as the 32 bit integer byte counters for each computer, counter-wrap is a frequent event (occurs every 4GB of transferred data). And restarting the machine, the shaper, or Netfilter is also a common event that&#8217;ll most likely cause an outlier to be recorded because all counters are reset.
</p>
<p>
Logic for making sure these events doesn&#8217;t pollute our database with erroneous measurements are part of the defining characteristics of a time series database system. In addition, querying data, such as summing within a period of time and making sure the sum isn&#8217;t affected by the above events, is what a time series database is good at.
</p>
<p>
On Linux-based based systems, <a href="http://tobi.oetiker.ch/">Tobias Oetiker&#8217;s</a> <a href="http://oss.oetiker.ch/rrdtool/">Round Robin Database tools</a> are the de facto tools for storing, querying, and graphing time series data, and therefore the ones used by the shaper.
</p>
<p>
The idea is that, using the RRD tools, each computer gets its own database, named after the corresponding MAC address, describing its traffic over time. So querying the database of each computer can tell us how much data was transferred and received over some period of time. Using the RRD tools for this task eliminating the need on our behalf to deal with outliers, missing values, counter-wraps, and so forth. All the shaper has to do is record the value of the counters at regular intervals and RRD makes sure data is consistent within the database.
</p>
<h4>Decision engine</h4>
<p>
To put the shaper online, it&#8217;s run from a Bash script containing an infinite loop that (1) reads and parses the WRR output, (2) reads and parses the ARP cache entries, (3) writes the byte counters to the RRD databases, (4) uses the RRD querying tool to sum the data based on the rules specified, possibly causing a violation event to fire, and finally (5) goto sleep for some period of time before starting all over.
</p>
<p>
Whenever a rule is violated, the action taken may be whatever can be expressed though Ruby code or through a callout. This may involve sending an email to the network administrator or disabling Internet access for the computer in question.
</p>
<p>
Within our configuration, we defined a set of rules that state that within a four hour sliding window a computer is allowed to upload no more than 5GB and download 10GB of data. Similarly, during a seven day sliding window, a computer is allowed to upload no more than 30GB and download 60GB of data (the exact quotas and periods obviously depend on the network capacity, users online, their usage patterns, and so forth).
</p>
<p>
Upon violation of a rule, we employ Netfilter to redirect a computer to an information page stating that the computer is blocked from accessing the Internet in conjunction with the sum of the four hour and seven day totals.
</p>
<h4>Conclusion</h4>
<p>
Observing the proof of concept shaper in action, the biggest problem seems to be that a few users modify their MAC address to get a bigger piece of the bandwidth pie. If we wanted, though, changing the MAC can be counteracted by introducing another layer, and instead tie traffic to the port on the switch to which the computer is connected.
</p>
<p>
As far as the available source code goes, it should probably only be used as a starting point for building your own system.</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fwww.bugfree.dk%2Fblog%2F2008%2F04%2F12%2Fuser-space-traffic-shaping-with-ruby%2F&amp;title=User%20space%20traffic%20shaping%20with%20Ruby" id="wpa2a_4"><img src="http://www.bugfree.dk/blog/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.bugfree.dk/blog/2008/04/12/user-space-traffic-shaping-with-ruby/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Creating a DSL for personal accounting in Ruby</title>
		<link>http://www.bugfree.dk/blog/2006/12/30/creating-a-dsl-for-personal-accounting-in-ruby/</link>
		<comments>http://www.bugfree.dk/blog/2006/12/30/creating-a-dsl-for-personal-accounting-in-ruby/#comments</comments>
		<pubDate>Sat, 30 Dec 2006 14:13:58 +0000</pubDate>
		<dc:creator>Ronnie Holm</dc:creator>
				<category><![CDATA[Ruby]]></category>

		<guid isPermaLink="false">http://www.bugfree.dk/blog/2006/12/30/creating-a-dsl-in-ruby-for-personal-accounting/</guid>
		<description><![CDATA[I’ve been reading up on Ruby (and Rails) lately, and I’m in need of a project for getting hands on with the nuts and bolts of the language. So as an exercise I’ve been thinking about constructing a DSL for describing accounting transactions (debits and credits) and describing reports (income vs. expense and account summaries). [...]]]></description>
			<content:encoded><![CDATA[<p>I’ve been reading up on <a href="http://www.ruby-lang.org/en/">Ruby</a> (and <a href="http://www.rubyonrails.org/">Rails</a>) lately, and I’m in need of a project for getting hands on with the nuts and bolts of the language. </p>
<p>So as an exercise I’ve been thinking about constructing a <a href="http://en.wikipedia.org/wiki/Domain_Specific_Language">DSL</a> for describing accounting transactions (debits and credits) and describing reports (income vs. expense and account summaries).</p>
<p>For a couple of years, I’ve been using <a href="http://gnucash.org/">Gnucash</a>, a double entry accounting system, for simple personal accounting. It works great except it&#8217;s Linux only, which is not ideal since I’ve been running Windows on my desktop lately. So every time I need to do accounting stuff, I’ve got to fire up VMware.</p>
<p>Instead it would be fun to literally do double entry accounting in Ruby, by expressing each transaction and report in terms of the DSL.</p>
<p>This way my accounting information can all be recorded using the DSL and when entering the information, you can make use of the arithmetic features and function of Ruby itself, e.g., if an entry must be the sum of several values, you can enter each value and have Ruby do the summing.</p>
<p><a class="a2a_dd a2a_target addtoany_share_save" href="http://www.addtoany.com/share_save#url=http%3A%2F%2Fwww.bugfree.dk%2Fblog%2F2006%2F12%2F30%2Fcreating-a-dsl-for-personal-accounting-in-ruby%2F&amp;title=Creating%20a%20DSL%20for%20personal%20accounting%20in%20Ruby" id="wpa2a_6"><img src="http://www.bugfree.dk/blog/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share"/></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.bugfree.dk/blog/2006/12/30/creating-a-dsl-for-personal-accounting-in-ruby/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

