<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Curtis Tasker &#187; Programming</title>
	<atom:link href="http://curtistasker.com/blog/programming/feed" rel="self" type="application/rss+xml" />
	<link>http://curtistasker.com</link>
	<description>&#099;&#117;&#114;&#116;&#105;&#115; (at) &#099;&#117;&#114;&#116;&#105;&#115;&#116;&#097;&#115;&#107;&#101;&#114; (dot) &#099;&#111;&#109;</description>
	<lastBuildDate>Wed, 05 Oct 2011 01:35:45 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Strings in Java</title>
		<link>http://curtistasker.com/blog/programming/494/strings-in-java</link>
		<comments>http://curtistasker.com/blog/programming/494/strings-in-java#comments</comments>
		<pubDate>Fri, 30 Sep 2011 04:41:23 +0000</pubDate>
		<dc:creator>Curtis</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[html]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[refactoring]]></category>
		<category><![CDATA[string]]></category>
		<category><![CDATA[stringbuffer]]></category>

		<guid isPermaLink="false">http://curtistasker.com/?p=494</guid>
		<description><![CDATA[At my first big programming job, we were building web applications using Servlets and EJBs. This is before Java web application frameworks had been invented, Servlets and EJBs were still in their infancy, and JSP were not even on the map. Everybody was feeling out how to use the technology properly, and a lot of [...]]]></description>
			<content:encoded><![CDATA[<p>At my first big programming job, we were building web applications using <a href="http://en.wikipedia.org/wiki/Java_Servlet">Servlets</a> and <a href="http://en.wikipedia.org/wiki/Enterprise_JavaBean">EJBs</a>. This is before Java <a href="http://en.wikipedia.org/wiki/Web_application_framework">web application frameworks</a> had been invented, Servlets and EJBs were still in their infancy, and <a href="http://en.wikipedia.org/wiki/JavaServer_Pages">JSP</a> were not even on the map. Everybody was feeling out how to use the technology properly, and a lot of mistakes were made along the way.<br />
<span id="more-494"></span><br />
The first thing I was tasked with was helping the existing team to improve performance in an application that had just been launched. Looking back, I cringe at how badly this application was written, but at the time it was the best the team could come up with. While there were many problems with this app, performance was the showstopper: loading a page took an average of 60 seconds.</p>
<p>The app was pulling large amounts of data from an <a href="http://en.wikipedia.org/wiki/Oracle_Database">Oracle database</a> using <a href="http://en.wikipedia.org/wiki/Stored_procedure">stored procedures</a>, and building presentation html right there in the servlet layer.  We couldn&#8217;t afford software to properly analyze the app, so we wrote a quick timer and started doing simple start/stop events throughout the loading of particularly slow pages, tracking the time it took to perform specific actions.</p>
<p>We slowly figured out that no one part of the app was being egregiously slow; The whole thing was just uniformly slow.  Obviously loops through <a href="http://en.wikipedia.org/wiki/Resultset">resultsets</a> were taking up the bulk of the time, but inside the loops each statement performed at the same equal snail&#8217;s pace.</p>
<p>Searching for help on the web didn&#8217;t help.  I&#8217;m not sure if my <a href="http://www.urbandictionary.com/define.php?term=google-fu">google-fu</a> was just in its infancy back then, or if there really was no information out about this problem.  Eventually I started writing simple test servlets and tried to do the same task in different ways, hoping to flush out the piece of code that was causing the slowdown.  As it turned out, I stumbled upon the solution in Sun&#8217;s <a href="http://download.oracle.com/javase/6/docs/api/">Javadocs</a>.</p>
<h2>The Problem</h2>
<p>Apparently in Java, <a href="http://download.oracle.com/javase/6/docs/api/java/lang/String.html">Strings</a> are <a href="http://en.wikipedia.org/wiki/Immutable_object">immutable</a> (they cannot be altered after they are created).  The idea is like so: Lets say you&#8217;re going to display an invoice to a customer.  You have their name in a string, pulled from the customer table.  You read in the invoice, and you have their name again, pulled twice from the billing and shipping information.  Now you have three string <a href="http://en.wikipedia.org/wiki/Variable_(computer_science)">variables</a> all storing the same static bit of information in memory.  Rather than waste 3x the space, Java simply points all three to the same chunk of memory.</p>
<p>Now you might think this would be a bad thing;  What if the customer changes their billing address name?  Because strings cannot be changed after they are created, every time you think you&#8217;re changing a string, you&#8217;re actually creating a brand new string and pointing your variable to the new string.  In the event that the old string has no more variables pointing to it, it will be <a href="http://en.wikipedia.org/wiki/Garbage_collection_(computer_science)">garbage collecte</a>d.</p>
<p>Understanding how this whole mess affected this application has to do with how we were building our html.  You see, building a page went something like this:</p>
<pre>
...
String body = new String();
body += &quot;&lt;table border='1'&gt;&lt;/table&gt;&quot;;
body += &quot;&lt;tr&gt;&quot;;
body += &quot;    &lt;th&gt;Column 1&lt;/th&gt;&quot;;
body += &quot;    &lt;th&gt;Column 2&lt;/th&gt;&quot;;
body += &quot;&lt;/tr&gt;&quot;;

while( rs.next() )
{
    body += &quot;&lt;tr&gt;&quot;;
    body += &quot;    &lt;td&gt;&quot;+rs.getString(1)+&quot;&lt;/td&gt;&quot;;
    body += &quot;    &lt;td&gt;&quot;+rs.getFloat(2)+&quot;&lt;/td&gt;&quot;;
    body += &quot;&lt;/tr&gt;&quot;;
}
body += &quot;&lt;/table&gt;&quot;;
...
return header + body + footer;
</pre>
<p>That&#8217;s right, every line of html that was being created for every page was done by string <a href="http://en.wikipedia.org/wiki/Concatenation">concatenation</a>.  With strings, when <code>body += "html here"</code> is used, it creates a new string containing the concatenated contents of both strings, then points <code>body</code> to the new string.  This means that the old string <code>body</code> pointed to, along with the html that was concatenated with it, are now floating around waiting for garbage collection.  Each and every line of html created using this method has the side effect of creating 1 or more new string variables.  Object creation is expensive in Java, and the garbage collection for these excess strings was nearly as expensive.</p>
<h2>The Solution</h2>
<p>The solution was actually pretty trivial, but time consuming to implement.  Java has a <a href="http://download.oracle.com/javase/6/docs/api/java/lang/StringBuffer.html">StringBuffer</a> class which is specifically designed for this type of problem.  It stores the same information as a String, but it a mutable fashion.  This allows us to alter its contents without the hassle of object creation and garbage collection.  The fixed code looked like so:</p>
<pre>
...
StringBuffer body = new StringBuffer();
body.append( &quot;&lt;table border='1'&gt;&lt;/table&gt;&quot; );
body.append( &quot;&lt;tr&gt;&quot; );
body.append( &quot;    &lt;th&gt;Column 1&lt;/th&gt;&quot; );
body.append( &quot;    &lt;th&gt;Column 2&lt;/th&gt;&quot; );
body.append( &quot;&lt;/tr&gt;&quot; );

while( rs.next() )
{
    body.append( &quot;&lt;tr&gt;&quot; );
    body.append( &quot;    &lt;td&gt;&quot;).append( rs.getString(1) ).append( &quot;&lt;/td&gt;&quot; );
    body.append( &quot;    &lt;td&gt;&quot;).append( rs.getFloat(2)  ).append( &quot;&lt;/td&gt;&quot; );
    body.append( &quot;&lt;/tr&gt;&quot; );
}
body.append( &quot;&lt;/table&gt;&quot; );
...
return (new StringBuffer( header.toString() )
                 .append( body )
                 .append( footer )
       ).toString();
</pre>
<p>For anybody doing Java web apps now, you just won&#8217;t run into this problem as often. JSPs allow for much more readable presentation code, and you rarely run around appending to your html like this. That said, the misuse of strings in Java is in no way limited to this example here. The problem crops up more than you&#8217;d imagine, with experienced and inexperienced developers alike.</p>
<h2>New Job, Same Problem</h2>
<p>At my second big computer job, I was surrounded by some very smart people.  They had been using Java for years, and were currently maintaining a huge, high performance web app using JSPs, Servlets, and EJBs. They were handling exclusively backend work, and passing off finished code webmonkeys to design <a href="http://en.wikipedia.org/wiki/User_interface">UIs</a> around.  I remember sitting in my little cubicle, pouring over code rapidly, trying to get a grasp of how the application worked. I overheard three of the senior developers discussing a problem a few cubicles over. Much to my surprise, they were having performance problems with a new piece of code that sounded suspiciously like it was an immutable string issue.</p>
<p>I listened for a minute or two, then swaggered over to solve their problem.  What took days to figure out at my first job was related and then fixed in mere minutes here, and the performance issue disappeared.  I&#8217;m always reminded of this moment, and how even experienced developers don&#8217;t know everything.  There&#8217;s always something more to learn.</p>
]]></content:encoded>
			<wfw:commentRss>http://curtistasker.com/blog/programming/494/strings-in-java/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>HTML Meta Tags</title>
		<link>http://curtistasker.com/blog/programming/681/html-meta-tags</link>
		<comments>http://curtistasker.com/blog/programming/681/html-meta-tags#comments</comments>
		<pubDate>Wed, 14 Jul 2010 19:05:17 +0000</pubDate>
		<dc:creator>Curtis</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[description]]></category>
		<category><![CDATA[html]]></category>
		<category><![CDATA[keywords]]></category>
		<category><![CDATA[meta]]></category>
		<category><![CDATA[metadata]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[seo]]></category>
		<category><![CDATA[tags]]></category>
		<category><![CDATA[wordpress]]></category>

		<guid isPermaLink="false">http://curtistasker.com/?p=681</guid>
		<description><![CDATA[When I created this website, I decided that I would use custom and unique description and keywords meta for each blog post. The idea here was to make each page a bit more individual in the eyes of search engines. I realize that search engines no longer place much (if any) weight on these factors, [...]]]></description>
			<content:encoded><![CDATA[<p>When I created this website, I decided that I would use custom and unique <a href="http://en.wikipedia.org/wiki/HTML_META#The_description_attribute">description</a> and <a href="http://en.wikipedia.org/wiki/HTML_META#The_keywords_attribute">keywords</a> meta for each blog post.  The idea here was to make each page a bit more individual in the eyes of search engines.  I realize that search engines no longer place much (<a href="http://googlewebmastercentral.blogspot.com/2009/09/google-does-not-use-keywords-meta-tag.html">if any</a>) weight on these factors, but it doesn&#8217;t hurt to be thorough.<br />
<span id="more-681"></span><br />
The original implementation of the <a href="http://en.wikipedia.org/wiki/HTML_element#Document_head_elements">html head</a> was like so:</p>
<pre>
&lt;?php
if(is_single() || is_page()) {
    $description_meta = get_post_meta($post-&gt;ID, &quot;description&quot;, true);
    $keywords_meta    = get_post_meta($post-&gt;ID, &quot;keywords&quot;, true);
}
else if( is_category() ) $description_meta = category_description();

if(&quot;&quot; == $description_meta) $description_meta = get_bloginfo(&#039;desciption&#039;);
if(&quot;&quot; == $keywords_meta)    $keywords_meta    = &quot;curtis,tasker,curtis tasker&quot;;
?&gt;

&lt;meta name=&quot;description&quot; content=&quot;&lt;?php echo($description_meta); ?&gt;&quot;/&gt;
&lt;meta name=&quot;keywords&quot; content=&quot;&lt;?php echo($keywords_meta); ?&gt;&quot;/&gt;
</pre>
<p>Every post or page in WordPress has a nice little GUI for entering <a href="http://codex.wordpress.org/Custom_Fields">Custom Fields</a> of data.  By using these, I would be able to customize each blog post&#8217;s metadata, while maintaining the default for the rest of the pages.</p>
<p>While this performed admirably, I recently did a little improvement.  While working on something totally unrelated, I realized that the <a href="http://codex.wordpress.org/Posts_Tags_Screen">post tags</a> and keywords meta for each blog post were almost exactly the same.  Apparently the overlap never occurred to me when I first built the site.</p>
<p>A solution was actually pretty simple:</p>
<pre>
$keywords_meta = implode( ",", wp_get_post_tags( $post->ID,
                                                 array('fields'=>'names') ));
</pre>
<p>Now it pulls the list of tags for the blog post, then converts it into a <a href="http://en.wikipedia.org/wiki/Comma-separated_values">comma-delimited string</a>, and uses that for the keywords meta.  All in all, a much cleaner solution that speeds up the process of publishing a blog post.  Every little bit helps.</p>
]]></content:encoded>
			<wfw:commentRss>http://curtistasker.com/blog/programming/681/html-meta-tags/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Permalinks</title>
		<link>http://curtistasker.com/blog/technology/562/permalinks</link>
		<comments>http://curtistasker.com/blog/technology/562/permalinks#comments</comments>
		<pubDate>Mon, 14 Dec 2009 01:18:44 +0000</pubDate>
		<dc:creator>Curtis</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[seo]]></category>
		<category><![CDATA[url rewriting]]></category>
		<category><![CDATA[wordpress]]></category>

		<guid isPermaLink="false">http://curtistasker.com/?p=562</guid>
		<description><![CDATA[Linking to from one document to another is basically the foundation of the internet. The URL of a document is the address you point your browser at to retrieve some content. Content on the internet is meant to be read. You drive readers to your content by publishing the URL. Ideally you want your content [...]]]></description>
			<content:encoded><![CDATA[<p>Linking to from one document to another is basically the foundation of the internet.  The <a href="http://en.wikipedia.org/wiki/Url">URL</a> of a document is the address you point your browser at to retrieve some content.  Content on the internet is meant to be read.  You drive readers to your content by publishing the URL.  Ideally you want your content to be available to readers not only now, but for years hence.<br />
<span id="more-562"></span><br />
The problem  lies with the issue of permanency.  Lets say you have a small company that sells products over the internet.  You put your return and exchange policies at <code>example.com/help/returns.htm</code>.  You buy a large batch of paper invoices, and have your return policy URL proudly displayed at the bottom of each invoice.  A few months down the road, the website is re-designed, and the new return policy is now at <code>example.com/returns</code>.  Apparently nobody remembered that your invoices are pointing to a resource that no longer exists.  This happens more often than you might think, and often involves both digital and physical links which are broken.</p>
<h2>Common Reasons For Removal</h2>
<ul>
<li>The content of the page becomes outdated, and is removed entirely.</li>
<li>The website is redesigned, and old information is moved to an entirely new location</li>
<li>The type of information changes (a pdf becomes a web page)</li>
<li>Static content becomes dynamic</li>
</ul>
<h2>Coping With The Problem</h2>
<p>There are a few basic ways to cope with out of date URLs.  You can display a custom <a href="http://en.wikipedia.org/wiki/HTTP_404">404 error</a>, perhaps with a search box to allow the user to more easily find the page he was looking for.  You can redirect the user back to the root of your website, and hope they can find their way from there.  Ideally, you can configure your web-server to redirect all requests from the old URL to the new URL.  Unfortunately, the latter process can become quite unwieldy, and is often just ignored, especially on larger websites and over long time periods.</p>
<p>The best solution is to plan ahead when first launching your website.  With proper planning, you can configure your website&#8217;s URL structure to cope with changes in underlying design and technology without changing the basic URL structure.</p>
<p>If a piece of content is going to be advertised or linked to, its URL should be a folder.  <code>example.com/resume</code> is far preferable to <code>example.com/resume.htm</code> or <code>example.com/resume/resume.pdf</code>.  </p>
<p>If your website is highly dynamic, generating content on the fly, then your URL structure is even more critical.  <code>example.com/home.php?page=intro&#038;lang=en&#038;spash=yes</code> is confusing, and painful to type by hand.  <code>example.com/intro</code> is simple and easy to type.</p>
<h2>How I Handle It</h2>
<p>Everybody seems to have their own opinions on how to set up your blog with permanent links. The main concerns seem to be <a href="http://en.wikipedia.org/wiki/SEO">SEO</a>, which isn&#8217;t terribly important to me, but I took it into account.  I&#8217;ve never had a lot of publicly accessible links on my website, so there weren&#8217;t many links to preserve.  I did choose to maintain most of the hosted files and private content, as I&#8217;d planned for that ages ago.</p>
<p>I settled on this structure: <code>http://curtistasker.com/blog/%category%/%post_id%/%posttitle%</code>. I wanted the word &#8216;blog&#8217; in the url somewhere, to indicate at a glance (and to search engines) that this was a blog post.  I plan on using a small number of categories with little overlap, so having the category in the URL is reasonable.  It provides an additional keyword for search engines, and offers users a bit more information to go along with the post title.  Post ID is there mostly to make search engines happy, as quite a few people seemed to think a three or more digit numeric in your URL was beneficial.  And finally it closes with the post title.</p>
<p>Tweaking WordPress to handle this structure wasn&#8217;t terribly bad, though its since gotten easier.  I&#8217;ve been able to get rid of a custom plugin I wrote to handle category links, as newer versions of WordPress can handle what I want by default.  I also use a lot <a href="http://curtistasker.com/blog/programming/109/url-rewriting-using-apache">URL Rewriting</a> to help prevent duplicate links, by forcing <code>www.example.com/file1</code> and <code>example.com/file1</code> and <code>example.com/files/file1</code> to all redirect to the same URL.  I also discourage duplicate indexing of content (category and tag pages with summaries).</p>
]]></content:encoded>
			<wfw:commentRss>http://curtistasker.com/blog/technology/562/permalinks/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Input Driven</title>
		<link>http://curtistasker.com/blog/technology/256/input-driven</link>
		<comments>http://curtistasker.com/blog/technology/256/input-driven#comments</comments>
		<pubDate>Thu, 18 Dec 2008 21:04:23 +0000</pubDate>
		<dc:creator>Curtis</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[apple]]></category>
		<category><![CDATA[keyboard]]></category>
		<category><![CDATA[mouse]]></category>
		<category><![CDATA[scancode]]></category>
		<category><![CDATA[script]]></category>
		<category><![CDATA[wireless]]></category>

		<guid isPermaLink="false">http://curtistasker.com/?p=256</guid>
		<description><![CDATA[I seem to chew through and spit out peripherals yearly. I suppose when you spend as much time as I do in front of a computer, you feel compelled to have the perfect input devices. Finding Perfection I started out with a standard IBM PC keyboard, one of those giant clackers, and a dreary old grey [...]]]></description>
			<content:encoded><![CDATA[<p>I seem to chew through and spit out peripherals yearly. I suppose when you spend as much time as I do in front of a computer, you feel compelled to have the perfect input devices.<br />
<span id="more-256"></span></p>
<h2>Finding Perfection</h2>
<p>I started out with a standard IBM PC keyboard, one of those giant clackers, and a dreary old grey mouse with my first computer.  Since then I&#8217;ve gone through literally dozens of upgrades. Ball mice gave way to optical mice, which in turn gave way to laser mice. In the past few years I&#8217;ve gone through half a dozen gaming mice, before finally settling on a <a href="http://en.wikipedia.org/wiki/Logitech_G9">Logitech G9</a>. Most of my relatives have a castoff mouse that once graced my desk, but my right hand is finally happy.</p>
<p>My keyboard churn has followed the same breakneck pace. I&#8217;ve gone through keyboards with built in <a href="http://en.wikipedia.org/wiki/Trackpoint">trackpoints</a>, purportedly ergonomic split keyboards, a dozen near-disposable $10 keyboards, a <a href="http://www.daskeyboard.com/">Das Keyboard</a> (100% blank, of course), and two backlit keyboards. I started with loud clicky keys, moved to silent soft keys, then back to clicky, then back to soft. I never did find one I was happy with.</p>
<p>That changed when the aluminum <a href="http://www.apple.com/keyboard/">Apple Keyboard</a> caught my eye. After a trek to an Apple Store and half an hour coding away on it (to the mild amusement of the staff), I brought one home. I&#8217;ve since moved to the wireless version, and aside from my occasional lapses when I try to use the nonexistent Numpad or Home/End PgUp/PgDn keys, I&#8217;m happy. In fact, the quality of the typing experience was so good that it factored in heavily to my MacBook Air purchase.</p>
<h2>Windows Support</h2>
<p>Despite both the wired and wireless keyboards being fairly fantastic as far as the hardware goes, the software support for Windows is severely lacking. The default <a href="http://support.apple.com/kb/HT1167">mapping</a> isn&#8217;t ideal.  The software offered by Apple for bootcamp is difficult to find for a Windows user, and once found is just not up to the task. When I first bought the wired keyboard, I spent hours putting together a melange of hacks to turn the keyboard into a usable device on Windows.</p>
<h3>A few of the problems:</h3>
<ul>
<li>Media &amp; Volume keys are nonfunctional</li>
<li>Command key where Alt key should be, but acts like Windows key</li>
<li>Option key where Windows key should be, but acts like Alt key</li>
<li>Right Control key acts like Right Alt key</li>
<li>Fn key where Insert key should be (wired keyboard only)</li>
<li>Clear key where NumLock key should be (wired keyboard only)</li>
</ul>
<p>That covers most of the big issues, but the list of minor issues can go on for some time.  I used <a href="http://sharpkeys.codeplex.com/">SharpKeys</a> to remap the F7-F12 <a href="http://en.wikipedia.org/wiki/Scan_codes">scancodes</a> to the windows media keyboard equivalents, turn F13-F15 into the PrintScreen, Scroll Lock, and Pause keys, and make the Ctrl and Alt keys work properly.  This proved to be an adequate solution at the time.</p>
<p>After I got my Macbook Air, I wanted more parity between how the keyboard on my laptop and desktop worked. Since I had upgraded to the wireless Apple Keyboard around the same time, I particularly needed shortcuts to replicate the lost buttons.  Again using scancodes, I turned the Command key into Ctrl, to make cut/copy/paste use the same gesture as the Mac. I used a scripting language called <a href="http://www.autohotkey.com/">AutoHotkey</a> to add some additional functionality, namely Ctrl+ArrowKeys shortcuts for Home/End/PgUp/PgDn, and Fn+Backspace for Delete. </p>
<h3>More Problems Arise</h3>
<p>All was well until I started beta testing new program, and my mapping fell apart. You see, this program bypassed the standard Windows keyboard device driver, and used their own instead. This made things problematic as my scancode mappings were thus ignored. The program used Ctrl + # and Alt + # heavily, and the Ctrl and Alt keys were just not where my fingers expected them to be. In addition, Command started acting like the Windows key again, launching the start menu when pressed. Words cannot adequately express how highly annoying this all was.</p>
<p>I tried to code my way out of this problem with more scripting, and ended up realizing that scancodes just weren&#8217;t doing the job anymore. I decided the only viable solution was to replicate all the key mapping functionality in my script instead.</p>
<h2>An Easy Solution</h2>
<p>Before I got too involved in this project, I did a little digging on the net, and a program called <a href="http://code.google.com/p/uawks/">UAWKS</a> caught my eye. It had been released just weeks prior, and it basically does everything my scancode remapping and scripting did, with a slick bit of user interface thrown on top. I swapped over entirely to this program, with minimal tweaks to the underlying code to suit my needs.</p>
<p>There are a few lingering issues, but I wholeheartedly recommend <a href="http://code.google.com/p/uawks/">UAWKS</a> to anybody running an Apple keyboard under Windows.  It is a requirement to get the proper Windows behavior out of the keyboard.</p>
<p>Of course, all this is going to be a moot point once I replace my desktop with a Mac.  At least I&#8217;ll be able to use the same mouse and keyboard, without all the hassle.  At this point, my lust for new input devices is sated.  Of course, who can say what the next year&#8217;s product line will bring?</p>
<p><strong>Update (03/25/2009):</strong>  Just spent an hour on the Wacom Intuos4. <em>*sigh*</em> With a little willpower, perhaps I can keep the tablet consumption rate down to something reasonable.</p>
]]></content:encoded>
			<wfw:commentRss>http://curtistasker.com/blog/technology/256/input-driven/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Basic access control in Apache</title>
		<link>http://curtistasker.com/blog/programming/153/basic-access-control-in-apahce</link>
		<comments>http://curtistasker.com/blog/programming/153/basic-access-control-in-apahce#comments</comments>
		<pubDate>Wed, 23 Jul 2008 16:23:41 +0000</pubDate>
		<dc:creator>Curtis</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[apache]]></category>
		<category><![CDATA[htaccess]]></category>
		<category><![CDATA[password]]></category>
		<category><![CDATA[protect]]></category>
		<category><![CDATA[wordpress]]></category>

		<guid isPermaLink="false">http://curtistasker.com/?p=153</guid>
		<description><![CDATA[Apache .htaccess files can be used to block access to specific resources, or to provide minimal security through user name and password authentication. You can use a .htaccess file in any folder of your website, and it will apply to any subfolders. A single .htaccess file placed in the root of your domain can apply [...]]]></description>
			<content:encoded><![CDATA[<p>Apache <a href="http://httpd.apache.org/docs/2.2/howto/htaccess.html">.htaccess files</a> can be used to block access to specific resources, or to provide minimal security through user name and password authentication.</p>
<p>You can use a .htaccess file in any folder of your website, and it will apply to any subfolders.  A single .htaccess file placed in the root of your domain can apply to the entire website.  While this is advantageous for blocking access to files, you&#8217;re going to need a seperate .htaccess in each subfolder that you want to password protect.</p>
<p><span id="more-153"></span></p>
<h2>Blocking access to resources</h2>
<pre># block access to all .ht* files
&lt;files ~ "^\.ht"&gt;
    Order allow,deny
    Deny from all
&lt;/files&gt;

# block access to wp-config
&lt;files wp-config.php&gt;
	Order allow,deny
	Deny from all
&lt;/files&gt;</pre>
<p>This blocks access to all files that begin with .ht, as well as the wp-config.php file.  You can use this type of definition to block access to any file or folder of your choosing.</p>
<p>If you&#8217;re running WordPress, ideally you should move your wp-config.php file to a location on your server that is above the root of your domain; It contains security information that you really don&#8217;t want somebody getting access to.</p>
<h2>Password protecting files and directories</h2>
<p>This is not a terribly good method of protecting your content, as none of the data is encrypted as you access it.  In addition, unless you use digest mode, your user name and password are sent in the clear every time you type them in.</p>
<h3>Creating users and passwords</h3>
<p>First, you need to create some user names and passwords and store them in a .htaccess file.</p>
<pre>% cd /home/username/webapps/www.example.com/
  htpasswd -c .htpasswd user1
  Adding password for user1.
  New password:  pass
  Re-type new password:  pass

% chmod a+r .htpasswd</pre>
<p>Switch to the root folder of your domain, and run the <a href="http://httpd.apache.org/docs/2.2/programs/htpasswd.html"><code>htpasswd</code></a> command with <code>-c</code> to create a new password file, with <code>user1</code> as a user.  Then type in a password for that user.  Finally you chmod the file to ensure it has the proper permissions.</p>
<p>You can add additional users to the file by simply using the above command without the <code>-c</code> option.  You can also create groups of users, by creating a file called .htgroup and that looks like so:</p>
<pre>my-users: user1 user2 user3 user4</pre>
<p>You should ideally put your .htpasswd file in a location that isn&#8217;t accessible from your website, such as your home directory.</p>
<h3>Setting up the authentication (Basic)</h3>
<pre>AuthType Basic
AuthName "My Protected Folder"
AuthUserFile /home/username/webapps/www.example.com/.htpasswd
AuthGroupFile /dev/null
Require valid-user</pre>
<p>AuthType we&#8217;re using is Basic (password sent in the clear).</p>
<p>AuthName is an arbitrary name you assign to your protected content.  If you protect multiple directories, and give each directory the same AuthName, then the user will only be required to entire their information once;  They will then be granted access to all the directories.</p>
<p>AuthUserFile is the location of the .htpasswd file, which grants users listed in this file access.</p>
<p>AuthGroupFile is the location of the .htgroup file, if you wish to grant a group of users access.  In this case I&#8217;m defining a null group.</p>
<p>Require can list either specific user names group names, or any valid user or group from the Auth files.</p>
<h3>Setting up authentication (Digest)</h3>
<p>This is slightly more secure than Basic mode, as your password is sent as an md5 hash rather than in the clear.  To use Digest mode, use the following code:</p>
<pre>AuthType Digest
AuthDigestDomain /
AuthDigestProvider file
AuthUserFile /home/username/webapps/www.example.com/.htdigest
AuthName "My Protected Folder"
Require valid-user</pre>
<p>You will need a new password file, which can be created using <a href="http://httpd.apache.org/docs/2.2/programs/htdigest.html"><code>htdigest</code></a>.   Follow the same steps used when creating the .htpasswd file, the syntax for the command is the same.</p>
]]></content:encoded>
			<wfw:commentRss>http://curtistasker.com/blog/programming/153/basic-access-control-in-apahce/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>URL rewriting using Apache</title>
		<link>http://curtistasker.com/blog/programming/109/url-rewriting-using-apache</link>
		<comments>http://curtistasker.com/blog/programming/109/url-rewriting-using-apache#comments</comments>
		<pubDate>Sun, 20 Jul 2008 21:26:00 +0000</pubDate>
		<dc:creator>Curtis</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[apache]]></category>
		<category><![CDATA[htaccess]]></category>
		<category><![CDATA[regex]]></category>
		<category><![CDATA[url rewriting]]></category>
		<category><![CDATA[wordpress]]></category>

		<guid isPermaLink="false">http://curtistasker.com/?p=109</guid>
		<description><![CDATA[Several years ago, you made the grave error of putting an overly specific URL in some advertising (example.com/fall_2005_news.html). You&#8217;ve come to your senses and are re-organizing your website&#8217;s structure, and you really want to get rid of that html file. You&#8217;d like any user attempting to visit that ancient URL to instead be shunted to [...]]]></description>
			<content:encoded><![CDATA[<p>Several years ago, you made the grave error of putting an overly specific URL in some advertising (<code>example.com/fall_2005_news.html</code>).  You&#8217;ve come to your senses and are re-organizing your website&#8217;s structure, and you really want to get rid of that html file.  You&#8217;d like any user attempting to visit that ancient URL to instead be shunted to <code>example.com/news</code>.</p>
<p>Well, such a procedure is relatively easy.  Apache <a href="http://httpd.apache.org/docs/2.2/howto/htaccess.html">.htaccess files</a> can be used in any folder of your website, and will apply to any subfolders.  Thus, a single .htaccess file in the root of your domain can apply to the entire website.</p>
<p>The aforementioned example is fairly simple, but URL rewriting of extreme complexity is possible once you know the rules.  Generally, instead of simply changing <code>file1.html</code> to <code>folder/file1.html</code> using a single rule, you will instead change every file fitting the form <code>fileX.html</code> to <code>folder/fileX.html</code>.  That is to say, any request for a file that matches a specific pattern will be rewritten to a new URL.</p>
<p><span id="more-109"></span></p>
<h2>Regular Expressions</h2>
<div class="infobox alignright">
<h3><abbr title="Regular Expressions">Regex</abbr> Pointers</h3>
<table border="0">
<tbody>
<tr>
<td>^</td>
<td>start of line anchor</td>
</tr>
<tr>
<td>$</td>
<td>end of line anchor</td>
</tr>
<tr>
<td>.</td>
<td>match any character</td>
</tr>
<tr>
<td>?</td>
<td>match 0 to 1 of the preceding elements</td>
</tr>
<tr>
<td>*</td>
<td>match 0 to N of the preceding elements</td>
</tr>
<tr>
<td>+</td>
<td>match 1 to N of the preceding elements</td>
</tr>
<tr>
<td>[abc]</td>
<td>matches any 1 character from the list abc</td>
</tr>
<tr>
<td>[^ab]</td>
<td>matches any characters except a and b</td>
</tr>
<tr>
<td>\.</td>
<td>match the character period</td>
</tr>
<tr>
<td>(.*)</td>
<td>backreference that matches all characters</td>
</tr>
<tr>
<td>!</td>
<td>negate the match</td>
</tr>
</tbody>
</table>
</div>
<p>It&#8217;s a bit beyond the scope of this post to teach <abbr title="Regular Expressions">regex</abbr> in depth, but here&#8217;s a few pointers to help decipher the code in the next section.</p>
<p>As a mildly complex example, consider <code>^www\.([^\.]+\.[^\.]+)$</code>.  This starts at the beginning of the line, matches the character string <code>www</code>, followed by a period, followed by 1 to N characters that are not period, followed by a period, followed by 1 to N characters that are not a period, followed by the end of the line.</p>
<p>It matches any URL that fits the form: <code>www.example.com</code>.  It then stores a backreference for the <code>example.com</code> portion of the match.  Think of a backreference as a saved variable containing the text that was matched inside a pair of parenthesis.  Multiple backreferences can be made per block of <abbr title="Regular Expressions">regex</abbr>, and they can be used later on in your code.</p>
<h2>URL Rewriting</h2>
<p>Apache defines a few simple commands that allow you to use <abbr title="Regular Expressions">regex</abbr> to dynamically alter a URL.  The <a href="http://httpd.apache.org/docs/1.3/mod/mod_rewrite.html">official documentation</a> is a great help, but the <a href="http://www.google.com/search?q=RewriteRule">examples on the web</a> are of much greater instructional value.  Generally you begin URL rewriting like so:</p>
<pre>&lt;IfModule mod_rewrite.c&gt;
RewriteEngine on
RewriteBase /</pre>
<p>and ends like so:</p>
<pre>&lt;/IfModule&gt;</pre>
<p>It uses an if..then block to enclose all URL rewriting commands, and will only evaluate the block if the correct Apache module is installed.  It begins by turning on the rewrite engine, and setting the base of all rewriting to the root directory.</p>
<p>RewriteCond and RewriteRule control how a URL is rewritten.  Generally you write 0 to N RewriteCond statements followed by a single RewriteRule statement.  The RewriteRule will only be executed if every RewriteCond statement preceding it matches something.</p>
<p>RewriteCond takes two arguments:  the string to match against, followed by the pattern to match.  RewriteRule takes two arguments: the pattern to match, followed by the rewritten URL.  RewriteRule always uses the REQUEST_URI variable as its string to match against;  If the URL was <code>http://www.example.com/folder1/file1.html</code>, it would match against the string <code>folder1/file1.html</code>.</p>
<h3>Strip off the www subdomain</h3>
<pre>RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]
RewriteRule ^(.*)$ http://%1/$1 [R=301,L]</pre>
<div class="infobox alignright">
<h3>Backreferences</h3>
<table border="0">
<tbody>
<tr>
<td>%1</td>
<td>the first backreference from RewriteCond</td>
</tr>
<tr>
<td>$1</td>
<td>the first backreference from RewriteRule</td>
</tr>
<tr>
<td>$2</td>
<td>the second backreference from RewriteRule</td>
</tr>
</tbody>
</table>
</div>
<p>This starts with the HTTP_HOST variable, which contains just the www.example.com portion of the incoming URL.  It then matches <code>www.</code>, and stores a backreference to all characters that come after that.</p>
<p>Assuming that the URL did indeed contain a <code>www.</code>, then the RewriteRule comes into play.  The pattern <code>^(.*)$</code> will match everything in the REQUEST_URI, and store a backreference to the string.  The URL is then rewritten using the two backreferences.</p>
<h3>Map subdomains to subfolders</h3>
<pre>RewriteCond %{HTTP_HOST} ^([^.]+)\.([^.]+\.[^.])$
RewriteRule ^(.*)$ http://%2/%1/$1 [R=301,L]</pre>
<p>This starts with the HTTP_HOST variable, which contains <code>subdomain.example.com</code>.  It finds the subdomain using <code>([^.]+)\.</code> to match 1..N characters that is not periods until it reaches a period.  A backreference is stored as %1.</p>
<p>Next, it finds the domain using <code>([^.]+\.[^.])$ to match 1..N of characters that are not periods (</code><code>example</code>) followed by a period followed by 1..N characters that are not periods (<code>com</code>).  It stores the domain in the backreference %2.</p>
<p>Finally, the RewriteRule uses the pattern <code>^(.*)$</code> tomatch everything in the REQUEST_URI, and stores a backreference as $1.  The URL is then rewritten using the three backreferences.  So, <code>subdomain.example.com</code> now becomes<code> example.com/subdomain</code>.</p>
<h3>Prevent image hotlinking</h3>
<p>If you host images on your website, you may want to prevent other websites from stealing your bandwidth by hotlinking to your images.</p>
<pre>#RewriteCond %{HTTP_REFERER} !^$
#RewriteCond %{HTTP_REFERER} !^http://(www\.)?example.com/.*$ [NC]
#RewriteRule \.(gif|jpg|png)$ - [F]</pre>
<p>The first RewriteCond will match any request where the referring site is not empty.  The second will match any request where the referring site is any site except your own site (<code>example.com</code> or <code>www.example.com</code>).</p>
<p>If either of those conditions are met, the RewriteRule kicks in, and matches any file that ends in gif, jpg, or png.  So, if any outside website links to any file on your website that ends in those 3 file extensions, it will return a forbidden response.</p>
<h2>WordPress URL Rewriting</h2>
<p>If you use WordPress, when you customize your permalinks through the admin interface, WordPress will attempt to alter your .htaccess file to add the following lines:</p>
<pre>RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]</pre>
<p>What this does is check to see if a requested file or directory exists (<code>example.com/directory</code>).  If the file or directory exists, nothing happens, and you are able to access the resource as usual.  If it does not exist (<code>example.com/postname</code>), the RewriteRule activates, sending the request to the WordPress index.php.  From here, the WordPress permalink php code takes over, translating your request into the WordPress resource you requested.</p>
]]></content:encoded>
			<wfw:commentRss>http://curtistasker.com/blog/programming/109/url-rewriting-using-apache/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

