Amazon Web ServicesAWS Week in Review - July 28, 2014

Let's take a quick look at what happened in AWS-land last week:

Monday, July 28
Tuesday, July 29
Wednesday, July 30
Thursday, July 31
Friday, August 1

Stay tuned for next week! In the meantime, follow me on Twitter and subscribe to the RSS feed.

-- Jeff;

Amazon Web ServicesRoute 53 Update - Domain Name Registration, Geo Routing, and a Price Reduction

Amazon Route 53 is a highly available and scalable Domain Name Service (DNS), including a powerful Health Checking Service. Today we are extending Route 53 with support for domain name registration and management and Geo DNS. We are also reducing the price for Route 53 queries! Let's take a closer look at each one of these items.

Domain Name Registration and Management
I registered my first domain name in 1995! Back then, just about every aspect of domain management and registration was difficult, expensive, and manual. After you found a good name, you had to convince one or two of your tech-savvy friends to host your DNS records, register the name using an email-based form, and then bring your site online. With the advent of web-based registration and multiple registrars the process became a lot smoother and more economical.

Up until now, you had to register your domain at an external registrar, create the Hosted Zone in Route 53, and then configure your domain's entry at the registrar to point to the Route 53 name servers. With today's launch of Route 53 Domain Name Registration, you can now take care of the entire process from within the AWS Management Console (API access is also available, of course). You can buy, manage, and transfer (both in and out) domains from a wide selection of generic and country-specific top-level domains (TLDs). As part of the registration process, we'll automatically create and configure a Route 53 Hosted Zone for you. You can think up a good name, register it, and be online with static (Amazon Simple Storage Service (S3)) or dynamic content (Amazon Elastic Compute Cloud (EC2), AWS Elastic Beanstalk, or AWS OpsWorks) in minutes.

If you, like many other AWS customers, own hundreds or thousands of domain names, you know first-hand how much effort goes in to watching for pending expirations and renewing your domain names. By transferring your domain to Route 53, you can take advantage of our configurable expiration notification and our optional auto-renewal. You can avoid embarrassing (and potentially expensive) mistakes and you can focus on your application instead of on your domain names. You can even reclaim the brain cells that once stored all of those user names and passwords.

Let's walk through the process of finding and registering a domain name using the AWS Management Console and the Route 53 API.

The Route 53 Dashboard gives me a big-picture view of my Hosted Zones, Health Checks, and Domains:

I begin the registration process by entering the desired name and selecting a TLD from the menu:

The console checks on availability within the selected domain and in some other popular domains. I can add the names I want to the cart (.com and .info in this case):

Then I enter my contact details:

I can choose to enable privacy protection for my domain. This option will hide most of my personal information from the public Whois database in order to thwart scraping and spamming.

When everything is ready to go, I simply agree to the terms and my domain(s) will be registered:

I can see all of my domains in the console:

I can also see detailed information on a single domain:

I can also transfer domains into or out of Route 53:

As I mentioned earlier, I can also investigate, purchase, and manage domains through the Route 53 API. Let's say that you are picking a name for a new arrival to your family and you want to make sure that you can acquire a suitable domain name (in most cases, consultation with your significant other is also advisable). Here's some code to automate the entire process! I used the AWS SDK for PHP.

The first step is to set the desired last name and gender, and the list of acceptable TLDs:

$LastName = 'Barr';
$Gender   = 'F';
$TLDs     = array('.com', '.org');

Then I include the AWS SDK and the PHP Simple HTML DOM and create the Route 53 client object:

require 'aws.phar';
require 'simple_html_dom.php';

// Connect to Route 53
$Client = \Aws\Route53Domains\Route53DomainsClient::factory(array('region' => 'us-east-1'));

Now I need an array of the most popular baby names. I took this list and parsed the HTML to create a PHP array:

$HTML       = file_get_html("");
$FirstNames = array();

$Lists = $HTML->find('table tr ol');
$Items = $Lists[($Gender == 'F') ? 0 : 1];

foreach ($Items->find('li') as $Item)
  $FirstNames[] = $Item->find('a', 0)->innertext;

With the desired last name and the list of popular first names in hand (or in memory to be precise), I can generate interesting combinations and call the Route 53 checkDomainAvailability function to see if they are available:

foreach ($FirstNames as $FirstName)
  foreach ($TLDs as $TLD)
    $DomainName = $FirstName . '-' . $LastName . $TLD;

    $Result = $Client->checkDomainAvailability(array(
      'DomainName'  => $DomainName,
      'IdnLangCode' => 'eng'));
  echo "{$DomainName}: {$Result['Availability']}\n";

I could also choose to register the first available name (again, consultation with your significant other is recommended here). I'll package up the contact information since I'll need it a couple of times:

$ContactInfo = array(
  'ContactType'      => 'PERSON',
  'FirstName'        => 'Jeff',
  'LastName'         => 'Barr',
  'OrganizationName' => 'Amazon Web Services',
  'AddressLine1'     => 'XXXX  Xth Avenue',
  'City'             => 'Seattle',
  'State'            => 'WA',
  'CountryCode'      => 'US',
  'ZipCode'          => '98101',
  'PhoneNumber'      => '+1.206XXXXXXX',
  'Email'            => '');

And then I use the registerDomain function to register the domain:

if ($Result['Availability'] === 'AVAILABLE')
  echo "Registering {$DomainName}\n");

  $Result = $Client->registerDomain(array(
    'DomainName'              => $DomainName,
    'IdnLangCode'             => 'eng',
    'AutoRenew'               => true,
    'DurationInYears'         => 1,
    'BillingContact'          => $ContactInfo,
    'RegistrantContact'       => $ContactInfo,
    'TechContact'             => $ContactInfo,
    'AdminContact'            => $ContactInfo,
    'OwnerPrivacyProtected'   => true,
    'AdminPrivacyProtected'   => true,
    'TechPrivacyProtected'    => true,
    'BillingPrivacyProtected' => true));

Geo Routing
Route 53's new Geo Routing feature lets you choose the most appropriate AWS resource for content delivery based on the location where the DNS queries originate. You can now build applications that respond more efficiently to user requests, with responses that are wholly appropriate for the location. Each location (a continent, a country, or a US state) can be independently mapped to static or dynamic AWS resources. Some locations can receive static resources served from S3 while others receive dynamic resources from an application running on EC2 or Elastic Beanstalk.

You can use this feature in many different ways. Here are a few ideas to get you started:

  • Global Applications - Route requests to Amazon Elastic Compute Cloud (EC2) instances hosted in an AWS Region that is in the same continent as the request. You could do this to maximize performance or to meet legal or regulatory requirements.
  • Content Management - Provide users access with access to content that has been optimized, customized, licensed, or approved for their geographic location. For example, you could choose to use distinct content and resources for red and blue portions of the United States. Or, you could run a contest or promotion that is only valid in certain parts of world and use this feature to provide an initial level of filtering.
  • Consistent Endpoints - Set up a mapping of locations to endpoints to ensure that a particular location always maps to the same endpoint. If you are running a MMOG, routing based on location can increase performance, reduce latency, give you better control over time-based scaling, and increase the likelihood that users with similar backgrounds and cultures will participate in the same shard of the game.

To make use of this feature, you simply create some Route 53 Record Sets that have the Routing Policy set to Geolocation. Think of each Record Set as a mapping from a DNS entry (e.g. to a particular AWS resource an S3 bucket, an EC2 instance, or an Elastic Load Balancer. With today's launch, each Record Set with a Geolocation policy becomes effective only when the incoming request for the DNS entry originates within the bounds (as determined by an IP to geo lookup) of a particular continent, country, or US state. The Record Sets form a hierarchy in the obvious way and the most specific one is always used. You can also choose to create a default entry that will be used if no other entries match.

You can set up this feature from the AWS Management Console, the Route 53 API, or the AWS Command Line Interface (CLI). Depending on your application, you might want to think about an implementation that generates Record Sets based on information coming from a database of some sort.

Let's say that I want to provide static content to most visitors to, and dynamic content to visitors from Asia. Here's what I need to do. First I create a default Record Set for "www" that points to my S3 bucket:

Then I create another one "www", this one Geolocated for Asia. This one points to an Elastic Load Balancer:

Price Reduction
Last, but certainly not least, I am happy to tell you that we have reduced the prices for Standard and LBR (Location-Based Routing) queries by 20%. The following prices go in to effect as of August 1, 2014:

  1. Standard Queries - $0.40 per million queries for the first billion queries per month; $0.20 per million queries after that.
  2. LBR Queries - $0.60 per million queries for the first billion queries per month; $0.30 per million queries after that.
  3. Geo DNS Queries - $0.70 per million queries for the first billion queries per month; $0.35 per million queries after that.

Available Now
These new features are available now and the price reduction goes in to effect tomorrow.

-- Jeff;

PS - Thanks to Steve Nelson of AP42 for digging up the Internic Domain Registration Template!

ProgrammableWeb: APIsGoogle Play Developer

The Google Play Developer API allows developers the opportunity to automate tasks concerning subscriptions, in-app purchasing, and general app-management. One may gain access to the API by signing into their Google Developer Console.
Date Updated: 2014-07-31
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsIDOL OnDemand Text Tokenization

IDOL OnDemand by HP offers an array of data processing APIs for format conversion, image analysis, indexing, search, and textual analysis. Preview capabilities exist to view how many of these RESTful APIs behave in action. The Text Tokenization API accepts a list of terms or a body of text, and returns data about each individual term. The API first stems and transliterates the word, then scans a large body of common English works to return information on the weight of the word (how often it is used in the English language), the total number of documents the word is present in, and the total number of occurrences.
Date Updated: 2014-07-31
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsIDOL OnDemand Sentiment Analysis

IDOL OnDemand by HP offers an array of data processing APIs for format conversion, image analysis, indexing, search, and textual analysis. Preview capabilities exist to view how many of these RESTful APIs behave in action. The Sentiment Analysis API accepts and scans a text file to return either positive, neutral, or negative sentiment. It does this by comparing content with a library of specific words that evoke negative emotion, and by recognizing phrases that are indicative of an emotional bias.
Date Updated: 2014-07-31
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsIDOL OnDemand Language Identification

IDOL OnDemand by HP offers an array of data processing APIs for format conversion, image analysis, indexing, search, and textual analysis. Preview capabilities exist to view how many of these RESTful APIs behave in action. The Language Identification API scans an inputted document and returns the language present. The API requires a minimum of 3 words to determine the language, however, accuracy is increased with larger amounts of content.
Date Updated: 2014-07-31
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsIDOL OnDemand Highlight Text

IDOL OnDemand by HP offers an array of data processing APIs for format conversion, image analysis, indexing, search, and textual analysis. Preview capabilities exist to view how many of these RESTful APIs behave in action. The Highlight Text API accepts a text file, and scans for specified words to insert an HTML tag that highlights them.
Date Updated: 2014-07-31
Tags: [field_primary_category], [field_secondary_categories]

Amazon Web ServicesAuto Scaling Update - Lifecycle Management, Standby State, and DetachInstances

Auto Scaling is a key AWS service. You can use it to build resilient, highly scalable applications that react to changes in load by launching or terminating Amazon EC2 instances as needed, all driven by system or user-defined metrics collected and tracked by Amazon CloudWatch.

Today we are enhancing Auto Scaling with the addition of three features that give you additional control over the EC2 instances managed by each of your Auto Scaling Groups. You can now exercise additional control of the instance launch and termination process using Lifecycle Hooks. You can remove instances from an Auto Scaling Group and you can now put instances into the new Standby state for troubleshooting or maintenance.

Lifecycle Actions & Hooks
Each EC2 instance in an Auto Scaling Group goes through a defined set of states and state transitions during its lifetime. In response to a Scale Out Event, instances are launched, attached to the group, and become operational. Later, in response to a Scale In Event, instances are removed from the group and then terminated. With today's launch we are giving you additional control of the instance lifecycle at the following times:

  • After it has been launched but before it is attached to the group (Auto Scaling calls this state Pending). This is your opportunity to perform any initialization operations that are needed to fully prepare the instance. You can install and configure software, create, format, and attach EBS volumes, connect the instance to message queues, and so forth.
  • After it has been detached from the group but before it has been terminated (Auto Scaling calls this state Terminating). You can do any additional work that is needed to fully decommission the instance. You can capture a final snapshot of any work in progress, move log files to long-term storage, or hold malfunctioning instances off to the side for debugging.

You can configure a set of Lifecycle actions for each of your Auto Scaling Groups. Messages will be sent to a notification target for the group (an SQS queue or an SNS topic) each time an instance enters the Pending or Terminating state. Your application is responsible for handling the messages and implementing the appropriate initialization or decommissioning operations.

After the message is sent, the instance will be in the Pending:Wait or Terminating:Wait state, as appropriate. Once the instance enters this state, your application is given 60 minutes to do the work. If the work is going to take more than 60 minutes, your application can extend the time by issuing a "heartbeat" to Auto Scaling. If the time (original or extended) expires, the instance will come out of the wait state.

After the instance has been prepared or decommissioned, your application must tell Auto Scaling that the lifecycle action is complete, and that it can move forward. This will set the state of the instance to Pending:Proceed or Terminating:Proceed.

You can create and manage your lifecycle hooks from the AWS Command Line Interface (CLI) or from the Auto Scaling API. Here are the most important functions:

  1. PutLifecycleHook - Create or update a lifecycle hook for an Auto Scaling Group. Call this function to create a hook that acts when instances launch or terminate.
  2. CompleteLifecycleAction - Signify completion of a lifecycle action for a lifecycle hook. Call this function when your hook has successfully set or up decommissioned an instance.
  3. RecordLifecycleActionHeartbeat - Record a heartbeat for a lifecycle action. Call this function to extend the timeout for a lifecycle action.

Standby State
You can now move an instance from the InService state to the Standby state, and back again. When an instance is standing by, it is still managed by the Auto Scaling Group but it is removed from service until you set it back to the InService state. You can use this state to update, modify, or troubleshoot instances. You can check on the state of the instance after specific events, and you can set it aside in order to retrieve important logs or other data.

If there is an Elastic Load Balancer associated with the Auto Scaling Group, the transition to the standby state will deregister the instance from the Load Balancer. The transition will not take effect until traffic ceases; this may take some time if you enabled connection draining for the Load Balancer.

You can now remove an instance from an Auto Scaling Group and manage it independently. The instance can remain unattached, or you can attach it to another Auto Scaling Group if you'd like. When you call the DetachInstances function, you can also request a change in the desired capacity for the group.

You can use this new functionality in a couple of different ways. You can move instances from one Auto Scaling Group to another to effect an architectural change or update. You can experiment with a mix of different EC2 instance types, adding and removing instances in order to find the best fit for your application.

If you are new to the entire Auto Scaling concept, you can use this function to do some experimentation and to gain some operational experience in short order. Create a new Launch Configuration using the CreateLaunchConfiguration and a new Auto Scaling Group using CreateAutoScalingGroup, supplying the Instance Id of an existing EC2 instance in both cases. Do your testing and then call DetachInstances to take the instance out of the Auto Scaling Group.

You can also use the new detach functionality to create an "instance factory" of sorts. Suppose your application assigns a fresh, fully-initialized EC2 instance to each user when they log in. Perhaps the application takes some time to initialize, but you don't want your users to wait for this work to complete. You could create an Auto Scaling Group and set it up so that it always maintains several instances in reserve, based on the expected login rate. When a user logs in, you can allocate an instance, detach it from the Auto Scaling Group, and dedicate it to the user in short order. Auto Scaling will add fresh instances to the group in order to maintain the desired amount of reserve capacity.

Available Now
All three of these new features are available now and you can start using them today. They are accessible from the AWS Command Line Interface (CLI) and the Auto Scaling API.

-- Jeff;

Amazon Web ServicesAmazon ElastiCache Flexible Node Placement

Amazon ElastiCache makes it easy for you to deploy an in-memory cache in the cloud using the Memacached or Redis engines.

Today we are launching a new flexible node placement model for ElastiCache. Your Cache Clusters can now span multiple Availability Zones within a Region. This will help to improve the reliability of the Cluster.

You can now choose the Availability Zone for new nodes when you create a new Cache Cluster or add more nodes to an existing Cluster. You can specify the new desired number of nodes in each Availability Zone or you can simply choose the Spread Nodes Across Zones option. If the cluster is within a Virtual Private Cloud (VPC) you can place nodes in Availability Zones that are part of the selected cache subnet group (read Createing a Cache Cluster in a VPC to learn more).

Here is how you control node placement when you create a new Cache Cluster:

Here is how you control zone placement when you add nodes to an existing cluster:

This new feature is available now and you can start using it today!

-- Jeff;

Daniel Glazman (Disruptive Innovations)Where is Daniel

I suffered last week a major lumbago, leaving me totally blocked, under morphine and hospitalized. I am recovering slowly, stuffed with strong medicines, my back being too weak now to stand a visit to my osteopath. Since I am unable to seat too long in front of a computer (I spend most of my days in bed), don't count on me these days for anything work-related, including unfortunately W3C stuff. Don't expect fast answer to emails either. I need at least one extra week of complete rest to recover a bit. Thanks.

Shelley Powers (Burningbird)Who Keeps E-Mails?

fishing expedition
If you're following the BPI vs. ABC "pinkslime" lawsuit, than you might be aware that the company is attempting to subpoena emails from several journalists and food safety experts.

The subpoenas to Food Safety News reporters are a bit tricky, because the publisher for the online site is Bill Marler, who is providing pro bono legal defense for the two former USDA workers who are also being sued in this lawsuit. I'm not a lawyer, but this means attorney-client privilege to me. I'm surprised that the Judge would allow such a fishing expedition so close to this privilege, but maybe this is the way they do things in South Dakota's courts.

Michele Simon responded to the subpoena, but as she noted, she doesn't keep emails. Come to think of it, I don't keep emails, either. Nowadays, when you have corporations shotgunning subpoenas under indifferent judicial eyes, perhaps none of us should keep emails. Not unless we primarily write about cats or JavaScript. Or the latest squabble between the WhatWG and the W3C HTML working groups (because no one would ever want any of these emails).

If BPI, Inc doesn't have what it needs to to win its case, or can't get it from those directly involved in the lawsuit, maybe it should focus on how to explain away both the pinkness and the slimy feel of its product when the defendant lawyers bring a mess in for the jury to fondle. And spend some time contemplating the fact that, yes, people in this country really do want to know what's in the food we're eating.

Update: ABC has also covered the subpoena story. Must have been a bit cathartic for them.


ProgrammableWeb: APIsVisa Checkout

Visa Checkout API helps developers to integrate payments. The fundamental value of this API is the functionality of e-commerce transactions, useful for developers who work with online businesses open 24/7. The API is about payment integration. Visa Checkout API features REST protocol, HTTPS interfaces, and API Key which can be used publicly or privately. Developers can receive Twitter tips and e-mail support.
Date Updated: 2014-07-30
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsWink App

Wink is an app that syncs with home automation devices to adjust lighting, window shades, climate, key locks, and more. Wink sells a Wink HUB hardware component that accepts communications from devices in the following protocols: Bluetooth LE, Wi-Fi, ZigBee, Z-Wave, Lutron ClearConnect, and Kidde. The RESTful Wink API is hosted through Apiary and allows Wink devices to communicate with users, other apps, and the web in general.
Date Updated: 2014-07-30
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsIDOL OnDemand Expand Terms

IDOL OnDemand by HP offers an array of data processing APIs for format conversion, image analysis, indexing, search, and textual analysis. Preview capabilities exist to view how many of these RESTful APIs behave in action. The Expand Terms API accepts a term and returns related words based on the selected expansion criteria. Developers can set to 'fuzzy' to return a list of similarly worded terms, can set to 'stem' to return a list of words that share the same root, or set to 'wild' to utilize even broader expansion results.
Date Updated: 2014-07-30
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsIDOL OnDemand Entity Extraction

IDOL OnDemand by HP offers an array of data processing APIs for format conversion, image analysis, indexing, search, and textual analysis. Preview capabilities exist to view how many of these RESTful APIs behave in action. The Entity Extraction API scans an inputted text file and extracts words that match the entity type criteria. These could be people, names, places, company names, phone numbers, URL, credit card numbers, and more. The API returns as list of the content matching the entity types, and also specifies where they were found within the document.
Date Updated: 2014-07-30
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsIDOL OnDemand Query Text Index

IDOL OnDemand by HP offers an array of data processing APIs for format conversion, image analysis, indexing, search, and textual analysis. Preview capabilities exist to view how many of these RESTful APIs behave in action. The Query Text Index API accepts plain text, references, or Boolean expressions and queries the OnDemand database to return documents that match the specified search criteria. Queries are also made to a number of public datasets that include Wikipedia news feeds in multiple languages.
Date Updated: 2014-07-30
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsIDOL OnDemand Get Parametric Values

IDOL OnDemand by HP offers an array of data processing APIs for format conversion, image analysis, indexing, search, and textual analysis. Preview capabilities exist to view how many of these RESTful APIs behave in action. The Get Parametric Values API allows one to retrieve parametric values for specified parametric fields. This is helpful for filtering with end users, or returning faceted data.
Date Updated: 2014-07-30
Tags: [field_primary_category], [field_secondary_categories]

Shelley Powers (Burningbird)Mother Jones Fascinating Murder Mystery with an NRA Twist—and Documents

Photo of gun associated with early murder trial, included in Mother Jones story.

Mother Jones has a fascinating, longer look at an early murder mystery associated with none other than the NRA's general counsel, Robert Dowlut. It would seem that Dowlut was originally convicted of second degree murder, a conviction that was later overturned.

In an act I've come to expect from Mother Jones, the publication has also provided easy access to all of the documentation that provided the basis for the story.

Journalists can't always provide all of their background material, but when they can, they should. This allows others to review the material, enabling them to either agree or disagree with the writer based on the same material, if the writer forms a conclusion. At a minimum, this sharing ensures open access to documents that may be difficult for non-journalists to obtain—documents that may form the basis for other, future works.

There is nothing to agree with or disagree with in the Mother Jones article, since it's very careful to remain neutral and factual in its retelling of the older story (and the more recent activities Dowlut has undertaken for the NRA). But the author, Dave Gilson, provides much to think about.


Amazon Web ServicesNew Amazon Climate Research Grants

Many of my colleagues are focused on projects that lead to a reduction in the environmental impact of our work. Online shopping itself is inherently more environmentally friendly than traditional retailing. Other important initiatives include Frustration-Free Packaging, Environmentally Friendly Packaging, our global Kaizen program, Sustainable Building Design, and a selection of AmazonGreen products. On the AWS side, the US West (Oregon) and AWS GovCloud (US) Regions make use of 100% carbon-free power.

In conjunction with our friends at NASA, we announced the OpenNEX (NASA Earth Exchange) program and the OpenNEX Challenge late last year. OpenNEX is a collection of data sets produced by Earth science satellites (over 32 TB at last count) and a set of virtual labs, lectures, and Amazon Machine Images (AMIs) for those interested in learning about the data and how to process it on AWS. For example, you can learn how to use Python, R or shell scripts to interact with the OpenNEX data, generate a true-color Landsat image, enhance Landsat images with atmospheric corrections, or work with the NEX Downscaled Climate projections (NEXDCP-30).

Amazon Climate Research Grants
We are interested in exploring ways to use computational analysis to drive innovative research in to climate change. In order to help to drive this work forward, we are now calling for proposals for Amazon Climate Research Grants. In early September, we will award grants of free access to supercomputing resources running on the Amazon Elastic Compute Cloud (EC2).

The grants will provide access to more than fifty million core hours via the EC2 Spot Market. Our goal is to encourage and accelerate research that will result in an improved understanding of the scope and effects of climate change, along with analyses that could suggest potential mitigating actions. Recipients have the opportunity to deliver an update on their progress and to reveal early findings at the AWS re:Invent conference in mid-November.

If you are interested in applying for an Amazon Climate Research Grant, here are some dates to keep in mind:

  • July 29, 2014 - Call for proposals opens.
  • August 29, 2014 - Submissions are due.
  • Early September 2014 - Recipients notified; AWS grants issued.
  • November 2014 - Recipients present initial research and findings at AWS re:Invent.

To learn more or to submit a proposal, please visit the Amazon Climate Change Grants page.

Let's wrap up this post with a quick look at an AWS HPC success story!

The Globus team at the University of Chicago/Argonne National Lab used an AWS Research grant to create the Galaxy instance and use EC2 Spot instances to run various climate impact models and applications that project irrigation water availability and agricultural production under climate change. You can learn more about this "Science as a Service on AWS" project by taking a peek at the following presentation:

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="356" marginheight="0" marginwidth="0" scrolling="no" src="" style="border:1px solid #CCC; border-width:1px; margin-bottom:5px; max-width: 100%;" width="427"> </iframe>

Your Turn
I am looking forward to taking a look at the proposals and to seeing the first results at re:Invent. If you have an interesting and relevant project in mind, I invite you to apply now!.

-- Jeff;

ProgrammableWeb: APIsIDOL OnDemand Find Similar

IDOL OnDemand by HP offers an array of data processing APIs for format conversion, image analysis, indexing, search, and textual analysis. Preview capabilities exist to view how many of these RESTful APIs behave in action. The Find Similar API returns documents that are similar to the query you provide. Acceptable query formats include plain text, URL, index type, file, or reference. One can use the 'sort' or 'print' parameters to denote what fields to represent as an output.
Date Updated: 2014-07-29
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsIDOL OnDemand Find Related Concepts

IDOL OnDemand by HP offers an array of data processing APIs for format conversion, image analysis, indexing, search, and textual analysis. Preview capabilities exist to view how many of these RESTful APIs behave in action. The Find Related Concepts API takes a query text and returns a list of the most similar concepts ranked by the presence of the prescribed attribute. Queries are made through public datasets that include Wikipedia news feeds in multiple languages. Boolean and FieldText query methods are supported.
Date Updated: 2014-07-29
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsIDOL OnDemand List Indexes

IDOL OnDemand by HP offers an array of data processing APIs for format conversion, image analysis, indexing, search, and textual analysis. Preview capabilities exist to view how many of these RESTful APIs behave in action. Their set of Indexing APIs allow for general databasing functionalities. Indexes come in different formats called "flavors" which vary according to maximum number of documents, scaling performance, and number of field types. The List Index API returns a list of the names of all dynamic data sets that have been created with the Create Text Index API.
Date Updated: 2014-07-29
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsIDOL OnDemand Index Status

IDOL OnDemand by HP offers an array of data processing APIs for format conversion, image analysis, indexing, search, and textual analysis. Preview capabilities exist to view how many of these RESTful APIs behave in action. Their set of Indexing APIs allow for content management and general classification abilities. Indexes come in different formats called "flavors" which vary according to maximum number of documents, scaling performance, and number of field types. To check the status of an index or flavor, developers can send a request to the Index Status API to return data on the number of content pieces, storage availability, and most recent updates/additions.
Date Updated: 2014-07-29
Tags: [field_primary_category], [field_secondary_categories]

Amazon Web ServicesAWS Week in Review - July 21, 2014

Let's take a quick look at what happened in AWS-land last week:

Monday, July 21
Tuesday, July 22
Wednesday, July 23
Thursday, July 24
Friday, July 25

Stay tuned for next week! In the meantime, follow me on Twitter and subscribe to the RSS feed.

-- Jeff;

Norman Walsh (Sun)The short-form week of 21–27 Jul 2014

<article class="essay" id="content" lang="en">

The week in review, 140 characters at a time. This week, 26 messages in 22 conversations. (With 7 favorites.)

This document was created automatically from my archive of my Twitter stream. Due to limitations in the Twitter API and occasional glitches in my archiving system, it may not be 100% complete.

Monday at 10:20am

This is why aliens won't talk to us #atheism —@denyreligion

Monday at 12:16pm

"Time is a machine: it will convert your pain into experience."—@ndw

Monday at 12:53pm

RT @siracusa: "I don't think that the fantasy of intellectual property ultimately serves artists or art very well." —@ndw

Monday at 12:55pm

@alexmilowski Knock 'em dead, Alex. You got this!—@ndw

In a conversation that started on Monday at 03:10pm

Odd to arrive at SFO, which I feel I know well, in a terminal with which I'm completely unfamiliar. (American instead of United)—@ndw
@ndw So when do you want your Manhattan?—@datablick
@datablick Ha! Not on the same day I got up at 5am, that's for sure!—@ndw
@ndw Well, we're here all week...—@datablick
@datablick @ndw Ha! Better make up the spare bed if he has one of your manhattans!—@alexmilowski
@alexmilowski @ndw He can sleep in the basement with all the kids.—@datablick
@ndw Decided to fly an airline that doesn't hate you?—@alexmilowski
@alexmilowski AUS-LHR nonstop on BA was the clincher.—@ndw
@ndw I thought BA was a United partner?—@alexmilowski
@alexmilowski Nope, American.—@ndw

Wednesday at 07:42pm

My social media feeds are overrun with news of rallies to stand with Israel or with Gaza. Not a single rally to stand with civilians.—@meyerweb

Wednesday at 08:47pm

A few days ago I posted DocBook V5.1CR3 RELAX NG schemas on docbook(.org). Today I posted an attempt at XML Schemas: 1.0 and (better) 1.1.—@ndw

In a conversation that started on Wednesday at 08:49pm

Have (almost surely) decided the next release of my DocBook toolchain (XSLT 2.0/HTML5) will require XProc, abandon XSL FO.—@ndw
@ndw I've been out of the XML loop for a year or so - I am curious - why abandon XSL-FO?—@EileenOttawa
@EileenOttawa Working group disbanded. Locus of interest/effort clearly CSS.—@ndw
@ndw Woah! Interesting. I have indeed been under a rock! Thanks much—@EileenOttawa
@ndw How will you get to PDF from XML? @EileenOttawa —@webbr
@webbr @EileenOttawa DocBook XML → "for-print" HTML → HTML+CSS formatter → PDF.—@ndw
@ndw @EileenOttawa Thank you for the details. What's the tool in the chain that gets you to PDF from HTML+CSS?—@webbr
@webbr @EileenOttawa I've tested both PrinceXML (oh, the irony) and AntennaHouse's formatter. There are probably others and surely will be.—@ndw
@ndw @EileenOttawa Do you think that there will soon be a free tool for that link in the chain? At this point, I rely on fop to get to PDF.—@webbr
@webbr @EileenOttawa I expect so. Might be already. Something that rivals fop shouldn't be too difficult.—@ndw
[big shift] MT @ndw: Have (almost surely) decided next release of my DocBook toolchain (XSLT 2.0/HTML5) will require XProc, abandon XSL FO—@jeffsonstein

In a conversation that started on Wednesday at 10:35pm

DocBook and HTML 5(.x): —@ndw
@ndw @sideshowbarker you might have missed the point of picture. It overcomes issues with speculative preload scanner in HTML parsing.—@marcosc
@marcosc @sideshowbarker Yes. Curiously, DocBook extended the content model of its image markup recently for similar reasons.—@ndw
@ndw @sideshowbarker interesting. Would really like to hear more. I'm worried we will hit more of these problems with Web Components.—@marcosc
@marcosc @sideshowbarker Oh, the imagine markup issue is unrelated to the question of Web Components. And I may still misunderstand picture.—@ndw
@ndw @sideshowbarker see the follow. Might help explain: —@marcosc
@ndw @sideshowbarker *following—@marcosc
@ndw @sideshowbarker to be clear, we didn't want picture to exist in HTML. But we had no option but to add it :(—@marcosc
@ndw @sideshowbarker also, 99.99% of new tags will be just be Web Components. Does DocBook allow custom elements + API? (Never used DocBook)—@marcosc
@marcosc @sideshowbarker No, but the paper that @alexmilowski and I wrote for #Balisage is about impl DocBook with web components!—@ndw
@ndw @sideshowbarker @alexmilowski can you guys email me the paper?—@marcosc
@ndw obviously, risky stuff on your website. T-mobile UK has a content lock on your website. ;)—@alexmilowski
@alexmilowski I have all kinds of dangerous ideas.—@ndw

In a conversation that started on Thursday at 12:56am

If I ever own a hotel, I will attempt to appoint the rooms with devices that do not have insanely bright LEDs. #unplug #coverwithtowel —@ndw
@ndw If you owned a hotel I bet you'd charge extra for those rooms—@ptsefton
@ptsefton Nah. I wouldn't have any other kind!—@ndw

Thursday at 07:13am

XML Stars, the journal is out! Stories via @ndw @JamieXML @xmlgrrl —@dominixml

Thursday at 09:05am

ORM is a software design pattern popular among people that think it will work this time—@GonzoHacker

Thursday at 10:13am

@scalzi I hope you picked "This timeline sucks"—@ndw

Thursday at 10:16am

Time Warner claims 100mb service for same price as 20mb starting next month. I'm sure that was always the plan b4 Google Fiber arrived.—@ndw

In a conversation that started on Thursday at 02:00pm

Really? Wow. No more donations for you, @Habitat_org —@ndw
@ndw Context?—@tabatkins
@tabatkins that was supposed to be a reply for context *grumble*.—@ndw
@ndw Wow, just read the preamble of @Habitat_org by-laws.—@nsushkin
@ndw what did I miss?—@laurendw

Friday at 06:10am

Friday at 03:12pm

People who methodically type "3-0" seconds into the microwave, instead of a quick "33," cannot compete in today's cutthroat global economy.—@adamisacson

Saturday at 06:42am

Hotel room to gate in 33 minutes. Shame it had to be the 33 minutes that started at 04:00.—@ndw

Saturday at 08:50am

Went to a car boot sale, bought some retro-style nuns' outfits and a Bruce Willis DVD. Old habits, Die Hard.—@gavinbarber

Sunday at 12:44pm

I was upgraded to first class. My seat is encrusted with Swarovski crystals. When you sit, it thanks you in Benedict Cumberbatch's voice.—@scalzi

Sunday at 01:31pm

Hey, weird oven. It only goes to 250 degrees. Good thing I only needed to reheat this. [10 mins later] Celsius Jeff! Celsius!!—@inscitekjeff

Sunday at 02:01pm

@united @Kellblog Well, maybe it's time to change that policy! Passengers care and you do actually know how to reach us!—@ndw

Micah Dubinko (Yahoo!)Prime Number sieve in Scala

There are a number of sieve algorithms that can be used to list prime numbers up to a certain value.  I came up with this implementation in Scala. I rather like it, as it makes no use of division, modulus, and only one (explicit) multiplication.

Despite being in Scala, it’s not in a functional style. It uses the awesome mutable BitSet data structure which is very efficient in space and time. It is intrinsically ordered and it allows an iterator, which makes jumping to the next known prime easy. Constructing a BitSet from a for comprehension is also easy with breakOut.

The basic approach is the start with a large BitSet filled will all odd numbers (and 2), then iterate through the BitSet, constructing a new BitSet containing numbers to be crossed off, which is easily done with the &~= (and-not) reassignment method. Since this is a logical bitwise operation, it’s blazing fast. This code takes longer to compile than run on my oldish MacBook Air.

import scala.collection.mutable.BitSet
import scala.collection.breakOut
println(new java.util.Date())
val top = 200000
val sieve = BitSet(2)
sieve |= (3 to top by 2).map(identity)(breakOut)
val iter = sieve.iteratorFrom(3)
while(iter.hasNext) {
 val n =
 sieve &~= (2*n to top by n).map(identity)(breakOut)
println(sieve.toIndexedSeq(10000)) // 0-based
println(new java.util.Date())

As written here, it’s a solution to Euler #7, but it could be made even faster for more general use.

For example

  • I used a hard-coded top value (which is fine when you need to locate all primes up to  n). For finding the nth prime, though, the top limit could be calculated
  • I could stop iterating at sqrt(top)
  • I could construct the removal BitSet starting at n*n rather than n*2

I suspect that spending some time in the profiler could make this even faster. So take this as an example of the power of Scala, and a reminder that sometimes a non-FP solution can be valid too. Does anyone have a FP equivalent to this that doesn’t make my head hurt? :-)


ProgrammableWeb: APIsIDOL OnDemand Delete from Text Index

IDOL OnDemand by HP offers an array of data processing APIs for format conversion, image analysis, indexing, search, and textual analysis. Preview capabilities exist to view how many of these RESTful APIs behave in action. The Delete from Text Index API allows one to delete content from an index that was previously inserted with the Add to Text Index API. To delete content, one can send a request to the Delete from Text Index API that includes the index name and the reference to the content one desires to delete. After deletion, the content will not be available for further use with additional APIs.
Date Updated: 2014-07-28
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsIDOL OnDemand Delete Text Index

IDOL OnDemand by HP offers an array of data processing APIs for format conversion, image analysis, indexing, search, and textual analysis. Preview capabilities exist to view how many of these RESTful APIs behave in action. The Delete Text Index API allows one to delete an index that was previously created with the Create Text Index API. Indexes come in different formats called "flavors" which vary according to maximum number of documents, scaling performance, and number of field types. To delete an index, one can send a request to the Delete Text Index API including the name of the index one desires to delete.
Date Updated: 2014-07-28
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsIDOL OnDemand Create Text Index

IDOL OnDemand by HP offers an array of data processing APIs for format conversion, image analysis, indexing, search, and textual analysis. Preview capabilities exist to view how many of these RESTful APIs behave in action. The Create Text Index API allows one to create an index with a unique name. Developers can then use this index to classify content using the Add To Text Index API. Indexes come in different formats called "flavors" which vary according to maximum number of documents, scaling performance, and number of field types.
Date Updated: 2014-07-28
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsIDOL OnDemand Add to Text Index

IDOL OnDemand by HP offers an array of data processing APIs for format conversion, image analysis, indexing, search, and textual analysis. Preview capabilities exist to view how many of these RESTful APIs behave in action. The Add to Text Index API organizes incoming content through indexing, making content easy to integrate with other APIs. In order to index content, one must first use the Create Text Index API to create an index. Asynchronous API usage is recommended for large quantity requests.
Date Updated: 2014-07-28
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsIDOL OnDemand Image Recognition

IDOL OnDemand by HP offers an array of data processing APIs for format conversion, image analysis, indexing, search, and textual analysis. Preview capabilities exist to view how many of these RESTful APIs behave in action. The Image Recognition API scans an image to compare and match objects against a database of objects that the user provides. The HP-hosted public database also includes logos of corporations to match against. The Image Recognition API returns the name of the database object, what it represents, as well as information on where the object appears in the image. The API supports all standard image types (TIFF, JPEG, PNG, GIF, BMP and ICO, PBM, PGM, and PPM).
Date Updated: 2014-07-25
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsIDOL OnDemand Face Detection

IDOL OnDemand by HP offers an array of data processing APIs for format conversion, image analysis, indexing, search, and textual analysis. Preview capabilities exist to view many of these RESTful APIs behave in action. The Face Detection API scans an inputted image for faces and returns the sizing and coordinate positioning of a bounding box that surrounds any faces it detects. The API supports all standard image types (TIFF, JPEG, PNG, GIF, BMP and ICO, PBM, PGM, and PPM).
Date Updated: 2014-07-25
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsIDOL OnDemand Barcode Recognition

IDOL OnDemand by HP offers an array of data processing APIs for format conversion, image analysis, indexing, search, and textual analysis. Preview capabilities exist to view many of these RESTful APIs behave in action. The Barcode Recognition API accepts and scans an image to detect if a barcode is present. If it detects a barcode, the API returns the associated number, as well as coordinates describing the positioning of the barcode within the frame. The Barcode Recognition API is most accurate when accepting horizontally or vertically oriented barcodes, and accepts a comprehensive repertoire of barcode language types.
Date Updated: 2014-07-25
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsIDOL OnDemand View Document

IDOL OnDemand by HP offers an array of data processing APIs for format conversion, image analysis, indexing, search, and textual analysis. Preview capabilities exist to view many of these RESTful APIs behave in action. The View Document API uses HP KeyView to extract content and metadata from over 500 different file types. It renders this content into HTML for browser friendly web viewing, offering an optional highlighting feature for tagging important content.
Date Updated: 2014-07-25
Tags: [field_primary_category], [field_secondary_categories]

Amazon Web ServicesElastic Load Balancing Connection Timeout Management

When your web browser or your mobile device makes a TCP connection to an Elastic Load Balancer, the connection is used for the request and the response, and then remains open for a short amount of time for possible reuse. This time period is known as the idle timeout for the Load Balancer and is set to 60 seconds. Behind the scenes, Elastic Load Balancing also manages TCP connections to Amazon EC2 instances; these connections also have a 60 second idle timeout.

In most cases, a 60 second timeout is long enough to allow for the potential reuse that I mentioned earlier. However, in some circumstances, different idle timeout values are more appropriate. Some applications can benefit from a longer timeout because they create a connection and leave it open for polling or extended sessions. Other applications tend to have short, non- recurring requests to AWS and the open connection will hardly ever end up being reused.

In order to better support a wide variety of use cases, you can now set the idle timeout for each of your Elastic Load Balancers to any desired value between 1 and 3600 seconds (the default will remain at 60). You can set this value from the command line or through the AWS Management Console.

Here's how to set it from the command line:

$ elb-modify-lb-attributes myTestELB --connection-settings "idletimeout=120" --headers

And here is how to set it from the AWS Management Console:

This new feature is available now and you can start using it today! Read the documentation to learn more.

-- Jeff;

Norman Walsh (Sun)DocBook and HTML 5(.x)

<article class="essay" id="content" lang="en">

HTML 5(.x) today reminds me a lot of DocBook 1(.x) twenty years ago. That's neither criticism nor compliment, merely observation.

Some random wind blew the HTML 5(.1) “picture” element across my desk today. That lead me to a page somewhere that enumerated all of the proposals for HTML 5.x elements in their various stages of standardization.

That's drifted back through my consciousness several times today until finally I realized why. The reason is: it reminds me a whole lot of DocBook twenty years ago. Hear me out.

Twenty years ago, DocBook had a relatively small number of tags. Like HTML of today, it had enough markup to do articles and sections and paragraphs and images and block quotations and a short list of other things.

Twenty years ago, DocBook had a selection of specialized elements in addition to the basic structural elements necessary to capture expository prose. HTML has them too; the specializations are different, but that's not surprising.

DocBook was about interchange so there was a fairly diligent effort undertaken to make sure that the processing expectations of each element were clearly defined. The variety of outputs imagined and the fact that the DocBook community had nearly no appreciable influence over the development of the platforms that would support those outputs meant that there was a certain vagueness, but we have always cared. HTML, the specification, goes to great lengths to describe the processing expectations of…everything, not just proper, valid markup but essentially every sequence of characters. HTML is as much about interoperability of browsers as anything else and so there's tremendous effort undertaken to insure that interoperability.

DocBook had a relatively large and diverse community of users (some significant fraction of techpubs plus a smattering of other fields of publication). Ok, HTML's relatively large and diverse community (basically everyone everywhere) eclipses the DocBook community the way the population of beetles on the earth dwarfs, say, the human population of Rhode Island, but we're talking relative scales.

An interesting thing about a large and diverse community of users is that they have different interests and different requirements. And if the community is big enough, you wind up with tags that are of interest to “a large group of people” who are still a relatively small group compared to the whole. DocBook certainly has markup that “almost everyone” never uses, and that I sometimes wish we hadn't invented, because various groups of users, perceived at the time to be of significant size, were able to make a compelling case for it.

HTML, like DocBook, has a committee of developers who respond to requests for new tags, proposals for new tags, proposals for changes to tags, proposals for extensions to tags, proposals for the removal of tags, etc. And like any committee, it attempts to establish guidelines and policies and undertakes to serve its community as best it can.

DocBook today has a quite large set of tags. Large enough that lots of folks want subsets. I don't know if HTML has become that large yet, but I bet it will.

HTML's evolution is never going to more than superficially resemble DocBook's evolution. The HTML community has direct and compelling influence on the platform that supports it (or maybe it's the other way around). DocBook still focuses on encoding technical documents; most of the HTML effort seems to be about developing an open, portable application development framework. Nothing wrong with that except to the extent that it seems to marginalize other goals for the web which, no doubt, one could argue it doesn't.

There's nothing profound in these observations, but I look forward to seeing what HTML is like in twenty years. And DocBook too, of course. I wonder if HTML will have twenty year old legacy markup that almost no one uses or if they'll be able to keep things tidier. The fact that HTML is being developed for effectively a single, global platform (or a platform that appears to be that way from most angles) means there's more opportunity for deprecation, I suppose.


Norman Walsh (Sun)The short-form week of 14–20 Jul 2014

<article class="essay" id="content" lang="en">

The week in review, 140 characters at a time. This week, 1 message in 8 conversations. (With 5 favorites.)

This document was created automatically from my archive of my Twitter stream. Due to limitations in the Twitter API and occasional glitches in my archiving system, it may not be 100% complete.

Monday at 09:56am

@ndw Perhaps time to update your bio here?: —@ronhitchens

Tuesday at 12:46pm

New Weird Al video is fantastic: Word Crimes—@codinghorror

Saturday at 08:46pm

RT @geoffarnold: This *should* be the Democratic manifesto. Not holding my breath... —@ndw

Sunday at 07:13am

XML Stars, the journal is out! Stories via @ndw —@dominixml

Sunday at 01:46pm

No, not a Constitutional scholar, just…watch this closely. Church…here. State…there. Separate. Would a picture help?—@StephenKing

Sunday at 02:47pm

Real #standards aren't cool until someone declares you dead, or copies you into JSON and claims they invented it. @jonlehtinen @gkirkpatrick —@JamieXML

Sunday at 03:45pm

No, You Don’t Have a Right to be Forgotten—@dtunkelang

Sunday at 08:28pm

How often do I make chemistry jokes?... Periodically.—@SciencePorn

Norman Walsh (Sun)The short-form week of 7–13 Jul 2014

<article class="essay" id="content" lang="en">

The week in review, 140 characters at a time. This week, 11 messages in 18 conversations. (With 8 favorites.)

This document was created automatically from my archive of my Twitter stream. Due to limitations in the Twitter API and occasional glitches in my archiving system, it may not be 100% complete.

In a conversation that started on Monday at 03:46pm

Yes,,,, and a few other sites are down. Should be back tomorrow.—@ndw
@ndw is down (used in your flickr1.xsl library). Is this related to the other sites?—@erikespana
@ndw I noticed it today. Do you have an estimate of when that file will be back online?—@erikespana
@ndw What about hosting the file on your github account, until your websites are restored?—@erikespana
@erikespana I'll see what I can do today. Not sure what the story is...—@ndw
@erikespana I didn't think that was down. Other sites still are :-(—@ndw

In a conversation that started on Sunday at 06:01pm

Note to self: eat more oysters.—@ndw
@ndw Whilst avoiding the bad ones?—@dpawson

Monday at 10:50am

RT @BadAstronomer: Dear media: Stop putting anti-science reality-denying crackpots on your news shows. —@ndw

Tuesday at 07:14am

XML Stars, the journal is out! Stories via @JamieXML @ndw —@dominixml

Tuesday at 06:23pm

Only an engineer would to this —@SciencePorn

Tuesday at 06:51pm

Looking down on the crater from Haleakalā. —@dtunkelang

Tuesday at 10:51pm

Wednesday at 07:48am

The optimist says the glass is half full. The pessimist says half empty. The scientist says, "why isn't it full of coffee?"—@SciencePorn

In a conversation that started on Wednesday at 08:16am

@peteaven New phone?—@ndw
@ndw yessir. 5s. having fun with slo-mo. all vids are much more dramatic now. oh, and it's a phone too it seems!—@peteaven

Wednesday at 08:19am

RT @geoffarnold: Money quote: "This is what happens in a surveillance state: to inoculate themselves against suspicion, people... http://t.… —@ndw

Wednesday at 04:25pm

Got a Sonos:1 for a small shelf. Worried that power cord might be a problem; discovered cord elegantly recessed into the base. #designwin —@ndw

Wednesday at 08:53pm

You're a ghost driving a meat coated skeleton made from stardust, what do you have to be scared of? ~@porkbeard —@pickover

In a conversation that started on Wednesday at 09:14pm

All is right with the world. Or at least my little corner of it. Weblog and ancillary sites back online. Sorry about that folks.—@ndw
@ndw is not working for me, is that in your corner?—@apb1704
@apb1704 Don't think that was effected and seems up for me. Still having trouble?—@ndw
@ndw Up now, maybe was local problem for me. thanks.—@apb1704
@ndw is still down for me. Any idea when that'll be online again?—@erikespana

Thursday at 07:14am

XML Stars, the journal is out! Stories via @RealMichaelKay @ndw @JamieXML —@dominixml

Thursday at 11:56am

“We have to do the HTML5 blockquote with seven nested tags inside it because, semantics.”—@zeldman

Thursday at 11:47pm

American out of office: "I'm on vacation but will check email hourly. Reach me on my mobile." European: "I am unavailable until September."—@shanselman

Friday at 10:51pm

Sunday at 10:50am

Objective: custard. Achievement: very sweet, slightly scrambled eggs. #dangit —@ndw

ProgrammableWeb: APIsIDOL OnDemand Text Extraction

IDOL OnDemand by HP offers an array of data processing APIs for format conversion, image analysis, indexing, search, and textual analysis. Preview capabilities exist to view many of these RESTful APIs behave in action. The Text Extraction API accepts over 500 different document formats. Either upload the file, or link to it using a URL or reference. The API returns scraped content as well as associated meta data that can be used by other APIs for further analysis.
Date Updated: 2014-07-24
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsIDOL OnDemand Store Object

IDOL OnDemand by HP offers an array of data processing APIs for format conversion, image analysis, indexing, search, and textual analysis. Preview capabilities exist to view many of these RESTful APIs behave in action. The Store Object API takes a file, URL, or reference, and stores the file as a reference for other APIs to receive and use. Referencing can be a helpful workaround to replace sending entire documents multiple times.
Date Updated: 2014-07-24
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsIDOL OnDemand OCR Document

IDOL OnDemand by HP offers an array of data processing APIs for format conversion, image analysis, indexing, search, and textual analysis. Preview capabilities exist to view many of these RESTful APIs behave in action. The OCR Document API extracts text from an inputted image. It also returns metadata about the organization and sizing of the text itself. The API behaves most accurately when text is superimposed over a high contrast background.
Date Updated: 2014-07-24
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsIDOL OnDemand Expand Container

IDOL OnDemand by HP offers an array of data processing APIs for format conversion, image analysis, indexing, search, and textual analysis. Preview capabilities exist to view many of these RESTful APIs behave in action. The Expand Container API extracts content from parent containers such as ZIP, TAR, or PST files. It then stores this information in an accessible list format for other APIs to interact with.
Date Updated: 2014-07-24
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsXPS2PDF Web

XPS2PDF is a web-based tool for converting documents from .XPS or .OXPS format into .PDF format. The XPS2PDF Web API allows users to convert documents directly or asynchronously, get a list of recent conversion jobs, get the status of a specific job, or retrieve the quota limits and current usage for a user's account.
Date Updated: 2014-07-24
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsGlitxt

Glitxt is designed to let users encode text in an image in a way that conceals the text but makes it obvious to the human eye that something is hidden in the image. This encryption method makes it difficult for automated data miners to extract the encrypted data, but it is not an advanced encryption method. The Glitxt API allows users to encode text in an image, decode an image to get the text, or send a ping request to check if the API is working.
Date Updated: 2014-07-23
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsPemilu Election Results

The Pemilu Election Results API allows users to retrieve information on Indonesia's 2014 election results. These results can be sorted by party, province, electoral district, legislative body, and election year. The Election Results API is part of API Pemilu, a collection of APIs that provide Indonesia's approximately 187 million voters with important election information.
Date Updated: 2014-07-23
Tags: [field_primary_category], [field_secondary_categories]

Jeremy Keith (Adactio)Adactibots

I post a few links on this site every day—around 4 or 5, on average. If you subscribe to the RSS feed, then you’ll know about them (I also push them to Delicious but I don’t recommend relying on that).

If you don’t use RSS—you lawnoffgetting youngster, you—then you’d pretty much have to actually visit my website to see what I’m linking to. How quaint!

Here, let me throw you a bone in the shape of a Twitter bot. You can now follow @adactioLinks.

I made a little If This, Then That recipe which will ping the RSS feed and update the Twitter account whenever there’s a new link.

I’ve done same thing for my journal (or “blog”, short for “weblog”, if you will). You can either subscribe to the journal’s RSS feed or decide that that’s far too much hassle, and just follow @adactioJournal on Twitter instead.

The journal postings are far less frequent than the links. But I still figured I’d provide a separate, automated Twitter account because I do not want to be that guy saying “In case you missed it earlier…” from my human account …although technically, even my non-bot account is auto-generated: my status updates start life as notes on—Twitter just gets a copy.

There’s also @adactioArticles for longer-form articles and talk transcripts but that’s very, very infrequent—just a few posts a year.

So these Twitter accounts correspond to different posts on in decreasing order of frequency:

Amazon Web ServicesBig Data Update - New Blog and New Web-Based Training

The topic of big data comes up almost every time I meet with current or potential AWS customers. They want to store, process, and extract meaning from data sets that seemingly grow in size with every meeting.

In order to help our customers to understand the full spectrum of AWS resources that are available for use on their big data problems, we are introducing two new resources -- a new AWS Big Data Blog and web-based training on Big Data Technology Fundamentals.

AWS Big Data Blog
The AWS Big Data Blog is a way for data scientists and developers to learn big data best practices, discover which managed AWS Big Data services are the best fit for their use case, and help them get started on AWS big data services. Our goal is to make this the hub for developers to discover new ways to collect, store, clean, process, and visualize data at any scale.

Readers will find short tutorials with code samples, case studies that demonstrate the unique benefits of doing big data on AWS, new feature announcements, partner- and customer-generated demos and tutorials, and tips and best practices for using AWS big data services.

The first two posts on the blog show you how to Build a Recommender with Apache Mahout on Amazon Elastic MapReduce and how to Power Gaming Applications with Amazon DynamoDB.

Big Data Training
If you are looking for a structured way to learn more about the tools, techniques, and options available to you as you learn more about big data, our new web-based Big Data Technology Fundamentals course should be of interest to you.

You should plan to spend about three hours going through this course. You will first learn how to identify common tools and technologies that can be used to create big data solutions. Then you will gain an understanding of the MapReduce framework, including the map, shuffle and sort, and reduce components. Finally, you will learn how to use the Pig and Hive programming frameworks to analyze and query large amounts of data.

You will need a working knowledge of programming in Java, C#, or a similar language in order to fully benefit from this training course.

The web-based course is offered at no charge, and can be used on its own or to prepare for our instructor-led Big Data on AWS course.

-- Jeff;

Jeremy Keith (Adactio)Indie Web Camp Brighton

If you’re coming to this year’s dConstruct here in Brighton on September 5th—and you really, really should—then consider sticking around for the weekend.

Not only will there be the fantastic annual Maker Faire on Saturday, September 6th, but there’s also an Indie Web Camp happening at 68 Middle Street on the Saturday and Sunday.

We had an Indie Web Camp right after last year’s dConstruct and it was really good fun …and very productive to boot. The format works really well: one day of discussions and brainstorming, and one day of hacking, designing, and building.

So if you find yourself agreeing with the design principles of the Indie Web, be sure to come along. Add yourself to the list of attendees.

If you’re coming from outside Brighton for the dConstruct/Indie Web weekend, take a look at the dConstruct page on AirBnB for some accommodation ideas at very reasonable rates.

Speaking of reasonable rates… just between you and me, I’ve created a discount code for any Indie Web Campers who are coming to dConstruct. Use the discount code “indieweb” to shave £25 off the ticket price (bringing it down to £125 + VAT). And trust me, you do not want to miss this year’s dConstruct.

It’s just a little over six weeks until the best weekend in Brighton. I hope I’ll see you here then.

Shelley Powers (Burningbird)Responding to Charity Navigator's DA on the Humane Society of the United States

circus elephants on parade

Courtesy of the Boston Public Library, Leslie Jones Collection.

I was sent a link to a story and asked if it was true. The story noted that Charity Navigator, the charity watch dog group, had attached a Donor Advisory to the Humane Society of the United State's listing, specifically because of the lawsuits related to the Ringling Brothers circus.

I was astonished. A donor advisory because of a single Endangered Species Act lawsuit? Many nonprofits are involved in lawsuits as they work to achieve the goals that are part of their underlying mission. I have a hefty annual PACER (federal court document system) fee because of the documents I download for the numerous environmental and animal welfare cases I follow—and I'm only following a tiny fraction of the cases I'd really like to follow.

Was the Donor Advisory given because the animal welfare groups lost the case? I would hope not, because penalizing nonprofits for taking a chance in court would have a chilling effect on their ability to do their work.

Was the Advisory given, then, because they also entered into a settlement for attorney fees? That seems to be more likely, especially considering the hefty size of the attorney fee settlement ($15 million). However, that a single incident related to a single court case would override 60 years of history in the Charity Navigator's decision seemed both capricious and arbitrary. If civil lawsuits were not part of the arsenal of the organization, or if HSUS was in the habit of losing these cases and having to pay hefty attorney fees on a regular basis, then I think it would give most people pause before donating—but a single instance? Frankly, my first reaction was, "Well, aren't you the precious."

Charity Navigator also referenced the fact that Ringling Brothers filed a counter-lawsuit against the animal welfare organizations based on RICO—the Racketeering law. The reference to RICO does sound serious, if it weren't for the fact that because of the RICO law's overly loose design, and due to the Supreme Court's over-reliance on the "intent" of Congress when passing the law, RICO's purpose has been badly muddied over the years. Now, rather than go after the Mafia or sophisticated white-collar criminal networks, RICO has become a highly tempting tool in corporate America's tool belt, especially after the recent findings in the Chevron RICO lawsuit related to the earlier lawsuit brought by poor Ecuadorians against the oil company for environmental damage to their lands.

Regardless, neither lawsuit—the original Endangered Species Act lawsuit brought by the animal welfare groups (not including HSUS), or the RICO case—ever reached a decision on the merits. The former was dismissed because of lack of standing, and the second never went to trial. As part of the attorney fee settlement, Feld Entertainment (parent company for the circus) agreed to dismiss the RICO lawsuit. The fact that the corporation filed a complaint should be seen as irrelevant and not figure into any agency's determination of whether the organizations involved are sound or not. Not unless Charity Navigator believes that all one has to do is file a complaint in court and it's automatically taken as true.

Charity Navigator noted the reasons why the Judge dismissed the ESA case for lack of standing, though the agency's understanding of the legal documents and associated time line of all the events are equally confused and inaccurate. For one, the agency stated that Feld filed the RICO lawsuit after the ESA case was decided. Feld originally filed the RICO lawsuit in 2007 when Judge Sullivan denied the company's request to amend its answer and assert a RICO counter-claim. The new lawsuit was stayed until the ESA case was decided in 2009, and Feld amended its original complaint in 2010, when the RICO case started up again.

I wanted to pull out part of the memorandum Judge Sullivan wrote in 2007 when he rejected Feld Entertainment's request to amend their answer (leading to the RICO lawsuit). It relates to Feld's implication that the animal welfare groups were involved in a complex and corrupt scheme to pay their co-plaintiff, Tom Rider that the company lawyers claimed they didn't know about until 2006.

Finally, the Court cannot ignore the fact that defendant has been aware that plaintiff Tom Rider has been receiving payments from the plaintiff organizations for more than two years. Although defendant alleges an “elaborate cover-up” that prevented it from becoming “fully aware of the extent, mechanics, and purpose of the payment scheme until at least June 30, 2006,” Def.’s Mot. to Amend at 4, such a statement ignores the evidence in this case that was available to defendant before June 30, 2006 and does not excuse defendant’s delay from June 30 forward. Plaintiffs’ counsel admitted in open court on September 16, 2005 that the plaintiff organizations provided grants to Tom Rider to “speak out about what really happened” when he worked at the circus.

In other words, Feld's lawyers found out about the "elaborate scheme" to fund Tom Rider, because the animal welfare groups mentioned funding Tom Rider during a court hearing in 2005.

As for that funding, it is true that the animal welfare groups paid Tom Rider about $190,000 over close to ten years. However, what isn't noted is that some of that "money" wasn't money at all. Rider was given a computer, a cell phone to keep in contact with the groups, a used van so he could travel around the country speaking out about the trial and his experiences with the circus, and various other goods. The groups also provided IRS forms for years 2000 through 2006 for Rider. When I added up the income for these years, it came to $152,176.00. However, after all of Tom Rider's expenses were deducted, over the seven years he "took home" a total of $12,582, for an average of $149.78 a month. That's to pay for all of his personal expenses—including a cheap dark blue polyester suit and equally cheap white shirt and tie he wore to the trial. (Tom Rider must have stood out for the plainness of his garb when next to Feld Entertainment's $825.00 an hour DC lawyers during the trial.)

Among the small selection of oddly one-sided court documents that Charity Navigator linked, another was the Judge Sullivan decision denying the animal welfare group's motion to dismiss the RICO case. What stands out in this document is a reference to the original Judge Sullivan decision, specifically a comment about the Rider funding:

The Court further found that the ESA plaintiffs had been “less than forthcoming about the extent of the payments to Mr. Rider.”

I compare this statement with Sullivan's statement I quoted earlier, wherein Sullivan denied Feld's request to amend its complaint because of the supposed underhanded and secret funding—an assertion that Sullivan rejected in 2007. The newer constradictory 2009 statement was just one of the many inconsistencies in Judge Sullivan's decisions over the years related to these two cases.

But the last issue that Charity Navigator seemed to fixate on was Feld's attempt to get confidential donor lists from the animal welfare groups. I've written about this request, and my great disappointment in Judge Facciola's decision to grant the request.

Nothing will ever convince me this wasn't a bad decision, with the potential to set an extremely bad precedent. Even when the discovery was limited primarily to those people who attended a single event, it's appalling that a confidential donor lists can be given to a corporation who represents everything the donors loath and disdain—and a corporation with a particularly bad record when it comes to dealing with animal welfare groups and other people—not to mention its abysmal record when it comes to its animal acts.

The animal welfare groups settled because when you have a billionaire throwing $825.00 an hour lawyers at a case, and said billionaire doesn't care how much it costs to win, it didn't make sense to continue fighting a fight that was already stacked against them. When Judge Sullivan ruled on the ESA case, he should have recused himself in the RICO case, because to rule favorably for the animal welfare groups in the RICO case would be to say he was inherently mistaken in many of his assertions in the ESA case. When he turned the case over to the Magistrate Judge, Judge Facciola should have exercised independent thinking rather than just continue to parrot Judge Sullivan. And the groups would continue to spend way too much money fighting a lawsuit that the other side would deliberately stretch out as long as it possibly could.

Top all that with the threat to the anonymity of their donors, and the groups settled. Point of fact, if they settled specifically to protect their donors, more power to them. They should be commended for doing so, not punished.

What's ironic is in my original posts on the donor list request, I noted that if the animal welfare groups had to give these lists out, it would most likely impact on their ratings in sites such as Charity Navigator. Never in my wildest dreams did I expect that Charity Navigator would give a donor advisory to the groups just because a judge ordered that the list be provided, not that they were provided. The groups had planned on appealing this ruling before they settled, and frankly, I think they had a good chance of winning the appeal. But the very fact that a no longer existing possibility of an event is enough to trigger a donor advisory leaves me to wonder how many more innocent nonprofits will be labeled with a donor advisory just because someone sent in a newspaper article about the possibility of an event?

Kenneth Feld's $825.00 an hour lead attorney, John Simpson, was recently interviewed for a legal publication. In it, he spoke about the donor list;

They didn't want a situation where I’m taking the deposition of some donor asking — if you knew they were going to take this money to pay a witness, would you have given this donation?” Simpson said. “I don’t think they wanted that kind of discovery to take place. Some people might have made the donation anyway. But most of these people would have said — no, I wouldn't have done that. And you would have been in the middle of their donor relations and potentially cutting off their donations in the future.”

In actuality, the one fund raiser that was at issue in the donor list request did specifically state that the money was for the lawsuit, and other requests for funds specifically stated the money was for Tom Rider's media campaign. I, for one, was concerned about what would happen to individuals put into an intimidating situation by a high priced, DC powerhouse attorney. Mr. Simpson has a way of asking questions in depositions, and then subsequently paraphrasing the responses so that even the most innocent and naive utterance seems dark, and dastardly. It was unfortunate that Judge Sullivan allowed his scarcely concealed disdain for Tom Rider to lead him to basically accept whatever Sullivan said, even though the animal welfare groups presented solid arguments in defense.

Lastly, Charity Navigator linked an article in the Washington Examiner, as if this was further evidence of good reasoning for the donor advisory. Might as well link Fox News as a character reference for the EPA, or The Daily Caller as a reasoned source of news for President Obama.

Just because something shows up in a publication online does not make what's stated truth, or even reliable opinion. That a charity watch dog would link a publication known for its political and social bias, as some form of justification for a decision only undermines its own credibility. Yes, the HSUS and the FFA are involved in lawsuits with a couple of insurance companies regarding their liability coverage. As noted, though, it's common for insurance companies to deny claims of liability when it comes to litigation fees. Kenneth Feld, himself, is involved in a lawsuit with his insurance company about it not wanting to pay those $825.00 an hour fees for Feld's attorneys in the lawsuit with his sister.

However, there were several insurance companies involved with the groups and this court case. One way or another most, if not all, of the attorney fee settlement will be paid by one or more insurance companies.

An interesting side note about the insurance company lawsuits is the fact that the Humane Society's lawsuit is being handled in federal court, while the Fund For Animals lawsuit is being managed in the Maryland state court system. This disproves one Feld Entertainment claim that HSUS and FFA are one organization (and hence, justifying Feld's dragging HSUS into the lawsuit). The reason for the lawsuit split is that FFA is a Maryland corporation, while HSUS is not, and the insurance company was able to argue that it could move the HSUS case to the federal level because of jurisdictional diversity. Nothing more succinctly demonstrates that FFA and HSUS are not the same corporate organization. Yet HSUS has received a donor advisory for a lawsuit it was never involved in. FFA was involved in the ESA suit, but not HSUS.

There is so much to this case, too much to cover in a single writing, but I did want to touch on the major points given by Charity Navigator in its donor advisory. Will the advisory hurt an organization like HSUS? Unlikely. The Humane Society of the United States is one of the older, more established, and largest animal welfare organizations in the country. Its charity ratings to this point have been excellent. A reputable organization like the BBB lists it as an accredited charity, and one only has to do a quick search online to see that it is currently involved in many different animal welfare efforts across the country—from rescuing animals in North Carolina to defending American burros. If people donate or not to the organization it won't be because of Charity Navigator's listing, because most people wouldn't need Charity Navigator to learn more about the HSUS.

But such donor advisories could negatively impact on lesser known, smaller charities. I hope that when Charity Navigator issues such a drastic warning from this day on, it does so based on a foundation that is a little less arbitrary, and much less capricious, than the one they used for HSUS and the other animal welfare groups involved in this court case.

Amazon Web ServicesAWS Week in Review - July 14, 2014

Let's take a quick look at what happened in AWS-land last week:

Monday, July 14
Tuesday July 15
Wednesday, July 16
Thursday, July 17
Friday, July 18

Stay tuned for next week! In the meantime, follow me on Twitter and subscribe to the RSS feed.

-- Jeff;


If developers have been looking for a different, yet strong password protection API, Passwd could be the ideal documentation for app creation. Three categories can be seen in the API site to classify password security: secure, super secure and insanely secure. Nowadays it is not enough to have the same access code for multiple sites, that is why a good combination of letters and numbers from 8 to 62 characters might work to prevent unwelcome access from hackers. The API features JSON and JSONP protocols with the parameters of name, type and values. Simple examples, and immediate support available for developers.
Date Updated: 2014-07-21
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsFoodCare

When consumers have an ailment, they usually go to the doctor. Many times doctors recommend dietary changes in order to preserve health. FoodCare offers personalized nutrition for users from the comfort of a mobile phone. Unlike other apps, FoodCare uses a community based approach, where individuals, restaurants, schools, senior communities and hospitals can benefit from the specialized information from registered dieticians and nutrition educators. This is where developers have an ample opportunity to create apps with the FoodCare API. To get started, developers need to apply for a key. Various features in the API documentation site include cuisine types, health conditions, dishes and restaurants for a more complete application development.
Date Updated: 2014-07-21
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsWahoo Fitness

In a time where exercise activity is connected to our mobile phones, Wahoo fitness offers an API so developers can create useful fitness software in iPhone devices with the main goal to monitor physical health. Various features of the API include multifunction capability, program navigation, economy, memory storage capacity and ease of internet upload. One of the benefits of Wahoo Fitness API is the wide variety of possibilities where developers can implement an application. It means not only the API could display physical activity in heart rate monitors, but also registers the sensors in scales, bike computers and bike trainers. Support is available for developers interested to re-shape the way fitness works either as an independent developer or as an affiliated partner with Wahoo Fitness.
Date Updated: 2014-07-21
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsInboxApp

Inbox is a tool that compiles all mail carriers into a single easy to use web service. It streamlines the email experience by giving added control over the collection & cataloguing process. The Inbox RESTful API allows one to access, modify, and filter incoming mail from popular providers like Gmail or Microsoft Exchange. Old protocols such as IMAP or MIME are thus not a problem using Inbox API. Mailbox API uses Threads and Tags to delineate and classify objects, and is enterprise compatible.
Date Updated: 2014-07-21
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsSkynet

Skynet is an API for the internet of things. With a collaborative approach, Skynet offers an API that helps developers to make drones dance, turn on hue light bulbs and control electronic devices like Belkin WeMo. This hybrid cloud API also offers to send and receive messages to and from devices to monitor sensor activities. The main value of Skynet is a chrome app already built in the market named NodeBlu. This app helps developers to experiment with easy dragging, dropping and writing in their own terms. Some of the features of this API include status, devices, messages, events and data. The site provides examples in REST, a demo and social media for additional support.
Date Updated: 2014-07-21
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: APIsImgflip

Imgflip is a SFW image generator targeted to the youth culture. In this site, teenagers can caption memes, make a gif from video, make a gif from images and even make a pie chart. Using the API, developers can reach a younger audience in their internet practices. The main value of this API is the possibility to make a paid version, even if at the moment it’s free. The paid version can occur when developers contact Imgflip directly. Imgflip API uses REST and JSON interfaces. In the documentation site, developers can see how to implement the API to get memes and to caption images with their respective examples of success and failure responses. Imgflip API can work for developers interested in the youth culture and it can make revenue when agreed by both parties involved, the Imgflip and the developer team.
Date Updated: 2014-07-21
Tags: [field_primary_category], [field_secondary_categories]

ProgrammableWeb: is the API in beta that developers might find beneficial to create activities in a local community. This API aims to provide an intuitive approach for consumers who tend to choose events targeted to their preferences. From the administrator perspective, the API serves as an organizer to spread the word of an event, track attendees, manage the activity and receive feedback for improvement. Developers in public relations industries who manage big events can benefit from this API, since according to the site the interface keeps concerts, marathons and shows digitally organized from beginning to end. API offers the features of user, search, event, stream, avatar and notification. Even though the site publishes worldwide events, technical support is available where developers can submit tickets for a personalized mashup creation.
Date Updated: 2014-07-21
Tags: [field_primary_category], [field_secondary_categories]

Bob DuCharme (Innodata Isogen)When did linking begin?

Pointing somewhere with a dereferenceable address, in the twelfth (or maybe fifth) century.

University of Bologna woodcut

As I have once before, I'm republishing an entry from an O'Reilly blog I had from 2003 to 2005 on topics related to linking. I've been reading up on early concepts of metadata lately—I particularly recommend Ann Blair's Too Much to Know: Managing Scholarly Information before the Modern Age—and have recently found another interesting reference to the "Regulae Iuris" book mentioned below. When I wrote this, I was more interested in hypertext issues, and if I was going to change anything to update this piece, I would change the word "traverse" to "dereference," but all the points are still meaningful.

Works about linking often claim that it's been around for thousands of years, and then they give examples that are no more than a few centuries old. I can only find one reference to something more than a thousand years old that qualifies as a link: Peter Stein's 1966 work "Regulae Iuris: from Juristic Rules to Legal Maxims" describes some late fifth-century lecture notes on a commentary by the legal scholar Ulpian. The notes mention that confirmation of a particular point can be found in the Regulae ("Rules") of the third-century Roman jurist (and student of Ulpian) Modestinus, "seventeen regulae from the end, in the regula beginning 'Dotis'...". The citation's explicit identification of the point in the cited work where the material could be found makes it the earliest link that I know of.

Other than Stein's tantalizing example, all of my research points to the 12th century as the beginning of linking. In a 1938 work on the medieval scholars of Bologna, Italy, who studied what remained of ancient Roman law, Hermann Kantorowicz wrote that in "the eleventh century...titles of law books are cited without indicating the passage, books of the Code are numbered, and the name of the law book is considered a sufficient reference." He uses this to build his argument that that a particular work described in his essay is from the eleventh century and not the twelfth, as other scholars had argued. Apparently, it was common knowledge in Kantoriwicz's field that twelfth century Bolognese scholars would reference a written law using the name of the law book, the rubric heading, and the first few words of the law itself. (Referencing of particular chapters and sections by their first few words was common at the time; the use of chapter, section, and page numbers didn't begin until the following century.)

Italian legal scholars trying to organize and make sense of the massive amounts of accumulated Roman law contributed a great deal to the mechanics of the cross-referencing that provide many of the earliest examples of linking. The medievalist husband and wife team Richard and Mary Rouse also found some in their research into evolving scholarship techniques in the great universities of England and France (that is, Oxford, Cambridge, and the Sorbonne) and they described Gilbert of Poitiers's innovative twelfth-century mechanism for addressing specific parts of his work on the psalms: he added a selection of Greek letters and other symbols down the side of each page to identify concepts such as the Penitential Psalms or the Passion and Resurrection. If you found the symbol for the Passion and Resurrection in the margin of Psalm 2 with a little 8 next to it (actually, a little "viii"—they weren't using Arabic numerals quite yet), it would tell you that the next discussion of this concept appeared in Psalm 8. Once you found the same symbol on one of the eighth psalm's pages, you might find a little "xii" with it to show that the next discussion of the same concept was in Psalm 12. This addressing system made it possible for someone preparing a sermon on the Passion and Resurrection to easily find the relevant material in the Psalms. (In fact, aids to sermon preparation was one of the main forces in the development of new research tools, as clergymen were encouraged to go out and compete with the burgeoning heretic movements for the hearts and minds of the people.)

The use of information addressing systems really got rolling in the thirteenth-century English and French universities, as scholarly monks developed concordances, subject indexes, and page numbers for both Christian religious works and the classic ancient Greek works that they learned about from their contact with the Arabic world. In fact, this is where Arabic numbers start to appear in Europe; page numbering was one of the early drivers for its adoption.

Quoting of one work by another was certainly around long before the twelfth century, but if an author doesn't identify an address for his source, his reference can't be traversed, so it's not really a link. Before the twelfth century, religious works had a long tradition of quoting and discussing other works, but in many traditions (for example, Islam, Theravada Buddhism, and Vedic Hinduism) memorization of complete religious works was so common that telling someone where to look within a work was unnecessary. If one Muslim scholar said to another "In the words of the Prophet..." he didn't need to name the sura of the Qur'an that the quoted words came from; he could assume that his listener already knew. Describing such allusions as "links" adds heft to claims that linking is thousands of years old, but a link that doesn't provide an address for its destination can't be traversed, and a link that can't be traversed isn't much of a link. And, such claims diminish the tremendous achievements of the 12th-century scholars who developed new techniques to navigate the accumulating amounts of recorded information they were studying.

Please add any comments to this Google+ post.

Doug Schepers (Vectoreal)You’re drunk FCC, go home

I just chimed in to the FCC to request that they stop the merger of Comcast and Time-Warner Cable. I don’t know if my voice will make a difference, but I do know that saying nothing will definitely not make a difference.

Here was my statement to the FCC (flawed, I’m sure, but I hope the intent and sentiment is conveyed):

Allowing the merger of Comcast and Time-Warner Cable will dramatically decrease consumer benefits and choice.

Some mergers can be good, allowing struggling companies to reduce losses; in this case, neither Comcast nor Time-Warner Cable is in a situation that needs this merger for financial stability; both companies are currently thriving in the marketplace.

Innovation and an open market for goods and services is in the best interest of the American people. This was clearly shown when the Bell System was broken up January 8, leading to the emergence of advanced competitive services, including cellular phone service, and lower prices. The FCC should take that as a model, and decrease the monopolistic merger of competitors, which decreases this innovation, price competition, and customer choice. Customer service is already notoriously poor at both companies, and decreasing customer choice is likely to make it harder for customers to receive adequate service.

Without competition, Internet providers have little incentive to provide either improved service or lower prices. The US is already widely regarded as having relatively expensive and slow Internet service compared to other industrial nations, and this merger threatens to make that worse.

In addition to the loss of benefits to the consumer, this merger threatens American jobs. When a merger occurs, service departments also merge, and workers lose their jobs. This is especially true when the mergers are in similar industries; some studies have shown an average of 19% job loss, far above the norm of 7.9% when the industries are unrelated. Comcast currently employs 136,000 people; Time-Warner Cable currently employs 51,600 people; if the average job loss takes place, that could mean approximately 35,644 jobs lost, or more conservatively 14,820 jobs, in a still-struggling employment market; many of these will be unskilled labor, which is even harder to resolve. While no laws in the US take into account the effect of job loss on mergers, this is still a factor that can be taken into account by the FCC; laws are only necessary when systemic problems arise in the behavior of key industry players and regulators, and allowing this merger could necessitate the creation of a law that would otherwise be avoided.

Please take the necessary steps to block this merger.

If you are a US citizen, you have until August 25th, 2014 to file a comment. The FCC seems to have gone out of its way to make this difficult, so here are some step-by-step instructions:

  1. Fill out the Free Press petition first just in case. Then, if you want to register your opposition independently…
  2. Go to the FCC  Electronic Comment Filing System page
  3. Enter “14-57” in the Proceeding Number field; you’ll get no immediate confirmation, but this is the code for the “Applications of Comcast Corporation and Time Warner Cable Inc. for Consent to Assign or Transfer Control of Licenses and Applications”. (Note: this is not arcane at all. That’s just an illusion.)
  4. Fill in all required personal information
  5. Ensure that the Type of Filing field is set to “Comment” (the default)
  6. Write a text document explaining why this is such a bad idea; crib mine if you like, or find a much better rationale, but be sure to be clear in your opposition (or support, if you’re a masochist).
  7. Upload your document using the Choose File button. (That’s right, you can’t just leave a comment in a text area, you have to write a separate document. The FCC seems to accept at least .txt and .doc files.) Add your optional description of the file in the Custom Description, so they know your sentiment even if they don’t open your file (which is pretty likely); I labeled mine “Block Comcast-TWC merger”.

Yay! You live in an arguably democratic country!

Amazon Web ServicesAWS Support API Update - Attachment Creation and Lightweight Monitoring

The AWS Support API provides you with a programmatic access to your support cases and to the AWS Trusted Advisor. Today we are extending the API in order to give you more control over the cases that you create and a new, lightweight way to access information about your cases. The examples in this post make use of the AWS SDK for Java.

Creating Attachments for Support Cases
When you create a support case, you may want to include additional information along with the case. Perhaps you want to attach some sample code, a protocol trace, or some screen shots. With today's release you can now create, manage, and use attachments programmatically.

My colleague Kapil Sharma provided me with some sample Java code to show you how to do this. Let's walk through it. The first step is to create an Attachment from a file (File1 in this case):

Attachment attachment = new Attachment;
Attachment.setData(ByteBuffer.wrap(File.readAllBytes(FileSystems.getDefault().getPath("", "File1"))));

Then you create a List of the attachments for the case:

List<Attachment> attachmentSet = New ArrayList<Attachment>();

And upload the attachments:

AddAttachmentsToSetRequest addAttachmentsToSetRequest = new AddAttachmentsToSetRequest();
AddAttachmentsToSetResult addAttachmentsToSetResult = client.addAttachmentsToSet(addAttachmentsToSetRequest);

With the attachment or attachments uploaded, you next need to get an Id for the set:

String attachmentSetId = addAttachmentsToSetResult.getAttachmentSetId();

And then you are ready to create the actual support case:

CreateCaseRequest request = new CreateCaseRequest()

CreateCaseResult result = client.createCase(request);

Once you have created a support case or two, you probably want to check on their status. The describeCases function lets you do just that. In the past, this function returned a detailed response that included up to 15 MB of attachments. With today's enhancement, you can now ask for a lightweight response that does not include any attachments. If you are calling describeCases to check for changes in status, you can now do this in a more efficient fashion.

DescribeCaseRequest request = new DescribeCaseRequest();


To learn more about creating and managing cases programmatically, take a look at Programming the Life of an AWS Support Case.

Available Now
The new functionality described in this post is available now and you can start using them today! The SDK for PHP, SDK for .NET, SDK for Ruby, SDK for Java, SDK for JavaScript in the Browser, and the AWS Command Line Interface (CLI) have been updated.

-- Jeff;


Updated: .  Michael(tm) Smith <>