ProgrammableWebPluralsight Technology Index Shows Software Development Technologies In Demand

Pluralsight, an enterprise technology learning platform, has launched the Pluralsight Technology Index, an index of more than 300 software development technologies ranked by global demand and growth rate. Among the software development technologies listed in the index are languages, frameworks, libraries, data stores, and tools.

Simon Willison (Django)Exploring the UK Register of Members Interests with SQL and Datasette

Ever wondered which UK Members of Parliament get gifted the most helicopter rides? How about which MPs have been given Christmas hampers by the Sultan of Brunei? (David Cameron, William Hague and Michael Howard apparently). Here’s how to dig through the Register of Members Interests using SQL and Datasette.

Gifts from the Sultan

mySociety have been building incredible civic participation applications like TheyWorkForYou and FixMyStreet for nearly 15 years now, and have accumulated all kinds of interesting data along the way.

They recently launched their own data portal at data.mysociety.org listing all of the information they have available. While exploring it I stumbled across their copy of the UK Register of Members Interests. Every UK Member of Parliament has to register their conflicts of interest and income sources, and mySociety have an ongoing project to parse that data into a more useful format.

It won’t surprise you to hear that I couldn’t resist turning their XML files into a SQLite database.

The result is register-of-members-interests.datasettes.com - a Datasette instance running against a SQLite database containing over 1.3 million line-items registered by 1,419 MPs over the course of 18 years.

Some fun queries

A few of my favourites so far:

Understanding the data model

Most of the action takes place in the items table, where each item is a line-item from an MP’s filing. You can search that table by keyword (see helicopter example above) or apply filters to it using the standard Datasette interface. You can also execute your own SQL directly against the database.

Each item is filed against a category. There appears to have been quite a bit of churn in the way that the categories are defined over the years, plus the data is pretty untidy - there are no less than 10 ways of spelling “Remunerated employment, office, profession etc.” for example!

Categories

There are also a LOT of duplicate items in the set - it appears that MPs frequently list the same item (a rental property for example) every time they fill out the register. SQL DISTINCT clauses can help filter through these, as seen in some of the above examples.

The data also has the concepts of both members and people. As far as I can tell people are distinct, but members may contain duplicates - presumably to represent MPs who have served more than one term in office. It looks like the member field stopped being populated in March 2015 so analysis is best performed against the people table.

Once concept I have introduced myself is the record_id. In the XML documents the items are often grouped together into a related collection, like this:

<regmem personid="uk.org.publicwhip/person/10001"
    memberid="uk.org.publicwhip/member/40289" membername="Diane Abbott" date="2014-07-14">
    <category type="2" name="Remunerated employment, office, profession etc">
        <item>Payments from MRL Public Sector Consultants, Pepple House, 8 Broad Street, Great Cambourne, Cambridge CB23 6HJ:</item>
        <item>26 November 2013, I received a fee of £1,000 for speaking at the 1st African Legislative Summit, National Assembly, Abuja, Nigeria.  Hours: 8 hrs. The cost of my flights, transfers and hotel accommodation in Abuja were also met; estimated value £5,000. <em>(Registered 3 December 2013)</em></item>
        <item>23 July 2013, I received a fee of £5,000 for appearing as a contestant on ITV&#8217;s &#8216;The Chase Celebrity &#8211; Series 3&#8217; television programme.  Address of payer:  ITV Studios Ltd, London Television Centre, Upper Ground, London SE1 9Lt.  Hours: 12 hrs.   <em>(Registered 23 July 2013)</em></item>
    </category>
</regmem>

While these items are presented as separate line items, their grouping carries meaning: the first line item here acts as a kind of heading to help provide context to the other items.

To model this in the simplest way possible, I’ve attempted to preserve the order of these groups using a pair of additional columns: the record_id and the sort_order. I construct the record_id using a collection of other fields - the idea is for it to be sortable, and for each line-item in the same grouping to have the same record_id:

record_id = "{date}-{category_id}-{person_id}-{record}".format(
    date=date,
    category_id=category_id,
    person_id=person_id.split("/")[
        -1
    ],
    record=record,
)

The resulting record_id might look like this: 2018-04-16-70b64e89-24878-0

To recreate that particular sequence of line-items, you can search for all items matching that record_id and then sort them by their sort_order. Here’s that record from Diane Abbott shown with its surrounding context.

A single record

How I built it

The short version: I downloaded all of the XML files and wrote a Python script which parsed them using ElementTree and inserted them into a SQLite database. I’ve put the code on GitHub.

A couple of fun tricks: firstly, I borrowed some code from csvs-to-sqlite to create the full-text search index and enable searching:

def create_and_populate_fts(conn):
    create_sql = """
        CREATE VIRTUAL TABLE "items_fts"
        USING {fts_version} (item, person_name, content="items")
    """.format(
        fts_version=best_fts_version()
    )
    conn.executescript(create_sql)
    conn.executescript(
        """
        INSERT INTO "items_fts" (rowid, item, person_name)
        SELECT items.rowid, items.item, people.name
        FROM items LEFT JOIN people ON items.person_id = people.id
    """
    )

The best_fts_version() function implements basic feature detection against SQLite by trying operations in an in-memory database.

Secondly, I ended up writing my own tiny utility function for inserting records into SQLite. SQLite has useful INSERT OR REPLACE INTO syntax which allows you to insert a record and will automatically update an existing record if there is a match on the primary key. This meant I could write this utility function and use it for all of my data inserts:

def insert_or_replace(conn, table, record):
    pairs = record.items()
    columns = [p[0] for p in pairs]
    params = [p[1] for p in pairs]
    sql = "INSERT OR REPLACE INTO {table} ({column_list}) VALUES ({value_list});".format(
        table=table,
        column_list=", ".join(columns),
        value_list=", ".join(["?" for p in params]),
    )
    conn.execute(sql, params)

# ...

insert_or_replace(
    db,
    "people",
    {
        "id": person_id,
        "name": regmem_el.attrib["membername"],
    },
)

What can you find?

I’ve really only scratched the surface of what’s in here with my initial queries. What can you find? Send me Datasette query links on Twitter with your discoveries!

Simon Willison (Django)System-Versioned Tables in MariaDB

System-Versioned Tables in MariaDB

Fascinating new feature from the SQL:2011 standard that's getting its first working implementation in MariaDB 10.3.4. "ALTER TABLE products ADD SYSTEM VERSIONING;" causes the database to store every change made to that table - then you can run "SELECT * FROM products FOR SYSTEM_TIME AS OF TIMESTAMP @t1;" to query the data as of a specific point in time. I've tried all manner of horrible mechanisms for achieving this in the past, having it baked into the database itself would be fantastic.

Via Markus Winand

Shelley Powers (Burningbird)The Joy Reid Saga: The Wayback Machine cannot guarantee authenticity

Recently, Mediaite posted screen shots captured by a Twitter user who goes by the name of Not a Bot that seemingly showed several homophobic comments made on a now defunct weblog by MSNBC’s Joy Ann Reid. Reid replied that her weblog had been hacked and several articles modified by unknown parties. The media has responded …

The post The Joy Reid Saga: The Wayback Machine cannot guarantee authenticity appeared first on Burningbird.

Jeremy Keith (Adactio)Acknowledgements

It feels a little strange to refer to Going Offline as “my” book. I may have written most of the words in it, but it was only thanks to the work of others that they ended up being the right words in the right order in the right format.

I’ve included acknowledgements in the book, but I thought it would be good to reproduce them here in the form of hypertext…

Everyone should experience the joy of working with Katel LeDû and Lisa Maria Martin. From the first discussions right up until the final last-minute tweaks, they were unflaggingly fun to collaborate with. Thank you, Katel, for turning my idea into reality. Thank you, Lisa Maria, for turning my initial mush of words into a far more coherent mush of words.

Jake Archibald and Amber Wilson were the best of technical editors. Jake literally wrote the spec on service workers so I knew I could rely on him to let me know whenever I made any factual missteps. Meanwhile Amber kept me on the straight and narrow, letting me know whenever the writing was becoming unclear. Thank you both for being so generous with your time.

Thanks to my fellow Clearlefty Danielle Huntrods for giving me feedback as the book developed.

Finally, I want to express my heartfelt thanks to everyone who has ever taken the time to write on their website about their experiences with service workers. Lyza Gardner, Ire Aderinokun, Una Kravets, Mariko Kosaka, Jason Grigsby, Ethan Marcotte, Mike Riethmuller, and others inspired me with their generosity. Thank you to everyone who’s making the web better through such kind acts of openness. To quote the original motto of the World Wide Web project, let’s share what we know.

Simon Willison (Django)Quoting Jeremy Hodges

China had about 99 percent of the 385,000 electric buses on the roads worldwide in 2017, accounting for 17 percent of the country’s entire fleet. Every five weeks, Chinese cities add 9,500 of the zero-emissions transporters—the equivalent of London’s entire working fleet

Jeremy Hodges

Simon Willison (Django)Black Onlline Demo

Black Onlline Demo

Black is "the uncompromising Python code formatter" by Łukasz Langa - it reformats Python code to a very carefully thought out styleguide, and provides almost no options for how code should be formatted. It's reminiscent of gofmt. José Padilla built a handy online tool for trying it out in your browser.

Via @llanga

Simon Willison (Django)JSON Escape Text

JSON Escape Text

I built a tiny tool for turning text into an escaped JSON string - I needed it to help create descriptions and canned SQL queries for adding to Datasette's metadata.json files.

ProgrammableWebTwilio Adds Support for LINE to its API

Twilio today announced that it has added support to its API for LINE, a popular free communication app that allows its more than 168 million users to make calls and send messages.

LINE support is part of Twilio Channels, a feature of Twilio's API that allows developers to build experiences for a variety of popular messaging platforms including Facebook Messenger, WeChat and Viber.

Simon Willison (Django)Quoting Chris Eppstein

The current linkedin.com homepage clocks in at 1.9MB of CSS (156KB compressed). After re-building a fully-functional version of the homepage with CSS Blocks, we were able to serve the same page with just 38KB of CSS. To be clear: that's the uncompressed size. After compression, that CSS file weighed in at less than 9KB!

Chris Eppstein

Amazon Web ServicesNew .BOT gTLD from Amazon

Today, I’m excited to announce the launch of .BOT, a new generic top-level domain (gTLD) from Amazon. Customers can use .BOT domains to provide an identity and portal for their bots. Fitness bots, slack bots, e-commerce bots, and more can all benefit from an easy-to-access .BOT domain. The phrase “bot” was the 4th most registered domain keyword within the .COM TLD in 2016 with more than 6000 domains per month. A .BOT domain allows customers to provide a definitive internet identity for their bots as well as enhancing SEO performance.

At the time of this writing .BOT domains start at $75 each and must be verified and published with a supported tool like: Amazon Lex, Botkit Studio, Dialogflow, Gupshup, Microsoft Bot Framework, or Pandorabots. You can expect support for more tools over time and if your favorite bot framework isn’t supported feel free to contact us here: contactbot@amazon.com.

Below, I’ll walk through the experience of registering and provisioning a domain for my bot, whereml.bot. Then we’ll look at setting up the domain as a hosted zone in Amazon Route 53. Let’s get started.

Registering a .BOT domain

First, I’ll head over to https://amazonregistry.com/bot, type in a new domain, and click magnifying class to make sure my domain is available and get taken to the registration wizard.

Next, I have the opportunity to choose how I want to verify my bot. I build all of my bots with Amazon Lex so I’ll select that in the drop down and get prompted for instructions specific to AWS. If I had my bot hosted somewhere else I would need to follow the unique verification instructions for that particular framework.

To verify my Lex bot I need to give the Amazon Registry permissions to invoke the bot and verify it’s existence. I’ll do this by creating an AWS Identity and Access Management (IAM) cross account role and providing the AmazonLexReadOnly permissions to that role. This is easily accomplished in the AWS Console. Be sure to provide the account number and external ID shown on the registration page.

Now I’ll add read only permissions to our Amazon Lex bots.

I’ll give my role a fancy name like DotBotCrossAccountVerifyRole and a description so it’s easy to remember why I made this then I’ll click create to create the role and be transported to the role summary page.

Finally, I’ll copy the ARN from the created role and save it for my next step.

Here I’ll add all the details of my Amazon Lex bot. If you haven’t made a bot yet you can follow the tutorial to build a basic bot. I can refer to any alias I’ve deployed but if I just want to grab the latest published bot I can pass in $LATEST as the alias. Finally I’ll click Validate and proceed to registering my domain.

Amazon Registry works with a partner EnCirca to register our domains so we’ll select them and optionally grab Site Builder. I know how to sling some HTML and Javascript together so I’ll pass on the Site Builder side of things.

 

After I click continue we’re taken to EnCirca’s website to finalize the registration and with any luck within a few minutes of purchasing and completing the registration we should receive an email with some good news:

Alright, now that we have a domain name let’s find out how to host things on it.

Using Amazon Route53 with a .BOT domain

Amazon Route 53 is a highly available and scalable DNS with robust APIs, healthchecks, service discovery, and many other features. I definitely want to use this to host my new domain. The first thing I’ll do is navigate to the Route53 console and create a hosted zone with the same name as my domain.


Great! Now, I need to take the Name Server (NS) records that Route53 created for me and use EnCirca’s portal to add these as the authoritative nameservers on the domain.

Now I just add my records to my hosted zone and I should be able to serve traffic! Way cool, I’ve got my very own .bot domain for @WhereML.

Next Steps

  • I could and should add to the security of my site by creating TLS certificates for people who intend to access my domain over TLS. Luckily with AWS Certificate Manager (ACM) this is extremely straightforward and I’ve got my subdomains and root domain verified in just a few clicks.
  • I could create a cloudfront distrobution to front an S3 static single page application to host my entire chatbot and invoke Amazon Lex with a cognito identity right from the browser.

Randall

ProgrammableWebRSA Conference Endures Cybersecurity Incident, Again

Last week, security solution provider RSA held its annual RSA Conference (RSAC). RSA pitches the event as "the leading cybersecurity event across the globe." Although the cybersecurity-focused event features the most innovative technologies in the space, the event itself has a history of cybersecurity missteps (e.g. multiple USB malware incidents in 2010, app data leak in 2014). Unfortunately for RSA and its attendees, RSAC 2018 suffered similar problems.

Simon Willison (Django)dateparser: python parser for human readable dates

dateparser: python parser for human readable dates

I've used dateutil.parser for this in the past, but dateparser is a major upgrade: it knows how to parse dates in 200 different language locales, can interpret different timezone representations and handles relative dates ("3 months, 1 week and 1 day ago") as well.

Via csvs-to-sqlite: -d / -dt options for parsing columns as dates

Simon Willison (Django)csvs-to-sqlite 0.8

csvs-to-sqlite 0.8

I released a new version of my csvs-to-sqlite tool this morning with a bunch of handy new features. It can now rename columns and define their types, add the CSV filenames as an additional column, add create indexes on columns and parse dates and datetimes into SQLite-friendly ISO formatted values.

ProgrammableWebEricsson Releases Study: Exploring IoT Strategies

Ericsson, a provider of technology and services to telecom operators, has released "Exploring IoT Strategies," a study that provides insights from 20 telecom service providers on how they are engaging and positioning themselves in the IoT market.

Anne van Kesteren (Opera)any.js

Thanks to Ms2ger web-platform-tests is now even more awesome (not in the American sense). To avoid writing HTML boilerplate, web-platform-tests supports .window.js, .worker.js, and .any.js resources, for writing JavaScript that needs to run in a window, dedicated worker, or both at once. I very much recommend using these resource formats as they ease writing and reviewing tests and ensure APIs get tested across globals.

Ms2ger extended .any.js to also cover shared and service workers. To test all four globals, create a single your-test.any.js resource:

// META: global=window,worker
promise_test(async () => {
  const json = await new Response(1).json()
  assert_equals(json, 1);
}, "Response object: very basic JSON parsing test");

And then you can load it from your-test.any.html, your-test.any.worker.html, your-test.any.sharedworker.html, and your-test.https.any.serviceworker.html (requires enabling HTTPS) to see the results of running that code in those globals.

The default globals for your-test.any.js are a window and a dedicated worker. You can unset the default using !default. So if you just want to run some code in a service worker:

// META: global=!default,serviceworker

Please give this a try and donate some tests for your favorite API annoyances.

Jeremy Keith (Adactio)Going Offline, available now!

The day is upon us! The hour is at hand! The book is well and truly apart!

That’s right—Going Offline is no longer available for pre-order …it’s just plain available. ABookApart.com is where you can place your order now.

If you pre-ordered the book, thank you. An email is winging its way to you with download details for the digital edition. If you ordered the paperback, the Elves Apart are shipping your lovingly crafted book to you right now.

If you didn’t pre-order the book, I grudgingly admire your cautiousness, but don’t you think it’s time to throw caution to the wind and treat yourself?

Seriously though, I think you’ll like this book. And I’m not the only one. Here’s what people are saying:

I know you have a pile of professional books to read, but this one should skip the line.

Lívia De Paula Labate

It is so good. So, so good. I cannot recommend it enough.

Sara Soueidan

Super approachable and super easy to follow regardless of your level of knowledge.

—also Sara Soueidan

You’re gonna want to preorder a copy, believe me.

Mat Marquis

Beautifully explained without being patronising.

Chris Smith

I very much look forward to hearing what you think of Going Offline. Get your copy today and let me know what you think of it. Like I said, I think you’ll like this book. Apart.

Simon Willison (Django)react-jsonschema-form

react-jsonschema-form

Exciting library from the Mozilla Services team: given a JSON Schema definition, react-jsonschema-form can produce a cleanly designed React-powered form for adding and editing data that matches that schema. Includes support for adding multiple items in a nested array, re-ordering them, custom form widgets and more.

Simon Willison (Django)Why it took a long time to build that tiny link preview on Wikipedia

Why it took a long time to build that tiny link preview on Wikipedia

Wikipedia now shows a little preview card on internal links with an image and summary paragraph of the linked page. As a Wikpedia user I absolutely love this feature - and as an engineer and product designer, it's fascinating to hear the challenges they overcame to ship it. Of particular interest: actually generating a useful summary of a page, while stripping out the cruft that often accumulates at the beginning of their text. It's also an impressive scaling challenge: the API they use for this feature is now handling more than 500,000 requests per minute.

Via Hacker News

ProgrammableWebFacebook Makes Drastic Changes to its APIs and Platform Product

​In the wake of a scandal that has led to the biggest backlash it has ever faced, Facebook has announced API and platform product changes that will have far-reaching consequences for developers.

Jeremy Keith (Adactio)Workshops

There’s a veritable smörgåsbord of great workshops on the horizon…

Clearleft presents a workshop with Jan Chipchase on field research in London on May 29th, and again on May 30th. The first day is sold out, but there are still tickets available for the second workshop (tickets are £654). If you’ve read Jan’s beautiful Field Study Handbook, then you’ll know what a great opportunity it is to spend a day in his company. But don’t dilly-dally—that second day is likely to sell out too.

This event is for product teams, designers, researchers, insights teams, in agencies, in-house, local and central government. People who are curious about human interaction, and their place in the world.

I’m really excited that Sarah and Val are finally bringing their web animation workshop to Brighton (I’ve been not-so-subtly suggesting that they do this for a while now). It’s a two day workshop on July 9th and 10th. There are still some tickets available, but probably not for much longer (tickets are £639). The workshop is happening at 68 Middle Street, the home of Clearleft.

This workshop will get you up and running with web animation in less time than it would take to read all the tutorials you have bookmarked. Over two days, you’ll go from beginner or novice web animator to having expert level knowledge of the current web animation landscape. You’ll get an in-depth look at animating with CSS, JavaScript, and SVG through hands-on exercises and learn the most efficient workflows for each.

A bit before that, though, there’s a one-off workshop on responsive web typography from Rich on Thursday, June 29th, also at 68 Middle Street. You can expect the same kind of brilliance that he demonstrated in his insta-classic Web Typography book, but delivered by the man himself.

You will learn how to combine centuries-old craft with cutting edge technology, including variable fonts, to design and develop for screens of all shapes and sizes, and provide the best reading experiences for your modern readers.

Whether you’re a designer or a developer, just starting out or seasoned pro, there will be plenty in this workshop to get your teeth stuck into.

Tickets are just £435, and best of all, that includes a ticket to the Ampersand conference the next day (standalone conference tickets are £235 so the workshop/conference combo is a real bargain). This year’s Ampersand is shaping up to be an unmissable event (isn’t it always?), so the workshop is like an added bonus.

See you there!

Simon Willison (Django)Quoting Will Larson

Migrations are both essential and frustratingly frequent as your codebase ages and your business grows: most tools and processes only support about one order of magnitude of growth before becoming ineffective, so rapid growth makes them a way of life. [...] As a result you switch tools a lot, and your ability to migrate to new software can easily become the defining constraint for your overall velocity. [...] Migrations matter because they are usually the only available avenue to make meaningful progress on technical debt.

Will Larson

ProgrammableWebWhich Platform is Easiest for Publishing a Progressive Web App?

Since their introduction in 2015, Progressive Web Apps (PWA) have been viewed as a potential way to break the grip that native apps have had on the mobile app ecosystem. Drawing an audience however, isn’t as easy as writing your PWA and watching the users flock to it. The most effective way to reach users still is to publish your app to one of the major app stores (Google Play, iOS App Store, Windows Store).

ProgrammableWebDaily API RoundUp: Mopinion, Cboe, ipstack, Gfycat, Rev.io, Moon Banking

Every day, the ProgrammableWeb team is busy, updating its three primary directories for APIs, clients (language-specific libraries or SDKs for consuming or providing APIs), and source code samples.

Bob DuCharme (Innodata Isogen)Reification is a red herring

And you don't need property graphs to assign data to individual relationships.

RDF's very simple subject-predicate-object data model is a building block that you can use to build other models that can make your applications even better.

I recently tweeted that the ZDNet article Back to the future: Does graph database success hang on query language? was the best overview of the graph database world(s) that I'd seen so far, and I also warned that many such "overviews" were often just Neo4j employees plugging their own product. (The Neo4j company is actually called Neo Technology.) The most extreme example of this is the free O'Reilly book Graph Databases, which is free because it's being given away by its three authors' common employer: Neo Technology! The book would have been more accurately titled "Building Graph Applications with Cypher", the Neo4j query language. This 238-page book on graph databases manages to mention SPARQL and Gremlin only twice each. The ZDNet article above does a much more balanced job of covering RDF and SPARQL, Gremlin and Tinkerpop, and Cypher and Neo4j.

The DZone article RDF Triple Stores vs. Labeled Property Graphs: What's the Difference? is by another Neo employee, field engineer Jesús Barrasa. It doesn't mention Tinkerpop or Gremlin at all, but does a decent job of describing the different approach that property graph databases such as Neo4j and Tinkerpop take in describing graphs of nodes and edges when compared with RDF triplestores. Its straw man arguments about RDF's supposed deficiencies as a data model reminded me of a common theme I've seen over the years.

The fundamental thing that most people don't get about RDF, including many people who are successfully using it to get useful work done, is that RDF's very simple subject-predicate-object data model is a building block that you can use to build other models that can make your applications even better. Just because RDF doesn't require the use of schemas doesn't mean that it can't use them; the RDF Schema Language lets you declare classes, properties, and information about these that you can use to drive user interfaces, to enable more efficient and readable queries, and to do all the other things that people typically use schemas for. Even better, you can develop a schema for the subset of the data you care about (as opposed to being forced to choose between a schema for the whole data set or no schema at all, as with XML), which is great for data integration projects, and then build your schema up from there.

Barrasa writes of property graphs that "[t]he important thing to remember here is that both the nodes and relationships have an internal structure, which differentiates this model from the RDF model. By internal structure, I mean this set of key-value pairs that describe them." This is the first important difference between RDF and property graphs: in the latter, nodes and edges can each have their own separate set (implemented as an array in Neo4j) of key-value pairs. Of course, nodes in RDF don't need this; to say that the node for Jack has an attribute-value pair of (hireDate, "2017-04-12"), we simply make another triple with Jack as the subject and these as the predicate and object.

Describing the other key difference, Barrasa writes that while the nodes of property graphs have unique identifiers, "[i]n the same way, edges, or connections between nodes--which we call relationships--have an ID". Property graph edges are unique at the instance level; if Jane reportsTo Jack and Jack reportsTo Jill, the two reportsTo relationships here each have their own unique identifier and their own set of key-value pairs to store information about each edge.

He writes that in RDF "[t]he predicate will represent an edge--a relationship--and the object will be another node or a literal value. But here, from the point of view of the graph, that's going to be another vertex." Not necessarily, at least for the literal values; these represent the values in RDF's equivalent of the key-value pairs--the non-relationship information being attached to a node such as (hireDate, "2017-04-12") above. This ability is why a node doesn't need its own internal key-value data structure.

He begins his list of differences between property graphs and RDF with the big one mentioned above: "Difference #1: RDF Does Not Uniquely Identify Instances of Relationships of the Same Type," which is certainly true. But, his example, which he describes as "an RDF graph in which Dan cannot like Ann three times", is very artificial.

One of his "RDF workarounds" for using RDF to describe that Dan liked Ann three times is reification, in which we convert each triple to four triples: one saying that a given resource is an RDF statement, a second identifying the resource's subject, a third naming the predicate, and a fourth naming the object. This way, the statement itself has identity, and we can add additional information about it as triples that use the statement's identifier as a subject and additional predicates and objects as key-value pairs such as (time, "2018-03-04T11:43:00") to show when a particular "like" took place. Barrasa writes "This is quite ugly"; I agree, and it can also do bad things to storage requirements.

In my 15 years of working with RDF, I have never felt the need to use reification. It's funny how the 2004 RDF Primer 1.0 has a section on reification but the 2014 RDF Primer 1.1 (of which I am proud to be listed in the Acknowledgments) doesn't even mention reification, because simpler modeling techniques are available, so reification was rarely if ever used.

By "modeling techniques" I mean "declaring and then using a model", although in RDF, you don't even have to declare it. If you want to keep track of separate instances of employees, or games, or buildings, you can declare any of these as a class and then create instances of it; similarly, if you want to keep track of separate instances of a particular relationship, declare a class for that relationship and then create instances of it.

How would we apply this to Barrasa's example, where he wants to keep track of information about Likes? We use a class called Like, where each instance identifies who liked who. (When I first wrote that previous sentence, I wrote that we can declare a class called Like, but again, we don't need to declare it to use it. Declaring it is better for serious applications where multiple developers must work together, because part of the point of a schema is to give everyone a common frame of reference about the data they're working with.) The instance could also identify the date and time of the Like, comments associated with it, and anything else you wanted to add as a set of key-value pairs for each Like instance that is implemented as just more triples.

Here's an example. After optional declarations of the relevant class and properties associated with it, the following has four Likes showing who liked who when and a "foo" value to demonstrate the association of arbitrary metadata with that Like.

@prefix d:    <http://learningsparql.com/ns/data/> .
@prefix m:    <http://learningsparql.com/ns/model/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . 

# Optional schema.
m:Like  a rdfs:Class .          # A class...
m:liker rdfs:domain m:Like .    # and properties that go with this class.
m:liked rdfs:domain m:Like .
m:foo   rdfs:domain m:Like .

[] a m:Like ;
   m:liker d:Dan ;
   m:liked d:Ann ;
   m:time "2018-03-04T11:43:00" ;
   m:foo "bar" .

[] a m:Like ;
   m:liker d:Dan ;
   m:liked d:Ann ;
   m:time "2018-03-04T11:58:00" ;
   m:foo "baz" .

[] a m:Like ;
   m:liker d:Dan ;
   m:liked d:Ann ;
   m:time "2018-03-04T12:04:00" ;
   m:foo "bat" .

[] a m:Like ;
   m:liker d:Ann ;
   m:liked d:Dan ;
   m:time "2018-03-04T12:06:00" ;
   m:foo "bam" .

Instead of making up specific identifiers for each Like, I made them blank nodes so that the RDF processing software will generate identifiers and keep track of them.

As to Barrasa's use case of counting how many times Dan liked Ann, it's pretty easy with SPARQL:

PREFIX d: <http://learningsparql.com/ns/data/> 
PREFIX m: <http://learningsparql.com/ns/model/>

SELECT (count(*) AS ?likeCount) WHERE {
  ?like a m:Like ;
        m:liker d:Dan ;
        m:liked d:Ann .
}

(This query would actually work with just the m:liker and m:liked triple patterns, but as with the example that I tweeted to Dan Brickley about, declaring your RDF resources as instances of classes can lay the groundwork for more efficient and readable queries.) Here is ARQ's output for this query:

-------------
| likeCount |
=============
| 3         |
-------------

Let's get a little fancier. Instead of counting all of Dan's likes of Ann, we'll just list the ones from before noon on March 3, sorted by their foo values:

PREFIX d: <http://learningsparql.com/ns/data/> 
PREFIX m: <http://learningsparql.com/ns/model/>

SELECT ?fooValue ?time WHERE {
  ?like a m:Like ;
        m:liker d:Dan ;
        m:liked d:Ann ;
        m:time ?time ;
        m:foo ?fooValue .
FILTER (?time < "2018-03-04T12:00")
}
ORDER BY ?fooValue

And here is ARQ's result for this query:

------------------------------------
| fooValue | time                  |
====================================
| "bar"    | "2018-03-04T11:43:00" |
| "baz"    | "2018-03-04T11:58:00" |
------------------------------------

After working through a similar example for modeling flights between New York and San Francisco, Barrasa begins a sentence "Because we can't create such a simple model in RDF..." This is ironic; the RDF model is simpler than the Labeled Property Graph model, because it's all subject-predicate-object triples without the use of additional data structures attached to the graph nodes and edges. His RDF version would have been much simpler if he had just created instances of a class called Flight, because again, while the base model of RDF is the simple triple, more complex models can easily be created by declaring classes, properties, and information about those classes and properties--which we can do by just creating new triples!

To summarize, complaints about RDF that focus on reification are so 2004, and they are a red herring, because they distract from the greater power that RDF's modeling abilities bring to application development.

A funny thing happened after writing all this, though. As part of my plans to look into Tinkerpop and Gremlin and potential connections to RDF as a next step, I was looking into Stardog and Blazegraph's common support of both. I found a Blazegraph page called Reification Done Right where I learned of Olaf Hartig and Bryan Thompson's 2014 paper Foundations of an Alternative Approach to Reification in RDF. If Blazegraph has implemented their ideas, then there is a lot of potential there. And if the Blazegraph folks brought this with them to Amazon Neptune, that would be even more interesting, although apparently that hasn't shown up yet.


Please add any comments to this Google+ post.

ProgrammableWebWhat Are Six of the Most Common API Mistakes?

These days, with the number of tools available, building an API can happen in a matter of minutes. But there is a big difference between whipping up an API, and crafting one that is secure, reliable and meets the user’s expectations. Heitor Tashiro Sergent from Runscope, lays out six of the most common mistakes that arise from APIs that weren’t made with enough care.

Amazon Web ServicesAWS Support – The First Decade

We launched AWS Support a full decade ago, with Gold and Silver plans focused on Amazon EC2, Amazon S3, and Amazon SQS. Starting from that initial offering, backed by a small team in Seattle, AWS Support now encompasses thousands of people working from more than 60 locations.

A Quick Look Back
Over the years, that offering has matured and evolved in order to meet the needs of an increasingly diverse base of AWS customers. We aim to support you at every step of your cloud adoption journey, from your initial experiments to the time you deploy mission-critical workloads and applications.

We have worked hard to make our support model helpful and proactive. We do our best to provide you with the tools, alerts, and knowledge that will help you to build systems that are secure, robust, and dependable. Here are some of our most recent efforts toward that goal:

Trusted Advisor S3 Bucket Policy CheckAWS Trusted Advisor provides you with five categories of checks and makes recommendations that are designed to improve security and performance. Earlier this year we announced that the S3 Bucket Permissions Check is now free, and available to all AWS users. If you are signed up for the Business or Professional level of AWS Support, you can also monitor this check (and many others) using Amazon CloudWatch Events. You can use this to monitor and secure your buckets without human intervention.

Personal Health DashboardThis tool provides you with alerts and guidance when AWS is experiencing events that may affect you. You get a personalized view into the performance and availability of the AWS services that underlie your AWS resources. It also generates Amazon CloudWatch Events so that you can initiate automated failover and remediation if necessary.

Well Architected / Cloud Ops Review – We’ve learned a lot about how to architect AWS-powered systems over the years and we want to share everything we know with you! The AWS Well-Architected Framework provide proven, detailed guidance in critical areas including operational excellence, security, reliability, performance efficiency, and cost optimization. You can read the materials online and you can also sign up for the online training course. If you are signed up for Enterprise support, you can also benefit from our Cloud Ops review.

Infrastructure Event Management – If you are launching a new app, kicking off a big migration, or hosting a large-scale event similar to Prime Day, we are ready with guidance and real-time support. Our Infrastructure Event Management team will help you to assess the readiness of your environment and work with you to identify and mitigate risks ahead of time.

Partner-Led Support – The new AWS Solution Provider Program for APN Consulting Partners allows partners to manage, service, support, and bill AWS accounts for end customers.

To learn more about how AWS customers have used AWS support to realize all of the benefits that I noted above, watch these videos (and find more on the Customer Testmonials page):

The Amazon retail site makes heavy use of AWS. You can read my post, Prime Day 2017 – Powered by AWS, to learn more about the process of preparing to sustain a record-setting amount of traffic and to accept a like number of orders.

Come and Join Us
The AWS Support Team is in continuous hiring mode and we have openings all over the world! Here are a couple of highlights:

Visit the AWS Careers page to learn more and to search for open positions.

Jeff;

ProgrammableWebWhy The Lynch Mob That’s After Mark Zuckerberg Has Got The Wrong Guy

To all of you — including the US Congress —  that want Mark Zuckerberg's head over the personal data that was gleaned from Facebook and used for profit by Cambridge Analytica, you have got the wrong guy. If you’re quitting Facebook, you might be doing it for the wrong reasons.

ProgrammableWeb​Safari Technology Preview 54 Contains Web API, WebRTC Updates

Apple this week released Safari Technology Preview 54 for macOS Sierra and macOS High Sierra and it contains updates for Web APIs and WebRTC.

ProgrammableWebGoogle Dialogflow Enterprise Edition Generally Available

This week, Google announced that its Dialogflow Enterprise Edition is now generally available. The development suite allows developers to build conversational interfaces for websites, mobile apps, messaging platforms, and IoT devices.

Matt Webb (Schulze & Webb)Mid-program reflections #4 – six thoughts about Office Hours

I meet with each of the nine startups for an hour every week. The session is called "Office Hours" and I'm pretty sure that all startup accelerators do something like this.

For me, it's about founder coaching and generally making sure each team is getting the most out of the program.

I first saw how this works because of Jon Bradford. I was lucky enough to sit in on his Office Hours sessions in 2014 when he was MD with Techstars London. I've developed my own style since. All the good bits are from Jon.

"MD" stands for "Managing Director."


What does a program Managing Director do? I can't tell you in general, but I can say what I do.

I lead on outreach and then selecting the startups. I make the case to the rest of the team about why each startup is worth investment, and I have a thesis about what's happening in the market. I lead on deal negotiation, and I coordinate the legal team.

Programming: I work closely with the Program Director, Lisa Ritchie, and her program team. In theory I'm backstop if there's trouble, but there's been little of that: Lisa both runs a tight ship, and thinks imaginatively ahead of the puck. So I'm consulted only as needed, usually as a startup is being handed off between the different parts of the program. Because of my design consultancy background, I'm a second pair of eyes on the briefs for the agency-led Services phase. I bring in much of the network of experts and advisors, and founders for Founder Stories sessions.

I run Office Hours. I coach the startups when they're in the room, and evangelise for them outside it.


These reflections are about Office Hours. Although this is the ninth paragraph, it was the final paragraph written. I finished writing this post, read it through, then came back here to give you a warning: there are too many words, and I have that horrible deer-in-the-headlights feeling I sometimes get when doing public speaking that, holy shit, everything I'm saying is obvious and asinine. So I'm going to do what I usually do when that feeling comes on, which is to double down and barrel on.


I estimate that I've led or sat in on 250 hours of Office Hours sessions. This doesn't include advisory sessions or board meetings. I don't feel that 250 hours is enough to get good at it.

Also: who the hell am I to be giving advice? I'm less successful at the startup game than a lot of the people I meet with, and with the rest that's only because they're just getting started. But I've seen a lot.

So given I don't feel particularly good at it, I keep notes of approaches that seem to work. This is something I've been doing for a couple of years, on and off: privately journaling at the end of the week about working practice and team dynamics.

Then I come back to the approaches later. I don't mean to follow them slavishly. Only that, in a session, I try to remain conscious of them rather than reacting in the moment.


Six things I try to keep in mind while I'm running an Office Hours session:


1. Do they know how to run the room?

My first session is about us getting to know each other, and talking about what we can expect from Office Hours. After that, I start by asking a question: what's one great thing and one challenging thing that's happened over the last week. (Then we dig into the challenging thing.)

About halfway through the program, I put more of the agenda in the hands of the founders: at the beginning of the meeting I get them to write the agenda up on the whiteboard. This becomes habit pretty quickly. If I'm not clear what a topic is, or what kind of response I'm being asked for, I say.

Much of any founder's time will be spent meeting advisors and investors. There's a knack to running the room and getting what you want out of it, while maintaining a feeling of collaboration and conversation. Meetings aren't just time you spend in a room together. Meetings are an atomic unit of work. They should have purpose and outcomes, although these don't necessarily need to be stated. There are lots of small ways to make sure attendees don't drift or feel lost.

Most of the founders I work with already know how to run a room. At which point, reassured, we can go back to chatting.


2. Am I thinking a couple weeks ahead?

We provide a bunch of programming to the startups, and I want to make sure it's effective.

For example, ahead of "mentor" meetings with experts and advisors, we discuss how to pitch (5 minutes to intro the company, then dig deep into one or two issues. They may have to work to make it useful). During the Services phase, I try to bring up the differences between how agencies work and how startups work, and also how to integrate the deliverables.

Absent anything else, I think ahead to the eventual pitch deck. I'm imagining the slides. If there's not yet a strong traction slide, I work backwards through sales and then to processes around customer development, and guide the conversation to those topics.

Because of this, I need to have a strong opinion about where the company should go and how it will get there. I spend a lot of my time between Office Hours thinking about this. This isn't so that I can say there is a "right" or "wrong" answer, it's so I can have a good understanding of the complexity of what they are taking on. Rather than "correct" or "incorrect," it's useful to feel out decision qualities such as "ability to easily iterate" or "here be layers of unconscious assumptions and hope."

Founders are very convincing people, so I have to watch for where an argument is strong because of good analysis versus mere charisma. Sometimes founders convince even themselves. There's a knack to jumping between sharing visions of the future and robust self-honesty.

My personal mantra is: strong opinions, weakly held. I have to remember that my view is secondary to what the founder and the team wants. Of course my opinion might be that the founder is missing something, so I have to satisfy myself that their decision is made with a good process. (And sometimes the choice is between two routes and the answer is: do whatever you're ok waking up at 4am and thinking about for the next 4 to 7 years.)


3. Why hasn't the founder answered this question already?

These founders are some of the brightest people I've met. If anyone has the mindset to tackle any challenge they meet, it's them.

So when a question is brought to Office Hours, I try to ask myself why the answer is not obvious to the founder. I try not to immediately answer it myself.

(There's another reason why I shouldn't leap to answering questions, in that the founder has been closer to and thinking more deeply about their startup than I ever will. In the end, all I really have is a perspective.)

Why might a founder ask a question?

There might be a knowledge, skills, or experience gap. This is possible. I think to myself why they have not worked it out with friends or google. We can figure out an approach together, and what I try to do then is ask smaller questions which will lead the founder to the answer for themselves.

A second possibility is that the higher-level framework has something missing. A question about, say, which product features to highlight on the homepage should be obvious given a validated model of the customer and an understanding of product differentiation. And those should be possible to figure out given their priors: in this case, a process of having a business model hypothesis and testing it by speaking to customers and investors.

So a question from a founder is a chance to dig upwards to these frameworks. Frameworks aren't axioms. They can and should change, but always deliberately.

The important thing here is not the answer, but the ability to deconstruct the question, to ask it intelligently, and to discuss it. If a question can be treated like this, then it can be worked on by the founder with their team and with their advisors--all people who are much smarter and more experienced than me. A question answered by instinct can't involve and take the benefit of all the smart people around us.

A third possibility is that the answer is clearly evident, but there is some personal or team resistance to seeing it. A resistance comes about often because the answer implies something undesirable. You'd be surprised how often this happens, or maybe you wouldn't. If it's a single founder, some possibilities are that:

  • the answer might imply something that conflicts with the founder's self-image
  • the answer might reveal an undesirable kind of hard work: it's preferable to do the all-consuming and intellectual hard work of grinding through product development, versus the emotionally scary work of sales and possible rejection (for example)
  • like all answer, this answer means bringing an idea into reality, which is terrifying: all ideas are perfect; reality is at best mundane and at worst can fail

So in this case, I try to help the founder be clear-eyed about what an answer means.

If it's a team, these different viewpoints can be embodied in different team members. This is not necessarily a conflict. One member might be not surfacing the answer because they imagine another team member is highly invested in a different approach. Possibilities are unvoiced from an overabundance of care. My job here is to help them become a functional team, and one way to do that is to illustrate the power of saying conflicting viewpoints out loud. So I try to point of differences of opinion. Just because differences of opinion have been unearthed does not mean they need to be resolved. Differences can be tolerated and embraced. (Although courses of action, once decided, need commitment.)

I have a hobby interest in small group dynamics, so I love these sessions intellectually. Though they are the hardest to work.


4. There is often a crisis. Fixing the issue is not my job.

A special type of Office Hours is when there's a crisis. I would characterise a crisis as any time the founder brings urgency into the room--whether it's good or bad. There are times when sales are going just too well! "A great problem to have" can trigger a panicked response just as a more existential crisis such as an unhappy team.

I have to remind myself that fixing the issue is not my primary job. Participating in panic validates panic as a response. But if a startup responded to every crisis with panic, nothing would get done. (I would characterise panic as short-termist thinking, accompanied by a stressed and unpleasant emotional state.)

What makes this challenging is that I often know what they're going through. Sometimes I recognise a situation and my own emotional memories well up. There have been sessions where my heart races, or my palms sweat, or I look from team member to team member and wonder if they realise the dynamic they've found themselves in.

So before we talk about the issue, I try to find the appropriate emotional response: enthusiastically cheer first sales (but don't sit back on laurels); get pissed off about bad news but move on with good humour; treat obstacles with seriousness but don't over-generalise. It's a marathon not a sprint, and so on.

Then use the situation to talk tactics and build some habits. I like to encourage:

  1. Writing things down. Startups are not about product, they are about operationalising sales of that product. Operationalising means there is a machine. The minimum viable machine is a google doc with a checklist. The sales process can be a checklist. HR can be a checklist. Bookkeeping can be a checklist. When things don't work, revise the checklist. Eventually, turn it into software and people following specific job objectives. This is how (a) the startup can scale where revenue scales faster than cost of sale; and (b) the founder can one day take a holiday.
  2. A habit of momentum. I forget who said to me "first we figure out how to row the boat, then we choose the direction" but movement is a team habit. If, in every meeting, i respond to a business update with "so, what are you doing about that" then that expectation of action will eventually get internalised

I find these viewpoints sink in better when they're using in responding to a crisis.

I also like to encourage self-honesty. Sometimes my job is to say out loud things which are unsaid. Founders are very good at being convincing (both themselves and others) otherwise they wouldn't be founders. Sometimes that data that doesn't fit the narrative is left out... to others and to themselves. So I can help break that down.

There will be crises and crises and crises. But we only have these Office Hours for 12 weeks. If we concentrate on fixing just today's issue, we miss the opportunity to build habits that can handle tomorrow's.


5. Am I being useful right now?

As much as the above is useful in the long-term, there has to be a balance: these sessions should also tackle the issues brought into the room. In the last few weeks of the program, I find that we spend more and more time on day-to-day business issues. The founders have figured out how to get what they need out of me. And if they can do it with me, my hopes are high they can do it with anyone.

What do we look at? An iteration of the pitch deck. A run-through of the sales process. How to hear a "no" as a description of what a customer wants, and to use it to win the sale. Examples of pipelines and proposals. The agenda for a weekly growth meeting. Showing how the almost identical pitch deck can be re-pitched with added intensity if you pay attention to emotional narrative and rhetoric. Investor motivations.

I'm not an expert, but I do a lot of things a little bit, so I can be a useful second pair of eyes.

(I pay attention when the same topic comes up more than once and try to understand why the founder has not instinctively generalised.)

Also towards the end of the program, I get more open about some of my approaches above. The sessions get more and more collaborative. In the end I'm learning quite a lot.


6. If nothing comes up, getting to know each other is great.

I want to make it very clear that all the good stuff you see is entirely down to the startups themselves. Advice is bullshit. The bar I set myself is: can this hour be more effective than the hour they would otherwise spend building their business. Almost certainly not.

As I said above, the founder has been thinking way more about their company and their market than I have. There are experts out there far smarter than me. But there's a bigger point:

I have to remind myself it's not my company. I don't make the decisions. In the event that I do recommend a direction, I remind myself that I mustn't get offended if they don't take my advice. (It's a natural and human response to be offended when offered office is not taken.) It's not that I ought not be offended--it's that being offended would be a category error. The material I work with is the actions of the founder. The material isn't right or wrong, it simply is.

A good way to do all of the above--to react appropriately, to coach good habits, and to be useful--is for the founder, team, and me to get to know one another. The better you know each other's values, the higher the information content of any given interaction. So sometimes the best thing to do is to hang out.


Reading these reflections, I sound, even to myself, like a pompous arse. I mean, there's a very good chance that I am a pompous arse, which would be the reason why.

Honestly mostly the sessions are just chatting. I work hard to make them useful chatting, and yes I probably overthink it. My Office Hours will be more useful to some founders than to others. And I sure a lot of people, in my shoes, would do a much better job and wouldn't indulge themselves with endless introspection.

Amateur hour coaching, that's all it is.

This feeling is so strong that I think I will have to warn readers somewhere near the top. Say, around the ninth paragraph.


Here's a quote from Bob Shaw's short story "Call me Dumbo," found in the collection Tomorrow Lies in Ambush.

An aircraft factory is a machine for producing aeroplanes and it may be disastrous to attempt to improve production by piecemeal tinkering with individual departments--one must seek out in all its ramifications, and destroy, the machine for stopping the production of aeroplanes, which lurks like a parasite within the organisation.

I love this way of thinking.

Let's start from the perspective that a startup is a machine for growing. But there are obstacles which temper the growth. Our job, together, is to identify and to remove the invisible anti-growth machine.

The end of week 10.

Simon Willison (Django)Datasette plugins, and building a clustered map visualization

Datasette now supports plugins!

Last Saturday I asked Twitter for examples of Python projects with successful plugin ecosystems. pytest was the clear winner: the pytest plugin compatibility table (an ingenious innovation that I would love to eventually copy for Datasette) lists 457 plugins, and even the core pytest system itself is built as a collection of default plugins that can be replaced or over-ridden.

Best of all: pytest’s plugin mechanism is available as a separate package: pluggy. And pluggy was exactly what I needed for Datasette.

You can follow the ongoing development of the feature in issue #14. This morning I released Datasette 0.20 with support for a number of different plugin hooks: plugins can add custom template tags and SQL functions, and can also bundle their own static assets, JavaScript, CSS and templates. The hooks are described in some detail in the Datasette Plugins documentation.

datasette-cluster-map

I also released my first plugin: datasette-cluster-map. Once installed, it looks out for database tables that have a latitude and longitude column. When it finds them, it draws all of the points on an interactive map using Leaflet and Leaflet.markercluster.

Let’s try it out on some polar bears!

Polar Bears on a cluster map

The USGS Alaska Science Center have released a delightful set of data entitled Sensor and Location data from Ear Tag PTTs Deployed on Polar Bears in the Southern Beaufort Sea 2009 to 2011. It’s a collection of CSV files, which means it’s trivial to convert it to SQLite using my csvs-to-sqlite tool.

Having created the SQLite database, we can deploy it to a hosting account on Zeit Now alongside the new plugin like this:

# Make sure we have the latest datasette
pip3 install datasette --upgrade
# Deploy polar-bears.db to now with an increased default page_size
datasette publish now \
    --install=datasette-cluster-map \
    --extra-options "--page_size=500" \
    polar-bears.db

The --install option is new in Datasette 0.20 (it works for datasette publish heroku as well) - it tells the publishing provider to pip install the specified package. You can use it more than once to install multiple plugins, and it accepts a path to a zip file in addition to the name of a PyPI package.

Explore the full demo at https://datasette-cluster-map-demo.now.sh/polar-bears

Visualize any query on a map

Since the plugin inserts itself at the top of any Datasette table view with latitude and longitude columns, there are all sorts of neat tricks you can do with it.

I also loaded the San Francisco tree list (thanks, Department of Public Works) into the demo. Impressively, you can click “load all” on this page and Leaflet.markercluster will load in all 189,144 points and display them on the same map… and it works fine on my laptop and my phone. Computers in 2018 are pretty good!

But since it’s a Datasette table, we can filter it. Here’s a map of every New Zealand Xmas Tree in San Francisco (8,683 points). Here’s every tree where the Caretaker is Friends of the Urban Forest. Here’s every palm tree planted in 1990:

Palm trees planted in 1990

Update: This is an incorrect example: there are 21 matches on "palm avenue" because the FTS search index covers the address field - they're not actually palm trees. Here's a corrected query for palm trees planted in 1990.

The plugin currently only works against columns called latitude and longitude… but if your columns are called something else, don’t worry: you can craft a custom SQL query that aliases your columns and everything will work as intended. Here’s an example against some more polar bear data:

select *, "Capture Latitude" as latitude, "Capture Longitude" as longitude
from [USGS_WC_eartag_deployments_2009-2011]

Writing your own plugins

I’m really excited to see what people invent. If you want to have a go, your first stop should be the Plugins documentation. If you want an example of a simple plugin (including the all-important mechanism for packaging it up using setup.py) take a look at datasette-cluster-map on GitHub.

And if you have any thoughts, ideas or suggestions on how the plugin mechanism can be further employed please join the conversation on issue #14. I’ve literally just got started with Datasette’s plugin hooks, and I’m very keen to hear about things people want to build that aren’t yet supported.

Jeremy Keith (Adactio)Timing

Apple Inc. is my accidental marketing department.

On April 29th, 2010, Steve Jobs published his infamous Thoughts on Flash. It thrust the thitherto geek phrase “HTML5” into the mainstream press:

HTML5, the new web standard that has been adopted by Apple, Google and many others, lets web developers create advanced graphics, typography, animations and transitions without relying on third party browser plug-ins (like Flash). HTML5 is completely open and controlled by a standards committee, of which Apple is a member.

Five days later, I announced the first title from A Book Apart: HTML5 For Web Designers. The timing was purely coincidental, but it definitely didn’t hurt that book’s circulation.

Fast forward eight years…

On March 29th, 2018, Apple released the latest version of iOS. Unmentioned in the press release, this update added service worker support to Mobile Safari.

Five days later, I announced the 26th title from A Book Apart: Going Offline.

For a while now, quite a few people have cited Apple’s lack of support as a reason why they weren’t investigating service workers. That excuse no longer holds water.

Once again, the timing is purely coincidental. But it can’t hurt.

Jeremy Keith (Adactio)That new-book smell

The first copies of Going Offline showed up today! This is my own personal stash, sent just a few days before the official shipping date of next Monday.

I am excite!

To say I was excited when I opened the box of books would be an understatement. I was positively squealing with joy!

Others in the Clearleft office shared in my excitement. Everyone did that inevitable thing, where you take a fresh-out-of-the-box book, open it up and press it against your nose. It’s like the bookworm equivalent of sniffing glue.

Actually, it basically is sniffing glue. I mean, that’s what’s in the book binding. But let’s pretend that we’re breathing in the intoxicating aroma of freshly-minted words.

If you’d like to bury your nose in a collection of my words glued together in a beautifully-designed package, you can pre-order the book now and await delivery of the paperback next week.

Amazon Web ServicesGet Started with Blockchain Using the new AWS Blockchain Templates

Many of today’s discussions around blockchain technology remind me of the classic Shimmer Floor Wax skit. According to Dan Aykroyd, Shimmer is a dessert topping. Gilda Radner claims that it is a floor wax, and Chevy Chase settles the debate and reveals that it actually is both! Some of the people that I talk to see blockchains as the foundation of a new monetary system and a way to facilitate international payments. Others see blockchains as a distributed ledger and immutable data source that can be applied to logistics, supply chain, land registration, crowdfunding, and other use cases. Either way, it is clear that there are a lot of intriguing possibilities and we are working to help our customers use this technology more effectively.

We are launching AWS Blockchain Templates today. These templates will let you launch an Ethereum (either public or private) or Hyperledger Fabric (private) network in a matter of minutes and with just a few clicks. The templates create and configure all of the AWS resources needed to get you going in a robust and scalable fashion.

Launching a Private Ethereum Network
The Ethereum template offers two launch options. The ecs option creates an Amazon ECS cluster within a Virtual Private Cloud (VPC) and launches a set of Docker images in the cluster. The docker-local option also runs within a VPC, and launches the Docker images on EC2 instances. The template supports Ethereum mining, the EthStats and EthExplorer status pages, and a set of nodes that implement and respond to the Ethereum RPC protocol. Both options create and make use of a DynamoDB table for service discovery, along with Application Load Balancers for the status pages.

Here are the AWS Blockchain Templates for Ethereum:

I start by opening the CloudFormation Console in the desired region and clicking Create Stack:

I select Specify an Amazon S3 template URL, enter the URL of the template for the region, and click Next:

I give my stack a name:

Next, I enter the first set of parameters, including the network ID for the genesis block. I’ll stick with the default values for now:

I will also use the default values for the remaining network parameters:

Moving right along, I choose the container orchestration platform (ecs or docker-local, as I explained earlier) and the EC2 instance type for the container nodes:

Next, I choose my VPC and the subnets for the Ethereum network and the Application Load Balancer:

I configure my keypair, EC2 security group, IAM role, and instance profile ARN (full information on the required permissions can be found in the documentation):

The Instance Profile ARN can be found on the summary page for the role:

I confirm that I want to deploy EthStats and EthExplorer, choose the tag and version for the nested CloudFormation templates that are used by this one, and click Next to proceed:

On the next page I specify a tag for the resources that the stack will create, leave the other options as-is, and click Next:

I review all of the parameters and options, acknowledge that the stack might create IAM resources, and click Create to build my network:

The template makes use of three nested templates:

After all of the stacks have been created (mine took about 5 minutes), I can select JeffNet and click the Outputs tab to discover the links to EthStats and EthExplorer:

Here’s my EthStats:

And my EthExplorer:

If I am writing apps that make use of my private network to store and process smart contracts, I would use the EthJsonRpcUrl.

Stay Tuned
My colleagues are eager to get your feedback on these new templates and plan to add new versions of the frameworks as they become available.

Jeff;

 

Simon Willison (Django)I submitted a PWA to 3 app stores. Here's what I learned

I submitted a PWA to 3 app stores. Here's what I learned

Useful real-world experience shipping a progressive web app to the iOS, Android and Windows app stores.

Via Hacker News

ProgrammableWebMicrosoft Launches a Public Preview of a New Security API

Microsoft announced a public preview of a Security API at the RSA conference in San Francisco this week.

The API allows Microsoft's customers and partners to extend the company's Intelligent Security Graph, a platform designed to "protect and strengthen Microsoft products and services." The Intelligent Security Graph provides real-time threat intelligence and protection, as well as analytics.

ProgrammableWebGoogle Finalizes APIs In Android Things Release Candidate

Google is billing Android Things Developer Preview 8 as a release candidate. The final preview, available now, includes the final set of APIs for Google's IoT platform ahead of its stable 1.0 release. The preview also debuts new behaviors in the Android Things Developer Console. Here's what's new.

Simon Willison (Django)How to rewrite your SQL queries in Pandas, and more

How to rewrite your SQL queries in Pandas, and more

I still haven't fully internalized the idioms needed to manipulate DataFrames in pandas. This tutorial helps a great deal - it shows the Pandas equivalents for a host of common SQL queries.

Via Gergely Szerovay

Simon Willison (Django)Intro to Threads and Processes in Python

Intro to Threads and Processes in Python

I really like the diagrams in this article which compares the performance of Python threads and processes for different types of task via the excellent concurrent.futures library.

Via Gergely Szerovay

Simon Willison (Django)How to Use Static Type Checking in Python 3.6

How to Use Static Type Checking in Python 3.6

Useful introduction to optional static typing in Python 3.6, including how to use mypy, PyCharm and the Atom mypy plugin.

Via Gergely Szerovay

Simon Willison (Django)The best of Python: a collection of my favorite articles from 2017 and 2018 (so far)

The best of Python: a collection of my favorite articles from 2017 and 2018 (so far)

Gergely Szerovay has brought together an outstandingly interesting selection of Python articles from the last couple of years of activity of the Python community on Medium. A whole load of gems in here that I hadn't seen before.

Simon Willison (Django)Creating Simple Interactive Forms Using Python + Markdown Using ScriptedForms + Jupyter

Creating Simple Interactive Forms Using Python + Markdown Using ScriptedForms + Jupyter

ScriptedForms is a fascinating Jupyter hack that lets you construct dynamic documents defined using markdown that provide form fields and evaluate Python code instantly as you interact with them.

Via @psychemedia

Simon Willison (Django)What’s New in MySQL 8.0

What’s New in MySQL 8.0

MySQL 8 has lots of exciting improvements: Window functions, SRS aware spatial types for GIS, utf8mb4 by default, a ton of JSON improvements and atomic DDL. I no longer feel at a significant disadvantage when I have to use MySQL in place of PostgreSQL.

ProgrammableWebDaily API RoundUp: Flow, Scanova, Quikkly, Returnly

Every day, the ProgrammableWeb team is busy, updating its three primary directories for APIs, clients (language-specific libraries or SDKs for consuming or providing APIs), and source code samples.

ProgrammableWebHow to Ensure APIs Drive Everlasting Organizational Value

Physicist and computer scientist Alex Wissner-Gross made headlines a few years ago with his definition of intelligence.

ProgrammableWeb: APIsCboe Trade Review

The Cboe Trade Review API is a price generator for error reviews on trades executed on US option exchanges. Trade Review is an system that is wrapped around Cboe LiveVol’s Time and Sales data that enables the an error trade review process that determines if an executed trade qualifies for an adjustment. Cboe LiveVol is an equity and index options technology and services data provider for professional and retail traders. This includes provides market data for; Backtester, Custom Scans, Market at a Glance, Market Reference, Option Scans, Theo Calculator, Trade Review and more. Cboe tools, data, and custom analytics services offers technology and data solutions for a consolidated feed, real-time programmatic analysis and scanning, historical files and back testing, real-time decision support, flat files, XML web services, web components, custom development and consulting.
Date Updated: 2018-04-19
Tags: Financial, Analytics, , Data, , Marketplace, , Real Time, , Tools

ProgrammableWeb: APIsCboe Time and Sales

TheCboe Time and Sales API allows you access to tick and trade data across US equities, ETFs, indexes and options. Cboe LiveVol is an equity and index options technology and services data provider for professional and retail traders. This includes provides market data for; Backtester, Custom Scans, Market at a Glance, Market Reference, Option Scans, Theo Calculator, Trade Review and more. Cboe tools, data, and custom analytics services offers technology and data solutions for a consolidated feed, real-time programmatic analysis and scanning, historical files and back testing, real-time decision support, flat files, XML web services, web components, custom development and consulting.
Date Updated: 2018-04-19
Tags: Financial, Analytics, , Data, , Marketplace, , Real Time, , Tools

ProgrammableWeb: APIsCboe Theo Calc

The Cboe Theo Calc API allows you to integrate Cboe benchmark theoretical values into your work flow and automate execution monitoring and error detection. Cboe LiveVol is an equity and index options technology and services data provider for professional and retail traders. This includes market data for; Backtester, Custom Scans, Market at a Glance, Market Reference, Option Scans, Theo Calculator, Trade Review and more. Cboe tools, data, and custom analytics services offers technology and data solutions for a consolidated feed, real-time programmatic analysis and scanning, historical files and back testing, real-time decision support, flat files, XML web services, web components, custom development and consulting.
Date Updated: 2018-04-19
Tags: Financial, Analytics, , Data, , Marketplace, , Real Time, , Tools

ProgrammableWeb: APIsQRtag.net

The QRtag.net API returns QR codes in SVG or PNG images when developers request via URI/CRUD. QRtag shortens URLs aiming codes to be easy and fast to read. To use the API, embed it as a normal image. PNG and SVG formats available.
Date Updated: 2018-04-19
Tags: Barcodes, Barcodes, , Linked Data, , QR Codes, , URL Shortener

ProgrammableWeb: APIsShinobi Media

The Shinobi Media API returns movie, TV show, actor, and rating data from IMDb. Swagger is the description file type, allowing developers to request in URI/CRUD to receive JSON and XML responses, authenticating with a token.
Date Updated: 2018-04-19
Tags: Media, Actors, , Movies, , TV

Amazon Web ServicesNew – Registry of Open Data on AWS (RODA)

Almost a decade ago, my colleague Deepak Singh introduced the AWS Public Datasets in his post Paging Researchers, Analysts, and Developers. I’m happy to report that Deepak is still an important part of the AWS team and that the Public Datasets program is still going strong!

Today we are announcing a new take on open and public data, the Registry of Open Data on AWS, or RODA. This registry includes existing Public Datasets and allows anyone to add their own datasets so that they can be accessed and analyzed on AWS.

Inside the Registry
The home page lists all of the datasets in the registry:

Entering a search term shrinks the list so that only the matching datasets are displayed:

Each dataset has an associated detail page, including usage examples, license info, and the information needed to locate and access the dataset on AWS:

In this case, I can access the data with a simple CLI command:

I could also access it programmatically, or download data to my EC2 instance.

Adding to the Repository
If you have a dataset that is publicly available and would like to add it to RODA , you can simply send us a pull request. Head over to the open-data-registry repo, read the CONTRIBUTING document, and create a YAML file that describes your dataset, using one of the existing files in the datasets directory as a model:

We’ll review pull requests regularly; you can “star” or watch the repo in order to track additions and changes.

Impress Me
I am looking forward to an inrush of new datasets, along with some blog posts and apps that show how to use the data in powerful and interesting ways. Let me know what you come up with.

Jeff;

 

ProgrammableWebRev.io Launches a REST API for its Usage-Based Billing Platform

Rev.io, a provider of a usage-based billing platform for telecom and IoT companies, announced today at the Channel Partners Conference & EXPO in Las Vegas a new REST API that gives the company's customers and partners the ability to programatically access usage-rating, metered billing, customer data, and inventory catalog information.

ProgrammableWeb: APIsMoon Banking

The Moon Banking API is a ratings service for crypto friendly banks. This production API is funded by the Lightning Network, a node that is currently deployed on the Bitcoin Test Network, that has several pluggable back-end chain services. Moon Banking provides a way to view crypto friendly banks ratings.
Date Updated: 2018-04-18
Tags: Cryptocurrency, Ratings

ProgrammableWeb: APIsRev.io

The Rev.io API provides a way to manage your customers and billing that includes; adding a new customer, creating a request or quote for new service, and more. Rev.io is a SaaS recurring revenue and customer management platform.
Date Updated: 2018-04-18
Tags: Billing, Customer Relationship Management

ProgrammableWeb: APIsCboe LiveVol’s Market at a Glance

The Cboe LiveVol Market at a Glance API allows you to Get trades, quotes, implied volatility and market stats on the US equity and options markets. Cboe LiveVol is an equity and index options technology and services data provider for professional and retail traders. This includes market data for; Backtester, Custom Scans, Market at a Glance, Market Reference, Option Scans, Theo Calculator, Trade Review and more. Cboe tools, data, and custom analytics services offers technology and data solutions for a consolidated feed, real-time programmatic analysis and scanning, historical files and back testing, real-time decision support, flat files, XML web services, web components, custom development and consulting.
Date Updated: 2018-04-18
Tags: Financial, Analytics, , Data, , Marketplace, , Real Time, , Tools

ProgrammableWeb: APIsG-Square NLP

The G-Square NLP (Natural Language Processing) API offers sentiment recognition, intent of social media comments, topics, and keywords. G-Square develops tools for the financial services industry including analytics for actionable business insights, Forex pricing solutions, and big data analysis.
Date Updated: 2018-04-18
Tags: Natural Language Processing, Financial, , Language

Footnotes

Updated: .  Michael(tm) Smith <mike@w3.org>