Navigation path

Left navigation

Additional tools

Building a better Web Browser….

On March 23rd, 2015, ZDNet and many other specialized IT magazines published articles about the Pwn2Own 2015 contest, whose title was (more or less..) :

“Pwn2Own 2015: The year every web browser went down”

And the summary of the article said “Every major browser showed up (with their last and best version)….. every web browser got hacked”

For those who are not familiar with the Pwn2Onw contest, it is a computer hacking contest that started in 2007 and is held annually at the CanSecWest security conference. Contestants are challenged to exploit widely used software and mobile devices with previously unknown vulnerabilities. The name “Pwn2Own” is derived from the fact that contestants must “pwn” or hack the device in order to “own” or win it.

The first contest was conceived and developed by Dragos Ruiu in response to his frustration with Apple’s lack of response to the Month of Apple Bugs and the Month of Kernel Bugs, as well as Apple’s television commercials that trivialized the security built into the competing Windows operating system. At the time, there was a widespread belief that, despite these public displays of vulnerabilities in Apple products, OS X was significantly more secure than any other competitors…… …… interesting, isn’t it ?

The Pwn2Own contest serves to demonstrate the vulnerability of devices and software in widespread use while also providing a checkpoint on the progress made in security since the previous year.

The 2015 winners of the contest received $555.500 (yes , more than half a million dollars….) in prize money plus the laptops they used to hack (HP gaming Notebooks) and other additional prizes…

The top “hacker” was Jung Hoon Lee (aka lokihardt) from South Korea. He left Vancouver with the impressive amount of $225.000…..yes, nearly quarter of a million dollars and half of the total prize amount for the contest… Not too bad !!!!

But what makes it more impressive is that, traditionally, the prize goes to a team….. but ” our lokihardt” did it as individual competitor, not as a member of a team…. !!!!

All this leads me to the core of the subject of this post: Building a better browser…

A few weeks ago I attended a conference with that title by James Mickens who works at Microsoft Research in Redmon (Washington).

At the beginning of the Wide World Web, the Browser started as an “Universal HTML Interpreter”…. Kind of a “dumb terminal od the past”… with the time a number of “modules” or “features” have been added and the “standard modules” of today’s browsers are typically:

  • The Network Stack : implements transfer protocols: http, https, file, etc
  • The HTML and CSS (Cascade Style Sheets) parsers: validate HTML and CSS code and enforces “a valid format” if pages are ill-specified…
  • The Document Object Model (DOM tree): a browser neutral standard to represent HTML content and its associated CSS
  • The layout and rendering engine: Traverses the DOM tree and determines the visual size and spatial position of every element of the tree
  • The Javascript interpreter: Implements the Javascript run-time and reflects the DOM tree in the Javascript namespace, defining JavaScript objects which are essentially proxies for internal browser objects.
  • The storage layer manages access to persistent data like cookies, cached web objects, and DOM storage,a new abstraction that provides each domain with several megabytes of key/value storage.

One way or the other, browsers has become a sort of “Operating System” since they have:

  • Network (XHR, WebSockets)
  • Disk IO (DOM storage)
  • Graphics (WebGL, <video>)
  • Sound (<audio>)
  • Concurrency (Web workers)

Unfortunately, browser architectures are broken because they are riddled with poor abstractions….. and the consequence is that modern web browsers make it difficult to create fast, secure, and robust programs….

Browsers like Firefox and some versions of IE (ex. IE8) have a “monolithic architecture”. They share two important characteristics; first, a browser “instance” consists of a process containing all of the components mentioned above. In some monolithic browsers, separate tabs receive separate processes; however, within a tab, browser components are not isolated. The second characteristic of a monolithic browser is that, from the web page’s perspective, all of the browser components are either black box or grey box. In particular, the HTML/CSS parser, layout engine, and renderer are all black boxes—the application has no way to monitor or directly influence the operation of these components. Instead, the application provides HTML and CSS as inputs, and receives a DOM tree and a screen repaint as outputs. The JavaScript runtime is grey box, since the JavaScript language provides powerful facilities for reflection and dynamic object modification…..but the so called “native objects” within the browser are not so “grey” and may lead in many cases to not very nice “surprises”…

Is there any solution to the problem?

One of the solutions provided by researchers at the University of Illinois is the so called “OP Web Browsers”. To enable more secure web browsing, they have designed and implemented a new browser , called the OP web browser, that attempts to improve the security in the browser using state-of-art software desing approaches . The do it by combining operating system design principles with formal methods to design a more secure web browser by drawing on the expertise of both communities.

The design philosophy is to partition the browser into smaller subsystems and make all communication between subsystems simple and explicit. At the core of the design is a small browser kernel (micro-kernel) that manages the browser subsystems and interposes on all communications between them to enforce the browser security features.

This certainly represents progress from monolithic architectures since provides better security and fault isolation than monolithic browsers. However, OP still uses standard, off-the-shelf browser modules to provide the DOM tree, the JavaScript runtime, and so on. Thus, OP still presents web developers with a number of “frustrations” when developing “complex web applications”…..

In fact, each browser provides its own implementation of the standard components. These implementation families are roughly compatible with each other, but each one has numerous quirks and bugs. Since a browser’s components are weakly “introspectable” (difficult to know their internal state) at best, developers are forced to use conditional code paths and ad-hoc best practices to get complex web applications running across different browsers……

There are problems with “Event Handling”, “Parsing Bugs”, “Rendering Bugs”, “JavsScript/Dom incompatibilities”, to mention only some….

So the Holy Grail of a “Browser based on Standards” that allowed “Write Once, Run Everywhere” became “Write Once, Test Everywhere” and now is “Write Variants, Test Everywhere” …… What to say …?

Summing up, it is easy to write a simple web page that looks the same and has the same functionality in all browsers. Unfortunately, web pages of even moderate sophistication quickly encounter inconsistencies and bugs in browser runtimes…

James and his team have been working on a prototype of a new generation of browsers called “Exo-kernel Browsers” . Their prototype, called Atlantis, tries to solve the above mentioned problems by providing pages with an extensible execution environment. It defines a narrow API for basic services like collecting user input, exchanging network data, and rendering images. By composing these primitives, web pages can define their own custom, high-level execution environments.

Therefore, an application which does not want a dependence on the Atlantis’ predefined web stack can selectively redefine components of that stack, or define markup formats and scripting languages that look nothing like the current browser runtime. Unlike prior microkernel browsers like OP, and compile-to-JavaScript frameworks like GWT, Atlantis is the first browsing system to truly minimize a web page’s dependence on “black box” browser code. This should make it much easier to develop robust, secure web applications.

The Master Kernel contains the Switchboard Process, the Device Server and the Storage Manager… very simple architecture with a relatively simple API.

Every time a “web domain” (protocol, host name, port) is instantiated it receives a separate isolation container with the kernel and the “script interpreter” (called Syphon). It is done by web applications adding an “environment ” tag at the top of its markup what allows the interpretation not only of HTML but of any kind of markup language of the page’s URL. If no environment is specified , the instance kernel assumes that the page is executed on top of the “Standard Stack”.

The instance kernel contains two modules “The NetworkManager” , that interprets protocols (http, file, etc) and the User Interface Manager (creates a new form and registers handlers for low level GUI events on that form. Also forwards the events to the application-defined run-time, updates the bitmaps in response to messages from the layout engine.

Syphon, the Script Interpreter , is one of the major component of the Atlantis architecture.

Applications pass “abstract syntax trees” (ASTs) to Atlantis for execution (instead of low level bytecode “à l’applets”) for two reasons: one is easier to optimize ASTs than bytecodes and second it is easier to reconstruct source code from ASTs than from bytecode. This feature is particularly useful for “debugging”.

Atlantis ASTs encode a new language, called Syphon, which is superset of the recent ECMAScript JavaScript Specs, but it is described with a generic tree syntax that may be adapted to serve as a compilation target for other high level languages that may or may not resemble JavaScript.

Syphon offers a number of features that facilitate the construction of robust, application-defined runtimes such as Object Shimming , Method Binding and Privileged Execution, Strong Typing, Threading, etc.

The core of the current Atlantis run-time contains, according to James, some 8600 lines of C# (C Sharp) code (Syphon interpreter, instance kernel , master kernel and the IPC (Inter-process Communication) libraries) relying on the .NET runtime for garbage collection, data types and so on. It also includes some 5500 lines of JavaScript for the demonstration web stack and a “compiler” from JavaScript to Syphon AST.

The core of Atlantis provides a very good Trusted Computing Base, enforcing, among other things, the “Same-Origin Policy” and at the same time allows for “Extensibility” allowing web pages to customize their own runtime in a robust manner.

In the lab, the Atlantis prototype has demonstrated very decent performances and this despite the fact that it has not been optimized, whet looks very encouraging.

To sum up all the above, current web browsers must support an API that is unnecessarily complex. This API is an uneasy conglomerate of disparate standards that define network protocols, mark-up formats, hardware interfaces, and more.

Using Exo-kernel principles ,like in Atlantis, allows each web page to ship with its own implementation of the web stack. Each page can tailor its execution environment to its specific needs; in doing so, the page liberates browser vendors from the futile task of creating a one-size-fits-all web stack.

The approach proposed by James and his team looks very good and will facilitate the development of robust and secure complex web applications….. so far so good.. my question to James was: Why there is no so much progress in this area?

There are a few reasons for it:

  • The browser technology is well known and developers got used to it
  • Browsers are today compared basically based on the speed of the JavaScript and Java machine
  • There is not the perception yet that we are reaching the limits of the current technology …

According to James, one of these days we are going to have big, very big problems and then things will have to change….

And this is one of the the reason why I started speaking about the ZDNet article…..

A personal reflection……during a Windows 10 focused-keynote, in January 2015, Microsoft unveiled that IE will be deprecated and there will be a new “standard” browser included in Windows 10. Its code name is Spartan…. we know already that it will not support “legacy technologies” such as ActiveX and Browser Helper Objects and will use an “extension system” instead and will increase its compliance to standards… IE11 will stay in parallel for some time to support “legacy systems”….

The question is “Will Spartan ever become an Exo-kernel Browser”? …… or Atlantis will be just a “research project”….and stay there?

Time will tell…… as usual !!!

Stay tuned for more….




2 Responses to “Building a better Web Browser….”

  1. Cinzia D'Ascanio says:

    Compliment for the note. It is a pleasure to see how you continue to be a good technician being an excellent manager.
    You are always the best.
    Concerning the content of your note, I think that other than the architecture and standard at browser level, another big issue is the capability to develop avoiding captivity. Is it always possible, but more expensive in time and effort (at least in short-time ROI analysis). Why quality assessment systems do not help to introduce criteria on this point? as well as development suites? Of course because there are economic advantages, but sometime it is just for laziness.

    • garcifr says:

      Hi Cinzia

      Thank you very much for your comments !!!

      Develop avoiding captivity, interesting issue…. as a matter of fact I think it is the Holy Grail of IT !!! All the browsers were supposed to stick to standards, and there are many, the problem is that different implementations lead to different behaviours since standards can not foresee everything. Look at HTML5, the standard is so big and has so many domains that the full standardization process may take some time and still the industry is not waiting to implement it… in the case of browsers I would not call it “captivity” but “variants” but this is theway it is….. One of the projects I run when I was head of the Section Case Tools was a manual of “Standard C programming”. It helped but as soon as one has to optimize one has to take into account the specific implementation. It also happend in Databases, SQL is standard but the different implementation of Views, Indexes, etc make the system dependent on the implementation of undelying database technology.
      Having said all that, what I tried to highlight in the post was the fact that the current architecture of the browsers is complex, unsecure and in some cases unreliable and it lacks the right level of abstraction that a “quasi Operating System” should have. In this respect, ChromeOS is ,by its own nature, a Browser based OS…. therefore Google has there some competitive advantage but perhaps the new MS Browser is already going in the direction…. to be continued…



Leave a Reply