Navigation path

Left navigation

Additional tools

Archive for ‘Uncategorized’

Building a better Web Browser….

Tuesday, March 31st, 2015

On March 23rd, 2015, ZDNet and many other specialized IT magazines published articles about the Pwn2Own 2015 contest, whose title was (more or less..) :

“Pwn2Own 2015: The year every web browser went down”

And the summary of the article said “Every major browser showed up (with their last and best version)….. every web browser got hacked”

For those who are not familiar with the Pwn2Onw contest, it is a computer hacking contest that started in 2007 and is held annually at the CanSecWest security conference. Contestants are challenged to exploit widely used software and mobile devices with previously unknown vulnerabilities. The name “Pwn2Own” is derived from the fact that contestants must “pwn” or hack the device in order to “own” or win it.

The first contest was conceived and developed by Dragos Ruiu in response to his frustration with Apple’s lack of response to the Month of Apple Bugs and the Month of Kernel Bugs, as well as Apple’s television commercials that trivialized the security built into the competing Windows operating system. At the time, there was a widespread belief that, despite these public displays of vulnerabilities in Apple products, OS X was significantly more secure than any other competitors…… …… interesting, isn’t it ?

The Pwn2Own contest serves to demonstrate the vulnerability of devices and software in widespread use while also providing a checkpoint on the progress made in security since the previous year.

The 2015 winners of the contest received $555.500 (yes , more than half a million dollars….) in prize money plus the laptops they used to hack (HP gaming Notebooks) and other additional prizes…

The top “hacker” was Jung Hoon Lee (aka lokihardt) from South Korea. He left Vancouver with the impressive amount of $225.000…..yes, nearly quarter of a million dollars and half of the total prize amount for the contest… Not too bad !!!!

But what makes it more impressive is that, traditionally, the prize goes to a team….. but ” our lokihardt” did it as individual competitor, not as a member of a team…. !!!!

All this leads me to the core of the subject of this post: Building a better browser…

A few weeks ago I attended a conference with that title by James Mickens who works at Microsoft Research in Redmon (Washington).

At the beginning of the Wide World Web, the Browser started as an “Universal HTML Interpreter”…. Kind of a “dumb terminal od the past”… with the time a number of “modules” or “features” have been added and the “standard modules” of today’s browsers are typically:

  • The Network Stack : implements transfer protocols: http, https, file, etc
  • The HTML and CSS (Cascade Style Sheets) parsers: validate HTML and CSS code and enforces “a valid format” if pages are ill-specified…
  • The Document Object Model (DOM tree): a browser neutral standard to represent HTML content and its associated CSS
  • The layout and rendering engine: Traverses the DOM tree and determines the visual size and spatial position of every element of the tree
  • The Javascript interpreter: Implements the Javascript run-time and reflects the DOM tree in the Javascript namespace, defining JavaScript objects which are essentially proxies for internal browser objects.
  • The storage layer manages access to persistent data like cookies, cached web objects, and DOM storage,a new abstraction that provides each domain with several megabytes of key/value storage.

One way or the other, browsers has become a sort of “Operating System” since they have:

  • Network (XHR, WebSockets)
  • Disk IO (DOM storage)
  • Graphics (WebGL, <video>)
  • Sound (<audio>)
  • Concurrency (Web workers)

Unfortunately, browser architectures are broken because they are riddled with poor abstractions….. and the consequence is that modern web browsers make it difficult to create fast, secure, and robust programs….

Browsers like Firefox and some versions of IE (ex. IE8) have a “monolithic architecture”. They share two important characteristics; first, a browser “instance” consists of a process containing all of the components mentioned above. In some monolithic browsers, separate tabs receive separate processes; however, within a tab, browser components are not isolated. The second characteristic of a monolithic browser is that, from the web page’s perspective, all of the browser components are either black box or grey box. In particular, the HTML/CSS parser, layout engine, and renderer are all black boxes—the application has no way to monitor or directly influence the operation of these components. Instead, the application provides HTML and CSS as inputs, and receives a DOM tree and a screen repaint as outputs. The JavaScript runtime is grey box, since the JavaScript language provides powerful facilities for reflection and dynamic object modification…..but the so called “native objects” within the browser are not so “grey” and may lead in many cases to not very nice “surprises”…

Is there any solution to the problem?

One of the solutions provided by researchers at the University of Illinois is the so called “OP Web Browsers”. To enable more secure web browsing, they have designed and implemented a new browser , called the OP web browser, that attempts to improve the security in the browser using state-of-art software desing approaches . The do it by combining operating system design principles with formal methods to design a more secure web browser by drawing on the expertise of both communities.

The design philosophy is to partition the browser into smaller subsystems and make all communication between subsystems simple and explicit. At the core of the design is a small browser kernel (micro-kernel) that manages the browser subsystems and interposes on all communications between them to enforce the browser security features.

This certainly represents progress from monolithic architectures since provides better security and fault isolation than monolithic browsers. However, OP still uses standard, off-the-shelf browser modules to provide the DOM tree, the JavaScript runtime, and so on. Thus, OP still presents web developers with a number of “frustrations” when developing “complex web applications”…..

In fact, each browser provides its own implementation of the standard components. These implementation families are roughly compatible with each other, but each one has numerous quirks and bugs. Since a browser’s components are weakly “introspectable” (difficult to know their internal state) at best, developers are forced to use conditional code paths and ad-hoc best practices to get complex web applications running across different browsers……

There are problems with “Event Handling”, “Parsing Bugs”, “Rendering Bugs”, “JavsScript/Dom incompatibilities”, to mention only some….

So the Holy Grail of a “Browser based on Standards” that allowed “Write Once, Run Everywhere” became “Write Once, Test Everywhere” and now is “Write Variants, Test Everywhere” …… What to say …?

Summing up, it is easy to write a simple web page that looks the same and has the same functionality in all browsers. Unfortunately, web pages of even moderate sophistication quickly encounter inconsistencies and bugs in browser runtimes…

James and his team have been working on a prototype of a new generation of browsers called “Exo-kernel Browsers” . Their prototype, called Atlantis, tries to solve the above mentioned problems by providing pages with an extensible execution environment. It defines a narrow API for basic services like collecting user input, exchanging network data, and rendering images. By composing these primitives, web pages can define their own custom, high-level execution environments.

Therefore, an application which does not want a dependence on the Atlantis’ predefined web stack can selectively redefine components of that stack, or define markup formats and scripting languages that look nothing like the current browser runtime. Unlike prior microkernel browsers like OP, and compile-to-JavaScript frameworks like GWT, Atlantis is the first browsing system to truly minimize a web page’s dependence on “black box” browser code. This should make it much easier to develop robust, secure web applications.

The Master Kernel contains the Switchboard Process, the Device Server and the Storage Manager… very simple architecture with a relatively simple API.

Every time a “web domain” (protocol, host name, port) is instantiated it receives a separate isolation container with the kernel and the “script interpreter” (called Syphon). It is done by web applications adding an “environment ” tag at the top of its markup what allows the interpretation not only of HTML but of any kind of markup language of the page’s URL. If no environment is specified , the instance kernel assumes that the page is executed on top of the “Standard Stack”.

The instance kernel contains two modules “The NetworkManager” , that interprets protocols (http, file, etc) and the User Interface Manager (creates a new form and registers handlers for low level GUI events on that form. Also forwards the events to the application-defined run-time, updates the bitmaps in response to messages from the layout engine.

Syphon, the Script Interpreter , is one of the major component of the Atlantis architecture.

Applications pass “abstract syntax trees” (ASTs) to Atlantis for execution (instead of low level bytecode “à l’applets”) for two reasons: one is easier to optimize ASTs than bytecodes and second it is easier to reconstruct source code from ASTs than from bytecode. This feature is particularly useful for “debugging”.

Atlantis ASTs encode a new language, called Syphon, which is superset of the recent ECMAScript JavaScript Specs, but it is described with a generic tree syntax that may be adapted to serve as a compilation target for other high level languages that may or may not resemble JavaScript.

Syphon offers a number of features that facilitate the construction of robust, application-defined runtimes such as Object Shimming , Method Binding and Privileged Execution, Strong Typing, Threading, etc.

The core of the current Atlantis run-time contains, according to James, some 8600 lines of C# (C Sharp) code (Syphon interpreter, instance kernel , master kernel and the IPC (Inter-process Communication) libraries) relying on the .NET runtime for garbage collection, data types and so on. It also includes some 5500 lines of JavaScript for the demonstration web stack and a “compiler” from JavaScript to Syphon AST.

The core of Atlantis provides a very good Trusted Computing Base, enforcing, among other things, the “Same-Origin Policy” and at the same time allows for “Extensibility” allowing web pages to customize their own runtime in a robust manner.

In the lab, the Atlantis prototype has demonstrated very decent performances and this despite the fact that it has not been optimized, whet looks very encouraging.

To sum up all the above, current web browsers must support an API that is unnecessarily complex. This API is an uneasy conglomerate of disparate standards that define network protocols, mark-up formats, hardware interfaces, and more.

Using Exo-kernel principles ,like in Atlantis, allows each web page to ship with its own implementation of the web stack. Each page can tailor its execution environment to its specific needs; in doing so, the page liberates browser vendors from the futile task of creating a one-size-fits-all web stack.

The approach proposed by James and his team looks very good and will facilitate the development of robust and secure complex web applications….. so far so good.. my question to James was: Why there is no so much progress in this area?

There are a few reasons for it:

  • The browser technology is well known and developers got used to it
  • Browsers are today compared basically based on the speed of the JavaScript and Java machine
  • There is not the perception yet that we are reaching the limits of the current technology …

According to James, one of these days we are going to have big, very big problems and then things will have to change….

And this is one of the the reason why I started speaking about the ZDNet article…..

A personal reflection……during a Windows 10 focused-keynote, in January 2015, Microsoft unveiled that IE will be deprecated and there will be a new “standard” browser included in Windows 10. Its code name is Spartan…. we know already that it will not support “legacy technologies” such as ActiveX and Browser Helper Objects and will use an “extension system” instead and will increase its compliance to standards… IE11 will stay in parallel for some time to support “legacy systems”….

The question is “Will Spartan ever become an Exo-kernel Browser”? …… or Atlantis will be just a “research project”….and stay there?

Time will tell…… as usual !!!

Stay tuned for more….

Best

 

Paco

Organizing a Hackathon @ UC Berkeley: Epilogue….

Tuesday, March 24th, 2015

In ended that last part of the “Hackathon@UC Berkeley” saga with the following sentence:

“Sending the winning team members to Barcelona is being quite an adventure….all kind of issues: passports missing, different origins, destinations and dates, changes in dates after the ticket has been issued, etc… I will sleep well when they arrive in Barcelona and even better when they will be back in Berkeley…”

Well, I was right… it was difficult to sleep throughout the week before Barcelona ….

One of the members of the team had earlier commitments and asked to leave from Los Angeles and come back to Washington DC…. and he only would arrive March 2nd at 7:00… the event in Barcelona started that day at 9:00……

The second member of the team had already his ticket but he had a mid-term exam and the professor did not accept to postpone it… new tickets needed…. he found a flight to leave on time and come back on time… Uffffff…

The third member of the team did not have a passport… no passport number , no way to book his flight….. I thought that getting an urgent passport was a simple and quick thing to do in the States, if properly justified… well I was wrong, it is not as simple and it is expensive compare with what we have, for instance, in Spain…..

The Barcelona event would start on Monday and here we are with one of the members of the team with the passport to be picked up Friday afternoon… to fly during the evening.. imagine the problem with 95000 people attending the Mobile World Congress from everywhere in the world…we managed to find a flight… departing from Oakalnd with two stop overs, exhausting….

The guy got , finally, the passport on Friday afternoon and headed to the airport to take the flight to Barcelona… so far , so good……except that he went to the wrong airport !!!. When he realized it he rushed to the other airport and missed the boarding for just a few minutes….

Here we are on Friday at 22:00 trying to find a solution…… and he was the main coding engineer in the team.

Finally , he found a flight.. but it would only leave on Monday to arrive Tuesday evening… Ok…. except that Wednesday at 12:00 the teams had to check-in the solutions and present them… I said to myself….we need a miracle or it is not going to work…….

I asked the team in Barcelona to keep me informed on the arrival of the team members and Luke and Alic. They were arriving safe and on schedule and the team of two was working in close contact with the one still in Berkeley…. He arrived on Tuesday evening as planned and started working with the others … I suspect that the team did not sleep a lot that night…..

When I got up on Wednesday, I asked Barcelona about the final result since it was already Wednesday evening  there …. and the winner was ….. the application InTime by … the Berkeley Team !!!

It was just incredible… one of the members of the Barcelona team was also asking, via the Whatsapp Group, about the winners … when she was told that the winner was the Berkeley Team, she could not believe it … she said “the Berkeley Team?.. No, it is not possible, yesterday evening their application was not working… ” one of the members of panel, said… “Yes, it worked when they presented it….”….

Apparently the presentation was awesome as well…; so….. well done David, Andrew and Jessie!!! ….miracles happen…!!!… but also it shows the quality of the Computer Science students at UC Berkeley…

More details and pictures (in spanish) about the event in Barcelona can be found here.

I could finally sleep well !!!! ;-))

I asked Luke and Alic to give me their impressions, in writing, about their experience in Barcelona and Granada (Alic), I copy them below:

” It was a thrill to see technologies and professionals from around the world converge in Barcelona for an entire week. The diverse nationalities, languages spoken and projects showcased were all interesting. Since I’m only 20, this was the first time that I witnessed a world-class industry conference and it brought to perspective how dynamic the global tech scene really is.

In a place like Berkeley, I think it’s easy to take certain things for granted. For example, we’re very close to the Silicon Valley and we have access to a great ecosystem for tech and startups. However, I think there’s a lot to see and learn from other parts of the world. Going to Barcelona and experiencing the Mobile World Congress, as well as 4YFN, was eye-opening for me and made me more curious about opportunities and possibilities outside of the Bay Area.

In terms of culture, food, and entertainment, Barcelona didn’t disappoint! I never had tapas before going to Barcelona and now I crave it here in Berkeley. I also hit some tourist spots like Park Guell, La Sagrada Familia, FC Barcelona stadium, Museu Nacional d’Art de Catalunya, and more.

I definitely want to go back and explore more of the city. Hopefully sooner rather than later! ” (dixit Luke)

” It was a wonderful experience getting to see hackers from Barcelona, Berkeley, Cordoba and Granada all come together in one place to compete and develop creative apps for smart watches. The teams not only had the opportunity to learn from each other, but also got to know each other over the course of the hackathon.

The 4YFN and Mobile World Congress were definitely amazing (and sometimes overwhelming) exhibitions of some of the newest technological advancements coming to market. It was definitely a very diverse global event where companies that ranged from local startups to multi-national corporations were able to showcase their newest tech. There was everything from the latest health monitoring devices to completely waterproof electronics, and just a plethora of smart watches.

I was very impressed by the development of and investment in infrastructure for health technologies in Granada. The completion of the Technology Park of Granada combined with the entrepreneurial spirit and technical talent in the region may position Granada to be a leader in the space of health tech and biotech.” (dixit Alic)

I want to warlmy thank Alic, Luke and the volunteers for their help on this, I have really appreciated working with them towards a successful completion of the Hackathon @UC Berkeley

And … congratulations to the winner team: David, Andrew and Jessie !!

Best

Paco

 

 

 

 

 

 

Code for America and OpenOakland.org…. Part 1

Wednesday, March 18th, 2015

It was end of September 2014 that Heddy , my office colleague, pointed me to an event of Code for America (CfA) in San Francisco. When I searched on the web, I discovered that it was their 2014 Annual Conference… unfortunately I had just missed it since it had taken place just a few days before…

I continued searching their web site and started learning interesting things about them. Code for America was created back in 2009 and the main player, and founder, behind CfA is Jennifer Pahlka . In her TED.com speech of 2012, “Coding a better Government” , she said that she created Code for America to get the rock stars of design and coding in America “to work in an environment that represents everything that we are supposed to hate….., to work in Government” …

As stated in their website, “Code for America believes government can work for the people, by the people in the 21st century”. Code for America calls ” engineers, designers, product managers, data scientists, and more “to put your skills to work in service to your country. Let’s bring government into the 21st century together”.

Code for America runs five programs:

  • Brigades: local groups of civic hackers and other community volunteers who meet regularly to support the technology, design, and open data efforts of their local governments
  • Fellowships: small teams of developers and designers work with a city, county or state government for a year, building open source apps and helping spread awareness of how contemporary technology works among the government workforce and leadership
  • The Accelerator: provides seed funding, office space, and mentorship to civic startups
  • Peer Network: for innovators in local government
  • Code for All: organizes similar efforts outside the US, particularly Brigades and fellowship programs in countries around the world

I contacted Code for America through their info mailbox and after a few days I was having some feedback. It took some time until I could visit their premises in San Francisco and meet with Catherine Bracy, Program Director in charge of International Relations. She explained to me how Code for America is organized, working methods, how it gets funded, projects they are particularly proud of and are considered best practice and invited me to participate in some meetings of the Brigades and pointed me to the ones being run in Oakland (OpenOakland.org) and San Francisco. She also put my in contact with Code for Europe.

OpenOakland defines itself as a non-profit civic innovation organization that brings together coders, designers, data geeks, journalists, and city staff to collaborate on solutions to improve the lives of Oaklanders. It is part of Code for America’s Brigade program and holds frequent events for community, local government and tech folks to work together.

Open Oakland focuses on both community technology and open government projects that are supported through community partnerships and engaged volunteers.

Searchingthe web for references and background information on my research, “Co-production in Public Services”,  I found a “Meetup” call from the OpenOakland.org Brigade and I decided to send a request for participation. I quickly received a few replies welcoming my presence and I participated in one of the Tuesday’s “Civic Hack Night” that takes places in one of the meeting rooms of the City Hall in downtown Oakland.

I arrived there at 18:15 and there were very few people in the room, I was greeted by Neil who told me to take a sit, relax and wait for the rest of the people to arrive.

People were arriving regularly and towards 18:30 the room was nearly full (roughly 60 people). The environment was relaxed; some people were already having dinner from boxes they had brought with them.

Spike, one of the Captains of the Brigade, opened the meeting with a few introductory words about the Executive Committee and then gave the floor to the representatives of a number of projects to report on progress.

One former Code for America fellow, apparently now working for a civic engagement oriented company, had ordered some pizzas and before the different project groups gathered scattered around the room, it was “pizza and candy time”.

Gradually people were sitting together to discuss about their projects. A lot of conversations going on at the same time and despite the parallel conversations people in the groups were much focused on their subject, a lot of activity was going on and interesting discussions were taking place.

I decided to start by sitting with Neil and Ronald to tell me about OpenOakalnd.org, the origins, the role, the projects, the decision making, the results, the relationships with other civic organizations and the challenges. They were asking me about my research project and why I was so interested in the Brigade. I told them that they had a very good reputation in Code for America. They said that, in some aspects, they were more advanced than other Brigades and therefore they need now less assistance from Code for America.

OpenOakalnd.org had just created an Executive Committee with 11 members that would have soon an “Away Day” to better know each other and start moving ahead.

Neil , an Irish born but Oakland resident for many years, told me that he have been involved in civic activities in his neighbourhood for some years and thought that his experience could be useful to the Brigade. He explained to me that he does not hack, but he is in support of the projects and the Brigade activities. He said that it would be good to have more contacts with other civic organizations in the city. Ronald, a specialist in leadership who had worked for a NGO for many years, tries to put some framework in the projects and activities of the Brigade and supports the Executive Committee in several matters. I told Ronal that I would like to meet him speak about his ideas.

We spoke about Fellowships and Neil called Eddie, the second Captain of the Brigade, who had been a Code for America Fellow a couple of years before. We agreed that we would have lunch together to speak about it.

Very close to Neil and Ronald’s group there was another one discussing about the possibility of building a system to collect applications for summer jobs for teenagers in Oakland. There was a government official with them and the project group were showing some web sites that could serve as a template for the system. According to the posts on the Google Groups of the Brigade, it was decided not to develop the project for the reasons I will explain in another post.; it illustrates the maturity of the Brigade in terms of decisions regarding the projects.

I asked Neil and Ronal about the attitude of the IT Staff in the City Hall regarding the activities of the Brigade and the applications resulting from the projects. They said that IT staff is so busy with the normal work and their resources are so scarce that they have enough work keeping the lights on and carrying out the existing activities. They do not have any special problem with the work of the Brigade and when requested they are , most of the time, able to deliver data they may have that would be useful for the applications.

It is my understanding, from my informal conversations with developers involved in “hacking for government”, that they think that the IT Staff in City Halls or other departments in Government, would probably not be able to develop the kind of applications that the hacking projects are producing or if they did they would take a lot of time and resources due to the way projects are managed.

There was another group focusing on “transparency” that was being chaired by another city hall official working for the “Ethics Commission”. The brigade released in September 2014 a web application called “Open Disclosure”, which provides campaign finance framework data that shows the flow of money into Oakland mayoral campaigns. They were now working on the extension to other campaigns.

Finally, there was another group discussing about “marketing” of the activities of the brigade; they were writing the ideas on big sheets of paper sticked to the wall. Other group was discussing about a project related to housing in the city. The discussion was very technical and I uderstood it was about how to best display the information.

Phil Wolf, who is also member of the Executive Committee,and very active in the Google group, asked me before leaving the place to write a post about my experience….

Here it is …. And others will follow !!

I am very happy with the experience and grateful to OpenOakalnd.org for their welcome and help.

Stay Tuned

Best

Paco

Cloud Infrastructure Planning @ Google…..

Saturday, March 14th, 2015

Today, three guys from the “Operation Decision Support” group at Google came to campus to recruit interns and full employees. …. surprise, surprise

It looks that they were very happy with one of the summer internals who came from UC berkeley and came again to introduce their department and describe their challenges and working methods.

They started by presenting the three business areas of Google:

  • The 100 billion dollar business: Ads, Search, Access (G-Fiber), Public Cloud
  • The 10 billion dollar business: You Tube, Nest, Play, Android, Chrome
  • The “bold bets”: Life Sciences, Self driving cars, Energy, Robotics, Space X

Pas mal… as they say in French !!!

The Operation Decision Support (ODS) Group is a kind of “consulting group” in Google specialized in “Capacity Planning” and impact analysis on costs and pricing of the solutions and the involved resources. This is particularly important because they have to invoice external customers and ….. charge back the internal ones …..  sounds familiar?  😉

Google has 12 Data Centers around the world, 6 of which are in the United Sates and 4 in Europe. Google expends 7 to 8 billion dollars in infrastructure per year… big, big money !!!

ODS has 50 people, among which there are 20 PhDs specialized in Statistics and Operational research. they also have experts in  modeling and supply chain. The group is becoming very important to the company; they plan to recruit between 15 and 50 new employees this year.

They presented a couple of interesting problems to illustrate the work of the group related to capacity planning, utilization of resourrces and costs.

The first one: “Increasing the utilization of the infrastructure (CPU, memory, disk space…) through oversubscription” (internal and external customers)

Google has Tier1 and Tier2 customers with different SLAs that normally subscribe for specific capacity that is not fully used all the time. The problem to be solved is how “oversubscribe” to sell capacity to more customers (internal or external).

There are three ways of approaching this problem:

  • Easier: Resell surplus in Tier1 and Tier2 (which on average use around 25% of the contracted capacity) with no SLA for the overcapacity sold.
  • Harder: Resell surplus in Tier1 as Tier2 with SLA
  • Hardest: Oversubscribe Tier1 with no change to its SLA.

In the first case, utilization changes with the time zone, there are peaks and valleys and there is no SLA, no guarantee, no problem.

Perhaps some “guarantee” could be provided by statistical extrapolation methods. For instance , for batch processing, it could be guaranteed that the batch is executed in the next 24 hours.

In the second case, it is necessary to collect detailed utilization data to estimate growth and security margins (Safety Stock) to guarantee SLA.

In the third case a more sophisticated analysis of the time series of data of every task run in the Tier1 environment. Workloads per task , in general, do not peak simultaneously what allows for a predictable “surplus” to be sold if some safety stock is taken.

It looks easy but , Thomas Olavson, the director of ODS, says that it is not evident. So, how to make this approach acceptable, taking into account that the final decision is in the hands of the implementing department (engineering, production ,etc) or the executive team?. Here is the method:

  1. Partner with the engineers: Fully, understand the issue, work together, pilot before roll out
  2. Build credibility and trust over the time
  3. Overcome “taboos”
    1. Clear SLA
    2. Explore Tier2 with statistically based SLAs
    3. Demonstrate economic impact
    4. Pilot, pilot and pilot.

The second case was related to the deployment of G-Fiber. Google Fiber is the fiber-to-the-premises service of Google in the US, providing broadband Internet and television to a small and slowly increasing number of locations. The service was first introduced to one of the biggest municipalities of Kansas City and Missouri, followed by expansion to other 20 Kansas City area suburbs within 3 years. Initially proposed as an experimental project, Google Fiber was announced as a viable business model on December 12, 2012.

Google is ,at the end of the day, a Content Service Provider and wants to provide high quality content at optimal speed to the users to increase satisfaction. A solution to it would be that the Connectivity Service Providers plug their “pipes” directly to the Google Data Centers what is unrealistic since they are normally in remote places. Therefore the solution is to bring Google infrastructure close to the users and this is exactly what Google is doing with the G-Fiber Service.

Answering the question where and when to build what infrastructure , is a tough optimization problem…. For a not very complicated deployment, the model would have some 30000 variables and more than 30000 constrains….

Brian Eck, now senior consultant at ODS, a former IBM employee who has been working with Google for the last two years (he says as a joke that the two years have been like “dog years” since he feels that he has been working at Google for 14 years !!), is a specialist in logistics and he was confronted to the same problem in manufacturing at IBM and concluded that the optimization approach was not the way to go….

Instead, he and his colleagues have developed a “Scenario Analysis Tool” for a reduced number of locations translating the alternative deployment roadmaps into a five year cost/cash model. The inputs for the model are the demand and the topology provided by the engineering team, the equipment footprint (calculated)and unit cost of all the cost components , also provided by the engineers. The result is a cost model with total cash flow over 5 years.

They call the model a “Big Special Purpose Calculator” which is also very useful to study “What if” scenarios and that can be generalized to other kind of problems in Google (some “super users” are doing it already).

One of the decisions that have to be taken in a deployment of this type is if it is better, given the cost of the workforce including travelling, either to install overcapacity in locations now and come back in one or two years to update, or to set up a local team and visit periodically and upgrade as necessary.

Applying the model to a specific deployment case, the latter option allowed 10 M$ savings …..

The model was first implemented using a spreadsheet; it contained 60 worksheets with some 300 line each and very complex formulas but allowed the fine tuning of the model. Once it was done, it was implemented using the R Statistical package.

The critical success factors are not very different from the ones mentioned in the previous case but here, there are additional ones

  • Strike the right level of detail; “what to include what to omit”
  • Standardize data: power, colo contracts, workforce, etc

Once more a very interesting talk… it is amazing what is going on in the Bay Area..

Students were queuing to hand their CVs or get the contact point….  I wish I could….  😉

Stay tuned for more…..

Best

Paco

 

 

 

 

Fully Automated Driving…. When? How? What is missing? ……

Monday, March 2nd, 2015

Last Friday  the guys from Bosch came to campus invited by the UC Berkeley EECS School.

Bosch, like BMW or Mercedes, has a Research Centre in Palo alto where the engineers are creating the “Vision and the Roadmap” for automatic driving. The Centre participated in the Urban Challenge organized by DARPA and since 2010 is prototyping systems for the project.

When speaking about “Automated Driving” one have to distinguish between “Supervised by the driver” (some technologies are available today such as Park Assist, Integrated Cruise Assist and others, like Highway Assist, are progressing fast), “Highly Automated” (Highway pilot) with reduced driver supervision and “Fully Automated” (Auto pilot).

When we will see fully automated cars on the market? According to Bosch, it is likely that in 2020 we will see the first commercial prototypes of “Highly Automated” cars. No date for “Fully Automated” cars can be forecasted today…

But what is missing ?

  • Surround sensing… in all circumstances !!!
  • Safety and security
  • Legislation
  • Very precise and dynamic map data
  • Highly Fault Tolerant System Architecture (what happens if the battery dies???)

Let’s examine briefly some of those aspects…

Surround sensing:

Today, 360° surround sensing is possible with the use of radars, sensors and cameras but there are issues in special circumstances … what happens tunnels, low sun or with some materials like timber f they are transported by trucks?

What is missing, among other things, is what is called “Third Sensor Principle” in excess of radar and cameras. Sensors that work in real time, asynchronouly, using probabilitic algorithms and computationally very efficiently with supervising systems that are able to decide in cases of conflicting information. In fact, a new generation of sensors…

Dynamic Map Data

Today, most of the map data we have in our GPS is mostly static. What is needed is absolute localization data on maps with dynamic layers, much more precision and SLAM (Simultaneous Localization and Mapping).

Safety

The driver has to be monitored to detect distraction, drowsiness, Health state, etc. Identification and adaptive assistance are also necessary and also the ability to return control if necessary and this is a Key Element for this part.

Security

Protection against technical failures by means of redundancy in the steering and braking systems (some elements like assisted steering, ESP HEV and iBooster exist already today, particularly in electric vehicles).

One important aspect of security is quality control and testing for release. In traditional cars the quality control is done statistically but this method will not be feasible for the testing of fully automated driving vehicles. It is estimated that the number of test hours would be multiplied by a factor of one million. New release strategies are needed with a combination of statistical validation and new qualitative design and release strategies for individual components and the full integrated system…

Legislation

Currently laws regarding traffic and car driving are enacted at National level. However , there are two international conventions, Geneva (1949 UN) ad Vienna (1968), on road traffic… the problem is that some countries have ratified one and not the other… or neither of two !!!!

Needless to say that , like in many fields of technology, legislation does not reflect easily and quickly the technical progress….

To illustrate it, it looks that in the Vienna ( or Geneva, I do  not remember..) Convention it is stated that

“Every driver shall , at all times, be able to CONTROL his vehicle or GUIDE HIS ANIMALS”…. one can imagine how modern thsi rules are when it speaks about.. ANIMALS…

The key here  is the meaning given to the word “CONTROL”… if taken literally there is no possibility of “Automated Driving”… but what if CONTROL would mean “SUPERVISION” ?

It looks that the state of California has accepted the “testing” of such kind of vehicles based on well justified request and with “certified” people… at least research can continue… well done.. I am sure that Google has something to do with this… ;-)))

One of the questions that is often raised is “What will be the User Experience (UX)”?

What the driver will feel?, Emotions? Transition of control back to the driver? ….

It has become clear that the automotive industry has become a hardware/software industry , the mechanics is still important but the car is full of IT systems that have to work in an integrated way and with very fast response times.

This is even more important in the case of full automated driving and the software has to be fault tolerant, secure and very efficient.. …imagine the power of the embedded processors to take reactive action in milliseconds to determine the trajectory… and at the same time, but a little bit slower, say in  seconds, decisions about the manoeuver…

Bosch displayed a video that illustrated their vision for Highway Pilot by 2020. I will point to it as soon as  that upload to their web site because is very interesting…

The meeting, as usual, ended with a request to EECS students to send applications for jobs/internships at Bosch Research Centre in Palo Alto..

At the end of the meeting, I asked some questions

Question: “There are initiatives by Google in this field and recently the press has published that Apple might have 1000 engineers working on the subject… are you working with them?

Answer: “We are not authorised to speak about the collaboration with partners”…..

Question; “Bosch is not a car manufacturer, what is your business model for this technology?”

Answer: “Bosch is a manufacturer of components or complete systems for traditional or electric cars. We are going to continue with the same approach”

Question: “If there are several suppliers of components or subsystems on the market, and the car manufacturer decides to have multiple suppliers, they will have to work together in a mission critical framework. Standards will be needed; what is the current situation?”

Answer: Today there are standards at low level that allow communication among components and subsystems. Higher level standards will probably be needed in the future but which and when is difficult to say today. having said it, we believe that the automotive industry will find the necessary agreements to ensure interoperability at least at some level”

One of the students asked a very interesting question: “Will automatic driving avoid today’s typical accidents? There will be more? Or Less?

Answer: “100% safety will never exist. The first accidents, when the full automated driving will be available, will produce big headings on the press. This is not new: the introduction of safety belts, airbags ,etc in traditional cars was criticized at the beginning; with the time and the technological progress the criticism has mostly disappeared; nobody would accept today cars without those safety elements and legislation is enforcing them . As far as the number and type of accidents are concerned, Bosch believes that there will be less accidents and those will probably be somehow different to the accidents in the case of human driving…”.

Very interesting conference on a new subject…… at least for me…

I spoke to the Director of Research at the end of the conference and we agreed that I will visit the Center when I will go to Palo Alto, Mountain View or Menlo Park for other meetings in March or April

Stay tuned for more…

Best

Paco

Organizing a Hackathon @ UC Berkeley: Day D …. (Part 3 of 3)

Friday, February 27th, 2015

The day before the starting of the Hackathon, we were making sure that the infrastructure was in place in the open space at CITRIS main floor, installing socket extensions, checking the items received, adjusting security access to premises for the organizing team and volunteers, checking T-Shirts sizes, avaialble drinks (Coke, Soda and Red Bull ), energy bars and cookies to keep participants energy up, buckets for ice, water deposit, etc…. we also opened one of InWatchZ boxes and played with it a little bit.

Everything was ready for Day D…..

We had 93 registrations…. but only some 30 participants showed up and officially checked in on February 20th from 18:30. Some started working until 22:30 , closing time for the first day.

All the volunteers were there, Luke organized the presence slot so that there were always a couple of volunteers present. InWatch USA staff arrived and officially opened the Hackathon. Pizza dinner was available as well as snacks and drinks for the participants from 19:00. The tradition was respected: “eating while hacking” …… 😉

Every team had to check-in at the volunteers desk to get a Smartwatch for hacking. It was important to keep track of them , not only for security reasons but also because some teams might have decide to hack the OS or install their own code and they would need to continue working on it the next day.

One of the teams did not manage to find the options in the API to change language from Chinese to English for some parts of the software and hesitated to come back the next day. They did not show up anymore…

The next day we started at 9:00; bagels and coffee were available with a good variety of flavoured spread. Good energy to start…. teams were checking in slowly , some decided to start working from home. Very soon some teams had already some parts of their design on the whiteboards and some pseudo-code was taking shape… a couple of teams requested additional Smartwatches, I understood why when they presented the solution the next day. One of the teams that had checked in the day before but could not stay, came back in the morning but dropped out after lunch… they started too late and realized that they would not be able to deliver anything consistent.

We were wondering why we did not have more participants showing up until we discovered that, at the same time, there was a very big Hackathon organized by the University of Stanford and later during the day we discovered that Code for America had organized at the same time CodeAcross 2015, a series of “civic hacking events” hosted by nodes of Code for America network around the world. It coincided with the International Open Data Day. The theme of CodeAcross 2015 was “Principles for 21st Century Government”.

I will come back with more about Code for America, the brigades, the fellows ,etc in a future post.

As I said above, those events must have certainly impacted the number of teams that participated; it is not normal that only a little bit more that 1/3 of participants checked in. According to Alic and Luke,statistically, the attrition percentage is around 35%.

Lunch and dinner were available, typical burritos, tortilla chips with different spicy sauces were served for lunch and Chinese food was served for dinner… more “eating while hacking”….

Teams were working hard and as we turned around the tables the solutions were taking shape…

We closed the second day at 22:00.

Next day we started at 9:00 again. The events in Cordoba, Granada and Barcelona had finished and the winners were already known:

“Pillow 112” an application for InWatch that activates an emergency protocol giving automatic response in case of situations of violence was the winner in Granada.

‘Saveme’, an application for assistance in emergencies and Help-App was the winner in Cordoba and an application for tracking and tracing medical emergencies, which was the winner in Barcelona.

In Berkeley, teams were asked to check-in their solutions just after lunch. Only six teams finally checked-in.

Every team had 10 min to present their solution, followed by questions by the panel. The evaluation criteria were as follows

1.Technical complexity

2..User experience

3.User Interface design

4.Degree of innovation

5.Marketability

6.Meeting the hackathon objectives

7.Quality of the presentation

After all the presentations, the panel unanimously awarded the first prize to “inTime”.

inTime is an app for Smartwatches that notifies users when to leave for their next destination based on their calendar. The app uses GPS, pedometer, and barometric sensors to pinpoint where users are and how fast they walk from destination to destination. inTime incorporates multiple modes of transportation such as walking, driving, public transit, and Uber. User data is collected to identify quicker routes that Google Maps cannot provide as well as map out campuses, office buildings, and more.

The team will go to Barcelona to compete in the Global #CapmpusInwatch Hackathon Final with the winners from Granada, Córdoba and Barcelona.

Sending the winning team members to Barcelona is being quite an adventure….all kind of issues: passports missing, different origins, destinations and dates, changes in dates after the ticket has been issued, etc… I will sleep well when they arrive in Barcelona and even better when they will be back in Berkeley…

The next post will be after the final that will take place at the same time as the Mobile World Congress from March 2nd to March 4th.

Stay tuned and ….. Let the best team win !!!

Best

Paco

 

 

 

 

 

Organizing a Hackathon @ UC Berkeley: Managing the project …. (Part 2 of 3)

Sunday, February 22nd, 2015

I was in back in Berkeley on January 17th  after the Christmas break and I had to start moving very quickly if I wanted to meet the the very tight deadlines.

I called my first project meeting for January 20th with objective of taking stock of progress and agree on the activities and planning that would take us to the successful opening on February 2th of #Campusinwatch 2015Berkeley, that was the name and the hashtag for the hackathon.

Very quickly with the help of Miguel ,whose organization has a Microsoft Lync cloud account we could organize multi-party conference calls with all the participants on both sides of the Atlantic. Lync worked flawlessly this was very helpful.

It become clear that it was urgent to open the registration site and start advertising. Alic did an excellent job by creating very quickly the site on challengepost.com , ours was on http://campusinwatchberkeley.challengepost.com/ and we saw the first registrations coming very fast. Luke was advertising on Fecebook and we agreed to send mass mailing to the available mailing lists in one week. Event launched !!!

In the mean time Luke had recruited some volunteers to help and they were added to the mailing list of the organizing committee.

Time to start with the logistics…… no “decent” Hackaton can be organized without T-Shirts… aha … needed to make design, collect the logos….. but…..do we have written permissions to use them..? Well , not exactly… do we have our first “showstopper” ? . Had to call a crisis meeting to see how to solve the issue.. We could not afford to have legal issues…..

But we could work in parallel in the mean time and  proceed with do the draft design of the T-Shirts and check the logistics for the delivery. One volunteer was assigned the task…. unfortunately she was overloaded and could not deliver… we lost a few days, time was running, we would need a week for delivery….. would we get there on time?

If participants had to develop applications for InWatchZ, we would need devices , technical documentation, perhaps specific libraries… InWatchZ will be running Android 4.4 (Kit-Kat), the open source libraries available on the Android web site should be enough…. hopefully… we needed information from China…

We expected the devices to be with us towards the first week of February….  but where were they, . I had requested the shipment of 25 to The Foundry but no news…. until we understood that they were retained in the US Customs since they were shipped from China, where the engineering and manufacturing takes place… we were lucky that US Customs called InWatch US to clear the shipment… I had already warned everybody that this might happen… you know my preferred sentence “I told you so…” ;-).

Customs cleared, the devices for Berkeley arrived four days before the starting date of the hackaton… we were good !!!

The situation for Spain was different …….. the shipment was still blocked in HongKong just three days before the starting date of the event. We were asked to ship ten devices to Madrid….. no guarantee they would arrive on time…. we decided not to do it.

The logo issue was , in the mean time, solved and another volunteer took over the design using the web tools of a T-shirt manufacturing company and we were ready to order with the hope that the T-Shirts would arrive on time since the estimated delivery time was one week….. at the same time we sent the customized design to Spain for them to order there.

Then a new issue came up, under the advice of one of the Directors of the CET, Luke came with the request for signing a so called “Project Agreement Document” with all the responsibilities of the participant parties, including description of the prizes, budget estimation, etc to formalize the process.

In order to avoid getting stuck, I immediately took over the action and prepared a draft of such document and circulated it for agreement. After a couple of revisions the document was ready to be signed by all the parties: InWatch USA, Global In Devices, CETSA, The Foundry@CITRIS, Telecenter.org …. but how to make everybody sign to complete the action if they were in three different locations?

I remembered that when I went to the ICA Conference in Ottawa back in October 2014, we had a meeting with Intel that was under Non Disclosure Agreement (NDA). In order for all the participants to formally accept it, they used Docusign to electronically sign the document.

Docusing is a company which offers a Software as a Service (SaaS) called Digital Transaction Management (DTM). It has emerged as a category of software designed to safely and securely manage document-based transactions digitally. DTM removes friction inherent in processes that involve people, documents, and data inside and beyond the “firewall” (outside the network of the organizations) to create faster, easier, more convenient and secure transactions. DTM delivers a suite of services that empower companies to easily deploy and update digital processes without the traditional expense and programming required of older enterprise applications.

Docusign is a worldwide service which claims that its electronic signatures are legally binding around the wall.

I thought that Docusign could help me to solve the signature issue and decided to test the service. I  saw on their site that there was the opportunity for a free trial and I signed up for it.

The system did not look very intuitive at the beginning and I thought it would be better to read the help before starting to start. I followed the instructions and uploaded the Project Agreement… unfortunately it would not upload and the “processing ring” was turning all the time. At the same time, the document appeared in the list as draft ….. very strange. I gave up and told myself that the system was useless…. and that I had to figure out another approach to the signing of the document…

When I came back to the office the next day, I decided to give it a try again and this time the document uploaded immediately, I typed in the names and the email addresses of the people that had to sign, I set the flag to be able to track progress as the signature workflow progressed and one hour later I had the document electronically signed by everybody.. cool !!!

In the meantime, I thought that Alic and Luke deserved also to be recognized for the excellent work they were carrying out and after some discussion I manage to convince InWatch USA and Global In Devices to fly them to Barcelona to help with the final hackaton. Alic will also go to Granada after Barcelona to set up relationships between The Foundry@CITRIS, the University of Granada and the startups in the Technology Park….. a good way of strengthening the relationships between universities in USA and EU. Unfortunately Luke cannot afford to miss more classes and exams therefore he will be back after the Mobile World Congress.

Here we were two days before the starting of the event with nearly all the preparatory actions nearly completed.. the T-Shirts arrived and they were cool as well …. miracles happen … 😉

Stay tuned for the third and final part !!!

Best

Paco

Organizing a Hackathon @ UC Berkeley: The project and its origins…. (Part 1 of 3)

Sunday, February 22nd, 2015

For those who are not familiar with the term, hackathon is the term used in the hacker community to refer to a meeting of programmers whose aim is the collaborative software development, although it may also have, in some case, a hardware component. These events can last for two to three days. The goal is twofold: first to make contributions to a, very often open source, project and second, to learn without haste, but with the aim of developing solutions that might lead to new startups. That is why in many hackathons there is also a component of mentoring and search for business angels.

The term integrates the concepts of marathon and hacker, alluding to a collective experience that pursues the common goal of developing applications in a collaborative short period of time.

Many hackatons have educational and stimulating purposes, as well as social to improve the quality of life, but we also propose the goal of creating usable software that might become a product marketable by a new startup.

Hackathons, from an organizational point of view, have a horizontal and intensive dynamic where participants complement individual experiences and skills in order to develop concrete solutions. They promote collaborative work among peers oriented towards problem solving, putting the focus on the work process as a form of collaborative learning and promoting the intrinsic motivation of participants.

Some months ago when I went to San Francisco for some meetings, I met Juan Francisco, a friend of mine from Granada, who was in charge of the Telecenters organization in Andalucía , a very successful experience in the region that started with the aim to increase IT literacy in small villages and that , with the time, got involved in social entrepreneurship.

The initiative funded by the Regional Government of Andalucía, with the contribution of EU funds, was, thanks to his energy and efforts, one of the most successful around the world in this area. He is internationally known it the circles.

In that meeting, Juan Francisco told me about the possibilities of a new generation of standalone SmartWatches (no need to have a Smartphone for them to receive messages, call or go to the Internet) that would make a breakthrough in the future, particularly when connected to “wearables”… we are already in the Internet of Things (IOT)here…

In order to promote these technologies he told me he would like to organize a series Hackathons to facilitate the development of applications for that kind of devices and he would like to have my help to organize one at UC Berkeley, since the university has a good reputation in computer science, at the same time as the other hackathons that he was organizing in Spain and possibly in South America.

You know me, I am a long life learner, this was anew experience… and I like challenges; I had never involved in the organization of a Hackathon and I decided to give it a try in my “free time”….  whatever it means  😉

Now, how to go about it?…

I knew there was a big Hakathon at UC Berkley at the beginning of the winter semester, it took place at so called Greek Theatre and it gathered more than 2000 hackers.

It was a general purpose one and any kind of IT related projects (hardware, software, web, mobile, etc) were accepted.

I contacted Costas Spanos, the Director of CITRIS, the Institute I am working with, for guidance and to get contact points for the activity and I also asked him if CITRIS would be interested in sponsoring our hackathon.

He replied immediately and put me in contact with Alic Chen , the head of “The Foundry @ CITRIS” so that he would look at my proposal and report to him with a Business Case.

The Foundry @ CITRIS was created in 2013 to help entrepreneurs build companies that make a significant impact on the world. A new economy is developing at the intersection of hardware, software and services. The Foundry provides access to design, manufacturing & business development tools, along with a community of entrepreneurs and experts to transform entrepreneurial teams into founders.

Since it was created, The Foundry has helped to create 14 companies, helped to raise more than 7 M$ in venture capital for them and it estimated that has contributed with more than 18 M$ to the Californian economy…… not too bad. Furthermore, The Foundry helped to organize the big hackathon I mentioned above.

I had a meeting with Alic and he immediately bought the idea and from there we started working.

The Foundry would sponsor the event and would contribute making available the infrastructure (space, tables, chairs, electricity, etc), will act as a consultant to help with the details of the organization (including budget estimation) and will liaise with the Administration to ensure, Security and cleaning of the premises after the event. Those weekend services are compulsory and are invoiced by the administration to the organizers.

The next step was to see how to reach the” hacker” population, particularly Berkeley students but no only to advertise the event and get as many registrations as possible.

But this was not enough, we needed more logistics support….

A few weeks before I had participated in a dinner organized by one of the Visiting Scholars from the University of Valencia who was here to collect information about Spanish startups in the valley and to see how the startup environment developed at UC berkeley.

There, I met a south Korean undergrad student whose “nick name” was Luke (his real name is Kun-hyoung Kim… he thought that his real first name would be too complicated to pronounce or remember here and therefore he decided to sign as Luke Kim…. very practical guy … and very interesting one…)

He is one of the most significant members of CETSA (Center for Entrepreneurship and Technology Student Association) which is linked to the UC Berkeley’s Center for Entrepreneurship & Technology (CET) equips engineers and scientists with the skills to innovate, lead and productize technology in the global economy.

I met Luke again in one of the UC Berkeley conferences and explained to him my idea.

We agreed to speak about it in detail around a cup of coffee the next day and after my explanations he bought the idea and agreed to create a group of volunteers to help with the organization and to advertise the event on their page in Facebook and send the invitation to participate to their mailing lists.

Next step was to agree with the two sponsoring companies, InWatch USA and Global In Devices in Spain, on the precise objectives for the hackathon, including the prizes for the winners and to get a clear commitment for the payment of all the expenses.

Based on the estimations provided by Alic , I prepared a budget proposal to get the agreement of the sponsoring companies and after some discussion I got it.

Juan Francisco also decided to involve Miguel Raimilla, the Head of Telecenter.org, the worldwide organization coordinating the Telecenter movement since he has a very good network of contacts and has experience in this kind of events. Miguel on board… I knew him and he is a nice guy..

We were nearly ready to go but…………we were already close to the end of the winter semester and after the winter exams everybody would disappear until mid-January……

Taking into account that the prize for the winners would be to go to a Final Hackathon at the Mobile World Cogress in Barcelona (March 2nd to 5th, 2015) it meant that the last possible week end to organize the UC Berkeley hackathon was the weekend of February 20th, at the same time as the other three hackathons taking place in Spain.

It also meant that we would had only four working weeks to deliver……. a lot of discipline and control of the activities would be required…. real project management skills needed…. 😉

Stay tuned for part two, soon

Best

Paco

Back to Berkeley again for the second semester.. and we start ….. in the Cloud!!!

Wednesday, January 28th, 2015

I attended last week a conference organized by UC Berkeley EECS (Electrical Engineering and Computer Science) School whose title was “Erasure Coding for Cloud Storage“. The speaker was Parikshit Gopalan who is a researcher in the Microsoft Azure team.

Under that apparently complicated title, there is the solution to a very interesting problem of storage efficiency and reliability.

These days nearly everyone stores things in the cloud and many users go for free storage on the various internet content providers (Goolge, Microsft, Dropbox, etc).

To estimate what it represents in terms of storage, let’s take a look at some raw figures:

One simple photo per day : 256kb; 1 year of photos: 100 Mb; 1 billion users over 1 year: 100Tb… call it a data deluge !!!

If 1Tb of data costs 100$, the estimated 100Tb of data, would cost 100 M$…. spent in reliable storage , partly “for free” for the users…

The conclusion is that storing such massive amount of data comes with costs, not only disk storage but also servers, power, airco and the associated professional services.

The challenge is then: “To manage data to help keep it safe and secure while minimizing the amount of storage space it requires”.

And this exactly what Parikshit and his colleagues at Microsoft Research and the Windows Azure Storage group have been doing for some time now , saving Windows Azure Storage millions of dollars and the work they have carried out have been recognized by the research community since they have been awarded several prizes for their research.

The easiest way to keep data integrity is duplication. Generally speaking three full copies of the data are enough to keep them safe, but in this case the “store cost overhead” is exactly three what is prohibitively costly.

One method to keep data accessible and durable , while using less space, is to “code” the data, i.e. create a shortened description of the data that can be reassembled and delivered safely to the user on request.

Parikshit and his colleagues have used “erasure coding” as the basis for their algorithms. Erasure Coding (EC) is a method of data protection in which data is broken into fragments, expanded and encoded with redundant data pieces and stored across a set of different locations or storage media.

A very nice description for “dummies” of Erasure Coding can be found at http://smahesh.com/blog/2012/07/01/dummies-guide-to-erasure-coding/. Erasure Code is also used in several distributed file systems like the now very popular Hadoop Distributed File System (HDFS)

A well understood way to perform the Erasure Coding (EC) operation is called Reed-Solomon coding  (devised in 1960 !!!!) and used by the U.S. space program to reduce and correct problems in data communications. It also helped made compact discs possible by catching and correcting problems in the disc’s digital coding.

And why this coding method is important for the subject of this post ? For example, by using 6+3 Reed Solomon code, which converts three copies of data to nine fragments (six for data and three for parity), each 1/6 of the size of the original data, it cuts data footprint in half, to an “storage overhead cost” of 1.5 instead of the 3 with the full three copies of the data mentioned above.

But there is no such think like a “free meal”… coding data has a cost; it slows performance of servers due to the need to reassemble data from code (it would also happen to a person who is reading a text where every other letter is missing !!!). Data retrieval also can be slowed if a data fragment is stored on a hard disk that has failed or is on a server that is temporarily offline (for un upgrade or patching , for instance).

The goal of Parkshit and the rest of the team on their approach was to reduce the time and cost of performing data retrieval , especially during hardware failures (… yes High Volume Standard , hardware will fail !!!) or data-centre maintenance operations such software upgrades. In addition to reducing data-retrieval time , the goal of the new approach was to perform “lazy erasure coding” that enables greater data compression and thus reducing “storage overhead cost” to 1.33 … or lower !!! and all that with minimal performance losses.

“Storage overhead cost” of 1.33 can be achieved with a 12+4 Reed-Solomon codes. But there is a n undesirable effect with this approach: if a piece of data fails, all the 12 fragments will have to be read to reconstruct the data…. Meaning 12 disk I/O operations and 12 network transfers.. and that is expensive, doubling the dis I/O actions and network transfers of the 6+3 mentioned above.

As stated above, Reed-Solomon coding is designed for deep space communications with the objective of tolerating as many errors as possible given a certain overhead. But the error pattern in data centres behaves differently.

In fact, well managed data centres (and I bet that Microsoft knows how to do it ;-)…….) have low hard-failure rate; it means that most of the data chunks (called extents) are healthy with no failed data fragments. The extent with two or more failed data or parity fragments are rare and appear for a short duration and the repair rate in data centres is high and quick.

The approach used by Parikshit and the team is based on the rich mathematical theory of “locally decodable codes and probabilistic checkable proofs“. They have called it Local Reconstruction Codes (LRCs) and enables data to be reconstructed more quickly than with Reed-Salomon codes, because fewer data fragments (six instead of twelve) need to be read to recreate the data in most of the cases and the mathematical algorithms are also simpler leading to less complexity in the operations needed to combine data pieces and therefor the response time is quicker.

The “local” in the coding technique’s name refers to the concept that, in the event of a fragment being offline, the code needed to reconstruct the data is not spread across the entire span of data centre’s servers and therefore the fragments are retrieved faster.

In the implementation, they propose “lazy erasure coding”; but … why “lazy”?. In fact the term “lazy” comes from the way coding works in the background, not in the critical writing path. When a “data chunck”, called an “extent”, is opened and filled, it is duplicated with three full copies for the reasons mentioned earlier. When it is sealed, erasure coding is launched in the background when the data centre load level is low. The extent is split into equal-sized data segments , coded to generate the parity fragments, with each of data fragments and parity fragments being stored in different places. Once the data is erasure-coded and all the data parity fragments are distributed, all the three original three copies of the data can be deleted.

With the proposed coding approach the team has met two main criteria of the data storage. First, reliability that guarantees durability and be readily available. A data chinck can have three failures and still be rebuilt with 100% accuracy. In the unlikely event of four failures, the success rate drops to 86%. Second , the new coding results in a data overhead of 1.29. When one thinks about the amount of storage existing and provisioned regularly in a Cloud service a small decrease in overhead may mean millions in savings.

The proposed coding method was deployed in Azure some tile ago and is also present in Windows 8.1 and Windows Server 2012.

A very interesting presentation and a discovery, at least for me, of the storage design and management approach needed in the Cloud.

Stay tuned for more !!

Best

Paco

The last post of 2014: The Reinhart and Rogoff Spreadsheet, Austerity Policies and Programming Language Technology…..

Wednesday, December 31st, 2014

A few days ago, I attended a very interesting conference by Emery Berger , a Professor in the School of Computer Science at the University of Massachusetts Amherst, the flagship campus of the UMass system.

In 2010, the economists Carmen Reinhart and Kenneth Rogoff, both now at Harvard, presented results of an extensive study of the correlation between indebtedness (debt/GDP) and economic growth (the rate of change of GDP) in 44 countries and over a period of approximately 200 years. The authors argued that there was an “apparent tipping point”: when indebtedness crossed 90%, growth rates plummeted. It looks that the results of this study were widely used by politicians to justify austerity measures taken to reduce debt loads in countries around the world.

What programming language did they use to develop the model?

C++? Nope…. even if there are aprox. 3.5 million users of this programming language around the world…

Was it Java? Nope… even if there are some 9 million users of Java around the world…

What did they use then? Like many others in Social and Bio sciences they used … Microsoft EXCEL …. It is estimated that there are around 500 million users around the world ( aprox. 7% of the World population) and…. Yes, EXCEL is a very powerful programming language with the formulas/macros and the embedded execution method…

A friend of mine, Oscar Pastor, who is Professor of Computer Science at the Polytechnic University of Valencia, is trying to apply Information Systems Technologies to genetics and together with his research group has being dong research on that area of Biology. He tries to organize DNA sequences in such a way that discovering patterns for illness is made easier and quicker.

Why I refer to this? … it looks that hundred of gigabytes of information on the subject gathered and processed by research groups are in flat file and EXCEL Worksheets and therefore checking the correctness of models and data developed with EXCEL is extremely important.

As stated by Emery and his team, program correctness has been an important programming language research topic for many years. A lot of research has been carried out ( and still is !!!) to find techniques to reduce program errors . They range from testing and runtime assertions to dynamic and static analysis tools that can discover a wide range of bugs. These tools enable programmers to find programming errors and to reduce their impact, improving overall program quality.

The Holy Grail in this area is to achieve “program proofing”, i.e. being able to find a mathematical (formal) representation of a program that allows it to be “proofed” very much in the same way one “proofs a theorem”.

Nonetheless, a computation is not likely to be correct if the input data are not correct. The phrase “garbage in, garbage out,” long known to programmers, describes the problem of producing incorrect outputs even when the program is known to be correct. Consequently, the automatic detection of incorrect inputs is at least as important as the automatic detection of incorrect programs. Unlike programs, data cannot be easily tested or analyzed for correctness.

There a variety of reasons why “data errors exist”: they might be Data Entry Errors (typos or false transcription), Measurement errors (acquisition devices is faulty), Data Integration errors (mixing different data types or measurement units….. ).

On Data Integration errors, remember the Mars Climate Orbiter loss in 1999 “because spacecraft engineers failed to convert from English to metric measurements when exchanging vital data before the craft was launched”.

By contrast with the proliferation of tools at a programmer’s disposal to find programming errors, few tools exist to help find data errors.

There are some automatic approaches to finding data errors such as data cleaning (cross-validation with ground truth data) and statistical outlier detection (reporting data as outliers based on the relation ship to a given distribution (e.g. Gaussian). However, identifying a valid input distribution is at least as difficult as designing a correct validator…. and, as stated by the authors of the research “even when the input distribution is known, outlier analysis often is not an appropriate error-finding method. The reason is that it is neither necessary nor sufficient that a data input error be an outlier for it to cause program errors !!!”.

While data errors pose a threat to the correctness of any computation, they are especially problematic in data-intensive programming environments like spreadsheets. In this setting, data correctness can be as important as program correctness. The results produced by the computations—formulas, charts, and other analyses— may be rendered invalid by data errors. These errors can be costly: errors in spreadsheet data have led to losses of millions of dollars. …….. and here comes the relationship between Reinhart and Rogoff and Programming Language Technology.

Although Reinhart and Rogoff made the original data available that formed the basis of their study, they did not make public the instrument used to perform the actual analysis: an Excel spreadsheet. Herndon, Ash, and Pollin, economists at the University of Massachusetts Amherst, obtained the spreadsheet. They discovered several errors, including the “apparently accidental omission of five countries in a range of formulas “. After correcting for these and other flaws in the spreadsheet, the results invalidate Reinhart-Rogoff’s conclusion: no tipping point exists for economic growth as debt levels rise.

Now , could this kind of “accidental error “ have been detected with the help of Programming Language Technology?

Emery and his team have carried out a research whose key finding is that, “with respect to a computation, whether an error is an outlier in the program’s input distribution is not necessarily relevant. Rather, potential errors can be spotted by their effect on a program’s output distribution. An important input error causes a program’s output to diverge dramatically from that distribution. This statistical approach can be used to rank inputs by the degree to which they drive the anomalousness of the program”.

In fact, they have presented “Data Debugging”, an automated technique for locating potential data errors. Since it is impossible to know a priori whether data are erroneous or not, data debugging does the next best thing: locating data that have an unusual impact on the computation. Intuitively, data that have a high impact on the final result are either very important or wrong. By contrast, wrong data whose presence have no particularly unusual effect on the final result do not merit special attention.

Based on the theoretical research, they have developed a tool called CHECKCELL , a data debugging tool designed as an add-in for Microsoft Excel and for Google Spreadsheets.

It highlights all inputs whose presence causes function outputs to be dramatically different than the function output were those outputs excluded. CHECKCELL guides the user through an audit one cell at a time. CHECKCELL looks to be empirically and analytically efficient.

CHECKCELL’s statistical analysis is guided by the structure of the program present in a worksheet. In the first place, it identifies the inputs and outputs of those computations; it scans the open Excel workbook and collects all formula strings. The collected formulas are parsed using an EXCEL grammar expressed with the FParsec parser combinatory library. CHECKCELL uses the Excel formula’s syntax tree to extract references to input vectors and other formulas, resolves references to local, cross-worksheet and cross-workbook cells.

One interesting approach was that, in order to generate possible input errors to test the tool, Emery and his team used human volunteers via Amazon’s Mechanical Turk crowdsourcing platform to copy series of data to generate typical human transcription errors. According to Emery, on average, 5% of data copied are erroneous.

Emery and his team obtained the Excel spreadsheet directly from Carmen Reinhart and ran CHECKCELL on it. The tool singled out one cell in bright red, identifying it as “a value with an extraordinary impact on the final result”.

They reported this finding to one of the UMass economists (Michael Ash). He confirmed that this value, a data entry of 10.2 for Norway, indicated a key methodological problem in the spreadsheet. The UMass economists found this flaw by careful manual auditing after their initial analysis of the Spreadsheet.

Due to the extraordinary growth (more that 10%) of Norway in a single year ,1946 , out of the 130 years registered. Such a high growth in one year has an enormous impact on the model since Norway’s one year in the 60-90 percent GDP category receives equal weight to, for example, Canada’s 23 years in the category, Austria’s 35, Italy’s 39, and Spain’s 47 !!!!!

I asked Emery about the reaction of Reinhart and Rogoff, when the flaws in the model were discovered…. He answered that both economist maintained their conclusions… and in any case , according to him, they said that theirs was a “working paper” …. and, of course, it was not subject to a rigorous “peer review”…..

Yu can find the opinion of the prestigious economist and Nobel Prize winner Paul Krugman on EXCEL errors and the Reinhart-Rogoff model at http://www.nytimes.com/2013/04/19/opinion/krugman-the-excel-depression.html?_r=0

I found an interesting FAQ on this famous (or perhaps infamous…) flaw at http://www.businessweek.com/articles/2013-04-18/faq-reinhart-rogoff-and-the-excel-error-that-changed-history for those who are interested in the subject.

I will come back on Emery’s works when I will speak about SurveyMan in one of my next posts.

In the mean time, I wish all the readers a Happy 2015 !!!

Stay tuned for more in the New Year !!!

Best

Paco