I few days ago I participated in a meeting sponsored by UC Berkeley’s EECS (Electrical Engineering and Computer Science) School for the company AirBnB whose title was “AirBnB Info Session”. I wanted to know why a company of this type was organizing a meeting at the University.
For those who do not know the company , Airbnb is a website for people to rent out lodging. It has over 800,000 listings in 34,000 cities and 192 countries with more than 20 million guests registered. Founded in August 2008 and headquartered in San Francisco, California, the company is privately owned and operated by Airbnb, Inc. According to some sources, the current valuation of the company is over 12 B$. AirBnB stands for “Air Bed & Breakfast”.
Users of the site must register and create a personal online profile before using the site. Every property is associated with a host whose profile includes recommendations by other users, reviews by previous guests, as well as a response rating and private messaging system.
Hotels in all countries have complained to public administrations about the way the company operates. This has led to issuing regulations limiting the number of days private owners can rent their houses without registering as a company and asking AirBnB to collect taxes on customer payments.
The presentation was driven by the communication team of the company, called “Team X”. It took place late afternoon after the normal classroom time table and the great majority of participants were undergrad or grad students. The Wozniak meeting room at the Soda Hall was full.
The purpose of the meeting was to present the company, its work environment and its technology platform to the students and collect applications for traineeships and/or employment.
I discovered that Oracle and Pixar were organizing the same kind of meetings in the following days. I understood that there will be a number of undergrads and grads who will receive their diploma in December and now is good time for companies to come and hunt brilliant brains. Computer Science students at Berkeley had a very good reputation and the Silicon Valley is in need of good professionals. The fact of the matter is that most students get a job before the finish their studies or shortly after.
I was particularly interested in the part of the presentation concerning the IT infrastructure and working methods. The presentation was delivered by Lu Cheng, who graduated from UC Berkeley in December 2013 and was a former trainee at the company.
The company is structured in the following teams: Growth, Search, Discovery, Trust & Safety ,Production & Infrastructure, Data Infrastructure, Security, Mobile and Payments.
AirBnB’s IT infrastructure is hosted in the Amazon Web Services Cloud. The company makes extensive use of Open Source Software such as Ruby on Rail, Java, MySQL, Lucene , etc and to my surprise also uses Hadoop, Apache’s Mesos (with Chronos, a replacement for Cron open sourced by AirBnB) and Apache Spark (for high speed cluster computing) . A very good description of the use of Open Source at AirBnB can be found at http://nerds.airbnb.com/open-source/. For technical teams, I recommend the reading of some of the “conversations” in the Tech Talk tab. I think we have a lot to learn from these start-ups !!!.
For the future, the company wants to go from the “Where you want to go” to 3where we suggest you to go” by exploiting all the information they have in their databases.
They also want to exploit the information in reviews and queries by better understanding the text. They are creating a series of attributes using TF-IDF algorithms.
TF–IDF ( which stands for term frequency–inverse document frequency), is a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus. It is often used as a weighting factor in information retrieval and text mining. The TF–IDF value increases proportionally to the number of times a word appears in the document, but is offset by the frequency of the word in the corpus, which helps to control for the fact that some words are generally more common than others.
Variations of the TF–IDF weighting scheme are often used by search engines as a central tool in scoring and ranking a document’s relevance given a user query. TF–IDF can be successfully used for stop-words filtering in various subject fields including text summarization and classification. One of the simplest ranking functions is computed by summing the TF-IDF for each query term; many more sophisticated ranking functions are variants of this simple model.
They tailor the algorithms to search for locations, user experiences, etc.
At the end of the presentation Lu Cheng made a few interesting comments based on his experience as a new employee of the company: Come and work with us, we guarantee lack of bureaucracy, no or very little 24/7 fire fighting , interesting IT challenges, fun and fast deployment of new versions !!!.
At the end of the session students were queuing to hand their CV to the team or get a token to go on-line and register.
I will post soon on other events I attended in the past days and on my very interesting experience at the 48th ICA-IT conference that took place in Ottawa this week.
Stay tuned !!!