Engineering is all about trade-offs, and Tom Bartel, Trivago Team Lead Interface Platform, does a great job of illustrating the trade-offs that led to the rewrite of the platform in his recent blog post. It was a colossal undertaking, especially when nothing is inherently “wrong” or broken with the code base.
What was working? Well, the site as a whole was. Users enjoyed full functionality on web and mobile. There was a team of trained engineers many of whom enjoyed their functional duties.
What “didn’t work”? Melody was homegrown, so it wasn’t very common. The ecosystem was small, documentation was limited, and engineering services (i.e. Google and Stack Overflow) were very limited to non-existent. There were a maximum of two main maintainers with at least one on call at all times. Onboarding new employees was difficult and some expressed concerns that they were learning and developing non-transferable skills.
Double Melody? Allocate resources to modernize the framework, update and add quality documentation and train engineers in maintenance? Or look down…
…The blank page
It’s not a new project but it’s a new project. Since the effort, internally called the Web Application Rewrite Project (WARP), is a complete rewrite and not a refactor, all the new questions of the project arise:
- Libraries: which are the most interesting for utilities, date calculation, etc…?
- CSS files: how to organize?
- Status of the application: how will it be maintained?
- Event transmission: how will it happen?
- HTML pages: will they be statically pre-generated?
- Structure: for URLs and pages.
- Application initialization: how will it work?
- Component API: What will the design look like?
Ah, making decisions!
With so many decisions and the team working remotely (remote collaboration was still considered new when this project kicked off in April 2020), Trivago engineers implemented an incredibly methodical and pragmatic approach to tackling business issues. touch engineering and ultimately make the decisions.
- Decision document: This document collects and organizes the relevant facts and points of view of the engineer.
- Decision meeting: a place to discuss viewpoints ultimately leading to the decision.
- Decision Owner: organizes the document, prepares the decision meeting and ensures that a decision is made.
Some decisions were easy to make while others were hard won, some went from document to test while others were refactored because the implementation did not meet initial expectations.
It was while implementing another decision that seemed too complicated for some developers that Trivago engineers decided to move forward with Next.js and React.
A great tip from Trivago’s trial-and-error rewrite process is to commit to the decisions, but keep an open mind and course-correct if necessary. Decisions made with the best knowledge and intentions can bring new insights during implementation.
Once the rewrite was fully functional and useful to the user, it was exposed to the real world and tested with dashboards, checks, and comparisons serving as guides for engineers to see what needed attention.
- User interaction: Does it differ from product to product? If so, was the cause bugs or something else?
- Revenue: Does the new app forward to booking sites at the same rate?
- Types of research: Trivago automatically adjusts search types for better results. For example, if a search is too narrow, Trivago will expand the parameters and add results. A direct indicator that something was wrong is a difference in the listings. This prompted the investigation to continue.
It took several months of engineering, but eventually the switch was flipped and all user traffic went to the new app.
The rewards have been collected!
Benefits for the user
The startup time is highly dependent on the size of the code provided by Trivago. Since engineers rely heavily on open source libraries such as Next.js, Preact, and react-use, they closely monitor code size.
The new product reduced page weight from 2.1 MB to 1.7 MB (19%) for topic pages and from 4.1 MB to 2.6 MB (37%) for results pages. Turning the application from a single page into multiple pages and using the automatic code splitting feature by Next.js has proven to be very beneficial.
As a result, Trivago now runs more smoothly on weaker hardware. Android 6, which accounts for approximately 0.5% of all Android customers, is the weakest hardware running the Trivago app. Their tests show that the application works perfectly on Android 6.
Benefits for developers
It’s a strong connection to why the rewrite started in the first place. Cleaner codebase with quality documentation and widely available both internally and in terms of global ecosystem available for Google search and Stack Overflow. New developers find it easier to onboard and there is more familiarity and transferable skills to develop.
There is no definitive evidence of faster development, monthly merged pull requests are higher in the new code base (see table above) with the same number of engineers as were working on the old cold base.
There are now ten outings per day compared to the previous two. With a little more cleanup, the additional legacy systems will be disabled, allowing more resources to be available for the new application.
All in all, this big rewrite comes at a cost and there have been quite a few struggles, revenue has been lost although it has paired well with the 2020 travel downturn. enormously developed in terms of engineering skills as well as soft skills.
With all the setbacks taken into account, the project was definitely a success.