How does an open philosophy jive with best practices in performance and security? In short, we’re selective in our dependencies and audit our own upstream sources. Progressive enhancement not only makes for a fast and accessible site, I argue it’s also the cheaper choice in the long run!
Background
The Wikimedia Foundation is the non-profit that hosts Wikipedia and other free knowledge and open data projects. These projects are made possible by a global community who, together with the Foundation, comprise the “Wikimedia movement”. The Wikimedia movement is united by a vision: to bring about a world in which every single human being can freely share in the sum of all knowledge.
I’ve worked at the Wikimedia Foundation for over 10 years, first starting as a front-end developer and eventually as a part of the Performance Team.
The Wikimedia movement is rooted in the culture of freely licensed software. The MediaWiki application that Wikipedia runs on, and all other software developed at the Foundation, is open source. That includes the configuration and datacenter automation of our web servers, databases, and CDN service. The Wikimedia community and any other individual or organization may inspect, contribute to, reuse for themselves, or fork any aspect of the platform at any time. This philosophy is also the basis of long-standing security practices which support visibility and openness.
Security through visibility and trust
We live in an incredible world. Today, most online devices are powered by open source. Whether the data centers of video streaming giants and social media sites, or your smartphone, they likely run an open source operating system like Linux or a BSD derivative[1]. The vast majority of websites are also built with open source tools, or run on open source platforms. When you build on existing software that is developed by another organization or community, this is called an “upstream”.
The Wikimedia Foundation relies heavily on upstream technology to power its platforms. This allows the organization to focus on its core mission of providing free knowledge to the world, rather than on developing and maintaining technology from scratch. Additionally, by collaborating with other open source projects, the Foundation is able to give back to the broader free software ecosystem and help advance the state of technology for everyone.
We’re notable for operating exclusively with upstreams that are also open source. This ensures our freedom principles (to freely inspect, modify, reuse, and fork) are not hindered by proprietary components.
New Wikimedia production software components or dependencies must pass certain fitness checks and a chain of trust for the software’s security and integrity. When the Wikimedia community creates software that is peer-reviewed during development, this trust follows implicitly from its public policies and standards. When adding a new third-party package or dependency (“upstream”), this chain needs to be established by other means.
The Wikimedia Foundation extends its chain to several credible upstream vendors and communities. For example, Debian, known for its Linux distribution, is host to the highly trusted and curated Debian package repository. When a package is present in the Debian repository, this signals trust, stability, and confidence to the industry. While we usually don’t audit source code of Debian packages, installing a Debian package may still require a concept review to validate and verify that the package actually intends to meet our scale, threat model, and performance requirements.
When considering PHP or JavaScript libraries from an anonymous and open registry like npm or Packagist, the Wikimedia Foundation audits the code as if it were its own. We keep on-going costs to a minimum by only adopting upstream packages in areas that solve non-trivial problems, have stable external requirements, and sit behind a module boundary. Dependencies should reduce cost, not increase it. In practice, we only consider packages with few or no transitive dependencies, written for a stable runtime.
As an added precaution, the Wikimedia Foundation prohibits networking to third-party services in its production realm. When deploying or installing the MediaWiki application, it does not download JavaScript or PHP packages from npm or Composer. Instead, upstream packages are downloaded as a file with an integrity hash, and are checked into Git. This approach implements the organization’s security requirements, allowing for transparent auditing, patch-ability, and independent offline deployment. It also helps with faster onboarding, consistent and reproducible development, and creates a natural place for auditing upstream changes during code review.
The most localized software
With over 300 language editions, Wikipedia might be among the most-translated literature in the world. Wikipedia editors usually write or translate articles manually, and in recent years, the ContentTranslation tool has helped editors do this more efficiently, producing over 1 million articles through this new tool alone.
The MediaWiki platform underneath it all recognizes and localizes its user interface in over 400 languages, including gender, pluralization rules (“10 new messages”), and sort order ICU collations. We contribute to the Unicode CLDR standard on behalf of Wikipedia’s language communities. These contributions flow downstream to other Unicode customers such as Linux, Apple, and Microsoft.
Languages like Arabic and Hebrew are written from right to left. CSSJanus takes stylesheets designed and developed for left-to-right languages like English, and automatically converts them into right-to-left layouts. We deploy the MediaWiki platform on a weekly basis. Each change to functionality is deployed to all supported languages from day 1, every time. CSSJanus is part of what makes this feasible and with little to no developer training.
Not all issues are that easy! During VisualEditor development, extensive effort went into localizing the bold and italic toolbar buttons. The familiar “B” and “I” buttons usually make place for an equivalent abbreviation, such as F (Fett) and K (Kursiv) in German, or a stylized “A” for language communities that have no accepted standard. But, early adoption of English-centric software led to “B” and “I” becoming the established and culturally familiar design pattern in some other languages as well. In Hebrew, Czech, and Malayalam “correcting” these with a translation actually created confusion.
No profit motive
Large corporations, driven by profit motives, regularly drop support for older devices and browsers. The Wikimedia Foundation, however, has an imperative to make information more accessible, not less.
How does the organization pull that off without the resources of a large corporation? Through equal parts being aggressively lean and aggressively uncompromising.
The organization saves development and testing costs by writing and deploying native JavaScript that targets only modern browsers. Through an approach inspired by BBC News’ cutting the mustard, the Foundation enables millions of people (1% of its 2 billion monthly readers) to access Wikipedia on older devices through a JavaScript-free experience. This is the same experience that all page views start at prior to the (optional) arrival of JavaScript code.
The Wikimedia Foundation’s development principles and browser support policy reflects this by emphasizing the importance of progressive enhancement.
Viewing Wikipedia through a web browser is the most common access method, but Wikipedia’s knowledge is consumed far beyond the canonical experience at Wikipedia.org. Wikipedia content goes everywhere. It’s distributed offline through Kiwix and IPFS, rendered in native apps like Apple Dictionary, and even shared peer-to-peer through USB sticks. What these environments have in common is that they may not involve JavaScript as they require high security and high privacy. This is made possible at no extra cost due to APIs offering complete content HTML-first, with CSS and embedded media based on ubiquitous and open formats only.
Summary
The Wikimedia Foundation prioritizes both security and openness. To achieve this balance, it implements a number of practices and policies that ensure that it protects both the freedoms and the privacy of its audience, all while sharing information transparently.
For example, the Foundation publishes an annual transparency report detailing its response to information and takedown requests twice per year. The Wikimedia Foundation’s Board positions are largely held by community members, and appointed by public election through anonymous and cryptographically-verifiable votes from any eligible Wikipedia account. Its Governance Wiki publishes the Foundation’s bylaws, board decisions, and meetings.
The Foundation participates in an ecosystem of organizations that collaborate on freely-licensed information and open-source software. Overall, the organization balances exceptional security and openness by implementing strong security practices, and providing transparency about their policies and procedures.
Originally published on OpenJS Foundation Blog.
- Note that Apple’s macOS and iOS are also Unix-like BSD derivatives, through their inheritence from the NeXTSTEP operating system, which continues to this day via the Darwin kernel. ↩︎