This paper provides quantitative data that, in many cases, using open source software / free software (abbreviated as OSS/FS, FLOSS, or FOSS) is a reasonable or even superior approach to using their proprietary competition according to various measures. This paper’s goal is to show that you should consider using OSS/FS when acquiring software. This paper examines market share, reliability, performance, scalability, security, and total cost of ownership. It also has sections on non-quantitative issues, unnecessary fears, OSS/FS on the desktop, usage reports, governments and OSS/FS, other sites providing related information, and ends with some conclusions. An appendix gives more background information about OSS/FS. You can view this paper at http://www.dwheeler.com/oss_fs_why.html (HTML format). Palm PDA users may wish to use Plucker to view this. A short briefing based on this paper is also available in PDF and OpenOffice.org Impress formats (for the latter, use OpenOffice.org Impress). Old archived copies and a list of changes are also available.
Open Source Software / Free Software (OSS/FS) (also abbreviated as FLOSS or FOSS) has risen to great prominence. Briefly, OSS/FS programs are programs whose licenses give users the freedom to run the program for any purpose, to study and modify the program, and to redistribute copies of either the original or modified program (without having to pay royalties to previous developers).
The goal of this paper is to convince you to consider using OSS/FS when you’re looking for software, using quantitive measures. Some sites provide a few anecdotes on why you should use OSS/FS, but for many that’s not enough information to justify using OSS/FS. Instead, this paper emphasizes quantitative measures (such as experiments and market studies) to justify why using OSS/FS products is in many circumstances a reasonable or even superior approach. I should note that while I find much to like about OSS/FS, I’m not a rabid advocate; I use both proprietary and OSS/FS products myself. Vendors of proprietary products often work hard to find numbers to support their claims; this page provides a useful antidote of hard figures to aid in comparing proprietary products to OSS/FS.
I believe that this paper has met its goal; others seem to think so too. The 2004 report of the California Performance Review, a report from the state of California, urges that “the state should more extensively consider use of open source software”, and specifically references this paper. A review at the Canadian Open Source Education and Research (CanOpenER) site stated “This is an excellent look at the some of the reasons why any organisation should consider the use of [OSS/FS]... [it] does a wonderful job of bringing the facts and figures of real usage comparisons and how the figures are arrived at. No FUD or paid for industry reports here, just the facts”. This paper been referenced by many other works, too. It’s my hope that you’ll find it useful as well.
The following subsections describe the paper’s scope, challenges in creating it, the paper’s terminology, and the bigger picture. This is followed by a description of the rest of the paper’s organization (listing the sections such as market share, reliability, performance, scalability, security, and total cost of ownership). Those who find this paper interesting may also be interested in the other documents available on David A. Wheeler’s personal home page.
As noted above, the goal of this paper is to convince you to consider using OSS/FS when you’re looking for software, using quantitive measures. Note that this paper’s goal is not to show that all OSS/FS is better than all proprietary software. Certainly, there are many who believe this is true from ethical, moral, or social grounds. It’s true that OSS/FS users have fundamental control and flexibility advantages, since they can modify and maintain their own software to their liking. And some countries perceive advantages to not being dependent on a sole-source company based in another country. However, no numbers could prove the broad claim that OSS/FS is always “better” (indeed you cannot reasonably use the term “better” until you determine what you mean by it). Instead, I’ll simply compare commonly-used OSS/FS software with commonly-used proprietary software, to show that at least in certain situations and by certain measures, some OSS/FS software is at least as good or better than its proprietary competition. Of course, some OSS/FS software is technically poor, just as some proprietary software is technically poor. And remember -- even very good software may not fit your specific needs. But although most people understand the need to compare proprietary products before using them, many people fail to even consider OSS/FS products, or they create policies that unnecessarily inhibit their use; those are errors this paper tries to correct.
This paper doesn’t describe how to evaluate particular OSS/FS programs; a companion paper describes how to evaluate OSS/FS programs. This paper also doesn’t explain how an organization would transition to an OSS/FS approach if one is selected. Other documents cover transition issues, such as The Interchange of Data between Adminisrations (IDA) Open Source Migration Guidelines (November 2003) and the German KBSt’s Open Source Migration Guide (July 2003) (though both are somewhat dated). Organizations can transition to OSS/FS in part or in stages, which for many is a more practical transition approach.
I’ll emphasize the operating system (OS) known as GNU/Linux (which many abbreviate as “Linux”), the Apache web server, the Mozilla Firefox web browser, and the OpenOffice.org office suite, since these are some of the most visible OSS/FS projects. I’ll also primarily compare OSS/FS software to Microsoft’s products (such as Windows and IIS), since Microsoft Windows has a significant market share and Microsoft is one of proprietary software’s strongest proponents. Note, however, that even Microsoft makes and uses OSS/FS themselves (they have even sold software using the GNU GPL license, as discussed below).
I’ll mention Unix systems as well, though the situation with Unix is more complex; today’s Unix systems include many OSS/FS components or software primarily derived from OSS/FS components. Thus, comparing proprietary Unix systems to OSS/FS systems (when examined as whole systems) is often not as clear-cut. This paper uses the term “Unix-like” to mean systems intentionally similar to Unix; both Unix and GNU/Linux are “Unix-like” systems. The most recent Apple Macintosh OS (MacOS OS X) presents the same kind of complications; older versions of MacOS were wholly proprietary, but Apple’s OS has been redesigned so that it’s now based on a Unix system with substantial contributions from OSS/FS programs. Indeed, Apple is now openly encouraging collaboration with OSS/FS developers.
It’s a challenge to write any paper like this; measuring anything is always difficult, for example. Most of these figures are from other works, and it was difficult to find many of them. But there are two special challenges that you should be aware of: legal problems in publishing data, and dubious studies -- typically those funded by a product vendor.
Many proprietary software product licenses include clauses that forbid public criticism of the product without the vendor’s permission. Obviously, there’s no reason that such permission would be granted if a review is negative -- such vendors can ensure that any negative comments are reduced and that harsh critiques, regardless of their truth, are never published. This significantly reduces the amount of information available for unbiased comparisons. Reviewers may choose to change their report so it can be published (omitting important negative information), or not report at all -- in fact, they might not even start the evaluation. Some laws, such as UCITA (a law in Maryland and Virginia), specifically enforce these clauses forbidding free speech, and in many other locations the law is unclear -- making researchers bear substantial legal risk that these clauses might be enforced. These legal risks have a chilling effect on researchers, and thus makes it much harder for customers to receive complete unbiased information. This is not merely a theoretical problem; these license clauses have already prevented some public critique, e.g., Cambridge researchers reported that they were forbidden to publish some of their benchmarked results of VMWare ESX Server and Connectix/Microsoft Virtual PC. Oracle has had such clauses for years. Hopefully these unwarranted restraints of free speech will be removed in the future. But in spite of these legal tactics to prevent disclosure of unbiased data, there is still some publicly available data, as this paper shows.
This paper omits or at least tries to warn about studies funded by a product’s vendor, which have a fundamentally damaging conflict of interest. Remember that vendor-sponsored studies are often rigged (no matter who the vendor is) to make the vendor look good instead of being fair comparisons. Todd Bishop’s January 27, 2004 article in the Seattle Post-Intelligencer Reporter discusses the serious problems when a vendor funds published research about itself. A study funder could directly pay someone and ask them to directly lie, but it’s not necessary; a smart study funder can produce the results they wish without, strictly speaking, lying. For example, a study funder can make sure that the evaluation carefully defines a specific environment or extremely narrow question that shows a positive trait of their product (ignoring other, probably more important factors), require an odd measurement process that happens show off their product, seek unqualified or unscrupulous reviewers who will create positive results (without careful controls or even without doing the work!), create an unfairly different environment between the compared products (and not say so or obfuscate the point), require the reporter to omit any especially negative results, or even fund a large number of different studies and only allow the positive reports to appear in public. (A song by Steve Taylor expresses these kinds of approaches eloquently: “They can state the facts while telling a lie”.)
This doesn’t mean that all vendor-funded studies are misleading, but many are, and there’s no way to be sure which studies (if any) are actually valid. For example, Microsoft’s “get the facts” campaign identifies many studies, but nearly every study is entirely vendor-funded, and I have no way to determine if any of them are valid. After a pair of vendor-funded studies were publicly lambasted, Forrester Research announced that it will no longer accept projects that involve paid-for, publicized product comparisons. One ad, based on a vendor-sponsored study, was found to be misleading by the UK Advertising Standards Authority (an independent, self-regulatory body), who formally adjudicated against the vendor. This example is important because the study was touted as being fair by an “independent” group, yet it was found unfair by an organization who examines advertisements; failing to meeting the standard for truth for an advertisement is a very low bar.
Steve Hamm’s BusinessWeek article “The Truth about Linux and Windows” (April 22, 2005) noted that far too many reports are simply funded by one side or another, and even when they say they aren’t, it’s difficult to take some seriously. In particular, he analyzed a report by the Yankee Group’s Laura DiDio, asking deeper questions about the data, and found many serious problems. His article explained why he just doesn’t “trust its conclusions” because “the work seems sloppy [and] not reliable” ( a Groklaw article also discussed these problems).
Many companies fund studies that place their products in a good light, not just Microsoft, and the concerns about vendor-funded studies apply equally to vendors of OSS/FS products. I’m independent; I have received no funding of any kind to write this paper, and I have no financial reason to prefer either OSS/FS or proprietary software.
This paper includes data over a series of years, not just the past year; all relevant data should be considered when making a decision, instead of arbitrarily ignoring older data. Note that the older data shows that OSS/FS has a history of many positive traits, as opposed to being a temporary phenomenon.
You can see more detailed explanation of the terms “open source software” and “Free Software”, as well as related information, in the appendix and my list of Open Source Software / Free Software (OSS/FS) references at http://www.dwheeler.com/oss_fs_refs.html. Note that those who use the term “open source software” tend to emphasize technical advantages of such software (such as better reliability and security), while those who use the term “Free Software” tend to emphasize freedom from control by another and/or ethical issues. The opposite of OSS/FS is “closed” or “proprietary” software.
Other alternative terms for OSS/FS software include “libre software” (where libre means free as in freedom), “livre software” (same thing), free-libre and open-source software (FLOS software or FLOSS), open source / Free Software (OS/FS), free / open source software (FOSS), open-source software (indeed, “open-source” is often used as a general adjective), “freed software,” and even “public service software” (since often these software projects are designed to serve the public at large).
Software that cannot be modified and redistributed without further limitation, but whose source code is visible (e.g., “source viewable” or “open box” software, including “shared source” and “community” licenses), is not considered here since such software don’t meet the definition of OSS/FS. OSS/FS is not “freeware”; freeware is usually defined as proprietary software given away without cost, and does not provide the basic OSS/FS rights to examine, modify, and redistribute the program’s source code.
A few writers still make the mistake of saying that OSS/FS is “non-commercial” or “public domain”, or they mistakenly contrast OSS/FS with “commercial” products. However, today many OSS/FS programs are commercial programs, supported by one or many for-profit companies, so this designation is quite wrong. Don’t make the mistake of thinking OSS/FS is equivalent to “non-commercial” software! Also, nearly all OSS/FS programs are not in the public domain. the term “public domain software” has a specific legal meaning -- software that has no copyright owner -- and that’s not true in most cases. In short, don’t use the terms “public domain” or “non-commercial” as synonyms for OSS/FS.
An OSS/FS program must be released under some license giving its users a certain set of rights; the most popular OSS/FS license is the GNU General Public License (GPL). All software released under the GPL is OSS/FS, but not all OSS/FS software uses the GPL; nevertheless, some people do inaccurately use the term “GPL software” when they mean OSS/FS software. Given the GPL’s dominance, however, it would be fair to say that any policy that discriminates against the GPL discriminates against OSS/FS.
This is a large paper, with many acronyms. A few of the most common acryonyms are:
| Acronym | Meaning | |
|---|---|---|
| GNU | GNU’s Not Unix (a project to create an OSS/FS operating system) | |
| GPL | GNU General Public License (the most common OSS/FS license) | |
| OS, OSes | Operating System, Operating Systems | |
| OSS/FS | Open Source Software/Free Software |
This paper uses logical style quoting (as defined by Hart’s Rules and the Oxford Dictionary for Writers and Editors); quotations do not include extraneous punctuation.
Typical OSS/FS projects are, in fact, an example of something much larger: commons-based peer-production. The fundamental characteristic of OSS/FS is its licensing, and an OSS/FS project that meets at least one customer’s need can be considered a success, However, larger OSS/FS projects are typically developed by many people from different organizations working together for a common goal. As the declaration Free Software Leaders Stand Together states, the business model of OSS/FS “is to reduce the cost of software development and maintenance by distributing it among many collaborators”. Yochai Benkler’s 2002 Yale Law Journal article, “Coase’s Penguin, or Linux and the Nature of the Firm” argues that OSS/FS development is only one example of the broader emergence of a new, third mode of production in the digitally networked environment. He calls this approach “commons-based peer-production” (to distinguish it from the property- and contract-based models of firms and markets).
Many have noted that OSS/FS approaches can be applied to many other areas, not just software. The Internet encyclopedia Wikipedia, and works created using Creative Commons licenses (Yahoo! can search for these), are other examples of this development approach. Wide Open: Open source methods and their future potential by Geoff Mulgan (who once ran the policy unit at 10 Downing Street), Tom Steinberg, and with Omar Salem, discusses this wider potential. Many have observed that the process of creating scientific knowledge has worked in a similar way for centuries.
OSS/FS is also an example of the incredible value that can result when users have the freedom to tinker (the freedom to understand, discuss, repair, and modify the technological devices they own). Innovations are often created by combining pre-existing components in novel ways, which generally requires that users be able to modify those components. This freedom is, unfortunately, threatened by various laws and regulations such as the U.S. DMCA, and the FCC “broadcast flag”. It’s also threatened by efforts such as “trusted computing” (often called “treacherous computing”), whose goal is to create systems in which external organizations, not computer users, command complete control over a user’s computer (BBC News among others is concerned about this).
Lawrence Lessig’s Code and Other Laws of Cyberspace argues that software code has the same role in cyberspace as law does in realspace. In fact, he simply argues that “code is law”, that is, that as computers are becoming increasingly embedded in our world, what the code does, allows, and prohibits, controls what we may or may not do in a powerful way. In particular he discusses the implications of “open code”.
All of these issues are beyond the scope of this paper, but the referenced materials may help you find more information if you’re interested.
Below is data discussing market share, reliability, performance, scalability, security, and total cost of ownership. I close with a brief discussion of non-quantitative issues, unnecessary fears, OSS/FS on the desktop, usage reports, other sites providing related information, and conclusions. A closing appendix gives more background information about OSS/FS. Each section has many subsections or points. The non-quantitative issues section includes discussions about freedom from control by another (especially a single source), protection from licensing litigation, flexibility, social / moral / ethical issues, and innovation. The unnecessary fears section discusses issues such as support, legal rights, copyright infringement, abandonment, license unenforceability, GPL “infection”, economic non-viability, starving programmers (i.e., the rising commercialization of OSS/FS), compatibility with capitalism, elimination of competition, elimination of “intellectual property”, unavailability of software, importance of source code access, an anti-Microsoft campaign, and what’s the catch. And the appendix discusses definitions of OSS/FS, motivations of developers and developing companies, history, licenses, OSS/FS project management approaches, and forking.
Many people think that a product is only a winner if it has significant market share. This is lemming-like, but there’s some rationale for this: products with big market shares get applications, trained users, and momentum that reduces future risk. Some writers argue against OSS/FS or GNU/Linux as “not being mainstream”, but if their use is widespread then such statements reflect the past, not the present. There’s excellent evidence that OSS/FS has significant market share in numerous markets:
Netcraft’s survey published January 2005 (covering results from December 2004) polled all the web sites they could find (totaling 58,194,836 sites), and found that of all the sites they could find, counting by name, Apache had 68.43% of the market, Microsoft had 20.86%, Sun had 3.14%, and Zeus had 1.19%. Apache’s share is increasing; all others’ market share is decreasing.
However, many web sites have been created that are simply “placeholder” sites (i.e., their domain names have been reserved but they are not being used); such sites are termed “inactive.” Thus, since 2000, Netcraft has been separately counting “active” web sites. Netcraft’s count of only the active sites is arguably a more relevant figure than counting all web sites, since the count of active sites shows the web server selected by those who choose to actually develop a web site. Apache does extremely well when counting active sites; in their January 2005 (surveying December 2004), Apache had 69.70% of the web server market, Microsoft had 22.70%, Zeus had 0.89%, and Sun had 0.79%. Apache gained market share; all others lost market share or stayed even. Here is the total market share (by number of active web sites):
Netcraft’s September 2002 survey reported on websites based on their “IP address” instead of the host name; this has the effect of removing computers used to serve multiple sites and sites with multiple names. When counting by IP address, Apache has shown a slow increase from 51% at the start of 2001 to 54%, while Microsoft has been unchanged at 35%. Again, a clear majority.
CNet’s ”Apache zooms away from Microsoft’s Web server” summed up the year 2003 noting that “Apache grew far more rapidly in 2003 than its nearest rival, Microsoft’s Internet Information Services (IIS), according to a new survey--meaning that the open-source software remains by far the most widely used Web server on the Internet.” The same happened in 2004, in fact, in just December 2004 Apache gained a full percentage point over Microsoft’s IIS among the total number of all web sites.
Apache’s dominance in the web server market has been independently confirmed by Security Space - their report on web server market share published January 1, 2005 surveyed 20,725,323 web servers in December 2004 and found that Apache was #1 (74.67%), with Microsoft IIS being #2 (17.92%). E-soft also reports specifically on secure servers (web servers supporting SSL/TLS, such as e-commerce sites); while much closer, Apache still leads with 50.55% market share, as compared to Microsoft’s 40.69%, Netscape/iPlanet’s 2.11%, and Stronghold’s 0.59%. Since Stronghold is a repackaging of Apache, Apache’s real market share is at least 51.14%.
Obviously these figures fluctuate monthly; see Netcraft and E-soft for the latest survey figures.
Therefore, Netcraft developed a technique that indicates the number of actual computers being used as Web servers, together with the OS and web server software used (by arranging many IP addresses to reply to Netcraft simultaneously and then analyzing the responses). This is a statistical approach, so many visits to the site are used over a month to build up sufficient certainty. In some cases, the OS detected is that of a “front” device rather than the web server actually performing the task. Still, Netcraft believes that the error margins world-wide are well within the order of plus or minus 10%, and this is in any case the best available data.
Before presenting the data, it’s important to explain Netcraft’s system for dating the data. Netcraft dates their information based on the web server surveys (not the publication date), and they only report OS summaries from an earlier month. Thus, the survey dated “June 2001” was published in July and covers OS survey results of March 2001, while the survey dated “September 2001” was published in October and covers the operating system survey results of June 2001.
Here’s a summary of Netcraft’s study results:
| OS group | Percentage (March) | Percentage (June) | Composition |
|---|---|---|---|
| Windows | 49.2% | 49.6% | Windows 2000, NT4, NT3, Windows 95, Windows 98 |
| [GNU/]Linux | 28.5% | 29.6% | [GNU/]Linux |
| Solaris | 7.6% | 7.1% | Solaris 2, Solaris 7, Solaris 8 |
| BSD | 6.3% | 6.1% | BSDI BSD/OS, FreeBSD, NetBSD, OpenBSD |
| Other Unix | 2.4% | 2.2% | AIX, Compaq Tru64, HP-UX, IRIX, SCO Unix, SunOS 4 and others |
| Other non-Unix | 2.5% | 2.4% | MacOS, NetWare, proprietary IBM OSes |
| Unknown | 3.6% | 3.0% | not identified by Netcraft OS detector |
Much depends on what you want to measure. Several of the BSDs (FreeBSD, NetBSD, and OpenBSD) are OSS/FS as well; so at least a part of the 6.1% for BSD should be added to GNU/Linux’s 29.6% to determine the percentage of OSS/FS OSes being used as web servers. Thus, it’s likely that approximately one-third of web serving computers use OSS/FS OSes. There are also regional differences, for example, GNU/Linux leads Windows in Germany, Hungary, the Czech Republic, and Poland.
Well-known web sites using OSS/FS include Google (GNU/Linux) and Yahoo (FreeBSD).
If you really want to know about the web server market breakdown of “Unix vs. Windows,” you can find that also in this study. All of the various Windows OSes are rolled into a single number (even Windows 95/98 and Windows 2000/NT4/NT3 are merged, although they are fundamentally very different systems). Merging all the Unix-like systems in a similar way produces a total of 44.8% for Unix-like systems (compared to Windows’ 49.2%) in March 2001.
Note that these figures would probably be quite different if they were based on web addresses instead of physical computers; in such a case, the clear majority of web sites are hosted by Unix-like systems. As stated by Netcraft, “Although Apache running on various Unix systems runs more sites than Windows, Apache is heavily deployed at hosting companies and ISPs who strive to run as many sites as possible on one computer to save costs.”
Here’s how the various OSes fared in the study:
| Operating System | Market Share | Composition |
|---|---|---|
| GNU/Linux | 28.5% | GNU/Linux |
| Windows | 24.4% | All Windows combined (including 95, 98, NT) |
| Sun | 17.7% | Sun Solaris or SunOS |
| BSD | 15.0% | BSD Family (FreeBSD, NetBSD, OpenBSD, BSDI, ...) |
| IRIX | 5.3% | SGI IRIX |
A part of the BSD family is also OSS/FS, so the OSS/FS OS total is even higher; if over 2/3 of the BSDs are OSS/FS, then the total share of OSS/FS would be about 40%. Advocates of Unix-like systems will notice that the majority (around 66%) were running Unix-like systems, while only around 24% ran a Microsoft Windows variant.
IDC released a similar study on January 17, 2001 titled “Server Operating Environments: 2000 Year in Review”. On the server, Windows accounted for 41% of new server OS sales in 2000, growing by 20% - but GNU/Linux accounted for 27% and grew even faster, by 24%. Other major Unixes had 13%.
IDC’s 2002 report found that Linux held its own in 2001 at 25%. All of this is especially intriguing since GNU/Linux had 0.5% of the market in 1995, according to a Forbes quote of IDC. Data such as these (and the TCO data shown later) have inspired statements such as this one from IT-Director on November 12, 2001: “Linux on the desktop is still too early to call, but on the server it now looks to be unstoppable.”
These measures do not measure all server systems installed that year; some Windows systems are copies that have not been paid for (sometimes called pirated software), and OSS/FS OSes such as GNU/Linux and the BSDs are often downloaded and installed on multiple systems (since it’s legal and free to do so).
Note that a study published October 28, 2002 by the IT analyst company Butler Group concluded that on or before 2009, Linux and Microsoft’s .Net will have fully penetrated the server OS market from file and print servers through to the mainframe.
Evans Data conducted a survey in October 2002. In this survey, they reported “Linux continues to expand its user base. 59% of survey respondents expect to write Linux applications in the next year.”
The survey has two parts, user and vendor. In “Part I : User enterprise”, they surveyed 729 enterprises that use servers. In “Part II : Vendor enterprise”, they surveyed 276 vendor enterprises who supply server computers, including system integrators, software developers, IT service suppliers, and hardware resellers. The most interesting results are those that discuss the use of Linux servers in user enterprises, the support of Linux servers by vendors, and Linux server adoption in system integration projects.
First, the use of Linux servers in user enterprises:
| System | 2002 | 2001 |
|---|---|---|
| Linux server | 64.3% | 35.5% |
| Windows 2000 Server | 59.9% | 37.0% |
| Windows NT Server | 64.3% | 74.2% |
| Commercial Unix server | 37.7% | 31.2% |
And specifically, here’s the average use in 2002:
| System | Ave. units | # samples |
|---|---|---|
| Linux server | 13.4 | N=429 (5.3 in 2001) |
| Windows 2000 Server | 24.6 | N=380 |
| Windows NT Server | 4.5 | N=413 |
| Commercial Unix server | 6.9 | N=233 |
Second, note the support of GNU/Linux servers by vendors:
| System | Year 2002 Support |
|---|---|
| Windows NT/2000 Server | 66.7% |
| Linux server | 49.3% |
| Commercial Unix server | 38.0% |
| Increase of importance in the future | 44.1% |
| Requirement from their customers | 41.2% |
| Major OS in their market | 38.2% |
| Free of licence fee | 37.5% |
| Most reasonable OS for their purpose | 36.0% |
| Open source | 34.6% |
| High reliability | 27.2% |
Third, note the rate of Linux server adoption in system integration projects:
| Project Size (Million Yen) | Linux | Win2000 | Unix | |
|---|---|---|---|---|
| 2002 | 2001 | 2002 | 2002 | |
| 0-3 | 62.7% | 65.7% | 53.8% | 15.4% |
| 3-10 | 51.5% | 53.7% | 56.3% | 37.1% |
| 10-50 | 38.3% | 48.9% | 55.8% | 55.8% |
| 50-100 | 39.0% | 20.0% | 45.8% | 74.6% |
| 100+ | 24.4% | 9.1% | 51.1% | 80.0% |
Note that the Japanese Linux white paper 2003 found that 49.3% of IT solution vendors support Linux in Japan.
| Expected GNU/Linux Use | Small Business | Midsize Business | Large Business | Total |
|---|---|---|---|---|
| 50% increase | 21.0% | 16% | 19.0% | 19% |
| 10-25% increase | 30.5% | 42% | 56.5% | 44% |
| No growth | 45.5% | 42% | 24.5% | 36% |
| Reduction | 3.0% | 0% | 0% | 1% |
Sales of GNU/Linux servers increased 63% from 2001 to 2002. This is an increase from $1.3 billion to $2 billion, according to Gartner.
A multitude of studies show that IE is losing market share, while OSS/FS web browsers (particularly Firefox) are gaining market share. The figure above shows web browser market share over time; the red squares are Internet Explorer’s market share (all versions), and the blue circles are the combination of the older Mozilla suite and the newer Mozilla Firefox web browser (both of which are OSS/FS).
OSS/FS web browsers (particularly Firefox) are gradually gaining market share among the general population of web users. By November 1, 2004, Ziff Davis revealed that IE had lost about another percent of the market in only 7 weeks. Chuck Upsdell has combined many data sources and estimates that, as of September 2004, IE has decreased from 94% to 84%, as users switch to other browser families (mainly Gecko); he also believes this downward trend is likely to continue. Information Week reported in March 18, 2005, some results from Net Applications (a maker of Web-monitoring software). Net Applications found that Firefox use rose to 6.17% of the market in February 2005, compared to 5.59% in January 2005. WebSideStory reported in February 2005 that Firefox’s general market share was 5.69% as of February 18, 2005, compared to IE’s 89.85%. OneStat reported on February 28, 2005, that Mozilla-based browsers’ global usage share (or at least Firefox’s) is 8.45%, compared to IE’s 87.28%. Co-founder Niels Brinkman suspects that IE 5 users were upgrading to Firefox, not IE 6, as at least one reason why “global usage share of Mozilla’s Firefox is still increasing and the total global usage share of Microsoft’s Internet Explorer is still decreasing.” The site TheCounter.com reports global statistics about web browsers; February 2005 shows Mozilla-based browsers (including Firefox, but not Netscape) had 6%, while IE 6 had 81% and IE 5 had 8% (89% total for IE). This is a significant growth; the August 2004 study of 6 months earlier had Mozilla 2%, IE 6 with 79%, and IE 5 with 13% (92% for IE). The website quotationspage.com is a popular general-use website; quotationspage statistics of February 2004 and 2005 show a marked rise in the use of OSS/FS browsers. In February 2004, IE had 89.93% while Mozilla-based browsers accounted for 5.29% of browser users; by February 2005, IE had dropped to 76.47% while Mozilla-based browsers (including Firefox) had risen to 14.11%. Janco Associates also reported Firefox market share data; comparing January 2005 to April 2005, Firefox had jumped from 4.23% to 10.28% of the market (IE dropped from 84.85% to 83.07% in that time, and Mozilla, Netscape, and AOL all lost market share in this time as well according to this survey).
Nielsen/NetRatings’ survey of site visitors found that in June 2004, 795,000 people visited the Firefox website (this was the minimum for their tracking system). There were 2.2 million in January 2005, 1.6 million in February, and 2.6 million people who visited the Firefox web site in March 2005. The numbers were also up for Mozilla.org, the Web site of the Mozilla Foundation (FireFox’s developer).
The growth of OSS/FS web browsers becomes even more impressive when home users are specifically studied. Home users can choose which browser to use, while many businesses users cannot choose their web browser (it’s selected by the company, and companies are often slow to change). XitiMonitor surveyed a sample of websites used on a Sunday (March 6, 2005), totalling 16,650,993 visits. By surveying Sunday, they intended to primarily find out what people choose to use. Of the German users, an astonishing 21.4% were using Firefox. The other countries surveyed were France (12.2%), England (10.9%), Spain (9%), and Italy (8.6%). Here is the original XitiMonitor study of 2005-03-06, an automated translation of the XitiMonitor study, and a blog summary of the XitiMonitor study observing that, “Web sites aiming at the consumer have [no] other choice but [to make] sure that they are compatible with Firefox ... Ignoring compatibility with Firefox and other modern browsers does not make sense business-wise.”
Using this data, we can determine that 13.3% of European home users were using Firefox on this date in March 2005. How do can get such a figure? Well, we can use these major European countries as representatives of Europe as a whole; they’re certainly representative of western Europe, since they’re the most populous countries. Presuming that the vast majority of Sunday users are home users is quite reasonable for Europe. We can then make the reasonable presumption that the number of web browser users is proportional to the general population. Then we just need to get the countries’ populations; I used the CIA World Fact Book updated to 2005-02-10. These countries’ populations (in millions) are, in the same order as above, 82, 60, 60, 40, and 58; calculating (21.4%*82 + 12.2%*60 + 10.9%*60 + 9%*40 + 8.6%*58) / (82+60+60+40+58) yields 13.3%.
Among leading-edge indicators such as the technically savvy and web developers, the market penetration has been even more rapid and widespread. In one case (Ars Technica), Firefox has become the leading web browser! This is a leading indicator because these are the people developing the web sites you’ll see tomorrow; in many cases, they’ve already switched to OSS/FS web browsers such as Firefox. W3schools is a site dedicated to aiding web developers, and as part of their role track the browsers that web developers use. W3schools found a dramatic shift from July 2003 to September 2004, with IE dropping from 87.2% to 74.8% while Gecko-based browsers (including Netscape 7, Mozilla, and Firefox) rising from 7.2% to 19%. ( W3Schools’ current statistics are available). This trend has continued; as of March 2005 Firefox was still growing in market share, having grown to 21.5% (with an increase every month), while IE was shrinking quickly (IE 6 was down to 64.0% and decreasing every month). CNN found that among its CNET News.com readers, site visitors with OSS/FS browsers jumped up from 8% in January 2004 to 18% by September 2004. Statistics for Engadget.com, which has a technical audience, found that as of September 2004, only 57% used a MS browser and Firefox had rapidly risen to 18%. IT pundits such as PC Magazine’s John C. Dvorak reported even more dramatic slides, with IE dropping to 50% share. InformationWeek reported that on March 30, 2005, 22% of visitors used Firefox, versus 69% who used Internet Explorer. The technical website Ars Technica reported on March 27, 2005, that Firefox was now their #1 browser at 40%, while IE was down to #2 at 30% (vs. 38% in September 2004).
Bloggers, another group of especially active web users (and thus, I believe, another leading indicator) also suggest this is a trend. InformationWeek’s March 30, 2005 article “Firefox Thrives Among Bloggers” specifically discussed this point. InformationWeek reported that on Boing Boing, one of the most popular blog sites, March 2005 statistics show that more of their users use Firefox than Internet Explorer: 35.9% of its visitors use Firefox, compared with 34.5% using Internet Explorer. I checked Boing Boing’s April 2, 2004 statistics; they reported Firefox at 39.1%, IE at 33.8%, Safari at 8.8%, and Mozilla at 4.1%; this means that Firefox plus Mozilla was at 43.2%, significantly beyond IE’s 33.8%. Between January 1 though March 9, the Technometria blog found that “Firefox accounted for 28% of browsers compared with 58% for Internet Explorer.” Kottke.org reported on February 27 that 41% of visitors used Mozilla-based browsers (such as Firefox), while 31% used Internet Explorer.
These increasing market share statistics are occurring in spite of problems with the data that work against OSS/FS browsers. Some non-IE browsers are configured to lie and use the same identification string as Internet Explorer, even though they aren’t actually IE. Thus, all of these studies are actually understating the actual share of non-IE browsers, though the amount of understatement is generally unknown.
In short, efforts such as the grassroots Spread Firefox marketing group seem to have been very effective at convincing people to try out the OSS/FS web browser Firefox. Once people try it, they appear to like it enough to continue using it.
Two key factors seem to driving this rise: survey respondents indicated that OSS/FS databases are increasing their performance and scalability to the point where they are acceptable for use in corporate enterprise environments, and many organizations have tight IT and database development budgets. Evans found that MySQL, PostgreSQL, and Firebird were popular OSS/FS databases. Evans found FireBird is the most used database among all database programs for ‘edge’ applications, with Microsoft Access as a close second (at 21%). In addition, MySQL and FireBird are locked in a virtual tie in the OSS/FS database space; each are used by just over half of database developers who use OSS/FS databases.
Perhaps the simplest argument that GNU/Linux will have a significant market share is that Sun is modifying its Solaris product to run GNU/Linux applications, and IBM has already announced that GNU/Linux will be the successor of IBM’s own AIX.
There are a lot of anecdotal stories that OSS/FS is more reliable, but finally there is quantitative data confirming that mature OSS/FS programs are often more reliable:
OSS/FS had higher reliability by this measure. It states in section 2.3.1 that:
It is also interesting to compare results of testing the commercial systems to the results from testing “freeware” GNU and Linux. The seven commercial systems in the 1995 study have an average failure rate of 23%, while Linux has a failure rate of 9% and the GNU utilities have a failure rate of only 6%. It is reasonable to ask why a globally scattered group of programmers, with no formal testing support or software engineering standards can produce code that is more reliable (at least, by our measure) than commercially produced code. Even if you consider only the utilities that were available from GNU or Linux, the failure rates for these two systems are better than the other systems.
There is evidence that Windows applications have even less reliability than the proprietary Unix software (e.g., less reliable than the OSS/FS software). A later paper published in 2000, “An Empirical Study of the Robustness of Windows NT Applications Using Random Testing”, found that with Windows NT GUI applications, they could crash 21% of the applications they tested, hang an additional 24% of the applications, and could crash or hang all the tested applications when subjecting them to random Win32 messages. Indeed, to get less than 100% of the Windows applications to crash, they had to change the conditions of the test so that certain test patterns were not sent. Thus, there’s no evidence that proprietary Windows software is more reliable than OSS/FS by this measure. Yes, Windows has progressed since that time - but so have the OSS/FS programs.
Although the OSS/FS experiment was done in 1995, and the Windows tests were done in 2000, nothing that’s happened since suggests that proprietary software has become much better than OSS/FS programs since then. Indeed, since 1995 there’s been an increased interest and participation in OSS/FS, resulting in far more “eyeballs” examining and improving the reliability of OSS/FS programs.
The fuzz paper’s authors also found that proprietary software vendors generally didn’t fix the problems identified in an earlier version of their paper (from 1990), and they found that concerning. There was a slight decrease in failure rates between their 1990 and 1995 paper, but many of the flaws they found (and reported) in the proprietary Unix programs were still not fixed 5 years later. In contrast, Scott Maxwell led an effort to remove every flaw identified in the OSS/FS software in the 1995 fuzz paper, and eventually fixed every flaw. Thus, the OSS/FS community’s response shows why, at least in part, OSS/FS programs have such an edge in reliability; if problems are found, they’re often fixed. Even more intriguingly, the person who spearheaded ensuring that these problems were fixed wasn’t an original developer of the programs - a situation only possible with OSS/FS.
Now be careful: OSS/FS is not magic pixie dust; beta software of any kind is still buggy! However, the 1995 experiment measured mature OSS/FS to mature proprietary software, and the OSS/FS software was more reliable under this measure.
The company used automated tools to look five kinds of defects in code: Memory leaks, null pointer dereferences, bad deallocations, out of bounds array access and uninitialized variables. Reasoning found 8 defects in 81,852 lines of Linux kernel source lines of code (SLOC), resulting in a defect density rate of 0.1 defects per KSLOC. In contrast, the three proprietary general-purpose operating systems (two of them versions of Unix) had between 0.6 and 0.7 defects/KSLOC; thus the Linux kernel had a smaller defect rate than all the competing general-purpose operating systems examined. The rates of the two embedded operating systems were 0.1 and 0.3 defects/KSLOC, thus, the Linux kernel had an defect rate better than one embedded operating system, and equivalent to another.
One issue is that the tool detects issues that may not be true problems. For example, of those 8 defects, one was clearly a bug and had been separately detected and fixed by the developers, and 4 defects clearly had no effect on the running code. None of the defects found were security flaws. To counter this, they also tracked which problems were repaired by the developers of the various products. The Linux kernel did quite well by this measure as well: the Linux kernel had 1 repaired defect out of 81.9 KSLOC, while the proprietary implementations had 235 repaired defects out of 568 KSLOC. This means the Linux kernel had a repair defect rate of 0.013 defects/KSLOC, while the proprietary implementations had a repair defect rate of 0.41 defects/KSLOC.
CEO Scott Trappe explained this result by noting that the open source model encourages several behaviors that are uncommon in the development of commercial code. First, many users don’t just report bugs, as they would do with [proprietary] software, but actually track them down to their root causes and fix them. Second, many developers are reviewing each other’s code, if only because it is important to understand code before it can be changed or extended. It has long been known that peer review is the most effective way to find defects. Third, the open source model seems to encourage a meritocracy, in which programmers organize themselves around a project based on their contributions. The most effective programmers write the most crucial code, review the contributions of others, and decide which of these contributions make it into the next release. Fourth, open source projects don’t face the same type of resource and time pressures that [proprietary] projects do. Open source projects are rarely developed against a fixed timeline, affording more opportunity for peer review and extensive beta testing before release.
This certainly doesn’t prove that OSS/FS will always be the highest quality, but it clearly shows that OSS/FS can be of high quality.
| Downtime | Apache |
Microsoft |
Netscape |
Other |
| September | 5.21 |
10.41 |
3.85 |
8.72 |
| October | 2.66 |
8.39 |
2.80 |
12.05 |
| November | 1.83 |
14.28 |
3.39 |
6.85 |
| Average | 3.23 |
11.03 |
3.35 |
9.21 |
It’s hard not to notice that Apache (the OSS web server) had the best results over the three-month average (and with better results over time, too). Indeed, Apache’s worst month was better than Microsoft’s best month. The difference between Netscape and Apache is statistically insignificant - but this still shows that the freely-available OSS/FS solution (Apache) has a reliability at least as good as the most reliable proprietary solution.
The report does state that this might not be solely the fault of the software’s quality, and in particular it noted that several Microsoft IIS sites had short interruptions at the same time each day (suggesting regular restarts). However, this still begs the question - why did the IIS sites require so many regular restarts compared to the Apache sites? Every outage, even if preplanned, results in a service loss (and for e-commerce sites, a potential loss of sales). Presumably, IIS site owners who perform periodic restarts do so because they believe that doing so will improve their IIS systems’ overall reliability. Thus, even with pre-emptive efforts to keep the IIS systems reliable, the IIS systems are less reliable than the Apache-based systems which simply do not appear to require constant restarting.
As with all surveys, this one has weaknesses, as discussed in Netcraft’s Uptime FAQ. Their techniques for identifying web server and OSes can be fooled. Only systems for which Netcraft was sent many requests were included in the survey (so it’s not “every site in the world”). Any site that is requested through the “what’s that site running” query form at Netcraft.com is added to the set of sites that are routinely sampled; Netcraft doesn’t routinely monitor all 22 million sites it knows of for performance reasons. Many OSes don’t provide uptime information and thus can’t be included; this includes AIX, AS/400, Compaq Tru64, DG/UX, MacOS, NetWare, NT3/Windows 95, NT4/Windows 98, OS/2, OS/390, SCO UNIX, Sony NEWS-OS, SunOS 4, and VM. Thus, this uptime counter can only include systems running on BSD/OS, FreeBSD (but not the default configuration in versions 3 and later), recent versions of HP-UX, IRIX, GNU/Linux 2.1 kernel and later (except on Alpha processor based systems), MacOS X, recent versions of NetBSD/OpenBSD, Solaris 2.6 and later, and Windows 2000. Note that Windows NT systems cannot be included in this survey (because their uptimes couldn’t be counted). Windows 2000 systems’s data are included in the source source for this survey, but they have a different problem. Windows 2000 had little hope to be included in the August 2001 list, because the 50th system in the list had an uptime of 661 days, and Windows 2000 had only been launched about 17 months (about 510 days) earlier. Note that HP-UX, GNU/Linux (usually), Solaris and recent releases of FreeBSD cycle back to zero after 497 days, exactly as if the machine had been rebooted at that precise point. Thus it is not possible to see an HP-UX, GNU/Linux (usually), or Solaris system with an uptime measurement above 497 days, and in fact their uptimes can be misleading (they may be up for a long time, yet not show it). There is yet one other weakness: if a computer switches operating systems later, the long uptime is credited to the new OS. Still, this survey does compare Windows 2000, GNU/Linux (up to 497 days usually), FreeBSD, and several other OSes, and OSS/FS does quite well.
It could be argued that perhaps systems on the Internet that haven’t been rebooted for such a long time might be insignificant, half-forgotten, systems. For example, it’s possible that security patches aren’t being regularly applied, so such long uptimes are not necessarily good things. However, a counter-argument is that Unix and Linux systems don’t need to be rebooted as often for a security update, and this is a valuable attribute for a system to have. Even if you accepted that unproven claim, it’s certainly true that there are half-forgotten Windows systems, too, and they didn’t do so well. Also, only systems someone specifically asked for information about were included in the uptime survey, which would limit the number of insignificant or half-forgotten systems.
At the very least, Unix and Linux are able to quantitatively demonstrate longer uptimes than their Windows competitors can, so Unix and Linux have significantly better evidence of their reliability than Windows.
They examined the Linux kernel (developed as an OSS/FS product), the original Mozilla web browser (developed as a proprietary product), and then the evolution of Mozilla after it became OSS/FS. They found “significant differences in their designs”; Linux possessed a more modular architecture than the original proprietary Mozilla, and the redesigned OSS/FS Mozilla had a more modular structure than both.
To measure design modularity, they used a technique called Design Structure Matrices (DSMs) that identified dependencies between different design elements (in this case, between files, where calling a function/method of another file creates a dependency). They used two different measures using DSMs, which produced agreeing results.
The first measure they computed is a simple one, called “change cost”. This measures the percentage of elements affected, on average, when a change is made to one element in the system. A smaller value is better, since as this value gets larger, it’s becomes increasingly likely that a change made will impact a larger number of other components and have unintended consequences. This measure isn’t that sensitive to the size of a system (see their exhibit 7), though obviously as a program gets larger that percentage implies a larger number of components. When Mozilla was developed as a proprietary product, and initially released as OSS/FS, it had the large value of 17.35%. This means that if a given file is changed, on average, 17.35% of other files in system depend (directly or indirectly) on that file. After gaining some familiarity with the code, the OSS/FS developers decided to improve its design between 1998-10-08 and 1998-12-11. Once the redesign was complete, the change cost dramatically decreased down to 2.78%, as you can see:
| Program | Change Cost |
|---|---|
| Mozilla-1998-04-08 | 17.35% |
| Mozilla-1998-10-08 | 18.00% |
| Mozilla-1998-12-11 | 2.78% |
| Mozilla-1999 | 3.80% |
| Linux-2.1.88 | 3.72% |
| Linux-2.1.105 | 5.16% |
Change cost is a fairly crude measure, though; it doesn’t take into account the amount of dependency (measured, say, as the number of calls from one file to another), and it doesn’t take clustering into account (a good design should minimize the communication between clusters more than communication in general). Thus, they computed “coordination cost,” an estimated cost of communicating information between agents developing each cluster. This measure is strongly dependent on the size of the system - after all, it’s easier to coordinate smaller projects. Thus, to use this as a measure of the quality of a design compared to another project, the sizes must be similar (in this case, by the number of files). The numbers are unitless, but smaller costs are better. The researchers identified different circumstances with similar sizes, so that the numbers could be compared. The following table compares Mozilla 1998-04-08 (built almost entirely by proprietary means) and Mozilla 1998-12-11 (just after the redesign by OSS/FS developers) with Linux 2.1.105 (built by OSS/FS processes):
| Linux 2.1.105 | Mozilla 1998-04-08 | Mozilla 1998-12-11 | |
|---|---|---|---|
| Number of Source files | 1678 | 1684 | 1508 |
| Coordination Cost | 20,918,992 | 30,537,703 | 10,234,903 |