projects techreports press lab location staff
citi top.2 top.3
citi mid.3
bot.1 bot.2 bot.3
star

LSP status report for March and April 1999

The primary goal of this research is to improve the scalability and robustness of the Linux operating system to support greater network server workloads more reliably. We are specifically interested in single-system scalability, performance, and reliability of network server infrastructure products running on Linux, such as LDAP directory servers, IMAP electronic mail servers, and web servers, among others.

Summary

Several performance-related kernel issues were detected and fixed during this two month period. As a result our visibility in the Linux community is improving. We made more funding contacts. Work continues on long-term projects.

Milestones

  • We've changed our name to better reflect our goals rather than our sponsors. The project's new name is the "Linux Scalability Project."
  • NLNet's Teus Hagan paid a site visit to CITI in mid-April to discuss possibly funding part of the Linux Scalability Project. An all-day meeting resulted in some new ideas for the project, and Teus plans to work with Peter to re-write the previous proposal as something that is likely to be accepted by NLNet's board of directors.
  • Peter has been working with Netscape, Intel, and U-M Legal to finish paper work requirements for the materials transfer of a four-way Pentium III machine for our testbed.
  • The AOL-Netscape merger was completed in late March. Tim is working on relocating the project into the Sun-Netscape Alliance under Claire Hough's organization, since he has taken a position in AOL's technology office.
  • With the help of kernel developer Andrea Arcangeli, Chuck found a kernel bug which was causing a massive buffer leak. As a result, a complete fix is included in kernel 2.2.7. Here is a more detailed report.
  • Chuck produced a kernel hash table analysis report that describes kernel tuning changes that can improve performance from 5% to 10% on large-memory machines. The report will be submitted to the Linux kernel development community for consideration. The kernel has already picked up some of this work; see Linux kernel 2.2.7.
  • Peter and Chuck have been working on recruiting student interns for summer projects related to our scalability efforts. We've identified several areas where an intern could make significant progress in three months, and Peter has contacted some candidates.
  • Niels has continued to refine his poll() hinting kernel modification. Prelimary observations indicate that the patch results in a substantial CPU utilization savings when an application is maintaining many active connections via poll().
  • Niels' earlier work on making poll() scale for thousands of file descriptors has been included in the AC kernel series, which will be merged eventually with the mainline kernel source.
  • Discussion has begun on the linux-perf and linux-kernel mailing lists regarding the creation of a complete performance monitoring facility for Linux. This facility will likely be better than sar.
  • No further progress has been made on determining an appropriate boilerplate and license for code created by Chuck and released by the project into public domain.
  • Kernel developers have identified several areas where server platform performance improvement is likely when the development tree is opened again. We'll keep a close eye on some of these changes, which include: more efficient implementation of fsync(), fewer copy operations when flushing dirty pages, support for more than 1G of physical memory, and finer-grained SMP locking in the VM logic.
  • Chuck and Naomaru met with researchers working on the IO-Lite effort at Rice. They have also created a new web server for studying the performance impact of different methods of threading and caching. The Rice researchers are willing to share some of their work with us at CITI.
  • Niels, Peter, and Chuck plan to be present at the USENIX technical conference occuring in Monterey California in early June. At that time, Peter and Chuck will meet with Claire Hough, and possibly others at Netscape, to introduce ourselves and our project to members of the Sun-Netscape Alliance. Peter and Chuck will also be attending the Linux Expo conference in Raleigh NC in mid-May.

Challenges

Mindcraft, and the truth about Linux performance

Recently, Mindcraft, an independent testing service, released a report, sponsored by Microsoft, comparing the performance of Windows NT and Linux when serving Samba files and web pages from a Pentium-based SMP hardware platform. The comparison was not positive towards Linux. The Linux community immediately examined the results, finding that there were several inexcusable biases. As well, there is an on-going discussion (argument?) between several Linux kernel developers, and Mark Russinovich, a well-known expert in Windows NT performance. The greatest concern to the community is dispelling FUD -- Fear, Uncertainty, and Doubt -- one of the most oft-wielded weapons that Microsoft, and many others, have in their arsenal.

It is important for those of us who wish to develop and successfully market products that run on the Linux platform to have a good understanding not only of the real performance issues, but also to see clearly through FUD tactics and to be able to combat them by providing well-researched, reasonable, and clear explanations of the issues.

So often, FUD inspires passionate arguments among technical folk that can't be followed by the people who are the real targets of FUD -- managers and purchasers. Managers and purchasers often ignore these arguments because they don't understand them, and thus FUD goes unchallenged. Linux itself doesn't have a marketing machine, so it is vulnerable to this kind of attack.

Of course, most managers worth their salt have one or two technical people they trust to explain these issues in terms they can understand. It is likely that this is how Linux has made inroads into enterprise cultures in the first place. This is probably the best way to begin injecting real information about Linux.

I'm not suggesting counter-FUD (spreading misinformation about NT) or covering up Linux's problems, most of which are already familiar to Linux kernel developers. However, clearly identifying known problems and stating plans for fixing them would probably be very reassuring to technologists and managers who are responsible for running enterprise infrastructure. The Linux community should provide a common location for advertising these issues and diagramming the community's response to them. As well, projects like ours should provide as many publicly accessible and scientifically valid performance studies as we can. And finally, software companies like Netscape should become directly involved in understanding these issues, and training their sales staff to respond to them accurately when customer questions arise.

Managing Change: Meeting our goals during organizational and personnel changes

In the past two months, our project has muddled forward in the midst of grand organizational change at Netscape, as well as expansion and change of the project itself. The Linux Scalability project is now attached to the Sun-Netscape Alliance, within AOL. Tim Howes, one of the original authors of the project, is now spending most of his time with the AOL technology office. Our local cadre of students will be changing shape as Spring-Summer term begins here at U-M. In addition, more sponsors are becoming involved with the project.

Teus Hagan of NLNet warned of the dangers when he visited with us in April: as more sponsors become involved, it is possible that they will all be interested in different and incompatible goals. It was difficult to get agreement among the original sponsors and participants about our project work scope. It is critical to the goals and identity of this project to be sure that new people and organizations who become involved with the project in the near future have a good grasp of our goals.

Performance graphs

Work on the Linux buffer cache resulted in a fix that prevented a significant buffer leak (and resultant low-memory scenario). The first graph shows eight consecutive runs without the fix; The second graphs demonstrates the improvements in benchmark performance with our fix.


If you have comments or suggestions, email linux-scalability @ citi.umich.edu

blank.space
b.star projects | techreports | press
lab | location | staff |
for info, email info@citi.umich.edu or call +1 (734) 763-2929.
Copyright © 1999 University of Michigan Regents. All rights reserved.
bottom.line
citi