Archive for the ‘Mozilla’ Category

“Vlad and analysis of dtrace was used”

December 4, 2007

(title from the Google translation of a Japanese blog [edit: it is a technology news site, not a blog] about Firefox memory fragmentation)

Using dtrace and some tools that we’ve built I’ve been able to get more fragmentation data. I haven’t hooked up all the allocators yet — Vlad just made some changes that will make getting data from a bunch of the other allocators much easier, so expect more data soon.

Lets compare the Windows standard heap, the Windows Low-Fragmentation Heap, nedmalloc and jemalloc. I’ve posted pictures from the two Windows allocators before, but here they are again:

Windows standard heap — Small and compact, but very fragmented:

Windows Standard Heap

The Windows Low-Fragmentation Heap — bigger and less fragmented:

Windows Low-Fragmentation Heap

nedmalloc — faster but more fragmented than the Windows LFH:

nedmalloc

jemalloc — faster, smaller than Windows LFH, and less fragmented:

jemalloc

So far, I’m seeing that tcmalloc (the latest version, which I’m told is slower than previous releases) and nedmalloc are both about 10% faster at pure allocations than the Windows heaps (which are about the same speed). jemalloc looks to be also about 10% faster but I ran my tests on a different machine and need to verify my numbers before making a strong claim about its speed.

jemalloc looks to be a pretty solid contender. Jason Evans, jemalloc’s author has been super helpful in answering lots of questions I’ve had and has done some investigating of his own. I’ll hold off declaring a winner until I’ve had time to run with a few more allocators, but the data is showing that we can get good wins by switching allocators. I’m also in the process of generating different sets of logs to run against the allocators so we can see how they behave with stress tests such as loading 200 tabs and then closing them all.

On a side but related note, Olli Pettay and Jonas Sicking are doing lots of great work on moving content nodes and related data in to arenas.

malloc replacements?

November 21, 2007

We’ve built some great tools lately including one to test fragmentation of different allocators.  I’m currently in the process of hooking up allocators such as tcmalloc, nedmalloc, Hoard, and jemalloc.  Also native platform specific ones such as the Windows low-fragmentation heap. I’m having to dig in to some of their internals to pull out the data that we need which is taking a bit of time, but things are progressing well.

If anyone knows of other allocators we should be looking at, would you please leave a comment?  I would like to make sure we’re comparing all of our options.

Firefox 3 Beta 1

November 19, 2007

Beta 1 is out. Hurray!

I’ve been working on Firefox 3 stuff for over two and a half years now. Back in March 2005, after GDC, I started to put together what would become the new graphics API in Gecko (”Thebes”). Ignoring the alphas, this is the first real release that includes the graphics, text, fonts, memory, and performance work that I’ve done for the last while. Things are coming together well.

There are plenty of new features, but I’ll admit I’m pretty excited about two core platform things:

  1. Vastly improved font selection and complex script support.
    - I’ll be going in to more detail than you probably wanted to know on the improvements here soon.
  2. Color profile support for images and CSS colors
    - This is disabled by default currently. We still need to add prefs for it. Set gfx.color_management.enabled to true in about:config and restart the browser to enable.

Both of these are fairly subtle improvements but should both make a big improvement to large sets of people.

While I’m really excited about this milestone, I think the memory work currently going on will make the next couple betas that much better.

Less fragmentation coming in Firefox 3

November 15, 2007

I’ve had a lot of people ask whether the memory improvements that I’ve been doing recently will make it in to Firefox 3 or if they’ll only appear in a more future release.

The basic answer: many of these fixes will be included in Firefox 3, but not all of them.

The more complicated answer is that we’re still analyzing the problem and working on solutions. At this point, we’re still digging through the data and finding hotspots. We’ve already identified quite a few places where we will be able to make improvements — some big and some small — and we’re evaluating each for overall invasiveness and impact so we can make the best decisions possible about how and where to implement these fixes.

As we’re already in the beta phase for Firefox 3 we have to be very careful not to add too much risk to the process, so we’re prioritizing memory improvements to get the biggest improvements for the least additional risk. This isn’t to say that we won’t work on fixes with higher potential risk, but we do have to be very careful. What this means is that we won’t be able to address every single issue, but should be able to knock out the big ones in time for Firefox 3.

Eliminating memory fragmentation entirely is almost impossible. We’ve got some amazing tools built now to debug the issues and test our progress. We’ve got several big issues on our radar that we believe will give us big wins. The current plan of attack is to reduce numbers of allocations, group allocations of similar lifetimes together in to pools, move areas with similar size allocations in to their own areas in memory, and to look at general malloc replacement solutions. We’re looking at all of these things in parallel and have some data on each but not enough to report anything useful yet. I hope to have some good data on each of these areas by early next week. We’ve built some pretty amazing tools for testing our progress and will be able to show visually how we’re improving.

With as many of these fixes going in to Firefox 3 as possible, Firefox 3 should provide significant improvement in long term memory use over previous versions.

Leaks? Memory? We never forgot about you.

November 14, 2007

I’ve seen quite a few posts lately based on the memory fragmentation work that I’m doing with titles such as “Fixing Firefox’s memory issue becomes a priority.” Others have claimed that this work is a result of Mozilla’s new focus on mobile. While I’m glad that people are paying attention to our memory work and offering great suggestions, let me say: Memory issues have always been a priority.

Since I started working on the project in 1998 we’ve always had a focus on keeping our memory footprint small and keeping leaks to a minimum. Early in the development cycle for each release we’ve set goals for memory and performance. We always set our bar under the previous release. I’ve found that developing desktop software is a pretty constant balancing act between performance and memory use. We’re always making trade-offs and we try our best to chose the things that will work best for the largest set of users:

Some examples:

  • Back in 2001 when I rebuilt our imaging library, I made several decisions to use more memory to store images results in faster rendering; optimizing for memory use reduces the speed at which was can display pages. We’ve looked at these issues many times over the years to make sure they were still correct. Recently we’ve adjusted that behavior to not keep full uncompressed images around as long which will result in memory savings but will cause initial scrolling to be a bit slower on documents you haven’t accessed in a while.
  • In Firefox 1.5 we added a feature called the back/foward cache which keeps documents in your recent history that you’ve navigated from in memory. This was done to significantly speed up hitting the back button. It worked great but caused us to use a bit more memory. We made sure that it was only using memory your computer wasn’t already using, but again, it’s an example of a trade-off. We started off with a pretty high number of pages that we kept in the cache and have continued to adjust that number to keep a more limited set of pages to help prevent unnecessary bloating. We’ve also started expiring these pages over time so that you don’t keep pages around you probably aren’t going to use.

With the popularity of extensions rising, we started hearing complaints about memory leaks. We took these reports pretty seriously and have spent a lot of time investigating what is going on. A lot of work has gone in over the last few years to reduce these leaks. Most of our early testing was around the browser, without extensions. Our investigations showed that certain extensions in caused a pretty bizarre class of leaks that were pretty difficult to fix given the architecture in Firefox 2. We fixed as many of them as we could in Firefox 2 but some we were unable to fix. In Firefox 3 some Really Smart People (graydon, peterv, dbaron, etc) have built this thing called the cycle collector in to the core which addresses many of the leaks that we were seeing from extensions (as well as leaks from other places that were of the same class). Our extensive testing shows an occasional leak here and there and we are working to fix those, but in general we aren’t seeing many leaks anymore.

It is only after we’ve gone through so many leak fixes and done so many other memory reduction fixes that we’ve needed to a deeper look at what is going on under the hood. We’ve long had suspicions that we were being hurt by memory fragmentation, but it wasn’t until recently that we had built good tools to fully diagnose the problem.

I’ll assert here that the way people use their browsers has changed. When Gecko was originally designed back around 1998 people had one, maybe two browser windows open without tabs, and they certainly didn’t have any extensions installed. I look at my browser windows now and I’ve got 3 browser windows open with a total of about 20 tabs open. That is 10x the number of documents open at once!

With the change in how people use their browsers, there is no doubt that they’re going to use more memory. We’re doing everything we can to minimize the impact of having lots of documents open. Many people are trying to shave off bytes here and there. Just in the last week we’ve removed over 200 thousand allocations just from startup and first page load. We’ve got a great community and people eager to solve these problems. We’re now equipped with data and ready to fight this battle.

Stay tuned…

Allocation Data

November 13, 2007

Lots of people have asked where most of the allocations in Mozilla come from. I’ve gzipped some dtrace output that shows number of calls per size of stacks 5 deep. Note: This log only shows allocations <= 2048 bytes. This data is pretty raw but if people want to take a look at it and see if they have ideas for how to improve some of the code paths in question, that would be great.

An example:

  libSystem.B.dylib`malloc+0x37
  XUL`nsStringBuffer::Alloc(unsigned long)+0x15
  XUL`nsACString_internal::MutatePrep(unsigned int, char**,
                                      unsigned int*)+0xce
  XUL`nsACString_internal::ReplacePrep(unsigned int,
                                       unsigned int,
                                       unsigned int)+0x46
  XUL`nsACString_internal::Assign(char const*, unsigned int)+0xc8

       value  ------------- Distribution ------------- count
           4 |                                         0
           8 |@@@@@@@@@@@@@                            9724
          16 |@@@@@@@@@@@@@@@@                         11940
          32 |@@@@@@                                   4359
          64 |@                                        887
         128 |@@                                       1604
         256 |@                                        497
         512 |@                                        676
        1024 |                                         47
        2048 |                                         0

This shows that there are 9724 8 byte allocations, 11940 16 byte ones and so on.

Things to look for include:

  • Things with lots of allocations
  • Things that could be stack allocated to avoid memory churn
  • Things that the lifetime is well understood that we could put in to pools
  • etc…

Edit: I’ve also posted another log with 8 frame deep stacks as well as a log that only includes allocations post-startup (also 8 frames deep).

Windows Low Fragmentation Heap builds

November 13, 2007

As promised, some Minefield beta 1 Windows builds with the low fragmentation heap turned on.

Installer
Zip

These are built off of the beta1 branch. Beta 1 isn’t out so this isn’t final use at your own risk blah blah blah.

I am curious to hear how these compare memory use wise to other builds for people.

I’m working on putting together an extension (maybe updating RAMBack) to let you produce pretty heap images but I’m not there yet. Stay tuned for that.

Windows Low Fragmentation Heap

November 11, 2007

Be sure that you’ve read my previous blog post about memory fragmentation before reading this.

I previously stated that I had tried to use the Windows Low-fragmentation Heap and hadn’t seen much difference. I realized that I wasn’t setting it on all the heaps.

I’ve added the following code to the top of main():

  HANDLE heaps[1025];
  DWORD nheaps = GetProcessHeaps(1024, heaps);

  for (DWORD i = 0; i < nheaps; i++) {
    ULONG  HeapFragValue = 2;
    HeapSetInformation(heaps[i],
                       HeapCompatibilityInformation,
                       &HeapFragValue,
                       sizeof(HeapFragValue));
  }

What do I see? Well, not quite what I expected.

After startup, loading about:blank:

This represents:
total: 19,247,554 bytes
used: 13,381,970 bytes
free: 5,865,584 bytes

This looks like about 1mb in used blocks more than without the LFH turned on. Notice though that we have 4mb more free blocks than with LFH off.

I repeated the same steps I followed in my last blog post. After loading Tripadvisor, clicking check rates, waiting for pages to load, closing them all and then loading about:blank. After clearing caches, I see:

This represents:
total: 54,253,757 bytes
used: 26,337,381 bytes
free: 27,916,376 bytes

Whoa! What happened here? We’re now using 14mb when we loaded the browser and have 27mb of free blocks on our heap! While we are less fragmented than without the low fragmentation heap, we’re using quite a bit more memory. There seems to be a lot of overhead to using the low fragmentation heap. Being far less fragmented means that future allocations will be a lot easier and we won’t have to keep putting them at the end.

“What does this mean after running for a while?”

A fine question, really. It certainly means that early on you’re going to be using a lot more memory. However, over time you may end up using less memory as you will be less fragmented and thus new allocations as you load new pages should be able to fill in to the free space more easily. From my quick testing, loading lots of pages, gmail, etc we don’t seem to grow much than we are at this state.

“If it doesn’t grow much more that sounds awesome, how do I try it?”

Watch this space. I’ll upload a build that people can test with and report back their results. If we really don’t grow much more than the initial growth from loading lots of pages at once, this might be worth turning on.

“I hate Windows, what about Mac and Linux”

If this does turn out to being a good thing on Windows, it certainly doesn’t solve all our problems. There is lots of work that we need to do to continue to reduce fragmentation. As you can see, there is still some fragmentation going on and we need to work to reduce it. I’m a bit scared of the amount of extra space we’re using with this enabled and would like to better understand what is going on. We’re looking at using cross-platform allocators, pools and arenas, etc. More on those results soon…

Memory fragmentation

November 10, 2007

I’ve been doing a lot of work trying to figure out why after loading a lot of pages much of your memory seems to disappear. I’ve tested all sorts of things — disabling extensions, plugins, images, etc. I’ve run leak tools over and over looking for things we might be leaking. Occasionally I’ll find something small we’re actually leaking but more often than not I don’t see any real leaks. This lead me to wonder where our memory went. Firefox has a lot of caches internally for performance reasons. These include things like the back/forward cache (which helps speed up loading pages when you hit back), the image cache (keeps images in memory to help load them faster), font cache, textrun cache (short lived, but used to cache computed glyph indicies and metrics and such), etc. We also introduced in Gecko 1.9 the cycle collector which hopes to avoid cycles in XPCOM objects that we might hit. We’ve also got the JS garbage collector. All of these things mean we could be holding on to a bunch of objects that could be taking up space so we want to eliminate those from the picture. I released the RAMBack extension earlier this week which clears most of these things.

So, if it is none of these things, what is going on? Why after a while do we end up using more memory than we should be if we aren’t leaking and our caches are clear? At least part of it seems to be due to memory fragmentation.

Let me give you some examples (with pictures!):

Loading the browser with about:blank as my homepage:

This represents a heap size of 12,589,696. This is made up of a total of 11,483,864 bytes of used blocks and 1,105,832 bytes of free blocks in varying sizes.

Each block in the image represents 4096 bytes in memory. Things range from solid black which are completely used to white which are mostly free.


Loading a bunch of windows and closing them and clearing my caches

Although you can get similar results on many sites, schrep gave me this TripAdvisor hotel search page which opens up lots of windows with lots of pages. To generate this image, I loaded the URL, waited for all of the pages to open, closed them all, loaded about:blank, and then ran RAMBack. At the end of that, here is the result:

Our heap is now 29,999,872 bytes! 16,118,072 of that is used (up 4,634,208 bytes from before… which caches am I forgetting to clear?). The rest, a whopping 13,881,800 bytes, is in free blocks! These are mostly scattered in between tiny used blocks. This is bad.

Light green blocks are completely free pages. I’ve highlighted those because the OS could page them out if it wanted to. You’ll notice there aren’t very many light green squares…

So.. What does this mean?

Well, it means that any allocations >4k are going at the end because we can’t really fit them anywhere earlier. This is bad for a variety of reasons including performance. It makes it very difficult for us to get big chunks of contiguous memory to give back to the OS. This makes us look big!

Yeah, duh, I already knew fragmentation was bad.. Now what?

Well, there are many things we can do. Thanks to vlad and dtrace I’ve got call stack distributions of all of our mallocs and can tell where the most allocations come from. As you might imagine, given the size of our codebase, we do allocations from lots and lots of different places. Fortunately, there are several hot spots. Those include Javascript, strings, sqlite, CSS parsing, HTML parsing, and array growing. For some of these we don’t need to heap allocate and can just do temporary allocations on the stack. For others we can’t, but we can use arenas (as we already do for some layout objects) to help reduce fragmentation. For example, we could have several arenas we could allocate small sized strings out of. Just during startup we do over 40,000 string allocations between 8 and 64 bytes. As as last resort, we could replace malloc and new entirely with something more generally better. I don’t think we should do this until we’ve done as much of the other things as possible.

I’ll be filing bugs and posting more details shortly.

Thoughts, suggestions, and comments welcome!

Edit: I found a small bug in the code I used to generate my images which resulted in fewer light green (empty) blocks than there should have been. I’ve updated the images to show properly.

Firefox Extensions Oh My!

November 7, 2007

I whipped up a couple of awesome Firefox extensions today. They both require the latest Firefox 3.0 nighties or beta (release according to digg!).

The first allows people on Windows and Linux to print web pages to PDFs natively. There are some backend bugs at the moment but it is pretty awesome.

I introduce PrintPDF!!!! You can find it on addons.

Next based on a bunch of the use-less-ram effort going on I’ve put together an extension that fires the memory pressure notification. This results in us clearing several caches including the bfcache and the image cache as well as forcing cycle collection and garbage collection. We’re working on hooking more things up to this such as the code to discard uncompressed image data and other such things.

I introduce to you, RAMBack (queue the Sexyback music). It helps some now and will continue to help as we release new versions of Firefox.

enjoy.