The WebKit mailinglists

The WebKit mailinglists

In the beginning there was webkit-dev but we were overwhelmed by the feedback we got. Last week Maciej announced how to use the new lists.

  • webkit-dev should be used for the development of WebCore/JavaScriptCore itself.
  • webkit-help should be used for questions on how to use WebKit API in applications, porting to new platforms, building it…
  • webkit-jobs is there to not force one to send job posts to every single commiter but to only one place.
Performance musings

Performance musings

Like many others I enjoy being in Las Palmas at the Gran Canaria Desktop Summit. It is great to see new and old friends, put faces to IRC nicknames… While sitting in talks I started to feel the performance itch. What code is the moc actually generating, how fast is it… Luckily Qt Software released their internal tests and you will find some benchmarks in the tests/benchmark directory.

Looking at QMetaObject and generated code

When using QObject::connect currently the following happens. For the sender QMetaObject::indexOfSignal gets called and for the receiver the QMetaObject::indexOfSlot or QMetaObject::indexOfSignal is being invoked. Now the various QMetaObject::index* (for properties, methods, signals, slots) work in the same way. You will start with the current QMetaObject and go through the list of all methods, if you didn’t find anything you go to the parent QMetaObject and do the same. What you are doing is a linear search for a signature across the inheritance hierachy. When having found the index of the method for a given QMetaObject you will add the number of methods from your parent QMetaObjects and this will be the id used by QObject::qt_metacall. The first thing the generated code in ::qt_metacall will do is to call the parents qt_metacall to subtract the id.

What can be done to improve it

Hashing and such comes into mind, or having a trie for the whole inheritance chain. With things like gperf you could create a perfect hash for the inheritance chain. The problem with having metadata over the whole inheritance chain is that maybe your baseclass is in a different library and they added a new method, now your hash might not be unique anymore… and obviously you will require more memory when you have the whole inheritance tree…

The easy solution is to sort methods/signals/slots so you can do binary search in the various indexOf* methods in QMetaObject. And so far I have only implemented this, but I have some other ideas from “self” and javascript how to cheat a bit to make recuring actions like QObject::invokeMethod a lot faster (there is no need to search again for the “slot”).

Another thing is when having searched/found the index of the slot/signal/method you might just safe the QMetaObject and the id instead of adding the offset and when emitting the signal you avoid going through the hierachy because you actually know which ::qt_metacall we want to call…

Non academic benchmark

The code can be seen in a branch on gitorious and for some tests in the QObject::connect/QObject::disconnect benchmark the new code is 30% faster. This happens when you have to go down in the inheritance tree to find the signal… For some other cases there is no difference. What is missing is code to deal with old generated code…

Joel on office space

Joel on office space

I encourage to read this article on how to design a office space for software teams (or any kind of engineer). There is scientific evidence showing the correlation between a good office space and productivity and his main points are:

  1. Private offices with doors that close were absolutely required and not open to negotiation.
  2. Programmers need lots of power outlets. They should be able to plug new gizmos in at desk height without crawling on the floor.
  3. We need to be able to rewire any data lines (phone, LAN, cable TV, alarms, etc.) easily without opening any walls, ever.
  4. It should be possible to do pair programming.
  5. When you’re working with a monitor all day, you need to rest your eyes by looking at something far away, so monitors should not be up against walls.
  6. The office should be a hang out: a pleasant place to spend time. If you’re meeting your friends for dinner after work you should want to meet at the office

a great read…

Taking over memprof

Taking over memprof

Where did all my memory go? Who is allocating it, how much is being allocated? From where were theses QImages allocated? valgrind provides an accurate leak checker, but for a running application you might want to know about allocations and browse through them and don’t take the performance hit of valgrind (e.g with massif).

There is an easy way to answer these questions, use memprof. memprof used to be a GNOME application, it was unmaintained, the website was gone from the net, but this tool is just way too good to just drop out of the net. After trying to reach the maintainer twice I decided to adopt the orphaned thing.

Check the application out, it is great, it helps me to get an overview of memory allocations for WebKit/GTK+…

First ever Gtk+ patch

First ever Gtk+ patch

During my work on Epiphany I was debugging a problem with the “woohoo” bar. It took me not less than three days to understand the issue, write a test case, and a patch and put everything in this bug. Matthias Clasen was kind enough to review and commit the patch and it can be found here. Sadly the –author tag of git was not used and the commit does not carry my name, so ohloh will not list my contribution to Gtk+.

The main issue with debugging was finding signal connections, e.g. which function is connected to that signal and which objects, and figuring out what was called during the signal activation. My approach was the usual printf method in many places and adding _backtrace() to function calls using the glibc builtin backtracing functionality. I would like to have SystemTap at a state I could use it for tracing, or be able to script gdb (it has python plugin support now) to automatically execute a trigger when certain parts of the code got executed.

Anyway, I’m happy to have fixed a Gtk+ bug and being a contributor now.

Bitbake parser performance

Bitbake parser performance

There is one thing in OE that is pretty time consuming. If you try to get some variables right that influence the entire system (SRCREV, DISTRO, MACHINE…) you will find yourself running bitbake over and over again and each run will take several minutes of your time. There are two solutions to this problem and both are very good.

  • Have less data. If you have less data, less data need to be parsed. Apparently poky is doing that with their meta module. You can also see this with jhbuild of GNOME where you have to enable modulesets to have more data. For OE we should consider splitting up things that are orthogonal to each other. GUI and console networking tools? For Distro- and Autobuilders this will not make a difference, for the avergae joe it might. I’m not sure we should split OE into several independent modules just for parsing speed
  • Parse faster. This is what I will talk about now

I’m bothered by the Bitbake parser for several years now. And all these years I had a paid job/main contractor and was doing work for them and never finished my work on the various approaches based on Marc Singers lexer/grammar. As of now this obstacle is gone, I have plenty of time (wanna change that?), and the last two days I was working on the Bitbake parser.

The current approach was to base on Marc Singers flex and lemon work (fixed to really parse everything), try to hook it into python, try to hook it into the rest of Bitbake. At some point this always stalled because it is very hard to verify that the new parser is doing things properly. And it is quite frustrating as well. lemon/flex is pretty fast in figuring out the structure but we have quite some python code in our metadata which needs to be executed and so far this has not been optimized. While one is able to lex and analysis the grammar below a second, it will take quite some time to execute the python code. Anyway, all of my previous attempts stalled at some point, mostly when trying to verify…
So this time I just ignored how horrible our current regexp based scanner is and decided to turn it step by step into a parser that creates a syntax tree/list and then evaluates this list/tree into the bb.data dictionary.
The first commits attempted to move the actual data handling out of the line based regexp handling into a new python module. Afterwards I turned all these methods into methods creating a Node for the syntax tree and immediately evaluating it to match the current behavior. Finally I was able to change that to evaluate the tree to a bb.data at the end of the parsing. So I have successfully (git bisectable) converted the current scanner into something producing the AST and then evaluating it. When parsing the OE metadata certain files like *.inc or *.bbclass will get parsed over and over again. With the above change we can scan these files once, keep the syntax tree around and then just evaluate again.

I ended up with something like 27 patches against Bitbake, plenty of baby steps, each with high confidence that there a no regressions and this leads to turning down the parsing time from 3m9.573s to 2m35.994s on my rusty macbook.

There is some more work ahead to improve this situation, move away from the regexp to PLY, attempt multithreaded parsing, attempt to write a peep-hole optimizer(???), look at the data module again…quite some time is spent in the cache too…

MontaVista using Bitbake for MontaVista Linux version 6

MontaVista using Bitbake for MontaVista Linux version 6

In case people didn’t notice a small information. According to this video MontaVista has looked into OpenEmbedded and the bitbake task executor. Apparently they liked the idea of cooking a customized rootfs by combining a set of recipes and have adopted this strategy and the tools. They have added extra value by having an easy to use installer, a source mirror and pre-built packages to speed up the engineer.
It is not clear how much and if they have used recipes and classes from OpenEmbedded but in any case we have adopted the MIT License because we want to have the widest use possible. I’m really happy to see this happening and thanks for everyone working on this.

Amused by ofono.foo

Amused by ofono.foo

I’m seriously amused by the recent announcement, I had to go through the irc log and laugh badly. There is one joke our lecturer made in the software engineering class and I would have never assumed that people in the real world would say something like this… oh well… today they did. It made my day.
So what uneducated people like to put into a product requirement is a sentence like “The software should be easy to use.”. And you should wonder why someone should design a software that is not easy to use… But today I found out that ofono’s API is supposed to be easy to use. All welcome a world where APIs are meant to be easy to used, finally someone is putting an end to all the APIs that are created to not be used. I know this takes a lot of changing on all our sides.

Scrolling in GTK+ and WebKit/GTK+

Scrolling in GTK+ and WebKit/GTK+

  • Model/View GtkAdjustment and GtkScrollbar

    The GtkAdjustment is probably best described as a model for scrolling. It has several properties, e.g. the current position (value), the lower and upper possibilities, and the size increments. The GtkScrollbar is operating on top of a GtkAdjustment. It is responsible for taking user events and painting. When scrolling it will look at the lower and upper properties, it will update the value in case of something is happening.

  • WebCore::Scrollbar in WebCore

    The Scrollbar is created by the Scrollbar::createNativeScrollbar “factory”. For many platforms the painting/theming and behaviour is entriely done within that class. For GTK+ we will use a GtkScrollbar, this widget happens to not have its own GdkWindow which makes painting a bit more easy, we will just forward the original expose event and let it draw as well (from ScrollbarGtk::paint). WebCore::Scrollbar in WebCore are used in two places. One prominent one is the WebCore::ScrollView which is the base class of the WebCore::FrameView and will be used to enable people to scroll on their content in horizontal and vertical direction, the other big user are scrollable div’s (we have a manual test in WebCore/manual-tests/gtk/ to test positioning of these scrollbars).

  • Scrolling on GtkWidget

    The scrolling starts quite innocently with the description of gtk_widget_set_scroll_adjustments. In case of WebKitWebView in WebKit/GTK+ we do want to support scrolling so we need to return TRUE… but how. If you take a look at the class structure of GtkWidget there is a GObject signal identifier you will have to set with the signal you have created. This is done in the WebKitWebView class init and we will remember and use the GtkAdjustment, e.g. set by a GtkScrolledWindow, in the WebCore::ScrollView (base of WebCore::FrameView class). The usage of GtkAdjustment allows to have different means of scrolling, e.g. by fingers on a touchscreen, or by a wheel… the representation of the scroll concept will change, the implementation not…

  • The problem of mainFrame and subframes

    So far we will most of the times only have external GtkAdjustment set on the mainFrame that is embedded in something like a GtkScrolledWindow but there are pages that will create a frameset and you will have subframes that require scrolling (my test case here is Google Images). What we are ending up with is having a WebCore::FrameView with GtkAdjustments set from the WebKitWebView and some WebCore::FrameView without GtkAdjustments set at all. On cases without a GtkAdjustment, there is no one that will place a GtkScrollbar (and resize the WebKitWebView to be next to it), so the WebCore::ScrollView will create a ScrollbarGtk to handle this job. This creates the siutation that one WebCore::FrameView class will or will not have a Scrollbar (and manage its size…).

  • The current solution

    When we have a GtkAdjustment set in the WebCore::ScrollView do not think about scrollbars, the need of them, the positioning of them at all. Simply update the properties including the current value, the visibleWidth and contentWidth. The upside is we have a GtkWidget that is working like a GtkWidget and can be embedded into a GtkScrolledWindow or a MokoFingerScroll (back in the better days). The downside of this include that we have more platform specific code and that we will not send onscroll events… due not going through ScollbarClient::valueChanged….

  • The new solution

    Wrap the GtkAdjustment we get set into a ScrollbarGtk but do not create a GtkScrollbar. This should create a WebCore::Widget with WebCore::Widget::platformWidget() returning zero and this should set the width() and height() of this Scrollbar to zero meaning that the ScrollView calculation (e.g. updating visibleWidth/visibleHeight) is not negatively influenced, we can kill some more #ifdef PLATFORM(GTK) from WebCore::ScrollView and that we properly send onscroll events as all scrolling is going through the WebCore::Scrollbar code path, like with every other port. For having navigation working we must be sure to reset the GtkAdjustment, due the nature of the FrameView this is best done when creating the ScrollBar… The progress of this can be tracked in bug #25646.