shop for SIMtrace

shop for SIMtrace

The WebShop for the SIMtrace hardware can be found here. We are using a CA-Cert signed SSL certificate and your browser vendor might not like it.

Setting up, or mostly modifying the webshop was my first encounter with rails. In some ways it is a great framework, in others with a Smalltalk background there are some tears in my eyes. This is using spreecommerce because I wanted something that is not done in PHP and is not following the OpenCore business model.

The manual of spreecommerce is actually great from an engineering point of view, they describe the concepts, the models and how to modify it. But they are bit short on setting it up. There are some config options where no Graphical way to change them exists yet. So it is best to look into the app_configuration.rb of spree_core, e.g. to modify the default country.

There are still some bits that I would like to change, the code for my modification can be found on gitorious.

Qt Quick vs. Android XML Layouts

Qt Quick vs. Android XML Layouts

In the last month and next couple of weeks we were/are developing a similiar application for Android, iPhone and the Desktop (using Qt and Quick) with native technologies. I will focus on Android and Qt as both are Open Source and we were developing them in a similiar fashion with one designer and one developer interacting with each other.

In both cases the application has a couple of views mainly consisting of a ListView of some sort and I will explain how this has been realized for Android and then with Qt Quick and provide some summary of what has worked well, what Qt Quick could learn from that. I have not worked with Qt Quick or Android prior to this project so I might have used the technology in the wrong way, and at your option you can either hold this against me or the technology.
We have started the project by doing the Android application. The model for my data is based on the abstract class called android.widget.BaseAdapter and I needed to implement a couple of functions. The first is the answer how many items I have, return an Object at the given position, an identifier for a position and finally a function that creates a View.
This function is called getView(int position, View convertView, ViewGroup parent) and the common usage is to check if convertView can be used to display the item at position and then modify (e.g. set the iccon, text, hide things on that view) otherwise we were using the LayoutInflater to create a View from one of the pre-defined layouts and set the items.
In Android one can describe Layouts in a XML Document, or use a still very basic Layout Designer (that will mess with the indenting of the XML), one unique feature is one can have different layouts for different properties, e.g. one can create a layout that will be only displayed if the application is in Landscape, uses Docomo as Operator and more. This is a very powerful system and we have only used it to have different layouts in portrait/landscape so far. The next important thing is one can ask the LayoutInflater to create the View for a given identifier, the next part is to find the items inside that view. Each View has a method called findViewById that is searching the children for that id.
One common pattern is to not call findViewById everytime one needs to access a widget but create a ViewHolder class with references to the Views and use View.setTag(Object tag) on the top View to set that ViewHolder on the View.
So how did the Designer and Developer interaction work here? First of all the Designer was responsible to create the View for the various parts of the UI and I was responsible for creating that BaseAdapter derived class, implement the getView method and set the right text/icon/visibility on the items of the layout based on the item that is to be displayed.
How did that work? In general it worked great, I could create a UI that worked, the Designer could make it look nice, the biggest problem (besides the basic UI editor, fighting with the RelativeLayout class) is that during redesigning a layout id’s were dropped from the Widgets and I had to assist the Designer to identify the missing IDs and add them back to the Design to keep things working and display the right thing again.
The Quick project is still work in progress and a lot like the Android project, the Developer creates the code (or states and interaction) and the Designer makes it look good. Our ListView’s are backed with classes based on QAbstractListModel (using setItemRoles to export the roles as QString) and the designer is forced to edit the files by hand right now.
The thing that really works better with Qt Quick is that the delegate (equivalent to the getView in Android) is in the hand of the Designer. This means that the Developer only needs to export the right strings for the roles from the model and the designer can use that to create the UI.
From my point of view the Qt Quick approach is really superior to Android’s XML Layout system, with Qt Quick we could get started faster by having a full click dummy (create the ListModel and fill it with dummy data), allowing the designer to fill the UI with content from the first day of the project, and by having the delegate in the Designers hand we have no “losing” id’s in the diff of an unformatted XML Stream. The only thing that has no proper equivalen in QML/Quick is basing the state on external properties, or with other words, I have not seen any documentation for some basic system/device properties.
First Steps with Qt’s Quick

First Steps with Qt’s Quick

I am very excited by the Qt Quick technology and I have finally found a reason to use it and gain some experience with it and want to report from my first hours of exploring it. The first time I have seen Declarative UI was with Enlightenment and the Edje framework. I liked it so much that I registered Epicenter five years ago. The plan was to create a handheld framework using declarative UI.

Now Qt Quick is not just a copy cat of Edje but adds some nice improvements over it. The first one is the usage of Javascript instead of the C like language, better integration with Qt’s Object Model (properties) and libraries/plugins. From my observation at Openmoko, our developers kept a set of C preprocessor macros around to do coming things with edje with Quick it seems to be better as one can import libraries and such.
The most common mistake I have made so far is dealing with the width/heigh of an Item. In my case I created an Item, placed a MouseArea (to deal with user input) of the size of the parent (anchors.fill: arent) in it and then also add some Text (as sibling). Now it appears possible that the Text is bigger than the parent item. For performance reasons no clipping is done by default so it renders just fine, just clicking doesn’t work. Which is bringing me to my debug trick…
I place a Rectangle { anchors.fill: parent; color: ‘blue’ } inside my item and then I actually see where it is. Another nice thing would be an Xray view showing the border of each item, their id but only in black and white. My solutions for this problem so far (from my first hours of using it) is to either use a Row/Column which seems to get the widh/heigt updated based on its children, or in some cases place a width/height inside the Item itself.
This is bringing me to the biggest issue with the qmlviewer and also an idea on how to resolve it… In the last couple of month’s I started to contribute to GNU Smalltalk and looking more into Self, Smalltalk-80 and such. The clever Xerox PARC people invented Model-View-Controller as part of Smalltalk-80, Qt adopted some parts of it with Qt4. In the meantime something called Morphic emerged. Morphich is a direct-manipulation User Interface. This means one is creating the UI inside a running UI by composing it from simple objects (just like Quick). In contrast to it one can inspect each element and interact with it, change it at runtime without restarting. This allows faster changes, and easier debugging in case something goes wrong. E.g. it easily answers the question of what is the width of that?
So for the immediate future I would like to see something like the WebKit inspector emerge for QML. This would allow to inspect the scene graph, change it from the console, has some simple hit testing to inspect one element, has the JavaScript Debugger and Profiler available, the timeline… and I am pretty sure to not be the first one to want it.
GPRS issue resolved

GPRS issue resolved

Hi,

with some more debugging and fun with wireshark scripting and looking a pretty obvious issue has been resolved. Now GPRS for us is actually using IP, UDP, NS (some simple address and type of the messages), BSSGP (protocol between SGSN and BSS) and for actual data there is LLC at the end of the BSSGP. The LLC is part of the BSSGP payload as TLV (Tag, Length, Value).
I created a simple setup that worked. It involved getting the traffic from the BTS, relayed with a simple smalltalk script (I had to do some fixes to GNU Smalltalk), and then send it to another SGSN. With a small variation of sending the data through our proxy I made the nanoBTS crash.
From observations I found that the other SGSN is padding the FLOW-CONTROL-BVC-ACK and FLOW-CONTROL-MS-ACK packets to 28 bytes, but padding/not padding had no effect on the crash.
The next observation was (before I tried doing it manually) that I now have each packet twice, once coming from the SGSN and how it looks after our proxy, apparently the proxy truncated the UDP packets….
So what errors have happened?
  1. The nanoBTS accesses random memory with short LLC frames and crashes, instead of crashing it should send a STATUS (I think BSSGP) returning our
  2. The wireshark BSSGP dissector does not check the size of the LLC frame (I created a bug report with a patch)..
  3. The proxy code was not reading the whole datagram and we had to increase the size, according to the spec the maximum size is 1600 byte for Framerelay… we now have a slightly bigger message…
Android and Java

Android and Java

I am now playing with the Android UI/Java Code and right now I have something really really simple to do. Over time I have a bunch of URIs to GET or POST and after it is done I want to have the status and the data. On top of that it should not block the UI thread. Android is using the HttpComponents of the apache project. The good thing about it is that Google was not reinventing the wheel, the bad thing is that the code is Java. So I will need to write 200 lines of code to setup a threadsafe ConnectionManager and a HttpClient on top of that and then I need to write my own thread pool… Which is bringing one back to the main complain about Java, it is an overengineered system…

But in general I think the Android UI classes are very promising, yesterday I implemented the preferences for my application and it is really really easy, better than anything i have dealt with before. One needs to create a xml/preferences.xml, one can use “categories” and each category can have different preference items, even custom ones. The system will automatically create the view for that (like with the most xml documents for Android) and it will take care of saving/loading the settings.

GSM RACH Bursts and Paging Requests

GSM RACH Bursts and Paging Requests

Yesterday I had the pleasure of trying OpenBSC on a real network and the result was desaster, but honestly what else to expect when trying it the first time. It is not that OpenBSC was crashing, leaking memory, or not recovering from failure it is just the load of the network was differrent than what I assumed and that leads to problems.

What happens is one is seeing a lot of location updating requests, which will load the SDCCH but that is really fine and we have seen such things at the Chaos Congress, what is different is the result of location updating requests, the network will flood us with paging requests… Right now we are sending up to 20 paging requests every two seconds… The first thing to notice is that this too much for the nanoBTS, it is sending us a nice CCCH/ACCH/BCCH overload warning which we do not handle (we should start two timers and throttle the amount of messages we send) the other part is… if we are out of SDCCHs and ask 20 more phones a second to get one… We have created the RACH DoS that Dieter Spaar has done with a Mobile Station.

The Random Access Request contains the channel type and one to IIRC four bits of random numbers, so even if we have a free channel… it can happen that two phones believe that we have assigned a channel to it… and then we see RF Failures, which in turn will trigger the phone to try again (or we page it again)… and then nothing will work….

The other observation is that if our cell is really busy we should start to assign TCHs to fullfill location updating requests….

So the changes I need to make is to change the paging to not page as much as we physically can stuff into the PACCH but as to how much of the responses we can handle (pretty obvious?) and the other is to allocate a “bigger” channel in case we have no smaller channel… E.g. use number of free channels divided by X for paging requests…

Hacking on OpenBSC

Hacking on OpenBSC

I was invited to visit the On-Waves (they have a shiny new website) office in Paris this week and I was quite busy hacking away on the OpenBSC codebase. On-Waves allows me to play a bit with their MSC and learn more about GSM and in exchange OpenBSC gains a more and more complete and stable GSM A-Interface.

When developing code for OpenBSC we are mostly sitting very close to the BTS, only have one active subscriber, test one thing, restart, test another thing, restart but with any piece of software I’m writing, I want OpenBSC to be rock solid, run unattended for years, have no memory leaks, deal with the nanoBTS going away and coming back, the MSC going away and coming, all this at any point in time. So far events like Hacking at Random and the Congress are the ideal testing ground as many different handsets, subscribers, etc are the ideal playground.

My testing was limited to a small set of handsets connected via USB and executing AT commands for call handling and sending SMS. I’m addressing subscribers on the same cell. That means whenever I do a call I have mobile originated and mobile terminated testing covered and this is done by funny chat scripts that work most of the time. The next thing is to simulate failure, for some stuff where a specific layer3 message would be send, we have to wait for a more complete OsmocomBB, so what I can easily do is to cut off TCP connections. I have done this with another piece of weird shell magic. I use the output of $RANDOM and treat it as seconds and then use a kill -SUGUSR2 `pidof bsc_msc_ip` to close the MSC connection at a random time. And then I let it running and wait for failures.

I have fixed a bug/issue in the way we do release a channel. There are multiple things involved. First of all is instructing the BTS that a given channel on a timeslot is open or closing it (RF Channel Release of RSL), the other part is that on the channel one can have logical applications running (SAPI), this can be call control (SAPI=0) and SMS (SAPI=3). When opening a connection to a Mobile Station (MS) the SAPI=0 is always established, when attempting to deliver a SMS we need to open SAPI=3 first. Now our issue with bringing this down was that whenever we got a SAPI release confirm (we asked for the release and it was released) or release indication (the MS closed it) and we used to respond with a RF Channel Release. Now when trying to bringdown a connection were we delivered a SMS we would issue a RF Channel Release twice and the nanoBTS ACKed it twice! To make matter worse, whenever we get a RF Channel Release ACK we mark it as free. We had this small window when we got the first RF Channel Release ACK, allocated the channel again, and then get the second RF Channel Release ACK. I have fixed this issue in multiple ways. The first is to use the T3111 timer to wait until we issue the RF Channel Release, the second is to handle (RF) failures by “blocking” the lchan for a short second to receive multiple errors and release acks and the last bit is to properly bring down the channel. When we have SAPI!=0 we bring that down first, then we send SACH deactivate, followed by SAPI=0 release and then finally we send the RF Channel Release. This makes things more reliable on our side but we need to fix some more things. There is a FIXME inside the gsm_04_08_utils.c that mentions the start of a T3109 timer. In any case when sending a SAPI release the BTS will answer with success or a timeout and we handle both.

Today I addressed losing the RSL or OML connection to the nanoBTS and making sure we are reconnecting and not leaking any memory. This took me most of the day to get stable and I have found a bug or such inside the osmocore/select.c when releasing a bsc_fd that is the last one of the list. The difficulty here is making sure we do not leak memory, close all file descriptors, close all channels that take place on the RSL connection and make sure that when the BTS is up again we can use the channels that were allocated during the failure. To help with testing I added two commands to our vty interface to drop the OML or the RSL connection on a given BTS. The other part that was helpful is to use Linux’s Netfilter and drop packets on a TCP connection and to wait for a failure. Now I can simulate most of the network failures easily and could build some trust.

And my final wishlist item would be to have like 16 GTA02 boards, use FS0 on each and run a simple script to dial, send SMS, pickup phonecalls this would allow me to heavily test the networking in an automated way. On top of that would be to have a OsmocoreBB enabled Calypso or C123 and then I could even send messages that are normally not send at all. And thanks to FreeSoftware development I’m sure we are going to reach that goal.

Dealing with security issues in the context of OpenEmbedded

Dealing with security issues in the context of OpenEmbedded

One thing that has bothered me while being at Openmoko is the lack of Security Response by the OpenEmbedded Crew. In one way a security issue is just like any other bug and distros don’t upgrade each package for each bug fixed upstream but it is getting worse when the security issues exists in the default installation, in a daemon listening to network traffic and such with ready to get exploits on the network.

I think it is really unethically to go around and claim how great OpenEmbedded is and then companies like Openmoko, Palm, etc. ship vulnerable software to their users and it is easy to pass the black pit to companies actually using OpenEmbedded, let me say it is too easy.

There are various things one can do to address these problems. One option is to downgrade and use the classic Buildroot as their maintainers seem to address vulnerabilities in time. I use the word downgrade as these systems provide less functionality, flexibility than OpenEmbedded, e.g. they lack the creating of SDKs, chosing the libc (glibc, uclibc, eglibc) but then again they do their homework and provide people with security updates in time, the other option is to go to a distribution like Debian or Fedora with a proven track record of handling security issues.

But I’m going to talk about the third option that includes improving OpenEmbedded. I had the idea while being at Openmoko but the guy who was assigned to do this was laid off shortly afterwards so it never happened. In general for every package we ship in OpenEmbedded there is an established distribution (e.g. FreeBSD, Debian, Fedora) that is shipping it as well. Or in the seldom cases where OpenEmbedded is the first adopter, the software is kept current and there is not much security research anyway. This means that to provide security upgrades to our users we only need to monitor the big and established guys and that sounds like something that can be partially scripted.

I’m using FreeBSD on my servers and in the FreeBSD world there is an application called portaudit which is looking at your built/installed packages and is comparing the name, package version and patch release to a list of known security issues in the ports tree and then asks you to upgrade, Gentoo has a similiar XML file for each security incident, Debian has a security feed as well.

A long story short, on a flight to iceland I was hacking a python script called oe_audit.py that is using the FreeBSD auditfile and the output of “bitbake -s” (the list of providers and their versions) and then starts comparing these lists. Right now the script is inside the OE tree, it is still a gross hack but I will improve it to be a proper python script. In its first version it has found issues with plenty of packages in OpenEmbedded and thanks to the help of some we are down to only a couple of issues in our tree. In general addressing security issues is not that hard, follow a couple of mailinglists, look at websites, when a CVE is published search for the patch, apply it to our version, be done. Specially given the amount of OE developers we could nominate a security sherif each week that has to do the upgrades… It is not that we see more than three upgrade a week anyway… So this week it would have been Pango, Php, Pulseaudio…

Using setitimer of your favorite posix kernel

Using setitimer of your favorite posix kernel

As this was requested in a comment of a previous post and knowing your kernel helps to write better performing systems here is a small information of how to use the interval timers provided by your posix kernel.

What is the interval itime?r

The interval timer is managed and provided by your kernel. Everytime the interval of the timer expires the kernel wil send a signal to your application. The kernel is providing three different interval timers for every application. The different timers are for measuring the real time passed on the system, the time your application is actually executed and finally the profiling timer which tmes the time when your application is executed and when the system is executing on behalf of your application. More information can be found in the manpage with the name setitimer.

Why is it useful?

In the QtWebKit Performance Measurement Utilities we are using the interval itimer as the timing implementation for our Benchmark Macros. To be more precise we are using the ITIMER_PROF to measure the time we spend executing in the system and in the application, we are using the smallest possible precision of this timer with one microsecond. The big benefit os using this instead of elapsed real time, e.g. with QTime::elapsed, is that we are not depending so much on system scheduling. This can be really nice as even with a lightly crouded system we can generate stable times, the only thing influecing the timing is the MHZ of the CPU.

How is it implemented?

It is a kernel timer, this means that it is implemented in your kernel. In case of Linux you should be able to find a file called kernel/itimer.c, it defines the syscall setitimer at the bottom of the file. In our case the SIGPROF seems to be generated in kernel/posix-cpu-timers.c in the check_cpu_itimer routine. Of course the timer needs to be accounted by things like kernel/sched.c when scheduling tasks to run…

How to make use of it?

We want to use ITIMER_PROF, according to the manpage this will generate the SIGPROF. This means we need to have a signal handler for that, then we need to have a way to start the timer. So let us start with the SIGPROF handling.

Elapsed time handling
static unsigned int sig_prof = 0;
static void sig_profiling()
{
    ++sig_prof;
}

The signal handler
    struct sigaction sa;
    sa.sa_handler = sig_profiling;
    sigemptyset(&sa.sa_mask);
    sa.sa_flags = SA_RESTART;
    if (sigaction(SIGPROF, &sa, 0) != 0) {
        fprintf(stderr, “Failed to register signal handler.n”);
        exit(-1);
    }

Start the timer
tatic void startTimer()
{
    sig_prof = 0;
    struct itimerval tim;
    tim.it_interval.tv_sec = 0;
    tim.it_interval.tv_usec = 1;
    tim.it_value.tv_sec = 0;
    tim.it_value.tv_usec = 1;
    setitimer(ITIMER_PROF, &tim, 0);
}

Discussion of the implementation

What is missing? We are using the sigaction API… we should make use of the siginfo_t passed inside the signal handler.

What if we need a higher precision or need to handle overflows?
There is the POSIX.1b timer API which provides timers in the nanosec region and also providers information about overflows (e.g. when the signal could not be delivered in timer). More information can be found when looking at the timer_create functions.

When is the interval timer not useufl?

Imagine you want to measure time it takes to complete a download and someone wrote code like this:

QTimer::singleShot(this, SLOT(finishDownload())), 300000);

In this case to finish the download a lot of real time will pass and the app might be considered very slow, but it in terms of the itimer only little will be executed as the time we just sleep is not accounted on us. This means the itimer can be the wrong thing to use when you want to measure real time, e.g. latency or time to complete network operations.

Explorations in the field of GSM

Explorations in the field of GSM

Something like 14 months ago I had no idea about GSM protocols, 12 months ago I was implementing paging for OpenBSC, beginning from last summer I explored SS7 and SCCP, wrote a simple SCCP stack for On-Waves. Started to implement the GSM A Interface for OpenBSC, the last week I saw myself learning more about MTP Level3. With the Osmocom I start to explore GSM Layer 1 (TDMA, bursts, syncing), GSM Layer 2 (LAPDm) and on GSM Layer3 we mostly see the counterpart of OpenBSC.

I feel like I am back to school (in the positive way) and I have learned a lot in the recent year and looking forward I will learn more about protocols used at the MSC side and such. I’m very excited about what the future is going to be like. Will we have a complete GSM Network (BTS, BSC, MSC, MS, SMSC, GPRS gateway(s)) with GPL software by the end of the year?