Julian's Blog

Stripe Hackathon Report: The Birth of Donation Party

| Comments

Last Saturday was the Stripe Hack to the Future hackathon. I’ve been working on attending more tech talks and other events lately, and since Stripe is pretty well known and I only live a block away, this one was a no-brainer. Naturally, they encouraged people to utilize Stripe, so I spent a few minutes the night before thinking about what to build. I quickly decided it should be something involving small transactions, and that it should be relatively silly: not something that would look like a “real” business idea.

The idea

Near the front entrance of Hacker Dojo is a credit card reader, used to accept donations. Unlike normal credit card readers, this one donates a random amount up to $20 on your behalf when you swipe your card.

While neat on its own, the real fun starts when you can get a bunch of people to all swipe one after another. Often we can arrange to have a little prize like a t-shirt or coffee mug for the “winner” who ends up donating the most. The money goes to a good cause, isn’t large enough to really care about, and the whole experience has a fun air competition: there’s a prize at stake, and suspense is high, but the outcome is not under anyones control.

With a little tweaking, I figured building a small web app basically simulating that same experience could be a lot of fun.

The Hackathon

I went over to Stripe HQ the next day and after having a quick bite to eat thanks to Stripe’s amazing culinary team, I set about recruiting people to get building. Normally, even at an event like a hackathon where you know you have something in common with almost everyone there, it’s a little hard to just start talking to people. But with the goal of starting work ASAP, and most people’s desire to get working on something too, it was easy to get things started.

Within about 30 minutes we had a team of four (the perfect size!) ready to get going. Most amazingly, we not only quickly came up with a great name, but a name with an available .com domain name! By the end of the night, we actually had made great progress. There’s a lot more to be done, but there happens to be another hackathon this Saturday at RISE where we plan to finish everything up.

Lessons Learned

While what we built was really cool, the actual experience of building it was one of the most valuable experiences I’ve had in the last few months. I’ve been reading and thinking about all sorts of startup and software development related topics lately, and this was a great chance to reflect upon them.

A few times during the hackathon particularly interesting thoughts crossed my mind. They might not be completely unique, but they’re still powerful.

Stay flexible

Since I was basing my hackathon idea off of a real world experience, I was lucky to have an extremely clear idea of what I wanted to build. I had specific interactions, pricing, and even wording in mind from the very start.

By the end of the night, none of the details of what we had built were similar, although the overall premise was preserved.

Initially, I was a little concerned when people started suggesting things directly in conflict with details that were, in my mind, already decided. I realized I had to let go of any specific vision I had, and just let our project evolve with input from the entire team.

Partially, this was because I knew I had to ensure everyone on the team wanted to keep working, and if I was too firm on any particular detail, they might decide they weren’t interested in helping anymore. Of course, there was absolutely no reason for me to believe that any preference I had for the direction of our project was automatically correct, and I have no doubt that the combined input of four people made it far better than I ever could have hoped to achieve on my own.

Urgency helps with decision making

Within about 10 minutes of gathering a small team, we had firmly made an incredible number of major decisions.

What language should we use? What hosting provider? Should we build a mobile app?

Those decisions alone could have taken weeks to decide at even a small company. Initial versions of user flows and interactions took us about 15 minutes, but could have taken even a fast moving startup a while.

The single hardest choice for any company, what to name your product, was decided in 30 seconds.

There’s nothing special about anyone on the team that caused us to make decisions so quickly, we simply didn’t have a lot of time, and therefore had a strong sense of urgency.

No doubt there are thousands of smaller decisions we could have worried about, but more important than any of them was our need to just build stuff.

We didn’t spend any time talking about coding styles, indentation, or any of the other classic programming debates. All software developers, of course, have extremely strong opinions on each of these topics, so not having to debate them was extremely refreshing.

Going forward, I’m going to focus on keeping the same sense of urgency for each and every project I work on, and hopefully all of them will be more successful.

Focus on the core of your project, outsource everything else

Another key to moving fast was to only spend time on the part of your project that makes it interesting, and let someone else take care of everything else you can. No one cares about the efficiency of our web servers, even if we all would have enjoyed tweaking them, so we used Heroku and were live in minutes. When we realized we needed a pub/sub system, we didn’t want to have to deal with setting up our own, so we let Pusher take care of it, and went back to working on something else.

For us, and for any team starting out, focusing only on what the user sees is the only way to go. Leave everything else to someone else.

Keep your team productive

I came to this hackathon mostly expecting to gather a team, come up with a basic design, and then more or less and sit down and code. In fact, for about two thirds of the night, I spent all my time just making sure everyone else could work.

At first, it might seem like this is silly. If I too had just gotten to work, there would be four people working instead of three.

But look at it another way: which is better, to have three people working in a productive, uninterrupted state, or four people constantly having to get sidetracked. I spent all night setting up dev apps, SSL certs, celery queues, and whatever else needed to be done.

Quite frankly, all the other team members worked on the hard, exciting stuff, and did a great job. But they wouldn’t have done as awesome of a job without my unglamorous help from the sidelines, and thats a great feeling.

Advice for Stripe

I really want to thank Stripe for hosting this hackathon and letting everyone eat their tasty food, but I also want to give some feedback that might make future events better. As a quick disclaimer I want to mention that I was fairly heads down for much of the night, so if I missed something, my apologies.

Do some sort of judging

While the event was ostensibly a hackathon, and indeed much hacking was done by several teams, it would be more correct to call it an office hours session. Guests were in attendance, even working on their own projects, but there wasn’t much structure imposed by Stripe, and at the end of the day the event fizzled out with no strong conclusion from Stripe.

There’s something to be said for a low key gathering, but Stripe, next time you host a hackathon, go all in on actually making it a hackathon. Have people register teams, do some sort of judging, give out a prize, the usual. Just helping people organize into teams probably would have tripled the number of people seriously working.

On the other hand, a more structured hackathon might have made it harder for me to pick up new team members, so maybe I should be careful what I wish for.

Make your employees more active

As a small team using Stripe for the first time, I can’t think of a better place to have been working than in the Stripe offices. Stripe employees were available to answer any questions we had, and they were overall really eager to help. That said, it seemed like we were always having to seek them out.

Maybe I just missed it, but I would have loved to have been bugged by Stripe employees dropping by every couple minutes, just to chat with us about our project. They could have quickly checked out how we were using Stripe, warned us before we ran into known problems, and helped us make things better than we could have on our own.

Fun With 4K Sectors

| Comments

Today I received in the mail a brand new 3TB hard drive for storing my multitudes of bits.

While I was eager to get started using it, I couldn’t help but dig into all the fun details of the new technology I have acquired. There’s two interesting considerations with a drive of this size and they both come down to something many people might not know much about: sectors.

A sector is basically a subdivision of usable space on a hard disk. When your operating system wants some data from disk, it asks for data by sector. For decades the standard size of a disk sector has remained unchanged: 512 bytes.

Recently however, two interesting things have happened. First, with the release of hard drives with capacities larger than 2TB, more than 232 sectors are required to address all data on disk. Unfortunately, the ubiquitous MBR partition table only supports up to 232 sectors per partition.

Second, hard drive manufacturers, in their never ending journey to give us more storage space, have realized that sectors of only 512 bytes no longer make sense. By using 4KB sectors, it is actually possible to store more data on the same hard disk because each sector comes with some overhead used by the hard disk.

What does all this mean? Most obviously, it requires that anyone wishing to use more than 2.2TB in a single disk use the new GUID Partition Table (it’s possible to cleverly utilize more than 2.2TB of a single disk with multiple MBR partitions, but this often does not work with many operating systems). Support for GPT is quite good amongst all operating systems now, and it is required for EFI, which is growing more common as well, so this is not much of an issue.

More insidiously however, it means that your hard disk is lying to you. Since sectors have been 512 bytes literally for decades, our friendly hard drive manufacturers assumed that no operating systems would be ready to support sectors of any size other than 512 bytes (perhaps they assume programmers don’t always properly use named constants for values such as sector sizes, which of course is ridiculous). Their clever solution was to have disks store data in 4KB sectors, but continue to advertise to the operating system that sectors are 512 bytes long, and then handle the bookkeeping to translate between the two themselves. So now there are two sector sizes worth worrying about: the logical size – how your operating system talks to your hard disk, and the physical size – what your disk actually does internally. This is all well and good, except that it breaks an implicit assumption about how much work a hard disk has to do when writing data.

Consider the case of an operating system writing to two consecutive 512 byte sectors. With 512 byte physical sectors, this is assumed to require a total of 1024 bytes be written to disk (a hard disk will generally only read and write, at minimum, a whole sector, regardless of how much or little data actually changes). But what if those two 512 byte logical sectors were not part of the same physical sector? Your hard drive has to write both physical sectors, a total of 8192 bytes!

If you’ve read any literature about SSD performance over the last few years, you’ll recognize this problem: it’s known as write amplification and like anything where more work than required is done, it’s not good for performance.

So how much performance is lost with a misaligned partition? Timothy Miller investigated by writing a small C program to force write amplification. Curious, and always a sucker for small C programs, I ran his code myself. Here’s my version:

testWriteAmplification.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#include <unistd.h>
#include <stdio.h>
#include <fcntl.h>

char buffer[4096];

int main(int argc, char *argv[])
{
    int fd, i, off;
    long bk, byte;

    if (argc<2) {
        off = 0;
    } else {
        off = atoi(argv[1]);
    }

    srandom(off);

    fd = open("/dev/sdc", O_RDWR | O_SYNC);

    for (i=0; i<1000; i++) {
        bk = random() % 200000000;
        byte = bk * 4096 + off * 512;
        lseek64(fd, byte, SEEK_SET);
        write(fd, buffer, 4096);
    }

    close(fd);

    return 0;
}

The method is simple: write 4096 bytes to 1000 random locations. By default, the program ensures that the write starts and ends at a 4KB sector boundary, but the first argument specifies an offset in 512 byte increments. Any offset not evenly divisible by 8 will cause write amplification, and as it turns out, the performance penalty is serious:

spectre256@ocean ~ $ sudo time ./testWriteAmplification 0
0.00user 0.02system 0:16.17elapsed 0%CPU (0avgtext+0avgdata 1664maxresident)k
0inputs+0outputs (0major+144minor)pagefaults 0swaps
spectre256@ocean ~ $ sudo time ./testWriteAmplification 1
0.00user 0.04system 0:26.45elapsed 0%CPU (0avgtext+0avgdata 1664maxresident)k
0inputs+0outputs (0major+144minor)pagefaults 0swaps

This brings us to the dreaded A-word: alignment. While occasional write amplification would be fine, what if your system was set up in such a way that write amplification is inevitable? This is the danger of differing physical and logical sector sizes. In fact, the default starting sector for many Windows partitions is 63. This has lead many other tools to copy this default, leading to misalignment and reduced performance. Some hard drives even internally shift all sectors by one so that such systems default to correct alignment.

Testing different alignments

While the test above showed serious theoretical performance reduction from misaligned writes, I wanted to know what would happen in the real world, so I devised some simple testing to investigate.

Sector 34 is the first available to start a new partition, after accounting for the space needed by GPT. Since 34 is not evenly divisible by 8, a partition starting at sector 34 will not be properly aligned, and is a good choice for testing misaligned performance. Sector 40 is the first possible correctly aligned sector, so I used this as the starting sector for the aligned partition.

Creating the partitions

Using sector 34 as the starting point, I created the misaligned partition using GNU Parted, and then created an ext4 filesystem:

ocean ~ # parted /dev/sdc
GNU Parted 3.1
Using /dev/sdc
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) mkpart ext4 0s -1s
Warning: You requested a partition from 0.00B to 3001GB (sectors 0..5860533167).
The closest location we can manage is 17.4kB to 3001GB (sectors 34..5860533134).
Is this still acceptable to you?
Yes/No? y
Warning: The resulting partition is not properly aligned for best performance.
Ignore/Cancel? i
(parted) q
Information: You may need to update /etc/fstab.

ocean ~ # time mkfs.ext4 /dev/sdc1
mke2fs 1.42 (29-Nov-2011)
/dev/sdc1 alignment is offset by 3072 bytes.
This may result in very poor performance, (re)-partitioning suggested.
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
183148544 inodes, 732566637 blocks
36628331 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
22357 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
    32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
    4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
    102400000, 214990848, 512000000, 550731776, 644972544

Allocating group tables: done
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done


real    0m29.931s
user    0m1.671s
sys     0m0.293s

Here’s the same procedure for the aligned partition:

ocean ~ # parted /dev/sdc
GNU Parted 3.1
Using /dev/sdc
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) rm 1
(parted) mkpart ext4 40s -1s
Warning: You requested a partition from 20.5kB to 3001GB (sectors 40..5860533167).
The closest location we can manage is 20.5kB to 3001GB (sectors 40..5860533134).
Is this still acceptable to you?
Yes/No? y
Warning: The resulting partition is not properly aligned for best performance.
Ignore/Cancel? i
(parted) q
Information: You may need to update /etc/fstab.

ocean ~ # time mkfs.ext4 /dev/sdc1
mke2fs 1.42 (29-Nov-2011)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
183148544 inodes, 732566636 blocks
36628331 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
22357 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks:
    32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
    4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
    102400000, 214990848, 512000000, 550731776, 644972544

Allocating group tables: done
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done


real    0m14.333s
user    0m1.646s
sys     0m0.289s

There are two interesting things to note. First, mkfs warns you when your partition alignment is incorrect. Second, the time to initialize the ext4 filesytem was significantly faster on the aligned partition, validating both the warning from mkfs and the initial testing. Note that parted warns about improper alignment in BOTH cases. It turns out parted is only happy with 1MB alignment (for SSDs), which is too conservative in this case.

Testing “real world” performance

To do my actual testing, I created a simple script that tested a small aspect of “real world” performance. I wanted to test writing both small and large files, as well as some reads. As a Gentoo user, I realized that simulating an update of the Portage ebuild tree would represent a good small file use case. For those not familiar with Gentoo, the Portage ebuild tree is a collection of text files used to automate the compilation of system packages. On my system, it currently consists of 137453 files in 23876 directories totaling 720MB on disk. To simulate the action of updating the ebuild tree, I extracted an old and new snapshot to tmpfs, then used rsync to copy the old, and then new snapshot to the same location on disk.

For large file performance, I tested copying a 4.4GB file from tmpfs to disk.

Here’s the full script that allows me to create and mount a new filesystem, run the tests, and then unmount the filesystem in one step:

1
2
3
4
5
6
7
8
9
10
11
#!/bin/bash -ex

mkfs.ext4 /dev/sdc1 > /dev/null
mount /dev/sdc1 /mnt/test

time rsync -aH /root/tmpfs/old/ /mnt/test
time rsync -aH /root/tmpfs/latest/ /mnt/test

time cp /root/tmpfs/bigfile /mnt/test

umount /mnt/test

Results

I ran my test setup 3 times for both the aligned and misaligned partiton, recreating the partition and filesystem after each test. Here’s the average of all 3 tests results:

Rsync old snapshot Rsync new snapshot Copy big file
Misaligned Partition (sector 34) 9.046s 0.877s 45.837s
Correctly aligned partition (sector 40) 7.399s 0.939s 33.348
Speedup for correct alignment 18.2% -7.0% 27.2%

Testing Conclusion

Based on the tests, there is a significant real world performance speedup when using a correctly aligned partition, both for large and small writes.

Interestingly, there is a small performance penalty shown during the second test. I’m going to assume this test wasn’t valid: I grabbed portage snapshots only a few days apart, meaning the changes to be synced are minimal. It’s doubtful that program execution times below one second are even accurate to be meaningful. If someone else can come up with an explanation though, I’d love to hear it.

Future work?

After doing all this testing, I started to wonder if the partitions on my SSDs are aligned correctly. SSDs are even more prone to write amplification, partially due to the fact that flash storage generally has to erase in large blocks (up to 256kb). Hopefully in the next couple weeks I’ll have time to write another blog post about it.

Unaligned performance with 512 byte sectors

Just for fun, I wanted to see if there was a theoretical performance penalty for 4KB writes on a hard drive with 512 byte physical sectors, so I ran the write amplification script on an old 640GB drive that my new 3TB drive is replacing.

pismo ~ # time ./testWriteAmplification 0

real    0m16.799s
user    0m0.000s
sys     0m0.046s
pismo ~ # time ./testWriteAmplification 1

real    0m22.654s
user    0m0.000s
sys     0m0.066s

Surprisingly, there was a performance penalty, although not as significant (I ran the test multiple times and the performance is consistent with the times shown above). I imagine even hard drives with 512 byte sectors are optimized for writes aligned at 4KB. The takeaway here is that it’s important for all partitions, regardless of the underlying sector size, to be aligned correctly.

Reference

For a full summary of the state of 4KB sector issues, the Linux ATA wiki has a comprehensive page.

Full data from real world testing

For reference, here’s all the performance data from my test script.

Misaligned Partition

Rsync old snapshot Rsync new snapshot Copy big file
Test 1 8.994s 0.878s 44.044s
Test 2 9.052s 0.878s 47.626s
Test 3 9.093s 0.876s 45.841s

Correctly aligned Partition

Rsync old snapshot Rsync new snapshot Copy big file
Test 1 7.408s 1.008s 32.945s
Test 2 7.115s 0.919s 31.746s
Test 3 7.674s 0.890s 35.461s

Where Are You Going Next?

| Comments

Whenever you leave a job, the most important question is “where are you going next?”. Having quit my last job in March, and not yet permanentely settled on anything, I have put considerable thought into this question. Immediately, I knew quite a few things. I wanted to work somewhere small, where my contibutions will be significant and varied, and I can learn many things. I wanted to work with amazing people, who will push me to become better myself. Finally, and most importantly, I wanted to work on something that without any reasonable doubt is a net positive for the world.

As a software developer in the bay area, especially one lucky enough to have had some great experience in only a few short years of employment, those first two criteria are not especially hard to meet. But is it even possible for a business to be sure it is providing not just something that people will pay for, but something that is “good”. Companies like AT&T and Sprint provide valuable services, but, as seems inevitable for large companies, provide terrible customer service, engage in shady lobbying and business dealings, and recieve near universal loathing even from their continued customers. Meanwhile, providers of enjoyable, but dubiously valuable services like Facebook and Twitter are essentially in the position of having to degrade their free product in order to make money, again causing incredible discontent among their users. Wouldn’t nearly any business find a way to anger someone in a way they can’t make a business case to remedy?

On the opposite end of the spectrum, well meaning people with a similar desire as myself have founded countless startups with chariable goals in mind. There must be a thousand new companies intent on educating, feeding, or providing technology for those in less fortunate areas of the world founded this year alone.

hile their goal is always noble, I have talked with many of these companies and have yet never met one with a feasible business plan, and in fact many seem hopelessly naive in their disregard for profitability or general business sense. Coming from me, this is a strong criticism.

When I first started looking for a new company to join in March, I mostly focused on consumer focused companies. B2B companies, while often very profitable, seemed quite frankly boring. While the technical challenges may be there, who wants to labor hard and long only to build better account tracking software? Even worse, many of the innefficiencies, frustrations, and restrictions that pushed me away from a comfortable job at an increasingly large company are even more prevalent, indeed pervasive, with B2B. I turned down invitations to interview at many B2B focused companies even with extremely interesting technical goals.

But as time went on, I became more and more frustrated with consumer focused startups. Most startups (even many that are well known and have raised considerable money) barely can make the case that they are making something useful, and the chances of providing this useful thing profitably is near zero. I have nothing against acquisitions, but a company that seems from the onset to have no exit other than an acqui-hire is unappearling. Converseley there are plenty of companies that make things people want, but don’t need. At best, these companies are like McDonalds: they make something no one should eat, but many claim to love anyway. At worst they are like cigarette companies.

But over time, I noticed a trend among companies I truly respect and admire, and with Yonas Beshawred’s recent post, I know what to call them: B2D companies.

And while just targeting software developers doesn’t instantly guarantee you’ll run an ethical and profitable company, there are a lot of reasons why it might be more likely. Most obviously, what ever it is you’re doing, the expectation is simple: you are selling something that can save a considerable amount of time. For pretty much every ever software developer, saving time is worth spending money. Furthermore, software developers have specific needs and they will tolerate exactly zero bullshit. Your product had better do what it says, with style, painlessly, or they will never come back. If you’re lucky they will write a critical tweet on their way out.

If the description above makes you think selling a product to software developers isn’t easy, you’re right. But really, that’s exactly the point. Like many software developers, nowhere on my list of things I look for in a job say I don’t want a challenge, indeed I require a challenge. Finally, as Paul Graham recently wrote, the best way to come up with an idea for a startup is to find a way to improve something you already know about. For someone like myself where software development is more than a job or even career, but is in fact a way of life, solving other developers problems seems only natural.

I haven’t decicde what I’ll do next, but I know I’ll be paying close attention to any chance to make something for other developers.

I Forgot How Frustrating SVN Can Be

| Comments

Yesterday, I was installing the latest KDE 4.10 beta, which is built from source. The Gentoo KDE overlay already has ebuilds, which use KDE’s SVN as the source, rather than a tarball. While installing all the packages, there was a failure to install kdeartwork-wallpapers:

 * Package:    kde-base/kdeartwork-wallpapers-9999
 * Repository: kde
 * Maintainer: kde@gentoo.org
 * USE:        amd64 elibc_glibc kernel_linux userland_GNU
 * FEATURES:   sandbox splitdebug userpriv usersandbox
>>> Unpacking source...
 * Fetching disabled since 1 hours has not passed since last update.
 * Using existing repository copy at revision 1326016.
 *    working copy: /usr/portage/distfiles/svn-src/kdeartwork/kdeartwork

 * Exporting parts of working copy to /var/tmp/portage/kde-base/kdeartwork-wallpapers-9999/work/kdeartwork-wallpapers-9999
rsync: link_stat "/usr/portage/distfiles/svn-src/kdeartwork/kdeartwork/wallpapers" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1052) [sender=3.0.9]
 * ERROR: kde-base/kdeartwork-wallpapers-9999 failed (unpack phase):
 *   {ESVN}: can't export subdirectory 'wallpapers' to '/var/tmp/portage/kde-base/kdeartwork-wallpapers-9999/work/kdeartwork-wallpapers-9999/'.
 * 
 * Call stack:
 *     ebuild.sh, line   93:  Called src_unpack
 *   environment, line 4224:  Called kde4-meta_src_unpack
 *   environment, line 3403:  Called kde4-meta_src_extract
 *   environment, line 3309:  Called die
 * The specific snippet of code:
 *                       rsync --recursive ${rsync_options} "${wc_path}/${subdir%/}" "${S}/${targetdir}" || die "${escm}: can't export subdirectory '${subdir}' to '${S}/${targetdir}'.";
 * 
 * If you need support, post the output of `emerge --info '=kde-base/kdeartwork-wallpapers-9999'`,
 * the complete build log and the output of `emerge -pqv '=kde-base/kdeartwork-wallpapers-9999'`.
 * This ebuild used the following eclasses from overlays:
 *   /var/lib/layman/kde/eclass/kde4-meta.eclass
 *   /var/lib/layman/kde/eclass/kde4-base.eclass
 *   /var/lib/layman/kde/eclass/kde4-functions.eclass
 *   /var/lib/layman/kde/eclass/cmake-utils.eclass
 * This ebuild is from an overlay named 'kde': '/var/lib/layman/kde/'
 * The complete build log is located at '/var/tmp/portage/kde-base/kdeartwork-wallpapers-9999/temp/build.log'.
 * The ebuild environment file is located at '/var/tmp/portage/kde-base/kdeartwork-wallpapers-9999/temp/environment'.
 * Working directory: '/var/tmp/portage/kde-base/kdeartwork-wallpapers-9999/work/kdeartwork-wallpapers-9999'
 * S: '/var/tmp/portage/kde-base/kdeartwork-wallpapers-9999/work/kdeartwork-wallpapers-9999'

This is a simple package, and since it is a live ebuild, I assumed a directory had been moved and set about debugging the problem. I spent a good amount of time looking into the KDE eclasses, trying to figure out how to figure out what was being built. Only after a while of this, during another attempt at compiling the package, did I realize that SVN could be the culprit.

A little digging revealed that indeed SVN, for reasons unknown, had done only a partial checkout of the repository. Indeed trying manually to checkout the entire part of the repository required was never successful in one go: it took multiple manual resumes to checkout the complete repository. Somehow, this error was never detected by Portage.

Having not even touched SVN for 8 months, after several years of begrudingly using it, I had totally forgotten the sort of pains it allowed.

GRUB2 Is Coming Along Nicely

| Comments

Today I decided to resurrect an old piece of hardware I’ve had sitting useless for a while: an Asus Eee PC 1005PE.

As an experiment, and to pave the way for future experimentation I decided to use GRUB2. I’m also trying out the no-multilib profile, so grub 0.xx is not supported.

I’m pleased to report that using GRUB2 on a normal, BIOS booting system was extremely easy. Here’s some information about how I did it.

Partition Table

I’m using a single EXT4 partition, the simplest possible setup.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
pebble ~ # cat /etc/fstab
# /etc/fstab: static file system information.
#
# noatime turns off atimes for increased performance (atimes normally aren't
# needed); notail increases performance of ReiserFS (at the expense of storage
# efficiency).  It's safe to drop the noatime options if you want and to
# switch between notail / tail freely.
#
# The root filesystem should have a pass number of either 0 or 1.
# All other filesystems should have a pass number of 0 or greater than 1.
#
# See the manpage fstab(5) for more information.
#

# <fs>                  <mountpoint>    <type>          <opts>          <dump/pass>

# NOTE: If your BOOT partition is ReiserFS, add the notail option to opts.
/dev/sda1               /               ext4            noatime         0 1

Installation

I installed Gentoo’s GRUB 2.00-rc1 package.

1
echo "~sys-boot/grub-2.00" >> /etc/portage/package.accept_keywords/grub

Setup

Since I’m using the standard make install to install kernel images to /boot, simply running grub2-mkconfig sets everything up:

1
2
3
4
5
pebble ~ # grub2-mkconfig -o /boot/grub2/grub.cfg
Generating grub.cfg ...
Found linux image: /boot/vmlinuz-3.5.7-gentoo
Found linux image: /boot/vmlinuz-3.5.7-gentoo.old
done

That’s it! While there’s more going on behind the scenes than GRUB1’s simple grub.cfg file, in the case of a BIOS driven setup, everything Just Works.

What Does It Take to Survive an EBS Outage?

| Comments

On October 22nd, EBS issues in a single availability zone brought down all sorts of websites, big and small. EBS is an important and impressive (read: complicated) part of AWS, but this isn’t the first time we’ve seen it underperform, and I was honestly not expecting so many well known websites with extremely talented teams to be bitten again.

Having worked with EC2 at scale for almost three years, I’ve seen several similar incidents and know a bit about how to weather the storm (since a hurricane has previously caused AWS issues, this is more than a figure of speech). So, what does it take to survive an AWS outage? More specifically: how can we build a system with data stored on EBS while gracefully handling its more ugly moments?

The Setup

EBS has a lot going for it: it’s pretty cheap (storing 100GB will only cost you $10/month), you can create snapshots at any time for backup, and when an EC2 instance fails, any EBS volumes can easily be remounted on a new instance. EBS very rarely loses data, and it’s common to use RAID 1 (or 10) for even more safety. Lots of people have their data on EBS and still sleep well at night.

But when EBS acts up, your data can essentially go missing. Without good preparation, there’s not much to do other than wait for EBS to recover.

What can be done to prepare? Two pieces are particularly important: replication across availability zones and quick (preferably automated) failover.

Using Multiple Availability Zones

Due to the nature of EBS, it has been particularly vulnerable to cascading failures. A small trigger can cause outages that affect nearly an entire availability zone. Without having data and instances ready to go in multiple zones, there’s no way to quickly recover.

Most serious deployments in EC2 will replicate data to one or more slave databases. But it seems it’s less common to ensure that the master and slave are in different availability zones.

How much will it cost?

While transfers within the same availability zone are free, Amazon charges charges a whopping 1 cent per GB sent to another availability zone in the same region.

What’s a reasonable expected cost for replicating from one database instance? According to Scalyr’s EC2 benchmarks, we can expect a maximum of about 2000 4KB writes per second to a single EBS volume on a small EC2 instance. That’s only 8MB/sec, or a maximum of 675GB per day, for which Amazon will charge $6.75. Meanwhile, those two small instances have an on-demand rate of $3.84 per day.

Keep in mind 8MB/sec is an absolute maximum. In practice, even with large EC2 instances (which Scalyr showed to have double the EBS throughput, but cost four times as much), a more sustainable throughput is at best half of that. It’s reasonable to expect that the cost of replicating to another availability zone won’t exceed the cost of the database instances.

What about sending data back to non-database instances in another zone? Scalyr found that EBS read performance is actually about an order of magnitude worse than write performance. Even assuming maximum read AND write performance can be obtained simultaneously, it isn’t significantly more expensive than just maxing out on writes.

A little bit extra safety

Amazon describes availability zones as “distinct locations that are engineered to be insulated from failures in other Availability Zones”. However, there has been at least one incident where an entire region was affected.

If the possibility of downtime from this scenario is unacceptable, you have to consider replicating across regions instead of availability zones. Latency will be higher between regions, and bandwidth costs will follow the public data transfer pricing, which starts at 12 cents/GB. However, if surviving this level of catastrophe is important, the costs are surely worth it.

How to manage failover

While it’s replication across zones that allows for any chance to survive major EBS outages, quickly reacting to the outage is what will really minimize downtime.

Fortunately, in the last few years this has gotten quite a bit easier. There are many databases supporting master-master replication, automatic leader election, and all sorts of other ways to switch over without any manual action. MongoDB has replica sets, CouchDB replication supports multi-master configuration and was specifically designed to handle extended downtime (there are people running CouchDB on their cell phones!). Even MySQL has good support for master-master replication and automatic failover now.

If the particular database you’re using doesn’t support automatic failover, at the very least build a small tool that lets you manually fail over to another set of databases. Test it heavily ahead of time and make it easy to use: there’s nothing worse than scrambling to fix a problem only to mistakenly make it worse, and any time spent fiddling is time where your site is down.

Simulating behavior during an EBS outage

Simulating EBS failures is hard. The EBS volume doesn’t disappear, it just stops responding to requests. The instance mounting the EBS volume is generally unaffected overall.

The closet thing I can come up with to the behavior of a stuck EBS volume is to mount an NFS volume, and then shut down the NFS daemon. Any process reading or writing to the NFS volume will hang until the volume is reconnected.

Fortunately, there is a very important and dangerous type of issue to demonstrate how this works.

Handling EBS issues on the client side

Most out of the box automatic failover setups work great when one of the databases fails. But as mentioned above, EBS issues don’t cause database to completely fail. There will be very high iowait, and any processes using EBS-backed data may hang, but the instance in general continues running just fine. Importantly, connecting to a database instance with a ‘stuck’ EBS volume will generally succeed. However, remote requests for writing, (and reading, depending on the specifics of the database’s caching), will generally not return any data.

Most libraries for working with databases will specify both a connection timeout and a “general” timeout, but general timeout often defaults to unlimited. This is reasonable, since any specific timeout would limit long-running jobs, but not setting this to a relatively small value can sabotage any failover mechanisms.

To see what happens, lets use CouchDB and run a little test using the methodology described above. CouchDB uses HTTP as an interface, so its dead simple to work with.

Here’s the test procedure:

  1. Set up two machines on the same network.
  2. Machine A exports a directory via NFS, which will simulate our EBS volume.
  3. Machine B mounts the NFS directory, and uses it to store CouchDB’s data.
  4. After starting CouchDB, the NFS daemon is stopped.
  5. curl is used to simulate database traffic

And here’s the result:

1
2
3
4
5
6
7
8
9
10
11
12
13
#everything works initially
user@machineB ~ $ curl http://127.0.0.1:5001/sampledb/doc1
{"_id":"doc1","_rev":"1-15f65339921e497348be384867bb940f","hello":"world"}
#NFSd is stopped on machine A
#now requests hang forever
user@machineB ~ $ curl http://127.0.0.1:5001/sampledb/doc1
^C
#this hangs forever too: the connection succeeded
user@machineB ~ $ curl http://127.0.0.1:5001/sampledb/doc1 --connect-timeout 5
^C
#this correctly times out
user@machineB ~ $ curl http://127.0.0.1:5001/sampledb/doc1 -m 1
curl: (28) Operation timed out after 1001 milliseconds with 0 bytes received

Any mechanism expecting the --connect-timeout option to protect against misbehaving instances will be completely defeated when EBS starts acting up.

Be sure to look into specifically what timeout options every DB client library gives. The names may differ, but generally there is an option for connection timeouts, and another timeout for the length of the entire request. Unfortunately, the connection timeout is often the only one prominintely mentioned in documentation.

Be sure to look into specifically what timeout options every DB client library gives. The names may differ, but generally there is an option for connection timeouts, and another timeout for the length of the entire request. Unfortunately, the connection timeout is often the only one prominintely mentioned in documentation.

Handling everything else

I’ve covered one particularly important aspect of handling outages, but of course there’s more to it. As Amazon mentioned in their summary, EBS is one of the building blocks of several other components of AWS. Having a service completely immune to EBS-backed database issues is no good if you aren’t also prepared for ELB issues. And of course any other instance in your infrastructure can flat out fail at any time. We can’t prepare for everything, but handling single-AZ EBS outages is one thing we should all be able to handle.

Help Requested Fixing Macbook Pro Related Kernel Regression

| Comments

UPDATE: The issue mentioned below has been fixed and will be in Kernel version 3.5!

As I mentioned in past posts, I’ve been working on getting my new 15” MacbookPro working completely with Linux. Good progress has been made on a lot of fronts, but there is a new regression in the i915 drivers that is holding me (and likely others) back. I added a post to the long MacbookPro thread on ubuntuforums. Here’s what I posted for anyone who would like to help.

Hi all,

Recently, the team of developers working on improvements to the i915 linux kernel drivers that are needed for many recent Macbook Pro models have been adding several enhancements to the drivers that benefit Macbook Pro users. The most notable one is that in their most recent git kernel tree, the number of LVDS channels is automatically set properly for the Macbook Pro, eliminating the need to apply a patch and pass the i915.lvds_channels=2 option as many on this thread (including myself) have done. Another change is the option to disable the intel_backlight controls, allowing the gmux_backlight control to take over which (for me at least) is the only working backlight control for a Macbook Pro.

However recently they inadvertently introduced a regression that is also causing a similar black screen. I reported a bug and the developers are taking a look but I suspect they do not have access to any Macbooks to test with. If any readers of this post who have a modern (say, 6,x or above) Macbook pro running linux in EFI mode, and are comfortable building their own kernel from git could help out with the bug I’m sure it would be greatly appreciated and would help get these improvements into the mainline kernel faster.

Here’s what you can do:

1.) Read the bug report at https://bugs.freedesktop.org/show_bug.cgi?id=49518
2.) Get the development git sources from git://people.freedesktop.org/~danvet/drm-intel
3.) Confirm that commit e646d57 from the git sources introduce a black screen
4.) Post to the bug report the result of any patches by kernel developers. include all the information a kernel developer would need to diagnose the issue: the system name of your Macbook (eg MacbookPro8,1), the full dmesg output with drm.debug=0xe added to your kernel parameters, and the output of the intel_reg_dumper command that is part of the intel-gpu-tools package located at http://cgit.freedesktop.org/xorg/app/intel-gpu-tools/ (version 1.1 works for me).
5.) Otherwise help out however you can

Thanks and happy hacking!

Debugging Amarok Transcoding to iPod

| Comments

Thanks to Matěj Laitl, Amarok now supports transcoding music when copying to iPods and iPhones.

However, while attempting to copy some FLAC encoded music to my iPod while transcoding to ALAC, I discovered it did not work, with only an unhelpful error message displayed.

A screenshot of an unhelpful Amarok dialog

Launching amarok with the –debug flag, I was able to see a little more detail:

amarok: Transcoding from KUrl("file:///home/spectre256/music/Trouble Maker/Trouble Maker/Trouble Maker - Trouble Maker-%20 01 - Trouble Maker.flac") to KUrl("file:///media/IPOD/iPod_Control/Music/F45/libgpod869044.m4a")
amarok: BEGIN: Transcoding::Job::Job(const KUrl&, const KUrl&, const Transcoding::Configuration&, QObject*)
amarok:   BEGIN: void Transcoding::Job::init()
amarok:     foo
amarok:     ("-acodec", "alac")
amarok:     "FFMPEG call is " ("ffmpeg", "-i", "/home/spectre256/music/Trouble Maker/Trouble Maker/Trouble Maker - Trouble Maker-  01 - Trouble Maker.flac", "-acodec", "alac", "-map_meta_data", "/media/IPOD/iPod_Control/Music/F45/libgpod869044.m4a:/home/spectre256/music/Trouble Maker/Trouble Maker/Trouble Maker - Trouble Maker-  01 - Trouble Maker.flac", "/media/IPOD/iPod_Control/Music/F45/libgpod869044.m4a")
amarok:   END__: void Transcoding::Job::init() [Took: 0s]
amarok: END__: Transcoding::Job::Job(const KUrl&, const KUrl&, const Transcoding::Configuration&, QObject*) [Took: 0s]
amarok: BEGIN: virtual void Transcoding::Job::start()
amarok:   starting ffmpeg
amarok:   call is  ("ffmpeg", "-i", "/home/spectre256/music/Trouble Maker/Trouble Maker/Trouble Maker - Trouble Maker-  01 - Trouble Maker.flac", "-acodec", "alac", "-map_meta_data", "/media/IPOD/iPod_Control/Music/F45/libgpod869044.m4a:/home/spectre256/music/Trouble Maker/Trouble Maker/Trouble Maker - Trouble Maker-  01 - Trouble Maker.flac", "/media/IPOD/iPod_Control/Music/F45/libgpod869044.m4a")
amarok:   ""
amarok:   ffmpeg started
amarok: END__: virtual void Transcoding::Job::start() [Took: 0.002s]
amarok: BEGIN: void Transcoding::Job::transcoderDone(int, QProcess::ExitStatus)
amarok:   ""
amarok:   NAY, transcoding fail!
amarok: END__: void Transcoding::Job::transcoderDone(int, QProcess::ExitStatus) [Took: 0s]

So the transcoding appears to be failing. Fortunately we have a big help to figure out why: we have the exact command parameters used for the transcoding. On the surface of course, the command and parameters look completely legitimate. To see what was broken, I just had to run the transcode command myself. The following output was the result:

fmpeg version 0.10.2 Copyright (c) 2000-2012 the FFmpeg developers
  built on Apr 14 2012 00:48:59 with gcc 4.5.3
  configuration: --prefix=/usr --libdir=/usr/lib64 --shlibdir=/usr/lib64 --mandir=/usr/share/man --enable-shared --cc=x86_64-pc-linux-gnu-gcc --cxx=x86_64-pc-linux-gnu-g++ --ar=x86_64-pc-linux-gnu-ar --optflags='-O2 -pipe -march=native -fomit-frame-pointer' --extra-cflags='-O2 -pipe -march=native -fomit-frame-pointer' --extra-cxxflags='-O2 -pipe -march=native -fomit-frame-pointer' --disable-static --enable-gpl --enable-version3 --enable-postproc --enable-avfilter --disable-stripping --disable-debug --disable-network --disable-vaapi --disable-vdpau --enable-libmp3lame --enable-libvo-aacenc --enable-libvorbis --enable-libx264 --enable-libxvid --disable-indev=v4l --disable-indev=oss --disable-indev=jack --enable-x11grab --disable-outdev=oss --enable-libfreetype --disable-amd3dnow --disable-amd3dnowext --disable-altivec --disable-avx --disable-vis --disable-neon --cpu=host --enable-hardcoded-tables
  libavutil      51. 35.100 / 51. 35.100
  libavcodec     53. 61.100 / 53. 61.100
  libavformat    53. 32.100 / 53. 32.100
  libavdevice    53.  4.100 / 53.  4.100
  libavfilter     2. 61.100 /  2. 61.100
  libswscale      2.  1.100 /  2.  1.100
  libswresample   0.  6.100 /  0.  6.100
  libpostproc    52.  0.100 / 52.  0.100
[flac @ 0x864320] max_analyze_duration 5000000 reached at 5015510
Input #0, flac, from '/home/spectre256/music/Trouble Maker/Trouble Maker/Trouble Maker - Trouble Maker-  01 - Trouble Maker.flac':
  Metadata:
    ARTIST          : Trouble Maker
    TITLE           : Trouble Maker
    ALBUM           : Trouble Maker
    DATE            : 2011
    GENRE           : K-Pop
    album_artist    : 트러블 메이커
    STYLE           : Trouble Maker
    track           : 1
    TOTALTRACKS     : 4
    disc            : 1
    TOTALDISCS      : 1
  Duration: 00:03:41.37, bitrate: 1004 kb/s
    Stream #0:0: Audio: flac, 44100 Hz, stereo, s16
Unrecognized option 'map_meta_data'
Failed to set value '/media/IPOD/iPod_Control/Music/F45/libgpod869044.m4a:/home/spectre256/music/Trouble Maker/Trouble Maker/Trouble Maker - Trouble Maker-  01 - Trouble Maker.flac' for option 'map_meta_data'

It looks like it’s the map_meta_data option causing issues.

After looking through the ffmpeg repository a bit, it seems map_meta_data was removed in version 0.10 after being depricated in version 0.7. There is a new option called map_metadata that replaces it, but with different syntax. However, for simply transcoding an audio file, neither of these options are needed: the only time they should have to be specified is when doing something such as transcoding a movie with multiple audio tracks.

This means the fix is as simple as ensuring that the map_meta_data option is removed. Amarok has a large codebase but it is actually very friendly to new developers, so understanding where and how to fix this turned out to be straightforward. Here, the relevant code is in src/transcoding/TranscodingJob.cpp. This class represents a single job to do one transcode, which will be run in the background. Different transcoders, such as the one found in src/core/transcoding/formats/TranscodingAlacFormat.cpp can set options to the transcoder (in this case it sets the “-acodec alac” option as you would expect), but there are several common options set in the Transcoding::Job::init() method; one of these is the offending map_meta_data option.

The relevant code is included below. It uses a KProcess object from the KDE libraries to run the transcode job in the background. The command and parameters are set, and then the slots for progress and completion are connected to the proper signals.

++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
void
Job::init()
{
    DEBUG_BLOCK
    m_transcoder = new KProcess( this );

    m_transcoder->setOutputChannelMode( KProcess::MergedChannels );

    //First the executable...
    m_transcoder->setProgram( "ffmpeg" );
    //... then we'd have the infile configuration followed by "-i" and the infile path...
    *m_transcoder << QString( "-i" )
        << m_src.path();
    //... and finally, outfile configuration followed by the outfile path.
    const Transcoding::Format *format = Amarok::Components::transcodingController()->format( m_configuration.encoder() );
    *m_transcoder << format->ffmpegParameters( m_configuration )
        << QString( "-map_meta_data" )
        << QString( m_dest.path() + ":" + m_src.path() )
        << m_dest.path();

    //debug spam follows
    debug() << "foo";
    debug() << format->ffmpegParameters( m_configuration );
    debug() << QString( "FFMPEG call is " ) << m_transcoder->program();

    connect( m_transcoder, SIGNAL( readyRead() ),
       this, SLOT( processOutput() ) );
    connect( m_transcoder, SIGNAL( finished( int, QProcess::ExitStatus ) ),
       this, SLOT( transcoderDone( int, QProcess::ExitStatus ) ) );
}

So the exact fix is simply to remove lines 17 and 18, so that the map_meta_data option is no longer passed to ffmpeg. It does in fact work: after removing those lines and recompiling I can happily copy all my FLAC encoded music to my iPod in ALAC format!

I submitted the code for review on the Amarok reviewboard. In addition to the transcoding fix, I also included a tiny fix to make debugging easier in the future: in the debug lines above where a QStringList is sent to debug output, use the .join() function to print the command exactly as it would be run, instead of a comma separated list of individual parameters. The full set of changes can be found on Github.

During this whole process, I’ve noticed a couple other pain points that would be great to improve. If possible I’ll work on this in the near future:

  • When multiple tracks are picked to be transcoded at once, each track is handled in series. I painfully watched htop show me one core working dilligently while the others, and the iPod’s disk, sit mostly idle. Ideally multiple tracks could be transcoded at once.
  • Amarok kept telling me about stale tracks on my iPod and telling me I can do something about them. Either I can’t find this option or it doesn’t exist. Either way the UX at least probably has to be improved.
  • When selecting a transcoding option, Amarok currently performs this transcode whether or not it makes sense or not based on the source filetypes. For example, if I ask Amarok to transcode to ALAC, any lossy filetype would be converted, which would simply waste space and time. It would be great if Amarok could detect and prevent such “upconverting”.

Update:

My changes are now in Amarok git!

Installing Gentoo on a Macbook Pro, Part 3

| Comments

Since last time I’ve made steady progress on various fronts.

WiFi

I actually spent quite a bit of time fiddling with wifi to get it working, when in reality it was quite simple. I am now running linux 3.3, so support for the BCM4331 chip is included by default in the linux kernel. Firmware has to be downloaded and installed, but fortunately the instructions at the b43 driver page work perfectly. The one catch is that in addition to the CONFIG_B43 and CONFIG_B43_PHY_HT kernel options to actually enable the drivers, CONFIG_BCMA is required as well.

After this, wifi worked great, although 802.11n support does not yet exist. I installed wicd and everything worked fine. Initially I was running into issues with wicd asking for a password each time I logged into KDE, but adding wicd to the default runlevel fixed that, and now my computer will automatically connect to wifi networks on startup (assuming of course there is no ethernet connection and there is an unsecured or previously configured wireless network nearby).

Annoying Startup Chime

Thusfar, every time I boot or reboot (quite often when testing kernel things!) my laptop has made the infamous startup chime. Apparently the only way to turn this off is to boot into OS X and set the volume to 0. Fortunately, I had my untouched OS X install on the original hard drive, and it is possible to boot from an external hard drive via USB. So I was able to boot into OS X once just to mute the volume. I actually went through all the initial setup steps, maybe I didn’t even need to go that far, and just lower the volume immediately.

FaceTime Camera

I didn’t see much use even enabling support for the built in camera until I learned there is a google chat plugin that allows you to do video chatting over gmail. With minimal work I was able to get everything running quickly.

The webcam section of the Macbook Pro Gentoo wiki page instructions were all I had to follow. A few steps there are apparently redundant for the newer models, here’s what I did:

1.) Enable kernel support. The only option required is CONFIG_USB_VIDEO_CLASS. I built it as a module.

2.) Unmask and install media-video/isight-firmware-tools. I unmasked version 1.6.

3.) Extract the firmware file. This requires the firmware file from Mac OS X itself. Fortunately I have my original OS X installation on an external hard drive. I had to build the HFS and HFS+ kernel modules, but I was able to easily grab this file. It can be found at

System/Library/Extensions/IOUSBFamily.kext/Contents/PlugIns/AppleUSBVideoSupport.kext/Contents/MacOS/AppleUSBVideoSupport

Then it’s a simple task to extract the relevant firmware

ift-extract --apple-driver AppleUSBVideoSupport

4.) It’s possible to immediately test if everything is working via mplayer. I first had to enable the v4l and v4l2 use flags and recompile, but once done the webcam should display with the following command:

mplayer tv:// -tv driver=v4l2:width=640:height=480:device=/dev/video0 -fps 20 -vo x11

I didn’t have to setup any dbus rules to create any device nodes.

5.) Finally, simply unmasking and installing www-plugins/google-talkplugin allowed me to video chat via gmail in Firefox.

Intel Graphics

I haven’t made any functional modifications since last time, booting still works fine using simple patches that allow setting the number of lvms channels manually. However I spent some time digging into the actual development work being done here. First I found the intel-gfx mailing list while browsing the mailing lists on freedesktop.org. I came across a bug report with a cleaner set of patches. It turns out that even with enhanced lvms channel detection, the Macbook Pro still doesn’t behave, so a patch adds a quirk entry as a workaround.

It looks like these changes will likely make it into the 3.5 kernel, which is a ways out, but at least progress is being made. Hopefully in the next couple days I will rework my branch of kernel patches to include these much nicer patches. As an added bonus, I now know where the development for these drivers happens. In the future I’ll keep watching that mailing list as well as the git repos of Dave Airlie and Daniel Vetter.

Todo

So far the laptop is working quite well and I have successfully been able to use it as a laptop is meant to be used: by going out and actually using it for real work for extended periods away from home. Battery life is only ok at the moment. However I’ve gotten to the point where I only have a few things left to get working, even if some of them will be a major project.

  • Suspend/hibernate. I’ve fooled with this a bit but still have to hard-reset the laptop any time it goes to sleep (which unfortunately happens whenever I close the lid currently).

  • Graphics switching. I still intend to, eventually at least, be able to switch between the integrated Intel graphics and dedicated AMD graphics at will.

  • Backlight adjustment. I tried a patch for this but it still stays pegged at the max. This should be a big helper for battery life.

  • Tweak Intel graphics kernel parameters. There are various parameters out there that reportedely improve battery life significantly.

  • Debug a USB issue. The last few boots I noticed that for about 5 seconds after the KDE login screen is displayed, no input devices work. Even the capslock key does not light up. This could be related to the fact that I’ve usd an extra USB mouse for these last few boots.

  • Disable the trackpad while typing. I didn’t think it would be an issue, but while typing, I often nudge the trackpad and cause Bad Things to happen. There is a fix out there to disable the trackpad while typing.

Installing Gentoo on a Macbook Pro, Part2

| Comments

It’s been a couple days since I could sit down with my fancy new Macbook, but last night and today I’ve made some good progress. After part 1 I a nearly booting sytem, but not quite everthing was working yet. While I could get a kernel to load from grub2, it would hang almost immediately. Also grub2 was clearly misconfigured, as it was spewing many errors about missing commands, and I had to manually type in the path to my kernel image on each boot.

I didn’t know enough about grub2 to hope to debug things there yet, and I was sick of the long cycle of waiting for the livecd to load, mounting filesystems required to chroot, and then making changes, so I hoped to be able to simply get a kernel to boot. As it turns out, this was not too hard.

The first main thing I did that was helpful was to set up a local git copy of the kernel code, since I knew I would be applying various patches eventually. I fiddled around with various versions from 3.1 onward, including the latest code merged as part of the newly opened 3.4 window. I actually found that the only key thing required to boot is to ensure that kernel modesetting was not enabled by default for the i195 driver (CONFIG_DRM_I915_KMS): when it’s enabled the kernel would hang unrecoverably (somtimes not even alt+sysrq+b worked) either a few seconds into boot if the i915 driver is conmpiled into the kernel, or after Gentoo prints the line “waiting for uevents to be processed”, if built as a module.

Unfortunately I wasted a lot of time because I borrowed a kernel config used by Ubuntu. It had almost every single kernel option built as a module, including AHCI and ext2/3/4 support. Of course those had to be built into the kernel to finish booting. I wasted a lot of time disabling various modules, and eventually copied my Gentoo config from gentoo-sources-3.2.1 and modified it. One slightly confusing thing was finding the correct ethernet driver. Despire lspci listing the name as NetExtreme, it is actually the Tigon3 driver that is needed (CONFIG_TIGON3), NOT either of the NetExtremeII drivers (CONFIG_BNX2 or CONFIG_BNX2X).

After this, I was able to fully boot just fine. Next I spent about 20 minutes filling /var/lib/portage/world with all the packages I wanted, ran an eix-sync, an an emerge -eav world. Despite including firefox, chromium, libreoffice, and about 800 other packages, this finished in only about 4 hours!

I fiddled around a bit with starting X, and determined it will take a lot of work. GRUB’s error messages were bugging me, so I decided to fix them first, since it might help me learn a bit about EFI and GRUB2. I read a good amount of the GRUB2 manual, crafted a mail to the gentoo-user mailing list, and finally stumbled across renergy’s forum post that worked perfectly.

The next task was perhaps the most daunting: get a desktop environment working. Oddly enough, for the most basic case at least, it was not that hard. Initially, I was wondering how things would work at all. Any attempts to enable modesetting by default in the i915 driver lead to hangs during boot. Yet, using the i915 module at all seems to require kernel modesetting. As it turns out, the trick is to first disable the Radeon graphics card, since presumably the two interfere if you aren’t careful. Mike Dentifrice’s blog has perfect instructions for this. The following steps are all it took:

  1. Start with a Linux 3.3.1 kernel. Using my existing linux git repo, I added another remote for linux-stable pointing to git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
  2. Apply the lvds_dual_channel and apple_bl patches. Few others seem to have shared their work, so I pushed a branch with the patches to my linux github repo.
  3. Build and install the kernel. Make sure the i915 driver is compiled into the kernel and kernel modesetting is enabled by default. In other words, set the following config options:
     CONFIG_DRM_I915=y
     CONFIG_DRM_I915_KMS=y
    
  4. Update your grub.cfg to disable the Radeon card and pass required parameters to the kernel. The menuentry for my kernel ended up looking like this:
    menuentry 'Gentoo GNU/Linux, with Linux 3.1.1-00002-g77b9830' --class gentoo --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-3.1.1-00002-g77b9830-advanced-1f2dc33c-2639-42f9-8502-9c7d3f24e7ec' {
        load_video
        set gfxpayload=keep
        insmod gzio
        insmod part_gpt
        insmod ext2
        set root='hd0,gpt1'
        # switch gmux to IGD
        outb 0x728 1
        outb 0x710 2
        outb 0x740 2
        #powers down ATI
        outb 0x750 0
        if [ x$feature_platform_search_hint = xy ]; then
            search --no-floppy --fs-uuid --set=root --hint-bios=hd0,gpt1 --hint-efi=hd0,gpt1 --hint-baremetal=ahci0,gpt1  1f2dc33c-2639-42f9-8502-9c7d3f24e7ec
        else
            search --no-floppy --fs-uuid --set=root 1f2dc33c-2639-42f9-8502-9c7d3f24e7ec
        fi
        echo    'Loading Linux 3.1.1-00002-g77b9830 ...'
        linux   /boot/vmlinuz-3.1.1-00002-g77b9830 root=/dev/sda1 ro i915.lvds_channels=2 reboot=pci acpi_backlight=vendor
    }
    

Pretty easy, right? Well there’s some problems. Linux 3.1.1 isn’t exactly the newest. Among other things, it doesn’t support the Macbook Pro’s wireless card. Fortunately there’s a patch for that. But it would be nice to be able to use a newer kernel with all the latest updates, right? Plus, all that power in the Radeon graphics card is going to waste all the time. I’ll have to conquer those hurdles another day.

Since I had a working desktop environment now, I decided quickly to see how things were working. Firefox and Amarok worked just fine, but when I tried to watch a few minutes of Firefly, I realized there was no sound! I fiddled with model=mb5 and other settings in /etc/modprobe.d/alsa.conf but it turns out all I had to do was enable SND_HDA_CODEC_CIRRUS. It’s the only Intel HDA coded that’s required actually.

Stay tuned as there clearly will be a part 3 (at least) in the near future.