Categories
Uncategorized

2021W35

This week was fairly quiet: Monday was a bank holiday, and it’s perf season inside Google, so everyone was busy writing about themselves and others instead of writing code…

Unfortunately for me personally it was a week of going down rabbit holes & yak shaving.

I spent far, far too long visiting the innards of cmp and protocmp in an effort to modernise an older test. The premise was simple: convert all int64 values in a particular message to 1, so that we don’t care about the values, just the presence or absence of the value. I was trying to use FilterMessage(), but couldn’t make it work. Eventually I ended up with a more brute-force approach: alter the entire result, not just the message I was concerned with.

cmp.Transform("normalize", func(int64) int64 {
  return 1
})

The other yak shaving relates to a data pipeline. I was switching it from reading some files to reading from a Spanner database. The code change was verbose, but fairly simple. Then I tried to run it and discovered the configuration change I needed was anything but. I ended up with some configuration of command line flags … in a proto message … wrapped in another command flag … embedded in some GCL … in turn embedded in another shell script. Not only was the quoting horrific, but I couldn’t make the GCL do what I needed. I ended up hacking the interpreter to emit errors from an eval() statement which are normally thrown away. This is one of the benefits of having everything in a monorepo, but I really didn’t need to spend hours on this.

Stats: 9 changes submitted, 47 reviewed.

Categories
Uncategorized

Weeknotes

It’s been a while since I’ve written anything… I’ve recently noticed Simon Willison’s weeknotes summarising what’s happened in the last week. I’ve done this internally since starting Google, and I now have a large backlog of snippets (as they’re known in there). I really value this for a few reasons.

  • It’s great to write each week and have a fixed point of review, while the all the context is still in your head. More often than not, I’m able to look back and realise that I’ve achieved more than I thought I had!
  • It’s a great historical record. Being able to go back in time is invaluable at perf time, but also in the longer term to see how my career has developed.
  • Reading other people’s snippets is an incredibly pleasurable start to the week. Invariably, I find a few interesting strands of work that I would not otherwise have known about (I love reading the Go team’s snippets to see how generics is progressing!)

While I can’t write exactly the same snippets externally, I’m going to see if I can pick out something useful or interesting I’ve done each week.

2021W34

A large part of last week was playing catchup with discussions from the previous week, as I had a few days out at Beautiful Days, which was amazing (The Levellers! Hawkwind! Dreadzone!). However, two things stood out:

  1. I’d made a small change to how one of our pipelines reads data from Bigtable before leaving. A single flag-flip. Unfortunately, I’d neglected to remember the ACL change that went alongside this, causing my colleagues to waste time debugging (it was missed in code review too … but while code review is nice it’s never perfect).

    And even when the ACL was fixed, the change caused the pipeline to slow from minutes to hours, so the change had to be rolled back anyway. 🤦‍♀️
  2. In another pipeline, I’d rolled out a change to the output format. I enabled in qa a couple of weeks back, and all seemed to be working fine. So I’d enabled it in production before I left. Unfortunately some downstream pipelines started reporting “no data” a day or so later while I was out. Thankfully, my change was quickly identified and rolled back. But again, I caused my colleagues to spend too long debugging my problems.

So: in conclusion, insufficient carefulness on my behalf. Both of these issues could have been resolved better if I’d taken a bit longer to sit down and think through the consequences of what I was working on.

Other things I touched on:

  • Centralised some RPC clients for services which our team runs so that we can changes to a single place instead of contacting affected teams
  • Investigated memory spike protection in some of our services. This was fun, and led me to conclude that I didn’t understand the tooling well enough to apply it. I need to write a reproducer first before coming back to this.

Stats: 15 changes submitted, 42 reviewed.

Categories
Uncategorized

Maven Irritation

I’m looking at Maven related stuff for the first time in … perhaps 4 years. This is long enough that I’ve forgotten how irritating it is. My task is simple: redirect the writes from the project’s directory to another location on local disk. This is because the project is located on a filesystem where writes are expensive (something like an NFS filer, but worse). I have to do this for all projects, not just my own, so saying “edit the POM to do X” is not an option. Given how disparate the projects are, there isn’t even a company-wide POM I can alter.

Of course the maven documentation tells you everything you don’t want to know. So, off to stack overflow.

First, is the helpful description of ${project.build.directory}. This is exactly what I want: a way to affect all settings related to the output directory (target by default). See pom-4.0.0.xml for how this works.

Except… it doesn’t work. You can set -Dproject.build.directory=/tmp on the command line, and maven will happily ignore you. I think this is because AbstractCompilerMojo marks it as read-only.

It turns out there’s another way to do this, via profiles.

<profile>
  <id>move-output-dir</id>
  <activation>
    <activeByDefault>true</activeByDefault>
  </activation>
  <build>
    <directory>${java.io.tmpdir}/maven-builds/${project.groupId}/${project.artifactId}/target</directory>
  </build>
</profile>

This works perfectly! But it has to be done on a pom-by-pom basis. What I want is to use maven’s settings.xml to do this. So I pasted it into the profile section there and got:

[WARNING] Some problems were encountered while building the effective settings
[WARNING] Unrecognised tag: ‘build’ (position: START_TAG seen …n … @263:14) @ /…/conf/settings.xml, line 263, column 14

It turns out that profiles in settings.xml are different to profiles in pom.xml (“The profile element in the settings.xml is a truncated version of the pom.xml profile element.”)

At this point, I’m kind of stuck. Maven has provided the illusion of configurability, but all attempts to do so have failed.

Categories
Uncategorized

Return errors or useful values, not both

One of the nice features of Go is returning multiple values natively. e.g.

type Foo struct{…}

func New() (*Foo, error) {
  if err := initSomething(); err != nil {
    return nil, err
  }
  return &Foo{…}, nil
}

What’s interesting is that either the error is nil (and you have a useful value) or the useful value is nil if you have an error. You don’t have to deal with the case of a useful value and an error. In general, this seems to be a good pattern.

Is it ever useful to return both? Possibly. But I’d argue that if you want to do this, it’s better to embed the error into the value. e.g.

type Foo struct {
  Err error
  // …
}

func New() *Foo {
  if err := initSomething(); err != nil {
    return &Foo{Err: err}
  }
  return &Foo{…}
}

Then it’s clear that the value returned may have other useful properties.

Categories
Uncategorized

Go interfaces are for consumption

After a recent discussion, I’ve realised something about Go’s interfaces. They’re best if you use them as a consumer rather than producer. Using my own code as an example. I have a storage layer that uses bigtable.

package storage

type IO interface {
  Read() (string, error)
  Write(string) error
}

type bigtable struct {…}

fun New() Bigtable {
	return &bigtable{…}
}

func (bt *bigtable)Read() (string, error) {…}
func (bt *bigtable)Write() error {…}

So, we’re creating an interface then returning an instance of it. This allows us to create a fake version simply.

package fakestorage

type Fake struct {
	value string
	err   error
}

func (f *Fake)Read() (string, error) {
	return f.value, f.err
}
func (f *Fake)Write() error {
	return f.err
}

The consumer just takes the interface.

package logger

type Service struct {…}

func New(io storage.IO) Service {
	return &Service{io}
}

So what’s the problem here? Well, we’re constraining the output of storage.New() with an eye towards the Service consumer. But there’s no need. The Bigtable struct would adequately satisfy the interface. By returning the interface we can only call the specified methods. There’s no possibility to call something that (e.g.) we only want in startup.

As a bonus, if we return the concrete type, we get far better results in godoc.

Categories
Uncategorized

Script or program

My new definition for “is it a script”: does it have tests? If so, it’s a proper program. If not, it’s unmaintainable junk.

Categories
Uncategorized

Logging in Go

Dave Cheney has an excellent post, Let’s talk about logging. In it, he dissects the current logging libraries, and what you really need. I pretty much agree with everything he says.

  • Warnings are just more info messages.
  • Log or handle the error — not both.

Where I disagree is with the error logging. It is distinct from info logging. It just needs to be actionable. if there’s an error, somebody needs to do something to fix it. See also Rob Ewaschuk’s Philosophy on Alerting.

As an aside, I’m glad to see that we’ve open-sourced our internal logging library as glog. It certainly doesn’t meet the criteria above, but it’s nicer than the standard library in a few small ways (like Verbose for developer-debugging logs).

Categories
Uncategorized

Go Readability

If you haven’t seen them, do take a look at these slides on Go readability, from one of Google’s Go readability group.

Readability is an important process at Google. In theory, it’s about ensuring the style guide for a language is applied. In practice, it’s also about ensuring that idiomatic code is produced. This is highly language specific, and not something that can easily be done with tooling.

In the case of Go readability, it feels like a mentoring process over a series of code reviews (other languages take a more “big-bang” approach). The end result is that I have a better idea of not just how to write Go, but how we like Go to be written at Google. I really appreciate the strong emphasis on simplicity in Go code. Hopefully, that comes through in the slides.

Categories
Uncategorized

Rescuing Data with F-Script

I have an old iMac. It’s not in a good state.

Unwell iMac

I don’t care that much, it’s old slow, and not a lot of use. The only remaining thing of value is the large collection of photos I have on there. I have an external drive, so it’s easy to copy them over.

All the photos are stored in iPhoto 5, which (from googling around) appears to be the version before they moved all the metadata into a big binary blob. And looking in the iPhoto Library folder revealed a file AlbumData.xml which contained details of all 12,000+ photos. Result!

Apart from one slight problem. I’d never used iPhoto’s “albums.” Well, I’d made one or two (mostly by accident). But for the most part, I’d left the photos in the import “rolls”, just renaming (e.g.) “Roll 42” to “Trip to Wales”. Now the rolls are represented in AlbumData.xml, but without the roll names. Just “Roll 1”, “Roll 2”, etc. <facepalm/>

So like any good engineer, I’ll spend a huge amount of time to avoid having to re-enter 340 roll names, which I could have done in a couple of hours. This information is necessary for whatever system I’ll be pulling the photos into afterwards, dammit!

iPhoto obviously knows what the name of each roll is, or it wouldn’t be able to display them. They’re just stored in some different place (a binary blob, as it happens). Thankfully, I can still run iPhoto. So long as I avoid the middle of the screen, anyway. Now I forget exactly where I heard about it (Core Intuition, perhaps?) but I remembered that F-Script Anywhere allowed inspecting a running Cocoa program.

F-Script itself is a smalltalk based language, built on top of Objective-C and the Cocoa runtime. The tutorial includes a nice side-by-side comparison with Objective-C. It’s pretty simple for the most part. F-Script Anywhere is an add-on that allows you to inject the F-Script runtime into any running process. I tried it with iPhoto and immediately had a command line and browser with complete access to all of iPhoto. It helped having a minimal knowledge of Cocoa (I could find the app delegate!)

However, this wasn’t enough on its own. I didn’t know my way around all the classes inside iPhoto. To that end, I used class-dump to piece together the classes and their relations.

Eventually after a few hours1, I came up with this.

app := NSApplication sharedApplication
appCtrl := app delegate
albumMgr := appCtrl archiveDocument albumMgr

"Each roll (album) has an imageRec whose name is what we need."
rolls := albumMgr rollAlbums

"Pull out the mapping from pseudo-album name to roll name."
keys := rolls name
values := rolls imageRec name
rollNames := NSDictionary dictionaryWithObjects:values forKeys:keys

"Write the result to a plist."
rollNames writeToFile:'/tmp/roll_names.xml' atomically:YES

One very interesting feature of F-Script is the loop. Or lack thereof. Did you see the rolls name above? rolls is an array, which doesn’t respond to the names method. So F-Script has magic to say “send the names message to each element of the rolls array”. There’s more on this in the messaging patterns section of the language guide. But it feels rather nice.

The end result is that I do have all the data I need to move my photos into the next system. I can now safely dispose of the broken iMac, much to the relief of the room that’s been harboring it.

As an aside, this whole business of Albums vs Rolls very much feels like an example of technical debt in a system which was rushed. There was no obvious reason to have rolls and albums represented so differently. But instead, they were bolted on in a strange fashion.

1Actual time period is undefined.

Categories
Uncategorized

Happy Fifth Anniversary, Go

So, Go is five years old. Looking back through version control, my first bit of Go was in May 2012. I’ve been using it as my preferred language for the last year or so. It still feels very pleasant and easy to work with. In no small part that’s due to the excellent tooling. The well written standard library also helps. Go is perhaps the first language where I’ve not had a burning desire to write a URL type because the builtin one is so awful (e.g. Python and Java). Inside Google, it’s been said that if you want to understand a piece of infrastructure, you should look at the Go implementation.

Looking back at that first Go code I wrote (a small tool of about 500 lines), it’s far from embarrassing. Certainly when compared to my efforts in other languages. The code has a few useful types, and methods that act upon them. It’s pretty readable (thanks to gofmt). The one bad thing is that I relied on panic far too heavily, instead of returning errors. The code ran happily for a year or so until the need for it was removed.

Anyway, if you’ve not already done so, try writing your next program in Go. You may be surprised at what a pleasant and productive experience it is.