Categories
Uncategorized

External Tests

I saw Eli Bendersky’s File-driven testing in Go post, and really like it. I was using something very similar yesterday. I’ve been attempting to replace a custom parser written in Python with an ANTLR one, with the goal being to run the same parser in both Python and Go. In order to do that, we need test cases that verify the old regex-based parser and the new antlr-based parser produce the same results. In order to do this, I moved all the test cases into a textproto (which is a common answer inside Google!). The output of the parser is another protobuf message, so we can include that directly.

test_case: {
  input: "some/valid/input"
  output: { … expected proto message inline … }
  want_error: false
}
test_case: { … }
// repeat as needed

The schema is pretty simple.

message TestCase {
  string input = 1;
  MyMessage output = 2;
  bool want_error = 3;
}

message TestCases {
  repeated TestCase test_case = 1;
}

This is trivially usable in the Go code, generating a subtest for each case.

func TestParse(t *testing.T) {
  data, err := os.Read("testdata/parser_test_cases.textproto")
  if err != nil { t.Fatal(err) }

  cases := &pb.TestCases{}
  if err := proto.UnmarshalText(data); err != nil { t.Fatal(err) }

  for _, tc := range cases {
    t.Run(tc.GetInput(), func(t *testing.T) {
      got, err := Parse(tc.input)
      if gotErr := (err != nil); gotErr != tc.getWantError() {
        t.Errorf("Parse() got err? %t want err? %t (err: %v)", gotErr. tc.GetWantError(), err)
      }
      if diffs := cmp.Diff(want, got, protocmp.Transform()) {
        t.Errorf("Parse() had unexpected differences (-want +got):\n%s", diffs)
      }
    })
  }
}

This makes it really easy to add new test cases, as I develop new parts of the ANTLR parser. The Python code is very similar to the Go code (both in the ANTLR and regex case). I was pleased to discover that Python has subtests along the way!

Categories
Uncategorized

Return errors or useful values, not both

One of the nice features of Go is returning multiple values natively. e.g.

type Foo struct{…}

func New() (*Foo, error) {
  if err := initSomething(); err != nil {
    return nil, err
  }
  return &Foo{…}, nil
}

What’s interesting is that either the error is nil (and you have a useful value) or the useful value is nil if you have an error. You don’t have to deal with the case of a useful value and an error. In general, this seems to be a good pattern.

Is it ever useful to return both? Possibly. But I’d argue that if you want to do this, it’s better to embed the error into the value. e.g.

type Foo struct {
  Err error
  // …
}

func New() *Foo {
  if err := initSomething(); err != nil {
    return &Foo{Err: err}
  }
  return &Foo{…}
}

Then it’s clear that the value returned may have other useful properties.

Categories
Uncategorized

Go interfaces are for consumption

After a recent discussion, I’ve realised something about Go’s interfaces. They’re best if you use them as a consumer rather than producer. Using my own code as an example. I have a storage layer that uses bigtable.

package storage

type IO interface {
  Read() (string, error)
  Write(string) error
}

type bigtable struct {…}

fun New() Bigtable {
	return &bigtable{…}
}

func (bt *bigtable)Read() (string, error) {…}
func (bt *bigtable)Write() error {…}

So, we’re creating an interface then returning an instance of it. This allows us to create a fake version simply.

package fakestorage

type Fake struct {
	value string
	err   error
}

func (f *Fake)Read() (string, error) {
	return f.value, f.err
}
func (f *Fake)Write() error {
	return f.err
}

The consumer just takes the interface.

package logger

type Service struct {…}

func New(io storage.IO) Service {
	return &Service{io}
}

So what’s the problem here? Well, we’re constraining the output of storage.New() with an eye towards the Service consumer. But there’s no need. The Bigtable struct would adequately satisfy the interface. By returning the interface we can only call the specified methods. There’s no possibility to call something that (e.g.) we only want in startup.

As a bonus, if we return the concrete type, we get far better results in godoc.

Categories
Uncategorized

Logging in Go

Dave Cheney has an excellent post, Let’s talk about logging. In it, he dissects the current logging libraries, and what you really need. I pretty much agree with everything he says.

  • Warnings are just more info messages.
  • Log or handle the error — not both.

Where I disagree is with the error logging. It is distinct from info logging. It just needs to be actionable. if there’s an error, somebody needs to do something to fix it. See also Rob Ewaschuk’s Philosophy on Alerting.

As an aside, I’m glad to see that we’ve open-sourced our internal logging library as glog. It certainly doesn’t meet the criteria above, but it’s nicer than the standard library in a few small ways (like Verbose for developer-debugging logs).

Categories
Uncategorized

Go Readability

If you haven’t seen them, do take a look at these slides on Go readability, from one of Google’s Go readability group.

Readability is an important process at Google. In theory, it’s about ensuring the style guide for a language is applied. In practice, it’s also about ensuring that idiomatic code is produced. This is highly language specific, and not something that can easily be done with tooling.

In the case of Go readability, it feels like a mentoring process over a series of code reviews (other languages take a more “big-bang” approach). The end result is that I have a better idea of not just how to write Go, but how we like Go to be written at Google. I really appreciate the strong emphasis on simplicity in Go code. Hopefully, that comes through in the slides.

Categories
Uncategorized

Happy Fifth Anniversary, Go

So, Go is five years old. Looking back through version control, my first bit of Go was in May 2012. I’ve been using it as my preferred language for the last year or so. It still feels very pleasant and easy to work with. In no small part that’s due to the excellent tooling. The well written standard library also helps. Go is perhaps the first language where I’ve not had a burning desire to write a URL type because the builtin one is so awful (e.g. Python and Java). Inside Google, it’s been said that if you want to understand a piece of infrastructure, you should look at the Go implementation.

Looking back at that first Go code I wrote (a small tool of about 500 lines), it’s far from embarrassing. Certainly when compared to my efforts in other languages. The code has a few useful types, and methods that act upon them. It’s pretty readable (thanks to gofmt). The one bad thing is that I relied on panic far too heavily, instead of returning errors. The code ran happily for a year or so until the need for it was removed.

Anyway, if you’ve not already done so, try writing your next program in Go. You may be surprised at what a pleasant and productive experience it is.

Categories
Uncategorized

Go Strings

I’ve been looking at Go recently. It’s a pleasant language, with few surprises. However, I wondered (as always) what the encoding of a string is supposed to be. For example:

  • Python 2 has two types: str, and unicode. Python 3 has sensibly renamed these to bytes and str, respectively.
  • Perl has a magic bit which gets set to state that the string contains characters as opposed to bytes (it’s called the UTF-8 bit, but it means characters).

So how does Go deal with characters in strings? Given that the authors of Go also invented UTF-8, we can hope it’s been thought about.

There are three types to think about.

byte[]

A slice of bytes.

string

A (possibly empty) sequence of bytes. Strings are immutable.

rune

A single unicode code point. Produced by characters in single quotes.

There’s no explicit encoding in the above. Nonetheless, there’s an implicit preference for UTF-8:

But this doesn’t help the common case:

package main

import "fmt"

func main() {
  s := "café"
  fmt.Printf("%q has length %dn", s, len(s))
}

// "café" has length 5

The unicode/utf8 package can do what’s needed though. This provides functions for, amongst other things, picking runes out of strings.

package main

import (
  "fmt"
  "unicode/utf8"
)

func main() {
  s := "café"
  fmt.Printf("%q has length %dn", s, utf8.RuneCountInString((s)))
}

// "café" has length 4

This is very Go-like. The default is somewhat low-level, but the types and libraries build on top of it. For example, text/scanner provides a nice way of iterating over runes in a UTF-8 input stream.

On a whim, I took a look at the internals of utf8.RuneCountInString(). It’s deceptively simple.

func RuneCountInString(s string) (n int) {
  for _ = range s {
    n++
  }
  return
}

This relies on the spec defining how a string interacts with a for loop: it’s defined as iterating over the UTF-8 codepoints (or runes).