Calculating the length of strings in GO
Measuring the length of strings in a program, could possibly be the most common denominator in all the applications out there.
Coming from a Python background, I've learnt this the hard way, there's a catch for calculating the length of strings in Go.
len
If you use the intuitive len
, it returns the number of bytes in a string; now this doesn't matter much for strings with characters each measuring a single byte; but this gives wrong results for strings which contain UTF
characters which might occupy more than a single byte.
So, len("Hello")
will return 5
, but len("Hello, 世界")
will return 13
even though there are only 9 characters.
RuneCountInString
To get the expected result, use utf8.RuneCountInString(string)
. It returns the number of "runes", instead of the number of bytes in a string, as len
. So, utf8.RuneCountInString("Hello, 世界")
will return the expected value of 9
.
#ConceptInCode
package main
import (
"fmt"
"unicode/utf8"
)
func main() {
str := "Hello, 世界"
fmt.Println("bytes =", len(str))
fmt.Println("runes =", utf8.RuneCountInString(str))
}