Home  Contents

UTF8 iterators

String Core4 Lua Commands

SYNOPSIS

  1. for ch in string.utf8fwd(str [, i [, j]]) ...
  2. for ch in string.utf8rev(str [, i [, j]]) ...

DESCRIPTION

These functions set allow iterating over a string with UTF8 encoding in forward and reverse direction.

Without further arguments, iterates over the full str contents.

Parameters select a substring of str that starts at i and continues until j; i and j can be negative. If j is absent, then it is assumed to be equal to -1 (which is the same as the string length). In particular, the parameters (str,1,j) iterate over prefix of str with length j, and the parameters (str, -i) iterate over the last i codepoints of str.

RETURN VALUE

Each iteration provides a numeric character code until the loop ends.

NOTES

Using an iterator walks the string in O(n) forward and O(2n) in reverse, while doing a naive loop using string.utf8at() takes O(n²).

EXAMPLE

>  >  >  >  > 
s = "Sm\xC3\xB8rebr\xC3\xB8d" print(s) for ch in string.utf8fwd(s) do print(ch) end
Smørebrød 83 109 248 114 101 98 114 248 100

SEE ALSO