In Str§
See primary documentation in context for method encode
multi method encode(Str:D $encoding = 'utf8', :$replacement, Bool() :$translate-nl = False, :$strict)
Returns a Blob
which represents the original string in the given encoding and normal form. The actual return type is as specific as possible, so $str.encode('UTF-8')
returns a utf8
object, $str.encode('ISO-8859-1')
a buf8
. If :translate-nl
is set to True
, it will translate newlines from \n
to \r\n
, but only in Windows. $replacement
indicates how characters are going to be replaced in the case they are not available in the current encoding, while $strict
indicates whether unmapped codepoints will still decode; for instance, codepoint 129 which does not exist in windows-1252
.
my $str = "Þor is mighty"; say $str.encode("ascii", :replacement( 'Th') ).decode("ascii"); # OUTPUT: «Thor is mighty»
In this case, any unknown character is going to be substituted by Th
. We know in advance that the character that is not known in the ascii
encoding is Þ
, so we substitute it by its latin equivalent, Th
. In the absence of any replacement set of characters, :replacement
is understood as a Bool
:
say $str.encode("ascii", :replacement).decode("ascii"); # OUTPUT: «?or is mighty»
If :replacement
is not set or assigned a value, the error Error encoding ASCII string: could not encode codepoint 222
will be issued (in this case, since þ is codepoint 222).
Since the Blob
returned by encode
is the original string in normal form, and every element of a Blob
is a byte, you can obtain the length in bytes of a string by calling a method that returns the size of the Blob
on it:
say "þor".encode.bytes; # OUTPUT: «4» say "þor".encode.elems; # OUTPUT: «4»