Table of Contents
Thanks to my friend @visarz for pointing out that I should have used this operator in my Exercism solutions.
I’ve been doing a lot of Exercism problems recently where I’ve been dealing with charcodes/code points of characters.
By charcode/code point, I mean the integer character code for a certain UTF-8 character: e.g. capital A
has a UTF-8 character code of 65
.
We saw this in the RNA transcription and Hamming problems:
iex> IO.inspect('GCTAU', charlists: false)
[71, 67, 84, 65, 85]
'GCTAU'
@dna_nucleotide_to_rna_nucleotide_map %{
# `G` -> `C`
71 => 67,
# `C` -> `G`
67 => 71,
# `T` -> `A`
84 => 65,
# `A` -> `U`
65 => 85
}
It turns out I’ve forgotten about a key Elixir operator that would have massively simplified the code above!
How to get the integer charcode or code point for a UTF-8 character
The ?
operator allows you to get the integer code point/charcode for a UTF-8 character in Elixir:
iex> ?A
65
iex> ?å
229
iex> ?Д
1044
iex> ?😃
128515
The map I showed in the section above can be better represented like this:
@dna_nucleotide_to_rna_nucleotide_map %{
?G => ?C,
?C => ?G,
?T => ?A,
?A => ?U
}
Much more readable! We can also do something like this:
iex> 'ABCABC' |> Enum.filter(fn char -> char == ?A end)
'AA'
Looks like I will need to go back and update my solutions! Thanks again, @visarz.
Further reading
Binaries, strings, and charlists on elixir-lang.org