This episode is for subscribers only. To access it, and all past and future episodes, become a subscriber today!See subscription optionsorLog in
Sign up for our weekly newsletter to be notified of new episodes, and unlock access to any subscriber-only episode of your choosing!Sign up for free episode
The manual, hand-rolled coordinate parsing function we made last time is already pretty complicated, and capturing every edge case will make it more complicated still.
But that’s not even the worst part. The worst part is that this parse function is a one-off, ad hoc solution to parsing the very specific format of latitude/longitude coordinates. There is nothing inside the body that is reusable outside of this example.
We’re definitely seeing how tricky and subtle parsing can be, and it’s easy to end up with a complex function that would need to be studied very carefully to understand what it’s doing, and even then it would be easy to overlook the potential bugs that lurk within.
Let’s back up and formally state what the problem of parsing is and analyze it from the perspective of functional programming and functions.
Create a parser
char: Parser<Character> that will parser a single character off the front of the input string.
Create a parser
whitespace: Parser<Void> that consumes all of the whitespace from the front of the input string. Note that this parser is of type
Void because we probably don’t care about the actual whitespace we consumed, we just want it consumed.
Right now our
int parser doesn’t work for negative numbers, for example
int.run("-123") will fail. Fix this deficiency in
Create a parser
double: Parser<Double> that consumes a double from the front of the input string.
Define a function
literal: (String) -> Parser<Void> that takes a string, and returns a parser which will parse that string from the beginning of the input. This exercise shows how you can build complex parsers: you can use a function to take some up-front configuration, and then use that data in the definition of the parser.
In this episode we mentioned that there is a correspondence between functions of the form
(A) -> A and functions
(inout A) -> Void. We even covered this in a previous episode, but it is instructive to write it out again. So, define two functions
fromInout that will transform functions of the form
(A) -> A to functions
(inout A) -> Void, and vice-versa.
In this free episode of Swift talk, Chris and Florian discuss various techniques for parsing strings as a means to process a ledger file. It contains a good overview of various parsing techniques, including parser grammars.
In this free episode of Swift talk, Chris and Florian discuss how to efficiently use Swift strings, and in particular how to use the
Substring type to prevent unnecessary copies of large strings.
We write a simple CSV parser as an example demonstrating how to work with Swift’s String and Substring types.
Swift contributor Michael Ilseman lays out some potential future directions for Swift’s string consumption API. This could be seen as a “Swiftier” way of doing what the
Scanner type does today, but possibly even more powerful.
This question on the Swift forums brings up an interesting discussion on how to best handle large files (hundreds of megabytes and millions of lines) in Swift. The thread contains lots of interesting tips on how to improve performance, and contains some hope of future standard library changes that may help too.