What Is a Parser?: Part 3

Episode #58 • May 20, 2019 • Subscriber-Only

It’s time to ask the all important question: what’s the point? We now have a properly defined parser type, one that can parse efficiently and incrementally, but does it give us anything new over existing tools?

What Is a Parser?: Part 3
Introduction
00:05
Constructing more parsers
00:30
What's the point?
09:04
Revisiting coordinate parsing
10:50

Unlock This Episode

Our Free plan includes 1 subscriber-only episode of your choice, plus weekly updates from our newsletter.

Introduction

The interface for mutating substrings is the same as the interface for mutating strings, but we’ll get a bit of a performance boost by working with a view into the string instead of a copy.

We still have the restriction that the entire String must be in memory, which means parsing a very large String isn’t going to be efficient, but the optimizations we’ve made so far were very low-hanging and already buy us a lot, so let’s kick that can a bit down the road.

This episode is for subscribers only.

Subscribe to Point-Free

Access this episode, plus all past and future episodes when you become a subscriber.

See plans and pricing

Already a subscriber? Log in

Exercises

  1. Right now all of our parsers (int, double, literal, etc.) are defined at the top-level of the file, hence they are defined in the module namespace. While that is completely fine to do in Swift, it can sometimes improve the ergonomics of using these values by storing them as static properties on the Parser type itself. We have done this a bunch in previous episodes, such as with our Gen type and Snapshotting type.

    Move all of the parsers we have defined so far to be static properties on the Parser type. You will want to suitably constrain the A generic in the extension in order to further restrict how these parsers are stored, i.e. you shouldn’t be allowed to access the integer parser via Parser<String>.int.

  2. We have previously devoted an entire episode (here) to the concept of map, then 3 entire episodes (part 1, part 2, part 3) to zip, and then 5 (!) entire episodes (part 1, part 2, part 3, part 4, part 5) to flatMap. In those episodes we showed that those operations are very general, and go far beyond what Swift gives us in the standard library for arrays and optionals.

    Define map, zip and flatMap on the Parser type. Start by defining what their signatures should be, and then figure out how to implement them in the simplest way possible. What gotcha to be on the look out for is that you do not want to consume any of the input string if the parser fails.

  3. Create a parser end: Parser<Void> that simply succeeds if the input string is empty, and fails otherwise. This parser is useful to indicate that you do not intend to parse anymore.

  4. Implement a function that takes a predicate (Character) -> Bool as an argument, and returns a parser Parser<Substring> that consumes from the front of the input string until the predicate is no longer satisfied. It would have the signature func pred: ((Character) -> Bool) -> Parser<Substring>.

  5. Implement a function that transforms any parser into one that does not consume its input at all. It would have the signature func nonConsuming: (Parser<A>) -> Parser<A>.

  6. Implement a function that transforms a parser into one that runs the parser many times and accumulates the values into an array. It would have the signature func many: (Parser<A>) -> Parser<[A]>.

  7. Implement a function that takes an array of parsers, and returns a new parser that takes the result of the first parser that succeeds. It would have the signature func choice: (Parser<A>...) -> Parser<A>.

  8. Implement a function that takes two parsers, and returns a new parser that returns the result of the first if it succeeds, otherwise it returns the result of the second. It would have the signature func either: (Parser<A>, Parser<B>) -> Parser<Either<A, B>> where Either is defined:

    enum Either<A, B> {
      case left(A)
      case right(B)
    }
    
  9. Implement a function that takes two parsers and returns a new parser that runs both of the parsers on the input string, but only returns the successful result of the first and discards the second. It would have the signature func keep(_: Parser<A>, discard: Parser<B>) -> Parser<A>. Make sure to not consume any of the input string if either of the parsers fail.

  10. Implement a function that takes two parsers and returns a new parser that runs both of the parsers on the input string, but only returns the successful result of the second and discards the first. It would have the signature func discard(_: Parser<A>, keep: Parser<B>) -> Parser<B>. Make sure to not consume any of the input string if either of the parsers fail.

  11. Implement a function that takes two parsers and returns a new parser that returns of the first if it succeeds, otherwise it returns the result of the second. It would have the signature func choose: (Parser<A>, Parser<A>) -> Parser<A>. Consume as little of the input string when implementing this function.

  12. Generalize the previous exercise by implementing a function of the form func choose: ([Parser<A>]) -> Parser<A>.

  13. Right now our parser can only fail in a single way, by returning nil. However, it can often be useful to have parsers that return a description of what went wrong when parsing.

    Generalize the Parser type so that instead of returning an A? value it returns a Result<A, String> value, which will allow parsers to describe their failures. Update all of our parsers and the ones in the above exercises to work with this new type.

  14. Right now our parser only works on strings, but there are many other inputs we may want to parse. For example, if we are making a router we would want to parse URLRequest values.

    Generalize the Parser type so that it is generic not only over the type of value it produces, but also the type of values it parses. Update all of our parsers and the ones in the above exercises to work with this new type (you may need to constrain generics to work on specific types instead of all possible input types).

References

Parse, don’t validate

Alexis King • Tuesday Nov 5, 2019

This article demonstrates that parsing can be a great alternative to validating. When validating you often check for certain requirements of your values, but don’t have any record of that check in your types. Whereas parsing allows you to upgrade the types to something more restrictive so that you cannot misuse the value later on.

Ledger Mac App: Parsing Techniques

Chris Eidhof & Florian Kugler • Friday Aug 26, 2016

In this free episode of Swift talk, Chris and Florian discuss various techniques for parsing strings as a means to process a ledger file. It contains a good overview of various parsing techniques, including parser grammars.

Swift Strings and Substrings

Chris Eidhof & Florian Kugler • Friday Dec 1, 2017

In this free episode of Swift talk, Chris and Florian discuss how to efficiently use Swift strings, and in particular how to use the Substring type to prevent unnecessary copies of large strings.

We write a simple CSV parser as an example demonstrating how to work with Swift’s String and Substring types.

Swift Pitch: String Consumption

Michael Ilseman et al. • Sunday Mar 3, 2019

Swift contributor Michael Ilseman lays out some potential future directions for Swift’s string consumption API. This could be seen as a “Swiftier” way of doing what the Scanner type does today, but possibly even more powerful.

Difficulties With Efficient Large File Parsing

Ezekiel Elin et al. • Thursday Apr 25, 2019

This question on the Swift forums brings up an interesting discussion on how to best handle large files (hundreds of megabytes and millions of lines) in Swift. The thread contains lots of interesting tips on how to improve performance, and contains some hope of future standard library changes that may help too.

Scanner

Apple

Official documentation for the Scanner type by Apple. Although the type hasn’t (yet) been updated to take advantage of Swift’s modern features, it is still a very powerful API that is capable of parsing complex text formats.

NSScanner

Nate Cook • Monday Mar 2, 2015

A nice, concise article covering the Scanner type, including a tip of how to extend the Scanner so that it is a bit more “Swifty”. Take note that this article was written before NSScanner was renamed to just Scanner in Swift 3.

Downloads