What we're going to do here is to convert a Genbank record (received as XML) into a plist with just the values we're interested in, like the example above. The first post in the series is here.
Below is the code for about the first half of the Reader class. We call it by sending it the message
read:(NSURL *)xmlURL. Parsing is very straightforward. But how to deal with duplicates?
What I do is when I get
parser:didStartElement:..., I check to see if it is already a key in the dictionary. If it is, we convert the key from, say, "FirstName" to "FirstNamei" where i is the number of these things we have including the current one. But this only happens for elements that are multiples. So, the abstract is there under its own key ("Abstract" or "AbstractText") but for multiple authors "FirstName" is set to "multiple" and then there are "FirstName1", "FirstName2", etc. Simple, but effective.
Dealing with Genbank sure is fun. For example, one record has "Forename" as the key, another has "FirstName." Why in the world would they do that?