|new project: congress ratings
||[Mar. 19th, 2007|06:48 pm]
So my next project is fun and neat. I can't remember how exactly I got here, but I noticed that all roll-call votes in congress are accessable in a handy XML format (Here's an example vote - note that the page you see is that XML file after being processed with an XSLT stylesheet, so you'll have to hit View Source to see the raw data). My idea (a little vague at this point) is to take all of the votes, and then figure out which issues I care about and which way I would have voted, and then "rate" representatives as to how closely their votes align with mine. This is a little grand in scope.
Anyway, it took me a while to get HaXml (a Haskell XML parser) installed, because I kept not being able to compile it from source for various stupid reasons. Anyway, I finally figured out that it was in fact in Debian in the libghc6-haxml-dev package, which made my life about 5 times easier.
So I saved a sample vote and have it parsing and I'm extracting simple data from it, which is exciting! I have a few questions, though: does anyone know the answer to these?
- I have these two functions:
nothing :: Maybe a -> Bool
nothing Nothing = True
nothing (Just _) = False
findElementContent :: String -> Content -> Maybe Element
findElementContent target (CElem el) = findElement target el
findElementContent target _ = Nothing
findElementContents :: String -> [Content] -> Maybe Element
findElementContents _  = Nothing
findElementContents target (c:cs) = if (nothing (findElementContent target c))
then findElementContents target cs
else findElementContent target c
findElementContent takes in a target tag and some data (Content), and returns the element that has that tag name if it exists, and Nothing otherwise. (
findElementContents is just a helper function to do the same thing with a list of Content) But
findElementContents looks pretty ugly to me - what I want it to do is return
findElementContent target c if that isn't Nothing, and otherwise recur on the rest of the list. The code is correct, but is it inefficient since I'm calling
findElementContent target c twice? My limited understanding says no, since
findElementContent is referentially transparent since it doesn't use monads (i.e. if you call it again with the same inputs it will return the same thing, always), but I'm not entirely clear on this.
- As I mentioned,
findElementContents seems a little inelegant - is there a better way to do this? Is there some builtin
nothing that I couldn't find?
Resources I've been using:
- HaXml reference
- standard library reference, including the Prelude