Diff’ing my way to the scrapbook!

Fetching a scrapbook page & parsing it everytime to get new messages seems kinda overkill… specially because of the following two reasons -

1) I’m interested in only the new message
2) Parsing HTML which is not controlled by you & which changes frequently can leave your application a bit shaky!

So this is what i figured out when I was SMS’ing a friend during a power cut…

I can diff the new copy of the scrapbook with an old cached copy & parse the diff for only the new message’s html (text/links/images etc).

This will give me 2 benefits…
1) I’m no more dependent on full page’s structure
2) Minimal HTML parsing :)

BTW… a quick google search showed that my dear Python already has a diffing module up its sleeve – difflib & diffing can be done like this.

Neat… aint it?
Py rocks!

Previous posts : First, Second

No related posts.

Related posts brought to you by Yet Another Related Posts Plugin.

  • SK Jain

    Mayank

    Hindi translation is fine but what exactly these lines means to you, is more important & pertinent.

    Dad