Diff’ing my way to the scrapbook!
Fetching a scrapbook page & parsing it everytime to get new messages seems kinda overkill… specially because of the following two reasons -
1) I’m interested in only the new message
2) Parsing HTML which is not controlled by you & which changes frequently can leave your application a bit shaky!
So this is what i figured out when I was SMS’ing a friend during a power cut…
I can diff the new copy of the scrapbook with an old cached copy & parse the diff for only the new message’s html (text/links/images etc).
This will give me 2 benefits…
1) I’m no more dependent on full page’s structure
2) Minimal HTML parsing
BTW… a quick google search showed that my dear Python already has a diffing module up its sleeve – difflib & diffing can be done like this.
Neat… aint it?
Py rocks!
Previous posts : First, Second
No related posts.
Related posts brought to you by Yet Another Related Posts Plugin.








