+ Post New Thread
Results 1 to 11 of 11
Scripts Thread, python rss read script in Coding and Web Development; anyone good with python? Im looking to get a python script that can parse an RSS feed and download the ...
  1. #1

    RabbieBurns's Avatar
    Join Date
    Apr 2008
    Location
    Sydney
    Posts
    5,532
    Thank Post
    1,341
    Thanked 470 Times in 307 Posts
    Blog Entries
    6
    Rep Power
    200

    python rss read script

    anyone good with python? Im looking to get a python script that can parse an RSS feed and download the file that the RSS points to.. does anyone have a working example of this I might be able to use?

  2. #2

    RabbieBurns's Avatar
    Join Date
    Apr 2008
    Location
    Sydney
    Posts
    5,532
    Thank Post
    1,341
    Thanked 470 Times in 307 Posts
    Blog Entries
    6
    Rep Power
    200
    ok ive been having a play with feedparser

    I can get the whole rss to list all the items, but im not sure how to iterate through them looking for the items i want, nor how to download the link associated to any matching items..

  3. #3

    dhicks's Avatar
    Join Date
    Aug 2005
    Location
    Knightsbridge
    Posts
    5,738
    Thank Post
    1,299
    Thanked 797 Times in 693 Posts
    Rep Power
    239
    Quote Originally Posted by RabbieBurns View Post
    I can get the whole rss to list all the items, but im not sure how to iterate through them looking for the items i want, nor how to download the link associated to any matching items..
    I just had a quick look at Universal Feed Parser. Seems easy enough - don't you want to just do a for-each loop over the entries list?:

    Code:
    import feedparser
    d = feedparser.parse("http://feedparser.org/docs/examples/atom10.xml")
    for each entry in d['entries']:
        print entry
    --
    David Hicks

  4. #4

    RabbieBurns's Avatar
    Join Date
    Apr 2008
    Location
    Sydney
    Posts
    5,532
    Thank Post
    1,341
    Thanked 470 Times in 307 Posts
    Blog Entries
    6
    Rep Power
    200
    yeh that feedparser was what i meant id tried in my 2nd post..

    what Im wanting to do is loop through the list, and compare the items to a pre-existing search criteria, and then download the links from any matching..

    Im not really sure how to do anything to be honest so just experimenting with it all just now...

  5. #5

    dhicks's Avatar
    Join Date
    Aug 2005
    Location
    Knightsbridge
    Posts
    5,738
    Thank Post
    1,299
    Thanked 797 Times in 693 Posts
    Rep Power
    239
    Quote Originally Posted by RabbieBurns View Post
    what Im wanting to do is loop through the list
    Which the list? The list of entries in the RSS file?

    Code:
    import feedparser
    d = feedparser.parse("http://feedparser.org/docs/examples/atom10.xml")
    for each entry in d['entries']:
        Your code goes here.
    and compare the items to a pre-existing search criteria
    First, run the code above with the "print" statement in it, see what fields each "entry" dictionary item has. Decide which field you want to check, then you'll be able to do something along the lines of:

    Code:
    import feedparser
    d = feedparser.parse("http://feedparser.org/docs/examples/atom10.xml")
    for each entry in d['entries']:
        if entry['name'] == "Bananas":
            print "We have bananas!"
        else:
            print "Sorry, we have no bananas."
    and then download the links from any matching
    There's bound to be a library for downloading resources from a given URL. Try searching the Python documentation with Google.

    --
    David Hicks

  6. #6

    RabbieBurns's Avatar
    Join Date
    Apr 2008
    Location
    Sydney
    Posts
    5,532
    Thank Post
    1,341
    Thanked 470 Times in 307 Posts
    Blog Entries
    6
    Rep Power
    200
    thanks, gives me a basis for thought processes..

    say for example the rss is

    BBC News | News Front Page | World Edition

    then from a previous process (which Ive already got) the array of things to search for was, say "cricket"

    I would want it to just downlaod the .stm page that the rss points to..

    but i think ive already got a routine to do the actual downloading part..

    ach i dunno.,. Im not a coder whatsoever, I just muck about with others codes till i get something that resembles something that works.. ending up with a really crude effort

  7. #7

    dhicks's Avatar
    Join Date
    Aug 2005
    Location
    Knightsbridge
    Posts
    5,738
    Thank Post
    1,299
    Thanked 797 Times in 693 Posts
    Rep Power
    239
    Quote Originally Posted by RabbieBurns View Post
    then from a previous process (which Ive already got) the array of things to search for was, say "cricket". I would want it to just downlaod the .stm page that the rss points to.
    Sorry, what part of what is "cricket"? A string you're looking for inside something, or the name of an array you've already read in from somewhere?

    The RSS file you link to would seem to have fields called title, description, link, guid, pubDate, category and media. Therefore, I'm guessing you want something along the lines of:

    Code:
    import feedparser
    d = feedparser.parse("http://feedparser.org/docs/examples/atom10.xml")
    for entry in d['entries']:
        if entry['title'] == "cricket":
            goGetTheURL(entry['link'])
    --
    David Hicks
    Last edited by dhicks; 29th September 2009 at 11:24 PM. Reason: No "each" needed in for loop.

  8. #8

    Join Date
    Sep 2009
    Posts
    2
    Thank Post
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    0
    Wow, timely thread.

    I'm doing the same thing. I'm getting stuck on the 'entry' iterator, unfortunately.

    >>> for each entry in d['entries']
    File "<stdin>", line 1
    for each entry in d['entries']
    ^
    SyntaxError: invalid syntax
    >>>


    Any thoughts?

  9. #9

    Join Date
    Sep 2009
    Posts
    2
    Thank Post
    0
    Thanked 0 Times in 0 Posts
    Rep Power
    0
    argh, the shame.


    n/m! Helps if I write the for request in the method appropriate for the language.

    No "for each" statement

  10. #10

    dhicks's Avatar
    Join Date
    Aug 2005
    Location
    Knightsbridge
    Posts
    5,738
    Thank Post
    1,299
    Thanked 797 Times in 693 Posts
    Rep Power
    239
    Quote Originally Posted by MrCurious View Post
    No "for each" statement
    Oh. Bother. <Hastily edits code>...

    --
    David Hicks

  11. #11

    dhicks's Avatar
    Join Date
    Aug 2005
    Location
    Knightsbridge
    Posts
    5,738
    Thank Post
    1,299
    Thanked 797 Times in 693 Posts
    Rep Power
    239
    Yep, just tested the following:

    Code:
    import feedparser
    d = feedparser.parse("http://newsrss.bbc.co.uk/rss/newsonline_world_edition/front_page/rss.xml")
    for entry in d['entries']:
    	if entry['title'] == "cricket":
    		goGetTheURL(entry['link'])
    It goes and downloads an RSS file from the BBC website, checks each item listed and, if its title exactly matches "cricket", calls goGetTheURL with the entry's link. You'd probably want to check the title against a regular expression or something, an exact string match probably isn't what you want.

    RabbieBurns, MrCurious: Does that cover what you were trying to do? Post more details of what you're trying to do if you want.

    --
    David Hicks
    Last edited by dhicks; 29th September 2009 at 11:37 PM.



SHARE:
+ Post New Thread

Similar Threads

  1. Replies: 5
    Last Post: 13th August 2009, 01:56 PM
  2. [Video] Star Trek Meets Monty Python
    By mattx in forum Jokes/Interweb Things
    Replies: 1
    Last Post: 10th June 2009, 11:19 AM
  3. Replies: 9
    Last Post: 9th April 2009, 08:33 AM
  4. Replies: 0
    Last Post: 30th September 2008, 04:24 PM
  5. Python LDAP module version problem
    By CyberNerd in forum Coding
    Replies: 6
    Last Post: 14th December 2006, 02:18 PM

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •