+ Post New Thread
Results 1 to 7 of 7
General Chat Thread, find Common Phrases Between Documents in General; Hello Anyone know of any software that will look at two documents (well in end going to be 100+ documents ...
  1. #1

    russdev's Avatar
    Join Date
    Jun 2005
    Location
    Leicestershire
    Posts
    6,985
    Thank Post
    735
    Thanked 559 Times in 374 Posts
    Blog Entries
    3
    Rep Power
    206

    find Common Phrases Between Documents

    Hello

    Anyone know of any software that will look at two documents (well in end going to be 100+ documents but lets start with two) and find common phrases between the documents. Problem is I do not know what the phrases are. I think might be asking for too much and have to do common words and then compare the context and start building a list of phrases to check for. Any ideas greatly received.

    Russell

  2. #2

    SYNACK's Avatar
    Join Date
    Oct 2007
    Posts
    11,271
    Thank Post
    884
    Thanked 2,749 Times in 2,322 Posts
    Blog Entries
    11
    Rep Power
    785

  3. Thanks to SYNACK from:

    russdev (20th September 2011)

  4. #3

    russdev's Avatar
    Join Date
    Jun 2005
    Location
    Leicestershire
    Posts
    6,985
    Thank Post
    735
    Thanked 559 Times in 374 Posts
    Blog Entries
    3
    Rep Power
    206
    Thanks

    Sadly I don't think that will work as phrases are not going to be in their database (almost certainly not as they are chat logs publicly available before someone says something). It has to look at documents we have and compare them against each other as said I think I am being to ambitious.

    Russ

  5. #4

    Andrew_C's Avatar
    Join Date
    Sep 2005
    Location
    Winchester
    Posts
    3,137
    Thank Post
    65
    Thanked 424 Times in 319 Posts
    Rep Power
    169
    I'll bet it exists, probably used in Uni's for plagiarism checking. I'll bet it costs £Mega though.

  6. #5


    tom_newton's Avatar
    Join Date
    Sep 2006
    Location
    Leeds
    Posts
    4,507
    Thank Post
    871
    Thanked 862 Times in 681 Posts
    Rep Power
    199
    Perl.

  7. #6


    Join Date
    Dec 2005
    Location
    In the server room, with the lead pipe.
    Posts
    4,715
    Thank Post
    288
    Thanked 789 Times in 616 Posts
    Rep Power
    226
    Who's running it? You or someone non-technical?

    Like Tom alludes, crude functionality is trivial to achieve quickly if you've access to decent tools, but if you want a gui it gets trickier.

    There's similarity-tester in the Ubuntu universe repos, designed for people nicking code, but also works on natural language.

    You'd probably want to prepare the (Microsoft?) documents beforehand so they were readable by running them through wv.

    Quite a lot of VLEs have plagiarism modules available.

  8. #7

    russdev's Avatar
    Join Date
    Jun 2005
    Location
    Leicestershire
    Posts
    6,985
    Thank Post
    735
    Thanked 559 Times in 374 Posts
    Blog Entries
    3
    Rep Power
    206
    @tom time to brush up on my perl then...

    @pete

    It will just be me so technical is fine. Has to be standalone as not running it as part of a LP I will check the similarity-tester...

    Crude is fine at moment to be honest because of content sounds cagey doesn't it I don't fancy reading all 500+ documents..

    Russell



SHARE:
+ Post New Thread

Similar Threads

  1. How to find out which machine a student logged onto, & when
    By indiegirl in forum How do you do....it?
    Replies: 32
    Last Post: 16th March 2012, 02:17 PM
  2. Replies: 0
    Last Post: 1st September 2010, 03:50 PM
  3. Images in Office 2003 documents
    By mattpant in forum Windows
    Replies: 6
    Last Post: 7th October 2005, 07:12 PM
  4. Best way / method to sync time between servers.
    By mac_shinobi in forum Wireless Networks
    Replies: 10
    Last Post: 27th September 2005, 02:40 AM
  5. Help me find a driver for Ghost Cast Boot Disk
    By ninjabeaver in forum Hardware
    Replies: 6
    Last Post: 26th September 2005, 11:50 AM

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •