+ Post New Thread
Results 1 to 7 of 7
General Chat Thread, find Common Phrases Between Documents in General; Hello Anyone know of any software that will look at two documents (well in end going to be 100+ documents ...
  1. #1

    russdev's Avatar
    Join Date
    Jun 2005
    Location
    Leicestershire
    Posts
    6,946
    Thank Post
    709
    Thanked 553 Times in 368 Posts
    Blog Entries
    3
    Rep Power
    204

    find Common Phrases Between Documents

    Hello

    Anyone know of any software that will look at two documents (well in end going to be 100+ documents but lets start with two) and find common phrases between the documents. Problem is I do not know what the phrases are. I think might be asking for too much and have to do common words and then compare the context and start building a list of phrases to check for. Any ideas greatly received.

    Russell

  2. #2

    SYNACK's Avatar
    Join Date
    Oct 2007
    Posts
    11,239
    Thank Post
    882
    Thanked 2,742 Times in 2,316 Posts
    Blog Entries
    11
    Rep Power
    784

  3. Thanks to SYNACK from:

    russdev (20th September 2011)

  4. #3

    russdev's Avatar
    Join Date
    Jun 2005
    Location
    Leicestershire
    Posts
    6,946
    Thank Post
    709
    Thanked 553 Times in 368 Posts
    Blog Entries
    3
    Rep Power
    204
    Thanks

    Sadly I don't think that will work as phrases are not going to be in their database (almost certainly not as they are chat logs publicly available before someone says something). It has to look at documents we have and compare them against each other as said I think I am being to ambitious.

    Russ

  5. #4

    Andrew_C's Avatar
    Join Date
    Sep 2005
    Location
    Winchester
    Posts
    3,044
    Thank Post
    64
    Thanked 391 Times in 299 Posts
    Rep Power
    163
    I'll bet it exists, probably used in Uni's for plagiarism checking. I'll bet it costs £Mega though.

  6. #5


    tom_newton's Avatar
    Join Date
    Sep 2006
    Location
    Leeds
    Posts
    4,485
    Thank Post
    867
    Thanked 853 Times in 674 Posts
    Rep Power
    197
    Perl.

  7. #6


    Join Date
    Dec 2005
    Location
    In the server room, with the lead pipe.
    Posts
    4,677
    Thank Post
    279
    Thanked 782 Times in 609 Posts
    Rep Power
    224
    Who's running it? You or someone non-technical?

    Like Tom alludes, crude functionality is trivial to achieve quickly if you've access to decent tools, but if you want a gui it gets trickier.

    There's similarity-tester in the Ubuntu universe repos, designed for people nicking code, but also works on natural language.

    You'd probably want to prepare the (Microsoft?) documents beforehand so they were readable by running them through wv.

    Quite a lot of VLEs have plagiarism modules available.

  8. #7

    russdev's Avatar
    Join Date
    Jun 2005
    Location
    Leicestershire
    Posts
    6,946
    Thank Post
    709
    Thanked 553 Times in 368 Posts
    Blog Entries
    3
    Rep Power
    204
    @tom time to brush up on my perl then...

    @pete

    It will just be me so technical is fine. Has to be standalone as not running it as part of a LP I will check the similarity-tester...

    Crude is fine at moment to be honest because of content sounds cagey doesn't it I don't fancy reading all 500+ documents..

    Russell

SHARE:
+ Post New Thread

Similar Threads

  1. How to find out which machine a student logged onto, & when
    By indiegirl in forum How do you do....it?
    Replies: 32
    Last Post: 16th March 2012, 01:17 PM
  2. Replies: 0
    Last Post: 1st September 2010, 02:50 PM
  3. Images in Office 2003 documents
    By mattpant in forum Windows
    Replies: 6
    Last Post: 7th October 2005, 06:12 PM
  4. Best way / method to sync time between servers.
    By mac_shinobi in forum Wireless Networks
    Replies: 10
    Last Post: 27th September 2005, 01:40 AM
  5. Help me find a driver for Ghost Cast Boot Disk
    By ninjabeaver in forum Hardware
    Replies: 6
    Last Post: 26th September 2005, 10:50 AM

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •