+ Post New Thread
Results 1 to 7 of 7
General Chat Thread, find Common Phrases Between Documents in General; Hello Anyone know of any software that will look at two documents (well in end going to be 100+ documents ...
  1. #1

    russdev's Avatar
    Join Date
    Jun 2005
    Location
    Leicestershire
    Posts
    6,873
    Thank Post
    650
    Thanked 534 Times in 353 Posts
    Blog Entries
    3
    Rep Power
    200

    find Common Phrases Between Documents

    Hello

    Anyone know of any software that will look at two documents (well in end going to be 100+ documents but lets start with two) and find common phrases between the documents. Problem is I do not know what the phrases are. I think might be asking for too much and have to do common words and then compare the context and start building a list of phrases to check for. Any ideas greatly received.

    Russell

  2. #2

    SYNACK's Avatar
    Join Date
    Oct 2007
    Posts
    10,692
    Thank Post
    824
    Thanked 2,570 Times in 2,187 Posts
    Blog Entries
    9
    Rep Power
    731

  3. Thanks to SYNACK from:

    russdev (20th September 2011)

  4. #3

    russdev's Avatar
    Join Date
    Jun 2005
    Location
    Leicestershire
    Posts
    6,873
    Thank Post
    650
    Thanked 534 Times in 353 Posts
    Blog Entries
    3
    Rep Power
    200
    Thanks

    Sadly I don't think that will work as phrases are not going to be in their database (almost certainly not as they are chat logs publicly available before someone says something). It has to look at documents we have and compare them against each other as said I think I am being to ambitious.

    Russ

  5. #4

    Andrew_C's Avatar
    Join Date
    Sep 2005
    Location
    Winchester
    Posts
    2,840
    Thank Post
    62
    Thanked 348 Times in 269 Posts
    Rep Power
    149
    I'll bet it exists, probably used in Uni's for plagiarism checking. I'll bet it costs £Mega though.

  6. #5


    tom_newton's Avatar
    Join Date
    Sep 2006
    Location
    Leeds
    Posts
    4,448
    Thank Post
    865
    Thanked 839 Times in 662 Posts
    Rep Power
    194
    Perl.

  7. #6


    Join Date
    Dec 2005
    Location
    In the server room, with the lead pipe.
    Posts
    4,534
    Thank Post
    271
    Thanked 752 Times in 590 Posts
    Rep Power
    218
    Who's running it? You or someone non-technical?

    Like Tom alludes, crude functionality is trivial to achieve quickly if you've access to decent tools, but if you want a gui it gets trickier.

    There's similarity-tester in the Ubuntu universe repos, designed for people nicking code, but also works on natural language.

    You'd probably want to prepare the (Microsoft?) documents beforehand so they were readable by running them through wv.

    Quite a lot of VLEs have plagiarism modules available.

  8. #7

    russdev's Avatar
    Join Date
    Jun 2005
    Location
    Leicestershire
    Posts
    6,873
    Thank Post
    650
    Thanked 534 Times in 353 Posts
    Blog Entries
    3
    Rep Power
    200
    @tom time to brush up on my perl then...

    @pete

    It will just be me so technical is fine. Has to be standalone as not running it as part of a LP I will check the similarity-tester...

    Crude is fine at moment to be honest because of content sounds cagey doesn't it I don't fancy reading all 500+ documents..

    Russell

SHARE:
+ Post New Thread

Similar Threads

  1. How to find out which machine a student logged onto, & when
    By indiegirl in forum How do you do....it?
    Replies: 32
    Last Post: 16th March 2012, 01:17 PM
  2. Replies: 0
    Last Post: 1st September 2010, 02:50 PM
  3. Images in Office 2003 documents
    By mattpant in forum Windows
    Replies: 6
    Last Post: 7th October 2005, 06:12 PM
  4. Best way / method to sync time between servers.
    By mac_shinobi in forum Wireless Networks
    Replies: 10
    Last Post: 27th September 2005, 01:40 AM
  5. Help me find a driver for Ghost Cast Boot Disk
    By ninjabeaver in forum Hardware
    Replies: 6
    Last Post: 26th September 2005, 10:50 AM

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •