General Chat Thread, find Common Phrases Between Documents in General; Hello
Anyone know of any software that will look at two documents (well in end going to be 100+ documents ...
-
19th September 2011, 04:02 PM #1 find Common Phrases Between Documents
Hello
Anyone know of any software that will look at two documents (well in end going to be 100+ documents but lets start with two) and find common phrases between the documents. Problem is I do not know what the phrases are. I think might be asking for too much and have to do common words and then compare the context and start building a list of phrases to check for. Any ideas greatly received.
Russell
-
-
IDG Tech News
-
20th September 2011, 01:22 AM #2
-
Thanks to SYNACK from:
russdev (20th September 2011)
-
20th September 2011, 04:20 PM #3 Thanks
Sadly I don't think that will work as phrases are not going to be in their database (almost certainly not as they are chat logs publicly available before someone says something). It has to look at documents we have and compare them against each other as said I think I am being to ambitious.
Russ
-
-
20th September 2011, 04:26 PM #4 I'll bet it exists, probably used in Uni's for plagiarism checking. I'll bet it costs £Mega though.
-
-
20th September 2011, 04:36 PM #5 Perl.
-
-
20th September 2011, 05:13 PM #6 Who's running it? You or someone non-technical?
Like Tom alludes, crude functionality is trivial to achieve quickly if you've access to decent tools, but if you want a gui it gets trickier.
There's similarity-tester in the Ubuntu universe repos, designed for people nicking code, but also works on natural language.
You'd probably want to prepare the (Microsoft?) documents beforehand so they were readable by running them through wv.
Quite a lot of VLEs have plagiarism modules available.
-
-
20th September 2011, 07:49 PM #7 @tom time to brush up on my perl then...
@pete
It will just be me so technical is fine. Has to be standalone as not running it as part of a LP I will check the similarity-tester...
Crude is fine at moment to be honest because of content sounds cagey doesn't it
I don't fancy reading all 500+ documents..
Russell
-
SHARE:
Similar Threads
-
By indiegirl in forum How do you do....it?
Replies: 32
Last Post: 16th March 2012, 02:17 PM
-
By DaveP in forum General Chat
Replies: 0
Last Post: 1st September 2010, 02:50 PM
-
By mattpant in forum Windows
Replies: 6
Last Post: 7th October 2005, 06:12 PM
-
By mac_shinobi in forum Networks
Replies: 10
Last Post: 27th September 2005, 01:40 AM
-
By ninjabeaver in forum Hardware
Replies: 6
Last Post: 26th September 2005, 10:50 AM
Thread Information
Users Browsing this Thread
There are currently 1 users browsing this thread. (0 members and 1 guests)
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules